Geons are the simple 2D or 3D forms such as cylinders, bricks, wedges, cones, circles and rectangles corresponding to the simple parts of an object in Biederman's Recognition-by-components theory. The theory proposes that the visual input is matched against structural representations of objects in the brain. These structural representations consist of geons and their relations (e.g., an ice cream cone could be broken down into a sphere located above a cone). Only a modest number of geons (< 40) are assumed. When combined in different relations to each other (e.g., on-top-of, larger-than, end-to-end, end-to-middle) and coarse metric variation such as aspect ratio and 2D orientation, billions of possible 2- and 3-geon objects can be generated. Two classes of shape-based visual identification that are not done through geon representations, are those involved in: a) individuating similar faces, and b) classifications that don’t have definite boundaries, such as that of bushes or a crumpled garment. Typically, such identifications are not viewpoint-invariant.
Properties of geons
There are 4 essential properties of geons:
- View-invariance: Each geon can be distinguished from the others from almost any viewpoints except for “accidents” at highly restricted angles in which one geon projects an image that could be a different geon, as, for example, when an end-on view of a cylinder can be a sphere or circle. Objects represented as an arrangement of geons would, similarly, be viewpoint invariant.
- Stability or resistance to visual noise: Because the geons are simple they are readily supported by the Gestalt property of smooth continuation, rendering their identification robust to partial occlusion and degradation by visual noise as, for example, when a cylinder might be viewed behind a bush.
- Invariance to illumination direction and surface markings and texture.
- High distinctiveness: The geons differ qualitatively, with only two or three levels of an attributes, such as straight vs. curved, parallel vs. non parallel, positive vs. negative curvature. These qualitative differences can be readily distinguished thus rendering the geons readily distinguishable and the objects so composed, readily distinguishable.
Derivation of invariant properties of geons
Viewpoint invariance: The viewpoint invariance of geons derives from their being distinguished by three nonaccidental properties (NAPs) of contours that do not change with orientation in depth:
- Whether the contour is straight or curved,
- The vertex that is formed when two or three contours coterminate (that is, end together at the same point), in the image, i.e., an L (2 contours), fork (3 contours with all angles < 180°), or an arrow (3 contours, with one angle > 180°), and
- Whether a pair of contours is parallel or not (with allowance for perspective). When not parallel, the contours can be straight (converging or diverging) or curved, with positive or negative curvature forming a convex or concave, envelope, respectively (see Figure below).
NAPs can be distinguished from metric properties (MPs), such as the degree of non-zero curvature of a contour or its length, which do vary with changes in orientation in depth.
Invariance to lighting direction and surface characteristics
Geons can be determined from the contours that mark the edges at orientation and depth discontinuities of an image of an object, i.e., the contours that specify a good line drawing of the object’s shape or volume. Orientation discontinuities define those edges where there is a sharp change in the orientation of the normal to the surface of a volume, as occurs at the contour at the boundaries of the different sides of a brick. A depth discontinuity is where the observer’s line of sight jumps from the surface of an object to the background (i.e., is tangent to the surface), as occurs at the sides of a cylinder. The same contour might mark both an orientation and depth discontinuity, as with the back edge of a brick. Because the geons are based on these discontinuities, they are invariant to variations in the direction of lighting, shadows, and surface texture and markings.
Geons and generalized cones
The geons constitute a partition of the set of generalized cones, which are the volumes created when a cross section is swept along an axis. For example, a circle swept along a straight axis would define a cylinder. A rectangle swept along an axis would define a brick. Four dimensions with contrastive values (i.e., mutually exclusive values) define the current set of geons (see Figure):
- Shape of cross section: round vs. straight.
- Axis: straight vs. curved.
- Size of cross-section as it is swept along an axis: constant vs. expanding (or contracting) vs. expanding then contracting vs. contracting then expanding.
- Termination of geon with constant sized cross-sections: truncated vs. converging to a point vs. rounded.
These variations in the generating of geons create shapes that differ in NAPs.
Experimental tests of the viewpoint invariance of geons
There is now considerable support for the major assumptions of geon theory (See Recognition-by-components theory). One issue that generated some discussion was the finding that the geons were viewpoint invariant with little or no cost in the speed or accuracy of recognizing or matching a geon from an orientation in depth not previously experienced. Some studies reported modest costs in matching geons at new orientations in depth but these studies had several methodological shortcomings.
- Biederman, Irving (1987). "Recognition-by-components: A theory of human image understanding". Psychological Review 94 (2): 115–47. doi:10.1037/0033-295X.94.2.115. PMID 3575582.
- Nevatia, R. (1982) Machine Perception. Prentice-Hall.[page needed]
- Biederman, Irving; Gerhardstein, Peter C. (1993). "Recognizing depth-rotated objects: Evidence and conditions for three-dimensional viewpoint invariance" (PDF). Journal of Experimental Psychology: Human Perception and Performance 19 (6): 1162–82. doi:10.1037/0096-1518.104.22.1682.
- Tarr, Michael J.; Williams, Pepper; Hayward, William G.; Gauthier, Isabel (1998). "Three-dimensional object recognition is viewpoint dependent". Nature Neuroscience 1 (4): 275–7. doi:10.1038/1089. PMID 10195159.
- Biederman, I; Bar, M (1999). "One-shot viewpoint invariance in matching novel objects". Vision Research 39 (17): 2885–99. doi:10.1016/S0042-6989(98)00309-5. PMID 10492817.
- Dill, Marcus; Edelman, Shimon (2001). "Imperfect invariance to object translation in the discrimination of complex shapes". Perception 30 (6): 707–24. doi:10.1068/p2953. PMID 11464559.