Motion perception is the process of inferring the speed and direction of elements in a scene based on visual, vestibular and proprioceptive inputs. Although this process appears straightforward to most observers, it has proven to be a difficult problem from a computational perspective, and extraordinarily difficult to explain in terms of neural processing.
The inability to perceive motion is called akinetopsia and it may be caused by a lesion to cortical area V5 in the extrastriate cortex. Neuropsychological studies of a patient who could not see motion, seeing the world in a series of static "frames" instead, suggested that visual area V5 in humans is homologous to motion processing area MT in primates.
First-order motion perception
Two or more stimuli that are switched on and off in alternation can produce two different motion percepts. The first, demonstrated in the figure to the right is "Beta movement", often used in billboard displays, in which an object is perceived as moving when, in fact, a series of stationary images is being presented. This is also termed "apparent motion" and is the basis of movies and television. However, at faster alternation rates, and if the distance between the stimuli is just right, an illusory "object" the same colour as the background is seen moving between the two stimuli and alternately occluding them. This is called the phi phenomenon and is an example of "pure" motion detection uncontaminated, as in Beta movement, by form cues.
This pure motion perception is referred to as "first-order" motion perception and is mediated by relatively simple "motion sensors" in the visual system, that have evolved to detect a change in luminance at one point on the retina and correlate it with a change in luminance at a neighbouring point on the retina after a short delay. Sensors that work this way have been referred to as either Hassenstein-Reichardt detectors after the scientists Bernhard Hassenstein and Werner Reichardt, who first modelled them, motion-energy sensors, or Elaborated Reichardt Detectors. These sensors detect motion by spatio-temporal correlation and are plausible models for how the visual system may detect motion. There is still considerable debate regarding the exact nature of this process.
Second-order motion perception
Second-order motion is motion in which the moving contour is defined by contrast, texture, flicker or some other quality that does not result in an increase in luminance or motion energy in the Fourier spectrum of the stimulus. There is much evidence to suggest that early processing of first- and second-order motion is carried out by separate pathways. Second-order mechanisms have poorer temporal resolution and are low-pass in terms of the range of spatial frequencies to which they respond. Second-order motion produces a weaker motion aftereffect unless tested with dynamically flickering stimuli. First and second-order signals appear to be fully combined at the level of Area V5/MT of the visual system.
The aperture problem
Each neuron in the visual system is sensitive to visual input in a small part of the visual field, as if each neuron is looking at the visual field through a small window or aperture. The motion direction of a contour is ambiguous, because the motion component parallel to the line cannot be inferred based on the visual input. This means that a variety of contours of different orientations moving at different speeds can cause identical responses in a motion sensitive neuron in the visual system.
Individual neurons early in the visual system (V1) respond to motion that occurs locally within their receptive field. Because each local motion-detecting neuron will suffer from the aperture problem, the estimates from many neurons need to be integrated into a global motion estimate. This appears to occur in Area MT/V5 in the human visual cortex.
Having extracted motion signals (first- or second-order) from the retinal image, the visual system must integrate those individual local motion signals at various parts of the visual field into a 2-dimensional or global representation of moving objects and surfaces. Further processing is required to disambiguate true "global motion" direction.
Motion in depth
As in other aspects of vision, the observer's visual input is generally insufficient to determine the true nature of stimulus sources, in this case their velocity in the real world. In monocular vision for example, the visual input will be a 2D projection of a 3D scene. The motion cues present in the 2D projection will by default be insufficient to reconstruct the motion present in the 3D scene. Put differently, many 3D scenes will be compatible with a single 2D projection. The problem of motion estimation generalizes to binocular vision when we consider occlusion or motion perception at relatively large distances, where binocular disparity is a poor cue to depth. This fundamental difficulty is referred to as the inverse problem.
Perceptual learning of motion
Detection and discrimination of motion can be improved by training with long term results. Participants trained to detect the movements of dots on a screen in only one direction become particularly good at detecting small movements in the directions around that in which they have been trained. This improvement was still present 10 weeks later. However perceptual learning is highly specific. For example, the participants show no improvement when tested around other motion directions, or for other sorts of stimuli.
- Hess, Baker, Zihl (1989). "The "motion-blind" patient: low-level spatial and temporal filters". Journal of Neuroscience 9 (5): 1628–1640. PMID 2723744.
- Baker, Hess, Zihl (1991). "Residual motion perception in a "motion-blind" patient, assessed with limited-lifetime random dot stimuli". Journal of Neuroscience 11 (2): 454–461. PMID 1992012.
- Steinman, Pizlo & Pizlo (2000) Phi is not Beta slideshow based on ARVO presentation.
- Reichardt, W. (1961). "Autocorrelation, a principle for the evaluation of sensory information by the central nervous system". W.A. Rosenblith (Ed.) Sensory communication (MIT Press): 303–317.
- Adelson, E.H., & Bergen, J.R. (1985). "Spatiotemporal energy models for the perception of motion". J. Opt. Soc. Am. A 2 (2): 284–299. doi:10.1364/JOSAA.2.000284. PMID 3973762.
- van Santen, J.P., & Sperling, G. (1985). "Elaborated Reichardt detectors". J. Opt. Soc. Am. A 2 (2): 300–321. doi:10.1364/JOSAA.2.000300. PMID 3973763.
- Cavanagh, P & Mather, G (1989). "Motion: the long and short of it". Spatial vision 4 (2–3): 103–129. doi:10.1163/156856889X00077. PMID 2487159.
- Chubb, C & Sperling, G (1988). "Drift-balanced random stimuli: A general basis for studying non-Fourier motion perception". J. Opt. Soc. Am. A 5 (11): 1986–2007. doi:10.1364/JOSAA.5.001986.
- Nishida, S., Ledgeway, T. & Edwards, M. (1997). "Dual multiple-scale processing for motion in the human visual system". Vision Research 37 (19): 2685–2698. doi:10.1016/S0042-6989(97)00092-8. PMID 9373668.
- Ledgeway, T. & Smith, A.T. (1994). "The duration of the motion aftereffect following adaptation to first- and second-order motion". Perception 23 (10): 1211–1219. doi:10.1068/p231211. PMID 7899037.
- Ball, K., and Sekuler, R. (1982). A specific and enduring improvement in visual motion discrimination. Science, 219, 697-698
Hadad, B.,Maurer, D., Lewis, T. L. (2001). Long trajectory for the development of sensitivity to global and biological motion. Developmental Science, 14:6, pp 1330–1339.