Jump to content

Motion estimation

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by The Anome (talk | contribs) at 11:45, 11 July 2010 (References: Category:Estimation theory). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions but the images are a projection of the 3D scene onto a 2D plane. The motion vectors may relate to the whole image (global motion estimation) or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom.

Closely related to motion estimation is optical flow, where the vectors correspond to the perceived movement of pixels. In motion estimation an exact 1:1 correspondence of pixel positions is not a requirement.

Applying the motion vectors to an image to synthesise the transformation to the next image is called Motion compensation. The combination of motion estimation and motion compensation is a key part of video compression as used by MPEG 1, 2 and 4 as well as many other video codecs.

Algorithms

The methods for finding motion vectors can be categorised into pixel based methods ("direct") and feature based methods ("indirect"). A famous debate resulted in two papers from the opposing factions being produced to try to establish a conclusion[1][2].

Direct Methods

Evaluation Metrics

In direct methods several evaluation metrics can be used.

Indirect Methods

Indirect methods use features, such as Harris corners, and match corresponding features between frames, usually with a statistical function applied over a local or global area. The purpose of the statistical function is to remove matches that do not correspond to the actual motion.

Statistical functions that have been successfully used include RANSAC.

References

  1. ^ Philip H.S. Torr and Andrew Zisserman: Feature Based Methods for Structure and Motion Estimation, ICCV Workshop on Vision Algorithms, pages 278-294, 1999
  2. ^ Michal Irani and P. Anandan: About Direct Methods, ICCV Workshop on Vision Algorithms, pages 267-277, 1999.