In computer vision and computer graphics, 3D reconstruction is the process of capturing the shape and appearance of real objects. This process can be accomplished either by active or passive methods. If the model is allowed to change its shape in time, this is referred to as non-rigid or spatio-temporal reconstruction.
- 1 Motivation and applications
- 2 Problem statement and basics
- 3 Active methods
- 4 Passive methods
- 5 See also
- 6 References
- 7 External links
Motivation and applications
The research of 3D reconstruction has always been a focus and difficulty. Using 3D reconstruction one can determine any object’s 3D profile, as well as knowing the 3D coordinate of any point on the profile.The 3D reconstruction of objects is a generally scientific problem and core technology of a wide variety of fields, such as Computer Aided Geometric Design(CAGD), Computer Graphics, Computer Animation, Computer Vision, medical imaging,computational science, Virtual Reality, digital media, etc. For instance, the lesion information of the patients can be presented in 3D on the computer, which offers a new and accurate approach in diagnosis and thus has vital clinical value.
Problem statement and basics
The approach of using Binocular Stereo Vision to acquire object’s 3D geometric information is on the basis of visual disparity. The following picture provides a simple schematic diagram of horizontally sighted Binocular Stereo Vision, where b is the baseline between projective centers of two cameras.
The origin of the camera’s coordinate system is at the optical center of the camera’s lens as shown in the figure. Actually, the camera’s imagie plane is behind the optical center of the camera’s lens. However, to simplify the calculation, images are drawn in front of the optical center of the lens by f. The u-axis and v-axis of the image’s coordinate system O1uv are in the same direction with x-axis and y-axis of the camera’s coordinate system respectively. The origin of the image’s coordinate system is located on the intersection of imaging plane and the optical axis. Suppose such world point P whose corresponding image points are P1(u1,v1) and P2(u2,v2) respectively on the left and right image plane. Assume two cameras are in the same plane, then y-coordinates of P1 and P2 are identical, i.e.,v1=v2. According to trigonometry relations,
where(xp, yp, zp) are coordinates of P in the left camera’s coordinate system, f is focal length of the camera. Visual disparity is defined as the difference in image point location of a certain world point acquired by two cameras,
based on which the coordinates of P can be worked out.
Therefore, once the coordinates of image points is known, besides the parameters of two cameras, the 3D coordinate of the point can be determined.
The 3D reconstruction consists of the following sections:
2D digital image acquisition is the information source of 3D reconstruction. Commonly used 3D reconstruction is based on two or more images, also it may only employ one single image sometimes. There are various types of methods for image acquisition that depends on the occasions and purposes of the specific application. Not only the requirements of the application must be meet, but also the visual disparity, illumination, performance of camera and the feature of scenario should be considered.
Camera calibration in Binocular Stereo Vision refers to the determination of the mapping relationship between the image points P1(u1,v1) and P2(u2,v2), and space coordinate P(xp, yp, zp) in the 3D scenario. Camera calibration is a basic and essential part in 3D reconstruction via Binocular Stereo Vision.
The aim of feature extraction is to gain the characteristics of the images, through which the stereo correspondence processes. As a result, the characteristics of the images closely link to the choice of matching methods. There is no such universally applicable theory for features extraction, leading to a great diversity of stereo correspondence in Binocular Stereo Vision research.
Stereo correspondence is to establish the correspondence between primitive factors in images, i.e. to match P1(u1,v1) and P2(u2,v2) from two images. Certain interference factors in the scenario should be noticed, e.g. illumination, noise, surface physical characteristic and etc.
According to precise correspondence, combined with camera location parameters, 3D geometric information can be recovered without difficulties. Due to the fact that accuracy of 3D reconstruction depends on the precision of correspondence, error of camera location parameters and so on, the previous procedures must be done carefully to achieve relatively accurate 3D reconstruction.
Active methods, i.e. range data methods, given the depth map, reconstruct the 3D profile by numerical approximation approach and build the object in scenario based on model. These methods actively interfere with the reconstructed object, either mechanically or radiometrically using rangefinders, in order to acquire the depth map, e.g. structured light, laser range finder and other active sensing techniques. A simple example of a mechanical method would use a depth gauge to measure a distance to a rotating object put on a turntable. More applicable radiometric methods emit radiance towards the object and then measure its reflected part. Examples range from moving light sources, colored visible light, time-of-flight lasers to microwaves or ultrasound. See 3D scanning for more details.
Passive methods of 3D reconstruction do not interfere with the reconstructed object; they only use a sensor to measure the radiance reflected or emitted by the object's surface to infer its 3D structure through image understanding. Typically, the sensor is an image sensor in a camera sensitive to visible light and the input to the method is a set of digital images (one, two or more) or video. In this case we talk about image-based reconstruction and the output is a 3D model. By comparison to active methods, passive methods can be applied to a wider range of situations.
Monocular cues methods
Monocular cues methods refer to use image (one, two or more) from one viewpoint (camera) to proceed 3D construction. It makes use of 2D characteristics(e.g. Silhouettes, shading and texture) to measure 3D shape, and that’s why it is also named Shape-From-X, where X can be silhouettes, shading, texture[disambiguation needed] etc. 3D reconstruction through monocular cues is simple and quick, and only one appropriate digital image is needed thus only one one camera is adequate. Technically, it avoids stereo correspondence, which is fairly complex.
Photometric Stereo This approach is an update of Shape-of-shading. Images taken in different lighting conditions are used to solve the depth information. It deserves to be mentioned that more than one images are required by this mean.
Shape-from-texture Suppose such an object with smooth surface covered by replicated texture units, and its projection from 3D to 2D causes distortion and perspective. Distortion and perspective measured in 2D images provide the hint for inversely solving depth of normal information of the object surface.
Binocular stereo vision
Binocular Stereo Vision obtains the 3-dimensional geometric information of an object from multiple images based on the research of human visual system. The results are presented in form of depth maps. Images of an object acquired by two cameras simultaneously in different viewing angles, or by one single camera at different time in different viewing angles, are used to restore its 3D geometric information and reconstruct its 3D profile and location. This is more direct than Monocular methods such as shape-from-shading.
Binocular stereo vision method requires two identical cameras with parallel optical axis to observe one same object, acquiring two images from different points of view. In terms of trigonometry relations, depth information can be calculated from disparity. Binocular stereo vision method is well developed and stably contributes to favorable 3D reconstruction, leading to a better performance when compared to other 3D construction. Unfortunately, it is computationally intensive, besides it performs rather poorly when baseline distance is large.
- 3D modeling
- 3D data acquisition and object reconstruction
- 3D reconstruction from multiple images
- 3D scanner
- 4D reconstruction
- Depth map
- Liping Zheng ; Guangyao Li ; Jing Sha; The survey of medical image 3D reconstruction. Proc. SPIE 6534, Fifth International Conference on Photonics and Imaging in Biology and Medicine, 65342K (May 01, 2007); doi:10.1117/12.741321.
- McCoun, Jacques, and Lucien Reeves. Binocular vision: development, depth perception and disorders. Nova Science Publishers, Inc., 2010.
- Buelthoff, Heinrich H., and Alan L. Yuille. "Shape-from-X: Psychophysics and computation." Fibers' 91, Boston, MA. International Society for Optics and Photonics, 1991.
- Horn, Berthold KP. "Shape from shading: A method for obtaining the shape of a smooth opaque object from one view." (1970).
- Woodham, Robert J. "Photometric method for determining surface orientation from multiple images." Optical engineering 19.1 (1980): 191139-191139.
- Witkin, Andrew P. "Recovering surface shape and orientation from texture." Artificial intelligence 17.1 (1981): 17-45.
- Kass, Michael, Andrew Witkin, and Demetri Terzopoulos. "Snakes: Active contour models." International journal of computer vision 1.4 (1988): 321-331.