User:Athlas7/sandbox
Cascaded Hough Transform
[edit]The Cascaded Hough Transform (CHT) is a technique used in image analysis, computer vision, and digital image processing for extracting several types of fixed structures in images. Examples of such features are line extraction and vanishing point detection. It works by applying several Hough transforms in a consecutive fashion, where the output of one transform becomes the input of the second, and so on, thus having the name "cascaded".[1]
Theory
[edit]A traditional Hough transform detects straight lines using a slope-intercept parametric representation. For any given line , the edge pixels are detected by a voting mechanism in the parameter space. Thus, every point of an edge in image space can be represented as the line in parameter space. A certain degree of symmetry can be observed in this translation from image space to parameter space: the parameters a and b are to image space (x, y) what x and y are to parameter space (a, b). It follows that a line in one space can be detected as points in the other and, vice versa, for every point in one space there is a corresponding line in the other. This symmetry forms the basis of the cascaded Hough transform.
Indeed, one can apply a second Hough transform to a subset of the result of the first and go from parameter space back to image space.
Applying a Hough transform to an image can be used to detect lines. In the resulting (a, b) parameter space only the dominant peaks are kept. A second Hough transform can then applied to this filtered result, that can be used to detect lines of collinear peaks in the parameter space. As the result goes back to image space, the positions of these peaks correspond to vertices where several lines in the original image intersect. An important special case of such vertices are vanishing points. Yet another Hough transform, applied to the peaks of the second, can be used to detect collinear vertices. In image space, such vertices correspond to vanishing lines (e.g. the horizon line) which consist of several vanishing points that have been detected by the previous Hough transform.
The table below summarises the resulting observations from each of the applied Hough transforms (shown as layers):
Layer | Meaning of Detected Features |
---|---|
Layer 0 | The original image |
Layer 1 | points ~ lines lines ~ convergent lines |
Layer 2 | points ~ intersection points lines ~ collinear intersection points |
Layer 3 | points ~ lines of intersection points |
Implementation
[edit]Parameterisation of the Spaces
[edit]The main drawback of having the slope-intercept parameterisation of lines is the unboundedness of the parameter space. This drawback is inherent from the implementation of the original Hough transform. An often applied solution to the issue in the original Hough transform is to use the polar coordinates as parameters, where is the orthogonal distance of the line to the origin, and is the direction of the normal to the line.[2] This gives a bounded parameter space, but breaks the symmetry between the image and parameter spaces, which cripples the key principle which CHT relies on.
To preserve the symmetry between the two spaces and obtain a bounded parameter space, the original (a, b) parameter space can be split into three bounded subspaces:
- The first subspace has coordinates (a, b) but only applies to lines where and .
- The second subspace has coordinates .
- The third subspace has coordinates .
Thus, if and , the point (a, b) shows up in the second subspace, with its coordinates. In the case where and , the point shows up in the third subspace. The resulting three subspaces are restricted to the interval [-1, 1]. A point in the original (x, y) image space still transforms into a line in each of the three subspaces. The resulting subspaces are shown in the image below:
However, the original (x, y) image space is also unbounded (e.g. vanishing points can lie outside the image size boundaries, or at infinity). To overcome the issue, and adjust the space according to the new representation of the parameter space, the image space is also split in a similar fashion into three subspaces having the following coordinates , , and . For practical reasons, the image is also rescaled so that it fits entirely within the first subspace. This allows the subspace (x, y) to correspond to the image itself.
Filtering
[edit]At each stage of the CHT, appropriate data handling can be done before passing on the result to the next Hough transform. The following aspects should be considered:
- which data should be eliminated for consideration by subsequent layers
- which data emerging from a layer should be considered as important
- which data should be passed on to the next Hough transform
Normally, only peaks, i.e. relatively high concentration of votes, are considered at each stage of the CHT. To better estimate the significance of the peaks, an estimate of the tangential line is made locally around the edge pixels, and only a small area around the corresponding (a, b) values is introduced in lieu of a complete line. The strength of the peak will still reflect the length of the line in the image.
At each layer, only truly "non-accidental" structures are read out. Such structures depend on the layer that is being considered, are correspond to the detected features as shown in the table above.
Examples
[edit]References
[edit]- ^ "The Cascaded Hough Transform as Support for Grouping and Finding Vanishing Points and Lines" (PDF). Tinne Tuytelaars, Marc Proesmans, and Luc Van Gool.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ "Use of the Hough Transformation to Detect Lines and Curves in Pictures" (PDF). Richard O. Duda, Peter E. Hart. 1971.
{{cite journal}}
: Cite journal requires|journal=
(help)