Talk:Hough transform

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Robotics (Rated Start-class, High-importance)
WikiProject icon Hough transform is within the scope of WikiProject Robotics, which aims to build a comprehensive and detailed guide to Robotics on Wikipedia. If you would like to participate, you can choose to edit this article, or visit the project page (Talk), where you can join the project and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 

Line from a single point??[edit]

I understand almost everything about this article, but one big hangup that I'm not finding an explanation for is that the text currently says "For each pixel and its neighborhood, the Hough transform algorithm determines if there is enough evidence of an edge at that pixel. If so, it will calculate the parameters of that line, and then look for the accumulator's bin that the parameters fall into, and increase the value of that bin." But it seems to skip the step between identifying an edge point and calculating the parameters of a line. I'm brand new at this, but for me, I'm searching for edge points by comparing each pixel with it's right, lower, and right-lower corner neighbors looking for significant changes. This only identifies a single point, but no direction of the edge.

Now, later in the article, an improvement of the algorithm is mentioned taking gradient direction into account, so if this is the reference, it seems out of order.

Also, I could imaging that "For each pixel and its neighborhood" could be referring to neighboring pixels within a distance, and calculating line parameters between each pair of nearby points, but such neighbor-searching seems wasteful and against the apparent non-locality benefits of the method.

Also, more discussion on "finding concurrent curves" would be helpful.

Also, the "Rendering of Transform Results" image deserves more discussion. I've been staring at it for a while and fail to understand the visual relationship between it and the original image. —Preceding unsigned comment added by 12.52.198.248 (talk) 16:29, 13 May 2010 (UTC)

Something incorrect[edit]

I think the "Big-O(Am - 2), where A is the image space and m is the number of parameters. (Shapiro and Stockman, 310)" is incorrect, it should be "Big-O(Am + 2)", since the complexity increases with the number of parameters.

The complexity is not Big-O(Am - 2) or Big-O(Am + 2) at all as far as I know. The complexity is a function of the number of edge points (Ne), number of parameters (p) and the discretization level of each parameter (say n for each parameter) in the parameter space so it something like O(Ne.n^p) — Preceding unsigned comment added by KgpianAsimov (talkcontribs) 15:57, 1 November 2011 (UTC)


The following sentence isn't 100% correct "The parameter r represents the smallest distance between the line and the .." In order to represent all the possible lines ,r can be negative as well as positive, therefore it can't be the distance, as distance is always non-negative. (Note : a few sentences later it is written that r belongs to R, therfore can be negative)

Probably it can be fixed by some other term like, 'signed distance' or something alike


Oh, and while were at it, the term "smallest distance" is also incorrect. Distance between a line and a point is the smallest possible by it's definition! The 'smallest' is relevant whenever you define the distance in the following way:

Distance(Point,Line) = Min { D(Point,q) | q belongs to Line}

It seems that nobody wants to add my proposed change. I've changed it



Hough's 1962 patent teaches a transform that uses the slope-intercept parametrization of straight lines. This is computationally very unwieldy-- the transform space is unbounded because the slope can go to infinity.

Azriel Rosenfeld, in his 1969 text Picture Processing by Computer (Academic Press, New York, 1969), noted this disadvantage and suggested possibly computing every transform twice, with X- and Y-axes interchanged.

In 1972 Richard Duda and I proposed instead using the rho-theta parametrization of straight lines. This now-universally-used new transform has the great advantage of leading to a bounded transform space.

See

Duda, R. O. and P. E. Hart, "Use of the Hough Transformation to Detect Lines and Curves in Pictures," Comm. ACM, Vol. 15, pp 1-15 (January, 1972)

and also

Duda, R. O. and P. E. Hart, Pattern Classification and Scene Analysis, pp 335 - 337 (John Wiley & Sons, New York, NY, 1973)


As a matter of further historical interest, this same rho-theta parametrization is used in the specialized field of integral geometry for analyzing the probability of 'random' geometric events, such as "What's the probability that a line thrown on a plane 'at random' will intersect a unit circle?"


NOTE: the link to Peter Hart links to a historian; it appears the target of the link has changed.



There are three flavors of Hough transform. The first (original) was used to detect straight lines in bubble chamber images. The technique parameterizes pattern space, then projects all points of interest (typically edges found via Canny edge detection) into the parameter space. Each point votes for every possible shape parameterization it could belong to. The resulting accumulator array is then thresholded and non-maxima suppressed, keeping only local minima. Indices of the bins in the accumulator correspond to shape parameterizations. By looking at bins still populated after thresholding and non-maxima suppression, it is possible to extract the parameters of all shapes of a given type in an image.

Chadman8000 (talk) 02:27, 5 March 2009 (UTC) chad miller

A note on revisions[edit]

When rewriting an article, I think that it is a good rule to keep what is good in a previous version. I must say that I like the new figure, which is illustrative. After the most recent changes in the text, however, I'm sorry to say that the article has lost quite a bit of expressing the ideas and properties of a Hough transform for an unfamiliar reader. Tpl 16:56, 11 March 2007 (UTC)


But what is the Hough transform?[edit]

I have read this article and I am none the wiser. It needs a mathematical definition of the transform as well as the waffle about what it is used for. Given an image (say a function on the plane) I presume the Hough transform is a function in Hough space. What function is it? Also what are concurrent curves? Billlion 11:47, 15 June 2007 (UTC).

The theory section reads like a bad answer scribbled in haste to an exam question. 5/10 ! Can someone who knows what a Hough transform is and can write clear mathematical prose please clean it up.Billlion (talk) 12:29, 25 November 2007 (UTC)
Agreed. I had to read through it repeatedly for hours, and I still don't get it.202.188.138.63 (talk) 14:58, 25 November 2007 (UTC)
Having just read a few papers and it transpires that the Hough transform in its simplest form is simply the discretized Radon transform of a binary image. This is usually thresholded so that the largest values in the sinogram are taken to indicate lines in the original image. That is the points that map into high value pixels (or groups of nearby pixels) in the sinogram space are considered to line on straight lines. In some sense it is not really a transform in the sense of a Radon transform or Fourier transform, it is a method of using the Radon transform to identify lines in the image. Unless anyone feels like having a go before I get around to it I suggest editing the section to reflect this point of view. This editorial raises the issue of disageement over if it is really a transform [1] for example. While I am at it, we need to get rid of the high-school style y=mx+c line parameterization! Billlion (talk) 16:45, 25 November 2007 (UTC)
After the first comments by user Billion, I have made a quick rewrite of the the article in order to hopefully make it more readable. As user Billion points out, there are close similarities to the Radon transform. The Hough transform however has a long history of being used in image processing, image analysis and computer vision, and I would therefore not advice to smash this article into a specific instance of the Radon transform -- this would make readers from the image processing, image analysis and computer vision communities complete lost unless they already know about the Radon transform and its relations to the Hough transform. Moreover, the Radon transform does not capture the detection of circles or ellipses, which are also very common applications of the Hough transform. Although the Wikipedia article on the Hough transform is (or at least was) not very well written, it (at least roghly) reflects the notions of how the Hough transform is taught in image processing courses. Although I have not personally written about the specific topics of circle and ellipse detection, I say that it would be valuable to extend this article with these notions. Tpl (talk) 17:52, 25 November 2007 (UTC)
Did our edits cross over? You seem to have reverted some of my edits to the the high school maths including inconsistent use of math markup, and the spurious (m,c) parameterization. I agree that it should be accessable to the image processing community but that is no reason to make it sound obscure and amateurish to the general but mathematically literate audience.Billlion (talk) 19:16, 25 November 2007 (UTC)
Possibly, our changes did cross over (sorry if I did note these). Concerning the style of presentation, have no aims to make things sound obscure or amateurish, rather contrary. Concerning the math description, I did actually make an attempt to clean it up a long time (a year?) ago, but I was overruled by what I perceived as more energetic people who made the text too talkative according to my taste. However, I find it important to not make the explanation more complicated than it needs to be, and according to my belief you don't need to introduce the Radon transform as a prerequisite to describing how the Hough transform is used in the image processing community for detecting lines, circles or ellipses. Tpl (talk) 07:36, 26 November 2007 (UTC)
I think its just the old stuff accidentally left in by our edit cross that is obscure. I will look over it and see if I can weave in my improvements taking note of what you say. I wonder if the mx+c notation should be included later as it is largely of historical interest now, having introduced the sinogram coordinates and explained what the Radon transfor is. Billlion (talk) 09:10, 26 November 2007 (UTC)

I think it is much easier for someone to understand how the Hough transform works in its simplest and original form using the y=mx+c parameterization and then build upon that to explain r-theta and possibly make the connection with the Radon transform. Having a solid mathematical foundation for the hough transform (or any other transform) is a good thing but please do not forget that wikipedia is not only for the academia. —Preceding unsigned comment added by 217.154.24.242 (talk) 14:43, 1 April 2009 (UTC)


On Illustration[edit]

I'm no Hough Transform expert, but it seems that the way the algorithm discussed here works is basically equivalent to the well known gemoetric property: Given a point X and a line L, all circles having a diameter with one end at X and the other end on L will intersect L again at the point of L nearest to X. This perhaps says nothing new about the algorithm, and the circles may still need to be transformed into sine curves in the angle,radius space -- the high accumulation points can be located without doing this last step, but perhaps there are computational or accuracy reasons for doing so?

My point is, though, that perhaps the intersecting circles explanation is easier to grasp than jumping straight to the intersecting sine waves. The fact of the intersections and their location and relation to the line in question seems immediate, and may suggest limitations or extensions of the algorithm. KRD —Preceding unsigned comment added by 75.109.202.185 (talk) 23:04, 22 June 2009 (UTC)

In its simplest form the Hough Transform is a way for all of the thresholded edge points of an image to vote for all of the possible lines upon which those points could lie. In doing this in parameter space the accumulated votes then point the way to the dominant lines of the image. 69.14.30.77 (talk) 21:23, 17 December 2010 (UTC) L.M. Oberdier

Rename it: Duda-Hart-Hough Transform[edit]

I would like to propose that we rename the title, and usage within the page, to be the Duda-Hart-Hough Transform, and that we encourage the community to adopt this name.

The rationale is that Duda and Hart made a seminal contribution and deserve co-authorship. The original formulation by Hough had a defect; Duda and Hart fixed the defect, and as the article notes, their formulation is now the one that is universally used.

There is certainly precedent for this kind of naming, e.g. the Cocke-Kasami-Younger algorithm for parsing.

Anyone who knows it by the name of Hough Transform will instantly recognize the referent, so it should not cause any confusion. —Preceding unsigned comment added by Gordonnovak (talkcontribs) 21:16, 8 January 2010 (UTC)

It's not Wikipedia's job to determine what the name should be. We should use the name reliable sources are using. And it seems they are using the name “Hough transform”, so we should use that. Svick (talk) 11:46, 25 November 2010 (UTC)

What is the Hough transform?[edit]

I understand why one of the readers is not satisfied with the theory presented. I shall try to explain what is missing from most of the accounts of the subject. If you have a segment of a straight line in an image and you want to detect this segment automatically you may want to count (in discrete binary setting for simplicity) all the pixels that lie on this line segment with the usual provisos for resolution, differences between ideal and real lines, etc. The number of black pixels that belong to this segment then gives you an evidence of the presence of the segment if you have an idea of a threshold for the pixel count that constitutes a line segment for you. There are various ways of accomplishing this counting, for instance you can take a linear template and start moving it all over your image varying slopes and intercepts and all the while counting how many pixels in the image coincide with pixels on your template. No matter how you do it the net result is the same - it is the pixel count. If you formalize this process you realize that what is being calculated is the cross-correlation C(p) = ∑∑ Image(i,j) Template (i,j,p), where p is a set of parameters that define position

      ij

of the moving template (such as slope and intercept). It is easy to see that if the black pixels in the image and template coincide, Image(i,j) = Template (i,j,p) = 1 thus contributing to the count C(p). In continuous setting this would correspond to an integral and the process applies to any template not necessarily linear. The Cauchy-Schwarz inequality guarantees that the maximum of cross-correlation is reached when the image and the template are proportional (which in binary case means the same). Now, if you use the equation of straight line in the form ax + by + 1 = 0, where (x,y) are running Cartesian coordinates you can see that the parameters that define a specific line (a,b) are present in this equation in a completely symmetrical fashion with coordinates (x,y) since you can treat (a,b) as coordinates and (x,y) as parameters. From this point of view the lines and the points are the same and this has been known in geometry as the principle of duality for hundreds of years. Since the lines and the points are the same, finding a line in the plane (x,y) is the same as finding a point in the plane (a,b) and to each point (X,Y) that belong to the line ax + by + 1 = 0 in the plane (x,y) corresponds the line aX + bY + 1 = 0 in the plane (a,b) and all these lines intersect at the same point (A,B) which identifies the sought for line. So, by mapping the points (x,y) this way to lines in the plane (a,b) and then counting the lines passing through a given point (a,b) one calculates the cross-correlation C(a,b) = ∑∑ Image(i,j) Template (i,j,a,b) with a linear template. The invariable confusion occurs because one does not distinguish between what is being calculated (cross-correlation) with how it is being calculated (by means of these point to line mappings). If this distinction is left out one feels dissatisfaction in not knowing what is actually being calculated. Hough patented mapping points to lines and calculating the linear particle tracks without realizing that he was patenting a process that has been known (incidentally, as correlation) in geometry for at least a hundred years. In continuous setting the same cross-correlation with a linear template is exactly the Radon transform, so the Hough transform is nothing else as a method of calculating the Radon transform. Various line parameterizations (like the normal form) are immaterial to the principle of duality. While one needs to understand the central concept of correlation (which encompasses both the linear and the circular Radon transforms) some innovative terminology that entered this discussion further obfuscates the matter. If you call points dudas and lines harts and then discuss how Hough transforms dudas to harts you would unlikely gain much new insight.

Mongo2010 (talk) 19:04, 11 August 2010 (UTC)

problem with theory[edit]

The article currently states:

An infinite number of lines can pass through a single point of the plane. If that point has coordinates (x0,y0) in the image plane, all the lines that go through it obey the r(\theta) =x_0\cdot\cos \theta+y_0\cdot\sin \theta where r is the same as the one in the formula (#1)

But if r(\theta) is the r from formula (#1) and if x_0, y_0 are fixed then theta can take on only one value, which can not be valid for all lines through that point.

Perhaps the claim "where r is the same as the one in the formula (#1)" should be removed?

—Preceding unsigned comment added by 159.54.131.7 (talk) 15:02, 11 August 2010 (UTC)