Talk:Sobel operator

"Solving"

I'm a university AI student who's done a module in Computational Vision, and still I don't understand the first line: "a discrete differentiation operator solving for the 1st derivatives". What does the "solving" mean?

Well, I think the "solving" is in the algebraic sense, as in "Solve the following equation for x" which basically means "Find the value of x", but I agree that that was a horrible way to start the article, useful only to seasoned mathematicians. I've tried to add a more "general audience" explanation of what the operator actually does. - IMSoP 22:30, 16 December 2005 (UTC)[reply]

Technical Details - Matrix maths wrong way round

Surely the matix maths in the Technical Details section is the wrong way round. The result shown would make Gx and Gy = 0. It should be a 3x1 column * 1x3 row = 3x3 matix.

Technical Details - Convolution operator get confuse with the times operator

First we get a squared matrix with a 3x1 matrix multiplied by a 1x3 row (so the times operator $\times$ )

\quad \quad {\begin{bmatrix}+1&+2&+1\\0&0&0\\-1&-2&-1\end{bmatrix}}={\begin{bmatrix}+1\\0\\-1\end{bmatrix}}\times {\begin{bmatrix}1&2&1\end{bmatrix}}

Then we convolve the squared matrix with the input image to get another image, in this case the Gy derivative (and the convolution operator $*$ is used):

\mathbf {G} _{y}=\quad {\begin{bmatrix}+1&+2&+1\\0&0&0\\-1&-2&-1\end{bmatrix}}*\mathbf {A} ={\begin{bmatrix}+1\\0\\-1\end{bmatrix}}\times {\begin{bmatrix}1&2&1\end{bmatrix}}*\mathbf {A}

And, because they are separable in both $x$ and $y$ directions, we can write it as:

\mathbf {G} _{y}=\left({\begin{bmatrix}+1\\0\\-1\end{bmatrix}}*\mathbf {A} \right)*{\begin{bmatrix}1&2&1\end{bmatrix}}

Simple Description

"giving the direction of the largest possible increase from light to dark"

I think "light to dark" is backwards, since usually dark is 0 and light is 1. And in fact, isn't it redundant since you say the largest possible "increase" ?

Ian Stewart

Convolution Mask Notation

This is just a notational comment because I was slightly confused when reading the section defining the convolution matrices for the x and y partial derivatives. When you write:

\mathbf {G} _{y}={\begin{bmatrix}+1&+2&+1\\0&0&0\\-1&-2&-1\end{bmatrix}}*\mathbf {A} \quad {\mbox{and}}\quad \mathbf {G} _{x}={\begin{bmatrix}+1&0&-1\\+2&0&-2\\+1&0&-1\end{bmatrix}}*\mathbf {A}

It is technically correct, because convolution is defined as

$f\star g(t)=\int f(t-x)g(x)dx$

i.e. the convolution kernel (f in this case) is flipped around before being applied to g. The above convolution masks give the technically correct definition the partial derivatives since the masks will be flipped around before applied to the image. However, in my experience in the image processing world, you never see convolution matrices (masks) written this way. They are always reversed with the convolution defined differently than above, which is why I was initially confused by this section. They would then be

\mathbf {G} _{y}={\begin{bmatrix}-1&-2&-1\\0&0&0\\+1&+2&+1\end{bmatrix}}*\mathbf {A} \quad {\mbox{and}}\quad \mathbf {G} _{x}={\begin{bmatrix}-1&0&+1\\-2&0&+2\\-1&0&+1\end{bmatrix}}*\mathbf {A}

Is this anyone else's experience too or am I the only one? It's just that when I looked at the convolution matrices I thought "that's not right, they seem to be backwards of the approximation of the partial derivatives." --Paul Laroque (talk) 19:10, 11 April 2009 (UTC)[reply]

By the way, it is defined this way in Edge detection#Other first-order methods if you take right and up to be positive orientations. Maybe one should be changed? --Paul Laroque (talk) 19:21, 11 April 2009 (UTC)[reply]

First, your "flipped around" means more exactly that the filter is flipped/mirrored in both x and y directions, i.e., rotated 180 degrees (in the case of 2D filtering). There may be some examples where authors don't care to deal with this issue, either because their implementation of convolution between a signal and a filter implicitly takes care of the flipping, or because the particular application is invariant to this transformation of the filter, but in order to minimize confusion the filter should be presented in a correct way. To add to the confusion, there is no well-established interpretation of what is meant by x and y coordinates in image processing. Mathematicians tend to think that the x-axis is pointing right and the y-axis is pointing up, whereas most image coordinates systems have x (first coordinate) pointing down and y (second coordinate) pointing right (with the origin typically at the top-left corner, but this is not important for convolution). The article uses the first coordinate system. --KYN (talk) 11:00, 12 April 2009 (UTC)[reply]

I understand the conventions used in the article. I am just trying to point out that I have rarely seen the Sobel operator written this way. Even in the page linking to this one, Edge detection#Other first-order methods, it is not defined this way. The first few hits from a google search of "sobel filter" give these:

which all give the convolution mask reversed. I think this is common because you're not really meant to think of the sobel operator as a filter in the sense that you don't care what it does in the frequency domain, you care that it approximates a partial derivative (smoothed in the orthogonal direction). I just wanted to point out that I was confused when I first saw the convolution masks in this article and I had to scan down the article to find the definition of convolution to see that they actually made sense. And I have significant experience in image processing, so I'm sure others will be confused as well. For example, if someone were to use Gimp to implement these after reading the article, they would have to know that they should be reversed before typing them into Gimp. --Paul Laroque (talk) 13:32, 12 April 2009 (UTC)[reply]

What you are pointing our seems to be a general problem of how the convolution operation is described in various software, rather than someting related to the Sobel operator in specific. If you look at cross correlation, you will find that this is the operation that appears to called "convolution" in certain contexts. --KYN (talk) 10:03, 13 April 2009 (UTC)[reply]

Scharr not the ultimate solution

At the bottom of the page, it states that the Sobel operator does not have perfect rotational symmetry. I read this as not being perfectly isotropic. I agree with that. Scharr's attempt to improve this, (I don't know from what time frame), is ${\begin{bmatrix}+3&+10&+3\\0&0&0\\-3&-10&-3\end{bmatrix}}{\begin{bmatrix}+3&0&-3\\+10&0&-10\\+3&0&-3\end{bmatrix}}$ . In literature, I even found another operator actually called the Isotropic Operator, which equals the kernels ${\begin{bmatrix}+1&+{\sqrt {2}}&+1\\0&0&0\\-1&-{\sqrt {2}}&-1\end{bmatrix}}{\begin{bmatrix}+1&0&-1\\+{\sqrt {2}}&0&-{\sqrt {2}}\\+1&0&-1\end{bmatrix}}$ but also gives disappointing non-isotropic results.

Well, In April of 2008, I did a research for the most isotropic operator, as part of the course Image Processing at Utrecht University. Here I considered the generalized operator ${\begin{bmatrix}+a&+b&+a\\0&0&0\\-a&-b&-a\end{bmatrix}}{\begin{bmatrix}+a&0&-a\\+b&0&-b\\+a&0&-a\end{bmatrix}}$ where I limited the search space to one degree of freedom where $2a+b=1\$ . Though I have no formal proof, I did found that the operator with $a={\tfrac {1-{\sqrt {\tfrac {1}{2}}}}{2}},b={\sqrt {\tfrac {1}{2}}}$ gives equal rotational symmetry for all multiples of 45 degree (which is an improvement on both Sobel or Isotropic operators which are rotationally symmetric for all multiples of 90 degrees). The results look far more pleasing to the eye as the Sobel or Isotropic filters, and the rotational symmetry is nearly indistinguishable from prefect isotropy. --Zom-B (talk) 22:49, 7 May 2009 (UTC)[reply]

Djexplo (University of Twente): I have tested, your kernel values in combination with a Coherence filter from "A Scheme for Coherence-Enhancing Diffusion Filtering with Optimized Rotation Invariance" by Joachim Weickert and Hanno Scharr. As a test image I used a 2D image with circles I = sin(r^2). Then I performed Coherence filtering on the test image. The absolute error between the test image and the result of your kernel is smaller than with the kernel of Hanno Scharr. But if you plot the pixel errors as an image, the errors with your kernel values are more rectangular, than with the kernel of Scharr.

Scharr not the ultimate solution 2

Djexplo (University of Twente): I have tested some first and second order derivatives kernels of size 3x3 and 5x5, including Sobel and Scharr. Scharr outperforms Sobel in rotational invariance, but Scharr is only one solution for a fixed image scale (Corresponds to a truncated Gaussian derivative kernel with a certain sigma). If you still want better rotational performance you can better switch to 5x5 derivative schemes, see my short paper: http://www.k-zone.nl/Kroon_DerivativePaper.pdf —Preceding undated comment added 15:07, 23 December 2009 (UTC).

arctan versus atan2

It's not correct to say that the gradient's direction can be computed as arctan(Gy/Gx). arctan returns an angle between -pi/2 and pi/2, not between -pi and pi as needed in most cases. (Obviously, this happens because information on the correct quadrant is only present in the signs of Gy and Gx, and that is lost in the division). Commonly, a two-argument function atan2(Gy, Gx) is used instead, which uses sign information and thus returns an angle between -pi and pi, as desired. 134.36.37.145 (talk) 15:54, 15 February 2011 (UTC)[reply]