Jump to content

User:Smcgruer

From Wikipedia, the free encyclopedia

(Note to the marker: My topic, Hexagonal Coordinate Systems, is technically a mathematical issue with a practical application in Computer Vision. Since the course is about vision, I have chosen to keep this page primarily focused around that. In a real article it would be preferable to keep the main page solely for mathematical definitions, and have it's use in vision processing as another page.)


Hexagonal Coordinate Systems

A hexagonal coordinate system is formed by placing a hexagonal lattice over a 2D plane, forming a coordinate system where each hexagon (or 'tile') can be uniquely specified in some manner. Although the coordinate system can be continuous, the discrete case is more common in vision processing and computer games, where a hexagon usually represents a pixel or map space respectively. The main advantages of hexagonal coordinates over the traditional Cartesian approach are the consistent distance between a hexagon and all of it’s neighbours, and the ability of hexagons to neatly represent natural shapes such as curves.

History[edit]

The use of hexagonal coordinate systems in vision processing first occurred in 1963[citation needed], in the Illinois Pattern Recognition Computer[1], where they were referred to as rhombic arrays. However it was not until 1969 that the first work directly relating to hexagonal coordinate systems was published, on the potential benefits of a hexagonal representation of data for pattern matching[2]. Since then, hexagonal coordinate systems have surfaced multiple times in academic work, but have never obtained widespread popularity.

Definitions and Conventions[edit]

Conventions[edit]

By convention, tiles in a hexagonal coordinate system are laid out edge-to-edge horizontally, along the x-axis. Aligning vertically along the y-axis is sometimes used; most of the definitions and formulae laid out on this page can be adapted to a vertically-aligned system by simply swapping axis parameters. Other alignments are rare, as they complicate most of the mathematics involved in hexagonal coordinate systems.

Indexing a Hexagonal Coordinate System[edit]

TODO
The effect of transforming the 'zig-zag' axis to a straight axis. Note how the transformation partitions the coordinate system in two.

Due to the popular nature of traditional rectangular coordinate systems, a natural approach when defining a hexagonal coordinate system is to attempt to use an orthogonal set of axes. As having truly orthogonal axes is obviously impossible, a zig-zag column approach is often suggested, where the y-axis alternates direction on each vertical row of tiles. This approach is inherently problematic, as can be seen by transforming the y-axis to a straight line. Examining the effect of this transformation on the super-hexagon triangles, we find that using an approximated y-axis splits the coordinate system into two partitions, based on alternating rows. This complicates geometric calculations, as they must take into account the two partitions. For example, distance calculations become iterative as the two ‘shortest’ paths between the start and end point (in terms of the number of tiles passed through) are not the same length in the coordinate system - the calculation must check which row it is in at each iteration and 'move' accordingly.

TODO
The effect of transforming the skewed-axis to a straight axis.

A better approach for indexing hexagonal coordinate systems can be found by altering the traditional view of the y-axis to suit hexagonal tiles. We define the y-axis to be ‘skewed’, rotated 60° from the x-axis. This viewpoint is 'straight' in terms of hexagonal coordinates, as each axis runs through the centre of an edge of the hexagon. Transforming the y-axis back to the traditional viewpoint, as we did before, shows that the super-hexagon triangles are now uniformly represented in the skewed-axis system.

Another approach for indexing hexagonal coordinate systems uses the hexagon’s axes of symmetry, giving three trigonal axes. By convention these are referred to as the x-, y- and z-axes, although it is important to realise that they are distinct from the normal three-dimensional use of these names - the coordinate system still only indexes into a two-dimensional plane. As only two of the three coordinates are required to uniquely specify any hexagon in the coordinate system, one coordinate is redundant; however the use of all three coordinates allows the easy calculation of distances and other geometric functions. Converting between the skewed-axis approach and the three-coordinate approach is straightforward - we simply identify the redundant coordinate and remove it, renaming and shifting the other axes as appropriate.

A final and rather unusual representation for a hexagonal coordinate system is to represent it as a layered system. The layered system is defined recursively, as each layer is composed of a number of tiles from the layer below it. A layer 0 tile is a single hexagon. A layer L tile is composed of a layer L-1 tile and it’s 6 neighbouring tiles. That is, a layer 1 tile is a collection of seven hexagons, a layer 2 tile is a collection of seven layer 1 tiles, and so on. This approach is common in vision processing due to its efficient use of space. There are 7L tiles in an L layer system, and each hexagon in the system can be indexed using an L-digit, base 7 number. Starting from the global tile, each digit in an index represents which sub-tile the hexagon belongs to, starting at 0 in the centre and then proceeding from 1 to 6 around the centre tile. For example, consider the hexagon indexed by 247. The first digit, 2, indicates that the hexagon is in the second location of the layer 1 tile, and the last digit, 4, indicates that the hexagon is in the fourth location of the layer 0 tile.

As each hexagon is just a number, an image can be stored as a one-dimensional vector. A hexagonal image of L layers with 24-bit colour requires 3 x 7L bytes. If the hexagonal image was created from an M x N square image with M = N = 2m, we can calculate the required number of layers as

Aside from its efficient use of space, the hierarchical layering approach also makes the execution of some visual processing algorithms in hexagonal space easier.

Extension to N Dimensions[edit]

As a planar shape, hexagons cannot be represented in three or more dimensions. Therefore, unlike Cartesian or polar coordinate systems, there is no straightforward extension of hexagonal coordinate systems to higher dimensions.

Advantages and Issues[edit]

Advantages of Hexagonal Coordinate Systems[edit]

As stated above, the main reasons for using a hexagonal coordinate system for image processing are hexagons' consistent connection with their neighbours and the ease of representing natural shapes using hexagons. In a normal square-pixel system, a pixel’s neighbours have two different levels of connectivity - they are either 1 pixel away, or pixels. This means that algorithms based on neighbourhood searches either have to ignore this distinction (and lose information) or cope with it somehow (causing increased complexity). Using a hexagonal coordinate system means that each neighbour is exactly 1 pixel away, and so algorithms can treat them all the same. The natural representation of curves in hexagonal coordinate systems allows many visual operations to be performed more easily; examples of edge detection and shape extraction are given below.

In addition to the above, hexagonal lattices also more closely resemble the pattern of photo-receptors in the human eye than square lattices. This means that they may be used when attempting to simulate the visual information provided by the eye to the brain and some of the visual processing performed by the brain on image data, such as simulating saccading[3].

Potential Issues[edit]

Despite their advantages, hexagonal coordinate systems are not commonly used in vision processing outside of academic work. A number of reasons have been suggested for the lack of uptake. Firstly, people often find thinking in hexagonal space difficult compared to traditional Cartesian systems, and so are less inclined to use hexagonal coordinate systems. Whether or not this is a real difficulty, or simply something one has to 'get used to' is debated, but in either case it reduces the likelihood of people choosing to use a hexagonal representation of data.

The lack of hardware devices (both input and output) that directly support hexagonal coordinate systems is also an issue. Input images must be converted from a square lattice to a hexagonal one. This may either be done by extrapolation, in which case the resolution of the image is artificially lowered, or by capturing the image at a higher resolution than it will be processed at, which is wasteful. Once the processing has been completed the output image must then be represented on or converted back to a square lattice somehow, which collapses pixels and thus results in a lower output resolution.

Image Conversion[edit]

Square to Hexagonal Lattices[edit]

As hardware devices that can directly capture images onto a hexagonal lattice are rare, images usually have to be converted from a square lattice to a hexagonal one. This process is known as image re-sampling. The conversion method depends on how the hexagonal coordinate system is indexed; as we can easily convert between the skewed-axis and 3-coordinate representations, only the skewed-axis method will be shown.

A hexagon with the distance from the centre to the middle of one edge, and from the centre to a corner shown.
An example hexagon showing the distances from the centre to it's edges and corners.

To convert points from a Cartesian coordinate system to a hexagonal system, we must transform the points into hexagonal space, map them to a square lattice, and then convert the square lattice to a hexagonal one. For convenience we define two constants: r is the shortest distance from the centre of a hexagon to an edge, and s is the distance from the centre of a hexagon to a corner.

Two matrices are used to transform the Cartesian coordinates to hexagonal space, one for each hexagonal axis. For each axis we must describe an affine transformation from the hexagonal lattice to the square lattice. Starting with the x-axis, we can examine one possible transformation such that every point in a square corresponds to the same x-coordinate in hexagonal space:


TODO


We now need to define the transformation matrix

such that Mph = ps, where ph is a point in the hexagonal lattice and ps is a point in the square lattice. Consider two points in the square lattice: (1,0) and (0,1). These correspond to the points (0, -s) and (r,s/2) in the hexagonal lattice. Therefore, we can form four linear equations:

Solving this linear system gives us , , , and . Now we compute the hexagonal x-coordinate. This is calculated as

where (xs, ys) is the transformed point, n is the offset needed to map the correct points to x = 0 (in the above example we want (-2,0), (-1,0) and (0,0) to all map to hex coordinate x = 0, so n = 2), and w is the 'width' or number of squares wide that a transformed hexagon is.

We now use a similar method for the y-axis. Another affine transformation is examined, such that every point in a square corresponds to the same y-coordinate in hexagonal space.


TODO


Under this transformation the points (1,0) and (0,1) in the square lattice correspond to the hexagonal coordinates (r, s/2) and (-r, s/2) respectively. Again we form four linear equations:

Solving this system gives us , , , and . Finally we compute the hexagonal y-coordinate. This is identical to before:


The values of r and s depend on the chosen mapping scale between the square and hexagonal lattice. Common values are and . These give the basis functions


Note that these basis vectors mean that the horizontal scale is fixed between the square and hexagonal lattice, but that the 'vertical' scale is not independent of the horizontal and so results in the hexagonal system having a tighter packing than the original square lattice.

Once the image has been converted, the hexagonal lattice is often represented using the layered approach in order to make visual processing easier. The conversion to a layered indexing system is a simple bottom-up sweep: we define the origin tile 07 and then walk our way up the system, recursively backtracking down to define new tiles.

Hexagonal to Square Lattices[edit]

As most output devices are based on a square lattice rather than a hexagonal one, displaying a hexagonal-based image requires us to simulate a hexagonal coordinate system using the square lattice. This is a two-step procedure: first we must convert the hexagonal coordinates to Cartesian ones, and then we must simulate each hexagonal pixel using a suitable set of square pixels.

Assuming that the image is stored using the layered approach described above, each hexagon is represented as an n-digit number in base 7:

Each digit, starting from d0, indicates an increase in distance from the origin, which can be seen by examining the sequence {17, 107, 1007, ...}. As the hexagonal x-axis maps exactly to the Cartesian one, we can convert each of these points directly to Cartesian coordinates, getting

where R is determined by the inter-pixel spacing. Using these vectors we can convert any hexagonal location to a Cartesian one, by rotating each vector (using a standard Cartesian rotation matrix) depending on the location of the referenced tile relative to the centre tile. The rotation values are:

di Rotation
1 0
2
3
4
5
6

As an example, the hexagon 327 can be converted to Cartesian coordinates by first converting 27:

and then converting 307:

Combining the two values gives a final result of

Once the image has been converted to Cartesian coordinates, each point must now be displayed as a hexagon on the screen. There are two constraints to consider - we must attempt to simulate a hexagon as exactly as possible, and we must tile the plane exactly. As each hexagonal pixel spans multiple real pixels, they are referred to as hyper-pixels. There are many possible representations for a hexagonal hyper-pixel, but one possible choice is:



Note that a natural consequence of this approach is that the hexagonal resolution is far less than the display's true resolution, meaning we need a denser screen resolution than we wish to display at.

Geometric Transformations[edit]

As with any coordinate system, common geometric transformations can be implemented on hexagonal coordinate systems. These are most easily shown when using a 3-coordinate representation, as their form is then similar to 3D Cartesian transformations. In the following section, we will use the convention that the x-axis points 30° clockwise from 'right' (meaning that the y-axis points directly 'up', and the z-axis is rotated 150° clockwise from right.) As is common when dealing with transformation matrices, homogeneous coordinates can be used to allow transformation chaining. This means that the general form of a transformation is

Some important properties to note are that:

  • Transformation matrices are not necessarily unique - some transformations have multiple representations. For example, a rotation of 60° in hexagonal space has three representations:

Translation[edit]

Translation in a hexagonal coordinate system is similar to the 3D Cartesian case. The translation matrix is given by

Unlike the Cartesian case, however, hexagonal translation has the constraint that the coordinate deltas must sum to 0:


As an example, the translation matrix to move one hexagon 'right' is

We can see that 1 + 0 + -1 = 0, and if we apply this to the point (5,1,2) we get

which gives us the (correct) output point (6, 1, 1).

Reflection[edit]

A point P illustrating the set of reflection axes for a hexagonal coordinate system and the respective reflection matrices.

As a hexagonal lattice has 12-fold symmetry[4], we can examine any point placed on a circle centred at the origin and find that it has 12 reflection points on that circle. These 12 points give the basic reflection matrices for hexagonal coordinate systems (with the point itself being given by the identity matrix, I.) The matrices are:



More complex reflections about an arbitrary line can be achieved by combining a basic reflection with other transformation matrices[5].

Scaling[edit]

As with reflection, scaling is usually more complicated in hexagonal space, as the coordinates are not independent. Uniform scaling is simple as it scales each coordinate together, and is given by

Otherwise, we can consider two different types of scaling - either scaling along one of the three axes, or along the lines formed when two axis values are equal. In each case there are three possible directions for scaling. For each of the six cases we need to derive scaling matrices such that


First consider scaling along the x-axis by a factor α. In this case, the x-value will not change, so we have . Since the scaled coordinates must still sum to 0, we have

Additionally, since y - z changes with respect to the scaling factor, we also have

Solving these equations gives the matrix for the Sx case, and the Sy and Sz matrices are found similarly:


Now consider scaling along a line formed by x = y. In this case the value x - y doesnt change, so we have

.

Therefore, only the value of z changes with respect to α, i.e.

Solving these equations gives us the matrix for the Sxy case, and the Syz and Szx matrices are found similarly:

Rotation[edit]

Since we have x + y + z = 0, a rotation in a hexagonal coordinate system is equivalent to a rotation in Cartesian 3D space around the unit normal vector of the plane

By substituting this into the general rotation matrix for 3D Cartesian space around an arbitrary vector, we get:


From this we create three new matrices by subtracting off P, Q, and R, giving us (respectively):



The differences between P, Q, and R define the ratios of the sides of an equilateral triangle. These are similar to standard trigonometric functions, which define similar ratios for a right-angled triangle. Therefore, we can define three hexagonal trigonometric functions, based on the triangle:


The equilateral triangle used to define hexagonal trigonometric functions.


  • The hexagonal sine, denoted sinx θ, is the ratio of a side of the equilateral triangle to the line AB:
  • The forward hexagonal cosine, denoted cosx+ θ, is the negation of the ratio of line AC to AB:
  • The backward hexagonal cosine, denoted cosx- θ, is the negation of the ratio of line AE to AB:

We can use these hexagonal trigonometric functions to re-write our rotation matrices in a simpler form:



Note that these now closely resemble Cartesian 2D rotation matrices.

Applications[edit]

Edge Detection[edit]

Edge detection is a commonly used technique in visual processing, which aims to find edges in an image by searching for locations at which the image intensity noticeably changes. Middleton and Sivaswamy[6] examined the use of hexagonal coordinate systems for edge detection, adapting three approaches (the Prewitt, Laplacian of Gaussian, and Canny edge detectors) to use hexagonal coordinate systems and comparing their performance to a square lattice.

Prewitt Edge Detection[edit]

The Prewitt edge detector is based on the gradient of the intensity, and looks for local maxima to detect edges. In general it is considered to be a poor detector due to it's bad approximation of the change in intensity. It also uses two 3×3 edge masks, one for the horizontal direction and one for the vertical, which treat all of a pixel's neighbours equivalently. In a square-pixel system this is a false assumption, and means that the Prewitt edge detector ignores information. However, this assumption means that Prewitt edge detection is well suited for conversion to a hexagonal coordinate system, as a hexagonal pixel's neighbours are all equivalent.

Converting the Prewitt operator to a hexagonal lattice requires the definition of new edge masks. As the hexagonal horizontal axis is the same as the Cartesian one, no change is needed for the horizontal edge mask, other than to convert it to a hexagonal matrix:

The vertical mask is approximated by combining two masks that are orientated at 60° and 120° from the horizontal axis:

Note that as h1 = h2 - h3, we do not need to store all three edge masks, and can compute the gradient images using only two of them.

Laplacian of Gaussian Edge Detection[edit]

The Laplacian of Gaussian edge detector is based on the second derivation of the intensity, looking for zero-crossings to indicate edges in the image. Due to the use of a Gaussian smoothing function, to reduce the noise sensitivity of the second derivative, the Laplacian of Gaussian approach is isotropic. As such, adapting it to a hexagonal coordinate system is straightforward; the Gaussian masks must simply be adapted to be hexagonal instead of square.

Canny Edge Detection[edit]

The Canny edge detector combines elements of the Prewitt and Laplacian of Gaussian approaches. It uses two Gaussians; one for the horizontal axis and one for the vertical axis. As such, it is also isotropic and needs little adjustment apart from the use of hexagonal Gaussian masks.

Results[edit]

In general, Middleton and Sivaswamy found that hexagonal-based edge detection located approximately the same number of edge pixels as the square lattice approach for each of their tests cases, but choose qualitatively better pixels which more accurately represented the edge locations. As would be expected, the hexagonal-based edge detection performed better on natural curves than it did on sharp edges. The hexagonal methods also seemed to require less computation: the hexagonal Prewitt edge detection method required 13% less space for it's mask than the square method while achieving similar performance, the hexagonal Laplacian of Gaussian mask was far smaller than the square mask, and the hexagonal Canny edge detector required fewer computations than the square method.

Shape Extraction[edit]

Finding shapes in an image is a common visual processing task, often with the aim of reducing the data dimensionality from a whole image to a set of features. This is called shape extraction, and can be viewed as attempting to find the dominant edges for any possible shapes in an image. Middleton, Sivaswamy, and Coghill[7] examined the use of hexagonal coordinate systems for shape extraction, comparing them to a square lattice.

By using a hexagonal lattice, Middleton et al. were able to mimic the human visual system, specifically the way that human eyes saccade to isolate points of interest in an image. They began by preprocessing the image to find 'critical points', where the weight of a hexagonal attention window placed over the point was above some threshold. The hexagonal attention window was then returned to the first critical point, features were extracted in terms of the weight and orientation of the point, and the attention window was moved according to the current features. Once a whole shape was found the algorithm restarted with the next critical point that had not already been assigned to a shape, and continued until there were no more critical points.

Middleton et al. found that their hexagonal-lattice based shape extraction method worked well on their test cases. As expected it worked better on natural curves due to the ability of the 'round' hexagonal attention window to match curves, while it struggled somewhat on sharp edges where a hexagon cannot nicely sit.

See Also[edit]

References[edit]

  1. ^ McCormick, Bruce H. The Illinois Pattern Recognition Computer - ILLIAC III. IEEE Transactions on Electronic Computers, 1963.
  2. ^ Golay, Marcel J. E. Hexagonal Parallel Pattern Transformations. IEEE Transactions on Computers, 1969.
  3. ^ Middleton, Lee., Coghill, George., and Sivaswamy, Jayanthi. Saccadic Exploration using a Hexagonal Retina. Proceedings of the International ICSC Congress on Intelligent Systems and Applications, 2000.
  4. ^ Her, Innchyn. Geometric Transformations on the Hexagonal Grid. IEEE Transactions On Image Processing, 1995.
  5. ^ Foley, J. D., van Dam, A., Feiner, A. K., and Hughes, J. F. Computer Graphics: Principles and Practice. Addison-Wesley, 1990.
  6. ^ Middleton, Lee., and Sivaswamy, Jayanthi. Edge Detection in a Hexagonal-Image Processing Framework. Image and Vision Computing, Volume 19, Number 14, 2001.
  7. ^ Middleton, Lee., Sivaswamy, Jayanthi., and Coghill, George. Shape Extraction in a Hexagonal-Image Processing Framework. Proceedings of the 6th International Conference on Control, Automation, Robotics and Vision, 2000.

External Links[edit]

Category:Image processing