# Z-buffering

Z-buffer data

In computer graphics, z-buffering, also known as depth buffering, is the management of image depth coordinates in 3D graphics, usually done in hardware, sometimes in software. It is one solution to the visibility problem, which is the problem of deciding which elements of a rendered scene are visible, and which are hidden. Z-buffering has first been described in 1974 by Wolfgang Straßer in Chapter 6 (page 6-1) of his PhD thesis.[1] The painter's algorithm is another common solution which, though less efficient, can also handle non-opaque scene elements. The z-buffer uses the Image space method for hidden surface detection. A z-buffer can refer to a data structure or to the method used to perform operations on that structure.

In a 3d-rendering engine, when an object is projected on the screen, the depth (z-value) of a generated pixel in the projected screen image is stored in a buffer (the z-buffer or depth buffer). A z-value is the measure of the perpendicular distance from a pixel on the projection plane to its corresponding 3d-coordinate on a polygon in world-space.

The z-buffer has the same internal data structure as an image, namely a 2d-array, with the only difference being that it stores a z-value for each screen pixel instead of pixel data. It has the same dimensions as the screen buffer, except when multiple z-buffers are used, such as in split-screen rendering. It operates in screen-space and takes as its input a projected image that originates from a projection of an object to the screen.

Before a projection from world to screen-space is done, primary visibility tests (such as back-face culling) are usually performed. Before an image is passed to the z-buffer, secondary visibility tests (such as overlap checks and screen clipping) are usually performed on objects' vertices. Primary and secondary visibility tests do not require the checking of individual pixels, so the z-buffer is relieved of some duty.

When viewing an image containing partially or fully overlapping opaque objects or surfaces, it is not possible to fully see those objects that are furthest away from the viewer and behind other objects (i.e., some surfaces are hidden behind others). The identification and removal of these surfaces is called the hidden-surface problem. To improve the rendering time, the hidden surfaces should be removed before a projected image of the surfaces is being passed to the z-buffer. To check for overlap, the z-buffer calculates the z-value of a pixel corresponding to the first object, and compares it with the z-value at the same pixel location in the z-buffer corresponding to the object that is known to be closest to the viewer. If the calculated z-value is smaller than the z-value already in the z-buffer, then the current z-value in the z-buffer is replaced with the calculated value. This doesn't necessarily mean that the first object as a whole is closer to the viewer than the closest known object, but it certainly means that the z-value's corresponding 3d-point on the first object's surface in world-space is closer to the viewer. In other words, the objects are intersecting, and at least some part of the first object is closer and thus visible to the viewer. In the end, the z-buffer will allow correct reproduction of the usual depth perception: a close object hides one further away. This is called z-culling.

The granularity of a z-buffer has a great influence on the scene quality: the traditional 16-bit z-buffer can result in artifacts (called "z-fighting" or stitching) when two objects are very close to each other. A more modern 24-bit or 32-bit z-buffer behaves much better, although the problem cannot be entirely eliminated without additional algorithms. An 8-bit z-buffer is almost never used since it has too little precision.

## Uses

The Z-buffer is a technology used in almost all contemporary computers, laptops and mobile phones for performing 3D graphics, for example for computer games. The Z-buffer is implemented as hardware in the silicon ICs (integrated circuits) within these computers. The Z-buffer is also used (implemented as software as opposed to hardware) for producing computer-generated special effects for films.

Furthermore, Z-buffer data obtained from rendering a surface from a light's point-of-view permits the creation of shadows by the shadow mapping technique.

## Developments

Even with small enough granularity, quality problems may arise when precision in the z-buffer's distance values is not spread evenly over distance. Nearer values are much more precise (and hence can display closer objects better) than values which are farther away. Generally, this is desirable, but sometimes it will cause artifacts to appear as objects become more distant. A variation on z-buffering which results in more evenly distributed precision is called w-buffering (see below).

At the start of a new scene, the z-buffer must be cleared to a defined value, usually, 1.0, because this value is the upper limit (on a scale of 0 to 1) of depth, meaning that no object is present at this point through the viewing frustum.

The invention of the z-buffer concept is most often attributed to Edwin Catmull, although Wolfgang Straßer described this idea in his 1974 Ph.D. thesis months before Catmull's invention 1.

On more recent PC graphics cards (1999–2005), z-buffer management uses a significant chunk of the available memory bandwidth. Various methods have been employed to reduce the performance cost of z-buffering, such as lossless compression (computer resources to compress/decompress are cheaper than bandwidth) and ultra-fast hardware z-clear that makes obsolete the "one frame positive, one frame negative" trick (skipping inter-frame clear altogether using signed numbers to cleverly check depths).

## Z-culling

In rendering, z-culling is early pixel elimination based on depth, a method that provides an increase in performance when rendering of hidden surfaces is costly. It is a direct consequence of z-buffering, where the depth of each pixel candidate is compared to the depth of existing geometry behind which it might be hidden.

When using a z-buffer, a pixel can be culled (discarded) as soon as its depth is known, which makes it possible to skip the entire process of lighting and texturing a pixel that would not be visible anyway. Also, time-consuming pixel shaders will generally not be executed for the culled pixels. This makes z-culling a good optimization candidate in situations where fillrate, lighting, texturing or pixel shaders are the main bottlenecks.

While z-buffering allows the geometry to be unsorted, sorting polygons by increasing depth (thus using a reverse painter's algorithm) allows each screen pixel to be rendered fewer times. This can increase performance in fillrate-limited scenes with large amounts of overdraw, but if not combined with z-buffering it suffers from severe problems such as:

• polygons might occlude one another in a cycle (e.g.: triangle A occludes B, B occludes C, C occludes A), and
• there is no canonical "closest" point on a triangle (e.g.: no matter whether one sorts triangles by their centroid or closest point or furthest point, one can always find two triangles A and B such that A is "closer" but in reality B should be drawn first).

As such, a reverse painter's algorithm cannot be used as an alternative to Z-culling (without strenuous re-engineering), except as an optimization to Z-culling. For example, an optimization might be to keep polygons sorted according to x/y-location and z-depth to provide bounds, in an effort to quickly determine if two polygons might possibly have an occlusion interaction.

## Mathematics

The range of depth values in camera space to be rendered is often defined between a ${\displaystyle {\mathit {near}}}$ and ${\displaystyle {\mathit {far}}}$ value of ${\displaystyle z}$. After a perspective transformation, the new value of ${\displaystyle z}$, or ${\displaystyle z'}$, is defined by:

${\displaystyle z'={\frac {{\mathit {far}}+{\mathit {near}}}{{\mathit {far}}-{\mathit {near}}}}+{\frac {1}{z}}\left({\frac {-2\cdot {\mathit {far}}\cdot {\mathit {near}}}{{\mathit {far}}-{\mathit {near}}}}\right)}$

After an orthographic projection, the new value of ${\displaystyle z}$, or ${\displaystyle z'}$, is defined by:

${\displaystyle z'=2\cdot {\frac {{z}-{\mathit {near}}}{{\mathit {far}}-{\mathit {near}}}}-1}$

where ${\displaystyle z}$ is the old value of ${\displaystyle z}$ in camera space, and is sometimes called ${\displaystyle w}$ or ${\displaystyle w'}$.

The resulting values of ${\displaystyle z'}$ are normalized between the values of -1 and 1, where the ${\displaystyle {\mathit {near}}}$ plane is at -1 and the ${\displaystyle {\mathit {far}}}$ plane is at 1. Values outside of this range correspond to points which are not in the viewing frustum, and shouldn't be rendered.

### Fixed-point representation

Typically, these values are stored in the z-buffer of the hardware graphics accelerator in fixed point format. First they are normalized to a more common range which is [0,1] by substituting the appropriate conversion ${\displaystyle z'_{2}={\frac {\left(z'_{1}+1\right)}{2}}}$ into the previous formula:

${\displaystyle z'={\frac {{\mathit {far}}+{\mathit {near}}}{2\cdot \left({\mathit {far}}-{\mathit {near}}\right)}}+{\frac {1}{z}}\left({\frac {-{\mathit {far}}\cdot {\mathit {near}}}{{\mathit {far}}-{\mathit {near}}}}\right)+{\frac {1}{2}}}$

Second, the above formula is multiplied by ${\displaystyle S=2^{d}-1}$ where d is the depth of the z-buffer (usually 16, 24 or 32 bits) and rounding the result to an integer:[2]

${\displaystyle z'=f(z)=\left\lfloor \left(2^{d}-1\right)\cdot \left({\frac {{\mathit {far}}+{\mathit {near}}}{2\cdot \left({\mathit {far}}-{\mathit {near}}\right)}}+{\frac {1}{z}}\left({\frac {-{\mathit {far}}\cdot {\mathit {near}}}{{\mathit {far}}-{\mathit {near}}}}\right)+{\frac {1}{2}}\right)\right\rfloor }$

This formula can be inverted and derived in order to calculate the z-buffer resolution (the 'granularity' mentioned earlier). The inverse of the above ${\displaystyle f(z)\,}$:

${\displaystyle z={\frac {-{\mathit {far}}\cdot {\mathit {near}}}{{\frac {z'}{S}}\left({\mathit {far}}-{\mathit {near}}\right)-{\mathit {far}}}}={\frac {-S\cdot {\mathit {far}}\cdot {\mathit {near}}}{z'\left({\mathit {far}}-{\mathit {near}}\right)-{\mathit {far}}\cdot S}}}$

where ${\displaystyle S=2^{d}-1}$

The z-buffer resolution in terms of camera space would be the incremental value resulted from the smallest change in the integer stored in the z-buffer, which is +1 or -1. Therefore, this resolution can be calculated from the derivative of ${\displaystyle z}$ as a function of ${\displaystyle z'}$:

${\displaystyle {\frac {dz}{dz'}}={\frac {-1\cdot (-1)\cdot S\cdot {\mathit {far}}\cdot {\mathit {near}}}{\left(z'\left({\mathit {far}}-{\mathit {near}}\right)-{\mathit {far}}\cdot S\right)^{2}}}\cdot \left({\mathit {far}}-{\mathit {near}}\right)}$

Expressing it back in camera space terms, by substituting ${\displaystyle z'}$ by the above ${\displaystyle f(z)\,}$:

{\displaystyle {\begin{aligned}{\frac {dz}{dz'}}&={\frac {-1\cdot (-1)\cdot S\cdot {\mathit {far}}\cdot {\mathit {near}}\cdot \left({\mathit {far}}-{\mathit {near}}\right)}{\left(S\cdot \left({\frac {-{\mathit {far}}\cdot {\mathit {near}}}{z}}+{\mathit {far}}\right)-{\mathit {far}}\cdot S\right)^{2}}}\\&={\frac {\left({\mathit {far}}-{\mathit {near}}\right)\cdot z^{2}}{S\cdot {\mathit {far}}\cdot {\mathit {near}}}}\\&={\frac {z^{2}}{S\cdot {\mathit {near}}}}-{\frac {z^{2}}{S\cdot {\mathit {far}}}}\approx {\frac {z^{2}}{S\cdot {\mathit {near}}}}\end{aligned}}}

This shows that the values of ${\displaystyle z'}$ are grouped much more densely near the ${\displaystyle {\mathit {near}}}$ plane, and much more sparsely farther away, resulting in better precision closer to the camera. The smaller the ${\displaystyle {\mathit {near}}/{\mathit {far}}}$ ratio is, the less precision there is far away—having the ${\displaystyle near}$ plane set too closely is a common cause of undesirable rendering artifacts in more distant objects.[3]

To implement a z-buffer, the values of ${\displaystyle z'}$ are linearly interpolated across screen space between the vertices of the current polygon, and these intermediate values are generally stored in the z-buffer in fixed point format.

### W-buffer

To implement a w-buffer,[4] the old values of ${\displaystyle z}$ in camera space, or ${\displaystyle w}$, are stored in the buffer, generally in floating point format. However, these values cannot be linearly interpolated across screen space from the vertices—they usually have to be inverted, interpolated, and then inverted again. The resulting values of ${\displaystyle w}$, as opposed to ${\displaystyle z'}$, are spaced evenly between ${\displaystyle {\mathit {near}}}$ and ${\displaystyle {\mathit {far}}}$. There are implementations of the w-buffer that avoid the inversions altogether.

Whether a z-buffer or w-buffer results in a better image depends on the application.

## Algorithmics

The following pseudocode demonstrates the process of z-buffering:

First of all, initialize the depth of each pixel.
i.e,  d(i, j) = infinite (max length)

Initialize the color value for each pixel
as c(i, j) = background color

for each polygon, do the following steps :
for (each pixel in polygon's projection)
{
find depth i.e, z of polygon
at (x, y) corresponding to pixel (i, j)
if (z < d(i, j))
{
d(i, j) = z;
c(i, j) = color;
}
}