Row-major order

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computing, row-major order and column-major order describe methods for storing multidimensional arrays in linear memory. Following standard matrix notation, rows are numbered by the first index of a two-dimensional array and columns by the second index. Array layout is critical for correctly passing arrays between programs written in different languages. It is also important for performance when traversing an array because accessing array elements that are contiguous in memory is usually faster than accessing elements which are not, due to caching.

Row-major order is used in C/C++, Mathematica, PL/I, Pascal, Python, Speakeasy, SAS and others. Column-major order is used in Fortran, OpenGL and OpenGL ES, MATLAB, GNU Octave, R, Julia, Rasdaman, and Scilab.

Row-major order[edit]

In row-major storage, a multidimensional array in linear memory is organized such that rows are stored one after the other. It is the approach used by the C programming language, among others.

For example, consider this 2×3 array:

 \begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \end{bmatrix}

An array declared in C as

int A[2][3] = { {1, 2, 3}, {4, 5, 6} };

is laid out contiguously in linear memory as:

1  2  3  4  5  6

To traverse this array in the order in which it is laid out in memory, one would use the following nested loop:

for (row = 0; row < 2; row++) 
    for (column = 0; column < 3; column++) 
        printf("%d\n", A[row][column]);

The difference in offset from one column to the next is 1 and from one row to the next is 3 (zero-based indexing). If we were to consider any matrix a one dimensional array, the linear offset from the beginning of the array to any given element A[row][column] can then be computed as:

offset = (row * NUMCOLS) + column

where NUMCOLS is the number of columns in the array.

Conversely, given an array element's linear offset, its corresponding row and column can be determined from:

row = offset / NUMCOLS
column = offset % NUMCOLS

where / represents Integer division and % is the modulo or integer remainder operator of the C language.

These formulas only work when following the C convention of labeling the first element 0. In other words, row 1, column 2 in matrix A, is represented as A[0][1].

This technique generalizes to higher dimensions, so a 2×3×4 array looks like:

int A[2][3][4] = {{{1,2,3,4}, {5,6,7,8}, {9,10,11,12}}, {{13,14,15,16}, {17,18,19,20}, {21,22,23,24}}};

and the array is laid out in linear memory as:

1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24

Column-major order[edit]

Column-major order is a similar method of flattening arrays onto linear memory, but the columns are listed in sequence. The scientific programming languages Fortran and Julia, the matrix-oriented languages MATLAB,[1] Octave and Scilab, the statistical languages S-Plus[2] and R,[3] the shading languages GLSL and HLSL (but not Cg), and the array database Rasdaman use column-major ordering. The array

 \begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \end{bmatrix}

if stored contiguously in linear memory with column-major order looks like the following:

1  4  2  5  3  6

The memory offset could then be computed as:

offset = row + column*NUMROWS

where NUMROWS represents the number of rows in the array—in this case, 2.

Treating a row-major array as a column-major array is the same as transposing it. Because performing a transpose requires data movement, and is quite difficult to do in-place for non-square matrices, such transpositions are rarely performed explicitly. For example, software libraries for linear algebra, such as the BLAS, typically provide options to specify that certain matrices are to be interpreted in transposed order to avoid the necessity of data movement.

Generalization to higher dimensions[edit]

It is possible to generalize both of these concepts to arrays with greater than two dimensions. For higher-dimensional arrays, the ordering determines which dimensions of the array are more consecutive in memory. Any of the dimensions could be consecutive, just as a two-dimensional array could be listed column-first or row-first. The difference in offset between listings of that dimension would then be determined by a product of other dimensions. It is uncommon, however, to have any variation except ordering dimensions first to last or last to first. These two variations correspond to row-major and column-major, respectively.

More explicitly, consider a d-dimensional N_1 \times N_2 \times \cdots \times N_d array with dimensions Nk (k=1...d). A given element of this array is specified by a tuple (n_1, n_2, \ldots, n_d) of d (zero-based) indices n_k \in [0,N_k - 1].

In row-major order, the last dimension is contiguous, so that the memory-offset of this element is given by:

n_d + N_d \cdot (n_{d-1} + N_{d-1} \cdot (n_{d-2} + N_{d-2} \cdot (\cdots + N_2 n_1)\cdots)))
= \sum_{k=1}^d \left( \prod_{\ell=k+1}^d N_\ell \right) n_k

In column-major order, the first dimension is contiguous, so that the memory-offset of this element is given by:

n_1 + N_1 \cdot (n_2 + N_2 \cdot (n_3 + N_3 \cdot (\cdots + N_{d-1} n_d)\cdots)))
= \sum_{k=1}^d \left( \prod_{\ell=1}^{k-1} N_\ell \right) n_k

Note that the difference between row-major and column-major order is simply that the order of the dimensions is reversed. Equivalently, in row-major order the rightmost indices vary faster as one steps through consecutive memory locations, while in column-major order the leftmost indices vary faster.

See also[edit]

References[edit]

  1. ^ MATLAB documentation, MATLAB Data Storage (retrieved from Mathworks.co.uk, January 2014).
  2. ^ Spiegelhalter et al. (2003, p. 17): Spiegelhalter, David; Thomas, Andrew; Best, Nicky; Lunn, Dave (January 2003), "Formatting of data: S-Plus format", WinBUGS User Manual (Version 1.4 ed.), Robinson Way, Cambridge CB2 2SR, UK: MRC Biostatistics Unit, Institute of Public Health, PDF document 
  3. ^ An Introduction to R, Section 5.1: Arrays (retrieved March 2010).