Talk:Row- and column-major order
|WikiProject Computer science||(Rated C-class, Mid-importance)|
2dimensional and 3 dimensional arrays
what is the general formula for locating an item in a 2dimensional and 3dimensional array?
- It depends on the row-major or column major storage method and how it is extended to higher dimensions. Math-wise, an three dimensional (x,y,z) coordinate is pretty common, and you can assume (1,2,3) refers to the x=1, y=2, z=3 point. if you were going to serialize this into a set of numbers, you might step through your whole array along x first, then y , then z, in a row-major-ish way, or along z (the right-most dimension) first, then y, then x in a column-major-ish way. If you have nx, ny, and nz slots along each of the dimensions, then row-major wise, x varies fastest, so you 'd locate the (x,y,z)th item at z*ny*nx+y*nx+x -- Column-major-wise z varies fastest and x varies slowest, so (x,y,z) would be at the z+y*nz+x*ny*nz cell. Of course there's lots of opportunity for off-by-one errors depending on whether your arrays are zero or one based. Drf5n 17:49, 25 August 2006 (UTC)
The bit on multi-dimensional arrays is a little difficult when you start thinking of what 'column' means in a three or higher dimensional tuple. For instance in a three dimensional (x,y,z) array is it clear that 'x' is the row and 'z' (definitely not 'y') is the column? Or for an n-dimensional (d1,d2,d3,...,dn) array is the row-major/column-major distinction better as something like row-major means d1 varying fastest / is the inner loop versus column-major means dn varying fastest / is the inner loop? Beyond 2d, the column-major concept seems a little fuzzy. Drf5n 18:08, 25 August 2006 (UTC)
- It's defined precisely in the article. The terminology is less apt in that case, I agree, but it is totally standard. — Steven G. Johnson (talk)
What are advantages/disadvantages of row-major vs col-major?
It would be a helpful addition to this article to tell us why it's one way in C and the other in Fortran. Are they different for a reason, or was it a random decision? —The preceding unsigned comment was added by Msouth@gmail.com (talk • contribs) 17:32, 4 January 2007 (UTC).
- Row major is the most common representation because it means that the rows of a 2-D array are just sub-arrays within a larger array. In C, for instance, multidimensional arrays are not defined directly, but rather as being arrays of arrays. You can't do this with a column major representation, because the inner-most array is interleaved throughout the outer array. Row major representation is also going to give you better cache locality where you are accessing each sub-array independently, such as an array of fixed-length strings.
- An advantage of column major is that it may make more sense for vector operations, which is important for scientific computing. Consider the matrix row operations of adding one row to all other rows. You would proceed by multiplying each element in the first column by the first element of the row, and then the next column, and so on. Since each column is contiguous in a column major representation, you will get better cache locality, and so fewer cache misses and better performance.
- Something like this needs to be in the article, but I'm not comfortable adding it without a citation, which I don't have. AaronWL 11:05, 22 March 2007 (UTC)
- I think if you are making a language it's an arbitrary decision. However, it is important that a programmer using that language knows what order the language uses. When defining the array, the programmer decides how to arrange the axes (xyz, zyx, xzy, etc), and it is up to them to arrange the axes in a way that enhances performance. The programmer can't do this without knowing how the language is going to unfold that multidimensional array (ie which major-order it uses).188.8.131.52 05:08, 8 November 2007 (UTC)
- In principle, there aren't really any intrinsic advantages one way or the other. As long as you know what the ordering is, you can order your operations accordingly to maximize locality. (In Aaron's example, for row-major order you would process entire rows at a time, rather than processing one column at a time in column-major, to maximize locality.)
- In practice, row-major has an advantage in that human programmers tend to order their loops in the same order as the indices. That is, a programmer will tend to write:
for i = 1 to m for j = 1 to n do something with the array element Ai,j
- rather than
for j = 1 to n for i = 1 to m do something with the array element Ai,j
- The former loop ordering maximizes locality for row-major order, and the latter maximizes locality for column-major order. As a consequence, many Fortran programmers accidentally put their loops in the wrong order and get poor cache performance, and considerable effort was invested into developing compiler optimizations that could fix the loop ordering.
- This kind of thing is discussed in lots of Fortran optimization guides precisely because it is so easy to get wrong (you don't see it to the same extent in C optimization guides). It's probably possible to track down a reputable source giving this as a motivation for preferring row-major order. — Steven G. Johnson (talk) 22:47, 11 March 2010 (UTC)
I would suggest that there needs to be a clarification that even though memory is stored row or column major in different languages, Matlab and Fortran still INDEX their arrays in a row-major format (i.e. a(2,3) is the element in the second row, third column of matrix a). Can anyone discuss a little on this topic? I got in a hideous argument with a coworker (who programs almost entirely in C++ and I in Fortran) over this subject, and we had to open matlab to settle the argument. We were both stumped.
- Row-major and column-major are descriptions of mappings from multidimensional indices to one-dimensional indices (e.g. linear memory). By itself, a multidimensional index like a(2,3) is neither "row-major" nor "column-major". — Steven G. Johnson (talk) 20:21, 6 May 2010 (UTC)
Yeah, the whole thing is confusing as is. As I understand it, C &co do this:
While VB &co do this:
The array is stored the same in memory, row after row, as you'd expect. And this is why e.g. in VB the Redim Preserve statement can only touch the last dimension, since the array may have to be reallocated and this may need a memcopy. —Preceding unsigned comment added by 184.108.40.206 (talk) 11:00, 11 February 2011 (UTC)
The article's title is Row-major order while it isn't specific to row-major order only; it also fully describes column-major order. Wouldn't it make more sense to rename the article Row-major and column-major order? Column-major order is used by multiple popular applications/libraries/etc. and it isn't uncommon to include both alternatives in a title. ctxppc (talk) 16:51, 8 May 2013 (UTC)
- Agreed. Perhaps array index ordering or something all-encompassing? —Ben FrantzDale (talk) 13:35, 9 May 2013 (UTC)
- I agree with Row-major and column-major order or Row- and column-major order or similar. I don't think we should go more generic than that; other possible indexing schemes (e.g. Morton order) should have their own articles. — Steven G. Johnson (talk) 14:17, 9 May 2013 (UTC)