Jump to content

Indentation style

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 209.16.242.50 (talk) at 21:55, 12 July 2007 (→‎Allman style). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer programming, an indent style is a convention governing the indentation of blocks of code to convey the program's structure. This article largely addresses the C programming language and its descendants, but can be (and frequently is) applied to most other programming languages (especially those in the curly bracket family). Indent style is just one aspect of programming style.

Indentation is not a requirement of most programming languages. Rather, programmers indent to better convey the structure of their program to human readers. In particular, indentation is used to show the relationship between control flow constructs such as conditions or loops and code contained within and outside them. However, some programming languages (such as Python and Occam) use the indentation to determine the structure instead of using braces or keywords.

The size of the indent is usually independent of the style. Many early programs used tab characters for indentation, for simplicity and to save on source file size. Unix editors generally view tabs as equivalent to eight characters, while Macintosh environments would set them to four, creating confusion when code was transferred back and forth. Modern programming editors are now often able to set arbitrary indentation sizes, and will insert the appropriate combination of spaces and tabs. They can also help with the cross-platform tab confusion by being configured to insert only spaces.

There are a number of computer programs that automatically correct indent styles as well as the length of tabs. A famous one among them is indent, a program included with many Unix-like operating systems. These programs work best for those who use an indent style close to that considered "proper" by their programmers; those who use other styles will more likely become frustrated.

K&R style

The K&R style, so-called because it was used in Kernighan and Ritchie's book The C Programming Language, is commonly used in C. It is less common for Objective C, C++, C#, and others. It keeps the first opening brace on the same line as control statement, indents the statements within the braces, and puts the closing brace on the same indentation level as the control statement (on a line of its own). Functions, however, are braced distinctly from statements; an opening function brace is placed on the line following the declaration, at the same indentation level as the declaration. This is because in the original C language argument types needed to be declared on the subsequent line, whereas, when no arguments were necessary, the opening brace would appear in the same line with the function declaration.

int main(argc, argv)
int argc;
char **argv;
{
    ...
    while (x == y) {
        something();
        somethingelse();
        if (some_error)
          do_correct();
        else
          continue_as_usual();
    }
    finalthing();
    ...
}

Advocates of this style sometimes refer to it as "The One True Brace Style" (abbreviated as 1TBS or OTBS) because of the precedent set by C (although advocates of other styles have been known to use similarly strong language). The source code of the Unix kernel and Linux kernel is written in this style.

Advantages of this style are that the beginning brace does not require an extra line by itself; and the ending brace lines up with the statement it conceptually belongs to. One disadvantage of this style is that the ending brace of a block takes up an entire line by itself, which can be partially resolved in if/else blocks and do/while blocks:

if (x < 0) {
    printf("Negative");
    negative(x);
} else {
    printf("Positive");
    positive(x);
}

This style makes it difficult to scan any source code for the opening brace of a block; however, it is otherwise as easy to find the beginning of the block by locating first line where the block 'pulls left' (where the indentation level decreases).

The motivation of this style is apparently to conserve screen real estate (and keep more code visible at once) by avoiding the need to dedicate an entire line to a single character (at the time this style was originated, a typical terminal had only 24 lines visible). This is most visibly demonstrated in the preceding example, as a chain of if...else if...[etc.]...else statements will display more code in a given number of lines than a style that places braces on new lines.

While Java is often written in Allman or other styles, a significant body of Java code uses the K&R style, largely because Sun's original style guides (see here and here) used K&R, and as a result most of the standard source code for the Java API is written in K&R. It is also a popular indent style for ActionScript, along with the Allman style.

It should be noted that The C Programming Language does not explicitly specify this style, though it is followed consistently throughout the book. Of note from the book:

The position of braces is less important, although people hold passionate beliefs. We have chosen one of several popular styles. Pick a style that suits you, then use it consistently.

This might serve as a caution to those who may be overly partisan regarding indentation style in general, especially considering the source(s) -- one of the authors, Dennis Ritchie, designed and developed the C language itself, (and Unix development used the style, Kernighan having been a primary contributor), yet their book remains nonpartisan about indentation style.

Allman style

The Allman style is common, and is named after Eric Allman. It puts the brace associated with a control statement on the next line, indented to the same level as the control statement. Statements within the braces are indented to the next level.

while (x == y)
{
    something();
    somethingelse();
}
finalthing();

This style is similar to the standard indentation used by the Pascal programming language and Transact-SQL, where the braces are equivalent to the "begin" and "end" keywords.

Advantages of this style are that the indented code is clearly set apart from the containing statement by lines that are almost completely whitespace, improving readability and the ending brace lines up in the same column as the beginning brace, making easy to find the matching brace. Additionally, the blocking style delineates the actual block of code associated from the control statement itself. Commenting out the control statement, removing the control statement entirely, refactoring, or removing of the block of code is less apt to introduce syntax errors because of dangling or missing brackets.

The following is still syntactically correct, for example.

//while (x == y)
{
    something();
    somethingelse();
}

As is this:

//for (int i=0; i < x; i++)
//while (x == y)
if (x == y)
{
    something();
    somethingelse();
}

A disadvantage of this style is that each of the enclosing braces occupies an entire line by itself without adding any actual code. This once was an important consideration when programs were usually edited on terminals that displayed only 24 lines, but is less significant with larger resolutions.

The motivation of this style is probably to promote code readability through visually separating blocks from their control statements, deeming screen real estate a secondary concern.

This style is used by default in Microsoft Visual Studio 2005 and Apple's Xcode.

BSD KNF style

Also known as Kernel Normal Form style, this is currently the form of most of the code used in the Berkeley Software Distribution operating systems. Although mostly intended for kernel code, it is widely used as well in userland code. It is essentially a thoroughly-documented variant of K&R style.

The hard tabulator (ts in vi) is kept at 8 columns, while a soft tabulator is often defined as a helper as well (sw in vi), and set at 4.

The hard tabulators are used to indent code blocks, while a soft tabulator (4 spaces) of additional indent is used for all continuing lines which must be split over multiple lines.

Moreover, function calls do not use a space before the parenthesis, although C language native statements such as if, while, do, switch and return do (in the case where return is used with parens).

Here follows a few samples:

while (x == y) {
        something();
        somethingelse();
}
finalthing();
if (data != NULL && res > 0) {
        if (!JS_DefineProperty(cx, o, "data", STRING_TO_JSVAL(
            JS_NewStringCopyN(cx, data, res)), NULL, NULL,
            JSPROP_ENUMERATE)) {
                QUEUE_EXCEPTION("Internal error!");
                goto err;
        }
        PQfreemem(data);
} else {
        if (!JS_DefineProperty(cx, o, "data", OBJECT_TO_JSVAL(NULL),
            NULL, NULL, JSPROP_ENUMERATE)) {
                QUEUE_EXCEPTION("Internal error!");
                goto err;
        }
}
static JSBool
pgresult_constructor(JSContext *cx, JSObject *obj, uintN argc,
    jsval *argv, jsval *rval)
{       

        QUEUE_EXCEPTION("PGresult class not user-instanciable");

        return JS_FALSE;
}

Whitesmiths style

The Whitesmiths style is relatively uncommon today compared to the prior three. It was originally used in the documentation for the first commercial C compiler, the Whitesmiths Compiler. It was also popular in the early days of Windows, since it was used in three influential Windows programming books, Programmer's Guide to Windows by Durant, Carlson & Yao, Programming Windows by Petzold, and Windows 3.0 Power Programming Techniques by Norton & Yao. Symbian Ltd continues to advocate this as the recommended bracing style for Symbian C++ mobile phone applications.

This style puts the brace associated with a control statement on the next line, indented. Statements within the braces are indented to the same level as the braces.

while (x == y)
    {
    something();
    somethingelse();
    }

finalthing();

There's no clear consensus on whether to indent the body of a function in the same manner as the statements within it:

void MyFunc ()
{
while (x == y)
    {
    something();
    somethingelse();
    }

finalthing();
}
void MyFunc ()
    {
    while (x == y)
        {
        something();
        somethingelse();
        }

    finalthing();
    }

The advantages of this style are similar to those of the Allman style in that blocks are clearly set apart from control statements. However with Whitesmiths style, the block is still visually connected to its control statement instead of looking like an unrelated block of code surrounded by whitespace. Another advantage is that the alignment of the braces with the block emphasizes the fact that the entire block is conceptually (as well as programmatically) a single compound statement. Furthermore, indenting the braces emphasizes that they are subordinate to the control statement.

A disadvantage of this style could be that the braces do not stand out as well. However this is largely a matter of opinion, because the braces occupy an entire line to themselves even if they are indented to the same level as the block.

Another disadvantage could be that the ending brace no longer lines up with the statement it conceptually belongs to, although others argue that the closing brace belongs to the opening brace and not to the control statement.

An example:

if (data != NULL && res > 0)
    {
    if (!JS_DefineProperty(cx, o, "data", STRING_TO_JSVAL(JS_NewStringCopyN(cx, data, res)),
                           NULL, NULL, JSPROP_ENUMERATE))
        {
        QUEUE_EXCEPTION("Internal error!");
        goto err;
        }
    PQfreemem(data);
    }
else
    {
    if (!JS_DefineProperty(cx, o, "data", OBJECT_TO_JSVAL(NULL),
        NULL, NULL, JSPROP_ENUMERATE))
        {
        QUEUE_EXCEPTION("Internal error!");
        goto err;
        }
    }

GNU style

Like the Allman and Whitesmiths styles, GNU style puts braces on a line by themselves. The braces are indented by 2 spaces, and the contained code is indented by a further 2 spaces. Popularised by Richard Stallman, the layout may be influenced by his background of writing Lisp code. Although not directly related to indentation, GNU coding style also includes a space before the bracketed list of arguments to a function.

while (x == y)
  {
    something ();
    somethingelse ();
  }
finalthing ();

This style combines the advantages of Allman and Whitesmiths, thereby removing the possible Whitesmiths disadvantage of braces not standing out from the block.

The GNU Emacs text editor and the GNU systems' indent command will reformat code according to this style by default. It is mandated by nearly all maintainers of GNU project software, but is rarely used outside of the GNU community. Another disadvantage is that the ending brace no longer lines up with the statement it conceptually belongs to.

Those who do not use GNU Emacs, or similarly extensible/customisable editors, may find that the automatic indenting settings of their editor are unhelpful for this style. However, many editors defaulting to KNF style cope well with the GNU style when the tab width is set to 2 spaces; likewise, GNU Emacs adapts well to KNF style just by setting the tab width to 8 spaces. In both cases, automatic reformatting will destroy the original spacing, but automatic line indentation will work correctly.

Link: GNU Formatting

Pico style

The style used most commonly in the Pico programming language by its designers is different from the aforementioned styles. The lack of return statements and the fact that semicolons are used in Pico as statement separators, instead of terminators, leads to the following syntax:

stuff(n):
{ x: 3 * n;
  y: doStuff(x);
  y + x }

The advantages and disadvantages are similar to those of saving screen real estate with K&R style. One additional advantage is that the beginning and closing braces are consistent in application (both share space with a line of code), as opposed to K&R style where one brace shares space with a line of code and one brace has a line to itself.

The banner style makes visual scanning easier for some, since the "headers" of any block are the only thing exdented at that level (the theory being that the closing control of the previous block interferes with the header of the next block in the K&R and Allman styles). In this style, which is to Whitesmiths as K&R is to Allman, the closing control is indented as the last item in the list (and thus appropriately loses salience).

function1 () {
  dostuff
  do more stuff
  }

function2 () {
  etc
  } 

or, in a markup language...

<table>
  <tr>
    <td> lots of stuff...
      more stuff
      </td>
    <td> alternate for short lines </td>
    <td> etc. </td>
    </tr>
  </table>

<table>
  <tr ... etc>
  </table>

A programmer may even go as far as to insert closing brackets in the last line of a block. This style makes indentation the only way of distinguishing blocks of code, however has the advantage of containing no uninformative lines.

for(i = 0; i < 10; i++) {
    if(i % 2 == 0) {
        doSomething(i); }
    else {
        doSomethingElse(i); } }

Other considerations

Losing track of blocks

In certain situations, there is a risk of losing track of block boundaries. This is often seen in large sections of code containing many compound statements nested to many levels of indentation - by the time the programmer scrolls to the bottom of a huge set of nested statements, he may have lost track of which control statements go where.

Programmers who rely on counting the opening braces may have difficulty with indentation styles such as K&R, where the beginning brace is not visually separated from its control statement. Programmers who rely more on indentation will gain more from styles that are vertically compact, such as K&R, because the blocks are shorter.

To avoid losing track of control statements such as for, one can use a large indent, such as an 8-unit wide hard tab, along with breaking up large functions into smaller and more readable functions. Linux is done this way, as well as using the K&R style.

Another way is to use inline comments added after the closing brace:

for ( int i = 0 ; i < total ; i++ ) {
    foo(bar);
} //for ( i )
if (x < 0) {
    bar(foo);
} //if (x < 0)

However, there is the major disadvantage in this of having to maintain the same code in two places.

Another solution is implemented in a folding editor, which lets the developer hide or reveal blocks of code by their indentation level or by their compound statement structure.

See also

References