Comma operator

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Tea2min (talk | contribs) at 09:51, 23 May 2016 (Undid revision 721665926 by 39.173.161.71 (talk)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In the C and C++ programming languages, the comma operator (represented by the token ,) is a binary operator that evaluates its first operand and discards the result, and then evaluates the second operand and returns this value (and type).

The use of the comma token as an operator is distinct from its use in function calls and definitions, variable declarations, enum declarations, and similar constructs, where it acts as a separator.

Syntax

The comma operator separates expressions (which have value) in a way analogous to how the semicolon terminates statements, and sequences of expressions are enclosed in parentheses analogously to how sequences of statements are enclosed in braces: (a, b, c) is a sequence of expressions, separated by commas, which evaluates to the last expression c while {a; b; c;} is a sequence of statements, and does not evaluate to any value. A comma can only occur between two expressions – commas separate expressions – unlike the semicolon, which occurs at the end of a (non-block) statement – semicolons terminate statements.

The comma operator has the lowest precedence of any C operator, and acts as a sequence point. In a combination of commas and semicolons, semicolons have lower precedence than commas, as semicolons separate statements but commas occur within statements, which accords with their use as ordinary punctuation: a, b; c, d is grouped as (a, b); (c, d) because these are two separate statements.

Examples

In this example, the differing behavior between the second and third lines is due to the comma operator having lower precedence than assignment. The last example differs as well since the return expression must be fully evaluated before the function can return.

/**
 *  Commas act as separators in this line, not as an operator.
 *  Results: a=1, b=2, c=3, i=0
 */
int a=1, b=2, c=3, i=0;

/**
 *  Assigns value of b into i.
 *  Results: a=1, b=2, c=3, i=2
 */
int a=1, b=2, c=3;              
int i = (a, b);           
                      
/**
 *  Assigns value of a into i. Equivalent to (i = a), b;
 *  Results: a=1, b=2, c=3, i=1
 *  (The curly braces on the second line are needed to
 *   avoid a compiler error.  The second 'b' declared
 *   is given no initial value.)
 */
int a=1, b=2, c=3;                                
{ int i = a, b; }              

/**
 *  Increases value of a by 2, then assigns value of resulting operation a+b into i .
 *  Results: a=3, b=2, c=3, i=5
 */
int a=1, b=2, c=3;
int i = (a += 2, a + b);
          
/**
 *  Increases value of a by 2, then stores value of a to i, and discards unused
 *  values of resulting operation a + b . Equivalent to (i = (a += 2)), a + b; 
 *  Results: a=3, b=2, c=3, i=3
 */
int a=1, b=2, c=3;
int i;
i = a += 2, a + b;

/**
 *  Assigns value of a into i;  the following 'b' and 'c'
 *  are not part of the initializer but declarators for
 *  second instances of those variables.
 *  Results: a=1, b=2, c=3, i=1
 *  (The curly braces on the second line are needed to
 *   avoid a compiler error.  The second 'b' and second
 *   'c' declared are given no initial value.)
 */     
int a=1, b=2, c=3;
{ int i = a, b, c; }

/**
 *  Assigns value of c into i, discarding the unused a and b values.
 *  Results: a=1, b=2, c=3, i=3
 */
int a=1, b=2, c=3;
int i = (a, b, c);          

/**
 *  Returns 6, not 4, since comma operator sequence points following the keyword 
 *  'return' are considered a single expression evaluating to rvalue of final 
 *  subexpression c=6 .
 */
return a=4, b=5, c=6;

/**
 *  Returns 3, not 1, for same reason as previous example.
 */
return 1, 2, 3;

/**
 *  Returns 3, not 1, still for same reason as above. This example works as it does
 *  because return is a keyword, not a function call. Even though compilers will 
 *  allow for the construct return(value), the parentheses are only relative to "value"
 *  and have no special effect on the return keyword.
 *  Return simply gets an expression and here the expression is "(1), 2, 3".
 */
return(1), 2, 3;

In Pascal, multidimensional arrays are indexed using commas, e.g. A[i, j]. In C, however, A[i, j] is equivalent to A[j], since the value of i is discarded. The correct way to index multidimensional arrays in C is with a construction like A[i][j].

Uses

The comma operator has relatively limited use cases. Because it discards its first operand, it is generally only useful where the first operand has desirable side effects. Further, because it is rarely used outside of specific idioms, and easily mistaken with other commas or the semicolon, it is potentially confusing and error-prone. Nevertheless, there are certain circumstances where it is commonly used, notably in for loops and in SFINAE (http://en.cppreference.com/w/cpp/language/sfinae). For embedded systems which may have limited debugging capabilities, the comma operator can be used in combination with a macro to seamlessly override a function call, to insert code just before the function call.

For loops

The most common use is to allow multiple assignment statements without using a block statement, primarily in the initialization and the increment expressions of a for loop. This is the only idiomatic use in elementary C programming. In the following example, the order of the loop's initializers is significant:

void rev(char *s, size_t len)
{
    char *first;
    for (first = s, s += len - 1; s >= first; --s)
        putchar(*s);
}

An alternative solution to this problem is parallel assignment, which allows multiple assignments to occur within a single statement, and also uses a comma, though with different syntax and semantics. This is used in Go in its analogous for loop.[1]

Outside of for loop initializers (which have a special use of semicolons), the comma might be used synonymously with the semicolon, particularly when the statements in question function similarly to a loop increment (e.g. at the end of a while loop):

++p, ++q;
++p; ++q;

However, as this usage achieves the same thing as the semicolon in a visually different way, this is of dubious usefulness and might confuse readers.

Macros

The comma can be used in preprocessor macros to perform multiple operations in the space of a single syntactic expression.

One common use is to provide custom error messages in failed assertions. This is done by passing a parenthesized expression list to the assert macro, where the first expression is an error string and the second expression is the condition being asserted. The assert macro outputs its argument verbatim on an assertion failure. The following is an example:

#include <stdio.h>
#include <assert.h>

int main ( void )
{
    int i;
    for (i=0; i<=9; i++)
    {
        assert( ( "i is too big!", i <= 4 ) );
        printf("i = %i\n", i);
    }
    return 0;
}

Output:

i = 0
i = 1
i = 2
i = 3
i = 4
assert: assert.c:6: test_assert: Assertion `( "i is too big!", i <= 4 )' failed.
Aborted

Condition

The comma can be used within a condition (of an if, while, do while, or for) to allow auxiliary computations, particularly calling a function and using the result, with block scoping:

if (y = f(x), y > x) {
    ... // statements involving x and y
}

A similar idiom exists in Go, where the syntax of the if statement explicitly allows an optional statement.[2]

Complex return

The comma can be used in return statements, to assign to a global variable or out parameter (passed by reference). This idiom suggests that the assignments are part of the return, rather than auxiliary assignments in a block that terminates with the actual return. For example, in setting a global error number:

if (failure)
    return (errno = EINVAL, -1);

This can be written more verbosely as:

if (failure) {
    errno = EINVAL;
    return -1;
}

Avoid a block

For brevity, the comma can be used to avoid a block and associated braces, as in:

if (x == 1) y = 2, z = 3;
if (x == 1)
    y = 2, z = 3;

instead of:

if (x == 1) {y = 2; z = 3;}
if (x == 1) {
    y = 2; z = 3;
}

Other languages

In the OCaml and Ruby programming languages, the semicolon (";") is used for this purpose. JavaScript and Perl utilize the comma operator in the same way C/C++ does.

See also

References

  1. ^ Effective Go: for, "Finally, Go has no comma operator and ++ and -- are statements not expressions. Thus if you want to run multiple variables in a for you should use parallel assignment (although that precludes ++ and --)."
  2. ^ The Go Programming Language Specification: If statements
  • Ramajaran, V. (1994), Computer Programming in C, New Delhi: Prentice Hall of India
  • Dixit, J.B (2005), Fundamentals of computers and programming in C, New Delhi: Laxmi Publications
  • Kernighan, Brian W.; Ritchie, Dennis M. (1988), The C Programming Language (2nd ed.), Englewood Cliffs, NJ: Prentice Hall

External links