struct (C programming language)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A struct in the C programming language (and many derivatives) is a complex data type declaration that defines a physically grouped list of variables to be placed under one name in a block of memory, allowing the different variables to be accessed via a single pointer, or the struct declared name which returns the same address. The struct can contain many other complex and simple data type in an association, so is a natural organizing type for records like the mixed data types in lists of directory entries reading a hard drive (file length, name, extension, physical (cylinder, disk, head indexes) address, etc.), or other mixed record type (patient names, address, telephone... insurance codes, balance, etc.).

The C struct directly corresponds to the Assembly Language data type of the same use, and both reference a contiguous block of physical memory, usually delimited (sized) by word-length boundaries. Language implementations which could utilize half-word or byte boundaries (giving denser packing, using less memory) were considered advanced in the mid-eighties. Being a block of contiguous memory, each variable within is located a fixed offset from the index zero reference, the pointer. As an illustration, many BASIC interpreters once fielded a string data struct organization with one value recording string length, one indexing (cursor value of) the previous line, one pointing the string data.

Basic syntax[edit]

The best way to describe it is via example:

struct Point {
   int x;
   int y;
};

declares a structure called "Point" and states that this structure contains two pieces of information. The first is an integer called "x", the second is an integer called "y". A structure is a singular object and all of the members of the structure may be treated as one unit. A pointer to structure "Point" will point to the first integer "x" which is immediately followed by the second integer "y".

Once a structure is declared, variables may be declared using it:

struct Point vPoint;

declares a variable called "vPoint" which is a "Point". "vPoint.x" accesses the integer member "x" of the structure while "vPoint.y" accesses the integer member "y" of the structure. It is quite common to declare a structure as follows:

typedef struct Point {
   int x;
   int y;
   int z;
  char *point_name;
} Point;

so that "Point" may be used as well as "struct Point" as in the following:

Point vPoint;

The general syntax for a struct declaration in C is:

struct tag_name {
   type member1;
   type member2;
   /* declare as many members as desired, but the entire structure size must be known to the compiler. */
};

here tag_name is optional in some contexts. Such a struct declaration may also appear in the context of a typedef declaration of a type alias or the declaration or definition of a variable, but such entities are better declared separately as in

typedef struct tag_name struct_alias;
struct tag_name struct_instance_1;
struct_alias struct_instance_2;

A struct declaration consists of a list of fields, each of which can have almost any object type. The total storage required for a struct object is the sum of the storage requirements of all the fields, plus any internal padding.

For example:

struct account {
   int account_number;
   char *first_name;
   char *last_name;
   float balance;
};

defines a type, referred to as struct account. To create a new variable of this type, we can write

struct account s;

which has an integer component, accessed by s.account_number, and a floating-point component, accessed by s.balance, as well as the first_name and last_name components. The structure s contains all four values, and all four fields may be changed independently.

The primary use of a struct is for the construction of complex data types, but in practice they are sometimes used to circumvent standard C conventions to create a kind of primitive subtyping. For example, common Internet protocols[examples needed] rely on the fact that C compilers insert padding between struct fields in predictable ways; thus the code

struct ifoo_version_42 {
   long x, y, z;
   char *name;
   long a, b, c;
};
struct ifoo_old_stub {
   long x, y;
};
void operate_on_ifoo(struct ifoo_version_42 *);
struct ifoo_old_stub s;
. . .
operate_on_ifoo(&s);

is often assumed to work as expected,[clarification needed] if the operate_on_ifoo function only accesses fields x and y of its argument.

Struct initialization[edit]

There are three ways to initialize a structure. For the struct type

/* Forward declare a type "point" to be a struct. */
typedef struct point point;
/* Declare the struct with integer members x, y */
struct point {
   int    x;
   int    y;
};

C89-style initializers are used when contiguous members may be given.[1]

/* Define a variable p of type point, and initialize its first two members in place */
point p = {1,2};

For non contiguous or out of order members list, designated initializer style[2] may be used

/* Define a variable p of type point, and set members using designated  initializers*/
point p = {.y = 2, .x = 1};

If an initializer is given or if the object is statically allocated, omitted elements are initialized to 0.[3]

A third way of initializing a structure is to copy the value of an existing object of the same type

/* Define a variable q of type point, and set members to the same values as those of p */
point q = p;

Assignment[edit]

The following assignment of a struct to another struct does what one might expect. It is not necessary to use memcpy() to make a duplicate of a struct type. The memory is already given and zeroed by just declaring a variable of that type regardless of member initialization. This should not be confused with the requirement of memory management when dealing with a pointer to a struct.

#include <stdio.h>
 
/* Define a type point to be a struct with integer members x, y */
typedef struct {
   int    x;
   int    y;
} point;
 
int main(void) {
 
/* Define a variable p of type point, and initialize all its members inline! */
    point p = {1,3};
 
/* Define a variable q of type point. Members are uninitialized. */
    point q;
 
/* Assign the value of p to q, copies the member values from p into q. */
    q = p;
 
/* Change the member x of q to have the value of 3 */
    q.x = 3;
 
/* Demonstrate we have a copy and that they are now different. */
    if (p.x != q.x) printf("The members are not equal! %d != %d", p.x, q.x);
 
    return 0;
}

Pointers to struct[edit]

Pointers can be used to refer to a struct by its address. This is particularly useful for passing structs to a function by reference or to refer to another instance of the struct type as a field. The pointer can be dereferenced just like any other pointer in C — using the * operator. There is also a -> operator in C which dereferences the pointer to struct (left operand) and then accesses the value of a member of the struct (right operand).

struct point {
   int x;
   int y;
};
struct point my_point = { 3, 7 };
struct point *p = &my_point;  /* To declare and define p as a pointer of type struct point,
                                 and initialize it with the address of my_point. */
 
(*p).x = 8;                   /* To access the first member of the struct */
p->x = 8;                     /* Another way to access the first member of the struct */

C does not allow recursive declaration of struct; a struct can not contain a field that has the type of the struct itself. But pointers can be used to refer to an instance of it:

typedef struct list_element list_element;
struct list_element {
   point p;
   list_element * next;
};
list_element el = { .p = { .x = 3, .y =7 }, };
list_element le = { .p = { .x = 4, .y =5 }, .next = &el };

Here the instance el would contain a point with coordinates 3 and 7. Its next pointer would be a null pointer since the initializer for that field is omitted. The instance le in turn would have its own point and its next pointer would refer to el.

typedef[edit]

Typedefs can be used as shortcuts, for example:

typedef struct {
   int    account_number;
   char   *first_name;
   char   *last_name;
   float  balance;
} account;

Different users have differing preferences; proponents usually claim:

  • shorter to write
  • can simplify more complex type definitions
  • can be used to forward declare a struct type

As an example, consider a type that defines a pointer to a function that accepts pointers to struct types and returns a pointer to struct:

Without typedef:

struct point {
   int    x;
   int    y;
};
struct point *(*point_compare_func) (struct point *a, struct point *b);

With typedef:

typedef struct point point_type;
struct point {
   int    x;
   int    y;
};
point_type *(*point_compare_func) (point_type *a, point_type *b);

A common naming convention for such a typedef is to append a "_t" (here point_t) to the struct tag name, but such names are reserved by POSIX so such a practice should be avoided. A much easier convention is to use just the same identifier for the tag name and the type name:

typedef struct point point;
struct point {
   int    x;
   int    y;
};
point *(*point_compare_func) (point *a, point *b);

Without typedef a function that takes function pointer the following code would have to be used. Although valid, it becomes increasingly hard to read.

/* Using the struct point type from before */
 
/* Define a function that returns a pointer to the biggest point,
   using a function to do the comparison. */
struct point *
biggest_point (size_t size, struct point *points,
               struct point *(*point_compare) (struct point *a, struct point *b))
{
    int i;
    struct point *biggest = NULL;
 
    for (i=0; i < size; i++) {
        biggest = point_compare(biggest, points + i);
    }
    return biggest;
}

Here a second typedef for a function pointer type can be useful

typedef point *(*point_compare_func_type) (point *a, point *b);

Now with the two typedefs being used the complexity of the function signature is drastically reduced.

/* Using the struct point type from before and the typedef for the function pointer */
 
/* Define a function that returns a pointer to the biggest point,
   using a function to do the comparison. */
point *
biggest_point (size_t size, point * points, point_compare_func_type point_compare)
{
    int i;
    point * biggest = NULL;
 
    for (i=0; i < size; i++) {
        biggest = point_compare(biggest, points + i);
    }
    return biggest;
}

However, there are a handful of disadvantages in using them:

  • They pollute the main namespace (see below), however this is easily overcome with prefixing a library name to the type name.
  • Harder to figure out the aliased type (having to scan/grep through code), though most IDEs provide this lookup automatically.
  • Typedefs do not really "hide" anything in a struct or union — members are still accessible (account.balance). To really hide struct members, one needs to use 'incompletely-declared' structs.
/* Example for namespace clash */
 
typedef struct account { float balance; } account;
struct account account; /* possible */
account account; /* error */

See also[edit]

References[edit]

  1. ^ Kelley, Al; Pohl, Ira (2004). A Book On C: Programming in C (Fourth ed.). p. 418. ISBN 0-201-18399-4. 
  2. ^ "IBM Linux compilers. Initialization of structures and unions". 
  3. ^ "The New C Standard, §6.7.8 Initialization".