2.7. C Structs
In the previous chapter we introduced C struct types. In this chapter we dive deeper into C structs, examine statically and dynamically allocated structs, and combine structs and pointers to create more complex data types and data structures.
We begin with a quick overview of statically declared structs. See the previous chapter for more details.
2.7.1. Review of the C struct Type
A struct type represents a heterogeneous collection of data; it’s a mechanism for treating a set of different types as a single, coherent unit.
There are three steps to defining and using struct
types in C programs:
-
Define a
struct
type that defines the field values and their types. -
Declare variables of the
struct
type. -
Use dot notation to access individual field values in the variable.
In C, structs are lvalues (they can appear on the left-hand side of
an assignment statement). The value of a struct
variable is the contents
of its memory (all of the bytes making up its field values). When
calling functions with struct
parameters, the value of the struct
argument
(a copy of all of the bytes of all of its fields) gets copied to the
struct
function parameter.
When programming with structs, and in particular when combining structs and arrays, it’s
critical to carefully consider the type of every expression. Each field in a struct
represents a
specific type, and the syntax for accessing field values and the semantics of
passing individual field values to functions follow those of their specific
type.
The following full example program
demonstrates defining a struct
type, declaring variables of that type,
accessing field values, and passing structs and individual field values to
functions. (We omit some error handling and comments for readability).
#include <stdio.h>
#include <string.h>
/* define a new struct type (outside function bodies) */
struct studentT {
char name[64];
int age;
float gpa;
int grad_yr;
};
/* function prototypes */
int checkID(struct studentT s1, int min_age);
void changeName(char *old, char *new);
int main(void) {
int can_vote;
// declare variables of struct type:
struct studentT student1, student2;
// access field values using .
strcpy(student1.name, "Ruth");
student1.age = 17;
student1.gpa = 3.5;
student1.grad_yr = 2021;
// structs are lvalues
student2 = student1;
strcpy(student2.name, "Frances");
student2.age = student1.age + 4;
// passing a struct
can_vote = checkID(student1, 18);
printf("%s %d\n", student1.name, can_vote);
can_vote = checkID(student2, 18);
printf("%s %d\n", student2.name, can_vote);
// passing a struct field value
changeName(student2.name, "Kwame");
printf("student 2's name is now %s\n", student2.name);
return 0;
}
int checkID(struct studentT s, int min_age) {
int ret = 1;
if (s.age < min_age) {
ret = 0;
// changes age field IN PARAMETER COPY ONLY
s.age = min_age + 1;
}
return ret;
}
void changeName(char *old, char *new) {
if ((old == NULL) || (new == NULL)) {
return;
}
strcpy(old,new);
}
When run, the program produces:
Ruth 0 Frances 1 student 2's name is now Kwame
When working with structs, it’s particularly important to think about the types
of the struct
and its fields. For example, when passing a struct
to a
function, the parameter gets a copy of the struct’s value (a copy of all bytes
from the argument). Consequently, changes to the parameter’s field values do
not change the argument’s value. This behavior is illustrated in the
preceding program in the
call to checkID
, which modifies the parameter’s age field. The changes in
checkID
have no effect on the corresponding argument’s age field value.
When passing a field of a struct
to a function, the semantics match the type
of the field (the type of the function’s parameter). For example, in the call
to changeName
, the value of the name
field (the base address of the name
array inside the student2
struct) gets copied to the parameter old
, meaning
that the parameter refers to the same set of array elements in memory as its
argument. Thus, changing an element of the array in the function also changes
the element’s value in the argument; the semantics of passing the name
field
match the type of the name
field.
2.7.2. Pointers and Structs
Just like other C types, programmers can declare a variable as a pointer to a
user-defined struct
type. The semantics of using a struct
pointer variable
resemble those of other pointer types such as int *
.
Consider the struct studentT
type introduced in the previous
program example:
struct studentT {
char name[64];
int age;
float gpa;
int grad_yr;
};
A programmer can declare variables of type
struct studentT
or struct studentT *
(a pointer to a struct studentT
):
struct studentT s;
struct studentT *sptr;
// think very carefully about the type of each field when
// accessing it (name is an array of char, age is an int ...)
strcpy(s.name, "Freya");
s.age = 18;
s.gpa = 4.0;
s.grad_yr = 2020;
// malloc space for a struct studentT for sptr to point to:
sptr = malloc(sizeof(struct studentT));
if (sptr == NULL) {
printf("Error: malloc failed\n");
exit(1);
}
Note that the call to malloc
initializes sptr
to point to a dynamically
allocated struct in heap memory. Using the sizeof
operator to compute
malloc’s size request (e.g., `sizeof(struct studentT)
) ensures that malloc
allocates space for all of the field values in the struct.
To access individual fields in a pointer to a struct
, the pointer variable
first needs to be dereferenced. Based on the rules for
pointer dereferencing, you may be
tempted to access struct
fields like so:
// the grad_yr field of what sptr points to gets 2021:
(*sptr).grad_yr = 2021;
// the age field of what sptr points to gets s.age plus 1:
(*sptr).age = s.age + 1;
However, because pointers to structs are so commonly used, C provides a special
operator (→
) that both dereferences a struct
and accesses one of its field
values. For example, sptr→year
is equivalent to (*sptr).year
. Here are
some examples of accessing field values using this notation:
// the gpa field of what sptr points to gets 3.5:
sptr->gpa = 3.5;
// the name field of what sptr points to is a char *
// (can use strcpy to init its value):
strcpy(sptr->name, "Lars");
Figure 1 sketches what the variables s
and sptr
may look like in
memory after the code above executes. Recall that malloc
allocates
memory from the heap, and local variables are allocated on the stack.
2.7.3. Pointer Fields in Structs
Structs can also be defined to have pointer types as field values. For example:
struct personT {
char *name; // for a dynamically allocated string field
int age;
};
int main(void) {
struct personT p1, *p2;
// need to malloc space for the name field:
p1.name = malloc(sizeof(char) * 8);
strcpy(p1.name, "Zhichen");
p1.age = 22;
// first malloc space for the struct:
p2 = malloc(sizeof(struct personT));
// then malloc space for the name field:
p2->name = malloc(sizeof(char) * 4);
strcpy(p2->name, "Vic");
p2->age = 19;
...
// Note: for strings, we must allocate one extra byte to hold the
// terminating null character that marks the end of the string.
}
In memory, these variables will look like Figure 2 (note which parts are allocated on the stack and which are on the heap).
As structs and the types of their fields increase in complexity, be careful
with their syntax. To access field values appropriately, start from the
outermost variable type and use its type syntax to access individual parts.
For example, the types of the struct
variables shown in Table 1
govern how a programmer should access their fields.
Expression | Type | Field Access Syntax |
---|---|---|
p1 |
struct personT |
p1.age, p1.name |
p2 |
struct personT * |
p2->age, p2->name |
Further, knowing the types of field values allows a program to use the correct syntax in accessing them, as shown by the examples in Table 2.
Expression | Type | Example Access Syntax |
---|---|---|
p1.age |
int |
p1.age = 18; |
p2->age |
int |
p2->age = 18; |
p1.name |
char * |
printf("%s", p1.name); |
p2->name |
char * |
printf("%s", p2->name); |
p1.name[2] |
char |
p1.name[2] = 'a'; |
p2->name[2] |
char |
p2->name[2] = 'a'; |
In examining the last example, start by considering the type
of the outermost variable (p2
is a pointer to a struct personT
).
Therefore, to access a field value in the struct, the programmer
needs to use →
syntax (p2→name
). Next, consider the type of the
name
field, which is a char *
, used in this program to
point to an array of char
values. To access a specific char
storage location through the name
field, use array
indexing notation: p2→name[2] = 'a'
.
2.7.4. Arrays of Structs
Arrays, pointers, and structs can be combined to create more complex data structures. Here are some examples of declaring variables of different types of arrays of structs:
struct studentT classroom1[40]; // an array of 40 struct studentT
struct studentT *classroom2; // a pointer to a struct studentT
// (for a dynamically allocated array)
struct studentT *classroom3[40]; // an array of 40 struct studentT *
// (each element stores a (struct studentT *)
Again, thinking very carefully about variable and field types is necessary for understanding the syntax and semantics of using these variables in a program. Here are some examples of the correct syntax for accessing these variables:
// classroom1 is an array:
// use indexing to access a particular element
// each element in classroom1 stores a struct studentT:
// use dot notation to access fields
classroom1[3].age = 21;
// classroom2 is a pointer to a struct studentT
// call malloc to dynamically allocate an array
// of 15 studentT structs for it to point to:
classroom2 = malloc(sizeof(struct studentT) * 15);
// each element in array pointed to by classroom2 is a studentT struct
// use [] notation to access an element of the array, and dot notation
// to access a particular field value of the struct at that index:
classroom2[3].year = 2013;
// classroom3 is an array of struct studentT *
// use [] notation to access a particular element
// call malloc to dynamically allocate a struct for it to point to
classroom3[5] = malloc(sizeof(struct studentT));
// access fields of the struct using -> notation
// set the age field pointed to in element 5 of the classroom3 array to 21
classroom3[5]->age = 21;
A function that takes an array of type struct studentT
as a parameter might
look like this:
void updateAges(struct studentT *classroom, int size) {
int i;
for (i = 0; i < size; i++) {
classroom[i].age += 1;
}
}
A program could pass this function either a statically or dynamically allocated
array of struct studentT
:
updateAges(classroom1, 40);
updateAges(classroom2, 15);
The semantics of passing classroom1
(or classroom2
) to updateAges
match
the semantics of passing a statically declared (or dynamically allocated)
array to a function: the parameter refers to the same set of elements
as the argument, and thus changes to the array’s values within the function affect the
argument’s elements.
Figure 3 shows what the stack might look like for the second call
to the updateAges
function (showing the passed classroom2
array with example
field values for the struct in each of its elements).
As always, the parameter gets a copy of the value of its argument (the memory address of the array in heap memory). Thus, modifying the array’s elements in the function will persist to its argument’s values (both the parameter and the argument refer to the same array in memory).
The updateAges
function cannot be passed the classroom3
array because its type
is not the same as the parameter’s type: classroom3
is
an array of struct studentT *
, not an array of struct studentT
.
2.7.5. Self-Referential Structs
A struct can be defined with fields whose type is a pointer to
the same struct
type. These self-referential struct
types can be
used to build linked implementations of data structures, such
as linked lists, trees, and graphs.
The details of these data types and their linked implementations
are beyond the scope of this book. However,
we briefly show one example of how to define and use a
self-referential struct
type to create a linked list in C. Refer
to a textbook on data structures and algorithms for more information
about linked lists.
A linked list is one way to implement a list abstract data type.
A list represents a sequence of elements that are ordered by their position in
the list. In C, a list data structure could be implemented as an
array or as a linked list using a self-referential struct
type
for storing individual nodes in the list.
To build the latter, a programmer would define a node
struct to contain one list
element and a link to the next node in the list. Here’s an example
that could store a linked list of integer values:
struct node {
int data; // used to store a list element's data value
struct node *next; // used to point to the next node in the list
};
Instances of this struct
type can be linked together through the
next
field to create a linked list.
This example code snippet creates a linked list containing three
elements (the list itself is referred to by the head
variable that
points to the first node in the list):
struct node *head, *temp;
int i;
head = NULL; // an empty linked list
head = malloc(sizeof(struct node)); // allocate a node
if (head == NULL) {
printf("Error malloc\n");
exit(1);
}
head->data = 10; // set the data field
head->next = NULL; // set next to NULL (there is no next element)
// add 2 more nodes to the head of the list:
for (i = 0; i < 2; i++) {
temp = malloc(sizeof(struct node)); // allocate a node
if (temp == NULL) {
printf("Error malloc\n");
exit(1);
}
temp->data = i; // set data field
temp->next = head; // set next to point to current first node
head = temp; // change head to point to newly added node
}
Note that the temp
variable temporarily points to a malloc’ed node
that
gets initialized and then added to the beginning of the list by setting its
next
field to point to the node currently pointed to by head
, and then
by changing the head
to point to this new node.
The result of executing this code would look like Figure 4 in memory.