Skip to content
Physics and Astronomy
Home Our Teaching Resources C programming Structures
Back to top
On this page
Contents

Structures organise our data

Show me your [functions] and conceal your [data structures] and I shall continue to be mystified,
show me your [data structures] and I won't usually need your [functions]; they'll be obvious.

Brooks,

Structures do for variables what functions do for code

Representation is the essence of programming.
Brooks

We have already discussed the fact that nearly every program is so large that it is completely impossible to hold in all in our head at the same time.

  • Functions organise code into self contained units, enabling us to think at a higher level without having to keep track of the details.
  • Structures do exactly the same for variables.

Structures allow us to organise our data, letting us think at a higher level without having to keep track of the details

Composite variables

We have already encountered C's two built-in composite variables: complex numbers and arrays. The latter allow us to think in terms of high-level mathematical concepts such as vectors and matrices and allow us to pass a single memory address to a function to allow it to access and modify any member of the array. However arrays are limited to composite objects of the same type (all ints, all doubles, etc.) C structures remove these limitations.

C structures allow us to define our own composite data types where several "sub-variables" (called "members") can be combined into one composite variable which can be treated as a single unit, similar to the individual elements of an array.

Since structures have members with different types it would be inconvenient to refer to them by number, instead they are referred to by name using a dot (.) rather than square brackets.

An array has elements of the same type referred as array[number], a structure has members of different types referred to as structure.name.

Defining a new composite variable (structure) type

Example: a nuclear problem

Imagine a simple problem involving radioactive nuclides. If each nuclide has a name (stored as an array of characters) and a a half life (stored as a double) then to store the data for several nuclides we will need a two-dimensional array of characters and a one-dimensional array of doubles.

This isn't very convenient, but C doesn't define a composite variable type consisting of a string and a double and even if it did, it's likely that a little later on we would want to add more data such as mass and atomic number.

It would be great if C allowed us to define our own variable types, with names such as "Nuclide" or "Isotope", and luckily it does. The following code defines a composite variable type called a "Nuclide":

The structure name (nuclide) is optional but if it is present it's normal to give the structure and the typedef similar names. In our case the typedef name is the same as the structure name but is capitalised


#define MAXLEN  32
typedef struct nuclide { 
  double halflife; 
  char name[MAXLEN];
} Nuclide;

The above code defines a new data type called "Nuclide" consisting of a double called halflife and a character array called name. This definition does not create any Nuclide variables. Rather it puts the variable type "Nuclide" on exactly the same basis as the built-in types int, float etc: we can now create (declare) some Nuclides if we need to.

The structure definition does not create any structures it just says what the structure means so we can later create some.

It's also worth noticing that the Nuclide structure contains an array: structures can contain arrays and we shall see below that we can have arrays of structures.

Optional aside

The above combines two C features: structures, which actually do the work, and typedefs which give user-friendly names to things which already exist.

Structures

The following is the bare structure definition (technically referred to as a "specifier") which defines a new variable type, "struct nuclide":

#define MAXLEN  32
struct nuclide { 
  double halflife; 
  char name[MAXLEN];
} ;

We can now create variables of type "struct nuclide" just like ints, floats, etc.

typedef

Since "struct nuclide" is not the most user-friendly name for a new variable type, we can make things a little neater by using C's "typedef" mechanism to give our structure a neater name:

typedef struct nuclide Nuclide;

Combining the two

For convenience we may combine the two as above:

typedef struct nuclide {
double halflife;
 char name[MAXLEN]; } Nuclide;

Now "Nuclide" and"struct nuclide" are synonyms; they mean the same thing.It is a common convention to give the synonym the same name as the structure type but with an initial capital letter.

Declaring structures

If we have the above code just once at the top of our file then Nuclide joins the list of variables we can now declare, just like ints, floats, etc, and we can access the individual structures members as varname.member:

#include <stdio.h>
#include <string.h>

#define MAXLEN  32
typedef struct nuclide { 
  double halflife; 
  char name[MAXLEN];
} Nuclide;

int main() {
  Nuclide nuc;

  nuc.halflife = 3.2;
  strncpy(nuc.name, "Mystuff", MAXLEN);
  nuc.name[MAXLEN-1] = '\0';

  // Now do something...
  return 0;
}
Step through this code


Once we have defined what a structure means we can then create some just like any other variable type.

In the above example we have assigned a value to nuc.mass and used nuc.name as an argument to strncpy(). We could have also read in the value of nuc.halflife with scanf():

  printf("Please enter the half-life.\n");
  scanf("%lg", &nuc.halflife);
 

Anything we can do with an ordinary variable or an array we can do when it's part of a structure

We can also have arrays of structures, although these are quite uncommon:

#include <stdio.h>
int main() {
  Nuclide nucs[MAXNUCLIDES];

  for (int n = 0; n < MAXNUCLIDES; ++n)
    scanf("%s %lg", nucs[n].name, &nucs[n].halflife);
  
  ...
  return 0;
}
Step through this code


  1. A structure to represent a cuboid
  2. To practice structures.
  3. A solid cuboid can be considered to have three dimensions, a density and a mass. This can be represented by an double array of size three and two individual doubles. 
  4. Create a new on-line program in a new window, with a suitable title and opening comment.
  5. Above main() create a structure definition for a cuboid with the above quantities. Give the structure and its members sensible names but do not at this point declare any actual structures inside main().
  6. Build & run. The code should run but not do anything
  7. Now declare an actual cuboid inside of main(). Read in the value of the density into the structure using scanf() and print that value out to the screen as a check.
  8. Build & run. Check the output is correct.
  9. Now read in the three dimensions, calculate the mass and print out all five numbers (the three dimensions, the density and the mass).
  10. Build & run. Check the output is correct..

Another example: an ellipse

The following defines a structure which we might use to represent an ellipse:

typedef struct ellipse {
  float centre[2];
  float axes[2];
  float orientation;
  float area;
} Ellipse;

We will use this example in the rest of the notes.

When to use structures and what to put in them

In general:

Any type of "thing" in the problem we are thinking about whose properties cannot be represented by an existing variable type or array should normally have its own structure type.

The structure should tell us everything we need to know about that thing.

As a general rule we should ask ourselves: "If I had two ellipses (or nuclides etc.) what would I need two of and what would I still just need one of?" Things we would need two of should be part of the structure.

For example, if we had two ellipses they would each have their own centre and axes but would share the same value of PI.

Structures are "proper variables"

Unlike arrays where the "value" of the name of an array is the address of its first element, structures are "proper variables" that can be copied (although this is quite rare):

  mystruct = yourstruct;

This has a serious consequence when passing structures to functions as we will see in the next example.

Passing structures to functions

Given that structures are proper variables we may pass copies of them to functions. This rather simple example shows a function whose job is to calculate and print out the area of an ellipse. (In practice this function is so simple it probably isn't worth making a separate function.) Just for fun it also makes a doomed attempt to move the ellipse.

#include <stdio.h>
#define  PI 3.14159265358979

typedef struct ellipse {
  float centre[2];
  float axes[2];
  float orientation;
} Ellipse;

void print_area(Ellipse el) {
  float area;

  area = PI * el.axes[0] * el.axes[1];
  printf("The area is %f\n", area);
  el.centre[0] = 123.456; // Move the ellipse
  el.centre[1] = -78.9;
}

int main() {
  Ellipse ellie;

  printf("Centre? (x,y)\n");
  scanf("%f %f", &ellie.centre[0], &ellie.centre[1]);
 
  // NB: in a real program we should check
  // the axes are actually > 0 
  printf("Length of axes ( > 0 )?\n");
  scanf("%f %f", &ellie.axes[0], &ellie.axes[1]);

  printf("Orientation to the vertical?\n");
  scanf("%f", &ellie.orientation);

  /*CIRCLE*/print_area(ellie);

  return 0;
}
Step through this code


  • Fast-forward to the call to print_area() (second line from the bottom) by clicking on the » button to the right of the call.
  • Now step forward and note how the function receives a copy of the original ellipse. Any changes made to the copy are

This example also illustrates that using structures without pointers is only of limited use: the function print_area()receives a copy of the original structure, not the structure itself, so if we change the copy inside the function the original is unchanged. For this reason structures are almost always used with pointers.

There are several reasons we may wish to modify an ellipse:

  1. We may wish to move the ellipse. (Or resize it, rotate it, or ...)
  2. Some properties of the ellipse, such as the area, may be so useful we may wish to add a member of the structure to store it.

We can solve this by passing a pointer to a structure, to a function, rather than a copy of the structure..

Pointers to structures

Structures are most useful when combined with pointers.

These work just as we would expect. Let's add an "area" member to our ellipse structure:
typedef struct ellipse {
  float centre[2];
  float axes[2];
  float orientation;
  float area;
} Ellipse;

This takes twenty four bytes. A particular Ellipse, say called "ellie", might be stored starting at byte 800 in which case its members would be laid out as follows:

               ----------------------- ------------------------------------------------
Byte number:  | 800 - 803 | 804 - 807 | 808 - 811 | 812 - 815 | 816 - 819 | 820 - 823 |
               ----------------------- ------------------------------------------------
Used for:     | centre[0] | centre[1] |  axes[0]  |  axes[1]  |orientation|   area    |
               ------------------------------------------------------------------------
              | <-------------------------- ellie ----------------------------------> |
A statement such as:
  ellie.orientation = 11.4;

would mean that the computer would go to the start of ellie (800), go along 16 bytes to byte 816 and write the value 11.4 into the four bytes 816 - 819.

Alternatively, we could declare a pointer to an Ellipse, assign it the address of ellie (800) and use that pointer, as in the following rather contrived example:

#define PI 3.14159265358979

int main() {
  Ellipse ellie, *ep; // ellie is a structure, ep just a pointer

  ep = &ellie;
  (*ep).orientation = PI/4;
  // More code here...

The notation (*ep).orientation is rather awkward (it is a consequence of the fact that operators that follow their operand such as array[n], elle.orientation bind more closely than anything else). However, pointers to structures are used so often they have their own notation:
  ep->orientation = PI/4;
The fact that they have their own notation should tell us how important pointers to structures are!

If p is a pointer to a structure then p->member accesses a member of that structure.

Passing pointers-to-structures to functions

We can now solve the obvious weakness of our previous 'structure' function (that it only passes a copy of the structure into the function without passing the new values back) by passing a pointer to a structure to a function, rather than a copy of the structure.

This is an example of "several pointers pointing to the same object": although the pointer in the calling function and the pointer in the called function are different variables they both have the same value and so they both point to the same thing. When we dereference them we are therefore accessing the same original structure.

Improving the print_area() function

Having added a new member, area, to our ellipse structure we can modify our print_area() function to calculate the area and store it in the structure rather than just print it to the sceen. We will call the new function calculate_area(). The program now looks like this:

#include <stdio.h>
#define  PI 3.14159265358979

typedef struct ellipse {
  float centre[2];
  float axes[2];
  float orientation;
  float area;
} Ellipse;

void calculate_area(Ellipse *el) {
  el->area = PI * el->axes[0] * el->axes[1];
}

int main() {
  Ellipse ellie;

  printf("Centre? (x,y)\n");
  scanf("%f %f", &ellie.centre[0], &ellie.centre[1]);
 
  // NB: in a real program we should check
  // the axes are actually > 0 
  printf("Length of axes ( > 0 )?\n");
  scanf("%f %f", &ellie.axes[0], &ellie.axes[1]);

  printf("Orientation to the vertical?\n");
  scanf("%f", &ellie.orientation);

  calculate_area(&ellie);

  return 0;
}
Step through this code


Passing pointers-to-structures to functions allows the function to access and change the members of the structure.

Passing pointers between functions

In reality calculate_area() is still a little simple to be worth having as a separate function. However it would be quite sensible to define a new function to read in the values of the ellipse, which we might imaginitively call "read_ellipse()".

We shall have read_ellipse() pass the pointer it receives to our new calculate_area() function to illustrate an extremely common practice: functions pass pointers on to other functions. The obvious analogy is with a phone number:

If I were to tell you "my phone number is 0314 159 265, ring me any time you have a question about programming", not only could you ring me yourself but could pass that number onto your class-mates who could also ring me.

Similarly, if a function has a pointer to something it can pass the value of that pointer to another function which can then access the original object in memory.

Finally we have functions to move and resize the ellipse, just because we can and to illustrate that the resize function can also call calculate_area().

#include <stdio.h>

#define  PI 3.14159265358979
typedef struct ellipse {
  float centre[2];
  float axes[2];
  float orientation;
  float area;
} Ellipse;

void calculate_area(Ellipse *el);
void read_ellipse(Ellipse *el);
void resize_ellipse(Ellipse *el, float scale);
void move_ellipse(Ellipse *el, float dx[2]);

int main() {
  Ellipse ellie;
  float moveby[2], resize;

  printf("Welcome to the ellipse program\n");
  read_ellipse(&ellie);

  printf("The area of the ellipse is: %f\n", ellie.area);

  printf("Amount to move the ellipse?\n");
  scanf("%g %g", &moveby[0], &moveby[1]);
  
  move_ellipse(&ellie, moveby);

  printf("Amount to resize the ellipse?\n");
  scanf("%g", &resize);
  
  resize_ellipse(&ellie, resize);

  return 0;
}

void read_ellipse(Ellipse *el) {

  printf("Centre? (x,y)\n");
  scanf("%f %f", &el->centre[0], &el->centre[1]);

  printf("Length of axes ( > 0 )?\n");
  scanf("%f %f", &el->axes[0], &el->axes[1]);

  printf("Orientation to the vertical?\n");
  scanf("%f", &el->orientation);

  calculate_area(el); // Pass the pointer to another function
}

void resize_ellipse(Ellipse *el, float scale) {
  el->axes[0] *= scale;
  el->axes[1] *= scale;
  calculate_area(el);
}

void calculate_area(Ellipse *e) {
  e->area = PI * e->axes[0] * e->axes[1];
}

void move_ellipse(Ellipse *el, float dx[2]) {
  el->centre[0] += dx[0];
  el->centre[1] += dx[1];
}
Step through this code


Looking at the above example from the bottom upwards, the first thing we see is that move_ellipse() and calculate_area() are passed a pointer to the original ellipse and so are able to change the values of its position and area respectively.

Then resize_ellipse() is also passed a pointer to the original ellipse. As well as accessing the ellipse itself, it also passes this pointer on to calculate_area() to update the area.

Similarly, for read_ellipse() we pass just one pointer, to the ellipse, rather than three or five individual pointers to ellipse.orientation, ellipse.centre[0], etc. and it again passes the value of the pointer it was given on to the function calculate_area(). As mentioned above this is extremely common.

Functions that receive a pointer to a structure often pass that pointer onto other functions.

More than one type of structure

The ellipse example has just one type of structure, but the generalisation is straightforward.

Example: a projectile

Imagine a function to calculate the position and velocity of a projectile thrown into the air. Its position and velocity are known at time t=0 and we need calculate its position and velocity at time t + dt. Its prototype would look something like:

void move(float x[NDIMS], float v[NDIMS], float mass,
                     float drag_coeff, float ywind[NDIMS],
                     float viscosity, float dt);

This is a very simple problem but the function has seven arguments. Worse, they are all floats or arrays of floats so it would be very easy to get two in the wrong order and the compiler would not notice. There are one hundred and forty four different legal ways of ordering these arguments (and nearly seven thousand if we allow for the chance of putting the same one in twice), but only one of these is the right one!

If we take a look at the arguments we see they split into three groups: x, v, mass and drag_coeff are properties of the projectile, ywind and viscosity are properties of the air and time is a physical quantity in its own right. This suggests we want two structures, one for the projectile and one for the air, and to leave time as it is.

The following code declares what are in effect two new types of variables. Again, this code does not actually create any structures, it just tells the compiler what we mean by "Projectile" and 'Air'.

#define NDIMS 3
typedef struct projectile {
  float x[NDIMS];
  float v[NDIMS];
  float mass;
  float drag_coeff;
} Projectile; 

typedef struct air {
  float ywind[NDIMS];
  float viscosity;
} Air;

We have gone from seven numbers to three things: projectile, air and time.

The function is now declared as:
void move(Projectile *proj, Air *thisair, float dt);

Not only do we now only have three variables rather than seven, all three are of a different type so if we were to call the function with two of its arguments in the wrong order the compiler would notice and tell us.

Thinking at a higher level

Almost without noticing it, we've made bit of a mental leap. We started by thinking about how we could reduce the number of arrays needed to represent our nuclides or reduce and organise the (floating point) arguments to a function. But we have quickly reached the stage where we have stopped talking about integers, floating-point numbers and strings and have started talking about nuclides, ellipses and projectiles.

This is the point about structures: we identify the types of "things" we are dealing with and typically we define a type of structure to represent that type.

Structures allow us to think at a higher level, in terms of the "things" we are dealing with rather than indiidual data values.

Structures allow extensibility

Another thing we did without noticing it was that we added an "area" member to our ellipse structure. All w had to do was to type in the extra member and recompile. Similarly we mentioned that our nuclide information may need to be extended to include its mass and atomic number.

Our simple projectile example only has one dimension, y. But if our program is successful somebody is bound to ask us to extend it to three dimensions in which case y, vy and ywind will be replaced by arrays. If we used the "separate variables" approach we would have to go through each of our functions changing the number (and type) of arguments.

With the "structure" approach, we just add some more members to the structure definition, change the part of the code responsible for calculating the acceleration and recompile.

It's extremely common for a program to start off quite simply but for more features and properties to be added later, so the question of extensibility is hugely important.

Structures can easily be extended to include new members.

Structures help make our functions more modular

When we added the area to our functions did not need to know about it. Indeed, resize_ellipse() and read_ellipse() jut passed the pointer to calculate_area() to calculate it: they did not even need to know the area member existed.

In our projectile example we might need to deal with the fact that real projectiles spin in the air and have texture. Even if we assume a spherical shape, that's several more variables. And somebody is sure to want to throw non-spherical objects. In this case our height_velocity function is going to split into two parts:

  • An numerical algorithm for moving the projectile.
  • A function for calculating the force on the object (and hence its acceleration), taking into account shape, spin, etc.
But if we use structures, the function definition remains unchanged: our acceleration function could become very complicated but our function call remains unchanged. Thus,the rest of our code doesn't need to know about it.

Functions don't even need to know of the existence of structure members that don't concern them.

Reference

Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they'll be obvious.
Brooks, F. R. Jr, The Mythical Man-Month (1975).

Summary

The text of each key point is a link to the place in the web page.

Structures do for variables what functions do for code

Defining a new composite variable (structure) type

When to use structures and what to put in them

Pointers to structures

Thinking at a higher level

Log in
                                                                                                                                                                                                                                                                       

Validate   Link-check © Copyright & disclaimer Privacy & cookies Share
Back to top