The structures we declared in the last last lecture have the disadvantage that we need to know in advance how many we will need and give them all names. But for many applications we want to be able to create and destroy them as the program progresses, for example if a user opens a new tab in a web browser. How do we create and keep track of these structures?

Allocating structures

Reminder: allocating a structure

In practice structures are nearly always dynamically allocated via pointers rather than declared as variables.

We saw in a previous lecture that we can use malloc() to dynamically create simple structures. The mechanics work just as we would expect:

  struct thingy *p = NULL;
  p = xmalloc(sizeof *p);

Having a function to create and initialise a structure

It's a very good idea to define a function whose job is to allocate the space for a structure and to do any initialisation as that way we have everything together in one place. The general form of such a function to allocate and initialise a new Thing with a double member called something is something like:

typedef struct thing {
    double something;
    // More members here...
} Thing; 

// Allocate a new Thing and read the values from stdin
Thing * newThing() {
    Thing *p = NULL;
    
    p = xmalloc(sizeof *p);
    printf("Please enter the value of something\n");
    scanf("%lg", &p->something);
    // etc.
 
  return p;
}

Step through this code

Two points are worth mentioning here:

1. Do the initialisation in the simplest place

In the above example we have just put a comment where we would initialise the rest of our new Thing. We have the choice of just putting the code directly in the function or of having a separate function to setup the Thing. If we choose to have a separate function then our newThing() function becomes extremely simple:

// Allocate a new Thing and call setupThing() to initialise it.
Thing *newThing(void) {
    Thing *p;
    p = xmalloc(sizeof *p);
    setupThing(p);
    
    return p;
}

Our setupThing() function might look a bit like:

// Read in the values of a Thing from the keyboard
void setupThing(Thing *p) {
  printf("Please enter the value of something\n");
  scanf("%lg", &p->something);
  // etc.
 }

As always we just go for the simplest and clearest option in any given circumstance:

Whenever we are faced with a choice, our first question should always be: "which choice will be the clearest and give me the least chance of making a mistake?"

2. Choose how to initialise the structure

Although for simplicity we often illustrate reading in the structure members from the keyboard, there are a number of other options. For example, it might be better to initialise all the members to zero which we can do using the following neat trick:

// Set all the members of a Thing to zero
void zeroThing(Thing *t) {
  Thing zero = {0};
  *t = zero;
}

This works because:

Structures can be inialised on declaration just like arrays.
Missing initialers are treated as being zero. (NB: as always this is only true if the structure is initialised, uninitialised structures are random.)
Structures are first-class variables so we can assign a whole structure the value of another whole structure.

We could then decide to call zeroThing() instead of setupThing():

// Allocate a new Thing and inialise it to zero (using sub-function)
Thing *newThing(void) {
    Thing *p;
    p = xmalloc(sizeof *p);
    zeroThing(p);
    
    return p;
}

or we may choose to do it all inside of newThing():

// Allocate a new Thing and inialise it to zero (all-in-one)
Thing *newThing(void) {
  Thing zero = {0}, *p;
  p = xmalloc(sizeof *p);
  *p = zero;  
    
  return p;
}

The function call

Whichever choice we make, the newThing() function is now essentially a drop-in replacement for malloc() except that it knows the number of bytes required and does any initialisation. A typical call looks like this:

  Thing *t = NULL;
  
   t = newThing().

It's a very good idea to define a function whose job is to allocate the space for a structure and to do any initialisation.

Refactoring

A common situation (and one we will encounter in the following mini-exercise) is when we have a chunk of code that is part of a larger function but we decide it would be better if it became its own separate function. This could be because:

We need to use it more than once.
It, or the surrounding code, have become sufficiently complex that it is clearer to have it separately.
It should have been a function to start off with!

This is an example of refactoring; the practice of tidyng our code in a way that doesn't change what it does but makes it easier to maintain and extend in the future. It's good practice to continually do this in small chunks rather than let our code turn into a mess and do a large "spring-clean".

Notice how "Cut-and-Paste" (into a function) is better than Copy-and-Paste.

A typical way to do this is to cut and paste the code into its own small function. For example, imagine we had decided not have a separate "newThing() function because we only needed to allocate a new Thing in one place. A little later we extend the program and need to alocate a new Thing in another place as well. The wrong way to do this would be to copy and paste the allocation code into another part of the program. The correct way would be to put the code inside a small function and to call that function twice.

Often, although not in this case, the new function will require some arguments from the calling function. In our case the shell function is simply:

// Allocate a new Thing and inialise it. Currently an empty stub
// Choose the same variable name as the existing code to be moved.
Thing *newThing(void) {
  Thing *p; 

  // Move the allocation code to here
  
  return p;
}

newCuboid function
To practice creating a function whose job is to allocate and initialise a new structure and to illustrate simple refactoring.
Open up the on-line compiler in a new window
Click on "Saved programs" and open your "cuboid" mini-exercise.
Look carefully at the newThing() stub function above. Now above main() create a new shell function that is a version of this adjusted for the name of your cuboid object. Use the same variable name as in your main() function.
Move all of the code that reads in the values of the cuboid into your new function. However, keep the printf() statement that prints out the cuboid inside main().
Put a call to your "newCuboid" function inside main(). (See the "function call" example above.)
Build & run. Check the output is correct..
So far all we have done is to refactor our code: it does exactly the same as before but we shall see later that it is easier to extend for when we want multiple cuboids..

Example: a slightly more complicated allocation.

Some structures may require more than one call to malloc() to fully allocate them. Consider a structure which itself has a dynamically-allocated string:

Notice how str is a pointer not an array. In order to use it we must make it point at something.

#include <stdlib.h>
typedef struct wotsit {
  float val;
  char *str;
  int length;
} Wotsit;

To obtain a new wotsit structure we will need to first allocate a structure and then allocate the memory for the character string. We show a short but complete example as it's worth stepping through it to watch the two calls to malloc() .

#include <stdio.h>
#include <stdlib.h>
typedef struct wotsit {
  float val;
  char *str;
  int length;
} Wotsit;
Wotsit *newWotsit(int len) {
  Wotsit *w;

  w = xmalloc(sizeof *w);

  w->length = len;
  w->val = 0;
  w->str = xmalloc(w->length);
  return w;
}

int main() {
  Wotsit *william;

  william = newWotsit(20);
  snprintf(william->str, william->length, 
    "Hello, world\n");
  return 0;
}

Step through this code

Structures which themselves contain dynamically allocated members will need more than one call to malloc().

Freeing memory

Memory that is no longer required must be released using free() otherwise our program will gradually use more and more memory until the computer runs out (this is called a memory leak). However, memory is automatically freed when the program finishes.

free()d memory must never be referenced or freed again.

Thus the absolute rule is that we can't use memory after we've freed it and we can free it only once. This typically means that calls to free() are done in the reverse order to calls to malloc(). To reuse the above example we might define a function freewotsit():

void freewotsit(Wotsit *w) {
  free(w->str);
  free(w);
}

The above version is correct. Had we written:

  free(w);
  free(w->str); // Wrong!!

It would have been very, very wrong.

Step through a complete example of the correct code.

Keeping links consistent

If a person changes ther phone number without telling their friends then a lot of people's contacts lists will be incorrect and trying to dial their friend will result in "number unobtainable" or a wrong number.

When we dynamically-allocate structures we will often have several pointers pointing to them. Each of these will need to be updated (often by setting them to NULL). So the challenge is to keep links consistent when we free memory, whether this is done via free() or realloc().

Failure to do this is a notorious source of bugs.

When freeing data we must take care of any pointers pointing to it.

Keeping track of memory

The reason why structures are normally allocated dynamically instead if just declaring them is that we want to be able to create and destroy them as the program progresses, for example if a user opens a new tab in a web browser. How do we keep track of these structures?

Allocating memory is the easy bit and in reality is just the first of three tasks:

Getting more memory.
Keeping track of the memory we have obtained.
Releasing it when we have finished with it, and handling any pointers that point to it..

We have already covered 1 (malloc()) and 3 (free()); this section is about 2.

Reminder: forgetting the results of previous calls to malloc()

As mentioned in the previous lecture on dynamic allocation of arrays, we can call malloc() as often as we like to give ourselves more and more structures but there's a problem: once we have lost the value of a pointer to memory allocated by malloc() there is no way to find it out again so how do we remember where all our structures are?

The following code illustrates the problem: it allocates memory for n structures but promptly forgets where the first n-1 are stored:

   for (int i = 0; i < n; ++i)
    p = malloc(sizeof *p);

Step through this code

The first n-1 structures are thus inaccessible: we have allocated the memory but forgotten where it was. It's like writing down someone's phone number and losing the piece of paper we wrote it on - they have a perfectly good phone but we can't ring it because we've lost the number.
Linked list

One solution

We can see one possible solution by looking at the step-though of the previous code where we have a set of allocated chunks of memory but have forgotten the address of all but one.

Could each allocation remember the address of the previous one?

However, each allocated chunk of memory has an inviting space at the end and the obvious thought is "could we use the last part of each chunk to store the address of the previous chunk?"

One solution to the problem of remembering the addresses of an unknown number of chunks of allocated memory is for each allocated chunk to remember the address of the previous one.

Lists of people's favourite names or favourite colours

Suppose we have a large group of people sitting in order and we want to be able to recite all their names, favourite colours, etc. without writing them down. In principle we could do this by:

Remembering the name of the first person
Getting each person to remember the name of the person after them
Having the last person know that there is nobody after them

Then somebody could recite the names of the whole class, or some other piece of information such as their favourite colour.

Before thinking of how we might create such a list of people, let's imagine how we would use such a list to recite everybody's favourite colour.

Reciting the list of the class's names

To recite the list we can just say:

Go to the first person on the list
Ask them the name of the next person on the list.
Go to that person.
Go to step 1.

However, this has a problem: we have failed to handle part 3 of the illustration: telling the last person there is nobody after them.

[Class exercise.]

Adding a person to the list

If we were to implement this with people then the basic algorithm for adding somebody to the start of the list would be:

Before we start the list: set the first person to "nobody". Then:

Find a new person.
Tell the new person their next person is the current first person.
The new person is now the new first person on the list.

The improved traversal function

Now that we have made the crucial step of identifying when the list is at an end, our improved traversal algorithm is:

Make the first person on the list the "current person".
If that person is "nobody" then quit.
Ask them the name of the next person on the list and make him or her the new current person.
Go to step 1.

Linked lists

This leads us to the idea of making a list of pointers. To make a list we need to add a new element to our structure: a pointer to the next one. The pointer is NULL if there is no next one, ie this element is the last one in the list:

typedef struct particle {
  float mass;
  float x[NDIM];
  struct particle *next;
} Particle;

The structure definition contains an extra member which is a pointer (usually called "next") to the next structure in the list.

We have to use struct particle inside the structure definition as at this stage the Particle alias has not been defined.

Printing the value of the `next` member

Although it's not something we need to do very often (and it's never needed for a normally functioning program), we might want to print the value of a structure's next member out as a diagnostic. It makes sense to print out the address of the structure itself as well.

The format to print out the value of a pointer is %p and it will usually print it out in hexadecimal (base 16) starting with 0x.

The following prints out the value of a pointer to a structure and the address of its next member. As stated above, and as we shall see below, it can be a useful diagnostic:

  printf("p: %p p->next: %p\n", p, p->next);

The output will typically look somehting like:

p: 0xff8b3454 p->next: 0xff8b3514

where "0x" indicates a hexadecial number. A NULL pointer may be printed out as 0x00000000 or with text such as (nil), NULL, etc.

Add a next member and print it out.
A little practice with pointers.
Add a next member to your structure.
Inside your newcuboid function set the next member to NULL.
Back inside main(), immediately above your printf() statement that prints out the cuboid values add a printf() statement like the one above to print out the value of your cuboid pointer and its next member.
Build & run and check that the next member is NULL.

Creating a linked list of structures

Just as in the human case we need something to hold the identity (ie the address) of the first item of the list. Typically this is a pointer called first or firstThing, etc.

The process of creating a linked list of people converts directly into the process of creating linked lists of structures. We show then here side by side:

Human list	C program
Initialisation:
Set the first person to "nobody".	`Thing *firstThing = NULL;`
For each new person or structure:
1. Find a new person.	1. `Thing new = malloc(sizeof new)`
2. Tell the new person their next person is the current first person.	2. `new->next = firstThing;`
3. The new person is now the new first person on the list.	3. `firstThing = new;`

Steps 1-3 are easy to add to our newparticle() function, we just need to:

Pass the value of the first particle pointer as an argument
Assign this to p->next (where p is the pointer to the newly allocated particle)

We have highlighted these two steps in the following newparticle function which allocates and initialises a new structure, and adds it on to the beginning of an existing linked list:

#include <stdlib.h>
#include <stdio.h>

#define NDIM 2
typedef struct particle {
  float x[NDIM];
  float mass;
  struct particle *next;
} Particle;


// Allocate a new particle and add it to the start of the list
struct particle * newparticle(Particle *first) {
  Particle *p;

  p = xmalloc(sizeof *p);
  p->next = first;

  // Now initialise exactly as before
  printf("Please enter the mass and x and y values.\n");
  scanf("%f %f %f", &p->mass, &p->x[0], &p->x[1]);

  return p;
}

int main() {
  int i, n;
  Particle *first = NULL;

  printf("How many structures would you like?\n");
  scanf("%d", &n);

  for (i = 0; i < n; ++i) 
    first = newparticle(first);



  return 0;
}

Step through this code

The call to newparticle()

At first sight the call to newparticle() looks very strange:

    first = newparticle(first);

The interpretation of this is "pass the current value of first to newparticle() and use the value it returns as the new value of first".

Notice also:

We initialised the first element to NULL to show that the list started off empty: this is vital.
We put all the checking that malloc() didn't fail into a utility function called xmalloc() that we can now use just like malloc() but without the tedious checking.

It is essential to initialise the first member of the list to NULL.

Covert your newcuboid function to create a linked list
To practice creating a list
Inside main() make sure that the pointer to your cuboid structure is initialised to NULL. (This is now going to be the first member of the list of cuboids)
Edit your newcuboid function so that it accepts a pointer to a cuboid structure, in the same way that newparticle() above accepts a pointer to a Particle.
Change the statement inside your newcuboid function that sets the value of p->next to NULL so that it now sets p->next to the value of the pointer it received as a parameter.
Change the call to your newcuboid function to pass it the value of the current cuboid structure (which should be NULL at this point).
Build & run. Check the output is correct.. At this stage your next member should still be NULL as there is only one item in the list (ie there is no second item).
At this stage main() should contain the following three statements in successsion:
- A call to your newcuboid function.
- A diagnostic that prints out the values of your pointer and its next member.
- A statement that prints out the cuboid values such as mass etc.
Now enclose all three of these statements inside a simple for(){ } loop that executes twice.
Build & run. Check the output is correct.. For the second diagnostic "%p" printout the value of next should no longer be NULL but should be the address of the first structure, something like:
```
p: 0xff8b3454 p->next: (nil)
p: 0xff8b3514 p->next: 0xff8b3454
 
```
Notice how the second p->next is the value of the first p

Traversing a list of structures

Again, we can directly convert the process of traversing a linked list of humans into traversing a linked list of structures. We will use a temporary pointer p to refer to the structure under consideration at any one time.:

Human list	C program
Initialisation:
Make the first person on the list the "current person".	`Thing *p = firstThing`
Traversal:
1. If that person is "nobody" then quit	1. `if ( p == NULL) quit the loop`
2. Ask them the name of the next person on the list and make him or her the new current person..	2. `p = p->next`
3. Go back to step 1	3. Go back to step 1

Using a for() loop

The loop above for traversing a linked list of structures fits well into a for() loop:

Initialisation	`Thing *p = firstThing`
Test	`p != NULL`
Increment	`p = p->next`

The following for loop loops over every element of the list in turn printing out their mass and co-ordinates. The pointer p is initially equal to first and then becomes first->next and so on.

  for (p = first; p != NULL; p = p->next)
    printf("%f (%f, %f)\n", p->mass, p->x[0], p->x[1[);

A complete example for study later

#include <stdlib.h>
#include <stdio.h>

#define NDIM 2
typedef struct particle {
  float x[NDIM];
  float mass;
  struct particle *next;
} Particle;


// Allocate a new particle and add it to the start of the list
struct particle * newparticle(Particle *first) {
  Particle *p;

  p = xmalloc(sizeof *p);
  p->next = first;

  // Now initialise exactly as before
  printf("Please enter the mass and x and y values.\n");
  scanf("%f %f %f", &p->mass, &p->x[0], &p->x[1]);

  return p;
}

int main() {
  int i, n;
  Particle *first = NULL;

  printf("How many structures would you like?\n");
  scanf("%d", &n);

  for (i = 0; i < n; ++i) 
    first = newparticle(first);



  for (Particle *p = first; p != NULL; p = p->next)
    printf("Mass: %f position: (%g, %g)\n", p->mass, p->x[0], p->x[1]);

  return 0;
}

Step through this code

To go over every element of a linked list use:

for(p = first; p != NULL; p = p->next)

Traverse your linked list of cuboids
To practice traversing a list and to check the creation of the list.
Comment out the diagnostic "%p" statement inside main().
After you have setup the linked list of cuboids, create a for() loop like the one above to traverse the list printing out each cuboid. (You can just resuse the printf() statement you already have.)
Build & run. Check the output is correct.. Your second loop should print out the same cuboids as the first one but in the reverse order.
Now before the fixed loop that creates two cuboids, ask the user how many they want and modify the loop to create that many. (You will need to declare a variable to hold the number of cuboids required.)
Build & run. Check the output is correct..

Preserving the order of the list

Our current newparticle() function creates the list in reverses order. This is not a problem if this is just a way of keeping track of the items, but it may be that the order is in fact significant. If we wish to preserve the order we can tweak the algorithm slightly:

Allocate a new structure and set its next member to NULL
If the list is empty just return the address of the new structure.
Otherwise:
Find the last member of the current list.
Set its next member to point to the new structure.
Return the address of the first member of the list.

Note that in this case the start of the list changes only if the list was initially empty.

The code for finding the last element of the list is similar to the code for going over the whole list except that instead of stopping when The code looks like this:

// Allocate a new particle and add it to the END of the list
struct particle *newparticle(Particle *first) {
  Particle *p, *last = first;

  p = xmalloc(sizeof *p);
  printf("Please enter the mass and x and y values.\n");
  scanf("%f %f %f", &p->mass, &p->x[0], &p->x[1]);
  p->next = NULL;

  if ( first == NULL ) 
    return p; // New list

  // Find the last element
  while(last->next != NULL)
    last = last->next;

  last->next = p;
  return first;
}

Step through a complete example

Removing and freeing a member of a list

This introduces an extra complication: as well as freeing the structure itself we also have to remove it from the list.

To remove the first member of a list we simply set the new first member to old_first->next and free old_first. For every member except the first we find its predecessor and change its next pointer to the next of the one we want to free:

Initial list

A -> B -> C -> D -> NULL

After removing 'B'

A -> C -> D -> NULL

free(B)

After removing 'A'

C -> D -> NULL

free(A)

If we are removing the first element of the list then the first element of the list will be the former second member (or NULL)

Otherwise we must traverse the list looking for the to-be-deleted member's predecessor

Two things to look out for

It would be good to check for two other possibilities:

The case when the function is asked to remove "nobody" (or NULL in pointer-speak). This may seem strange ("if there is nothng to do why would they bother to call it?") but remember two of our core principles for functions:
The purpose of a function is to make life easier for the person calling it.

And its consequence:

Decisions about what a function should do, or whether the function should exist at all, are usually answered by asking the question "what makes it easier for the person who calls this function?"

And it is easier to write:
```
  freesomething(pointer);
```
Than:
```
  if (pointer != NULL)
    freesomething(pointer);
```
And it is also safer for our function to do the check than for it to crash.
The possibility of a mistake, ie asking to remove something that is not actually a member of the list. We will discover this when we go down the list looking for the predecessor of our "to go" member: if we get to the end without finding our predecessor we have made a mistake.
A variation of the above is that the list is actually empty.

It's perfectly OK to ask the function to remove "nothing" and the function must handle this.

Functions should try to check for common errors in calling them.

Removing a member of a list

Stripped-down version

Let's first look at a version that does not do any of the error checking:

/*
 * Stripped-down free particle without checking.
 * Free Particle "togo" and return the new first Particle
 * "first" must be the existing first member of the list
 */
struct particle * freeparticle(Particle *togo, Particle *first) {
  Particle *parent;
  
  if (togo == first) // Is it the first one?
    first = first->next;
  else {    // Find togo's predecessor:
      for (parent = first; parent->next != togo; parent = parent->next) {
    // Empty loop
      }
      parent->next = togo->next;
  }

  free(togo);
  return first;
}

The empty for() loop looks a little unusual, it is because the mere act of finding the end of the loop is actualy the task we want to acheive.

Full version

Here we incorporate the "nothing to do" and error checks:

/*
 * Free Particle "togo" and return the new first Particle
 * "first" must be the existing first member of the list
 */
struct particle * freeparticle(Particle *togo, Particle *first) {
  Particle *parent;
  
  if (togo == NULL) // Do nothing
    return first;
  else if ( first == NULL ) {
    fprintf(stderr, "Trying to remove something from an empty list!\n");
    return NULL;
  }

  if (togo == first) // Is it the first one?
    first = first->next;
  else {    // Find togo's predecessor:
    for (parent = first; parent->next != togo; parent = parent->next) {
      if (parent->next == NULL) {
        fprintf(stderr, "%p isn't in the list!\n", togo);
        return first;
      }
    }

    parent->next = togo->next;
  }

  free(togo);
  return first;
}

Step through a complete example of this code