Skip to content
Physics and Astronomy
Home Our Teaching Resources C programming Appendix: allocation
Back to top
On this page
Contents

The occasional appendices and optional examples in this module are for advanced material that you will not need for this module. They are intended for enthusiastic students who are interested in going further in programming for its own sake.

This optional appendix goes into a little more detail about the thinking behind creating multi-dimensional pseudo arrays, and the use of pointers in general.

How to use pointers effectively


Given the flexibility and usefulness of pointers and memory addresses the obvious question is "given a situation where they might be useful, how do we decide how to use them?". In general, this will depend on the answers to the following questions:

Questions to help us understand a specific use of pointers and memory addresses
  1. What type of "thing"  is this the address of and what are the values we need to represent this "thing"?
  2. How are we going to organise and store these values in the computer's memory?
  3. How are we going to proceed from knowing a single memory address to being to access any of these values?
  4. How are we going to allocate new "things", and if necessary resize and/or free them?
For built-in composite types such as complex numbers and arrays C makes decisions 2,3 and 4 for us, for more-complicated or more flexible objects we will have to decide this for ourselves. 

It is important to realise that questions 2, 3 and 4 may have more than one possible answer in which case the key is to choose one and carefully document it. Also, we may not necessarily answer the questions in that order: it's quite likely that we will have a particular way we want to access the data (Q3) for example, and this will influence the answers to questions 2 and 4. Or there may be restrictions on how we want to allocate objects which will then effect the other two. We saw examples of both of these later on in the lecture and in the examples.

Example: a fixed-size two-dimensional array of doubles


Applying these questions to a fixed-sized two-dimensional array of such as double ar[N1][N2] we find:
Question Answer
1. What type of "thing"  is this the address of and what are the values we need to represent this "thing"? We are representing a rectangle of N1*N2 floating-point numbers so we need N1*N2 floats.
2. How are we going to organise and store these values in the computer's memory? We need to store N1*N2 floating-point numbers and we will store them sequentially as N1  rows each of of N2 doubles.
3. How are we going to proceed from knowing a single memory address to being to access any of these values? Element ar[j][k] can be found by simple arithmetic from the initial address as ar + (k + (j-1)*N2) * 8 bytes. Any function needing to access the array will therefore need to know the values of ar and N2 but not N1.
4. How are we going to allocate new "things", and if necessary resize and/or free them? This decision is made for us: by declaring the array we will automatically be allocated the storage for it, with no ability to resize it.

Arrays of pointers

The first generalisation of a simple array we made in the lecture was to go from a fixed matrix (two-dimensional array) to one where we know the first dimension (the number of rows) won't change but the length of each row will do. Given we are dealing with a matrix it's very likely we would adopt the following requirement:

  • Requirement: the data must be accessed using the same [j][k] notation as a true two-dimensional array

In fact, this is an extremely common requirement in this sort of situation.

Assuming a first dimension of M then answering our four questions we see:

Question Answer
1. What type of "thing"  is this the address of and what are the values we need to represent this "thing"? We are trying to represent an M*n matrix where n may change as the program progresses but will be the same for each row. So we will need M*n floats.
2. How are we going to organise and store these values in the computer's memory? We are storing M rows of numbers. Since the length of each row will change each row will need to be dynamically allocate so we will need M pointers to hold those addresses. This suggests we use an array of M pointers, and that each pointer points to a dynamically-allocated array of n floats.
3. How are we going to proceed from knowing a single memory address to being to access any of these values? If the array of pointers is called p then the address of the  row j is p[j] and the value of element k of this row is p[j][k] . (This satisfies our requirement.)
4. How are we going to allocate new "things", and if necessary resize and/or free them? We shall decide to use an array of M pointers and for each of these M pointers we shall call malloc() or realloc() to allocate the space to store the actual row floating-point numbers.

The code looked like this:

#include <stdlib.h>
#define M 3
int main() {
  int n;
  float *p[M];        // This is the array of pointers

  printf("Size of each vector?\n");
  scanf("%d", &n);    // Find the row size
  
  for (int i = 0; i < M; ++i)
    p[i] = xmalloc(n * sizeof *p[i]); // Allocate rows
  
  p[0][1] = 3.14159;
  // Use p for something 
  
  // Advanced point: resizing the arrays
  // feel free to skip this.
  printf("New size of each vector?\n");
  scanf("%d", &n);   // Find new the row size
  
  for (int i = 0; i < M; ++i)
    p[i] = realloc(p[i], n * sizeof *p[i]); // Resize the rows
  
  // Do something else

  return 0;
}
Step through this code


Things that link to other things


Ask yourself (or a colleague on the same module):
  • How useful would the World Wide Web be if web pages coud not link to other web pages, (which can link to other web pages, which can...)
  • How useful would social network sites be if people could not have other people called "friends" (who also have other people called "friends", who also...)
  • How useful would mobile phones be if they could not store the numbers of other phones (which can also store the numbers of other phones, which can also store...)
  • You need to send an important message to Alice. You don't know her number but you know that Bob does, he would be happy to forward a text for you, and you do know Bob's number. What would you do?
  • Bob's phone number has changed and that text to Alice really important! Luckily you know Carol's number and you know she knows Bob's new number. What do you do?

 Things that can link to other Things (that in turn can link to other Things...) are extremely powerful provided we know how to follow the links!

The key application of this is simple:

Pointers can point to other pointers

Question Answer
1. What type of "thing"  is this the address of and what are the values we need to represent this "thing"? We are trying to represent an m*n matrix where both m and n may change as the program progresses.
2. How are we going to organise and store these values in the computer's memory? We are storing m rows of numbers. Since the length of each row will change each row will need to be dynamically allocate so we will need m pointers to hold those addresses. Sice m may also changehis suggests we use an dynamically-allocated array of m pointers, and that each pointer points to a dynamically-allocated array of n floats.
3. How are we going to proceed from knowing a single memory address to being to access any of these values? If the pointer to the array of pointers is called p then the address of the  row j is p[j] and the value of element k of this row is p[j][k] . (This satisfies our requirement.)
4. How are we going to allocate new "things", and if necessary resize and/or free them? We shall decide to use an pointer to a pointer to a float, dynamically allocate (and if necessary realloc) an array of m pointers and for each of these m pointers we shall call malloc() or realloc() to allocate the space to store the actual row of floating-point numbers.

We illustrate this with an example not found in the lecture.

Using pointers to pointers to create a 2-D symmetric matrix

Physical problems often involve arrays that are symmetric: ar[j][[k] == ar[k][j] Rather than store every off-diagonal value twice we can consider a "triangular matrix" where the first row is of length one, the second of length two etc and make sure we only access ar[j][k] where k <= j  (Even if we didn't want to resize the "matrix", ordinary arrays wouldn't be able to handle this as the rows are of different lengths.)

This has the interesting property that if we do want to resize the matrix we don't need to resize the rows: the first row always has length one, the second length two etc. Rather, if we shrink the array we just free() the unwanted rows, if we grow the array we just allocate more rows. What we will need to do is to change the number of pointers to the rows. Thus this is essentially the miror image of the previous example where we could change the length of the rows but not the number of rows.
Let's answer our four questions:

Question Answer
1. What type of "thing"  is this the address of and what are the values we need to represent this "thing"? For a symmetrical matrix of size N there are N rows of numbers, the first row is one number long, the second two numbers long, .. and the final row N numbers long, for N*(N-1) numbers altogether. Each number is stored as a float.
2. How are we going to organise and store these values in the computer's memory? We will use N independent chunks of data, the first large enough to store one floating-point point number, the second two, ... and the last N numbers. As before, each of the N independent chunks of data will need its own pointer to store its address, but this time we will dynamically allocate an array of pointers.
3. How are we going to proceed from knowing a single memory address to being to access any of these values? This will allow elements to be accessed via the usual ar[j][k] notation.
4. How are we going to allocate new "things", and if necessary resize and/or free them?
  • To create a new array we will need to dynamically allocate an array on N pointers and then for each pointer dynamically allocate the space to store the actual row of numbers.
  • To shrink an existing array we will need to free the unwanted rows of data and then reallocate the array of pointers to the new, smaller, size.
  • To grow an existing array we will need to reallocate the array of pointers to the new, larger, size  and then dynamically allocate the space to store the additional rows of numbers.
  • To free an existing array we will need to free the rows of data and then free the array of pointers.

Allocating and freeing data often has a required order


Looking at answer four we see that whenever we increase the size of the matrix we first grow the array of pointers and then populate it with the actual rows of data. Conversely, when we shrink the matrix we have to free the unwanted rows first and then reduce the size of the array of pointers. This is quite a common occurrence:

Calls to malloc() and free() are frequently in the reverse order.


 Finally, in any situation where we allocate or resize something it's always a good idea to have a separate function to do this, so that is what we will do in he next example:
The following function allocates memory that can then be used like a two dimensional array (but see note below).
#include<stdio.h>
#include<stdlib.h>
// Shrink or extend a symmetric array
float **resizesymarray(float **p, int oldm, int newm) {
  int i;

  // Is the new one smaller?
  for (i = newm; i < oldm; ++i) 
    free(p[i]);

  p = realloc(p, newm * sizeof *p); 

  // Is the new one bigger?
  for (i = oldm; i < newm; ++i) 
      p[i] = malloc((i+1) * sizeof *p[i]);
  
  return p;
}

// Create a new symmetric array
float **newsymarray(int m) {
  return resizesymarray(NULL, 0, m);
}

// Free a symmetric array
void freesymarray(float **p, int m) {
  resizesymarray(p, m, 0);
}
  
int main() {
  float **p = NULL;
  int n, newn;

  printf("Size of array (>0)?\n");
  scanf("%i", &n);

  p = newsymarray(n);
  p[0][0] = 17;
  p[n-1][1] = 11.3;

  printf("New size of array (>0)?\n");
  scanf("%i", &newn);

  p = resizesymarray(p, n, newn);
  // Do stuff here

  freesymarray(p, newn);
  // Do more stuff not requiring p

  return 0;
}
Step through this code


Compare this code's memory map with that of a true two-dimensional array.

Differences from true multi-dimensional arrays

Executive summary: the things above may look like ordinary two-dimensional arrays but they're not

At first sight p above looks just like a "real" array, in that their elements are accessed using the same source-level notation:

  int array[M][N];
  int **p = NULL;

  p = new2dint(M, N);

  p[1][7] = 17;
  array[1][7] = 17;
However, p and array are different things. p is a pointer to M pointers to N ints. By contrast, array points to a single block of M*N ints.

They are dereferenced as follows:

p[i][j] => *( *(p+i) + j)  
Double deference: deference: (p+i), add j and dereference the result.
array[i][j] => *(array + (N*i + j))
Single deference: add N*i+j to array and dereference that.

This is why for functions which have arguments which are multidimensional arrays we must tell the function the size of the array (which the left-most size being optional):

int oldstylefun(float array[][N]);
int newstylefun(int n, float array[][n]);
int three_d_fun(int m, int n, float array[][m][n]);

Strictly for geeks only

The following monstrous construction declares a pointer that can be used to dynamically allocate a true two-dimensional array but we don't recommend you ever use it!

  int (*par)[N], j;

  printf("First dimension?\n");
  scanf("%d", &j);
  par = malloc(j * sizeof *par);
  par[j-1][0] = 3.14159;


Advantages and disadvantages of using malloc()

Allocated memory is permanent

Unlike true arrays, which vanish when the function returns, allocated memory is permanent until explicitly freed. This means that memory can be allocated inside sub-functions and is still there for the calling function when the sub-fnction retunrs.

malloc() is quite slow

One not very obvious disadvantage of malloc() is that because it's so general it's also quite slow. This means that for, say, short arrays that will be used within a function (and the functions it calls) and then thrown away at the end of the function it is better to use automatic ("normal") arrays if we can.

Poor Better
#include <stdlib.h>

void poorfun(int n) {
  double *x;

  x = xmalloc(n * sizeof *x);
  // .. Do some stuff with x

  free(x);
}

void betterfun(int n) {
  double x[n];

  // .. Do some stuff with x
}

The first version has more ways to go wrong, we might forget to call free(), and is also slower. (However, until C99 programmers had no choice as array sizes had to be declared at compile time.) It's not unknown for the speed of some programs to be dominated by calls to malloc().


Summary

The text of each key point is a link to the place in the web page.

When we needed to "pass the array to another function" we encountered the simple, but key innovation is that whereas previously we had used memory addresses that are constants,  we can now also use memory addresses that are the values of variables.  We can illustrate a trivial example as follows:

Pointers

When pointers attack: random pointer values

Using pointers to allocate space

Advanced: realloc()

Log in
                                                                                                                                                                                                                                                                       

Validate   Link-check © Copyright & disclaimer Privacy & cookies Share
Back to top