Home › Our Teaching › Resources › C programming › PHY3134 › pointers.html

Computational Physics
Memory and Pointers

Comments and questions to John Rowe.

Memory

Remember how variables are stored in memory. For example the computer might choose to store two variables x and y like this:

               ----------------------- -----------------------
Byte number:  | 400 | 401 | 402 | 403 | 404 | 405 | 406 | 407 |
               ----------------------- -----------------------
Value:        |          8.4          |         11.3          |
               ----------------------- ----------------------- 
                <--      x      ---->   <---      y      --->

Pointers

A pointer is a variable that contains the address of another variable.
Kernighan and Ritchie.

Pointers say it where something is stored in memory, not what its values is.

Pointers to variables

Pointers to variables are the most-obvious and least-used use of pointers. We use them here merely to illustrate how pointers work, the more useful examples follow below.

C provides a special operator ('&') to find the address of a variable and a special type of variable (known as a pointer) to hold that address. If we assume that p is a pointer and that, as above, the variable y is stored in memory starting at location 404 then the line:

  p = &y;

would set the value of p to 404. We can even print out the value of p for debugging purposes using the %p operator:

  printf("p has the value: %p\n", p);

The '*' operator does the opposite of '&': given a pointer p whose value is &x *p is just a synonym for x, ie *p means 'the variable whose address is p'. You may then use *p anywhere where you might want to use x.:

void example(void) {
  float x =17.1, y, *p, *q;  /* p, q are pointers */

  p = &x;          /* The value of p is now 400, the address of x.
                      If we do anything with or to *p it's
                      doing it to x:                              */

  printf(" p: %p\n *p: %f\n", p, *p);

  y = *p;          /* Exactly the same as y = x;                  */
  *p = 3.14159;    /* Exactly the same as x =  3.14159;           */


  /* 
   * Now let's make p point to y: 
   */

  p = &y;          /* The value of p is now 404, the address of y */
  *p = 1.414;      /* Exactly the same as y  = 1.414;             */
  q = &x;          /* The value of q is now 400, the address of x */

  unwise_function(p, q);  /* The same as unwise_function(&y, &x); */

  /* 
   * Now x has the value 73.2, y has the value 2.5 
   */    
}

void unwise_function(float *pp, float *qq) {
/* What are the values of pp and qq? */
  *pp = 2.5;
  *qq = 73.2;
}

Notice the difference between:
p = &x; change the value of p
*p = 3.14159; change the value of the variable whose address is p.

The process of finding the variable whose address is p is called dereferencing p.

Now see what happens when we pass p and q to the function unwise_function(). Inside unwise_function() pp and qq have the same value as p and q had in main, ie &y and &x. Therefore when pp and qq are dereferenced you are actually accessing y and x inside main and y and x's values gets changed.

Use structures, not pointers to variables!

At first sight unwise_function() looks like the answer to the question "How do I write a function that returns more than one value?". (One return value is easy of course as in x = sin(y).) But as the name unwise_function() implies it isn't! Remember two of our golden rules:

A function does just one job considered from the perspective of its caller.
Everything about a "thing" is contained in that thing's structure.

So if you find yourself wanting to pass pointers to two or more variables to get round the "functions return at most one value" limitation it's a sure sign that you're either trying to do two things at once or that you've broken the "one thing, one structure" rule.

In practice the only time you'll ever use pointers to ordinary variables is when you call the scanf functions.

Pointers to structures

These work just as you would expect. Let's look at our ellipse example again:

#include <stdio.h>

#define PI 3.14159265358979
typedef struct ellipse {
  float centre[2];
  float axes[2];
  float orientation;
} Ellipse;

void print_area(Ellipse el);

int main() {
  Ellipse ellie, *ep; /* ellie is a structure, ep just a pointer */

  ep = &ellie;
  (*ep).orientation = PI/4;

The notation (*ep).orientation is rather awkward and pointers to structures are used so often they have their own notation:

  ep->orientation = PI/4;

The fact that they have their own notation should tell you how important pointers to structures are!

Let's now deal with the obvious weakness of our 'structure' functions so far: that they only pass the structure values into the function without passing the new values back. We will add a new member, area, to our structure and modify the old print_area() function to calculate the area and store it in the structure. We will call the new function calculate_area():

#include <stdio.h>

#define  PI 3.14159265358979
typedef struct ellipse {
  float centre[2];
  float axes[2];
  float orientation;
  float area;
} Ellipse;

void calculate_area(Ellipse *el);

int main() {
  Ellipse ellie;
  int i;

  printf("Centre? (x,y)\n");
  scanf("%f %f", &ellie.centre[0], &ellie.centre[1]);

  for(i = 0; i < 2; ++i) {
    do {
      printf("Length of axis number %d ( > 0 )?\n", i + 1);
      scanf("%f", &ellie.axes[i]);
      } while (ellie.axes[i] <= 0);
  }

  printf("Orientation to the vertical?\n");
  scanf("%f", &ellie.orientation);

  calculate_area(&ellie);
  printf("The area of the ellipse is: %f\n", ellie.area);
  return 0;
}

void calculate_area(Ellipse *el) {
  el->area = PI * el->axes[0] * el->axes[1];
}

Strings and pointers

Because strings are used so often, C has a convenient feature: a string in double-quotes is treated like a nameless array: it is assigned some memory and a line like:

  char *foo = "hello, world";

Assigns foo the value of the pointer to that memory. NB: the memory foo points to cannot be changed, unlike the next example.

Passing pointers to strings (and arrays) to functions

We will pre-empt the next section by noting that the name of an array is a synonym for the address of its first element.

#include <stdio.h>
void printstr(char *);
main() {
  char str[] = "abcdxyz";

  printstr(str);
  strcpy(str, "Hi!");
  printstr(str);
}

void printstr(char *tmp) {
  for(; *tmp; ++tmp)
    putchar(*tmp);
  putchar('\n');
}

The first line of main both declares and initialises str. The empty brackets [] tell the compiler to make the array equal to the size of the string (ie 8, the number of characters plus one for the final zero).

The pointer to the start of the string is then passed to printstr which goes over the string printing out each character in turn until it comes to the end. Notice that the test part of the for loop is just *tmp. This is fine because:

A 'logical' test is just an integer value which is considered to be 'true' if that value is non-zero.
Strings are terminated with a zero (often written '\0', not to be confused with '0' which on most systems would have the value 48).

The argument is still a copy of the original

The loop in printstr look strange because we are changing tmp. But remember, tmp is a variable just like any other. Let's assume that the string pointed to by str starts at memory location 400:

               -----------------------------------------------
Byte number:  | 400 | 401 | 402 | 403 | 404 | 405 | 406 | 407 |
               -----------------------------------------------
Value:        | 'a' | 'b' | 'c' | 'e' | 'x' | 'y' | 'z' | \0  |
               -----------------------------------------------

(Remember, each printable character from the 'normal' latin alphabet is assigned an integer value in the range 1-127 which is the value actually stored in the computer's memory.)

When we call printstr then tmp is given its own (temporary) piece of memory to store its value (say bytes 640-643) and that value is initialised to 400:


              ------------------------ 
Byte number:  | 640 | 641 | 642 | 643 |
               -----------------------
Value:        |          400          |
               -----------------------
               <- tmp (in printstr) ->

As we keep adding one to tmp it has the values 401, 402, etc until it reaches 407 and it stops because *407 is zero. Then when printstr returns the memory at location 640 is freed and is available for use by variables as other functions are called and so the changes make to tmp are lost.

Pointer arithmetic

We've seen how we can add one to a pointer and it does what we expect (and what we would like: since a char takes one byte adding one to tmp takes us to the next character). But what about floats which on our systems take four bytes? Suppose we have an float array called x:

               ----------------------- -----------------------
Byte number:  | 800 | 801 | 802 | 803 | 804 | 805 | 806 | 807 |
               ----------------------- -----------------------
Value:        |         -12.8         |         73.4          |
               ----------------------- -----------------------
                <----     x[0]   ----> <----     x[1]     ---->

Then x is a constant pointer of value 800 and if we have the following code:

  float *p = x;  /* 800 */

  ++p;
  printf("%p\n", p);

at the end of this p has the value 804 not 801, ie adding one to a pointer always makes it point to the next item in the array. The compiler can do this because we declared p as float *p so it knew that the thing p pointed to took four bytes per item.

Pointer subtraction

Pointers to items in the same array can even be subtracted:

  float *p = &x[2];
  float *q = &x[7];

  printf("%d\n", q - p);

will print out 5.

Pointers and arrays

array is a synonym for &array[0].
(array + n) actually means (array + n * sizeof *array)
array[n] is a synonym for *(array + n).

Important consequence: when you pass the name of an array to a function you are passing a pointer to the array so changing the array in the function changes the original.

Copying strings vs. copying pointers

Consider the following program:

#include <stdio.h>
#include <string.h>

#define LENGTH 200
struct mystruct {
  float val;
  char str[LENGTH];
};

main() {
  struct mystruct struct1 = { 3.14259, "This is the value of Pi!" };
  struct mystruct struct2;

  struct2 = struct1;
  strncpy(struct1.str, "Oops! No it isn't!", LENGTH);
  printf("%s\n%s\n", struct2.str, struct1.str);
}

(If you haven't met strncpy before it should be fairly obvious what it does.)

Each structure has 204 bytes of memory so the line

  struct2 = struct1;

copies all 204 bytes of memory from struct1 to struct2. Thus the program prints out:

This is the value of Pi!
Oops! No it isn't!

As each structure has its own 200 bytes of memory for the string and the call to strncpy only effects struct1.

Suppose we decide to do things a bit better and allow a variable amount for the string:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

struct mystruct {
  float val;
  char *str;
  int length;
};

main() {
  char stringbuffer[100];
  struct mystruct struct1 = { 3.14259, stringbuffer, 100 };
  struct mystruct struct2;

  strncpy(struct1.str, "This is the value of Pi!", struct1.length);
  struct2 = struct1;
  strncpy(struct1.str, "Oops! No it isn't!", struct1.length);
  printf("%s\n%s\n", struct2.str, struct1.str);
}

Now each structure has just 12 bytes of memory and the value of struct1.str is just the value of stringbuffer, for example 600.

Now the line:

  struct2 = struct1;

copes just 12 bytes of memory and struct1.str and struct1.str both have the value 600, ie they both point to the same piece of memory. The output of this program is:

Oops! No it isn't!
Oops! No it isn't!

The NULL pointer

We've also been using pointers all along as in this example from week 2 of Computing II:

  FILE *fin;

  fin = fopen("idatin.dat", "r");

But what would happen if the open of the file failed, for example if it did not exist? Well, C has a special value called NULL which it can be guaranteed no legitimate pointer will ever have. So you could have something like:

  FILE *fin;

  fin = fopen("idatin.dat", "r")
  if (fin == NULL) {
    /* Print an error message and die */
  }
  else {
    /* Do something useful */
  }

Pointers are dangerous

In order to dereference a pointer p it is absolutely essential that the value of p has been explicitly set to the address of a legitimate variable.

Why? Well, remember first that uninitialised variables values random values. So when you try to dereference an unitialised pointer you will try to read from or write to a random memory location which will give you a random result. Worse still, if you write to this location some other variable in some other function will have its value changed and your program will fail or the whole machine crash.

Your safest bet is whenever you declare a pointer, initialise it to NULL:

  float *p = NULL;

Using a value of NULL by mistake is not safe but it is probably less dangerous than using any other value.

Aside: named constants and enumerations

Although this doesn't logically belong here we mention it as it's useful to know.

In last week's course work our (real) quadratic equations could have been one of three types: two real roots, one repeated real root or two complex roots. It might be useful for our structure to be able to store the solutions and what type they are. Obviously we can do the latter by adding a new member to the structure and setting it to '1' for one root, '2' for two real roots and '3' for two complex roots but then we need to remember what '1', '2' and '3' mean. It's much better to give a name to these constants.

C provides two ways of naming constants. We've met one already, #define, but C provides another way specifically designed for our situation, enumerations:

enum eqnstatus { EQN_UNSOLVED, EQN_ONEROOT, EQN_REALROOTS,
EQN_COMPLEX_ROOTS };

Now anywhere in your program you could use the named constants EQN_UNSOLVED, EQN_ONEROOT, EQN_REALROOTS, EQN_COMPLEX_ROOTS to mean zero, one, two or three respectively:

enum eqnstatus { EQN_UNSOLVED, EQN_ONEROOT, EQN_REALROOTS,
EQN_COMPLEX_ROOTS };

main() {
  enum eqnstatus eqn_status = EQN_UNSOLVED;
}

Enumerations are integers and printing them out with %d just prints their integer values but debuggers usually understand them. They can be combined with typedefs as in the example below which also illustrates that enumerations and structures don't actually have to have a type if you don't want them to:

#include <stdio.h>

typedef enum { VANILLA, CHOCOLATE, STRAWBERRY } Flavour;

typedef struct {
  Flavour flavour;
  float fat;
  float sugar;
  float calories;
} Icecream;

main() {
  Icecream icecream;

  icecream.flavour = CHOCOLATE;
  printf("%d\n", icecream.flavour); /* Prints: 1 */
}

Validate Link-check

Privacy & cookies

Computational PhysicsMemory and Pointers