Skip to content
Physics and Astronomy
Home Our Teaching Resources C programming Arrays, etc.
Back to top
On this page
Contents

Arrays or "numbered variables"

Warning: do not scroll past the heading "What is an array?" until you have answered the warm-up questions.

Vectors and matrices

As scientists we often deal with vectors of length N with coefficients V1, V2, Vk, etc. Clearly it would be impractical to have to declare N separate variable with names like V1, V2 ... V1000. And it would be completely inflexible: changing the length of our array from 1000 to 2000 would require us to declare 1000 new variables! For this reason C allows us to declare a single compound variable called an array that allows us to directly represent an entire vector and to to use expressions such as Vk to access the element we want. A simple extension of the concept allows us to do the same for matrices.

Locations as mathematical expressions

If we want to have a single compound variable to represent a whole vector of floats or doubles then each value of Vk will need four or eight bytes to store it. It follows that the address of Vk must be a mathematical expression that depends on k. That is, we are moving from machine code that looks like double@constant to machine code that looks like double@(expression) and that expression must depend upon k. The benefit is that rather than have to declare N individual variables to represent a vector of length N (or N-squared for a NxN matrix!) we can declare a single array and access each element with a single loop. Later on we will encounter an extension of this ability needed to enable us to effectively pass arrays to functions.

This combination of abilities has many other uses within C and is used whenever we wish to use data that is more complicated than a simple variable (in this case a vector). So we shall use the subject of vectors and matrices to introduce these two concepts: we think of two things we require to be able to program using vectors and show how these new features solve the problem for us. Later on in the course we shall show how these same two abilities can be used to make life easier for us in many other ways.

Preparation

Warning: do not scroll past the heading "What is an array?" until you have answered the warm-up questions.

Warm-up discussion 1: Storing vectors

Suppose we wish to store the elements of a vector as doubles the computer's memory and it turns out the the value of the first element of the vector is stored at location 400.

  1. If the first element of the vector is stored at location 400 where would be the simplest and most logical place to store the second element? (Remember that doubles require eight bytes each)
  2. Where would be the simplest and most logical place to store the third element?
  3. Provide a simple mathematical expression for the simplest and most logical place to store the Kth element.

Warm-up discussion 2: numbering things

Since arrays are all about numbered variables, let's briefly think about how we number things.

  • What was the first year of the twenty-first century?
  • If this is the twenty-first century, why do the years start with "20"?
  • Is "12 pm" twelve noon or twelve midnight?
  • What's your reason for the previous answer?

Although intuitively our reaction is usually to number things starting from "1", as these warm-up discussion show, this tends to lead to problems later on. Therefore all modern numbering systems tend to start from zero. For example, the twenty-four clock numbers its hours from 0-23 (not 1-24) and its minutes and seconds from 0-59.

We shall return to this a little later in the lesson.

What is an array?

Hopefully in the above discussion you decided that if the first element of an array of doubles is stored at address 400 then the logical place to store the second element was at location 408!

An array is a series of objects of the same type stored sequentially in the computer's memory.

Arrays are used where we have a set of things of the same type and we need to access each element by number, not by name.

Example: an array of floats

For example, if we have an array of floats called farray, let us suppose the compiler has chosen to store its first element, farray[0] at byte 640. Then farray[1] will be stored at byte 644, farray[2] will be stored at byte 648, etc:

        -------------------------------------------------------------------------
Byte:   | 640 | 641 | 642 | 643 | 644 | 645 | 646 | 647 | 648 | 649 | 650 | 651 | etc.
        -------------------------------------------------------------------------
Stores: | <---- farray[0] ----> | <---- farray[1] ----> | <---- farray[2] ----> | etc.
        -------------------------------------------------------------------------

Note: we shall see below that farray[0] is the notation for the first element of farray. Notice how we number the array from zero, not one.

In our "float@" notation we say the the compiler translates farray[j] into the machine-code float@(640+4*j).

It's actually even simpler than that

It would be quite complicated for the computer's processor to directly support expressions of the form float@(expression) so as always with computers it does something much simpler: computers just allow float@value_of_temporary_variable as well as float@constant. It's up to the compiler to generate the value of this temporary variable so an expression such as

farray[j] = 7.1;

might get compiled into the two-stage command:

tmp_address = 640+4*j;
float@tmp_address = 7;

(We shall see later that the variable does not even have to be temporary: we can create one ourselves if it would be useful to us.)

We can make arrays of anything

We can make arrays of anything we like, such as arrays of ints, arrays of doubles, arrays of double complexes etc. As we shall see below we can even make arrays of arrays! (Otherwise known as two-dimensional arrays.)

Anything we can have one of we can have an array of.

All arrays have both a type and a length.

What this gains us

Before seeing the (very simple) mechanics of creating and using arrays it's worth reminding ourselves of the benefits of moving from expressions of the form double@constant to double@(expression) or float@value_of_variable.

Why we need to use expressions as memory addresses.
Specific requirement: To address data by a numerical index (eg a vector of doubles)
Solution: To move from double@constant to double@(expression) or float@value_of_variable.
Wider application: Any time we need to be able to access data other than a simple variable declared inside the same function.

Declaring and using a one-dimensional array

The obvious, although not the only, way to create an array is simply to declare it in the same way we would declare an ordinary variable. C uses square brackets [] to designate the number in an array, both when declaring it and when accessing the individual elements.

Example: declaration and simple assignment

The following code sets up an array of three doubles and then sets the value of its elements.


#include <stdio.h>
int main() {
  double sides[3];  // Declare the array "sides"

  sides[0] = 5.2;
  sides[1] = 3.1;
  sides[2] = 4.6;

  // ... Do something interesting 
  return 0;
}
Step through this code


The first statement ("double sides[3];") reserves enough space to store three values, in this case doubles. Arrays can have any type: int, long, etc. We have given the array the name sides, which suggests we are dealing with a triangle. The following lines set the values of the three individual elements. The number inside the [] is referred to as that element's index and can be any integer expression, not just a constant.

  1. int myarray[N] (note the typename "int") declares a brand new array of N ints.
  2. myarray[j] on its own (note no "int") refers to one particular element of the array.
  1. Step through the above code
  2. To show how arrays are used and how they are stored in memory.
  3. Step through the above code, seeing how:
    • The array is declared.
    • The individual elements are accessed.
  4. Reload the page and select either "Combined view" or "Computer view". The main difference from what we have seen in previous codes is that the addresses are expressions rather than constants.
  5. Check that way the compiler is storing the elements of the array is the same as your answer in the preparation section (bearing in mind that the first array element is numbered zero.)
  6. Step through the code seeing how the address maths picks out each element in turn.

The first element of the array has index zero, the last has index N-1

As can be seen above, C numbers its arrays from zero. Although intuitively our reaction is usually to number things starting from "1", as the warm-up discussion shows, it tends to lead to problems later on. Therefore all modern numbering systems tend to start from zero. For example, the twenty-four clock numbers its hours from 0-23 (not 1-24) and its minutes and seconds from 0-59.

If we have an array of size N and the first element has index zero then the last element must have index N-1, not N!.

The classic array error

Be sure to remember this one!

  double sides[3];

  sides[3] = 1.7;  /* WRONG ! */

This is a particularly nasty bug which C does not check for. Trying to access the N+1th value of an array of size N is like standing three steps away from a cliff-top and taking four steps forward. (We shall see another example of this below.)

The elements of an array of size N are array[0] to array[N-1] inclusive.

Using scanf()

We can use scanf() to read in sides[0] and sides[1] just like a variable (see the complete example):


  scanf("%lg %lg", &sides[0], &sides[1]);
 


Here &sides[0] does what we want it to: it's the address of sides[0] .

Array initialisation

Arrays can have their values initialised at the point of declaration to make it more difficult to forget to miss one out. This is done using { ... } :


  double sides[3] = { 5.2, 3.1, 4.6 };

We have already seen { ... } used to group several statements together, here they are used to group several initialisers together.

Arrays can be inialised on declaration by putting the values inside { }
int myarray[3] = { 1, 2, 3 };

Two convenient array initialisation short cuts

Automatic array sizing

If we miss out the array size the compiler infers the array size from the number of initialisers:


  double sides[] = {  5.2, 3.1, 4.6 };

The compiler can infer the array size from the number of initialisers.

Missing initialisers are treated as zeros

Conversely, if we provide fewer initialisers than required the rest of the array elements are initialised to zero:


  int myarray[3] = { 1 };

Here we have shown an array of integers for variety. myarray[0] is initialised to 1, the other two elements to zero (not one!!). This does not apply to uninitialised arrays, only to arrays with fewer initialisers than values. To initialise an array to all zero use:


  int anotherarray[3] = { 0 };

Missing initialisers are treated as zeros but this does not apply to uninitialised arrays.

Looping over an array

The for() loop:
for(Initialise; Test; Increment)

  1. Initialise
  2. Loop:
    1. Test and if false quit the loop.
    2. Execute the body.
    3. Increment.

In the example above the array indices were constants (0, 1 and 2). Whilst this is of course legal, the whole point of using arrays is that we can use indices that are expressions. A common example is looping over all the elements of an array in turn. We can use the " for()" loop to loop over an entire array, in this case to read it in:

#include <stdio.h>
int main() {
  int numbers[10];

  for (int i = 0;  i < 10; ++i ) { 
    printf("Please enter value %d\n", i);
    scanf("%d", &numbers[i]);
  }

  for (int j = 0;  j < 10; ++j )
    printf("Value %d: %d\n", j, numbers[j]);

  return 0;
}
Step through this code

 

The for(Initialise; Test; Increment) loop can be used to loop over the elements of an array.

  1. Create an array and give it some values.
  2. To practice using arrays.
  3. Create a new on-line program in a new window, with a suitable title and opening comment..
  4. Declare an array of ten doubles.
  5. Use a for() loop to initialise element j to sqrt(j)
    • Be very careful about the limits of your for() loop. Look at the example above.
    • Don't forget <math.h>
  6. Now use another for() loop to print out the values of the array.
  7. Build & run. Check the output is correct..

Avoid "magic numbers", use symbolic constants instead.

Remember: we don't give ourselves the opportunity to go wrong!

In this code the size of the array (10) is hard-coded in two different places. This is sometimes referred to as a "magic number": it's just there and we have no idea why. If we were to change the size of the array to 12 we would have to change the limits of our loops to 12 as well. Worse, there may be other occurrences of the number 10 that we don't have to change, as they have the value 10 for a different reason. And if we see the number "20" we will need to ask if it should now have the value "24".

The following # (preprocessor) line allows us to give a name to a constant:


#define VALUES 10
 
  • Any subsequent occurrence of the word VALUES (outside of text strings) is replaced by the rest of the line; in this case "10". Note there is no semi-colon!
  • Like all "#" directives, it is line-based with no semi-colon
  • It's conventional to use ALL UPPER-CASE letters, and digits for symbolic constants.

We shall expect you to always use symbolic constants for array dimensions, except for applications where it is logically impossible to change the array size, for example when dealing with a triangle.

NB: #define can be used for non-numerical constants as well.

Use #define to avoid "magic constants".

Example

#include <stdio.h>
#define NSTARS 10
void loopdemo2(void) {
  double mass[NSTARS];

  for (int star = 0; star < NSTARS; ++star ) {
    printf("Please enter mass of star %d\n", star);
    scanf("%lg", &mass[star]);
  }
  // ... Do something interesting 
}
Step through this code

 
  1. Use a #define for your array dimensions and for() loops.
  2. To practice #define and see a common but confusing mistake.
  3. Modify your previous code to use #define to define a constant equal to the size of your array (in this case 10).
    • Don't forget there is no semi-colon at the end of a #define, or = sign inside it.
  4. Change both the declaration of your arrays and the limits of your loops to use the new constant.
  5. Build & run. Check the output is correct..
  6. Now change the value of your #defined constant to 16.
  7. Build & run and check that your program prints out the correct sixteen values.
  8. Now "accidentally" put a semi-colon at the end of your #define .
  9. Build & run and see what happens.

Getting it wrong

It's particularly helpful to see what happens when we gets the loop wrong, both because it's important that we don't do it and because it cements our understanding of how memory works. In the following example we have accidentally looped from 1 to N rather than 0 to N-1. (For speed and simplicity we have just made N equal to 2):

/*
 * Demonstrate an error looping over an array
 */
#define N 2
#include <stdio.h>
int main() {
  // It is likely, although not certain, that the compiler
  // will choose to store a, vec and b next to each other.
  float a = 0, vec[N], b = 0;

  // Initialise all the elements of vec to 1000
  for (int j = 1; j <= N; ++j) // Ooops!
    vec[j] = 1000;

  printf("a: %g b:%g\n", a, b);

  return 0;
}
Step through this code

 

  1. Step through the above "Key example".
  2. To demonstrate a common error and to further illustrate how the computer stores arrays in memory
  3. Step through the above "Key example" in a new window.
  4. Start the program and step through until the first iteration has changed the value of vec[1].
  5. What do you think will happen the next time the loop runs? Why?
  6. Step through the next iteration of the loop and see if you were correct. If not, try to work out why.

Run-time array sizes

The original version of the C language required the dimensions of an array to be constants, defined when the program was written but this restriction has now been relaxed. It is occasionally useful to be able to write:


#include <stdio.h>
void myfun(int n) {
  int x[n];

  for(int i = 0; i < n; ++i)
    scanf("%d", &x[i]);
  ...
}

int main() {
  myfun(2);
  myfun(6);
  ...
  return 0;
}
Step through this code

 

How arrays work

C being C, it doesn't treat the translation v[j] => double@(address_of_v + 8*j) as a special one-off but provides two simple rules for first finding the start of the array, and second going along it and finding what is there. These rules are particularly useful when we want to pass arrays to functions.

Rule 1: In an expression C treats the name of an array as meaning the address of its first element

Expressions have both a value and a type.

When (the of the name of) an array of Things is used in an expression it is replaced by the address of its first element, and its type is the address of a Thing (or "pointer to a Thing") and is a constant.The type is used both for traversing the array and for working out what to do when it gets there.

In the above example we had an array called farray stored starting at location 640. We repeat the memory example here:

        -------------------------------------------------------------------------
Byte:   | 640 | 641 | 642 | 643 | 644 | 645 | 646 | 647 | 648 | 649 | 650 | 651 | etc.
        -------------------------------------------------------------------------
Stores: | <---- farray[0] ----> | <---- farray[1] ----> | <---- farray[2] ----> | etc.
        -------------------------------------------------------------------------

Here farray is a constant and has the value "640" and type "address of a float". We can even print out the value of farray using the "%p" format used for addresses.

The name of an array is a shorthand for the address of its first element (in both value and type).

Another way to say this is that:

If x is an array then any occurence of x on its own just means &x[0].

In theory it is possible to write &farray which has the same value as farray but has the type "address of an array of N floats" not "address of an float". The compiler will warn you if you do this by mistake. (This is possible because the & operator is one of only two operators which does not evaluate the the object it operates on and is used internally by the compiler when receiving the address of a multi-dimensional array into a function, to allow the compiler to generate the correct code to traverse the array.)

Example: passing the name of an array to a function

Easily the most common example of this is when we wish to pass an array to a function so it is important to remember that:

	  myfunction(arrayname);
	

is exactly the same as:

	  myfunction(&arrayname[0]);
	

Kernighan and Ritchie wrote the book ("K&R") that first defined the C language.

When an array name is passed to a function, what is passed is the location of the beginning of the array.

[Kernighan and Ritchie: The C Programming Language]

One import consequence of this is that since the above function calls only tell the function where the array starts but now where it ends we nearly always want to write:

	  myfunction(arrayname, n);
	

where n is the length of the array.

Arrays are implemented in software, by the compiler

The fact that the name of the array x means nothing more than &x[0] is rather surprising, and to be honest disappointing. We might have hoped that x would be some high-level construct containing information concerning not only where x was being stored but also its type, dimensions etc. and to implement some protected area of memory for its data and to warn us if we try to step outside of that area.

The reason for this is simple: the vast majority of computers have no hardware support for arrays, the only support they have is to use the values of expressions as addresses. Arrays are therefore implemented entirely in software and it is up to the programmer and the compiler to keep track of an array's dimensions and to calculate the correct memory address for a given element of a vector or matrix.

Rule 2: If a is the address of a Thing then a[k] means "the Thing whose address is (a + k * the number of bytes needed to store a Thing)"


If a is the address of a Thing then a[k] means "the Thing whose address is (a + k * the size of a Thing)".

This combination of these two rules is what makes the "ar[n]" syntax work: if ar is an array of ints then the construct "ar[n]" is just a synonym for "go to location ar and move along by n times the size of an int and see what's there".

One result of this is that unlike some more specialised languages, C has no way of referring to an entire array (only its type and the address of its first element) and does not allow whole-array operations:

  int myarray[N], yourarray[N];
  ...
  yourarray = myarray;         // Wrong!

  for (i = 0; i < N; ++i)      // Correct
    yourarray[i] = myarray[i];
The array type is used:
  1. To know how many bytes to move along when traversing the array.
  2. To know how to interpret the bytes stored there.

The address[index] notation is specifically designed for use with arrays as it provides a simple way to say "go to this address, move along by this much and see what's there" (eg double@(address + 8*i)). We shall see in the next lesson an simpler notation for the situation where we just want to say "go to this address and see what's there", (eg double@address rather than double@(address + 0*i).

After an array has been declared the only thing the compiler knows is the address and type of its first element.

We shall see below that the key point about this rule is that it does not just apply to arrayname[j], it can be variablename[j]. This is a big change because the name of an array is a constant so arrayname[j] can only ever act on that one specific array. But provided we set arrayname to the correct value then it can act on whatever array as we like.

No array bound checking

There is no array bound checking, we are responsible for this ourselves.

Stepping over the edge of the cliff

The above rules have the disconcerting property that the compiler can follow the rules without needing to know the length of the array! In fact for a one-dimensional array the declaration is the only place the array size is used. The instructions generated by "ar[k]" are entirely independent of the length of the array and it is our responsibility not to go over the edge. We have already seen how going one too far over the end of an array changes the value of whatever was stored after it. We may sometimes do soemthing even worse and have an index that is not just one too many but entirely random.

This will most likely result in one of two things:

  • The variable whose address that is will have its value mysteriously changed, much as in our previous example. This tends to lead to random and confusing behaviour.
  • This will be a piece of memory we are not allowed to access and the program will die with a "segmentation fault" or SIGSEGV as it is also called.

Exactly the same will happen if we pass the wrong address to scanf(), or we try to access a file we have opened with fopen() that we have not checked for being NULL.

If we go a bit over the end of an array we will change the values of random variables, if we go a lot over the end we are likely to crash the entire program.

SIGSEGV or randomly-changing variables are likely to be due to going over the end of an array, bad arguments to scanf() or failing to check for failed calls to fopen().

The name of an array is a shorthand for the address of its first element (in both value and type).

Functions that operate on arrays

Example: normalising a vector

A common requirement in science and maths is to normalise a vector, that is scale it so that its magnitude is one. The following code normalises an array of n floats:

  // normalise the vector a:
  float norm = 0;
  for(int i = 0; i < n; ++i)
    norm += a[i]*a[i];
  norm = sqrt(norm);

  if ( norm > 0) {
    for(int i = 0; i < n; ++i)
      a[i] /= norm;
  }
 

The first loop calculates the norm of the vector (the square root of the sum of the squares), the second divides each element by the norm. (Note: the statement a[i] /= norm; means "divide a[i] by norm".)

Making it into a function

But suppose we have three vectors to normalise, a, b and c, maybe of different lengths? We could Copy-and-Paste the code and then replace "a" by "b", and then again for "c" (and the same for the lengths of the arays). But this would lose us a lot of marks(!). What we would really like to do is to be able to write a function to normalise a vector and to able to call it three times, once for each vector.

This is far from a theoretical example: vectors and matrices are very important in science and there is a wealth of libraries to do high-level operations such as inverting matrices, solving systems of linear equations, etc. A language that did not let us write functions to manipulate vectors and matrices would be a serious problem for scientists. Notice also that normalise() needs to work on the original arrays a, b and c. It would be no use passing it a copy of the array and normalising the local copy which would then be lost when the function returned.

From our previous discussion we know that the function calls will look like:

    normalise(a, lengtha);
    normalise(b, lengthb);
    normalise(c, lengthc);
  

That is, for each vector the arguments will be the address of its first element and the length of the array.

The function parameters

The function prototype will look a bit like this, although we are not yet sure what the first parameter will look like:

  void normalise(What goes here? p, int n);

The first parameter to the normalise() function, p, will as always receive the value of the first argument. We saw above that this was the address of (the first element of) an array. Specifically, in the first call p will have the value of &a[0] in main()), in the second call p will have the value of &b[0] in main()) and in the third call the value of &c[0].

Leavig aside the mechanics of this (which are actually very simple) this means that if our function contains p[j] and the value of p is &a[0] then we are accessing the array a inside main().

We've not yet seen a type of variable whose value is the address of another variable (although we shall see a lot more of them from now on!) but now we need it in order to be able to write a function that operates on an array declared inside another function. Specifically p must a variable whose value is the address of the first element of the array we wish to normalise.

A variable whose value is the address of another variable is called a pointer.

C makes it simple for us

So the question is "what type of variable declaration do we make to show that p is the address of a float (in this case the address of first element of an array)?". Fortunately we don't need to worry too much about this due to the following useful trick:

As a convenience to the programmer, when a function expects to be passed the address of the first element of an array the corresponding parameter can just be declared with the same declaration as the passed array with the left-hand dimension being ignored.

Not only is the left-hand dimension ignored, it can be omitted altogether which is a good idea that it emphasises we have not created a new array. For a one-dimensional array there is only one dimension of course so in our case the function prototype is just:

  void normalise(p[ ], int n);

the declaration p[] deliberately looks a little weird, it suggests that p is an array but it can't really be one because it has no size. The aim of the notation then is to suggest that p is a pointer to the start of an array of that type rather than an actual, local array.

The whole function

The function looks like this:

// Normalise a vector (an array) of length n
void normalise(float p[ ], int n) {
  float norm = 0;
  int i;

  for(i = 0; i < n; ++i)
    norm += p[i]*p[i];
  norm = sqrt(norm);

  if ( norm > 0) {
    for(i = 0; i < n; ++i)
    p[i] /= norm;
  }
}


 

First we see the parameter declaration p[] telling us that p is the address of a float and by implication that this float is the first element of an array.

Looking next at the body of the function we see the familiar notation p[i] to access an array element.

If a is the address of a Thing then a[k] means "the Thing whose address is (a + k * the size of a Thing)".

There is nothing in this rule that says a has to be a constant, if as in this case a (or p) is a variable C will just generate the code to fetch the value of that variable from memory and then add the offset to that value and go there. If the C programming language did not provide us with this ability then we would be unable to write functions that acted on vectors and matrices declared inside other functions.

For example, let's assume that the array a is stored starting at location 200. Then the comparison between a[i] inside main() and p[i] inside normalise() is as follows:

Expression
Type
Meaning
Start of array Use
a[i] a is an array (constant) float@(200+4*i) Fixed (location 200) Can only process the array a inside the function in which a is declared
p[i] p is a variable float@(p+4*i) Whatever we want (the value of p) Can process any array declared inside any function provided we set the value of p to the address of its first element

(Note, for simplicity we have not further expanded the variable values.)

If the first time we set p to be the address of a[0], the second to the address of b[0] and the third to the address of c[0] then the one function can be used to normalise all three arrays.

The full program.

The full program is:


#include <stdio.h>
#include <math.h>

#define NA 2
#define NB 3
#define NC 4

void normalise(float p[], int n);

int main() {
  float a[NA] = {2.7, 5.3};
  float b[NB] = {5.8, 2.9, 3.0};
  float c[NC] = {-9.6, 1.4, 13.9, 6.41};

  normalise(a, NA);
  normalise(b, NB);
  normalise(c, NC);
  
  // More stuff here..
  
  return 0;
}

// Normalise a vector (an array) of length n
void normalise(float p[], int n) {
  float norm = 0;
  int i;

  for(i = 0; i < n; ++i)
    norm += p[i]*p[i];
  norm = sqrt(norm);

  if ( norm > 0) {
    for(i = 0; i < n; ++i)
    p[i] /= norm;
  }
}
Step through this code

 
  1. Step through the above "Key example".
  2. To observe how the function uses the pointer variable p to operate on the arrays a, b and c in turn.
  3. Step through the above "Key example" in a new window.
  4. Step through the code until the program is about to first enter the function normalise().
  5. Observe how the value of the first argument to normalise() is the address of the first element of a.
  6. Now step into normalise() and see how the value of p inside normalise() is the address of the first element of a within main(). (This should be indicated a by a green arrow.)
  7. Step through the first loop observing how the code looks at each value of a in turn despite a being declared inside main().
  8. Step through the second loop observing how the code changes each value of a in turn despite a being declared inside main().
  9. Step through each subsequent call to normalise() seing how p then points to b and finally points to c.
  1. Add a function to your previous code
  2. To practice passing arrays to functions.
  3. Take the code from your previous mini-exercise and change the title.
  4. Now write an empty function above main() which expects to "receive an array of doubles" (that is to receive the address of the first element of an array of doubles). Its prototype should be identical to that of normalise() but with float replaced by double and, of course, an appropriate name. By "an empty function" we mean a function with the opening and closing { } but nothing else (see the next step for why).
  5. Now cut and paste your loop that sets up the array from main() to your function. It would be a great idea to give your pointer ("array") inside your function a different name from the array in main(). For example you could call it p just like in normalise().
    • Carefully follow the example of normalise within the notes.
    • In particular, make sure the name of the first parameter to your function is the same name you use within the function:
      The start of your function should look like:

       
      void functionName(double someName[], ...
       
      Inside your function:
       
        someName[j] = ...

  6. Now call your function from main() in the same place that the loop used to be.
  7. Build & run. Check the output is correct.

Some more examples

Pointers

We have seen here how the idea of a pointer has given us tremendous flexibility: it allows us to write a function that can act on any array declared inside any function and for that function then be called multiple times, possibly on ddifferent arrays each time. We shall make more use of this flexibility in later lectures:
 

Pointers
Specific requirement: To enable a function to access an array declared inside another function.
Solution: A variable called a pointer whose value is the address of another variable (in this case the address of the first element of the array)
Wider application:
  • To enable a function to access an object declared (or otherwise created) inside another function
  • To enable a piece of code to act on one set of data and later on another set.

And, as in this case, usually both.

Arrays of arrays, or multi-dimensional arrays

Multi-dimensional arrays are simple and straightforward and fit in well with our commonly-used ideas of matrices and "more than one vector".

Warm-up discussion 2: Storing matrices

Suppose we wish to store the elements of a two-dimensional 4x10 square matrix of floats, that is four rows of ten values, and that the first row starts at location 1000.

  1. If the first element of the first row of 10 floats is stored at location 1000 where would be the simplest and most logical place to store the first element of the second row? (Remember that floats require four bytes each)
  2. Provide a simple mathematical expression for the simplest and most logical place to store the start Jth row.
  3. Provide a simple mathematical expression for the simplest and most logical place to store the Kth element of the Jth row.

Matrices and collections of several vectors are one and the same thing

Example: simultaneous equations

When solving simultaneous equations it's common to have a set of vectors Ej with Ejk referring to the kth component of jth vector. Furthermore it's also then common to think of E as a matrix in its own right ("the matrix formed by the vectors Ej").

C's ability to make an "array of anythings" makes this very natural and easy. We just define an "array of arrays". Imagine we have a matrix formed from individual vectors and the algorithm first requires each vector to be normalised. We might write:


#include <stdio.h>
#include <math.h>

#define NX 3
#define NY 2

void normalise(float p[], int n);

int main() {
  int x, y;
  float coeffs[NX][NY] = { { 1, 2 }, { 3, 4 }, { 5, 6 } };

  // Normalise the indivdual vectors
  for (x = 0; x < NX; ++x)
    normalise(coeffs[x], NY);

  printf("The matrix is:\n");
  for (x = 0; x < NX; ++x)
    for (y = 0; y < NY; ++y)
      printf("coeffs[%d][%d] is %g\n", x, y, coeffs[x][y]);

  return 0;
}
Step through this code

 

(Note that C considers a 3x2 array to be three arrays each of length two.)


We can think of a two-dimensional array either as a matrix, or a collection of vectors, or both.

  1. Step through the above "Key example".
  2. To observe a 3x2 arraybeing treated both as three vectors of length two and as a single 3x2 matrix.
  3. Step through the above "Key example" in a new window.
  4. Step through the calls to normalise() seeing how the pointer p in turn points to coeffs[0][0], coeffs[0][1] and coeffs[0][2].
  5. Now see the double loop where coeffs is treated as a single matrix to be printed out.
  1. Create, populate and print out a matrix
  2. To practice using matrices.
  3. Create a new on-line program in a new window, with a suitable title and opening comment.
  4. Declare a small array (this can be of any type).
  5. Using a double for() loop set element [j][k] to the value j*k for all valid values of j and k.
  6. Using another double for() loop print out the indices of the matrix and its values, eg "2 * 3 = 6"'
  7. Build & run. Check the output is correct.

More dimensions

We can do this as many times as we like. For example is we wanted to store the electric and magnetic fields as an array of length 4 with one field for each point of a three-dimensional grid we could define the four-dimensional array:

double field[NX][NY][NZ][4];

and we could treat field[NX][NY][NZ] as being the one-dimensional array (of size 4) representing the field at that point.

An N+1 dimensional array is just an array of N dimensional arrays (for N > 0).

How multidimensional arrays are stored in memory

The storage of multidimensional arrays follows immediately from their definition as "an array of arrays". In the previous example we declared a three by two array of floats called coeffs:

  float coeffs[NX][NY]; // NX ==3, NY == 2

The computer will store it as three one-dimensional arrays of length two, one after the other. If the array starts at address 800 then the start of the array storage will be:

        -------------------------------------------------------------------------
Byte:   | 800 | 801 | 802 | 803 | 804 | 805 | 806 | 807 | 808 | 809 | 810 | 811 |
        -------------------------------------------------------------------------
Stores: | <-- coeffs[0][0] ---> | <-- coeffs[0][1] ---> | <-- coeffs[1][0] ---> |
        -------------------------------------------------------------------------

It follows that in the general case, if we were to declare a multiple dimensional array:

For the float array coeffs[M][N]; starting at location 800 the element ar[j][k] will be stored in location
800 + 4 * (N*j + k).

A multi-dimensional array is a single chunk of memory and the compiler calculates the address of an array element  using the address of the start of the array, the size of each array element and the dimensions of the array (excluding the left-hand dimension).

Note that we never need to know the left-hand dimension to find the address of an element of the array, that just tells us where to stop!

The left-hand dimension of an array is not needed to calculate the address of an array element.

Passing multidimensional arrays to functions

In a later lecture we shall see an even more flexible way of doing this using dynamic memory allocation.

When passing multi-dimensional array to functions we have to tell the function the dimensions of the array (apart from the leading dimension). This isn't a problem if we are using symbolic constants, but if they are dimensioned at run-time we have to pass the dimensions as additional arguments, before the array name in the argument list:

The following is a very simple matrix-multiply function.


//
// Multiply two matrices together: A = B x C
//
void matmul(int nx, int ny, int nz, float a[][ny], 
      float b[][nz], float  c[][ny]) {
  int x, y, z;

  for (x = 0; x < nx; ++x) {
    for (y = 0; y < ny; ++y) {
      a[x][y] = 0.0;
      for (z = 0; z < nz; ++z)
  a[x][y] += b[x][z] * c[z][y];
    }
  }
}
Step through this code

 

As always when using arrays, if we were to make a mistake and end up with the function thinking that the dimensions of the array were different from those it was originally declared with, we would be in serious trouble.

If pass an array to another function we also have to tell that function the dimensions of the array, other than the left-most one.

Although we tend to think of vectors and matrices in their mathematical sense with operations such as division, multiplication etc. there is also a more general non-mathematical use of "Thing number 1", "Thing number 2", which use the same underlying mechanism. Indeed modern programs can easily use hundreds of mega-bytes or even gigabytes of memory and only a tiny fraction of that is for named variables with addresses that are constants that are known at compile time. Most data is accessed via machine-code that looks like typename@(expression), just like the vectors described above.

Other uses of arrays

Although we have stessed using arrays as mathematical vectors or matrices, arrays are also used whenever we have several things that can be identified by number. For example an economist with GDP figures from yearMin to yearMax inclusive may declare an array of length (1 + yearMax - yearMin) (note he "1 "!) and store the GDP for a particular year in gdp[year - yearMin].

We shall examine this a little more in the next lesson.

Summary

The text of each key point is a link to the place in the web page.

Declaring and using a one-dimensional array


No array bound checking

Passing the address of an array to a function

Arrays of arrays, or multi-dimensional arrays

Log in
                                                                                                                                                                                                                                                                       

Validate   Link-check © Copyright & disclaimer Privacy & cookies Share
Back to top