Skip to content
Physics and Astronomy
Home Our Teaching Resources C programming Sharing and the pre-processor
Back to top
On this page
Contents

Sharing and the pre-processor

This final lecture deals primarily with sharing variables between functions and sharing both variables and functions between source code in different files. This this makes use of the pre-processor so we take the opportunity to cover the preprocesor in a little more detail.

Sharing variables between functions

Relying too heavily on external variables is fraught with peril since it leads to programs whose data connections are not at all obvious - variables can be changed in unexpected and even inadvertent ways and the program is hard to modify.
Kernighan & Ritchie.

GLOBAL variables (to be used only if you really need them)..
Torvalds

Sharing variables between functions in the same source file

Variables defined outside of a function are called external variables and can be used by any function in the file, provided only that the function definition follows the variable definition:

Variables declared outside of a function are called external, or global,variables and can be accessed by any function.


int somenumber = 7;

// myfun() can now use the variable somenumber:
void myfun(int i) {
  somenumber = i *i;
}

// so can main(), if it is inside the same file:
int main() {
  int k;

  myfun(3);
  k = 6 + somenumber;

  return 0;
}
Step through this code


Initialising external variables

In the above example somenumber, this happens before the program begins. Just like in-function static variables, by default external variables are initialised to zero (or NULL for pointers).

External variables are initialised to zero by default.

  1. Step through the above "Key example".
  2. To see a variable being shared between functions
  3. Step through the above "Key example" in a new window.
  4. Step through the code observing somenumber being accessed by both main() and myfun()

External variables should be used very sparingly

Look at the comment by Kernighan & Ritchie at the top of the section ("variables can be changed in unexpected and even inadvertent ways and the program is hard to modify"). Contrast that with a "normal" function call with arguments:

   x = fun1(arg1, arg2);

Here when we call function fun1 we can be confident that fun1 isn't going to change the values of its arguments. if instead of using arguments we had made arg1 and arg2 external so the function call was just x = fun1(); we wouldn't know if arg1 and arg2 had the same values when fun1 returned. If overused external variables tie different routines up in knots.

External variables should be used very sparingly

Use global variables for situations when all of the following criteria apply:
  • They represent quantities which really are global to the whole scope of the problem (or a large sub-part).
  • Sufficiently many functions need them that passing them as arguments is inconvenient.
  • It is logically impossible for there to be more than one of that thing, not just "I can't imagine we would ever want more than one".

Good uses include:

  • The first in a linked list of items that are fundamental to the whole program (as in the firstfood example below).
  • Program-wide options and flags.

Never use global variables just to avoid function parameters.

Splitting source code between files

As programs get larger it makes sense to split the code between different files, with related functions all going into the same file. These are referred to as source files and have the file-name suffix ".c".

  • In Code::Blocks add files by selecting File -> New -> Empty File and confirm "Add to project".

Header files: "mydefs.h"

When we split our code between different files there are a number of things that have to be consistent across the whole program, for example:

Things to go into include files
  • Structure definitions
  • Function prototypes
  • #defined constants
  • typedef
  • Enumerations
But not the actual function code.

We could just copy and paste function prototypes etc. between the different source files but how could we be sure that the different protypes were the same? If we add an extra argument to a function, or a member to a structure, there is the danger we will forget to change the prototype or structure definition in one of the other source files. The correct way to share these definitions between different source files is by creating our own header files and to include the same header file in each .c file.

Header files allow us to keep definitions and prototypes consistent between all of our .c program files.

Having our own #include files

Lines starting with a hash # are called pre-processor directives and the system uses them to create an edited version of the C  file to be given to the compiler.

The answer is to put our function prototypes inside a header file and #include it in all our source files.

#include "mydefs.h"

The preprocessor line above "pastes in" the contents of the header file "mydefs.h" which should be in the same place as our ".c" files. Notice that the file name is in between double-quotes (just like C strings) rather than angle brackets <>.

We include our header files using #include "myfile.h" using double-quotes rather than angle brackets <>.

Our header files contain everything we would normally put before the first function (apart from the initial comment), i.e. the list above but not the actual functions themselves.

Header files include function prototypes but not the actual function source.

Depending on the size of the project we may choose to have just one header file or several.

Example: xmalloc()

We may decide to put xmalloc() into a separate source file and perhaps have an xrealloc() as well. The two files would look like this:

Source file: alloc.c

#include <stdlib.h>
#include "alloc.h"
void *xmalloc(size_t n) {
  void *p = malloc(n);
  if (p == NULL) {
    fprintf(stderr, "Out of memory!\n");
    exit(1);
  }
  return p;
}

void *xrealloc(void *old, size_t n) {
  void *p = realloc(old, n);
  if (p == NULL && n != 0) {
    fprintf(stderr, "Out of memory!\n");
    exit(1);
  }
  return p;
}

Header file: alloc.h

void *xmalloc(size_t n);
void *xrealloc(void *old, size_t n);

We would then #include alloc.h in the source files that use these functions. (See below for a slight improvement to this file.)

Notes

  1. We have put the actual function code in the ".c" file, not the ".h" file.
  2. We have #include "alloc.h" at the top of the .c file so that the source code for the xmalloc() and xrealloc() functions are also checked against their protoypes.

Calling functions in other source  files

Sharing and not sharing functions between source files

By default, functions inside any source file that is part of the project ("linked together" in the jargon) can be called from functions inside any other. It's possible to restrict a function to just being called from functions within the same source file by using the static qualifier:

This is a different use of the word static from that used for variables inside functions to make them retain their values between calls.

// This function cannot be called from functions in other files.
static int localfun(int whatever) {
  // Your code here...
  return result;
}

Now localfun() can only be called from other functions in the same source file. We would put the prototype for localfun() at the top of the source file it is in not in the header file.

Putting the word static in from of a function definition stops it from being called from another .c file.

Sharing and not sharing variables between files

Sharing variables only between functions in the same source file

We saw above that variables defined outside of a function are called external variables and can be used by any function in the file, provided only that the function definition follows the variable definition. Just like functions, external variables can be prefixed with the word static to restrict them just to functions in the same source file:

double globalval;       // Available to any function in any file
static int somenumber;  // Available only to functions in this file

void myfun1(int i) {
  /* myfun1 can now use the variables globalval and somenumber */
}

void myfun2(int i) {
  /* so can myfun2 */
}

Putting the word static in from of a variabe declared outside of a function stops it from being accessed from another .c file.

Sharing variables between functions in different source files

As mentioned above, external variables without the static qualifier can be shared between functions in different files.

However, we can't just put the statement:

double globalval;

in every source file as that would define a different variable called globalval in each file.

Putting a variable declaration such as "double globalval;" inside every C file would create a separate globalval variable for each file.

To solve this problem we "properly" declare a global variable in just one file and then tell the other files it exists, using a new keyword "extern":

extern double globalval;

The declaration "extern double globalval;" tells the compler that the variable globalval exists but does not create it.

Using an include file

In theory we can just put extern declarations such as the above in every .c file. But this gives a potential problem: i If we have separate declarations of a global variable in each .c file there is the danger that we might declare it as "float" in one file and "double" in another. Therefore, just like our function prototypes and structure definitions it is safest to put our extern decalarations in the appropriate include file to make sure that everything is consistent.

In the reverse order to the above, the steps are:

  1. Put an external ("extern") declaration inside an include file

    Here is a typical include file which we have called food.h . It defines a Food structure suitable for a linked list, declares some function protoypes and then declares an extern (global) variable for the first member of the liked list:

    // Contents of food.h
    #define MAXLEN 256
    typedef struct food {
      char name[MAXLEN];
      float calories;
      struct food *next;
    } Food;
    
    void new_food(void);
    void calculate_calories(void);
    
    extern Food *firstfood;
    
    

    As mentioned above, the extern keyword says "a variable called firstfood has been declared somwehere else and we are going to share it". Think of an extern declaration as being a bit like a function prototype for variables: it does not create a variable, it just says the variable exists somewhere else. We will include mydefs.h in all the .c files.

  2. Put the include file inside every .c file.

    Now every .c file will know of that this variable exists, although we have not yet created it. For example, the following code uses the variable firstfood on the assumption it has been declared somewhere else:

    #include "food.h"
    #include <stdlib.h>
    #include <stdio.h>
    
    void new_food(void) {
      Food *new = NULL;
    
      new = xmalloc(sizeof *new);
      new->next = firstfood;
      firstfood = new;
    
      printf("Name of food?\n");
      scanf("%255s", new->name);
      printf("Calories per gramme?\n");
      scanf("%f", &new->calories);
    }
    
    
    void calculate_calories(void) {
      Food *tmp;
      float calories = 0, grammes;
    
      for(tmp = firstfood; tmp != NULL; tmp = tmp->next) {
        printf("How many grammes of %s?\n", tmp->name);
        scanf("%f", &grammes);
        calories += grammes * tmp->calories;
      }
      printf("Your food contains %.2f calories\n", calories);
    
    }
    
    

    Every C file that wants to use a global variable declared in another file must either #include the appropriate header file or have its own declaration of the variable.

  3. Have the variable declared for real in one file only.

    Finally, we create the variable inside one source file:

    #include <stdio.h>
    #include "food.h"
    #include <stdio.h>
    
    Food * firstfood;
    
    int main(void) {
      int choice;
    
      do {
        printf("\nChoice?\n\n"
         "0. Quit\n"
         "1. Add new food\n"
         "2. Calculate calories\n");
        scanf("%d", &choice);
    
        switch(choice) {
        case 0:
          break;
    
        case 1:
          new_food();
          break;
    
        case 2:
          calculate_calories();
          break;
    
        default:
          fprintf(stderr, "Input out of range\n");
        }
      } while(choice);
    
      return 0;
    }
    
    

    Here inside just one source file we have declared firstfood to be a non-static external variable (one that can be shared with other files as opposed to somenumber above which was a static external variable, one that cannot be shared with other files).

    A variable to be shared between files must be properly declared just once.

Notes

  • firstfood was defined properly (ie without an extern) just once, outside of any function and without a static modifier. We chose the file with the main function but it could have been any file.
  • Other files could then declare firstfood as extern. We chose to do it via food.h but we could have manually typed the statement in the file or even put the statement in individual functions. (Using an include file is the most common way of declaring external variables.)
  • We have declared firstfood twice in the main file: as an external variable via food.h and then declared it "properly" (without the extern) in the actual file. Rather like a function protype this checks that the extern and "real" declarations of firstfood agree and makes it difficult to accidentally use the same global variable name twice.

More about the preprocessor

Lines starting with a hash # are called pre-processor directives and the system uses them to create an edited version of the C  file to be given to the compiler.

The following is more-advanced material, feel free to pick and choose between it. However, if you work as part of a team or on somebody else's code you will almost certainly need to know it.

We have already used two features of the preprocessor which "edits" our source files before they are compiled into something the computers can understand.

The preprocessor has a number of features which can make life easier for us but it should be used with care: at its best it can be a helpful workaround, at worst it becomes a collection of bodges.

#if ... #endif

This is used to enclose sections of the file which may or not be passed through to the compiler.

Example: Pi

We may want to have a line such as:

#define PI 3.14159265358979

However some implementations define a constant M_PI to be the most accurate value of pi that machine can support and it makes sense to use it when possible. Therefore we want to have some logic that says "if we have a value of M_PI then we #define PI to be this, otherwise we use our own version". We can do this as follows:

#if defined(M_PI)
#define PI M_PI
#else
#define PI 3.14159265358979
#endif

In this example if M_PI has been #defined then the first part is activated, if not the second.

The construct #if defined(M_PI) is used so frequently it has its own synonym #ifdef.

Using #ifdef we might write something like:

#ifdef M_PI
// Using system value of pi
#define PI M_PI
#else
// Using our value of pi
#define PI 3.14159265358979
#endif

int main() {
  double twopi = 2 * PI;

  return 0;
}
 

We show the result for both cases:

Result with M_PI previously #defined

If M_PI is defined then the first part of the #if is activated and PI is defined as the system value of M_PI. The actual occurence of PI inside main() is then replaced with this value:

Before preprocessing       After preprocessing
#ifdef M_PI
// Using system value of pi // Using system value of pi
#define PI M_PI
#else
// Using our value of pi
#define PI 3.14159265358979
#endif
int main() { int main() {
double twopi = 2 * PI; double twopi = 2 * 3.14159265358979323846;
return 0; return 0;
} }

Result with M_PI not previously #defined

If M_PI is not defined then the second part of the #if is activated and PI is defined as 3.14159265358979, which again is used inside main():

Before preprocessing       After preprocessing
#ifdef M_PI
// Using system value of pi
#define PI M_PI
#else
// Using our value of pi // Using our value of pi
#define PI 3.14159265358979
#endif
int main() { int main() {
double twopi = 2 * PI; double twopi = 2 * 3.14159265358979;
return 0; return 0;
} }

NB: for clarity we have used an option to the C preprocessor that preserves comments, when compiling they are replaced by spaces.

C also defines #ifndef M_PI for "if M_PI has not been #defined".

Using the the preprocessor to handle differences between operating systems or compilers, is very common. Note that #ifdef doesn't care what value M_PI has been defined with, just that it has been defined.

#if #elseif #else #endif cause the preprocessor to exclude parts of the source file

The "false" sections are completely removed

#if and #ifdef operate before the code ever gets to the compiler, and the "false" sections are completely removed. Consider the following rather extreme example:

Before preprocessing       After preprocessing
#define ABC "Alpha Bravo Charlie\n"
#ifndef ABC
Chim chiminey, chim chiminey,
chim chim cher-oo.
My dad is a dalek and I'm Dr Who!
#else
int main() { int main() {
printf(ABC); printf("Alpha Bravo Charlie\n");
return 0; return 0;
} }
#endif

Strange as it looks this is perfectly legal C as the "Chim chimineny.." text is removed before the code is compiled.

A little more about #define

Before we go on to some examples of using #ifdef, a couple of points are worth mentioning.

#undef

#undef FOO

removes any definition of FOO. It does not matter if FOO had not previously been defined.

"Empty" #defines

We've already met one use of an empty #define, with assert.h:

#define NDEBUG
#include <assert.h>

which turns off the abort() macro. In this case any subsequent occurance of NDEBUG in the code would simply be removed.

"Empty" #defines like this are often used when their job is simply to turn on or off sections of code later on the the file(s) They have another use when defining debugging macros as we shall see below.

Typical uses of #if ... #endif

Chopping out whole sections of the program before they even get to the compiler is a pretty extreme measure and it tends to get used in a few specific circumstances:

  • Code that differs between operating systems or compiler options. (See previous example.)
  • Header file protection
  • Debugging code

Header file protection

We don't normally want to include a header file twice by mistake. We can guard against this by using the preprocessor. For example our previous header file alloc.h might actually look like this:

An improved alloc.h

#ifndef _ALLOC_H
#define _ALLOC_H
void *xmalloc(size_t n);
void *xrealloc(void *old, size_t n);
#endif

Should this accidentally be #included twice into a source file then the second time around the preprocessor will strip it out. This is an almost-universal convention, for example on my system the file math.h begins with a comment followed by:

#ifndef _MATH_H
#define _MATH_H 1


Protect header files with #ifndef _FILENAME_H

Example: optional debugging

When developing code it's quite common to want to have debugging statements but to turn them off when the code goes out to other people. One way to do this is to have a line at the top of our include file like this:

#define DEBUG

This can be removed when the code goes "live". Later on in the include file, and sometimes in the .c files, there will be sections of the form:

#ifdef DEBUG
... debugging code
#else
... normal code
#endif

Example: "I'm here" macro

C99 defines some useful macros and non-changeable strings of characters, including:
  • __func__ the name of the current function
  • __LINE__ the (integer) line number.

This allows us to define an optional "I'm here" macro to help us with cases where we are not quite sure the order in which functions are being called, how many times loops are running, etc:

#ifdef DEBUG
#define IAMHERE fprintf(stderr, "%s() line %d\n", __func__, __LINE__)
#else
#define IAMHERE
#endif

The second half of this looks strange but without it, if a function contained the statement:

IMHERE;

and DEBUG had not been defined then the compiler would not know what IMHERE meant an the compilation would fail. So we put in an empty #define which is replaced by an empty string.

Example: avoiding having to declare global variables twice

As it stands we have to declare global variables twice: first as extern in the include file, and then once without the extern in a .c file. But what if we are too laz busy to do this twice?

One of my header files looks like this (with several hundred lines missed out):

#ifndef GLOBALS_H
#define GLOBALS_H

#ifndef EXTERN
#define EXTERN extern
#endif
EXTERN int mouseaction, popup, msize;
#endif

There are many other lines starting with EXTERN. Every file except one (main.c) has #include "globals.h" as its very first line:

Before preprocessing       After preprocessing
#include "globals.h" extern int mouseaction, popup, msize;
void somefunction(void) { void somefunction(void) {
// More code here.. // More code here..
} }

This has the effect of declaring mouseaction, etc. as extern (ie it does not create these variables, just says they have been declared properly somewhere else).

But main.c looks like this:

Before preprocessing       After preprocessing
#define EXTERN int mouseaction, popup, msize;
#include "globals.h"
int main() { int main() {
// ... // ...
return 0; return 0;
} }

(note the missing "extern"). Now mouseaction, etc. have been properly declared and we have achieved our goal of declaring our global variables just once.

More-complicated #if expressions

In extreme circumstances we can use expressions in #if lines:

#if ! defined(M_PI) && ! defined(PI)
I don't know what pi is!
#endif

If neither M_PI or PI has been #defined then the line "I don't know what pi is!" will get passed to the compiler with the inevitable consequence.

Macros: #defines with arguments

#defines can also take arguments, although these are less used than they used to be. For example;

#define myisdigit(c) (((c) >= '0' && (c) <= '9'))

notice that the "expansion" contains the arguments (in this case just one, "c"); this is nearly always the case. Now if our code contains the line:

if (myisdigit(str[i])) 

It gets preprocessed to the macro definition with the"c"s replaced by "str[i]":

if ( (((str[i]) >= '0' && (str[i]) <= '9'))

Using this instead of isdigit() is very much faster on my machine. (It's also safer than it looks: although computers are not obliged to use ASCII to represent digits they are obliged to use a scheme where '1' == '0' + 1, etc.)

There are a few other dangers however, one of which is hinted at by the large number of brackets we chose to use.

Where possible use single-line functions

Traditionally macros have been used for two main purposes:

  1. To save time calling very short functions
  2. To do something that cannot be done with functions at all

Modern compilers will usually deal with problem 1 (by removing the function call altogether and inserting the instructions directly into the calling function) so "save time" macros are usually better written as single-line functions, particularly for macros that expand an argument twice, as in the above example:

// Single-line function replacing a macro
int myisdigit(char c) {
  return c >= '0' && c <= '9';
}
 
      

Macros that expand an argument twice are best replaced by single-line functions.

Dangers

The following attempts to define a useful "squared" macro:
#define BADSQ(foo) foo * foo

Then if the preprocessor later encounters:

z = BADSQ(x);

it replaces it with:

z = x * x;

This looks like a function but isn't, if the preprocessor later encounters:

z = BADSQ(x + y);

it replaces it with:

z = x + y * x + y;

which is not what we want! Our first attempt at a fix is to add brackets around the "x":

#define BADSQ(foo) (foo) * (foo)
      

Now

z = BADSQ(x + y);

correctly becomes:

z = (x + y) * (x + y);

But we still have a problem with the line:

z = 1.0/BADSQ(x);

which expands to:

z = 1.0 /(x) * (x);
Instead we should use:
#define SQ(foo) ((foo) * (foo))

Even this isn't entirely safe however, consider:

y = SQ(++x);

which becomes:

y = ((++x) * (++x));

So x gets incremented twice. We show this below:

Before preprocessing       After preprocessing
#define BADSQ(foo) foo * foo
#define BADSQ2(foo) (foo) * (foo)
#define SQ(foo) ((foo) * (foo))
int main() { int main() {
double x = 1.2, y = 2.1; double x = 1.2, y = 2.1;
printf("%g squared is not %g\n", x+y, BADSQ(x+y) ); printf("%g squared is not %g\n", x+y, x+y * x+y );
printf("1/%g squared is also not %g\n", x+y, 1.0/BADSQ2(x+y) ); printf("1/%g squared is also not %g\n", x+y, 1.0/(x+y) * (x+y) );
printf("1.0/%g squared is %g\n", x+y, 1.0/SQ(x+y) ); printf("1.0/%g squared is %g\n", x+y, 1.0/((x+y) * (x+y)) );
printf("But %g squared ", x + 1); printf("But %g squared ", x + 1);
printf("is not %g\n", SQ(++x) ); printf("is not %g\n", ((++x) * (++x)) );
printf("x is noq %g\n", x); printf("x is noq %g\n", x);
return 0; return 0;
} }

What this tells us is that using macros for mathematical shortcuts is very dubious: it's better to use a function instead.

As mentioned above, macros that use their arguments twice are best avoided and so is changing the value of variables in calls to functions or macro evaluations.

Variable numbers of arguments

Macros with at least one argument may be specified to have a variable number of arguments: any "extra" arguments are just pasted in. These arguments are indicated by ...:

#define mymacro(arg1, arg2, ...)

We then need to specify where the arguments appear in the expansion. This is done by placing the word __VA_ARGS__ where the extra arguments should appear.

Example: a simple debug macro

#ifdef DEBUG
#define debug(format, ... ) fprintf(stderr, format, __VA_ARGS__)
#else
#define debug(format, ... )
#endif

We would normally put the above in a header file or at the very top of a source file. Inside a function we can then write:

  debug("x = %g\n", x);

And it will only print out the message when DEBUG has been #defined.

Example: an improved debug macro

We have already mentioned how C99 defines __func__, the name of the current function, and __LINE__ the line number. This enables to create a slightly better debug macro:

#ifdef DEBUG
#define debug(format, ... ) fprintf(stderr, "%s() line %d: " format, __func__, __LINE__, __VA_ARGS__)
#else
#define debug(format, ... )
#endif

Before preprocessing       After preprocessing
#define DEBUG
#include "debug.h"
int main() { int main() {
int x = 1; int x = 1;
debug("x is %d\n", x); fprintf(stderr, "%s() line %d: " "x is %d\n", __func__, 6, x
return 0; return 0;
} }

This works because the complier joins the first two strings together to create a single format string. The output is:

main() line 6: x is 1

If we remove the #define of debug then the macro expands to a blank:

Before preprocessing       After preprocessing
#include "debug.h"
int main() { int main() {
int x = 1; int x = 1;
debug("x is %d\n", x); ;
return 0; return 0;
} }

Preprocessor tricks can sometimes be useful but are always ugly. Use them sparingly: a few can be useful but like all bodges once we have too many they can be confusing and interact quite badly.

Summary

The text of each key point is a link to the place in the web page.

Sharing variables between functions

Header files: "mydefs.h"

Calling functions in other source  files

Sharing and not sharing variables between files

Sharing variables between functions in different source files

Header file protection

Log in
                                                                                                                                                                                                                                                                       

Validate   Link-check © Copyright & disclaimer Privacy & cookies Share
Back to top