Multiple files and the pre-processor
Comments and questions to John Rowe.
As programs get larger it makes sense to split the code between
different files, with related functions all going into the same
file. These are referred to as source files
and have the file-name suffix ".c".
- In Code::Blocks add files by selecting File
-> New ->
Empty File and confirm "Add to project".
But there are a number of things that have to be consistent
across the whole program, for example:
- Structure definitions
- Function prototypes
- #defined constants
- typedef
- Enumerations
We could just copy and paste, for example, function prototypes
between the different
source files but how could we be sure that the different
protypes were
the same? If we add an extra argument to the function there is
the danger we will forget to change the prototype in one of the
other
source files. Besides, every time we add a new function we will
have
to modify each source file that wants to use it. And the same
applies
to structure definitions, enumerations, etc. A better way to to
share definitions between . source files by creating our own
header files.
Header files allow us to keep definitions and
prototypes consistent between all of our .c program files.
Having our own #include files
The answer is to put our function prototypes inside a header
file and #include it in all our source
files.
The preprocessor line:
"pastes in" the contents of the header file "mydefs.h"
which should be in the same place as our ".c" files. Notice that
the
file name is in between double-quotes (just like C strings)
rather
than angle brackets <>.
We include our header files using
#include "myfile.h" using double-quotes rather
than angle brackets <>.
Loosely speaking our header files contain everything we would
normally put before the first
function,
i.e. the list above but not the
actual functions themselves.
Header files include function prototypes but not
the
actual function source.
Depending on the size of the project we may choose to have just
one
header file or several.
Example: xmalloc()
We may decide to put xmalloc() into a separate source
file
and rehaps have an xrealloc() as well. The two files
would
look like this:
Source file: alloc.c
Header file: alloc.h
We would then #include alloc.h in the
source
files that use these functions. (See below for a slight
improvement to this file.)
Sharing and not sharing functions between source files
By default, functions inside any source file that is part of
the
project ("linked together" in the jargon) can be called from
functions
inside any other. It's possible to restrict a function to just
being
called from functions within the same source file by using the
static qualifier:
This is a different use of the word static from that used for
variables inside functions to make
them retain their values between calls.
Now localfun() can only be called from other
functions
in the same source file. We would put the prototype for localfun()
at the top of the source file it is in not in the header file.
Putting the word static in from of a
function
definition stops it from being called from another .c file.
Sharing variables between functions in the same source file
We recall that variables defined outside
of a function are called external
variables
and can be used by any function in the file, provided only that
the
function definition follows the variable definition. Just like
functions, external variables can be prefixed with the word static to restrict them just to functions
in the same source file:
Putting the word static in from of a
variabe declared outside of a function
stops it from being accessed from another .c file.
As mentioned above, external variables without the static
qualifier can be shared between functions in different files.
However, we can't just put the line:
in every source file as that would defined a different
variable called globalval in each
file.
Putting a variable declaration such as
"double globalval;" inside every C file would
create a separate
globalval for each file.
To solve this problem we "properly" declare a global variable
in just one file and then tell the
other files it
exists, using a new keyword "extern".
In the reverse order to the above, the steps are:
- Put an external ("extern")
declaration
inside an include file
Here is a typical include file which we have called food.h
.
It defines a Food structure suitable for a linked list,
declares some function protoypes and then declares an extern
(global) variable for the first member
of the liked list:
The extern keyword is one we haven't met
before, it says
"a variable called firstfood has been declared
somwehere else
and we are going to share it". We will include mydefs.h
in all the .c files.
The declaration
"extern double globalval;"
tells
the compler that the variable globalval exists but
does not
create it.
- Have the variable declared for real in
one file only.
firstfood is a non-static
external variable
(one that can be shared with other files) as opposed to
somenumber above which was a static
external
variable (one that cannot
be shared with
other files).
Reminder: the call to printf() inside
main uses the fact that
when two or more strings follow each other separated by
white
space ("First string" "second string") C
just joins
them together to form a single string
("First stringsecond string" - note it
didn't put a
space in there).
A variable to be shared between files must be
properly declared just once.
- Have other files refer to the variable
via the include file.
Every C file that wants to use a global
variable declared in another file must either #include
the appropriate header file
or have its own declaration of the
variable..
Notes
- firstfood was defined properly (ie without an
extern) just once,
outside of any function
and without a static modifier. We chose the file
with the
main function but it could have been any file.
- Other files could then declare firstfood as
extern. We chose to do it via food.h but we
could
have manually typed the statement in the file or even put the
statement in individual functions. (Using an include file is
the most
common way of declaring external variables.)
- Eagle-eyed students may have noticed that firstfood
ended up being declared twice in the main file: as an external
variable via food.h and then defined properly
(without the
extern) in the actual file. This isn't a problem - C
allows
this to make it easier to use include files. However it does
mean that
each variable has to be declared twice, once as extern
and once not. See below for a somewhat
brutal way of getting round this.
More about the preprocessor
The following is more-advanced material, feel free to pick and
choose between it. However, if you work as part of a team or on
somebody else's code you will almost certainly need to know it.
We have already used two features of the preprocessor
which "edits" our source files before
they are compiled into something the computers can understand.
The preprocessor has a number of features which can make life
easier for us but it should be used with care: at its best it
can be a
helpful workaround, at worst it becomes a collection of bodges.
This is used to enclose sections of the file which may or not
be passed through to the compiler.
Example: Pi
We may want to have a line such as:
However some implementations define a constant M_PI
to be the most accurate value of pi that machine can support and
it
makes sense to use it when possible. We can use this as follows:
If M_PI has been #defined then the first
part is activated, if not the second.
The construct
#if defined(M_PI) is used so frequently it has
its own synonym #ifdef. We show
the result for both cases:
Result with M_PI not previously #defined
NB: for clarity we have used an option to the
C preprocessor that preserves comments, when compiling they are
replaced by spaces.
Result with M_PI previously #defined
C also defines #ifndef M_PI for "if M_PI
has not been #defined".
Using the the preprocessor to handle differences between
operating
systems or compilers, is very common. Note that #ifdef
doesn't care what value M_PI has been defined with,
just that
it has been defined.
#if #elseif #else #endif
cause the preprocessor to exclude parts of the source file
The "false" sections are completely removed
#if and #ifdef operate before
the code ever gets to the compiler,
and the "false" sections are completely removed. Consider the
following
rather extreme example:
Strange as it looks this is perfectly legal C as the "Chim
chimeny.." text is removed before the code is compiled.
A little more about #define
Before we go on to some examples of using #ifdef,
a couple of points are worth mentioning.
#undef
removes any definition of FOO. It does not matter
if FOO had not previously been defined.
"Empty" #defines
We've already met one use of an empty #define,
with assert.h:
which turns off the abort() macro. In this case any
subsequent occurance of
NDEBUG in the code would simply be removed.
"Empty" #defines like this are often used when their
job
is simply to turn on or off sections of code later on the the
file(s)
They have another use when defining debugging macros as we shall
see
below.
Chopping out whole sections of the program before
they even get to the compiler is a pretty extreme measure
and it tends to get used in a few specific circumstances:
- Code that differs between operating systems or compiler
options. (See previous example.)
- Header file protection
- Debugging code
We don't normally want to include a header file twice by
mistake.
We can guard against this by using the preprocessor. For example
our previous header file alloc.h might actually look
like this:
An improved alloc.h
Should this accidentally be #included twice into a
source
file then the second time around the preprocessor will strip it
out.
This is an almost-universal convention, for example on my system
the
file math.h begins with a comment followed by:
Protect header files with #ifndef _FILENAME_H
Example: optional debugging
When developing code it's quite common to want to have
debugging statements but to turn them off when the code goes out
to other people. One way to do this is to have a line
at the top of our include file like this:
This can be removed when the code goes "live". Later on in
the include file, and sometimes in the .c files, there will be
sections
of the form:
#ifdef DEBUG
... debugging code
#else
... normal code
#endif
Example: "I'm here" macro
C99 defines some useful macros and non-changeable strings of
characters,
including:
- __func__ the name of the
current function
- __LINE__ the (integer) line
number.
This allows us to define an optional "I'm here" macro to help
us with cases where we are not quite sure the order in which
functions
are being called, how many times loops are running, etc:
The second half of this looks strange but without it, if a
function
contained the statement:
IMHERE;
and DEBUG had not been defined then the compiler
would
not know what IMHERE meant an the compilation would
fail. So
we put in an empty #define which is replaced by an
empty
string.
Example: avoiding having to declare global variables twice
As it stands we have to declare global variables twice: first
as
extern in the include file, and then once without the extern
in a .c file. But what if we are too laz busy to do
this twice?
One of my header files looks like this (with several
hundred lines missed out):
There are many other lines starting with EXTERN.
Every file except one (main.c) has #include "globals.h"
as its very first line:
But main.c looks like this:
(note the missing "extern") and we have achieved our goal of
declaring my global
variables just once.
More-complicated #if expressions
In extreme circumstances
we can use expressions in #if lines:
If neither M_PI or PI has been #defined
then the line "I don't know what pi is!" will
get passed to the compiler with the inevitable consequence.
#defines can also take arguments, although these are
less
used than they used to be. For example;
#define myisdigit(c) (((c) >= '0' && (c) <= '9'))
notice that the "expansion" contains the arguments (in this
case
just one, "c"); this is
nearly always the case. Now if our code contains the line:
if (myisdigit(str[i]))
It gets preprocessed to the macro definition with the"c"s
replaced
by "str[i]":
if ( (((str[i]) >= '0' && (str[i]) <= '9'))
Using this instead of isdigit() is very much faster
on my
machine. (It's also safer than it looks: although computers are
not
obliged to use ASCII to represent digits they are obliged to use
a
scheme where '1' == '0' + 1, etc.)
There are a few other dangers
however, one of which is hinted at by the large number of
brackets we
chose to use.
Dangers
The following attempts
to define a useful "squared" macro:
#define BADSQ(foo) foo * foo
Then if the preprocessor later encounters:
z = BADSQ(x);
it replaces it with:
z = x * x;
This looks like a function but isn't, if the preprocessor later
encounters:
z = BADSQ(x + y);
it replaces it with:
z = x + y * x + y;
which is not what we want! Even the following has a problem:
#define BADSQ(foo) (foo) * (foo)
As the line:
z = 1.0/BADSQ(x);
Expands to:
z = 1.0 /(x) * (x);
Instead we should use:
#define SQ(foo) ((foo) * (foo))
Even this isn't entirely safe however, consider:
y = SQ(++x);
which becomes:
y = ((++x) * (++x));
So x gets incremented twice. We show this below:
<>
What this tells us is that using macros for mathematical
shortcuts is very dubious: it's better to use a function
instead.
Macros that use their arguments twice are
best
avoided and so is changing the value of variables in
calls
to functions or macro evaluations.
Variable numbers of arguments
Macros with at least one argument may be specified to have
a variable number of arguments: any "extra" arguments are just
pasted in.
These arguments are indicated by ...:
#define mymacro(arg1, arg2, ...)
We then need to specify where the arguments appear
in the expansion. This is done by placing
the word __VA_ARGS__ where the extra
arguments
should appear.
Example: a simple debug macro
We would normally put the above in a header file or at the very
top of a source file.
Inside a function we can then write:
debug("x = %g\n", x);
And it will only print out the message when DEBUG
has been #defined.
Example: an improved debug macro
We have already mentioned how C99 defines
__func__, the name of the current function, and __LINE__
the line number.
This enables to create a slightly better debug macro:
The output is:
main() line 6: x is 1
If we remove the #define of debug:
Preprocessor tricks can sometimes be useful
but are
always ugly. Use them sparingly: a few can be useful
but like
all bodges once we have too many they can be confusing and
interact
quite badly.