Supplementary lecture
The standard library
As well as the actual language itself, C also provides a
standard library of functions.
We've all used some of them but it's important to know that there
are others. We won't try to give
an exhaustive list, that's what
K&R and
Google
are for, just a few examples.
The library functions are split into groups, each with its own include
file.
String functions: <string.h>
We've all used them but try to use the
'
n' versions when possible:
#include <stdio.h>
#include <string.h>
#define BUFLEN 4
void killme(void) {
char buffer[BUFLEN+1] = "";
strcpy(buffer, "This would be very bad news indeed");
}
void dontkillme(void) {
char buffer[BUFLEN+1] = "";
strncpy(buffer, "This would be truncated", BUFLEN);
}
Strlen and strcmp are also often used.
A minor warning
You may have noticed that the buffer we passed to
strncpy had
a length one greater than it apparently needed to have and that we
carefully initialised it to all zeros. This is because
strncpy fails in a rather unhelpful way if the string to be
copied is longer than the length provided by the third argument : it
copies all
N bytes leaving a string without the zero at the
end. Thus, whilst it doesn't over-write the buffer it does leave an
invalid string. We've got round this by making
buffer one
byte larger than it needs to be and ensuring that
buffer[BUFLEN]
is zero. Another approach would be to write a wrapper function:
#include <string.h>
char *mystrncpy(char *to, const char *from, size_t n) {
strncpy(to, from, n);
to[n-1] - '\0';
return to;
}
Mathematical functions: <math.h>
Linux users: you may need to specify
-lm when you link.
We've all used sin() and cos(),
, here's an example that answers the
age-old question: "How do I find the angle whose cosine is y/sqrt(x*x+y*y) and sine is x/sqrt(x*x+y*y)?"
#include <stdio.h>
#include <math.h>
#define PI 3.14159265358979
int main(void) {
double angle, sinbit, cosbit;
/* NB, sinbit and cosbit don't need to be normalised */
printf("sin and cos?\n");
scanf("%lf %lf", &sinbit, &cosbit);
printf("angle is %4f pi\n", atan2(sinbit, cosbit)/PI);
}
Unformatted Input/Output
What is unformatted I/O?
The printf() and scanf() functions we have used
so far are "intelligent" in that they look at the first argument
searching for "%" characters to
see what they should do. They are known as formatted
I/O. By contrast unformatted IO just reads
and writes bytes without looking at them.
Unformatted input allows us to read either a single byte or a line
of input. Unformatted output allows us to write either a single
char or a character string.
Unformated I/O is a classic example of the 80/20 rule in that it's
possible to go to emormous lengths to handle ever less-likely
situations. Here we recommend a couple of basic strategies and in
particular give two simple, "good enough" functions that do most
of what we want and can be copied directly from the lecture notes.
Unformated I/O can be far more flexible than formated I/O but with
flexibility comes complexity. As we shall see below, mixing formatted
and unformatted I/O requires some care due to their differing attitude
towards white space. The unformated I/O functions are declared inside
stdio.h .
Single characters
A single character can be written to stdout using
putchar(c)
or to a file using
putc(c, file) where
file is
the usual FILE *. Here is one of the worst "Hello, world" programs
you will have seen:
#include <stdio.h>
int main() {
putchar('H');
putchar('e');
putchar('l');
putchar('l');
putchar('o');
putchar(',');
putc(' ', stdout);
putc('w', stdout);
putc('o', stdout);
putc('r', stdout);
putc('l', stdout);
putc('d', stdout);
putc('\n', stdout);
return 0;
}
It simply prints out one character at a time.
Single character input
Technically, the argument to putchar()
and the first argument to putc() are also ints
The corresponding input routines are getchar() and
getc(file). These return an int not a
char (remember a char is just a single byte integer
so any value a char can store can also be stored by an int
but not vice versa) as getchar() and getc(file)
return EOF if the input ends, just like scanf() etc.
Fabulously, C also includes ungetc(c, file) which puts a
character back into the input! It doesn't actually write it to the
file, but the next call to getc(c), fscanf(), etc.
will see that character just as if it had been in the file or typed at
the keyboard. We shall see an example of this later but for the time
being note that it is only guarranteed to work once.
Strings
The corresponding routines for reading and writing strings are
called fgets(string, maxbytes, fp), which reads at
most maxbytes-1 bytes (why?), and
fputs(string, fp). fgets() includes the new
line.
Don't use gets() as it doesn't have a
maxbytes arguments so the string can over-run.
There is also puts(string) which writes to
stdout.
For a string without the % character fputs() and
scanf() do the same thing, but when the string does contain a
% character fputs() will print it out literally
whereas scanf() will expect the corresponding arguments and
will print them out instead.
Mixing formatted and unformatted input
Formatted and unformatted output coexist quite
well because they both just write out bytes to a file or to the
keyboard.
Formatted and unformatted input are more tricky
as formatted input skips over leading white space and unformatted input
does not. The following is typical of what happens when we try to
follow formatted input with unformatted. We are trying to read an
integer and then a line of free-form text:
#include<stdio.h>
#define LEN 200
/*
* A doomed attempt to read an integer and then a line of text.
*/
int main() {
char string[LEN];
int i;
printf("Please enter an integer\n");
scanf("%i", &i);
printf("I read %i\n", i);
printf("Now type some stuff\n");
fgets(string, LEN, stdin);
printf("I read:>%s<\n", string);
return 0;
}
At first sight this looks fine but the problem is that when we
type, say, '7' at the first prompt, we are not really typing '7', we
are typing "7\n". The call to scanf() reads the '7'
but not the
new line. fgets() is thus presented with an empty
line of text, the leftovers from the call to scanf().
One "solution" is to just read in one line and discard it, but
this assumes that we know that fgets() is following a call to
scanf(). What if it's actually following another call
to fgets()? Then we will skip the line we want.
The situation is simpler if we are in the common situation where
inital white-space can be ignored. Then we can just carry on reading
characters until we meet the first non-space. This skips over
any number of blank lines, including zero.
There's a problem here though: for us to do this we must read in
the first non-space character before we know when to stop. But if
we've already read it in the next call to fgets() won't see
it! This is where ungetc() comes in.
In the following example we skip blank lines by reading in
characters, stopping at EOF or until we read in a non-space
character in which case we push that character back into the
input stream so that the next call to fgets() or
fscanf() will see it.
We discuss the include file ctype.h
below; it has a number
of useful tests for letters, numbers, spaces, etc.
#include <stdio.h>
#include <ctype.h>
#define LEN 200
void skipspaces(FILE *fp);
void skipspaces(FILE *fp) {
int i;
while ((i = getc(fp)) != EOF)
if ( ! isspace(i)) {
ungetc(i, fp); /* Push it back */
break;
}
}
int main() {
char string[LEN];
int i;
printf("Please enter an integer\n");
scanf("%i", &i);
printf("I read %i\n", i);
printf("Now type some stuff (initial spaces will be ignored)\n");
skipspaces(stdin);
fgets(string, LEN, stdin);
printf("I read:> %s", string);
return 0;
}
Do I really need unformatted I/O? Handling errors
It's easy to jump into unformatted I/O simple because we don't
fully understand the capabilities of formatted I/O.
The scanf() function has an apparent weakness in that if
the program asks for an integer and we type in, say 'w' the program
goes into an infinite loop.
This looks like a bug but it's what it's meant to do:
scanf() tells us that it failed (by its return value) and
leaves the unwanted character in the input stream for us to look at.
Mini discussion
Turn to your neighbour and discuss what would be a suitable course of
action in this situation for two possible cases:
- The program is reading from the keyboard.
- the program is reading from a file.
A simple solution
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
void skipline(int num);
int main() {
int x, y;
int got;
while (1 == 1) {
printf("Please enter two integers > 0\n");
if ((got = scanf("%d %d", &x, &y)) != 2)
skipline(got);
else if ( x <= 0 || y <= 0 )
printf("Only integers greater than zero are allowed\n");
else
break;
printf("\n\tPlease try again.\n\n");
}
printf("Read: %d %d\n", x, y);
return 0;
}
/*
* Read and discard the rest of the line from stdin, printing it so
* the user knows what's going on. If num == EOF exit
*/
void skipline(int num) {
int i;
if ( num == EOF ) {
printf("End of standard input\n");
exit(1);
}
/* Not EOF - skip the rest of the line */
printf("\nSkipping unexpected input: ");
while ((i = getchar()) != '\n') {
if ( i == EOF ) {
printf("End of standard input\n");
exit(1);
}
putchar(i);
}
putchar('\n');
}
Character tests: <ctype.h>
If we need to know if a particular character is a letter(or a number, or a
space, or..) then <ctype.h> is our friend.
In the example below we use fgets() to
read in a string with spaces and putchar to print a single character.
#include<stdio.h>
#include<ctype.h>
/*
* Print out letters from the input string and
* print out all spaces as tabs
* Note use of fgets to handle spaces
*/
#define MAXLEN 256
int main(void) {
char string[MAXLEN], *p;
printf("String?\n");
fgets(string, MAXLEN, stdin);
/* Get rid of the '\n' at the end */
p = string + strlen(string) - 1;
if (*p == '\n')
*p = '\0';
for(p = string; *p; ++p)
if (isalpha(*p))
putchar(*p);
else if(isspace(*p))
putchar('\t');
putchar('\n');
}
Result:
./ctype
String?
abc def
abc def
./ctype
String?
ab2cdef ;lo
abcdef lo
Ie, all the letters have been printed out, the spaces changed to tabs
and everything else ignored. Similarly, the isprint(c) function
returns non-zero if c is a printable character.
ctype.h also includes tolower(c) which takes a
character argument (technically an int) and returns
the appropriate upper case or its original value if 'c' is not
an upper-case letter. No prizes for guessing what toupper(c)
does.
Strategy
We've touched on this before but as you come to write your
mini-project it becomes more important.
Strategy is driven by several things including the non-scalability of
the mental task and the difficulty of finding bugs. It's also driven
by the three rules of planning and design:
- We can't write code without being 100% sure what that code going
to do. Sitting down at the keyboard to write a function whose
task is almost but not quite clear is a recipe for disaster.
- Our users won't know exactly what they want until
they try using it. The specification will change.
- Even if they do, we can be sure that new, unthought of requirements will
emerge after the program has been in use for a while.
Understand the task
from the perspective of the people who will be using the program.
This not only guides what the problem does but also how we implement it.
Create a framework
-
At the beginning of a project we are not writing
a computer program, we are
creating a framework that is easy to:
- Test
- Debug
- Modify and extend
Right from the start.
- More time is spent maintaining existing programs than writing
new ones. New features always get added.
- The major problem is preventing, finding and fixing errors.
- Trying to "bolt on" testability, etc. afterwards is never very
effective - it must be designed in right from the beginning.
Structure our data
- Each type of "thing" will normally have its own type of
structure.
- Each actual thing will have a structure allocated for it and a
pointer to that structure will be passed from function to function.
- That structure tells us everything we need to
know about it, including its relationship to other things, usually via
other pointers.
- Once we get our data structure right, working out what
functions to write is usually quite easy.
Structure our code
- We structure our data first, then our code.
- Two strategies:
- Top down: divide our task into sub-tasks, the sub-tasks into
smaller sub-tasks and so on until we get to something small
enough to do.
- Bottom up: build up a collection of utility "tool-box" routines, and
string them together to make increasingly higher-level
routines.
- Top-down implies that we know our requirement pretty
accurately and that it's unlikely to change (although the way
you implement it may). Bottom-up is pretty meaningless unless
we know what higher-level functions we are aiming at.
- It's a bit of a false dichotomy.
Types of program
- Traditional. Has a fixed task: read in data, do calculation,
output data, finish. Examples: payroll, movie rendering,
batch job scientific simulation.
- Modern. Event loop: initialise, wait for input, handle input,
wait for next input. Examples: word processor, web server,
real-time scientific simulation.
- Event loop suited to null program style: start off with program
that does nothing, add a tiny functionality, test, repeat.
This is a popular solution to the "Can't write a program until you know
what it's got to do, can't work out what it's got to do until you try it"
paradox.
Splitting up your task
Remember from the previous notes:
-
Functions break large computing tasks into smaller ones, and
enable people to build on what others have done instead of starting
over from scratch.
Kernighan & Ritchie.
Functions should be short and sweet, and do just one thing...
and do that well.
Torvalds
- Functions allow us to think about what we are trying to do,
not how we are doing it.
- Conversely, when writing a low-level utility function we only
need to think about what the function does and not the wider
context in which it is being called.
- Avoid side effects (do one thing..)
- The structure of our code reflects the structure of the task
you are trying to achieve: we should be able to describe in
words what the function does in terms of the task.
- Better criterion: the structure of our code reflects the
decisions we had to make when deciding what to do and how to do
it.
- If I had to change one aspect of the code (say, a particular
algorithm), could I just change one function or would there be a
large number of functions that needed changing?
- Could this function be reused in this program or even in
another program?
Smart people have simple code
-
Keep It Simple, Stupid.
Plan to stop!
This would be far more than four-weeks
work of course
It's very easy to go wild and have an over-ambitious plan. This is
made worse if we start by writing all the least important code first
and leave the guts of the problem until last. Instead, write the
basics first and then add optional extras as you have time. For
example, in a game of chess first write the basic game with the
players themselves deciding if a move is legal. Then add more and more
tests leaving the really obscure ones until the end. Even if you dont
finish, at leas eople will be able to play a game.
Advanced material: sorting and <stdlib.h>
As well as
malloc, etc.
<stdlib.h> includes a
function,
qsort, to sort array elements. These can be arrays
of any type (
float,
int, etc.) or arrays of
structures.
Since this can be a little confusing, we'll provide a
working example.
Sorting an array of structures poses two problems:
- Given two structures, how does qsort know which should come first?
Answer: we have to right a function to tell it.
- More subtly; if we pass qsort a pointer to the first element of
the array, how does it know where the second, third, etc elements are?
Answer: we tell it.
Sorting example
NB the boxes of code in this section are
a single C file that has been split into sections.
Let's imagine we have a list of foods and we want them sorted by calories,
with foods with identical calories listed alphabetically. Our
data structure looks like this
#include <stdio.h>
#include <stdlib.h>
#define NAMELEN 256
typedef struct food {
char name[NAMELEN];
float calories;
} Food;
Now we declare the function that
qsort will
call whenever it needs to know which order two items should be in.
The prototype is:
int calories_or_alphabetical(const void *a, const void *b);
It must return an
int which is negative or positive according
to whether the thing
a points to should be before or after in
the list than the thing
b points to. If it turns out they
should both occupy the same place ("second equal") it returns zero.
The arguments are both void pointers because qsort
neiether knows or cares what they point to and the word const
in front of them says that calories_or_alphabetical is
not allowed to modify the thing they point to.
OK, let;s read in the data values.
int main(void) {
Food *foods = NULL;
int howmany, todo;
do {
printf("How many foods (>= 1)?\n");
scanf("%d", &howmany);
} while (howmany < 1);
if ((foods = malloc(howmany * sizeof *foods)) == NULL) {
fprintf(stderr, "Sorry you're too hungry for this Mac!\n");
exit(-1);
}
for (todo = howmany -1; todo >= 0; --todo) {
int readin;
printf("Name and calories?\n");
readin = scanf("%s %f", foods[todo].name, &foods[todo].calories);
if (readin != 2) {
fprintf(stderr, "Sorry, I couldn't understand that\n"
"Please make sure the food name has no spaces.\n");
++todo;
}
}
We use a
do..while loop to check that
we read in a positive number of foods and we also test
to check we've read the (space-free) name and calories properly.
Now we sort the array and print out the foods in calorie/alphabetical order:
qsort(foods, howmany, sizeof *foods, calories_or_alphabetical);
for (todo = howmany -1; todo >= 0; --todo)
printf("%s %f\n", foods[todo].name, foods[todo].calories);
return 0;
}
The first two arguments to
qsort are quite simple: a pointer
to the start of the array (note it
must be an array
not a linked list) and the number of elements to be sorted. The third
argument is the answer to our second question above: it's the
'distance' (in bytes) between the
Nth and
(N+1)th elements of the array which tells
qsort
where the second, third, fourth, etc. array elements are. Finally,
calories_or_alphabetical is just the name of our function.
yes, you can pass the name of a function to another function.
The function itself looks like this:
/*
* Return positive or negative or zero according to which food
* should come first in the sorted list
*/
int calories_or_alphabetical(const void *a, const void *b) {
const Food *fooda = a, *foodb = b;
if ( fooda->calories != foodb->calories)
return foodb->calories - fooda->calories;
return strcmp(foodb->name, fooda->name);
}
Notice that the very first thing the function does is to turn the two
void * pointers into something it can use (which must be
the same type as the array
qsort was called with of
course). Then it checks to see if one food has more calories than the
other. If they have the same calories it checks for which comes first
in the alphabet. Note that our function
calories_or_alphabetical can return
any
positive or negative number it likes. All that counts is whether the
value is positive, negative or zero.
The results
INPUT:
How many foods (>= 1)?
5
Name and calories?
cream 300
Name and calories?
bigmac 300
Name and calories?
kentucky 500
Name and calories?
yoghurt 100
Name and calories?
apple 100
OUTPUT:
apple 100.000000
yoghurt 100.000000
bigmac 300.000000
cream 300.000000
kentucky 500.000000
Sorting and pointers
The
sort function always
introduces another layer of pointing. In the above example, we have
an array of structures but each argument to
calories_or_alphabetical
is a pointer to a structure (
Food *).
Had we had an array of pointers to structures
each argument would be a pointer to a pointer (
Food **)
and the start of
calories_or_alphabetical would have been:
int calories_or_alphabetical(const void *a, const void *b) {
const Food *fooda = * (Food **) a, *foodb = * (Food **) b;
Sorting is important in its own right, it has also introduced two new
concepts: passing a function name as an argument to another function
so that second function can call it and using a
void *
pointer to pass a pointer to a structure of our own devising through a
library function to another function of ours. We conclude with a small
example of this:
typedef struct food {
float val;
char name[32];
} Food;
int alphabetical(const void *a, const void *b) {
const Food *fooda = a, *foodb = b;
fprintf(stderr, "%s %s\n", fooda->name, foodb->name);
return strcmp(fooda->name, foodb->name);
}
void tryit(int fun(const void *, const void *), void *a, void *b) {
printf("%d", fun(a, b));
}
int main(void) {
Food a, b;
strcpy(a.name, "Potato");
strcpy(b.name, "Apple");
tryit(alphabetical, &a, &b);
return(0);
}
Notice my diagnostic inside
alphabetical. This to check I've
got my levels of pointers right. You would be
very wise
to do the same!