Physics and Astronomy |
Back to top
On this page
Contents Scientific programming in C: wrap-upUnformatted input and outputRecap: formatted inputThe scanf() family of functions we have used so far are intelligent functions for situations where we know what to expect (for example, an integer) and want interpret the input accordingly. They are extremely convenient for reading in numbers as they skip over white space, including new lines. The user can leave any number of spaces between inputs, or even just put one per line. This is referred to as formatted input as the function has to know the format of the data it is expecting (integer, floating-point number, text string without spaces, etc.) After it has read the expected characters, the system leaves itself positioned at the next character after the last one it has used, which is nearly always a space or a new-line, '\n'. An exampleConsider a file or keyboard input that starts: 1 2 Mary had a little lamb ... As far as our program is concerned, it is as if it were a giant character string starting: "1 2\nMary had a little lamb\n..." If the code now executes the statement: fscanf(infile, "%d %d", &j, &k); The value 1 and 2 get read into j and k respectively with fscanf() conveniently skipping over unwanted spaces and new-line characters. Having read in everything up to and including "2" the imaginary character string now contains: You will notice the remaining "string" starts
with a newline character, '\n' . This will be very
important later on.
"\nMary had a little lamb\n..." As long as we continue to use fscanf() to read in integer and numbers everything will be fine as all the unwanted spaces just get skipped over. After a call to scanf() or fscanf(), the system is left looking at the next unread character which is usually a space or a new line. Reading in text without interpreting itgetc() reads in a single charactergetc() requires a single argument which is a FILE* variable just like fscanf(). A few points worth noting:
ungetc() puts back a single characterSee the following example and note that with all unformatted input and output functions the FILE argument comes last. #include <stdio.h> int main() { int mychar; while(1) { // Read a character from stdin mychar = getc(stdin); if (mychar == EOF) return 0; printf("I read: \"%c\"\n", mychar); // Put it back ungetc(mychar, stdin); // Read it again! mychar = getc(stdin); printf("Again I read: \"%c\"\n\n", mychar); } } fgets() reads a line of textQuick discussion:In this section we shall be dealing with situations where it is fairly easy to describe what we want to happen but actually making it happen involves some irritating and potentially confusing details. Turn to your neighbour and ask:
Sometimes we just want to read in a whole line of text into a character array, referred to as unformatted input. We don't expect it to have any special form such as a number. The most common reason is to be able to input text containing spaces, for example people's names or free-form text for a notebook application. We shall first look at the mechanics of reading in the text and then deal with the subtleties of combining formatted and unformatted input. The fgets() function reads a line from a file without interpreting it. That is, fgets() reads the unread input up to and including the next new-line character. The "file" can be the keyboard if stdin (standard input) is used. It has the form: fgets(buffer, maxbytes, infile);Here file is a FILE *. It can be obtained obtained in the usual way using fopen() or we could use the predefined value stdin if we want to read from standard input. buffer is a character array at least maxbytes long. Like snprintf(), fgets() is "well behaved" and always puts a zero, '\0', at the end of the text even if the input line is too long, thus always leaving a valid zero-terminated string. Notice that for unformatted input the FILE is the last argument, unlike in fscanf() where it is the first. Thus fgets() reads in at most maxbytes - 1 bytes of actual input, or up to the next new-line character, whichever comes first. It returns NULL if the read failed completely, for example if we have reached the end of the input file. fgets(buffer, maxbytes, file); reads at most maxbytes characters from file into the charactyer array buffer, stopping at the end of the line. Removing the new-line characterIf space permits, fgets() includes the new-line character, sent when the user presses the "Return" or "Enter" key. This is always the final character of the string. If we don't want this character, we just need to replace it with '\0', thus shortening the string by one. The following snippet calls fgets() to read a line from standard input and then checks to see if the last character of a string stored inside a character array is '\n'. If so it replaces it with '\0', thus shortening the string by one. If this were a program we were writing for other people to use we would need to consider what to do if the final character were not a new line as it probably indicates that the line was too long for out buffer. if (fgets(line, N, stdin) != NULL) {
// Chop off final '\n';
int end = strlen(line) - 1;
if (line[end] == '\n')
line[end] = '\0';
}
With this we can look at a very short program that uses fgets() to read a line of text from the keyboard and print it out again: Mixing formatted and unformatted inputWhether we have a file of data or are reading from the keyboard, we are always free to mix formatted and unformatted input. A common situation is to use formatted input to get options from a menu, or to read in data values, and to then need to read in a complete line of text. The first attempt often looks like this: // // Flawed attempt to read in an integer followed by some text. // int main() { char line[N]; int value; printf("Please enter the integer value\n"); scanf("%d", &value); printf("Now please type in the text string, spaces are allowed!\n"); if (fgets(line, N, stdin) != NULL) { // Chop off final '\n'; int end = strlen(line) - 1; if (line[end] == '\n') line[end] = '\0'; } printf("The value is %d, the text is >%s<\n", value, line); return 0; } The "conversation" goes like this: Please enter the integer value
12
Now please type in the text string, spaces are allowed!
The value is 12, the text is ><
The user is given no chance to type in a line of text, instead fgets() just seems to read a completely blank line. What's happening? The answer is that, when we typed in "12" we actually typed three characters, '1', '2' and the carriage return, '\n'. As in the previous example, the system reads in the two characters '1' and '2' that form the integer 12 and leaves itself positioned at the very next character, which is the new line '\n'. Thus the "next line" is completely empty! We can illustrate this by typing at the keyboard not just "12<return>" (three keystrokes) but "12 abc<return>" (seven keystrokes including the space between "12" abd "abc"). The final line of output now looks like this: The value is 12, the text is > abc< A call to fscanf() followed by a call to fgets() will almost certainly result in a blank line. Recap: scanf() is reading the two characters '1' and '2', inspecting the next character, seeing it is white-space, which ends the number and is therefore ignored and left for the next input function to deal with. If that next function is another call to scanf() that's fine as scanf() skips over white-space. But fgets() doesn't so it just sees the new-line character which it treats as being an empty line. There are various bad solutions at this point (some people's first reaction is just to read in one more character in the hope that nobody will ever type 12<space><return>), but the most common situation is that we require the input line to be non-blank. In this case it's easy to write a loop that carries on reading a line from the file, or keyboard, until it finds a line that contains a non-space character. If we are reading from stdin it looks like this: int readoneline(char line[], int maxbytes) { while ( 1 ) { int i; if ( fgets(line, maxbytes, stdin) == NULL ) { return 0; // Out of data } // We don't the new-line character so chop it i = strlen(line) - 1; if ( line[i] == '\n') line[i] = '\0'; // Look for a non-blank character. for (i = 0; line[i]; ++i) if ( isspace(line[i]) == 0) { return 1; } } } Step through a complete example This is quite a useful function. One solution is to have a function to read in a non-blank line. Optional further study
Command-line argumentsWhen running a program from the command line it's possible to specify command-line arguments: myprog hello world Here myprog is the program name, the first argument is the word hello and the second is world. Accessing the command-line argumentsCommand-line arguments are accessed by declaring main() as: int main(int argc, char **argv) { ... Or equivalently: int main(int argc, char *argv[]) { ... That is argc ( argument count ) is an integer, and argv ( argument vector ) is an array (vector) of character strings. argc is the number of character strings in argv and is one more than the number of command-line arguments. argv[0] is the program name. This is usually the name of the file the program is stored in and is not under the control of the programmer. argv[i] (i > 0) are the program argments.
#include <stdio.h>
int main(int argc, char *argv[]) {
if (argc >= 0)
printf("Welcome to \"%s\"\n", argv[0]);
for (int i = 1; i < argc; ++i)
printf("Argument %d: \"%s\"\n", i, argv[i]);
return 0;
}
Converting arguments to numbersCommand-line arguments are always presented as character strings even if they are valid numbers. There are a number of ways to convert them to numbers, the easiest is probably to use sscanf():
#include <stdio.h>
int main(int argc, char *argv[]) {
if (argc >= 0)
printf("Welcome to \"%s\"\n", argv[0]);
for (int i = 1; i < argc; ++i) {
float val;
printf("Argument %d: \"%s\"\n", i, argv[i]);
if ( sscanf(argv[i], "%g", &val) > 0 )
printf("\tthe value is %g\n", val);
}
return 0;
}
Named constants and enumerationsSuppose we need to solve quadratic equations, and let's assume we have defined a structure to represent them. Their roots can be one of three types: two real roots, one repeated real root or two complex roots. It might be useful for our structure to be able to store the solutions and what type they are. Obviously we can do the latter by adding a new member to the structure and setting it to '1' for one root, '2' for two real roots and '3' for two complex roots but then we need to remember what '1', '2' and '3' mean. It's much better to give a name to these constants. C provides two ways of naming constants. We've met one already, #define, but C provides another way specifically designed for our situation, enumerations: enum eqnstatus { EQN_UNSOLVED, EQN_ONEROOT, EQN_REALROOTS, EQN_COMPLEX_ROOTS }; Now anywhere in our program we could use the named constants EQN_UNSOLVED, EQN_ONEROOT, EQN_REALROOTS, EQN_COMPLEX_ROOTS to mean zero, one, two or three respectively:
enum eqnstatus { EQN_UNSOLVED, EQN_ONEROOT, EQN_REALROOTS,
EQN_COMPLEX_ROOTS };
int main() {
enum eqnstatus eqn_status = EQN_UNSOLVED;
// More code here ...
return 0;
}
Enumerations give names to integers and are used to list (enumerate) different, mutually-exclusive choices. Enumerations are integers and printing them out with %d just prints their integer values but debuggers usually understand them. They can be combined with typedefs as in the example below which also illustrates that enumerations and structures don't actually have to have a type if we don't want them to: #include <stdio.h> typedef enum { VANILLA, CHOCOLATE, STRAWBERRY } Flavour; typedef struct { Flavour flavour; float fat; float sugar; float calories; } Icecream; main() { Icecream icecream; icecream.flavour = CHOCOLATE; printf("%d\n", icecream.flavour); // Prints: 1 return 0; } The convention is for the individual values to have names that
are either all upper-case or have just the first letter
capitalised (Vanilla, Non-default valuesIt's possible to write: enum something { TYPE1, TYPE2=76, TYPE3, TYPE4 }; In which case TYPE1 has the value 0, TYPE2 has the value 76 and TYPE3 has the value 77, etc. but it's very unusual to. One class, several typesIt's worth noting that we have created a single structure definition with an integer "type" variable rather than a list of identical structure types:
// Don't do this!
typedef struct {
float fat;
float sugar;
float calories;
} Vanilla_Icecream;
typedef struct {
float fat;
float sugar;
float calories;
} Chocolate_Icecream;
typedef struct {
float fat;
float sugar;
float calories;
} Strawberry_Icecream;
An occasionally-useful trickWe may want to have an array of Icecream structures, one for each flavour. To help with this we can use the useful trick of adding a dummy enumeration value at the end of the list: typedef enum { VANILLA, CHOCOLATE, STRAWBERRY, NUM_FLAVOURS } Flavour; typedef struct { Flavour flavour; float fat; float sugar; float calories; } Icecream; Icecream icecreams[NUM_FLAVOURS]; Now NUM_FLAVOURS has the value "3" and as we add new flavours then as long as we are careful to keep NUM_FLAVOURS as the last in the list its value will always be correct: typedef enum { VANILLA, CHOCOLATE, STRAWBERRY, TOFFEE, NUM_FLAVOURS } Flavour;>h3>Adding new types or options Enumeration provide a sinple and neat way of adding new types or options, for example: // Memristor is new in v2.0 typedef enum { Resister, Capacitor, Inductor, Memristor } Component; Enumerations and the switch() statementEnumerations are integers and they have a natural affinity with the switch() statement: switch (person->gender) { case female: ... } Most modern compilers can warn us if a switch() statement with an enumeration as its argument doesn't handle one of the possible cases which can be very helpful: if the above example when we added Memristor we will be warned of any switch() statements that don't handle it. Of course this can't be done with an if() .. else if() statement which is another reason to use switch(). When to use enumerationsEnumerations are used for one thing only: when a variable is used to "enumerate" different, mutually-exclusive possibilities. For general named constants use #define. Summary of advantages over #define
Sharing variables between functions Relying too heavily on external
variables is fraught with peril since it leads to programs
whose data connections are not at all obvious - variables can
be changed in unexpected and even inadvertent ways and the
program is hard to modify.
Kernighan & Ritchie. GLOBAL variables (to be used only if you really need them).. Sharing variables between functions in the same source fileVariables defined outside of a function are called external variables and can be used by any function in the file, provided only that the function definition follows the variable definition:int somenumber = 7; // myfun() can now use the variable somenumber: void myfun(int i) { somenumber = i *i; } // so can main(), if it is inside the same file: int main() { int k; k = somenumber + 6; // ... return 0; } Variables declared outside of a function are called external, or global,variables and can be accessed by any function. We shall see in the next lecture that it is possible specify that variables can be shared between any function in any file or just between functions in the same source file. External variables should be used very sparinglyWhy? Well, look at the comment by Kernighan & Ritchie at the top of the section. Contrast that with this:x = fun1(arg1, arg2);Here when we call function fun1 we can be confident that fun1 isn't going to change the values of its arguments. if instead of using arguments we had made arg1 and arg2 external so the function call was just x = fun1(); we wouldn't know if arg1 and arg2 had the same values when fun1 returned.
External variables should be used very sparingly If overused external variables tie different routines up in knots.
Use global variables for situations when all of
the below apply:
Good uses:
Never use global variables just to avoid function parameters. Static variables inside functionsBy default variables and arrays inside functions are automatic, that is they automatically come into existence when a function starts and are automatically destroyed when it returns. This also means that if a function calls itself then each instance of the function has its own copies of variables. Static and external variables retain their values between calls to functions. Very occasionally we may need to have a variable within a function keep its value between calls. The static keyword does this. Such variables are permanent and initialised before the start of the program. Unlike ordinary variables, if no explicit initial value is specified a value of zero (or NULL for pointers) is assumed: Static and external variables are intialised once, when the program starts to run. // static demonstration #include <stdio.h> #include <math.h> double addup(double k) { static int called; static double sum; sum += k; printf("k is %g new sum is %f\n", k, sum); printf("The function has been called %d times\n", ++called); return sum; } int main() { int i, max=5; double mysum; for(i = 1; i <= max; ++i) mysum = addup(sqrt(i)); printf("The sum of the square roots of 1 to %d is %g\n", max, mysum); return 0; }Also note that if foo() were to be called recursively all instances of the function would have the same variable sum. ( See this example.)
Unlike ordinary variables, if no explicit initial value is specified static and external variables are initialised to zero (or NULL for pointers) is assumed: Sorting, and pointers to functionsSuppose we wish to sort an array of integers. We can easily write a double for() loop: #include <stdio.h> #include <stdlib.h> #define N 8 int main() { int x[N] = { 8, 6, 3, 5, 7, 1, 4, 2}, i, j, tmp; for (i = 0; i < N; ++i) { for (j = i + 1; j < N; ++j) { if (x[j] < x[i]) { // swap them tmp = x[i]; x[i] = x[j]; x[j] = tmp; } } } for (i = 0; i < N; ++i) printf("%d\n", x[i]); return 0; } This takes N*(N-1)/2 comparisons (and possible swaps) which is fine when N equals 8 but a problem when N equals ten thousand or one million. Using a better algorithmThere are better sorting algorithms that take approximately N*log2(N) comparisons which is clearly much quicker for large N. But how do we use them? In principle we could find the algorithm and program it ourselves but it would be much easier if we could have a collection of "quick" library functions written by somebody else which we could call: int x[N]; // Setup x qsort_ints(x, N); double x[N]; // Setup x qsort_doubles(x, N); But what happens if we wish to sort an array of structures? C's answer to this is to have a generalised function "qsort()" which can sort an array of anything. Like malloc(), etc. qsort() requires <stdlib.h>. qsort() can sort arrays of anything
|
Operator | Description |
---|---|
& | Bitwise and (both bits are one) |
| | Bitwise or (either or both bits are one) |
^ | Bitwise exclusive or (just one bit is one) |
<< | Left shift (muliplies by power of 2) |
>> | Right shift (divides by power of 2) |
~ | One's complement |
NB the One's complement operator takes a single argument, the others two.
Expression | Calculation |
---|---|
00110011 & 11110000 | 00110011 11110000 -------- 00110000 |
00110011 | 11110000 | 00110011 11110000 -------- 11110011 |
00110011 ^ 11110000 | 00110011 11110000 -------- 11000011 |
10110011 << 2 | 10110011 -------- 11001100 |
11110000 >> 3 | 11110000 -------- 00011110 |
~11110000 | 11110000 -------- 00001111 |
#define OPTION1 0x01 // 1 binary 00000001 #define OPTION2 0x02 // 2 binary 00000010 #define OPTION3 0x04 // 4 binary 00000100 #define OPTION4 0x08 // 8 binary 00001000 #define OPTION5 0x10 // 16 binary 00010000 #define OPTION6 0x20 // 32 binary 00100000The code sets, unsets and tests options as follows:
#include <stdio.h> #define OPTION1 0x01 // 1 binary 00000001 #define OPTION2 0x02 // 2 binary 00000010 #define OPTION3 0x04 // 4 binary 00000100 #define OPTION4 0x08 // 8 binary 00001000 #define OPTION5 0x10 // 16 binary 00010000 #define OPTION6 0x20 // 32 binary 00100000 unsigned int flags; int main(void) { flags |= OPTION3; // Set OPTION3 flags |= OPTION4; // Set OPTION3 flags &= ~OPTION4; // Unset OPTION4 if ( (flags & OPTION3) ) printf("Option 3 is set\n"); return 0; }The setting and testing of flags is fairly clear, the unsetting of OPTION4 is a little more complicated: OPTION4, like all options, has just one bit set so ~OPTION4 has every bit except that one set. So flags &= ~OPTION4 has the following effect:
The text of each key point is a link to the place in the web page.