Skip to content
Physics and Astronomy
Home Our Teaching Resources C programming Appendix: binary I/O & handling bad input
Back to top
On this page
Contents

Appendix: Binary IO and handling bad input

The occasional appendices and optional examples in this module are for advanced material that you will not need for this module. They are intended for enthusiastic students who are interested in going further in programming for its own sake.

Handling bad input

This is an advanced topic and can be omitted if desired.

So far we have handled checking the values of numbers typed in at the keyboard by enclosing the call to scanf() inside an infinite loop, checking the values typed in and printing an error message if they are incorrect or breaking out of the loop if they are OK.

But you may already have encountered the situation where you have typed a non-numeric character by mistake, say 'q' instead of '1'. The problem is that in this situation scanf() leaves the input at the first character that doesn't match what it expects, the 'q'. The loop, if it's properly written, does not break so scanf() is called again and does exactly the same thing again: it stops at the 'q'. And so on for ever.

This might be thought to be an unhelpful response, but the question arises "what should the system do in this situation?".

Quick discussion

Turn to your neighbour and ask what would be the best thing to do in this situation.

One possible approach

The general idea here is to print a helpful message to the screen and skip the rest of the line. This only makes sense if we are reading from the keyboard there is no point in doing this if we are reading from a file.

Even here there are a few subtleties, for example: what happens if we reach the end of the input? This could be because stdin is coming from a file not the keyboard, or because the terminal window has closed. Or the input may be coming over the network and the connection might break. Thankfully we can tell when this happens as scanf() returns the special value EOF, usually equal to -1 and certainly negative. Note the distinction: scanf() returns zero when there was data but the wrong sort, such as a letter for a format of %d, and EOF when there is no data at all.

With that in mind we can write an error handling function which we shall call skipline():

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

void skipline(void);

int main() {
  int x, y;

  while (1 == 1) {
    printf("Please enter two integers > 0\n");

    if (scanf("%d %d", &x, &y) != 2)
      skipline();
    else if ( x <= 0 || y <= 0 )
      printf("Only integers greater than zero are allowed\n");
    else
      break;

    printf("\n\tPlease try again.\n\n");
  }
    
  printf("Read: %d %d\n", x, y);
  return 0;
}

//
// Read and discard the rest of the line from stdin, printing it so
// the user knows what's going on. 
//
void skipline(void) {
  int i;
  
  printf("\nSkipping unexpected input: ");

  while ((i = getchar()) != '\n') {
    if ( i == EOF ) {
      printf("End of standard input\n");
      exit(1);
    }    
    putchar(i);
  }

  putchar('\n');
}
Step through this code


getchar() and putchar()

getchar() reads a single character from standard input (there is a version getc(file) to read from a file).

If you look closely at the code above you will see it returns an int, not a char as we might expect. The reason is the one we mentioned above: we need a way of telling if we have run out of input. All the possible values of a char (including zero) are by definition possible successful values of getchar().

So on failure getchar() returns EOF which is not a possible char.

Example

If we want to read in a char variable, let's call it c, we might write:

char c;
int i;

if ((i = getchar()) == EOF ) {
  fprintf(stderr, "Out of data!\n");
  exit(99);
}
/* Else */
c = i;  /* Success */

Similarly putchar(int value) and putc(int value, FILE *) print a single character to stdout or a file respectively.

With that in mind skipline() should be reasonably clear. It's important to notice that we have put all of the nastiness of checking for the end of file inside of skipline(), we have not left any of it for the calling function to handle.

Dealing with errors is always tedious and should be separated from the main logic of the code as far as possible.

Binary input and otput of numerical data

Sometimes we require a program to be able to save data to a file but we know the file will only ever be read by another program. For example, long-running numerical simulations often periodically save their state so they can be stopped and restarted from their last saved state.

Considering the example of a two-dimensional matrix, the obvious way to do it is to use a double loop and fprintf():

 for (int m = 0; m < M; ++m)
    for (int n = 0; n < N; ++n) 
      fprintf(datafile, "%g\n", x[m][n]);

and later read it in with:

 for (int m = 0; m < M; ++m)
    for (int n = 0; n < N; ++n) 
      fscanf(datafile, "%lg", &y[m][n]);

However this is inefficient:

  • The program is having to take eight bytes of binary data, translate it to human-readable form and then translate it back again.
  • We are losing precision.
  • The file is bigger.

We can improve the precision, at the cost of a larger file size, by writing using a format such as "%.10g" which means "use ten decimal places". But the data we read in still won't be exactly the same as we wrote out and that's a problem.

Binary input and output

C gives us the ability to write the raw bytes to a file, in this case (assuming a true 2-D array) with a single function call:

 fwrite(x, sizeof x[0][0], N*M, datafile);

This means: write N*M chunks of data each of the size of x[0][0] from x into datafile.

The corresponding "read" call is:

fread(y, sizeof y[0][0], N*M, datafile);

Now y is identical to x as we have just copied the bytes to and from the disk.

The complete program

/*
 * Unformatted store and read of a matrix.
 */

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define N 100
#define M 200

int main() {
  double x[M][N], y[M][N], error = 0.0;
  FILE *datafile;

  for (int m = 0; m < M; ++m)
    for (int n = 0; n < N; ++n) 
      x[m][n] = sin(0.01 * n + 0.0153 * m);

  // Write the data
  if ((datafile = fopen("Bindat.dat", "w")) == NULL ) {
    printf("Cannot create file\n");
    exit(1);
  }

  fwrite(x, sizeof x[0][0], N*M, datafile);
  fclose(datafile);

  // Now read it back in
  if ((datafile = fopen("Bindat.dat", "r")) == NULL ) {
    printf("Cannot read file\n");
    exit(1);
  }

  fread(y, sizeof y[0][0], N*M, datafile);
  fclose(datafile);

  // Now compare
  for (int m = 0; m < M; ++m)
    for (int n = 0; n < N; ++n) 
      error += fabs(y[m][n] - x[m][n]);

   printf("Total error is: %g\n", error);
   return 0;
}

The error is zero (precisely) as we have read back exactly the same bytes as we wrote.

Two gotchas

Partial and/or pseudo-arrays

The "all in one go" approach above writes the whole array, but in some circumstances we may only be using a part of the whole array. For example we may be going from x[0][0] to x[m-1][n-1] where m<=M and n<=N. The solution is to write each row separately using a loop:

for (int j = 0; j < m; ++j)
  fwrite(x[j], sizeof x[j][0], n, datafile);

Alternatively, suppose the x[m][n] were not part or a true array but a dynamically-allocated psudo-array using malloc():

double **x;

x = xmalloc(m * sizeof *x);
for (int j = 0; j < m; ++j)
  x[j] = xmalloc(n * sizeof *x[j]);

Then x is a (dynamically allocated) array of pointers, not a 2-D array of doubles. The above code, writing each row separately, works in this case as well.

MS Windows

For horrible historical reasons MS Windows distinguishes between text and binary files so it is recommended you open binary files with "wb" and "rb":

fopen("Bindat.dat", "wb") // MS Windows only
Log in
                                                                                                                                                                                                                                                                       

Validate   Link-check © Copyright & disclaimer Privacy & cookies Share
Back to top