![]() | Physics and Astronomy |
Back to top
Recursion: Functions that call themselvesTo iterate is human, to recurse divine.Computer Science saying
God is subtle but he is not malicious.
Recursion is an extremely simple concept: a function simply calls itself. Of course, a function can't always call itself, that would create an infinite loop, so all recursive functions have some sort of test before they call themselves. void myfun(some_args) { /* Do some stuff */ if ( some_test) myfun(some_other_args); } A variation on the above is to do the test inside the function: void myfun(some_args) { if (I_didnt_need_to_be_called) return; /* Do some stuff */ myfun(some_other_args); } Where it works this is neater (as we do the test only once). Try googling the word "recursion" Recursion can lead to subtle errors: either something gets treated twice or not at all, or our recursion goes into an infinite loop. The key is to understand the "test" inside the recursive function that decides whether or not it calls itself again. When functions call themselves each instance has its own local variables, unless they are static. Example 1: a factorial functionThe simplest example of recursion is the following somewhat pretentious factorial function: long factorial(int i) { if ( i > 0 ) return i * factorial(i - 1); return 1; } This illustrates the concepts of recursion: a function that calls itself together with a test to make sure the sequence of function calls stops at some point, and a local variable "i" which local to each invocation of factorial(): when factorial() calls itself each instance has its own copy of i. It also illustrates one possible pitfall: it's possible to use recursion just to show off. The above example would have been better written as a simple loop. Error checking and wrappersWe could add some simple argument error checking as follows: long factorial(int i) { if ( i > 0 ) return i * factorial(i - 1); else if (i < 0 ) { fprintf(stderr, "Factorial must be positive\n"); return -1; } return 1; } Another option would be to make the publicly-callable function a wrapper around the real function: // The actual factorial function long factorial_real(int i) { if ( i <= 1 ) return 1; return i * factorial_real(i - 1); } // Publically-callable front-end to the // factorial function with error checking long factorial(int i) { if (i < 0 ) { // Error check fprintf(stderr, "Factorial must be positive\n"); return -1; } return factorial_real(i); } This is slightly over the top in this case but is a useful illustration. Functions only callable from other functions within the same fileRecall that if we declare the following two variables outside of any function: int globalvar; static int thisfilevar; then globalvar is accessible from any function in any source file but thisfilevar is only available to functions inside the same source file. In the above code factorial_real() is clearly only meant to be called by factorial(). Whilst there's no exact way to acheive this with C we can restrict the function to be only callable by functions inside the same source file by again using the static qualifier for both the prototype and the actual function definition: static long factorial_real(int i); ... static long factorial_real(int i) { When recursion goes wrongThe biggest problem is the internal test or tests: either the recursion never stops or some possibilities get missed out.Imagine we had made a mistake in our previous rather over-the-top way of calculating a factorial: long factorial(int i) { if ( 1 > 0 ) // Bug! return i * factorial(i - 1); return 1; }You can see what it's meant to do but we've written '1' instead of 'i' so it will go on for ever. The first stage of debugging is just to print out its arguments: long factorial(int i) { fprintf(stderr, "i: %d\n", i); if ( 1 > 0 ) // Bug! return i * factorial(i - 1); return 1; }We can get even more details by adding an extra debugging argument to tell us the depth: long factorial(int i, int depth) { int d; for(d = 0; d <= depth; ++d) fprintf(stderr, " "); fprintf(stderr, "i: %d (depth %d)\n", i, depth); if ( depth > 12) { fprintf(stderr, "Too deep!\n"); abort(); } if ( 1 > 0 ) // Bug! return i * factorial(i - 1, depth + 1); return 1; }Here we have done several things:
Using a wrapperIt may be very inconvenient to add an extra argument to our routine because it is called elsewhere, there are header files, etc. In that case we just create a simple wrapper:long factorial(int i) { return factorial_debug(i, 0); } Be prepared to add some debugging to your recursive routine and also be prpared to add a "depth" parameter to the "real" recursive routine, perhaps with a simple publicly-callable wrapper routine. Example 2: geneology, printing people's descendantsConsider a simple structure to represent a person and their children: /* legal_parent defaults to bio_parent if the former is NULL */ typedef enum { bio_father, bio_mother, legal_father, legal_mother, parent_max } Parent_type; typedef enum { male, female, unknown_gender } Gender; typedef struct person { char *surname; char **forenames; struct person **children; int n_children; /* Number of children */ int childbuf_size; /* Size of children buffer */ struct person *parents[parent_max]; Gender gender; } Person; The childrenThe structure members: struct person **children; int n_children; /* Number of children */ int childbuf_size; /* Size of children buffer */ allow us to store a dynamically-sized array of that person's children. TreesThe above is an example of a tree (but see note below). Trees are quite common in programming because the concept they represesent "things that can have other things to an unlimited depth" is quite common in real life. Recursion is an essential tool when dealing with them. Printing a person's detailsWe can then define the following function to print out the details of one person: void printperson(Person *p) { int j; printf("%s,", p->surname); for(j = 0; p->forenames[j] != NULL; ++j) printf(" %s", p->forenames[j]); printf(" (%s), %d children\n", gender_name[p->gender], p->n_children); } To print a person and their descendantsIf we wished to print out a person and all of their descendants the algorithm would look like this:
The function and its public wrapper look like this: static void printdescendants_1(Person *p, int depth); // // Print a person and their descendants void printdescendants(Person *p) { printdescendants_1(p, 0); } // // Print a person and their descendants static void printdescendants_1(Person *p, int depth) { if (p == NULL) return; for (int sp = 0; sp < depth; ++sp) fputs(" ", stdout); printperson(p); for (int child = 0; child < p->n_children; ++child) printdescendants_1(p->children[child], depth + 1); } There are a couple of things to notice before the main part of the function. The first is that we check that the Person argument is not NULL. This is pure paranoia and hence is extremely wise. Second we print out some spaces before the name to indicate the depth. Then it's simple: print this person's details and call the whole function for each of this person's children, increasing the depth parameter by one. (There is an interesting issue here: since each person has two parents, if the original person has several generations of descendants it's likely that at some point two of them will have a child together and we will end up printing the same child twice. For the time being we shall assume this is what is required but we shall examine the issue later.) Example 3: flood fillA common siuation, with a number of variations, is when we have a rectangular grid of integer values, typically representing an image, and we wish to identify "islands", i.e. sets of neighbouring pixels with the same value. In the following example we wish to identify and number "islands" in an empty "sea" of zeros: The initial description#include <stdio.h> /* * Demonstrate a flood fill, using it to count and label non-empty * contiguous regions in an image. * * An image point that is empty has the value * zero, if it is non-empty but not yet assigned to a region * it has the value minus one. Pixels that have been assigned * to regions have the values one, two, three, etc. according * to the region they belong to. */ #define NX 10000 #define NY 10000 The flood fill function
/*
* Simple flood filler: set all contiguous spaces with value "-1"
* to the value "area_id".
*/
void floodfill(int image[NX][NY], int x, int y, int area_id) {
if (x < 0 || x >= NX || y < 0 || y >= NY || image[x][y] != -1)
return;
image[x][y] = area_id;
for (int ix = x-1; ix <= x+1; ++ix)
for (int iy = y-1; iy <= y+1; ++iy)
floodfill(image, ix, iy, area_id);
}
Notes
Had we reversed the last two steps then the function would have been forever calling itself. It would have done this even if we had put in an if() statement to avoid refilling itself: for (ix = x-1; ix <= x+1; ++ix) for (iy = y-1; iy <= y+1; ++iy) if ( ix != x || iy != y) floodfill(image, ix, iy, area_id); image[x][y] = area_id; // Too late!! The outer functionAll we need to do now is to go over each point in the image and call floodfill() for each point that needs it:
/*
* Count and label contiguous areas in an image.
* On input: 0 means empty space, -1 means non-empty
*
* On exit the contiguous areas have the values 1, 2, 3 etc
* and the number of areas is returned.
*/
int count_and_label(int image[NX][NY]) {
int area_id = 0;
for (int x = 0; x < NX; ++x)
for (int y = 0; y < NY; ++y)
if (image[x][y] == -1)
floodfill(image, x, y, ++area_id);
return area_id;
}
This will fill the first island/region with the value "1", the second with the value "2", etc. and return the total number of regions found. Example 4: counting a person's descendantsIn the above flood fill example, the act of labelling the point 1,2 3 etc. automatically marked the point as having been processed and enabled us to not do it twice. Furthermore, each point is only ever going to be handled once in the program. Sometimes this may not be true: we may need to explicitly mark something as having been handled, and we may need to process something more than once. To illustrate this, let's consider a variation of the above geneology example where we wish to count a person's descendants. As mentioned above, there is a subtle problem here: since each person has two parents, if the original person has several generations of descendants it's likely that at some point two of them will have a child together and we will end up counting the same child twice. Further more we may wish to count several people's descendants and of course any given person may need to be included in several people's counts (but only once each time!). The answer is to add a "flag" to the person structure to record whether that person has been dealt with. To use the flag we will need to use binary, bitwise operators Reminder: bitwise operators and flagsBitwise operators operate on the individual bits of a variable, rather than the arithmetic value of the whole variable as the arithmetic operators do. The operators
NB the One's complement operator takes a single argument, the others two. ExamplesWe take the example of an unsigned char (one byte, ie 8 bits). Although there is no way to write numbers as binary in C; the nearest is hexadecimal, but for clarity we shall show the calculation in binary.
Binary flagsAs mentioned above, the most common use of bitwise operators is to define flags, ie options that are either on or off. Typically there's a header file with various constants each of which is an exact power of 2 (ie has only one bit set): #define OPTION1 0x01 /* 1 binary 00000001 */ #define OPTION2 0x02 /* 2 binary 00000010 */ #define OPTION3 0x04 /* 4 binary 00000100 */ #define OPTION4 0x08 /* 8 binary 00001000 */ #define OPTION5 0x10 /* 16 binary 00010000 */ #define OPTION6 0x20 /* 32 binary 00100000 */ because these are powers of two they are unsuitable for enumerations. The code sets, unsets and tests options as follows: unsigned int flags; int main(void) { flags |= OPTION3; /* Set OPTION3 */ flags &= ~OPTION4; /* Unset OPTION4 */ if ( flags & OPTION3) printf("Option 3 is set\n"); } The setting and testing of flags is fairly clear, the unsetting of OPTION4 is a little more complicated: OPTION4, like all options, has just one bit set so ~OPTION4 has every bit except that one set. So flags &= ~OPTION4 has the following effect:
00111011 & ~00001000 ~00001000 = 11110111
00111011 Using binary flags in our exampleReturnin to our example, here's a typical include file defining a structure with some binary flags: typedef enum { male, female, unknown_gender } Gender; /* legal_parent defaults to bio_parent if the former is NULL */ typedef enum { bio_father, bio_mother, legal_father, legal_mother, parent_max } Parent_type; typedef struct person { char *surname; char **forenames; struct person **children; int n_children; /* Number of children */ int childbuf_size; /* Size of children buffer */ struct person *parents[parent_max]; Gender gender; unsigned int flags; } Person; // For flags #define ISDEAD 0x01 #define DONE 0x01000000 int countdescendants(Person *p); We modify our printing function so that it fist checks to see if the person has already been handled and if so immediately returns. It then larks the person as being handled. Since we assume we had to do some debugging we will add the ever-useful "depth" parameter and a (commented out) assert(): // // Count descendants, assuming "DONE" flag has already // been cleared static int countdescendants_1(Person *p, int depth) { int children; if (p == NULL || p->flags & DONE ) return 0; // assert(depth < 100); p->flags |= DONE; children = p->n_children; for (int child = 0; child < p->n_children; ++child) children += countdescendants_1(p->children[child], depth + 1); return children; } Cleaning the listAs it stands the code above will work just once because the second time around every person will have already been marked as already done. therefore before printing we must first go down the descendant tree clearing that bit: // // Clear the "DONE" flag from a person and their descendants static void markasclean(Person *p) { if (p == NULL) return; p->flags &= ~DONE; for (int child = 0; child < p->n_children; ++child) markasclean(p->children[child]); } Putting it all together#include <assert.h> #include <stdio.h> #include "peopledefs.h" static int countdescendants_1(Person *p, int depth); static void markasclean(Person *p); // // Count descendants, assuming "DONE" flag has already // been cleared static int countdescendants_1(Person *p, int depth) { int children; if (p == NULL || p->flags & DONE ) return 0; // assert(depth < 100); p->flags |= DONE; children = p->n_children; for (int child = 0; child < p->n_children; ++child) children += countdescendants_1(p->children[child], depth + 1); return children; } // // Clear the "DONE" flag from a person and their descendants static void markasclean(Person *p) { if (p == NULL) return; p->flags &= ~DONE; for (int child = 0; child < p->n_children; ++child) markasclean(p->children[child]); } // // (Finally, the "public" function) // Count a person's descendants int countdescendants(Person *p) { markasclean(p); return countdescendants_1(p, 0); } |