Skip to content
Physics and Astronomy
Home Our Teaching Resources C programming Dealing with text
Back to top
On this page
Contents

Dealing with text

Introduction

So far we have been dealing with character strings, (or just "strings" for short) enclosed inside double quotes ("") without really saying what they are.

As well as the fixed strings we have been dealing with, C allows us to deal with individual characters as well as modifiable (or writeable) strings via arrays of characters. C also provides a range of standard routines for inspecting and modifying characters and strings.

This lecture just deals with strings using the English character set. Should you wish to write a multi-lingual program you will need to investigate unicode and in particular UTF-8.

Single characters

C denotes single characters using single quotes, for example:

  char mychar = 'a';

C denotes single characters using single quotes.

A char is a one-byte integer

Since the data in a computer's memory consist of zeros and ones, not letters, C stores characters by assigning a one-byte integer value to each character. When a character value is stored in memory it is this one-byte integer that is written. When library routines such as printf() read this one-byte integer from memory and are instructed to interpret it as an character they translate the integer value back to the appropriate printable character. Therefore, technically a char is a one-byte integer.

Even though internally C treats characters as a type of integer one byte long, we should never rely on any particular character having any particular integer value.

A char is a one-byte integer

The format for a single character is %c

Single characters have the "%c" format specifier to printf() and scanf(). %c is unusual in that on input it does not skip white space by default. To read in a single character skipping white space put a space before the %c:

Note the space in " %c".
  char mychar;

  printf("Please enter a single character, spaces will be ignored\n");
  scanf(" %c", &mychar); // Note blank space before %c

The format for a single character is %c and to read in a single character skipping white space put a space before the %c (you nearly always want to do this).

Avoiding a problem

If we start a program and type: "7<return>8<return> then instinctively we think we have entered two characters ('7' and '8'). But actually we have entered 4 characters: "7\n8\n". When reading in a number then scanf() will always skip white space (such as the newline in this input), but when reading in a character we get to choose whether we want the '\n' or to skip over it. We nearly always want the latter.

But in the following code the second call to scanf() specifically asks to read in a character without skipping white space:

    scanf("%d", &intvar);
    scanf("%c", &charvar); // Doesn't skip spaces
  

So intvar will have the value 7 as expected but charvar will have the value '\n'.The following code skips spaces and charvar will have the value '8' as expected.

    scanf("%d", &intvar);
    scanf(" %c", &charvar); // Skips spaces
  
  1. Use scanf("%c") after another call to scanf()
  2. To demonstrate the problem
  3. Create a new on-line program in a new window, with a suitable title and opening comment.
  4. Declare an integer variable at the top of your program
  5. Read in its value using scanf() before you ask the user to enter a character. (This sets up the problem.)
  6. Declare a char variable.
  7. Ask the user to enter a letter.
  8. Then read in the value of the character using the " %c" format (note: with a space before the %).
  9. Print out this character using ">%c<" That is: if the character was 'A' the output should be ">A<".
  10. Build & run. Check the output is correct..
  11. Now change the format from " %c" to "%c" without a space.
  12. Build & run and type in the same input as before. You should find the value of your char value is newline, not 'A'.
  13. Fix your program and check it's OK.

Don't reply on ASCII

In practice just about ever modern machine uses a mapping called ASCII to decide which one-byte integer value represents which character. In ASCII the value of integer that is used internally to represent the character 'a' is 97. This can lead to code like this:

If we need to know which integer the computer uses to represent a character we are doing something wrong.

  char mychar = 97; // Bad!

This is legal but has two problems:

  • It's not guaranteed. Nowhere does the C standard say compilers have to use this value.

Have we mentioned the megaprinciple?

  • It's not clear. Use the following instead:
  char mychar = 'a'; // Good!

Similarly:

  if ( mychar == 97) {  // BAD!!
    ...
  }
  if ( mychar == 'a') {  // Use this instead
    ...
  }

With this warning we shall use the ASCII character to integer encoding through out these notes on the clear understanding that our code must never depend on it. The only things guaranteed are that

  1. No character is represented by the value zero, as this is used to indicate the end of a string of characters.
  2.  The integer of '1' is the integer value of '0' plus one,  and so on up to the integer value of '9' being the integer value of '8' plus one.

Special characters

As with character strings, single characters that are hard to represent inside single quotes are specified using the backslash, for example:


  // Some special characters within single quotes
  char quote = '\'';
  char newline = '\n';
  char backslash = '\\';

Decimal digits

One mistake to avoid is to confuse the integer value used to internally represent a digit character with the integer whose value is that digit. For example the character '9' has the ASCII value 57 (the byte that gets stored in the computer's memory), where as the ASCII value 9 is used to represent the tab character.

Character strings

With its typical economy and simplicity C does not define a separate "string" type, instead C treats character strings as arrays of characters.

Character strings are just arrays of characters.

Marking the end of the string

  • An array is a series of objects of the same type stored sequentially in the computer's memory.
  • There is no array bound checking, we are responsible for this ourselves.
  • If we go a bit over the end of an array we will change the values of random variables, if we go a lot over the end we are likely to crash the entire program.

Suppose we have a string (character array) containing the phrase "Hello, world". The first byte is 72 ('H'), the second 101 ('e'), etc. This raises the question: "how does we know when we have got to the end?".

Remember, the computer's memory goes on way past the end of any particular array. The address of a variable tells us where the memory allocated for that array starts but not where it ends.

The answer, as alluded to above, is that C appends a stop-byte of value zero to the end of every character string. Thus for example the string "abc" takes up four bytes of storage, not three, the fourth byte being zero.

The ASCII value for the character '0' is forty-eight, not zero.

The integer constant zero can obviously be written simply as just 0, but when when used in a character context it is conventional to write it in the form '\0'. This is another example of using the backslash inside a character or string constant to indicate a special character, in this case a literal integer value. But either will work.

All character strings are terminated by a (hidden) zero (\0) to mark their end.

Declaring arrays of characters

If we want character strings we can modify, we can declare them like any other array. As a further convenience we can use a string such as "abc" as an initialiser instead of { 'a', 'b', 'c', '\0' } (remember, the string requires four bytes because of the final zero). A modifiable string is just an array of chars, allowing one extra for the terminating zero

The following example declares two four byte character arrays with one initialised and one left uninitialised:

  char str[] = "abc", string2[4];

Each can be used to store a three-character string, plus the stop-byte of zero, and can be modified.

Character arrays can be initialised with "string".

Like all arrays we cannot just write string2 = str1 but there is a function called strncpy() to do it for us (see below).

Printing strings to the screen and reading from the keyboard

Strings take the %s format.

Strings have the "%s" format specifier to printf() and scanf(). For scanf() "%s" can only be used to read in character strings without spaces. We will see how to deal with this limitation in a later lecture, but here is a simple example of reading in a space-free name:


#include <stdio.h>
#define N 100
int main() {
  char name[N];

  printf("Please enter your name (no spaces)\n");
  scanf("%s", name); // NB: the name of the array is its address
  printf("Hello \"%s\"\n", name);

  return 0;
}
Step through this code


It's worth pointing out that in the call to scanf(). we just had name not &name. This follows immediately from the rule:

The name of an array is a shorthand for the address of its first element (in both value and type).

and is one of the few times we do not need an & in scanf()

Three warnings when using %s with scanf()

scanf("%s", str) cannot be used to read strings with spaces

Very Bad Things will happen if scanf("%s", str) is given a string longer than the length of the receiving array.

There is no & in scanf("%s", str)

The string length problem is a particular problem when you do not trust the people providing the input, for example anything which takes input from the Internet. We shall see how to deal with below.

  1. Reading and writing a string
  2. To practice reading and writing strings without spaces using %s.
  3. Rename your previous "reading a character" mini-exercise
  4. Declare a character array at the top of main(). Make it larger than anything you may want to type as input!
  5. After you have read in the single character, print a message asking for some text with no spaces.
  6. Read in the string using scanf() and %s.
  7. Print out the string using printf() and %s.
  8. Build & run. Check the output is correct..
  9. As a check see what happens when you enter a characer string that does contain spaces, such as "Hello world".

Looping over character arrays

Since character arrays are (by definition!) arrays of characters we can access an individual character as string[j]. We can loop over an entire string with a for() loop, stopping when we reach a '\0' character:

    for(int j = 0; string[j] != '\0'; ++j)
      Something to do with string[j]
  
  1. Looping over an array.
  2. Practice looping over an array and accessing individual characters.
  3. This mini-exercise will use both your input character and the input string to count the number of times the character occurs in the string.
  4. In your previous mini-exercise declare an int variable that will become equal to the number of times your letter occurs in your string.
  5. Write a loop to go over our string as above. Initially just print out the value of the character to check your loop is OK, eg printf("%c\n", string[j]);
  6. Build & run. Check the output is correct..
  7. Now remove the printf() statement and replace it with an if() statement that checks to see if string[j] is equal to the input character and if so increases the count by one.
    • Remember: chars are one-byte integers so it's perfectly OK to write if (char1 == char2)
  8. After the loop print out the count of the number of matching characters.
  9. Build & run. Check the output is correct.. Make sure that the count is correct.

Fixed character strings

We have already used fixed character strings enclosed within double quotes as arguments to printf(), scanf(), etc.

This is the only situation where C allows arrays without a name ("anonymous arrays")

When the compiler encounters a fixed string such as "Hello, world\n", it creates an non-modifiable array of one-byte integers with the appropriate values and passes the address of that array to printf() in exactly the same way as any other array. It's even possible, although rather unusual, to use a fixed string with the [index] notation as in the example below.

A fixed character string ("Hello") is a non-modifiable array of chars, including the terminating zero.

Example

The following somewhat strange example illustrates that a character string is just a fixed array:


#include <stdio.h>
int main() {
  int let;
  
  while (scanf("%d", &let) == 1 && let > 0 && let <= 26 )
    printf("%c\n", "_ABCDEFGHIJKLMNOPQRSTUVWXYZ"[let]);

  return 0;
}
Step through this code


This works because the compiler encounters "string"[j] it stores the string in the computer's memory and replaces "string"[j] by char@(location_of_string + j)

Passing fixed strings to functions

As a fixed string is just an array of one-byte integers it can be passed to a function (that is, its address can be passed to a function) just like any other array. There is one qualification however; the function must not try to modify the string!

On my machine the following code produces a segmentation fault (SIGSEGV):


#include <stdio.h>
//
// Demonstrate problem trying to modify a fixed string
//
void myfunction (char message[]);

void myfunction (char message[]) {
  printf("The original message was: %s\n", message);

  message[0] = 'A';
  printf("The new message is: %s\n", message);
}

int main() {
  myfunction("Hello, world");   // Subtle error!
  return 0;
}
Step through this code


The const qualifier

When a function takes (the address of) an array as an argument but guarantees not to modify it, this can be indicated by adding the qualifier const (short for constant) in front of the argument declaration.

The following code demonstrates a simple function to read in an integer within a specified range with a helpful message. Since the message is only printed, not modified, we declare it as const.


#include <stdio.h>
//
// Read an integer in the range min to max inclusive
//
int getint(const char message[], int min, int max);


int getint(const char message[], int min, int max) {
  while (1 == 1) {
    int input;

    printf("%s\n (Please enter a number from %d to %d)\n\n",
     message, min, max);
    scanf("%d", &input);

    if ( input < min)
      printf("Sorry, the minimum possible value is %d. ", min);
    else if ( input > max)
      printf("Sorry, the maximum possible value is %d. ", max);
    else 
      return input;

    printf("Please try again\n\n");
  }

}

int main() {
  int value;

  value = getint("Pick a number!", 1, 10);
  printf("You chose: %d\n", value);
  return 0;
}

The const qualifier can be used for arrays of any type, not just integers. We have added the const qualifier to the function's prototype as well as the whole point about prototypes and function definitions is that they agree.

Another example of const

We might wish to write a scalar product function which takes (the addresses of ) two arrays as arguments and returns their scalar (dot) product. A first attempt at its prototype might be:

double dotprod(double v1[], double v2[], int n);

Whilst this is OK(ish!) it leaves open the possibility that dotprod() might try to modify either or both of the vectors v1 and/or v2. A better prototype would be:

double dotprod(const double v1[], const double v2[], int n);

whicj makes it clear that neither vector will be modified. (Of course the actual code of the function must be updated as the function and its prototype must always agree.)

Three things to look out for

'a' != "a"

The first is a single character, the second is the address of a two-byte quantity, the first byte having the value 'a', the second zero or '\0'.

The format for a single character is "%c", that of a string "%s"

'0' != '\0', '1' != 1

In both cases the left-hand value is the integer code for that character (probably 48 and 49 respectively), the second is the integer zero or one.

 Strings require one byte more to store than their length, arrays can store one fewer characters than their size

This is used to store the stop-byte, integer zero or '\0'.

Utility functions for characters and strings

C provides a number of useful functions for handling characters and strings.

Character-based functions: <ctype.h>

The include file <ctype.h> ("Character TYPE .h") provides a number of useful functions to test, and in two cases below convert, the type of a character. All take a single argument which is a character (not a character string).

"is...()" tests

Useful <ctype.h> functions
Function Description Example (True) Example (False)
isalpha() Alphabetic (letter: does not include '_') isalpha('A') isalpha('7')
isalpha('_')
isupper() Upper case letter isupper('A') isupper('a')
isupper('?')
islower() Lower case letter islower('a') islower('A')
islower('?')
isdigit() Decimal digit isdigit('4') isdigit('B')
isalnum() Letter or decimal digit isalnum('m')
isalnum('9')
isalnum'?')
isspace() Space isspace(' ')
isspace('\t')
isspace('G')
ispunct() Punctuation ispunct('?') ispunct('w')

Example

The following function looks at a character string to count the number of letters, digits, punctuation characters and spaces.


#include <stdio.h>
#include <ctype.h>

//
// Count the number of letters, digits, punctuation
// characters and spaces in a well-known phrase
//
void countem(const char string[]) {
  int alphas = 0, digits = 0, spaces = 0, punct = 0, others = 0;
  int i;

  for(i = 0; string[i]; ++i) {
    if (isalpha(string[i]))
      ++alphas;
    else if (isdigit(string[i]))
      ++digits;
    else if (isspace(string[i]))
      ++spaces;
    else if (ispunct(string[i]))
      ++punct;
    else 
      ++others;
  }

  printf("The string: %s\n"
   "contains %d letters, %d digits, %d spaces,\n"
   "%d punctuation characters and %d other characters\n",
   string, alphas, digits, spaces, punct, others);
}

int main() {
  char hello[] = "Hello, world";
  countem(hello);
  return 0;
}
Step through this code


You will note that the character string is declared as "const char" as the string itself is not modified. Here we have passed a (modifiable) character array but we could have just as easily passed a fixed string.

The output is:

The string: Hello, world
contains 10 letters, 0 digits, 1 spaces,
1 punctuation characters and 0 other characters

toupper() and tolower()

The functions toupper() and tolower(), also defined in ctype.h, also take a single character as an argument and return it converted to upper or lower case. Usefully, if the argument is a non-letter, or is already the correct case, they just return the argument.

The following function translates its argument string to upper case.


#include <stdio.h>
#include <ctype.h>

//
// Convert a string to UPPER CASE
//
void shout(char string[]) {
  int i;

  for(i = 0; string[i] != '\0'; ++i) {
    string[i] = toupper(string[i]);
  }
}

int main() {
  char helloworld[16] = "Hello, world";

  printf("The  original  string is: %s\n", helloworld);
  shout(helloworld);
  printf("The upper-case string is: %s\n", helloworld);
 
  return 0;
}
Step through this code


Notice that we have dropped the "const" qualifier to the argument to shout() and we have had to copy our favourite phrase into a (modifiable) character array; calling shout() with a fixed string would have led to a segmentation error.

The output is:

The  original  string is: Hello, world
The upper-case string is: HELLO, WORLD

Notice how the 'H' and ',' are unchanged.

#include <ctype.h> has various character tests as well as toupper() and tolower().

String utility functions: <string.h>

The file <string.h> provides a number of functions for dealing with strings, not individual characters. In the table below, all strings are guaranteed to be unmodified ("const") except those called dest.

Useful <string.h> functions
Function Description
int strlen(string) String length, eg:
strlen("abc") == 3
char * strncpy(dest, source, nbytes) String copy
(See example and warning)
char * strncat(dest, source, nbytes) Concatenate two strings
int strcmp(string1, string2) Compare two strings
char * strchr(string, char) Find char inside string, returns NULL if not found.
char * strstr(string1, string2) Find string2 inside string1, returns NULL if not found.

strlen()

Note that strlen returns the length of the string, not size of the array it is stored in. For example the code will print the value three, not twelve:

  char string[12] = "abc";

  printf("The length of \"%s\" is %d\n", string, strlen(string));

Note too how we have used the backslash to put a double-quote character inside a quoted string.

strlen(string returns the length of the string

strncpy()

The strncpy(destination, source, n) function copies at most n bytes of the source string to the destination. The syntax is supposed to be reminiscent of the not-allowed destination = source with a sanity check. It returns the address of the destination string which is occasionally useful if we want to use this as the argument to another function, but not often.

strncpy() warning

The strncpy() has a subtle flaw: although it correctly refuses to write over the end of the destination array (good), in the case that the length of the source is larger than n the final character written to the destination array is not zero, it is the nth byte of the source. This means that the destination does not have a terminating zero and hence anything that tries to treat it as a C string will run off the end.

One solution is to make the destination array one byte "too big" and initialise it all to zeros. That way the final byte of the destination is always zero, as in this example:


//
// Safer way to use strncpy() to handle overflows
//
#define N 8
int main() {
  char buffer[N+1] = ""; // One byte longer and initialised to all zeros 

  strncpy(buffer, "Hello, world", N);

  printf("The truncated string is: %s\n", buffer);
  return 0;
}

It's worth noting that we have used the fact that when we initialise an array with fewer elements that it needs the rest are all set to zero.

Of course, another way to solve the strncpy() problem is to always call strncpy() with argument N-1, or to write a "wrapper" function called, say mystrncpy(), that always writes a zero to the final byte:

The type "size_t" here means something like "a type of integer large enough to hold the length of a really long string"


#include <string.h>
char *mystrncpy(char *dest, const char *src, size_t n) {
  strncpy(dest, src, n);
  dest[n-1] = '\0';

  return dest;
}

strncpy() rather unsafely copies a string

strncat()

The strncat() function catenates (joins) the contents of the second argument onto the end of the first, the first character of the second argument over-writing the zero at the end of the first. strncat() does not have the same flaw as strncpy() in that if it runs out of space the final character is zero, but it should be remembered that the final argument is the maximum number of bytes to be copied (appended or catenated), not the maximum final length of the resultant string. Try this (assuming N is large enough for the initial string):

  char dest[N] = "And I think to myself, ";
  ...
  strncat(dest, "Hello, world", N - strlen(dest));

Both strncpy() and strncat() have versions without the 'n' and the final maximum-length argument, strcpy() and strcat(). These are best avoided.

Joining them together

Both strncpy() and strncat() return their first argument, so for example we may use one as the argument to another:

  char text[N];
  strncat(strncpy(text, "Hello,", N/2), " world", N/2);

strcmp() and strstr()

The strcmp(str1, str2) function returns the value zero if str1 and str2 are the same, a (strictly) negative integer if str1 comes before str2 in the alphabet and a (strictly) positive integer if str1 comes after str2 in the alphabet. The classic use is to check for equality by seeing if strcmp() returns zero.

As always with arrays we cannot test for string equality using str1 == str2 as this tests to see if the addresses of the two strings are the same, i.e. they both refer to the same character array, rather than two character arrays which contain the same string.

The function strstr(str1, str2) tells us if str1 contains the string str2. (The return value of strstr() can also tell us where in str1 str2 appears.)

Here is an example:


#include <stdio.h>
#include <string.h>

/*
 * Simple demo of strcmp() and strstr() 
 */
int main() {
  char str1[] = "Hello, world", str2[] = "world";

  if (strcmp(str1, str2) == 0)
    printf("The strings \"%s\" and \"%s\" are the same\n", str1, str2);
  else
    printf("The strings \"%s\" and \"%s\" are different\n", str1, str2);

  if (strstr(str1, str2) != 0)
    printf("String: \"%s\" DOES contain the string \"%s\"\n", 
     str1, str2);
  else
    printf("String: \"%s\" does NOT contain the string \"%s\"\n", 
     str1, str2);

  return 0;
}

The output is:

The strings "hello, world" and "world" are different
String: "hello, world" DOES contain the string "world"

#include <string.h> has various string functions such as strlen() and strncpy().

  1. Use strcmp() to check two words for equality
  2. To practice using strcmp()
  3. Create a new on-line program in a new window, with a suitable title and opening comment.
  4. Declare two character arrays (one for each word).
  5. Print out a suitable message and read in two words from the keyboard, one into each array.
  6. Use the strcmp() to see if they are the same and print out the result.
    • Don't forget to #include<string.h>
  7. Build & run. Check the output is correct.

Other useful features

String concatenation

Sometimes we wish to use a very long character string that wraps over the side of the page. C solves this for us by the rule that if two strings (not characters!) are separated by white-space they are joined together. Since such strings tend to have new-lines in them, this is a convenient place to break them although it is not compulsory. For example:

  printf("Menu\n\n"
    "1. Hot dog\n"
    "2. Burger\n"
    "2. Cheeseburger\n"
    "4. Veggie-burger\n"
    "5. Double espresso\n"
    "6. Coffee with milk and stuff\n\n");

The above seven lines are interpreted as one huge string. It should be noted that C does not insert spaces when joining strings together, if we want spaces we must do that for ourselves within the individual strings. Also, there are no commas between the strings, if there were they would be treated as seven separate strings, not one large one.

Strings separated by white spaces (no commas!) are joined together to make one large string.

"Printing" into character arrays with snprintf()

The snprintf() functions acts like printf() except that it takes two additional arguments before the format: the name of a character array for the output to go into and the maximum number of bytes to be written to it (which is usually just the length of the array). It "prints" the output into the character array rather than to the screen. It is defined inside stdio.h, just like printf().

There is also a function sprintf() (no 'n') that omits the protection of the maximum length. We do not recommend you use it.

The following example demonstrates snprintf() protecting us against trying to write some text into a buffer that is not large enough.


#include <stdio.h>

#define N 8

/*
 * Demonstrate snprintf protecting against a buffer overflow.
 * We have "accidently" made the buffer too short for the text.
 */
int main() {
  char buffer[N];
  int i;

  snprintf(buffer, N, "Hello, world\n");

  /* Print out the individual bytes for information */
  for(i = 0; i < N -1; ++i)
    printf("Byte %d: %d\t'%c'\n", i, buffer[i], buffer[i]);

  printf("Final byte: %d\n", buffer[N-1]);

  printf("%s", buffer);

  return 0;
}

The output is:

Byte 0: 72      'H'
Byte 1: 101     'e'
Byte 2: 108     'l'
Byte 3: 108     'l'
Byte 4: 111     'o'
Byte 5: 44      ','
Byte 6: 32      ' '
Final byte: 0

Notice that unlike strncpy(), snprintf() does the right thing with the final zero byte.

Reading from strings with sscanf()

Most of the time we want to read in data from the keyboard or from a file stored on the computer. Occasionally, however, we may have a text string that contains the data we want to "read in".

The sscanf() function is exactly like fscanf() except that the first argument is a string or character array which is used as the source of the data, rather than an external file. It can be thought of as the opposite to snprintf() and is also defined inside stdio.h.

Example

char mystring[] = "12 34";
int j, k;

sscanf(mystring, "%d %d", &j, &k);

The variables j and k take their values from the character array mystring[] and hence have their values set to 12 and 34 respectively. There is no input from any file or the keyboard and mystring[] is not altered in any way.

snprintf() and sscanf() "print" to and read from character arrays in the same way as printf() and scanf().

Dynamically allocating strings

This is a fairly advanced topic.

We've seen two problems with our simple reading of a string into a character array: it's hard to know how big to make the character array and scanf("%s") does not handle spaces. We will deal with the first of these now and the second in a later lecture. The general method is:

  • Declare a large temporary character array.
  • Read in the word.
  • Dynamically allocate a character array of the right length for the text we read in.

Since this is a little complicated we shall make this into a function which we can write once, forget about and call whenever we need it.

A good function is either considerably more complicated to implement than to describe or saves the same code being repeated more than once in the program.

The complete code

#include <stdio.h>
#include <stdlib.h>

// Read in a long word, allocate the resulting character array
// Later we will encounter a more useful version of this function
#define BUFLEN 1024
char *readaword(void) {
  char input[BUFLEN], *output = NULL;
  int len;

  if ( scanf("%s", input) != 1 ) {
    fprintf(stderr, "Out of input!!\n");
    exit(99);
  }

  // Now allocate the final string and copy the input to it
  len = 1 + strlen(input); // +1 for closing '\0'
  output = malloc(len);    // NB: sizeof *output == 1
  if ( output == NULL) {
    fprintf(stderr,"Out of memory!\n");
    exit(98);
  }
  strncpy(output, input, len);
  return output;
}


int main() {
  char *word;

  printf("Please enter a long word\n");
  word = readaword();

  printf("Wow, %s\n\tis a long word!\n", word);
  
  return 0;
} 
Step through this code

 
      

Two-dimensional character arrays are arrays of (writeable) strings

Fixed-length arrays

Given that a character array can be thought of as a writeable string it immediately follows that we can have an array of them:


// Demonstrate an array of writeable strings
#include <stdio.h>
#define STRMAX 64
#define NPL 2 // Number of players
int main() {
  char names[NPL][STRMAX] = { 0 };

  for (int p = 0; p < NPL; ++p) {
    printf("Player %d, please enter your forename\n", p + 1);
    scanf("%s", names[p]);
    printf("Thanks %s.\n", names[p]);
  }
  // Do stuff here...

  return 0;
}
Step through this code


We can have arrays of character arrays, just like any other array type.

  1. Use an array of character arrays
  2. To use arrays of writeable strings.
  3. Edit your previous program so that instead of having two separate character arrays of length LEN there is a single 2xLEN array: char words[2][LEN].
  4. Now read in your two words using a loop that goes from zero to one.
  5. Build & run. Check the output is correct..

  1. Several words
  2. A slightly more advanced case
  3. Convert the character array to an MxLEN array: char words[M][LEN] for some suitable value of M.
  4. Update your loop so that it reads in M words.
  5. Now put your "same word check" inside a double for() loop that compares all different pairs of words.
    Tip: a good way to get all different pairs is to start the inner loop from one more than the current value of the outer loop control variable, eg:
      for (int j = 0; j < M; ++j)
        for (int k = j + 1; k < M; ++k) {
    
    

Dynamic arrays

For situations where we do not know the length of the text (such as peoples' names) a better approach is to have an array of pointers and to dynamically allocate the character array:

int main() {
  char *name[2] = {NULL, NULL};

  for (int i = 0; i < 2; ++i) {
    printf("Player %d please enter your first name\n", i+1);
    name[i] = readaword();
  }

  printf("Welcome %s and %s.\n", name[0], name[1]);
  
  return 0;
}
Step through this code

 

Truely dynamic arrays

An even better way is to dynamically allocate the pointers too. We can also do a trick a bit like the zero at the end of a text string: we can make the array of pointers one too big and add a NULL pointer at the end. We can then pass the pointer to the names to a function and it can work out when the end of the array is itself:

void welcome(char **name) {
  for (int i = 0; name[i] != NULL; ++i) 
      printf("Welcome %s.\n", name[i]);
}

int main() {
  char **name = NULL;
  int n;

  printf("How many names are there?\n");
  scanf("%d", &n);
  name = xmalloc((1+n) * sizeof *name);
  name[n] = NULL;

  for (int i = 0; i < n; ++i) {
    printf("Player %d please enter your first name\n", i+1);
    name[i] = readaword();
  }

  welcome(name);

  return 0;
}
Step through this code

 

Summary

The text of each key point is a link to the place in the web page.

Single characters

Character strings

Declaring arrays of characters

Three things to look out for

Utility functions for characters and strings

String utility functions: <string.h>

Other useful features

Two-dimensional character arrays are arrays of (writeable) strings

Log in
                                                                                                                                                                                                                                                                       

Validate   Link-check © Copyright & disclaimer Privacy & cookies Share
Back to top