Introduction to C
Comments and questions to John Rowe.
This lesson introduces the main concepts of a simple C
program, some of which will be covered in more detail later in
the module.
Brief refection on the turtle
Review the questions at the end of Mr Turtle.
- What features did you find surprising or fun?
- What features do you think would make it easier to do
complicated things?
.
C is a general-purpose programming
language which according to one
widely
quoted but not very meaningful index is the second most
popular programming language in the world today. It has a
philosophy of simplicity and consistency.
The purpose of C is to unambiguously
express an algorithm and its data in a way that is
easy for us to understand and modify.
Much more time is spent modifying existing
programs than writing new ones.
C's popularity has has spawned several
off-shoots which are, with very few exceptions, super-sets
of the original language, i.e. any valid C program will also be
a valid program in any of those languages. These include C++
, C# (C-sharp) and Objective C, a
programming language used on the iPhone and iPad
(although Objective C is now being superceded by
the Swift programming language). When looking
for information on the Web or buying a course book be sure it is about the C programming language, not
C++ or one of the other off-shoots.
On the other hand, we can also use a C++ compiler as a C
compiler provided we are careful only to use its C features, not
any of the C++ extensions. (A compiler
is a program that converts the human-readable program we have
written into the executable program that will run on our
computer.)
- Sumarise the previous two sections.
- Add sufficient comments to the previous two sections that
you still understand them when they are minimised.
- From now on you should do this for each
session.
In C, as in many languages, the statements that implement our
algorithm are grouped into functions
each of which which implements a well defined task. (In Logo
these are called "procedures", other names can include
"routines" or "subroutines".) Traditionally, our first C program
looks like this:
#include <stdio.h>
int main() {
printf("Hello, world\n");
return 0;
}
Though simple, this program illustrates the three components
found in any C program: Comments, pre-processor directives and C
code proper.
The C programming language was developed at Bell
labs where they also invented the transistor and discovered the
cosmic background radiation left over from the Big Bang.
The first thing we see is a comment,
which is there purely for our benefit:
Text starting with "/*"
and ending with "*/" is ignored by
the compiler and is called a comment. It may run over several
lines.
You will notice we have made our comment look slightly pretty
by starting each line within the comment with a * and making all
four stars line up vertically. This is quite a popular style but
is not necessary: only the /* */
count. We can use whatever layout we like, for example:
Or:
Why comments are useful
Comments are there to give us the context of what's going on and to help us
understand anything that can't be made clear in the program
itself.
Comments at the start of programs and just
before functions are extremely useful as
it's easy to be slightly unclear or ambiguous about
what a function does. Comments inside of functions are less
useful.
Single-line comments
C also allows single line comments starting
with "//". The "//" and everything after it
up to the end of the line is ignored. This is good way of writing short comments but is a bit inconvenient for longer comments as every line must start with // .Thus we could have equally
written our initial comment:
//
// My first C program. Successfully prints the phrase:
// "Hello, world"
//
Comments are ignored by the computer and are there
to tell us something that is not already
clear from the code
- Write a comment at the start of a program.
- To remind us of the importance of the initial comment
- Create a new on-line program in a new window. (Try Right-clicking on this link).
- You will need to be able to see both windows as once so if
the compiler started in a new tab follow these instructions to move the new
tab into its own window.
- At the top of your program, add a simple comment to
indicate this is your first set of mini-exercises.
- Press "Build & Run" - you should see the output
"Hello, world" and a message saying that the program exited
with status 0.
- If you program did not run, look back at the previous two
sections and work out why.
Secondly in our program we see the line:
#include <stdio.h>
Lines starting with a hash #
are called pre-processor directives and the
system uses them to create an edited version of the C file
to be given to the compiler.
The word "stdio" is short for STanDard
Input/Output, i.e. a collection of standardised
functions available to us that will read from, and write to, the
screen and/or files on our computer. Somewhere in the depths of
our computer will be a file called stdio.h and this command includes that file in our program when it
is compiled. This file stdio.h contains all the
definitions necessary for us to use these standard input and
output functions functions (we will see one of hem in the main
program). It's normal to put all of our #include
statements at the top of the file.
#include <stdio.h> allows us
to access functions to print out, and read in, data.
We will use this construct quite often as the
standard library is divided into convenient groups, each with
its own include file. This is also the method by which we can
use "third-party" libraries (i.e., software written by other
people) for tasks such as graphics, advanced mathematical
analysis, etc.
It is also used when our program becomes large enough to require
splitting between several files and we need to keep various
definitions, etc. consistent between them.
- Experiment with #include <stdio.h>
- To practice #include and see what happens when we
get it wrong.
- Deliberately miss-spell "stdio.h" in the
line #include <stdio.h> from the start of your
program. (For example: "studio.h")
- Buid & Run. It should fail and a red "error" box appear
at the offending line. Also a yellow Triangle of Peril[TM] will
appear at the "printf()" line to indicate that the
compiler no longer knows exactly what "printf()" is.
- Completely remove the #include <stdio.h>
line from the start of your program.
- Build and run: it should work but with a warning.
- Now type the line back in (type not
Copy-and-Paste!) and check your program will now build. (Mac
users: the # key is Alt-3)
So far we have not done very much but finally we come to the actual C code itself. The construct:
The instructions that make up our program are
conventionally referred to as "code", as in "code of conduct".
Throughout this course we will use the ellipsis ... to
indicate omitted code.
int main() {
return 0; }
indicates that we have written a function called main() which will return an integer value to
whatever called it. The main() function has a
special place in C in that the program runs by calling and
executing main, and hence any functions contained
within it, then quitting.
The value returned by main is then
passed back to whatever ran the program to indicate whether the
program succeeded or failed. The convention is for a program to
return zero upon success and a non-zero value on failure.
The actual content of the function lives in between the
matched pair of braces {},
or curly brackets.
All executable statements live inside a function
and the program starts by executing the contents of the main()
function.
Statements
"Normal" lines of C are referred to as statements.
Our first example contained two statements, (printf()
and return) which we will look at below.
Statements "do something" and individual
statements within a { ... }
block all end in a semi-colon. C ignores spaces and new-lines
between words and character strings
so it's fine to split a statement over
several lines.
Statements finish with a semi-colon and long
statements should be split over several lines for clarity.
The first statement:
printf("Hello, world\n");
calls the printf() function which "prints"
the output, not to a printer but to the default output device,
usually the screen. The "f"
at the end of printf stands for "formatted" and we
shall see later how to use it to print numbers, etc to the
screen as well as plain text.
The printf() function prints
output to the screen.
The parentheses,
following the word printf enclose the arguments which are passed to the printf
function. In this case these is just one, a
character string enclosed within double-quotes. (They have to be double
quotes, single quotes are used for something else.)
Character strings are enclosed by double-quotes,
not single quotes.
- Make some deliberate mistakes.
- To make it easier to recognise fix those mistakes when we
make them accidentally.
- When we made the "studio.h" mistake in
the previous mini-exercise the code did not build but gave us
a red error box. Here we will deliberately make some mistakes
to see what happens. This will help us fix
the problem when we make these mistakes by accident.
NB: For each deliberate mistake fix the
previous error before making the next one. Don't worry if you will find that the detailed message is sometimes not very clear: what's most useful is that the compiler tells us where the problem is.
- Remove the opening parenthesis (left bracket) in
the "printf()" statement so that it now reads:
printf"Hello, world\n");.
- Build and Run, look at the position of the error box and
the wording of the error and see how you would work out what
the problem was when you do it by accident.
- If you hover your mouse over the error box it will show
you the error(s) relating to that point.
- Fix the previous error and repeat the process in turn for
each of the punctuation characters in the statement.
- What do you think will happen when you remove the comma in "Hello, world\n"? Why?
- When you remove the final semi-colon ";" at the end of
the statement look where the error box appears. Sometimes
a mistake on one line shows up as an error on the next
one: in general the compiler complains at the first point
it can no longer make any sense of the input.
The actual character string itself is fairly straightforward
except for the funny '\n' at the end which stands for
"new line".
There are several of these so-called escape
characters, consisting of a backslash ('\') followed by a
letter. These include '\t' (tab) and '\a' (alert, which sounds
the beep). We also need to use a '\' character when we need a
literal backslash or double quote inside a character string:
printf("\"Hello\", he said\n");
We nearly always want a \n at the end of the string
but we can put \n etc. anywhere in the string and have
as many as we like:
printf("Menu:\n\nBurger\t$1\nPizza\t$1.50\n");
The output is:
Menu:
Burger $1
Pizza $1.50
The backslash is used for characters that we would
otherwise not be able to represent, eg \n for
new-line, and \t tab..
The printf()statement finishes with a semi-colon
(';').
- printf() and character strings
- To practice printing text to the screen
- Insert a new line after the existing printf()
statement and add a new printf() statement that
prints out the phrase: "My name is: yournamehere\n".
Run your code and see what it produces.
- Type it in by hand, don't use copy and paste. This will
help you remember it.
- When you type the opening "(" the on-line editor
will add the closing ")" for you. This is quite helpful although it can occasionally
be confusing if you are not expecting it.
- Remember to use the double-quote character, not the
single-quote or two single-quotes.
- Don't forget to put in the semi-colon at the end of
your new printf() statement.
- Now remove the \n after the phrase "Hello, world".
What does your output look like now?
- Put the \n back after "Hello, world" and add one
after the colon (:) in the second printf() statement
so it now reads "My name is:\n yournamehere\n". What
does your output look like now?
"Commenting out" statements
A rather unsophisticated use of the //
comment notation is to temporarily remove a line that we
think we might need later. This is often used with diagnostic printf()
statements:
This is a pretty crude technique but we all use
it sometimes! Note that if a statement is over several lines we
either need to comment out each line individually. or the use
the /* ... */ comment notation.
- Comment out a print statement
- To see what happens
- Comment out one of the printf() statements in the
previous exercise by putting // in front of it. You should see
the line change colour.
- Build & run: the line should no longer be printed.
- Remove the comment // characters, Build & run and check
that the text is now printed.
- Don't use this technique too often!
The second and final statement is:
return 0;
The return statement returns from main
in both senses of the word: the execution of main immediately
stops and the value zero is returned. If we
had put any statements after the return statement
they would never be reached and, if we were using a nice
compiler, it would warn us. Notice again the semi-colon at the
end of the statement.
Making our program easier to
understand
In the preface to this series we emphasised the importance of
avoiding and finding mistakes. Amongst other things this means
having code that is really easy to understand.
Whenever we are faced with a choice, our
first question should always be:
"which choice will
be the clearest and give me the least chance of making a
mistake?".
Consistent layout
Our first example of this is
that our program is clearly and consistently laid out.
When discussing comments we explained how we adopt a consistent
and distinctive style and you will also have seen how in
main()we have:
- Left two blank lines before the start of the function.
- Indented everything inside the braces {}
to the left by the same amount.
We will return to this in two lessions time
but meanwhile be sure to follow these rules.
Arithmetic and variables
The above code has one huge omission: it has no data, no
numbers, etc. Now that we have learned the basics of what a
simple program looks like it is time to do some useful
calculations.
You should at least be aware that computers use
"binary", or base-2 arithmetic rather than the "decimal" or
base-10 that we are used to, that "binary digits" (zeros or
ones) are called "bits" and that an ordered set of eight bits is
called a "byte", although you will be relieved to hear that you
don't need to be able to do binary arithmetic to program a
computer.
As discussed in the preface, algorithms have words and phrases
which are used as "place-holders" for the actual numbers:
- The cost_of_the_petrol is the cost_per_litre multiplied by the number_of_litres_sold.
- The cost_of_the_baked_beans is the
cost_per_tin multiplied by the number_of_tins_sold.
In computer programming the "place holders" for
holding values are known as variables.
The values of variables will typically change as
the program progresses, just like the running total at a
supermarket checkout.
The above example reminds us that some things (tins of beans,
people) are treated as integer units, whereas others (petrol,
distance) are allowed to be fractions. In the latter case all
measurements are by necessity approximate.
Like most programming languages C reflects this by allowing two
categories of expressions and variables: integers
and floating-point, i.e.
non-integers. (The name "floating-point" comes from the fact
that in numbers such as 1.234 , 12.34
, 123.4 etc. the decimal point "floats" from
left to right.) Floating-point calculations are very important
for scientists and engineers so be sure to remember what the
term means!
Expressions have both a value and a
type.
In general we don't have to worry about this as C "just does
the right thing", but we we can occasionally get caught out by
things like accidentally including integer division in
floating-point calculations which we will deal with in the next
lecture.
Floating-point variables are used for
things we measure and integer
variables are used for things we count.
Make a variable an integer only if is it logically
impossible for it to have a fractional value.
Floating-point constants
1.2345E2 |
123.45 |
1.2345E-2 |
0.012345 |
-1.0E1 |
-10.0 |
Floating-point constants (9.4 and 11.3) work
just as we would expect, scientific notation is available using
"En" for "ten to the power n", where n
is an integer. ('E' stands for Exponent.)
In the examples to the left the numbers on the same row have
the same value.
Some well-known floating-point constants: 2.99792458E8,
6.62606957E-34, -1.60217657E-19, 3.14159265358979.
Floating-point variables
The use of four and eight bytes for floats
and doubles respectively is an optional appendix to
the C99 standard. You will find it on everything from an iPhone
upwards, but you may not find it on your mobile phone.
Whenever we take a measurement or use a calculator we should be
familiar with the fact that using a finite number of decimal
places (or binary places for a computer) limits the accuracy of
the calculation. Most computers help deal with this conundrum by
offering a choice of two precisions: single precision
(four bytes per variable)and double precision (eight
bytes per variable). In C these are known, somewhat
inconsistently as float and double respectively.
A float is a single-precision,
four-byte, floating point variable, a double
is a double-precision, eight-byte, floating point variable.
We suggest using doubles and it's
vaguely useful to remember they use eight bytes each.
C adopts a "better safe than sorry" approach and
by default does most of its arithmetic in double precision
anyway.
Floating-point calculations are always
approximations to the mathematically correct result.
Unlike in our turtle
example, variables
have to be declared before they can
be used. This causes the compiler to set aside some memory to store the
value of the value of that variable. So for example, if
the compiler encounters the declaration double x
it may decide to store x from bytes 600-607 inclusive
(where byte "1" denotes the first byte in the computer's memory,
"2" the second and so on).
Variables must be declared before use.
Variable names
Variables have names which start with a letter (the underscore
"_" counts as a letter) followed by zero or more letters or
digits. Conventionally only lower-case letters (and digits) are
used, with upper-case letters being used for named constants.
The compiler doesn't care
what the variables are called; the names are simply
there to make things clearer for us, and the people who have to
read our code later on.
Variable names start with a letter, then have more
letters, underscores ("_") and digits. They should be all
lower-case.
We can illustrate this with a simple example of a code that calculates a bill. It has three variables.
int main() {
double materials, labour, total_cost;
materials = 9.4;
labour = 11.3;
total_cost = materials + labour;
return 0;
}
Step through this code
Seeing what the program does
Clicking on the "Step through this code" link will open a new web page which will simulate what we would do if we were executing the code ourselves armed with a calculator, index cards and a pen and eraser.
- Clicking the "Start program" button at the top right will cause main() to start and its variables to appear. It will also evaluate the first mathematical expression it comes to (in this case the simple constant 9.4).
- Every time we press "Next step" it will evaluate the next expression (with large expressions being divided into sub-expressions).
- Every time it evaluates an expression it will highlight the appropriate part of the code and print the value of the expression
- If the sub-expression is an assignment of a value to a variable it will update the value of that variable's index card (with an arrow from the value box to the index card meant to represent the flow of data).
- If the sub-expression is the value of a variable it will draw a reversed arrow from the index card to show where the value has come from.
- To start again from the beginning just press your browser's "Reload" or "Refresh" button.
- Finally, if you dislike the human "index-card model" and would like to see a more
literal view of how our computers store their data click on the "Show advanced options" button and switch between the two options on the right-hand side (Show index card or Show memory table).
- Step through a code.
- To learn how to understand what our program is doing
- Click on the "Step through this code" link above. It will
open a new tab or window.
You will need to be able to see both windows at once so if it
is in a new tab follow these instructions to move the new
tab into its own window.
- Click on the "Start program" button at the top right of the new
window. A table should appear with three variables and their
values. Note that the values are (extremely!) random.
- Also note that the program has taken its first "real" step,
ie it has started to evaluate the Right Hand Side of "materials = 9.4;". This is obviously
quite easy, it's just 9.4! The value of the last evaluated
expression appears in a blue box near the top of the page.
Notice too that it is preceded by the word "(double)" meaning
that the program is treating it as a double-precision
floating-point number.
- Click "Next step" again. See how it copies the value of the
expression it has just evaluated (9.4) into the value for the
variable "materials".
- Click "Next step" twice more and see it do the same for
"labour" and "11.3".
- Now ask yourself:
- After the program has executed the next statement,
what do you expect to be the value of total_cost?
- How do you think it will arrive at this?
- Click "Next step" again. See how it finds the value of
"materials" by going to the appropriate entry in the table and
retrieving the value.
- Carry on clicking next seeing what happens at each stage.
Was it what you expected?
- Click on your browser's "Reload" button and step through
the program a few more times until you are sure you understand
what the program is doing.
Notes
- Most of the complete examples have a "Step through this code" link just as a side-effect of checking that they are correct. We don't expect you to step through every code, just the ones where we ask you to or are marked as "Key code". But you should know what would happen if you were to. If you're not sure what the code does then step through it to check.
- In practice, if we were mentally stepping through our own code we would not go down to the level of doing the maths (although in this case given it is so simple we might). We would just think something like "total_cost is materials plus labour". If we were looking for a mistake in our code and we had written * instead of + we might say "total_cost should be materials plus labour but the code says materials multiplied by labour" which enable us to fix the error.
Data is stored in numbered locations
This idea, that the values of variables are stored
in numbered locations is
extremely important in computer programming. We are introducing it
now before we really need it so that you can become
familiar with the idea.
Over the next lesson or two make sure you
become comfortable with ideas such as
"we are storing the value of x on card number 12".
We will develop this idea a little more in the next lesson.
We have already seen the
printf() function print a constant string of
characters to the screen. To print expressions such as
numbers to the screen we put a format
specifier in the string. This consists of a percent
character "%" followed by a letter to denote
the type of the value to be printed.
There are three possible formats for printing a floating-point
number depending on how exactly we want it to be displayed
- %e This prints a floating point
number always using scientific notaion, ie the exponent form: 1.2345e1
- %f (seen above). This prints a
floating point number in the form 12.345
- %g prints a floating
point number making an intelligent choice between "normal" and
scientific formats depending on the value of the arguments.
This is normally the best choice.
Example: printing the total cost
In our previous example we could have printed the total cost
with the following statement:
printf("The cost is: %g\n", total_cost);
Notice
- Inside the format string we see the characters "%g".
- We have added a second argument to printf() , "total_cost",
which is separated from the first argument (the format string)
by a comma.
Arguments to functions are separated by commas.
- As you might expect, the output is the string with "%g" replaced by the value of total_cost
Notice the difference: \n
is used to represent a character which we can't conveniently
type into the string (in this case "new-line"); %f is used to indicate another
argument, in this case a floating-point number.
Example: printing the value of an expression
Although in the last example the value printed was just the
value of a single variable, it is important to realise that we
can print the value of any mathematical expression.
For example, suppose we have two double variables, a and b. We can
print the value of this expression in exactly the same way:
printf("a plus b equals %g\n", a + b);
Notice:
- Inside the format string we see the characters "%g", just like before.
- The second argument to printf() is the
mathematical expression "a + b", which is
separated from the first argument (the format string) by a
comma.
- The expression "a + b" is a single argument: the computer will
calculate the value a + b when the
statement is executed.
C always passes
values of mathematical
expressions to functions, never variables.
- The output is the string with "%g"
replaced by the value of a + b
At first sight the statement "C always passes
values of mathematical
expressions to functions" seems to be contradicted by the fact that the first argument to printf() is a string of characters! But whenever C sees a string of characters in our code it stores the characters in consecutive locations in its memory and replaces the character string by the value of its first memory location, which is an integer. So in our card-index model the human compiler might decide to store the first letter on card 300, the second on card 301 etc and would then pass the number 300 to printf().
Example: printing more than one number
We can modify the above example to print out the values of a
and b:
printf("%g plus %g equals %g\n", a, b, a + b);
Notice three "%g"s means we need three floating point
arguments following the format string, separated by commas. The
output is the format string with the value of a, the
value of b and the value of a + b replacing
the three occurrences of "%g".
Occurrences of %g in the format string are
replaced by the value of the floating-point expressions
following it.
- Momentum and energy
- Our first simple calculation
- If you have difficulties with this
mini-exercise feel free to do it during the class.
- Press the "New Program" button on the in-line C
compiler to start a new program.
- Write a short comment at the top of your program saying
that it caclulates the momentum of an object with known mass
and velocity, in one dimension.
- Now, inside main() declare two variables, one for
the mass and one for the velocity. (Treat velocity as a
scalar.) Ask yourself:
- Is it logically possible for mass and velocity to have
non-integer values? If it is make them a double,
if it is not make them an int. (Do not
ask yourself the question "do I want to give them
non-integer values this time".)
- Now ask: "can I think of two short, snappy variable
names so that when I look at them having read the comment
at the top of the program I will immediately know which is
the mass and which is the velocity?". Use those names for
your variables.
- Now give values to the mass and velocity in the same way as
we gave values to materials and labour in
the previous example. Experiment with using Scientific
Notation (eg 1.23E4).
- Use a printf() statement and "%g" to
print the value of the momentum for this object.
- Finally, the printf() statement that prints out
the momentum so that it now prints out all three values: mass,
velocity and momentum.
The text of each key point is a link to the place in the web page.
- Repeat the mega-principle to yourself so many times that you
wake up having dreamed about it!
Log in