Computational Physics
Seven plus or minus two
Comments and questions to John Rowe.
In the 1950s George Miller realised that people could typically
remember about 7 (±2) things in their short term
memory. Interestingly, he also pointed out that seven seems to be a
typical limit in other slightly different memory contexts.
In practice this means that anything you can't express in around seven
independent steps is going to be very difficult.
The square law
An early study by IBM showed that the time taken to get a program
working was proportional to the square of its size. This make sense if
you think of the major task as being debugging: if the program is
twice the size there will be twice as many bugs and each bug will
take twice as long to find.
Aside: Complexity is increasing not decreasing
- Almost every commercial project is written by a team of
people, not just one person.
- People leave these teams.
- Software projects are often written by teams of people who
have never even met each other.
- Much more time is spent maintaining (adding new features or
fixing old bugs) programs than writing brand-new ones.
- Modern programs have to be internet aware: they have to
interact with other programs written by people you've never met
running on other
computers running other operating systems.
Historically, the first response to this was to split code into
self-contained units called functions.
Functions
Functions break large computing tasks into smaller ones, and
enable people to build on what others have done instead of starting
over from scratch.Kernighan & Ritchie.
Functions should be short and sweet, and do just one thing...
and do that well.
Torvalds
- The purpose of a function is to make life easier for
the person who is calling it.
- A function allows the caller to think in terms of
what the function does rather than
how it does it.
- A function allows debugging to procede one task at a time, thus
linearising the debugging process.
Example
In pairs, consider the following simple program and answer the following
questions:
- What does the program do? How do you know?
- The function I have called xxxx is in fact one of the functions
called from main. Which one is it? How do you know?
After some discussion as a group discuss the following question:
- What do you think the functions read_string_from_screen
and read_numeric_string_from_screen do?
How do you know?
- What are the meanings of arguments to these functions?
#include <stdio.h>
#include "mydefs.h"
int main() {
int action;
read_data_from_disk();
while(action = get_user_command()) {
switch (action) {
case LOOKUP_ENTRY:
lookup_phonebook_entry();
break;
case ADD_NEW_ENTRY:
add_new_phonebook_entry();
break;
case DELETE_ENTRY:
delete_phonebook_entry();
break;
case SAVE_TO_DISK:
save_data_to_disk();
break;
case QUIT:
exit_program();
break;
}
}
}
void xxxx(void) {
char name[MAX_NAME_LEN];
char number[MAX_NUMBER_LEN];
if (read_string_from_screen(name, MAX_NAME_LEN) > 0
&& read_numeric_string_from_screen(number, MAX_NUMBER_LEN) > 0)
store_name_and_number(name, number);
}
You will notice that the main function is nothing but a series
of function calls and so is xxxx. This is quite common.
Let's consider a few points about what the use of functions above
achieves:
- When looking at main we are able to think about the
problem at a conceptual, global level without worrying about any of
the details of implementation. We are concentrating on
what we are doing, not how we are
doing it.
- When looking at xxx:
- We are still able to think about what we are doing, not how we are
doing it - functions such as read_numeric_string_from_screen
protect us from the gory details.
- We can understand it without having to think about any of the
other functions such as lookup_phonebook_entry or even the
function main from which it is called.
- Each of the functions such as lookup_phonebook_entry:
- Performs
a task which is simple to understand at the level of the
function from which it is called (again, we are talking about
what it does, not how it does it).
- Is easier to describe what
it does than how it does it, ie is non-trivial.
- Has a small number of arguments (or none
at all) and it is clear exactly what they are.
- Could change the way it does its job without the functions that
call it having to know about it.
There are a couple of other reasons we may want to put something in
a function, both of which apply to read_string_from_screen:
- We know we are likely to use several times in the program.
- We may need to change it later on.
Structuring functions
Imagine your task can be split into three sensible functions
each of which can be split into two or three subfunctions
for a total of eight subfunctions..
There are two ways to do this:
Bad structure
- function_1
- function_2
- function_3
- function_4
- function_5
- function_6
- function_7
- function_8
Proper structure
- function_a
- function_a_part_1
- function_a_part_2
- function_a_part_3
- function_b
- function_b_part_1
- function_b_part_2
- function_c
- function_c_part_1
- function_c_part_2
- function_c_part_3
Why?
Problems with functions
Trivial functions
We have already stated that functions should be non-trivial.
As an example of a trivial function consider the following:
/*
* Add two numbers together
*/
float add_two_floats(float num1, float num2) {
float sum;
sum = num1 + num2;
return sum;
}
This is bad because it is simpler and clearer to just write:
z = x + y;
rather than:
z = add_two_floats(x, y);
Functions that do two things
calculate_area_of_a_square_and_check_the_spelling_of_some_text_
and_make_me_a_cup_of_coffee( black, ! sugar );
Side effects
A side effect is where a function that is advertised as doing one
thing also does something else ("I know I'll always want to do it so I
might as well do it here").
Hidden assumptions
This is the converse of a side effect: we implicitly assume that
something else has already been done without actually saying so.
If this is totally unavoidable then you should :
- document it in the comment at the top of the function
- check inside the function to see if it has been done and if not
warn the user (and probably quit the program.)
Interface errors
The problems in the section above are essentially beginners errors. Interface
errors are the big one.
An interface error is where you are making a slightly
different assumption inside a function that in the function you are
calling it from.
- You may disagree on what exactly the function does.
- You may disagree on what exactly the arguments are.
- You may disagree on some other assumption.
Examples of interface errors
- In a kinetics problem you have a variable called energy:
in one part of the program you assume it is the kinetic energy, in another
the total (kinetic plus gravitational potential) energy.
- In the same problem, h is the height (upwards) but the
acceleration a is taken to be downwards because g is
positive.
- You call a function which has as an argument a heat capacity
c, you assume it wants capacity at constant volume, it wants
constant pressure.
- In 1999 NASA lost the Mars Orbiter satellite because they were
using metric units and a contractor imperial (feet and inches,
etc.)!
Tests for interface errors
- Is the name of the function clear?
- Does the comment at the top of the function tell you
everything you need to know without having to look at
the body of the function?
Summary
A function should :
- do one thing and do that well.
- be clearly defined and easy to understand.
- let you think of the task at a higher level, in terms
of what you are doing ("read in a positive integer")
and not how you are doing it.
- either:
- enable the code to be simpler and shorter by being called
from several places or
- enable a piece of code that would otherwise violate Miller's
limits to be understood within those limits.