|Physics and Astronomy||
|Physics Home||Study here||Our Teaching||Our Research||Our Centres||News||Work here||EMPS|
Back to top
Introduction to Data Analysis and Statistics
This short course is part of the Core Training that must be completed by all research students in the School of Physics. A set of course notes will be issued during the lectures. This course is a broad-brush survey of some of the more common uses and abuses of statistics, and its purpose is to set the context within which specialised techniques exist.
The techniques needed to tackle these problems are discussed in the printed notes which accompanied the lectures.
In a strange species of animal, the female/male ration is 3/2 and CW071211-02.tab is a tab-delimited text file listing the heights of a random sample of 66 males and 99 females.
2. Confidence Intervals
The file CW031205-03.tab contains one hundred independent measurements of the same physical quantity which has a Gaussian distribution. Use:
to estimate, and find (95%) confidence intervals for, the parameters describing the distribution as follows:
3. Bayesian Inference
You are playing a game in which the object is to roll three six-sided dice, the highest score wins. Before the game you believe that there is a 3% chance that your opponent is a cheat. The first time you play, your score is 4-3-6, but your opponent's is 6-6-6. What is your revised opinion of your opponent?
Optional Question for film/musical buffs: In Guys and Dolls Act II Scene 3, if Nathan Detroit is an unwilling participant in a crap game in a sewer. If he initially estimates there is a 1% chance that Big Jule will cheat, how does he revise his estimate when Big Jule wins on both the first and second roll of his 'lucky' dice?
4. Maximum-Likelihood Estimators
By representing the estimated mean by the true mean plus a suitable random variable, show that the maximum likelihood estimator for the variance of an unknown Gaussian distribution (i.e. equation 7.12 in the notes) is biased.
5. Non-linear Fitting
The file CW071211-04.tab contains twenty independent measurements of the energy E of a spectral line against magnetic field B. The standard deviation associated with the energy at each field is also listed. Two theories have been proposed to fit the data, each has three parameters:
Use your favourite curve-fitting software to fit each theory to the data. Remember to find the covariance matrix in each case. If P is not equal to p explain why not. Can you decide which theory is correct? How accurate do you consider the parameter estimates to be? How accurate does the software you used imply they are?
6. That's All Folks
There is no question 6! Please hand in your attempts at the above exercises no later than Friday 17th February 2012. Put hand written answers in my pigeonhole, or, if they are in rtf or pdf, use ELE:
Let me know if you have diffculty with them. CDHW.