Data Analysis, Error and Uncertainty  data analysis using Excel
Produced by Graham Currell, University of the West of England, Bristol, and David Read, University of Southampton, in association with:
● Royal Society of Chemistry, 'Discover Maths for Chemists' website, and
● Essential Mathematics and Statistics for Science, 2nd Edition
Graham Currell and Antony Dowman, WileyBlackwell, 2009
This study unit uses video to demonstrate the use of Excel in the analysis of experimental data and its uncertainty. The Excel files used in the data analysis examples and videos can be downloaded here:
ExcelDataUncert01.xlsx for analyses 1 and 2, and
BeersLaw.xls for analysis 3
The study unit is divided into four main sections:
Introduction  provides an overview of the important methods of data analysis using Excel, together with links to video tutorials on basic skills and selfassessment study guide/tutorials on linear regression.
1. Analysis of replicate data  demonstrates the use of equations, functions and data analysis tools, to interpret the results of repeated measurements of a single experimental value. The data represents replicate measures of the pressure, p, of a gas.
2. Analysis of linear data  demonstrates the use of regression analysis and graphical presentation to interpret the experimental results for a linear relationship between two variables. The data uses the variation of pressure, p, against temperature, T, of an ideal gas.
3. Analysis of linear calibration data  demonstrates the analysis of spectrophotometric data, using correlation coefficients, data residuals, and a calculation of the 95% confidence interval of the measurement of concentration using the calibration line of bestfit.
Introduction
It is possible to:
 Use Excel functions to perform specific calculations
e.g. =SQRT(B4) will calculate the square root of the value in cell B4.
 Write equations directly into Excel cells,
e.g. =B5*B6/SQRT(B4) will multiply the contents of B5 and B6 and divide by the square root of B4.
 Use Data Analysis tools. These are not normally loaded when Excel is first installed, but can be added later (see video) using AddIns.
Additional video help on the use of Excel 2007 is available:
Important notes on calculations in Excel:
 Equations and functions are dynamic. Their results change if the source data is changed, e.g. the value of SQRT(B4) will change if the value in B4 is changed.
 The calculations in the Data Analysis tools are static. They are a singleshot calculation, whose results do NOT change if the source data changes.
Study Guide for the straight line and linear regression:
Analysis 1: Experimental uncertainty (error) in replicate measurements
It is common (and good) practice to obtain repeated (replicate) measurements of the same experimental value, e.g. measurements of gas pressure (in pascals):
p / Pa

5246

5471

5821

5321

5537

5329

This data is analysed using:
Equations and functions to calculate  see video Replicate Measurements A
 Mean value, x, (p171) using function AVERAGE(data)
 Sample standard deviation, s, (p173) using function STDEV(data)
 Sample size, n, using function COUNT(data)
 Standard uncertainty, u, (p221) using the equation, u = s/√n
 Degrees of freedom, df, (p222) using the equation, df = n  1
 tvalue (p222) for 95% confidence using the function = TINV(0.05,df)
 95% Confidence deviation, Cd, (p221) using the equation, Cd = t*s/√n
 95% Confidence interval (p221), minimum and maximum values, using CI = x ± Cd
(NB Do NOT use the function CONFIDENCE for experimental sample data)
(The page numbers given in brackets above refer to the text: Essential Mathematics and Statistics for Science, 2nd Ed, by G Currell and A A Dowman)
The Descriptive Statistics option in Data Analysis tools to  see video Replicate Measurements B
 Calculate the same values as obtained by the equations and functions above
Excel demonstration of the effect of random experimental variations  see video Replicate Measurements C
Analysis 2: Experimental uncertainty (error) in simple linear data plot
A typical set of linear data can be described by the change of the pressure, p, (in pascals) of an ideal gas as a function of the temperature, T, in degrees kelvin.
T /K

298

328

358

388

418

448

478

508

p /Pa

5606

5890

6405

6997

7172

8160

8518

9218

Important Note:
Regression calculations to find the line of ‘bestfit’ assume that the main errors (uncertainties) occur in the yvalues with negligible error in the xvalues.
This data is analysed using:
XY plot of data including  see video Regression Analysis A
 Trendlines projected back to the yaxis
 Forcing the trendline through the origin of the graph
 Error bars based on a specified value
Functions to calculate  see video Regression Analysis A
 Slope of the trendline using SLOPE(ydata,xdata)
 Intercept (on the yaxis) of the trendline using INTERCEPT(ydata,xdata)
 Slope of a trendline which is forced through the origin using LINEST(ydata,xdata,FALSE)
 Standard deviation of the experimental data using STEYX(ydata,xdata)
Data Analysis tools to perform a Regression analysis to  see video Regression Analysis A
 Calculate the same values as obtained by the functions above
 Uncertainties in the values for slope and intercept
Excel demonstration of the effect of random experimental variations, including  see video Regression Analysis B
 Error bars based on a data range
Analysis 3: Experimental uncertainty (error) in using linear calibration
A typical set of linear data can be described by the plot of the absorbance, A, of a solution as a function of its concentration, C, in mg dm^{3}.
C / mg dm^{3}

0 
20 
40 
60 
80 
100 
120 
A

0 
0.267 
0.583 
0.824 
1.120 
1.313 
1.499 
A solution of unknown concentration gives 3 replicate measures of absorbance: 0.763, 0.741 and 0.749, and we use the calibration data to calculate a bestestimate for the true concentration of this solution.
This data is analysed using:
An analysis of Residuals and Corrrelation coefficients to  see video Beers Law v1
 Identify curvature of the calibration data at high concentrations (xvalues)
 Select an appropriate linear calibration range
Functions and equations to calculate  see video Beers Law v2
 Best estimate of the unknown concentration using both a freefit trendline and a trendline forced through the origin.
Functions and equations to calculate  see video Beers Law v3
 Standard uncertainty in the calculated concentration, given by
or
 95% Confidence Interval for the true value of the unknown concentration
(Refer to section 13.3.3 in the text: Essential Mathematics and Statistics for Science, 2nd Ed, by G Currell and A A Dowman)