Data Analysis, Error and Uncertainty - data analysis using Excel
Produced by Graham Currell, University of the
West of England, Bristol and David Read, University of Southampton, in association with:
● Royal Society of Chemistry, 'Discover Maths
for Chemists' website, and
●
Essential Mathematics and Statistics for
Science, 2nd Edition
Graham Currell and Antony Dowman, Wiley-Blackwell, 2009
Return to Excel Tutorial Index
This study unit uses video
to demonstrate the use of Excel in the analysis of experimental data and its
uncertainty. The Excel files used in the data analysis examples and videos
can be downloaded here:
ExcelDataUncert01.xlsx for analyses 1 and
2, and
BeersLaw.xls for analysis 3
The study unit is divided into four main sections:
Introduction - provides an overview of the important methods of data analysis using Excel, together with links to video tutorials on basic skills and self-assessment study guide/tutorials on linear regression.
1. Analysis of replicate data - demonstrates the use of equations, functions and data analysis tools, to interpret the results of repeated measurements of a single experimental value. The data represents replicate measures of the pressure, p, of a gas.
2. Analysis of linear data - demonstrates the use of regression analysis and graphical presentation to interpret the experimental results for a linear relationship between two variables. The data uses the variation of pressure, p, against temperature, T, of an ideal gas.
3. Analysis of linear calibration data - demonstrates the analysis of spectrophotometric data, using correlation coefficients, data residuals, and a calculation of the 95% confidence interval of the measurement of concentration using the calibration line of best-fit.
Introduction
It is possible to:
·
Use Excel functions to perform specific calculations
e.g. =SQRT(B4) will calculate the square root of the value in cell B4.
·
Write equations directly into Excel cells,
e.g. =B5*B6/SQRT(B4) will multiply the contents of B5 and B6 and divide by the
square root of B4.
· Use Data Analysis tools. These are not normally loaded when Excel is first installed, but can be added later (see video) using Add-Ins.
Additional video help on the use of Excel 2007 is available:
Basic skills:
Formatting,
Editing,
Calculations and equations,
Functions,
Addressing,
+ associated
Excel file
XY Graphs: Drawing x-y graph; 'Best-fit' trendlines; Error bars;
Regression & Correlation (Slope & Intercept);
Data Analysis Tools
+ associated
Excel file
Important notes on calculations in Excel:
· Equations and functions are dynamic. Their results change if the source data is changed, e.g. the value of SQRT(B4) will change if the value in B4 is changed.
· The calculations in the Data Analysis tools are static. They are a single-shot calculation, whose results do NOT change if the source data changes.
Self Assessment Study Guide:
· Mathematics of the straight line, 'best-fit' straight line, linear regression and linear calibration
Analysis 1: Experimental uncertainty (error) in replicate measurements
It is common (and good) practice to obtain repeated (replicate) measurements of the same experimental value, e.g. measurements of gas pressure (in pascals):
p / Pa |
5246 |
5471 |
5821 |
5321 |
5537 |
5329 |
This data is analysed using:
Equations and functions to calculate - see video Replicate Measurements A
· Mean value, x, (p171) using function AVERAGE(data)
· Sample standard deviation, s, (p173) using function STDEV(data)
· Sample size, n, using function COUNT(data)
· Standard uncertainty, u, (p221) using the equation, u = s/√n
· Degrees of freedom, df, (p222) using the equation, df = n - 1
· t-value (p222) for 95% confidence using the function = TINV(0.05,df)
· 95% Confidence deviation, Cd, (p221) using the equation, Cd = t*s/√n
· 95% Confidence interval (p221), minimum and maximum values, using CI = x ± Cd
(NB Do NOT use the function CONFIDENCE for experimental sample data)
(The page numbers given in brackets above refer to the text: Essential Mathematics and Statistics for Science, 2nd Ed, by G Currell and A A Dowman)
The Descriptive Statistics option in Data Analysis tools to - see video Replicate Measurements B
· Calculate the same values as obtained by the equations and functions above
Excel demonstration of the effect of random experimental variations - see video Replicate Measurements C
Analysis 2: Experimental uncertainty (error) in simple linear data plot
A typical set of linear data can be described by the change of the pressure, p, (in pascals) of an ideal gas as a function of the temperature, T, in degrees kelvin.
T /K |
298 |
328 |
358 |
388 |
418 |
448 |
478 |
508 |
p /Pa |
5606 |
5890 |
6405 |
6997 |
7172 |
8160 |
8518 |
9218 |
Important Note:
Regression calculations to find the line of ‘best-fit’ assume that the main errors (uncertainties) occur in the y-values with negligible error in the x-values.
This data is analysed using:
X-Y plot of data including - see video Regression Analysis A
· Trendlines projected back to the y-axis
· Forcing the trendline through the origin of the graph
· Error bars based on a specified value
Functions to calculate - see video Regression Analysis A
·
Slope of the trendline using
SLOPE(y-data,x-data)
·
Intercept (on the y-axis) of the trendline using
INTERCEPT(y-data,x-data)
·
Slope of a trendline which is forced through the origin
using
LINEST(y-data,x-data,FALSE)
·
Standard deviation of the experimental data using
STEYX(y-data,x-data)
Data Analysis tools to perform a Regression analysis to - see video Regression Analysis A
· Calculate the same values as obtained by the functions above
· Uncertainties in the values for slope and intercept
Excel demonstration of the effect of random experimental variations, including - see video Regression Analysis B
· Error bars based on a data range
Analysis 3: Experimental uncertainty (error) in using linear calibration
A typical set of linear data can be described by the plot of the absorbance, A, of a solution as a function of its concentration, C, in mg dm^{-3}.
C /mg dm^{-3} |
0 | 20 | 40 | 60 | 80 | 100 | 120 |
A |
0 | 0.267 | 0.583 | 0.824 | 1.120 | 1.313 | 1.499 |
A solution of unknown concentration gives 3 replicate measures of absorbance: 0.763, 0.741 and 0.749, and we use the calibration data to calculate a best-estimate for the true concentration of this solution.
This data is analysed using:
An analysis of Residuals and Corrrelation coefficients to - see video Beers Law v1
· Identify curvature of the calibration data at high concentrations (x-values)
· Select an appropriate linear calibration range
Functions and equations to calculate - see video Beers Law v2
·
Best
estimate
of the unknown
concentration using both a free-fit trendline and a trendline forced
through the origin.
Functions and equations to calculate - see video Beers Law v3
· Standard uncertainty in the calculated concentration, given by
or
· 95% Confidence Interval for the true value of the unknown concentration
(Refer to section 13.3.3 in the text: Essential Mathematics and Statistics for Science, 2nd Ed, by G Currell and A A Dowman)
End of page