Data Analysis, Error and Uncertainty - data analysis using Excel

Produced by Graham Currell, University of the West of England, Bristol and David Read, University of Southampton, in association with:
Royal Society of Chemistry, 'Discover Maths for Chemists' website
, and

Essential Mathematics and Statistics for Science, 2nd Edition
   Graham Currell and Antony Dowman, Wiley-Blackwell, 2009

Return to Excel Tutorial Index

 

This study unit uses video to demonstrate the use of Excel in the analysis of experimental data and its uncertainty. The Excel files used in the data analysis examples and videos can be downloaded here:
ExcelDataUncert01.xlsx for analyses 1 and 2, and

BeersLaw.xls for analysis 3

 

The study unit is divided into four main sections:

 

Introduction - provides an overview of the important methods of data analysis using Excel, together with links to video tutorials on basic skills and self-assessment study guide/tutorials on linear regression.

 

1. Analysis of replicate data - demonstrates the use of equations, functions and data analysis tools, to interpret the results of repeated measurements of a single experimental value. The data represents replicate measures of the pressure, p, of a gas.

 

2. Analysis of linear data - demonstrates the use of regression analysis and graphical presentation to interpret the experimental results for a linear relationship between two variables. The data uses the variation of pressure, p, against temperature, T, of an ideal gas.

 

3. Analysis of linear calibration data - demonstrates the analysis of spectrophotometric data, using correlation coefficients, data residuals, and a calculation of the 95% confidence interval of the measurement of concentration using the calibration line of best-fit.

 


 

Introduction

 

It is possible to:

·         Use Excel functions to perform specific calculations
e.g. =SQRT(B4) will calculate the square root of the value in cell B4.

·         Write equations directly into Excel cells,
e.g. =B5*B6/SQRT(B4) will multiply the contents of B5 and B6 and divide by the square root of B4.

·         Use Data Analysis tools. These are not normally loaded when Excel is first installed, but can be added later (see video) using Add-Ins.

 

Additional video help on the use of Excel 2007 is available:

Basic skills: Formatting, Editing, Calculations and equations, Functions, Addressing,
+
associated Excel file

XY Graphs: Drawing x-y graph; 'Best-fit' trendlines; Error bars;

Regression & Correlation (Slope & Intercept); Data Analysis Tools
+
associated Excel file

 

Important notes on calculations in Excel:

·         Equations and functions are dynamic. Their results change if the source data is changed, e.g. the value of SQRT(B4) will change if the value in B4 is changed.

·         The calculations in the Data Analysis tools are static. They are a single-shot calculation, whose results do NOT change if the source data changes.

 

Self Assessment Study Guide:

·         Mathematics of the straight line, 'best-fit' straight line, linear regression and linear calibration

 

 


 

Analysis 1: Experimental uncertainty (error) in replicate measurements

 

It is common (and good) practice to obtain repeated (replicate) measurements of the same experimental value, e.g. measurements of gas pressure (in pascals):

 

p / Pa

5246

5471

5821

5321

5537

5329

 

This data is analysed using:

 

Equations and functions to calculate - see video Replicate Measurements A

 

·         Mean value, x, (p171) using function AVERAGE(data)

·         Sample standard deviation, s, (p173) using function STDEV(data)

·         Sample size, n, using function COUNT(data)

·         Standard uncertainty, u, (p221) using the equation, u = s/√n

·         Degrees of freedom, df, (p222) using the equation, df = n - 1

·         t-value (p222) for 95% confidence using the function = TINV(0.05,df)

·         95% Confidence deviation, Cd, (p221) using the equation, Cd  = t*s/√n

·         95% Confidence interval (p221), minimum and maximum values, using CI = x ± Cd

(NB Do NOT use the function CONFIDENCE for experimental sample data)

(The page numbers given in brackets above refer to the text: Essential Mathematics and Statistics for Science, 2nd Ed, by G Currell and A A Dowman)

 

The Descriptive Statistics option in Data Analysis tools to - see video Replicate Measurements B

·         Calculate the same values as obtained by the equations and functions above

 

Excel demonstration of the effect of random experimental variations - see video Replicate Measurements C

 

 


 

Analysis 2: Experimental uncertainty (error) in simple linear data plot

 

A typical set of linear data can be described by the change of the pressure, p, (in pascals) of an ideal gas as a function of the temperature, T, in degrees kelvin.

 

T /K

298

328

358

388

418

448

478

508

p /Pa

5606

5890

6405

6997

7172

8160

8518

9218

 

Important Note:

Regression calculations to find the line of ‘best-fit’ assume that the main errors (uncertainties) occur in the y-values with negligible error in the x-values.

 

This data is analysed using:

 

X-Y plot of data including  - see video Regression Analysis A

·         Trendlines projected back to the y-axis

·         Forcing the trendline through the origin of the graph

·         Error bars based on a specified value

 

Functions to calculate  - see video Regression Analysis A

·         Slope of the trendline using
SLOPE(y-data,x-data)

·         Intercept (on the y-axis) of the trendline using
INTERCEPT(y-data,x-data)

·         Slope of a trendline which is forced through the origin using
LINEST(y-data,x-data,FALSE)

·         Standard deviation of the experimental data using
STEYX(y-data,x-data)

 

Data Analysis tools to perform a Regression analysis to  - see video Regression Analysis A

·         Calculate the same values as obtained by the functions above

·         Uncertainties in the values for slope and intercept

 

Excel demonstration of the effect of random experimental variations, including - see video Regression Analysis B 

·         Error bars based on a data range

 


 

Analysis 3: Experimental uncertainty (error) in using linear calibration

 

A typical set of linear data can be described by the plot of the absorbance, A, of a solution as a function of its concentration, C, in mg dm-3.

 

C /mg dm-3

0 20 40 60 80 100 120

A

0 0.267 0.583 0.824 1.120 1.313 1.499

A solution of unknown concentration gives 3 replicate measures of absorbance: 0.763, 0.741 and 0.749, and we use the calibration data to calculate a best-estimate for the true concentration of this solution.

This data is analysed using:

 

An analysis of Residuals and Corrrelation coefficients to  - see video Beers Law v1

·         Identify curvature of the calibration data at high concentrations (x-values)

·         Select an appropriate linear calibration range

 

Functions and equations to calculate  - see video Beers Law v2

·         Best estimate of the unknown concentration using both a free-fit trendline and a trendline forced through the origin.
 

Functions and equations to calculate  - see video Beers Law v3

·         Standard uncertainty in the calculated concentration, given by

Equation gives the standard uncertainty in the x-value of a calibration point

or

Equation gives the approximate standard uncertainty in the x-value of a calibration point

·         95% Confidence Interval for the true value of  the unknown concentration

 

(Refer to section 13.3.3 in the text: Essential Mathematics and Statistics for Science, 2nd Ed, by G Currell and A A Dowman)

 


End of page