Standard Error of the Average

To Documents

The Standard Error of the Average

Introduction

Class experiment: Using this Close your eyes and count off 30 seconds, using this online stopwatch, to see how accurate your count is.
An experiment consists of repeating this 30 second count five times by the same person and recording the actual times recorded by the stopwatch.
35.2 30.9 32.6 31.9 30.4
The average of these five timings is 32.2. For this experiment, we can compare this average to the true value of 30. However, for most experiments, we don't know the true value, so we would like a way to estimate the accuracy of our average x = 32.2.
timings30.txt contains the results from repeating this experiment 8 times (Batches A through H).
The term standard error for the average, abbreviated SE_ave, is an estimate of the accuracy of the average of an experiment.
We discuss two methods for estimating SE_ave.

Method 1: Long Method

timings30.txt from the Timings of Count to 30 Seconds Example contains the results of repeating the experiment 7 more times. Here are the individual averages for all 8 batches:

32.20 29.20 32.28 31.34 31.74 31.20 28.38 29.24

See timings30.xls:
The mean of these averages is 30.69. Now compute the standard deviation SD+ of these averages: 1.52
This gives the average timing with its error estimate: 30.69 ± 1.52.
Remarks:
1. To use Method 1, several replications of the original experiment are required. This is often expensive.
2. Compute the SD+ for each of the 8 groups:
  
  1.883 1.967 4.594 1.165 5.009 1.265 3.196 5.227
3. The average of the SDs is 3.03.
4. The individual observations have more variability than the average of the measurements, so we would normally expect the SD⁺ of the averages 1.52 to be smaller than the individual SD⁺s of the individual batches, which average to 3.03.
5. One reason for the variability in SD⁺ from batch to batch is that each batch of timings is recorded by a different person. Some persons were more accurate in counting off 30 seconds than others.

Method 2: Short Method

The short method uses a remarkable formula for the standard error of the average:

SE_ave = SD_x / √n
The remarkable part is that extra replications of the experiment are not required.
SE_ave indicates how much variation in the average to expect from the true measurement if the experiment were to be repeated n times, assuming that the measurements are unbiased and that the SE remains the same for each experiment.
Again, here are the timings from Batch A:

35.2 30.9 32.6 31.9 30.4

They have mean = 29.20 and SD+ = 1.967. To estimate the standard error of the average, use the formula:

SE_ave = SD_x / √n = 1.882 / √5 = 0.841
Compare this value 0.841 from Method 2 with the value 1.52 obtained from Method 1.
Method 1 is often more accurate, but Method 2 is easier because extra replications of the experiment are not needed.
Usually Method 2 is accurate for most data analyses.
One reason that the estimated SD_mean from Method 2 is low is that SD+ is that SD+ for Batch 1 is lower than many of the other batches. The reason for the variability in the batches is that each batch is obtained from a different student. Some students are better than others at counting to 30 seconds.
The value SD_mean is most accurate if SD+ for all the batches are about the same.

Confidence Interval for the Mean

As we saw earlier in this document, the sample mean will change every time we repeat the experiment.
The standard error of the sample mean is SE_ave = SD⁺ / √n.
According to the Central Limit Theorem, the sample mean is approximately normally distributed if n ≥ 25, even if the original distribution is not:
- If the measurements are unbiased, one can expect that the true measurement is within 1 SE_ave of x about 68% of the time:
  [x - 1 SE_ave, x + 1 SE_ave]
  is a 68% confidence interval for the true measurement.
- If the measurements are unbiased, one can expect that the true measurement is within 2 SE_ave of x about 95% of the time:
  [x - 2 SE_ave, x + 2 SE_ave]
  is a 95% confidence interval for the true measurement. A more accurate 95% confidence interval is
  [x - 1.96 SE_ave, x + 1.96 SE_ave]
- If the measurements are unbiased, one can expect that the true measurement is within 3 SE_ave of x about 99.7% of the time:
  [x - 3 SE_ave, x + 3 SE_ave]
  is a 99.7 confidence interval for the true measurement.
The ideal measurement model assumes that the true measurement is constant (although unknown) in all situations and for all time. If these assumptions are not valid, than then one must use a model different than the ideal measurement model.
Use the Micrometer Dataset in paper-thickness.txt, which are 19 thicknesses of paper. Use column t2, which are the thicknesses measured by the professor. Use R to compute the sample mean and standard deviation:
x = 0.09456 SD+ = 0.002177
Also compute
SE_mean = SD+ / √26 = 0.002177 / √26 = 0.0004269
Now use the standard normal table to find a 95% confidence interval for z. We want to find the value of z such that the area of the normal density over the curve above [-z, z] is 95%. This means that the area of the normal density over [-∞, z] is 97.5%. Looking up 0.975 in the body of the table shows that z = 1.96, so that the 95% confidence interval for z is [-2, 2].
To compute the 95% confidence interval for the true mean, compute the z-score for the sample mean:
z = (x - μ) / SE_mean = (0.09456 - μ) / 0.0004269
Now μ is unknown, but z is in the interval [-2, 2] 95% of the time, so we can solve this inequality for μ:
-2 ≤ x ≤ 2
-2 ≤ (x - μ) / SE_mean ≤ 2
-2 ≤ (0.0946 - μ) / 0.0004269 ≤ 2
-2 * 0.0004269 ≤ 0.0946 - μ ≤ 2 * 0.0004269
-0.0008538 ≤ 0.0946 - μ ≤ 0.0008538
-0.0008538 - 0.0946 ≤ - μ ≤ 0.0008538 - 0.0946
-0.09545 ≤ -μ ≤ -0.09375
Multiply the inequality by -1, which flips the direction of the ≤ signs:
0.09545 ≥ μ ≥ 0.09375
0.09375 ≤ μ ≤ 0.09545
This means that a 95% confidence interval for μ is [0.09375, 0.09545].
Later will see how to use the t-table to obtain a more accurate confidence interval that accounts for the sample size n.