Class experiment: Using this Close your eyes and count off 30 seconds,
using this
online stopwatch, to see how accurate your count is.
An experiment consists of repeating this 30 second count five times
by the same person and recording the actual times recorded by the
stopwatch.
35.2
30.9
32.6
31.9
30.4
The average of these five timings is 32.2. For this experiment, we
can compare this average to the true value of 30. However, for most
experiments, we don't know the true value, so we would like a way to
estimate the accuracy of our average x = 32.2.
timings30.xls
(Dataset 4 on the Datasets Page)
contains the results from repeating this experiment 8 times
(Batches A through H).
The term standard error for the average, abbreviated
SEave, is an estimate of the accuracy
of the average of an experiment.
We discuss two methods for estimating SEave.
Method 1: Long Method
timings30.xls from the
Timings of Count to 30 Seconds Example contains
the results of repeating the experiment 7 more times.
Here are the individual averages for all 8 batches:
The mean of these averages is 30.69. Now compute the standard
deviation SD+ of these averages: 1.52
This gives the average timing with its error estimate: 30.69 ± 1.52.
Remarks:
To use Method 1, several replications of the original
experiment are required. This is often expensive.
Compute the SD+ for each of the 8 groups:
1.883 1.967 4.594 1.165
5.009 1.265 3.196 5.227
The averate of the SDs is 3.03.
The individual observations have more variability
than the average of the measurements, so we would normally expect
the SD+ of the averages 1.52 to be smaller than the individual
SD+s of the individual batches, which average to 3.03.
One reason for the variability in SD+ from batch to batch is that
each batch of timings is recorded by a different person. Some persons
were more accurate in counting off 30 seconds than others.
Method 2: Short Method
The short method uses a remarkable formula for the standard error of the average:
SEave =
SDx /
√n
The remarkable part is that extra replications of the experiment are not required.
SEave indicates how much variation in
the average to expect from the true measurement if the experiment
were to be repeated n times, assuming that the measurements
are unbiased and that the SE remains the same for each experiment.
Again, here are the timings from Batch A:
35.2 30.9
32.6
31.9
30.4
They have mean = 29.20 and SD+ = 1.967. To estimate
the standard error of the average, use the formula:
SEave =
SDx /
√n =
1.882 / √5 = 0.841
Compare this value 0.841 from Method 2 with the value 1.52
obtained from Method 1.
Method 1 is often more accurate, but Method 2 is easier because extra
replications of the experiment are not needed.
Usually Method 2 is accurate for most data analyses.
One reason that the estimated SDmean from Method 2 is low is that SD+ is that
SD+ for Batch 1 is lower than many of the other batches. The reason for the
variability in the batches is that each batch is obtained from a different student.
Some students are better than others at counting to 30 seconds.
The value SDmean is most accurate if SD+ for all the batches
are about the same.
Confidence Interval for the Mean
As we saw earlier in this document, the sample mean will change every time we repeat
the experiment.
The standard error of the sample mean is
SEave = SD+ /
√n.
According to the Central Limit Theorem, the sample mean is approximately
normally distributed if n ≥ 25, even if the original distribution is not:
If the measurements are unbiased,
one can expect that the true measurement
is within 1 SEave
of x about 68% of the time:
[x - 1 SEave,
x + 1 SEave]
is a 68% confidence interval for the true measurement.
If the measurements are unbiased,
one can expect that the true measurement is within 2 SEave
of x about 95% of the time:
[x - 2 SEave,
x + 2 SEave]
is a 95% confidence interval for the true measurement. A more accurate 95%
confidence interval is
[x - 1.96 SEave,
x + 1.96 SEave]
If the measurements are unbiased,
one can expect that the true measurement is within 3 SEave
of x about 99.7% of the time:
[x - 3 SEave,
x + 3 SEave]
is a 99.7 confidence interval for the true measurement.
The ideal measurement model assumes that the true measurement is
constant (although unknown) in all situations and for all time.
If these assumptions are not valid, than then one must use
a model different than the ideal measurement model.
Use the Micrometer Dataset in micrometer.xslx, which are 26 thicknesses of paper.
Use column t2, which are the thicknesses measured by the professor.
Use SPSS to compute the sample mean and standard deviation: x = 0.09456 SD+ = 0.002177
Now use the standard normal table to find a 95% confidence interval for z.
We want to find the value of z such that the area of the normal density over
the curve above [-z, z] is 95%. This means that the area of the normal density over
[-∞, z] is 97.5%. Looking up 0.975 in the body of the table shows that
z = 1.96, so that the 95% confidence interval for z is [-1.96, 1.96].
To compute the 95% confidence interval for the true mean, compute the z-score
for the sample mean:
z = (x - μ) / SEmean = (0.09456 - μ) /
0.0008367
Now μ is unknown, but z is in the interval [-1.96, 1.96] 95% of the time,
so we can solve this inequality for μ:
-1.96 ≤ x ≤ 1.96
-1.96 ≤ (x - μ) / SEmean ≤ 1.96
-1.96 ≤ (0.0946 - μ) / 0.0004229 ≤ 1.96
-1.96 * 0.0004269 ≤
0.0946 - μ ≤ 1.96 * 0.0004269
-0.0008367 ≤ 0.0946 - μ ≤ 0.0008367
-0.09630 ≤ -μ ≤ -0.09462
Multiply the inequality by -1, which flips the direction of
the ≤ signs:
0.09630 ≥ μ ≥ 0.09462
0.09462 ≤ μ ≤ 0.09630
This means that a 95% confidence interval for μ is [0.09462, 0.09630].
Check the 95% confidence interval computed by SPSS:
[0.094582, 0.096341], which is close to, but not equal to the confidence
interval that we computed. Later will see how to use the t-table to
obtain a more accurate confidence interval that accounts for the
sample size n.