IT 223: Data Analysis
Winter 2006
Assignment 7
Due Friday
March 3 before 11:30pm by DL Web
Submission
Overview
For this week, you will be analyzing the performance of a file
compression application of your choice (e.g. WinZip, FastZip,
StuffitExpander). Using your own files, you will estimate the average reduction in file size (as a percentage) and provide a confidence interval for it.
Specifying your procedure and sample
You may choose the file compression application, but be sure you
specify what you choose to use.
You will need to create a sample of test files to generate your
results. Select files from your computer that you might use for
compression. For this assignment, you should obtain a sample size of
20. Make sure you specify how you chose your files. It won't be a
random sample, but it should at least be representative of the files
you work with.
For each file compression, determine the percent reduction in file
size. This can be calculated as 100 * (initialSize -
reducedSize) / initialSize.
Statistics
You may use SAS or Excel to help you create or determine the following:
- A frequency bar chart of your percent reduction data.
- The
average and standard deviation of the sample (percent reduction).
- The estimated mean (i.e. average) file reduction (as a
percentage). Note that this is the same as the average of the sample.
- Your calculation of a 95% confidence confidence interval of the
mean, using the standard deviation of the sample as an estimate for
the standard deviation of the population.
- The 95% confidence interval produced using statistical software.
- How does your calculated confidence interval differ from that
provided by statistical software? Explain the difference.
- A short written summary describing the distribution and the
estimated mean with the 95% confidence interval (created from statistical software).
Submission
Submit a one-document report that responds to the instructions.
Please use a word processor and show your work.
You report should include the following:
- Your method for collecting file reduction percentages.
- The answers and writings from the numbered points.
- A listing of the data values you collect (may be appended at the end of your report).