CSC 239: Personal Computing for Science
Midterm Project (Due 11:30pm Monday October 17, 2005)
Assignment
This project uses the same data file as homework 3.
Open a new workbook. Download the file cereals.txt
and save it in your work directory. The file contains data on several variables of 77
different brands of cereal. The variables are
- name: name of cereal,
- calories: calories per serving,
- protein: grams of protein,
- fat: grams of fat,
- sodium: milligrams of sodium,
- carbo: grams of complex carbohydrates,
- sugars: grams of sugars,
- shelf: display shelf (1, 2, or 3, counting from the floor), and
- rating: a rating of the cereals.
Please perform the following steps on the data:
- Import the data into an Excel workbook using a web query.
- Using Excel, draw a histogram for the percentage of calories due to fat. You will
have to calculate this percentage based on the total calories and the grams of fat, where
each gram of fat contains 9 calories. Make sure to choose an appropriate bin range.
[10 points]
- Describe the distribution of the data; is the histogram symmetric or skewed? [2 points]
- Based on the histogram, report the appropriate statistics to describe the center and
spread of the distribution. [3 points]
- Provide an explanation for the distribution of the histogram. (Don't just
provide a description of the histogram.) [5 points]
- Draw a histogram of the grams of carbohydrates in each cereal. [10 points]
- Describe the distribution of the histogram and report the appropriate measures of center
and spread. [5 points]
- Use the normal distribution to answer the following questions:
- What percentage of cereals have more than 20 grams of carbohydrates? [5 points]
- What percentage of cereals have between 10 and 20 grams of carbohydrates? [5 points]
- What is the larest amount of carbohydrates that a cereal can have and still be in the
bottom 25% of all cereals? [5 points]
- Use the actual counts of the cereals to answer each of the three questions in 8 above and report
whether you
think the normal distribution appropriately models the number of grams of carbohydrates in the
cereals. [20 points]
- Using the worktable for computing confidence intervals,
compute the 95% confidence interval of the average
carbohydrate content per serving in cereals sold in the US. [5 points]
- Explain to somebody who knows no statistics what the computed result represents. [10 points]
- Give a 98% confidence interval of the average calorie content of the cereals.
A nutritionist claims that cereals contain 110 calories on average.
Is this claim confirmed by your data? [5 points]
Make sure that you use good excel practices: name your input data ranges and use those ranges in your formulas; use
multiple worksheets and name them appropriately; format the sheets so that the information is labeled and
clear. [10 points]
Deliverables
Submit your excel workbook using
Course On-Line.