To Projects
IT 403 -- Project 2A
Part A: Analyzing a Univariate Dataset
- Collect a univariate dataset containing at
least 15 observations that follows the ideal measurement
model:
- Actual measurement = true measurement + random error
- Here are some suggestions:
- Time for a cup of water to boil on a stove
- Time for your dog to come to you when you call it
- Time to go to work over the same route
- Time for a Red Line train to travel from the Fullerton stop to the Jackson stop
- Time for an email message sent to yourself to arrive
- Number of words on random pages of a book that does not contain figures or tables
- Number of words in newspaper articles
- Your pulse rate measured at random times during the day
- Your weight measured at random times during the day
- Weight of of a packages of meat at a store of approximately the same weight (you may need to go to more than one store)
- Weight of each potato in a 10 pound bag
- Lengths of "30 second" television commercials in seconds
- Newspaper prices of used cars, which are of the same make and model
- Heights of persons of the same gender in this class
- The roundtrip ping time for an IP packet
- The number of time for a spring loaded kitchen timer to "time" 30 seconds.
(Because of random error, there may be more variation than you think.)
- Then use SPSS to answer these questions. Type the answers to the starred
questions at the top of your SPSS output file (.docx file type).
Also make sure that your name is shown at the top of your submission
as well as in the filename.
- Important: Do not sort the data before performing the
analyses.
- *Describe your dataset and the circumstances under which you collected the data.
- Print your dataset.
- *What are Q0, Q1, Q2, Q3, and Q4?
- *What are the sample mean and standard deviation?
- Use SPSS to plot histograms with three different interval widths.
- *Use SPSS to plot a boxplot. What does the boxplot tell you?
- *List any moderate or extreme outliers using the boxplot. See Hint 1 below.
- Create a new column of z-scores in the dataset. See the
Transform Variables section of the Brief SPSS Tutorial.
Print the z-scores.
- *List any moderate or extreme outliers using z-scores. See Hint 2
below.
- *Plot your dataset by observation number. Describe this plot using
the terms unbiased, biased, homoschedastic, and heteroscedastic.
Can you think of any lurking variables, also called
confounding variables, that might cause your dataset
to deviate from an ideal measurement model? A lurking variable is a variable
not included in the dataset that might affect the results.
Hints:
- To find outliers using Q1, Q3, and IQR:
- An extreme outlier is a data point that is more than 3 IQRs below Q1
or more than 3 IQRs above Q3.
- An extreme outlier is a data point that is more than 1.5 IQRs below Q1
or more than 1.5 IQRs above Q3, and is not an extreme outlier.
- To find outliers using z-scores:
- An extreme outlier is a data point that has a z-score of more than
3 or less than -3.
- An mild outlier is a data point that has a z-score greater than
2 or less than -2, and is not an extreme outlier.
- To plot the dataset by observation number:
- Create the scatterplot of the datapoints (y-axis) vs. the observation number
(x-axis), select Graphs >> Interactive >>
Scatterplot. Drag the y-variable into the box with the vertical
arrow through it. Drag the observation number variable into the
box with the horizontal arrow though it. Click OK.