IT 223 -- Practice Quiz Questions and Answers The questions and answers for the practice quizzes that have closed will be posted here. The correct answer is marked with an asterisk (*). -------------------------------------------------- Practice Quiz 1a. Due Apr 15. 1. Who invented the field of exploratory data analysis? a. Abraham de Moivre b. Karl Gauss *b. John Tukey d. Robert Fisher 2. For a controlled experiment, the subjects are divided into two groups: the group the received the vaccine and the group that received the placebo. The group that receives the vaccine is called the treatment group. What is the group that receives the placebo called? The __________ group. a. antidote *b. control c. primary d. secondary 3. For the sorted dataset 28 58 74 83 100 Use the Tukey's Hinges method to compute Q1. a. 43 *b. 58 c. 66 d. 74 4. For the sorted dataset 5 28 58 73 83 93 100 Use the Tukey's Hinges method to compute Q3. a. 78 b. 83 *c. 88 d. 93 5. A histogram is constructed using the bins [0,40), [40,80), [80,120) and the sorted data 5 24 40 58 78 80 87 91 100. How many observations are in the bin [40,80)? a. 2 *b. 3 c. 4 d. 6 -------------------------------------------------- Practice Quiz 2a. Due Apr 18. 1. Given the sorted dataset 23 45 76 79 123 145 155 175 183 What is the third quartile (Q3)? Use the Tukey's Hinges method to obtain your answer. a. 123 b. 150 *c. 155 d. 165 2. A boxplot drawn by the R boxplot function shows that Q1=1200, Q2=1300, and Q3=1500. Where are the inner fences located? a. 300 and 2400 *b. 750 and 1950 c. 1000 and 1700 d. 1200 and 1500 3. A histogram is drawn using the bins [0,1), [1, 3), [3, 7] with frequencies 20%, 60%, 20%, respectively. What is the median (Q2) of this histogram? a. 1.5 *b. 2.0 c. 2.25 d. 2.5 4. The x-value where the slope of a curve is zero is called a(n) ______________. a. asymptote *b. critical point b. inflection point d. terminal point 5. The x-value where the curve changes from concave down to concave up or from concave up to concave down is called a(n) ________________ . a. asymptote b. critical point *c. inflection point d. terminal point -------------------------------------------------- Practice Quiz 2b. Due Apr 21. 1. Which dataset variable is a continuous variable? a. country b. gender *c. height d. occupation 2. Which measure of central tendency is most affected by outliers? a. interquartile range b. sample median c. sample mean d. trimmed mean 3. The data model actual measurement = true measurement + random error is called the ____________ measurement model. a. exact *b. ideal c. partial d. semi- 4. The sample mean and sample standard deviation form a ____________ description of a normal (bell-shaped) histogram. a. exact b. complete c. generous *d. parsimonious 5. The R function hist call hist(x) where x is the data vector, produces a histogram with bins that are ___________. a. all-inclusive b. left-inclusive *c. right-inclusive -------------------------------------------------- Practice Quiz 3a. Due Apr 25. 1. What is a parsimonious description of a density histogram? A(n) ____________ description. a. extravagant b. generous c. redundant *d. terse 2. What does it mean that the normal histogram is ubiquitous in statistics? It is ___________ . a. abnormal b. extraordinary c. pervasive d.rare 3. If u is the sum of the squared deviations from the sample mean, which expression represents SD+? sqrt(x) represents the square root of x. a. u/n b. u/(n-1) c. sqrt(u/n) d. sqrt(u/(n-1)) 4. What does MAD mean? *a. Mean absolute deviation b. Mean angle dispersion c. Measure adjustment direction d. Median angle dispersion 5. A histogram consists of two bars. The interval [3,4] contains 0.7 of the observations; the interval (4,5] contains 0.3 of the observations. What is the mean of the histogram? a. 3.50 b. 3.71 *c. 3.80 d. 4.00 -------------------------------------------------- Practice Quiz 3b. Due Apr 28. 1. At the Best Films movie theatre chain, the amount that customers spend on concessions is normally distributed with mean=$4.00 and sd=$0.75. What percentage of the customers spend less than $4.00 on concessions. a. 16% *b. 50% c. 60% d.68% 2. At the Best Films movie theatre chain, the amount that customers spend on concessions is normally distributed with mean=$4.00 and sd=$0.75. What percentage of the customers spend between $4.00 and $4.75 on concessions? a. 5% b. 16% *c. 34% d. 68% 3. At the Best Films movie theatre chain, the amount that customers spend on concessions is normally distributed with mean=$4.00 and sd=$0.75. What percentage of the customers spend more than $5.50 on concessions? *a. 2.5% b. 5% c. 10% d. 32% 4. At the Best Films movie theatre chain, the amount that customers spend on concessions is normally distributed with mean=$4.00 and sd=$0.75. What is the median amount (50% percentile) that customers spend on concessions? a. $3.00 b. $3.25 *c. $4.00 d. $4.75 5. At the Best Films movie theatre chain, the amount that customers spend on concessions is normally distributed with mean=$4.00 and sd=$0.75. What is the 75th percentile of the amount that customers spend on concessions? a. $4.25 *b. $4.51 c. $4.75 d. $5.01 -------------------------------------------------- Practice Quiz 4a. Due May 16. 1. Which of the following is NOT a name for a regression line? a. least squares line b. line of averages b. linear trend line *d. standard deviation line 2. What should the residual plot show for a good regression model? a. biased and heteroscedastic b. biased and homoscedastic c. unbiased and heteroscedastic *d. unbiased and homoscedastic 3. What should the normal plot of the residuals show for a good regression model? *a. approximately normal b. fat tails b. skewed to the right d. thin tails 4. If x <- 0:4 and y <- c(0, 3, 1, 2, 4), compute the correlation of x and y. a. 0.1 b. 0.4 *c. 0.7 d. 0.8 5. If x = 0:4 and y <- c(0, 3, 1, 2, 4), compute linear regression equation for predicting y from x. a. y = -0.7 * x + 0.8 b. y = 0.6 * x + 0.8 *c. y = 0.7 * x + 0.6 d. y = 0.8 * x + 0.7 -------------------------------------------------- Practice Quiz 4b. Due May 19. 1. If x is the independent variable and y is the dependent variable, suppose that xbar = 10, ybar = 30, SDx = 2, SDy = 8, and r = 0.5, what is the estimated regression equation? a. y = 0.5x + 10 *b. y = 2x + 10 c. y = 8x - 20 d. y = 8x - 50 2. If the estimated regression equation is y = 3x + 5, what are the predicted values that correspond with independent variable values x1=0, x2=1, and x3=3? a. 0, 1, 3 b. 1, 3, 8 *c. 5, 8, 14 d. 8, 11, 14 3. For a regression equation (different than in problems 1 and 2), the actual dependent variable values are 38, 54, and 42, and the corresponding estimated predicted values are 30, 60, 40. What are the estimated residuals? a. 1, -1, 1 b. 4, 7, 1 *c. 8, -6, 2 d. 68, 114, 82 4. For a simple regression model, SDy = 15 and r = 0.6. What is the root mean square error (RMSE)? a. 6 b. 9 *c. 12 d. 15 5. What is the name of this phenomenon in statistics? In a pre-test, post-test situation, If the pre-test score is high, the post-test score is usually lower, measured in standard deviations from the mean. If the pre-test score is low, the post-test score is usually higher, when measured in standard deviations from the mean. The name of this phenomenon is the regression _______________ . a. adjustment b. collaboration c. conspiracy *d. fallacy -------------------------------------------------- Practice Quiz 5a. Due May 26. 1. The root mean square error (RMSE) is the spread of the residuals around _____________ . a. a horizontal line with intercept y = ybar *b. the regression line c. the x-axis d. the y-axis 2. SDx and SDy denote the x and y standard deviations. If the correlation between x and y is 1, what is the RMSE equal to? *a. 0 b. 1 c. SDx d. SDy 3. SDx and SDy denote the x and y standard deviations. If the correlation between x and y is 0, what is the RMSE equal to? a. 0 b. 1 c. SDx *d. SDy 4. What is the sample space? It is the set of all __________ . a. estimated values b. events c. fair bets d. outcomes 5. What is an event? It is a(n) _____________ of the sample space a. element b. expected value *c. subset d. summary -------------------------------------------------- Practice Quiz 5b. Due May 26. 1. What condition is required so that P(A and B) = P(A) * P(B)? *a. A and B are independent. b. A and B are mutually exclusive c. P(A) = P(B) d. 0.5 < P(A) and 0.5 < P(B) 2. Which condition is needed to ensure that P(A or B)? a. A and B are independent. *b. A and B are mutually exclusive. c. P(A) = P(B) d. 0.5 < P(A) and 0.5 < P(A) 3. X and Y are random variables. If E(X) = 7000 and E(Y) = 9000, what is E(X + Y)? a. 7000 b. 9000 c. 14000 *d. 16000 4. X and Y are random variables. If E(X) = 7000 then what is E(100 * X)? a. 7000 b. 9000 c. 16000 *d. 700000 5. Roll a six sided fair die (probability of each outcome = 1/6) 10 times. What is the probability of obtaining at least one ace (1)? a. 16% *b. 84% c. 167% d. 100% ------------------------------------------------- Practice Quiz 6a. Due June 2. 1. On an assembly line, the probability of finding a defective part is 15%. Out of 12 parts what is the probability that no parts are defective? *a. 14% b. 17% c. 30% d. 86% e. 91% 2. On an assembly line, the probability of finding a defective part is 15%. What is the probability that at least one part out of 12 is defective? a. 14% b. 17% c. 30% *d. 86% e. 91% 3. On an assembly line, the probability of finding a defective part is 15%. What is the probability that exactly one part out of 12 is defective? a. 14% b. 17% *c. 30% d. 86% e. 91% 4. On an assembly line, the probability of finding a defective part is 15%. What is the probability that exactly 3 out of 12 parts are defective? a. 14% *b. 17% c. 30% d. 86% e. 91% 5. On an assembly line, the probability of finding a defective part is 15%. What is the probability that 3 or fewer parts out of 12 are defective? a. 14% b. 17% c. 30% d. 86% *e. 91% ------------------------------------------------ Practice Quiz 7a. Due June 11. 1. A researcher performs a t-test which tests the null hypothesis that mu = c, where c is a constant. Usually this researcher wants to ___________ the null hypothesis. a. accept *b.reject 2. The result of a test of significance with p-value = 0.0387 is said to be ________ . a. highly significant b. not significant c. satisfactory *d. significant 3. The result of a test of significance with p-value = 0.4305 is said to be ________ . a. highly significant *b. not significant c. satisfactory d. significant 4. The result of a test of significance with p-value = 0.000758 is said to be ________ . *a. highly significant b. not significant c. satisfactory d. significant 5. A t-test id designed to test the null hypothesis mu = 150. The level of the test is alpha=0.05. The number of observations is 7, Which confidence interval should be used? a. [-1.96, 1.96] b. [-2.00, 2.00] c. [-2.36, 2.36] *d. [-2.45, 2.45] -------------------------------------------------