Stats 250 W15 Exam 2 Solutions
Stats 250 W15 Exam 2 Solutions
Stats 250 W15 Exam 2 Solutions
1. Protecting the Environment Part 1 ~ Some people consider themselves “green” meaning they are supportive
when it comes to environmental issues. But how do people really act? American per capita use of energy is
roughly double that of Western Europeans. Would people be willing to pay more for gas to fund environmental
projects? A research team at Michigan State University selected a random sample of 1000 Michigan adults and
asked each if they would support an additional 2% tax on gasoline in order to fund various environmental
projects. For this sample, 450 of the 1000 adults reported a willingness to do so. Create a 90% confidence
interval to estimate the population proportion for all Michigan adults who would support such a gasoline tax.
[3] Note that a general (not conservative) interval was requested here and z* of 1.64 or 1.65 is ok.
pˆ (1 − pˆ ) (0.45)(1 − 0.45)
pˆ ± z ∗ → ( 0.45) ± 1.645
n 1000
0.45 ± 1.645 0.0157 → 0.45 ± 0.0259
Final Answer: ____0.4241______ to ____0.4759______
2. Protecting the Environment Part 2 ~ A researcher at the University of Minnesota wanted to conduct a similar
study as did Michigan State University to estimate the population proportion of Minnesota adults who would
support an additional 2% tax on gasoline to fund environmental projects.
a. The researcher would like to have a 95% confidence interval estimate with a width of (at most) 8%.
Determine the minimum sample size that would be required.
[3]
Note that a width of 0.08 is the same as 2m (with a margin of error of 0.04).
𝒛𝒛∗ 𝟐𝟐 𝟏𝟏.𝟗𝟗𝟗𝟗 𝟐𝟐
z* = 1.96 n = � � =� � = (24.5)2 = 600.25 and ALWAYS ROUND UP
𝟐𝟐𝟐𝟐 𝟎𝟎.𝟎𝟎𝟎𝟎
If use z* of 2, then n = 625 is the final answer.
[2] b. We are 95% confident that the population proportion of all Minnesota adults who would support an
additional 2% tax on gasoline to fund environmental projects is in the interval 0.40 to 0.48.
Correct Incorrect
[2] c. Based on the same survey results, the width for a 90% confidence interval for the population proportion all
Minnesota adults who would support an additional 2% tax on gasoline to fund environmental projects
would be more than 8%.
Correct Incorrect
[2] d. Consider now using these same Minnesota survey results to test, at a significance level of 0.05, the null
hypothesis H0: p = 0.50 against the majority hypothesis of Ha: p > 0.50.
Then the resulting p-value will be (circle one) greater than less than 0.05,
Note: if selected “LESS THAN” as answer at start of sentence, then must circle “WOULD BE”
to earn 1 of 2 points; if “LESS THAN” and “WOULD NOT BE” both points are lost as not consistent.
c. Based on the pilot study, the decision was made to follow up with the larger scale study. Out of the
300 Cleveland adults that were contacted, 66 reported their roads are in “poor” shape. Report the test
statistic with its symbol, provide the corresponding p-value and a well-labelled sketch showing that p-value.
[6]
𝟎𝟎.𝟐𝟐𝟐𝟐−𝟎𝟎.𝟐𝟐𝟐𝟐 −𝟎𝟎.𝟎𝟎𝟎𝟎
𝒁𝒁 = = = −𝟏𝟏. 𝟐𝟐 and p-value will be the area to the LEFT of -1.2 under N(0,1) model
(𝟎𝟎.𝟐𝟐𝟐𝟐)(𝟎𝟎.𝟕𝟕𝟕𝟕) 𝟎𝟎.𝟎𝟎𝟎𝟎𝟎𝟎
�
𝟑𝟑𝟑𝟑𝟑𝟑
N(0,1)
p-value = area
to left of -1.2
-1.2 0 Z (values)
One-Sample Test
Test Value = 0
Mean 95% Confidence Interval of the Difference
t df Sig. (2-tailed) Difference Lower Upper
Difference 2.918 5 .033 .5383 .0641 1.0126
a. Briefly explain why this is a paired design.
[1] Each of the 6 subjects were measured twice OR we have two measurements on each subject.
b. The treatment will be considered successful if the population mean difference in T cell counts is higher after
20 days on blinatumomab over the baseline. For this situation, specify the null and alternative hypotheses.
[3]
H0: _____ µd = 0 _________ Ha: _____ µd > 0__________
c. Use the provided SPSS output to report the test statistic and the exact p-value that corresponds with your
hypotheses in part (b).
[2]
Test statistic: _____ 2.918 ________ p-value: ____ 0.033/2 = 0.00165 _______
d. Which of the following is the appropriate statistical decision and conclusion at the 5% significance level?
[2] Circle one.
• Reject H0; there is insufficient evidence to say the population mean difference in T cell counts
is higher after 20 days on blinatumomab over the baseline.
• Reject H0; there is sufficient evidence to say the population mean difference in T cell counts
is higher after 20 days on blinatumomab over the baseline.
• Fail to reject H0; there is insufficient evidence to say the population mean difference in T cell counts
is higher after 20 days on blinatumomab over the baseline.
• Fail to reject H0; there is sufficient evidence to say the population mean difference in T cell counts
is higher after 20 days on blinatumomab over the baseline.
e. The researcher remembers there are “assumptions” for any test, to ensure its integrity. She believes the
subjects are representative of the population and can be treated as a random sample; but since her sample
size was small, there is an additional “assumption.” Clearly state that assumption in context.
[2] The assumption: The population of differences in T cells (after 20 days less baseline) should be normal.
OR The differences in T cell counts (after less base) for the population of all subjects should follow
a normal model.
a. A histogram of the 35 observations indicates that the model for the number of daily steps in the population
may not be normal. However, a confidence interval to estimate the population mean can still be
constructed because:
𝟖𝟖𝟖𝟖𝟖𝟖
� ± 𝒕𝒕∗ [𝒔𝒔. 𝒆𝒆. (𝒙𝒙
𝒙𝒙 �)] → 𝟓𝟓𝟓𝟓𝟓𝟓𝟓𝟓 ± 𝟐𝟐. 𝟎𝟎𝟎𝟎 � �
√𝟑𝟑𝟑𝟑
→ 𝟓𝟓𝟓𝟓𝟓𝟓𝟓𝟓 ± 𝟐𝟐. 𝟎𝟎𝟎𝟎(𝟏𝟏𝟏𝟏𝟏𝟏. 𝟐𝟐𝟐𝟐𝟐𝟐𝟐𝟐) → 𝟓𝟓𝟓𝟓𝟓𝟓𝟓𝟓 ± 𝟐𝟐𝟐𝟐𝟐𝟐. 𝟖𝟖𝟖𝟖𝟖𝟖𝟖𝟖)
Or 5489.1 to 6040.9
Final Answer: ____ 5489.1 steps ____ to ____ 6040.9 steps _____
c. Consider the following incorrect statement regarding the 95% confidence level used.
Briefly edit the statement to make it correct so it can be used in the report summary.
Do not rewrite the sentence.
[2]
� __ = __ 0.5364 __
Final answer: __ 𝒑𝒑
Symbol
c. One of the conditions for the test to be valid involves having two independent random samples, which is
reasonable from the design of the study. Validate the remaining assumption.
[2]
We need that each sample would be EXPECTED (under the null hypothesis) to have at least 10 Blue/Black
responses and at least 10 White/Gold responses. Using the common estimate from part (b) we have:
250(0.5364) = 134.1 and 250(1 – 0.5364) = 250(0.4636) = 115.9
300(0.5364) = 160.92 and 300(1 – 0.5364) = 300(0.4636) = 139.08
d. Based on the results, the observed test statistic value is 1.87 with a corresponding p-value of 0.0307.
Which of the following is a correct meaning of this p-value?
[2] Circle one.
• the probability that the null hypothesis is true is 0.0307.
• the probability of seeing a test statistic of 1.87 or more extreme is 0.0307.
• if there is no difference in the population rates for seeing the dress as blue/black,
we would see a test statistic of 1.87 or more extreme with probability 0.0307.
• none of the above statements is a correct and complete meaning for this p-value.
H0: Model for Number of Office Hours attended by Students Weekly during the Fall term
X = # Hours 0 1 2 3 4 5
Probability 0.1 0.4 0.2 0.1 0.1 0.1
Mean number of office hours attended weekly = 2 hours, with standard deviation = 1.5 hours
Ha: Model for Number of Office Hours attended by Students Weekly during the Winter term
X = # Hours 0 1 2 3 4 5
Probability 0 0.1 0.1 0.1 0.3 0.4
Mean number of office hours attended weekly = 3.8 hours, with standard deviation = 1.3 hours
a. In the Spring term, a former student of Dr. Z’s stops by his office to pick up his final exam. The student
cannot remember which term he took the class, so Dr. Z decides he will ask the student how many office
hours he attended weekly when in his class. If the student’s response is 4 hours or more, Dr. Z will decide
the student was in the Winter term class. So Dr. Z will be picking between the following hypotheses:
H0: The student was in the Fall term class versus Ha: The student was in the Winter term class
using the decision rule: Reject H0 if the number of office hours attended weekly was at least 4.
i. For this decision rule, find the level of significance, that is, compute α. Apply the DEFINITION of α.
[2]
α = P(Reject H0 when H0 is true) = P(X = 4 or X = 5 under the H0 model) = 0.1 + 0.1 = 0.2
9. Truck Weight Limit ~ A warehouse has a fleet of small trucks to transport crates of their floor tile. Each truck
can carry a maximum load of 2000 pounds. Suppose that the weight of a standard crate of floor tile is normally
distributed with a mean weight of 480 pounds and a standard deviation of 20 pounds.
What is the probability that a random sample of 4 crates placed in a truck will exceed the maximum load? (Hint:
think about what exceeding the maximum load would imply about the sample mean weight for these 4 crates.)
[4] Using the hint to think about the sample mean … since X=crate weight has a N(480,20) model, the sample
mean (for a random sample of 4 crates) will also have a normal model N(480, 20/sqrt(4)), that is, N(480,10).
10. Statistically Significant – A researcher will compare middle students and high school students on 25 different
Yes/No questions. He will use a 10% significance level to carry out the 25 independent samples z-tests
(one for each question) to compare the two population proportions. If, for each test, the null hypothesis of
no difference in population proportions is actually true, how many decisions are expected to be correct?
[2]
25 different tests each with H0 true and conducted at the 10% level; so expect 10% of the decisions
to be wrong, so 90% of the 25 tests or 22.5 tests are expected to be correct.
Common errors: (1) 2.5 tests = how many tests expected to REJECT H0 (be statistically significant) which
would be an Incorrect decision, (2) rounded to whole value (but expect 22.5), (3) 90% (not how many).
If a CI, then provide the notation for the corresponding parameter the CI is for.
If a HT, then clearly state the appropriate null and alternative hypotheses to be tested.
The last scenario has one additional question, so be sure to answer it too.
a. The Dean of a college want to learn about the proportion of all students have a summer internship before they
graduate. His research team takes a representative sample of 200 students who will be graduating this May and
finds that 85 of the 200 sampled students had a previous summer internship.
[2]
b. A sociologist developed a test to measure attitudes about public transportation. She is interested in estimating
the difference between the average score for younger residents (under 25 years old) and the average score for
older residents (50 years or older).
[2]
c. The average age of customers at a local nightclub has been 25 years old. The owner has re-modeled the club
in hopes the new décor will attract an older crowd. She will take a random sample of recent customers and
record their ages to assess if there has been an increase in the average age.
[2]
CI for ____________ OR H0: _____ µ = 25_______ versus Ha: _____ µ > 25___________
The parameter ___ µ ____ represents ___ the POPULATION AVERAGE age of all customers with new décor
__ or the MEAN age of ALL customers that visit the re-modeled club. ____________________________
Common error = population must be of all ages for all RECENT customers (so after remodel).