STAB22 FinalExam 2013F PDF
STAB22 FinalExam 2013F PDF
STAB22 FinalExam 2013F PDF
0.04 =
0.2 = P(A) = 2 0.2 = 0.4 = P(A or B) = P(A) + P(B) P(A and B) =
0.4 + 0.2 0.08 = 0.52.
2. Historical evidence suggests that SAT scores are normally distributed with mean 1000
and standard deviation 180. What score (approximately) do you have to make to be in
the top 1 percent of all those who are taking this exam?
A) 586
B) 1180
C) 1234
D) 1325
E) 1415
Solution: Let X N(1000, 180). P(X x) = 0.01. Solve P(Z z) = 0.01 =
z = 2.325. So x = 1000 + 2.325 180 = 1418 . The closest is 1415.
3. Which of the following values is closest to the interquartile range for the standard normal
distribution?
A) 0
B) 0.5
C) 1.3
D) 2.5
E) 3
Solution: Q
1
= P
25
0.675 ; Q
3
= P
75
0.675; IQR = Q
3
Q
1
= 0.675+0.675 =
1.35.
Page 3 of 23
4. The distribution of bachelors degrees conferred by a local college is listed below, by
major.
Major Frequency
English 2,073
Mathematics 2,164
Chemistry 318
Physics 856
Liberal Arts 1,358
Business 1,676
Engineering 868
Total 9,313
What is the probability that a randomly selected bachelors degree is not in Mathemat-
ics?
A) 0.232
B) 0.768
C) 0.303
D) 0.682
E) 0.889
Solution: P(not mathematics) = 1 P(mathematics) = 1
2,164
9,313
= 0.768.
5. The two-way table below classies the members of a tness club by gender and food
habits. Use this information to answer this question and the next question.
Women Men
Vegetarian 9 3
Non-vegetarian 8 10
If a member is selected at random from the club, what is the probability that this person
is a vegetarian female?
A) 3/4
B) 3/10
C) 1/10
D) 4/9
E) 9/17
Page 4 of 23
Solution: P(Female and vegetarian) = 9/30 = 3/10.
6. Using the information in question 5 above, what is the probability that a randomly
selected member is non-vegetarian?
A) 2/5
B) 1/3
C) 4/15
D) 3/5
E) 8/17
Solution: P(non-vegetarian) = 18/30 = 3/5.
7. Which of the following procedures best describes a method to select a simple random
sample?
A) Randomly select half of the sample from females and the remaining half from
males.
B) Select an individual from every fourth house on a street.
C) Assign each individual in the population a unique number and use a computer
or a random number table to randomly generate numbers for selection.
D) Select every individual with a surname beginning with the letter S.
E) Select every 20th individual from a list of patients registered with a GP.
8. The regression equation relating midterm score (x) and nal exam score (y) in a course
is y = 20+0.8x. Which of the following is a valid conclusion based on this information?
A) If Adam has scored 10 points higher than Bob on the midterm, then Adam
will score 8 points higher than Bob on the nal exam.
B) Students who score 80 on the midterm are predicted to score 84 on the nal.
C) If a student scores 0 on the midterm, then that student will score 20 on the
nal.
D) On average, the students scores on the nal exam will only be 80% of what
they were on the midterm.
E) This equation shows that the midterm and the nal exam scores are not related.
Solution: y = 20 + 0.8 80 = 84 Key word: predicted
Page 5 of 23
9. In a study of acupuncture for treating pain, 100 volunteers were recruited. Half were
randomly assigned to receive acupuncture and the other half to receive a sham acupunc-
ture treatment. The patients were followed for 6 months and the treating physician
measured their degree of pain relief. The patients did not know which treatment they
actually received, but the treating physicians were aware of who was getting acupuncture
and who wasnt. Which of the following best describes this study?
A) This is a retrospective observational study.
B) This is a randomized block design.
C) This is a randomized single-blind experiment.
D) This is a randomized double-blind experiment.
E) This is a prospective observational study.
10. In a regression study, which of the following best describes the square of the correlation
r
2
between the two variables?
A) The amount of error that is incurred from using a regression model to make
predictions
B) The amount of the variability explained by the standard deviations of the
response and explanatory variables
C) The proportion of the total variation in the response variable explained by the
straight line relationship with the explanatory variable
D) The sum of squares of the residuals
E) None of the above
11. Consider the random experiment of rolling a six sided die for which the sample space is
S = {1, 2, 3, 4, 5, 6}. We know that each side of the die has probability 1/6 of coming
up. Which of the following pairs of events are independent?
A) A = {1, 2} , B = {1, 2, 3}
B) A = {1, 2}, B = {2, 3}
C) A = {1, 3, 5}, B = {2, 4, 6}
D) A = {1, 2, 3}, B = {3, 4}
E) None of the above
Solution: For A = {1, 2, 3}, B = {3, 4}, P(A) = 1/2, P(A|B) =
P(A and B)
P(B)
=
1/6
2/6
=
1/2. i.e P(A|B) = P(A) and so A B.
Or P(A) = 1/2, P(B) = 1/3 and P(A and B) = P({3}) = 1/6 = 1/2 1/3 =
P(A) P(B) and so A B.
Page 6 of 23
12. Which of the following pairs of events (denoted by A and B) are disjoint and indepen-
dent?
A) Toss two coins, A = rst toss is head, B = second toss is head
B) Roll a die, A = observe an odd number, B = observe an even number
C) Take two cards one by one from a deck of 52 cards, A = rst card is red, B =
both cards are red
D) There are three kids in a family, A = two kids are girls, B = two kids are twins
E) None of the above
Solution: Disjoint events can never be independent (unless one event has zeo prob-
ability).
13. A number calculated with complete population data and quanties a characteristic of
the population is called ...
A) Datum
B) Statistics
C) Population
D) Stratum
E) Parameter
14. In a large national population of post-secondary students, 60 percent attend 4-year
institutions while the remainder attends 2-year institutions. Males make up 42 percent
of post-secondary students. In 4-year institutions, 45 percent of the students are male.
We will select one student at random. What is the probability the selected student is a
male attending a 4-year institution?
A) 0.75
B) 0.25
C) 0.19
D) 0.27
E) 0.64
Solution: Let A = {male}, B = {4th year} Then P(A) = 0.42, P(B) = 0.6,
P(A|B) = 0.45. Thus, P(A and B) = P(B)P(A|B) = 0.6 0.45 = 0.27
Page 7 of 23
15. The heights (y) of 50 men and their shoes sizes (x) were obtained. The variable height is
measured in centimetres (cm) and the shoe sizes of these men ranged from 8 to 13. From
these 50 pairs of observations, the least squares regression line predicting height from
shoe size was computed to be y = 130.455 + 4.7498x. What height would you predict
for a man with a shoe size of 13?
A) 130.46 cm
B) 192.20 cm
C) 182.70 cm
D) This regression line cannot be used to predict the height of a man with a shoe
size of 13
E) None of the above
Solution: y = 130.455 + 4.7498 13 = 192.20
16. In clinical trials a certain drug has a 10% success rate of curing a known disease. If 15
people are known to have the disease. What is the probability of at least 2 being cured?
A) 0.5491
B) 0.01
C) 0.4509
D) 0.15
E) 0.7564
Solution: Let X be the number of people cured. P(X 2) = 1 P(X < 2) =
1 P(X = 0) P(X = 1) = 1 0.2059 0.3432 = 0.4509(you can also get this
value from the table).
17. The manufacturer of a bag of sweets claims that there is a 90% chance that the bag
contains some toees. If 20 bags are chosen, how many bags with toees would you
expect to have?
A) 20
B) 2
C) 9
D) 18
E) None
Page 8 of 23
Solution: E(X) = np = 20 0.90 = 18
18. Government data show that 10% of males under age 25 are unemployed. A random
sample is taken of 400 males who are in the labor force and under age 25. Find the
probability that the sample unemployment rate is 0.12 or more. (Hint: Use Normal
approximation)
A) 0.1023
B) 0.0918
C) 0.9082
D) 0.5470
E) 0.4530
Solution:
P N(0.10,
0.100.90
400
) = N(10, 0.015). P( p 0.12) = P(Z 1.33) =
1 0.9082 = 0.0918
19. In a large mall a survey was taken. It was found that in a random sample of 45 women
over the age of 25, 15 had children. Find the 90% condence interval for the population
proportion of women over the age of 25 in the mall who have children. (Choose the
closest answer.)
A) (0.0208, 0.0653)
B) (0.4337, 0.6774)
C) (0.3333, 0.5555)
D) (0.1374, 0.1626)
E) (0.2177, 0.4489)
Solution: p = 15/45 = 0.333 , CI = (0.3331.645
0.333(10.333)
45
) = (0.2174, 0.4486)
20. You have sampled 25 students to nd the mean SAT scores. A 95% condence interval
for the mean SAT score is 900 to 1100. Which of the following statements gives a valid
interpretation of this interval?
A) 95% of the 25 students have a mean score between 900 and 1100.
B) 95% of the population of all students have a score between 900 and 1100.
C) If this procedure were repeated many times, 95% of the sample means would
be between 900 and 1100.
Question 20 continues on the next page. . .
Page 9 of 23
D) If this procedure were repeated many times, 95% of the resulting condence
intervals would contain the true mean SAT score.
E) If 100 samples were taken and a 95% condence interval was computed, 5 of
them would be in the interval from 900 to 1100.
21. In brief, what does the Central Limit Theorem say?
A) The area under a Normal density curve is one.
B) Measures of central tendency should always be computed with and without
outliers.
C) For suciently large sample size, the sampling distribution of
X is approxi-
mately Normal.
D) Condence intervals have zero margin of error for large sample sizes.
E) In the long run, the average outcome gets close to the distribution mean.
22. Two independent random variables have the following distributions:
x -1 0 1
p 0.2 0.3 0.5
y 1 2
p 0.3 0.7
Find the mean of Y 2X (i.e. nd E(Y 2X)).
A) 1.1
B) -3.1
C) 0.3
D) 1.7
E) 1.4
Solution: E(Y 2X) = E(Y ) 2E(X) = 1.7 2 0.3 = 1.1
23. X and Y are two independent random variables and V ar(X) = 0.61 and V ar(Y ) = 0.21.
Find SD(X Y ) (i.e.
XY
).
A) 0.6451
B) 0.6324
C) 0.9055
D) 0.5727
Question 23 continues on the next page. . .
Page 10 of 23
E) 0.3227
Solution: SD(X Y ) =
(V ar(X) +V ar(Y ) =
0.750.25
110
= 0.04128614119
( p.75)
(0.750.25)/110)
= 0.3302891295
Page 13 of 23
31. In a study, a group of researchers measured the blood pressure levels of 474 patients,
recording the blood pressure levels as high blood pressure (HBP), low blood pressure
(LBP) or normal blood pressure (NBP). The patients were also classied into three age
categories: under 30, 30-50, and over 50. The results of this study are summarized in
the table below:
Blood pressure level Under 30 30-50 Over 50 Total
HBP 23 51 73 147
LBP 27 37 31 95
NBP 48 91 93 232
Total 98 179 197 474
Which pie chart below represents the marginal distribution of blood pressure level?
49.0%
NBP
27.6%
LBP
23.5%
HBP
A
50.8%
NBP
20.7%
LBP
28.5%
HBP
B
47.2%
NBP
15.7%
LBP
37.1%
HBP
C
48.9%
NBP
20.0%
LBP
31.0%
HBP
D
41.6%
NBP
37.8%
LBP
20.7%
HBP
E
A) pie chart A
B) pie chart B
C) pie chart C
D) pie chart D
E) pie chart E
Solution: The marginal distribution of blood pressure level as proportions (per-
centages)
147
474
= 31.0%,
95
474
= 20.0%, and
232
474
= 48.9% for HBP, LBP and NBP
respectively.
32. In a group of 100 people, 40 own a cat, 25 own a dog, and 15 own a cat and a dog. Find
the probability that a person chosen at random, owns a dog, given that he owns a cat.
Question 32 continues on the next page. . .
Page 14 of 23
A) 0.375
B) 0.6
C) 0.25
D) 0.4
E) 0.15
Solution: 15/40 = 0.375
33. The pie chart below shows the percentage of students in each faculty at a university.
35.0%
Science
10.0%
Engineering
5.0%
Medicine
5.0%
Law 5.0%
Education
15.0%
Business
25.0%
Arts
Pie Chart of Percentage vs Faculty
If the number of students in the faculty of Arts is 3000, then how MANY students are
there in the faculty of Science?
A) 3400
B) 3600
C) 3800
D) 4000
E) 4200
Solution: There are 3000 students in the Arts faculty. That is is 25% of the all
students in the university. Thus the number of students in the university is 30004 =
12000 and 35% of them, i.e. 12000 0.35 = 4200 are in Science faculty.
34. There are ten children in a room. The mean and standard deviation of their ages are 5
years and 1.25 years respectively. If another 5-year-old child enters this room, what will
happen to the mean and standard deviation of the ages of the children in the room?
Question 34 continues on the next page. . .
Page 15 of 23
A) The mean will stay the same but the standard deviation will increase.
B) The mean and standard deviation will both increase.
C) The mean and standard deviation will both decrease.
D) The mean will stay the same but the standard deviation will decrease.
E) The mean and standard deviation will both stay the same.
35. The heights of women in a certain population have a Normal distribution with mean
64.5 inches and standard deviation 2.5 inches. We select three women at random from
this population. Assume that their heights are independent. Find the probability that
the tallest of these three women will be taller than 67 inches. Which of the following
numbers is closest to this probability?
A) 0.0
B) 0.2
C) 0.4
D) 0.6
E) 0.8
Solution: Note that the event the tallest is taller than 67 inches is equivalent to
the event at least one of them is taller than 67 inches. Denoting the height by X,
P(X < 67) = 0.841345 and so the probability that all three will be shorter than 67
inches is 0.841345
3
= 0.5955556572 and the probability that at least one of them is
taller than 67 inches is 1 0.5955556572 = 0.4044443428 0.4.
Note: This answer is the same if one uses the 68-95-99.7 % rule.
36. The unemployment rate (i.e. the percent unemployed in the labour force) in a certain
city is 8%. A random sample of 150 people from the labour force in this city is drawn.
Find the approximate probability that this sample contains fteen or more unemployed
people. Which of the following numbers is closest to this probability?
A) 0.1
B) 0.2
C) 0.3
D) 0.4
E) 0.5
Solution: Denoting the number of unemployed people in the sample by X, we have
X N( = np = 150 0.08 = 12, =
n
=
20
4
= 10.
38. The regression equation and the summary statistics given below were obtained from a
regression analysis of the relation between murder rate (y, the number of murders per
1,000,000 inhabitants per annum) and unemployment(x, the percentage of the labour
force unemployed) using data for 20 cities.
Descriptive Statistics: x, y
Variable N Mean StDev Minimum Q1 Q3 Maximum
x 20 deleted 1.207 4.900 6.050 8.125 9.300
y 20 20.57 9.88 5.30 12.88 26.63 40.70
Regression Analysis: y versus x
The regression equation is
y = - 28.5 + 7.08 x
What is the correlation between x and y?
A) 0.035
B) 0.0708
C) 0.122
D) 0.865
E) 0.988
Solution: b
1
= r
sy
sx
= r = b
1
sx
sy
= 7.08
1.207
9.88
= 0.8649352227 0.865
Page 17 of 23
39. Using the information in question 38 above, calculate the average unemployment rate
(i.e. the value of x ) of these 20 cities. Note: This value has been deleted in the summary
statistics given in question 38 above but can be calculated from other information given.
Which of the following numbers is closest to this value?
A) 7
B) 14
C) 21
D) 28
E) 49
Solution: b
0
= y b
1
x = x =
yb
0
b
1
=
20.57(28.5)
7.08
= 6.93079096 7.
40. One of the cities in the data set used in question 38 above, had unemployment rate 6.0
(i.e. x = 6.0) and murder rate 14.5. What is the residual for this city?
A) 0.52
B) -68.16
C) -27.98
D) 43
E) -96.66
Solution: Residual = y y = 14.5 (28.5 + 7.08 6.0) = 0.52.
41. A consumer product agency tests mileage (miles per gallon) for a sample of automobiles
using each of four dierent types of gasoline. Which of the following statements regarding
this study is true?
A) There are four explanatory variables and one response variable in this study.
B) There is only one explanatory variable with four levels and one response vari-
able in this study.
C) Miles per gallon is the only explanatory variable in this study.
D) This study has a single response variable with four levels.
E) None of the above statements is true.
42. In a certain game of chance, your chances of winning are 0.2 and it costs $1 to play the
game. If you win, you receive $4 (for a net gain of $3). If you lose, you receive nothing
(for a net loss of $1). Your are going to play the game nine times. Assume that the
outcomes are independent. Let T be the total net gain from this round of nine games.
Find the standard deviation of T. Choose the closest.
Question 42 continues on the next page. . .
Page 18 of 23
A) $ 1
B) $ 3
C) $ 5
D) $ 7
E) $ 9
Solution: Let X be the number of times you win ( i.e. 9X is the number of times
you lose. Then X Bin(9, 0.2) and the standard deviation of X is
np(1 p) =
3
2
+ 2
2
= 3.605551275)
P(Y > X) = P(Y X > 0) = P(Z >
0 (3)
3.605551275
0.83) = 0.2033.
END OF EXAM
Page 20 of 23
Page 21 of 23
Page 22 of 23
Page 23 of 23
Binomial Distribution Table
n k 0.1 0.2 0.3 0.4 0.5 0.6
14 0 0.2288 0.0440 0.0068 8e-04 1e-04 0.0000
14 1 0.3559 0.1539 0.0407 0.0073 9e-04 1e-04
14 2 0.2570 0.2501 0.1134 0.0317 0.0056 5e-04
14 3 0.1142 0.2501 0.1943 0.0845 0.0222 0.0033
14 4 0.0349 0.1720 0.2290 0.1549 0.0611 0.0136
14 5 0.0078 0.0860 0.1963 0.2066 0.1222 0.0408
14 6 0.0013 0.0322 0.1262 0.2066 0.1833 0.0918
14 7 2e-04 0.0092 0.0618 0.1574 0.2095 0.1574
14 8 0.0000 0.0020 0.0232 0.0918 0.1833 0.2066
14 9 0.0000 3e-04 0.0066 0.0408 0.1222 0.2066
14 10 0.0000 0.0000 0.0014 0.0136 0.0611 0.1549
14 11 0.0000 0.0000 2e-04 0.0033 0.0222 0.0845
14 12 0.0000 0.0000 0.0000 5e-04 0.0056 0.0317
14 13 0.0000 0.0000 0.0000 1e-04 9e-04 0.0073
14 14 0.0000 0.0000 0.0000 0.0000 1e-04 8e-04
n k 0.1 0.2 0.3 0.4 0.5 0.6
15 0 0.2059 0.0352 0.0047 5e-04 0.0000 0.0000
15 1 0.3432 0.1319 0.0305 0.0047 5e-04 0.0000
15 2 0.2669 0.2309 0.0916 0.0219 0.0032 3e-04
15 3 0.1285 0.2501 0.1700 0.0634 0.0139 0.0016
15 4 0.0428 0.1876 0.2186 0.1268 0.0417 0.0074
15 5 0.0105 0.1032 0.2061 0.1859 0.0916 0.0245
15 6 0.0019 0.0430 0.1472 0.2066 0.1527 0.0612
15 7 3e-04 0.0138 0.0811 0.1771 0.1964 0.1181
15 8 0.0000 0.0035 0.0348 0.1181 0.1964 0.1771
15 9 0.0000 7e-04 0.0116 0.0612 0.1527 0.2066
15 10 0.0000 1e-04 0.0030 0.0245 0.0916 0.1859
15 11 0.0000 0.0000 6e-04 0.0074 0.0417 0.1268
15 12 0.0000 0.0000 1e-04 0.0016 0.0139 0.0634
15 13 0.0000 0.0000 0.0000 3e-04 0.0032 0.0219
15 14 0.0000 0.0000 0.0000 0.0000 5e-04 0.0047
15 15 0.0000 0.0000 0.0000 0.0000 0.0000 5e-04