đề ôn final 2
đề ôn final 2
đề ôn final 2
Question 2
n
The weights (kg) of a random sample of 50 patients are shown below:
11 15 20 23 23 25 30 30 32 34
38 39 41 41 45 46 47 47 47 48
50 51 51 51 52 53 55 56 57 57
57 58 58 58 58 60 62 63 64 65
65 67 67 68 70 70 73 73 78 79
x = 2528 x = 141332
2
Also given and
a. Calculate the sample mean, variance and standard deviation.
b. Construct the stem and leaf display.
c. Calculate the five-figure summary and hence construct the box plot. Calculate the lower
inner fence and the upper inner fence and hence identify the presence of any outliers.
d. Giving 3 reasons, are the weights positively or negatively skewed?
e. Fill in the values for the three columns in the frequency distribution below.
Frequency distribution of the weights (kg) of 50 patients.
Weights(kg) Frequency Relative frequency Cum. Frequency
10 and under 20
20 and under 30
30 and under 40
40 and under 50
50 and under 60
60 and under 70
70 and under 80
Revision Questions (QBM101) Page 2
f. From the frequency distribution in part (e), calculate the sample mean and standard
deviation.
g. Construct the histogram and Ogive based on the frequency distribution table in (e).
h. Based on Ogive in (g), how many patients have weight less than 55kg?
i. Based on Ogive in (g), how many patients have weight more than 65kg?
Question 3
The weekly food expenditures in US $ of a random sample of 30 students is shown below:
Weekly food expenditures (US $) of a random sample of 30 students
21 32 44 47 56 71
24 34 44 50 57 73
28 35 45 50 62 75
30 36 47 51 62 76
32 42 47 55 64 83
Question 4
A statistician would like to study the distribution of monthly sales pattern of outlets for product X.
He selected a random sample of 30 outlets and recorded their monthly sales in $ million. The
summary output of the monthly sales of the 30 outlets is shown below. However, four data marked
a, b, c and d are missing in the summary output.
Summary output
Sales($ Million)
Mean 29.77
Standard Error 2.75
Median 32
Mode 35
Standard Deviation 15.06
Sample Variance a
Kurtosis -0.91
Skewness d
Range b
Minimum 3
Maximum 52
Sum c
Count 30
First Quartile 19
Third Quartile 40
b. Find the values of the three measures of central tendency and interpret their meanings.
c. From the summary statistics, is the value of d positive or negative? Justify your answer.
d. Compute the interval 𝑥̅ ± 1.5𝑠, where 𝑥̅ and 𝑠 are the sample mean and standard deviation
respectively.
e. Giving the reason for the rule used, how many percent of the monthly sales are expected
to be within the interval 7.18 and 52.36?
f. Another product Y has the distribution with mean monthly sales of 27.25 ($ million) and
standard deviation of 2.38 ($ million). Compute the coefficient of variation of these two
products and compare their variabilities.
Revision Questions (QBM101) Page 4
Question 5
Given the stem and leaf display below for daily spending ($) of a sample group of 35 adults.
Stem Leaf
1 12
2 23478899 9
3 023346 7
4 123345
5 11345
6 0123
7 23
Legend: 1|1 means $11
Question 6
The following data show the courses enrolled by 20 HELP students in 2018. B refers to Business,
P refers to Psychology, L refers to Law and E refers to Economics.
B P E P B E B P B L
E P B P L P L B L P
ii. Calculate the relative frequencies and percentage for all the categories
Revision Questions (QBM101) Page 5
Question 7
A group of 30 runners are randomly chosen, and their time (in minutes) taken to complete a
marathon run is recorded in the table below:
Cumulative
Midpoint, Frequency, Relative Cumulative
Time, x Relative
m f Frequency Frequency
Frequency
1≤x<5 3 2 . . .
5 ≤ x < 10 . 3 . . .
10 ≤ x < 15 . . . . .
15 ≤ x < 20 . 10 . . .
20 ≤ x < 25 . 8 . . .
Total 30 1.00
Module 2
Question 1
vi. the graduate major in human resource given that the graduate is a female?
viii. the graduate major in accounting given that the student is a male?
x. Is the event that graduate is female and major in human resource independent?
xi. Is the event that graduate is male and major in accounting mutually exclusive?
Question 2
X -1 0 1 2 3
Probability, p(x) 0.25 0.33 0.17 0.15 0.10
Question 3.
A statistician was employed by a shopping mall to study the number of shops that customers will
actually enter. A random sample of 200 mall customers were selected and the table below shows
the probability distribution of the number of shops that mall customers entered.
iii. a randomly selected customer enters more than1 but not more than 5 shops.
Question 4
vi) more than 3 but not more than 8 applications will be approved.
Question 5
A used car company determined that 30% of customers made a complaint about the car, within
one month of purchase.
ii) less than two made a complaint within one month of purchase?
iv) more than 2 but less than 6 made complaint within one month of purchase?
v) more than 2 but not more than 6 complain within one month of purchase?
vi) at least 2 but not more than 6 complain within one month of purchase?
b. Find the mean and standard deviation of the above binomial distribution.
Question 6
i) What is the mean number of cars arriving at the petrol station per 5 minutes and per 10
minutes?
ii) Find the probability that in the next 5 minutes, 2 or less cars will arrive at the petrol station.
iii) Find the probability that in the next 5 minutes, exactly two cars will arrive at the petrol
station.
iv) Find the probability that in the next 5 minutes, four or more cars will arrive at the petrol
station.
v) Find the probability that in the next 10 minutes, more than one but less than 6 cars will arrive
at the petrol station.
vi) Find the mean and standard deviation of car arriving at the petrol station in the next 45
minutes.
Revision Questions (QBM101) Page 9
Question 7
The average number of patients visiting a clinic between 3 p.m. and 7 p.m. is 20 patients.
i) Find the probability that exactly 10 patients visit the clinic on a particular day between 3
p.m. and 5 p.m.
ii). Find the probability that less than 4 patients visit the clinic on a particular day between 3
p.m. and 5 p.m.
iii) Find the probability that at least 4 patients visit the clinic on a particular day between 3
p.m. and 4 p.m.
iv) Find the mean and variance number of patients visiting between 3pm and 4:45pm.
Question 8
In a text book, there are 100 misprints distributed randomly and independently throughout the 400
pages.
ii) What is the probability that there are exactly 5 misprints in the next 5 pages?
iii) What is the probability that there are 2 or more misprints in the next 6 pages?
iv) What is the probability that there are less than 3 misprints in the next 12 pages?
v) What is the probability that there are more than 2 but less than 8 misprints in the next 12
pages?
Question 9
A chocolate waffle factory manufactures chocolate wafflers with a mean of 100 grams and a
standard deviation of 5 grams. Assume that the weights of waffles are normally distributed. What
is the probability that a randomly selected waffle
v) Find the minimum weight of the heaviest 10% of the chocolate wafflers.
Revision Questions (QBM101) Page 10
Question 10
The recent average starting salary for new college graduates in computer information systems is
$47,500. Assume salaries are normally distributed with a standard deviation of $4,500.
(i) What is the probability of a new graduate being offered a starting salary in excess of
$55,000?
(ii) What is the probability that a new graduate being offered a starting salary of between
$45,000 and $52,000?
(iii) What is the probability that the mean salary of a random sample of 100 new graduates
exceed $48,000?
(iv) It was found that top 5% of the graduates have starting salary of minimum X. Find the value
of X.
Question 11
The quality of the electric light bulbs is measured by the lifetimes in hours. The longer the lifetime
of the bulbs, the better the will be the quality. Assuming that the lifetimes of electric light bulbs
are normally distributed with a mean lifetime of 1,000 hours and a standard deviation of 100 hours
i. What is the probability that a randomly selected light bulb will have lifetimes of between 900
hours and 1050 hours?
ii. What is the probability that a randomly selected light bulb will have lifetimes of less than 900
hours?
iii. Find the maximum lifetime of the bottom 5% electric light bulbs of the worst quality.
Question 12
The weights of soap bars manufactured by a factory are normally distributed with a mean of 2,000
grams with a standard deviation of 20 grams.
i. What is the probability that a randomly selected soap bar will weigh between 1,990 grams
and 2,030 grams?
ii. What is the probability that a randomly soap bar will weigh more than 2,020 grams?
iii. What is the probability that the sample mean of a random sample of 50 soap bars will weigh
more than 1,995 grams?
Revision Questions (QBM101) Page 11
Module 3
Question 1
The police department would like to estimate the average speed of cars passing along a stretch of
highway by installing a speed detector. From a random sample of 30 cars selected, the sample
mean speed and was found to be 110 km per hour. Assuming that the population standard deviation
is 10 km per hour.
a. Find and interpret the 95 % confidence interval for the true mean speed of cars passing along
this stretch of highway.
b. What is the minimum sample size so that the sampling error at the 95% confidence level does
not exceed 2 km per hour?
Question 2.
In order to estimate the mean time students spent studying per week at home, a random sample of
10 students were selected and the time in hours studying per week is shown below:
Estimate and interpret the 95% confidence interval for the true mean time studying per week at
home for all the students.
Question 3
In a sample of 400 shops it was discovered that 136 of them sold carpets at below the list prices,
which had been recommended by the manufacturers.
a. Estimate with 95% confidence, the true proportion of shops selling carpets at below the list
prices recommended by the manufacturers.
b. What size sample would have to be taken in order to estimate the proportion to within
2% with 95% confidence and by using the above sample proportion?
Question 4
Experience has shown that the scores obtained by students in a particular test are normally
distributed with mean score 70 and variance 36. When the test is taken by a random sample of 49
students, the mean score is 71.5. Assuming that the population variance remains at 36
(i) Is there sufficient evidence, at the 5% significance level, that these students have performed
better than expected?
Question 5
The mean time spent by all customers in a coffee house used to be 16 minutes. The coffee house
manager claims that the mean time spent by all customers has changed. The times in minutes spent
by a random sample of 10 customers in the coffee house are shown below.
Time (Minutes)
15.6 16.2 13.9 12.7 22.5 20.5 16.6 17.9 16.4 19.4
Given that x = 171.7 and x = 3,028 , compute the sample mean and sample
2
i.
standard deviation.
ii. Do the data provide sufficient evidence to support the manager’s claim at the 0.05 level of
significance?
Question 6
In the past, a manufacturer of computer chips used to have 90% of the chips produced conforming
to specification. A new quality control manager claimed that the proportion of chips conforming
to specification has significantly increased. To test the claim, a random sample of 100 chips were
drawn from a large production run. It was found that 93 of the chips conformed to specifications.
Do the data provide sufficient evidence at the 5% level of significance to accept the manufacturer’s
claim? What is the p-value?
Question 7
In the past 70% of students attended revision classes. It is claimed that the proportion of students
attending revision classes has significantly decrease. To test the claim, a random sample 100
students were selected. 65 of the 100 students attended revision classes.
iii. Using the p-value calculated in part(ii), will your decision to part(i) change at the 10% level
of significance?
Revision Questions (QBM101) Page 13
Module 4
Question 1
In a small fishing town the daily catches were sold locally. Recently, the fishermen have
complained about price fluctuations and reduced catches and hence requested the government to
introduce a minimum fish price. It was suspected that fluctuations in fish prices were related to
fish catches. A statistician was asked to study the relationship between daily prices and daily
catches in the fishing town. A random sample of 30 weeks were selected and the prices of fish in
($) and the daily catches in kilograms were recorded.
The prices range from a low of $3.00 to a high of $17.50 per kg.
The daily catches range from a low of 300 kg to a high of 1,000 kg.
The sample data were analyzed using EXCEL, and the summary output and appropriate charts
were generated and provided below. However because of the printer malfunction, some of the data
values are missing and they are indicated as A, B, C and D.
SUMMARY OUTPUT
Regression Statistics
Multiple R A
R Square 0.9646
Adjusted R Square 0.9634
Standard Error 0.8426
Observations 30
ANOVA
Significance
df SS MS F F
Standard
Coefficients Error t Stat P-value
Intercept 24.5698 0.5406 45.4453 8.8279E-28
Average Daily
Catch(kg.) -0.0222 0.0008 D 7.326E-22
Revision Questions (QBM101) Page 14
15.00
10.00
5.00
0.00
0 200 400 600 800 1000 1200
Average Daily Catch (kg.)
ii. Interpret the scatter plot and identify the dependent and the independent variables.
iii. Write down the regression equation and interpret the slope coefficient.
iv. What is the value of the coefficient of determination? Interpret its meaning.
v. Estimate the price for a given day with a daily catch of 850 kg. Comment on its reliability.
vi. Estimate the price for a given day with a daily catch of 1100 kg. Comment on its reliability.
vii. Is there any linear relationship between daily catch and price at the 5% significance level?
Revision Questions (QBM101) Page 15
Question 2
The management of a chain of package delivery stores would like to develop a model for predicting
the weekly sales ($’000) for individual stores based on the number of customers who made
purchases. A random sample of 20 stores was selected from among all the stores in the chain, with
the following results:
Excel was used to fit the regression line and the output generated follows.
13
Weekly sales
000's)
11
($’000)
9
(RM
7
5
400 500 600 700 800 900 1000 1100
No. custome rs
Revision Questions (QBM101) Page 16
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9549
R Square *
Adjusted R Square 0.9069
Standard Error 0.5016
Observations A
ANOVA
Df SS MS F
Regression 1 B 46.8314 186.1232
Residual C 4.5291 0.2516
Total 19 51.3605
i Identify the independent and dependent variables and interpret the scatter plot.
iv Find the value of the coefficient of determination and interpret its meaning.
vi At the 0.05 level of significance, is there evidence of a linear relationship between weekly
sales for individual stores and the number of customers? Use p-value method.
vii Predict the average weekly sales for stores that have 600 customers. Comment on its
reliability and justify your answer.
viii Predict the average weekly sales for stores that have 1120 customers. Comment on its
reliability and justify your answer.
Revision Questions (QBM101) Page 17
Statistical Formulae
∑𝑥
𝜇=
𝑁
2 (∑ 𝑥)2
√∑ 𝑥 − 𝑁
𝜎=
𝑁
∑𝑥
𝑥̅ =
𝑛
2 (∑ 𝑥)2
√∑ 𝑥 − 𝑛
𝑠=
𝑛−1
∑ 𝑚𝑓
𝜇=
𝑁
2 (∑ 𝑚𝑓)2
√∑ 𝑚 𝑓 − 𝑁
𝜎=
𝑁
∑ 𝑚𝑓
𝑥̅ =
𝑛
2 (∑ 𝑚𝑓)2
√∑ 𝑚 𝑓 − 𝑛
𝑠=
𝑛−1
Chebyshev’s theorem:
1
1− ;𝑘 > 1
𝑘2
Revision Questions (QBM101) Page 18
Conditional Probability:
𝑃(𝐴 ∩ 𝐵)
𝑃 (𝐴 | 𝐵) =
𝑃(𝐵)
Addition Rule:
Multiplication Rule:
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴). 𝑃 (𝐵 | 𝐴)
µ = 𝛴 𝑥 𝑃(𝑥)
𝜎 = √Σ 𝑥 2 P(x) − µ2
µ = 𝑛𝑝, 𝜎 2 = 𝑛𝑝𝑞
𝑒 −𝜆 𝜆𝑥
𝑃(𝑋 = 𝑥) =
𝑥!
µ = 𝜆, 𝜎 2 = 𝜆
𝑥̅ − 𝜇
𝑧= 𝜎
√𝑛
Sampling Distribution of Sample Proportion:
𝑝̂ − 𝑝
𝑧=
𝑝𝑞
√
𝑛
Confidence Interval for Population Mean:
𝜎
𝑥̅ ± 𝑧𝛼⁄2 If σ is known
√𝑛
𝑠
𝑥̅ ± 𝑡𝛼⁄2 If σ is unknown
√𝑛
𝑝̂ 𝑞̂
𝑝̂ ± 𝑧𝛼⁄2 √
𝑛
𝑥̅ − 𝜇
𝑧= 𝜎
√𝑛
𝑥̅ − 𝜇
𝑧= 𝑠
√𝑛
𝑝̂ − 𝑝
𝑧=
𝑝𝑞
√
𝑛
Revision Questions (QBM101) Page 20
Sum of Squares:
(∑ 𝑥)(∑ 𝑦)
𝑆𝑆𝑥𝑦 = ∑ 𝑥𝑦 −
𝑛
(∑ 𝑥)2
𝑆𝑆𝑥𝑥 = ∑ 𝑥 2 −
𝑛
(∑ 𝑦)2
2
𝑆𝑆𝑦𝑦 = ∑𝑦 −
𝑛
Regression Coefficients:
𝑆𝑆𝑥𝑦
𝑏=
𝑆𝑆𝑥𝑥
𝑎 = 𝑦̅ − b𝑥̅
𝑆𝑆𝑦𝑦 − 𝑏𝑆𝑆𝑥𝑦
𝑠𝑒 = √
𝑛−2
Coefficient of Determination:
𝑏𝑆𝑆𝑥𝑦
𝑟2 =
𝑆𝑆𝑦𝑦
Coefficient of Correlation:
𝑆𝑆𝑥𝑦
𝑟=
√𝑆𝑆𝑥𝑥 𝑆𝑆𝑦𝑦
𝑏−𝐵
𝑡=
𝑠𝑏
𝑠𝑒
𝑠𝑏 =
√𝑆𝑆𝑥𝑥
𝑑𝑓 = 𝑛 − 2