Exam
Exam
Examination questions
1.
a. Below are two sets of graphs constructed by an analyst who wants to highlight some key
statistics pertaining to the beverages industry. The first aims to highlight changes in sales
by category between 2017 and 2018. The second aims to illustrate the distribution of
market size of the key players in the industry.
Figure I
Coffee,
Coffee, 15.2
21.4
Figure II
Critically evaluate these graphs and explain how these visualizations can be improved.
(5 marks)
b. A real estate agent has data on 362 properties and computes some descriptive statistics
using the property values (in thousands of dollars).
300.00 330.00
320.00
250.00
310.00
200.00 300.00
150.00 290.00
280.00
100.00
270.00
50.00
260.00
0.00 250.00
0 20 40 60 80 100 0 20 40 60 80 100
Age of property Age of property
1. Which graph should be used to illustrate the relationship between property value
and age of the property? Explain your answer. (3 marks)
2. Based on your graph of choice, what can you say about the relationship between
property value and age of the property. Is the relationship causal?
(3 marks)
2. A processed food manufacturing company XYZ has 5 factories across the Colombo district and
employs 600 machine operators working in two shifts (8am to 2pm and 3pm to 9pm). The HR
director of the company wants to know about the time taken by the machine operators in the
factories to travel from home to work and how it is related to worker wellbeing. She assigns two
executives, based in the same factory, with the task of designing a survey to explore this question.
The first executive, who works during the morning shift, proposes surveying the 60 machine
operators working in the room next to her while the second suggests a simple random sample of
60 machine operators be taken from across all five factories and shifts.
Both executives plan to administer the same questionnaire that asks respondents, in addition to
a few background questions, the following questions to measure travel time and worker
wellbeing:
i. Over the last week, on average, how long did you take to travel from your home to work?
ii. On a scale of 0 to 10 (where 10 is the happiest) how satisfied are you with your job?
a. Explain the advantages and disadvantages of the sampling methods proposed by the two
executives. (4 marks)
b. Describe two types of survey errors that may occur when implementing the second
executive’s plan. (4 marks)
c. The sample average time taken to travel from home to work is 48 minutes. Based on this,
the second executive makes the following statement: “The mean travel time from home
to work for machine operators employed in Company XYZ is 48 minutes.”
Critically evaluate this statement. (4 marks)
d. Explain what is meant by the sampling distribution of travel time. What would happen to
the distribution if the sample size were increased to 100 machine operators?
(3 marks)
e. The sample correlation coefficient between job satisfaction rating and travel time is
calculated to be -0.85.
i. Interpret this result. Can we say that there is a cause and effect relationship
between the two variables? Explain your answer. (3 marks)
ii. If you were to plot the two variables in a scatterplot, which would you select as
the dependent variable? Explain your answer. (2 marks)
3. Retail company X has its head office and warehouse in two separate locations and deliveries are
made from the warehouse to head office several times a day. The logistics manager of the
company wants to outsource the company’s delivery service and wants to choose the faster of
two couriers. Courier A promises the fastest service but charges higher rates than Courier B. To
verify the claim made by Courier A, the manager’s assistant places 10 orders with courier A and
10 orders with courier B, at different times, and for each order records the time from when the
package leaves the warehouse to when it is received at head office. He then proceeds to test the
claim made by courier A.
a. State the null and alternative hypothesis for this test (2 marks)
b. Explain the risks of type I and type II errors in this scenario. (4 marks)
c. The assistant isn’t sure which statistical test should be used and ends up running all
possible tests using a statistical software. The output from those tests are given below.
Courier Courier
A B
Mean 16.70 18.88
Variance 9.58 8.22
Observations 10.00 10.00
Pooled Variance 8.90
Hypothesized Mean Difference 0.00
Degrees of freedom 18.00
t Stat -1.63
Critical value for one-tailed test 1.73
Critical value for two-tailed test 2.10
Courier Courier
A B
Mean 16.70 18.88
Variance 9.58 8.22
Observations 10.00 10.00
Hypothesized Mean Difference 0.00
Degrees of freedom 18.00
t Stat -1.63
Critical value for one-tailed test 1.73
Critical value for two-tailed test 2.10
Output 3: F-test for equality of variances (null hypothesis that variances are equal)
Courier Courier
A B
Mean 16.70 18.88
Variance 9.58 8.22
Observations 10.00 10.00
Degrees of freedom 9.00 9.00
F statistic 1.17
p-value 0.41
Explain which sets of statistical output the manager should consider. Based on this, which
courier should the manager select? (6 marks)
d. The manager then realizes that each time the assistant placed an order with courier A, he
had also placed an order with courier B, resulting in paired samples. He redoes the test
and obtains the following output.
Courier Courier
A B
Mean 16.70 18.88
Variance 9.58 8.22
Observations 10.00 10.00
Hypothesized Mean
Difference 0.00
Degrees of freedom 9.00
t Stat -3.04
P(T<=t) one-tail 0.01
P(T<=t) two-tail 0.01
Should the manager change his decision about which courier to use? (3 marks)
e. Should the manager use the output from the paired t-test or from the independent
samples t-test? Explain your answer. (3 marks)
f. Can the manager be certain about his choice? Explain why or why not.
(2 marks)