Workshop 03 - S1 - 2020 - Solutions For Business Statistics
Workshop 03 - S1 - 2020 - Solutions For Business Statistics
Workshop 03 - S1 - 2020 - Solutions For Business Statistics
WORKSHOP 3 SOLUTIONS
Q3.1
A study of Melbourne’s climate compared annual rainfall values for the 75 years from 1861
to 1935 (“Historical”) with annual rainfall values for the following 75 years 1936 to 2010
(“Recent”).
Using the exhibits below, compare the distribution of yearly rainfall totals for the “Historical”
period with the distribution for the “Recent” period by making appropriate references to the
(i) measures of central location,
(ii) measures of variability, and
(iii) shapes
of the distributions.
1
Business Statistics
SOLUTION:
Overall, with a lower mean and median, we can conclude that RECENT average annual
rainfall is lower than HISTORICAL average annual rainfall.
(Of course, modal class is dependent on the bins (class intervals) which are used. As usual, for
a numeric variable with many possible values, the mode is of no interest. We would usually
refer to the modal class)
(Historical has higher minimum and higher maximum. This fact does not, by itself, assure
Example, A: min 1, max 19 range = 18. B: min 4, max 20 range = 16. B has higher max
and min, but smaller range.)
2
Business Statistics
Co-eff of variation measures the relative variability (std deviation as a percentage of mean)
and is higher for RECENT rainfall than for HISTORICAL.
Overall then, the above evidence shows that whilst HISTORICAL annual rainfall has a larger
range, RECENT annual rainfall has greater variability.
Shape (S)
Both distributions are unimodal. They are very close to being symmetrical (mean ≈ median)
OR only slightly positively skewed with the mean being slightly greater than the median. The
difference in the mean and median for RECENT is slightly more pronounced, indicating that
there must have been an unusually high rainfall at some point.
Q3.2
PlanFinan Pty Ltd
PlanFinan is a financial planning organisation. The data in PlanFinan.xlsx was obtained from
PlanFinan’s database of a particular group of clients. Definitions are given for the following
variables:
Sex 1 if the client is male; 0 for female
EducLevel 1: high school incomplete 2: high school complete
3: undergraduate degree 4: postgraduate degree
Salary Annual salary ($,000)
Use the following pivot table to analyse the data and report on how sex and educational
level are associated with annual salary for this group of clients.
Exhibit 1
3
Business Statistics
SOLUTION
Step One:
Identify the dependent and independent variables
Dependent variable (DV) Salary ($,000) (DV)
Independent variables (IV) Sex (IV1),
EducLevel (IV2),
Step Three:
Describe the overall relationship between Salary and Sex (IV1).
This is also true for each level of education. Female clients with a high level of education
have a mean salary ($109,170) which is greater than their male counterparts ($98,910).
Female clients with a lower level of education have a mean salary ($90,070) which is
greater than their male counterparts ($83,980).
Step Four:
Describe the overall relationship between Salary and Education Level (IV2).
This is also true for each sex. Female clients with a high level of education have a mean
salary ($109,170) which is greater than the female clients who have a lower level of
education ($90,070).
Male clients with a high level of education have a mean salary ($98,910) which is
greater than the male clients who have a lower level of education ($83,980).
4
Business Statistics
Overall Conclusion:
Female clients with a high level of education have a higher mean salary ($109,170)
than any other client group.
Male clients with a low level of education have the lowest mean salary ($83,980) of
all client groups.
Video solution for Q3.2
Q3.3
Hytex Company
Hytex Company is a direct marketer of electronic equipment and wants to investigate the
efficacy of catalogue mailings to its 1,000 mail order customers. Hytex is currently
sending catalogues to customers who are not married and are renting a house. Catalogue
Marketing.xlsx contains customer demographic attributes including the following:
Use the following pivot table to analyse the data and report on how AmountSpent is related
to these demographic attributes. Is Hytex sending catalogues to the right customers? If not,
to whom should the catalogues be sent to?
Exhibit 2
Average of
AmountSpent Marital Status
OwnHome Not married Married Grand Total
Renting $597.72 $1,339.02 $868.82
OwnHome $1,015.12 $1,853.45 $1,543.14
Grand Total $757.81 $1,672.07 $1,216.77
5
Business Statistics
SOLUTION
Step One:
Identify the dependent and independent variables
Dependent variable (DV) Amount Spent ($) (DV)
Independent variables (IV) OwnHome (IV1),
Marital Status (IV2)
Step Two:
Describe any relevant overall features of AmountSpent.
Step Three:
Describe the overall relationship between AmountSpent and OwnHome (IV1).
This is also true for each marital status. Married homeowners have a mean AmountSpent
($1,853.45) which is greater than married renters ($1,339.02).
Homeowners that are not married have a mean AmountSpent ($1,015.12) which is greater than
renters who are not married ($598.72).
Step Four:
Describe the overall relationship between AmountSpent and Marital Status (IV2).
This is also true for each level of OwnHome. Married homeowners have a mean
AmountSpent ($1,853.45) which is greater than homeowners who are not married
($1,015.12).
Married renters have a mean AmountSpent ($1,339.02) which is greater than renters
who are not married ($597.72).
6
Business Statistics
Now, it is important to address HyTex’s question. Is it is sending the catalogues to the right
customers? If not, to whom should HyTex send the catalogues?
Overall Conclusion:
• Married customers who own their own home have a higher mean spend
($1,853.45) than any other customer group.
• Married customers who rent have the next highest mean spend ($1,339.02).
• So, it seems to make good sense for HyTex to send catalogues to these customer
segments instead of sending them to unmarried customers who rent a home.
7
Business Statistics
As part of an Australian Household Expenditure Survey (1988-89), the following data was
collected for 1000 households:
Exhibit 1:
Exhibit 2:
8
Business Statistics
Using the above results, compare the distribution of the variable “Income” for the two groups,
discussing typical values (i.e. “central tendency”), how spread out the values are (“variability”),
and the shape of the distributions.
Comment on what this tells us about the association between income and the consumption
of alcohol.
SOLUTION:
9
Business Statistics
With a lower mean and median, we can conclude that the average weekly household
income for the group that does not consume alcohol is much lower than the group that
consumes alcohol.
The group that does not consume alcohol has a clear modal class of $0-$250 while that of the
group that does consume alcohol is not really clear, $500-$750 if pushed.
The income range of the group that does not consume is larger than the group that
consume alcohol by $150.
(Both minimums are the same but Do Not Consume has a higher Maximum, hence a larger
range)
The spread of the middle 50% of the income for the group that consumes alcohol is larger
(more variable) than the group that does not consume alcohol.
The distribution of incomes for households that do not consume alcohol has a lower standard
deviation and interquartile range than the distribution for those that do consume alcohol.
10
Business Statistics
Overall, although the measures of absolute variability (interquartile range and standard
deviation) are higher for the households that consume alcohol, the relative variability as
measured by the coefficient of variation is considerably higher for the income distribution of
households that do not consume alcohol. This is because the standard deviation is about 88%
of the mean, whereas for the income distribution of alcohol-consuming households the
standard deviation is only about 65% of the mean.
The distribution of weekly income is skewed to the right and unimodal for both groups. This
means that in both cases, the mean is greater than the median. This suggests that there are
a “few” very large incomes.
Among those who do not consume alcohol, the distribution is more strongly skewed given
that the difference between mean and median is larger.
Overall then, the above evidence shows that the group that do not consume alcohol has a
larger range and greater relative variability in weekly income.
11
Business Statistics
Q3.5
The side by side boxplots below show the distribution of age at marriage of 45 married men
and 38 married women.
(iii) shape
b. Comment on how the age at marriage of men compares to women for the data.
Solution:
a.
i. Mean: Female 22 years vs Male 25 years
Median: Female 21 years vs Male 23 years
Both the mean and median age of marriage for men is greater than women. Hence,
the overall average age of marriage is greater for men than women.
ii. The range of age at marriage is greater for men (R= 26 years) than women (R= 22
years).
12
Business Statistics
The IQR is also greater for men (IQR= 11 years) than women (IQR= 8 years). This means
that the middle 50% of men get married between the ages of 20 – 31 years (IQR = 11
years), compared to the middle 50% of women who get married between the ages of
19 – 27 years (IQR = 8 years), which is both younger and less varied.
iii. The distributions of age at marriage are positively skewed for both men and women
as shown by the mean > median. This means that for both men and women, there
must be a small number who married at a much older age which is responsible for
dragging the mean upwards in a way that results in it no longer representing the
‘average’ age.
b. The men, on average, married at an older age and the age at which they married is
more variable, i.e. spread across a wider range of ages.
13