578 Assignment 1 F14 Sol

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

1

BA 578 Assignment-Sol- due by Midnight (11:59pm) Monday, Sept 15


th
,
2014(Chapters 1, 2, 3 and 4): Total 75 points

True/False (One point each)
Chapter 1
1. An example of a quantitative variable is the telephone number of an individual. FALSE
2. An example of a interval scale variable is the make of a car. FALSE
3. Credit score is an example of an interval scale variable. TRUE There is no intrinsic Zero.
An arbitrary minimum is established. Therefore, it is an interval scale variable.
4. The number of people eating at a local caf between noon and 2:00 p.m. is an example of a
discrete variable. TRUE
Chapter2
5. When establishing the classes for a frequency table it is generally agreed that the more classes
you use the better your frequency table will be. FALSE We try to follow the 2
k
rule. Having
too many classes is not good.

6. The cumulative distribution function is never decreasing. TRUE It is always increasing and
becomes flat at the end point.

7. A Histogram is a graphic that is used to depict quantitative data. TRUE Bar Chart is used for
qualitative data.
Chapter 3
8. The Mean is the measure of central tendency that divides a population or sample into two
equal parts (that is two parts with equal frequencies) FALSE It is the median which does that.

9. If there are 7 classes in a frequency distribution then the fourth class necessarily contains the
median. FALSE It depends on the class frequencies

10. The sum of deviations from the mean (taking into account the frequencies) can be negative,
zero or positive. FALSE It is always Zero
11. The median is said to be less sensitive to extreme values. TRUE This statement is a
relative statement (implicitly) comparing Median with the other popular measure of
central tendency, namely, the Mean. But some students read the statement in absolute
2

terms and answered it wrong although they knew that Median is not sensitive to extreme
values. Therefore, I removed this question from grading.
12. The Empirical Rule is used to describe a population that is not highly skewed. TRUE It is
based on the symmetrical Normal distribution and can be safely applied only for slightly
skewed non-Normal distributions. For highly skewed distribution it is not appropriate.
Chapter 4
13. If events A and B are independent and A is not an impossible event, then P(A/B) is not equal
to zero. TRUE In fact P(A/B) equals P(A) if A and B are independent, which is not zero unless
A is an impossible event.

14. If events A and B are mutually exclusive, then the conditional probability P(A/B) is a
positive number greater than zero but less than 1. FALSE This is obvious from the definition of
mutually exclusive events. If B occurs then A cannot occur. Therefore P(A/B) = 0.
15. The union of events A and B is given by all basic outcomes common to both A and B
FALSE This statement is for Intersection, not for Union.
Multiple Choices (each question carries two points):
Chapter 1
1. Ratio variables have the following unique or special characteristic:
A. Meaningful order
B. Predictable
C. Categorical in nature
D. An inherently defined zero value
2. Which of the following is a quantitative variable?
A. The make of a TV
B. The price of a TV
C. The VIN of a car
D. The rank of a police officer
E. The Drivers License Number
3. Which of the following is a categorical or Nominal variable?
A. The Social Security Number of a person
B. Bank Account Balance
C. Daily Sales in a Store
D. Air Temperature
E. Value of Company Stock
3

4. The level of Satisfaction in a Consumer survey would represent a(n) ____________ level of
measurement.
A. Nominative
B. Ordinal
C. Interval
D. Ratio
Chapter 2
5. When developing a frequency distribution the class (group), intervals must be
A. Large
B. Small
C. Mutually exclusive.
D. Whole numbers
E. Equal
Having equal intervals (or nearly equal intervals) is generally (not always) desirable. But
it is not necessary and not even appropriate in some applications. For example, in Income
distribution the classes are arbitrarily formed and are generally unequal. Similarly many
distributions have the lowest and/or highest class with open bounds which make these class
intervals different from other classes.
6. If there are 80 values in a data set, how many classes should be created for a frequency
histogram?
A. 4
B. 5
C. 6
D. 7
E. 8
Just apply the 2
k
rule for question 8.
7. Consider the following frequency distribution from Excel. What is the missing value?
Bin Frequency Cumulative %
584 1 4.00%
1774.4 64.00%
2964.8 4 80.00%
4155.2 3 92.00%
5345.6 1 96.00%
More 1 100.00%


4

A. 4
B. 10
C. 12
D. 15
E. 20
Chapter 3
8. In a statistic class, 10 scores were randomly selected with the following results obtained: 75,
74, 77, 77, 71, 70, 65, 78, 67, and 66. What is the Standard deviation?
A. 21.40
B. 23.78
C. 4.88
D. 4.63
E. 214.00
X X (X-bar) (X-Xbar)
2

75 3 9
74 2 4
77 5 25
77 5 25
71 -1 1
70 -2 4
65 -7 49
78 6 36
67 -5 25
66 -6 36
720 0 214
s
2
x
=

214 / (10 1) = 23.78 s
x
= 23.78 = 4.88
9. According to a survey of the top 10 employers in a major city in the Midwest, a worker
spends an average of 413 minutes a day on the job. Suppose the standard deviation is 26.8
minutes and the time spent is approximately a normal distribution. What are the times that
approximately 95.45% of all workers will fall?
5

A. [387.5 438.5]
B. [386.2 439.8]
C. [372.8 453.2]
D. [359.4 466.6]
E. [332.6 493.4]
10. When using the Chebyshev's theorem to obtain the bounds for a 99.73 percent of the values
in a population, the interval generally will be ___________ the interval obtained for the same
percentage if normal distribution is assumed (empirical rule).
A. Shorter than
B. Wider than
C. The same as
D. A Subset of
See Instructions. Chebyshevs theorem is more general but is less precise compared to the
empirical rule.
11. In a hearing test, subjects estimate the loudness (in decibels) of sound and the results are:
68, 67, 70, 71, 67, 75, 69, 62, 80, 73, 68 What is the median?
A. 67
B. 68
C. 69
D. 70
E. 71
Put items in order: 62,67,67,68,68,69,70,71,73,75,80 Median = [11 + 1] / 2 or 6
th
item
12. The numbers of rooms for 15 homes recently sold were: 8, 8, 8, 5, 9, 8, 7, 6, 6, 7, 7, 7, 7, 9, 9
What is the standard deviation?
A. 1.96
B. 1.40
C. 1.31
D. 1.14
E 1.18
X X (X-bar) (X-Xbar)
2

5 -2.4 5.76
6 -1.4 1.96
6 -1.4 1.96
6

7 -.4 .16
7 -.4 .16
7 -.4 .16
7 -.4 .16
7 -.4 .16
8 .6 .36
8 .6 .36
8 .6 .36
8 .6 .36
9 1.6 2.56
9 1.6 2.56
9 1.6 2.56
111 0 19.6
Mean = 111/15 = 7.4 Sample variance s
2
x
= 19.6/14 = 1.4 and s = 1.4 = 1.18
Chapter 4
13. Two mutually exclusive events having positive probabilities are ______________
dependent.
A. Never
B. Sometimes
C. Always
They are necessarily dependent because the occurrence of one (seriously) affects the probability
of the other (makes it zero). Instructions on Ch 4 page 4
14. If P(A) >0 and P(B) > 0 and events A and B are independent, then:
A. P(A) = P(B)
B. P((A|B)) = P(A)
C. P(A B) = 0
D. P(A B)=P(A)/ P(B/A)
E. Both A and C are correct
See My Instructions on Ch 4 page 5. Independence does not imply equality of probabilities. So
the first choice is clearly wrong. The third choice applies to mutually exclusive events not
7

independent events. The fourth choice is also incorrect because there should be multiplication on
the right hand side not division. So the correct answer is B.

Essay Type Questions (4 points each)
Chapter 2
1. Consider the following data on distances traveled by people to visit the local amusement
park.
Distance Frequency
1-8 miles 15
8-15 miles 14
15-22 miles 10
22-29 miles 8
29-36 miles 3
Expand and construct the table adding columns for relative frequency and cumulative relative
frequency and construct the histogram of frequencies, plot the frequency polygon and the
Ogive curve using Excel.
distance freq rel.fr cum.rel.fr
1-8 15 0.30 0.30
8-15 14 0.28 0.58
15-22 10 0.20 0.78
22-29 8 0.16 0.94
29-36 3 0.06 1.00
total 50 1.00 na
The following plots were obtained using simple Excel and Insert/scatter plot functions (without using
analysis Tool Pack)
Histogram










15
14
10
8
3
0
2
4
6
8
10
12
14
16
1-8 miles 8-15 miles 15-22 miles 22-29 miles 29-36 miles
Frequency
Frequency
8


Frequency Polygon

Frequency Polygon





















Ogive Curve




















15
14
10
8
3
0
2
4
6
8
10
12
14
16
1-8 miles 8-15 miles 15-22 miles 22-29 miles 29-36 miles
Frequency
Frequency
0.30
0.58
0.78
0.94
1.00
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1-8 miles 8-15 miles 15-22 miles 22-29 miles 29-36 miles
Cumulative Relative Frequency
Cumulative Relative Frequency
9

2. Math test anxiety can be found throughout the general population. A study of 120 seniors at a
local high school was conducted. The following table was produced from the data. Complete the
missing parts. (Work step by step to solve this puzzle. Round the frequencies to the nearest
whole number.)

Score Range Frequency Rel frequency Cumulative Rel. freq.
Very anxious 37-50 0.20
Anxious 33-36 12
Mild Anxiety 27-32
Relaxed 20-26 24
Very Relaxed 10-19 0.30
Total

We have to work step by step using our knowledge of Frequency tables to solve this puzzle.
For the first class, Relative Frequency and Cumulative Relative Frequency will be the same.
So we write 0.20 in the first row last column. Moreover, we find the frequency for this class by
multiplying Relative frequency 0.20 by total frequency 120 to get 24. Thus, first row is
completely filled. I n the second row we convert the given frequency 12 into relative frequency
after dividing by 120 which gives 0.10. Therefore, the cumulative relative frequency in the
second row will be 0.30. Thus, second row is filled too.Next we convert the given relative
frequency in the fifth row into frequency after multiplying 0.30 by 120 and rounding to get 36.
Since the total frequency is given as 120, we can find the remaining frequency for the third
row once we have the frequencies for the other four rows. I t is calculated as 24. The rest of
the story should be clear to you. J ust remember that the total of all frequencies must be the
given number 120 and the total of all relative frequencies must always be 1.

Score Range Frequency Rel frequency Cumulative Rel. freq.
Very anxious 37-50 24 0.20 0.20
Anxious 33-36 12 0.10 0.30
Mild Anxiety 27-32 24 0.20 0.50
Relaxed 20-26 24 0.20 0.70
Very Relaxed 10-19 36 0.30 1.00
Total 120 1.000 NA


3. The number of items rejected daily by a manufacturer because of defects for the last 30 days
are: 22, 21, 8, 17, 25, 20, 18, 19, 14, 13, 11, 6, 21, 23, 4, 19, 11, 12, 16, 16, 10, 28, 24, 6, 21, 20,
25, 5, 17, 9 . Complete this frequency table for the above data showing columns for Frequency,
Relative Frequency and Cumulative Relative Frequency and plot the Ogive curve

Frequency Relative Frequency Cum Relative Frequency
4<9 5 0.167 0.167
9<14 6 0.200 0.367
14<19 6 0.200 0.567
19<24 9 0.300 0.867
24<29 4 0.133 1.000
10


Ogive






4-9 9-14 14-19 19-24 24-29






Chapter 3
4. The following frequency table summarizes the distances in miles of 100 patients from a
regional hospital.
Distance (miles) Frequency
0-4 40
4-8 30
8-12 20
12-16 5
16-20 5

Calculate the sample standard deviation for this data (since it is a case of grouped data with
classes, use group or class midpoints in the formula in place of X values).

Calculate the Sample Mean



Distance
Class Midpoint
(M
i
)
Frequency (fi)
f
i
*M
i

0-4 2 40 80
4-8 6 30 180
8-12 10 20 200
12-16 14 5 70
16-20 18 5 90
Total NA 100 620

The Sample Mean

= 6.2


Calculate the standard deviation:



0.167
0.367
0.567
0.867
1
0
0.2
0.4
0.6
0.8
1
1.2
Series2
11

Distance
Class
Midpoint
(Mi)
Frequency (fi) Deviation(M
i
-

)

Squared Deviation
(M
i
-

)
2


f
i
*(M
i
-

)
2

0-4 2 40 - 4.2 17.64 705.6
4-8 6 30 - 0.2 0.04 1.2
8-12 10 20 3.8 14.44 288.8
12-16 14 5 7.8 60.84 304.2
16-20 18 5 11.8 139.24 696.2
Total NA 100 NA NA 1996

Sample Variance, s
2
=

= 20.1616; Sample Standard Deviation, s =

= 4.49


5. Use the data in Essay question number 3 above to calculate the sample Mean,
Variance and Standard deviation without grouping the data (that is, as a series of
individual values)



Answer: Mean = 16.033

Variance = 44.102

Standard Deviation = 6.641


Data


22

Column1

21


8

Mean 16.033

17

Standard Error 1.212

25

Median 17.000

20

Mode 21.000

18

Standard Deviation 6.641

19

Sample Variance 44.102

14

Kurtosis -0.955

13

Skewness -0.225

11

Range 24.000

6

Minimum 4.000

21

Maximum 28.000

23

Sum 481.000

4

Count 30.000

19


11


12


16


16


10

12


28


24


6


21


20


25


5


17


9

Mean 16.033
Count 30

Using calculator

X X-

(X-

)
2


22 5.967 35.601

21 4.967 24.668

8 -8.033 64.534

17 0.967 0.934

25 8.967 80.401

20 3.967 15.734

18 1.967 3.868

19 2.967 8.801

14 -2.033 4.134

13 -3.033 9.201

11 -5.033 25.334

6 -10.033 100.668

21 4.967 24.668

23 6.967 48.534

4 -12.033 144.801

19 2.967 8.801

11 -5.033 25.334

12 -4.033 16.268

16 -0.033 0.001

16 -0.033 0.001

10 -6.033 36.401

28 11.967 143.201

24 7.967 63.468

6 -10.033 100.668

21 4.967 24.668

20 3.967 15.734

25 8.967 80.401

5 -11.033 121.734

17 0.967 0.934
13


9 -7.033 49.468
Total 481 0.000 1278.967

Sample Variance 1278.967/29 = 44.102

Sample Std. Dev. 44.102 = 6.641

Chapter 4
6. At a college, 55 percent of the students are women and 40 percent of the students receive a
grade of C. About 35 percent of the students are female but not C students. Use this contingency
table.
C Not C
Female 0.30 0.55
Male
0.40

If a randomly selected student is a C student, what is the probability the student is a male
student?
The completed table is:
C Not C
Female 0.25 0.30 0.55
Male 0.15 0.30 0.45
0.40 0.60 1.000

P(M/C) = P(M and C)/P(C) = 0.15/0.40 = 0.375 or 37.5% chance.
Some of you just answered 0.15, but that is the probability of "male and C", not the probability
of "Male" given C.

7. The contingency table about customers of a store who buy cigars and/or beer is given below.
Beer No Beer
Cigars 0.20
No cigar 0.10 0.40
14

Determine the probability that a customer will buy at least one of these items: cigar or beer.
The completed table is:
Beer No Beer
Cigars 0.30 0.20 0.50
No cigar 0.10 0.40 0.50
0.40 0.60 1.00
Answer: P(C or B) = P(C) + P(B) - P(C and B) = 0.50 + 0.40 - 0.30 = 0.60 or 60% chance.
You can also obtain the same probability by working with the rule of complements. The
opposite of buying Cigar or Beer or both is neither Cigar nor Beer. The probability for
neither Cigar nor Beer according to the contingency table is 0.40. Therefore, by the rule
of complements, the probability asked is 1- 0.40 = 0.60.

8. Four employees who work as drive-through attendees at a local fast food restaurant are being
evaluated. As a part of quality improvement initiative and employee evaluation these workers
were observed over three days. One of the statistics collected is the proportion of time employee
forgets to include a napkin in the bag. Related information is given in the table.
Worker Proportion of Dinners Packed
Proportion of forgetting Napkin when
packing Dinner
Joe 0.20 0.05
Jan 0.30 0.02
Cheryl 0.15 0.14
Clay 0.35 0.04
15

You just purchased a dinner and found that there is no napkin in your bag, what is the probability
that Clay has prepared your order?
Answer: First note that the last column in the above table gives conditional probabilities.
For example 0.06 is the probability of forgetting napkin given that Joe packed the dinner
or P(No napkin/Joe). In the question we are given that No napkin has occurred and asked
to find the probability of Clay in light of this result. So here we are asked a reverse
conditionality than the one given in the contingency table. According to the Instructions for
Chapter 4, this requires Bayesian rule. Therefore,
P(Clay/ No napkin) = 0.014/0.051 = 0.2745 or 27.45%
The numerator is P(Clay and No napkin)=P(Dinner packed by Clay)*P(No napkin given
that Clay packed Dinner) = 0.35*0.04 = 0.014.
The denominator is P(No napkin)= P(Joe and no napkin)+ P(Jan and No napkin)+
P(Cheryl and No napkin)+ P(Clay and no napkin) = 0.010 + 0.006 + 0.021 + 0.014 = 0.051
as shown in the table below (everything converted to decimals instead of percentage,
because working with percentage is messy):
Worker
Proportion of Dinners
Packed by individual
workers
Proportion of forgetting
Napkin given the
worker (conditional
probability)
Joint probability
Col. 2*Col.3
Joe 0.20 0.05 0.010
Jan 0.30 0.02 0.006
Cheryl 0.15 0.14 0.021
Clay 0.35 0.04 0.014
0.051
This formula is also called the Bayesian rule for probability revision based on the results of
an experiment. Here the prior probability of Clay is 35%, but the posterior probability has
been revised downward to 27.45% (called the revised or posterior probability) after
noticing that the dinner had no napkins, because Clay is one of the least forgetful ones. If
the question were for Cheryl the posterior probability would be higher than the prior
probability because she has a very high chance of forgetting napkin (14%).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy