Business Statistics, 4e: by Ken Black

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Business Statistics, 4e

by Ken Black
Chapter 12
Analysis of
Categorical Data

Discrete Distributions

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-1

Learning Objectives
Understand the 2 goodness-of-fit test and how to
use it.
Analyze data using the 2 test of independence.

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-2

Page 1

2 Goodness-of-Fit Test
The 2 goodness-of-fit test compares
expected (theoretical) frequencies
of categories from a population distribution
to the observed (actual) frequencies
from a distribution to determine whether
there is a difference between what was
expected and what was observed.

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-3

2 Goodness-of-Fit Test

( f o f e)

=
2

df = k -1- c
where:

= frequencyof observedvalues

= frequencyof expectedvalues

k = numberof categories
c = numberof parametersestimatedfromthesampledata

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-4

Page 2

Month
January
February
March
April
May
June
July
August
September
October
November
December

Milk Sales Data


for Demonstration
Problem 12.1

Gallons
1,610
1,585
1,649
1,590
1,540
1,397
1,410
1,350
1,495
1,564
1,602
1,655
18,447

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-5

Hypotheses and Decision Rules


for Demonstration Problem 12.1
Ho : The monthly milk figures for milk sales
are uniformly distributed
Ha : The monthly milk figures for milk sales
are not uniformly distributed
=.01
df = k 1 c
= 12 1 0
= 11

2
.01,11

If
If

2
C al
2
C al

> 24 .725 , reject H o .


24 .725, do not reject H o .

= 24.725

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-6

Page 3

Calculations
for Demonstration Problem 12.1
Month
January
February
March
April
May
June
July
August
September
October
November
December

fo
fe
(fo - fe)2/fe
1,610 1,537.25
3.44
1,585 1,537.25
1.48
1,649 1,537.25
8.12
1,590 1,537.25
1.81
1,540 1,537.25
0.00
1,397 1,537.25
12.80
1,410 1,537.25
10.53
1,350 1,537.25
22.81
1,495 1,537.25
1.16
1,564 1,537.25
0.47
1,602 1,537.25
2.73
1,655 1,537.25
9.02
18,447 18,447.00
74.38

18447
12
=153725
.

=
e

2
Cal

= 74 .37

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-7

Demonstration Problem 12.1:


Conclusion
df = 11
0.01
Non Rejection
region

24.725

2
C al

= 74 .37 > 24 .725 , reject H o .

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-8

Page 4

Bank Customer Arrival Data


for Demonstration Problem 12.2
Number of
Arrivals
0

Observed
Frequencies
7

18

2
3

25
17

12
5

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-9

Hypotheses and Decision Rules


for Demonstration Problem 12.2
Ho: The frequency distribution is Poisson
Ha: The frequency distribution is not Poisson
=.05
df = k 1 c
= 6 1 1

If
If

2
Cal
2
Cal

> 9 .488 , reject H o.


9 .488 , do not reject H o.

=4

2
.05, 4

= 9.488

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-10

Page 5

Calculations
for Demonstration Problem 12.2:
Estimating the Mean Arrival Rate
Number of
Observed
Arrivals
Frequencies
X
f
0
7
1
18
2
25
3
17
4
12
5
5

=
fX
0
18
50
51
48
25
192

Mean
Arrival
Rate

192
84
= 2 .3 custom ers per m inute
=

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-11

Calculations for Demonstration Problem


12.2: Poisson Probabilities for = 2.3
Expected
Expected
Number of Probabilities Frequencies
Arrivals X
P(X)
nP(X)
0
0.1003
8.42
1
0.2306
19.37
2
0.2652
22.28
3
0.2033
17.08
4
0.1169
9.82
0.0838
7.04
5

n=f
= 84

Poisson
Poisson
Probabilities
Probabilities
for
for==2.3
2.3
Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.
12-12

Page 6

2 Calculations
for Demonstration Problem 12.2
(fo - fe)2
fe

Number of Observed
Expected
Arrivals Frequencies Frequencies
X
f
nP(X)
0
7
8.42
1
18
19.37
2
25
22.28
3
17
17.08
4
12
9.82
5
7.04
5
84
84.00

2
Cal

= 1.74

0.24
0.10
0.33
0.00
0.48
0.59
1.74

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-13

Demonstration Problem 12.2: Conclusion


df = 4
0.05
Non Rejection
region

9.488

2
Cal

= 1.74 9.488, do not reject Ho.

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-14

Page 7

Using a 2 Goodness-of-Fit Test


to Test a Population Proportion
=.05

Ho : P = .08
Ha : P .08

df = k 1 c
= 2 1 0
=1

2
.05,1

If

.
= 3841

If

2
Cal
2
Cal

> 3.841, reject H o.


3.841, do not reject H o.

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-15

Using a 2 Goodness-of-Fit Test to Test


a Population Proportion: Calculations
fo
33
167
200

Defects
Nondefects
n=

Defects

f
f

fe
16
184
200

(f f )
= o e
f
2

(3316) + (167184)
=
16
184
= 18.0625 + 1.5707
= 19.6332

= nP
= (200 )(. 08 )
= 16

Nondefects

f
f

= n (1 P )
= (200 )(. 92 )
= 184

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-16

Page 8

Using a 2
Goodness-ofFit Test
to Test a
Population
Proportion:
Conclusion

df = 1
0.05
Non Rejection
region

3.841

2
Cal

= 19.6332 > 3.841, reject Ho.

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-17

2 Test of Independence
Used to analyze the frequencies of two
variables with multiple categories to
determine whether the two variables
are independent.
Qualitative Variables
Nominal Data

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-18

Page 9

2 Test of Independence: Investment Example


In which region of the country do you reside?
A. Northeast B. Midwest C. South
D. West
Which type of financial investment are you most likely to
make today?
E. Stocks
F. Bonds
G. Treasury bills

Type of financial
Investment

Contingency Table
E

G
O13

A
Geographic B
C
Region
D
nE

nF

nA
nB
nC
nD
N

nG

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-19

2 Test of Independence: Investment Example


If A and F are independent,
P( A F) = P ( A ) P ( F )

P ( A) =

P( F ) =

n n
P( A F ) =
A

AF

= N P( A F )
n n
= N A F
N N

Type of Financial
Investment

Contingency Table
E

Geographic
Region

n n

A
B
C
D
nE

F
e12

nF

nG

nA
nB
nC
nD
N

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-20

Page 10

2 Test of Independence: Formulas


eij =

(n )(n )
i

N
where : i = the row

Expected
Frequencies

j = the columnn

ni =
nj =

the total of row i


the total of column j

N = the total of all fr equencies

Calculated 2
(Observed 2)

( f o f e)
=
fe

where : df = (r - 1)(c - 1)
r = the numberr of rows
c = the numberr of columns

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-21

2 Test of Independence: Gasoline


Preference Versus Income Category
=.01
df = ( r 1)( c 1)
= ( 4 1)( 3 1)
=6

If
If

2
.01, 6

= 16.812

2
Cal
2
Cal

r=4

c=3

Income
Less than $30,000
$30,000 to $49,999
$50,000 to $99,000
At least $100,000

> 16.812 , reject H o .


16.812 , do not reject H o .

Type of
Gasoline

Regular Premium

Extra
Premium

Ho : Type of gasoline is
independen t of income
Ha : Type of gasoline is not
independen t of income

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-22

Page 11

Gasoline Preference Versus Income


Category: Observed Frequencies
Type of
Gasoline
Income
Less than $30,000
$30,000 to $49,999
$50,000 to $99,000
At least $100,000

Regular Premium
85
16
102
27
36
22
15
23
238
88

Extra
Premium
6
13
15
25
59

107
142
73
63
385

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-23

Gasoline Preference Versus Income


Category: Expected Frequencies
e
e

ij

11

12

13

Type of
Gasoline

(n )(n )
=
j

(107 )(238 )

385
= 66.15
=

(107 )(88 )

385
24
.
46
=
=

(107 )(59 )

Income
Less than $30,000
$30,000 to $49,999
$50,000 to $99,000
At least $100,000

385
= 16.40

Extra
Regular Premium Premium
(66.15)
(24.46)
(16.40)
85
16
6
(87.78)
(32.46)
(21.76)
102
27
13
(45.13)
(16.69)
(11.19)
36
22
15
(38.95)
(14.40)
(9.65)
15
23
25
238
88
59

107
142
73
63
385

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-24

Page 12

Gasoline Preference Versus Income


Category: 2 Calculation
2

f f
= of e
(88 66.15) + (16 24.46)
=
2

66.15

24.46

(6 16.40) +
16.40

(102 87.78) + (27 32.46) + (13 21.76) +


87 .78

32.46

21.76

(36 4513
. ) + (22 16.69 ) + (15 1119
. )
45.13

16.69

(15 38.95) + (23 14.40)


38.95

1119
.

14.40

+
2

(25 9.65)
9.65

= 70.78
Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-25

Gasoline Preference Versus Income


Category: Conclusion
df = 6
0.01
Non rejection
region

16.812

2
Cal

= 70.78 > 16.812, reject Ho.

Business Statistics, 4e, by Ken Black. 2003 John Wiley & Sons.

12-26

Page 13

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy