Hypothesis Testing
Hypothesis Testing
Hypothesis Testing
Level of significance:
Level of significance is the maximum probability of
rejecting the null hypothesis when it is true.
It is usually expressed as % and is denoted by
For example: 5% LOS implies that there are about
5 chance out of 100 of rejecting the null hypothesis
when it is true OR we are about 95% confident
that we will make a correct decision.
6
Critical Region / Rejection Region:
A region in the sample space in which if the computed
value of the test statistic lies, we reject the null
hypothesis, is called critical or rejection region.
Critical Value:
Critical value is that value of statistics which separates
the critical region from the acceptance region. It lies
at the boundary of the region of acceptance &
rejection.
Acceptance Region
Acceptance Acceptance
Region Region
Acceptance Region
1 Size N n
2 Mean X
3 Standard Deviation S
11
The following are the steps involved in test of
significance.
STEP 1: Setting up of a null hypothesis:
This step involves setting of null and alternate
hypothesis. For eg: To test the population mean
50, null hypothesis & alternate hypothesis may
be formulated as:
14
15
Test of Significance of Large Sample
A sample is regarded as large only if its size
exceeds 30.
Step 1: Setting up Ho & H1 as follows:
Ho = & H1 (Two Tail Test)
Ho = & H1 > (Right Tail Test)
Ho = & H1 < (Left Tail Test)
Step 2: Appropriate test statistic to be used is Z
test , when the sample size exceeds 30 or
population SD is known.
Step 3: Calculate Standard Error of mean as follows
SE = (if population SD is Known)
n
X
1% 2.58 2.33
5% 1.96 1.64
17
STEP 6: Interpretation:
Specify the decision as follows:
If the critical value (table value) is
greater than computed value then we
accept null hypothesis (Ho).
18
Illustration 1
A company manufacturing automobile tyres, finds that
tyre life is normally distributed with a mean of 40,000
km and standard deviation of 3,000 km. It is believed
that a change in the production process will result in a
better product and the company has developed a new
tyre. A sample of 100 new tyres has been selected. The
company has found that the mean life of these new
tyre is 40,900 km. Can it be concluded that the new
tyre is significantly better than the old one, using the
significance level of 0.01? 19
Illustration 2
An insurance agent has claimed that the
average age of policyholders who insure
through him is less than the average for all
other agents, which is 35 years. A random
sample of 40 policy holders who have
insured through him gave an average of 32
years with a standard error of 2 years.
=
deviation 1 &2 (i) 1 & 2 are known
S1 2 S2 2
( (ii) 1 &2 are unknown
n1 + n2 22
Step 4: Calculation of value of Z
X1 - X2
Z=
SE
Step 5 : Look for critical value of z at a given level
of significance (5% or 1%) from the normal
distribution table for two tail.
LOS 1% 5%
Two Tail Test 2.58 1.96
STEP 6: Interpretation:
Specify the decision as follows:
If the critical value (table value) is greater than
computed value then we accept null hypothesis (Ho).
=
n S2
n-1 ;S Sample SD
26
Step 4: Calculation of value of t as follow:
-x
t=
SE/ n
Step 5: Calculation degree of freedom:
= n -1
Step 6 : Look for critical value of t at a given level
of significance (5% or 1%) and calculated
degree of freedom
STEP 7: Interpretation:
Specify the decision as follows:
If the critical value (table value) is greater than
computed value then we accept null hypothesis
(Ho).
If the critical value (table value) is less than
computed value then we reject null hypothesis
(Ho).
27
Q1) A soap manufacturing company was distributing a particular
type of brand through a large number of retail shops. Before a
heavy advertising campaign, the mean sales per week per shop
was 140 dozens. After the campaign, a sample of 26 shops was
taken & the mean sales was found to 147 dozens with SD 16.
Can you consider advertisement effective? (2001 2002)
Q2) Six boys are selected at random from a school and their marks in
mathematics found to be 63,63,64,66,60 and 68 out of 100. In the
light of these marks discuss the general observation that the mean
marks in mathematics in the school were 66.
X1 X2
t= 1 1
SE +
n1 n2
Step 5: Calculation degree of freedom:
= n1 + n2 - 2
Step 6 : Look for critical value of t at a given level
of significance (5% or 1%)
Step 7: Interpretation:
Specify the decision as follows:
If the critical value (table value) is greater than
computed value then we accept null hypothesis
(Ho).
If the critical value (table value) is less than
computed value then we reject null hypothesis
(Ho).
31
ILLUSTRATIONS
Q1) Two groups of students appeared in a test
examination and the marks obtained by them were as
follows:
Ist Group 18 20 36 50 49 36 34 49 41
IInd Group 29 28 26 35 30 44 46 - -
X = [ ]
2 (fo fe)2
fe
Properties:
It is a non - parametric test.
It is continuous probability distribution.
It is not symmetrical. It is skewed to the right.
It has only one parameter i.e. degree of freedom.
Its variance is 2 times d.o.f. (Variance = 2 d.o.f.)
Condition for application of X2-test:
Random Samples
Independent Observations
Atleast 50 Observations.
Test of Independence:
Chi square test is used to examine the association
or independence between two sets of attributes.
For Eg: To test whether there is any association
b/w the level of intelligence of fathers & sons or
both are independent.
Test of goodness of fit:
Chi square test is also used to determine whether
actual or observed frequencies correspond to any
specified theoretical frequency distribution such
as Binomial. Poisson & Normal Distribution.
Test of homogeneity
In this test it is determined whether two or more
independent random samples have been drawn
from the same population or not.
Steps Involved In Testing Independence of Attributes:
Step 1 : Set up the hypothesis as follows:
H0 : No association exists between the attributes.
H1 : An association exists between the attributes.
Step 2 : Calculate expected frequencies (fe):
Ri X Cj
(fe)ij = n
Where R is an ith row totat
C is a jth column total
n is sample size
Step 3 : Calculate value of Chi Square (X2):
X = [ ]
2 (fo fe)2
fe
38
fo fe fo - fe (fo - fe)2 (fo - fe)2/ fe
Total X2 =
Step 4: Calculation degree of freedom:
v = (r -1) (c -1)
Step 5 : Look for critical value of X2 at a given level
of significance (5% or 1%)
STEP 6: Interpretation:
Specify the decision as follows:
If the critical value (table value) is greater than
computed value then we accept null hypothesis
(Ho).
If the critical value (table value) is less than
computed value then we reject null hypothesis
(Ho). 39
Steps Involved In Testing
Goodness Of Fit:
Step 1 : Set up the hypothesis as follows:
H0 : fo = fe
H1 : fo fe
Step 2 : Calculate expected frequencies (fe):
Calculate expected frequencies using appropriate
theoretical distribution such as:
Binomial: Expected Frequency=NP(r) = N Cr p q
n r n-r
-m x
Poisson: Expected Frequency= NP(x) = N e m
x!
Sum of Frequencies
Normal : Expected Frequency=
No. of Observation
40
Step 3 : Calculate value of Chi Square (X2):
X 2
= [
(fo fe)2
fe ]
fo fe fo - fe (fo - fe)2 (fo - fe)2/ fe
Total X2 =
Step 4: Calculation degree of freedom:
v = n-1
Step 5 :Look for critical value of X2 at a given level of
significance (5% or 1%).
STEP 6: Interpretation: Specify the decision as follows:
If the critical value (table value) is greater than
computed value then we accept null hypothesis (Ho).
If the critical value (table value) is less than computed
value then we reject null hypothesis (Ho).
41
Q1) 100 students of management institute obtained
the following grades in the statistics paper.
Grade A B C D E Total
Frequency 15 17 30 22 16 100
Firm
A B C D
Type of labour
Skilled 24 24 23 49
Semi Skilled 32 60 37 51
Manual 24 56 40 80
43
Question 4
Five coins were tossed 3200 times and the
number of heads appearing each time is
noted as shown below:
No. of heads 0 1 2 3 4 5
Frequency 80 570 1100 900 500 50
44
Question 5
The following table shows the number of
road accidents in a city, that occurred
during various days of a week:
Days Sun Mon Tue Wed Thu Fri Sat
No. of accidents 14 16 8 12 11 9 14
45
Question 6
By using Chi square test, find out
whether there is any association
between income level and type of
schooling:
Public Govt.
Income
School School
46
47
48
The Analysis of
F - Test
Variance or F Testis a technique
develop by R A Fisher to test for the significance of the
difference among more than two sample means and to
make inferences about whether such samples are
drawn from the population having the same mean.
F test is based on the ratio rather than the difference
between variance.
F test is obtained by taking ration of unbiased
estimates of population variances as follows:
n1S12
var12 (n1 1)
f= var22 = n2S22
(n2 1)
n1 sample size of first population.
n2 sample size of second population.
S1 Standard deviation of first population.
S2 Standard deviation of second population. 49
Assumptions:
Samples are randomly drawn from the
population
50
CLASSIFICATION MODEL:
(a) One Way Model
(b) Two Way Model
Salesman
One Factor
Salesman
EXPERIMENT
Sales
Two Factors
Season
51
Practical Steps Involved In Preparation Of ANOVA
Table For One Factor Analysis Of Varince
STEP 1:
We set null hypothesis and alternate hypothesis as:
Ho : 1 = 2 = 3 = 4 .. = n
H1 : Atleast two means are not equal
STEP 2:
Calculate the sum of observations of each sample i.e.
X1, X2, X3,, Xn.
Now square the observations and obtain their total
for each sample i.e. X12, X22, X32, . Xn2.
STEP 3:
Calculate Correction Factor (T2/N) as follows:
T2 (Sum of all observation of all samples)2
N = Total no. of observation of all the samples
52
T2 = (X1 + X2 + X3 + + Xn)2
N N
STEP 4:
Calculate Sum of Square between samples(SSB) as
follows:
SSB = (Sum of Observation of sample)12+ .. C.F.
(No. of items in column)
+ ..... +
(X1)2 (X2)2 (X3)2 (Xn)2 T2
SSB = n1 + n2 + n3 nn -
N
STEP 5:
Calculate Total Sum of Square (SST) as follows:
Between SSB c -1
Samples
MSB = SSB
c-1
F = MSB
Within SSW N-c MSW = SSW
MSW
Samples N-c
Total SST N1
54
Step 8:
Calculate critical value at given level of
significance.
Step 9:
Interpretation:
Compare the computed value of F with the table
value of F for the given level of significance and
interpret the same as follows:
Case (a) :
If Critical Value > Calculated Value; Accept null
hypothesis
Case (b) :
If Critical Value < Calculated Value; Reject null
hypothesis 55
56
57
Practical Steps Involved In Preparation Of
ANOVA Table For Two Way Model
STEP 1:
We set null hypothesis and alternate hypothesis as:
Ho : 1 = 2 = 3 = 4 .. = n
H1 : Atleast two means are not equal
STEP 2:
Calculate the sum of observations of each row and
each column.
Now square the observations and obtain their total
for each sample.
STEP 3:
Calculate Correction Factor (T2/N) as follows:
T2 (Sum of all observation of all samples)2
N = Total no. of observation of all the samples
58
T2 = (X1 + X2 + X3 + + Xn)2
N N
STEP 4:
Calculate Sum of Square between columns (SSC) as:
(Sum of Square of total of each columns)12+ .. C.F.
SSC=
(No. of items in column)
+ ..... +
(X1)2 (X2)2 (X3)2 (Xn)2 T2
SSC = n1 + n2 + n3 nn - N
STEP 5:
Calculate Sum of Square between rows (SSR) as:
SSR= (Sum of Square of total of each row)12 + .. C.F.
(No. of items in row)
+ ..... +
(Y1)2 (Y2)2 (Y3)2 (Yn)2 T2
SSR = r1 + r2 + r3 rn - N
59
STEP 6:
Calculate Total Sum of Square (SST) as
follows:
SST = Sum of square of all observation C.F.
STEP 7:
Calculate Total Sum of Square for the Residual
Error (SSE) as:
MSC
Between
Columns
SSC c-1 MSC = SSC
c-1 F1= MSE F1
Between MSR
Rows
SSR r-1 MSR = SSR
r-1
F 2= MSE F2
Residual
SSE (c-1)(r-1) MSE =
SSE
Error (r-1)(c-1)
Total SST rc -1
61
Step 9:
Interpretation:
Compare the computed value of F with the table
value of F for the given level of significance and
interpret the same as follows:
Case (a) :
Case (b) :
62
Illustration
A tea company appoints four salesman A, B, C
& D and observes their sales performance in
three seasons of the year viz Summer,
Monsoon & Winter. The figures of sales in lakh
of Rs, are given in the following table. Carry
out an analysis of variance and interpret.
Salesman
A B C D
Season
Summer 56 56 41 55
Monsoon 46 48 49 49
Winter 48 49 51 52
Total 150 153 141 156 63
Illustrations
Q1) Three varieties A, B, and C of crops are tested
in a randomised block design with four
replications. The plot yields in kgs. Are given in
following table. Analyse the experimental yield
and state your conclusions at 5 % level of
significance.
66
Illustration
The probability that a boy will solve the
problem is 3/4 & that a girl will solve the
problem is 4/5. Find the probability that
67
Illustration
A bag contains 3 black, 4 white and 5 red balls. One
ball is drawn at random. Find the probability that :
a) it is either a black ball or non white ball
b) it is either a white ball or non red ball
c) it is either a red ball or non black ball
Illustration
A dice is thrown. What is the probability of getting : -
a) a multiple of 2 or 3
b) A multiple of 2 or 4
68
Illustration
X can solve 80 percent of the problem given in a book and
Y can solve 60 percent. What is the probability that:-
a) both will solve problem
b) none will be able to solve a problem
c) problem will be solved
d) atleast one them will not be able to solve the problem
e) only one of them will solve the problem.
Illustration
X can solve 3 problems out of 5, Y can solve the 2 out of 5
and Z can solve 3 out of 4. What is the probability that : -
a) the problem will be solved
b) only two of them will be solve a problem
c) atleast two of them will solve a problem
69