0% found this document useful (0 votes)
64 views

Hypothesis Testing

This document discusses hypothesis testing and key concepts such as: - The null and alternative hypotheses, with the null being the initial assumption and alternative being the opposite. - Hypothesis tests use sample data to evaluate if the null should be rejected or not. - Tests can be one-tailed if the alternative specifies a direction (< or >) or two-tailed if it does not specify direction (≠). - Type I errors occur when the null is falsely rejected, while Type II errors are failures to reject a false null. The significance level controls the likelihood of Type I errors.

Uploaded by

Samuel Angelus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Hypothesis Testing

This document discusses hypothesis testing and key concepts such as: - The null and alternative hypotheses, with the null being the initial assumption and alternative being the opposite. - Hypothesis tests use sample data to evaluate if the null should be rejected or not. - Tests can be one-tailed if the alternative specifies a direction (< or >) or two-tailed if it does not specify direction (≠). - Type I errors occur when the null is falsely rejected, while Type II errors are failures to reject a false null. The significance level controls the likelihood of Type I errors.

Uploaded by

Samuel Angelus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Theory of Estimation– Hypothesis Testing

Hypothesis Testing

R.Venkatesakumar
Department of Management Studies
School of Management, Pondicherry University
Puducherry, INDIA
Hypothesis Testing
Basic Concepts / Definitions
Hypothesis Testing

 Hypothesis testing can be used to determine whether


a statement about the value of a population parameter
should or should not be rejected.
 The null hypothesis, denoted by H0 , is a tentative
assumption about a population parameter.

 The alternative hypothesis, denoted by Ha/ H1 is the


opposite of what is stated in the null hypothesis.

 The hypothesis testing procedure uses data from a


sample to test the two competing statements
indicated by H0 and Ha /Ha.
An Introduction to Hypothesis Testing

 A hypothesis is an assumption about a


population parameter such as a mean or a
proportion
 Example: population mean
 The mean data use for smartphone users is μ = 1.8
gigabytes per day

 Example: population proportion


 The proportion of cell phone users with 4G contracts
is π = 0.92
Stating the Hypothesis

 Every hypothesis test has both a null hypothesis and


an alternative hypothesis
 The null hypothesis (H0) represents the status quo
 States a belief that the population parameter is ≤, =, or ≥ a
specific value
 The null hypothesis is believed to be true unless there is
overwhelming evidence to the contrary
 The alternative hypothesis (H1) represents the
opposite of the null hypothesis
 Is believed to be true if the null hypothesis is found to be false
 The alternative hypothesis always states that the population
parameter is >, ≠, or < a specific value
Stating the Hypothesis

 The purpose of hypothesis statements is to draw a


conclusion about an unknown population parameter
 A hypothesis test is often performed to show that a change has
occurred from the status quo

H0: the unknown parameter has not


changed from the status quo
H1: there has been a change in the
desired direction
Developing Null and Alternative
Hypotheses
• It is not always obvious how the null and alternative
hypotheses should be formulated.
• Care must be taken to structure the hypotheses
appropriately so that the test conclusion provides
the information the researcher wants.

• The context of the situation is very important in


determining how the hypotheses should be stated.

• In some cases it is easier to identify the alternative


hypothesis first. In other cases the null is easier.

• Correct hypothesis formulation will take practice.


One-Tail or Two-Tail Testing
Stating the Hypothesis
 Stating the null and alternative hypotheses depends on
the nature of the test and the motivation of the person
conducting it
 This will decide whether to use one-tail or two-tail tests

Consider the following cases:


H0: μ ≤ 1.8 This test would be used by someone who thinks that data
use has gone up, and wants to support that the average
H1: μ > 1.8 data use is now more than 1.8 gigabytes per day

This would be used by someone who wants to test an


H0: μ ≥ 1.8 assumption that data use has gone down (rejecting the
H1: μ < 1.8 null hypothesis would support the alternative that the
average data use is less than 1.8 gigabytes per day)

This test would be used by someone who has no specific


H0: μ = 1.8
expectation, but wants to test the assumption that the
H1: μ ≠ 1.8 average data use is 1.8 gigabytes per day
Two-Tailed Test

A two-tailed test of hypothesis is one in which the


alternative hypothesis does not specify departure
from H0 in a particular direction and is written with
the symbol “ ≠.”
Two-Tail Hypothesis Tests

 A two-tail hypothesis test is used whenever the


alternative hypothesis is expressed as ≠

H0: μ = 1.8 We assume that μ = 1.8 unless the sample mean is


H1: μ ≠ 1.8 much higher or much lower than 1.8

Reject H0 Do not reject H0 Reject H0

1.8 x scale
H 0

Reject H0 Do not reject H0 Reject H0


Basic Idea
Sampling Distribution ... therefore, we
reject the
hypothesis that

It is unlikely that μ = 50.


we would get a
sample mean of
this value ...
It is unlikely that
we would get a
sample mean of
this value ...
... if in fact this were
the population
mean

90
20 m= 50
Sample Means Sample Means
H0
Two-Tail Hypothesis Tests-Example
One-Tailed Test

A one-tailed test of hypothesis is one in


which the alternative hypothesis is
directional and includes the symbol “ < ” or “
>.”
One-Tail Hypothesis Tests

 A one-tail hypothesis test is used when the alternative


hypothesis is stated as < or >
H0: μ ≤ 1.8 H0: μ ≥ 1.8
H1: μ > 1.8 H1: μ < 1.8
Upper tail test: We assume that Lower tail test: We assume that
μ = 1.8 unless the sample mean is μ = 1.8 unless the sample mean
much higher than 1.8 is much lower than 1.8

Do not reject H0 Do not reject H0


Reject H0 Reject H0

1.8 x scale 1.8 x scale


H 0
H 0

Do not reject H0 Reject H0 Reject H0 Do not reject H0


Basic Idea
Sampling Distribution

It is unlikely that
we would get a ... therefore, we
sample mean of reject the
this value ... hypothesis that
μ = 50.

... if in fact this were


the population mean

20 m= 50 Sample Means
H0
One-Tail Hypothesis Tests-Example
The Logic of Hypothesis Testing

17
The Logic of Hypothesis Testing-
[Text book approach!]
 Starts with an assumption that if the null hypothesis is
true…
 Then, compute appropriate TEST STATIC [Mostly
Error Value]
 If the error is higher than NORMS [Table Value]…
 Then reject null hypothesis
 If the error is smaller - than NORMS [Table Value]…
 Then accept null hypothesis
 The null hypothesis is tested using sample data
 The sample result provides enough evidence to reject the null or does not
provide enough evidence to reject
The Logic of Hypothesis Testing-
[Software / Significance approach!]
 Starts with an assumption that if the null hypothesis is true…
 Then, compute appropriate TEST STATIC [Mostly Error
Value]
 If the probability /chance of getting the value is smaller - …
 Then reject null hypothesis
 If the probability /chance of getting the value is higher - …
 Then accept null hypothesis
 The null hypothesis is tested using sample data
 The sample result provides enough evidence to reject the null or does not provide
enough evidence to reject
Type-1 Error / Type-2 Error
Confidence & Significance levels

20
The Difference Between Type I
and Type II Errors

 Sample evidence is not perfect due to


sampling error, so a conclusion about the
population can be wrong
 A Type I error occurs when the null hypothesis is
rejected when it is true
 The probability of making a Type I error is known as α , the level
of significance

 A Type II error occurs when we fail to reject the null


hypothesis when it is not true
 The probability of making a Type II error is known as β
The Difference Between Type I
and Type II Errors

 Decision Rules for the Two Types of Hypothesis Errors

Possible Hypothesis Test Outcomes

Actual State of H0

Decision H0 is True H0 is False

Type I Error
Reject H0 P(Type I Error) =  Correct Outcome

Do Not Reject H0 Correct Outcome Type II Error


P(Type II Error) = β
The Difference Between Type I
and Type II Errors

 When doing a hypothesis test, decide on a value for


α before selecting the sample
 Once α has been set, the value of β can be
calculated
 For a given sample size, reducing the value of α will
result in an increase in the value of
(or the opposite, α ↑ → β ↓)
 The only way to reduce both α and β
simultaneously is to increase the sample size

9-23
Type-1 Error / Type-2 Error

 Let us consider 2-tail test

H0: μ = 1.8 We assume that μ = 1.8 unless the sample mean is


H1: μ ≠ 1.8 much higher or much lower than 1.8

Do not reject H0
Reject H0 Acceptance Region Reject H0
/Probability

 1.8
 x scale
H 0

Reject H0 Do not reject H0 Reject H0


Type-1 Error / Type-2 Error

Do not reject H0
Reject H0 Acceptance Region Reject H0
/Probability β

 1.8
 x scale
H 0

Reject H0 Do not reject H0 Reject H0

Actual State of H0
Decision H0 is True H0 is False

Type I Error
P(Type I Error) = 
Reject H0 Correct Outcome

Do Not Reject H0 Correct Outcome Type II Error


P(Type II Error) = β
Type I and Type II Errors

Example-1: In a Production Process

  represents a correct decision


Type I and Type II Errors

You May view:

 Type I Error: the mistake of taking action


when no action is needed.

 Type II Error: the mistake of failing to take


action when needed.
Type I and Type II Errors

True State of Nature


Conclusion H0 True Ha True
Accept H0 Correct decision Type II error
(Assume H0 True) (probability )
Reject H0 (Assume Type I error Correct decision
Ha True) (probability )

28
Type I and Type II Errors

 Type I Error: Rejecting the null hypothesis,


what it is true.
 The probability of committing a Type I error is
denoted by .
 Type II Error: Accepting the null hypothesis,
what it is false
 The probability of committing a Type II error
is denoted by .
Rejection Region
The rejection region of a statistical test is
the set of possible values of the test
statistic for which the researcher will reject
H0 in favor of Ha.
Steps for Selecting the Null and
Alternative Hypotheses

1. Select the alternative hypothesis as that which the


sampling experiment is intended to establish. The
alternative hypothesis will assume one of three
forms:
a. One-tailed, upper-tailed (e.g., Ha: µ > 2,400)
b. One-tailed, lower-tailed (e.g., Ha: µ < 2,400)
c. Two-tailed (e.g., Ha: µ ≠ 2,400)
Steps for Selecting the Null and
Alternative Hypotheses

2. Select the null hypothesis as the status quo,


that which will be presumed true unless the
sampling experiment conclusively establishes
the alternative hypothesis.
• The null hypothesis will be specified as that
parameter value closest to the alternative in one-
tailed tests and as the complementary (or only
unspecified) value in two-tailed tests.
(e.g., H0: µ = 2,400)
Conditions Required for a Valid Large-
Sample Hypothesis Test for µ

1. A random sample is selected from the target


population.
2. The sample size n is large (i.e., n ≥ 30). (Due to
the Central Limit Theorem, this condition
guarantees that the test statistic will be
approximately normal regardless of the shape
of the underlying probability distribution of the
population.)
Possible Conclusions for a Test of
Hypothesis

1. If the calculated test statistic falls in the


rejection region, reject H0 and conclude
that the alternative hypothesis Ha is true.
2. State that you are rejecting H0 at the 
level of significance.
3. Remember that the confidence is in the
testing process, not the particular result of
a single test.
Possible Conclusions for a Test of
Hypothesis

4.If the test statistic does not fall in the rejection


region, conclude that the sampling experiment
does not provide sufficient evidence to reject H0
at the  level of significance.
[Generally, we will not “accept” the null hypothesis
unless the probability  of a Type II error has
been calculated.]
Confidence Level / (Coefficient)

The confidence coefficient is the probability


that a randomly selected confidence interval
encloses the population parameter
that is, the relative frequency with which similarly
-

constructed intervals enclose the population


parameter when the estimator is used repeatedly a
very large number of times.
The confidence level is the confidence
coefficient expressed as a percentage.
CASE1 One Sample Mean Test
Large samples – with Sigma-P is known

Hypothesis Testing of a Population


Mean –[σ Known]

37
CASE1 One Sample Mean Test Exercise1
Large samples – with Sigma-P is known

Mean
SD -P
N
SE Ho: μ=
H1: μ≠
Z= α 0.05
Zcrit

Z-Dist
38
CASE1 One Sample Mean Test Exercise2
Large samples – with Sigma-P is known

39
CASE1 One Sample Mean Test Exercise3
Large samples – with Sigma-P is known

40
CASE1 One Sample Mean Test Exercise4
Large samples – with Sigma-P is known

41
CASE1 One Sample Mean Test Exercise5
Large samples – with Sigma-P is known

42
CASE1 One Sample Mean Test Exercise6
Large samples – with Sigma-P is known

43
CASE2 One Sample Mean Test
Large samples – with Sigma-P is unknown

Hypothesis Testing of a Population


Mean:σ unknown

44
CASE2 One Sample Mean Test Exercise1
Large samples – with population SD is unknown

45
CASE2 One Sample Mean Test Exercise2
Large samples – with population SD is unknown

46
CASE2 One Sample Mean Test Exercise3
Large samples – with population SD is unknown

47
CASE3 One Sample Proportion Test
Large samples – with population SD is known/unknown

Hypothesis Testing of a Population


Proportion:
σp- known/unknown

48
CASE3 One Sample Proportion Test
Large samples – with population SD is known/unknown

Confidence Interval for the Proportion

 The Central Limit Theorem implies a


normal model for the sampling distribution
of p̂ .

 E( p̂) = p and SE( p̂ ) = p (1  p ) / n


CASE3 One Sample Proportion Test
Large samples – with population SD is known/unknown

Sampling Distribution of pˆ

1. The mean of the sampling distribution of p̂ is p;


that is, p̂ is an unbiased estimator of p.

2. The standard deviation of the sampling


distribution of p̂ is pq n ; that is,  p̂  pq n
where q = 1–p.
3. For large samples, the sampling distribution of p̂
is approximately normal. A sample size is
considered large if both np̂  15 and nq̂  15.
CASE3 One Sample Proportion Test
Large samples – with population SD is known/unknown

Large-Sample Confidence Interval for p̂

pq p̂q̂
p̂  z 2  p̂  p̂  z 2   p̂  z 2 
n n
x
where p̂  and q̂  1  p̂.
n

Note: When n is large, p̂ can approximate the


value of p in the formula for  p̂ .
CASE3 One Sample Proportion Test
Large samples – with population SD is known/unknown

 95% Confidence Interval for p

 The sample statistic in 95% of samples lies


within 1.96 standard errors of the population
parameter.
CASE3 One Sample Proportion Test
Large samples – with population SD is known/unknown

Estimation Example Proportion

A random sample of 400 graduates showed 32


went to graduate school.

Set up a 95% confidence interval estimate for p.


ˆˆ
pq ˆˆ
pq 32
ˆ  Z / 2 
p  p p
ˆ  Z / 2  ˆ 
p  0.08
n n 400

.08  .92  .08 .92 


.08  1.96   p .08  1.96 
400 400

.053  p  .107
CASE3 One Sample Proportion Test Exercise1
Large samples – with population SD is known/unknown

54
CASE3 One Sample Proportion Test Exercise2
Large samples – with population SD is known/unknown

55
CASE3 One Sample Proportion Test Exercise3
Large samples – with population SD is known/unknown

56
CASE4 One Sample Mean Test
Large samples from Finite Population– with population
SD is known/unknown

Hypothesis Testing for the Mean-Finite


Population
σp Known/unknown

57
CASE4 One Sample Mean Test
Large samples from Finite Population– with population
SD is known/unknown

Whenever Finite Population is used


in the analysis- if sample size
exceeds 5%, then to control the
effect [normalize] of Standard
Error [SE of a large of population],
we use finite population multiplier

58
CASE4 One Sample Mean Test Exercise1
Large samples from Finite Population– with population
SD is known/unknown

59
CASE4 One Sample Mean Test Exercise2
Large samples from Finite Population– with population
SD is known/unknown

60
CASE4 One Sample Mean Test Exercise3
Large samples from Finite Population– with population
SD is known/unknown

61
CASE5 One Sample Mean Test
Small samples – with population SD is unknown

Using the Student’s t-distribution


Using the Student’s t-distribution

 The t-distribution is a continuous probability


distribution with the following properties:
 It is bell-shaped and symmetrical around the mean
 The shape of the curve depends on the degrees of freedom
(df), df = n – 1
 The area under the curve is equal to 1.0
 The t-distribution is flatter and wider than the normal
distribution
 The critical score for the t-distribution is greater than the
critical z-score for the same confidence level
Using the Student’s t-distribution

 Formula for the Approximate Standard Error of the


Mean
s
σˆ x 
n

 The Student’s t-distribution is used in place of the


normal probability distribution when the sample
standard deviation, s, is used in place of the
population standard deviation, σ
Using the Student’s t-distribution

 The t-distribution is actually a family of distributions. As


the number of degrees of freedom increases, the shape of
the t-distribution becomes similar to the normal
distribution
 With more than 100 degrees of freedom (a sample size of more
than 100), the two distributions are practically identical
Using the Student’s t-distribution

Normal
distribution

t (df = 13)
t-distributions are bell-shaped
and symmetric, but have t (df = 5)
‘fatter’ tails than the normal

t
Using the Student’s t-distribution
 Comparing t-scores and z-scores:

Confidence t t t z
Level (10 df ) (20 df ) (30 df )

.80 1.372 1.325 1.310 1.28


.90 1.812 1.725 1.697 1.645
.95 2.228 2.086 2.042 1.96
.99 3.169 2.845 2.750 2.575

Note: t z as n increases
Using the Student’s t-distribution

 Formulas for the Confidence Interval for


the Mean (σ Unknown)
UCLx  x  tα/ 2 σˆ x
LCLx  x  tα/ 2 σˆ x
where:
x = The sample mean
tα/ 2 = The critical t-score
σˆ x = The approximate standard error of the mean
Example
Using the Student’s t-distribution

Example: Suppose a sample of size n = 15 has a


sample mean of 5.11 and a sample standard deviation
s = 0.85
Calculate a 95% confidence interval for the
population mean
s 0.85
1. Find the standard error of the mean:
ˆ
σ x    0.2195
n 15

2. t
Find /2 for (15 – 1) = 14 df and 95% confidence
(from Table): t / 2  t0.025  2.145

2. Calculate the interval endpoints: (next slide)


Example
Using the Student’s t-distribution

Example: (continued) Suppose a sample of size n = 15 has


a sample mean of 5.11 and a sample standard deviation
s = 0.85
Calculate a 95% confidence interval for the
population mean

3. Calculate the interval endpoints:


UCLx  x  tα/ 2 σˆ x  5.11 (2.145)(0.2195)  5.58
LCLx  x  tα/ 2 σˆ x  5.11 (2.145)(0.2195)  4.64

 Based on our sample mean of 5.11, we are 95% confident that the
population mean is between 4.64 and 5.58
CASE5 One Sample Mean Test Exercise1
Small samples – with population SD is unknown

71
CASE5 One Sample Mean Test Exercise2
Small samples – with population SD is unknown

72
Thank You

73

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy