We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 30
[plural]
discover —Research Process
Tug
CEU meo ead
EI Reid
einen cca Mecematosc
Probie , 3 Ee
even Tevet
Perron ys
TMs NRSGr Nee)
Euchccltlas Tua elu)
NAP enry
feeer ten (ive toys
Eee ea}
V. Collect data
t= Toe ol}Hypothesis Testing
(a) Null hypothesis and alternative hypothesis:
+ In the context of statistical analysis, we often talk about null
hypothesis and alternative hypothesis.
* According to Fisher, a hypothesis which is tested for plausible
rejection under the assumption that itis trueiscalledthe =
Hypothesis’.
* If we are to compare drug A with drug B about its efficacy and if
we proceed on the assumption that both drugs are equally
efficacious, then this assumption is termed as the null hypothesis.
+ Any other rival hypothesis iscalled 7Hypothesis Testing
* The null hypothesis is generally symbolized as Hi and the
alternative hypothesis as H..
* The null hypothesis and the alternative hypothesis are chosen
before the sample is drawn .
+ Alternative hypothesis is usually the one which one wishes to
prove and the null hypothesis is the one which one wishes to
disprove.
* Thus, a null hypothesis represents the hypothesis we are trying to
reject, and alternative hypothesis represents all other possibilities.Hypothesis Testing
(b) Type I error and type I error.
+ In the context of testing of hypotheses, there are basically two
types of errors we can make.
Bam Cmte nel (rows Pate cea yd Ce Deco Se
Peete nartees ows Cniy rete Eom eis prog rieron yee oe Cott) et a
If error.
+ Type I error means rejection of hypothesis which should have been
accepted and Type II error means accepting the hypothesis which
should have been rejected.
* Type lL error is denoted by & (alpha) known as a error, also called
the level of significance of test; and Type II error is denoted by B
(beta) known as error.Hypothesis Testing
The probability of committing a Type I error (chances of getting it
wrong) is commonly referred to as p-value.
* When a hypothesis test results in a p-value (probability of an a
error) that is less than the significance level, the result of the
hypothesis test is called statistically significant.
+ The conventional range for alpha is between 0.01 and 0.10.
+ Although numbers such as 0.10, 0.05 and 0.01 are values
commonly used for alpha, there is no overriding mathematical
theorem that says these are the only levels of significance that we
EN monHypothesis Testing
Why asl ean
* ‘Type II error means accepting the hypothesis which should have
been rejected.
* The probability of a type II error is given by the Greek letter beta.
* The probability of avoiding a type II error is called the power of
the hypothesis test, and is denoted by the quantity 1 - £.
+ An attempt to decrease one type of error is accompanied in general
by an increase in the other type of error. The only way to reduce
both types of error is to increase the sample size, which may or
may not be possible.Hypothesis TestingHypothesis TestingHypothesis Testing
* The choice of significance level should be based on the
consequences of Type I and Type II errors.
+ If the consequences of a type I error are serious , then a yery small
significance level is appropriate.
+ Example 1: Two drugs are being compared for effectiveness in
treating the same condition. Drug 1 is very affordable, but Drug 2
is extremely expensive. The null hypothesis is "both drugs are
equally effective," and the alternate is "Drug 2 is more effective
than Drug 1." In this situation, a Type | error would be deciding
that Drug 2 is more effective, when in fact it is no better than Drug
1, but would cost the patient much more money. That would be
undesirable, so a small significance level is warranted.Hypothesis Testing
ava
* Power is the probability of correctly rejecting a false null
ih yeroldetoie
*- Power=1—8
* Since Power is the probability of correctly rejecting a false null
hypothesis, It is to our best interest to increase power.
+ There are several ways to increase power:
— Increase the sample size.
— Increase the alpha level. You will have more of a chance of rejecting
the alternative at the 5% level of significance than a 1% test.
— Consider an alternative that is farther away from the null hypothesis.X-p C—ph
p}___A> A |_1_,
Ssh a
vn Vn
G= Standard deviation
n= Sample Size
X= MeanHypothesis Testing
Confidence interval:
A is. a range of values within which the
population parameter is expected to occur.
The two confidence intervals that are used extensively are the 95%
and the 99%.
The upper and lower limit for a confidence interval is given by :
¥+z ae
Tas
Reema ent
PATO AV CACO
aT eer a Meta TOa)
n= sample sizeCritical Values of z and Levels of
ConfidenceThe Dean of the Business School wants to estimate the mean
number of hours worked per week by students. A sample of 49
students showed a mean of 24 hours with a standard deviation of 4
hours. Develop a 95 percent confidence interval for the population
mean.
Co) TF ae
x
24 + 1.96(4/V49)= 24 + 1.12
The confidence limit is 22.88 to 25.12.TESTS OF HYPOTHESES
Statisticians have developed several tests of hypotheses
(also known as the tests of significance) for the purpose of
testing of hypotheses which can be classified as:
(a) Parametric tests or standard tests of hypotheses; and
(b) Non-parametric tests or distribution-free test of
hypotheses.TESTS OF HYPOTHESES
+ Parametric tests usually assume certain properties of the parent
population from which we draw samples.
+ Assumptions like observations come from a normal population,
sample size is large, assumptions about the population parameters
STICBICHT MAE lee Gemrelle
* Non-parametric tests do not depend on any assumption about the
parameters of the parent population.
* Non parametric test generally are less statistically powerful than
the analogous parametric procedure
* anonparametric test will require a slightly larger sample size to
have the same power as the corresponding parametric test.TESTS OF HYPOTHESES
The basic distinction for paramteric versus non-parametric is:
If measurement scale is nominal or ordinal then use non-
parametric statistics
If you are using interyal or ratio scales you use parametric
statistics.
Parametric tests can be used only when the distribution of data is
normal.
If a distribution deviates markedly from normality then you take
the risk that the statistic will be inaccurate. The safest thing to do
is to use an non-parametric statistic.TESTS OF HYPOTHESES
Analysis Type Example Pammcige — Nuaparanstre
Procedure Procedure
‘Compare means between two Is the mean systolic blood Two-sample t-test. Wilcoxon rank-
distinct/independent groups pressure (at baseline) for sum test
patients assigned to placebo
different from the mean for
patients assigned to the
treatment group?
Compare two quantitative Was there a significant
Paired t-test Wilcoxon signed-
measurements taken from the change in systolic blood
rank test
same individual pressure between baseline
and the six-month follow-
up measurement in the
treatment group?
Compare means between If our experiment had three Analysis of variance Kruskal-Wallis
three or more groups (e.g., placebo, new (ANOVA) test
distinct‘independent groups drug #1, new drug #2), we
might want to know
whether the mean systolic
blood pressure at baseline
differed among the three
groups?
Estimate the degree of Is systolic blood pressure
association between two associated with the patient's
quantitative variables age?
Pearson coefficient Spearman’s rank
of correlation correlationTESTS OF HYPOTHESES
¢ The important parametric tests are:
(OR easton
(2) t-test;
(3)* x2-test, and
(4) F-test.
¢ All these tests are based on the assumption of normality
i.e., the source of data is considered to be normally
distributedTESTS OF HYPOTHESES
* This is a most frequently used test in research studies.
* z-test is generally used for comparing the mean of a sample to
some hypothesised mean for the population in case of large
sample, or when population variance is known.
* This test may be used for judging the significance of median,
mode, and coefficient of correlation.One sample
X—Un,
ep /vn
Incase Gy is not
known, we use
G, inits place
calculating
we n—1
is used when two samples are drawn from the
same population. In case Op is not known, we use
pp initsplace calculating
Sar =
My My
where D, = (X, — Kz)
D; =(X; - Xn)
¢,-mkitm™
wh +z-test illustration
ed
A sample of 400 male students is found to have a mean height 67.47 inches. Can it be reasonably
regarded as a sample from a large population with mean height 67.39 inches and standard deviation
1.30 inches? Test at 5% level of significance.
Solution; Taking the null hypothesis that the mean height of the population is equal to 67.39 inches,
we can write:
Hp: My, = 6739"
Hi: |ly, # 67.39"
and the given information as Y = 67.4 ", 0, = 130", n= 400. Assuming the population to be
normal, we can work out the test statistic z as under:
eee ee nase WAT
6pfin —130A/400 0065William Sealy Gosset
1 “unpaired t test’.
‘paired t test’One sample Two samples
unpaired paired
be D-9o
rp Pa hf
n-1
with d.f=(n—1)
where n= number of
with df. =(n, +m — 2)TESTS OF HYPOTHESES —S
ELL
The specimen of copper wires drawn form a large lot have the following breaking strength (in ke.
weight):
ov
578, 572, 570, 568. 572, 578, 570, 572, 596, 544
‘Test (using Student's tstatistic)whether the mean breaking strength of the lot may be taken to be
578 ke. weight (Test at 5 per cent level of significance). - ig
Ss
Solution: Taking the mull hypothesis that the population mean is equal to hypothesised mean of
S78 ke., we can write:
Hy: = ty, = 578 ke
Aye Ug
As the sample size is mall (since n= 10) and the population standard deviation is not known. we
shall use t-test assuming normal population and shall work out the test statistic tas under:
a
t
on‘To find ¥ and G, we make the following computations
SNe : =)
3
°
a
a4
é
¢
=
i
2
3
4
s
6
7
s
2
wo
re =
D(X FP _
572 - 578
Hence. t= =—1488
1272/10
Degree of freedom = (mn — 1) = (10-1) =9s.005|1.376|1.s53|3.o75|e.316|a2.78
lo.7es|0.578|1 2=0|+ esn|> sas 1a>
[p.7a3 [0.543 |s.t00|+ sa3[2 1s2|> 776
lp.727 |o.e20|s 180]: «76 [207s |> 570
[9.703 [o.Se5|s.119|+ a15| tess |2ae=
lo.7os|o-s85|2.108|+ 57]
lp. 700|0.875|+ 053] + a7>|
lp.esr |o.e76|s css] 253
p.ees [o.570|t.o75|+ 350]
|p.es2|0.c05|s.076|s.245|
lo.es1 |o.c20|s.074].241
[p.eso|o.c85|s.072|* 337]
= |o.655|5.582|7.087|=
9 o.ess|o.ces|.cce]®
ze |o.cs7|o.e20|1.c04|*
22 5.324.797 [2.074
Ba jo.es7|3.055|+ 28/1 723 [2088
2s [o.see|+.o5s|4 ste|t 708 |2.os5
25 S55 |2.0s8|* ats|9 vos [2056
2s [o.S85|s.00|+ 343 [4.703 [2.048
29 o.ss4|2.055|1 311 |1 e85|2 045
20 [o.5%|s.050|+ s03|1 ena|2 021
50 [9.6a5|+.047|s 2o5|t e76 [2.005
=o [o-545|2.045|2 255|t 673 [2.000
=o [p-ea5|t.043 [1 Sea|t.ce4|s.eo
300 [9.845|7.042|1.250|1.ec0|s.050
o.842|1.030|s 2a2]|s.c45|1.S00
£7
a