0% found this document useful (0 votes)
8 views53 pages

L3-Statistics Tests of Sig-3

The document discusses tests of significance, focusing on types of data, their distribution, and the design format for statistical analysis. It covers various statistical tests including t-tests for independent and paired samples, ANOVA for comparing more than two means, and the importance of normal distribution in these analyses. Additionally, it explains the interpretation of p-values and the implications for hypothesis testing.

Uploaded by

NAHLA ELKHOLY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views53 pages

L3-Statistics Tests of Sig-3

The document discusses tests of significance, focusing on types of data, their distribution, and the design format for statistical analysis. It covers various statistical tests including t-tests for independent and paired samples, ANOVA for comparing more than two means, and the importance of normal distribution in these analyses. Additionally, it explains the interpretation of p-values and the implications for hypothesis testing.

Uploaded by

NAHLA ELKHOLY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 53

Tests of significance

Three important things to be considered :


1- Type of data
2- Distribution of data (shape – type)
3- Format /design of data in the spread
sheet of the software program

1 Prof.Mohsen Gadallah - Stat


Normal Distribution Curve

2 Prof.Mohsen Gadallah - Stat


3 Prof.Mohsen Gadallah - Stat
4 Prof.Mohsen Gadallah - Stat
The standard normal (Z)
:distribution
• Using the z score equation :
– Z score = Value – Mean / SD
– Z score = x – x /SD

• The z scores will distributed as normal distribution


(standard normal curve ) which has a mean of 0 and a SD
of 1.
• There are three areas on a standard normal curve that all
introductory statistics students should know. The first is
that the total area below 0.0 is .50, as the standard normal
curve is symmetrical like all normal curves. This result
generalizes to all normal curves and for any values or
variables.
5 Prof.Mohsen Gadallah - Stat
6 Prof.Mohsen Gadallah - Stat
7 Prof.Mohsen Gadallah - Stat
I Comparison between two independent samples
using t-test (student test)
-
Diastolic blood pressure among normal
weight group and obese group
The
The relation
relation DIAST.1 DIAST.2
between
between 1 80.00 84.00
diastolic
diastolic 2 84.00 80.00
3 81.00 85.00
blood
blood 4 78.00 88.00
pressure
pressure and
and 5 83.00 87.00
weight
weight .. 6 70.00 86.00
7 72.00 84.00
Is
Is there
there aa 8 78.00 84.00
9
statistical
statistical 75.00 81.00
10 77.00 82.00
relationship
relationship ?? Total N 10 10
8 Prof.Mohsen Gadallah - Stat
Assumptions for using t-test
• Normal
distribution of
3.5

each group
3.0

2.5

2.0

• Type of data is 1.5

quantitative 1.0

continuous. .5 Std. Dev = 4.52


Mean = 77.8
0.0 N = 10.00
70.0 72.5 75.0 77.5 80.0 82.5 85.0
9 Prof.Mohsen Gadallah - Stat
DIAST.1
3.5

3.0

2.5

2.0

1.5

1.0

.5 Std. Dev = 2.56


Mean = 84.1
0.0 N = 10.00
80.0 82.0 84.0 86.0 88.0

10 Prof.Mohsen Gadallah - Stat


DIAST.2
How can you decide that the data
? is normally distributed

• 1- Test of normality
• 2- Shape of the histogram and normal
curve.
• 3- The mean and the standard deviation.

• Both no. 2 and 3 are rough .


11 Prof.Mohsen Gadallah - Stat
a
study groups and diastolic blood pressure

First of all ,arrange 1


GROUPS
normal weight
DIASTOLI
80.00

the data according to 2


3
normal weight
normal weight
84.00
81.00
the format of 4 normal weight 78.00
5 normal weight 83.00
stat.package e.g. SPSS 6 normal weight 70.00
7 normal weight 72.00
8 normal weight 78.00
9 normal weight 75.00
10 normal weight 77.00
11 obese 84.00
12 obese 80.00
13 obese 85.00
14 obese 88.00
15 obese 87.00
16 obese 86.00
17 obese 84.00
18 obese 84.00
19 obese 81.00
20 obese 82.00
Total N 20 20
12 Prof.Mohsen Gadallah - Stat
a.
Example of output of t-test using SPSS
t-test for two independent means

Group Statistics

Std. Std. Error


GROUPS N Mean Deviation Mean
DIASTOLI obese 10 84.1000 2.5582 .8090
normal weight 10 77.8000 4.5166 1.4283
Independent Samples Test

Levene's Test for


Equality of Variances t-test for Equality of Means
Sig.
F Sig. t df (2-tailed)
DIASTOLI Equal variances
assumed 2.382 .140 3.838 18 .001
Equal variances
not assumed 3.838 14.236 .002

13 Prof.Mohsen Gadallah - Stat


Table of t-tests
Level of
significance
0.001 0.01 0.05 df

636.619 63.657 12.706 1

31.598 9.925 4.303 2


12.941 5.841 3.182 3
8.61 4.604 2.776 4
6.859 4.032 2.571 5
4.587 3.169 2.086 10
3.85 2.845 2.086 20
3.646 2.75 2.042 30
3.496 2.687 2.009 50
3.29 2.58 1.96 Inf
14 Prof.Mohsen Gadallah - Stat
P value
• P value can be interpreted as the probability of
obtaining the value of the observed statistic (e.g.
a difference between the means of two groups, a
difference between two proportions), or ones of
more extreme value, if the null hypothesis is
true. So, after collecting and weighing the
evidence we examine the probability of the
observed difference (and all the ones that are
more extreme) and if this probability ≤ 0.05 (the
Type I probability) we conclude that the null
hypothesis is false and it should be rejected. So,
a P value ≤ 0.05 leads to rejection of the null
hypothesis, while a P value > 0.05 indicates that
probability of type I error is high and we can not
15
reject the null hypothesis.
Prof.Mohsen Gadallah - Stat
II Comparison between paired
data (dependent means)
By using paired t-test

• Paired observation =before


and after
• Normal distribution
• Numerical quantitative
continuous

16 Prof.Mohsen Gadallah - Stat


- a
diastolic bl.br after anti-histaminic drug

DIAST.before DIAST.after
1 80.00 84.00
2 84.00 80.00
• Example : effect of 3 81.00 85.00
antihistaminic 4 78.00 88.00
drug on diastolic 5 83.00 87.00
blood pressure 6 70.00 86.00
7 72.00 84.00
8 78.00 84.00
9 75.00 81.00
10 77.00 82.00
Total N 10 10
a.
17 Prof.Mohsen Gadallah - Stat
Using t-test for dependent means
Paired Samples Statistics

Std. Std. Error


Mean N Deviation Mean
Pair DIAST.1 77.8000 10 4.5166 1.4283
1 DIAST.2 84.1000 10 2.5582 .8090

Paired Samples Test

Sig.
t df (2-tailed)
Pair 1 DIAST.1 - DIAST.2 -3.678 9 .005

18 Prof.Mohsen Gadallah - Stat


III Comparison between more than
two independent means
using Analysis of Variance Test
Over weight Normal Under
Assumptions:
Assumptions: weight weight
14 12.5 14.5 1
1-
1- Normal Dist..
Normal Dist 12 13 13 2
13 14 12.5 3
2-
2- Type
Type of
of Data
Data 12 13 13 4
13 14.5 14 5
Continuous.
Continuous. 14 15 12 6
3-
3- Independent
Independent 15 14 14 7
10 13 12 8
Groups.
Groups. 12 13 10 9
19 Prof.Mohsen Gadallah - Stat
The analysis of the previous table
using ANOVA Test followed by
Multiple comparison test (LSD)
Descriptives

UNDER_WT
Std.
N Mean Deviation Minimum Maximum
Under Weight 9 12.7778 1.3718 10.00 14.50
Normal Weight 9 13.5556 .8457 12.50 15.00
Over weight 9 12.7778 1.4814 10.00 15.00
Total 27 13.0370 1.2704 10.00 15.00

20 Prof.Mohsen Gadallah - Stat


ANOVA

UNDER_WT
Sum of Mean
Squares df Square F Sig.
Between Groups 3.630 2 1.815 1.136 .338
Within Groups 38.333 24 1.597
Total 41.963 26

21 Prof.Mohsen Gadallah - Stat


N.B. Do not use t-test after
ANOVA
Multiple Comparisons

Dependent Variable: UNDER_WT


LSD
Mean
Difference
(I) GROUPS (J) GROUPS (I-J) Sig.
Under Normal Weight -.7778 .204
Weight Over weight .0000 1.000
Normal Under
.7778 .204
Weight Weight
Over weight
.7778 .204

Over Under
weight .0000 1.000
Weight
Normal Weight
22 Prof.Mohsen Gadallah-.7778
- Stat .204
If we change some data for hemoglobin among
under weight group to be as follows

Over weight Normal Under


weight weight
14 12.5 11 1
12 13 13 2
13 14 12.5 3
12 13 13 4
13 14.5 11 5
14 15 12 6
15 14 10 7
10 13 12 8
12 13 10 9
23 Prof.Mohsen Gadallah - Stat
Descriptives

VAR00001
Std.
N Mean Deviation Std. Error Minimum Maximum
Under Weight 9 11.7222 1.1487 .3829 10.00 13.00
Normal Weight 9 13.5556 .8457 .2819 12.50 15.00
Over weight 9 12.7778 1.4814 .4938 10.00 15.00
Total 27 12.6852 1.3739 .2644 10.00 15.00

ANOVA

VAR00001
Sum of Mean
Squares df Square F Sig.
Between Groups 15.241 2 7.620 5.406 .012
Within Groups 33.833 24 1.410
Total 49.074 26
24 Prof.Mohsen Gadallah - Stat
Multiple Comparisons

Dependent Variable: VAR00001


LSD
Mean
Difference
(I) GROUPS (J) GROUPS (I-J) Sig.
Under Weight Normal Weight -1.8333* .003
Over weight -1.0556 .071
Normal Weight Under Weight 1.8333* .003
Over weight .7778 .177
Over weight Under Weight 1.0556 .071
Normal Weight -.7778 .177
*. The mean difference is significant at the .05 level.
25 Prof.Mohsen Gadallah - Stat
IMPORTANT NOTE

• If you perform usual t-tests after ANOVA, then


multiply the P value by the number of
comparisons . In the previous example , we have
three comparisons then after each comparison
multiply the P value by 3 e.g. If the t-test
between group1 and group 2 revealed a P value
= 0.04 ( Significant) , then the adjusted P value
=0.04 x 3 = 0.12 ( in this case it is Insignificant).

26 Prof.Mohsen Gadallah - Stat


IV: One sample t-test
• It is less common used in our research
• It is used when we carry out a study with one
sample and we want to compare the results with
the Known Population MEAN.
• If the population mean value of cholesterol level is
180 mg %/ml . A group of 100 adult males with
+ve family history of hypertension was studied for
cholesterol level .The results are : Mean=190 and
SD=20 .
27 Prof.Mohsen Gadallah - Stat
Repeated measures analysis of
variance
• For repeated measures of quantitative
continuous variable in the same group e.g.
Extended paired t-test , as seen in many
clinical trials with follow up for different
periods of time.

28 Prof.Mohsen Gadallah - Stat


V
V :: Non
Non Parametric
Parametric Analysis
Analysis
• It is used when the normal distribution is
violated even after trials of transformation .

• It is used when the data are quantitative and


discrete e.g. Apgar Score (2 – 10) , Pain
Score (1-10) .

29 Prof.Mohsen Gadallah - Stat


Mann-Whitney U test
( Wilcoxon Rank Sum Test )
Equivalent of t-test for two
independent samples
Case Summariesa

ALT.HBV+B ALT.HBV
1 5.00 4.00
2 10.00 80.00
3 100.00 60.00
4 90.00 190.00
5 60.00 50.00
6 200.00 100.00
7 40.00 4.00
8 30.00 30.00
9 12.00 10.00
10 8.00 2.00
Total N 10 10
30 Prof.Mohsen Gadallah - Stat
a.
The distribution of Data for
Group 1
5

2
Std. Dev = 61.22
1
Mean = 55.5
0 N = 10.00
0.0 50.0 100.0 150.0 200.0
25.0 75.0 125.0 175.0
31 Prof.Mohsen Gadallah - Stat
ALT.HBV1
Distribution of Data for Group 2
5

2
Std. Dev = 59.20
1 Mean = 53.0
0 N = 10.00
0.
25
50
75 0
10
12 .0
15 .0
17 0
20 .0
0
.0
.
.0
0
5
0.
5
0.
0

32 Prof.Mohsen Gadallah - Stat


ALT.HB2
Wilcoxon signed –rank sum test
equivalent to paired t-test
Case Summaries for ALT before a and
after treatment with new drug

ALT-Before ALT.After
1 5.00 4.00
2 10.00 80.00
3 100.00 60.00
4 90.00 190.00
5 60.00 50.00
6 200.00 100.00
7 40.00 4.00
8 30.00 30.00
9 12.00 10.00
10 8.00 2.00
Total N 10 10
33 Prof.Mohsen Gadallah - Stat
a.
Example of test result (Mann-
Whitney U test
Ranks

Mean Sum of
STUDY.GR N Rank Ranks
ALT.HBV1 Bilh.+HBV 10 11.10 111.00
HBV 10 9.90 99.00
Total 20

Test Statisticsb

ALT.HBV1
Mann-Whitney U 44.000
Wilc ox on W 99.000
Z -.454
As y mp. Sig. (2-tailed) .650
Ex ac t Sig. [2*(1-tailed a
Sig.)] .684

a. Not c orrec ted for ties .


34 b. Grouping
Prof.Mohsen Gadallah - Stat
Variable: STUDY.GR
Kruskal –Wallis test
• Equivalent of one way Analysis of
Variance Test .
• Just as analysis of variance is a more
general form of t-test , so The Kruskal –
Wallis test is an extension of Mann-
Whitney test.

35 Prof.Mohsen Gadallah - Stat


Friedman Test

• The equivalent of repeated measures


analysis of variance for ordinal data and
non-parametric data .

36 Prof.Mohsen Gadallah - Stat


VI : Comparison between two
Independent proportions
• It is used when the data are qualitative
(categorical) .
• It is called 2 x 2 Table (=Frequency table) or
Cross-tabulation .
• The distribution of the event (proportion, cure)
is approximately normal (= Binomial
distribution).
• The test used is z test .
• Chi square test can be used .
37 Prof.Mohsen Gadallah - Stat
Example
Failed Cured
Frequency Frequency
Two
Two Drugs
Drugs
were
were used
used 10 40 Drug A
(N=50)
in
in clinical
clinical 4 46 Drug B
trial
trial for
for (N=50)
treatment
treatment ofof
Tonsilitis
Tonsilitis
ZZ test
test == 1.44
1.44

38
PP value
value == 0.15
0.15
Prof.Mohsen Gadallah - Stat
Chi square test (x ) for 2 x 2 table
2

or r x c
Failed Cured
Two
Two Drugs
Drugs Frequency Frequency
were
were used
used 10 40 Drug A
(N=50)
in
in clinical
clinical 4 46 Drug B
trial
trial for
for (N=50)
treatment
treatment ofof
Tonsilitis
Tonsilitis XX22 == 2.99
2.99
PP value
value == 0.084
0.084
39 Prof.Mohsen Gadallah - Stat
Some important points for X2 test
• If the sample size is NOT Large ,
we use X2 with Yates’ correction.
As You see if x2 test is used for
small sample size its value tend
to be a little large , so we use
correction to remove the Bias .
Many statistitions prefer to use
x2 with Yates’s correction.

40 Prof.Mohsen Gadallah - Stat


Failed Cured
Frequency Frequency
10 40 Drug A
(N=50)
4 46 Drug B
(N=50)

XX22 == 2.99
2.99 XX22y == 2.08
2.08
y
PP value
value == 0.084
0.084 PStatvalue
41 Prof.Mohsen Gadallah -P value == 0.15
0.15
Fisher Exact Test
IfIf any
any cell
cell with
with
expected
expected value value less
less Failed Cured
than
than 55 ,, xx22test
test even
even Frequency Frequency

with
with correction
correction is is 7 13 Drug A
(N=20)
not
not valid
valid .. So,
So, we
we
use
use alternative
alternative 2 18 Drug B
(N=20)
approach
approach for for only
only
22 xx 22 table
table ,, known
known
as XX22=3.58
=3.58,, pp==0.058
as Fisher’s
Fisher’s exactexact 0.058
test
test Fisher
Fisherexact
exact ppvalue
value=0.13
=0.13
42 Prof.Mohsen Gadallah - Stat
VII Paired qualitative data
Paired 2 x 2 table
IfIf the
thesubjects
subjectsare are After After
measured
measuredtwise twise
and -ve +ve
andthe the
measured
measured pain pain
characteristic
characteristicisis 28 10 +ve pain
nominal
nominalaatest test for
for
paired Before
paired
proportions
proportionsthat that
analyzes
analyzesthe the 10 2 -ve pain
number
numberof of
disagreements
disagreements,,
called McNemar
McNemar=20.83
=20.83,,pp<0.001
<0.001H.sig.
calledthe theMc Mc H.sig.
Nemar
Nemar test
test X 22
43 X =Gadallah
Prof.Mohsen =0.08
0.08- ,Stat
,pp>0.05
>0.05insignificant
insignificant
McNemare= Paired X2

• It is used to compare the diagnosis of


the disease by using two methods.

• Also , we can apply extended


McNemare test for 3 x 3 Table or
more .
44 Prof.Mohsen Gadallah - Stat
Some examples of using
paired data
X-ray

normal Benign Malignant

Normal 20 5 2
Sona
r Benign 4 15 1

Malignant 0 5 6

45 Prof.Mohsen Gadallah - Stat


VIII : Correlation and
Linear Regression
• In Correlation we are looking for a linear
association between two variables (in the
same group ) , and the strength of the
association is summarized by the
correlation coefficient ( r test )

46 Prof.Mohsen Gadallah - Stat


Diastolic Bl. Pr
140 Immunity level
 120
120 
 100 
-ve
-ve Correlation
100  
80 
 80  Correlation


 60 
60 
 
40 40 
ve Correlation+
veCorrelation

20 + 20 

0 
0
20 30 40 50 60 70 80 90 100 110 20 30 40 50 60 70 80 90 100 110 120

Cholesterol No. of Cigarettes/Day

Blood Urea
25
  
20      
 
15

10

5 No
NoCorrelation
Correlation
0
20 30 40 50 60 70 80 90 100 110 120
No. of cigarettes/day

47 Prof.Mohsen Gadallah - Stat


NON-PARAMETRIC
MEASURES OF ASSOCIATION

• For Nominal Data there are many test as


Contingency coefficient (after X2 test) ,
Phi test , Cohen’s kappa .

• For discrete or ordinal data , there are


also many tests as Spearman rank
correlation , Kendall,s tau .
48 Prof.Mohsen Gadallah - Stat
Linear Regression

• In Linear regression we are looking for a


dependence of one variable (the dependent
variable = Systolic Blood Pressure) , on another
,the independent variable (Cholesterol) . BOTH
Variables are continuous .
• The relation is summarized by a regression
equation consisting of a slope and an intercept.

49 Prof.Mohsen Gadallah - Stat


Regression equation

• Y = a + bX
• Y= dependent variable
• X= Independent Variable
• a= Intercept * b= Slope
• The slope represents the amount the
dependent variable increases with unit
increase in the independent variable.
• The intercept represents the value of the
dependent variable when the independent
variable takes the value zero.
50 Prof.Mohsen Gadallah - Stat
Example for linear line trend

Diastolic Bl. Pr ( Y )
100

90 

 


80 


70 

60
150 160 170 180 190 200 210 220 230 240
Cholesterol ( X )
51 Prof.Mohsen Gadallah - Stat
Multiple linear regression
• In multiple linear regression we are interested in
the simultaneous relationship between one
dependent variable ( diastolic blood pressure )
and a number of independent variables
( weight ,smoking,cholesterol , .. Etc.)

• The equation : Y = a +b1x1+b2x2+b3x3+…


• Note : dependent variable = Y = continuous

52 Prof.Mohsen Gadallah - Stat


Logistic regression
• In logistic regression the dependent variable is
binary (qualitative ) ; that is it can take one of two
categories (Cure / Failure) . While the independent
variables are any things.
• There are many methods used to carry out the
Model of logistic regression such as :
• 1-Enter Technique (all variables allowed for the
prediction of the outcome).
• 2- Stepwise Technique that allow only significant
predictor variable to be included in the Model .
53 Prof.Mohsen Gadallah - Stat

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy