0% found this document useful (0 votes)
9 views25 pages

Parametric Noparametric Tests

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views25 pages

Parametric Noparametric Tests

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Tests of

significance
By
Prof Dr Saad Motawea
Professor of Community Medicine and Public
Health Tests of
significance
Intended learning
Objectives:
By the end of this lecture,
the student should be able
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Objectives
to:
• Understand normal distribution curve and skewness
• Understand the statistical hypothesis
• Understand The P (probability) value
• List tests of significance
• Select the proper test of significance according to the type of data
• Interpret of results of significance tests.

Normal Distribution Curve


Many biological characteristics e.g. Lengths, weight, H.R., B.P., Hb. etc when plotted
follow the normal distribution curve.
Properties of Normal distribution curve:
1- It is bell - shaped, bilateral and symmetrical, with the peak at the mean which is
located at the midpoint of the base.
2- The mean, median and mode coincide together.
3- The curve has a point of inflexion on both sides of the centre where the curvature
changes from convex upwards to concave upwards.
- The point of inflexion is located at one standard deviation above the center and one
standard deviation below the center.
- Between these 2 points, we find 68.2% of the area under the curve and 68.2% of the
total frequency of the population.
4- The percentage of the area included within other multiples of the S.D. above and
below the mean is always fixed, as shown in figure (1).
Between 2 S.D. ± (above or below the mean) , 95.4% of the area is included.
Between 3 S.D, 99.7% of the area is included.

2
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Figure. (1): Normal distribution curve

Example:
Let us say that a group of patients enrolling for a trial had a normal distribution for
weight. The mean weight of the patients was 80 kg. For this group, the SD was calculated
to be 5 kg.
1 SD below the average is 80 – 5 = 75 kg.
1 SD above the average is 80 + 5 = 85 kg.
±1 SD will include 68.2% of the subjects, so 68.2% of patients will weigh between 75
and 85 kg.
95.4% will weigh between 70 and 90 kg (±2 SD).
99.7% of patients will weigh between 65 and 95 kg (±3 SD).
Problem:
If we have some length of hospital stay data with a mean stay of 10 days and a SD of 8
days. Does this mean and SDs appropriate measures to use?!
No because the previous data are not paramateric the SD should be 1/3 mean

3
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Skewness.
Skewness indicate the extent and direction of asymmetry (degree of departure from
skewness) in a distribution.
1- If a distribution is perfectly symmetrical, the measure of skewness is equal to zero.
2- If the distribution is asymmetrical and the tail of the distribution extends in the
direction of the positive values -positive skewness (shift to the right) where the mean will
be to the right (greater than) median and mode. As shown in figure (2B)
3- If the tail extends in the direction of negative values- negative skewness
(Shift to the left) where the mean will be to the left (less than) median and mode. As
shown in figure (2A)

Figure (2): A Shift to the right (+ve skewness) B Shift to the left (-ve skewness).

PARAMETRIC and NON-PARAMETRIC DATA


Parametric data; Data which follow the normal distribution curve simply when the
SD is less than one third of the mean i.e. the individuals in the group are homogenous.
Non-parametric data ; Data which do not follow the normal distribution curve
simply when the SD is greater than one third of the mean i.e. there is great variation
between the individuals in the group.

4
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

HYPOTHESIS TESTING
The “null hypothesis”
 The “null hypothesis” is a concept. The test method assumes (hypothesizes) that
there is no (null) difference between the groups.
 The result of the statistical test either supports or rejects that hypothesis.
 The null hypothesis is generally the opposite of what we are actually interested in
finding out.
 If we are interested if there is a difference between two treatments then the null
hypothesis would be that there is no difference and we would try to disprove this.
Null hypothesis (Ho); that no differences exist between groups.

Alternative hypothesis (H1); that differences exist between groups.

Examples:
What is the null hypothesis in a study?
(a) To find out whether use of a new surgical technique reduces rates of wound infection.
(b) To find out whether adenoidectomy reduces absence from school in children with
middle ear disease.
Answers
(a) Rates of wound infection are the same with the new surgical technique as with old
surgical technique.
(b) In children with middle ear disease rates of absence from school are the same after
adenoidectomy as in comparable children who have not had the operation.
• Type I error; the probability of error to reject the null hypothesis when it is true.
i.e to state that there is difference when the difference occurs by chance.

• Type II error; the probability of error to accept the null hypothesis when it is
false i.e to state that there is no difference when such difference exists.

5
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Errors

Test result Ho True Ho False

Decision

Reject Ho Type I Error


Correct decision

Accept Ho Type II Error


Correct decision

The P (probability) value


• The P (probability) value is used when we wish to see how likely it is that a
hypothesis is true.
What does it mean?
The P value gives the probability of any observed difference having happened by
chance.
• P = 0.5 means that the probability of the difference having happened by chance is
0.5 in 1, or 50:50.
• P = 0.05 means that the probability of the difference having happened by chance is
0.05 in 1, i.e. 1 in 20.
• P = 0.05 It is the figure frequently quoted as being “statistically significant”, i.e.
unlikely to have happened by chance and therefore important.
- However, this is an arbitrary figure.
- If we look at 20 studies, even if none of the treatments work, one of the studies
is likely to have a P value of 0.05 and so appear significant!
• P = 0.01 is often considered to be “highly significant”.

6
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

• It means that the difference will only have happened by chance 1 in 100 times.
This is unlikely, but still possible.
• P = 0.001 It is usually considered to be “very highly significant”. means the
difference will have happened by chance 1 in 1000 times, even less likely, but still
just possible.
The lower the P value, the less likely it is that the difference happened by chance
and so the higher the significance of the finding.

Remember that
• P = 0.05 is usually classed as “significant”,
• P = 0.01 as “highly significant”
• P = 0.001 as “very highly significant”.
Example 1: Patients with minor illnesses were randomized to see either Dr Smith or
Dr Jones. Dr Smith ended up seeing 176 patients in the study whereas Dr Jones saw
200 patients (Table 1).
Table 1. Number of patients with minor illnesses seen by two GPs

• In the example above, only two of the sets of data showed a significant difference
between the two GPs.

7
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

• Dr Smith’s consultations were very highly significantly shorter than those of Dr


Jones.
• Dr Smith’s follow-up rate was significantly higher than that of Dr Jones.
2-What is meant by the P values in the following statements?
(a) In a trial of a new analgesic regimen used in terminal illness, patients reported more
satisfactory pain relief than when receiving conventional treatment (P =0·02).
(b) In a comparison of the workloads of two accident and emergency departments there
was no significant difference in the numbers of hip fractures treated over a 12 month
period (P>0·05).
(c) Advice to mothers in a community not to leave babies face down in their cots was
associated with a significant reduction (P < 0·05) in rates of sudden infant death over two
years as compared with the two previous years.
Answers:
(a) If in general the new analgesic produced no better or worse pain relief than
conventional treatment, the probability of observing a difference in pain relief
between the new analgesic and conventional treatment as large as or larger than that
found in the trial would be 0·02 (2%).
(b) If the underlying workloads of the two departments were identical, the probability of
finding a difference between them in numbers of hip fractures treated over 12 months as
large as or larger than that observed in the study would be more than 0·05 (5%).
(c) If advice not to leave babies face down in their cots had no influence on rates of
sudden infant death, the probability of observing a difference in the rates of sudden death
as large as or larger than that found in the study would be less than 0·05 (5%).
Note that in each of the above examples, the definition of the P value starts with an “if.”

8
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Tests of significance
In order to test a statistical hypothesis, tests of significance are used. They are
mathematical expression of sample values that provide basis for testing statistical
hypotheses.
Statistical tests which enable us to decide whether to reject or accept hypotheses.
It varies whether the data are parametric or non parametric.
PARAMETRIC TESTS
t TESTS AND OTHER PARAMETRIC TESTS
How important are they?
 Used in one in three of the published medical papers, they are an important
aspect of medical statistics.
Mathematically it is difficult to obtain t value
By any Statistical computer program such as SPSS, Epi info …..etc it is very
easy.
(1) Student- t test ( Two samples t-test) (Unpaired t-test):
 Used to compare the means of two samples, the variable is of quantitative type

Example: Serum cholesterol was determined for 2 groups.


The mean of the first group was 220 mg% and S.D. 10, the number of persons of the 1st
group was 20.
The mean of the second group was 200 mg% and S.D. 15, the number of persons was 15.

9
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Determine if there is significant difference between the serum cholesterol of the


two groups.

x̄1 = 220 mg% x̄2 = 200 mg%


n1 = 20 n2 = 15
S.D1 = 10 S.D2 = 15
- As the variables are quantitative and parametric i.e. S.D. is less than 1/3 of the mean.
We use the Student-t test
- Null Hypothesis states that there is no difference between the groups.
- Alternative Hypothesis states that there is difference between the groups

t-test = mean1 – mean2 / √ SD21/n1 + SD22/n2

=220-200/ √ (10)2 /20 + (15)2 /15

=20 √ 5 + 15 =4.49

- The ( t ) value obtained (4.49) is called the calculated t.


- The degree of freedom equals n1 + n2 - 2 =33
- From the table of (t), at 0.05 probability and degree of freedom 33, the critical t value
(tabulated t) is 2.04
It is evident that the calculated (t) is higher than the tabulated value. This means that
there is significant difference between the two groups i.e. Null Hypothesis is rejected and
Alternative Hypothesis is accepted with confidence 95%.

N.B: For any test of significance, the following rules are applicable;
* If the calculated value is equal or greater than the tabulated value, there is significant
difference i.e. the Alterative Hypothesis is true.
* If the calculated value is less than the tabulated value, the Null Hypothesis is true i.e.
there is insignificant difference.

10
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Thankfully you do not need to know this mathematical calculation.


Just look for the P value to see how significant the result is.
Remember: The smaller the P value, the smaller the chance that the “null
hypothesis” is true. The higher the chance “the alternate hypothesis” is true

EXAMPLE for interpretation:


Two hundred adults seeing an asthma nurse specialist were randomly assigned to either a
new type of bronchodilator or placebo.
After 3 months the peak flow rates in the treatment group had increased by a mean of 96
l/min (SD 58), and in the placebo group by 70 l/min (SD 52).
The null hypothesis is that there is no difference between the bronchodilator and the
placebo. The t statistic is 11.14, resulting in a P value of 0.001.
It is therefore very unlikely (1 in 1000 chance) that the null hypothesis is correct so we
reject the hypothesis and conclude that the new bronchodilator is significantly better
than the placebo.
2-Paired t-test:
 This test compares between paired observations. It is used to compare sample
means from the same group, the variable is of quantitative type.
3-Analysis of variance (ANOVA)
 Analysis of variance (ANOVA) (F-test): used to compare the means of multiple
groups. It may be one-way ANOVA or two-way ANOVA.
4-Correlation:
 It indicates the degree to which two quantitative variables are related. Measured by
Pearson’s correlation coefficient (r.). It may be +ve, -ve or 0.
The Correlation Coefficient (r) defines both the strength and the direction of relation.
5- Regression statistics:
 If two quantitative variables are correlated, if one is known the other can be
predicted.
11
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

NON-PARAMETRIC TESTS
 Non-parametric statistics are used when the data are not normally distributed and so
are not appropriate for “parametric” tests.
 Rather than comparing the values of the raw data, statisticians “rank” the data and
compare the ranks.
 MANN–WHITNEY corresponds to unpaired t-test.
 The “Wilcoxon signed rank test”: corresponds to paired t-test.
 Kruskal-Wallis test corresponds to One-way ANOVA Compare three or more
unmatched groups
 Friedman tests corresponds to Repeated-measures ANOVA Compare three or
more matched groups
 Spearman’s rank correlation coefficient: corresponds to Pearson’s correlation
coefficient.

• Do not be put off by the names – go straight to the P value. P = 0.05 is usually
classed as “significant”, P = 0.01 as “highly significant” P = 0.001 as “very
highly significant”.
Example:
 A statistician used a “Mann–Whitney U test” to test the hypothesis that there is no
difference between the ages of two groups. This gave a U value of 133 200 with a P
value of < 0.001.
 Ignore the actual U value but concentrate on the P value, which suggests very
highly significantly difference between the two groups
TESTS FOR QUALITATIVE DATA
Chi-square test: it measures the association between qualitative variables. It uses
absolute values.
Fisher exact test: alternative to chi-square test when cell frequencies are less than 5.
The “Mantel Haenszel test” is an extension of the χ2 test that is used to compare
several two-way tables.

12
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Z-test: used to compare 2 groups regarding qualitative variable. It uses proportion not
absolute numbers.
Chi-square test: (X2)

It is a test used for comparison of qualitative variables


The following criteria should be satisfied:-
1. The data must be in the form of frequencies
2. The frequency data must have a precise numerical value and must be organised
into categories or groups.
3. The total number of observations must be greater than 20.
4. The expected frequency in any one cell of the table must be greater than 5.
5. If the number in any one cell is less than 5. Fisher exact test is used
6. X2 is used also for comparison between more than two groups

Equation:
X2 = ∑ (O – E )2

O=observed value , E = expected value

Example :

Two Drugs were used in clinical trial for treatment of Tonsillitis

Type of Number of cure Number of Total


Failure
Treatment

Drug A 40 20 60

Drug B 36 4 40

Total 76 24 100

Is Drug B better than Drug A?


13
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Is there a statistical difference between the two drugs regarding cure rate?

To answer the problems above for example problem 2, we:

1- Assume normal distribution of data


2- Type of data variables = categorical qualitative
3- Number of groups = two independent groups
4- Null Hypothesis: No difference in cure rate between the two drugs
5- Alternative Hypothesis: There is a statistical difference being Drug B is better.
6- Test of significant: X2 test
Calculate the X2 test

Steps of calculation:

a- Calculate the expected frequency (E) for each cell To find the expectancy E
of a cell you multiply the row total by the column total and divide by the
grand overall total
E= row total x column total / grand total

b- For each cell, subtract the expected frequency from the observed frequency (
O–E)
c- For each cell, square the result of ( O-E) and divide by the expected
frequency E
d- Add the squared results calculated in step c for all the cells
X2 = ∑ ( O – E ) 2

In our example, the Expected Frequency Values are:

E1 = 76 x 60/100 = 45.6 E2 = 24 x 60/100 = 14.4

E3 = 76 x 40/100 = 30.4 E4 = 24 x 40/100 = 9.6

Type of Number of cure Number of Failure Total

Treatment

Drug A O1=40 E1=45.6 O2=20 E2= 14.4 60

Drug B O3=36 E3=30.4 O4=4 E4= 9.6 40

Total 76 24 100

14
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

X2 = (40 – 45.6)2 + (20-14.4)2 + (36-30.4)2 + (4-9.6)2

45.6 14.4 30.4 9.6


X2 = 7.16

7- Using X2 table , at degree of freedom = (row-1) x (column – 1) and level of


significant = 0.05 . Note the degree of freedom in this example=1
If X2 calculated (=7.16) is equal to or larger than X2 tabulated ( in this
example=3.84) then we reject Null Hypothesis. If X2 calculated is smaller than X2
tabulated then we accept Null Hypothesis.

8- Our results revealed that calculated X2 is larger than tabulated X2 , which means
that the p value is smaller than 0.05 ( P < 0.05) . We conclude that Drug B is better
than Drug A in treatment of Tonsillitis, as the difference is statistically significant.

Example for interpretation:


A group of patients with bronchopneumonia were treated with either amoxicillin or
erythromycin. The results are shown in Table 3.
Table 3. Comparison of effect of treatment of bronchopneumonia with
amoxicillin or erythromycin
Amoxicillin Erythromycin Total
Improvement at 5 days 144 (60%) 160 (67%) 304 (63%)

No improvement at 5 days 96 (40%) 80 (33%) 176 (37%)

Total 240 (100%) 240 (100%) 480 (100%)

The results were X2 = 2.3; P = 0.13


Remember, do not worry about the X2 value itself, but see whether it is significant. In
this case P is 0.13, so the difference in treatments is not statistically significant.
Interpretation of results of significance tests.
Just look for the P value to see how significant the result is.
Remember: The smaller the P value, the smaller the chance that the “null
hypothesis” is true. The higher the chance “the alternate hypothesis” is
true

15
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Remember that P = 0.05 is usually classed as “significant”, P = 0.01 as


“highly significant” P = 0.001 as “very highly significant”.
Selection of the proper test of significance according to the type of data

16
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

17
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

An Introduction to SPSS
What is SPSS?
• “SPSS” = Statistical Package for the Social Sciences
• A software program that is very user-friendly
What can it do?
• Enables the user to perform advanced statistical analyses (e.g., t-tests) without all
the tedious hand calculations .
• We won’t teach you the formulas. We’ll only teach you how to do a t-test with
SPSS, and how to interpret the results.
The following example use actual SPSS output
While you are learning about t-tests, you are also learning to interpret SPSS output!
We want to compare the systolic BP between the normal weight and the over-weight.
250 subjects for each group were recruited.
Table : Descriptive statistics of Systolic BP by group.
Mean Std Deviation Minimum Maximum Median
over-weight 141.65 23.06 90.00 200.00 138.00
normal-weight 97.12 10.82 80.00 132.00 100.00
SPSS output

Conclusion: there was a significant difference in the systolic BP between the over-
weight and normal (p<0.001)

18
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

Look to the Following and their interpretations

19
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

20
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

21
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

22
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

23
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

24
PARAMETRIC&NONPARAMETRIC TESTS Prof Dr Saad Motawea

25

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy