Math204 NonParThree

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

N ON - PARAMETRIC S TATISTICS

T WO D EPENDENT S AMPLES AND M ULTIPLE I NDEPENDENT S AMPLES

1. T WO D EPENDENT S AMPLES : W ILCOXON S IGNED R ANK T EST


Data collected from the same experimental units are in general dependent. For example, if data are
collected on two occasions (time 1 and time 2, or before and after treatment) from the same n indi-
viduals, then the resulting data samples (y11 , . . . , yn1 ) and (y12 , . . . , yn2 ) are dependent. Such data
are often referred to as paired. We wish to test whether there is a significant change across the two
measurements.
For a parametric test, we typically assume that the within-individual differences
xi = yi1 − yi2 i = 1, . . . , n
are Normally distributed, and test the hypothesis that the mean difference µ is zero
H0 : µ = 0
using a one-sample Z-test (σ known) or T -test (σ unknown), with statistic
x x
z= √ or t= √
σ/ n s/ n
distributed as Normal(0, 1) or Student(n − 1) respectively.

For a non-parametric test, we can use the Wilcoxon Signed Rank test, which proceeds as follows:
1. Compute the within-individual differences
xi = yi1 − yi2 i = 1, . . . , n
If any xi = 0, then that data point is discarded and the sample size adjusted.

2. Sort the absolute values s1 , . . . , sn of x1 , x2 , . . . , xn into ascending order, and assign ranks 1 up
to n. If there are ties, assign average ranks.

3. Form the two rank sums T+ and T− , where


T+ = Sum of ranks for those xi > 0
T− = Sum of ranks for those xi < 0
The test statistic is a function of these rank sums. Heuristically, if the statistic T+ is large and T− is small
, this implies that the experimental units where yi1 > yi2 have a larger (in magnitude) difference than
those where yi1 < yi2 . This indicates an overall decrease between the first and second measurements.
Conversely, if the statistic T− is large and T+ is small, this implies that the experimental units where
yi2 > yi1 have a larger (in magnitude) difference than those where yi2 < yi1 . This indicates an overall
increase between the first and second measurements.
We test the null hypothesis
H0 : No change between first and second measurements
against the three alternative hypotheses
(1) Ha : Significant decrease between first and second measurements
(2) Ha : Significant increase between first and second measurements
(3) Ha : Significant change between first and second measurements

1
To test H0 vs (1), we perform a one-sided test using the statistic T− ; the critical value in the test is
denoted T0 , and is determined by the table on p. 839 of McClave and Sincich:
If T− ≤ T0 , we reject H0 in favour of Ha (1)

To test H0 vs (2), we perform a one-sided test using the statistic T+ ; the critical value is T0 and
If T+ ≤ T0 , we reject H0 in favour of Ha (2)

To test H0 vs (3), we perform a two-sided test using the statistic T = min{T− , T+ }; the critical value is
T0 and
If T ≤ T0 , we reject H0 in favour of Ha (3)
Notes :
1. The only assumption behind the test is that the difference data xi are drawn independently from
a continuous distribution.

2. Large Sample Test: For n ≥ 25, we can use a large sample version of the test based on T+ , and
the Z statistic
n(n + 1)
T+ −
Z=r 4
n(n + 1)(2n + 1)
24
If H0 is true, then Z ∼
: Normal(0, 1), so that the test at α = 0.05 uses the following critical values
For Ha (1) use CR = 1.645
For Ha (2) use CR = −1.645
For Ha (3) use CR = ±1.960

EXAMPLE 1: Haemodialysis Data


The following data are measurements of the heparin cofactor II (HCII) to plasma protein ratios in a
group of patients at baseline and five months after haemodialysis.
Reference: Toulon, P et al. (1987) Antithrombin III and heparin cofactor II in patients with chronic renal
failure undergoing regular hemodialysis, Thrombosis and Haemostasis, 3;57(3): pp263-8.

Patient Before After


yi1 yi2 xi si Rank Ave. Rank
1 2.11 2.15 -0.04 0.04 3 3.5
2 1.85 2.11 -0.26 0.26 10 10.0
3 1.82 1.93 -0.11 0.11 8 8.0
4 1.75 1.83 -0.08 0.08 6 6.0
5 1.54 1.90 -0.36 0.36 11 11.0
6 1.52 1.56 -0.04 0.04 3 3.5
7 1.49 1.44 0.05 0.05 5 5.0
8 1.44 1.43 0.01 0.01 1 1.5
9 1.38 1.28 0.10 0.10 7 7.0
10 1.30 1.30 0.00 0.00 OMIT OMIT
11 1.20 1.21 -0.01 0.01 1 1.5
12 1.19 1.30 -0.11 0.11 9 9.0
T+ = 13.5
T− = 52.5

From the table on p 839, for n = 12 − 1 = 11, we find that the α = 0.025/0.05 (one/two-sided)
significance level critical value is T0 = 11. Thus using T+ , we cannot reject either of the null hypotheses
(2) and (3), as T+ > T0 . Note that Z = −1.734, so if the approximation was valid, we would be able to
reject (2) at α = 0.05.

2
2. T HREE OR MORE INDEPENDENT SAMPLES :
T HE K RUSKAL -WALLIS AND F RIEDMAN T ESTS
We now seek non-parametric tests that can be used for multiple independent samples, such as those
found in the Completely Randomized Design (CRD) and Randomized Block Design (RBD) described
in the ANOVA section.
The non-parametric equivalents of the Fisher-F tests ANOVA for these two designs are
• The Kruskal-Wallis H test for a Completely Randomized Design
• Friedman’s test for a Randomized Block Design

2.1 Kruskal-Wallis Test


In a CRD, we have k independent groups, corresponding to k different treatments, with sample sizes
n1 , . . . , nk . Let n = n1 + · · · + nk . To compute the test statistic, H, we
1. Pool the data, sort them into ascending order, and assign ranks. If there are ties in the data, then
average ranks are used.

2. For j = 1, . . . , k, compute the rank sum Rj


Rj = Sum of ranks for data from sample j.
To test the hypothesis
H0 : No difference between the population distributions of the k groups
Ha : At least two population distributions different
the test statistic is
k
X Rj2
12
H= − 3(n + 1)
n(n + 1) nj
j=1

If H0 is true, then for large n,


H∼
: Chisquared(k − 1).
Notes :
1. The test assumes that the k samples are independently drawn from continuous populations.

2. For the approximation to be valid, there should be at least five observations in each sample, and
the number of ties should be small.

EXAMPLE 2: Mucociliary efficiency data


The following data are measures of mucociliary efficiency from the rate of removal of dust in normal
subjects (Group 1), subjects with obstructive airway disease (Group 2), and subjects with asbestosis
(Group 3).
Reference: Myles Hollander, M and Douglas A. Wolfe (1973), Nonparametric statistical inference, New
York: John Wiley & Sons. pp115-120.

Group 1 1 1 1 1 2 2 2 2 3 3 3 3 3
y 2.9 3.0 2.5 2.6 3.2 3.8 2.7 4.0 2.4 2.8 3.4 3.7 2.2 2.0
Rank 8 9 4 5 10 13 6 14 3 7 11 12 2 1

Hence R1 = 36, R2 = 36 and R3 = 33, and the test statistic H = 0.7714. To complete the test, we
compare with the α = 0.05 quantile of the Chisquared(k − 1) = Chisquared(2) distribution. We have
Chisq0.05 (2) = 5.99 > H ∴ No evidence to reject H0
and a p-value of p = 0.680.

3
2.2 Friedman Test
In a RBD, we have k treatment groups, and a blocking factor. For example, we might have k repeated
measurements on the same b experimental units, and n = bk observations in total. To compute the test
statistic, Fr , we proceed as follows.
1. Within each block separately, sort the k data values into ascending order, and assign ranks. If
there are ties in the data, then average ranks are used.

2. For j = 1, . . . , k, compute the rank sum Rj


Rj = Sum of ranks for data from treatment j.
To test the hypothesis
H0 : No difference between the population distributions of the k treatment groups
Ha : At least two population distributions different
the test statistic is
X k
12
Fr = Rj2 − 3b(k + 1)
bk(k + 1)
j=1
If H0 is true, then for large n,
Fr ∼
: Chisqα (k − 1)
Notes :
1. The test assumes that the data are drawn independently from continuous populations, with ran-
dom assignment of treatments within blocks.

2. For the approximation to be valid, it is recommended that b or k is at least five, and the number
of ties should be small.
EXAMPLE 3: Skin potential under hypnosis
A study was conducted to investigate whether hypnosis has the same effect on skin potential for four
different emotions. Eight subjects were asked to display fear, joy, sadness and calmness under hypno-
sis, and the resulting skin potential (measured in millivolts) was recorded for each emotion. Thus in
this experiment, b = 8 and k = 4.

Fear Joy Sadness Calmness


Subject y Rank y Rank y Rank y Rank
1 23.1 4 22.7 3 22.5 1 22.6 2
2 57.6 4 53.2 2 53.7 3 53.1 1
3 10.5 3 9.7 2 10.8 4 8.3 1
4 23.6 4 19.6 3 21.1 2 21.6 1
5 11.9 1 13.8 4 13.7 3 13.3 2
6 54.6 4 47.1 3 39.2 2 37.0 1
7 21.0 4 13.6 1 13.7 2 14.8 3
8 20.3 3 23.6 4 16.3 2 14.8 1
Rank Sum 27 20 19 14

Thus the within-treatment rank sums are


R1 = 27 R2 = 20 R3 = 19 R4 = 14
and thus
Fr = 6.45
To complete the test, we compare with the α = 0.05 quantile of the Chisquared(k − 1) = Chisquared(3)
distribution. We have
Chisq0.05 (3) = 7.81 > Fr ∴ No evidence to reject H0
and a p-value of p = 0.092.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy