Goodness of Fit Techniques

Download as pdf or txt
Download as pdf or txt
You are on page 1of 589

STATISTICS: textbooks find monographs volume 68

GOODNESS-OF-FIT
TECHNIQUES
1.0
.9
.8
.7
Pni’') 6

76 80 120 126

edited by
Ralph B. D’Agostino
Michael A. Stephens
GOODNE8S-OF-FIT TECHNIQUES
S T A T IS T IC S : T e x tb o o k s and M o n ograph s

A S E R IE S E D IT E D B Y

D. B. OW EN, Coordinating Editor

D e p a r t m e n t o f S ta tistics
S o u t h e r n M e t h o d i s t U n iv e r s it y
Dallas, T ex a s

Vol. I: The Generalized Jackknife Statistic, Я. L. Gray and У1, R Schucany


Vol. 2: Multivariate Analysis, /fsAirsjgffr
Vol. 3: Statistics and Society, Walter T. Federer
Vol. 4: Multivariate Analysis: A Selected and Abstracted Bibliography, 1957-1972,
Kocherlakota Subrahmaniam and Kathleen Subrahmaniam (out o f print)
Vol. Si Design o f Experiments: A Realistic Approach, K i r g ? / Anderson e/id
R o b e rt A . M cL ea n
Vol. 6: Statistical and Mathematical Aspects of Pollution Problems, John W. Pratt
Vol. 7: Introduction to Probability and Statistics (in two parts), Pärt I: Probability;
Part II: Statistics, Narayan C Giri
Vol. 8: Statistical Theory o f the Analysis of Experimental Designs, J. Ogawa
Vol. 9: Statistical Techniques in Simulation (in two parts). Jack P. C Kleijnen
Voi. 10: Data Quality Control and Editing, dosep/? I. Naus (out o f print)
Vol. 11: Cost o f Living Index Numbers: Practice, Precision, and Theory, Kali S. Banerjee
Vol. 12: Weighing Designs: For Chemistry, Medicine, Economics, Operations Research,
Statistics, Kali S. Banerjee
Vol. 13: The Search for Oil: Some Statistical Methods and Techniques, edited b y
D. B. O w en
Vol. 14: Sample Size Choice: Charts for Experiments with Linear Models, R o b e r t E . Odeh
and Martin F o x
Vol. 15: Statistical Methods for Engineers and Scientists, R o b e rt M . Bethea, Benjamin S.
Duran, and Thomas L. Boullion
Vol. 16: Statistical Quality Control Methods,/m ng W. Burr
Vol. 17: On the History o f Statistics and Probability, edited b y D. B, Ow en
Vol. 18: Econometrics, Peter Schmidt
Vol. 19: Sufficient Statistics: Selected Contributions, Vasant S. Huzurbazar (ed ited b y
An a nt M . Kshirsagar)
Vol. 20: Handbook o f Statistical D i s t r i b u t i o n s , K. Patel, C. H, Kapadia, and
D. B. O w en
Vol. 21: Case Studies in Sample Design,/4. C Rosander
Vol. 22: Pocket Book o f Statistical Tables, com piled b y R. E. Odeh, D. B. Ow en, Z. W.
Birnbaum, and L. Fisher
Vol. 23: The Information in Contingency Tables, D. V. Gokhale and S o lo m o n Kullback
Vol. 24: Statistical Analysis o f Reliability and Life-Testing Models: Theory and Methods,
L e e J. Bain
Vol. 25: Elementary Statistical Quality Control, Irving W. Burr
Vol. 26: An Introduction to Probability and Statistics Using BASIC, Richard A . Groeneveld
Vol. 27: Basic Applied Statistics, B. L. R aktoe and J. J. Hubert
Vol. 28: A Primer in Probability, Kathleen Subrahmaniam
Vol. 29: Random Processes: A First Look,/?. Syski
Vol. 30: Regression Methods: A Tool for Data Analysis, R u d o lf J. Freu n d and Paul D.
M in to n
Vol. 31: Randomization T esis, E ugene S. Edgington
Vol. 32: Tables for Normal Tolerance Limits, Sampling Plans, and Screening, R o b e r t E.
O d eh and D , B, O w en
Vol. 33: Statistical Computing, William J. K ennedy, Jr, and James E. Gentle
Vol. 34: Regression Analysis and Its Application: A Data-Oriented Approach, R ichard F.
Gunst and R o b e r t L. M a son
Vol. 35: Scientifíc Strategies to Save Your Life,/. D, J, Bross
Vol. 36: Statistics in the Pharmaceutical Industry, edited b y C. Ralph B u n ch era n d
J ia -Y eon g Tsay
Vol. 37: Sampling from a Finite Population, J. Hafek
Vol. 38: Statistical Modeling Techniques, S. S. Shapiro
Vol. 39: Statistical Theory and Inference in Research, T. A . Bancroft and C.-P. Han
Vol. 40: Handbook o f the Normal Distribution, К. Patel a nd C am pbell В. R ea d
Vol. 41: Recent Advances in Regression Methods^ Hrishikesh D . V in o d a n d A m a n Ullah
Vol. 42: Acceptance Sampling in Quality Control, Edw ard G, Schilling
Vol. 43: The Randomized Clinical Trial and Therapeutic Decisions, edited b y N iels
Tygstrup, John M , Lachin, and Erik Juhl
Vol. 44: Regression Analysis o f Survival Data in Cancer Chemotherapy, Walter H . Carter,
Jr„ Galen L . Wampler, and D on ald M . Stablein
Vol. 45: A Course in Linear M odels, A n a n t M . Kshirsagar
Vol. 46: Qinical Trials: Issues and Approaches, edited b y Stanley H . Shapiro and
Thomas H. L o u is
Vol. 47: Statistical Analysis o f D N A Sequence Data, edited b y B. S. Weir
Vol. 48: Nonlinear R^ression Modeling: A Unified Practical Approach, D a vid A ,
Ratkow sky
Vol. 49: Attribute Sampling Plans, Tables o f Tests and Confidence Limits for Proportions,
R o b e rt E. O deh and D . B. O w en
Vol. 50: Experimental Design, Statistical Models, and Genetic Statistics, edited b y Klaus
Hinkelmann
Vol. 51: Statistical Methods for Cancer Studies, edited b y Richard G. Cornell
Vol. 52: Practical Statistical Sampling for A n d ilois, A rth u r J, Wilburn
Vol. 53: Statistical Signal Processing, edited b y Edward J, Wegman and James G. Smith
Vol. 54: Self-Organizing Methods in Modeling: GM D H Type Algorithms, edited b y
Stanley J. Farlow
Vol. 55: Applied Factorial and Fractional Designs, R o b e rt A . M cL ea n and Virgil L.
An d erson
Vol. 56: Design of Experiments: Ranking and Selection, edited b y Thomas J. Santner
and A jit C. Tamhane
Vol. 57: Statistical Methods for Engineers and Scientists. Second Edition, Revised
and Expanded, R o b e r t M . Bethea, Benjamin S. Duran, and Thomas L.
Boullion
Vol. 58: Ensemble Modeling: Inference from Small-Scale Properties to Large-Scale
Systems, Alan E. Gelfand and Crayton C. Walker
V oL 59: Computer Modeling for Business and Industry, Bruce L. Bow erm an
and Richard T 0 *C on n ell
V oL 60: Bayesian Analysis o f Linear Models, L y le D . Broem eling
Vol. 61: Methodological Issues for Health Care Surveys, Brenda C o x and Steven Cohen

Vol. 62: Applied Regression Analysis and Experimental Design, Richard J. B rook and
Gregory C A rn o ld
V oL 63: Statpal: A Statistical Package for Microcomputers - PC-DOS Version for the
IBM PC and Compatibles, B ru ce J, Chalmer and D a vid G. W hitmore

V oL 64: Statpal: A Statistical Package for Microcomputers — Apple Version for the
II, II+, and lie, D a vid G. W hitm ore and Bru ce J. Chalmer
V oL 65: Nonparametric Statistical Inference, Second Edition, Revised and
Expanded, Jean Dickinson G ibbons
V oL 66: Design and Analysis o f Experiments, R o g e r G. Petersen

V o l. 67: Statistical Methods for Pharmaceutical Research Planning, Sten W. Bergman and
John C G ittim

V o L 68: Goodnessof-Fit Techniques, ed ited b y Ralph B. D*A g o s tm o and M ichael A ,


Stephens

OTHER VOLUMES IN PREPARATION


A N IM P O R T A N T MESSAGE TO R E A D E R S .. .

A Marcel Dekker, Inc. Facsimile Edition contains the exact contents o f an


original hard cover M D I published work but in a new soft sturdy cover.

Reprinting scholarly works in an economical format assures readers that im­


portant information they need remains accessible. Facsimile Editions provide
a viable alternative to books that could go “ out o f print.’ * Utilizing a contem­
porary printing process for Facsimile Editions, scientifîc and technical books
are reproduced in limited quantities to meet demand.

Marcel Dekker, Inc. is pleased to offer this specialized service to its readers in
the academic and scientifíc communities.
GOODNESS-OF-FIT TECHNIQUES

ed/ied by

Ralph B. D’Agostino
D epartm ent of M athem atics
Boston University
Boston, M assachusetts

Michael A. Stephens
Departm ent of M athem atics and Statistics
Simon Fraser University
Burnaby, British C olum bia, C anada

M ARCEL DEKKER, INC. New York and Basel


Library o f Congress Cataloging-in-Pubiicatíon Data

Goodness-of-fit techniques.

(Statistics, textbooks and monographs ; vol. 68)


Includes bibliographies and index.
I . Goodness-of-fit tests. I. D ’Agostino, Ralph B.
11. Stephens, Michael A., [d a t e ]. 111. Series:
Statistics, textbooks and monographs ; v. 68.
QA277.G645 1986 519.5’6 86-4571
ISBN: 0 -8 2 4 7 -8 7 0 5 -6

C O PYR IG H T © 1986 by M A R C E L DEKKER, INC. A L L RIGH TS RESER VED

Neither this book nor any part may be reproduced or transmitted in any form or
by any means, electronic or mechanical, including photocopying, microfilming,
and recording, or by any information storage and retrieval system, without per­
mission in writing from the publisher.

M A R C E L DEKKER, INC.
270 Madison Avenue, New York, New York 10016

Current printing (last digit):


10 9 8 7 6 5 4 3

PR INTED IN THE U N IT E D STATES O F A M E R IC A


То Egon S. Pearson
who helped us both
Preface

F rom the earliest days of statistics, statisticians have begun their analysis
by proposing a distribution fo r their observations and then, perhaps with
somewhat less enthusiasm, have checked on whether this distribution is
true. Thus over the years a vast number of test procedures have appeared,
and the study of these procedures has come to be known as goodness-of-flt.
When several of the present authors met at the Annual Meeting of the A m e ri­
can Statistical Association in Boston in 1976 and proposed writing a book on
goodn ess-of-fit techniques, we certainly did not foresee the magnitude of
the task ahead. Quite early on we asked P ro fe sso r E . S- Pearson if he would
join us. He declined and stated his view that the time was not yet ripe fo r a
book on the subject. A s w e, nevertheless, have slowly written it, it has often
appeared that his assessm ent was correct. A s fast as we have tried to survey
what we know, with every issue the journals produce new papers with new
techniques and new information.
However, many colleagues have told us that the time is ready fo r a m ajor
sum m ary of the literature, and for some sorting and sifting to take place.
This we have tried to do. The emphasis of this book was determined by the
w riters to be mostly on the practical side. The intent is to give a survey of
the leading methods of testing fit, to provide tables where necessary to make
the tests available, to make (where possible) some assessm ent of the com­
parative m erits of different test procedures, and finally to supply numerical
examples to aid in understanding the techniques.
This applied emphasis has led to some difficult decisions. Many goodness-
o f-flt techniques are supported by elegant mathematics involving combina­
torics, analysis, and geometric probability, mostly arisin g in the distribution
theory, both sm all-sam ple and asymptotic, o r in examining power and effi­
ciency. Furtherm ore, there are many unsolved problem s, especially in
Vl PREFACE

discovering the relationships between different approaches, which would


require sophisticated mathematics to resolve. However, fo r the book to be
of manageable size, mathematical details have had to be held to the minimum
necessary to describe clearly the various techniques. References to fu ller
mathematical treatments are given throughout the book. W e also leave out
tests comparing several sam ples. Although these are often closely related
to the one-sam ple tests of this volume, they are not usually classified as
goodness-of-fit tests. Including them here would have made the book too large.
In arranging the book it was necessary to decide whether to collect to­
gether all methods of testing fo r specific famous distributions, such as the
norm al o r the exponential, o r whether to group tests according to techniques
such as chi-squared tests, em pirical distribution function tests, o r tests
based on probability plotting. In the end, and perhaps because of the fact that
many authors w ere Involved, we reached the Inevitable compromise to try to
do both. In order to make chapters as complete as possible, there is some
necessary overlap.
There is also some imbalance with respect to tables. Some m ajor, w e ll-
established techniques require quite sm all tables—surely an attractive fea^
ture—while many new and unproven techniques need fairly extensive tables,
often based on Monte C arlo studies. W here we have judged the techniques
important, either as new methods o r to complete a group of existing methods,
we have included the necessary tables and, in fact, have considerably e x ­
tended some of those in the literature. By doing so we hope not only to make
the newer techniques available for practical use but also to make the book
useful for further research in making the com parisons between methods which
we feel are still necessary. On the other hand, we at times only re fe r to
tables for some techniques which have never appeared to win much favor.
A s we have surveyed the tests available, it has become cle a r that much
work rem ains to be done. It sometimes seem s that new test statistics, even
for standard problem s, are invented every day. In goodness-of-fit, where
there is a wide range of problem s and almost never a best solution, this ap­
pears to be easy to do. However, the simple invention o f a test statistic is
surely not enough. W e suggest that, to gain acceptance, new methods should
have a clear motivation, be easily understood by the practical statistician,
and be w ell documented^ Where new tables are necessary, they should be
comprehensive. The day may w ell come when computer algorithms w ill r e ­
place tables, but fo r most statisticians this day has not yet arrived . Also
new methods should be compared with the a rray of procedures which often
already exist.
Finally, of course, this book is inevitably a reflection of the interests of
the editors and the contributors. Although we have tried to cast our net wide,
some special techniques for testing fit may seem to have received too much
attention, while others have been neglected. F or the latter cases it is,
we believe, mostly because the practical aspects are not yet sufficiently
developed. W e have attempted throughout the book to at least sum m arize the
PREFACE V li

present state o f knowledge fo r these. W e hope that by drawing attention to


them, we can again encourage further research .
W e also acknowledge gratefully M s. Sylvia Holmes and M r. Thomas
Orowan of Simon F r a s e r University and Boston University, respectively,
fo r much help with the typing of the manuscript, and the staff of M arcel
D ekker, I n c ., fo r their patient editorial work with this volume.

R A L P H B . D ’AGOSTINO
M IC H A E L A . STEPHENS
Acknowledgments

W e acknowledge with thanks the perm ission of numerous publishers, editors,


and authors to reprint their tables and graphs. W e thank the Addison-W esley
publishers of Reading, M a s s ., fo r perm ission to publish the norm al d istri­
bution table (from H o ste lle r, Rourke, and Thom as, Probability with Statis­
tical Applications Í1982; Table I (adapted)] and the ch i-square table [from
D. B . Owen, Handbook of Statistical T ables (1962) I , T h e s e t a b le s a r e in t h e
Appendix. W e thank the editor o f the Annals o f Statistic and the Institute of
Mathematical Statistics fo r perm ission to reprint an adaptation o f a table
from Lew is (1961). This table is in Chapter 4.
W e thank the editor of Biom etrika and the Biom etrika T rustees for p e r ­
m ission to reprint tables and graphs from a number of articles fo r various
chapters in this book. The chapters, authors, and publication y ears are as
follows: Chapter 2, H arter (1961); Chapter 4, Lockhart and Stephens (1985),
Pettitt and Stephens (1976), Pettitt (1976, 1977), and Stephens (1977, 1979);
C h ^)ter 5, Shapiro and W ilk (1965); Chapter 7, Bowman and Shenton (1975);
Chapter 8, Stephens (1966); and Chapter 9, D'Agostino (1971, 1972),
D'Agostino and Tietjen (1971, 1973), and D'Agostino and P earson (1973).
A lso in Chapter 4 is an adaptation of Table 54 from Biom etrika Tables for
Statisticians, Volume 2 (1972).
W e thank the editor of Technometrics and the Am erican Statistical A s s o ­
ciation fo r perm ission to reprint tables and data: Chapter 4, Proschen (1963),
Koziol and B yar (1975), and Pettitt and Stephens (1977); Chapter 5, Shapiro
and W ilk (1972), and Chapter 12, Grubbs (1965), Tietjen and M oore (1972),
Stefansky (1972), Lund (1975), Galpin and Hawkin (1981), and Jain (1981).
W e thank the editor of the Journal of the Am erican Statistical Association
and the Am erican Statistical Association fo r perm ission to reprint tables

IX
ACKNOWLEDGMENTS

from Stephens (1974), and Chandra, Singpurwalla, and Stephens (1981) for
Chapter 4 .
We thank the editors of the Journal of the Royal Statistical Society and
the Royal Statistical Society for perm ission to reprint tables from Stephens
(1970), Pettitt (1977), Burrow s (1979), and Stephens (1981) in Chapters 4*
and 8. W e thank the editor of Communications in Statistics and M arcel
Dekker, In c ., for perm ission to reprint tables from M ille r and Quesenberry
(1979), and Solomon and Stephens (1983) in Chapter 8. Finally we thank the
editors of the Journal of Statistical Computation and Simulation and Gordon
and Breach Science P ublishers, I n c ., fo r perm ission to reprint a table from
Quesenberry and M ille r (1977).
Contents

P reface V
Acknowledgments ix
Contributors xviii

I . O V E R V IE W
Ralph B . D ’Agostino and M ichael A . Stephens

1.1 G oodn ess-of-F it Techniques I


1.2 Objectives of the Book 3
I. 3 The Topics of the Book 4

2. G R A P m C A L A NALYSIS
Ralph B . D ’Agostino

2. 1 Introduction 7
2.2 Em pirical Cumulative Distribution Function 8
2. 3 General Concepts of Probability Plotting 24
2.4 N orm al Probability Plotting 35
2. 5 Lognorm al Probability Plotting 47
2.6 W eibull Probability Plotting 54
2.7 O th erT op ics 57
2.8 ConcludingComment 59
References 59

Xl
Xll CONTENTS

3. TESTS O F С Ш -SQ U AR ED T Y P E 63
David S. Moore

3.1 Introduction 63
3.2 C lassicalC h i-S q u ared S tatistics 64
3.3 G eneral Chi-Squared Statistics 75
3.4 Recommendations on U se o f Chi-Squared Tests 91
References 93

4. TESTS BASED O N ED F STATISTICS 97


M ichael A . Stephens

4 •I Introduction 97
4. 2 Em pirical Distribution Function Statistics 97
4.3 G oodn ess-of-F it Tests Based on the EDF (E D F Tests) 102
4.4 EDF Tests fo r a Fully Specified Distribution (Case 0) 104
4. 5 Comments on EDF Tests fo r Case 0 106
4. 6 Pow er of EDF Statistics fo r Case 0 HO
4. 7 EDF Tests fo r Censored Data: Case 0 111
4. 8 EDF Tests for the Norm al Distribution with
Unknown Param eters 122
4. 9 EDF Tests for the Exponential Distribution 133
4.10 ED F Tests fo r the E xtrem e-Value Distribution 145
4.11 ED F Tests fo r the W elbull Distribution 149
4.12 EDF T ests fo r the Gamma Distribution 151
4 . 13 ED F Tests fo r the Logistic Distribution 156
4.14 EDF Tests fo r the Cauchy Distribution 160
4.15 EDF Tests fo r the von M ises Distribution 164
4.16 ED F Tests fo r Continuous Distributions:
M scellaneous Topics 166
4.17 E D F T ests for D iscrete Distributions 171
4.18 Combinations of Tests 176
4 . 19 E D F Statistics as Indicators o f Parent Populations 180
4 . 20 Tests Based on Norm alized Spacings 180
References 185

5. TESTS BASED O N REGRESSION AND C O R R E LA T IO N 195


M ichael A . Stephens

5.1 Introduction 195


5.2 R e g re ssio n T e sts: Models 196
5.3 M e a s u r e o fF it 197
5.4 Tests Based on the Correlation Coefficient 198
5.5 The Correlation Tests for the Uniform Distribution
with Unknown Lim its 199
CONTENTS X lll

5.6 The Correlation Test fo r U (0 ,1) 201


5.7 R egression Tests fo r the Norm al Distribution I 201
5.8 R egression Tests Based on Residuals 205
5.9 Tests Based on the Ratio of Two Estimates of Scale 206
5.10 R egression Tests fo r the Norm al Distribution 2 207
5.11 R egression Tests fo r the Exponential Distribution 215
5.12 Tests Based on the Ratio of Two Estimates o f Scale:
Further Comments 223
5.13 R egression Tests fo r Other Distributions:
General Comments 224
5.14 Correlation Tests for the Extrem e-V alue Distribution 225
5.15 Correlation Tests for Other Distributions 225
References 230

SOME TR AN SFO R M ATIO N M ETHODS IN G O O D N ESS-O F-F I T 235


Charles P . Quesenberry

6.1 Introduction 235


6. 2 Probability Integral Transform ations 239
6.3 Some P roperties of CPIT^s 244
6.4 Testing Simple Uniformity 246
6.5 Transform ations for P articu lar Fam ilies 252
6.6 N u m ericalE xam p les 260
References 275

7. M OM ENT b j ) TECH NIQ UES 279


K. O. Bowman and L . R . Shenton

7.1 Introduction 279


7.2 N orm alD lstribution 280
7.3 Nonnormal Sampling 287
7.4 Moments of Sample Moments 288
7. 5 The Correlation Between n/Ь^ and Ьз 292
7.6 Simultaneous Behavior of ^ Jh i and Ьз 295
7.7 A Bivariate Model 306
7.8 Experim entalSam ples 316
References 318

8. TESTS FOR THE UNIFORM DISTRIBUTION 331


Michael A . Stephens

8.1 Introduction 331


8.2 Notation 332
8.3 Transform ations to Uniform s 332
CONTENTS

8. 4 Transform ation from Uniforms to Uniforms 333


8. 5 Superuniform Observations 334
8. 6 Tests Based on the E m pirical Distribution
Function (ED F) 334
8.7 R e g re ssio n a n d C o rre la tio n T e sts 336
8. 8 Other Tests Based on O rd er Statistics 336
8.9 Statistics Based on Spacings 338
8.10 Statistics fo r Special Alternatives 345
8.11 The Ne 5rman-Barton Smooth Tests 351
8.12 Components of Test Statistics 355
8.13 The Effect on Test Statistics of Certain Patterns
of U -V alu es 356
8.14 P ow er o f Test Statistics 357
8.15 Statistics for Combining Independent Tests
fo r Several Samples 357
8.16 Tests for a Uniform Distribution with Unknown Lim its 360
8.17 Tests for Censored Uniform Samples 361
References 361

9. TESTS FOR THE N O R M A L DISTRIBUTION 367


Ralph B. D ’Agostino

9.1 Introduction 367


9.2 Com pleteRandom Sam ples 368
9.3 Classification of Existing Tests 370
9.4 Comparisons of Tests 403
9.5 Recommendations 405
9.6 Tests of Norm ality on Residuals 406
9.7 M ultivariateN orm ality 409
References 413

10. TESTS FOR THE E X P O N E N T IA L DISTRIBUTION 421


M ichael A . Stephens

10.1 Introduction and Contents 421


10.2 Notation 424
10.3 Tests fo r E3q)onentiality: The Four C ases 425
10.4 Applications of the E^)onential Distribution 426
10.5 Transform ations from Exponentials to Exponentials
o r to Uniforms 429
10.6 Test Situations and Choice of Procedures 432
10.7 Tests with O rigin Known: Groups I , 2, and 3 435
10.8 Group I Tests 435
10.9 Group 2 T ests, Applied to U = JX 438
CONTENTS XV

10.10 The Effect of Z ero V a lu e s, and of T ies 444


10.11 Group 3 Tests Applied to X' = NX, o r to U* = KX 445
10.12 Discussion of the Data Set 451
10.13 Evaluation of Tests for Exponentiality 451
10.14 Tests with O rigin and Scale Unknown 455
10.15 Summary 456
References 457

11. A N A LY SIS O F D A T A FROM CENSORED S A M PLE S 461


John R . Michael and W illiam R. Schucany

11.1 Introduction 461


11.2 Probability Plots 463
11.3 Testing a Simple Null Hyix>thesls 480
^ 11.4 Testing a Compostte Hypothesis 487
References 493

12. THE A N A LYSIS AN D D E T E C T IO N O F O U T LIE R S 497


G ary L . Tietjen

12. 1 Introduction 497


12.2 A Single Outlier in a Univariate Sample 500
12. 3 Multiple Outliers in a Univariate Sample 504
12.4 The Identification o f a Single Outlier in Lin ear Models 507
12.5 Multiple Outliers in the L in ear Model 516
12.6 A ccom m odationofO utliers 517
12.7 M u ltlvariateO u tliers 520
12.8 Outliers in Tim e Series 520
References 521

A P P E N D IX 523

1. T able I , Cumulative Distribution Function of the


Standard Norm al Distribution 524
2. Table 2, C ritical Values of the Chi-Square Distribution 526
3. Simulated Data Sets 527
4. R eal Data Sets 546

IN D E X 551
Contributors

К. о. B O W M A N Engineering Physics and Mathematics Division, Oak Ridge


National Laboratory, Oak Ridge, Tennessee

R A L P H B . D ’AGOSTINO Department of Mathematics, Boston University,


Boston, Massachusetts

JOHN R. M IC H A E L * Quality Assurance Center, B e ll Telephone Laboratories,


Holm del, New Jersey

D AVID S. MOORE Department of Statistics, Purdue University, W est


Lafáyette, Indiana

C . P . Q U E SE N B E R R Y Department of Statistics, North Carolina State


University, Raleigh, North Carolina

М Т Ш А М R . SCH U C A N Y Department of Statistics, Southern Methodist


University, D allas, Texas

LE O N A R D R. SHENTONt Computer Center, University o f G eorgia, Athens,


G eorgia

M IC H A E L A . STEPH ENS Department of Mathematics and Statistics,


Simon F ra s e r University, Burnaby, B ritish Colum bia, Canada

♦ Current affiliation: W estat, In c ., Rockville, Maryland


t Current affiliation: Advanced Computational Methods Center, University
of Georgia, Athens, G eorgia

X V ll
X V lll CONTRIBUTORS

G A R Y L . T IE T JE N Analysis and Assessm ent Division, Los Alam os


National Laboratory, Los A lam os, New Mexico
GOODNESS-OF-FIT TECHNIQUES
O ve rvie w
Ralph В . D^Agostino Boston University, Boston, Massachusetts

M ichael A . Stephens S lm o n F ra s e rU n iv e rs ity , Burnaby, B . C . , Canada

1.1 G O O D N E S S -O F -F IT TECH NIQ UES

This book is devoted to the presentation and discussion of goodness-of-fit


techniques. By these we mean methods of examining how w e ll a sample of
data agrees with a given distribution as its population. The techniques d is­
cussed are almost entirely for univariate data, fo r which there is a vast
literature; methods fo r multivariate data are much less w ell developed.
In the form al fram ework of hypothesis testing the null hypothesis Hq is
that a given random variable x follows a stated probability law F (x) (for ex­
ample, the norm al distribution o r the W eibull distribution); the random v a ri­
able may come from a process which is under investigation. The goodness-
o f-fit techniques applied to test Hq are based on m easuring in some way the
conformity of the sample data (a set of x-valu es) to the hypothesized d istri­
bution, o r, equivalently, its discrepancy from it. The techniques usually
give form al statistical tests and the m easures of consistency o r of discrep­
ancy are test statistics.
The null hypothesis Hq can be a simple h3poth esis, when F (x) is specified
completely, fo r example, normal with mean ß = 100 and standard deviation
0- = 10; o r Hq can give an incomplete specification and w ill then be a com­
posite hypothesis, fo r example, when it states only that F (x) is normal with
unspecified ß and cr.
In most applications of goodness-of-fit techniques, the alternative
hypothesis H^ is composite—it gives little or no information on the distribu­
tion of the data, and simply states that Hq is false. The m ajor focus is on
the m easure of agreement of the data with the null hypothesis; in fact, it is
usually hoped to accept that H q I s true.
D’AGOSTINO AND STEPHENS

There are several reasons for this. F irst, the distribution of sample
data may throw light on the process that generated the data; if a suggested
model fo r the process is correct, the sample data follow a specific d istri­
bution, which can be tested. Also, param eters of the distribution may be
connected with important param eters in describing the basic model. Sec­
ondly, knowledge of the distribution of data allows for application of standard
statistical testing and estimation procedures. F o r example, if the data follow
a normal distribution, inferences concerning the means and variances can
be made using t tests, analyses of variances, and F tests; sim ilarly, if the
residuals after fitting a regression model are norm al, tests may be made
on the model param eters. Estimation procedures such as the calculation of
confidence intervals, tolerance intervals, and prediction intervals, often
depend strongly on the underlying distribution. Finally, when a distribution
can be assumed, extreme tail percentiles, which are needed, fo r example,
in environmental work, can be computed.
The fact that it is usually hoped to accept the null hypothesis and proceed
with other analyses as if it w ere true, sets goodness-of-fit testing apart
from most statistical testing procedures. In many testing situations it is
rejection of the null hypothesis which appears to prove a point. This might
be so, for example, in a test for no treatment effects in a factorial analysis—
rejection of Hq indicates one o r m ore treatments to be better than others.
Even when one would like to accept a null hypothesis—for example, in a test
for no interaction in the above factorial analysis—the statistical test is
usually clear and the only problem is with the level of significance. In a
test of fit, where the alternative is very vague, the appropriate statistical
test w ill often be by no means clear and no general theory of Незгтап-
Pearson type appears applicable in these situations. Thus many different,
sometimes elaborate, procedures have been generated to test the same null
hypothesis, and the ideas and motivations behind these are d iverse. Even
when concepts such as statistical power of the procedures a re considered it
ra re ly happens that one testing procedure em erges as superior.
It may happen that the alternative hypothesis has some specification,
although it could be Incomplete; for example, an alternative to the null
hypothesis of normality may be that the random variable has positive skew­
ness. When the alternative distribution contains some such specification,
tests of fit should be designed to be sensitive to it. Even in these situations
uniquely best tests are ra ritie s .
In addition to form al hypothesis testing procedures, goodness-of-flt
techniques also include less form al methods, in particular, graphical tech­
niques. These have a long history in statistical analysis. Graphs are drawn
so that adherence to o r deviation from the hypothesized distribution results
in certain features of the graph. F o r example, in the probability plot the
ordered observations are plotted against functions of the ranks. In such plots
a straight line indicates that the hypothesized distribution is a reasonable
model for the data and deviations from the straight line indicate inappropri-
OVERVIEW 3

atenesS of the model. The type of departure from the straight line may indi­
cate the nature of the true distribution. Historically the straight line has
been judged by eye, and it is only recently that m ore form al techniques have
been given.

1.2 OBJECTIVES OF THE BOOK

There are five m ajor objectives of this book. They are:

1. To identify the m ajor theories behind goodness-of-fit techniques;


2. To present an up-to-date picture of the status of these techniques;
3. To give references to the relevant literature;
4. To illustrate with num erical exam ples, and
5. To make some recommendations on the use of different techniques.

There are several features that bear mention. F irs t, a substantial


number of num erical examples are included. These are fo r the most part
easy to find. In many chapters subsections containing num erical examples
are identified by the letter E before the section number. F o r example, in
Chapter 9, Section E 9 . 3 . 4 . 1.1 contains a num erical example of the Shapiro-
W ilk test for normality.
Second, a set of data sets is used throughout the book. These allow for
comparisons of some of the techniques on the same data se ts. Some of these
data sets are re a l data and others are simulated. The data sets are given in
full in the appendix.
Third, the chapters contain specific recommendations fo r use of the
test methods. Nevertheless, we have avoided the attempt to present final
definitive recommendations. The authors fo r the chapters of this book each
have significant expertise, but there is not always complete agreement among
them on what is best. As we stated previously, theory does not exist which
can identify the uniquely best procedure for most goodness-of-fit situations,
and personal opinion and judgment w ill often enter any consideration. Each
author has made recommendations based on his o r her understanding and
view of the problem .
Fourth, many references are given. There is an enormous literature
and we have made no attempt to survey all of it. W e have especially
avoided heavy mathematical treatment and the details of theorems. A sub­
stantial list of references is given with each chapter, they include references
to e a rlie r source m aterial and to the theoretical background of the test p ro ­
cedures; it is hoped they w ill aid the development of further research .
Finally we recognize that it is im possible to include all goodness-of-fit
topics in this survey; our emphasis is largely on the practical aspects of
testing. Some techniques are still underdeveloped, and, fo r example, sug­
gested tests may lack tables for practical application, o r enough comparisons
4 D^AGOSTINO AND STEPHENS

have not been made to assess their m erits; for these and sim ilar reasons,
some subjects have been lightly treated, if at a ll.
In goodness-of-fit there are many areas with unsolved problem s, o r
unanswered questions. Some of the subjects on which there w ill surely be
much work in the future include tests for censored data, especially fo r ran ­
domly censored data, tests based on the em pirical characteristic function,
tests based on spacings, and tests fo r multivariate distributions, especially
fo r multivariate normality. Many comparisons between techniques are still
needed, and also the exploration o f w ider questions such as the relationship
of form al goodness-of-fit testing (as, indeed, in other form s of testing) to
modern, m ore inform al, approaches to statistical analysis where distribu­
tional models are not so rigidly specified. W e hope this book sets forth the
m ajor topics of its subject, and w ill act as a base from which these and many
other questions can be explored.

1.3 THE TOPICS O F THE BOOK

In addition to this chapter the book consists of eleven other chapters. These
are divided into three grou ps. The first consists of Chapters 2 to 7, con­
taining general concepts applicable to testing fo r a variety of distributions.
Chapter 2 describes graphical procedures for evaluating goodness-of-fit.
These are informal procedures based mainly on the probability plot, useful
for exploring data and fo r supplementing the form al testing procedures of the
other chapters.
Chapter 3 reviews chi-square-type tests. The classical chi-square
goodness-of-fit tests are reviewed first and then recent developments in­
volving general quadratic form s and nonstandard chi-squared statistics are
also discussed.
Chapter 4 presents tests based on the em pirical distribution function
(e d f). These tests include the classical Kolm ogorov-Sm im ov test and other
tests such as the C ram d r-von M ises and Anderson-D arling tests. Considera­
tion is given to simple and composite null hypotheses. The norm al, expo­
nential, extrem e-value, W eibu ll, and gamma distributions among other
distributions are given individual discussion.
Chapter 5 deals with tests based on regression and correlation. Some
of these procedures can be viewed as arisin g from computing a correlation
coefficient from a probability plot and testing if it differs significantly from
unity. Also involved are tests based on comparisons of linear regression
estimates of the scale param eter of the hypothesized distribution to the esti­
mate coming from the sample standard deviation. The Shapiro-W ilk test for
normality is one such test.
In Chapter 6 transformation techniques are reviewed. Here the data are
first transform ed to uniformity and goodness-of-fit tests fo r uniformity are
OVERVIEW

applied to these transform ed data. These techniques can deal with simple
and composite hypotheses.
Tests based on the third and fourth sample moments are presented in
Chapter 7. These techniques w ere first developed to test fo r normality. In
Chapter 7 they are extended to nonnormal distributions.
The second group of chapters consists o f Chapters 8, 9, and 10. These
deal with tests for three distributions—the uniform , the n orm al, and the
exponential —which have played prominent roles in statistical methodology.
Many tests for these distributions have been devised, often based on the
methods of previous chapters, and they are brought together, for each d is­
tribution, in these three chapters.
Chapters 11 and 12 form the last group; they cover extra m aterials.
The problem of analyzing censored data is of great importance and Chapter 11
is devoted to this. Many of the previous chapters have sections on censored
data. Chapter 11 collects these together, fills in some om issions, and gives
exam ples; there is also a discussion on probability plotting of censored data.
The final chapter 12 is on the analysis and detection of o u tlie rs. This
m aterial might be considered outside the direct scope o f goodn ess-of-flt
techniques; however, it is closely related to them since they are often applied
with this problem in mind, so we felt it would be useful to close the book with
a chapter on o u tlie rs.
Graphical Analysis

Ralph B . D^Agostino Boston University, Boston, Massachusetts

2.1 IN TR O D U C TIO N

The purpose of this chapter is to illustrate the use of graphical techniques


as they relate to goodness-of-fit problem s. Graphical techniques as p re ­
sented here are sim ple tools which can be implemented easily with the use
of graph paper o r simple computer p ro g ra m s. They are less form al than
the num erical techniques that a re presented in the following chapters and
are great aids in understanding the numerous relationships present in data.
F o r goodness-of-fit problem s they can be used in at least two ways:

A s an exploratory technique. Here the objective is to uncover charac­


teristics of the data that are suggestive of mathematical properties of
the underlying phenomena ranging from incomplete specifications such
as symmetry or thick tailness to complete specification such as norm al­
ity with specific mean and standard deviation.
In conjunction with form al num erical techniques. Here the objective is
to test form ally a preconceived hypothesis o r one suggested by the
grap h s. The graphs can help reveal departures from the assumed
models and statistical distributions. Often they uncover features of the
data that w ere totally unanticipated p rio r to the analysis. The num er­
ical techniques quantify the information and evidence in the data or
graphs and act as a verification of inferences suggested from these.
The use of graphs alone may lead to spurious conclusions and the use
of num erical techniques is often essential in order to avoid this.
In general, with goodness-of-fit problem s, it is useful fo r numer­
ical testing to be preceded and supplemented by graphical analysis. In
8 D’AGOSTINO

the following we w ill point out the specific relations between some
graphical procedures and those form al num erical tests that quantify the
information revealed in the graph s.

This chapter deliberately concerns itself with simple to use graphical


procedures involving arithmetic o r log graph papers in conjunction possibly
with simple arithmetic and table look-ups, o r else with procedures involving
readily available special probability plotting p a p e rs. Further most of the
procedures are o r can be easily computerized. The view underlying this
approach is that graphical techniques are useful because of their ease and
informality. Involved, complicated procedures detract from this usefulness.
This chapter borrow s heavily from the w orks, concepts, and spirit of
W lk and Gnanadesikan (1968), Feder (1974), Daniel (1959), B liss (1967),
W . Nelson and Thompson (1971), W . Nelson (1972), Tukey (1977), and
Cham bers, Cleveland, K leiner, and Tukey (1983).

2.2 E M P IR IC A L C U M U L A T IV E DISTRIBUTION FU N C T IO N

2 .2 .1 Definition

Say we have a random sample , ..., drawn from a distribution with


cumulative distribution function (cdf ) F , then the em pirical cumulative d is­
tribution function (ecdf ) is defined as

#(X. < X)
Fn(X) -OO < X < OO (2 . 1)

where #(Xj < x) is read, the number of X j’s less than o r equal to x. The
ecdf is also often called the edf, em pirical distribution function. The plot of
the ecdf is done on arithmetic graph paper plotting i/n as ordinate against
the i’th ordered value of the sam ple, as abscissa. Figure 2.1a is an
ecdf plot of the data set NOR given in the appendix which is a random sample
of size 100 from the normal distribution with mean 100 and standard devia­
tion 10.
The ecdf plot provides an exhaustive representation of the data. F o r all
X values F ^(x) converges for large sam ples to F (x ), the value of the under­
lying distribution’s cdf at x. This convergence is actually strong convergence
uniformly for all x (R^nyi, 1970, p. 400).
The use of the ecdf plot does not depend upon any assumptions concerning
the underlying param etric distribution and it has some definite advantages
over other statistical devices, v iz .,
1. It is invariant under monotone transformations with regard to quan­
tiles. However, its appearance may change.
2 . Its complexity is independent of the number of observations.
GRAPfflCAL ANALYSIS 9

3. It supplies immediate and direct information regarding the shape of


the underlying distribution ( e . g . , on skewness and bim odalily).
4. It is an effective indicator of peculiarities ( e . g . , outliers).
5. It supplies robust information on location and dispersion.
6 . It does not involve grouping difficulties that a rise in using fo r e x ­
ample, a histogram .
7. It can be used effectively in censored sam ples.
There is, however, one serious potential drawback with the use of ecdf
plots and other graphical techniques which was already mentioned in the last
section. They can be sensitive to random occurrences in the data and sole
reliance on them can lead to spurious conclusions. This is especially true
if the sample size is sm all. This warning always should be kept in mind. In
the following we w ill illustrate uses of the ecdf and related grap h s. W e w ill
also indicate situations where the user may be m isled by them and where
further clarification o r confirmation via other graphical analyses ( e . g . ,
probability plotting) o r num erical techniques may be needed.
The ecdf is a standard item in a number of computer packages such as
the Statistical Package fo r the Social Sciences (SPSS), the Statistical Analy­
sis System (SA S), and Biom edical Computer P rogram s (B M D P ).

FIGURE 2.1 Em pirical distribution function of NOR data set. (a) Ecdf of
full data set (n = 100). (b) Ecdf of first ten observations.
10 D’AGOSTINO

Two other technical points are worth mentioning here. F irst, as defined
by form ula (2 . 1) the ecdf is actually a step function with steps o r jumps at
the values of the variable that occur in the data. Figure 2. Ia does not d is­
play the ecdf as a step function. V ery often it is not displayed as such, espe­
cially when the sample size is large and the underlying variable is continuous
as is the case with the NOR data. Figure 2. Ib displays the ecdf as a step
function fo r the first ten observations of the NOR data set. The ordered v a l­
ues of these first ten observations along with their ecdf values are:

Ordered observations
F (X ) = i/n
Num ber (i) Value n'

I 84.27 .1
2 90.87 .2
3 92.55 .3
4 96.20 .4
5 98.70 .5
6 98.98 .6
7 100.42 .7
8 101.58 .8
9 106.82 .9
10 113.75 1.0

Second, if the data set consists of grouped data and the variable is con­
tinuous , then the ecdf should be defined so that the steps occur at the true
upper class lim its. F o r example, if the frequency table is

C lasses Frequency

10-13 15
14-17 20
18-21 15

and an observation is categorized in the first class if it is in the interval


9.5 < X < 13.5 and sim ilarly for the other classes, then the ecdf is defined
as

13.5 .30
17.5 .70
21.5 1.00
GRAPfflCAL ANALYSIS 11

2 .2 .2 Investigation of S5rmmetry

Figure 2.2 contains plots of three distributions to illustrate different situa­


tions one can encounter in attempting to determine if a distribution is s y m ­
m e t r i c o r skewed. The three distributions a re the norm al (which is sym ­
m etric), the negative exponential (which is positively skewed—i . e . , "its
upper tail is longer than its low er ta il" o r "its upper percentage points are
farther from the median than a re the lo w e r") and the Johnson unbounded
Su (1»2) curve (which is negatively skewed—i . e . , "its low er tail is longer
than its upper ta il"). The density functions fo r these three distributions a re ,
respectively.

f(x) = expj^- | [(x - ti)/<T]^

I -x/e
f(x) = - e

and

f(x) =
cr's/^ 's/1 + ((X - Д)/(Т)2 'B
езф - o i v + ö s i n h - * [(X - ß )/ (T ] Ÿ

Here yt, (T, 0, y , and ô are param eters of the distributions.


If a distribution is sym m etric, then in the plot of the population cdf F (x )
the distance on the horizontal axis between the median (50-th percentile) and
any percentile P below the median (0 < P < 50) is equal to the distance from
the median to the (100 - P)th percentile. Figure 2.3a represents this relation

d en sity

cu m u lative
N orm al N egative Exponential Johnson SU(1,2)
(Sym m etri c) (P o sitiv e Skew) (N e gative Skew)

FIG UR E 2 . 2 Differentiation of sym m etric and skewed distributions.


В'АСЮЗТШ О
12

(а) Relation of percentile about median fo r sym m etric distributions-

(b) Ecdfs fo r three distributions.

FIGURE 2.3 Use of ecdfs fo r investigation of S 3r m m e t r y .


GRAPfflCAL ANALYSIS 13

in diagram form . This relation should be reflected in the ecdf. An examina­


tion of the ecdfs given in Figure 2 . 3b shows it clearly is in the NOR data set
and clearly is for the other two data sets (E X P fo r the negative exponen­
tial distribution and SU(1,2) for the Johnson unbounded distribution). Some
rough num erical values from the ecdfs are:

Absolute Values of Distances from Sample Median to Percentiles

Sample percentiles
P
100 - P NOR EXP SU(1,2)

10 21 3.5 1.75
90 25 15.0 1.55

20 11 2.3 .50
80 9 6.0 .85

25 10 2.0 .15
75 7 3.5 .55

40 4 .7 .05
60 3 1.1 .20

If the distribution has positive skewness the portion of the ecdf fo r i/n
values close to I ( e . g . , greater than .9) w ill usually be longer and flatter
(almost p a rallel to the horizontal axis) than the rest of the ecdf. Sim ilarly,
if the distribution has negative skewness the long flat portion w ill lie in the
lower end of its ecdf ( e . g . , i/n values less than . I ). The ecdfs from both
the E X P and SU(1,2) data sets behave as expected.
Another, m ore sensitive and informative graph fo r studying asym m etry
is a sim ple scatter diagram plotting the upper half of the ordered observa­
tions against the low er. That is, letting X ^i), ^ (2 )^ * * * » ^ (n ) represent the
ordered observations, plot X^^j versus versus X^2)» ^
general, Х^^+^.ц versus Х(ц fo r i < n/2. Figure 2.4 contains these plots
for the NOR, E X P , and SU(1,2) data sets. A negative unit slope indicates
S5rmmetry, a negative slope exceeding unity in absolute value indicates posi­
tive skewness, and a negative slope less than unity in absolute value indi­
cates negative skew ness. Notice how w ell this technique identifies the
behavior of the distribution with respect to symmetry. Note also that not all
of the observations are plotted. They are not needed usually fo r a correct
visual identification.
Another useful plotting technique involves plotting the sums Х(ц+ 1 _^) + Х^ц
against the differences - X^¿^, which would produce a horizontal con­
figuration fo r a symmetric distribution (Wilk and Gnanades ikan, 1968). A
plot o f the (100 - P)th sample percentile versus the P^^ sample percentile
14 D’AGOSTINO

39 J
125 261^
20'
120
Upper
Observation ^^
no 15

105
100
10
95
90

85
80

75
75 80 85 90 95 100 b 0 2 4 6 -3

Lower Observations Lower Observations Lower Observations

F IG U R E 2 . 4 Plot of upper versus low er observations fo r investigation of


sym m etry, slope computed on a ll data, (a) NOR data, slope = -1 .0 6 .
(b) E X P data, slope = -4 .6 2 . (c) SU(1,2) data, slope = -.7 5 .

fo r 0 < P < 50 is called a symmetry plot and is also useful (Cham bers et a l . ,
1983).
F o rm al num erical techniques fo r investigating and testing fo r S3nmnetry
are often based on the sample statistic. A full treatment of this proce­
dure is given in Chapter 7.

2 .2 .3 Detection of Outliers

O utliers, observations that appear to deviate m arkedly from other m em bers


o f the sample (Grubbs, 1969), often can be detected by the use of ecdf plots.
They usually appear as one o r a cluster of observations separated from the
re st o f the sample and are identifiable in the ecdf if, in addition to the plot,
some knowledge is available concerning the features of the underlying dis­
tribution which should be reflected in the data ( e . g . , the maximum p erm is­
sible range of the observations o r the largest o r sm allest possible correct
values of the observations may be known o r it may be known that the under­
lying distribution is sym m etric.
F igure 2.5 illustrates the use of the ecdf for detecting an outlier. The
figure contains two ecd fs. The first (Figure 2 . 5a) is a plot of the first ten
observations of the NOR data set. These observations a re: 92.55, 96.20,
84.27, 90.87, 101.58, 106.82, 98.70, 113.75, 98.98, and 100.42. The
second ecdf (F igure 2 . 5b) is a plot o f the same data with the last observation,
100.42, replaced by an outlier equal to 140. This example is an exaggeration
GRAPHICAL ANALYSIS 15

of what usually happens in practice, but it illustrates w e ll the type of con­


figuration that results in an ecdf plot of a sym m etric distribution such as
the normal distribution when an outlier o r outliers a re present. Note if it
w ere not known that the underlying distribution is sym m etric o r nearly sym ­
m etric, it would be im possible to judge if the ecdf of Figure 2.5b represents
data with an outlier present or data from a skewed distribution (see, fo r e x ­
ample, the ecdfs of the SU(1,2) and E X P data sets given in Figure 2 .3 b).
W e w ill illustrate later in this chapter the use of the probability plotting
technique for detecting outliers. Further, Chapter 12 is devoted solely to
the problem s of detecting and testing fo r o u tliers. The form al techniques of
that chapter should be used in conjunction with inform al graphical techniques-

2.2.4 M ixtures of Distributions—


Presence of Contamination

At times we may be dealing with sam ples that a rise as m ixtures of two o r
more distributions. F o r example, the author once was involved in a study
dealing with taking measurements on parasite transmitting snails obtained
from field sampling. There was no nonstatistical way to separate the d iffer­
ent generations ( i . e . , age groups) of snails in the sam ple. The param eters
that w ere desired w ere related to age. The author was also involved in

FIGURE 2.5 Plots of ecdfs from NOR data set illustrating effect of an outlier,
(a) Ecdf of first ten observations, (b) Ecdf of first nine plus one outlier.
16 D’AGOSTINO

another study dealing with o ra l glucose tolerance test data. In this study it
was suggested that there might exist two sul^>opulations—norm als and dia­
betics. The data set consisted mainly of norm als. Again there was no simple
nonstatistical way of removing the sm all "contaminating" subsample of dia­
betics . In both o f these situations the graphical techniques of this chapter
proved to be extrem ely useful.
Unless the component distributions of the mixture a re very distinct ( e . g . ,
the difference between the means is much la rg e r than the individual d istri­
butions’ standard deviation), the ecdf of the combined sample may not supply
much information to aid in determining if a mixture exists. Figure 2.6 illu s­
trates the problem . It contains separate and combined densities of mixtures
of norm al distributions. If the component distributions are "c lo s e " as in (a)
and (b) of Figure 2.6, the combined distribution may very w ell be unimodal.
Figure 2.7 further illustrates the problem . These are ecdfs from m ix­
tures of two norm al distributions. The main underlying distribution is the
norm al distribution with mean zero and standard deviation unity. However,
the sampling w as done in such a way that for each observation drawn there
was a probability тг that the observation would come from the norm al d istri­
bution with mean 3 and standard deviation unity. The data set fo r (a) of
F igure 2.7 had тг = . I (data set L C N (. 10,3) of the appendix) and the set

separate
component
dens iti es

comb i ned
components
density

FIG U R E 2 . 6 Mixtures of normal distributions. (a) Two equal close com­


ponents. (b) Two close components, (c) W e ll separated components.
GRAPHICAL ANALYSIS 17

1.0 -
.9-
Я-
.7-

Fn(X)
.5
.4
.3
.2
.1

-3 -2 -I -2 0 I

FIGURE 2 . 7 Ecdfs of contaminated distributions (main component is standard


normal: mean zero and standard deviation unity). (a) Standard norm al with
10 percent contamination from norm al with mean 3 and standard deviation I .
(b) Standard normal with 20 percent contamination from norm al with mean 3
and standard deviation I .

fo r (b) of Figure 2.7 had тг= .2 (data set L C N (.2 0 ,3 )). The ecdfs in F ig ­
ure 2.7 look very much like those that are produced by positively skewed
distributions. In fact the populations cdfs are positively skewed. The con­
tamination "caused” the skewness.
If the component distributions a re "w e ll separated" as in (c) of Figure
2.6, the resulting mixture w ill be bimodal and with sufficient data available,
the ecdf w ill show the changes from concavity to convexity to concavity as
does the cdf of (c) in Figure 2.6. In general only under the condition of sub­
stantial separation o f the components w ill the ecdf reveal bimodality.
There is an extensive literature on mixtures (see, for example, Johnson
and Kotz, 1970, Section 7.2) and the usual procedure is to assume some
functional form for the components or for the m ajor distribution o r d istri­
butions of the components. Specific param etric techniques are then employed
18 D’AGOSTINO

to establish if a mixture does exist and to estimate the param eters of the
components (e . g . , B liss, 1967, Chapter 7). G iventhese assumptions about
the functional form of the underlying distributions, graphical techniques such
as the probability plotting techniques which w ill be discussed later in this
chapter can be very useful in detecting the presence of mixtures even in situ­
ations such as those in Figure 2.6a and 2.6b. These probability plotting
techniques are the graphical techniques we recommend fo r use. Other graph­
ical procedures are given in Harding (1949) and Taylor (1965).

2 .2 .5 A ssessin g T a il Thickness

At times the interest is not in describing the entire distribution of a variable


but rather only one o r both of the tails of the distribution. F o r example, the
Environmental Protection Agency is interested often in making inferences
and issuing standards concerning high concentrations of various pollutants
(Curran and Frank, 1975). In such situations it is m ore important to under­
stand the behavior of the upper tail of the distribution than it is to fit the
entire distribution. Although a particular model may adequately describe
most of the distribution, it would be useless for predicting maximum o r
extreme values if the model broke down fo r the upper percentiles. A lso , a
model that is not accurate for a large portion of the data may still be useful
fo r predicting upper values if it adequately describes the behavior of the
upper percentiles. Bryson (1974) presented a graphical technique applicable
to deal with assessing the behavior of the tails of a distribution. The develop­
ment below is due to the adaptation of Curran and Frank (1975).
To be specific say the interest lies in assessing the behavior of the upper
tail. Mathematically, this is equivalent to assessing the thickness of the
upper tail o r finding a mathematical model which ’’fits” the upper tail. The
most convenient mathematical model is the negative exponential distribution.
Here, the probability density function is

r -x/ö
f(x) = ~ e в > 0, X > 0 (2.2a)

and the cumulative distribution function is

, . -x/â
F(X) = I - e (2.2b)

From this we have

I - F(X) =

and

ln(l - F(X)) = -x/0 (2.2c)


GRAPmCAL ANALYSIS 19

FIGURE 2.8 Relation of lognorm al and W eibull distributions to negative


exponential on se m i-lo g graph paper (for investigation of tail thickness).

The implication is that if I - F (x ) is plotted against x on se m i-lo g graph


paper the plot w ill be a straight line (see Figu re 2 .8 ). Because of this, it is
convenient to use the negative exponential distribution as the reference d is­
tribution and compare other distributions to it. The W eibull and lognormal
distributions are often the two m ajor distributions of potential interest for
this type o f problem . The tw o-param eter W eibull has as its probability
density and as its cdf

Л -1
f(x) = | (| ) , e>0, x > 0, k > 0 (2.3a)

-(х/в)
F(X) = I - e (2.3b)

The lognorm al distribution has as its probability density

f « . — (2.4)
XO-

Its cdf, F (X ), does not have a closed form representation. Figu re 2.8 con­
tains plots of I - F (x ) versus x on sem i-lo g paper fo r a negative exponential,
a W eibull with к > I and a lognorm al distribution. Notice the negative expo­
nential produces a straight line, the lognorm al distribution curves upward
and the W eibull with к > I curves downward. A distribution that curves down­
ward is termed "light tailed. " A heavy tailed distribution has a probability
density function whose iQ>per tail approaches zero less rapidly than the
ISD
О

Ö
>
FIGURE 2.9 Plots of data sets on sem i-log graph paper (for determining tail thickness), (a) E X P data set.
cn
(b) WE2 data set.
d
§
GRAPfflCAL ANALYSIS 21

e?фonential o r , in other w ords, a heavy tailed distribution has a greater


probability o f yielding high values. On the other hand, a light tailed d istri­
bution has a probability density function whose upper tail approaches zero
m ore rapidly than the езфопепиа! and, therefore, is less likely to yield high
values • In particular it should be mentioned that a ll lognorm al distributions
are heavy tailed and a ll W eibull distributions with к > I a re light tailed
(when к = I , the W eibull is the negative exponential distribution)* So if these
are the two models of Interest fo r the upper tail, an examination o f a plot of
I - F ji(X) ( i . e . , one minus the ecdf) versus Х (ц , the ordered observations
w ill often indicate which is the appropriate model.
The sem i-lo g graph paper used in F igure 2.8 is four cycle paper. Three
quarters of the vertical axis concerns only the upper 10”^ to 10““* points of
the distribution (.90 < F (x) < .9990). So this graph does focus alm ost exclu­
sively on the upper 10 percent of the distribution. However, it does contain
on the vertical axis the rest of the distribution and plotting the full distribu­
tion can cause confusion in attempting to judge the fit of the upper tail. In
general, points fo r which Fjj(x) < .5 should not be plotted. F o r sam ples as
sm all as 100 the author has found it convenient to use two cycle sem i-lo g
paper, define Fjj(x) as

#(X. < x) - . 5
V x z ________ .5
(2.5)
n n

and plot the data only for Fjj(x) > . 50. Note in (2.5) i = #(Xj < x ).

E 2 .2 .5 .1 Example

Figure 2.9 contains the above described plots fo r the E X P and WE2 (W eibull
with к = 2) data sets. Consider first the E X P data set plotted in Figure 2 .9a.
The dots represent the observed values of I - Fjj(x) fo r Fn(x) > .50. These
appear to lie roughly on a straight line. If the negative exponential distribu­
tion is an adequate model fo r these data, then a straight line for the theo­
retical exponential as in Figure 2.8 should fit the observed points. To obtain
the theoretical line we need The param eter Ö in (2. 2) can be estimated by

в = (2 . 6)
ln (l - F^(X))

where x represents any value fo r which the model is supposed to hold. In


particular the x fo r which Fjj(x) = .6321 o r I - Fjj(x) = .3699 yields a direct
estimate of Ö. F o r the E X P data set the estimate of в using this o r almost
any choice of x is approximately 5 ( i . e . , 0 = 5 ) . The line drawn in Figure
2.9a is the line I - F(x) fo r the negative exponential with 0 = 5 . Except for
the last two data points it fits w ell to the data. The inference to be drawn
from this exercise is that the negative exponential model is an appropriate
22 D'AGOSTINO

model which accounts w ell for all the data points except possibly the last
two. Note that in judging the goodness-of-fit of these points it is the hori­
zontal distance from the points to the line that are important, not the vertical
distances. The last data point, in particular, may appear to be further away
from the line than might be expected. Such variability in the extreme o b ser­
vation is, however, often observed.
Consider next the WE2 data plotted in Figure 2.9b. Using the x fo r which
Fn(X) = .6321 to obtain в in (2.6) we obtain 0 = .98. The line I - F (x) for
the negative exponential with ö = .98 is drawn in Figure 2.9b. Notice how it
lies above most of the data. Using Figure 2.8 as a guide this suggests (cor­
rectly) that the data is from a distribution with a thinner upper tail than the
negative exponential. A lso on Figu re 2.9b are plotted two other lines re p re ­
senting I - F ( X ) for the negative exponential of (2 .2 ). These arose from
solving (2.6) for fu s in g F jj(x) = .90 and Fjj(x) = .95. The estimators of в
a re, respectively, .59 and .52. A gainthe inference is the sam e, v iz ., the
negative exponential model of (2. 2) is not appropriate and the distribution
under consideration has a thinner tail than the negative exponential. Note,
this inference is correct.
A further examination of the WE2 data plot in Figure 2 . 9b does reveal
that the points do appear to lie on a straight line. The above analysis estab­
lishes that the data cannot be explained by a model such as (2 .2 ). They can,
however, be explained by a negative e^qx>nential model which incorporates a
displacement value, v i z . ,

0 > 0, X > Л. (2.7)

where Л. is the displacement value. The cdf for this distribution is

Ч -. -[(x-X )/ö ]
F(X) = I - e

If we start with this model then any two distinct x values (or two F^(X)
values) can be used to produce linear equations fo r X and 0. The equations
are

ln (l - F^(x^))e + A =

(2 . 8)
ln (l - F^(Xg)) 0 + A = X^

Using F jj(Xi ) = .50 and Fjj(X 2) = .90 in the WE2 data set produces в = .28,
X = .73. The line of I - F (x) for model (2.7) with these param eters is also
plotted in Figure 2.9b. This provides an excellent fit. So if we restrict our
attention solely to the upper tail the WE2 data can be w ell explained by a
negative exponential of the form (2 .7 ). O f course, the correct model is the
GRAPHICAL ANALYSIS 23

W eibull of (2.3) with к = 2. Completion of the first p art of the above analysis
would have led correctly to the W eibull model.

2 .2 .5 .2 Extensions

The above m aterial can easily be modified to examine the low er tail of the
distribution ( v i z . , by plotting F^(X) of (2.5) versus the observations on
sem i-lo g p a p e r).
Often the norm al distribution is used as the reference distribution in
discussing tail thickness and the standardized central fourth moment (the
kurtosis m easure) is used as the appropriate m easure. F o r these problem s,
one is usually interested in fitting the complete distribution and not just the
tail. W e w ill discuss this tail thickness concept in Section 2.4 below. A lso,
Chapter 7 w ill discuss in detail the form al computational procedures associ­
ated with this concept.

2 .2 .6 A ssessin g the Fit of the F u ll Distribution

The ecdf can be used also fo r assessin g how w ell a particular statistical
distribution fits the entire data set. The procedure starts by plotting on the

FIG UR E 2 . 10 Com parison o f population and em pirical cumulative distribution


functions fo r NOR data set. (a) F u ll data set (n = 100): population mean 100,
standard deviation 10. (b) F irs t ten observations: sam ple mean 98.41,
standard deviation 8.28.
24 D^AGOSTINO

same grid o f a piece o f graph paper the ecdf of the sample and the cdf of the
hзфothetical distribution. F o r example. Figure 2.10a contains the ecdf of
the NOR data along with the cdf for the norm al distribution with mean 100
and standard deviation 10 ( i . e . , the true underlying distribution). If values
of the param eters of the hypothetical distribution are unspecified, these
must be estimated fo r the data set under investigation by means of some
procedure such as the method of moments o r the method of maximum likeli­
hood and then the cdf of the hypothetical distribution using these estimates
as param eter values are plotted. F o r an example, Figure 2. IOb contains
the plot of the ecdf of the first ten observations of the NOR data set along
with the cdf of the norm al distribution with mean and standard deviation equal
to the sample mean and standard deviation, v i z . , x = 98.41 and s = 8.28.
The next step in the informal graphical analysis involves comparing the
two plots (ecdf and cdf) and deciding if they a re " c lo s e .” Usually this infor­
m al procedure is the first step in a m ore elaborate análysis which includes
form al numerical techniques re fe rre d to as em pirical cumulative distribution
function techniques o r m ore simply em pirical distribution function (ED F)
techniques. Chapter 4 contains a detailed account of these techniques.
W hile the above described graphical procedure has m erit, especially
when used with the form al num erical ED F techniques, it is deficient as an
inform al technique in that there are m ore informative sim ple graphical
techniques—namely those involving probability plotting which are the subject
matter of the rem ainder of this chapter.

2.3 G E N E R A L C O N C E PTS O F P R O B A B IL IT Y P L O T T IN G

2 .3 .1 Introduction

A m ajor problem with the use of the ecdf plot in attempting to judge visually
the correctness of a specific hypothesized distribution is due to the curva­
ture o f the ecdf and cdf plots. It is usually very hard to judge visually the
closeness of the curved (o r step function) ecdf plot to the curved cdf plot. If
one is attempting to reach a decision based on visual inspection it is prob­
ably easiest to judge if a set of points deviates from a straight line. A prob­
ability plot is a plot o f the data that offers exactly the opportunity fo r such a
judgment, for it w ill be a straight line plot, to within sampling e r r o r , if the
hypothesized distribution is the true underlying distribution. The straight
line results from transform ing the vertical scale of the ecdf plot to a scale
which w ill produce exactly a straight line if the h3npothesized distribution is
plotted on the graph.
The principle behind this transformation is simple and is as follow s.
Say the true underlying distribution depends on a location param eter ß and a
scale param eter ct*. Qx and a need not be the mean and standard deviation,
respectively). The cdf of such a distribution can be written as
GRAPHICAL ANALYSIS 25

F(X) = g ( ^ ) = G(Z) (2.9)

where

is re fe rre d to as the standardized variable and G (*) is the cdf of the standard­
ized random variable Z . The ecdf plot is based on plotting F (x) on x. F o r
sample data F (x) is replaced by Fj^(x) and the plotted values o f x are the ob­
served values of the random variable X. Now if the plot w ere one of z on x
(or equivalently G “^(F(x)) on x where G "^ (.) is the inverse transformation
which here transform s F (x) into the corresponding standardized value z), the
resulting plot would be the straight line

X -M
Z = G -^ F (X )) = (2.10a)

or in term s of x on z

X = M + Z(7 (2 .10b)

A probability plot is a plot of

Z = G"^(F (X )) on X (2.11a)

where x represents the observed values of the random variable X. Notice


F(X) in (2 .10a) is replaced by F^(X) in (2 . 11a). With observed ordered o b se r­
vations X(I) < • * • < *(n)* ^ probability plot can also be described as a plot of

(2.11b)
'l ■ “ ’‘fl)

F o r probability plotting the ecdf Fji(x) of (2 . 11a) o r Fn(X(()) of (2.11b) are


usually not defined as in (2.1) but rather as either

fo r i= l. ....n (2 . 12)

o r m ore generally as

I-C
fo r 0 < C < I (2.13)
n - 2c + I

In (2.12) the C o f (2.13) is equal to 0.5. See Barnett (1975) and Chapter 11
fo r further discussion o f the selection o f c. In the following we w ill always
26 D'AGOSTINO

use the (x) given by p,- of (2.12). Given that F is the true cdf, the proba­
bility plot of (2.11) should be approximately a straight line. In fact there is
strong convergence to a straight line for large sam ples.

E 2 .3 .2 An Exam ple—Logistic Distribution

A s an example of the above consider the problem of investigating the appro­


priateness of the logistic distribution as the underlying distribution from
which the LOG data set was obtained. (The LOG data set was drawn from a
logistic distribution and is given in the Appendix.) The cdf of the logistic
distribution is

F(X) = [I + exp{-ir(x - n)/{(T'^3)}]~^ (2.14)

Here Д and a are the mean and standard deviation, respectively. The cdf of
the standardized logistic distribution ( i - e . , of Z = (X - pt)/o*) is

G(Z) = [ I + exp (-7tz/ n/3)]"^

lOOF-lx)
995
99.0
98.5
97.5
95
90
80
70
60
50
40
30
20

10
5
2.5 -2
15
1.0
.5 -3

I i i J
50 60 70 80 90 100 Л0 120 130 140 150

FIG U R E 2.11 Logistic probability plot of LO G data set.


GRAPmCAL ANALYSIS 27

T A B L E 2 . 1 P a rtia l Data fo r Logistic Probability Plot of LO G Data


(Plotted in Figures 2.11 and 2.12)

O rdered Observation -ri # V i — •5 O rdered Observation


F „(x ) - P -
Num ber (i) n i n Z

I .005 -2 .9 0 51.90
2 .015 -2 .3 1 60.57
3 .025 -2.02 63.35
4 .035 -1 .8 3 65.87
5 .045 -1 .6 8 66.35
6 .055 -1 .5 6 68.44
7 .065 -1.47 74.29
8 .075 -1 .3 9 76.52
9 .085 -1.31 78.32
10 .095 -1.24 78.48
11 .105 -1 .1 8 79.07
12 .115 -1 .1 2 79.32
13 .125 -1.07 81.17
14 .135 -1.02 81.61
15 .145 - .98 82.45

86 .855 .98 113.79


87 .865 1.02 114.97
88 .875 1.07 116.01
89 .885 1.12 116.58
90 .895 1.18 116.99
91 .905 1.24 117.01
92 .915 1.31 118.54
93 .925 1.39 118.92
94 .935 1.47 121.83
95 .945 1.56 123.39
96 .955 1.68 123.58
97 .965 1.83 131.24
98 .975 2.02 132.40
99 .985 2.31 144.28
100 .995 2.90 145.33
28 D’AGOSTINO

Recalling from (2.9) that

F(X) = g ( ^ ) = G(Z)

where z = (x - /х)/(т and solving fo r z in term s of F(x) we obtain from the


above.

■«-<-«>-T -(T ilk )


Now, according to (2.11) a logistic probability plot consists of plotting on
arithmetic graph paper

(2.15)
ж ^ V l-F (X )/
n' '

where Fj^(x) is the ecdf defined by (2.12) on one axis ( e . g . , the vertical axis)
versus X on the other (horizontal) axis. Here x represents the observed v a l­
ues in the sam ple. Figure 2.11 contains an appropriate graph set up for
this problem .
Notice in Figure 2.11 there are two alternative ways of labeling the
vertical axis. The first way, which is probably the most informative, is to
label the axis in term s of Fn(x) (or lOOFjj(x) which is the m ore conventional
w a y ). The second way is in terms of the values of the standardized v a r i­
able z. In Figure 2.11 we have labelled the left vertical axis as 100Fjj(x) and
the right vertical axis as z. Notice the vertical axis is linear in z. It is not
linear in F ^ (x ).
Figure 2.11 contains the data plotted on it. To make m ore езфИси the
actual points plotted on this graph we list in Table 2.1 the values of Fjj(x) of
(2.12) and Z obtained for (2.15) for the first and last fifteen ordered o b se r­
vations . A lso listed are the corresponding ordered observations.

2 .3 .3 . Inform al Goodness -o f-F it and Estimation of Param eters

Once the data are plotted the next step is to determine the goodness-of-fit of
the data. F o r a probability plot this means determining if a straight line
’’fits w e ll” the data. This problem can be approached in a very form al manner
and Chapter 5 (Regression Techniques) discusses this approach in detail. F o r
the purposes of this chapter it means drawing a straight line through the
points and deciding in an informal manner if the fit is good.

2 .3 .3 .1 F irs t Procedure

The sim plest procedure is to draw a line ”by eye” through the points. One
convenient way to do this is to locate a point on the plot corresponding to
GRAPmCAL ANALYSIS 29

around the IOth percentile (F^(X) = . 10) and another around the 90th percent­
ile (Fji(x) = -90) and connect these two. F igure 2 . 12a contains such a line for
the logistic probability plot of the complete LO G data set. (Notice this is the
sam e plot as Figure 2.11. H ere we have the straight line imposed on the
grap h .) This line fits the data extrem ely w e ll, accommodating even the ex­
trem e points. There are two comments which need mentioning here. The
first concerns the non-random pattern of the points about the line. The
ordered observations are not independent and the type o f pattern shown in
F igure 2 . 12a is to be ej^ected. Second, in judging deviations from the line
rem em ber it is the horizontal distances from the points to the line that are
important.
A fter the *Ъу eye” line is drawn it can be used to supply quick estimates
o f the param eters o f the distribution. F o r exam ple, with the LO G data of
Figure 2 . 12a we can obtain estimates of the mean м and standard deviation a
by recalling that z = 0 corresponds to the mean ц. and z = I (86th percentile)
corresponds to д + o*. In Figure 2 . 12a we have lines extending from z = 0
and I to the straight line and down to the x a x is . From these we estimate
ii = 99 and a = 17.

2 .3 .3 .2 O th erP ro ced u res

A second procedure fo r obtaining the line and estimates is to recognize that


from (2 . 10) we have that the desired line can be represented by

X = /1+2(7 (2.16)

and estimates of and a can be obtained by using unweighted least squares


(sim ple linear re g re ssio n ). The general solution for these are

^ Z (z - z)x ,-4 - ^-
S ( Z - Z ) Z and Í. = X - (T Z (2.17)

If Z z = O, then

^ - ,-4 Z ZX
Al = X and (T = (2.18)

for the LO G data set, /z = 99.78 and a = 16.70.


Still a third procedure applicable if /z and cr are the mean and standard
deviation, as they are in the logistic distribution of (2.14), is to use x and s,
the sample mean and standard deviation, as estim ates. F o r the LOG data
X = 99.78 and s = 16.67. Notice fo r the LOG data there are very little d iffer­
ences among the results of these different procedures. The true param eter
values are /z = 100 and C7 = 18.14.
M ore elaborate procedures involve finding the best linear unbiased esti­
m ators o f /Z and (7 (see D^Agostino and Lee, 1976, fo r the logistic distribu­
tion). These procedures lead to the regression techniques of Chapter 5.
CO
о

U
FIGURE 2.12 Logistic analysis of LOG data set. (a) Full data set (n = 100) line fit "by eye. " (b) F irst
ten observations, sm all sample analysis.
a
§
GRAPfflCAL ANALYSIS 31

2 .3 .4 Small Samples

When the size of the sample is sm all (say 50 o r less) the probability plots
of Z on X as given by (2.11) may display curvature in the tails even if the
hypothesized distribution is correct. F o r these cases the usual recommen­
dation is to use the expected values of the ordered statistics from the stand­
ardized distribution of the h3^othesized distribution fo r the plotting positions
of the vertical axis. These are used in place of the z o f (2.11) which are the
percentile points of the standardized distribution. The ejqjected values are
defined as follow s. Say Z (i ) < • • • < Z(n) represent the ordered observations
for a sample of size n from a standardized distribution. Then the expected
values are defined as fo r i = I , . . . » n where E represents the e3q>ected
value operator.

E 2 .3 .4 .1 Example

F or the logistic distribution the expected values are readily available (Gupta
and Shah, 1965, and Gupta, Qurelshi and Shah, 1967). However, fo r this
particular distribution there appears to be no reason to use them in plotting.
Figure 2.12b contains a logistic probability plot ( i . e . , a logistic analysis) of
the first ten unordered observations of the LOG data. The data along with the
expected values of the ordered observations, Fjj(x) and z o f (2.11) and (2.15)
are as follows:

Expected values
Ordered of standardized Z of (2.11)
F (X ) = p.
observations logistic ordered observations n' and (2.15)

63.35 -1 .5 6 .05 -1.62


78.32 -.9 5 .15 -.9 6
94.63 -.6 0 .25 -.6 1
96.91 -.3 4 .35 -.3 4
102.97 -.1 1 .45 -.1 1
104.47 .11 .55 .11
109.99 .34 .65 .34
111.81 .60 .75 .61
118.54 .95 .85 .96
144.28 1.56 .95 1.62

The differences between the expected values and the z ’s of (2.15) are not
large enough to influence the plots. This is seen clearly in Figure 2 . 12b.
Rem em ber in judging the fit it is the horizontal distance from a point to the
line that is important.
32 D^AGOSTINO

FIGURE 2.13 Use of computer scatter diagram s for probability plotting,


(a) Logistic analysis of LOG data.

FIGURE 2.13 (b) Logistic analysis of UNI data.


GRAPmCAL ANALYSIS 33

Z
Ш5

A3

.39

.17

-.05 J. J. J.
C -0.2887 1.8148 3.9183 6.0217 8.1252 10.2287

FIGURE 2.13 (c) Uniform analysis of U NI data.

2 .3 .5 Grouped Data/Ties in Data

F o r groined data such as discussed in Section 2 .2 .1 the sim plest procedure


fo r probability plotting is to plot only the data at the true upper class limit
fo r each Interval. O f course, use (2.12) fo r the F^(X). This is equivalent
to representing a ll the observations in the interval at the upper end points
of the intervals. F o r ungrouped data with ties the sim plest procedure is to
average the z values fo r the observations in the ties (see E 2 . 4 . 1.1 fo r a
numerical exam ple).

2 .3 .6 Use of Simple Computer Graphics

Elaborate sophisticated computer graphics are not needed to produce proba­


bility plots. Many interactive systems have the capability of ordering the
observations of a data set and defining new v ariab les. In such systems a
probability plot is simply a scatter diagram with z of (2.11) on the vertical
axis and the sample observations on the horizontal axis. F igure 2.13 d is­
plays such scatter diagram -probability plots. Figure 2 . 13a is the logistic
plot of the LOG data (already plotted in F igures 2.11 and 2.12a). Figure
2 . 13b is a logistic analysis o f the UNI data. The UNI data w ere drawn from
34 D’AGOSTINO

the uniform distribution defined on the interval 0 to 10. This analysis clearly
indicates lack of fit. Figure 2.13c is a uniform analysis of the UNI data.
That i s , it is a probability plot Investigating if the UNI data w ere drawn from
a uniform distribution. In Chapter 6 techniques are discussed which involve
transform ing the data first to a uniform distribution. In that chapter the
uniform probability plot plays a very important adjunct role in judging
goodness-of-fit. F o r a uniform probability plot the standardized variable z
usually is defined as the uniform distribution on the unit Interval.
In addition to the plotting, many interactive computer program s can also
be used to obtain the estimates of д and <t given by (2 .1 7 ). These a re just the
intercept and slope estimates from a simple linear regression of x on z.
However, the correlation coefficient from this simple linear regression must
be viewed with care in attempting to judge goodness-of-fit. Because of the
matching of the ordered observations with increasing z values both x and z
a re monotonically increasing, so the correlation coefficient w ill be usually
large in magnitude regard less o f how w ell the data fit a straight line. F o r
example, the correlation coefficient for the data of Figure 2 . 13b (logistic
analysis of the UNI data) is . 947. The fit o f these data to a straight line
obviously leaves much to be desired.
In addition to the use of program s as described above to do probability
plotting, many standard software packages ( e . g . , SAS) have specific routines
for probability plotting. These should be used when available.

2 .3 .7 Summary Comments

A s given above a probability plot is a plot of

Zj = G -*(F ^ ( x ^j^)) = G -I(P j) on x^j^ (2.19)

where G "^(*) is the inverse transformation of the standardized distribution


of the population (hypothesized distribution) under consideration. W e recom ­
mend for Fjj(X^i))

( 2 . 20)
^n("(i)> = Pi =

In the examples above we have used arithmetic graph paper placing z on the
vertical axis and x on the horizontal axis. O f course, it is not incorrect to
place X on the vertical axis and z on the horizontal axis. (In Chapter 11 prob­
ability plotting is done that w ay .) N or is it essential to use arithmetic paper.
Many probability plotting papers, which have the axes appropriately labelled,
are available com m ercially. Logistic paper and many other probability
papers are available from the Codex Book Company, 74 Broadway in Norwood,
Massachusetts.
GRAPmCAL ANALYSIS 35

T A B L E 2.2 Plotting Form ulas fo r Some F am iliar Distributions


( i-0.5^

Horizontal V ertical Axis


Distribution C d f F (X ) Axis

i - .5
Uniform ------- fo r Д < X < ß + (T
(T * (i ) P i= n

N orm al See (2.22) to (2.24)


" (I )

Lognorm al In(X^i)) See (2.22) to (2.24)

W eibull ln (-ln (l - p .))


I - ® * p (-(i) )

Extreme Value I - e x p (-e x p (^ )) ln (-ln (l - p . ) )


" (i )
nTs
Logistic [ I + exp { -7T(x - m)/( tn/3 } ] “^ — l n (P i / (l - P i »
" (i)

Exponential I - exp (-(x / ö )) -In (I-P j)


*(1)

Once the points are plotted the m ajor task is to judge if the plotted data
form a straight line. If they do not, the task is then to decide what a re the
properties of the underlying distribution o r data which cause this nonlinearity.
W e w ill now illustrate this probability procedure with the norm al, lognormal
and W eibull plotting. Table 2.2 contains the appropriate form ulas fo r proba­
bility plotting for those and other fam iliar distributions.

2.4 N O R M A L P R O B A B IL IT Y P L O T T IN G

2 .4 .1 Probability Plotting

Norm al probability plotting, norm al plotting o r norm al analysis is the plotting


of data in o rd e r to investigate the goodness-of-fit of the data to the norm al
distribution with density given by

f(x) = (2 . 21 )
^Í2Ír<T
CO
05

g
d
§
FIGURE 2.14 Normal probability paper.
GRAPmCAL ANALYSIS 37

T A B L E 2.3 Plotting Positions z fo r N orm al Probability Plotting (n < 50)


(Expected Values of Standard N orm al O rd er Statistics*)

(n = Sample Size, i = Observation Num ber)

i\ n 3 4 5 6 7 8 9 10

I -0 .8 5 -1.03 -1.16 -1.27 -1 .3 5 -1.42 -1 .4 9 -1.54


2 0.00 -0.30 -0.50 -0.64 -0.76 -0 .8 5 -0 .9 3 -1 .0 0
3 0.00 -0.20 -0 .3 5 -0 .4 7 -0 .5 7 -0 .6 6
4 0.00 -0 .1 5 -0.27 -0 .3 8
5 0.00 -0.12

11 12 13 14 15 16 17 18

I -1 .5 9 -1.63 -1.67 -1 .7 0 -1.74 -1.77 -1 .7 9 -1.82


2 -1.06 -1.12 -1 .1 6 -1.21 -1 .2 5 -1 .2 8 -1.32 -1 .3 5
3 -0.73 -0.79 -0.85 -0 .9 0 -0 .9 5 -0.99 -1 .0 3 -1 .0 7
4 -0.46 -0.54 -0.60 -0.66 -0.71 -0.76 -0 .8 1 -0 .8 5
5 -0.22 -0.31 -0.39 -0.46 -0.52 -0 .5 7 -0.62 -0.66
6 0.00 -0 .1 0 -0 .1 9 -0.27 -0.34 -0 .4 0 -0 .4 5 -0 .5 0
7 0.00 -0.09 -0.17 -0.23 -0 .3 0 -0 .3 5
8 0.00 -0 .0 8 -0 .1 5 -0 .2 1
9 0.00 -0.07

19 20 21 22 23 24 25 26

I -1.84 -1.87 -1.89 -1.91 -1 .9 3 -1 .9 5 -1 .9 7 -1 .9 8


2 -1 .3 8 -1.41 -1.43 -1.46 -1 .4 8 -1.50 -1.52 -1.54
3 -1,10 -1.13 -1.16 -1 .1 9 -1.21 -1.24 -1.26 -1 .2 9
4 -0 .8 9 -0.92 -0.95 -0 .9 8 -1.01 -1.04 -1.07 -1.09
5 -0 .7 1 -0 .7 5 -0.78 -0.82 -0 .8 5 -0 .8 8 -0 .9 1 -0.93
6 -0.55 -0.59 -0.63 -0.67 -0 .7 0 -0.73 -0.76 -0 .7 9
7 -0.40 -0 .4 5 -0 .4 9 -0 .5 3 -0 .5 7 -0.60 -0.64 -0.67
8 -0.26 -0.31 -0 .3 6 -0.41 -0 .4 5 -0 .4 8 -0.52 -0 .5 5
9 -0.13 -0.19 -0.24 -0 .2 9 -0 .3 3 -0.37 -0 .4 1 -0.44
10 0.00 -0.06 -0.12 -0.17 -0.22 -0 .2 6 -0 .3 0 -0.34
11 0.00 -0.06 -0.11 -0 .1 6 -0 .2 0 -0.24
12 0.00 -0 .0 5 -0 .1 0 -0.14
13 0.00 -0.05

*z fo r o rd er statistic where i > n/2 is -z of o rd er statistic where


j = n + I - i.
(continued)
T A B L E 2.3 (continued)

i\n 27 28 29 30 31 32 33 34

I -2.00 -2.01 -2.03 -2.04 -2.06 -2.07 -2.08 -2.09


2 -1.56 -1.58 -1.60 -1.62 -1.63 -1.65 -1.66 -1.68
3 -1.31 -1.33 -1.35 -1.36 -1.38 -1.40 -1.42 -1.43
4 -1 .1 1 -1.14 -1.16 -1 .1 8 -1.20 -1.22 -1.23 -1 .2 5
5 -0 .9 6 -0.98 -1.00 -1.03 -1.05 -1.07 -1.09 -1.11
6 -0.82 -0.85 -0.87 -0.89 -0.92 -0.94 -0.96 -0 .9 8

7 -0 .7 0 -0.73 -0.75 -0 .7 8 -0 .8 0 -0.82 -0.85 -0.87


8 -0 .5 8 -0.61 -0.64 -0.67 -0.69 -0.72 -0.74 -0.76
9 -0 .4 8 -0.51 -0.54 -0.57 -0.60 -0.62 -0 .6 5 -0.67
10 -0 .3 8 -0.41 -0.44 -0.47 -0.50 -0.53 -0.56 -0.58
11 -0 .2 8 -0.32 -0 .3 5 -0 .3 8 -0.41 -0.44 -0.47 -0.50
12 -0.19 -0.22 -0.26 -0.29 -0.33 —0 . 36 -0.39 -0.41

13 0.09 -0.13 -0.17 -0.21 -0.24 -0 .2 8 -0.31 -0.34


14 0.00 -0.04 -0.09 -0.12 -0 .1 6 -0.20 -0.23 -0.26
15 0.00 -0.04 -0 .0 8 -0.12 -0.15 -0.18
16 0.00 -0.04 -0.08 -0.11
17 0.00 -0.04

35 36 37 38 39 40 41 42

I -2.11 -2.12 -2.13 -2.14 -2 .1 5 -2.16 -2.17 -2.18


2 -1 .6 9 -1.70 -1.72 -1.73 -1.74 -1.75 -1.76 -1 .7 8
3 -1 .4 5 -1.46 -1 .4 8 -1.49 -1.50 -1.52 -1.53 -1.54
4 -1.27 -1.28 -1.30 -1.32 -1.33 -1.34 -1.36 -1.37
5 -1.13 -1.14 -1.16 -1.17 -1.19 -1.20 -1.22 -1.23
6 -1.00 -1.02 -1.03 -1.05 -1.07 -1 .0 8 -1.10 -1.11
7 -0.87 -0.91 -0.92 -0.94 -0.96 -0.98 -0.99 -1.01

8 -0 .7 9 -0.81 -0.83 -0.85 -0.86 -0.88 -0.90 -0.91


9 -0 .6 9 -0.71 -0.73 -0.75 -0.77 -0.79 -0.81 -0.83
10 -0 .6 0 -0.63 -0.65 -0.67 -0.69 -0.71 -0.73 -0 .7 5
11 -0 .5 2 -0.54 -0.57 -0.59 -0.61 -0.63 -0.65 -0.67
12 -0.44 -0.47 -0.49 -0.51 -0.54 -0.56 -0.58 -0.60
13 -0 .3 6 -0.39 -0.42 -0.44 -0.46 -0.49 -0.51 -0.53
14 -0 .2 9 -0.32 -0.34 -0.37 -0.39 -0.42 -0.44 -0.46
15 -0.22 -0.24 -0.27 -0.30 -0.33 -0.35 -0.37 -0.40
16 -0.14 -0.17 -0.20 -0.23 -0.26 -0.28 -0.31 -0.33
17 -0.07 -0.10 -0.14 -0.16 -0.19 -0.22 -0.25 -0.27
18 0.00 -0.03 -0.07 -0.10 -0.13 -0.16 -0.18 -0.21
19 0.00 -0.03 -0.06 -0.09 -0.12 -0.14
20 0.00 -0.03 -0.06 -0.09
21 0.00 -0.03

*z for o rd e r statistic where i > n/2 is -z of o rd er statistic where


j = n + I - i.
GRAPfflCAL ANALYSIS 39

T A B L E 2.3 (continued)

i\n 43 44 45 46 47 48 49 50

I -2.19 -2.20 -2.21 -2.22 -2.22 -2.23 -2.24 -2.25


2 -1.79 -1.80 -1.81 -1.82 -1.83 -1.84 -1.85 -1 .8 5
3 -1.55 -1.57 -1 .5 8 -1.59 -1.60 -1.61 -1.62 -1.63
4 -1.38 -1.40 -1.41 -1.42 -1.43 -1.44 -1.45 -1.46
5 -1.25 -1.26 -1.27 -1.28 -1.30 -1.31 -1.32 -1.33
6 -1.13 - 1 . 14 -1.16 -1.17 -1 .1 8 -1.19 -1.21 -1.22

7 -1.02 -1.04 -1.05 -1.07 -1 .0 8 -1.09 -1.11 -1.12


8 -0.93 -0.95 -0.96 -0.98 -0.99 -1.00 -1.02 -1.03
9 -0.84 -0.86 -0 .8 8 -0.89 -0.91 -0.92 -0.94 -0.95
10 -0.76 -0 .7 8 -0.80 -0.81 -0.83 -0.84 -0.86 -0.87
11 -0.69 -0.71 -0.72 -0.74 -0.76 -0.77 -0.79 -0.80
12 -0.62 -0.64 -0.65 -0.67 -0.69 -0.70 -0.72 -0.74

13 -0.55 -0.57 -0.59 -0.60 -0.62 -0.64 -0.66 -0.67


14 -0 .4 8 -0.50 -0.52 -0.54 -0.56 -0 .5 8 -0.59 -0.61
15 -0.42 -0.44 -0.46 -0 .4 8 -0 .5 0 -0.52 -0.53 -0 .5 5
16 -0.36 -0.38 -0.40 -0.42 -0.44 -0.46 -0.48 -0 .4 9
17 -0.29 -0.32 -0.34 -0.36 -0 .3 8 -0.40 -0.42 -0.44
18 -0.23 -0.26 -0.28 -0.30 -0.32 -0.34 -0.36 -0 .3 8

19 -0.17 -0.20 -0.22 -0.25 -0.27 -0.29 -0.31 -0.33


20 -0.12 -0.14 -0 .1 7 -0.19 -0.21 -0.24 -0.26 -0 .2 8
21 -0.06 -0.09 -0.11 -0.14 -0.16 -0 .1 8 -0.21 -0.23
22 0.00 -0.03 -0.06 -0 .0 8 -0.11 -0.13 -0.15 -0 .1 8
23 0.00 -0.03 -0.05 -0 .0 8 -0.10 -0.13
24 0.00 -0.03 -0.05 -0.07
25 0.00 -0.02

*z fo r order statistic where i > n/2 is -z of o rd er statistic X/ where


(j)
j = n+ I

This plotting can be achieved by using already prepared norm al probability


paper such as shown in F igure 2.14 o r by using arithmetic paper where the
Z of (2.19) is approximated by

s ig n (F (X) - . 5 )(1 .238t(l + 0.0262t)) (2 . 22)


n

Here

t = {-In [4 F (X ) (I - F ( X ) ) ] } (2.23)
n n

and
100 Fn(x)

FIGURE 2.15 Norm al probability plots, (a) NOR data set.


о

S
(b) Dosim eter data set.
GRAPfflCAL ANALYSIS 41

+1 if F (X ) - .5 > О
n
sign (F^(X) - .5) = (2.24)
- I if F (X ) - .5 < О
n' '

This approximation to z is given in Hamaker (1978) and appears to be of suf­


ficient accuracy for plotting. Notice this z function defined by (2.22) and
(2.24) can be program m ed easily and so perm its the use of sim ple computer
graphics for perform ing normal probability plotting (see Section 2 .3 .6 for
further d etails).
F o r sm all sam ples (say less than 50 observations) the z o f (2.22) should
be replaced with the expected values of the o rd er statistics from the standard
norm al distribution—i . e . , the distribution with M = O and cr = I (H arter,
1961). T able 2.3 contains these fo r sample sizes up to 50. The norm al prob­
ability plots fo r sam ples sm aller than 25 can show substantial variation and
nonlinearity even if the underlying distribution is norm al (see, fo r example,
Daniel and Wood, 1971, and Hahn and Shapiro, 1967). W e caution the reader
against placing too much reliance upon a plot in these situations. Rem em ber,
in general and especially fo r these situations, graphs should be used for
informal prelim inary judgments and/or as adjuncts to form al numerical
techniques. Chapter 9 contains the form al techniques fo r testing fo r normality.

E 2 .4 .1 .1 Examples

Figure 2.15 contains two norm al probability plots. Figure 2 . 15a is a plot of
the NOR data set already extensively discussed in Section 2 .2 . Figure 2 . 15b
is a plot from a sample of 20 dosim eter readings of benzene (D'Agostino and
G illespie, 1978). A dosim eter is a portable device for m easuring a person's
exposure to various gases. The dosim eter data are in parts p er million
(ppm ). The frequency distribution and plotting points z are:

Dosim eter Data for M easuring Benzene

Data Expected values


values o rd e r statistics
(ppm) Frequencies (Z)

.93 3 -1 .4 7
.95 6 -.5 3
.97 3 .06
.98 I .32
.99 I .45
1.01 4 .85
1.05 I 1.41
1.07 I 1.87

20
42 D^AGOSTINO

Notice in Figure 2 . 15b we plotted only 8 points since only eight different
values appeared in the sam ple. The z values are averaged in the case of the
ties. The line drawn in Figure 2.15b is the line x + zs, where x = .98 and
S = .04.
F o r grouped data ( i . e . , data grouped into frequency classes) only one
value p er class should be plotted. This plotted value should be the true upper
limit of the class (see Section 2 .2 .1 for an illustration of true upper limits
and Section 2 .3 .5 fo r m ore d etails).

2 .4 .2 Deviations from Norm ality

2 .4 .2 .1 Unimodal Distributions

A useful way to distinguish unimodal non-norm al distributions from the


norm al is in terms of the skewness and kurtosis m easures defined as

_3________ E (X - M)-
Skewness : 4 ß ^ = - (2.25)
3/2

and

E (X - H )
Kurtosis: (2.26)
“ 2
{ E ( X -ß ) Y

F o r the normal distribution = 0 and /?2 = 3. The sample estim ators of


these and the tests of fit based on them are discussed in Chapters 7 and 9.
Figure 2.16 contains norm al probability plots of four data sets of the appen­
dix which represent various combinations of and • Figure 2 . 16a and
2 . 16b are plots of symmetric distributions. Notice for ß 2 < ^ (UNI data,
ß 2 = 1.80) the plot is, within sampling e r r o r , antisymmetric about the
median, being concave for x < median and convex fo r x > median. F o r
/?2 > 3 (SU(0,2) data, /З2 = 4.51) the plot is again, within sampling e r r o r ,
antisymmetric about the median. Now, however, it is convex fo r x < median
and concave for x > median. (See Figure 2.18 for further illustrations.)
Notice also for skewed distributions (Figures 2 . 16c and 2 . 16d) the plots are
either convex o r concave throughout.

2 . 4 ^2.2 Outliers, Mixtures and Contamination


Figure 2.17 illustrates the use of normal probability plotting for the detection
of outliers and the presence of mixtures (or contamination). F o r previous
discussions see Sections 2 .2 .3 and 2 .2 .4 , respectively. Figure 2.17a is a
plot of the data whose ecdf is given in Figure 2.5b ( i . e . , first nine observa­
tions of the NOR data set plus one outlier equal to 140). Notice how the point
GRAPfflCAL ANALYSIS 43

100F„(x)
99

90
70
50
30 /
10

0 2 4 6 8 10

FIGURE 2. 16 Norm al probability plots fo r normorm al unimodal distributions.


Sym m etrlcdistributions { ^ I ßi = 0 ). (a) U N Id a ta /32 = 1.80. (b) SU (0,2),
/З2 = 4 .5 1 . Skewed distributions {\Tßi Ф 0). (c) SU(1,2) data = -.8 7 ,
/?2 = 5.59. (d) E X P data = 2, /З2 = 9,
44 D^AGOSTINO

FIG U R E 2.17 Norm al analysis for outliers and mixtures (or contamination),
(a) NOR data (n = 10) detection of outlier.

corresponding to the observation 140 is clearly out o f line with the rest of
the data. In practice the techniques of Chapter 12 should now be used to con­
firm that this point is an outlier.
F igu res 2.17b and 2 . 17c a re norm al probability plots of the contami­
nated normal data sets L C N (. 10,3) and L C N (.2 0 ,3 ) whose ecdfs are given,
respectively, in Figures 2.7a and 2.7b. The read er should note two impor­
tant related points concerning both Figures 2 . 17b and 2.17c. F irst, both
reveal the presence of two straight lines. This is seen, for example, in
Figure 2.17b [L C N (. 1,3) data set] where one straight line can be fit nicely
through the data below the 80th percentile of the sample and a second straight
line can be fit through the data from about the 92nd percentile up. The points
GRAPmCAL ANALYSIS 45

from the 80th to the 92nd percentiles represent a contaminated o r transition


zone where the two distributions cannot be clearly separated. Second, neither
plot displays a convex nor a concave pattern throughout. One of these pat­
terns would be the case if we had simply skewed distributions under analysis
here as we did in Figures 2.16c and 2 . 16d. Recall the ecdfs of the L C N (. 10,
3 and L C N (.2 0 ,3 ) data sets could not be distinguished from those o f skewed
distributions. With probability plotting the underlying components can s u r­
face as straight line segments in the plot, and so do produce a completely
different effect than what is produced by a unimodal skewed distribution (see
Figure 2.18 for further illustration). Once it is established that there are
two o r m ore components in the data, the next step is to estimate the p aram -
46 D^AGOSTINO

FIG U R E 2.17 (continued) (c) L C N (.2 0 ,3 ) data, contaminated norm al.

eters o f the components. The read er is re fe rre d to B liss (1967) and Johnson
and Kotz (1970) for further details.

2 .4 .2 .3 Recognizing and Responding to Nonnormality

Figure 2.18 provides guidelines to aid the user in interpreting normal prob­
ability plots. Notice in the drawings of Figure 2.18 the em pirical cumulative
distribution function and/or z scale is on the vertical a x is . Some graphs
have these on the horizontal axis. The resulting configurations w ill be differ­
ent if this is done.
GRAPfflCAL ANALYSIS 47

in d ic a tio n \Ув = О VjS = 0


1 1
ß < Z ß > Ъ
2 2

S y m m e t ric S y m m e t ric
T a ils Thinner T a ils T h ick er
Than N o rm al Than N o r m a l

J
in d ic a tio n Vjff < 0 ViS, > 0

Skewed to Skewed to
Left R igh t

in d ic a tio n M i x t u r e Truncated Truncated Outlier


o f N orm al a t Le ft at R i g h t
FIGURE 2.18 Indications of nonnormality from the norm al probability plots.

2.5 LO G N O R M A L P R O B A B IL IT Y P L O T T IN G

2 . 5 . 1 Probability Plotting fo r Two Param eter Lognorm al

The two param eter lognorm al distribution has density

_1 -(I n Х-Д)V 2o^


f(x) = X > 0 (2.27)
XÖ*

The random variable Y = In X has a norm al distribution with mean д and


standard deviation <r. Probability plotting fo r this distribution can be achieved
in a number of ways: (I) on already prepared lognorm al probability paper
(Figure 2.19), (2) on norm al probability paper such as shown in Figure 2.14
where x o f the horizontal axis is replaced by log x, (3) on arithmetic graph
paper where z of (2.22) to (2.24) is the variable of the vertical axis and
log X is the variable of the horizontal axis, o r (4) on se m i-lo g graph paper
ONixsoov*a
GRAPffiCAL ANALYSIS 49

FIGURE 2.20 Lognorm al probability plots, (a) Total suspended particu­


lates. (b) CHEN data set.
50 D^AGOSTINO

where the data x is the variable o f the log axis and z of (2.22) to (2.24) is
the variable of the equal interval scale axis. In selecting graph paper with a
log scale, the user should select one with enough cycles to accommodate the
data. Graph papers with one to five cycles on the log axis a re readily avail­
able. F o r sam ples of less than 50 the z of (2.22) to (2.24) should be replaced
with the expected values of the standardized ord er statistics of T able 2.3.
Figure 2.20 contains lognormal probability plots. Figure 2.20a contains
three plots of TSP (total suspended particulates) data from three a ir quality
monitoring sites near Boston, Massachusetts. Notice this figure uses log
normal paper. Figure 2.20b is a plot on arithmetic graph paper of the CH EN
data set given in the Appendix. These data are taken from B liss (1967) and
they are the lethal doses of the drug cinobrifagln in 10 (m g/kg), as deter­
mined by titration to cardiac a rre s t in individual etherized cats. The loses.
In (d oses), frequencies and plotting position z values are:

Dose In (Dose) f Z Dose In (Dose) f Z

1.26 .231 I -1.97 2.34 .850 I .10


1.37 .315 I -1.52 2.41 .880 I .20
1.55 .438 I -1.26 2.56 .940 I .30
1.71 .536 I -1.07 2.63 .967 2 .46
1.77 .571 I -.9 1 2.67 .982 I .64
1.81 .593 I -.7 6 2.82 1.037 2 .84
1.89 .637 2 -.5 8 2.84 1.044 I 1.07
1.98 .683 I -.4 1 2.99 1.095 I 1.26
2.03 .708 3 -.2 0 3.65 1.295 I 1.52
2.07 .728 I .0 3.83 1.343 I 1.97

25

The straight line drawn in Figure 2.20b is the line

In(D ose) = Д + za = .7972 + z(.2790)

where Д and a are the sample mean and standard deviation, respectively, of
the logs o f the data. F o r lognorm al data the param eters o f interest are
usually the geometric mean and geometric standard deviation. F o r the model
(2.27) these are, respectively,

e^ and e^

Estimates o f these based on the data in Figure 2.20b a re exp (. 7972) = 2.2193
and exp (.2790) = 1.3218. F o r data plotted directly on log normal paper
ехр(д) is estimated as the 50th percentile ( i . e . , x value corresponding to
Z = O) and exp (cr) is estimated as the ratio of the 84th percentile to the 50th
GRAPmCAL ANALYSIS 51

percentile ( i . e . , the ratio o f the x value corresponding to z = I to the x


value corresponding to z = 0 ).

2 .5 .1 .1 Z ero Data Values

At tim es, when dealing with a set of data that appears to be Iognormally d is­
tributed, there may be a subset of these observations that are a ll equal to
zero. Before the data can be plotted these zeros must be "adjusted.** F irst,
it is possible that they represent a contamination and simply should be r e ­
moved. Second, they may reflect a measurement limitation o f the m easure­
ment instrument. In this case it may be justified to replace them with the
**least detectable level** o f the instrument. I f this is not known then it may be
possible to adjust the zeros by adding a sm all arb itrary constant to them o r
to a ll the data values before they a re plotted. C areful consideration should
be given before any of these suggestions a re employed.

2 .5 .2 T h re e P a ra m e te rL o g n o rm a l

The three param eter lognorm al distribution has density

-(ln (x-X )-M )V 2 o^


f (X) = x> X (2.28)
ХСГ

FIGURE 2.21 Lognorm al plot fo r three param eter lognorm al. • Original data;
X D a t a - . I . Line is In (Data - . 1 ) = -1.7025 + 2.2781z; X = .1 , M = -1.7025,
(T = 2.2781.
52 D^AGOSTINO

A plot of data from this distribution for a lognormal analysis w ill not produce
a straight line unless the Л value o r an estimate o f it is subtracted from all
the data. The data in Figure 2.21 Illustrate the situation. These data come
from Leidel, Busch and Lynch (1977) and represent readings o f hydrogen
ñuoride. The dots r ^ r e s e n t the unadjusted data. These data and the plotting
positions fo r Z a re given in the firs t four columns below:

Data (ppm) In (Data) In (Data - . I)

.11 2 -1.38 - 2.21 -4.61


.12 1 -.7 9 - 2.12 -3.91
.14 2 -.4 2 -1.97 -3.22
.21 I -.1 0 -1.56 - 2.21
.33 I .10 - 1.11 -1 .4 7
.80 I .31 -.2 2 -.3 6
.91 I .54 -.0 9 -.2 1
1.30 I .79 .26 .18
2.60 I 1.12 .96 .92
10.00 I 1.63 2.30 2.29

12

Notice how the dots at the low er end bend in a concave manner while those
at the upper end do appear to follow a straight line.
There a re many ways to obtain estim ators o f X fo r this type of data
(Aitchinson and Brown, 1957, and Johnson and Kotz, 1970, chapter 14). The
author has found the following two simple inform al procedures to be useful
in the graphical stage of analysis. F irst, note in Figure 2.21 that the low er
end dots do appear to be approaching asymptotically the log value o f -2 .3 .
The antilog of this asymptote ( v i z . , exp (-2 .3 ) = . 10) can be used as an esti­
mate of X . Second, if we use X p to represent the Pth sample quantile
(0 < P < 100) then the following should be approximately true

In (Xg^ - X) (2.29)

fo r all P (0 < P < 50). From (2.29) we have as an estim ator o f X

^ lO O -P ^ ”
X = (2.30)
^ lO O -P ^ " ^^0

The usually recommended value o f P is 5, and so the suggested estim ator is


GRAPmCAL ANALYSIS 53

95 5 50
X = (2.31)
X « , + X^ 2X.
95 5 50

A s a value fo r the Pth sample quantile, the user can use either the ith o rd e r
statistics where

i = [n (.O lP )] + I (2.32)

(here [y] is the largest integer in y) o r else obtain it directly from a graph.
That is, draw a curve by hand through the data and use o f the Pth quantile
the X value on the horizontal axis corresponding to 100Гц(х) = Pth value on
the vertical axis. Applying (2.31) to the data o f F igure 2.21 we again obtain
. I as a good approximation to X. Figu re 2.21 also contains a plot o f the data
minus this . I value ( i . e . , In (Data - . 1)). This plot is given as x values on
the graph. A straight line now does fit reasonably w ell these values indicating
the appropriateness o f the lognorm al distribution.

2 .5 .3 Responding to Lognorm al Plots

F igure 2.22 provides guidelines to aid the u ser in interpreting lognormal


probability plots. A s with F igu re 2.18 the drawings in Figu re 2.22 have the

indicati on Normal Distribution Weibull Distribution


Weibull Distribution

indicati on Truncated Left Truncated Right Truncated Both Tails


Misclossified Doto Left M isclassified Data Right Finite Tailed Distribution
Possibly Normol Distribution Mixture o f Distributions

i nd i Ca ti on Outlier Left Tail Outlier Right Toil Outliers Both Toils

FIGURE 2.22 Indications of non-lognormality from nonlinearity of lognormal


probability plots.
54 D’AGOSTINO

em pirical cumulative distribution function and/or z scale on the vertical


axis.

2.6 W E IB U L L P R O B A B IL IT Y P L O T T IN G

2 .6 .1 Probability Plotting fo r Two Param eter W eibull

The two param eter W eibull distribution has density

(2.33)

and cdf

<t/0Ÿ
F (t ) = I - e (2.34)

where 0, k, t > 0. This is a very versatile distribution and by varying the


param eter к it can assume a large number of different shapes. F o r example,
when к = I the W eibull distribution is the negative e?{ponential distribution,
when к is in the neighborhood o f 3.6 the W eibull distribution is sim ilar in
shape to the norm al distribution, when к < 3.6 the W eibull has positive
skewness ( i . e . , n/ ^ > 0) and when к > 3.6 it has negative skewness ( i . e . ,
< 0 ). To illustrate this versatility we have in Figure 2.23a plots o f the
W eibull density fo r four different к values.
A large number o f probability plotting papers a re available fo r a \ ^ ib u ll
analysis (see Nelson and Thompson, 1971). However, W eibull probability
plotting can be achieved also sim ply by plotting

Z = In (-In ( I - Fj^(t))) on In t (2.35)

FIG U R E 2.23 W eibull analysis, (a) W eibull densities fo r different к values


{0 = I ) .
(b) W eibull analysis o f WE2 data.

K)OF„(x)

55
56 D’AGOSTINO

This follows immediately from the cdf given in (2.34) and the general proce­
dure for probability plotting described in Section 2 .3 .1 . In particular, from
(2 . 34) we obtain

I - F (t ) =

In ( I - F (t )) = -(t/0)

In (-In (I - F (t ))) = k l n t - k l n 0

where

X = In t, /i = l n 0 and (T = l/ k (2.36)

Note with W eibull plotting the log o f the data ( i . e . , log t ) Is put on the hori­
zontal axis rather than the data values directly. This is needed to obtain
linearity in the plots. Now applying (2.11) o r (2.19) we obtain (2.35). Further
while sm all sample e^qpected values o f ord e r statistics are available (Mann,
1968) W eibull probability plotting usually works w ell by simply employing
(2.35) fo r all sample sizes. To illustrate W eibull plotting we present two
plots. Figure 2.23b is a W eibull plot of the W E2 data set ( i . e . , 100 o b se r­
vations from the W eibull distribution with к = 2 and 0 = 1 ) . Figure 2.23c
is a plot of eleven survival times in months of cancer patients who have had
an adrenalectomy. This data set was obtained from a study by D r . Richard
O berfield of the Lahey Clinic, Boston. The data are:

(I ) (2) (3) (4) (5) (6)


Rank t X = In t F (t) z = ln (-ln (l-F ^ (t ))) z; Plotting p<
n'

I 6 1.79 .045 -3.07 -3.07


2 9 2.20 .136 -1.92 -1.92
3 13 2.56 .227 -1.36 -1.36
4 18 2.89 .318 -.9 6 -.9 6
5 22 3.09 .409 -.6 4 -.5 0
6 22 3.09 .500 -.3 7 -.5 0
7 36 3.58 .590 -.1 1 .02
8 36 3.58 .682 .14 .02
9 37 3.61 .773 .39 .39
10 41 3.71 .864 .69 .69
11 52 3.95 .955 1.13 1.13

Follov/ing our previous convention Figure 2.23c is a plot of the z ’s of


column (6) on the x 's of column (3).
GRAPfflCAL ANALYSIS 57

The straight line drawn in F igu re 2.23b w as drawn using the "by eye"
technique described in Section 2 .3 .3 .1 . F ro m the line we obtain

X = In t = Д + z i = -.0 5 + z(.47)

Notice M and cr a re location and scale param eters o f the distribution o f In T .


However, they a re not the mean and standard deviation. Using (2.36) we
next obtain

9 = ехр(Д) = .95 and к = l / a = 2.13

R ecall the true param eter values a re 9 = 1 and к = 2. The inform al "by eye"
technique did w e ll in this case.
Because the sample size is sm all fo r the data in F igure 2.23c we use
the unweighted least squares estimates o f (2.17) to obtain the estimates o f д
and a from this data (columns (3) and (5) o f the data above w ere used fo r the
x ’s and Z^S, respectively). The estimates are Д = 3.40 and a = .55. From
these we obtain ff = езф(3.40) = 29.96 and a = 1/.55 = 1.82.

2 .6 .1 .1 Z e ro D a ta

A s with the lognorm al distribution there may be a subset o f zero values in


data that otherwise appears to have a W eibull distribution. F o r an example
dealing with wind speed data see Takle and Brown (1978). The recommended
procedures for dealing with these zero values are exactly those given fo r the
lognormal in Section 2 .5 .1 .1 .

2 .6 .2 T h re e P a ra m e te rW e ib u ll

The three param eter W eibull has density

(2.37)

Sim ilar to the three param eter lognorm al distribution the X value must be
subtracted from a ll the data before a W eibull plot w ill produce a straight
line. In many cases a value close to the minimum t is adequate as an esti­
mate of Л. Other m ore precise techniques can be employed (Johnson and
Kotz, 1970).

2.7 OTHER TO PICS

There are a number of other topics, fo r which due to space limitation, we


cannot give detail treatments. Some o f these are:
58 D'AGOSTINO

I. Analysis of Residuals from a M odel. A ll of the above m aterial


applies directly if we are dealing with residuals from a model. That is,
say w e have a mathematical model

Y = h(X,/3) + TJ (2.38)

where Y represents a random variable, X a vector o f random variables o r


known constants, ß a vector of unknown param eters and rj an e r r o r term .
F o r example, (2.38) can represent a multiple regression model. I f the ß
vector is estimated from data, say estim ator is then the estim ator of Y
is

Y = h {X ,ß ) (2.39)

and

7? = Y - Y (2.40)

is the residual. While the residuals com prise a dependent sample the graph­
ical techniques described above can be used to analyze them. Chapter 12 on
outliers discusses further the analysis of residuals.
2. Analysis of Censored Sam ples. There are no restrictions in the
above which make it necessary fo r a ll the data to be available fo r plotting.
The above techniques can be used on censored data. There a re , however,
special concerns which a rise with censored data ( e . g . , see Nelson, 1972)
that require careful and complete discussion. Chapter 11 is devoted solely
to this problem .
3. Q -Q Plots and P - P P lo ts. W ilk and Gnanadesikan (1968) discuss in
detail the quantile-quantile probability plots (Q -Q plots) and the percentage-
percentage probability plots ( P - P plo ts). A Q -Q plot is the plot o f the quan­
tiles (o r, as we call them above, the percentiles) of one distribution on the
quantiles o f a second distribution. If one of these distributions is a hypothe­
sized theoretical distribution a Q -Q plot is just a probability plot as devel­
oped in Section 2.3. A P - P plot is the plot o f the percentages of one d istri­
bution on the percentages of a second. W ilk and Gnanadesikan (1968) state
that while the P - P plots are limited in their usefulness, they are useful for
detecting discrepancies in the middle of a distribution ( i . e . , about the median)
and also may be useful for multivariate analysis.
4. Transform ation to Norm ality. In some settings ( e . g . , analysis o f
variance) it is suggested first to transform to norm alily and then analyze
the transform ed data. Box and Cox (1964) suggest a power transformation fo r
this. T h eir transformation is as follows:

: if 0 > 0

у = log X if Ö = 0 (2.41)

-X^ for 0 < 0


GRAPffiCAL ANALYSIS 59

H ere X re fe rs to the original data and в is the power exponent. Box and Cox
develop a maximum likelihood estim ator fo r в . Once it is computed norm al
probability plotting as developed in Section 2.4 can be applied directly to the
transform ed data у o f (2.41). The techniques o f Chapter 9 can be used to
test form ally the norm alily o f the transform ed data.
5. Probability Plotting fo r the Gam m a Distribution. One important
distribution that does not lend itself immediately to the probability plotting
techniques described above is the Gamma distribution. Even with the aid of
transformations it cannot be put in the sim ple form o f a distribution depend­
ent upon a location and scale param eter. W ilk , Gnanadesikan and Huyett
(1962) present a technique and accompanying tables to handle this situation.
See Chapter 11 fo r further comments on this.
6 . M ultivariate N orm ality. There has been much attention paid to the
problem o f probabilily plotting fo r the m ultivariate norm al distribution and
a number o f techniques have been suggested (Gnanadesikan, 1973). The
author has found the following technique to be very informative. F irs t trans­
form the data to principal components and then do univariate norm al proba­
bility plotting (Section 2.4) fo r each component. Each component can be con­
sidered an independent variable. I f the original data set is from a multivariate
norm al distribution then each component should produce a straight line in the
univariate plots. M ore w ill be said about m ultivariate normality in Chapter 9.

2.8 CONCLUD ING CO M M ENT

The aim o f this chapter has been to present to the read er sim ple inform al
graphical techniques which can be used in conjunction with the form al tech­
niques to be discussed in the following chapters. In perform ing an analysis
we suggest that the read er should draw a graph, examine it and judge if
other graphs are needed. As the form al num erical techniques a re being
applied use the graphs to interpret them and to gain insight into the phenom­
enon under investigation.

REFERENCES

Aitchinson, J. and Brown, J. A . C. (1957). The Lognorm al Distribution.


Cam bridge University P re s s , London.

Andrews, D. F . (1972). Plots o f high dimensional data. Biom etrika 28,


125-136.

Anscom be, F . J. (1973). Graphs in statistical analysis. The A m e r. Statis­


tician ^ ( 1 ) , 17-21.

Barnett, V . (1975). Probability plotting methods and o rd e r statistics.


J. R. Statist. Soc. C 24, 95-108.
60 D'AGOSTINO

Bliss, C. I. (1967). Statistics in Biology, Vol. I. McGraw-Hill, New York.


Box, G. E. P . and Cox, D . R. (1964). An analysis of transform ations.
J. R . Statist. Soc■ B 211-252.

Bryson, M . C . (1974). Heavy-tailed distributions: properties and test.


Technometrics 16, 61-68.

Cham bers, J. M ., Cleveland, W . S ., K leiner, B . and Tukey, P . A . (1983).


Graphical Methods fo r Data A n aly sis. Duxbury P r e s s , Boston.

C hem off, H. and Lieberm an, G . (1954). Use of norm al probability paper.
Jour. A m er. Stat. A s s o c . 49, 778-785.

Chernoff, H. and Lieberm an, G . (1956). The use o f generalized probability


paper for continuous distributions. Annals of Math. Stat. 27, 806-818.

Curran, T . C . and Frank, N . H. (1975). A ssessin g the validity of the log­


norm al and model when predicting maximum a ir pollution concentrations.
Presented at the 68th Annual Meeting of the A ir Pollution Control A sso ­
ciation, Boston, M ass.

D ’Agostino, R. B . and Lee, A . F . S. (1976). Lin ear estimation of the lo gis­


tic param eters fo r complete o r tail-censored sam ples. Jour. A m e r.
Stat. A s s o c . 71, 462-464.

D ’Agostino, R. B . and G illespie, J. C. (1978). C om m en tson th eO SH A


accuracy of measurement requirement fo r monitoring employee exposure
to benzene. A m er. Industrial Hygiene A sso c. J oum . 39, 510-513.

Daniel, C. (1959). Use o f R ulf-norm al plots in interpreting factorial two-


level ejqjeriments. Technometrics J^, 311-411.

Daniel, D . and Wood, F . S. (1971). Fitting Equations to D ata. W iley,


New Y ork.

Feder, P . I. (1974). Graphical techniques in statistical data analysis.


Technometrics 16, 287-299.

Gentleman, J. F . (1977). It’s all a plot (using interactive computer graphics


in teaching statistics). The A m er. Statistician 31(4), 166-175.

Gnanadesikan, R. (1973). Graphical methods fo r inform al inference in m ulti­


variate data analysis. B ull, o f the International Stat. Institute. P r o ­
ceedings of the 39th Session.

Gnanadesikan, R. and Lee, E . T . (1970). Graphicaltechniques for internal


comparisons amongst equal degree of freedom groupings in multiresponse
e3q>eriments. Biometrika 57, 229-237.

Grubbs, F . E. (1969). P ro ced u resford etectin gou tlyln gobservation s in


sam ples. Technometrics 11, 1-21.
GRAPmCAL ANALYSIS 61

Gupta, S. S. and Shah, R. K. (1965). E xactm om en tsan dpercen tagepoin ts


of the o rd e r statisties and the distribution of the range from the Iogistie
distribution. Ann. Math. Statist. 36, 907-920.

Giq>ta, S. S ., Qureishi, A . S ., and Shah, B . K. (1967). Best linear unbiased


estim ators o f the param eters o f the Iogistie distribution using o rd er
statisties. Teehnometries 9, 43-56.

Hahn, G. J. and Shapiro, S. S. (1967). Statistieal M odels in Engineering.


W iley, New Y ork.

Ham aker, H. C . (1978). Approximating the eumulatlve norm al distribution


and its inverse. Applied Statist. 27, 76-79.

Harding, J. P . (1949). The use o f probabilily paper for the graphieal analy­
sis of polynomial frequeney distributions. Jour, o f M arine Biology
A s s o e . , United Kingdom 28, 141-153.

H arter, H. L . (1961). E3q)eeted values o f norm al o rd e r statisties. Biometrika


48, 151-165.

Johnson, N . L . and Kotz, S. (1970). Continuous Univariate Distributions,


Vol. I . W iley, New Y ork.

Kim ball, B . Z . (1960). O n th e o h o io e o fp lo ttin g p o sitio n so n p ro b a b ilily


paper. Jour. A m er. Stat. A s s o e . 55, 546-560.

Leidel, N . A . , Buseh, K. A . , and Lyneh, J. R . (1977). Oeeupational


Exposure Sampling Strategy M anual. U .S . Dept, o f H. E . W . ,
Cineinnati, Ohio.

Mann, N . R. (1968). Results on statistieal estimation and hyxx>thesis testing


with applieation to the W eibull and extrem e-value distributions. A e ro -
spaee Researoh Laboratories Report A R L 68-0068, U .S . A ir F o re e ,
W right-Patterson A ir Foree B ase, Ohio.

H o ste lle r, F . and Tukey, J. W . (1977). Data Analysis and R egression.


A ddison -W esley, Reading, M a ss.

Nelson, L . S. (1977). Graphieal aid fo r drawing norm al distributions. Jour.


Quality Teehnology 9, 42-43.

Nelson, W . (1972). Theory and applieation of hazard plotting for eensored


failure data. Teehnometries 14, 945-966.

Nelson, W . and Thompson, V . C. (1971). W eibull probability p a p e rs. Jour.


of Quality Teehnology 3, 45-50.

R^nyi, A . (1970). Probability T heory. NorthH olland Publishing, Am sterdam .

Takle, E. S. and Brown, J. M . (1978). Note on the use of W eibull statisties


to eharaeterize wind-speed data. Journal of Applied Meteorology 17,
556-559.
62 D'AGOSTINO

T aylor, В . J. R. (1965). The analysis o f polymodal frequency distributions.


Jour, of Anim al Ecology 34, 445-452.

Tukey, J. W . (1971). Dynamic and static display of multidimensional data:


a number of suggestions. Unpublished mimeo report.

Tukey, J. W . (1977). Exploratory Data A n aly sis. Addlson-W esley, Reading,


M a ss.

W ilk, M . B . , Gnanadeslkan, R. and Huyett, M . J. (1962). P robabllityp lots


fo r the gamma distribution. Technometrics 4, 1-15.

W ilk, M . B . and Gnanadeslkan, R . (1964). Graphical methods fo r internal


comparisons in multiresponse experim ents. Ann. Math. Statist. 35,
613-631.

W ilk, M . B . and Gnanadeslkan, R . (1968). P ro b ab ilily p lo ttin gm eth o d sfo r


the analysis of data. Biom etrika 55, 1-19.
Tests of C hi-Squared Type
David S. M oore Purdue University, W est Lafayette, Indiana

3.1 IN TR O D U C TIO N

In the course of his Mathematical Contributions to the Theory o f Evolution,


K arl Pearson abandoned the assumption that biological populations are
norm ally distributed, introducing the Pearson system o f distributions to
provide other m odels. The need to test fit arose naturally in this context,
and in 1900 Pearson invented his chi-squared test. This statistic and others
related to it rem ain among the most used statistical procedures.
P earson ’s idea was to reduce the general problem o f testing fit to a
multinomial setting by basing a test on a com parison o f observed cell counts
with their expected values under the hypothesis to be tested. This reduction
in general discards some information, so that tests of chi-squared type are
often less powerful than other c lasses o f tests o f fit. But chi-squared tests
apply to discrete o r continuous, univariate o r multivariate data. They a re
therefore the most generally applicable tests of fit.
M odem developments have increased the flexibility o f chi-squared tests,
especially when unknown param eters must be estimated in the hypothesized
fam ily. This chapter considers two classes of chi-squared procedures. One,
called "c la s s ic a l” because it contains such fam iliar statistics as the log
likelihood ratio, Neyman modified chi-squared, and F reem an-Tukey, is
discussed in Section 3 .2 . The second, consisting of nonnegative definite
quadratic form s in the standardized cell frequencies, is the main subject of
Section 3 .3 . Other newer developments relevant to both classes of statistics,
especially the use o f data-dependent c e lls , are also treated prim arily in 3 . 3 ,
while such practical considerations as choice of cells and accuracy of asym p­
totic approximate distributions appear in 3. 2. Both sections contain a number
o f exam ples.

63
64 MOORE

Tests of the types considered here are also used in assessing the fit of
models fo r categorical data. The scope of this volume forbids venturing into
this closely related territory. Bishop, Fienberg, and Holland (1975) discuss
the methods of categorical data analysis most closely related to the contents
of this chapter.

3.2 C LA SSIC A L С Ш -SQ UAR ED STATISTICS

To test the simple hypothesis that a random sample X^, . . . , Xj^ has the
distribution function F (x ), Pearson partitioned the range o f Xj into M cells,
say E l, . . . , E]y[. If N i , . . . , N m a re the observed number of Xj^s in these
c e lls , then N i has the binomial distribution with param eters n and

= P (X . falls to E.) = / dF(x) (3.1)


E.
I

when the null hypothesis is true. Pearson reasoned that the differences
N i - npi between observed and expected cell frequencies express lack of fit
of the data to F , and he sought an appropriate function of these differences
fo r use as a m easure o f fit.
P earso n ’s argument here w as in three stages: (i) The quantities N ^ -n p i
have in large sam ples approximately a multivariate norm al distribution, and
this distribution is nonsingular if only M - I o f the cells are considered.
(ii) If Y = ( Y i , . . • , Y p )’ has a nonsingular p -variate norm al distribution
Np(M, Z ) , then the quadratic form (Y - /i)’Z ”^(Y - /x) appearing in the expo­
nent of the density function has the (P) distribution as a function o f Y . Here
of course /u is the p-vector of m eans, and Z is the p x p covariance m atrix
of Y . (iii) Computation shows that if Y = (N^ - n p ^ , . . • ,N m » i -
this quadratic form is

M (N -n p .)2
X" . Z ‘ *
1¾ ”P1

which therefore has approximately the x^ (M - I) null distribution in large


sam ples. This is the Pearson chi-squared statistic.
This elegant argument w ill reappear in our survey of recent advances
in chi-squared tests. Pearson reduced the problem of testing fit to the pro b­
lem of testing whether a multinomial distribution has cell probabilities pj
given by (3 .1 ). This problem , and the statistic X ^, do not depend on whether
F is univariate o r multivariate, discrete o r continuous. But if F is continu­
ous, consideration of only the cell frequencies Nj does not fully use the
information available in the observations . Thus the flexibility and relative
lack of power of X^ stem from the same source.
TESTS OF CHI-SQUARED TYPE 65

3. 2. 2 Composite Hypothesis

It is common to w ish to test the composite hypothesis that the distribution


function o f the observations Xj is a m em ber o f a param etric fam ily { F ( * 10) ¡
0 in Q }, where fí is a p-dim ensional param eter space. Pearson recom ­
mended estimating 0 by an estim ator 0j^ (a function of , . . . , ^ ) , and
testing fit to the distribution F(* 10^^). Thus the estimated cell probabilities
become

Ei

and the Pearson statistic is

M [N -n p (0 )]^

i =l

P earson did not think that estimating 0 changes the large sample distribution
of X ^ , at least when 0^ is consistent. In this he was wrong. It was not until
1924 that F ish er showed that the limiting null distribution of X ^ (^ ) is not
(M - I ) , and that this distribution depends on the method o f estimation
used.
F ish er argued that the appropriate method of estimation is maximum
likelihood estimation based on the cell frequencies This grouped data
M L E is the solution of the equations

M N. ap.(0)
= 0, к = I, (3.2)

obtained by differentiating the logarithm of the multinomial likelihood func­


tion. F ish er noted that the log likelihood ratio statistic

M N
= 2 X N. log —
I ^ np.
1=1

is asymptotically equivalent to X ^ . He further observed that an estim ator


asymptotically equivalent to the grouped data M L E can be obtained by choos­
ing 0 to minimize X^ (0 ) for the observed N j . This minimum chi-squared
estim ator is the solution of

M ( N dp ^ ( 9 )
i_
= 0, к = I ......... ... (3.3)
(9 ) дв.
66 MOORE

Let US denote either estimator by Then (0^^) is conceptually the


Pearson statistic for testing fit to F (« I the m em ber of the fam ily { F ( x 10)}
which is closest to the data if the Pearson statistic is used as a m easure of
distance. F ish er showed that the P earso n -F ish e r statistic X^(0^) has the
(M - P - I) distribution under the null ЬзфоШее1е, no matter what 0 in Í2
is the true value. This is the famous ^4ose one degree of freedom fo r each
param eter estimated” result.
Ne 3nnan (1949) noted that another estim ator asymptotically equivalent to
0j^ can be obtained by minimizing the modified chi-squared statistic

M [N - np (в)]2

This minimum modified chi-squared estimator is the solution of

M Pj (O ) 8 P j(0 )
= 0. к = I, P (3.4)
Э0,
i= l ^ i

Since for the purposes of large sample theor}»^ under the null hypothesis this
estimator is Interchangeable with the previous two, call it also to mini­
mize notation. Neyman’s rem ark is important because equations (3.4) are
m ore often solvable in closed form than a re (3.3) and (3 .2 ).

3. 2. 2. 1 Example

Consider a chi-squared test of fit to the fam ily of density functions

i(x l0 ) = 2 (I ÖX) , -I < x< I (3.5)

with Q = ( - 1 , 1). This fam ily has been used as a model fo r the distribution of
the cosine of the scattering angle in some beam -scattering experiments in
physics. F o r cells E^ = (a|_^, aj] with

- I = ao <

we have

P j(ö ) = / f(x|0) dx

i-1

_ в, 2 2 ^ ^ 1 ,
4 ^^1 “ ^1-1^ 2 ^^1 " ®i-l^
TESTS OF CHI-SQUARED TYPE 67

It Is easily seen that neither (3.2) nor (3.3) has a closed solution, while
((3.4) has solution

0 = -2
n M

Substituting this value in the Pearson statistic produces an easily computed


test of fit fo r the fam ily (3.5) using (M - 2) critical points.

But even the minimum modified chi-squared estim ator must often be
obtained by numerical solution of its defining equations. If cells
Ei = (a ^ .i, a j a re used in a chi-squared test of fit to the norm al fam ily

F(xl/x, a) = , -« < X<


\ (7 ^

(Ф is the standard norm al distribution function), then

A -I ” ^
P.(ix, a ) = ф ( - 4 “ ) - ------ )

It takes only a moment to see that none of the three versions of can be
obtained algebraically, so that recourse to num erical solution is required.
Most computer lib ra rie s contain efficient routines using (for example)
Newton’s method to accomplish the solution.
This circumstance calls to mind F ish e r’s warning that his ’’lose one
degree of freedom for each param eter estimated” result is not true when
estimators not asymptotically the same as are used. F o r example, in
testing univariate normality we may not simply use the raw data M L E ’s

X = ^ Z x .
“ j=l ^

in the Pearson statistic. Chernoff and Lehmann (1954) studied the conse­
quences of using the raw data M L E 0^^ in the Pearson statistic. They found
that (¾ ) has as its limiting distribution under F(* |0) the distribution of
68 MOORE

X^(M-P-I)+ ^ \{е)х1Л1) (3.6)

Here (М - P - I) and x^(^) ^.re Independent chi-squared random variables


with the indicated numbers of degrees o f freedom . The numbers Aj^(ö) satisfy
0< < I . So the large sample distribution of (^ ) is not x^ and depends
on the true value of 0. A ll that can be said in general is that the correct c rit­
ical points fall between those o f x^ (M - p - I) and those o f x^ (M - I ) . These
bounds often make usable in practice, especially when the number of
cells M is large and the number of param eters p is sm all.

3. 2. 3 A Fam ily of Stàtistics

W e have already mentioned the Pearson chi-squared, modified chi-squared,


and log likelihood ratio statistics. Another statistic recommended by some
statisticians is the Freem an-Tukey statistic

M I I
FT^ = 4 2 { N f - (np )2 }"
1=1

C ressie and Read (1984) have systematized the theory o f classical chi-squared
procedures by introducing a class o f test statistics based on m easures o f
divergence between discrete distributions on M points. If q = (q i* • • • , q]y[)
and P = ( p ^ , . . . ,P]y[) are such probability distributions, the directed d iv er­
gence o f order Л. of q from p is

M
:p ) = Ë ч* Л
\ (X + r (ч*i/ р
‘ .)
i- -I)

is a m etric only for A. = -1/2, but is a useful generalized information


m easure of ^Mistance" for all re a l X. If N is the vector o f cell frequencies N¿,
and p (0 ) the vector o f probabilities p j(0 ), the C ressie-R ead statistics are
the divergences of the em piric distribution N/n from the estimated hypothe­
sized distribution p ( ^ ) ,

) = 2n А н / п :р (0 ))

If I^ is defined by continuity at X = - I , 0, this class includes X ^(X = I ),


G2(X = 0), FT^ (X = -1/2) and X ^ (X = -Д^.
These statistics are all asymptotically equivalent to under
F (* |0o) for any estimator ^ such that - 0^) is bounded in probabil­
ity. M oreover, the "minimum distance" estim ators o f 0 derived from the
TESTS OF CHI-SQUARED TYPE 69

statistics a re a ll as 5anptotically equivalent under the null hyjKjthesis to


the grouped data M L E and minimum chi-squared estim ators. So if is any
o f these estim ators and Л is any re a l number, R ^0 n ) has the x^(M - p - I)
limiting null distribution. The C re ssie -R e a d statistics rem ain asymptotically
equivalent under contiguous alternative distributions, but not under alte r­
natives distant from the hypothesized fam ily.
I f the C re ssie -R e a d fám ily is taken as a completion of the class o f statis­
tics equivalent to in large sam ples, there rem ain the practical problem s
o f use fo r finite n. How large must n be before the asymptotic distribution
theory is trustworthy? How many cells should be used, and how should they
be chosen? What of these statistics should be used? W e now turn to these
questions.

3 .2 .4 Choosing C ells

An objection to the use of chi-squared tests has been the arbitrarin ess intro­
duced by the necessity to choose c e lls. This choice is guided by two consid­
erations: the pow er of the resulting test, and the desire to use the asymptotic
distribution o f the statistic as an approximation to the exact distribution fo r
sample size n. These issues have been studied in detail fo r the case o f a
simple hypothesis, i . e . , the case of testing fit to a completely specified d is­
tribution F . Recommendations can be made in this case which may reasonably
be extended to the case of testing fit to a param etric fam ily { F (- \ $ )} .
Mann and W ald (1942) initiated the study o f the choice of cells in the
Pearson test of fit to a continuous distribution F . They recommended, first,
that the cells be chosen to have equal probabilities imder the hypothesized
distribution F . The advantages of such a choice a re: (I ) The P earson test
is unbiased. (Mann and W ald proved only local unbiasedness, but Cohen and
Sackrowitz (1975) establish unbiasedness o f both and G^. This is not true
when the cells have unequal probabilities under F . ) (2) The distance
sup I F j(X) - F(X) I to the nearest alternative F j indistinguishable from F by X^
is maximized (M an n -W ald), and X^ m axim izes the determinant of the m atrix
o f second partial derivatives o f the power function among a ll locally unbiased
tests of the same size (Cohen-Sackrowitz). (3) E m pirical studies have shown
that the distribution is a m ore accurate approximation to the exact null
distribution of X ^, and FT^ when équiprobable cells a re employed (see
Section 3 .2 .5 for re fe re n c e s).
Ш пп and W ald then made recommendations on the number M o f équi­
probable cells to be used. Their work rests on large-sam p le approximations
and on a somewhat complex mlnimax criterion, so that it is at best a rough
guide in practice. Mann and W ald found that for a sample of size n (large)
and significance level a , one should use approximately

1/5
M (3.7)
70 MOORE

where c { a ) is the upper a-point of the standard norm al distribution. The


optimum is quite broad. In p articular, the M o f (3.7) can be halved with
little effect on pow er. Retracing the M ann-W ald calculations using better
approximations, as in Schorr (1974), confirms that the ” optimum” M is
sm aller than the value given by (3 .7 ). Since the exact optimum depends on
the criterion, a choice of e r r o r probabilities, and of course on the assump­
tion that the hypothesized F contains no unknown param eters, the practi­
tioner need not go beyond the following recommendation: Choose a number M
of équiprobable cells falling between the value (3.7) fo r oi = 0.05 and half
2 /5 2 /5
that value. Since half the value (3.7) is 1.88n , the choice M = 2n ' is
convenient. This recommendation is not an endorsement of the use of
a = 0.05 (o r any fixed a ) in tests of fit. Because (3.7) increases slowly
with a , but overstates the number of cells required, the value fo r o' = 0.05
can also be used when la rg e r significance levels are in mind.
F o r sm all n, accuracy o f the approximation to the exact null d istri­
bution becom es o f paramount concern. W e shall see (Section 3 .2 .5 ) that the
recommendations above, especially that o f équiprobable c e lls, are sustained
by this concern. When param eters must be estimated, cells équiprobable
under the estimated param eter value can be employed. This requires data-
dependent cells, a m ajor modern innovation to be discussed in Section 3 .3 .1
below. Since an ^objective" procedure fo r choosing cells is desirable, all
examples in this chapter w ill use equlprobable cells with (3.7) fo r a = 0.05
as a guide to choosing M .

3 .2 .5 Sm all-Sam ple Distributions

The distribution theory o f chi-squared statistics (and most other form al


tests of fit) is a large-sam ple theory. Indeed, Pearson^s discovery of
rested on the norm al limiting distribution o f the cell frequencies. How usable
in practice are critical points o r P -v alu es fo r o r R^ obtained from the
chi-squared distribution? Cochran (1954) gave a commonly accepted rule o f
thumb: a ll expected cell frequencies npj should be at least I , with at least
80 percent being at least 5. The availability o f inexpensive computing has
led to extensive study of this issue in recent y e a rs. Several recommended
papers sum m arizing this work are Roscoe and B yars (1971), L am tz (1978)
and Koehler and Larntz (1980), and Read (1984).
Each o f these papers has a different em phasis. Roscoe and B yars p r e ­
sent a simulation study of the P earson test o f fit to a sim ple hypothesis and
sum m arize much e a rlie r w ork. L am tz (1978) com pares the Pearson, log
likelihood ratio and Freem an-Tukey statistics with regard to the accuracy
of the chi-squared approximation. He includes the simple hypothesis case
and four cases in which param eters must be estimated. Koehler and Lam tz
(1980) study X^ and when the number of cells M in creases with n rather
than remaining fixed. In this case the limiting distribution is norm al rather
than chi-squared when a simple hyxюthesis is being tested (see Section 3 .3 .4
TESTS OF CHI-SQUARED TYPE 71

below. Read (1984) investigates the family of statistics fo r testing fit to


the simple hypothesis of équiprobable cells, and considers the usefulness of
two improved approximations to the exact distribution.
The consensus of these and other studies is that the traditional rule o f
thumb is very conservative, especially when the estimated cell probabilities
a re not too unequal. Here are the recommendations o f Roscoe and B yars for
the Pearson X^, which may serve as a guide fo r practitioners.
1. With équiprobable cells, the average eзфected cell frequency should
be at least I (that is, n > M) when testing fit at the a = 0.05 level; for
a = 0.01, the average expected frequency should be at least 2 (that is,
n > 2M ).
2. When cells are not approximately équiprobable, the average e^>ected
frequencies in (I) should be doubled.
3. These recommendations apply when M > 3. F o r M = 2 ( I degree o f
freedom ), the chi-squared test should be replaced by the test based on the
exact binomial distribution.
Note that the R o sco e-B y ars recommendations are based on the average
rather than the minimum cell expectation. Any such rule may be defeated,
as Koehler and Lam tz (1980) rem ark, by a sufficiently skewed assignment
o f cell probabilities. They suggest the guidelines M > 3, n > 10, n V M > 10
as adequate fo r use of the approximation to the Pearson statistic. These
are somewhat conservative when, as we recommend, cell probabilities a re
approximately equal. The M ann-W ald suggestion (3.7) meets both the R oscoe-
B yars and K oeh ler-Larn tz guidelines. Simulations suggest that when these
guidelines are met, the true oi fo r is usually slightly less than the nom­
inal CL given by X^ • But the true a generally exceeds the nominal a. fo r R^
with X not close to I , often substantially, when approximately équiprobable
cells are employed.
Though these recommendations rest on study o f the sim ple Hq case,
L am tz (1978) gives some grounds for adopting them when param eters must
be estimated.
The comparative studies of Lam tz (1978) and Read (1984) establish
clearly that the x^ approximation is notably m ore accurate fo r X^ than fo r
such common competitors as and F T ^ . Read, fo r équiprobable cells,
finds close agreement between the exact and approximate critical levels o f R^
fo r 1/3 < X < 1 .5 when n < 20 and 2 < M < 6. Only X^ (X = I) among the m ore
common m em bers o f the R^ fam ily falls in this c la ss. M oreover, although
increasing n for fixed M enlarges the class of X for which the approximation
is reasonable. Read finds that as M increases for fixed n, the e r r o r in this
approximation ” increases dram atically” fo r values o f X outside the recom ­
mended interval.
Statisticians, including the authors o f the papers we have cited, differ
on criteria for an ”adequate” large sample approximation. R eaders may
therefore want to examine these papers in detail fo r additional information,
particularly if the use o f R^ statistics other than X^ is contemplated.
72 MOORE

3 .2 .6 Choosing a Statistic

Since both hypotheses and alternatives of interest for an omnibus test o f fit
are very general, it is difficult to give comprehensive recommendations
based on power for choosing among a class of such tests. Asymptotic results
(fo r the simple Hq case) are ambiguous. When M is held fixed as n increases,
all are equivalent against local alternatives, and is favored against
distant alternatives (Hoeffding, 1965). But if M increases with n, the limiting
distributions of R^ vary with X under both hypothesis and local alternatives,
and appears to be favored (Holst 1972, M o rris 1975, C re ssie and Read
1984).
In many practical situations, power considerations are secondary to the
accuracy o f the approximation to the exact null distribution. In such cases,
the Pearson is the statistic of choice. Some quite limited computations of
exact power by Koehler and Larntz (1980) and Read (1984) shed some light on
the dependence of pow er on the alternative hypothesis and on the choice of X.
Read suggests 1/3 < X < 2/3 as a compromise with reasonable power against
the alternatives he considers. Again X^ fáres better than its common com­
petitors , X|n and F T ^ .
A different approach that may aid the choosing o f a statistic is to exam ­
ine the type of lack of fit m easured by each statistic. The sample m easure
of the degree of lack of fit accompanying R ^ ( ^ ) (which m easures the signif­
icance of lack of fit) is R ^ (^ )/ n . If G is the true distribution o f the o b se r­
vations Xj, a ll common estim ators converge under G to a Öq such that
F (* 10q) is ’^closest” to G in some sense. When G is a m em ber o f the hypoth­
esized fam ily { F ( * 10) : 0 in Í2}, this is iust consistencv o f When G is
not in this fam ily and 0j^ is the minimum-R ^ estim ator, 0 q is the point
such that p (0 q) is closest to the vector ttq of cell probabilities under G by
the discrepancy m easure i \7Tq : p ( 0 )). M oreover, R ^ ^ ) / n converges w .p . I
to 2I^(7Tq :p (0 o )). F o r example, X^(0j^)/n converges to

M (7Г - p ) 2

2I^V'P(^o)) = L „ '
i= l ^i

where 0 is the minimum chi-squared estim ator, 0 q is the point closest to G


by the I^ m easure, and Pj = P i(^ o )* See M oore (1984) for details of these
re su lts.
A choice of X can be based on a choice of distance m easure, and power
against an alternative of interest w ill depend on the distance of that alterna­
tive from the hj^pothesis under the given m easure. F o r a specific alternative,
X can be chosen to maximize the distance of this alternative from {F (-1 0 ) }•
This generalizes the conclusions of Read (1984). F o r general alternatives,
we recommend (pending further study) that the Pearson X^ statistic be em ­
ployed in practice when a choice is made among the statistics R^. W e w ill
see below that consideration of a broader class o f chi-squared-like statistics
TESTS OF CHI-SQUARED TYPE 73

w ill modify this recommendation. But w ill rem ain the statistic o f choice
when the null hзфothesis is sim ple o r when minimum chi-squared estimation
is used.

E 3 . 2. 7 Examples of the Pearson Test

Because of its relative lack of pow er, cannot be recommended fo r testing


fit to standard distributions fo r which special-purpose tests are available,
o r for which the special tables of critical points needed to apply tests based
on the em pirical distribution function (ED F) when param eters a re estimated
have been computed. Testing fit to the fam ily (3.5) is, on the other hand, a
realistic application of the P e a rso n -F lsh e r statistic X^ (¾ ) • The examples
below of X^ applied to the NOR data set are Intended only as illustrations of
the mechanics o f applying the test.

3 . 2 . 7 . 1 Example

Since NOR purports to be data simulating a norm al sample with ix = 100 and
Cr = 10, let US first a ssess the simulation by testing fit to this specific d istri­
bution. The M ann-W ald recipe (3.7) with o; = 0.05 and n = 100 gives M = 24.
F o r computational convenience, we use M = 25 cells chosen to be équi­
probable under N(100,100). The cell boundaries a re 100 + 10zj, where zj
Is the 0.04Í point from the standard norm al table, i = I, 2, . . . , 24. F o r
example, the 0.04 point is -1 .7 5 , so the upper boundary of the leftmost cell
is 100 + (10)(-1.75) = 82.5. Table 3.1 shows the cells and their observed
frequencies. The expected frequencies are all (100)(0.04) = 4. When p^ = l/M
for all i, we have

n .4, ^ I
1=1

So in this example.

I 25
X" = 4 E (N j - 4 )
i= l

112
= 28

The appropriate distribution is x^(24), and the P -valu e (attained significance


level) of X^ = 28 is 0.260.
To test the NOR data fo r fit to the family of univariate norm al distribu­
tions, an intuitively reasonable procedure is to estimate /i, o* by X, $ and
use cells with boundaries X + zjœ, where a re as before. These cells are
74 MOORE

T A B L E 3.1 Chi-squared Tests fo r Normality of the NOR Data

F itto N(100,100) Fit to norm al family


Upper Upper
C ell boundary Frequency boundary Frequency

I 82.5 3 81.2 3
2 85.9 8 84.8 5
3 88.3 5 87.3 5
4 90.1 8 89.2 5
5 91.6 4 90.7 6
6 92.9 2 92.1 4
7 94.2 I 93.5 3
8 95.3 5 94.6 I
9 96.4 6 95.8 4
10 97.5 I 96.9 6
11 98.5 3 98.0 3
12 99.5 3 99.0 3
13 100.5 4 100.1 2
14 101.5 2 101.1 5
15 102.5 2 102.2 2
16 103.6 7 103.3 5
17 104.7 7 104.5 9
18 105.8 3 105.6 3
19 107.1 I 107.0 I
20 108.4 2 108.3 I
21 109.9 4 109.9 5
22 111.7 6 111.8 6
23 114.1 6 114.3 6
24 117.5 4 117.8 4
25 OO 3 OO 3

équiprobable under the norm al distribution with ß = X and cr = a • I t w ill be


rem arked in Section 3 .3 .1 that the Pearson statistic with these data-depend-
ent cells has the same large sample distribution as if the fixed cell bound­
aries 100 + IOz^ to which the random boundaries converge w ere used. This
distribution is not (24), since д and a w ere estimated by their raw data
M L E ’s X and i in computing the cell probabilities pi(X,a*) = 0.04. The
appropriate distribution has the form (3. 6), so that its critical points fall
between those o f X^(24) and x^(22). Calculation shows that X = 99.54 and
a- = 10.46. The cell boundaries X + and the observed cell frequencies
are given at the right o f Table 3 .1 . The observed chi-squared value is
X^ = 22, reflecting the somewhat better fit when param eters are estimated
TESTS OF CHI-SQUARED TYPE 75

from the data. The P -v alu e falls between 0.460 (from (22)) and 0.579
(from (24)).
F o r comparison, the same procedure w as applied to test the LOG data
set fo r normality. In this case, X = 99.84 and a = 16.51, and the observed
chi-squared value using cell boundaries X + is X^ = 31.5. The c o rre ­
sponding P -valu e lies between 0.086 (from X^(22)) and 0.140 (from x^(24)).
Thus this test has correctly concluded that NOR fits the norm al fam ily w ell,
while the fit of LO G is m arginal. Since the logistic distributions a re difficult
to distinguish from the norm al fam ily, this is a pleasing perform ance. In
contrast, the samé procedure with M = 10 has X^ = 9.4 fo r the IXXl data,
so that the P -valu e lies between 0.225 (from X^(T)) and 0.402 (from X^(¾)*
Using three cells gives X^ = 0.98 and again fails to suggest that the LO G
data set is not norm ally distributed. Thus fo r these p articular data, the
la rg e r M suggested by (3.7) produces a m ore sensitive test.

3 .2 .7 .2 Example

The same procedure can be applied to the E M E A data, but a glance shows
that these data as given a re discrete and therefore not norm al. Indeed, with
15 cells équiprobable under the N (X ,o ) distribution fo r these data, X^ = 554.
Since the data are grouped in classes centered at integers, a m ore intelligent
procedure is to use fixed cells of unit width centered at the integers, with
cell probabilities computed from N (X , a ) . O f course, X and î from the
grouped data are only approximate. Sheppard^s correction fo r a im proves
the approximation, and gives X = 14.540 and a = 2.216. Calculatingthe
cell probabilities and computing the P earson statistic, we obtain X^ = 7.56.
The P -v alu e lies between 0.819 (from X^(12)) and 0.911 (from x^(14)), so
that the E M E A data fit the norm al fam ily very w e ll indeed. The applicability
of X^ to grouped data such as these is an advantage o f chi-squared methods.

3.3 G E N E R A L С Ш -SQ UAR ED STATISTICS

3 .3 .1 Data-Dependent Cells

A s already noted in Section 3 .2 .7 , the use o f data-dependent cells increases


the flexibility o f chi-squared tests, fortunately without increasing their com­
plexity in practice. The essential requirement is that as the sample size
in creases, the random cell boundaries must converge in probability to a set
of fixed boundaries. The limiting cells w ill usually be unknown, since they
depend on the true param eter value 9q . Random cells are used in chi-squared
tests by ^forgetting” that the cells a re data-dependent and proceeding as if
fixed cells had been chosen. Since the cell frequencies a re no longer multi­
nomial, the theory of such tests is mathematically difficult. But in practice,
the limiting distribution of R^ with random cells is exactly the same as if the
limiting fixed cells had been used. This is true even when param eters are
76 MOORE

estimated. Details and regularity conditions appear in Section 4 of M oore


and Spruill (1975) fo r k-dim ensional rectangular c e lls. Pollard (1979) has
extended the theory to cells of very general shape. Therefore, any statistic,
such as the P ea rso n -F ish e r ( ¾ , that has a 0q -fre e limiting null d istri­
bution with fixed cells has that same distribution for any choice of converging
random c e lls .
A statistic such as the Chem off-Lehm ann X^ (0^^) which has a
ent limiting null distribution for fixed cells, has in general this same defi­
ciency with random c ells. But if the hypothesized family { F ( - 1 0 ) } is a
location-scale fam ily, a proper choice of random cells eliminates this Oq -
dependency and also allows cells to be chosen équiprobable under the esti­
mated 9 , thus matching the recommended practice in the simple hypothesis
case. Such cell choices should be made whenever possible. Theorem 4.3 of
M oore and Spruill (1975) is a general account of this. Let us here illustrate
it by returning to the X^ statistic for testing univariate normality.
When the param eter 9 = ( ß , c r ) i s estimated by = ( X , a ) and cell
boundaries X + z j i are used, the estimated cell probabilities are

/ V -Ч
P.(X,<r) = J /o
(2iro-^) •‘ e -(t-3QV2â='
- Z s - i
' dt
X+z.
1-1

= J
,O- . -5 e - u V 2 ^du
(2тг)

"i-i

These are not dependent on ( X , î ) , and a re équiprobable if z^ are the suc­


cessive i/M points of the standard normal distribution. Since this choice
o f cells leaves both Nj and pj unchanged when any location-scale tran sfor­
mation is applied to a ll observations X j, the Pearson statistic (and indeed,
any R ^) has the same distribution for a ll (/^,(г). The limiting null distribution
has the form (3.6) but the are now free o f any unknown param eter. C rit­
ical points may therefore be computed. Two methods fo r doing so, and tables
for testing normality, appear in Dahiya and Gurland (1972) and M oore (1971).
Dahiya and Gurland (1973) study the power o f this test. The idea of using
random cells in this fashion is due to A . R. Roy (1956) and G . S. Watson
(1957, 1958, 1959). W e w ill re fe r to the Pearson statistic using the raw data
M L E and random cells as the W atson-Roy statistic. Section 3 .2 .7 .1 , an
example in Section 3 .2 .7 , Illustrated its use.
Note that the W atson-Roy statistic has 0-free limiting null distribution
only for location-scale fam ilies, that this distribution is not a standard
tabled distribution, and that a separate calculation of critical points is r e ­
quired fo r testing fit to each location-scale fam ily. These statements are
also true for EDF tests of fit. Since the latter are m ore powerful, the W atson-
Roy statistic has few advantages when F ( - 10) is univariate and continuous.
TESTS OF CHI-SQUARED TYPE 77

Nonetheless, data-dependent cells move the cells to the data without essen­
tially changing the asymptotic distribution theory of the chl-squared statistic.
They should be routinely employed in practice, and this is done in most of
the examples in this chapter.

3 .3 .2 General Quadratic Form s

Some of the most useful recent work on chi-squared tests involves the study
o f quadratic form s in the standardized cell frequencies other than the sum
of squares used by Pearson . Random cells a re commonly recommended in
these statistics, fo r the reasons outlined in Section 3 .3 .1 , and do not affect
the theory. A statement o f the nature and behavior o f thèse general statistics
o f chi-squared type is n ecessarily somewhat complex. Practitioners may
find it helpful to study the examples computed in Section 3 .3 .3 and in Rao
and Robson (1974) before approaching the summary treatment below.
Random cells should be denoted by ^ ^ p recise notation,
but here the notation E^ fo r cells and fo r cell frequencies w ill be con­
tinued. The ”cell probabilities” under F (-| ö ) a re

p (ö ) = J d F (X ie ), 1 = 1. M
Ei

Denote by Vn(0) the M -vector of standardized cell frequencies having ith


component

[Nj -npj(e)]/(np.(e))^

If Qn = Q n ( ^ » • • • * ^ ) is ^ possibly data-dependent M x M symmetric


nonnegative definite m atrix, the general form of statistic to be considered is

V J (3.8)
n n n n n

when в is estimated by The P earson statistic is the special case for


which Qn = ^M » M x M identity m atrix. The large-sam p le theory of
these statistics is given in M oore and Spruill (1975). The basic idea is that
o f P earson ’s proof: Show that V n (^ ) is asymptotically multivariate norm al
(even with random cells) and then apply the distribution theory of quadratic
form s in multivariate normal random variables. A ll statistics of form (3.8)
have as their limiting null distribution that of a linear combination of inde­
pendent chi-squared random variables. References on the calculation of such
distributions may be found in Davis (1977).
To avoid the necessity to compute special critical points, it is advan­
tageous to seek statistics (3.8) which have a chi-squared limiting null d istri­
bution. This idea is due to D. S. Robson. Rao and Robson (1974) treat the
78 MOORE

important case of raw data M L E s . They give the quadratic form in Vn(§n)
having the (M - I) limiting null distribution. The appropriate m atrix is
Q(ôjj), where

Q (0 ) = + B (6 )[J (Ö ) - В ( 0 ) ’В ( 0 ) Г * В ( 0 ) ’

J(0) is the P X P F ish er information m atrix fo r F(* |0), and B (0 ) is the


M X p m atrix with (i, j)th entry

i Эр (0 )

P i< ^ > " - э Г -

The Rao-Robson statistic is

n n n n n n

This test can be used whenever J - is positive definite. Since nJ


is the information m atrix from the raw data and n B 'B the information m atrix
from the cell frequencies, J - B 'B is always nonnegative definite. Notice
that Rn is just the Pearson statistic (¾ ) plus a term that conceptually
builds up the distribution (3.6) to term sim plifies consider­
ably, since Эр./Э0. = O implies that

I /M N Эр. M N. Эр, \
JN. öp.
V’ B = (3.9)
.... Ш )
and

R^ = {$^) + (V ^B )(J - B*B)-^ (V ^B )' (3.10)

all term s being evaluated at ^ = Further simplification can be achieved


in location-scale cases by the use of random cells fo r which P i ( ^ ) = l/ M .
Rao and Robson (1974) give several examples of the use of this statistic,
using random cells iñ some cases.
Simulations by Rao and Robson show that Rn has generally greater
power than either the P ea rso n -F ish e r o r Watson-Roy statistics. Spruill
(1976) gives a theoretical treatment showing that Rndom inates the W atson-
Roy statistic for any location-scale family { F (* 10) }• Since Rn is pow erful,
has tabled critical points, and is easy to compute whenever the M L E ^ can
be obtained, it is recommended as a standard chi-squared test of fit.
Moore (1977) gives a general recipe for the quadratic form having the
chi-squared limiting null distribution with maximum degrees of freedom when
TESTS OF CHI-SQUARED TYPE 79

nearly arbitrary estimators a re used. F irs t compute the limiting multi­


variate normal law of ^ ) , which under F(* 10q) ^^^.s covariance m atrix
2 (¾ ) whose form depends on the large-sam p le properties of the estimators
S ^ . If 2¿^ is a consistent estim ator of the generalized inverse 2 (¾ )" , the
desired statistic is V jj(^ )* 2 "V ^ (ö jj). The derivation of this Waldos method
statistic clearly follows the lines o f Pearson*s original proof. The statistic
can be computed in closed form m ore often than might be eзфected. It is the
Pearson statistic when ^ = the Rao-Robson statistic when ^ = and
can even in some cases be used when the Xj are dependent (M oore, 1982).
L eC am , Mahan and Singh (1983) have studied these statistics in depth, and
show that they have certain asymptotic optimality properties given the choice
of estimator This strengthens the case for use of the Rao-Robson statistic
when raw data M L E s are chosen.
If (3.6) can be built up to (M - I ) , it can also be chopped down to
X^(M - P - I ). Dzhaparidze and Nikulin (1974) point out that the appropriate
statistic is

= V’ a ., - B(B'B)-iB*)V
n n n M n

where and B are evaluated a t в = Zjj has the x^(M - p - I) limiting


distribution whenever ^ approaches at the usual n^/2 rate, and can there­
fore be used with any reasonable estim ator of Ö. Computation o f is again
simplified by (3 .9 ). A s might be expected, simulations suggest that Z n ( ^ )
is inferior in power to both the W atson-Roy and Rao-Robson statistics.

E 3 .3 .3 Examples of General Chi-Squared Tests

3 .3 .3 .1 Example

It is desired to test fit to the negative exponential family

r/ 1/.Ч “X/9
f(x l0 ) = в e , O< X < «>

where ß = { 0 : O < 0 < « : . } . Since the M L E of 0, ^ = X, is available, the


Rao-Robson statistic is the recommended chi-squared test. When p = I,
(3 . 9) and (3. 10) reduce to

^ M (N ,-n p ^ )^ M ^

n np^ nD V de /

where

M dp ^
80 MOORE

and J, Pi, dpi/dö are all evaluated at 0 = F o r a sample o f size n = 100,


we w ill once m ore use M = 25 équiprobable cells. In this scale-param eter
fam ily, équiprobable cells are achieved by the use of random cell boundaries
of the form zjX. From

P (в ) = (3.11)

the condition = 1/25 gives Zo = 0, = «> and

Z^ = - l o g ( l 1=1, . . . , 24

Differentiating (3.11) under the Integral sign, then substituting 0 = X, gives

^ - * " [ ( ' - è ) '»>(> - 4 ) -(■ - 4 r ) -ir )] ■I


Because o f their Iterative nature, the quantities v^ a re easily computed on a
program m able calculator. The F ish er information is J(0) = 0"^ so that

25
D = Jr^F l -2 5 Z v [l

Finally

1 -25
^ (251^ ( S p N v )Z
100 4 я / i 100 I -2 5 Z p V?
1=1 * I

Table 3.2 records zj and vj, from which

25
I - 25 Z
E = 0. 04255
I

F o r the WE2 data set, X = 0.878. The resulting cell boundaries and cell f r e ­
quencies appear in Table 3.2, and

R - I (25)Z (-0.0519)Z
^lOO 4^® ^ 100 0.04255

= 87.75 + 0.40 = 88.15


TESTS OF CHI-SQUARED TYPE 81

TABLE 3.2 The Rao-Robson Test for the Negative Exponential Family,
with 25 Equiprobable Cells

WE2 EXP

i z. V. ZjX N. Z^x N.
I I I I

I .0408 -.0392 0.036 I 0.221 6


2 .0834 -.0375 0.073 0 0.451 5
3 .1278 -.0358 0.112 I 0.692 3
4 .1743 -.0340 0.153 I 0.944 2
5 .2231 -.0321 0.196 3 1.208 5
6 .2744 -.0301 0.241 I 1.486 5
7 .3285 -.0279 0.288 2 1.779 7
8 .3857 -.0257 0.338 3 2.088 2
9 .4463 -.0234 0.392 5 2.416 4
10 .5108 -.0209 0.448 5 2.766 3
11 .5798 -.0182 0.509 I 3.140 3
12 .6539 -.0153 0.574 5 3.541 4
13 .7340 -.0123 0.644 3 3.974 6
14 .8210 -.0089 0.721 5 4.445 3
15 .9163 -.0053 0.804 8 4.962 4
16 1.0216 -.0013 0.897 4 5.532 4
17 1.1394 .0032 1.000 16 6.170 3
18 1.2730 .0082 1.118 9 6.893 3
19 1.4271 .0139 1.253 11 7.728 4
20 1.6094 .0206 1.413 7 8.715 2
21 1.8326 .0287 1.609 5 9.923 7
22 2.1203 .0388 1.861 I 11.481 3
23 2.5257 .0524 2.217 3 13.676 3
24 3.2189 .0733 2.826 0 17.430 6
25 OO .1288 ÜO 0 OO 3

This gives a P -v alu e of 3 x lO“’ using the (24) distribution. In contrast,


the E X P data set has X = 5.415, cell boundaries and frequencies given at
the right of Table 3 . 2, and

^ (25)^ (-0.1231)^
i(54)
100 100 0.04255

= 13.5 + 2.23 = 15.73

The P -valu e from (24) is 0.898.


82 MOORE

Table 3.2 reveals an important practical advantage o f chi-squared tests,


especially when équiprobable cells are employed: examination of the devi­
ations of the cell frequencies from their common expected value (here 4)
shows clearly the nature of the lack of fit detected by the test. In this case,
the W elbull with power param eter к = 2 has fa r too few observations in the
low er tall, too many in the middle slope of the density function, and too few
in the extreme upper tall. A glance at graphs of the W eibull and exponential
density functions ( e . g . , on pp. 379-380 of Derman, G le s e r, and Olkin 1973)
shows how accurately the m irro r the differences between the two d istri­
butions .
A s these examples suggest, the Pearson statistic Х^(0д), which is the
first component of Rn, is usually adequate for drawing conclusions when M
is large and p is sm all, hi this example, the critical points of (¾ ) fall
between those of (22) and those of X^ (24). A reasonable strategy is to
compute X^(Sjo) first, completing the computation of Rn only if the results
after the first stage are ambiguous.

3 .3 .3 .2 Example

The B A E N data are to be tested for fit to the double-exponential family

f(x|0) = e ^ 2, -«< x < «

^ = { (^1, Ö2) : 0 < 02 < «>}

The M L E 0JJ = 02n^ from a random sample Xj^, . . . , Xj^ is

0, = median (X ^, . . . , X )
In ' I ’ n

Kr. = i l IX ,-e ,

In this location-scale setting, équiprobable cells with boundaries 0^^^ + ^^2 n


w ill again be employed. Using an even number of ceils, say M = 2 v , and
choosing the a¿ sym m etrically as aj^-i = = c¿, where

¢. = - l o g ( l , i = 0, V

(in particular, ap = a^ = 0, ам = “ ) gives Pi(öjj) = l/M.


Computations sim ilar to those shown in Section 3 .3 .3 .1 yield
TESTS OF CHI-SQUARED TYPE 83

Эр.

Ц < ^ = -1 / Ч п 1=1.

= 1/Мв i = + I, . . . , M (3.12)
2n

^ I k-1
<V ■ - C^e ) 1 = p+k, v - k + 1
к = 1, ►, I'

T- , “\ - l "^k
then

-P .2
» 'V i

Since the information m atrix is 0 ^ ^ 1 2 , the m atrix J ( ^ ) - has


rank I and the Rao-Robson statistic is not defined. (The reason for this
unusual situation is that fo r this choice o f c e lls , the median is both the raw
data M L E and the grouped data M L E for .) The Dzhaparidze-Nikulin sta­
tistic is

V V I l ( " i - ñ f - f ¡ ^ ¡ ^ 1 " I ' V i * " v -in )]


■ f i=

This computation w as sim plified by the fact that B 'B is diagonal and the first
term o f (3.9) is O by (3.12) and the definition o f the median.
The B A E N data contain n = 33 observations, fo r which = 13 and
^ n = ^ .36. Table 3.3 contains cj, upper cell boundaries + ^ i^ n *
cell frequencies fo r these data. The statistic is, after some arithmetic.

10
Z = ^ ^ (N - 3.3)2 [-1.2828)2
n 33 ' I 33 (2)(.1574)

= 7.30 - 1.59 = 5.71

The P -v alu e from (7) is 0.426. The Pearson statistic = 7.30 has c rit­
ical points falling between those of x^(7) and x^(8), taking advantage of the
fact that the grouped data M L E was used to estimate one of the two unknown
param eters. The corresponding bounds on the P -v alu e are 0.398 and 0.505.
The double exponential model clearly fits the B A E N data very w e ll. Even
84 MOORE

T A B L E 3.3 Testing the Fit of the B A E N Data


to the Double Exponential Fam ily

Cell 0- + C .0 . N.
In I 2n

1 -1.609 4.722 4
2 -0.916 7.051 7
3 -0.511 8.414 3
4 -0.223 9.380 2
5 0 10.130 1
6 0.223 10.880 3
7 0.511 11.846 4
8 0.916 13.209 3
9 1.609 15.538 4
10 2

though an anomaly reduced from 2 to I the difference in the degrees o f fre e ­


dom of the distributions bounding , there is a considerable spread in
the corresponding P -values. This is typical when n (and therefore M) is
sm all. In examples where the goodness of fit is less cle a r than here, use of
Rji o r Zn can be essential to a clear conclusion.

3 .3 .3 .3 Example

In testing for multivariate normality, a natural choice o f cell boundaries are


the concentric hyperellipses centered at the sample mean and with shape
determined by the inverse o f the sample covariance m atrix. These are level
surfaces of the multivariate norm al density function with param eters esti­
mated. Equiprobable cells of this form have the advantage of revealing by
the observed cell counts the presence o f such common types o f departure
from normality as peakedness o r heavy tails. Chi-squared statistics in this
setting a re computed and applied by M oore and Stubblebine (1981). Here we
consider the special case of testing fit to the circu lar bivariate norm al fam ­
ily, a common model fo r *4argeting" problem s. It represents the effect of
independent norm al horizontal and vertical components with equal variances.
The density function is

^ {(X -í* i)^ + (y -ÍÍ2 )^ }


f(x,y|0) = -7— -°o < X , y < »
27ГСГ^ '

n = {e = (/Í1 ,iij.cT) : - » < j«i ,ÍÍ2 < » , 0 < (T< » }

The M L E of в from a random sample (X ^, Y j), . . . , (X^, Y^) is = (Дх »


Д2 *? )» where
TESTS O F C H I-SQ UAR ED T Y P E

Ml = X Мг = Ÿ

In constructing a test o f fit to this fam ily, Jit is natural to use as cells annuli
centered at (X ,Y ) with successive radii cjor fo r

O = C <c <***<c <c =


O l M- I M

Thus

E j = {( x , y ) ; < (X - + (y - f ) ^ < Cj Ct^

The cell probabilities a re

= S I f(x ,y l0 )d x d y
Ei

and calculation shows that P i ( ^ ) - 1/M when

= | -2 1 o g ( l - ¿ ) | , i=l, . . . , M - I

The recommended test is based on the Rao-Robson statistic. Differentiating


P i(0 ) under the integral sign, then substituting O = gives

= ^ = О
в ~ в

1 2 1 2
— 1, 2 ■ 2 ® i-l 2 "2 °i.
= о- (Cj jC - с. e )
да

= v./o-
I

Hence

O O O
a M 2
0 0 Z j Vj
86 MOORE

The F ish er information m atrix fo r the circu lar bivariate norm al fam ily is
also diagonal,

J (0 ) = I
2
<T
0

so that (J - is trivially obtained. M oreover, from (3.9) it follows that

JL туг
V B = n * (0 ,0 ,2 , N v 7 î )
n I i i '

The Rao-Robson statistic is therefore

( s f N .d ./
_ M y / n
, 2
1=1 I - M S ^ d^

where

.,.V ,/ . = O - y . 0 . ( : 4 ) . O ^ ( I - V )

The limiting null distribution is (М - I ) , while that of the Pearson statistic


X^(á‘n) has critical points falling between those of X^ (M - I) and x^(M - 4).
The Rao-Robson correction term w ill often be necessary for a clear picture
of the fit of this three-param eter fam ily.

3 .3 .3 .4 Example

The negative exponential distribution with density function

r/ I /.-I - x / 9
f(x l^ ) = O e , O< X < «

í2 = { 0 : O < 0 < o o }

is often assumed in life testing situations. Such studies may involve not a full
sample, but rather Type II censored data. That is, ord er statistics are
observed up to the sample a-quantile.

< X
^ (1 ) ^ ^(2 ) ^ ([na])

where [no:] is the greatest integer in na and O < a < I. It is natural to make
TESTS OF CHI-SQUARED TYPE 87

use o f random cells with sample quantiles = as cell boundaries.


Here ¢0 =

o = ôo<ô^< < Ô ,-. - = O' < = I


M -I M

so that the n - [nof] unobserved Xj fa ll in the rightmost cell. Although the


cell frequencies N i are now fixed, the general theory o f M oore and Spruill
(1975) applies to this choice of c ells. A full treatment of this type of problem
is given in Mlhalko and M oore (1980). Chi-squared tests are Immediately
applicable to data censored at fixed points. W e now see that allowing random
cells allows Type П censored data to be handled as w e ll.
The P e a rso n -F ish e r Statistic. Estimate в b y the grouped data M L E
found as the solution of (3 .2 ). That equation becomes in this case

-ij_ / e -ij/ 0
M
Ё
Z
i= l
N j
«i-1® - ¥

= O

- e

which is easily solved Iteratively to obtain ^ = ..., ím -I^*


statistic is

M [N - n p (5 )]^

where

= [nôj] - [nôj (nonrandom)

-Í/“
Pj(e> = e - e (random)

The limiting null distribution Is (M - 2 ).


The Waldos Method Statistic. A m ore powerful chi-squared test can be
obtained by use of the raw data M L E of в from the censored sam ple, namely
(Epstein and Sobel, 1953),

^ = T j X,,, + (n - [пог])Х , )
n [nû'JVj^^ (i) ' ([nû'l)/
([no'])>

By obtaining the limiting distribution o f Vjj(öj^) and then finding the appro­
priate quadratic form , a generalization of the Rao-Robson statistic to cen­
sored samples can be obtained. This is done in Mihalko and M oore (1980).
88 MOORE

The resulting statistic for the present example is

M 2

\ = x ^ (e ;) + ( n D ) - ^: z
^
1=1
v)
where Щ and p j(0 ) are as above, and

,-1 ,, -h - Л .

- “Í^tM
v/
t
-I n V 2/
^ = I - e - 2i
1=1

In the full sample case, a = I, = Njyj = 0, ^ = X and the statistic


Rn reduces to the Rao-Robson statistic of Section 3 .3 .3 .1 (with M - I cells
bounded by the .
The motivation for using censored data when lifetim es o r survival times
are being m easured is apparent from the E X P data set. The sample 80th p e r­
centile is 9.46, while the maximum of the 100 observations is 39.12. The
M L E of 9 from the data censored at o f = 0 .8 is = 5.471, compared with
the full sample M L E , X = 5.415. Experience shows that the R oscoe-B yars
guidelines are adequate to ensure accurate critical points from the
distribution in the present situation, where the npj are random and unequal.
Tests of the EX P data w ill therefore be made with (a) the full sample using
10 cells having the sample deciles as boundaries; and (b) the data censored
at 0 = 0.8 using 9 cells with the first 8 sample deciles as boundaries. A ll
cells except the rightmost in case (b) contain 10 observations. The results
are, for the full sample,

R = 6.132 + 0.0220 = 6.352


n

with a P -valu e of 0.704 from X^(9)« F o r the censored sample,

R = 5.153 + 0.0G5 = 5.218


n

with a P -v alu e of 0.734 from X^(S)* These results are comparable to those
obtained fo r the same data in Section 3 . 3 . 3 . 1.

3.3.4 Nonstandard Chi-Squared Statistics

W e have considered two classes of ^standard” chi-squared statistics, the


C ressle-R ead class based on m easures of divergence and the M oore-Sprulll
TESTS OF CHI-SQUARED TYPE 89

class of nonnegative definite quadratic form s. The Pearson is the only


common m em ber of these c lasses. A ll of the C re ssie -R e a d statistics are
asymptotically equivalent to under the null hypothesis when the same
(possibly random) cells and the same estimators are used. But different
divergence m easures may be sensitive to different types of divergence of Nj
from npj, and this fact can be used to choose a statistic when a specific type
of alternative is to be guarded against. The M oore-Spruill statistics differ
in asymptotic behavior under the null hypothesis. The choice o f statistics
within this class is most often made to obtain a limiting null distribution
fo r given estim ator (The C ressie-R ead statistics have a x^ limiting null
distribution only for estim ators equivalent to the grouped-data M L E , a class
that includes all minimum-R ^ estim ators.)
The theory of these standard chi-squared statistics assum es Independent
observations and a fixed number of cells M . Relaxing these assumptions
leads to situations that a re incompletely explored, and some other statistics
have also been suggested. In this section we mention a few of these nonstand­
ard cases.
1. Increasing M with n. Usual practice is to increase the number of
cells M as the sample size n increases (re c a ll the M ann-W ald recommenda­
tion (3 .7 )). This practice is not explicitly recognized in the standard theory.
The large-sam p le theory of the usual chi-squared statistics for increasing
M is available in the case o f a simple null hypothesis (Holst 1972, M orris
1975, C re ssie and Read, 1984). The limiting null distributions of the R^ are
norm al, with mean and variance depending on X. The statistics are therefore
no longer asymptotically equivalent, and X^ is the optimal m em ber of the
class in term s of Pitman efficiency. The behavior of these statistics when
param eters are estimated has not been explored.
Two possible variations in practice suggest them selves. (I) A llow M to
Increase with n at a rate faster than the M ann-W ald suggestion n^/S. K em p-
thorne (1968) proposed the use of the Pearson statistic with M = n équiprob­
able c e lls . Simulation studies suggest that standard statistics with few er
cells have superior power except against very short-tailed alternatives.
(2) Use a norm al rather than a X^ approximation for the distribution o f stand­
ard statistics. F o r X ^ , the x^ approximation is generally both adequate in
practice and superior to the norm al. The x^ is also e a sie r to use, since it
does not require computing the asymptotic mean and variance. F o r other R^
(such as G^), the y } approximation is much less good, and the norm al approx­
imation may be superior. See Koehler and Larntz (1980). But Read (1984)
gives an adjustment of the approximation that is ea sie r to use than the
norm al and should also be considered.
2. Dependent observations. Since many data a re collected as time
se rie s, tests of fit that assume Independence may often be applied to data
that are in fact dependent. Positive dependence among the observations w ill
cause omnibus tests of fit to reject a true hypothesis about the distribution of
90 MOORE

the individual observations too often. That is, positive dependence is con­
founded with lack of fit. This is shown in considerable generality fo r both
chi-squared and EDF tests by G leser and M oore (1983). If a model for the
dependence is assumed, it may be possible to compute the effect of depend­
ence or even to construct a valid chi-squared test using the distributional
results in IVbore (1982). But in general, data should be checked for se ria l
dependence before testing fit, as the tests are sensitive to dependence as
w ell as to lack of fit.
3 . Sequentially adjusted c e lls . By use of the conditional probability
integral transformation (see Chapter 6), O^Reilly and Quesenberry (1973)
obtain particular m em bers of the following class of nonstandard chi-squared
tests. Rather than base cell frequencies on cells E^ (fixed) o r (X^^,. . . , ¾ )
(data-dependent) into which all of ..., a re classified, the cells used
to classify each successive Xj are functions Eij of X^, . . ♦ , Xj only. Thus
additional observations do not require reclassification of e a rlie r observa­
tions, as in the usual random cell case. No general theory of chi-squared
statistics based on such sequentially adjusted cells is known. O ’Reilly and
Quesenberry obtain by their transformation approach specific functions E y
such that the cell frequencies are multinomially distributed and the Pearson
statistic has the “ I) limiting null distribution. The transformation
approach requires the computation of the minimum variance unbiased esti­
mator of F ( - 10) • Testing fit to an uncommon fam ily thus requires the p ra c ­
titioner to do a hard calculation. M oreover, any test using sequentially
adjusted cells has the disadvantage that the value of the statistic depends on
the order in which the observations w ere obtained. These are serious b a r ­
r ie rs to use.
4 . Easterling’s approach. Easterling (1976) provides an interesting
approach to param eter estimation based on tests o f fit. Roughly speaking,
he advocates replacing the usual confidence Intervals for в in F(* |0) based
on the acceptance regions of a test of

Hq I O = Oq

H j : 0 Ф Oo

with Intervals based on the acceptance regions of tests of fit to completely


specified distributions,

h J: G (-) = F(-|0o)

H f: G (.) ^ F(.|0o)

In the course of his discussion, Easterling suggests rejecting the family


{ F ( x I0) : 0 in n } as a model for the data if the (say) 50% confidence Interval
fo r в based on acceptance regions for H* is empty. This ’’imolicit test of fit”
TESTS OF CHI-SQUARED TYPE 91

deserves comment, using the chi-squared case to make some observations


that apply as w ell when other tests of H j are employed.
Taking then the standard chi-squared statistic fo r h J ,

M [N -n p (0 )]2

2
i= l — = V V -

and denoting by XaiM - I) the upper a-poin t of the X^ (M - I) distribution,


the (I - Of)-confidence interval is empty if and only if

(Ö) > X a(M - I) fo r a ll Ö in n (3.13)

But if 0J1 is the minimum chi-squared estim ator, (3.13) holds if and only if

x ^ (^ > X ^ (M -I) (3.14)

When any F(x|ö) is true, (5^) has the X^ (M - p - I) distribution, and the
probability of the event (3.14) can be explicitly computed. It is less than oj,
but close to a when M is la rg e . Thus Easterling’s suggestion essentially
reduces to the use o f standard tests of fit with param eters estimated by the
minimum distance method corresponding to the test statistic employed.
M oreover, his method b y -p asses a proper consideration of the distributional
effects of estimating unknown p aram eters.

3.4 R ECOM M ENDATIONS O N USE O F


С Ш -SQ UARED TESTS

Chi-squared tests are generally less powerful than ED F tests and sp ecial-
purpose tests of fit. It is difficult to assess the seriousness o f this lack of
power from published sources. Comparative studies have generally used the
Pearson statistic rather than the m ore powerful Watson-Roy and Rao-Robson
statistics. M oreover, such studies have often dealt with problem s of param ­
eter estimation in ways which tend to understate the power of general purpose
tests such as chi-squared and Kolm ogorov-Sm im ov tests. This is true of the
study by Shapiro, W ilk and Chen (1968), fo r example. R eliable information
about the power o f chi-squared tests fo r normality can be gained from Table
IV o f Rao and Robson (1974) and from Tables I and 2 of Dahiya and Gurland
(1973). The form er demonstrates strikingly the gain in power (always at
least 40% in the cases considered, and usually much greater) obtained by
abandoning the P e a rso n -F ish e r statistic fo r m ore m odem chi-squared statis­
tics. Nonetheless, chi-squared tests cannot in general match ED F and spe­
cial purpose tests of fit in power.
92 MOORE

This relative lack of power Implies three theses on the practical use of
chi-squared techniques. F irst, chi-squared tests of fit must compete for use
prim arily on the basis of flexibility and ease of u se. D iscrete and/or multi­
variate data do not discomfit chi-squared methods, and the necessity to esti­
mate unknown param eters Is m ore easily dealt with by chi-squared tests
than by other tests of fit.
Second, chi-squared statistics actually having a (limiting) chl-squared
null distribution have a much stronger claim to practical usefulness. Ease
of use requires the ability to obtain (I) the observed value of the test statis­
tic , and (2) critical points for the test statistic. The calculations required
for (I) in chl-squared statistics are at most Iterative solutions of nonlinear
equations and evaluation of quadratic form s, perhaps with m atrix expressed
as the Inverse of a given symmetric pd m atrix. These are not serious b a r ­
rie rs to practical use, given the current availability of computer lib ra ry
routines. Computation of critical points of an untabled distribution is a much
harder task for a user of statistical methods. Chi-squared and ED F statistics
both have as their limiting null distributions the distributions of linear com bi­
nations of central chi-squared random variables. General statistics of both
classes require a separate table of critical points for each hзфothesized
fam ily. The effort needed is justified when the hypothesized family is com ­
mon, but should be expended on a test m ore powerful than chi-squared tests.
In less common cases, o r when no more powerful test with 0-free null d is­
tribution is available, there are several chi-squared tests requiring only
tables of the distribution. These include the P e a rs o n -F is h e r, Rao-Robson,
and Dzhaparidze-Nikulin tests, and others which can be constructed by W ald ’s
method. Among the chi-squared statistics proposed and studied to date, the
Rao-Robson statistic R^ of (3.10) appears to have generally superior power
and is therefore the statistic of choice for protection against general alterna­
tives. Computation of R^ in the nonstandard cases most appropriate for ch i-
squared tests of fit does require some mathematical work. However, the
Pearson statistic {0^) with raw -data MLEs is the first and usually domi­
nant component of Rjj. If Х^(0д) itself lies in the upper tail of the X^ (M - I)
distribution, the fit can be rejected without computing R^.
The third thesis rests on the exposition and examples in this chapter.
Chi-squared tests are the most practical tests of fit in many situations. When
param eters must be estimated in non-location-scale fam ilies o r in uncommon
distributions, when the data are discrete, multivariate, o r even censored,
chi-squared tests remain easily applicable.

A C K N O W LE D G M E N T

This work was supported in part by National Science Foundation Grant


DMS-8501966.
TESTS OF CHI-SQUARED TYPE 93

R E FE R E N C E S

Bishop, Y . M . M . , Fienberg, S. E . and Holland, P . W . (1975). D iscrete


M ultivariate A n aly sis. Cam bridge: The M IT P r e s s .

Chernoff, H. and Lehmann, E . L . (1954). The use o f m aximum-likelihood


estimates in test fo r goodness of fit. Ann. Math. Statist. 25, 579-586.

Cochran, W . G. (1954). Some methods of strengthening the common tests.


Biom etrics 10, 417-451.

Cohen, A. and Sackrowitz, H. B. (1975). Unbiasedness of the chi-square,


likelihood ratio, and other goodness of fit tests fo r the equal cell case.
Ann. Statist. 959-964.

C r e s s le , N . and Read, T . R . C . (1984). Multinomial goodn ess-of-fit tests.


J. Roy. Statist. Soc. B46, 440-464.

Dahlya, R. C. and Gurland, J. (1972). Pearson chi-square test of fit with


random Intervals. Biom etrika 59, 147-153.

Dahiya, R. C. and Gurland, J. (1973). How many classes in the Pearson


chi-square test? J. A m er. Statist. A s s o c . 68, 707-712.

D avis, A . W . (1977). A differential equation approach to linear combinations


of independent ch i-squ ares. J. A m er. Statist. A s s o c . 72, 212-214.

Derm an, C . , G leser, L . J. and Olkin, I. (1973). A Guide to Probability


Theory and Application. Holt, Rinehart and Winston, New Y ork.

Dzhaparidze, K. O. and Nikulin, M . S. (1974). O n a m o d iflc a tlo n o fth e


standard statistics of Pearson. Theor. Probability A p p l. 19, 851-853.

Easterling, R. G. (1976). G oodness-of-fit and param eter estimation.


Technometrics 18, 1-9.

Epstein, B. and Sobel, M . (1953). Life testing. J. A m er. Statist. A s s o c .


48, 486-502.

F ish er, R. A . (1924). The conditions under which m easures the d iscrep ­
ancy between observation and hypothesis. J. Roy. Statist. Soc. 87,
442-450.

G leser, L . J. and M oore, D. S. (1983). T h e e ffe c to fd e p e n d e n c e o n c h i-


squared and em piric distribution tests of fit. Ann. Statist. 11, 1100-1108.

Hoeffdlng, W . (1965). Asymptotically optimal tests for multinomial d istri­


butions. Ann. Math. Statist. 36, 369-408.

Holst, L . (1972). Asymptotic normality and efficiency for certain goodness-


o f-fit tests. Biometrika 59, 127-145.

Kempthorne, O. (1968). The classical problem of Inference—goodness of fit.


P r o c . Fifth Berkeley Symp. Math. Statist. P r o b . 1_, 235-249.
94 MOORE

Koehler, К. J. and Larntz, К. (1980)* A n e m p irlc a lin v e stig a tio n o fg o o d n e ss-


o f-fit statistics for sparse multinomials* J. Am er* Statist* A sso c * 75,
336-344*

L a m tz , K. (1978). Sm all-sam ple comparisons of exact levels fo r chi-squared


goodness-of-fit statistics* J. A m er. Statist. A s s o c * 73, 253-263.

LeCam , L . , № han, C* and Singh, A . (1983). An extension of a theorem of


H. Chernoff and E. L . Lehmann. In: Rizvi, M . H ., Rustagi, J. S. and
Slegmund, D. (E d s .). Recent Advances in Statistics: Papers in Honor
of Herman Chernoff, 303-337. Academic P re s s , New York.

Mann, H. B . and W ald, A . (1942). On the choice of the number of class


intervals in the application of the chi-square test. Ann, ^ t h . Statist.
13, 306-317.

Mlhalko, D . and M oore, D. S. (1980). C hi-square tests o f fit fo r type П


censored sam ples. Ann. Statist. 8, 625-644.

M oore, D. S. (1971). A chi-square statistic with random cell boundaries*


Ann. Math. Statist. 147-156.

M oore, D. S. (1977). Generalized inverses, Waldos method and the construc­


tion o f chi-squared tests of fit. J. Am er* Statist* A s s o c . 7, 131-137*

M oore, D* S* (1982). The effect of dependence on chi-squared tests of fit.


Ann. Statist. 1 0 , 1163-1171.

M oore, D. S. (1984). M easures of lack of fit from tests of chi-squared t y p e .


J. Statist. Planning Inf. 10, 151-166.

M oore, D. S. and Spruill, M . C. (1975). Unified large-sam ple theory of


general chi-squared statistics for tests of fit. Ann. Statist. 3, 599-616.

M oore, D. S. and Stubbleblne, J. B . (1981). C hi-square tests fo r multi­


variate normality, with application to common stock p ric e s. Com m .
Statist. A IO , 713-733.

M o rris, C . (1975). Central limit theorems for multinomial sums. Ann.


Statist. 3, 165-188*

Neyman, J. (1949). Contribution to the theory of the test. P r o c . Berkeley


Symp. Math. Statist, and P r o b . , 239-273*

O ’Reilly, F . J. and Quesenberry, C. P . (1973). Thecondltionalprobabillly


integral transformation and applications to obtain composite chi-square
goodness-of-flt tests. Ann. Statist. 1^, 74-83.

Pollard, D. (1979). General chi-square goodness-of-fit tests with data-


dependent cells. Z . Wahrscheinlichkeitstheorie verw . G eb. 50 , 317-331.

Rao, К. С. and Robson, D. S. (1974). A chi-square statistic for goodness-


o f-fit within the exponential fam ily. Comm. Statist. 3, 1139-1153.
TESTS OF CHI-SQUARED TYPE 95

Read, T . R. C. (1984). Small sample comparisons fo r the power divergence


goodness-of-fit statistics. J. A m er. Statist. A s s o c . 79, 929-935.

Roscoe, J. T . and B y ars, J. A . (1971). A n in v e stig a tio n o fth e re stra in ts


with respect to sample size commonly imposed on the use of the ch i-
square statistic. J. A m er. Statist. A s s o c . 66, 755-759.

Roy, A . R. (1956). O nx^ statistics with variable Intervals. T echnicalR eport


No. I, Stan fordU n iv., Department of Statistics.

Schorr, B. (1974). On the choice of the class Intervals in the application of


the chi-square test. IMhth. Operations Forsch, u. Statist. 5, 357-377.

Shapiro, S. S ., W llk, M . B . and Chen, H. J. (1968). A comparative study


of various tests for normality. J. A m er. Statist. A s s o c . 63, 1343-1372.

Spruill, M . C. (1976). A comparison of chi-square goodness-of-fit tests


based on approximate Bahadur slope. Ann. Statist. 4, 409-412.

Watson, G. S. (1957). The chi-squared goodness-of-fit test for normal d is­


tributions. Blom etrika 44, 336-348.

Watson, G. S. (1958). On chi-square goodn ess-of-flt tests for continuous


distributions. J. Roy. Statist. S oc., Ser. B 20, 44-61.

Watson, G. S. (1959). Some recent results In chi-square goodness-of-flt


tests. Biom etrics 15, 440-468.
Tests Based o n EDF Statistics
M ichael A . Stephens S im o n F ra s e rU n iv e rs ity , Burnaby, B . C . , Canada

4.1 IN TR O D UCTIO N

Graphical methods have a wide appeal in deciding if a random sample appears


to come from a given distributional form . Some o f these methods have a l­
ready been considered in Chapter 2. In this chapter we consider tests of fit
based on the em pirical distribution function (E D F ). The EDF is a step func­
tion, calculated from the sam ple, which estimates the population distribution
function. EDF statistics are m easures of the discrepancy between the EDF
and a given distribution function, and are used for testing the fit of the sample
to the distribution; this may be completely specified (Case O below) o r may
contain param eters which must be estimated from the sam ple.

4 . 2 E M P IR IC A L DISTRIBUTIO N F U N C T IO N STATISTICS

4 .2 .1 The Em pirical Distribution Function (ED F)

Suppose a given random sample of size n is ..., and let < X^2)
< • • • < X (n) be the o rd er statistics; simpóse further that the distribution of
X is F (X ). F o r the present and in most of this chapter we assume this d istri­
bution to be continuous. The em pirical distribution function (E D F) is Fjj(x)
defined by

number of observations S. x
F^(X) = -OO < X < OO

97
98 STEPHENS

T A B L E 4. I Leghorn Chick Data

Z,b

156 0.104 0.040


162 0.139 0.060
168 0.180 0.087
182 0.304 0.184
186 0.345 0.221
190 0.388 0.261
190 0.388 0.261
196 0.455 0.329
202 0.523 0.402
210 0.612 0.505
214 0.655 0.557
220 0.716 0.633
226 0.771 0.704
230 0.804 0.747
230 0.804 0.747
236 0.848 0.805
236 0.848 0.805
242 0.885 0.855
246 0.906 0.883
270 0.977 0.976

S-Orlginal values X of weights of


20 chicks; in g ra m s.
^’Values Z I are given by the P r o b a -
bllity Integral Transform ation for a
test fo r normality, with given mean
200 and given standard deviation 35
for use in a Case 0 test.
^Values Z 2 are given by the P r o b a -
bllity Integral Transform ation using
sample mean 209.6 and sample
standard deviation 30.65 for use in
а Case 3 test.
TESTS BASED ON EDF STATISTICS 99

M ore p recisely, the definition Is

F^(X) = 0, X < X
(I)

., n - I (4.1)
^n^*^ n’ ^(1) - * ^(1+1) ’ ^ ■

= V ) - *

Thus F ji(X) I s a step function, calculated from the data; as x increases


It takes a step up of height l/n as each sample observation is reached. F o r
any X , Fji(x) records the proportion of observations less than o r equal to x,
while F (x) is the probability of an observation less than o r equal to x. W e
can expect F^(X) to estimate F (x ), and it is in fact a consistent estimator of
F (X ); as n — » , I Fjj(x) - F(x)| decreases to zero with probability one.

E 4 .2 .1 Example

T able 4.1 gives the weight X in gram s of twenty 21-day-old leghorn chicks,
given by B liss (1967) and taken from Appendix A . Figure 4.1 gives the EDF

FIGURE 4 .1 EDF of X. Also drawn is the normal distribution, mean 200,


variance 1225.
100 STEPHENS

of these sample data. It is clear that the weights have been rounded to the
nearest gram , so that strictly the parent population is discrete, but with
these large numbers this approximation w ill make negligible difference. The
Fjj(x) suggests that F (x) w ill have the characteristic S-shape of many d istri­
butions, including the normal distribution. A tзфlcal norm al distribution,
with mean д = 200 and variance = 1225, has also been drawn in F ig ­
ure 4.1; this w ill be used in later work to illustrate tests that the sample of
weights comes from this distribution.

4 .2 .2 ED F Statistics

A statistic m easuring the difference between Fjj(x) and F (x) w ill be called an
EDF statistic; we shall concentrate on seven which have attracted most atten­
tion. They are based on the vertical differences between Fjj(x) and F (x ), and
are conveniently divided into two classes, the supremum class and the
quadratic class.
The supremum statistics. The first two EDF statistics, D*^ and D “ , a re,
respectively, the largest vertical difference when F^(X) is greater than F (x ),
and the largest vertical difference when F^(X) is sm aller than F (x ); form ally,
D ^ = supx{ F n (X ) - F ( X ) } and D ’ = s u p x {F (x ) - Fn(X)}. The most well-known
EDF statistic is D, introduced by Kolmogorov (1933):

+
D = sup IF (X) F(X)I = max(D ,D )
^x n'

A closely related statistic V, given by Kuiper (1960), is useful fo r o b se r­


vations on a circle (see Section 4 .5 .3 ):

V = d "^ + d “

The quadratic statistics. A second and wide class of m easures of d is­


crepancy is given by the C ram er-von M ises family

Q = n / {F (X) - F (x )}z ^ (x )d F (x )

where ф{х) is a suitable function which gives weights to the squared d iffe r­
ence { F n (X ) - F (X ) }^. When ip{x) = I the statistic is the C ram ár-von M ises
statistic, now usually called W ^, and when ip(x) = [ { F ( x ) } { ( i - F ( x ) } ] “^ the
statistic is the Anderson-D arling (1954) statistic, called A ^. A modification
of also devised originally for the circle (see Section 4. 5 . 3 ) , is the Watson
(1961) statistic defined by

=n / IF^(X) F(X) - / [F^(X) - F(X)] dF(x) I^dF(X)


TESTS BASED ON EDF STATISTICS 101

4 .2 .3 Computing Form ulas

From the basic definitions of the supremum statistics and the quadratic sta­
tistics given above, suitable computing form ulas must be found. This is done
by using the Probability Integral Transform ation (P IT ), Z = F (X ); when F (x)
is the true distribution of X, the new random variable Z is uniformly distrib­
uted between 0 and I . Then Z has distribution function F *(z ) = z, 0 < z < I.
Suppose that a sample X^, . . . , X^ gives values Z^ = F (X j), i = I, . , n,
and let F J ( z ) be the EDF of the values Z j .
ED F statistics can now be calculated from a comparison of F *(z ) with
the uniform distribution for Z . It is easily shown that, for values z and x
related by z = F (x ), the corresponding vertical differences in the EDF dia­
gram s for X and for Z are equal; that is,

F^(X) - F(X) = F *(z ) - F * (z ) = F *(z ) - z ;

consequently EDF statistics calculated from the ED F o f the Z j compared


with the uniform distribution w ill take the same values as If they w ere calcu­
lated from the ED F of the X j, compared with F (x ). This leads to the following
form ulas for calculating EDF statistics from the Z -values.
The form ulas Involve the Z -values arranged in ascending o rd er, Z <
Z ( 2) < • •• < Then, with Z = 2 jZ j/ n , ' '

D^ = m ax^{i/n - Z^.^} ; D ” = max.{Z^^^ - ( 1 - l ) / n } ; D = m a x (D ^ ,D *)

V = d "^ + D“

W* = - (21 - l ) / ( 2 n ) Ÿ + l/(12n) ; = W * - n(Z - 0.5)*


(4.2)
A * = -n - (1/n) 2^(21 - l)[lo g Z^j^ + log { 1 -

Another form ula fo r A^ is

A* = -n - (1/n) Z j[(2 i - I) log Z^j^ -b (2n + I - 21) log {1 -

In these e g r e s s io n s , log x means loge x, and the sums and maxima are
over I < I < n. A ll these form ulas are very straightforward to calculate,
particularly with a m odem computer, o r program m able desk calculator.
Note that statistic D “ can be easily m iscalculated, using 1/n instead of
(i - l)/n .
102 STEPHENS

4.3 G O O D N E S S -O F -F rr TESTS BASED ON THE ED F


(E D F TESTS)

4 .3 .1 General Comments

The general test of fit is a test of

B q I a random sample of n X -values comes from F (x ;0 )

where Y { x ; 0 ) is a (continuous) distribution and 0 is a vector of param eters.


F o r example, for a normal distribution under test, в = (At,(7^), and fo r a
Gamma distribution defined as in Section 4.12, в = (a,/3,m ). When 9 is
fully specified, we call the situation Case 0. Then = F(X^j^;d) gives a
set Z (I) which, on Hq , are ordered uniforms and equations (4.2) are used to
give EDF statistics. On the other hand, F (x ;ô ) may be defined only as a
m em ber o f a fam ily of distributions such as the normal o r Gamma, but all
o r part of the vector в may be unknown. A s an example of Case 0, the data
in Table 4 . 1 might be tested to be normal with mean = 200 and variance
0*2 = 1225, and as an example of the second situation it would be tested only
to be norm al, with unknown mean and variance. F o r the second case it would
be natural to use the sample mean and variance as estimates of the compo­
nents o f в = (M,cr^).
F o r Case 0, distribution theory of EDF statistics is well-developed,
even for finite sam ples, and tables have been available for some time. When
9 contains one o r m ore unknown param eters, these param eters may be r e ­
placed by estim ates, to give ê as the estimate of 9 . Then form ulas (4 • 2)
may still be used to calculate ED F statistics, with Z(^) = F (X (y ;0 ). However,
even when Hq is true, the Z (i) w ill now not be an ordered uniform sample,
and the distributions of EDF statistics w ill be very different from those for
Case 0; they w ill depend on the distribution tested, the param eters estimated,
and the method of estimation, as w ell as on the sample size. New points
should then be used for the appropriate test, even for large sam ples, other­
w ise a serious e r r o r in significance level w ill result.

4 -3 .2 Unknown Location and Scale Param eters

When the unknown components of 9 are location o r scale param eters, and if
these are estimated by appropriate methods, the distributions of ED F statis­
tics w ill not depend on the true values of the unknown param eters. Thus p e r­
centage points for EDF tests for such distributions, for example, the normal,
exponential, extrem e-value, and logistic distributions, depend only on the
fam ily tested and on the sample size n. Nevertheless, the exact distributions
of EDF statistics a re very difficult to find and except fo r the exponential d is­
tribution, Monte Carlo studies have been extensively used to find points for
finite n. Fortunately, for the quadratic statistics W ^, U^, and A^, asymptotic
theory is available; furtherm ore, the percentage points of these statistics for
TESTS BASED ON EDF STATISTICS 103

finite n converge rapidly to the asymptotic points. F o r the statistics D"^, D ",
D, and V, there is no general asymptotic theory (except for Case 0), and
even asymptotic points must be estimated. This may be done by plotting, for
a fixed a , the Monte Carlo points for samples of size n against m = l/n,
and then extrapolating to m = 0; alternatively, since the statistics are func­
tions of a process which is asymptotically Gaussian, points may be found by
simulating the Gaussian process. Serfling and Wood (1975) and Wood (1978a)
have obtained asymptotic points by this method. Both techniques are of
course subject to sampling variation and e rro rs due to extrapolation; Chandra,
Singpurwalla, and Stephens (1981) have given some comparisons of the two
methods in obtaining points for tests for the extreme value distribution.
F o r the tests corresponding to many distributional fa m ilie s, Stephens
(1970, 1974b, 1977, 1979) has given modifications of the test statistics; if
the statistic is, say, T , the modification is a function o f n and T which is
then referred to the asymptotic points of T o r of Т * Л . Asymptotic theory
depends on using asymptotically efficient estim ators fo r the estimates of
unknown components of Э ; the asymptotic points given w ill then be valid for
any ,such estim ators. Points for finite n w ill depend on which estimators are
used; usually these are maximum likelihood estim ators although an exception
is the Cauchy distribution below. In the test situations in following sections,
percentage points for finite n, using the estim ators given, w ere found from
extensive Monte Carlo studies, often done by the author, although other
studies referenced have also been used. The modifications are then derived
from an examination of how these points, for a = 0.05 say, converge to the
asymptotic point. A feature of the modifications is that, at least in the appro­
priate tall, they do not depend on o ; thus when such modifications have been
found, the usual tables of percentage points, with entries for n and a , can
be reduced to one line for each test situation. The modifications hold only if
the estimators given are used. They have been calculated to be most accurate
at about a = 0.05, but usually give good results, for practical purposes,
for Oi less than about 0.2.

4 .3 .3 Unknown Shape Param eters

When unknown param eters are not location o r scale param eters, for example
when the shape param eter of a Gamma o r a W elbull distribution is unknown,
null distribution theory, even asymptotic, when the param eters are esti­
mated, w ill depend on the true values of these param eters. However, if this
dependence is very slight, a set of tables, to be used with the estimated
value of the shape param eter, can still be valuable (see, for example. Sec­
tion 4.12, concerning tests for the Gamma distribution). Other methods of
dealing with unknown param eters are discussed in Section 4 . 16. 3 .
There is now a vast literature on EDF statistics and tests, and only the
principal references related directly to the tests and tables are included
here. Surveys have been given by Sahler (1968) and by Neuhaus (1979); a com­
prehensive review of the theory, and many references, is given by Durbin
(1973).
104 STEPHENS

W e next give tests and applications of EDF statistics for Case 0, and in
subsequent sections give tests for the m ajor distributions.

4.4 EDF TESTS FOR A F U L L Y SPE C IFIE D


DISTRIBUTION (CASE 0)

The following procedure can now be set out for EDF tests for Case 0, that
is, for the null hypothesis

H : a random sample X _, . . . , X comes from F (x ;0 ), a completely


0 I n
specified continuous distribution

(a) Put the X j in ascending ord er, X (i) < X^2) < • • * < ^ (n )-
(b) Calculate = F ( X ^ ^ ÿ $ ) , i = I, .. , n.
(о) Calculate the appropriate test statistic using (4 .2 ).
(d) Modify the test statistic as in Table 4 . 2 using the modifications for the
upper tail, and compare with the appropriate line of percentage points.
If the statistic exceeds the value in the upper tall given at level a , Hq is
rejected at significance level Oi.

E 4 . 4 . 1 Example

Suppose the data in Table 4 . 1 are to be tested to come from a normal d istri­
bution with mean /u = 200 and standard deviation (j = 35. This is the distri­
bution drawn in Figure 4 .1 . The Probability Integral Transform ation gives
the values in column Z j of Table 4 . 1. These have then been used to draw the
EDF in Figure 4.2. The calculation of D+ and D ” is also illustrated in the
figure. Form ulas (4.2) give, t o S d . p . : D"*^ =0.044, D = 0 .1 7 1 , D = 0.171,
V = 0.216, for the supremum statistics, and = Ó.187, U^ = 0.051, and
A^ = 1.019 for the quadratic c la ss. From Table 4.2 the modified value D *
is found from D * = D( n/ii + 0 .1 2 + 0. l l / ’sTn) and the value is 0.790. R e fe r­
ence to the percentage points on the same line as the modification (the asymp­
totic percentage points of iW n ) shows D to be not significant at the 15% level.
The modified values o f the other statistics (using, fo r example, W * for m od-
lfled W ^) a re : D +* = 0.203, D " * = 0.790, V * = 1.011, V ,* = . 177, U * = .048,
A * = 1.019. These give levels o f significance a (o r p -le v e ls) w ell below the
25% point for all the statistics, so that the hypothesis Hq w ill not be rejected.
It w ill be observed from the modifications in Table 4 . 2 that the percent­
age points of and U^ for finite n converge rapidly in the upper tail to the
asymptotic points, and even if the modifications w ere not included in the
table, the use of the asymptotic percentage points for n > 20 would give
negligible e rro r in a . Even more striking is the fact that, for n > 3, the
distribution of A^ is accurately given by the asymptotic distribution. This
H
И
T A B L E 4.2 Modifications and Percentage Points for EDF Statistics for Testing a Completely Specified Distribution H
W
(Case 0; Section 4.4)

W
Statistic Significance level a И
ö
T Modified form T * .25 .15 .10 .05 .025 .01 .005 .001
о

Upper tail percentage points и


ö
4
d ‘^ (D “) {^Гп + 0.12 + O . I I / n/п) 0.828 0.973 1.073 1.224 1.358 1.518 1.628 1.859
H
D D('s/n + 0.12 + O . I I / n/п) 1.019 1.138 1.224 1.358 1.480 1.628 1.731 1.950
ей
V V(N/h+ 0.155 + 0 . 2 4 / ^Tn) 1.420 1.537 1.620 1.747 1.862 2.001 2.098 2.303 й
W2 (W2 - 0.4/n + 0.6/n2)(1.0 + 1.0/n) 0.209 0.284 0.347 0.461 0.581 0.743 0.869 1.167 5

(U^ - 0.1/n + O .l/n^H l.O + 0.8/n) 0.105 0.131 0.152 0.187 0.222 0.268 0.304 0.385

А2 F o r a ll n > 5 1.248 1.610 1.933 2.492 3.070 3.880 4.500 6.000

Low er tall percentage points

D 0 ( - ^ + 0.275 - 0 .0 4 / -^ ) 0.610 0.571 0.520 0.481 0.441

V V(*/n + 0.41 - 0 . 2 6 / ^Tn) 0.976 0.928 0.861 0.810 0.755

W2 (W^ - 0.03/n)(1.0 + 0.05/n) 0.054 0.046 0.037 0.030 0.025

(U* -0 .0 2 / n )(l + 0.35/n) 0.038 0.033 0.028 0.024 0.020

A^ F o r all n > 5 0.399 0.346 0.283 0.240 0.201

Adapted from Stephens (1970), with perm ission of the Royal Statistical Society. O
СЛ
106 STEPHENS

FIGURE 4.2 E D F o fZ j.

w as demonstrated by Lew is (1961), and is valid a ll along the distribution.


The modified form s are taken from Stephens (1970) where references to
original sources of tables for EDF statistics are given.

4.5 COM M ENTS O N EDF TESTS FOR CASE 0

4 .5 .1 Use of the Low er T ail

The test as described above is a one-tail test, using only the upper tail of
the test statistics. This is because, in general, we should expect the d iffer­
ence between Fjj(x) and F (x) o r between F^(Z) and F (z) to be large when Ho
is not tru e . If a test statistic appears to be significant in the low er tail it
suggests that the Z -sam ple is too regu lar to be a random uniform sam ple,
and perhaps the original X-data have been tampered with. Such Z -values are
called superuniform ; tests for superuniformity can be made using the modi­
fications and lower tail points also given in Table 4 .2 . Superuniform o b se r­
vations can arise in other ways also, particularly in connection with tests for
the exponential distribution, o r for randomness of points in time. An inter­
TESTS BASED ON EDF STATISTICS 107

esting data set which appears to be superuniform is the dates of the kings
and queens of England (Pearson, 1963). Further comments on superuniform­
ity are in Chapters 8 and 10.

4 .5 .2 Calculation of Significance Levels (p -L e v e ls)


for Given Statistics

Suppose a test statistic T takes the value t; the significance level, o r p-value,
of the statistic w ill then be the value p = P (T > t ) . In some contexts the term
is also applied to the low er tail probability P (T < t) but here q, o r q -level,
w ill be used fo r this quantity; thus q = I - p. It is useful (especially in com­
bining several independent tests, see Sections 4.18 and 8.15) to be able to
calculate the significance level a ll along the distribution o f T , and not m erely
in the tails. F o r A^, Case 0, the table of q-values of the as 3rmptotic d istri­
bution given by Lew is (1961) is reproduced in Table 4 .3 . Since A^ needs no
modification for sample size greater than three. Table 4.3 may be used to
give q-values for all n > 3. T ables to find p o r q in the tails w ere given for
other EDF statistics by Stephens (1970).

E 4 .5 .2 Example

In example E 4 .4 .1 , A^ = 1.019; from Table 4.3 the q-value is approximately


0.653, so that P = 0.347.

4 .5 .3 Observations on a C ircle

A special problem arises if the observations X are measurements recording


points on a c ircle. F o r example, suppose the circum ference is of length I,
and let X be the arc length around the circum ference, m easured clockwise
from an origin O. C learly the value of F (x ;0 ) at a given point x varies with
the choice of O, and the EDF statistics D"^, D ", D, W ^, and A^ take different
values with different choices of origin. However, statistics V and do not;
they w ere Introduced by Kulper and Watson to adapt the statistics D and
fo r this problem , and V and should be used for observations recorded on
a c irc le .

E 4 .5 .3 Example

Consider a sm all sample of four valu es, which are to be tested for uniformity
around a circle of unit circum ference. With North as origin, and positive
direction clockwise, suppose the X v a lu e s are 0.3, 0.4, 0.5, 0.9. If East
w ere regarded as origin, these values would change to 0.05, 0.15, 0.35,
and 0.65. When the EDF are drawn fo r these two cases, the values of D”^,
D ", and D a re , respectively, 0.15, 0.30, 0.30 and 0.40, 0.05, 0.40, bu tin
both cases V is 0.45. The corresponding values of W ^, A^, and are
108 STEPHENS

TABLE 4.3 Distribution of A^, Case 0: The Table Gives q = P (A^ < z)

.025 0.0000 1.250 0.7503 3.100 0.9756


.050 0.0000 1.300 0.7677 3.150 0.9770
.075 0.0000 1.350 0.7833 3.200 0.9783
.100 0.0000 1.400 0.7973 3.250 0.9795
.125 0.0003 1.450 0.8111 3.300 0.9807
.150 0.0014 1.500 0.8235 3.350 0.9818
.175 0.0042 1.550 0.8350 3.400 0.9828
.200 0.0096 1.600 0.8457 3.450 0.9837
.225 0.0180 1.650 0.8556 3.500 0.9846
.250 0.0296 1.700 0.8648 3.550 0.9855
.275 0.0443 1.750 0.8734 3.600 0.9863
.300 0.0618 1.800 0.8814 3.650 0.9870
.325 0.0817 1.850 0.8888 3.700 0.9878
.350 0.1036 1.900 0.8957 3.750 0.9884
.375 0.1269 1.950 0.9021 3.800 0.9891
.400 0.1513 2.000 0.9082 3.850 0.9897
.425 0.1764 2.050 0.9138 3.900 0.9902
.450 0.2019 2.100 0.9190 3.950 0.9908
.475 0.2276 2.150 0.9239 4.000 0.9913
.500 0.2532 2.200 0.9285 4.050 0.9917
.525 0.2786 2.250 0.9328 4.100 0.9922
.550 0.3036 2.300 0.9368 4.150 0.9926
.575 0.3281 2.350 0.9405 4.200 0.9930
.600 0.3520 2.400 0.9441 4.250 0.9934
.625 0.3753 2.450 0.9474 4.300 0.9938
.650 0.3930 2.500 0.9504 4.350 0.9941
.675 0.4199 2.550 0.9534 4.400 0.9944
.700 0.4412 2.600 0.9561 4.500 0.9950
.750 0.4815 2.650 0.9586 4.600 0.9955
.800 0.5190 2.700 0.9610 4.700 0.9960
.850 0.5537 2.750 0.9633 4.800 0.9964
.900 0.5858 2.800 0.9654 4.900 0.9968
.950 0.6154 2.850 0.9674 5.000 0.9971
1.000 0.6427 2.900 0.9692 5.500 0.9983
1.050 0.6680 2.950 0.9710 6.000 0.9990
1.100 0.6912 3.000 0.9726 7.000 0.9997
1.150 0.7127 3.050 0.9742 8.000 0.9999
1.200 0.7324

Adapted from Lew is (1961), with perm ission of the author and o f the Institute
of Mathematical Statistics.
TESTS BASED ON EDF STATISTICS 109

0.053, 0.337, 0.043 fo r North as origin, and 0.203, 1.116, 0.043 for East;
and A^ change in value but U^, like V , rem ains constant.

4 .5 .4 Use of D to Give Confidence Intervals


for a Distribution

The E D F , with statistic D , may also be used to provide confidence Intervals


fo r the true distribution function. This is done as follows. Let the critical
value of D, for given n and a , be D q,, and draw a band of vertical height D q,
on either side of F ^ (x) ; this gives a confidence band fo r the true distribution
function, with confidence level 100(1 - a ) % . (Strictly speaking, there should
be a slight modification of the band at the low er and upper ta ils , because
otherwise the band contains negative values for the distribution, o r values
greater than I, but the modifications are very sm all for a sample of reason­
able size, say n = 20.)

4 .5 .5 Use of EDF Statistics to Give Confidence Sets


for Param eters

When param eters are not known in F (x ;ö ), confidence sets may be provided
fo r them by the following device. Suppose 0 is a vector of unknown param ­
eters, which need not be only location o r scale p aram eters, and suppose
values are given to unknown components of 0 to give vector Oq ; then F(x;0o)
is completely specified and ED F statistics can be calculated. Suppose T is
such a statistic. The confidence set fo r 0, derived from T , and with level
100(1 - oi)% , includes a ll those values of Oq which make T not significant at
level 0?. The confidence set is sometimes called a consonance set o r region.
Easterling (1976) and Littelland Rao (1978) have investigated the use of A^
and D fo r finding consonance sets fo r p aram eters. The technique affords an
interesting mixture of goodn ess-of-fit and param eter estimation methods.

4 .5 .6 Other E D F Statistics fo r Case 0

Many other statistics have been proposed to m easure the discrepancy between
F jj(X) and F (x ;0 ); they are often closely related to the seven statistics d is­
cussed above, and have sim ila r properties.

(a) Anderson and D arling (1952) suggested using a variance-weighted D,


obtained by incorporating a weight function into the definition o f D, much
as it is included in A ^ . Asymptotics fo r this statistic have been given by
Doksum, Fenstad and Aaberge (1977), and tables fo r finite n by
Niederhausen (1981).
(b) Suppose SI = F -^(i/n), and let Ц = i/n - Fjj(Sj), where Fjj(x) is the EDF
o f the original sample X; Riedwyl (1967) suggested the test statistic •

• On the Z -d iagram , this statistic is based on the discrepancy


lio STEPHENS

between F^(Z) and F (z) at equal Intervals along the z-a x is between 0
and I; the statistic has a discrete distribution.
(C ) Let Ô1 = m a x i{ I - (i - l)/ n | , I - i/n|}; the Kolmogorov statistic
D is then max ôj. Finkelsteln and Schafer (1971) have proposed the sta­
tistic S = Z i ôi and have given a table of percentage points for n up to
30.
(d) Hegazy and Green (1975) and Green and Hegazy (1976) have discussed
several statistics calculated from slight modifications of the computing
form ulas in Section 4 .2 . Berk and Jones (1979) gave other statistics
based on F^(x) and sim ilar to the Kolmogorov statistics. Hegazy and
Green (1975) have demonstrated that their modified statistics can in­
crease power against certain alternatives, and Berk and Jones showed
certain optimal properties in the sense of Bahadur efficiency for their
statistics.
(e) A set of statistics closely related to ED F statistics, although not derived
from the E D F , is the C and K set, described in Section 8.8.

4.6 PO W E R OF EDF STATISTICS FOR CASE 0

In Case 0, as we have seen, the final test is that a set of variables Z is uni­
form ly distributed U (0 ,1), and a discussion of power properties of EDF
statistics is therefore deferred to Chapter 8, where tests for uniformity are
discussed in detail. However, for later comparisons in this chapter, we
sum m arize certain properties of EDF statistics in Case 0 situations.

(a) EDF statistics are usually much m ore powerful than the Pearson c h i-
square statistic; this might be explained by the fact that for the chi-square
statistic the data must be grouped, with a resulting loss of information,
especially fo r sm all sam ples.
(b) The most well-known EDF statistic is D, but it is often much less pow er­
ful than the quadratic statistics and A ^ .
(C ) Statistics D"*" and D “ w ill be powerful in detecting whether o r not the
Z -set tends to be close to 0 o r to I, respectively; A^, W ^, and D w ill
detect either of these two alternatives, and and V are powerful in
detecting a clustering of Z values at one point, o r a division into two
groups near 0 and I . In term s of the original observations X, statistics
D"^, D “ , A^, W ^, and D w ill detect an e r r o r in mean in F (x ;d ) as speci­
fied, and and V w ill detect an e r r o r in variance.
(d) A^ often behaves sim ilarly to , but is on the whole m ore powerful for
tests when F (x ;0 ) departs from the true distribution in the tails, espe­
cially when there appears to be too many outlying X -values for the F (x ;ö )
as specified. In goodness-of-flt work, departure in the tails is often im ­
portant to detect, and A^ is the recommended statistic.
TESTS BASED ON EDF STATISTICS 111

4.7 EDF TESTS FO R CENSORED D A T A : CASE 0

4 .7 .1 Introduction

If some of the observations X 2 , . . . , X ^ of a random sample a re m iss­


ing, the sample is said to be censored. If a ll observations less than X^gj are
m issing the sample is left-censored, and if all observations greater than
a re m issing, it is right-censored; in either case, the sample is said to be
singly-censored. If observations a re m issing at both ends, the sample is
doubly-censored.
Censoring may occur fo r random values of s o r r (Type I censoring) o r
fo r fixed values (Type 2 censoring). These may be illustrated by lifetime
measurements X^ of equipment. If the experiment is continued fo r a fixed
time t, the number of items which fail in that time would be a random v a ri­
able and the censoring would be Type I ; if, on the other hand, it is decided
to follow the eiqjeriment until 20 items have failed, then r is fixed at 20 and
we have Type 2 censoring. Another form of censoring is random censoring,
where, fo r example, observation X^ may not be known, but it is known that
X i > T i, where T i is another random variable. In the lifetesting experiment,
this could occur if items w ere removed from the test fo r reasons other than
because they failed in the manner investigated in the experiment.
E D F statistics have been adapted fo r a ll these form s o f censoring. Con­
sider Case 0, where the distribution under test, say F (x ), is fully specified.
Then the Probability Integral Transform ation may be made fo r the observa­
tions available, giving a set Z i = F (X i) which is Itself censored. Suppose
the X -se t is right-censored, of Type I ; the values of X are known to be less
than the fixed value X *, and the available Z i are then < Z^2) < • • • < Z (r )
< t, where t = F (X * ). If the censoring is T y p e 2, there are again r values
Z (i), with Z (r ) the largest and r fixed.

4 . 7 .2 The Kolm ogorov-Sm irnov Statistics D“**, D “ , and D


The Kolm ogorov-Sm irnov statistic, modified for Type I censored data, is
lD t,n , calculated from the ED F F^(Z) of the r ordered Z -values:

D = sup I F ( Z ) - Z l
I t,n Л . n
0< z< t

= max
1< 1< r '

F o r Type 2 censored data, the Kolm ogorov-Sm irnov statistic is


112 STEPHENS

T A B L E 4.4 Upper T all A s 5rmptotic Percentage Points for n/iiD, and A^,
fo r Type I o r Type 2 Censored Data from U (0 ,1) (Section 4.7)

Significance level ot

0.50 0.25 0.15 0.10 0.05 0.025 0.01 0.005


P

Statistic
^ЛlD
.2 .4923 .6465 .7443 .8155 .9268 1.0282 1.1505 1.2361
.3 .5889 .7663 .8784 .9597 1.0868 1.2024 1.3419 1.4394
.4 .6627 .8544 .9746 1.0616 1.1975 1.3209 1.4696 1.5735
.5 .7204 .9196 1.0438 1.1334 1.2731 1.3997 1.5520 1.6583
.6 .7649 .9666 1.0914 1.1813 1.3211 1.4476 1.5996 1.7056
.7 .7975 .9976 1.1208 1.2094 1.3471 1.4717 1.6214 1.7258
.8 .8183 1.0142 1.1348 1.2216 1.3568 1.4794 1.6272 1.7306
.9 .8270 1.0190 1.1379 1.2238 1.3581 1.4802 1.6276 1.7308
1.0 .8276 1.0192 1.1379 1.2238 1.3581 1.4802 1.6276 1.7308

Statistic

.2 .010 .025 .033 .041 .057 .074 .094 .110


.3 .022 .046 .066 • 083 .115 .147 .194 .227
.4 .037 .076 .105 .136 .184 .231 .295 .353
.5 .054 .105 .153 .186 .258 .330 .427 .488
.6 .070 .136 .192 .241 .327 .417 .543 .621
.7 .088 .165 .231 .286 .386 .491 .633 .742
•8 .103 .188 .259 .321 .430 .544 .696 .816
.9 .115 .204 .278 .341 .455 .573 .735 .865
1.0 .119 .209 .284 .347 .461 .581 .743 .869

Statistic
A^
.2 .135 .252 .333 .436 .588 .747 .962 1.129
.3 .204 .378 .528 .649 .872 1.106 1.425 1.731
.4 .275 .504 .700 .857 1.150 1.455 1.872 2.194
.5 .349 .630 .875 1.062 1.419 1.792 2.301 -
.6 .425 .756 1.028 1.260 1.676 2.112 2.707 -
.7 .504 .882 1.184 1.451 1.920 2.421 3.083 -
.8 .588 1.007 1.322 1.623 2.146 2.684 3.419 -
.9 .676 1.131 1.467 1.798 2.344 2.915 3.698 -
1.0 .779 1.248 1.610 1.933 2.492 3.070 3. 880 4.500

Table fo r »N/nD adapted from Koziol and B yar (1975), with perm ission of the
authors and of the Am erican Statistical Association. Tables fo r and A^
adapted from Pettltt and Stephens (1976), with perm ission of the author and
of the Biom etrika Trustees.
TESTS BASED ON EDF STATISTICS 113

2 ^ .n = ""P
0 < z < Z (p )

= max (4.4)
, { ; - V
K i< r

The one-sided versions o f these statistics are denoted by iD t,n* Tables


of percentage points o f the null distribution of j^D^ ^ and 2l>r,n given by
B a r r and Davidson (1973). F o r both types of censoring, these converge to
one asymptotic distribution, given by Koziol and B yar (1975); points from
this distribution are in the first part of Table 4 .4 . Dufour and M aag (1978)
gave useful form ulas so that the as 5onptotic distributions could be used with
finite sam ples. The technique is as follows.
Suppose the sample is right-censored, and Hq is

Ho: the censored sample < X^2) < • * • < X^j.) comes from the fully
specified continuous distribution F (x)

The values = F (X (j)), i = I , . . . , r , a re calculated. The steps in testing


Hq are then the following.

F o r Type I censoring;

(a) Calculate j^D^ ^ from form ula (4 .3 ).


(b) Modify iD t,n *^ ^ t calculated from

D * = \Tn^D + 0.19/N/n, fo r n > 25 and t > 0.25


t I t,n — —

(c) R efer to T able 4.4 and reject Hq at significance level a if D * exceeds


the tabulated value fo r a .

F o r Type 2 censoring;

(a) Calculate 2^^r,n f^cm form ula (4 .4 ).


(b) Modify 2 ^ r ,n ^ ^ r calculated from

D* = n/Û D + 0.24/\/n, fo r n > 25 and r/n > 0.4


r 2 r ,n - -

(C ) R efer to T able 4.4 and reject H q at significance level a if D * exceeds


the tabulated value at level o'.

F o r values o f n < 25 o r fo r censoring m ore extreme than the ranges


given above, r e fe r to tables of B a r r and Davidson (1973) o r Dufour and Maag
(1978).
114 STEPHENS

In o rd er to approximate the tail areas of the finite sample distribution,


that is, to obtain p-values of test statistics, a relationship between the
asymptotic distributions of one and two-sided statistics can be used.
The as 5nnptotic distribution for the one-sided test statistic n/Îi (^ d |
has beèn given by Schey (1977) as ’

IimР{\Гп( I t,n -
)<у)} =G(у) t
n-*oo
2у2
ф (А ^ ) - ф (В ^ ) е •" , у > О

where At = (t - t^)^ and Bt = (2t - l)A t* The tall area for the two-sided
statistic is approximated w ell for significance levels less than 0.20 by dou­
bling the one-sided value. Thus to obtain a p-value, the test statistic iDt^n
o r 2l^r,n Is first adapted to obtain D f o r D * as described above, and then
the p -value, for a two-sided test, is well-approxim ated by p^ = 2 { l - G ^ (D *)}
o r P r = 2 { l - G t (D ^ }. Examples of calculations of Kolm ogorov-Sm irnov
statistics fo r censored data are given in Section 1 1 .3 .1.

4 .7 .3 C ram ár-von M ises Statistics

A second group of statistics for censored sam ples is of the general C ra m ^ r-


von M ises tyi>e. Pettitt and Stephens (1976) introduced versions of the
C ram ^r-von M ises W ^, Watson U^, and A nderson-D arling A^ statistics,
obtained (fo r right-censored data) by modifying the upper limit of integration
in the definitions of these statistics; would then become ,W ? fo r Type I
2 ^
censoring, and 2W^ ^ for Tjrpe 2 censoring. The computing form ulas differ
slightly for the two t5^ e s of censoring. The form ulas fo r Тзгре 2 censoring,
given < • • • < Z (r ), are

= У (z
2 r .n (i) 2n ^ 3V (r) n/

= - nZ
2 r ,n 2 r ,n (r)

where Z = ^ (4.5)
i= i W

2 < . ■
1=1 i= l

i[(r -n)^log{l - Z^^^}- r " l o g Z ^ r )^ “" V


TESTS BASED ON EDF STATISTICS 115

F o r Type I right-censored data, suppose t (t < I) Is the fixed censoring


value of Z . This value is added to the sample set, and the statistics are now
calculated by using the above form ulas with r replaced by r + I , and with
they W lllb e called ^w2 and A? . Note that it is

possible to have r = n observations less than t, so that when the value t is


added, the new sample has size n + I . Statistics and W2 have the
I t,n 2 r ,n
same asymptotic distributions for the two typ>es of censoring; sim ilarly fo r
the other statistics. Asymptotic points fo r and A^ a re given in T a b le 4 .4 .
Thus the steps in making a test, fo r right-censored data, are:

(a) Calculate the statistic required as described above.


(b) R efer to T able 4.4 fo r Type I data, and Table 4 .5 fo r Type 2 data.

F o r Type I censored data. Table 4.4 is entered at p = t, for a ll n. F o r


Type 2 censoring. Table 4.5 is entered at p = r/n, with appropriate n. Hq
is rejected at significance level a if the statistic is greater than the point
given for level a .
The as 5miptotlc points in Tables 4.4 and 4 . 5 are those given by Pettitt
and Stephens (1976), with some additions. Points for finite n have been ob­
tained by extensive Monte C arlo studies, which showed that points for Type I
censoring converge so rapidly to the as 5onptotic distribution that a new table
is not needed. T ables for are probably not so valuable fo r censored data
and have been omitted. M ore extensive tables, including tables fo r U^, are
in Stephens (1986).
Smith and Baln (1976) have suggested another version of fo r use with
Type 2 right-censored data from the uniform distribution; the statistic, say

, is z f Л - (2i - l)/ 2 n }^ + l/(12n). w ill have the same


2 r ,n i= l^ (i) ' 2 2 r ,n
asymptotic distribution as and Smith and Bain (1976) have given
Monte C arlo points fo r finite n.’ Some com parisons o f statistics fo r censored
data. Including , oW2 . and ^A^ w ere made by Michael and
r , n 2 r ,n 2 r ,n
Schucany (1979). F o r the alternatives and censoring factors which w ere
studied, there w ere noticeable differences in the sensitivity o f ^jad
2 2 2 r ,n
2 ^ г ,п * general, the statistic 2 A r,n displayed the best pow er. Examples
o f calculations of C ram ár-von M ises statistics for censored data are given
in Section 11..3.1.

4 .7 .3 .1 Left-C ensored Data

F o r left-censored data, the values zTjx = I 1 = 1, . . . , r may be


calculated from the r largest observations and the set Z * used in tests for
right-censored data. In Type I censoring, the left-censoring value t converts
to t* = I - t, to be used as the right-censoring point with the Z * values.
116 STEPHENS

T A B L E 4.5 Upper T a il Percentage Points fo r 2^ ^ n and 2 ^ x n


Type 2 Right-Censored Data from the Uniform U (0 ,1) Distribution
(Section 4 .7 .3 ). The table should be entered at n and at p = r/n.

Significance level a

Statistic n 0.50 0.25 0.15 0.10 0.05 0.025 0.01

20 0.006 0.018 0.038 0.058 0.099 0.152 0.243


2 r ,n 40 0.008 0.018 0.032 0.046 0.084 0.128 0.198
60 0.009 0.020 0.031 0.044 0.074 0.107 0.154
P = 0.2
80 0.009 0.021 0.031 0.043 0.069 0.097 0.136
100 0.009 0.022 0.031 0.043 0.066 0.092 0.127
OO 0.010 0.025 0.031 0.041 0.057 0.074 0.094

10 0.022 0.056 0.101 0.144 0.229 0.313 0.458


20 0.029 0.062 0.095 0.132 0.209 0.297 0.419
40 0.033 0.067 0.100 0.128 0.191 0.267 0.381
P = 0.4 60 0.034 0.070 0.102 0.130 0.189 0.256 0.354
80 0.035 0.071 0.103 0.132 0.187 0.251 0.342
100 0.035 0.072 0.103 0.132 0.187 0.248 0.335
OO 0.037 0.076 0.105 0.135 0.184 0.236 0.307

10 0.053 0.107 0.159 0.205 0.297 0.408 0.547


20 0.062 0.122 0.172 0.216 0.302 0.408 0.538
40 0.067 0.128 0.180 0.226 0.306 0.398 0.522
P = 0.6 60 0.068 0.131 0.184 0.231 0.313 0.404 0.528
80 0.068 0.132 0.186 0.233 0.316 0.407 0.531
100 0.069 0.133 0.187 0.235 0.318 0.409 0.532
OO 0.070 0.136 0.192 0.241 0.327 0.417 0.539

10 0.085 0.158 0.217 0.266 0.354 0.453 0.593


20 0.094 0.172 0.235 0.289 0.389 0.489 0.623
40 0.099 0.180 0.247 0.303 0.401 0.508 0.651
P = 0.8 60 0.100 0.183 0.251 0.308 0.410 0.520 0.667
80 0.101 0.184 0.253 0.311 0.415 0.526 0.675
100 0.101 0.185 0.254 0.313 0.418 0.529 0.680
OO 0.103 0.188 0.259 0.320 0.430 0.544 0.700

10 0.094 0.183 0.246 0.301 0.410 0.502 0.645


20 0.109 0.194 0.263 0.322 0.431 0.536 0.675
40 0.112 0.199 0.271 0.330 0.437 0.546 0.701
P = 0.9 60 0.113 0.201 0.273 0.333 0.442 0.553 0.713
80 0.114 0.202 0.274 0.335 0.445 0.558 0.718
100 0.114 0.202 0.275 0.336 0.447 0.561 0.722
0.115 0.204 0.278 0.341 0.455 0.573 0.735

(continued)
TESTS BASED ON EDF STATISTICS 117

T A B L E 4.5 (continued)

Significance level a

Statistic n 0.50 0.25 0.15 0.10 0.05 0.025 0.01

10 0.103 0.198 0.266 0.324 0.430 0.534 0.676


2 r ,n
20 0.115 0.201 0.275 0.322 0.444 0.551 0.692
40 0.115 0.205 0.280 0.329 0.448 0.557 0.715
60 0.116 0.207 0.280 0.338 0.451 0.562 0.724
P = 0.95
80 0.117 0.208 0.281 0.340 0.453 0.566 0.729
100 0.117 0.208 0.282 0.341 0.454 0.569 0.735
OO 0.118 0.208 0.283 0.346 0.460 0.579 0.742

10 0.117 0.212 0.288 0.349 0.456 0.564 0.709


20 0.116 0.212 0.288 0.350 0.459 0.572 0.724
P = 1.0 40 0.115 0.211 0.288 0.350 0.461 0.576 0.731
100 0.115 0.211 0.288 0.351 0.462 0.578 0.736

OO 0.119 0.209 0.284 0.347 0.461 0.581 0.743


20 0.107 0.218 0.337 0.435 0.626 0.887 1.278
/ r ,n 40 0.119 0.235 0.337 0.430 0.607 0.804 1.111
60 0.124 0.241 0.341 0.432 0.601 0.785 1.059
P = 0.2
80 0.127 0.243 0.344 0.433 0.598 0.775 1.034
100 0.128 0.245 0.345 0.434 0.596 0.769 1.019
OO 0.135 0.252 0.351 0.436 0.588 0.747 0.962

10 0.214 0.431 0.627 0.803 1.127 1.483 2.080


20 0.241 0.462 0.653 0.824 1.133 1.513 2.011
40 0.261 0.487 0.681 0.843 1.138 1.460 1.903
P = 0.4 60 0.265 0.493 0.686 0.848 1.142 1.458 1.892
80 0.268 0.496 0.688 0.850 1.144 1.457 1.887
100 0.269 0.497 0.689 0.851 1.145 1.457 1.884
OO 0.275 0.504 0.695 0.857 1.150 1.455 1.872

10 0.354 0.673 0.944 1.174 1.577 2.055 2.774


20 0.390 0.713 0.984 1.207 1.650 2.098 2.688
40 0.408 0.730 1.001 1.229 1.635 2.071 2.671
P = 0.6 60 0.413 0.739 1.011 1.239 1.649 2.084 2.683
80 0.416 0.743 1.017 1.244 1.655 2.091 2.689
100 0.418 0.746 1.020 1.248 1.659 2.095 2.693
OO 0.425 0.756 1.033 1.260 1.676 2.112 2.707

(continued)
118 STEPHENS

T A B L E 4 . 5 (continued)

Significance level a

Statistic n 0.50 0.25 0.15 0.10 0.05 0.025 0.01

10 0.503 0.913 1.237 1.498 2.021 2.587 3.254


2 r ,n
20 0.547 0.952 1.280 1.558 2.068 2.570 3.420
40 0.568 0.983 1.321 1.583 2.088 2.574 3.270
P = 0.8 60 0.574 0.991 1.330 1.596 2.107 2.610 3.319
80 0.578 0.995 1.335 1.603 2.117 2.629 3.344
100 0.580 0.997 1.338 1.607 2.123 2.640 3.359
OO 0.588 1.007 1.350 1.623 2.146 2.684 3.419

10 0.639 1.089 1.435 1.721 2.281 2.867 3.614


20 0.656 1.109 1.457 1.765 2.295 2.858 3.650
40 0.666 1.124 1.478 1.778 2.315 2.860 3.628
p = 0.9 60 0.670 1.128 1.482 1.784 2.325 2.878 3.648
80 0.671 1.130 1.485 1.788 2.330 2.888 3.661
100 0.673 1.131 1.486 1.790 2.332 2.893 3.668
OO 0.676 1.136 1.492 1.798 2.344 2.915 3.698

10 0.707 1.170 1.525 1.842 2.390 2.961 3.745


20 0.710 1.177 1.533 1.853 2.406 2.965 3.750
40 0.715 1.184 1.543 1.860 2.416 2.968 3.743
60 0.717 1.186 1.545 1.263 2.421 2.977 3.753
P = 0.95
80 0.718 1.187 1.546 1.865 2.423 2.982 3.760
100 0.719 1.188 1.547 1.866 2.424 2.984 3.763
CO 0.720 1.190 1.550 1.870 2.430 2.995 3.778
P = 1.0 a l l n 0.775 1.248 1.610 1.933 2.492 3.070 3. 880

4 .7 .3 .2 Doubly-Censored Data

F o r doubly-censored data, suppose the values to are available,


S < r . Pettitt and Stephens (1976) defined C ram ^r-von M ises statistics fo r
such data; in term s of the definitions above, the C ram ^r-von M ises statistic
is , for Тзфе 2 censoring,

= i
2 s r ,n 2 r ,n 2 s ,n

sim ilar definitions hold fo r Type I censoring and for and . Pettltt and
TESTS BASED ON EDF STATISTICS 119

Stephens have given as5onptotic percentage points for S3rmmetric double-


censoring, where limits of r/n and s/n are p and q, and p = I - q.

4 .7 .4 Random Censoring

An important type of censoring is random censoring, which can occur as


follows. Suppose a full random sample consists of the values X j , , ...,
Xn from a distribution F°(x) and consider a set of censoring variables
T j , Т з , . . . , T jj drawn, independently o f each other and o f the X? -se t, from
a censoring distribution F c(t). Whenever X^ > T j, x j is replaced by T j , so
that the available observations are the pairs (X^.ô^) defined as follows, for
i = I ......... ....

X, = m i n ( X ? , T ) and 0 = 1 if X =X ?
I M i' i i l

= 0 if X^ = T j

Such data could occur when X?^ are lifetimes o f patients who enter a study of
a certain disease; then if the patient dies from the disease before the study
ends, is recorded, but if the patient is still alive at the end of the study,
o r withdraws, o r dies of another cause, the time T j is recorded fo r which
he o r she was observed. The distribution function F(x) of X is then given by
I - F(X) = { I - F° (X)} { I - Fc(X) }. There has been much recent Interest in
testing fit in the presence of random censoring, o r in estimating and giving
confidence Intervals for F° (x) o r the related survival function S® (x) =
I - F °(x ).

4 .7 .4 .1 Estimation o f the Distribution Function

An estimate of F °(x ) fo r randomly censored data, analogous to the E D F , is


the K aplan-M eier (1958) estimate. This is form ed from the p airs (Xj,ó¿) as
follow s. F irst place the p a irs in ascending o rd er of the X j; if Xj = X q ), de­
fine Rx = j ( i * e . , the rank of X j in the orderin g). The estimate of F°(x) is
then F® (X ) defined by

0
F (X ) = 0, X < X
C n' ' (I)

n - R lö i
X < X
Ь й 4 т ■ (n)
i:X^< X f I

= I X > X
(n)

If no observation is censored, the estimate F ^ (x) becomes the EDF F^(X ).


C n n
120 STEPHENS

C learly EDF statistics may be defined using c ^ n W instead of F jj{x ),


when random censoring is present. F o r Case 0, to test Hq : that F® (x) is a
completely specified distribution, suppose = F°(X^^^), and let

be the K aplan-M eier estimate of the distribution of Z : suppose also that


The statistic corresponding to is then given by
I (3)^

2
W = n E - {Z^.^ - Z^._^^}, . I

(Koziol and Green, 1976). If there is no censoring, becomes W ^. In


general, the null distribution o f w ill depend on the censoring distribu­
tion, although the tested distribution is completely specified. Koziol and
Green (1976) have given asymptotic percentage points of fo r the specific
censoring model with I - F^(X) = { I - F °(x )}^ , ß a positive constant, ß must
be estimated from the proportion censored.
Koziol (1980) and Cs6rg6 and Horvath (1981) have also given tests fo r
Case 0 with random censoring. G illespie and F ish er (1979) and Hall and
W ellner (1980) have shown how confidence bands fo r F®(x) may be constructed
from the K aplan-M eler estimate; the H all-W elln er bands reduce to those
given by D (Section 4 .5 .4 ) when no censoring is present. The articles quoted
give many references to related w ork.

4 .7 .4 .2 Replacement of Censored Values

Another possible technique fo r randomly censored data (Case 0) is to make


the Probability Integral Transform ation on the observations above and
then to replace those values which come from the T j by new ones so that,
on Ho, the final set of transform ed values is U (0 ,1). Let Ho be, as before,
that the x j come from F®(x), completely specified. The P IT is applied using
F®(x), on the values x j and T^; then let U i = F ° (X j), and let ti = F ^ (T i).

Suppose Ft(t) is the distribution o f the ti, and let G^(t) = F ^ (s)d s. Then

replace ti by Ui given.by G t(U f) = (I - tj) Ft(ti) + Gt(ti); it may be shown


that the resulting combined set consisting of the values U and U * is distrib­
uted, on Ho, as U (0 ,1). Then any o f the many tests fo r Case 0 above may be
applied to the combined set. Here the censoring distribution o f t must be
known to make the transformation; however, it may be possible to replace
F i(t) and Gi(t) by the ED F of the t-values and its integral (Stephens and
W agner, 1986). Other methods of analyzing randomly censored data, using,
fo r example, probability plots, a re given in Chapter 11.
TESTS BASED ON EDF STATISTICS 121

4 .7 .5 Renyl Statistics

Renyi (1953) discussed a number of statistics based on the difference between


Fji(x) and F(X), o r on the ratio of F^(X) to F(x) over a restricted range. These
include

R = sup {F (X) - F(X)}


a< Fn(X)
F^(X) - F(X) ^
R = sup
F(X) I
a<F n (x)

IFJx) -F (X )I
R- = sup
FW
a< F (x )< b

R = sup { F (X) - F (X )}
^ F (x )< b ”

F^(X)

«5 = 7 -iw

F (X)

° "F W

where P is the interval 0 < F^(X) < r/n, with r an Integer in the range
I < r < n. Birnbaum and Lientz (1969a,b) have given exact and asymptotic
theory for some of these statistics fo r Case 0, and have produced tables of
percentage points fo r R j , R 2 , and R^ ; they also gave examples of the use of
the statistics, particularly in giving confidence lim its for F (x) over a r e ­
stricted range. Niederhausen (1981) has given tables of points fo r variance-
weighted Kolm ogorov-Sm im ov D, that is, R 3 above but with denominator
[F(X) { I - F ( x ) } ] Instead of F (x) (see Section 4 .5 .6 ), and for the analogues
of D"*” and D ” . Other statistics o f Renyi type, o r closely related, have been
discussed by a number of authors but, despite the potential applications fo r
censored data, they have not been much developed fo r practical use.

4 . 7 .6 Transform ation to Complete Samples


Before leaving the subject of censored data, we point out that, fo r Case O in
particular, several techniques are available to transform a censored sample
of uniforms to a complete sample of uniform s. Then Case O tests for uniform­
ity, for complete sam ples, o r any of the methods of Chapter 8 may be used
to test Hq . These methods can even be applied when there are blocks of
m issing observations. This is essentially a procedure which does not employ
122 STEPHENS

statistics specially adapted for censored data, and it is discussed m ore fully
in Section 11. 3. 3.

4 .8 EDF TESTS FOR THE N O R M A L DISTRIBUTION


W IT H UNK NO W N PA R A M E TE R S

W e now turn to EDF tests for distributions with one o r m ore param eters
unknown, beginning with the normal distribution.

4 .8 .1 Tests for Norm ality, Cases I , 2, and 3

The null hypothesis is

H^: a random sample . . . , X^ comes from F (x , 0 ) , the norm al d is­


tribution N(Ai,cr^), with one o r both of в = (Ц,(т^) unknown

Following Stephens (1974b, 1976a), three cases are distinguished accord­


ing to which param eter o r param eters are unknown.

Case I: The variance cr^ is known and ß is unknown, estimated by X, the


sample mean.
Case 2: The mean ß is known and is unknown, estimated by 2|(Xj - ^ ) V n
(= s ? , say).
Case 3: Both (i and are unkno^vn, and are estimated by X and
= X i(X i - X ) V ( n - I ) .

T A B L E 4 . 6 U p p e r-T a il A s 3rmptotic Percentage Points fo r Tests


fo r Norm ality with ß Unknown (Section 4 .8 .1 , Case I) o r
CT^ Unknown (Section 4 .8 .1 , Case 2)

Signiflcancelevel a

Statistic .25 .15 .10 .05 .025 .01 .005 .0025

W^ Case I .094 .117 .134 .165 .197 .238 .270 .302

W^ Case 2 .190 .263 .327 .442 .562 .725 .851 .978

U^ Case I .088 .110 .127 .157 .187 .228 .259 .291

U^ Case 2 .085 .105 .122 .151 .180 .221 .252 .284

A^ Case I .644 .782 .894 1.087 1.285 1.551 1.756 1.964

A^ Case 2 1.072 1.430 1.743 2.308 2.898 3.702 4.324 4.954

Adapted from Stephens (1974b), with perm ission of the Am erican Statistical
Association.
H
CA
H
T A B L E 4 . 7 Modifications and Percentage Points for a Test for Norm ality with /л and <т^ Unknown CO
W
(Section 4 .8 .1 , Case 3) >
CO
W
Significance level a Ö

i
Statistic Modified statistic .50 .25 . 15 . 10 .05 .025 .01 .005
Ö
Upper tail
H
D D(^Гn - 0.01 + О .вЗ/'Л) - 0.775 0.819 0.895 0.995 1.035 >
ä
V V(NÍñ + 0.05 + 0.82/nTii) - 1.320 1.386 1.489 1.585 1.693
d
O
W2 (1.0 + 0.5/n) .051 .074 .091 .104 .126 .148 .179 .201 CO

U* U^(1.0 + 0.5/n) .048 .070 .085 .096 .117 . 136 . 164 .183

A* A *(1 .0 + 0.75/n + 2.25/n*) .341 .470 .561 .631 .752 .873 1.035 1.159

Low er tail

W2 W *(1 .0 + 0.5/n) .051 .036 .029 .026 .022 .019 .017

u* (1.0 + 0.5/n) .048 .033 .027 .025 .021 .018 .016

A^(1.0 + 0 .75/n + 2.25/n*) .341 .249 .226 .188 .160 .139 .119

Adapted, with additions, from Table 54 of Pearson and Hartley (1972) and from Stephens (1974b), with perm ission
of the Biometrika Trustees and of the American Statistical Association.

CO
124 STEPHENS

O f these three cases. Case 3 is the most Important in most practical


situations. F o r the three cases, the steps in making the substitution
Z (i) = F (X (y ; 0 ) a re :

(a) Calculate w^, for i = I, • . . , n, from

Wf = (X^.j -X )/(T (Case I)

Wi = (X^i^ - ß ) / s ^ (Case 2)

Wi = (X^.^ - X )/s (Case 3)

(b) Calculate Z ( i ) = Ф ( W i ) (I = I , . . . ,n ), where Ф ( х ) denotes the cumulative


probability of a standard norm al distribution N (0 ,1) to the value x,
found from tables o r computer routines.
(C) Calculate the test statistics from the form ulas (4 . 2 ).
(à) F o r Cases I o r 2, use Table 4 .6 . F o r Case 3, use T able 4.7 and calcu­
late the modified statistic. If the value of the statistic used, o r, in
Case 3, its modified value, exceeds the appropriate percentage point at
level O', Ho is rejected with significance level a .

The percentage points given fo r statistics , , and are those of


their asymptotic distributions, and can be found theoretically. The points
for D and V (Case 3) are the asymptotic points of n/п times the statistic;
these have so fa r not been found theoretically (but see Nesenko and
Tjurln, 1978) and those given have been obtained by extrapolation of points
fo r finite n obtained by Monte C arlo studies. The modifications fo r a ll the
statistics w ere calculated from points fo r finite n obtained by Monte Carlo
methods. The tables now given are extended and revised from previous tables,
fo r example, those given in Stephens (1974b), quoted also in Pearson and
Hartley (1972). Percentage points fo r W ^, U^, and w ere calculated by
Stephens (1971, 1974b, 1976a), by Durbin, Knott, and T aylor (1975), and by
Martynov (1976); Monte Carlo studies fo r D, Case 3, w ere given by van Soest
(1967), by L illiefo rs (1967), and by Stephens (1974b); fo r D, Cases I and 2,
sim ilar studies have been made by van T ilm ann-D eutler, Griesenbrock, and
Schwensfeier (1975), fo r V, Case 3, by Louter and Koerts (1970), and for W ^,
Case 3, by van Soest (1967). Asymptotic results w ere also given by Wood
(1978) and by Nesenko and Tjurin (1978).
No modifications have been calculated fo r W ^, U^, and A^, C ases I and 2.
The percentage points for finite n converge rapidly to the asymptotic points,
so that the points given could be used with good accuracy for n > 20. F o r
other references see Durbin (1973), Stephens (1974b), and Neuhaus (1979).
TESTS BASED ON EDF STATISTICS 125

E 4 . 8 . 1 Example

W e return to the data on weights o f chickens given in T able 4 . 1, and now


suppose that the tests is fo r

Ho : the sample is from a norm al distribution but with mean and variance
unknown

The situation is therefore Case 3, and the appropriate estimates for ц and
a re given by x = 209.6, and s^ = 939.25. The transformations give the v a l­
ues in column Z 2 o f T able 4 .1 , and Figure 4.3 shows their E D F . Equations
(4.2) give for the test statistics the values: D^ = 0.089, D~ = 0.104, D =
0.104, V = 0.192, = 0.034, = 0.034, A^ = 0 .2 1 4 . T h em o d ified valu es
are D * = 0.483, V * = 0.906, W * = 0.035, U * = 0.034, A * = 0.223. It c a n b e
seen from Table 4 .7 , using Case 3 percentage points, that these are not
nearly significant at the 15% level, so that at this level the sample would not
be rejected as coming from a normal population.

FIG U R E 4.3 E D F o fZ 2 -
126 STEPHENS

4 . 8.2 Significance Levels for Tests of Normality (Case 3)


Tables 4 . 8 and 4 . 9 can be used to find the p -le v e l of a test statistic in an
EDF test fo r normality (Case 3). Table 4 .8 Is adapted from Pettltt (1977a)
and gives form ulas to give the percentage point of , for a sample of size n,
corresponding to a given q-value. The value a fo r which P(A^ < a) = q is
given by a = a ^ i l + Ъ ^ / п + bj/n ^), where bo , b ^ , and aoo are given in the
table against the value of q. Table 4.9 gives form ulas fo r log p in the upper
tail, o r log q in the low er tail, fo r Case 3 tests and fo r modified values of
, , A ^ . They are m ore accurate in the upper tall (where the modifica­
tions of Table 4 . 7 are m ore accurate) but also give good approximations in the
low er tail; these are useful fo r combining several test results (Section 4 .18).

T A B L E 4 . 8 Constants fo r Calculating the Significance Level


of a Value of A^ in a T est fo r Norm ality with
P aram eters Unknown (C ase 3, Section 4 .8 .2 )

Asymptotic
bo bi percentage point a «

.05 -.512 2.10 .1674


.10 -.552 1.25 .1938
.15 -.6 0 8 1.07 .2147
.20 -.643 .93 .2333
.25 -.707 1.03 .2509
.30 -.7 3 5 1.02 .2681
.35 -.772 1.04 .2853
.40 -.770 .90 .3030
.45 -.7 7 8 .80 .3213
.50 -.779 .67 .3405
.55 -.803 .70 .3612
.60 -.8 1 8 .58 .3836
.65 -.8 1 8 .42 .4085
.70 -.801 .12 .4367
.75 -.800 -.0 9 .4695
.80 -.756 -.3 9 .5091
.85 -.749 -.5 9 .5597
.90 -.750 -.8 0 .6305
.95 -.795 -.8 9 .7514
.975 -.881 -.9 4 .8728
.99 -1.013 -.9 3 1.0348
.995 -1.063 -1.34 1.1578

Adapted from Pettitt (1977a), with perm ission of the author


and of the Royal Statistical Society.
TABLE 4.9 Formulas for Significance Levels, Tests for Normality with Parameters Unknown (Case 3, Section 4.8.2)

Statistic

, Case 3 , Case 3 , Case 3

Z < Zi log q = -13.953 + 775.5z - 12542.eiz^ log q = -13.642 + 766.31z - 12432.74z^ log q = -13.436 + 101.14z - 223.73z^

Zj 0.0275 0.0262 0.200

Zi<z<Z 2 Io g q = -5.903 + 179.546z-1 5 1 5 .29z^ I o g q = -6 .3 328+214.5 7 z-2 0 2 2 .28z2 Io g q = -8.318 + 4 2 .796z- 5 9 . 938z^

Z2 0.051 0.048 0.340

Z2 < Z < Z3 I o g p = 0 .886 - 3 1 .62z + 10.897z^I o g p = 0.8510 - 32.006z - 3.45z^ Io g p = 0.9177 - 4.279z - 1.38z^

Z3 0.092 0.094 0.600

z>Z 3 Io g p = 1.111 - 34.242z + 12.832z 2I o g p = 1.325 - 3 8 .918z + 16.45z2 Io g p = 1.2937 - 5 .709z + 0.0186z2

S u ppose Z is a modified value of W ^, o r A^ (see Table 4 .7 ). F or a given z, find the interval in which z lies. The formula
gives the value of log q (q = lower tail significance level) o r log p (p = upper tail significance le v e l).
128 STEPHENS

E 4 .8 .2 Example

Suppose, fo r n = 20, w ere 0.435. Reference to Table 4.8 suggests a


q -le v e l near 0.70. The percentage point fo r n = 20, q = 0.70 is given approxi­
mately by Z = 0.4367(1 - .801/20 + .12/400) = .419. Sim ilar calculations
give the percentage point for n = 20, q = 0.75 to b e z « 0.4506. Interpolation
between these values gives q fo r = 0.435 to be about 0.725. To use
Table 4.9 we first calculate modified A^ (from Table 4.7) to be 0.454; then
Table 4.9 gives log p « 0.9177 - 4.279(.454) - 1 . 38 ( . 454 )^ = -1.309; then
P « 0.270, and q = .730.

4 .8 .3 Related Tests fo r Norm ality

Green and Hegazy (1976) have shown that slight modifications of the basic
EDF statistics can improve power in tests for normality against selected
alternatives. Hegazy and Green (1975) have discussed tests based on values
V i = {(X (i ) - X )/ s } - m i, where mi Is the expected value o f the i-th o rd er
statistic o f a sample of size n from N (0 ,1).

4 .8 .4 Tests for Normality with Censored Data

Pettitt (1976) has given percentage points fo r modified versions of W ^, U^,


and A ^ , fo r use in tests of normality with sin gly - o r doubly-censored data.
The param eters and o* can be estimated by maximum likelihood o r by esti­
mates given by Gupta (1952). Maximum likelihood estimates are complicated
to calculate and percentage points of the test statistics for finite n appear to
converge m ore slowly to the asymptotic points when these estimates are
used, so Gupta^s estimates are suggested. These are linear combinations of

the available o rd er statistics, for example, ß * =

a* = n < 10 Gupta gives coefficients c^ and b^ (there called

ß i and ) for the most efficient estimates. F o r n > 10, which would be the
situation most needed in practice, Gupta gives the easily calculated coeffi­
cients

^ m(m^ - m) m. - m
I
and

^ l= I ^ ^ i "

where m^ is the expected value of the i-th o rd er statistic of a sample of size

n from N (0 ,1) and where m = m^/r. Values of m^ are tabulated o r can

be w ell approximated (see Section 5 .7 .2 ), and the estimates and o * have


been shown to be asymptotically efficient (A ll and Chan, 1964). These esti­
mates are the same as those obtained by least squares when is regressed
TESTS BASED ON EDF STATISTICS 129

T A B L E 4.10 Upper T a ll Percentage Points fo r Statistics and A^ fo r i


T est fo r Norm ality (P aram eters Unknown) with Complete o r Type 2
Right-Censored Data (Section 4 .8 .4 ). p = r/n is the censoring ratio.

Significance level ol

Statistic n 0.5 0.25 0.15 0.10 0.05 0.025 0.01

W2 20 0.001 0.002 0.004 0.006 0.010 0.016 0.024


40 0.002 0.004 0.006 0.008 0.013 0.021 0.041
60 0.002 0.004 0.006 0.008 0.014 0.021 0.039
P = 0.2
80 0.001 0.004 0.006 0.008 0.013 0.021 0.035
100 0.001 0.004 0.006 0.008 0.013 0.020 0.032
CO 0.000 0.004 0.006 0.008 0.009 0.017 0.020
10 0.007 0.011 0.014 0.017 0.028 0.041 0.057
20 0.009 0.014 0.019 0.024 0.037 0.055 0.090
40 0.009 0.015 0.020 0.026 0.038 0.057 0.089
P = 0.4 60 0.010 0.015 0.021 0.026 0.036 0.052 0.077
80 0.010 0.016 0.021 0.026 0.036 0.049 0.073
100 0.010 0.016 0.021 0.026 0.035 0.047 0.071
OO 0.009 0.019 0.021 0.026 0.034 0.038 0.066

10 0.017 0.026 0.033 0.040 0.054 0.075 0.109


20 0.019 0.029 0.037 0.044 0.060 0.080 0.113
40 0.020 0.031 0.040 0.047 0.061 0.077 0.106
P = 0.6 60 0.020 0.031 0.040 0.047 0.060 0.077 0.105
80 0.021 0.032 0.040 0.048 0.061 0.077 0.103
100 0.021 0.032 0.040 0.048 0.061 0.076 0.101
OO 0.025 0.032 0.044 0.052 0.064 0.074 0.092

10 0.030 0.044 0.054 0.062 0.078 0.094 0.115


20 0.032 0.046 0.057 0.067 0.083 0.100 0.122
40 0.033 0.049 0.060 0.070 0.084 0.101 0.124
P = 0.8 60 0.033 0.049 0.060 0.070 0.086 0.103 0.125
80 0.034 0.050 0.061 0.071 0.087 0.105 0.127
100 0.035 0.050 0.062 0.072 0.089 0.106 0.129
OO 0.039 0.051 0.069 0.080 0.098 0.114 0.140

10 0.037 0.054 0.066 0.076 0.093 0.110 0.137


20 0.039 0.056 0.069 0.079 0.097 0.113 0.137
40 0.040 0.058 0.072 0.082 0.099 0.116 0.142
P = 0.9 60 0.040 0.058 0.072 0.082 0.100 0.118 0.141
80 0.041 0.059 0.073 0.084 0.102 0.120 0.144
100 0.042 0.060 0.074 0.085 0.103 0.122 0.146
0.045 0.067 0.082 0.094 0.114 0.135 0.163

(continued)
130 STEPHENS

T A B L E 4.10 (continued)

Significance level a

Statistic 0.5 0.25 0.15 0.10 0.05 0.025 0.01

W2 10 0.042 0.061 0.074 0.084 0.103 0.122 0.145


20 0.043 0.062 0.076 0.087 0.106 0.124 0.147
40 0.044 0.064 0.078 0.089 0.108 0.126 0.154
60 0.044 0.064 0.078 0.089 0.109 0.128 0.152
P = 0.95
80 0.045 0.065 0.079 0.090 0.110 0.130 0.154
100 0.045 0.066 0.080 0.091 0.112 0.132 0.156
OO 0.049 0.072 0.087 0.099 0.120 0.142 0.172

10 0.049 0.070 0.086 0.098 0.119 0.141 0.167


20 0.049 0.071 0.087 0.100 0.121 0.142 0.171
40 0.050 0.073 0.088 0.101 0.122 0.141 0.169
P = 1.0 60 0.050 0.073 0.088 0.101 0.123 0.144 0.171
80 0.050 0.073 0.089 0.101 0.124 0.146 0.173
100 0.050 0.073 0.089 0.102 0.125 0.146 0.174
0.051 0.074 0.091 0.104 0.126 0.148 0.179

A2 20 0.015 0.043 0.054 0.061 0.092 0.131 0.182


40 0.028 0.053 0.067 0.079 0.112 0.158 0.253
60 0.035 0.053 0.069 0.084 0.114 0.160 0.246
P = 0.2
80 0.036 0.056 0.073 0.087 0.116 0.159 0.236
100 0.036 0.059 0.075 0.089 0.119 0.158 0.228
OO 0.030 0.077 0.094 0.099 0.133 0.149 0.185

10 0.063 0.090 0.108 0.121 0.172 0.236 0.319


20 0.072 0.107 0.135 0.162 0.220 0.297 0.439
40 0.078 0.117 0.150 0.177 0.236 0.316 0.433
P = 0.4 60 0.079 0.119 0.148 0.174 0.228 0.299 0.410
80 0.082 0.121 0.153 0.178 0.229 0.292 0.395
100 0.085 0.123 0.157 0.182 0.231 0.288 0.385
OO 0.106 0.134 0.190 0.215 0.250 0.279 0.340

10 0.111 0.158 0.198 0.233 0.304 0.405 0.592


20 0.122 0.178 0.222 0.259 0.339 0.437 0.607
40 0.130 0.191 0.238 0.278 0.348 0.430 0.570
P = 0 .6 60 0.132 0.193 0.239 0.275 0.348 0.430 0.557
80 0.134 0.196 0.241 0.278 0.350 0.429 0.548
100 0.136 0.198 0.244 0.280 0.351 0.429 0.541
0.151 0.212 0.261 0.300 0.359 0.426 0.512

(continued)
TESTS BASED ON EDF STATISTICS 131

T A B L E 4.10 (continued)

Significance level a

Statistic n 0.50 0.25 0.15 0.10 0.05 0.025 0.01

A^ 10 0.172 0.246 0.302 0.352 0.440 0.542 0.698


20 0.185 0.267 .330 0.380 0.473 0.574 0.743
40 0.197 0.282 0.344 0.394 0.478 0.575 0.711
P = 0.8 60 0.197 0.284 0.345 0.396 0.482 0.574 0.705
80 0.200 0.288 0.349 0.401 0.489 0.580 0.707
100 0.202 0.291 0.353 0.405 0.494 0.585 0.709
OO 0.220 0.311 0.380 0.432 0.528 0.619 0.732

10 0.214 0.303 0.368 0.425 0.530 0.642 0.825


20 0.229 0.326 0.397 0.453 0.549 0.654 0.807
40 0.242 0.342 0.414 0.473 0.566 0.669 0.814
P = 0.9 60 0.243 0.343 0.415 0.472 0.571 0.675 0.805
80 0.245 0.348 0.420 0.478 0.579 0.683 0.811
100 0.248 0.352 0.425 0.483 0.585 0.689 0.818
OO 0.265 0.380 0.456 0.517 0.623 0.729 0.871
10 0.243 0.344 0.414 0.474 0.584 0.696 0.840
20 0.257 0.366 0.440 0.500 0.600 0.708 0.853
40 0.273 0.382 0.456 0.519 0.624 0.721 0.865
60 0.272 0.383 0.459 0.520 0.626 0.733 0.874
P = 0.95
80 0.276 0.388 0.465 0.528 0.633 0.744 0.885
100 0.279 0.392 0.470 0.534 0.640 0.753 0.893
0.301 0.420 0.502 0.580 0.686 0.802 0.942

10 0.309 0.425 0.511 0.578 0.700 0.818 0.964


20 0..323 0.446 0.530 0.601 0.714 0.831 0.993
40 0.330 0.456 0.541 0.611 0.723 0.833 0.981
P = 1.0 60 0.331 0.458 0.545 0.614 0.734 0.847 0.993
80 0.333 0.460 0.548 0.616 0.740 0.853 1.000
100 0.334 0.461 0.550 0.618 0.742 0.857 1.005
0.341 0.470 0.561 0.631 0.752 0.873 1.035

Some as 3miptotic points taken from Pettitt (1976), with perm ission of the
author and of the Biom etrika T ru ste e s.
132 STEPHENS

against mt, i = I , . . . , г, a s, fo r example, in Section 5 .7 .3 . The steps in


making a test, with right-censored data, a re then:

(a) Calculate ß * and cr*.


(b) Find Wi = { X (i ) - ß * } / ( T * , i = I, . . . , r .
(C) Calculate = <&(wi) as in Section 4 .8 .1 , step (b ), above.
(d) F o r Type 2 censored data use the Z i, i = I , . . . , r , in the form ulas of
Section 4. 7. 3 to obtain statistics , _A^ , and .
2 2 ^ ^ ^
(e) R efer r n 2-^r n ^ percentage points in T ables 4.10. The
table is entered at p = r/n and at n. The points for finite n w ere found
by extensive Monte Carlo sampling, using 10,000 sam ples fo r each
sam ple size n, and some as 3Tnptotic points w ere taken from Pettitt
(1976). The test is easily adapted fo r left-censored data by changing
the sign o f all values given and observing that the sam ple is now rig h t-
censored. Tables fo r are in Stephens (1986).
2 r,

An example of these tests is given in Example E 1 1 .4 .1 .1.1; the same


data set is used, with the correlation test described in Section 5 .7 , in
E 1 1 .4 .1 .2 .1. F o r these data, the EDF statistics are the m ore sensitive.
It is possible to make a rough test fo r, say, large outliers, mixed with
an otherwise normal sam ple, by firs t testing the whole sample and then the
sample with suspect values rem oved. This procedure would be difficult to
form alize since the two tests would be correlated, and also the censoring
fraction w ill probably be chosen after observing the data.
F o r Type I right-censored data, test statistics ,W ? ,U ? and
I
can be found as follow s. Suppose the upper censoring value of X is t*, and
let P = Ф {(t * - д *)/<^}* The procedure o f Section 4. 7. 3 is followed, for
Type I censored data; that is , set = p, consider the sample to be
now of size r + 1, and use the form ulas of that section to calculate the statis­
tics. T ables can be constructed, by taking Monte Carlo sam ples from N (0 ,1),
censoring at t = Ф"^(р), fo r given p, and calculating the test statistics just
described; however, a correct test caimot be made, since for any given r e a l-
life sam ple, only p w ill be known, and not the correct p to enter the table.
A s the points vary considerably with p, entering the table at an estimate o f p
could produce substantial e r r o r s . However, Table 4.10 can be used to give
an approximate test, especially fo r large sam ples; the asymptotic points are
the same for both types of censoring, and tables for Type I , produced as
described above, have values close to those for Type 2 censored data. The
same problem w ill a rise fo r Type I censored data from other distributions.

4. 8. 5 T ests fo r Norm ality o f Residuals in Regression

Mukantseva (1977), P ierce and Kopecky (1978), and Lo 3mes (1980) have
studied the asymptotic behavior o f the E D F of the residuals when a regression
TESTS BASED ON EDF STATISTICS 133

model has been fitted, with the intention of testing these residuals for norm al­
ity. If the model, of any ord er, is correct, the residuals w ill be norm al,
with known mean equal to z e ro , but with unknown varian ce. At first sight
this situation would appear to be Case 2 of Section 4 .8 .1 above, but this is
not so, because the residuals are not Independent. However, the above
authors have shown that if EDF statistics a re calculated from the residuals
their asymptotic distributions are the same as for Case 3 above.
A s an example, consider sim ple linear regression using the model
У1 = /^0 + ^i, i = I, • • • , n, with = N(0,cr2). Let Д, and be the
usual least squares estim ators, and let be the usual estimate of <r^ ob­
tained from the e r r o r sum of squares in the A N O V A table; if y^ = ^ + ß i X i ,
У1 " ^ ^ = Z4 e f / ( n - 2). The studentized residuals a re (P ie rc e
and G ray, 1982, with slight change in notation)

Wj = € j / [ a { l - 1/n ( X , -x )2 / S }* ]
' i ' xx‘ ‘

where Sxx = 2¾ (^i - x) ^ . Let = Ф(W(i)); fo r an approximate test that the


a re norm al, EDF statistics a re then calculated from (4.2) and re fe rre d
to the asymptotic points for Case 3 in T able 4 .7 . The modifications given in
the table w ill not be valid for this problem .
From Monte C arlo studies. P ierce and G ray (1982) concluded the asymp­
totic points can be used with good accuracy for the sim ple linear regression
model, fo r n as low as 20. White and MacDonald (1980) gave some results
fo r multiple regression situations. It seem s c lear that the tests would be
affected considerably if the experimental model w ere not correct—fo r ex­
ample, if the correct regression model w ere a quadratic function, but only
a linear fit was made; see also comments on the multiple regressio n situ­
ation by White and MacDonald (1980), and in the discussion to that paper, and
by P ierce and Gray (1982). Wood (1978b) has discussed asymptotic theory
fo r EDF statistics obtained from residuals in an an alysis-of-varian ce model.
In residual analysis, of course, other questions are also o f great importance,
fo r example, systematic variation of the residuals. F o r further discussion
see Anscombe and Tukey (1963) o r textbooks such as D rap er and Smith
(1966) o r Kleinbaum and Kupper (1978).

4 .9 EDF TESTS FO R THE E X P O N E N T IA L DISTRIBUTIO N

4 .9 .1 Tests for Exponentlality, C ases I, 2, and 3

The exponential distribution, denoted by Exp (of,/?), is the distribution

F(x;a,/3) = I - e x p {- (x - a ) / ß } , x> a; ß> 0

In this section we consider tests of


134 STEPHENS

Hq S а random sample , Xj^ comes from distribution Exp { a , ß )

A s for the test for normality, we can distinguish three cases:

Case I: the origin o r location param eter a is unknown, but ß is known;


Case 2: the scale param eter ß is unknown, but a is known;
Case 3: both param eters are unknown.

4 .9 .2 Tests fo r Case I

The first method we shall describe fo r Case I uses a special property of the
exponential distribution, as follows. Let = X^i) “ ^ (1 )» ..., n;
on Ho , the w ill be a random sample from Exp (0,)3) (see Section 10.3.1,
Result 2) and, since ß is known, a Case 0 test can be made using the n - I
values o f .
Alternatively, o¿ may be estimated unbiasedly by a = (Х^ц - 1/n); this
estimate is derived from the maximum likelihood estimate X (i ), and has
variance diminishing as l/n^. Then Z (j) are found from Z (i) =
I - exp[-(X (I) - a ) / ß ] , i = I , • • • , n, and EDF statistics calculated from the
Z (i) by form ulas (4.2) w ill have their Case 0 distributions asymptotically so
that the percentage points in Table 4 . 2 may be used for large sam ples. How­
ever, in contrast to the previous test procedure, the modifications given
there w ill not apply, and since the two procedures are likely to have very
sim ilar power properties, the first procedure is m ore practical for r e la ­
tively sm all sam ples.

4 .9 .3 Tests for Case 2

F o r this case (Case 2) suppose first that a is known to be zero. The maximum
likelihood estimate of ß is given b y ß = X where X is the sample mean.
The steps in testing Hq are as follows:

(a) Calculate Z^^j = I - e x p (-Х (ц / Х ), i = I , . . . , n.


(b) Calculate the EDF statistics from (4. 2 ).
(c) Modify and compare with the percentage points given in T able 4.11, o r
alternatively obtain p -le v e ls from Table 4.12.

If the known origin is a = a ^ , not zero, the substitution X^ = X j - a ,


i = I , . . . , n, can be made, and the X [ tested fo r Exp (0,/3) as just described.
See Result I o f Section 10.3.1.
The percentage points given are as 5nnptotic points for the statistics ,
U^, and A^; see Stephens (1974b, 1976a). The asymptotic distribution of
was e a rlie r tabulated by van Soest (1969), and points fo r and have also
been calculated by Durbin, Knott, and Taylor (1975). The modifications w ere
based on Monte Carlo points for finite n; for D and V , these w ere extrapolated
TABLE 4.11 Modifications and Percentage Points for EDF Tests for Exponentiality, Case 2: Origin Known,
Scale Unknown (Section 4.9.3)

Upper tail
Significance level Qt

Statistic
T Modified form T * .25 .20 .15 .10 .05 .025 .01 .005 .0025

D (D - 0.2/11)(-^1+ 0.26 + О . 5 /-Л 1) .926 .995 1.094 1.184 1.298

V (V - 0.2/n)(^/n + 0.24 + 0.35/-\/n) 1.445 1.527 1.655 1.774 1.910

W2 W 2(1.0 + 0.16/n) .116 .130 .148 .175 .222 .271 .338 .390 .442

U^(1.0 + 0.16/n) .090 .099 .112 .129 .159 .189 .230 .261 .293

A^ A ^(1.0 + 0.6/n) .736 .816 .916 1.062 1.321 1.591 1.959 2.244 2.534

Low er tail
Significance level ol

.01 .025 .05 .10 .15 .20 .25 .50

W2 Asymptotic percentage points. .0192 .0233 .0276 .0338 .039 .044 .048 .074

.0172 .0207 .0243 .0293 .0339 .0373 .0409 .0601

.150 .178 .208 .249 .280 .312 .342 .502

Adapted from Table 54 of Pearson and Hartley (1972) and from Stephens (1974b), with perm ission of the Biom etrika Trustees
and of the Am erican Statistical Association.
TABLE 4.12 Formulas for Significance Levels, Tests for Exponentiality, Case 2: Origin Known, Scale Unknown
(Section 4.9.3)^

S ta tis tic

U 2 A*

Z < Zj l o g q = - 1 1 . 3 3 4 + 4 5 9 . 0 9 8 z - 5 6 5 2 . Iz=' lo g q = -1 1 .7 0 3 + 5 4 2 .5 z - 7 5 7 4 .5 9 z * l o g q = - 1 2 . 2 2 0 4 + 6 7 . 4 5 9 z - H O . S z^

Zl 0 .0 3 5 0 .0 2 9 0 .2 6 0

Zj < Z < Z2 l o g q = - 5 . 7 7 9 + 1 3 2 . 8 9 z - 8 6 6 . 5 8 z^ lo g q = -6 .3 2 8 8 + 1 7 8 .1 z - 1 3 9 9 .4 9 z * l o g q = - 6 . 1 3 2 7 + 2 0 . 2 1 8 z - 1 8 .6 6 3 z 2

Zz 0 .0 7 4 0 .0 6 2 0 .5 1 0

Z 2 < Z < Z3 I o g p = 0 .5 8 6 - 1 7 .8 7 z + 7 .4 1 7 z * l o g P = 0 .8 0 7 1 - 2 5 . 1 6 6 z + 8 . 4 4 z * l o g P = 0 . 9 2 0 9 - 3 . 3 5 3 z + 0 .3 0 0 z ^

Z3 0 .1 6 0 0 .1 2 0 0 .9 5 0

Z > Z3 l o g P = 0 . 4 4 7 - 1 6 .5 9 2 z + 4 . 8 4 9 z * I o g p = 0 .7 6 6 3 - 2 4 . 3 5 9 z + 4 . 5 3 9 z * l o g P = 0 . 7 3 1 - 3 . 0 0 9 z + 0 .1 5 z ^

S u p p o se Z is a modified value of W ^, U^, o r (See Table 4.11). F or a given z, find the interval in which z lie s. The formula
gives the value o f log q (q = lower tail significance level) o r log p (p = upper tall significance le v e l).
TESTS BASED ON EDF STATISTICS 137

to obtain asymptotic values. Stephens (1974b) has given references to other


Monte C arlo studies, the first o f which, for D only, was given by L illie fo rs
(1969). Subsequently, Durbin (1975) has produced exact null distribution the­
o ry fo r D , D ” , and D, and has given percentage points fo r n up to 100; see
also M argolin and M au rer (1976) fo r work on these statistics. Table 4.11 is
an extended and revised version of that given in Stephens (1974b) making use
o f later results where possible. T able 4.12 gives form ulas fo r obtaining
p-values o r q-values of a given value z of a modified statistic.

E 4 .9 .3 Example

Proschan (1963, Table I) has given a number of sam ples of data, consisting
of intervals between failures of a ir conditioning equipment in aircraft. W e

T A B L E 4.13 O r ig ln a lV a lu e s X
in a Test for E^x)nentiali1y

X Zi^ zb

12 0.113 0.094
21 0.189 0.159
26 0.229 0.193
27 0.237 0.200
29 0.252 0.213
29 0.252 0.213
48 0.381 0.327
57 0.434 0.375
59 0.446 0.385
70 0.503 0.439
74 0.522 0.457
153 0.783 0.717
326 0.962 0.932
386 0.979 0.959
502 0.993 0.984

^Values Z^usingthe Probability


Integral Transform ation with в
given equal to 0.01.
bValues Z 2 using the transform a­
tion with в estimated from the
data, i . e . , .0083.
Taken from Proschan (1963), with
perm ission of the author and of
the Am erican Statistical A sso c i­
ation.
138 STEPHENS

illustrate the test fo r exponentiality by taking his aircraft numbered 7910,


fo r which the 15 intervals are as listed in Table 4.13. In o rd er to emphasize
the contrast between Case 0 of Section 4 . 6 and the test with unknown param ­
eter ¢, we first test the null hypothesis of exponentiality with both param eters
given: O' = 0, ß = 100. Thus the null hypothesis is the data a re a random
sample from Exp (0,100).
The transformation Z j = I - e x p (-X i/ l0 0 ) gives Z -values listed as Z^
in Table 4.13; on Ho , Z^ w ill be U (0 ,1). The values give test statistics
D"*" = 0.210, D~ = 0.161, D = 0.210, V = 0.372, = 0 .1 3 3 , = 0 .1 3 0 ,
= 1.055. The modified form s, using Table 4 .2 , are = 0.973, D “* =
0.650, D * = 0.846, V * = 1.522, W * = 0.116, U * = 0.130, A * = 1.055. Statis­
tics V * and U * are almost significant at the 15% level; the others a re fa r
from significant even at this level.
If ß is not known in the test fo r езфопепйаИ1у, the estimate is ß = X =
121.2. With O' = 0 and ß = 121.2, the Probability Integral Transform ation
gives the values in column Z 2 o f T able 4.13 and the test statistics modified
as in Table 4.11 become: D * = 1 .122, V * = 1 .661, W * = 0.221, U * = 0.172,
A * = 1.210. A ll the statistics are now significant at the 10% level, with D *,
V * , U *, significant (and W * alm ost so) at the 5% level. Thus, given the fre e ­
dom to choose the param eter, it appears that the assumption of an езфопеп-
tial parent population is suspect. The comparison with the previous test for
Case 0 may appear paradoxical, since apparently in Case 0 one makes use
of m ore information, and we shall return to this point in Section 4.16.
IVlany other tests for exponentiality with known origin a re given in Chap­
ter 10. In particular, two other test statistics based on the ED F (D and S*)
are discussed in Section 10.8.1.

4 . 9.4 Tests for Case 3


Relatively few tests have been proposed to test fo r exponentiality with a and ß
unknown, probably because Result 2 of Section 1 0 .3 .1 can be used to reduce
the test to a test with o' = 0. However, this may not always be the best

T A B L E 4.14 Modifications and Upper T a il Percentage Points fo r a Test for


Exponentiality, Case 3: Origin and Scale Unknown (Section 4 .9 .4 )

Significance level

Statistic Modification . 25 .15 .10 .05 .025 .01

W ^(l + 2.8/n-3/n^) .116 .148 .175 .222 .271 .338

U * (l + 2 .3/n -3/n ^) .090 .112 .129 .159 .189 .230

A ^ (l + 5 .4 / n -ll/ n 2 ) .736 .916 1.062 1.321 1.591 1.959


TESTS BASED ON EDF STATISTICS 139

T A B L E 4.15 Upper T a il Percentage Points fo r N/nD^, N^nD-, n/iiD, N/nV,


W ^, U^, and A ^, fo r a Test of Exponentiallty, Case 3. O rigin and Scale
Unknown (Section 4 .9 .4 )

Upper tall significance level a

.25 .15 .10 .05 .025 .01

Statistic

5 .491 .569 .639 .743 .825 .917


10 .580 .674 .745 .851 .952 1.038
15 .610 .700 .768 .872 .978 1.077
20 .624 .716 .785 .894 .995 1.108
25 .635 .725 .799 .909 1.010 1.125
50 .660 .758 .832 .943 1.051 1.163
100 .682 .778 .853 .967 1.074 1.189
OO
.723 .820 .886 .996 1.094 1.211

Statistic n/iiD "

5 .627 .705 .753 .821 .891 .955


10 .671 .761 .825 .916 .993 1.089
15 .688 .783 .842 .933 1.022 1.111
20 .696 .791 .855 .949 1.041 1.132
. 25 .702 .795 .860 .958 1.052 1.149
50 .710 .807 .874 .976 1.072 1.178
100 .717 .814 .879 .984 1.089 1.192
CO
.723 .820 .886 .996 1.094 1.211

Statistic n/iiD

5 .683 .749 .793 .865 .921 .992


10 .753 .833 .889 .977 1.048 1.119
15 .771 .865 .912 1.002 1.079 1.163
20 .786 .872 .927 1.021 1.099 1.198
25 .792 .878 .936 1.033 1.115 1.215
50 .813 .879 .960 1.061 1.149 1.257
100 .824 .911 .972 1.072 1.171 1.278
OO
.840 .927 .995 1.094 1.184 1.298

Statistic N/nV

5 1.098 1.186 1.234 1.314 1.400 1.494


10 1.194 1.294 1.363 1.461 1.556 1.662
15 1.225 1.325 1.392 1.504 1.596 1.701
20 1.245 1.346 1.419 1.536 1.635 1.769

(continued)
140 STEPHENS

TABLE 4.15 (continued)

Upper tail significance level a

n .25 .15 .10 .05 .025 .01

Statistic n/iiV , continued

25 1.260 1.366 1.438 1.559 1.658 1.796


50 1.292 1.400 1.481 1.600 1.701 1.847
100 1.310 1.419 1.502 1.647 1.740 1.897
CO 1.334 1.445 1.527 1.655 1.774 1.910

Statistic

5 .083 .102 .117 .141 .166 .197


10 .097 .122 142 .176 .211 .259
15 .103 .130 .151 .188 .229 .281
20 .106 .133 .157 .195 .237 .293
25 .107 .135 .160 .199 .247 .301
50 .111 .141 .166 .209 .256 .319
100 .113 .144 .170 .215 .263 .328
OO .116 .148 .175 .222 .271 .338

Statistic

5 .068 .083 .093 .113 .131 .153


10 .075 .094 .108 .131 .155 .187
15 .080 .099 .114 .139 .165 .200
20 .082 .102 .117 .143 .170 .207
25 .083 .104 .119 .146 .173 .212
50 .087 .108 .124 .152 .180 .223
100 .089 .110 .126 .155 .184 .229
OO .090 .112 .129 .159 .189 .230

Statistic

5 .460 • 555 .621 .725 .848 .989


10 .545 .660 .747 .920 1.068 1.352
15 .575 .720 .816 1.009 1.198 1.495
20 .608 .757 .861 1.062 1.267 1.580
25 .625 .784 .890 1.097 1.317 1.635
50 .680 .838 .965 1.197 1.440 1.775
100 .710 .875 1.008 1.250 1.510 1.855
OO .736 .916 1.062 1.321 1.591 1.959
TESTS BASED ON EDF STATISTICS 141

procedure. In this section we give ED F tests fo r Case 3, using estimates for


both a and ß , sim ilar to other ED F tests. The null hypothesis is

the random sample ...» comes from the distribution


E x p ( a , /3), with Of, ß unknown

The test procedure is as follow s:

(a) Calculate estimates ¡ í = n(X - X^^^)/(n - I) and a = X^^^^ - ß / n *

(b) Calculate W . = a)/0, i = I , ..

(C ) Calculate = I - exp (-W ^), 1 = 1 , . . . , n.

(d) Find the EDF statistics from (4 .2 ), modify W ^, U^, and A^ using
Table 4.14 and compare with the.asymptotic percentage points given;
fo r D"^, D ” , D, and V use T able 4.15 without modification.

The estimate of a is superefficient and so asymptotic theory is the same as


fo r Case 2 in the previous section. F o r finite n, however, the distributions
are different. The modifications fo r W ^, U^, and A^ and the points fo r D"^,
D ” , D, and V w ere found from extensive Monte C arlo studies (Spinelli and
Stephens, 1987). Van Soest (1969) has simulated the probability distribution
fo r W ^, for n = 10 and 20. F or comments on power, see Section 10,14.

4 .9 .5 Tests for Exponentiality with Censored Data

Suppose, for example, in a life-testing experiment, the observations are


recorded only up to a fixed time t (Type I censoring) o r until a fixed number
r out of n are observed (Type 2 censoring). In either case let X/pj be the
largest o rd er statistic, so that the sample is right-censored. The param eter
^ l n Exp (0,/3) is estimated by

Д = I X^j^ + (n - r)t I у r fo r Type I data

and by

A test for exponentiallty Exp (0,/3) may then be made as follows:

(a) Calculate Z (i) = I - exp { - X ^ i ) / ß ) , i = I , . . . , r .


(b) F o r Type 2 censoring, use the in the form ulas of Section 4 .7 .3 to
calculate statistics , , and A^ .
2 r ,n 2 r ,n 2 r ,n
142 STEPHENS

T A B L E 4.16 Upper T a ll Percentage Points fo r Statistics and fo r a


Test for E3qx>nentlality with Unknown Scale Param eter and Known O rigin,
for Complete o r Right-Censored Data of Type 2 (Section 4 .9 .5 )
P = r/n is the censoring ratio.

Significance level a

Statistic n 0.50 0.25 0.15 O.lO 0.05 0.025 0.01

W2 20 0.005 0.009 0.012 0.014 0.018 0.021 0.025


40 0.005 0.008 0.011 0.013 0.017 0.020 0.025
60 0.005 0.008 0.011 0.013 0.017 0.020 0.026
P = 0.2
80 0.005 0.008 0.011 0.013 0.017 0.020 0.026
100 0.005 0.008 0.011 0.013 0.017 0.020 0.026
CO 0.005 0.008 0.011 0.013 0.016 0.021 0.026
10 0.019 0.030 0.038 0.045 0.055 0.066 0.079
20 0.017 0.028 0.037 0.044 0.056 0.068 0.083
40 0.017 0.028 0.036 0.044 0.056 0.068 0.084
P = 0.4 60 0.017 0.028 0.036 0.044 0.056 0.068 0.085
80 0.017 0.028 0.036 0.044 0.056 0.069 0.086
100 0.017 0.027 0.036 0.043 0.056 0.069 0.086
OO 0.017 0.027 0.036 0.043 0.056 0.070 0.087

10 0.036 0.056 0.072 0.084 0.104 0.124 0.149


20 0.035 0.055 0.071 0.084 0.106 0.131 0.161
40 0.035 0.055 0.072 0.085 0.109 0.132 0.161
P = 0.6 60 0.034 0.056 0.072 0.085 0.109 0.133 0.164
80 0.034 0.056 0.072 0.085 0.109 0.134 0.166
100 0.034 0.056 0.072 0.086 0.109 0.134 0.167
OO 0.034 0.058 0.072 0.086 0.110 0.136 0.171
10 0.055 0.086 0.107 0.126 0.156 0.187 0.229
20 0.055 0.086 0.110 0.130 0.167 0.203 0.253
40 0.055 0.087 0.111 0.131 0.167 0.203 0.253
P = 0.8 60 0.055 0.087 0.112 0.132 0.168 0.205 0.256
80 0.055 0.087 0.112 0.132 0.169 0.206 0.257
100 0.055 0.087 0.112 0.132 0.169 0.206 0.258
OO 0.055 0.087 0.113 0.133 0.170 0.209 0.261
10 0.065 0.100 0.126 0.147 0.182 0.219 0.265
20 0.065 0.102 0.132 0.155 0.194 0.238 0.289
40 0.064 0.102 0.129 0.152 0.193 0.229 0.290
P = 0.9 60 0.064 0.101 0.130 0.153 0.195 0.234 0.294
80 0.065 0.101 0.131 0.154 0.196 0.236 0.297
100 0.065 0.101 0.131 0.155 0.196 0.238 0.298
OO 0.065 0.101 0.132 0.156 0.199 0.243 0.303

(continued)
TESTS BASED ON EDF STATISTICS 143

TABLE 4. 16 (continued)

Significance level a

Statistic n 0.50 0.25 0.15 0.10 0.05 0.025 0.01

W2 10 0.070 0.109 0.136 0.160 0.200 0.239 0.292


20 0.070 0.110 0.142 0.166 0.209 0.251 0.313
40 0.069 0.108 0.138 0.161 0.205 0.246 0.304
60 0.069 0.108 0.139 0.163 0.207 0.250 0.313
P = 0.95
80 0.069 0.108 0.139 0.164 0.208 0.252 0.318
100 0.070 0.108 0.140 0.164 0.209 0.254 0.321
OO 0.070 0.109 0.141 0.166 0.212 0.259 0.333

10 0.075 0.116 0.147 0.171 0.216 0.259 0.319


20 0.073 0.115 0.148 0.175 0.221 0.265 0.328
40 0.074 0.115 0.147 0.172 0.218 0.267 0.331
P = 1.0 60 0.074 0.115 0.147 0.173 0.219 0.267 0.334
80 0.074 0.115 0.147 0.173 0.220 0.268 0.336
100 0.074 0.115 0.147 0.173 0.220 0.268 0.337
0.074 0.116 0.148 0.175 0.222 0.271 0.338

A^ 20 0.080 0.127 0.161 0.188 0.232 0.271 0.325


40 0.078 0.126 0.161 0.189 0.241 0.292 0.355
60 0.077 0.126 0.164 0.192 0.244 0.300 0.373
P = 0.2
80 0.077 0.126 0.164 0.194 0.249 0.306 0.385
100 0.078 0.126 0.163 0.195 0.252 0.311 0.394
OO 0.078 0.128 0.161 0.200 0.274 0.336 0.438
10 0.158 0.248 0.312 0.363 0.445 0.528 0.671
20 0.157 0.248 0.319 0.379 0.477 0.582 0.719
40 0.157 0.250 0.322 0.382 0.485 0.584 0.736
P = 0.4 60 0.156 0.251 0.324 0.382 0.493 0.605 0.753
80 0.156 0.252 0.326 0.385 0.496 0.611 0.762
100 0.157 0.252 0.326 0.388 0.497 0.614 0.767
OO 0.158 0.255 0.330 0.407 0.501 0.614 0.788

10 0.243 0.373 0.474 0.549 0.684 0.835 1.058


20 0.241 0.375 0.482 0.568 0.721 0.875 1.104
40 0.243 0.385 0.492 0.580 0.733 0.892 1.126
P = 0.6 60 0.244 0.382 0.491 0.580 0.730 0.892 1.126
80 0.244 0.382 0.491 0.580 0.731 0.894 1.128
100 0.244 0.383 0.492 0.581 0.733 0.897 1.130
0.244 0.390 0.494 0.584 0.746 0.914 1.145

(continued)
144 STEPHENS

TABLE 4.16 (continued)

Significance level a

Statistic n 0.50 0.25 0.15 0.10 0.05 0.025 0.01

10 0.337 0.510 0.636 0.740 0.929 1.130 1.434


20 0.337 0.518 0.662 0.773 0.979 1.195 1.512
40 0.344 0.529 0.669 0.782 0.985 1.186 1.465
P = 0.8 60 0.341 0.527 0.670 0.782 0.989 1.207 1.516
80 0.341 0.530 0.670 0.783 0.991 1.214 1.529
100 0.341 0.532 0.671 0.785 0.993 1.218 1.533
OO 0.345 0.549 0.675 0.793 1.003 1.222 1.521

10 0.391 0.580 0.732 0.852 1.059 1.289 1.584


20 0.396 0.611 0.768 0.894 1.117 1.360 1.706
40 0.399 0.608 0.771 0.893 1.117 1.330 1.633
P = 0.9 60 0.398 0.609 0.766 0.897 1.127 1.367 1.699
80 0.398 0.612 0.766 0.900 1.132 1.380 1.720
100 0.399 0.615 0.768 0.902 1.135 1.386 1.728
OO 0.407 0.630 0.781 0.914 1.149 1.394 1.729

10 0.433 0.653 0.800 0.928 1.176 1.422 1.738


20 0.431 0.657 0.822 0.959 1.205 1.466 1.811
40 0.437 0.657 0.824 0.958 1.195 1.432 1.779
60 0.434 0.654 0.824 0.959 1.202 1.447 1.803
P = 0.95
80 0.434 0.657 0.827 0.962 1.208 1.456 1.812
100 0.435 0.660 0.830 0.965 1.212 1.462 1.817
OO 0.444 0.680 0.850 0.983 1.232 1.490 1.830

10 0.485 0.746 0.886 1.017 1.278 1.524 1.894


20 0.488 0.723 0.904 1.052 1.315 1.570 1.924
40 0.494 0.732 0.907 1.049 1.299 1.565 1.933
P = 1.0 60 0.491 0.728 0.905 1.051 1.303 1.570 1.964
80 0.491 0.728 0.906 1.053 1.306 1.574 1.971
100 0.492 0.729 0.907 1.054 1.308 1.576 1.973
0.496 0.736 0.916 1.062 1.321 1.591 1.959

Some asymptotic points taken, from Pettitt (1977b), with perm ission of the
author and of the Biometrlka T ru stees.
TESTS BASED ON EDF STATISTICS 145

2 2
(C ) R efer 2 ^ г ,п 2-^r,n ^ percentage points given In Table 4.16.

Pettitt (1977b) gave asymptotic theory and points for this test: some of the
points have been used in Table 4.16. Tables fo r 2 ^ ^ ^ a re given by Stephens
(1986). ’ 2
F o r a test with Type I censored data, the test statistic, say,
can be found by setting p = = I - exp (-t /Д), where t is the censoring
value and ß is found as above, and then using the form ulas of Section 4 .7 .3 ,
with sample size r + I . F o r large sam ples, an approximate test may be
made by referrin g the statistic to Table 4.16, with entries p = p and n, but
fo r sm aller sam ples, entering the table at an estimate of p instead of the
true value can produce a considerable e r r o r in significance level; see the
comments in Section 4 .8 .4 on tests fo r normality.
Another method o f treating right-censored data is to use the N -tra n s fo r-
mation of Chapter 10 (see Section 10.5.6). This converts a right-censored
exponential sample to a complete exponential sam ple, and the above tests of
exponentlality fo r complete sam ples, o r others given in Chapter 10, may
then be used to test R 0 •

4.10 EDF TESTS FOR THE E X T R E M E -


V A L U E DISTRIBUTION

One form of the extrem e-value distribution is

F(X) = exp j^-exp I - ~ , - « < X < OO (4.6)

where - « > < a < «>, and ß > 0 .


The distribution of X* = -X gives a second form of the extrem e-value
distribution:

F (x ') = I - exp I -е ? ф (^ • - « < x' < (4.7)

(here Of’ = - a above).


The first version (4 . 6) has a long tall to the right, and (4 . 7) has a long
tall to the left.
In this section we discuss ED F tests of the null hypothesis

a random sample X^, . . . , X^ comes from distribution (4.6)


with one o r both of param eters of and ß unknown

Three test situations can again be distinguished (Stephens, 1977):


146 STEPHENS

Case I : ß known, a to be estimated.


Case 2: a known, ß t o h e estimated.
Case 3: a ^ ß both unknown, and to be estimated.

W e suppose the param eters w ill be estimated by maximum likelihood;


the estimates, fo r Case 3, are given by the equations (Johnson and Kotz
(1970), p. 283):

ß = Х .Х У п - [ 2 .x . е х р (-Х ./ Д )]/ [2 . е х р (-Х ./ Д )] (4.8)

and

a = -ß log [2 . exp(-X.//3)/n] (4.9)

Equation (4.8) is solved iteratively for ß , and then (4.9) can be solved for a .
In Case I , )3 is known; then o: is given by (4.9) with ß replacing Д. In Case 2,

T A B L E 4.17 Modifications and Upper T a il Percentage Points fo r Statistics


, , and fo r the E xtrem e-Value o r W elbull Distributions
(Sections 4.10, 4.11)

Significance level a

Statistic Modification .25 .10 .05 .025 .01

Case I W=* ( I + 0.16/n) .116 .175 .222 .271 .338

Case 2 None .186 .320 .431 .547 .705

Case 3 w * ( i + о .г / 'Л ) .073 .102 .124 .146 .175

Case I U=* ( I + 0.16/n) .090 .129 .159 .189 .230

Case 2 U=*(l + O-lSZ-sTn) .086 .123 .152 .181 .220

Case 3 U=* (I + 0 . 2 / ^ Г а ) .070 .097 .117 .138 .165

A^

Case I (I + 0.3/n) .736 1.062 1.321 1.591 1.959

Case 2 None 1.060 1.725 2.277 2.854 3.640

Case 3 A ^ (l + 0.2/Nfn) .474 .637 .757 .877 1.038

Taken from Stephens (1977), with permission of the Biometrika Trustees.


TESTS BASED ON EDF STATISTICS 147

T A B L E 4.18 Upper T a il Percentage Points fo r Statistics \TúD * , n/5d - ,


N/nD, and N/nV, fo r Tests fo r the Extrem e-V alue o r W eibull Distributions
(Sections 4 . 10, 4.11)

Significance level a

Statistic n .10 .05 .025 .01

^УñD♦ 10 .872 .969 1.061 1.152


Case I 20 .878 .979 1.068 1.176
50 .882 .987 1.070 1.193
OO .886 .996 1.094 1.211

n/iiD * 10 .773 .883 .987 1.103


Case I 20 .810 .921 1.013 1.142
50 .840 .950 1.031 1.171
OO
.886 .996 1.094 1.211

n/tiD 10 .934 1.026 1.113 1.206


Case I 20 .954 1.049 1.134 1.239
50 .970 1.067 1.148 1.263
OO
.995 1.094 1.184 1.298

^ГñV 10 1.43 1.55 1.65 1.77


Case I 20 1.46 1.58 1.69 1.81
50 1.48 1.59 1.72 1.84
OO
1.53 1.65 1.77 1.91

10 .99 1.14 1.27 1.42


Case 2 20 1.00 1.15 1.28 1.43
50 1.01 1.17 1.29 1.44
OO
1.02 1.17 1.30 1.46

n/iiD - 10 1.01 1.16 1.28 1.41


Case 2 20 1.01 1.15 1.28 1.43
50 1.00 1.14 1.29 1.45
OO
1.02 i.l7 1.30 1.46

(continued)
148 STEPHENS

TABLE 4.18 (continued)

Significance level a

Statistic n .10 .05 .025 .01

n/hD 10 1.14 1.27 1.39 1.52


Case 2 20 1.15 1.28 1.40 1.53
50 1.16 1.29 1.41 1.53
OO 1.16 1.29 1.42 1.53

N/nV 10 1.39 1.49 1.60 1.72


Case 2 20 1.42 1.54 1.64 1.76
50 1.45 1.56 1.67 1.79
OO 1.46 1.58 1.69 1.81

n/hD* 10 .685 .755 .842 .897


Case 3 20 .710 .780 .859 .926
50 .727 .796 .870 .940
OO
.734 .808 .877 .957

N/nD" 10 .700 .766 .814 .892


Case 3 20 .715 .785 .843 .926
50 .724 .796 .860 .944
OO
.733 .808 .877 .957

^/ñD 10 .760 .819 .880 .944


Case 3 20 .779 .843 .907 .973
50 .790 .856 .922 .988
OO .803 .874 .939 1.007

^/ñV 10 1.287 1.381 1.459 1.535


Case 3 20 1.323 1.428 1.509 1.600
50 1.344 1.453 1.538 1.639
OO 1.372 1.477 1.557 1.671

Taken from Chandra, Singpurwalla, and Stephens (1981), with perm ission of
the authors and o f the Am erican Statistical Association.
The table fo r \/nD, Case 2, has been corrected.
TESTS BASED ON EDF STATISTICS 149

OL is known; suppose then that = X j - a ; /3 is given by solving

ß = { 2 . Y . - 2 .Y . ex p (-Y ./ i3 )}/ n
JJ J J J

The steps in making the test a re then:

(a) Estimate unknown param eters as above.


(b) Calculate Z (i) = F(X ^y), i = I , • • • , n, where F (x) is given by equation
(4 .6 ), using estimated param eters when necessary.
(c) Use form ulas (4.2) to calculate the E D F statistics.
(d) Modify the test statistics as shown in Table 4.17, o r use Table 4.18
and compare with the upper tall percentage points given.

Table 4.17 is taken from Stephens (1977), and Table 4.18 from Chandra,
Singpuivv^lla, and Stephens (1981).
Case I above is equivalent to a test fo r the е^фопепйа! distribution on the
transform ed variable Y = e x p { - X / ß ) . This transformation in (4.6) gives,
fo r Y , the distribution F(y) = I - e x p (-ô y ), y > 0, with ô = езф(се//3). When
ß is known, the transformation can be made, and the Y values are then tested
to come from the exponential distribution with origin zero and unknown scale
param eter (Section 4 .9 .3 ). The test statistics for the exponential test w ill
take the same values as those fo r the Case I test in the present section,
except that D“^ becomes D “ and vice v e rsa .

4.11 EDF TESTS FOR TH E W E IB U L L DISTRIBUTION

4.11.1 Test Situations

The general form of the W eibull distribution W { x ; a , ß ,m ) is

F(X) = I - exp I - ( ^ Д |. x > a ; ß > 0, m > I

H ere a and ß a re location and scale param eters, respectively, and m is a


shape param eter; a is called the origin o f the distribution. The W elbull
density function is

, ^m -I ( , ^mj
-P 1-(^ ) I
The null hypothesis in this section is

H^: a random sample X ^ , . . . , Xj^ comes from the W eibull


distribution W (x ; a , /3, m)
150 STEPHENS

4.11.2 T ests When the Location Param eter Is Known;


Reduction to a Test for the E xtrem e-Value Distribution

W e consider the case where a is known. Suppose its value is zero, so that
Ho becomes

H : the set X comes from W(x;0,/3,m)


Oa

This distribution is often called the tw o-param eter W eibull distribution. If a


is not zero, but has value OJq , say, the transformation X ’ = X - , i =
I , . . . , n, gives a set X ’ , fo r which Hqq, w ill be true when Hq is true fo r X;
thus Hqq/ is tested fo r X ’ .
In considering H q we distinguish three cases:

Case I : m is known and ß is unknown;


Case 2 : ß i s known and m is unknown;
Case 3: both m and ß a re unknown.

F o r the test of Hq ^,, the tables fo r the extrem e-value distribution tests
may be used. Let Y = - log X in the distribution W {x ;0 ,ß ,m ); the distribu­
tion for Y becomes

F (y) = e x p ^ ¿ , < У < (4.10)

with в = l/ m and ф = - log /3. This distribution is the extrem e-value d istri­
bution of Section 4.10, and a test o f Hoa fo r X may be made by testing that
Y has the extrem e-value distribution, with one o r both of в and ф unknown.
The test procedure therefore becomes:

(a) Make the transformation Y i = - log X i, i = I , . . . , n.


(b) A rran ge the Y i in ascending o rd e r (note that if the Xi w ere given in
ascending o rd er the Y i w ill be in descending o r d e r ).
(c) Test that the Y -sa m p le is from the extrem e-value distribution (4.6) as
described in Section 4 . 10.

In Case I, m w ill be known, and so в w ill be known in distribution (4.10)


fo r Y . The test is therefore a Case I test as described in Section 4.10. In
Case 2, /3 is known, and so ф is known in distribution (4.10) fo r Y .T h e test
w ill be a Case 2 test of Section 4.10. In Case 3, both param eters в and ф
in (4.10) w ill be unknown, so the test w ill be a Case 3 test o f Section 4.10.
T ables fo r the rather m ore unusual cases where a is unknown have been
given by Lockhart and Stephens (1985a).
TESTS BASED ON EDF STATISTICS 151

4.12 ED F TESTS FOR TH E G A M M A DISTRIBUTION

In this section we discuss the tests of the null hypothesis

H^: a random sample . . . , X ^ comes from the Gamma


distribution, G(x;o',/3,m) with density

" /З Г(т) x > O !;/ 3 > 0 , m > 0

The location param eter a w ill be called the origin o f the distribution; ß and
m a re , respectively, scale and shape param eters.

4 .1 2 .1 Tests with Known O rigin, C ases I , 2, and 3

W e consider the case where a is known. If o' = 0, Hq becomes

Hqq/2 set X comes from G(x;0,)3,m )

If a is not zero, but has value oíq , say, the transform ation x j = X j - QJq ,
i = I, . . . , n, is made to give a set X*: then the null hypothesis Hq fo r set X
reduces to Hqq, fo r set X*, and Hqq, is tested using the set X*.
In considering Hqq, we can distinguish three cases:

Case I: m is known, and ß is unknown;


Case 2: ß is known, and m is unknown;
Case 3: both m and ß are unknown.

F o r Cases 2 and 3, distribution theory, even as 5anptotic theory, when m


is estimated by maximum likelihood o r another efficient method, w ill dqjend
on the true m; this is because m is not a location o r scale param eter (Sec­
tion 4 .3 .3 ). However, useful approximate tests can still be made as follow s.

4.12.2 Tests fo r Case I

The steps fo r making this test are:

(a) Putth esam plein ascen din gorder Х (ц < • • • < Х(ц).
(b) Let X be the sample mean, and estimate ß h y ß = X^m ; ß is the m axi­
mum likelihood estim ator o f /3.
(c) Define

I(X ;m ,^ ) =
ß^T(m )
I
0
^ e x p (-x / ^ )d x
152 STEPHENS

T A B L E 4.19 Upper T ail A s 3anptotlc Percentage Points


fo r , U ^ , and A^ in Tests for the Gamma Distribution
(Section 4 . 12)^

Significance level a

Statistic m .10 .05 .025 .01

^ ff2
I .175 .222 .271 .338
2 .156 .195 .234 .288
3 .149 .185 .222 .271
4 .146 .180 .215 .262
5 .144 .177 .211 .257
6 .142 .175 .209 .254
8 .140 .173 .205 .250
10 .139 .171 .204 .247
12 .138 .170 .202 .245
15 .138 .169 .201 .244
20 .137 .169 .200 .243
CO .135 .165 .196 .237

U* I .129 .159 .189 .230


2 .129 .158 .188 .228
3 .128 .158 .187 .227
4 .128 .158 .187 .227
5 .128 .158 .187 .227
6 .128 .157 .187 .227
8 .128 .157 .187 .227
10 .128 .157 .187 .227
12 .128 .157 .187 .227
15 .128 .157 .187 .227
20 .128 .157 .187 .227
OO .128 .157 .187 .227

A" I 1.062 1.321 1.591 1.959


2 .989 1.213 1.441 1.751
3 .959 1.172 1.389 1.683
4 .944 1.151 1.362 1.648
5 .935 1.139 1.346 1.627
6 .928 1.130 1.335 1.612
8 .919 1.120 1.322 1.595
10 .915 1.113 1.314 1.583
12 .911 1.110 1.310 1.578
15 .908 1.106 1.304 1.570
20 .905 1.101 1.298 1.562
OO .893 1.087 1.281 1.551

^Parameters: location a known; scale ß unknown;


shape m known.
TESTS BASED ON EDF STATISTICS 153

Accurate computer routines now exist fo r this expression (the incomplete


gam m a function). Calculate = I(X^y;m,/3), fo r i = I , . . . » n.
(d) Calculate the ED F statistics from the using form ulas (4 .2 ).
(e) Modify the statistics as follows:

F o r m = I , calculate

W * = W * ( l + 0.16/n) ; U * = U ^ (l + 0.16/n) ; A * = A * ( l + 0.6/n)

F o r m > 2, calculate

w . . - » • » ) , D. . .
1.8n-l l.Sn-l n^ m^

The modified statistics are then re fe rre d to the upper tail percentage points
given in Table 4.19 fo r the appropriate known value of m . These points a re
the as 3nnptotic points for the various distributions; they w ere given by Pettitt
and Stephens (1983).
The modifications given above a re based on Monte C arlo studies fo r
finite n, and have been designed to be as comprehensive as possible, cover­
ing all values of m and n; when the given percentage points a re used at levèl a
it is believed that the true level o f significance w ill not differ by m ore than
0.5% fo r n > 5.

4.12.3 Application to a Test for the Chi-Square Distribution

The Gamma distribution, with m - r/2 and /3 = 2, becomes the chi-square


distribution with r degrees o f freedom . Thus this Case I test can be used to
test that observations X j, multiplied by an unknown constant, come from a
chi-square distribution with known degrees o f freedom . F o r example, it may
be used to test Hq : n independent sample variances s f , . . . , s^» each calcu­
lated from a sample o f size k, come from parent populations which a re n or­
m al with the same (unknown) variance . An application might be to test for
constant variance in an Analysis o f Variance with cells each containing к
observations. Other applications are given by Pettitt and Stephens (1983).

4.1 2 .4 Test fo r Case 2

F o r this case, the steps are as follows:

(a) Put the sample in ascending o rd er i) < * ‘ * < ^(n )*


(b) Estimate m by solving for in the equation { 2. log X. }/n = ip{m) - log ß ,
d
where ^(m) is the digamma function is fbe maximum
likelihood estimator of m.
(c) Calculate = I(X^^^;m,/3), for i I,
154 STEPHENS

T A B L E 4.20 Upper T a il Asymptotic Percentage Points for , , and A^


in Tests fo r the Gamma Distribution (Section 4 . 12)^

Significance level a

Statistic m .25 .10 .05 .025 .01 . 005

I .103 .150 .186 .223 .273 .311


2 .099 .143 .176 .210 .256 .291
3 .097 .140 .172 .205 .250 .283
4 .096 .138 .171 .203 .247 .280
5 .096 .138 .169 .202 .245 .278
6 .095 .137 .169 .201 .244 .276
8 .095 .136 .168 .200 .242 .275
10 .095 .136 .167 .199 .241 .274
12 .095 .136 .167 .199 .241 .273
15 .094 .135 .167 .198 .240 .272
20 .094 .135 .166 .198 .240 .272
OO .094 .134 .165 .197 .238 .270

I .090 .129 .159 .189 .230 .262


2 .089 .128 .158 .189 .229 .261
3 .089 .128 .158 .188 .229 .260
4 .089 .128 .158 .188 .229 .260
5 .089 .128 .158 .188 .229 .260
6 .089 .128 .158 .188 .228 .260
8 .089 .128 .157 .188 .228 .260
10 .089 .128 .157 .188 .228 .260
12 .089 .128 .157 .188 .228 .260
15 .089 .128 .157 .188 .228 .260
20 .089 .127 .157 .187 .228 .260
OO .090 .127 .157 .187 .228 .259

A* I .680 .956 1.170 1.390 1.687 1.916


2 .661 .926 1.130 1.338 1.619 1.836
3 .655 .915 1.115 1.320 1.596 1.809
4 .651 .909 1.108 1.310 1.584 1.795
5 .649 .906 1.103 1.305 1.577 1.787
6 .648 .904 1.101 1.301 1.572 1.781
8 .646 .901 1.097 1.297 1.567 1.775
10 .645 .899 1.095 1.294 1.563 1.771
12 .644 .898 1.094 1.293 1.561 1.768
15 .644 .897 1.092 1.291 1.559 1.766
20 .643 .896 1.091 1.289 1.557 1.763
OO .644 .894 1.087 1.285 1.551 1.756

^Parameters: location a known; scale ß known; shape m unknown.


TESTS BASED O N EDF STATISTICS 155

T A B L E 4.21 Upper T a il Asymptotic Percentage Points fo r W ^, , and A^


in Tests for the Gamma Distribution (Section 4 . 12)^

Significance level a

Statistic m .25 .10 .05 .025 .01 .005

I .079 .111 .136 .162 .196 .222


2 .076 .107 .131 .155 .187 .211
3 .075 .106 .129 .153 .184 .208
4 .075 .105 .128 .152 .183 .207
5 .075 .105 .128 .151 .182 .206
6 .075 .105 .128 .151 .181 .205
8 .074 .104 .127 .150 .181 .204
10 .074 .104 .127 .150 .180 .204
12 .074 .104 .127 .150 .180 .203
15 .074 .104 .127 .149 .180 .203
20 .074 .104 .126 .149 .180 .203
OO .074 .104 .126 .148 .178 .201

I .071 .098 .119 .141 .169 .190


2 .070 .097 .118 .139 .166 .187
3 .070 .097 .118 .138 .165 .186
4 .070 .097 .117 .138 .165 .186
5 .069 .097 .117 .138 .165 .185
6 .069 .097 .117 .138 .165 .185
8 .069 .096 .117 .137 .164 .185
10 .069 .096 .117 .137 .164 .185
12 .069 .096 .117 .137 .164 .185
15 .069 .096 .117 .137 .164 .185
20 .069 .096 .117 .137 .164 .185
OO .069 .096 .117 .136 .164 .183

I .486 .657 .786 .917 1.092 1.227


2 .477 .643 .768 .894 1.062 1.190
3 .475 .639 .762 .886 1.052 1.178
4 .473 .637 .759 .883 1.048 1.173
5 .472 .635 .758 .881 1.045 1.170
6 .472 .635 .757 .880 1.043 1.168
8 .471 .634 .755 .878 1.041 1.165
10 .471 .633 .754 .877 1.040 1.164
12 .471 .633 .754 .876 1.039 1.163
15 .470 .632 .754 .876 1.038 1.162
20 .470 .632 .753 .875 1.037 1.161
OO .470 .631 .752 .873 1.035 1.159

^Parameters: location a known; scale ß unknown; shape m unknown.


156 STEPHENS

(d) Calculate the EDF statistics from the using form ulas (4 .2 ).
(e) Reject H oq/ if the value of the statistic used is greater than the value in
Table 4.20 fo r desired significance level a and for appropriate m .

4.12.5 Tests for Case 3

The steps in the test are as follows:

(a) Estimate m by solving for in the equation

{ 2 . log X j}/ n - log X = ф{т) - log m

where ÿ(m) is the digamma function as above, and estimate ß by


ß = X/in.
(b) Calculate EDF statistics from = I(X ^j);m ,ß ), i = I , . . . , n.
(C ) Reject H q if the value of the statistic used is greater than the value in
Table 4.21, fo r desired significance level a , and appropriate in.

4 .1 2 .5 .1 Comment

The points in Table 4.21 rem ain rem arkably stable as m changes, especially
fo r , and accurate results can be expected when in is used fo r m , except
possibly fo r sm all values of m . Note that only asymptotic points are given;
experience with , , and suggests these w ill be very good approxi­
mations to the points for finite n, even for quite sm all n. The points in
T ables 4.20 and 4.21 are taken from Lockhart and Stephens (1985b), where
the as 3rmptotic theory is also developed. A somewhat different treatment
was given much e a rlie r in an unpublished report by Mickey, M m dle, W alk er,
and Glinskl (1963). The various cases when the origin is not known are much
m ore unlikely; furtherm ore, it is often difficult to estimate param eters effi­
ciently. Tests for these cases have been given by Lockhart and Stephens
(1985b). T ables for the Kolmogorov statistic D , fo r n = 4(1)10(5)30, have
been given fo r Cases 1 ,2 , and 3 above (a different estimate of m is used in
Case 3) by Schneider and Clickner (1976).

4 . 13 EDF TESTS FOR THE LOGISTIC DISTRIBUTION

In this section is discussed the test of

Hq : a random sample X i , . . . , X^ comes from the logistic distribution

F(x;o',/?) = 1/[1 + e x p { - ( x - O i ) / ß } ] , -«< x < < »;)3 < 0

with param eters o' o r ¢, o r both, unknown


H
И
H
CO
T A B L E 4 . 22 Modifications and Upper Tail Percentage Points for , , in Tests for the Logistic Distribution
(Section 4.13)^
W
Ö
Significance level a O
й;

Statistic Modification .25 .10 .05 .025 .01 .005 и


ö

w* W
Case I (1.9nW2 - 0 .1 5 )/(1 .Sn - 1.0) .083 .119 .148 .177 .218 .249
hH
CO
Case 2 (0 .SSnW^ - 0 .45)/(0.95n - 1.0) .184 .323 .438 .558 .721 .847 H
H
Case 3 .114 .152 O
(nW=^ - 0.08)/(n - 1.0) .060 .081 .098 .136 CO

Case 2 (1.6nU^ - 0 . 16)/(1.6n - 1.0) .080 .116 .145 .174 .214 .246

Case I A* + 0.15/n .615 .857 1.046 1.241 1.505 1.710

Case 2 (О.бпА^ - 1 .8)/(0 .6n - 1.0) 1.043 1.725 2.290 2.880 3.685 4.308

Case 3 A^(1.0 + 0.25/n) .426 .563 .660 .769 .906 1.010

^F or Gases 1 and 3 use modifications and percentage points for Cases I and 3, respectively (see Section4.13).
Taken from Stephens (1979), with perm ission of the Biometrika T rustees.
158 STEPHENS

T A B L E 4.23 Upper T a ll Percentage Points for Statistics D'^'s/n,


D n/i i , and VN/n, fo r Tests for the Logistic Distribution (Section 4 . 14)

Significance level a
Case 0.10 0.05 0.025 0.01

Statistic D'^^Гn

5 0.702 0.758 0.805 0.854


10 0.730 0.792 0.846 0.913
20 0.744 0.809 0.867 0.944
50 0.752 0.819 0.880 0.962
0.757 0.826 0.888 0.974

5 0.971 1.120 1.239 1.380


10 0.990 1.143 1.268 1.423
20 0.999 1.150 1.282 1.444
50 1.005 1.161 1.290 1.456
1.009 1.166 1.297 1.464

5 0.603 0.650 0.690 0.735


10 0.636 0.687 0.736 0.789
20 0.653 0.705 0.758 0.816
50 0.663 0.716 0.773 0.832
0.669 0.723 0.781 0.842

Statistic D n/п

5 0.736 0.791 0.845 0.883


10 0.777 0.837 0.895 0.653
20 0.800 0.865 0.926 0.997
50 0.808 0.874 0.937 1.011
OO 0.816 0.883 0.947 1.025

5 1.108 1.236 1.349 1.474


10 1.148 1.274 1.388 1.521
20 1.167 1.294 1.406 1.545
50 1.179 1.305 1.419 1.559
CO 1.187 1.313 1.427 1.568

5 0.643 0.679 0.723 0.751


10 0.679 0.730 0.774 0.823
20 0.698 0.755 0.800 0.854
50 0.708 0.770 0.817 0.873
0.715 0.780 0.827 0.886

(continued)
TESTS BASED ON EDF STATISTICS 159

T A B L E 4 . 23 (continued)

Significance level Of
Case n 0.10 0.05 0.025 0.01

Statistic Y \T n

I 5 1.369 1.471 1.580 1.658


10 1.410 1.520 1.630 1.741
20 1.433 1.550 1.659 1.790
50 1.447 1.564 1.675 1.815
OO 1.454 1.574 1.685 1.832

2 5 1.314 1.432 1.547 1.674


10 1.372 1.483 1.587 1.711
20 1.400 1.510 1.607 1.730
50 1.417 1.525 1.619 1.741
OC 1.429 1.535 1.627 1.748

3 5 1.170 1.246 1.299 1.373


10 1.230 1.311 1.381 1.466
20 1.260 1.344 1.422 1.514
50 1.277 1.364 1.448 1.542
OO 1.289 1.376 1.463 1.560

Taken from Stephens (1979), with perm ission of the Biom etrika
T ru ste e s.

A s in e a rlie r sections, we distinguish three cases:

Case I: ß known, a unknown;


Case 2: a known, ß unknown;
Case 3: o¿ and ß both unknown.

The param eters are estimated from the data by maximum likelihood.
F o r Case 3, when both a and ß a re unknown, the equations fo r the estimates
Of, ß are

П-» Z i [ l + exp { ( X i - 0!)/ ^ }]-* = 0.5 (4.11)

Xj - a I - exp { (X j - Z t ) / ß }
П-* S . ' = -I (4.12)
ß
I + exp {(X ^ - ä ) / ß }

These equations may be solved iteratively; good starting values fo r of and ß


a re the sample mean X and the sample standard deviation s. In Case I , a
is the solution o f equation (4.11), with ß replacing Д. In Case 2, /3 is the
160 STEPHENS

S o lu t io n of (4.12), with a replacing a . The steps in making the test a re then


as follow s:

(a) Find estimates of any unknown p aram eters.


(b) Calculate = 1/[1 + е зф {-(Х ( 1) - o i ) / ß } ] , i = I , • . . , n, with a and ß
replaced by estimates where necessary.
(c) Calculate the EDF statistics from (4 . 2 ).
(d) For , , and modify the statistic as in Table 4.22; reject Hq if
the statistic exceeds the percentage point given fo r desired significance
level O'. F o r D'^, D ", D, and V , multiply by n/п and use Table 4.23.
The table for D'^n/iT can also be used fo r D “ n/E.

The as 3nnptotic percentage points for , , and given in Table 4.22


a re based on theoretical work of Stephens (1979), and the modifications have
been derived, as in previous sections, from extensive Monte Carlo studies
fo r finite n. F o r each case, 10,000 samples w ere used to give the percentage
points, fo r n = 5, 8, 10, 20, and 50. The percentage points fo r D"*" n/п , D \ f n ,
and V \ f n given in Table 4 . 23 w ere derived from the same Monte Carlo
studies.

4.14 EDF TESTS FOR THE C A U C H Y DISTRIBUTION

The Cauchy distribution has density

f(x :a ,ß ) = ^ ßz ^ ( / _ а ) г • -~ < x < c o ;^ > 0 (4.13)

and distribution function

F (x ;a ,/ Î) = 1 + i tan-* , -«> < х < »;/ 3 > 0 (4.14)

In this section we discuss tests of the null hypothesis

H^: a random sample . . . , Xj^ comes from the Cauchy distribution,


with one o r both of param eters a and ß unknown

A s with previous tests, we consider three cases:

Case I: ß known, o¿ unknown;


Case 2: a known, ß unknown;
Case 3: both a and ß unknown.

F o r other distributions, the param eters have been estimated by maximum


likelihood; however, for the Cauchy distribution, the likelihood may have
TESTS BASED ON EDF STATISTICS 161

T A B L E 4.24 Upper T a il Percentage Points fo r and A^


fo r Tests for the Cauchy Distribution (Section 4 . 14)

Significance level a

n .25 .15 .10 .05 .025 .01

Case I . Statistic

5 .208 .382 .667 1.26 1.51 1.61


8 .227 .480 .870 1.68 2.30 2.55
10 .227 .460 .840 1.80 2.60 3.10
12 .220 .430 .770 1.76 2.85 3.65
15 .205 .372 .670 1.59 2.88 4.23
20 .189 .315 .520 1.25 2.65 4.80
25 .175 .275 .420 .870 2.10 4.70
30 .166 .250 .360 .710 1.60 4.10
40 .153 .220 .290 .510 1.50 3.05
50 .145 .200 .260 .400 .70 2.05
100 .130 .170 .210 .270 .35 .60
OC .115 .146 .173 .216 .260 .319

Case I . Statistic A^

5 1.19 2.22 3.83 8.00 12.75 17.980


8 1.33 2.62 4.7 10.0 17.4 25.0
10 1.34 2.52 4.5 10.6 18.2 29.0
12 1.31 2.42 4.1 9.9 18.8 32.0
15 1.30 2.15 3.5 8.2 17.2 31.2
20 1.17 1.86 2.8 6.5 14.4 27.5
25 1.12 1.68 2.3 4.7 10.8 23.0
30 1.08 1.55 2.1 3.8 8.2 20.0
40 1.02 1.38 1.8 2.9 5.2 15.5
50 .970 1.29 1.6 2.4 3.8 10
100 .890 1.16 1.4 1.8 2.2 3.5
OO .834 1.02 1.219 1.519 1.812 2.212
162 STEPHENS

T A B L E 4.25. Upper T a il Percentage Points for and


for Tests for the Cauchy Distribution (Section 4 . 14)

Significance level a

n .25 .15 .10 .05 .025 .01

Case 2. Statistic

5 .199 .236 .261 .338 .437 .590


8 .211 .273 .321 .389 .463 .564
10 .212 .279 .332 .414 .501 .626
12 .212 .281 .337 .433 .525 .661
15 .206 .279 .339 .444 .537 .684
20 .199 .273 .333 .442 .547 .698
25 .194 .268 .328 .437 .551 .704
30 .189 .265 .326 .435 .553 .708
40 .185 .260 .323 .434 .555 .712
50 .183 .258 .321 .433 .557 .714
100 .179 .254 .319 .432 .559 .715
OO .176 .250 .316 .131 .560 .714

Case 2. Statistic

5 .974 1.131 1.239 1.59 2.08 2.84


8 1.085 1.360 1.560 1.88 2.18 2.55
10 1.110 1.414 1.653 2.04 2.38 2.89
12 1.117 1.443 1.710 2.14 2.55 3.15
15 1.117 1.449 1.728 2.22 2.65 3.31
20 1.101 1.444 1.728 2.24 2.73 3.44
25 1.083 1.432 1.727 2.25 2.77 3.50
30 1.064 1.422 1.724 2.25 2.80 3.53
40 1.051 1.41 1.723 2.26 2.82 3.56
50 1.045 1.405 1.722 2.27 2.83 3.59
100 1.038 1.40 1.718 2.28 2.86 3.64
OO 1.034 1.409 1.716 2.283 2.872 3.677
TESTS BASED ON EDF STATISTICS 163

T A B L E 4.26 Upper T ail Percentage Points for and A^


fo r Tests fo r the Cauchy Distribution (Section 4 . 14)

Significance level a

n .25 .15 .10 .05 .025 .01

Case 3. Statistic

5 .167 .242 .305 .393 .445 .481


8 .192 .315 .441 .703 .940 1.13
10 .197 .331 .481 .833 1.201 1.571
12 .194 .329 .487 .896 1.391 1.901
15 .185 .317 .472 .904 1.54 2.33
20 .169 .281 .419 .835 1.63 2.96
25 .154 .253 .366 .726 1.47 3.08
30 .143 .225 .319 .615 1.25 2.90
40 .126 .195 .263 .460 .850 2.17
50 .117 .175 .235 .381 .642 1.56
60 .1097 .160 .211 .330 .508 1.07
100 .098 .135 .174 . 2378 .331 .544
OO .080 .108 .130 .170 .212 .270

Case 3. Statistic A^

5 .835 1.14 1.40 1.77 2.00 2.16


8 .992 1.52 2.06 3.20 4.27 5.24
10 1.04 1.63 2.27 3.77 5.58 7.50
12 1.04 1.65 2.33 4.14 6.43 9.51
15 1.02 1.61 2.28 4.25 7.20 11.50
20 .975 1.51 2.13 4.05 7.58 14.57
25 .914 1.40 1.94 3.57 6.91 14.96
30 .875 1.30 1.76 3.09 5.86 13.80
40 .812 1.16 1.53 2.48 4.23 10.20
50 .774 1.08 1.41 2.14 3.37 7.49
60 .743 1.02 1.30 1.92 2.76 5.32
100 .689 .927 1.14 1.52 2.05 3.30
OO .615 .780 .949 1.225 1.52 1.90
164 STEPHENS

local maxima, and It may be difficult to find the true maximum. W e there­
fore find estimates using sums of weighted ord er statistics. Chernoff, G ast-
wirth, and Johns (1967) have given the estimate a = with

_ sin 4тг j/ (n + I) - 0 . 5 }
i n tan 7Г j/ (n + I) - 0 . 5 }

The estimate o í ß i s ß = Щ dj with

A - 8 tan 7r{j/(n + I) - 0 . 5 }
i n sec^ ^ {j/ (n + I) - 0 . 5 }

These estimates are asymptotically efficient, and asymptotic distributions


of , , and can be found. The test of Hq I s then as follows:

(a) Estimate param eter a o r /3 o r both, as described above.


(b) Calculate = Г (Х (ц ;а ,0 ), given in (4.14), with estimates replacing
unknown param eters.
(C ) Use the form ulas (4.2) to calculate EDF statistics.
(d) R efer to Tables 4 . 24, 4.25, or 4 . 26 to make the test; reject Hq if the
test statistic is greater than the value given for n and for the desired
significance level a .

The points are taken from Stephens (1985), where the asymptotic theory,
and tables for , are given.

4.15 EDF TESTS FOR THE VO N MISES DISTRIBUTION

The von M ises distribution is used to describe unlmodal data on the circum ­
ference of a c ircle. Suppose the circle has center O and radius I, and let a
radius O P be measured by the polar coordinate 9 , from ON as origin. Let
$Q be the coordinate of a radius O A , and let /c be a positive constant. The
von M ises density is

i(0 î Oq t k ) — exp {/c cos (Ö - ^ ) }, 0 < 0 < 2ТГ


2тг1о(/с)

Here I q (/c) is the imaginary B essel function of o rd er zero. The distribution


has a mode along OA (that is, at 9 = 9q ) and is symmetric around O A ; for
K = O the distribution reduces to the uniform distribution around the circle.
Suppose a random sample of values 02» • • • * ^ given, denoting loca­
tions on the circum ference of points Pj^, P 2 , P jj. W e discuss the test of

Ho : the random sample o f 0-values comes from the von M ises distribution

= Ц в ;в о ,к )0 9
T A B L E 4.27 Upper T a ll Percentage Points fo r fo r Tests
of the von M ises Distribution (Section 4 . 15)

True Significance level a


shape
K 0.500 0.250 0.150 0.100 0.050 0.025 0.010 0.005

Case I

0.0 0.047 0.071 0.089 0.105 0.133 0.163 0.204 0.235


0.50 0.048 0.072 0.091 0.107 0.135 0.165 0.205 0.237
1.00 0.051 0.076 0.095 0 .1 1 1 0.139 0.169 0.209 0.241
1.50 0.053 0.080 0.099 0.115 0.144 0.173 0.214 0.245
2.00 0.055 0.082 0.102 0.119 0.147 0.177 0.217 0.248
2.50 0.056 0.084 0.104 0 .12 1 0.150 0.180 0.220 0.251
3.00 0.057 0.085 0.106 0.122 0.152 0.181 0.222 0.253
3.50 0.058 0.086 0.107 0.123 0.153 0.182 0.223 0.254
4.00 0.058 0.086 0.107 0.124 0.153 0.183 0.224 0.255
10.00 0.059 0.088 0.109 0.126 0.155 0.186 0.227 0.258
OO 0.059 0.089 0.110 0.127 0.157 0.187 0.228 0.259

Case 2
0.0 0.047 0.071 0.089 0.105 0.133 0.163 0.204 0.235
0.50 0.048 0.072 0.091 0.107 0.135 0.165 0.205 0.237
1.00 0.051 0.076 0.095 0 .1 1 1 0.139 0.169 0.209 0.241
1.50 0.053 0.080 0.100 0.116 0.144 0.174 0.214 0.245
2.00 0.055 0.082 0.103 0.119 0.148 0.177 0.218 0.249
2.50 0.056 0.084 0.105 0 .12 1 0.150 0.180 0.220 0.251
3.00 0.057 0.085 0.105 0.122 0 . 151 0.181 0.221 0.252
3.50 0.057 0.085 0.106 0.122 0.151 0.181 0.221 0.253
4.00 0.057 0.085 0.106 0.122 0.151 0.181 0.221 0.253
10.00 0.057 0.085 0.105 0.122 0.151 0.180 0.221 0.252
OO 0.057 0.085 0.105 0.122 0.151 0.180 0.221 0.252

Case 3

0.0 0.030 0.040 0.046 0.052 0.061 0.069 0.081 0.090


0.50 0.031 0.042 0.050 0.056 0.065 0.077 0.090 0.100
1.00 0.035 0.049 0.059 0.066 0.079 0.092 0.110 0.122
1.50 0.039 0.056 0.067 0.077 0.092 0.108 0.128 0.144
2.00 0.043 0.061 0.074 0.084 0 .101 0.119 0.142 0.159
2.50 0.045 0.064 0.078 0.089 0.107 0.125 0.150 0.168
3.00 0.046 0.066 0.080 0.091 0.110 0.129 0.154 0.173
3.50 0.047 0.067 0.081 0.093 0.112 0.131 0.157 0.176
4.00 0.047 0.067 0.082 0.093 0.113 0.132 0.158 0.178
10.00 0.048 0.068 0.083 0.095 0.115 0.135 0.162 0.182
OO 0.048 0.069 0.084 0.096 0.117 0.137 0.164 0.184

Taken from Lockhart and Stephens (1985c), with perm ission of the Biometrika
T ru ste e s.
165
166 STEPHENS

As fo r other distributions, there are three cases:

Case I: Oq unknown, к known;


Case 2 : known, к unknown;
Case 3: both Oq and к unknown.

Maximum likelihood estimates of and of к are found as follow s. Let R be


the vector sum o r resultant of vectors O P i, i = I , «о ., n, and let R Ы its
length. The estimate o f Oq I s the direction of g , and the estimate к o f к
is given by solving

R^
(4.15)
IoM N

where I i (к) is the imaginary B essel function of ord er I . T ables for solving
(4.15) a re given in, fo r example, Biometrika T ables fo r Statisticians, V o l. 2
(Pearson and Hartley, 1972), and by M ardia (1972).
When O A is known, let X be the component o f g on O A ; then the estimate
of K is now given by K i , obtained by replacing R by X in (4.15).
Since the distribution is on a c irc le , only o r V are valid EDF statis­
tics, of those we have been considering (see Section 4 .5 .3 ). Asymptotic null
distributions can be found for ; because к is not a scale param eter, the
distribution depends on /c. However, as for the gamma and W eibull distribu­
tions, useful tests are still available.
The steps in making a test of Hq a re then as follows:

(a) F o r the appropriate case, estimate unknown p aram eters as described


above.
(b) Calculate = F ( 0^^; во where and к are replaced by estimates
if necessary.
(c) Calculate from form ula (4. 2) .

R efer to the part o f Table 4 . 27 appropriate fo r the given case, using к


o r k ; reject Hq if exceeds the point given fo r a . The test is approximate,
since asymptotic points are used; however, these are likely to be accurate,
fo r practical purposes, fo r n > 20 . The points are taken from Lockhart and
Stephens (1985), where asymptotic theory is also given.

4 . 16 EDF TESTS FOR CONTINUOUS DISTRIBUTIONS:


m S C E L L A N E O U S TOPICS

4.16.1 P ow er of EDF Statistics when Param eters


A re Estimated

In Section 4 .6 some comments w ere made on the power of different EDF sta­
tistics for Case 0 , using complete sam ples, where essentially the final test
TESTS BASED ON EDF STATISTICS 167

is a test fo r uniformity of the Z -values given by the Probability Integral


Transform ation. Different statistics w ere found to detect different types of
departure from uniformity. When unknown param eters are estimated from
the same sample as is used for the goodness-of-fit test, the differences in
the powers of the statistics tend to become sm a lle r. It appears that fitting
the param eter o r param eters makes it possible to adjust the tested d istri­
bution to the sample in such a way that the statistics can detect a departure
from the null distribution with roughly the same efficiency; nevertheless, A^
tends to lead the others, probably because it is effective at detecting depar­
tures in the tails.
Some asymptotic theory is available to examine pow er, at least fo r
quadratic statistics . Durbin and Knott (1972) demonstrated a method by which
asymptotic power results could be obtained, and applied it to tests for the
norm al distribution with mean 0 and variance I , that is. Case 0 tests,
against norm al alternatives with a shift in mean o r a shift in variance.
Stephens (1974a) extended the results to shifts in both mean and variance.
The technique rests on a partition o f the appropriate statistics into compo­
nents (see also Section 8 . 12). Durbin, Knott, and T aylor (1975) showed how
the decomposition into components could be done also fo r the test fo r nor­
mality with mean and variance unknown (Case 3 ), o r fo r the е щ ю п е п П а ! test
with scale param eter unknown, and used their method to discuss the asym p­
totic power o f the components. Stephens (1976b) followed the method and
applied it to tests fo r the statistics W ^, U^, and A^ fo r these situations. The
overall result when tests fo r normality o r eзфonentiallty a re made with
unknown p aram eters, is that A^ is slightly better than fo r the alternatives
discussed, with not fa r behind W ^.
The superiority of A^ has also been documented by various pow er studies
based on Monte Carlo sampling. Some of these, in comparisons of tests of
uniformity and normality, are by Stephens (1974b). These power studies also
included the statistics D"*", D “ , D, and V .
The most famous statistic, the K olm ogorov-Sm im ov D, tends to be weak
in pow er. Statistics D"*" and D ", on the other hand, often have good power but
each one against only certain classes of alternatives. F o r example, in tests
of exponentiality (see Table 10. 6 , results fo r Group I statistics) D"*" appears
to be powerful against alternatives with decreasing failure rate and D “ is
powerful against alternatives with Increasing failure rate. In some applica­
tions the alternative of interest may be clearly identified, and then it w ill be
possible to identify which statistic to use. However, D“*" and D " w ill be biased
when used against the wrong alternatives, so these statistics must be used
with caution. From the power studies fo r tests fo r normality and езфопепй-
ality it appears that A^ (o r as second choice) should be the recommended
omnibus test statistic for EDF tests with unknown p aram eters, with good
power against a wide range of alternatives.
168 STEPHENS

4.16.2 The Effect on Pow er of Knowing


Certain Param eters

In Section 4 .9 above, as an illustration of the test fo r exponentiality, the


example w as worked in the case when the param eter ß was known, and also
when it was necessary to estimate it. It is clear from the example (o r from
a comparison of Tables 4.11 and 4.14) that when the estimate is very close
to the true value, one has a much m ore sensitive test using the tables for
the param eters unknown than using the tables for Case 0 ; in general, the
critical values fo r rejection are much sm aller when param eters must be
estimated. It would quite frequently happen that the estimated value of ß
would be close to the true value, and then the practitioner who does not know
ß w ill obtain greater power than if ß w ere known. This appears somewhat
paradoxical« in that usually in statistical testing one assum es that the m ore
knowledge the better. However, the tests a re (Ideally) intended as tests for
distributional form , not as tests fo r param eter values, and some knowledge
of param eters may not be very Important in assessing distributional form .
F o r example, it may be unhelpful to know, and to use, the mean of the true
distribution« when this is not the one tested. Stephens (1974b) and D yer (1974)
have noted these effects in tests fo r normality; being given means and v a ri­
ances changes the test from Case 3 to Case 0 , with a consequent loss of
pow er. On the other hand, Spinelli and Stephens (1987) have shown that in
tests for exponentiality it is better to use the value o f the origin, when this
is known, than to estimate it. Note also that in Example E 4 .9 .3 the е^фопеп-
tial form , when ß was given, was acceptable, but when the test focused m ore
on the exponential shape (the main point of the test) and less on the param eter,
the exponential form was rejected. Further work is still needed on what
param etric information is useful and what is not.

4.16.3 Other Techniques for Unknown Param eters

4 .1 6 .3 .1 Use of Sufficient Statistics

Some other interesting methods have been proposed to deal with unknown
param eters. When sufficient statistics are available for в , Srinivasan (1970,
1971) has suggested using the Kolmogorov statistic D calculated from a com ­
parison of Fji(x) with the estimate f { x ; 0 ) obtained by applying the R a o -
Blackwell theorem to F (x, 0 ), where в is, say, the maximum likelihood
estimator o f 0 . The resulting tests are asymptotically equivalent to the tests
given in previous sections using F (x ; 0 ) itself (M oore, 1973) and can be ex­
pected to have sim ilar properties for finite n. The method w ill usually lead
to complicated calculations, and has been developed only fo r tests fo r norm al­
ity (Srinlvasan, 1970; see also Kotz, 1973) and for tests o f exponentiality (see
Section 10.8 .1 ).
TESTS BASED ON EDF STATISTICS 169

4 . 16. 3 >2 The Half-Sam ple and Related Methods

Another method of eliminating unknown param eters is called the half-sam ple
method; this can be useful when unknown param eters are not location o r
scale. Unknown components in в a re estimated by asymptotically efficient
methods (fo r example, maximum likelihood) using only half the given sample,
randomly chosen. The estimates, together with any known components, give
an estimate в of the vector 0 . The transformation = ; 0 ), i = I ,
. . . , n, is made, and EDF statistics are calculated from form ulas (4 .2 ),
now using the whole sam ple. A rem arkable result is that, asymptotically,
the EDF statistics w ill have their Case 0 distributions (Section 4 .4 ), although
this w ill not be true fo r finite n. Stephens (1978) has examined the h alf-
sample method applied to tests fo r normality and exponentiallty, to compare
with the techniques given in Sections 4 .8 and 4 .9 . Several points can be made:

(a) The quadratic statistics W ^, U^, and A^, as in other situations, appear
to converge fairly rapidly to their asymptotic distributions: this is p ro b ­
ably the case fo r tests for other distributions a ls o , so that for reason­
ably large (say n > 20) sam ples, the half-sam ple method could be used
with the Case 0 as 3rmptotic points.
(b) The half-sam ple technique is not Invariant; different statisticians w ill
obtain different values o f the estim ates, according to the different p o s­
sible random h alf-sam ples chosen fo r estimation, and so w ill get d iffer­
ent values o f the test statistics.
(c) There is considerable loss in power when the half-sam ple method is
used fo r tests of normality and exponentiallty, compared with using EDF
statistics with param eter estimates obtained from the whole sam ple, as
described in Sections 4 . 8 and 4 . 9 . The pow ers also tend to vary among
the different statistics.

Braun (1980) has also suggested a technique fo r dealing with unknown


param eters. These are first estimated using the whole sam ple; then the
sample itself is randomly divided into several groups and a Case 0 test made
on each group separately, using the estimates as though they w ere true v a l­
ues. This technique can be ejq)ected to be valuable only fo r large sam ples;
see Braun (1980).
It seem s clear that the above methods should not be p re fe rre d to the
techniques previously presented fo r tests Involving unknown location and
scale param eters, where the complete sample is used to estimate param e­
ters, but they might be useful fo r tests for distributions involving shape
param eters. M ore information would be helpful on how the methods compare
with other tests with unknown shape param eters, fo r example, use of or
its improvements discussed in Chapter 3, o r with ED F tests such as the tests
fo r the W eibull o r gamma distributions given in Sections 4.11 o r 4 .12 above.
170 STEPHENS

4.16.4 Tests for Symmetry

Tests have been derived fo r : X has a symmetric distribution about a


specified median. If the median is a, the transformation = X - a gives a
sample set which, on Ho, w ill be symmetric with median zero. Hence only
this situation need be considered. The test is not strictly a goodness-of-fit
test, but a test o f the very general hypothesis F (x) = I - F (- x ). A basic tech­
nique is to compare the EDF^s o f X and -X ; the statistics are then based on
ranks. Smirnov (1947) and Butler (1969) suggested a modification of the
Kolmogorov statistic fo r this problem . Distribution theory of the B u tle r-
Smirnov test was given by Chatterjee and Sen (1973), and power results
w ere discussed by Koul and Stoudte (1976). Other variations, and methods
of obtaining confidence bands, w ere illustrated by Doksum, Fenstad, and
Aaberge (1977); these authors find versions o f Kolmogorov statistics which
a re competitive with EDF statistics and with the Shapiro-W ilk statistic
(Section 5.10.3) when used as tests fo r normality against gamma and log­
normal alternatives. Review articles on Kolmogorov-type statistics fo r
symmetry a re given by Niederhausen (1982) and Gibbons (1983).
Rothman and Woodroofe (1972) and Srinivasan and Godio (1974) have
given test statistics S^ of Cram dr-von M ises type fo r Hq ; H ill and Rao (1977)
showed connections between the two statistics (called there R^ and , have
generalized them, and finally have proposed a statistic T^^) which is based

on the generalizations. The statistic has the property that it takes the same
value if Xi is replaced by -X i o r if it is replaced by l/ X i fo r a ll I. H ill and
Rao (1977) gave tables o f probabilities in the upper tail fo r n 2 T W /4 , fo r n

from 10 to 24. Lockhart and M cLaren (1985) have given as 3nnptotic points
fo r this test.
Use of the EDF to estimate the center o f symmetry was discussed by
Butler (1969) and by Rao, Schuster, and Littell (1975).

4.16.5 Tests Based on the Em pirical


Characteristic Function

Some authors have proposed goodness-of-fit tests based on the em pirical


characteristic function (E C F ). This is defined, fo r a random sample Xj^, X 2 ,

• . . , by <t> (t) = ^ e3qp(lt X i)}/ n , where here i^ = - I ; ó (t) converges,


n j= l n
as n — to 0(t), the characteristic function of the distribution F (x) of X,
and the re a l and imaginary parts o f <t>j^{t) , say Cjj(t) and Sjj(t), suitably n or­
m alized, are asymptotically jointly norm al. Tests of fit can be based on how
w ell 0 u(t), Cji(t), o r Sj^(t) correspond to hypothesized values (corresponding
to a given distribution) at particular values o f t; o r they can be of K olm ogorov-
Smlrnov type o r C ram ér-von M ises type, based on
TESTS BASED ON EDF STATISTICS 171

sup I Ф (t) - ф(1)| o r on 1 1¢n (1) - ¢ ( 1)1 ' dG(t) ,


t “

where G is a suitable m easure. Many practical questions rem ain fo r such


tests, such as the choice of t-values o r o f G (t); also, tables ra re ly exist fo r
finite n and pow er studies a re often limited, so that m ore work is needed in
this area. Epps and Pulley (1983) have given a test fo r norm ality, with
tables, and references to e a rlie r w ork.

4.17 ED F TESTS FO R DISCRETE DISTRIBUTIONS

4 .1 7 .1 Introduction

The tests given in previous sections have a ll been developed fo r various


cases in which the tested distribution F (x) was continuous. H istorically, the
test statistics w ere introduced with this intention, and the field was left
c le a r to the Pearson chi-square statistic fo r testing fo r discrete distribu­
tions. However, an E D F can also be drawn fo r discrete data and it can be
compared with the cumulative distribution from which the data a re supposed
to be drawn; it is then natural to define m easures o f discrepancy analogous
to the statistics given fo r continuous distributions. H ere w e examine tests
based on such m easures. A general review of goodn ess-of-fit tests fo r d is­
crete distributions w as given by Horn (1977).
Data may appear to be discrete either because the sam ple genuinely
a rise s from a discrete distribution like the Binom ial o r Poisson, fo r ex­
am ple, in measurements o f counts, o r alternatively because originally con­
tinuous data may have been grouped. The grouping may occur because the
unit o f measurement is very coarse, fo r example, when angles are m eas­
ured to the nearest 5 degrees, o r weights to the nearest pound o r gram .
This occurs in the data on leghorn chicks in Table 4.1; the two chicks which
a re recorded as having weight 190 gm obviously do not p ossess exactly equal
weight, but each weighs somewhere between 189.5 and 190.5 gm . With large
amounts of data, grouping may also be done to facilitate display o r handling
o f the data, and the original values, and therefore some information, may
be lost before a goodness-of-fit test is to be m ade. This happens with Monte
Carlo sampling, when very many observations w ill be recorded, and for
ease of tabulation w ill often be graded into a histogram as they are collected.
O f course, in practice a ll continuous data are subject to the lim its of
accurate measurement, but the inherent grouping may be so fine as to have
negligible effect. This was assumed to be so fo r the data on chicks when they
w ere tested fo r normality in Sections 4.4 and 4 .8 .
172 STEPHENS

4.17.2 T h e E D F fo r D is c r e t e D a t a : C aseO

Suppose that fo r discrete data the possible outcomes a re divided into к cells
and the null hypothesis is

Ho : P {an observation fálls in cell i) = p j, i = I, . . . , к

The P j are assumed given, so that Hq is completely specified, and the


situation is Case O fo r discrete distributions. The cell boundaries may be
determined by the actual values taken by a random variable X, especially if
there are exactly к of these, o r some values may be grouped together, as
in the tail o f a Poisson distribution, to give к cells o v e ra ll. Simpóse n inde­
pendent observations are taken, and let O j be the observed number and E j
be the expected number (E j = npj), in the i-th cell. Define the statistic S by

J
S= m ax I J (0 - E )|
l < j < k 'i = l '

F o r groined continuous data, let the cell boundaries, in ascending o rd e r, be


Co, c j, . . . , (¾; cell i contains values X such that c j _ j < X < cj. If Oj and
E j a re the observed and expected values in cell i, the statistic S can be de­
fined as above. A lso, an E D F may be defined as

j = l.

F (X) = F ( c . ) , c. < X < c _ , 5


n' ' n' з' ’ J- 3+1

F jj(X) is the cumulative histogram o f the data. The grouped distribution func­
tion Fg(X) may be defined in the sam e way, by replacing O j by E j. Then the
statistic S is equal to

S = n sup IF ( X ) - F (X )I
X n g

and there is an obvious p a ra lle l with the Kolmogorov statistic nD. Sim ilarly,
a statistic p a ra lle l to would be

O к ( j )2
W^ = H - I E I E (O i-E )
3=1 4 = 1 ^ ^

and it is possible to construct p a ra lle ls to the other statistics fo r continuous


distributions.
TESTS BASED ON EDF STATISTICS 173

The value of the statistic S depends on the ordering of the cells so that
a different ordering w ill produce a different value for the same data* It is
therefore recommended that S be used when there is a natural ordering of
the categories.
Several authors have discussed the statistic S o r the statistics and S"
defined by

j
S = m ax E (O i - E^) and S = m ax
l<j<k U=I l<3<k

which are analogous to nD"** and nD", and we confine ourselves to tests for
discrete data based on these three statistics.
Pettitt and Stephens (1977) have given exact probabilities fo r the d istri­
bution of S fo r equal cell probabilities. They also showed how the tables
can be used as good approximations fo r probability distributions o f S fo r
unequal probabilities p e r cell, and also to deduce approximate probabilities
fo r S^ o r S“ (see also Conover, 1972). Table 4.28 is taken from Table I
o f Pettitt and Stephens (1977). The table gives values o f P (S > m ), fo r values
o f m which give probabilities near the usual test lev els. Thus a test o f Hq is
made as follows:

(a) Record the observed number o f observations (¾ and the e 3q>ected number
E l, for all i, i = I, . . . , k.

(b) Calculate T j = i = I . • • • . k.

(c) Calculate ST*" = maxj T j , o r S“ = maxj (- T j ), o r S = maxj I T jl s let m


be the value of the test statistic used.
(d) Use Table 4.28 to find p -le v e ls , that is P(S > m ). The p -le v e ls fo r S^
o r S’ , that is, P (S ^ > m) o r P (S’ > m) are each approximately
5 P (S > m ).
(e) If the p -le v e l for the statistic used is less than the test level a , reject
Ho at significance level.

Statistic S gives a two-sided test and statistics and S" give one-sided
tests.

E 4.17.2 Example

The data given in Table 4.29, used by Pettitt and Stephens (1977), are taken
from Siegel (1956). Each of ten subjects was presented with five photographs
o f him self, varying in tone (grades 1-5), and was asked to choose the photo­
graph he liked best. The hypothesis tested was that there was no overall
preference for any tone, that is , each tone was equally likely to be chosen.

The values of T. = (O^ - E^) are given in the table. The values of
174 STEPHENS

T A B L E 4.28 Table of Probabilities for EDF Statistic S for a Fully


Specified Discrete Distribution with к C lasses (Section 4.17)

k = 3 к = 5

n= 6 m ; 4 3 n = 10 m : 5 4
.00274 .03567 .00477 .04162

n = 9 m : 5 4 3 n = 15 m : 6 5 4
.00193 .01656 . 12361 .00584 .03202 . 12322

n = 12 m : 6 5 4 n= 20 m : 7 6 5
.00109 .00771 .04994 .00496 .02203 .07617

n = 15 m : 6 5 4 n = 25 m : 8 6 5
.00361 .02089 .09181 .00368 .04717 . 13083

n = 18 m : 6 5 4 n = 30 m : 8 7 6
.00902 .04005 . 13579 .00946 .02930 .07924

n = 21 m : 7 6 5
к = 6
.00402 .01760 .06308
n = 12 m : 6 5 4
n = 24 m : 7 6 5
.08064
.00173 .01422
.00792 .02897 .08824
n = 18 m : 7 6 5
n = 27 m ; 7 6 5
.00308 .01599 .06435
.01325 .04245 .011433
n = 24 m : 7 6 5
n = 30 m : 8 ' 7 6
.01375 .04695 . 13203
.00609 .02015 .05757
n = 30 m : 8 .7 6
к = 4 .01071 .13317 .08836

n = 8 m : 4 3 Ir
K — *7«

.01514 .10791
n = 14 m : 6 5 4
n = 12 m : 5 4
.00511 .02996 . 12856
.01115 . 05974
n = 21 m : 7 6 5
n = 16 m : 6 5 4
.00807 .03242 . 10550
.00706 .03299 .12611
n = 28 m : 8 7 6
n = 20 m : 7 6 5
.00853 .02828 .08047
.00424 .01826 . 06598

n = 24 m ; 7 6 5 k = 8
.01014 .03526 . 10519
n = 16 m : 6 5
n = 28 m ; 8 7 6 .01122 .05166
.00566 .01914 . 05689
n = 24 m : 8 7 6
.00410 .01641 .05477

(continued)
TESTS BASED ON EDF STATISTICS 175

T A B L E 4.28 (continued)

к = 9

n = 18 m : 5
.00406 .02043 .07840

n = 27 m : 8 7 6
.00833 .02831 .08210

к = 10
n = 20 m : 6 5
.00781 .03276 .10909

n = 30 m : 9 7 6
.00421 .04365 .11333

^F o r given n and k, the table gives values of P(S > m) beneath values of m .
The probabilities given a re exact fo r cells o f equal probability. H alf the
tabulated probabilHy is a good approximation to P(S+ > m) = P (S “ > m ).
Taken from Pettitt and Stephens (1977), with perm ission of the Am erican
Statistical Association.

T A B L E 4 . 29 Data fo r EDF Test fo r a D iscrete Distribution

Tone grade j of Num ber choosing Expected number:


chosen photograph grade j*0^2 E.: T.= E
3 < °i -
i= l

1 0 -2
2 I -3
3 0 -5
4 5 -2
5 4 0

^The first column gives the five grades of tone o f a photograph, and the data
in Column 2 are the numbers out o f 10 persons in an experiment who chose
the different tone grad es. (See Section 4 .1 7 .)
176 STEPHENS

and S” are respectively О and 5, and the value of S is 5. F rom Table 4.29
for n = 10, к = 5, we have P (S > 5) = 0.00477, so S is highly significant,
with p -le v e l less than .005, and Hq w ill be rejected. The Pearson statistic

к
= 2 (О - E )V E
1=1

has the value 11. Using the usual x| approximation, P (X^ > 11) = 0.024,
while by exact enumeration the probability is 0. 04. The S statistic thus gives
a much m ore extreme value than does , and appears to be m ore sensitive
in this instance. Pettitt and Stephens have investigated the power of S, espe­
cially against alternatives representing a trend in cell probability as the cell
index i increases, and it appears that fo r such alternatives, S w ill often be
m ore powerful than X ^ .
Note that Case 0 tables fo r nD should not be used for S, despite the
parallel between the two statistics. Noether (1963) suggested that use of the
nD tables would give a conservative test; Pettltt and Stephens have given
several examples to show this to be true, with the true a -v a lu e very different
from the supposed value.
The test fo r S has been given above fo r Case 0 where the null hypothesis
is completely specified. The analogue o f S is not available fo r the various
cases where probabilities for each cell must be estimated, fo r example, in
a test for a Poisson o r binomial distribution, where an unknown param eter
must be estimated from the data.
Wood and Altavela (1978) have discussed asymptotic properties of
Kolm ogorov-Sm lm ov statistics D^, D “ , and D when used with discrete d is­
tributions, and have shown how asymptotic percentage points may be
simulated.

4.18 COM BINATIONS OF TESTS

4.18.1 Introduction

Suppose к independent statistical tests are made. It may be that the p -le v e ls
are quite sm all, but not sm all enough to be significant. If the к tests are all
tests of sim ilar type—fo r example, all tests fo r normality of sim ilar data,
with sm all samples of each—the results may suggest, overall, that the data
a re non-norm al, but the sam ples are too sm all to detect this. It then becomes
desirable to combine the tests. The general problem of combining tests, even
of different t5ф es, has been discussed by many authors; see, for example,
F isher (1967), Birnbaum (1954), and Volodin (1965). F ish er (1967) suggested
an easy method of combination, based on the p -le v e ls of the к separate test
statistics. In effect, the p -le v e ls are tested for uniformity. This method is
discussed in Section 8.15 and has been used to combine various tests for
TESTS BASED ON EDF STATISTICS 177

normality by W ilk and Shapiro (1968) and by Pettitt (1977a). Volodin (1965)
has discussed tests for one distribution (the norm al, exponential, Poisson,
o r W eibull) against specific alternatives which are close to the one tested.

4.1 8 .2 Combining EDF Test Statistics: Case 0


In this section we give a method o f combining test statistics obtained from
ED F tests. Suppose A^ is the test statistic in a Case 0 test, and let test j

give value A ; the proposed test statistic is Z, = , A * / k . F o r Case 0 , A^


к J=I 3
is chosen as test statistic since, as w as stated in Section 4 .4 , its distribution
function almost does not depend on n. A table of percentage points for Zj^, fo r
Case 0 tests, is given in T able 4.30. When к is too la rg e fo r the table, a
good approximation to the percentage point of Zk is given by the correspond­
ing point fo r a norm al distribution with mean ц and variance o^/k, where ß
and (T^ are given in the last line of the table.

£ 4 .1 8 .2 Example

Suppose six Case 0 tests for normality are made, and the values of A^ are
2.353, 1.526, 0.550, 0.252, 2.981, 2.309. The p -le v e ls of the tests are,
from Table 4.3: 0.06, 0.17, 0.70, 0.97, 0.03, 0.06. T h e a v e ra g e is
Z^ = 1.662, and reference to Table 4.30 shows Z¿ to be significant at the 5 %
level. Although only one component test is significant at the 5 % level, the
overall combination suggests a total picture o f non-norm ality.

T A B L E 4.30 T able for Combining Tests fo r к Sam ples, Case 0


(Section 4 .1 8 .2 )^

Significance level a
N o. of sam óles
к .25 .10 .05 .025 .01

2 1.242 1.705 2.047 2.387 2.838


3 1.219 1.582 1.842 2.096 2.427
4 1.200 1.506 1.721 1.928 2.195
6 1.173 1.414 1.578 1.735 1.934
8 1.155 1.358 1.495 1.624 1.786
10 1.142 1.320 1.439 1.550 1.689
OC ß = 1.000 = 0 . 57974/к

^Upper tail percentage points of Z = Z jA*/k, where A^“ is the value


o f A^ for sample j , case 0. F o r к > 10, Z is approximated by a
normal distribution with ß and a ^ as shown.
178 STEPHENS

T A B L E 4.31 Table fo r Combining Tests fo r к Samples,


Param eters Unknown (Section 4 .18.3)^

Significance level a
No. o f samples
Test к .25 .10 .05 .025 .01

Norm al 2 1.081 1.537 1.878 2.220 2.674


Case 2 3 1.065 1.422 1.680 1.934 2.265
4 1.050 1.350 1.562 1.768 2.035
6 1.027 1.262 1.424 1.579 1.777
8 1.010 1.208 1.343 1.470 1.631
10 0.998 1.172 1.289 1.398 1.535
CO (See Table 4.30: /X= 0.8649, = 0.5303/к)

Norm al 2 0.455 0.562 0.638 0.710 0.805


Case 3 3 0.446 0.530 0.588 0.643 0.713
4 0.439 0.510 0.559 0.604 0.661
6 0.431 0.487 0.524 0.559 0.602
8 0.425 0.473 0.504 0.533 0.569
10 0.421 0.463 0.490 0.515 0.546
OO (See Table 4.30: H = 0.3843, = 0. 3615/k)

Exponential 2 0.723 0.942 1.102 1.260 1.469


Case I 3 0.708 0.881 1.003 1.122 1.276
4 0.698 0.843 0.945 1.042 1.166
6 0.683 0.798 0.876 0.950 1.043
8 0.673 0.771 0.836 0.897 0.973
10 0.666 0.752 0.809 0.861 0.927
OO (See Table 4.30: ß = 0.5959, = 0 .1392/k)

^Upper tall percentage points of Zj^. = S j A j /k where A * Is the modified A^


fo r sample j . Values are given fo r the test of normality with mean zero and
variance unknown (Case 2 of Section 4 . 8) , the test for normality with mean
and variance unknown (Case 3 of Section 4 .8 ), and fo r the test fo r exponen-
tlallty (Section 4.9, Case 2).
TESTS BASED ON EDF STATISTICS 179

4.18.3 Combining ED F Test Statistics: Other Cases

The same technique can be applied to combine tests o f fit when param eters
a re estimated. Here each value of A^ w ill be modified, as described in
previous sections, fo r the appropriate test, and then the mean o f the
modified values A?, j = I , • • • , k, w ill be taken as the overall test statistic.
Upper tall percentage points of fo r tests of normality and fo r tests o f
exponentiallty, are in Table 4.31. F o r к too large fo r the table, follow the
same procedure as described in Section 4 .1 8 .2 , fo r Table 4.30.

E 4.18.3 Example

Proschan (1963) has given a number of sets of fáilure tim es o f airconditioning


equipment fo r several aircraft. The data fo r aircraft 7910 have already been
listed in Table 4.13. When ED F tests of exponentiallty are made on the sets
fo r the first 6 aircraft, with a = 0 and ß estimated separately fo r each a ir ­
craft, the values of A * (that is, A^ modified as in Table 4.11) are: 0.543,
0.722, 0.763, 1.187, 0.499, 1.175, and the mean A * is 0.814; reference to
Table 4.31 shows this value to be near the 10 % point, fo r к = k, although
only two of the Individual values a re significant at the 10% level.

4.18.4 Combining the Standardized Values


from Several Tests

P ierce (1978) has suggested the following method of combining tests based
on к sam ples, fo r testing Hq : the sample comes from a distribution F (x ; 0 ),
with в containing unknown location and/or scale param eters a and ß . The
true values o f these param eters may be different fo r each test. F o r sample i,
let Oi^ and /¾ be the maximum likelihood estim ates. Define standardized
values W r i = ( ^ r i “ ^ = I* . . . , n¿, where r = I, . . . , n^, are
the observations in sample i. The proposal of P ie rc e is that the W ^j fo r all

the к sam ples should be pooled to form one la rg e sam ple o f size n = n^.

P ie rc e showed that the limiting distribution o f any E D F statistic calculated


from the combined sample w ill be the same as its limiting distribution fo r
one sam ple. Sim ilar results apply if only one param eter is not known. P ierce
gave results o f a Monte C arlo study on tests fo r normality (C ase 3 ), com ­
paring the method, using A^ calculated from the к pooled sam ples of stand­
ardized values, with Fisher^s method using the p -le v e ls o f к values of A ^ .
The alternative, that is, the true distribution, w as a W eibull distribution,
and the pooled method is m ore sensitive in this study. Q uesenberry, Whitaker,
and Dickens (1976) have given another method, somewhat sim ilar but less
direct, of combining tests fo r norm ality.
180 STEPHENS

4.19 ED F STATISTICS AS INDICATORS


OF P A R E N T PO PU LA T IO N S

Statisticians sometimes use goodness-of-fit statistics to decide which popu­


lation appears best suited to describe a data set. ED F statistics may be used
fo r this purpose; when the different parent populations F¿(x) are fully speci­
fied (Case 0), aty p ical statistic, say A^, is calculated assuming F i(x) to
be correct, giving value A?, and values of a \ may be directly compared.
A sm aller value of A f w ill indicate a better fit than a la rg e r value. However,
when param eters are estimated from the data, the value of a | must not be
the indicator, since the distribution of A f w ill now vary with F¿(x); instead,
the P i-value attached to A f w ill be a suitable indicator, with a la rg e r р|-
value (m easured from the upper-tail) indicating a better fit.

4.20 TESTS BASED O N N O R M A L IZ E D SPACINGS

4.20.1 Norm alized Spacings and the ED F Statistic a |

In this section tests are discussed based on the spaclngs o f a sam ple. Each
spacing is norm alized by division of a constant and a transformation made
to produce Z-values between 0 and I . W e give the test based on the ED F sta­
tistic A| , the Anderson-D arling A^ calculated from these z-valu es, and
compare it with tests based on the median o r the mean of the values. The
technique affords an Interesting method o f testing by eliminating location and
scale param eters, rather than directly estimating them, and is based on
tests fo r the e^qxDnential distribution. The tests can be used fo r censored
data. The general case w ill be treated o f a sample which has been censored
at both ends. Suppose t + 2 successive observations X , X , ...,
(K) (K"^l)
X a re given, and the test is o f the null hypothesis:

Hq : the original X -sam ple comes from a continuous distribution T { x ; a , ß ) ,


w here a and ß a re unknown location and scale param eters

Then values X can be viewed as being constructed from a random sample W


from F (x ;0 , 1), by the relation X (i) = a + ß W ( i ) , I = к, к + I , . . . , к + t + I .
Suppose m^ = E(W^^^), and define spacings 1 = 1,
®i ^(k+i) (k + i- 1) ’
t + I ; these a re analogous to the spacings D^ o f Section 8 .2 o r the spacings E j
o f Section 10.5.2. Then norm alized spacings are given by

= .........

When the a re from the e^qponential distribution Езф(о',^3), given in


Section 4 .9 .1 , the y^ w ill be i . i . d . Exp (0, ß ) . The J transformation of
TESTS BASED ON EDF STATISTICS 181

Section 10.5.4 can then be applied to give values

i / t+1

"(I. ■ Д ........‘

and the w ill be ordered uniform s, that is, they w ill be distributed as an
ordered sample of size t from the uniform distribution with lim its 0 and I .
A special case of norm alized spacings is the set derived, as in Sec­
tion 10.5.2, by the N-transform ation applied to a complete sample of X -
values from Exp (0,/3); here an extra spacing is available between X^i) and
‘the known low er endpoint a = 0 of the distribution of X. The spacings
a re E i = X ( I ) , E 2 = X ( 2) - X (i ), E 3 = Х ( 3 ) - X ( 2), e t c ., and, fo r the
Exp ( 0 , 1) distribution, m j - m j » ! = l/ (n + I - i), with mg = 0; thus the
norm alized spacings are the values X j in Section 10.5.2. Tests that the
original sample is exponential can then be based on statistics used fo r test­
ing that the Z(^ are ordered uniforms, for example, E D F statistics. Case 0
(Section 4 .4 ), o r any of the test statistics described in Chapter 8 . Tests of
this type, for exponentiality of the original X, are discussed in Chapter 10.
These tests can be adapted to test that X came from a m ore general
F(x;o¿,/3) as follow s. The spacings are calculated as described above, and,
provided values of ki = mj - m^^j^ known, norm alized spacings y i can
be found; then the transformation to z(i) can be made. Then, subject to
important conditions particularly affecting the extreme spacings, suitably
separated norm alized spacings from any continuous distribution are asym p­
totically Independent and exponentially distributed with mean ß (see Pyke,
1965, fo r rigorous and detailed resu lts). The conditions on this result are
sufficiently strong, however, that the transform ed values Z(^) must not be
assumed to be distributed as uniform o rd er statistics, even asymptotically,
fo r the purpose o f finding distributions o f test statistics. However, asym p­
totic distribution theory of three statistics calculated from the z-valu es has
been given by Lockhart, O^Reilly, and Stephens (1985, 1986).

4.20.2 Tests fo r the N orm al, Logistic, and E xtrem e-


Value Distributions Based on Ag

W e first discuss statistic Ag ; this is the Anderson-D arling statistic A^


defined in equation (4 .2 ), but using the Z(j) above derived from the spacings.
Later we discuss statistics based on the median and the mean of the z-valu es.
Asymptotic percentage points fo r a |, fo r use with tests that the X^ are
from one o f the norm al, logistic, o r extrem e-value distributions, are given
in Table 4.32; these are taken from Lockhart, O^Rellly, and Stephens (1985,
1986). Thus to make a test fo r norm ality, p = k/n and q = (k + t + l)/ n are
found, and Table 4.32 is entered on the line corresponding to p and q; the
null hypothesis that the original X j a re norm al is rejected if a | exceeds the
percentage point given fo r the desired test level a .
182 STEPHENS

T A B U S 4.32 A s 5rmptotic Percentage Points fo r A g, fo r Tests fo r Norm al,


Logistic, o r Extrem e-Value Populations (Section 4.20). The table is
entered at p = k/n and q = (k + t + l)/n .

Left Right Significance level a


censoring censoring
point, p point, q 0.25 0.20 0.15 0.10 0.05 0.025 0.01

Norm al Distribution

0 I 0.955 1.066 1.211 1.422 1.798 2.191 2.728


0 0.75 1.056 1.183 1.350 1.592 2.026 2.479 3.100
0 0.50 1.098 1.232 1.409 1.667 2.129 2.612 3.273
0 0.25 1.133 1.273 1.459 1.730 2.215 2.722 3.416
0.25 0.75 1.178 1.324 1.518 1.800 2.306 2.835 3.559
0.25 0.50 1.225 1.381 1.587 1.889 2.430 2.996 3.770

Logistic Distribution

0 I 1.123 1.263 1.448 1.720 2.206 2.716 3.413


0 0.75 1.141 1.281 1.468 1.741 2.230 2.741 3.441
0 0.50 1.178 1.325 1.521 1.806 2.318 2.852 3.584
0 0.25 1.215 1.369 1.574 1.873 2.409 2.969 3.736
0.25 0.75 1.177 1.323 1.517 1.801 2.308 2.838 3.564
0.25 0.50 1.223 1.378 1.584 1.885 2.424 2.989 3.761

Extrem e-Value Distribution

0 I 1.016 1.138 1.300 1.535 1.957 2.398 3.000


0.0 0.75 1.159 1.302 1.492 1.770 2.267 2.787 3.498
0.0 0.50 1.202 1.354 1.555 1.849 2.376 2.927 3.682
0.0 0.25 1.229 1.386 1.594 1.898 2.444 3.015 3.797
0.25 1. 0 1.027 1.150 1.312 1.549 1.972 2.413 3.018
0.25 0.75 1.187 1.336 1.532 1.819 2.333 2.870 3.605
0.25 0.50 1.231 1.388 1.596 1.901 2.447 3.018 3.800
0.50 1. 0 1.051 1.177 1.345 1.589 2.025 2.481 3.105
0.50 0.75 1.224 1.380 1.586 1.887 2.428 2.993 3.767
0.75 1.0 1.081 1.213 1.387 1.641 2.096 2.571 3.222

^The table is entered at p = k/n and q = (k + t + l)/n .


TESTS BASED ON EDF STATISTICS 183

Monte Carlo studies suggest that, as in previous sections, the points


may be used to give a good approximate test fo r, say, n > 20 . The tables
can be interpolated fo r values o f p and q not given.
The test fo r the extrem e-value distribution above r e fe rs to distribution
(4 .7 ); fo r a test that X comes from (4 .6 ) the substitution = - X must be
made, and the X ’ tested to come from (4 .7 ). The test may be adapted to a
test that X comes from the tw o-param eter W eibull distribution W (x ;0 ,ß ,m )
o f Section 4.11 by taking X* = log X and testing that X ’ comes from (4 .7 ).
In the above tests, values o f m^ (o r m ore p recisely, o f Ц = - пц)
a re needed. F o r the norm al distribution values o f m^ have been extensively
tabulated o r can be accurately calculated; see Section 5 .7 .2 . F o r the logistic
distribution, ki = n /{(i - l)(n - i + I ) }, i = 2, . . . , n. F o r the extrem e-value
distribution, values o f m^ have been given by Mann (1968) and by White
(1968)} also values o f kj a re tabulated, fo r 3 < n < 25, by Mann, Scheuer,
and F ertig (1973). Alternatively the approximation m^ = l o g [ - l o g { l -
(i - 0 .5 )/(n + 0 .2 5 )}] is quite accurate fo r n > 10 (L a w le ss, 1982).

4.2 0 .3 Tests Based on the Median and the Mean


of the Transform ed Spacings

Although they a re not properly E D F statistics, it is convenient to discuss


here tests based on the median and the mean o f the . The test based on
the median w as introduced by Mann, Scheuer, and F ertig (1973) fo r the
special problem of testing that an original right-censored sam ple o f t-values
comes from the two param eter W eibull distribution W(t;0,/3,m) (Section
4.11) against the three-param eter alternative W (t;o',)3,m ), with a > 0 . The
test situation a rise s in reliability theory. To make the test, the tran sfor­
mation X = log t is made as described above, and the right-censored sample
o f X -valu es is tested to come from the extrem e-value distribution (4 .7 ).
Suppose therefore that the r sm allest o rd e r statistics of an X -sam p le of
size n a re available, giving r - I norm alized spacings yi ; the statistic
Г—I Г—I
proposed is S = 2 . y./T , where T = 2 . ^ y . , and where s = (r + 1)/2 if

r is odd and s = (r + 2)/2 if r is even. It is easily shown that S = I - Z (^ j,


where w = s - I , so that S is equivalent to which is essentially the
median o f the z-valu es. Mann, Scheuer, and F ertig (1973) give tables of
percentage points fo r S, based on Monte Carlo studies, fo r n = 3(1)25, and
fo r r = 3 (l)n . F o r the problem they w ere studying, the authors proposed a
one-taii test; Hq is rejected if S is too large (so the median is too s m a ll).
An example of the S-test, with censored data, is given in Section 12.4.1.
Mann and F ertig (1975) later modified the S statistic, by choosing a sm aller
value of S. They also showed how to obtain confidence intervals for the
origin of a three-param eter W eibull distribution by choosing those values
which gave non-significant test results.
184 STEPHENS

Another possible test statistic fo r Hq is Z , the mean of the Tiku


and Singh (1981) gave a test, also fo r the extrem e-value distribution, with
double censoring; the statistic is their Z * . On Hq , the authors gave an accu­
rate norm al approximation fo r Z * with mean I , and with variance depend­
ent on the variances and covariances of standard extrem e-value order statis­
tics. It is easily shown that Z * = 2Z, so that the Tlku-Singh approximation
can easily be adapted to give percentage points fo r Z . The distributions o f S
and Z , when the X come from a suitably regu lar parent population, have
been investigated by Lockhart, O ’R eilly, and Stephens (1985, 1986). Both

statistics are asymptotically norm al; Sj = C it^(S - 0.5) and Z^ = C 2t^(Z - 0 .5 )


a re asymptotically N ( 0 , 1), where and C 2 a re constants dependent on the
parent population and on the censoring levels.
F o r example, fo r the extrem e-value test, fo r complete sam ples,
C l = 2.233, and C 2 = 4.0098. If the z(i) could be assumed to be ordered
uniform s, S would have a beta distribution and Z would have a distribution
tabulated in Chapter 8 ; as 3onptotically these also tend to norm al distributions
o f the type given above, with Cj = 2 fo r S and C 2 = nÎ Ï 2 = 3.464 fo r Z .
Mann, Scheuer, and F ertig found the beta distribution to give a good approx­
imation to their Monte Carlo points fo r sm all n, and suggested its use fo r n
beyond their tables; however, comparison o f the norm al approximations
above shows that use o f the beta distribution fo r la rg e n w ill give a test
which is too conservative.

4.20.4 Pow er Properties

Monte C arlo power studies by Mann, Scheuer, and F ertig showed that S has
good power properties for the problem they w ere considerii^ where S w as
used with one tail. Other power studies for the extrem e-value test have been
given by Littell, M cClave, and Offen (1979), Tiku and Singh (1981), and
Lockhart, O ’R eilly, and Stephens (1986). These show that A| and Z have
high power, often better than S. F o r some alternatives, S w ill be biased;
we return to this point below.
Tlku (1981) has investigated S*, equivalent to Z , in testing for normality.
Lockhart, O ’R eilly, and Stephens (1985) have made further comparisons,
involving A|, S, Z , the A^ (C ase 3) test of Section 4 .8 , and the Shaplro-WiIk
W test discussed in Chapter 5; here S must be used with two tails to cover
reasonable alternatives. The most powerful tests fo r normality are given by
W , A^ (Case 3), and A|: these three give quite sim ilar results.
Lockhart, O ’R eilly, and Stephens have also shown that statistics S and Z
can give non-conslstent tests fo r some alternatives to the null; fo r example,
this is the case in testing fo r normality. Statistic S may also be biased, as
w as observed by Tiku and Singh (1981), although for the problem discussed
TESTS BASED ON EDF STATISTICS 185

by Mann, Scheuer, and F ertig (1973) this does not appear to be the case. It
would seem that statistic A|, which is consistent, should be p referred to Z
and S except perhaps fo r some situations in which the alternatives to the null
a re very carefully specified to avoid problem s of bias and non-consistency.
A| has good power In studies so fa r reported. It can be easily calculated
without direct estimation of param eters, can be used fo r censored data, and
is consistent. These properties suggest that A|, and possibly other ED F
statistics found from norm alized spacings, might prove useful in other test
situations, rivaling the regu lar use o f E D F statistics as described in the
re st of this chapter.

R E FE R E N C E S

A U , M . M . and Chan, L . K. (1964). O n G u p ta ^ se stim a te so fth e p a ra m e te rs


of the norm al distribution. Biom etrika 51, 498-501.

Anderson, T . W . and Darling, D . A . (1952). Asymptotic theory o f certain


goodn ess-of-fit c riteria based on stochastic p ro cesses. Ann. Math.
Statist. 23, 193-212.

Anderson, T . W . and D arling, D. A . (1954). A test of goodness-of-fit.


J. A m er. Statist. A s s o c . 49, 765-769.

Anscom be, F . J. and Tukey, J. W . (1963). The examination and analysis of


residuals. Technometrics 5, 141-160.

B a r r , D . R. and Davidson, T . (1973). A K olm ogorov-Sm lm ov test fo r cen­


sored sam ples. Technometrics 15, 739-757.

B erk, R . H. and Jones, D. H. (1979). G oodness-of-fit test statistics that domi­


nate the Kolmogorov statistics. Z . W ahrsch. V e rw . Gebiete 47, 47-59.

Birnbaum , A . (1954). Combining independent tests o f significance. J. A m e r.


Statist. A s s o c . 49, 559-574.

Birnbaum, Z . W . and Lientz, B . P . (1969a). Exact distributions fo r some


Renyi type statistics. Applicationes Mathematicae 10, 179-192.

Birnbaum, Z . W . and Lientz, B . P . (1969b). T ables o f critical values of


some Renyl type statistics fo r finite sample s iz e s . J. A m e r. Statist.
A s s o c . 64, 870-877.

B lis s , C . (1967). Statistics in Biology; Statistical Methods fo r R esearch in


the Natural Sciences. New York: M cG raw -H ill.

B lo m , G. (1958). Statistical Estimates and Transform ed Beta V a ria te s.


New Y ork: W iley.
186 STEPHENS

Braun, H. (1980). A simple method fo r testing goodness o f fit in the p r e s ­


ence o f nuisance param eters. J. Roy. Statist. Soc., B 42, 53-63.

Butler, C. C. (1969). A test for S3rmmetry using the sample distribution


function. Ann. Math. Statist. 40, 2209-2210.

Chandra, M ., Singpurwalla, N . D . , and Stephens, M . A . (1981). Kolmogorov


statistics for tests o f fit for the extrem e-value and W eibull distributions.
J. A m er. Statist. A s s o c . 76, 729-731.

Chatterjee, S. К. and Sen, P . K. (1973). O n K olm o go ro v-S m im o vty p etests


fo r symmetry. Ann. Inst. Statist. M ath. 25, 287-300.

C hem off, H ., Gastwirth, J. L . , and Johns, M . V . (1967). Asymptotic d is­


tribution of linear combinations o f functions o f o rd er statistics with
applications to estimation. Ann. Math. Statist. 38, 52-73.

Conover, W . J. (1972). A Kolmogorov goodness of fit test of discontinuous


distributions. J. A m er. Statist. A s s o c . 67, 591-596.

C sörgo, S. and Horvath, L . (1981). On the K oziol-G reen model fo r random


censorship. Biom etrika 68 , 391-401.

Doksum, K. A . (1975). M easures of location and asym m etry. Scand. J .


Statist. 2, 11 - 22 .

Doksum, K. A . , Fenstad, G . , and A a b e rg e , R . (1977). P lo t s a n d t e s t s fo r


symmetry. Biom etrika 64, 473-487.

D rap er, N . R. and Smith, H. (1966). Applied Regression A n aly sis. New
Y ork: W iley.

Dufour, R. and M aag, U. R. (1978). D istrib u tio n re su ltsfo rm o d ifie d


Kolm ogorov-Sm lm ov statistics for truncated o r censored data. Techno­
m etrics 20, 29-32.

Durbin, J. (1973). Distribution Theory fo r Tests Based on the Sample D is ­


tribution Function. Regional Conference Series in A ppl. M ath ., No. 9,
SIAM, Philadelphia, P a.

Durbin, J. (1975). K o im o go rov-S m lm ovtestsw h en p aram eters are esti­


mated with applications to tests of e:qx)nentiality and tests on spacings.
Biom etrika 62, 5-22.

Durbin, J. and Knott, M . (1972). Components of C ram eY -V on M ises statis­


tics, I. J. Roy. Statist. S o c ., B 34, 290-307.

Durbin, J ., ICnott, M ., and T aylor, C. C. (1975). Components of C ra m ë r-


Von M ises statistics, П. J. Roy. Statist. Soc., B 37, 216-237.

D yer, A . R. (1974). Comparisons of tests fo r normality with a cautionary


note. Biom etrika 61, 185-189.
Easterling, R. G. (1976). Goodness -o f-fit and param eter estimation. Tech­
nometrics 18, 1-9.
TESTS BASED ON EDF STATISTICS 187

Epps, T . W . and Pulley, L . B . (1983). A test for normality based on the


em pirical characteristic function. Biom etrika 70, 723-728.

Finkelstein, J. M . and Schafer, R. E. (1971). Improved goodness-of-fit


tests. Biom etrika 58, 641-645.

F ish er, R. A . (1967). Statistical Methods for R esearch W o rk e rs , 4th Edition.


New York: Stechert.

Gibbons, J. D . (1983). Kolm ogorov-Sm lrnov symmetry test. In Encyclo­


paedia of Statistical Sciences, S. Kotz, N . L . Johnson, and C. B . Read,
E d s ., 4, 396-398.

G illespie, M . J. and F ish er, L . (1979). Confidence band for the Kaplan-
M eier survival curve estimate. Ann. Statist. £, 920-924.

Green, J. R. and Hegazy, Y . A . S. (1976). P o w e rfu lm o d ifie d E D F g o o d n e ss-


o f-fit tests. J. A m er. Statist. A s s o c . 71, 204-209.

Gupta, A . K. (1952). Estimation o f the mean and standard deviation of a


normal population from a censored sam ple. Biom etrika 39, 266-273.

Hall, W . J. and W e lln e r, J. A . (1980). Confidence band fo r a survival curve


from censored data. Biom etrika 67, 133-143.

Hegazy, Y . A. S. and Green, J. R . (1975). Some new goodn ess-of-flt tests


USingorder statistics. A p p l. Statist. 24, 299-308.

H ill, D. L . and Rao, P . V . (1977). Tests of symmetry based on C ra m e r-


Von M ises statistics. Biom etrika 64, 489-494.

Horn, S. D . (1977). GoodnesS-o f-fit tests for discrete data; a review and an
application to a health impairment scale. Biom etrics 33, 237-248.

Johnson, N . L . and Kotz, S. (1970). Distributions in Statistics, Volume I:


Continuous Univariate Distributions. Boston: Houghton M ifflin.

Kaplan, E . L . and M eier, P . (1958). Non-param etric estimation from


incomplete observations. J. A m er. Statist. A s s o c . 53, 457-481.

Kleinbaum, D. G. and Küpper, L . L . (1978). Applied R egression Analysis


and Other M ultivariable Methods. Belmont, C a lif.: Duxbury P re s s .

Kolmogorov, A . N. (1933). Sulla determinazione em pírica di una legge di


distibuziane. G io m a . Ist. Attuari. 4, 83-91.

Kotz, S. (1973). Norm ality versus lognormality with applications. Comm.


Statist. £, 113-132.

Koul, H. L . and Staudte, R. G . , J r. (1976). Pow er bounds fo r a Smirnov


statistic In testing the hypothesis of sym m etry. Ann. Statist. 4, 924-935.

Koziol, J. A . (1980). G oodness-of-fit tests fo r randomly censored data.


Biom etrika 67, 693-696.
188 STEPHENS

Koziol, J. A . and Byar, D* P . (1975). P ercen tagepo in tsofth easym ptotic


distributions of one and two sample k -s statistlos fo r truneated o r oen-
sored data. Teohnometries 17, 507-510.

Koziol, J. A . and Green, S. B . (1976). A Cram er-von M ises statistio for


randomly eensored data. Biom etrika 63, 465-474.

K uiper, N . H. (1960). Tests eonoeming random points on a oirele. P ro o *


Koninkl. N eder. Akad. van. Wetensehappen, A 63, 38-47.

Law less, J. F . (1982). Statistieal Models and Methods of Lifetim e Data.


New Y ork: W iley.

Lew is, P . A . W . (1961). D istribu tlo n o fth eA n d erso n -D arlin gstatistio .


Ann. Math. Statist. 32, 1118-1124.

L illie fo rs, H. W . (1967). O n th e K o lm o g o ro v -S m lm o v te stsfo rn o rm a lity


with mean and varianoe unknown. J. A m er. Statist. A sso o . 62, 399-402.

L illie fo rs , H. W . (1969). On the K olm ogorov-Sm im ov test fo r the exponen­


tial distribution with mean unknown. J. A m er. Statist. Assoo. 64 , 387-
389.

Littell, R. C. and Rao, P . V . (1978). C on fiden eeregio n sfo rlo o atio n an d


seale param eters based on the Kolm ogorov-Sm im ov goodness-of-fit
statistio. Teohnometrios 20, 23-27.

Littell, R. C . , M eClave, J. T . , and Offen, W . W . (1979). G o o d n e sso ffit


test fo r the two param eter W eibull distribution. Commun. Statist. Simu­
lation Comput. B ,3 , 257-269.

Lookhart, R. A . and M oLaren, C. G. (1985). A s 3nnptotio points for a test of


symmetry about a speoified median. Biometrika 72, 208-210.

Lookhart, R. A. and Stephens, M . A . (1985a). Tests fo r the W eibull d istri­


bution based on the em pirioal distribution funotion. Teohnioal Report,
Department of Mathematios and Statistios, Simon F r a s e r University.

Lookhart, R. A . and Stephens, M . A . (1985b). Goodnes s -o f-fit tests for the


gamma distribution. Teohnioal Report, Department o f Mathematios and
Statistios, Simon F r a s e r University.

Lookhart, R. A . and Stephens, M . A . (1985o). Tests o f fit fo r the Von M ises


distribution. Biometrika 72 (To appear).

Lookhart, R. A . , O ’R eilly, F . J ., and Stephens, M . A . (1985). T e s t s o f f it


based on normalized spaoings. Teohnioal report. Department of Statis-
tios, Stanford University.

Lookhart, R . A . , O ’Reilly, F . J ., and Stephens, M . A . (1986). T e s t s o f f it


fo r the extreme value and W eibull distributions based on normalized
spaoings. N av . R es. Legist. Quart. (To appear).
TESTS BASED ON EDF STATISTICS 189

Louter, A . S. and K oerts, J. (1970). On the Kuiper test fo r normality with


mean and variance unknown. Statistica Neerlandica 24, 83-87.

Loynes, R. A . (1980). The em pirical distribution function of residuals from


generalised regression . Ann. Statist. 8 , 285-298.

Mann, N. R. (1968). Results on statistical estimation and hypothesis testing


with application to the W eibull and extreme value distributions. Technical
report no. A R L 68-0068, Office of Aerospace R esearch, United States
A ir F o rce, W right-Patterson A ir Force B ase, Ohio.

Mann, N. R. and F ertig, К. W . (1975). A goodn ess-of-fit test fo r the two


param eters versus three param eters W eibull ; confidence bounds fo r
threshold. Technometrics 17, 237-245.

Mann, N . R ., Scheuer, E . M . , and F ertig, К. (1973). A new goo d n ess-o f-f it


test fo r the two param eter W eibull o r extrem e-value distribution. Com ­
mun. Statist. 2, 383-400.

M ardia, K. V . (1972). Statistics of Directional D ata. New Y ork: Academic


P re ss.

M argolin, B . H. and M au rer, W . (1976). T e stso fth e K o lm o g o ro v -S m lrn o v


type for езфопепйа! data with unknown scale, and related problem s.
Biom etrika 63, 149-160.

Martynov, G. V . (1976). Computation of limit distributions of statistics fo r


normality tests of t5фe w squared. T h e o r. Probability A p p l. 21, 1-13.

M ichael, J. R. and Schucany, W . R. (1979). A new approach to testing


goodness-of-fit fo r censored sam ples. Technometrics 2 1 , 435-441.

Mickey, M . R ., Mundle, P . B . , W alk er, D. N ., and G linski, A . M . (1963).


Test criteria fo r Pearson type in distributions. Technical report A R L
63-100, Aeronautical Research Laboratories, United States A ir Force.

M oore, D. S. (1973). A note on Srlnivasan’s goodn ess-of-fit test. Biometrika


e o , 209-211.

Mukantseva, L . A. (1977). Testing normality In one dimensional and multi­


dimensional linear regression . T h eo r. Probability A p p l. 22 , 591-601.

Nesenko, G. A . and T ju rin, Ju. N . (1978). Asymptotics of Kolm ogorov’s


statistic for a param etric fam ily. Soviet Math. D okl. 19, 516-519.

Neuhaus, G. (1979). Asymptotic theory of goodn ess-of-fit tests when param ­


eters are present: A survey. Math. Op. Stat. S . 10, 479-494.

Niederhausen, H. (1981). T ables o f significance points for the v arian ce-


weighted Kolm ogorov-Sm irnov statistics. Technical report. Department
of Statistics, Stanford University.

Niederhausen, H. (1982). Butler-Sm irnov test. In Encyclopaedia o f Statistical


Sciences, S. Kotz, N. L . Johnson, and C. B. Read, E d s ., 1^, 340-344.
190 STEPHENS

Noether, G . E . (1963). A note on the Kolm ogorov-Sm irnov statistic in the


discrete case. M etrika 7, 115-116.

Pearson, E . S. (1963). Comparison o f tests fo r randomness of points on a


line. Biom etrlka 50, 315-325.

Pearson, E . S. and Hartley, H. O . (1972). Biometrika Tables fo r Statisti­


cians, V o l. 2* New York: Cam bridge University P re s s .

Pettitt, A . N . (1976). C ram er-von M ises statistics fo r testing normality


with censored sam ples. Biometrlka 63, 475-481.

Pettitt, A . N . (1977a). Testing the normality of several independent samples


using the Anderson-D arling statistic. J. Roy. Statist. Soc. C 26, 156-161.

Pettitt, A . N . (1977b). Tests for the ejqjonential distribution with censored


data using C ram er-von M ises statistics. Biom etrika 64 , 629-632.

Pettitt, A . N . and Stephens, M . A . (1976). Modified C ram er-von M ises


statistics for censored data. Biometrika 63, 291-298.

Pettitt, A . N . and Stephens, M . A . (1977). TheK olm ogorov-Sm im ov


goodness-of-fit statistic with discrete and grouped data. Technometrics
19, 205-210.

Pettitt, A . N . and Stephens, M . A . (1983). ED F statistics fo r testing fo r


the Gamma distribution. Technical Report, Department of Statistics,
Stanford University.

P ierce, D. A . (1978). Combining evidence from several sam ples fo r testing


goodness-of-fit to a location-scale fam ily. Technical Report, D epart­
ment o f Statistics, Stanford University.

P ierce, D. A . and G ray, R. J. (1982). T e s t in g n o r m a lit y o fe r r o r s in


regression m odels. Biom etrika 69, 233-236.

P ierce, D. A . and Kopecky, K. J. (1978). Testing goodness-of-fit fo r the


distribution of e rro rs in regression m odels. Technical Report Symp. 16,
Department o f Statistics, Stanford University.

Proschan, F . (1963). Theoretical езф1апайоп o f observed decreasing fáilure


rate. Technometrics 5, 375-383.

Руке, R. (1965). Spaclngs. J. Roy. Statist. Soc., B 27, 395-449.

Quesenberry, C. P . , Whitaker, T . B . , and Dickens, J. W . (1976). On


testing normality using several sam ples: Analysis of peanut aflatoxin
data. Biom etrics 32, 753-754.

Rao, P . V . , Schuster, E. F . , and Littell, R. C. (1975). E stlm ation ofsh ift


and center of symmetry based on Kolm ogorov-Sm im ov statistics. Ann.
Statist. 3, 862-873.
TESTS BASED ON EDF STATISTICS 191

Renyi, A . (1953). On the theory of o rd er statistics. Acta ^feth. Acad. Sei.


Hung. 4, 191-227.

Riedwyl, H. (1967). G oodness-of-fit. J. A m e r. Statist. A s s o c . 62, 390-398.

Rothman, E. D. and W oodroofe, M . (1972). A C ram er-vo n M ises type statis­


tic for testing sym m etry. Ann. Math. Statist. 43, 2035-2038.

Sahler, W . (1968). A Surrey of distribution-free statistics based on distances


between distribution functions. Metrika 13, 149-169.

Schey, H. M . (1977). The asymptotic distribution of the one-sided Kolm ogorov-


Sm im ov statistic fo r truncated data. Comm. Statist. Theory Methods, A
6 , 1361-1366.
Schneider, B . E . and CIickner, R. P . (1976). O n th e d istrib u tio n o fth e
Kolm ogorov-Sm irnov statistic for the gamma distribution with unknown
param eters. Mimeo Series No. 36, Department o f Statistics, Schoolof
Business Administration, Tem ple University, Philadelphia, P a.
Serfling, R. J. and Wood, C. L . (1975). On null-hypothesis limiting d istri­
butions o f Kolm ogorov-Sm im ov type statistics with estimated location
and scale p aram eters. Technical Report; F lo rid a State University.

Siegel, S. (1956). Non-param etric Statistics fo r Behavioral Scientists.


New York; M cG raw -H ill.

Smirnov, N . V . (1947). Akad. Nauk SSR ., C. R. (D ok l.) Acad. Sei. URSS


5 6 , 11-14.

Smith, R. M . and Bain, L . J. (1976). Correlation-type goodness-of-fit


statistics with censored sampling. Comm. Statist. Theory Methods, A
5, 119-132.

Spinelli, J. J. and Stephens, M . A . (1983). Tests fo r eзфonentiality


when origin and scale param eters are unknown. Technometrics
2 9 , 471-476.

Srinivasan, R . (1970). An approach to testing the go o d n ess-of-flt o f incom­


pletely specified distributions. Biom etrika 57, 605-611.

Srinivasan, R. (1971). Tests fo r ездюпеп11аШ у. Statist. Hefte 12, 157-160.

Srinivasan, R. and Godio, L . B . (1974). A C ram er-von M ises type statistic


fo r testing sym m etry. Biom etrika 61, 196-198.

Stephens, M . A . (1970). U se o f the K olm ogorov-Sm im ov, C ram er-von M ises


and related statistics without extensive tables. J. Roy. Statist. S oc., B
32, 115-122.

Stephens, M . A . (1971). Asymptotic results fo r goodn ess-of-flt statistics


when param eters must be estimated. Technical Reports 159, 180, D e­
partment of Statistics, Stanford University.
192 STEPHENS

Stephens, M . A . (1974a). Components of goodness-o f-fit statistics. Ann.


Inst. H. Poincare, B 10, 37-54.

Stephens, M. A . (1974b). ED F statistics fo r goodness-of-fit and some com ­


parison s. J. A m e r. Statist. A s s o c . 69, 730-737.

Stephens, M. A . (1976a). Asymptotic results fo r goodn ess-of-fit statistics


with unknown p aram eters. Ann. Statist. £, 357-369.

Stephens, M. A . (1976b). Asymptotic power of ED F statistics fo r езфопеп-


tiality against Gamma and W eibull alternatives. Technical Report
No. 297: Department of Statistics, Stanford University.

Stephens, M. A . (1977). G oodness-of-fit fo r the extreme value distribution.


Biom etrlka 64, 583-588.

Stephens, M . A . (1978). On the half-sam ple method for goodness-of-fit.


J. Roy. Statist. S oc., B 40, 64-70.
Stephens, M. A . (1979). T e s t s o f f i t f o r th e lo g lstic d lstrib u tio n b a se d o n th e
em pirical distribution function. Biometrika 66 , 591-595.
Stephens, M. A . (1985). Tests for the Cauchy distribution based on the E D F .
Technical Report, Department of Statistics, Stanford University.

Stephens, M . A . (1986). G oodness-of-fit for censored data. Technical


Report, Department o f Statistics, Stanford University.

Stephens, M. A . and W agner, D. (1986). On a replacement method for


goodness-of-fit with randomly censored data. Technical report, D ^ ja r t -
ment of Mathematics and Statistics, Simon F r a s e r University.

Tiku, M . L . (1981). A goodness o f fit statistic based on the sample spaclngs


fo r testing a symmetric distribution against sym m etric alternatives.
Australian J. Statist. 23, 149-158.

Tiku, M . L . and Singh, M . (1981). Testing the two param eter W elbull d is­
tribution. Comm. .Statist. A , 10, 907-918.

Van Soest, J. (1967). Some e^)erim ental results concerning tests o f norm al­
ity. Statistica Neerlandica 2 1 , 91-97.

Van Soest, J. (1969). Some goodness-of-fit tests fo r exponential distributions.


Statistica Neerlandica 23, 41-51.

V an T llm an n -D eutler, D ., Grlesenbrock, H. F . , and Schwensfeier, H. E.


(1975). D e r K olm ogorov-Sm im ov Eisntichprobentest auf Norm alltat bei
unbekanntem Mittelwert und undbekannter V arianz. Allegem eins Statis-
tishe Archiv. A 59, 228-250.

Volodin, I. N. (1965). Testing of Statistical hypothesis on the type of d istri­


bution by sm all sam ple. Kazan. G o s . Univ. Ucen. Z a p * 125, 3-23
(R u ssian ). Also in Selected translations in mathematical statistics and
probability, AMS (1973), 11, 1-26.
TESTS BASED ON EDF STATISTICS 193

Watson, G. S. (1961). G oodness-of-fit tests on a c irc le . Biom etrika 48,


109-114.

White, H. and MacDonald, G . M . (1980). Some la rg e -sa m p le tests fo r


non-normality in the linear regressio n model (with discussion). J. A m e r.
Statist. A s s o c . 75, 16-28.

White, J. S. (1969). The moments of log-W eibull order statistics. Techno-


m etrics 11, 373-386.

W ilk , M . B . and Shapiro, S. S. (1968). T h e jo ln ta s s e s s m e n to fn o rm a lity


o f several Independent sam ples. Technometrics 10, 825-839.

Wood, C . L . (1978a). On null-hyix)thesis limiting distributions o f Kolm ogorov-


Sm im ov type statistics with estimated location and scale param eters.
Comm. Statist., A 7^, 1181-1198.
Wood, C . L . (1978b). A la rg e sample K olm ogorov-Sm im ov test fo r norm al­
ity of eзфerimental e r r o r in a randomised block design. Biom etrika
6 5 , 673-676.

Wood, C. L . and Altavela, M . M . (1978). L a rge-sam p le results fo r


Kolm ogorov-Sm irnov statistics fo r discrete distributions. Biom etrika
65, 235-239.
Tests Based on Regression and Correlation

M ichael A . Stephens Simon F r a s e r University, Burnaby, B . C . , Canada

5.1 INTR O D U C TIO N

In previous chapters it has been shown how a random sample can be used in
a graphical display (for example, on probability paper, o r by drawing the
ED F) which is then used to indicate whether the sample comes from a given
distribution. The techniques make use o f the sample values arranged in
ascending o rd e r, that is, they use the o rd e r statistics. In this chapter we
examine another graphical method, related to probability plots, in which the
o rd er statistics are plotted on the vertical axis o f the graph, against T^,
a suitable function o f i, on the horizontal axis. A straight line is then fitted
to the points, and tests are based on statistics associated with this line. This
type of test w ill be called a regression test; when the test statistic used is
the correlation coefficient between X and T , the test w ill be called a c o rre ­
lation test.

5. 1. 1 Notation

The following notation is used: Xj^, . . . » Xj^ is a random sample from a


continuous distribution F q (x ), known o r hypothesized; X (ij < X^2) < • • . < X^^)
are the o rd er statistics. X may re fe r to the n-vector with components Xj o r
X (i) depending on context. F q (X) is often of the form F(w ) with w = (x - a ) / ß ;
OL is the location param eter and ß is the scale param eter. If ^ = 0, F q (x ) is
said to contain a scale param eter only.
If U = F (W ), W = F"^(u), so that F ”^(*) is the inverse function of F(»)>
f(w) is the density function corresponding to F (w ).

195
196 STEPHENS

W i , . . . » W jj is a random sample from F (w ); < W ( 2) < • • • < W^jj)


a re the o rd e r statistics- T j , i = I , n, is used to describe a set o f con-
stants; two important sets are T j = m j E (W (i)), where E denotes езфес-
tation, and T i = = F “^ {i/ (n + 1 )}, i = I , • • • , n; m , H, T are column
vectors of length n with components m|, Щ , T i, respectively.
Z o r Z i denotes the sum over i from I to n unless other limits are given
X = ZiXi/n, T = Z^Tj/n, etc. ; log X means loge

5. 1. 2 Definitionss CorrelationCoefficient

Let X re fe r to the vector . . . , X^jjj, and T to vector T i , . . . , Tjj;


define the sums

S(X,T) = Z(X^.j - 3Q(T. - f ) = SX^^^T^ - n X f

S(X,X) = Z(X^.^ = 2(X^ - ¾ "

S (T ,T ) = 2 (T j - T)2

and let v (X ,T ) = S (X ,T )/ (n - I ); sim ilarly define v (X ,X ) and v (T ,T ). The

correlation coefficient between X and T is R (X ,T ) = v ( X , T )/ {v (X , X )v (T , T )}*

= S (X ,T )/ {S (X ,3 Q S (T ,T )}^ . The statistic Z (X ,T ) = n { l - R^(X, T )} is often


used in subsequent sections. S (X ,X ) may be re fe rre d to as S^; sim ilarly
v(X , T ) o r R (X , T) may be re fe rre d to as v o r R when there is no ambiguity
in context. The usual meaning o f (sam ple) correlation, defined fo r two sets
o f random variables, is here being extended to define "correlation” between
a set of random variables X and a set of constants T , following the same
definition.

5 .1 .3 The Correlation Coefficient fo r Censored Data

Suppose only a subset o f the X^¿^ is available, because the data have been
censored. Provided the ranks i of the known Х(ц are known also, the c o rre ­
sponding T i can be paired with the X^ij, and the correlation coefficient c a l­
culated as above, with the sums running only over the available values i. The
calculation of R (X, T ) is thus very easily adapted to all types of censored
data.

5.2 REGRESSION TESTS: M O D ELS

Regression tests arise most: naturally when unknown param eters in the tested
distribution Fo(X) are location and scale param eters. Simpóse F q (x) is F(w )
with W = (x - a ) / ß , so that ce is a location param eter and ß a scale param eter,
TESTS BASED ON REGRESSION AND CORRELATION 197

and simpóse any other param eters in F (w ) a re known. If a sample o f values


W j w ere taken from F (w ) with ce = 0 and ß = I , we could construct a sample
Xj from F q (x ) by calculating

= ce + ß W ., i = I, (5.1)

Let mi = E(W^Jj); then

E (X ,.J = a + ß m . (5.2)
(I) I

and a plot of Х (ц against пц should be approximately a straight line with


intercept a on the vertical axis and slope /3. The values mt are the most
natural values to plot along the horizontal axis, but fo r most distributions
they are difficult to calculate. V arious authors have therefore proposed
alternatives T i which are convenient functions of i; then (5.2) can be replaced
by the model

X = O' + ÔT + € (5.3)
(I) ^ i i

where ei is an “e r r o r ” which for T i = mi w ill have mean zero. A frequent


choice for T i is Щ defined in Section 5 .1 .1 .

5.3 M EASURES OF F IT

Three main approaches to testing how w ell the data fit (5.3) can be identified;

(a) A test is based on the correlation coefficient R (X , T) between the paired


sets Xi and T i as defined in Section 5 .1 .2 .
(b) Estimates o¿ and ß of the param eters in the line o' + /3T i a re found by a
suitable method, and a test is based on the sum of squares o f the re s id ­
uals X (i) - X^i), where i^ i) = a + ß T ^ . In o rd e r to give a s c a le -fre e
statistic, this must be divided by another quadratic form in the Х (ц .
(c) The scale param eter ß is estimated as in (b ), and the squared value
compared with another estimate o f ß ^ , fo r example that obtained from
the samóle variance.

These three methods are closely connected. In p articular, when the


method of estimation in (b) and (c) is ordinary least squares, the techniques
often lead to test statistics equivalent to R^. W e discuss R ^(X ,m ) in this
chapter, fo r various distributions, beginning with the uniform. R ^(X ,m ) is
consistent against all alternatives; methods (b) and (c) can yield statistics
which are not consistent against certain classes of alternative. W e shall
198 STEPHENS

discuss statistics based on methods (b) and (c) when they arise in connection
with tests for the norm al and exponential distributions.

5.4 TESTS BASED O N THE C O R R ELATIO N C O E F F IC IE N T

5 .4 .1 The Correlation Coefficient and the A N O V A Table

When ordinary least squares is used to estimate a and ß in the line


X (i) = O' + ß T i t the estimate of Д is ß = S (X ,T )/ S (T ,T ), and the standard
A N O V A table for the model is, with Х^ц = â + ß T ^ ,

Regression sum of squares: S ^ (X ,T )/ S (T ,T )

E r r o r sum of squares (ESS) : 2¾ { Х^ц - Х(ц

Total sum of squares (TSS) : S^(X ,X )

Define

Z (X ,T ) = n { l - R 2 ( X , T ) } (5.4)

It is easily shown that Z (X ,T ), o r simply Z , is n(ESS/TSS). Thus Z is


equivalent to R ^ , but can also be regarded as based on ESS (Section 5.3,
method (b )), o r on the ratio of two quadratic form s (method (c )). The inter­
connections exist because ordinary least squares has been used to estimate
a and /3.
When T = m, R^ is an appealing statistic fo r m easuring the fit, since
if a "perfect” sample is given, that is , a sample whose ordered values fall
exactly at their e^jected values, R ^(X ,m ) w ill be I . A test based on R ^(X ,m )
w ill be one-tail, with sm all values of R^ indicating a poor fit. There are
then some advantages in using Z instead of R^; first, high values of Z lead
to rejection of a good fit, corresponding to many other goodness-of-fit sta­
tistics (such as ED F statistics o r Pearson^s X ^), and second, Z often has
an asymptotic distribution, which can be found, and interpolation in tables
is e a sie r. W e therefore tabulate Z , and insert asymptotic points in the tables
where these have been calculated.
R ^ (X ,T ) is naturally suited to the model Х(ц = a + ß T i where both a
and ß are unknown. When one o r m ore of these param eters are known (for
example, in many tests fo r exponentiality a = 0, and in some tests fo r uni­
formity O' = O and ß = I ) , R^ w ill not necessarily be a good statistic for
m easuring the fit, and a test based on the residuals (method (b)) w ill be
m ore natural. These points w ill be discussed in Sections 5 .6 .1 and 5.11.4
below.
TESTS BASED ON REGRESSION AND CORRELATION 199

5 .4 .2 Consistency of the R (X , m) Test

Saiiiadi (1975) showed the consistency of the text based on R (X ,m ) for testing
norm ality, and m ore recently G erlach (1979) has shown consistency fo r c o r­
relation tests based on R (X ,m ), o r equivalently on Z = n { l - R ^ (X ,m )}, for
a wide class o f distributions including a ll the usual continuous distributions.
This is to be expected, since fo r large n we expect a sample to become p e r ­
fect in the sense of Section 5 .4 .1 . W e can expect the consistency property
to extend to R (X , T) provided T approaches m sufficiently rapidly fo r large
sam ples. W e now give tests based on R ^ (X ,m ), for the uniform distribution,
with unknown lim its (next section) and with lim its 0 and I (Section 5.6 ).

5.5 THE C O R R E LA T IO N TEST FO R THE UNIFORM


DISTRIBUTIO N W ITH U N K NO W N LIM ITS

5 .5 .1 Com pleteSam ples

Suppose U (a ,b ) re fe rs to the uniform distribution between lim its a and b.


The test fo r uniformity is a test of

H^: a random sample X^, X^, ., X comes from U (a ,b )


n

with a, b unknown.
The ordered values Х(ц w ill be plotted against mi = l/(n + I ); mi =
E (W (i)), where W i , W 2 , . . . , W ^ is a random sample from U (0 ,1 ), and
m = 0.5; fo r this distribution mi = Hi. The correlation test then takes the
following steps:

(a) Calculate Z = n { l - R ^ (X ,m )}.


(b) Compare Z to the percentage points In Table 5.1; reject Hq if Z exceeds
the appropriate value fo r given n and Z has an asymptotic null d istri­
bution, found and tabulated by Lockhart and Stephens (1986). The points
given fo r finite n, obtained by Monte C arlo studies, approach the as 3nnp-
totlc points smoothly and rapidly, and interpolation in Table 5 . 1 is
straightforward. Note that since R ^ (X ,T ) is s c a le -fre e , R ^(X ,m ) is
here the same as R ^ (X ,T ) with T i = i.

5 . 5.2 Tests for Type 2 Censored Samples


Suppose the sample is Тзфе 2 right-censored (see Chapter 11), and only the
subset X (i), 1 = 1, . . . , r , is available. This subset, on Hq , w ill be uniform
with unknown limits a * and b * if a and b are unknown. Thus the subset may
be treated as a complete sample of size r and the test above may be used;
mi is then i/ (r + I ). Alternatively, R^(X,m) can be calculated as described
in Section 5 .1 .2 ; since the X(ij are the first r values out of n, the values mi
200 STEPHENS

T A B L E 5ol Upper T a il Percentage Points fo r Z = n { l (X, m )} fo r a


Test for Uniformity, Full Sample

Significance level a

0.5 0.25 0.15 0.10 0.05 0.025 0.01

4 0.344 0.559 0.734 0.888 1.089 1.238 1.388

6 0.441 0.703 0.901 1.053 1.325 1.590 1.918

8 0.495 0.792 1.000 1.163 1.474 1.739 2.100


10 0.535 0.833 1.068 1.245 1.532 1.846 2.294

12 0.560 0.864 1.093 1.280 1.628 1.938 2.360

18 0.605 0.940 1.147 1.385 1.716 2.083 2.503

20 0.610 0.960 1.200 1.420 1.760 2.140 2.550

40 0.640 0.980 1.215 1.420 1.762 2.140 2.550

60 0.648 0.988 1.227 1.420 1.765 2.140 2.550

80 0.658 0.997 1.228 1.420 1.770 2.140 2.550

0 . 666 0. 993 1.234 1.430 I. 774 2.129 2. 612

are i/(n + I ), i = I , . . . , r , with mean m = (r + 1)/{2 (n + 1)}. Because


R ^(X ,m ) is s c a le -fre e , both these procedures are equivalent to finding the
correlation between Х(ц and i, and the two values of R^ a re identical. Sim i­
larly , a left-censored o r double-censored sample can be treated; if the ranks
of the known X (j) are s < i < r , the correlation R between Xj and i may be
found, and Table 5 . 1 used as though the data w ere a complete sample of size
n = r + I - s. Note that this technique cannot be used for a randomly censored
sam ple. Chen (1984) has examined correlation tests for randomly censored
data, especially in connection with testing езфопепйаШу.

5 .5 .3 Test for Type I Censored Samples

F o r a Type I censored sam ple, the procedure above does not make use of
the censoring values, say A for the lower value and B for the upper. Suppose,
for simplicity, A is zero and B is 11. Then it is possible to have, say, five
values out of ten, X^^) to X ( 5), which are 0 .9 , 2.1, 3.1, 3.9, 5.2; these
could give a large correlation coefficient, but would be suspect as being a
uniform sample because of the large gap between the maximum X ( 5 ) = 5.2
and B = 11. One way to make use of A and B is to include these in the sample,
m;aking it now a sample of size 7. On Hq , all 7 values w ill be uniform between
TESTS BASED ON REGRESSION AND CORRELATION 201

unknown lim its a *, b * and again the complete sample test in Section 5 .5 .1
may be used. The test as now made combines a test fo r uniformity of the
X (i) with a test that they are spread over the range (A , B ).

5.6 TH E C O R R E LA T IO N T E ST FO R U (0 ,1 )

5 .6 .1 Com pleteSam ples

In many goodn ess-of-fit situations, a transformation reduces the original


test to a test of : that a set o f X -valu es is U (0 ,1). Exam ples a re the
Probability Integral Transform ation discussed in Chapter 4, and the J and K
transform ations in tests fo r е^фопепиаИ1у discussed in Chapter 10. F o r this
null hypothesis, equation (5.2) reduces to

= “ i

where mj = i/(n + I ); that is, a and ß a re known to be 0 and I , respectively.


A s in Section 5 .5 .3 , statistic R ^ (X ,m ), taken alone, is not a suitable test
statistic fo r Ho ; fo r example, R^ could be very close to I even if the X -values
w ere uniform on only a sm all subset o f [0 ,1 ]. R ^ (X ,m ) can be modified for
this situation by including 0, I as values in the sam ple, but a m ore direct
use of the known range is to base a test on the residuals v¿ = Х^ц - m j, for
example, using = 2¿v|; then M^/n is statistic Tj^ of Section 8 .8 .1 .
Other tests based on vj are given in Section 8 .8 .1 .

5 .6 .2 Censored Samples from U (0 , 1)

Suppose the sample is Type I left-censored at A and right-censored at B; a


test fo r uniformity should not only test fo r linearity o f the values given in
this interval, but also should test that the number of values is reasonable.
R ^(X ,m ) alone w ill not do this, fo r reasons sim ilar to those given in Section
5 .5 .3 ; sim ilar objections would ápply if the given values w ere Type 2 cen­
sored. Thus the correlation coefficient alone w ill not usually be appropriate
fo r testing for U (0 ,1) with censored sam ples.
Further illustrations of these difficulties are given in Chapter 11, where
other methods are proposed for censored U (0 ,1) sam ples.

5.7 REGRESSION TESTS FOR THE


N O R M A L DISTRIBUTION I

5 .7 .1 Tests Based on the Correlation Coefficient

In this section correlation tests are discussed for


202 STEPHENS

Ho: a random sample X i , . . . , Xn comes from N(/x,cr^)


(5.5)
with both jL¿ and cr unknown

In sections 5.8 and 5.9 other regression tests fo r (5.5) are developed,
based on residuals, o r on the ratio of two estimates of scale.

F o r the norm al distribution N ( m , o-^), f(w) = (2тг) ^ e x p (-w V 2 ), with


W = (x - fj)/<T ; thus a = 1Л and /3 = a, and m i a re the e x p e c t e d values of
standard norm al ord er statistics. Equation (5.2) becomes

E(X^^^) = /i + om^ (5.6)

F o r the norm al distribution m = 0, and R ^(X ,m ) can conveniently be written


in vector notation. Let X be the vector {X ^ i) » X^2) » • • • » ^(n)^ before, and
let m be the vector ( m i , m 2 . • • • ,nin); let a p rim e, fo r example, in X* and m ^
denote the transpose of a vector o r o f a m atrix. Then, fo r a complete sample.

(5.7)

This statistic w ill be seen later to be identical to the Shapiro-Francia statis­


tic W ’ , so that, for testing normality, we can re fe r to R ^(X ,m ) also as W 4

5 . 7 .2 Eзфected Values of Standard Norm al


O rd er Statistics

Values of m^, for the calculation of R ^ (X ,m ), must be calculated num eri­


cally: extensive tables have been given by H arter (1961). The approximation

.1 /J_2_0^^75>|
m. = Ф (5.8)
Vn + 0 125>'

first suggested by Blom (1958) can also be used; Ф"^(*) is the inverse of the
standard norm al cdf and computer routines exist for this function. Other
form ulas given by Hastings (1955) are quoted in Abram owltz and Stegun
(1965, p. 933); an algorithm has been given by Milton and Hotchkiss (1969),
and sim pler form ulas, suitable for use on pocket calculators, have since
been given by Page (1977) and by Hamaker (1978). W e isberg and Bingham
(1975) have shown that use of Blom*s form ula and the Milton-Hotchkiss
algorithm fo r Ф"^(*) to approximate mj makes negligible difference to the
distribution of R ^(X ,m ) = which in any case must be found by Monte
Carlo methods. Shapiro and Francia (1972) originally introduced W* as a
replacement for the Shapiro-W llk statistic below, for use when n > 50, but
the above articles suggest that W ’ = R ^(X ,m ) w ill be useful, because of the
T A B L E 5.2 Upper Tail Percentage Points for Z = n { l - R ^ ( X , m ) } for a
Test for Normality with Complete o r Type 2 Censored Data; p = censoring
ratio r/n

Significance level a

0.50 0.25 0.15 0 .1 0 0.05 0.025 0.01

20 1.52 2.86 3.65 4.16 5.26 6.65 8.13


40 2.68 4.25 5.40 6.27 7.69 9.07 10.76
p = 0.2 60 3.25 5.05 6.34 7.46 9.25 11.00 13.01
80 3.65 5.64 7.08 8.32 10.38 12.50 15.17
100 3.94 6.07 7.64 8.95 11.22 13.64 16.96
10 0.77 1.42 1.80 2.03 2.51 3.17 3.90
20 1.33 2.09 2.64 3.07 3.76 4.46 5.38
40 1.77 2.72 3.42 3.98 4.92 5.97 7.30
p = 0.4
60 2.02 3.04 3.84 4.49 5.65 6.79 8.51
80 2.18 3.30 4.16 4.84 6.13 7.44 9.38
100 2.30 3.51 4.39 5.09 6.46 7.92 10.01
10 0.76 1.19 1.50 1.76 2.18 2.61 3.15
20 1.03 1.57 1.97 2.28 2.83 3.38 4.09
40 1.30 1.93 2.42 2.81 3.47 4.22 5.31
P = 0.6
60 1.42 2.13 2.66 3.08 3.86 4.70 5.97
80 1.52 2.27 2.82 3.29 4.12 5.04 6.36
100 1.60 2.38 2.94 3.43 4.30 5.27 6.63
10 0.65 1.01 1.26 1.45 1.79 2.14 2.59
20 0.84 1.25 1.55 1.79 2.20 2.63 3.27
40 1.01 1.48 1.81 2.09 2.61 3.14 3.85
p = 0.8
60 1.08 1.58 1.95 2.25 2.80 3.38 4.29
80 1.15 1.67 2.06 2.38 2.95 3.56 4.50
100 1.19 1.78 2.14 2.47 3.07 3.68 4.61
10 0.62 0.94 1.17 1.35 1.67 2.01 2.42
20 0.76 1.14 1.40 1.60 1.97 2.33 2.86
40 0.90 1.31 1.61 1.86 2.27 2.70 3.34
p = 0.9 60 0.96 1.40 1.70 1.96 2.38 2.87 3.54
80 1.01 1.46 1.78 2.05 2.50 2.99 3.68
100 1.05 1.50 1.84 2.12 2.60 3.07 3.78
10 0.60 0.92 1.13 1.30 1.62 1.95 2.34
20 0.74 1.11 1.36 1.55 1.89 2.26 2.73
40 0.87 1.26 1.54 1.76 2.14 2.52 3.14
p = 0.95
60 0.92 1.33 1.61 1.85 2.26 2.69 3.27
80 0.97 1.39 1.68 1.92 2.35 2.80 3.39
100 1.01 1.43 1.74 1.98 2.41 2.88 3.48
10 0.59 0.89 1.10 1.26 1.58 1.90 2.27
20 0.76 1.11 1.36 1.56 1.91 2.31 2.81
40 0.91 1.32 1.60 1.83 2.23 2.66 3.30
60 1.01 1.42 1.72 1.96 2.39 2.83 3.43
80 1.06 1.50 1.80 2.05 2.49 2.94 3.54
1.0
100 1.08 1.55 1.86 2.11 2.56 3.01 3.61
400 1.34 1.83 2.17 2.45 2.89 3.36 3.95
600 1.43 1.94 2.26 2.52 2.98 3.42 4.04
1000 1.50 2.00 2.33 2.60 3.11 3.61 4.25
204 STEPHENS

ready availability of good approximations to m i, fo r sm aller n. W e have


therefore constructed tables for this test based on R ^ (X ,m ), using (5.8) fo r m .

5 .7.3 C o rre la tio n T e s tB a s e d o n R ^(X ,m )

The correlation test using R^ (X , m) then takes the following form :

(a) Calculate Z = n { I - R^ (X ,m )}, with mi given by (5 .8 ).


(b) R efer Z to Table 5.2. Hq is rejected if Z is greater than the percentage
point fo r given values of n and o f p = r/n, and significance level a .
Table 5.2 contains percentage points fo r use with Type 2 censored data.
They have been found by extensive Monte Carlo sampling, using 10,000
sam ples for each n. Shapiro and Francia (1972) have given points for
W ’ = R ^ (X ,m ), for complete sam ples only, fo r n = 35 and fo r a ll n b e ­
tween 50 and 99. Fotopoulos, L e slie , and Stephens (1984) have produced
asymptotic theory for Z (X ,m ), fo r complete sam ples.

Tables are not given for Type I right-censored data since objections
apply sim ilar to those against EDF statistics. If the цррег censoring value
w ere t, and if p = ф {1 - ß ) / a } , tables of Z could be given for selected p and n;
however, they would have to be entered, in practice, at p = ф {(t - ? )/ ? }, and
this could cause an e r r o r in the apparent significance level o f Z . F o r large
sam ples, Z can be calculated from the available observations, and Table 5.2
can be used to give an approximate test.

E 5 .7 .3 Example

Table 5.3 contains 20 values of weights of chicks, taken from B liss (1976),
already used in Chapter 4 in tests of normality. When these a re correlated

T A B L E 5.3 Data fo r Tests o f Norm ality

X^; 156, 162, 168, 182, 186, 190, 190, 196, 202, 210, 214, 220, 226,
230, 230, 236, 236, 242, 246, 270
b
Hija........ .... : 1.86748, 1.40760, 1.13095, 0.92098, 0.74538, 0.59030,
0.44833, 0.31493, 0.18696, 0.06200
I C
n»20. , m i l • 1*8682, 1.4034, 1.1281, 0.9191, 0.7441, 0.5895, 0.4478,
0.3146, 0.1868, 0.0619

^The values X are weights of 20 chicks, in g ra m s .


W a lu e s m are the exact expected values of standard norm al ord er statistics.
^Values m* are those given by (5 .8 ).
TESTS BASED ON REGRESSION AND CORRELATION 205

with m i calculated from (5 .8 ), also given in the table, the value of R is


0.9907, leading to Z = 0.3719. When the weights are correlated against the
exact mi (also given), Z = 0.3720, a negligible difference. Reference to
T able 5.2 shows that Z is not significant at p = 0.50, so that normality is
not rejected (the same conclusion as in Section 4 .8 .1 ). An example with
censored data is in Section 11. 4 . 1. 2 .

5.7.4 Correlation T est Based on R^ (X, H)

Let Hi = Ф“^ {i/ (n + 1)} ; a test could be based on statistic Z(X,H) =


n { l - R^(X,H)}, following the procedure given above. This test statistic was
suggested by de W et and Venter (1972). U se o f the Щ instead of m i makes
distribution theory of Z(X,H) e a sie r than that of Z (X ,m ), and de W et and
Venter have given the asymptotic theory fo r Z(X,H ), fo r full sam ples.
Stephens (1986) has given tables for Z (X ,H ), fo r finite n and fo r complete
and Type 2 censored sam ples. These are not included here since computation
of Z(X,H) is no e a sie r than that o f Z(X,m ) and the corresponding tests can
be expected to have sim ilar p ro p erties.

5 .7 .5 Other Correlation Statistics fo r Norm ality

Smith and Bain (1976) have proposed the correlation statistic R (X ,K ), where
Ki is a close approximation to Щ , given by Abram owltz and Stegun (1965,
p. 933). Smith and Baln have given tables fo r use when R ^(X ,K ) has been
calculated from Type 2 censored data. Filliben (1975) investigated tests
using T i = nii the median of the distribution of W^i^; nii is given by Ф“^(íïi),
where Ui is the median of the i-th o rd er statistic of a uniform sam ple.
Filliben has given an em pirical approximation fo r Ui which gives a form ula
fo r ihi sim ilar to that for mi given in (5.8) above; thus R ^(X ,m ) is close to
R ^(X ,m ) = W * and has sim ilar power properties (Filliben, 1975). Fllllben
also gave tables of critical values of R^ (X, m ).

5.8 REGRESSION TESTS BASED ON RESIDUALS

W e next turn to the second method of testing described in Section 5.3, in


which param eters a and ß in the model Х^ц = a + ß T \ + ^i are estimated
and a test is then based on the residuals. W e consider estimates given by
generalized least squares, and suppose T = m. In the notation of Section 5.2,
let Vij = E { W^ij - în i } { W ( j ) - m j } , the covariance of W (i) and W ( j ). Then,
as before, let X be the column vector with components X (i), . , X(n)» let
m be the column vector with components m i, . . . , m^, and let I be a column
vector with each component I . Let V be the m atrix V = (Vij) and let X* be
the transpose of X; sim ilarly for other vectors and m atrices. The general­
ized least squares estimates of a and ß are
206 STEPHENS

а = -m ’GX and ß = VGX (5.9)

where

V “4lm^ - т Г ) У
G =
(Г У - Ч ) - ( r V “^m)2

F o r particular distributions, for example, fo r the norm al and exponential


distributions, these equations sim plify considerably^.
The estimate of X (i) given by the line is again = a + ß m i , and the
sum of squares of residuals, corresponding to the E r r o r Sum of Squares In
the ANO VA table of Section 5.4, is ESS^ = 2 i(X (i) - ^ (i))^ ; a test fo r fit
may then be based on Z j (X ,m ) = E SSj/S^, where S^ = (X(i) - as
before. Alternatively, since In generalized least squares analysis the quan­
tity ESS2 = (X - i ^ ’V “^(X - ÍQ, where X is the column vector with^compo­
nents Xi(I)* X^j^^, is minimized by the param eter estimates a and /?,
the test might be based on Z 2 (X ,m ) = ESS 2/S^. The test based on Z^ can be
shown to be consistent. Some examination of such tests has been made by
Spinelli (1980), for the exponential and the extrem e-value distributions, but
otherwise they have not been much developed.

5.9 TESTS BASED O N THE RATIO OF


TW O ESTIM ATES OF SC A LE

Finally we turn to the third method of testing the fit of the model (5 .2 ), one
which has been developed by Shapiro and W ilk for testing normality and
exponentiality. The procedure used is to compare /3^, where /? is the gen eral­
ized least squares estimate given in equation (5 .9 ), with the estimate of /3^
given by the sample variance, and the test statistic is essentially
Tests of this type are closely related to those in the previous section.
F o r the normal test, this statistic works very w ell, but In other test
situations, fo r example, for the exponential test, the statistic is inconsistent;
in practical term s there w ill be certain alternative distributions which w ill
not be detected even with very large samples (see Section 5.12).
In the case of tests fo r normality, modifications of the first estimate
of /3^ above have also been suggested, since the estimate is complicated to
calculate. It is not, of course, necessary to use the particular estimates of
scale given above, and a test can be developed using the ratio of any two con­
venient estimates for ß ^ . Some comments on the choice of these estimates
are in Section 5.11 below.
In the next section we give tests for normality based on residuals and on
ratios of scale estimates. Regression tests for exponentiality o f all types
are Included with other tests in Section 5.11, and some general comments
on the techniques are given in Section 5.12.
TESTS BASED ON REGRESSION AND CORRELATION 207

5.10 REGRESSION TESTS FO R TH E


N O R M A L DISTR IBUTIO N 2

5 .1 0 .1 Tests fo r Norm ality Based on Residuals

The model ( 5 . 2) can be extended to provide further tests of fit. Suppose the
m \ a re the expected values o f o rd e r statistics from a N ( 0 , 1) distribution; the
model expressed by (5.2) is then correct only when the sample X i comes
from a norm al distribution. A wide class of alternative distributions can be
specified by supposing that the o rd er statistics in a sample of size n from a
distribution F(X) have expectations which satisfy the model

E (X (i)) = Д + o-m^ + (5.10)

where ß^y • • • are constants and W2(m i), W3 (m j), . . . are functions of mi-
By different choices o f these functions, the norm al model fo r X is embedded
in various classes of densities. F o r the appropriate class, fo r given W j(*),
the estimates ^ of the constants ß j can be derived by generalized least
squares. A test fo r normality can be based on these estimates by using them
to test Ho : /З2 = /З3 = • • • = 0. Equivalently, tests may be based on the reduc­
tion in e r r o r sum of squares between a fit of model (5.2) and the m ore gen­
e ra l model (5.10) above. LaBrecque (1973, 1977) has developed such tests,
fo r Wj(m) a j-th o rd er polynomial in m , chosen so that the covariance m atrix
of the estimates д, o-, e t c ., should be diagonal, and has given the
necessary tables of coefficients and significance points. Stephens (1976) has
suggested use o f Hermite polynomials fo r w j(* ) and has given asymptotic
theory of the tests based on ßj, fo r j > 2 .

5.10.2 T e s t fo r N ( 0 , 1)

When the param eters in N (^ ,(j^ ) a re specified, the test based on residuals
takes a sim ple form . Let Х^ц = (Х(ц - д)/о-; the null h3pothesis then reduces
to a test that the Х^ц are N ( 0 , 1) , and a natural statistic based on residuals
is

“ Î . - = = % ) - ” ,>*

De W et and Venter (1972) have investigated the asymptotic theory of this


statistic (which they call L ^ ). However, no tables a re available fo r sm all
sample sizes, and the power properties o f the test are not known. The test
based on would be a riv al to the many tests available for this case, called
Case 0 in Chapter 4, when param eters are fully specified; fo r example, the
Xi could be transform ed to uniforms (on the null hypothesis) by the P ro babil­
ity Integral Transform ation (Section 4 .2 .3 ) and any of the many tests for
U (0 , 1) could be used.
208 STEPHENS

5 .1 0 .S Tests fo r Norm ality Based on the Ratio of


Estimates of Scale; the Shapiro-W ilk Test

F o r the norm al distribution, a. and ß in (5.2) are д and a, respectively; the


estimates of these param eters given by (5.9) then become

- - , . m *V -iX
Ai = X and <7 = , — (5.11)
m’ V ^m

The test statistic proposed by Shapiro and W ilk (1965) is

2r 4
W = (5.12)
S^

where S^ = 2 (X ai^ - X)^ = 2(Х ^ - before, and where here R^ = m * V ~ ^


and = m ^ V ^ V " ^ . The factors R^ and ensure that W always takes
values between 0 and I .

5. 10. 3 . 1 Computing Form ulas

When O- is substituted in the form ula fo r W , it may be reduced to the following


computing form ula. F irst, let vectors a * and a be defined by

a * = V ”^m and a = a*/C (5.13)

then

W (5.14)
S^ S2

In o rd er to calculate W , the vector a is needed, and this in turn requires


values of m and V “^, derived from V . At the time of the introduction o f W ,
exact values of V w ere published only fo r sample sizes up to n = 20 (Teichroew,
1956; Sarhan and G reenberg, 1962). They have since been calculated for
la rg e r sample sizes (Tietjen, Kahaner, and Beckman, 1978), and an algorithm
fo r approximating V has been given by Davis and Stephens (1977). F o r values
of n between 21 and 50, Shapiro and W ilk used approximations fo r the compo­
nents ai o f a, and gave a table for sample sizes from n = 3 to 50. These are
given in Table 5.4. The a^ are symmetric about 0, that is , aj =
i = I , . . . , r , where r = (n - 1)/2 if n is odd and r = n/2 if n is even, so
that only the positive values are given.

5. 10. 3 . 2 Test Procedures

The steps in making the W test for normality, that is , for testing : the Xj
are a random sample from N( ai, ct^), with ju, cr unknown, are:
TESTS BASED ON REGRESSION AND CORRELATION 209

TABLE 5.4 Coefficients for the W Test for Normality

2 3 4 5 6 7 8 9 10

1 .7071 .7071 .6872 .6646 .6431 .6233 .6052 .5888 .5739


2 — .0000 .1677 .2413 .2806 .3031 .3164 .3244 .3291
3 — — — .0000 .0875 .1401 .1743 .1976 .2141
4 — — — — .0000 .0561 .0947 .1224
5 — — — — — — — .0000 .0399

11 12 13 14 15 16 17 18 19 20

1 .5601 .5475 .5359 .5251 .5150 .5056 .4968 .4886 .4808 .4734
2 .3315 .3325 .3325 .3318 .3306 .3290 .3273 .3253 .3232 .3211
3 .2260 .2347 .2412 .2460 .2495 .2521 .2540 .2553 .2561 .2565
4 .1429 .1586 .1707 .1802 .1878 .1939 .1988 .2027 .2059 .2085
5 .0695 .0922 .1099 .1240 .1353 .1447 .1524 .1587 .1641 .1686

.0000 .0303 .0539 .0727 .0880 .1005 .1109 .1197 .1271 .1334
— — .0000 .0240 .0433 .0593 .0725 .0837 .0932 .1013
— — — — .0000 .0196 .0359 .0496 .0612 .0711
— — — — — — .0000 .0163 .0303 .0422
— — — — — — — — .0000 .0140

21 22 23 24 25 26 27 28 29 30

I .4643 .4590 .4542 .4493 .4450 .4407 .4366 .4328 .4291 .4254
2 .3185 .3156 .3126 .3098 .3069 .3043 .3018 .2992 .2968 .2944
3 .2578 .2571 .2563 .2554 .2543 .2533 .2522 .2510 .2499 .2487
4 .2119 .2131 .2139 .2145 .2148 .2151 .2152 .2151 .2150 .2148
5 .1736 .1764 .1787 .1807 .1822 .1836 .1848 .1857 .1864 .1870

6 .1399 .1443 .1480 .1512 .1539 .1563 .1584 .1601 .1616 .1630
7 .1092 .1150 .1201 .1245 .1283 .1316 .1346 .1372 .1395 .1415
8 .0804 .0878 .0941 .0997 .1046 .1089 .1128 .1162 .1192 .1219
9 .0530 .0618 .0696 .0764 .0823 .0876 .0923 .0965 .1002 .1036
10 .0263 .0368 .0459 .0539 .0610 .0672 .0728 .0778 .0822 .0862

11 .0000 .0122 .0228 .0321 .0403 .0476 .0540 .0598 .0650 .0697
12 — — .0000 .0107 .0200 .0284 .0358 .0424 .0483 .0537
13 — — — — .0000 .0094 .0178 .0253 .0320 .0381
14 ___ — — — — — .0000 .0084 .0159 .0227
15 — — — — — — — — .0000 .0076

(continued)
210 STEPHENS

TABLE 5.4 (continued)

SI 32 33 34 35 36 37 38 39 40

1 .4220 .4188 .4156 .4127 .4096 .4068 .4040 .4015 .3989 .3964
2 .2921 .2898 .2876 .2854 .2834 .2813 2794 .2774 .2755 .2737
3 .2475 .2463 .2451 .2439 .2427 .2415 2403 .2391 .2380 .2368
4 .2145 .2141 .2137 .2132 .2127 .2121 2116 .2110 .2104 .2098
5 . 1874 .1878 .1880 .1882 .1883 .1883 .1883 .1881 .1880 .1878

6 .1641 .1651 .1660 .1667 .1673 .1678 .1683 .1686 .1689 .1691
7 . 1433 .1449 .1463 .1475 .1487 .1496 .1505 .1513 .1520 .1526
8 . 1243 .1265 .1284 .1301 .1317 .1331 .1344 .1356 .1366 .1376
9 . 1066 .1093 .1118 .1140 .1160 .1179 .1196 .1211 .1225 .1237
10 . 0899 .0931 .0961 .0988 .1013 .1036 .1056 .1075 .1092 .1108

11 .0739 .0777 .0812 .0844 .0873 .0900 .0924 .0947 .0967 .0986
12 .0585 .0629 .0669 .0706 .0739 .0770 .0798 .0824 .0848 .0870
13 .0435 .0485 .0530 .0572 .0610 .0645 .0677 .0706 .0733 .0759
14 .0289 .0344 .0395 .0441 .0484 .0523 .0559 .0592 .0622 .0651
15 .0144 .0206 .0262 .0314 .0361 .0404 .0444 .0481 .0515 .0546

16 .0000 .0068 .0131 .0187 .0239 .0287 .0331 .0372 .0409 .0444
17 - .0000 .0062 .0119 .0172 .0220 .0264 .0305 .0343
18 - — .0000 .0057 .0110 .0158 .0203 .0244
19 - .0000 .0053 .0101 .0146
20 - — — .0000 .0049

41 42 43 44 45 46 47 48 49 50

I .3940 .3917 .3894 .3872 .3850 .3830 .3808 .3789 .3770 .3751
2 .2719 .2701 .2684 .2667 .2651 .2635 .2620 .2604 .2589 .2574
3 .2357 .2345 .2334 .2323 .2313 .2302 .2291 .2281 .2271 .2260
4 .2091 .2085 .2078 .2072 .2065 .2058 .2052 .2045 .2038 .2032
5 .1876 .1874 .1871 .1868 .1865 .1862 .1859 .1855 .1851 .1847

6 .1693 .1694 .1695 .1695 .1695 .1695 .1695 .1693 .1692 .1691
7 .1531 .1535 .1530 .1542 .1545 .1548 .1550 .1551 .1553 .1554
8 .1384 .1392 .1398 .1405 .1410 .1415 .1420 .1423 .1427 .1430
9 .1249 .1259 .1269 .1278 .1286 .1293 .1300 .1306 .1312 .1317
10 .1123 .1136 .1149 .1160 .1170 .1180 .1189 .1197 .1205 .1212

(continued)
TESTS BASED ON REGRESSION AND CORRELATION 211

TABLE 5.4 (continued)

41 42 43 44 45 46 47 48 49 50

11 .1004 .1020 .1035 .1049 .1062 .1073 .1085 .1095 .1105 .1113
12 .0891 .0909 .0927 .0943 .0959 .0972 .0986 .0998 .1010 .1020
13 .0782 .0804 .0824 .0842 .0860 .0876 .0892 .0906 .0919 .0932
14 .0677 .0701 .0724 .0745 .0765 .0783 .0801 .0817 .0832 .0846
15 .0575 .0602 .0628 .0651 .0673 .0694 .0713 .0731 .0748 .0764

16 .0476 .0506 .0534 .0560 .0584 .0607 .0628 .0648 .0667 .0685
17 .0379 .0411 .0442 .0471 .0497 .0522 .0546 .0568 .0588 .0608
18 .0283 .0318 .0352 .0383 .0412 .0439 .0465 .0489 .0511 .0532
19 .0188 .0227 .0263 .0296 .0328 .0357 .0385 .0411 .0436 .0459
20 .0094 .0136 .0175 .0211 .0245 .0277 .0307 .0335 .0361 .0386

21 .0000 .0045 .0087 .0126 .0163 .0197 .0229 .0259 .0288 .0314
22 — — .0000 .0042 .0081 .0118 .0153 .0185 .0215 .0244
23 — — — — .0000 .0039 .0076 .0111 .0143 .0174
24 — — — — — — .0000 .0037 .0071 .0104
25 — — — — — — — — .000 .0035

Taken from Shapiro and W ilk (1965), with perm ission of the authors and of
the Biom etrika T ru stees.

(a) Calculate Y from Y = Z . - a .. .(X. .. „ - ^) where r = (n - 1)/2,


i= l n + l - i ' (n + l-i)
if n is odd, and r = n/2 if n is even.
(b) Calculate W = Y V S ^ .
(C) If W is less than the value given in the low er tail in T able 5.5 fo r appro­
priate values o f n and a , reject Hq at level a .

The exact distribution of W under the null hypothesis w ill depend on n,


but not on the true values of д and o-. This distribution is not known, and
Shapiro and W ilk provided Monte C arlo percentage points fo r use with the
test. T able 5.5 contains these percentage points of W , fo r sample sizes
n < 50. Shapiro and W ilk (1968) gave an approximation to the null distribution
of W .
The test is made in the low er tail o f W , because extensive Monte Carlo
studies by Shapiro and W ilk suggested that when the sample is not from a
norm al distribution, low values of W w ill usually result; this is so because
W is approximately R ^(X ,m ) fo r the norm al case (see Section 5 .1 0 .4 ).
212 STEPHENS

T A B L E 5.5 Percentage Points for the W Test for Normality

Significance level а.

Low er tail Upper tail

n 0.01 0.02 0.05 0.10 0.50 0.10 0.05 0.02 0.01

3 0.753 0.756 0.767 0.789 0.959 0.998 0.999 1.000 1.000


4 .687 .707 .748 .792 .935 .987 .992 .996 .997
5 .686 .715 .762 .806 .927 .979 .986 .991 .993

6 0.713 0.743 0.788 0.826 0.927 0.974 0.981 0.986 0.989


7 .730 .760 .803 .838 .928 .972 .979 .985 .988
8 .749 .778 .818 .851 .932 .972 .978 .984 .987
9 .764 .791 .829 .859 .935 .972 .978 .984 .986
10 .781 .806 .842 .869 .938 .972 .978 .983 .986

11 0.792 0.817 0.850 0.876 0.940 0.973 0.979 0.984 0.986


12 .805 .828 .859 .883 .943 .973 .979 .984 .986
13 .814 .837 .866 .889 .945 .974 .979 .984 .986
14 .825 .846 .874 .895 .947 .975 .980 .984 .986
15 .835 .855 .881 .901 .950 .975 .980 .984 .987

16 0.844 0.863 0.887 0.906 0.952 0.976 0.981 0.985 0.987


17 .851 . 869 . 892 .910 .954 .977 .981 .985 .987
18 .858 .874 .897 .914 .956 .978 .982 .986 .988
19 . 863 .879 .901 .917 .957 .978 .982 .986 .988
20 . 868 .884 .905 .920 .959 .979 .983 .986 .988
21 0.873 0.888 0.908 0.923 0.960 0.980 0.983 0.987 0.989
22 .878 .892 .911 .926 .961 .980 .984 .987 .989
23 .881 .895 .914 .928 .962 .981 .984 .987 .989
24 .884 .898 .916 .930 .963 .981 .984 .987 .989
25 .888 .901 .918 .931 .964 .981 .985 .988 .989
26 0.891 0.904 0.920 0.933 0.965 0.982 0.985 0.988 0.989
27 . 894 .906 .923 .935 .965 .982 .985 .988 .990
28 .896 .908 .924 .936 .966 .982 .985 .988 .990
29 . 898 .910 .926 .937 .966 .982 .985 .988 .990
30 .900 .912 .927 .939 .967 .983 ".985 .988 .900
31 0.902 0.914 0.929 0.940 0.967 0.983 0.986 0.988 0.990
32 . 904 .915 .930 .941 .968 .983 .986 .988 .990
33 . 906 .917 .931 .942 .968 .983 .986 .989 .990
34 . 908 . 919 . 933 . 943 . 969 . 983 .986 .989 .990
35 .910 .920 .934 .944 .969 .984 .986 .989 .990

(continued)
TESTS BASED ON REGRESSION AND CORRELATION 213

T A B L E 5.5 (continued)

Significance level a

Low er tail Upper tail

n 0.01 0.02 0.05 0.10 0.50 0.10 0.05 0.02 0.01

36 0.912 0.922 0.935 0.945 0.970 0.984 0.986 0.989 0.990


37 .914 .924 .936 .946 .970 .984 .987 .989 .990
38 .916 .925 .938 .947 .971 .984 .987 .989 .990
39 .917 .927 .939 .948 .971 .984 .987 .989 .991
40 .919 .928 .940 .949 .972 .985 .987 .989 .991

41 0.920 0.929 0.941 0.950 0.972 0.985 0.987 0.989 0.991


42 .922 .930 .942 .951 .972 .985 .987 .989 .991
43 .923 .932 .943 .951 .973 .985 .987 .990 .991
44 .924 .933 .944 .952 .973 .985 .987 .990 .991
45 .926 .934 .945 .953 .973 .985 .988 .990 .991

46 0.927 0.935 0.945 0.953 0.974 0.985 0.988 0.990 0.991


47 .928 .936 .946 .954 .974 .985 .988 .990 .991
48 .929 .937 .947 .954 .974 .985 .988 .990 .991
49 .929 .937 .947 .955 .974 .985 .988 .990 .991
50 .930 .938 .947 .955 .974 .985 .988 .990 .991

Taken from Shapiro and W ilk (1965), with perm ission of the authors
and of the Biom etrika T ru ste e s.

E 5 .10. 3 . 3 Example

F o r the data of Table 5.3, Y in Step (a) above is (using coefficients from
Table 5.4) Y = 0.4734(270 - 156) + 0.3211(246 - 162) + • • • = 131.95. S^ is
17845 and W = 0.976. Reference to Table 5.5 shows W to be significant at
approximately the 10% level, upper tail, indicating a very good normal fit.

5.10.4 The Shapiro-Francia Test

A test sim ilar to W , but fo r use with n > 50, was later suggested by Shapiro
and Francia (1972). This is based on the observation of Gupta (1952), that
the estimate a is almost the same if is Ignored in (5.11); the test statistic
then given by Shapiro and Francia is
214 STEPHENS

A s has already been observed, is equivalent to the sample correlation


statistic R ^(X ,m ) given by equation (5.7) of Section 5 .7 .1 . A justification
fo r the equivalence, fo r large n, o f W and W* has bèen given by Stephens
(1975), who showed that, fo r large n, V “4n ~ 2m; then /n and /n in
Section 5.10.3 w ill have limits 2 and 4 and W reduces to W ’ = R ^ (X ,m ).

5. 10. 5 The d*Agostino Test

In the above tests, a is given by a linear combination of the o rd e r statistics.


The coefficients are difficult to calculate fo r W , and in W* the form ula is
replaced by an ea sie r one. Other linear combinations of o rd er statistics can
also be used to estimate o, and one in particular, proposed by Downton
(1966), has been used by d^Agostino to provide a further adapation of the
Shapiro-W ilk statistic. d ’Agostino’s statistic is given by

o/o
= - è ( n + l )}]/ (S n ) (5.15)

The statistic is e asier to calculate than W o r W* but must be used with


both tails; d ^Agostlno has shown by Monte Carlo studies that alternative d is­
tributions may produce large o r sm all values of the statistic. D ’Agostino
(1971, 1972, 1973) tabled percentage points fo r a standardized value o f D ^ ,
given by Y = { D ^ - (2 n/7t ) ”^} n/п /О.02998598, and gave power studies (see
Chapter 9). Note that D ^ can be regarded as a correlation statistic, based
on the correlation R (X ,T) where now T^ = { i - j(n + 1)}; however, the fact
that Ti is not near mi means that D ^ can take significantly large or sm all
values when the sample is not norm al.

5.10.6 Pow er Studies for Regression Tests


o f Normality

Shapiro and W ilk (1965) and Shapiro, W ilk, and Chen (1968) have given exten­
sive Monte Carlo studies to compare W with other test statistics for n orm al­
ity. Their studies indicate that W is a powerful omnibus test, particularly
when compared with the Pearson chi-square tests and against other older
tests, fo r example, those based on skewness bj and kurtosis Ь з , o r
U = range/(standard deviation). Stephens (1974) also gave com parisons, p a r­
ticularly of W with EDF statistics, and pointed out that low power results for
the latter, given in the papers quoted above, are based on non-com parable
test situations. Nevertheless, over a wide range of alternative distributions,
W gives slightly better power than EDF statistics A^ and W ^ , and consider­
ably better than the Kolmogorov D , o r the Pearson X^ o r P ea rso n -F ish e r X^
discussed in Chapter 3. F o r large sam ples, Stephens (1974) also compared
W , W *, and D a for power over a wide range of alternatives. The power of W ’
is m arginally less than that of W when W is available, and that of D a is
TESTS BASED ON REGRESSION AND CORRELATION 215

sm aller again. Thus the power drops as the statistic becomes m ore easy to
calculate. D yer (1974) has shown that W* has good power properties. F o r
la rg e samples these studies are effectively showing the value of the c o rre ­
lation coefficients R ^(X ,m ) o r R ^ (X jH) as test statistics for normality. Of
these statistics, it would appear to be best to use the Shapiro-W ilk W for
sm all sam ples, and Z (X ,m ) for la rg e r sam ples (n > 50), but further com ­
parisons would be useful.

5.11 REGRESSION TESTS FO R THE


E X P O N E N T IA L D ISTR IBUTIO N

5.11.1 Introduction

In this and the following sections we discuss tests of

Hq : a random sample of X -valu es comes from Exp ( a , ß)

that is, the distribution

F q (x) = I - exp { - ( x - o i ) / ß } f X > O', /3 > 0 (5.16)

P aram eter a is the origin o f the distribution, and ß is the scale param eter.
F irst suppose that both a and ß are unknown.

5.11.2 Correlation Coefficient Tests

F o r the exponential distribution, m. = (n - j + 1)"^ and so can be calcu­

lated without num erical Integration. The test statistic Z ( X , m ) = n { l -R ^ (X ,m )},


fo r either complete o r right-censored sam ples o f Type 2, is re fe rre d to
Table 5.6. Points in this table w ere found from Monte C arlo sam ples, using
10,000 sam ples fo r each n.
A lso , for the exponential distribution, Hj = - l o g { l - i/(n + 1 ) } ; the
correlation test statistic using H is then Z (X ,H ) = n { l - R ^ (X ,H )} and is
re fe rre d to Table 5.7. The points differ only slightly from those fo r Z (X ,m )
except when p, the censoring ratio, is near I . Although Spinelli and Stephens
(1983; see Section 5.11.6) found Z (X ,H ) less sensitive than Z (X ,m ) for
complete samples we include the table because Щ can be calculated m ore
easily than m i. It might also be true that the power of Z (X ,H ) im proves for
right-censored samples where the influential tail observations are not avail­
able. Smith and Bain (1976) have given a table of points fo r Z (X ,H )/ n , and
Lockhart and Stephens (1986) have discussed asymptotic theory for both
Z (X ,m ) and Z (X ,H ). Examples of the use o f these statistics are given in
Chapter 10.
216 STEPHENS

T A B L E 5.6 Upper T ail Percentage Points for Z = n { l - R ^ (X ,m )} fo r a


Test for Exponentiality, Param eters Unknown, fo r Complete o r Type-2
Right-Censored Data; p = r/n is the censoring ratio

Significance level a.

n 0.50 0.25 0.15 0.10 0.05 0.025 0.01

20 1.74 2.83 3.68 4.40 5.44 6.24 6.97


40 2.53 3.95 5.00 5.87 7.32 8.81 10.71
P = 0.2 60 2.80 4.32 5.44 6.34 7.90 9.47 11.64
80 2.95 4.51 5.68 6.58 8.20 9.81 12.13
100 3.05 4.63 5.82 6.74 8.39 10.03 12.44
10 0.86 1.41 1.82 2.16 2.68 3.09 3.50
20 1.25 1.98 2.51 2.96 3.73 4.49 5.46
40 1.48 2.26 2.84 3.34 4.13 4.96 6.15
P = 0.4
60 1.54 2.33 2.93 3.42 4.23 5.08 6.27
80 1.57 2.38 2.98 3.47 4.2 8 5.14 6.34
100 1.59 2.40 3.01 3.49 4.31 5.18 6.37
10 0.75 1.22 1.55 1.81 2.25 2.65 3.19
20 0.95 1.47 1.86 2.17 2.69 3.25 4.03
40 1.06 1.60 2.02 2.36 2.93 3.54 4.36
P = 0 .6
60 1.09 1.65 2.05 2.39 2.96 3.56 4.37
80 1.11 1.66 2.07 2.40 2.98 3.58 4.39
100 1.12 1.66 2.08 2.41 2.99 3.59 4.40
10 0.65 1.04 1.32 1.55 1.93 2.31 2.83
20 0.79 1.22 1.55 1.81 2.27 2.76 3.46
40 0.87 1.32 1.67 1.94 2.43 2.85 3.56
P = 0 .8
60 0.90 1.35 1.69 1.97 2.45 2.92 3.63
80 0.91 1.37 1.71 1.99 2.46 2.95 3.66
100 0.92 1.37 1.72 2.00 2.47 2.97 3.68
10 0.62 1.00 1.27 1.49 1.85 2.23 2.68
20 0.79 1.21 1.53 1.79 2.25 2.74 3.45
40 0.86 1.32 1.66 1.94 2.42 2.96 3.67
P = 0.9
60 0.89 1.35 1.69 1.97 2.46 2.99 3.70
80 0.91 1.36 1.70 1.99 2.49 3.01 3.71
100 0.92 1.37 1.71 2.00 2.50 3.02 3.72
10 0.63 1.03 1.31 1.53 1.89 2.24 2.67
20 0.81 1.26 1.59 1.89 2.40 2.89 3.73
40 0.93 1.41 1.80 2.11 2.71 3.26 4.06
P = 0.95 60 0.97 1.47 1.85 2.16 2.73 3.30 4.14
80 0.99 1.50 1.88 2.18 2.74 3.32 4.18
100 1.00 1.52 1.89 2.20 2.74 3.34 4.20
10 0.64 1.05 1.34 1.56 1.92 2.25 2.67
20 0.92 1.46 1.87 2.20 2.77 3.35 4.10
40 1.26 2.00 2.58 3.05 3.94 4.92 6.42
P = 1.0
60 1.47 2.32 2.95 3.52 4.67 5.94 8.01
80 1.64 2.58 3.30 3.96 5.25 6.81 9.33
100 1.78 2.78 3.57 4.30 5.70 7.49 10.35
TESTS BASED ON REGRESSION AND CORRELATION 217

T A B L E 5.7 Upper T a il Percentage Points fo r Z = n { l - R ^ (X ,H )} fo r a


Test fo r Exponentiality, P aram eters Unknown, fo r Complete o r Type 2
Right-Censored Data; p = r/n is the censoring ratio

Significance level a

0.50 0.25 0.15 0.10 0.05 0.025 0.01

20 2.13 3.3 8 4.33 5.12 6.40 7.59 8.94


40 2.6 8 4.14 5.25 6.15 7.61 9.06 11.10
P = 0.2 60 2.89 4.42 5.57 6.47 8.04 9.63 11.87
80 3.00 4.57 5.73 6.64 8.27 9.92 12.27
100 3.0 6 4.66 5.83 6.74 8.40 10.11 12.52

10 0.86 1.41 1.82 2.16 2.69 3.0 8 3.4 8


20 1.25 1.98 2.51 2.96 3.73 4.4 9 5.39
40 1.48 2.25 2.84 3.33 4.1 3 4.96 6.14
P - 0.4
60 1.54 2.33 2.93 3.42 4.2 3 5.08 6.27
80 1.57 2.38 2.98 3.46 4 .2 8 5.14 6.3 3
100 1.59 2.40 3.00 3.49 4.3 1 5.18 6.37

10 0.75 1.22 1.54 1.80 2.2 5 2.65 3 .1 8


20 0.95 1.47 1.85 2.17 2.70 3.27 4.0 1
40 1.06 1.61 2.02 2.36 2.93 3.53 4.3 6
P = O. 6 1.64 3.56
60 1.09 2.05 2.39 2.96 4 .3 7
80 1.11 1.66 2.07 2.40 2.98 3.5 8 4.3 9
100 1.11 1.67 2.08 2.41 2.99 3.5 8 4.4 0

10 0.65 1.03 1.31 1.53 1.93 2.31 2.8 1


20 0.79 1.22 1.55 1.81 2.2 8 2.7 8 3.4 8
40 0.86 1.32 1.67 1.94 2.44 2.96 3.56
P = 0 .8
60 0.90 1.35 1.69 1.97 2.46 2.96 3.64
80 0.92 1.37 1.71 1.99 2.47 2.96 3.69
100 0.93 1.37 1.71 2.00 2.4 8 2.97 3.71
10 0.62 0.99 1.25 1.47 1.86 2.2 5 2 .7 5
20 0.7 8 1.20 1.53 1.81 2.31 2.80 3.57
40 0.85 1.32 1.66 1.94 2.43 2.99 3.7 5
P = 0 .9
60 0.89 1.35 1.69 1.98 2.4 8 3.03 3.77
80 0.91 1.36 1.71 1.99 2.50 3.0 5 3.77
100 0.92 1.37 1.72 2.00 2.51 3.0 6 3.7 8
10 0.62 1.02 1.29 1.51 1.90 2.28 2.74
20 0.80 1.25 1.61 1.88 2.43 3.0 5 3.9 6
40 0.92 1.41 1.82 2.1 5 2.78 3.41 4.2 9
P = 0.95 60 0.96 1.48 1.87 2.19 2.79 3.40 4.3 0
80 0.98 1.51 1.89 2.20 2.79 3.40 4 .3 1
100 1.00 1.53 1.90 2.22 2.79 3.40 4 .3 1

10 0.63 1.01 1.30 1.54 1.94 2.31 2.74


20 0.8 8 1.44 1.89 2.27 2.99 3.71 4 .7 6
40 1.19 1.99 2.69 3.33 4.57 5.90 7.86
P = 1 .0
60 1.39 2.33 3.20 3.96 5.52 7.33 9.85
80 1.55 2.63 3.62 4.4 9 6.35 8.52 11.52
100 1.67 2.86 3.94 4.90 7.00 9.44 12.83
218 STEPHENS

5.11.3 Tests Based on the Residuals

A s was described before, tests may also be based on the ESS of Section
5 .4 .1 , divided by a quadratic form in the When simple least squares
is used, and when the divisor is S^, regression on m yields n(ESS/S^) =
Z (X ,m ) as test statistic, and regression on H 3rlelds Z (X , H ). The divisor S^
is an estimate of nß^ , but fo r the exponential distribution it might be better
to use the estimate nX^, since X is a sufficient estim ator of ß» The c o rre ­
sponding test statistic ESS/(nX^) does not appear to have been Investigated.

5.11.4 Tests Based on Residuals, When the Origin Is Known

Situations often arise in testing for exponentiallty where a is known; usually


O' = 0 . If O' is OJq , the substitution X ’ = X - Œq w ill reduce the test to a test
of Hq , with O' = 0, on the X* values. Thus we can suppose the null hypothesis
is

Hq : a set of values of X is from Exp (0,/3)

In the test statistics R ^(X ,m ) o r R ^(X ,H ) the fact that oj is 0 is not used;
however, the line Е(Х^ц) = ß m \ can be fitted to the p airs {m j,X ^ jj} and a
test statistic can be constructed using the ESS divided by a suitable estimate
of /3^, sim ilar to those in Section 5.11.3 above. If a and ß w ere both known,
a natural statistic on these lines would be M?, = - m.)^ where X* =
E (I) i' (i)
(X (i) - O')//? ; this is analogous to of Section 5 .10.2. Lockhart and Stephens
(1986) have studied statistics based on residuals.

5.11.5 Tests Based on the Ratio of Two Estimates of Scale

5 .11.5.1 The Shapiro-W ilk Test, for O rigin


and Scale Unknown

F o r the exponential distribution, the estimates in (5.9) become

n (X _ ^
^ = ¾ and ¢ = - (5.17)

and the comparison o f ß ^ with the sample variance leads to the Shapiro-W ilk
(1972) statistic

n (X -X )2
W „ = ------------ ^ (5.18)
'E (n - 1)S2

Thus the test for exponentiallty with origin and scale unknown is as follows:
TESTS BASED ON REGRESSION AND CORRELATION 219

(a) Calculate W g from (5.18).


(b) R efer W jj to Table 5.8, using, in general, a tw o-tail test.

Shapiro and W ilk pointed out that, in general, W g w ill give a tw o-tail test,
since fo r alternative distributions W g may take either low o r high values.
Shapiro and W ilk (1972) gave a table of percentage points fo r W g based
on Monte C arlo studies; Table 5.8 is adapted from their table. C u rrie (1980)
has since calculated the points by numerical methods. Points fo r W g can also
be found from those of Greenwood^s statistic based on uniform spacings (see
Section 1 0 .9 .3 .2 ). The test is discussed in Section 5.12.

5 .1 1 .5 .2 Adaptation of the Shapiro-W ilk Statistic


fo r Known O rigin

It is often required to test for F (x) in (5.16) with a known; fo r the present,
suppose O' = 0. The estimate of ß in the new model = /5т^ now becomes
/? = X, the same as the maximum likelihood estimate, and the corresponding
Shapiro-W ilk statistic would be based on X^/S^; Hahn and Shapiro (1967)
have proposed W E q = S^/(n5c)^, and have given percentage points derived
from Monte Carlo methods. Stephens (1978) later gave a test statistic, here
called W g , which, for sample size n, has the same distribution as W g for
sample size n + I . Statistics W E q and W g are in fact equivalent, and both
a re equivalent to Greenwood^s statistic based on spacings (see Section
1 0 .9 .3 .2 ). The test based on W g w ill be given here since no new tables are
necessary for its use. Suppose, returning to the general situation, that the
known origin has value a = Œq . The steps in the test are as follows:

(a) Let z. = X. - for i = I , 2, . . . , n.

(b) Calculate

and = E
i= l 1=1

(C ) Calculate W g = A V [n {(n + 1)B - A ^ }].


(d) R efer W g to Table 5.8, using the percentage points given for sample
size n + I , and using a tw o-tail test.

W g can also be calculated by adding one extra value X^+^ equal t o to the
given sample of X -values of size n, and then using all n + I values to calcu­
late W e from (5.18). This is a useful device if a computer program is already
available for W e * Stephens (1978) showed that, fo r most alternatives, use of
W g gives greater power than using W e as though a w ere not known.
Shapiro and W ilk (1972) also discussed the situation when it is desired to
test the exponential distributional form , and at the same time to test that
to
N3
О

T A B L E 5.8 Percentage Points for and W _ for Testing Exponentiality

Significance level a

Lower tail Upper tail

n 0.005 0.01 0.025 0.05 0.10 0.50 0.10 0.05 0.025 0.01 0.005

3 .252 .254 .260 .270 .292 .571 .971 .993 .998 .9997 .9999

4 .124 .130 .143 .160 .189 .377 .751 .858 .924 .968 .984

5 .085 .091 .105 .119 .144 .288 .555 .668 .759 .860 .919

6 .061 .067 .080 .096 .117 .228 .429 .509 .584 .678 .750

7 .051 .059 .070 .081 .097 .187 .347 .416 .485 .571 .643

8 .045 .051 .061 .071 .085 .163 .293 .350 .403 .485 .543

9 .040 .0^4 .054 .063 .075 .142 .255 .301 .345 .402 .443

10 .037 .040 .049 .057 .068 .123 .218 .253 .288 .339 .370

12 .031 .036 .041 .049 .057 .101 .172 .202 .236 .272 .298
H
14 .027 .038 .036 .043 .050 .085 .142 .165 .186 .213 .232 И

16 .023 .028 .033 .037 .044 .073 .119 .136 .154 .177 .193

18 .021 .025 .029 .033 .039 .064 .102 .116 .131 .448 .167
.035 .057 .088 .100 .112 .129 .137 H
20 .020 .023 .026 .030 И
25 .017 .019 .022 .025 .029 .045 .067 .075 .084 .093 .100 H
с»
30 .015 .016 .019 .021 о 024 .036 .054 .059 .064 .072 .079
Î
.014 .021 .031 .044 .049 .054 .059 .064 И
35 .013 .017 .019 Ö
.041 .045 .050 .051 6
40 .012 .013 .015 .016 .019 .027 .038
W
45 .011 .012 .013 .015 .017 .024 .033 .036 .039 .042 .045 и
о
50 .010 .011 .012 .014 .015 .021 .029 .032 .034 .036 .039

55 .009 .010 .012 .013 .014 .019 .026 .028 .030 .032 .034
о
60 .009 .010 .011 .012 .013 .018 .023 .025 .027 .029 .031 !25
>
65 .008 .009 .010 .011 .012 .016 .022 .023 .025 .027 .028

.008 .009 .010 .011 .015 .019 .021 .022 .024 .026 О
70 .008 о
75 .007 .008 .009 .010 .011 .014 .018 .019 .021 .022 .023
и

i
80 .007 .008 .008 .009 .010 .013 .017 .018 .019 .021 .021

85 .007 .007 .008 .009 .009 .012 .016 .017 .017 .019 .019
îz:
90 .006 .007 .008 .008 .009 .012 .015 .016 .016 .018 .018

95 .006 .007 .007 .008 .008 .011 .014 .015 .015 .016 .017

100 .006 .006 .007 .007 .008 .010 .013 .014 .015 .015 .016

Adapted from Shapiro and W ilk (1972), with perm ission of the authors and of the Am erican Statistical Association.
222 STEPHENS

û; = OfQ , where i s a given constant. The test statistic fo r a is based on

U =

When the observations are exponential, U is distributed independently of W g ,


and Shapiro and W ilk combine the two statistics by F ish e r’s method (Sec­
tion 8.15).

5.11.5.3 The Jackson Statistic

A statistic suggested by Jackson (1967) is effectively a comparison between


the slope o f the regression line through the origin, when Х(ц is plotted
against mi and the covariance of the X -values is ignored, and X, the m ax­
imum likelihood estimator of ß . The statistic is

J = Z m .X ,.y (n ÍQ
I (I)

In general, J is used as a two-tail test. Jackson has given moments of J,


and percentage points for n = 5, 10, 15, 20. These are based on c u rv e -
fitting, using the moments.

5.11.5.4 The de W et-Venter Statistic

De Wet and Venter (1973) have devised a statistic dependent on the ratio of
two asymptotically efficient estim ators of /3. It is straightforward to apply
their method to the exponential test with a = 0, fo r a complete sample; the
test statistic is = n j^ {Z X ^ y / H j} where = - l o g { l - i/(n + 1)}. De Wet
and Venter have given as 5nnptotic null distribution theory for V g .

5.11.6 Comparison of Regression Tests fo r ЕздюпепШШу

Splnelli and Stephens (1987) have reported results on power studies to com­
pare R ^(X ,m ) and R ^(X ,H ) in tests for exponentiality, with ED F statistics
W^ and , and with W g ; for these studies both a and ß w ere assumed
unknown. On the whole, the correlation statistics R ^(X ,m ) and R ^(X ,H )
w ere less effective than EDF statistics o r than W g , particularly fo r large
sample s iz e s . F o r tests for this distribution (in contrast to tests fo r norm al­
ity) we might also expect some difference between R^ (X, m) and R^ (X, H) ;
this em erges clearly from the studies, with R^ (X , H) less powerful overall
than R ^ (X ,m ). Statistic W g has good power over a wide range of alternatives,
although it w ill have low er power against alternatives with coefficient of
variation equal to I (see Section 5.12).
TESTS BASED ON REGRESSION AND CORRELATION 223

5.12 TESTS BASED O N THE RATIO OF TW O ESTIM ATES


O F S C A L E : FU R T H E R COM M ENTS

In Section 5.3 it w as noted that Sarkadi (1975), fo r norm ality, and Gerlach
(1979) m ore generally, have proved the consistency of tests based on the
correlation coefficient R (X ,m ). The value o f R (X ,m ) w ill, loosely speaking,
approach I as the fit gets better. The test based on R (X ,m ) is then a one-tail
test; the null hypothesis is rejected only fo r sm all values of R, o r fo r large
values of Z = n (l - R ^ (X ,m )).
This consistency does not necessarily extend to correlation statistics
R (X ,T ) when T is not m . F o r example, tests based on the ratio of the gener­
alized least squares estimate of ß with an estimate obtained from the sample
variance can be put in term s of correlation statistics, but w ill not generally
be consistent. F o r the Shapiro-W ilk test for normality in Section 5.10.3,
dr/S is equivalent to the correlation coefficient R (X ,T ) where T¿ is the i-th
component of T = V"^m . Then if T is proportional to m , o r very nearly so,
the graph of X (j) against T^ w ill be approximately a straight line; R (X ,T ) w ill
be a good m easure of fit, and low values of R (X ,T ) w ill lead to rejection of
the norm al model. F o r the norm al distribution, T is nearly proportional
to m , since Vm » m/2 impl 5dng » 2m (Stephens 1975), with the ap­
proximation becoming better for large n; then the Shapiro-W ilk W approaches
the Shapiro-Francia W \ and this is the same as the correlation statistic
R ^ (X ,m ), which is consistent.
However, the norm al case is exceptional. Even for other symmetric
distributions V “^m w ill not as a rule be proportional to m , even as 3rmptot-
Ically , and the situation is m ore complicated fo r non-S3nnmetric distributions,
s u c h as the exponential. F o r such distributions the test based on the ratio of
/32 to S^ is equivalent to the correlation R^ (X , T) with the vector T =
l» V " ^ (lm ’ - m l* )V ”^, and fo r most distributions this vector w ill not even be
close to m . F o r example, fo r the e3qx>nential distribution, with a sample of
size n, T is proportional to a vector with one component equal to -(n - I ),
and the other n - I components a ll equal to I . A plot o f X (j) against T j , even
fo r a ^’perfect” exponential sample with X^y = m j, would not be close to a
straight line, and the value o f R (X ,T ) would not be close to I . The statistic
R (X , T ) then gives no indication of the fit in the sense that a large value indi­
cates a good fit and a sm all value a bad fit. In practical term s, this means
that fo r W e , equivalent to R ^ (X ,T ), both tails are needed to make the test.
A lso , the test statistic w ill not be consistent.
F o r the exponential distribution, the coefficient of variation C y = o'/^l
is I , and for large samples nW g converges in probability to l/ C ^ = I; how­
ever, there are other distributions for which nW£ would also converge to I
(for example, the Beta distribution Beta (x; p ,q ) defined in Section 8 .8 .2 ,
with P < I , and q = p { p + l } / { l - p } ) , and W e w ill not detect these a lte r­
224 STEPHENS

natives* Nevertheless, when the alternatives to exponentiality are identified


and exclude such distributions, W j; can be powerful, as w as reported in
Section 5.11.6; see also Section 10.14.
F o r the uniform distribution, the Shapiro-W ilk method gives W u =
(n - I) {X (n ) - this also can be shown to be inconsistent. Sim ilar
objections apply to the corresponding test for the extrem e-value distribution,
which w as examined by Spinelli (1980).
In general, consistency of a test based on the ratio of two estimates of
scale must depend critically on how these are chosen, and this question
deserves clo ser examination. It is intuitively reasonable that efficient esti­
m ates, o r estimates which are at least asymptotically efficient, w ill give
better tests. The ESS, after fitting the line (5.3) may not be asymptotically
efficient for the chosen T vector, nor w ill be the sample variance in the
denominator except in the test for normality; this w ill affect the consistency
o f tests which use these estim ates. De W et and Venter (1973) have devised
a general procedure for tests of distributions when only the scale is unknown,
in which the ratio of two asymptotically efficient estimates is used. The
authors gave asymptotic theory, and illustrated with tests fo r the Gamma
distribution. The specific case fo r the exponential distribution is given in
Section 5.11.5 above. However, computing form ulas fo r the test statistic
fo r other Gamma distributions, and tables fo r finite n are not available.
Practical aspects of the method also rem ain to be explored, such as how the
power is influenced by which estimates are used in the test ratio.

5.13 REGRESSION TESTS FOR O TH ER DISTRIBUTIONS:


G E N E R A L COM M ENTS

The various techniques above fo r tests o f normality and exponentiality have


not been so extensively developed for other distributions, and the tests to
follow, for the extrem e-value, logistic, and Cauchy distributions, are all
based on the simple correlation coefficients R ^ (X ,H ). H is used fo r compu­
tational simplicity rather than m, although fo r some distributions (for ex­
ample, the extrem e-value, see L aw less, 1982) good approximations to m
have been found and so tests could be developed; there may be some d iffer­
ence in sensitivity between tests based on R^ (X, H) and those based on R^ (X , m ),
as was found fo r the eзфonential distribution (Section 5 .1 1 .6 ). On this question
m ore work needs to be done; also m ore comparisons are needed between c o r­
relation tests and others. The tables given in connection with the tests below
w ere found from Monte Carlo studies with 10,000 samples fo r each n. The
tests are given fo r Type 2 censored data: objections to tests for Type I cen­
sored data are sim ilar to those given in Section 5 .7 .3 in connection with
tests fo r normality.
TESTS BASED ON REGRESSION AND CORRELATION 225

5.14 C O R R E LA T IO N TESTS FO R THE


E X T R E M E -V A L U E DISTRIBUTIO N

5.14.1 V e r s lo n l

The null hypothesis is

Hq ; a random sample of X -valu es comes from

F o (x ;a ,ß ) = е з ф [- e x p { - ( x - o i ) / ß } ] , - « < x < « , ß> О (5.21)

The test given is fo r complete o r Type 2 right-censored sam ples; fo r le ft-


censored data see Section 5 .14.2. F o r this distribution Щ =
- l o g [- l o g {i/ (n + 1 ) } ] . This version of the extrem e-value distribution has
a long tail to the right and is used to model data in, fo r example, reliability
studies. The test statistic Z = n { l - R ^ (X ,H )} can be calculated from the
r sm allest observations; Hq is rejected if Z exceeds the appropriate value
in Table 5.9. The table is entered, fo r Type 2 censored data, at p = r/n
and at n.

5.14.2 Version 2

Another version of the extrem e-value distribution is

Fo(x*;o',i3) = I - e x p [-e x p {(x^ - a ) / ß } ] ^ -°° < x < «>, ß > O (5.22)

This is the distribution o f X* = -X , where X has distribution (5.21). F o r this


distribution Hi = log [ - l o g { I - i/(n + 1 ) } ] . The test statistic Z =
n { l - R ^ (X ,H )}, calculated from the r sm allest observations (Type 2 censor­
ing) is re fe rre d to Table 5.10. F o r left-censored data from (5.21), the signs
can be changed and tested as right-censored data from (5 .2 2 ), and v ic e -
v e rsa . These tables can also be used to give a test fo r the tw o-param eter
W eibull distribution (density W(x;0,/3,m) of Section 4.11, also given in
Equation (10 .4 )). This is a distribution with a long right tail, also used in
reliability and survival studies. H ere, left-censored data values are trans­
formed by = - lo g (X (i)) and the Y -v a lu e s tested to come from (5.21),
since they w ill be right-censo red; sim ilarly, right-censored W elbull test
data are transform ed by = log (X^jj) and tested to come from (5.22).
Gerlach (1979) has considered the test for (5.22) based on R ^ (X ,m ); the test
is shown to be consistent and tables are given to make the test; tables for
Z (X ,m ) are given by Stephens (1986). An example is given in Section 1 1 .4.1.3.

5.15 C O R R E LA TIO N TESTS FOR O TH ER DISTRIBUTIONS

5.15.1 The Logistic Distribution

F o r the logistic distribution, F (x) = l / [ l + exp { - ( x - o i ) / ß } ] , (-«> < x < <»,


ß > 0) and Щ = log {i/ (n + I - I ) } , i = I , . . , n. The statistic
226 STEPHENS

T A B L E 5.9 Upper T a il Percentage Points fo r Z = n { l - R ^ (X ,H )} fo r a


Test fo r the Extrem e-Value Distribution, Equation (5.21), Param eters
Unknown, for Complete o r Type 2 Right-Censored Data; p = r/n is the
censoring ratio

Significance level oi

n .50 .25 .15 .10 .05 .025 .01

20 1.62 2.81 3.52 4.07 4.90 5.69 7.19


40 2.58 4.1 2 5.18 6.09 7.48 8.92 10.47
P = 0.2 60 3.01 4 .6 8 5.89 6.90 8.75 10.45 13.02
80 3.33 5.11 6.45 7.59 9.71 11.79 14.50
100 3.56 5.42 6.88 8.12 10.42 12.84 15.47
10 0 . 8X 1.38 1.75 2.04 2.52 2.93 3.39
20 1.27 1.98 2.51 2.94 3.76 4.56 5.46
40 1.62 2.47 3.09 3.66 4.66 5.59 6.8 5
P = 0.4
60 1.77 2.71 3.39 3.93 4.92 5.88 7.35
80 1.88 2.86 3.5 8 4.1 6 5.19 6.25 7.72
100 1.95 2.95 3.72 4.3 3 5.41 6.55 7.99

10 0.74 1.17 1.49 1.75 2.16 2.61 3.1 8


20 0.98 1.49 1.88 2.20 2.72 3.28 4.0 3
40 1.15 1.73 2.15 2.49 3.0 8 3.77 4.6 6
P = 0 .6
60 1.23 1.82 2.25 2.61 3.26 3.92 4 .7 7
80 1.28 1.89 2.34 2.71 3.35 4.04 4.9 1
100 1.32 1.94 2.41 2.78 3.41 4.1 2 5.03

10 0.64 0.99 1.25 1.46 1.79 2.14 2.5 8


20 0.79 1.19 1.48 1.71 2.14 2.58 3.29
40 0.90 1.32 1.63 1.85 2.27 2.70 3.32
P = 0 .8
60 0.94 1.37 1.68 1.94 2.38 2.79 3.37
80 0.97 1.40 1.72 1.98 2.41 2.82 3.37
100 0.99 1.42 1.74 1.99 2.41 2.84 3.3 5
10 0.61 0.93 1.24 1.37 1.71 2.08 2.51
20 0.74 1.13 1.42 1.64 2.03 2.44 3.05
40 0.84 1.23 1.53 1.77 2.17 2.59 3.14
P = 0.9
60 0.88 1.28 1.57 1.80 2.19 2.59 3.17
80 0.91 1.31 1.59 1.81 2.20 2.60 3.1 8
100 0.92 1.32 1.60 1.82 2.20 2.60 3.1 8

10 0.61 0.94 1.23 1.41 1.76 2.13 2.60


20 0.75 1.14 1.44 1.68 2.11 2.57 3.20
40 0.85 1.28 1.60 1.84 2.28 2.73 3.33
P = 0.95 60 0.90 1.33 1.63 1.88 2.30 2.74 3.39
80 0.93 1.35 1.65 1.89 2.31 2.75 3.43
100 0.95 1.36 1.66 1.90 2.32 2.75 3.4 5
10 0.61 0.95 1.23 1.45 1.81 2.18 2.69
20 0.82 1.30 1.69 2.03 2.65 3.36 4.1 5
40 1.04 1.67 2.23 2.66 3.63 4 .7 8 6.42
P = 1.0
60 1.20 1.93 2.57 3.18 4.33 5.69 7.79
80 1.32 2.14 2.87 3.55 4.92 6.54 8.86
100 1.41 2.30 3.09 3.82 5.38 7.22 9.67
TESTS BASED ON REGRESSION AND CORRELATION 227

T A B L E 5.10 Upper T a ll Percentage Points fo r Z = n { l - R ^ (X ,H )} fo r a


Test for the E xtrem e-Value Distribution, Equation (5.22), P aram eters
Unknown, for Complete o r Type 2 Right-Censored Data; p = r/n is the
censoring ratio

Significance level a

n . 50* .25 .15 .10 .05 .025 .01

20 1.52 2.79 3.54 4.0 2 4.6 9 6.24 7.72


40 2.76 4.43 5.71 6.66 8.22 9.82 11.33
P = 0.2 60 3.4 5 5.58 7.21 8.41 10.64 12.83 15.79
80 3.98 6.48 8.46 10.08 13.03 16.06 20.00
100 4.37 7.16 9.45 11.46 15.06 18.91 23.56

10 0.76 1.42 1.77 2.01 2.40 3.15 3.96


20 1.38 2.19 2.80 3.2 8 4.1 2 4.8 9 5.69
40 1.92 3.11 4 .0 5 4 .8 6 6.44 7.94 9.72
P - 0.4 10.09 13.05
60 2.32 3.79 5.04 6.06 7.99
80 2.63 4.32 5.80 7.10 9.65 12.39 15.93
100 2.86 4.7 2 6.37 7.92 11.09 14.39 18.25

10 0.77 1.23 1.56 1.83 2.25 2.63 3.10


20 1.11 1.80 2.31 2.73 3.4 8 4.27 5.28
40 1.49 2.44 3.2 5 3.93 5.33 6.67 8.57
P - 0 .6 8.51 11.21
60 1.77 2.94 3.94 4 .7 9 6.46
80 1.99 3.30 4.4 8 5.53 7.6 8 10.13 13.23
100 2.15 3.57 4.90 5.12 8.74 11.44 14.74

10 0.68 1.08 1.38 1.61 2.02 2.43 2.90


20 0.93 1.50 1.95 2.35 3.04 3.81 4 .8 8
40 1.22 2.00 2.69 3.27 4.4 6 5.69 7.51
P = 0. 8 9.43
60 1.42 2.35 3.19 3.96 5.35 7.15
80 1.59 2.54 3.59 4.4 9 6.21 8.31 11.04
100 1.71 2.85 3.89 4 .8 8 6.92 9.20 12.30

10 0.64 1.02 1.30 1.54 1.93 2.27 2.80


20 0.86 1.38 1.81 2.17 2.87 3.59 4.5 1
40 1.11 1.82 2.44 2.99 4.10 5.27 7.04
P = 0 .9 8.63
60 1.29 2.13 2.86 3.57 4 .8 8 6.4 5
80 1.43 2.37 3.21 4.01 5.59 7.46 9.97
100 1.53 2.56 3.48 4.3 5 6.17 8.24 11.01

10 0.62 0.98 1.26 1.49 1.86 2.22 2.74


20 0.84 1.33 1.75 2.10 2.76 3.51 4.36
40 1.07 1.73 2.32 2.82 3.93 5.05 6.7 8
P = 0.95 8.22
60 1.23 2.01 2.73 3.38 4.61 6.11
80 1.36 2.25 3.04 3.79 5.27 7.04 9.43
100 1.46 2.43 3.28 4 .0 8 5.82 7.77 10.38

10 0.61 0.95 1.23 1.45 1.81 2.18 2.69


20 0.82 1.30 1.69 2.03 2.65 3.36 4.1 5
40 1.04 1.67 2.23 2.66 3.63 4 .7 8 6.42
P = 1.0 5.69 7.79
60 1.26 1.93 2.57 3.1 8 4.33
80 1.35 2.14 2.87 3.55 4.92 6.54 8.86
100 1.40 2.30 3.09 3.82 5.38 7.22 9.67
228 STEPHENS

TABLE 5о11 Upper Tall Percentage Points for Z = n { l - R^(X,H)} for a


Test for the Logistic Distribution, Parameters Unknown, for Complete or
Type 2 Right-Censored Data; p = r/n is the censoring ratio

Significance level o¿

.50 .25 .15 .10 .05 .025 .01

20 1.56 2.85 3.57 4.03 4.78 6.09 7.35


40 2.75 4.47 5.70 6.65 8.29 9.73 11.13
P = 0.2 60 3.37 5.40 6.92 8.21 10.37 14.48 15.32
80 3.88 6.25 8.08 9.71 12.67 15.40 19.35
100 4.29 6.94 9.03 10.95 14.70 17.96 22.78
10 0.79 1.43 1.78 2.00 2.37 3.01 3.76
20 1.37 2.18 2.78 3.26 4.10 4.90 5.71
40 1.94 3.08 4.01 4.79 6.25 7.93 9.87
P = 0.4
60 2.23 3.62 4.75 5.73 7.66 9.68 12.99
80 2.50 4.08 5.43 6.66 9.01 11.70 15.61
100 2.72 4.44 5.97 7.43 10.12 13.48 17.68
10 0.77 1.23 1.57 1.84 2.26 2.68 3.16
20 1.10 1.76 2.24 2.63 3.41 4.19 5.02
40 1.46 2.34 3.07 3.72 5.07 6.37 8.38
P = 0.6
60 1.66 2.69 3.56 4.40 5.99 7.72 10.43
80 1.84 2.99 4.00 4.95 6.86 9.07 12.14
100 1.98 3.24 4.35 5.38 7.57 10.20 13.48
10 0.68 1.07 1.36 1.58 1.99 2.34 2.81
20 0.91 1.43 1.85 2.20 2.86 3.54 4.43
40 1.16 1.84 2.42 2.94 3.99 5.15 6.99
P = 0.8
60 1.30 2.09 2.76 3.39 4.62 6.01 8.15
80 1.42 2.29 3.05 3.76 5.21 6.90 0.27
100 1.51 2.44 3.27 4.05 5.68 7.64 10.18
10 0.66 1.04 1.33 1.54 1.96 2.32 2.82
20 0.85 1.34 1.71 2.05 2.63 3.29 4.20
40 1.07 1.68 2.18 2.64 3.50 4.59 6.14
P = 0.9
60 1.18 1.89 2.48 3.02 4.02 5.31 7.17
80 1.28 2.04 2.70 3.30 4.47 5.97 8.07
100 1.35 2.15 2.86 3.51 4.82 6.50 8.79
10 0.65 1.03 1.31 1.52 1.94 2.31 2.82
20 0.85 1.33 1.71 2.03 2.57 3.17 4.00
40 1.05 1.64 2.12 2.51 3.30 4.27 5.71
P = 0.95
60 1.17 1.83 2.38 2.84 3.76 4.89 6.49
80 1.25 1.96 2.56 3.10 4.12 5.42 7.28
100 1.31 2.06 2.89 3.28 4.39 5.83 7.92
10 0.65 1.02 1.29 1.51 1.93 2.31 2.84
20 0.90 1.42 1.84 2.19 2.78 3.42 4.20
40 1.20 1.90 2.46 2.94 3.76 4.64 5.94
P = 1.0
60 1.38 2.20 2.88 3.40 4.38 5.37 6.99
80 1.52 2.42 3.15 3.74 4.83 5.93 7.73
100 1.62 2.59 3.35 3.99 5.16 6.34 8.26
TESTS BASED ON REGRESSION AND CORRELATION 229

T A B L E 5.12 Upper T a il Percentage Points for Z = n { l - R 2 (X ,H )} for a


Test for the Cauchy Distribution, Param eters Unknown, fo r Complete o r
Type 2 Right-Censored Data; p = r/n is the censoring ratio

Significance level a

.50 .25 .15 .10 .05 .025 .01

20 1.19 2.02 3.94 5.20 6.64 7.39 9.39


40 3.23 5.85 6.99 8.10 10.45 12.61 15.35
p = 0.2 60 4.90 8.85 10.78 12.03 14.22 17.40 20.91
80 6.76 12.00 14.91 16.70 18.99 22.12 26.71
100 8.50 14.85 18.76 21.18 23.60 26.19 31.84
10 0.67 1.17 2.01 2.57 3.21 3.84 4.83
20 1.65 2.99 3.64 3.97 5.11 6.15 7.44
40 3.48 6.12 7.64 8.64 9.71 10.66 13.09
P = 0.4 9.09 11.73 13.25 15.10 16.20 18.39
60 5.17
80 7.05 12.39 15.92 18.11 20.95 22.35 24.16
100 8.79 15.48 19.68 22.55 26.27 28.11 29.42
10 0.91 1.61 1.94 2.12 2.80 3.45 4.12
20 1.83 3.16 3.94 4.45 5.06 5.63 6.74
40 3.66 6.35 8.13 9.29 10.74 11.52 12.61
P = 0.6 9.51 12.30 14.10 16.51 17.78 18.51
60 5.40
80 7.27 12.90 16.62 19.15 22.49 24.20 25.24
100 8.99 16.00 20.51 23.73 27.88 29.97 31.56
10 1.10 1.86 2.37 2.68 3.05 3.72 4.67
20 2.07 3.57 4.58 5.27 6.12 6.59 7.27
40 3.94 6.96 8.97 10.47 12.28 13.21 13.78
P = O.8 5.69 10.19 13.40 15.48 18.19 19.78 20.68
60
80 7.63 13.65 17.74 20.51 24.43 26.47 27.54
100 9.42 16.81 21.53 24.96 30.08 32.45 33.59
10 1.34 2.48 2.99 3.35 3.97 4.36 5.08
20 2.39 4.26 5.52 6.43 7.45 8.06 9.56
40 4.45 7.94 10.38 12.11 14.11 15.24 15.87
P = 0.9 11.39 14.92 17.48 20.42 21.93 22.85
60 6.39
80 8.36 14.92 19.57 23.03 26.87 28.94 30.17
100 10.10 18.07 23.71 27.99 32.59 35.24 36.76
iO 1.50 2.78 3.45 3.86 4.49 4.86 5.32
20 2.84 5.24 6.77 7.81 8.98 9.78 12.35
40 5.27 9.66 12.43 14.32 16.61 17.78 20.82
P = 0.95 13.31 17.89 20.15 23.30 24.97 26.12
60 7.48
80 9.55 17.02 22.34 26.09 30.13 32.25 33.42
100 11.31 20.28 26.68 31.33 36.16 38.65 40.43
10 1.74 3.19 4.02 4.51 5.08 5.42 5.58
20 4.08 7.35 9.08 10.19 11.42 12.05 12.42
40 8.85 15.78 19.56 21.67 24.00 25.18 25.82
P = 1.0 24.29 30.00 33.13 36.33 38.06 38.92
60 13.79
80 18.58 32.50 39.91 44.19 46.46 50.66 51.97
100 22.72 39.66 48.46 53.80 59.12 61.70 63.50
230 STEPHENS

Z = n{ I - (X, H )} iS found from a complete o r Type 2 right-censored


sam ple, and re fe rre d to Table 5.11. The hypothesis that the sample comes
from F ( X ) is rejected at the level o' if Z exceeds the given percentage point.

5.15.2 The Cauchy Distribution

F o r the Cauchy distribution, F (x) = 0.5 + [tan“^ {(x - a ) / ß } ] / T r , ( - « < x <


ß > 0 ) , and Hi = tan (тг[ { i / (n + I ) } - 0 .5 ]), i = I, . . . , n. The statistic
Z = n { l - R^(X, H )} is found from a complete o r Type 2 right-censored
sample and referred to Table 5.12.

5.15.3 The Exponential Pow er Distribution

Smith and Bain (1976) have given tables of null critical values of I - R ^(X ,H )
for the exponential power distribution, F (x) = I - exp [ I - e x p { ( x - a)//J}],
(O' < X < » , ß > 0 ) . These are fo r Type 2 right-censo red data, with r/n = 0.5,
0.75, and I , and fo r n = 8, 20, 40, 60, and 80.

REFERENCES

Abramowitz, M. and Stegun, I. A. (eds.) (1965). Handbook of Mathematical


Functions (National Bureau of Standards). New York: Dover Publications.
Bliss, C. (1967). Statistics in Biology; Statistical Methods for Research in
the Natural Sciences. New York: McGraw-Hill.
Blom, G. (1958). Statistical Estimates and Transformed Beta Variates.
New York: Wiley.
Chen, C-H. (1984). A correlation goodness-of-fit test for randomly censored
data. Biometrika 71, 315-322.
Currie, I. D. (1980). The upper tall of the distribution of W-e^)onential.
Scand. J. Statist. 147-149.
D’Agostino, R. B. (1971). An omnibus test of normality for moderate and
large sample sizes. Biometrika 58, 341-348.
D’Agostino, R. B. (1972). Small sample probability points for D test of
normality. Biometrika 59, 219-221.
D’Agostino, R. B. (1973). Monte Carlo comparison of the W ’ and D test of
normality for N = 100. Commun. Statist. 545-551«
Davis, C. S. and Stephens, M. A. (1977). The covariance matrix of normal
order statistics. Commun. Statist.-Simula. Computa. B6, 135-149.
De Wet, T. (1974). Rates of convergence of linear combinations of order
statistics. S. A fr. Statist. J. 8, 35-43.
TESTS BASED ON REGRESSION AND CORRELATION 231

De W et, T . and Venter, J- H- (1972). Asymptotic distributions of certain


test criteria of norm ality. S- A fr - Statist. J - £, 135-149.

De W et, T . and Venter, J- H. (1973). A goodness-of-fit test fo r a scale


param eter fam ily of distributions. S. A f r . Statist. J. 7^, 35-46.

Downton, F . (1966). Linear estimates with polynomial coefficients. B io -


metrika 53, 129-141.

D yer, A . R. (1974), Comparisons of tests fo r normality with a cautionary


note. Biometrika 61, 185-189.

Filliben, J. J. (1975). T h e p ro b a b ility p lo tc o rre la tio n c o e ffic ie n tte s tfo r


norm ality. Technometrics 17, 111-117.

FotopouloS, S ., L e slie , J. R . , and Stephens, M . A . (1984). Approximations


fo r expected values of norm al order statistics with an application to
goodness-of-fit. Technical report. Department of Statistics, Stanford
University.

G erlach , B . (1979). A consistent correlation-type goodn ess-of-flt test; with


application to the tw o-param eter W elbull distribution. Math. O perations-
forsch. Statist. Ser. Statist. 10, 427-452.

Gupta, A . K. (1952). Estimation of the mean and standard deviation of a


norm al population from a censored sam ple. Biometrika 39, 266-273.

Hahn, G . J. and Shapiro, S. S. (1967). Statistical Models in Engineering.


New York: W iley.

Ham aker, H. C. (1978). A pproxlm atingthecum ulatlvenorm aldistrlbution


and its Inverse. A p p l. Statist. 27, 76-77.

H arter, H. L . (1961). Expected values of normal o rd er statistics. Biometrika


151-165.

Hastings, C. J r. (1955). Approximations fo r Digital Com puters. Princeton:


Princeton University P re s s .

Jackson, O. A . Y . (1967). An analysis of departure from the exponential


distribution. J. Roy. Statist. Soc. B 29, 540-549.

Labrecque, J. F . (1973). New goodness-of-fit procedures fo r the case o f


normality. P h .D . Thesis, SUNY at Buffalo, Department of Statistics.

Labrecque, J. F . (1977). G oodness-of-flt tests based on nonlinearity in


probability plo ts. Technometrics 19, 292-306.

L aw less, J. F . (1982). Statistical Models and Methods of Lifetim e Data.


New York: W iley.

Lockhart, R. A . and Stephens, M . A . (1986). Correlation tests of fit. Tech­


nical Report, Department of Mathematics and Statistics, Simon F ra s e r
University.
232 STEPHENS

Milton, R. С. and Hotchkiss, R. (1969). C o m p u terevalu ation ofth en orm al


and inverse normal distribution function. Technometrics 11, 817-822.

Page, E . (1977). Approximations to the cumulative norm al function and its


inverse for use on a pocket calculator. App. Statist. 26, 75-76.

Sarhan, A . E . and Greenberg, B . G. (1962). Contributions to O rd er Statis­


tic s. New York: W iley.

Sarkadi, K. (1975). The consistency of the Shapiro-Francia test. Biom etrika


6 2 , 445-450.

Shapiro, S. S. and Francia, R. S. (1972). A p p ro x im a te a n a ly siso fv a ria n c e


test fo r normality. J. A m er. Statist. A s s o c . 67, 215-216.

Shapiro, S. S. and W ilk, M . B . (1965). An analysis of variance test fo r


normality (complete sam ples). Biom etrika 52, 591-611.

Shapiro, S. S. and W ilk, M . B . (1968). Approximations fo r the null d istri­


bution of the W statistic. Technometrics 10, 861-866.

Shapiro, S. S. and W ilk, M . B . (1972). An analysis of variance test fo r the


exponential distribution (complete sam ples). Technometrics 14, 355-
370.

Shapiro, S. S ., W ilk, M . B . and Chen, H. J. (1968). A comparative study


of various tests for normality. J. A m er. Statist. A sso c . 63, 1343-1372.

Smith, R. M . and Bain, L . J. (1976). Correlation-type goodness-of-fit


statistics with censored sampling. Comm. Statist. Theory Methods A
5, 119-132.

Spinelli, J. (1980). Contributions to goodness-of-fit. M .S c . T h esis, Depart­


ment of Mathematics, Simon F r a s e r University.

Spinelli, J. J. and Stephens, M. A. (1987). Tests for exponentiality when


origin and scale param eters are unknown. Technometrics 29, 471-
476.

Stephens, M. A . (1974). EDF statistics for goodness-of-fit and some com ­


parisons. J. A m er. Statist. A s s o c . 69, 730-737.

Stephens, M . A . (1975). Asymptotic properties for covariance m atrices of


ord er statistics. Biom etrika 62, 23-28.

Stephens, M. A . (1976). Extensions to the W -test for normality. Technical


Report, Department of Statistics, Stanford University.

Stephens, M . A . (1978). On the W test for exponentiality with origin known.


Technometrics 20, 33-35.

Stephens, M. A . (1986). G oodness-of-fit fo r censored data. Technical


report. Department of Statistics, Stanford University.
TESTS BASED ON REGRESSION AND CORRELATION 233

Telchroew, D. (1956). T ables of expected values of o rd er statistics and


products of o rd er statistics fo r sam ples of size 20 and less from the
norm al distribution. Ann. Math. Statist. 27, 410-426.

Tietjen, G . L . , Kahaner, D . K. and Beckman, R . J. (1978). Variances and


covariances o f the norm al o rd e r statistics fo r sample sizes 2-50.
Selected Tables in Mathematical Statistics 5 (D. B . Owen and R . E .
Odeh, e d s .), Providence: A m ericanM athem aticalSocieIy.

W e is b e rg , S. and Bingham, C . (1975). An approximate analysis of variance


test fo r non-normality suitable fo r machine calculation. Technometrics
17, 133-134.
Som e Transformation M ethods in
G oodness-of-Fit *
C> P . Quesenberry North Carolina State University, Raleigh, North Carolina

6.1 INTR O D U C TIO N

6 .1 .1 Hypothesis T e stin g P ro b le m s

In this chapter let X i , X 2 , Xj, be identically and Independently distrib­


uted ( i . i . d . ) real-valued random variables with a common continuous distri­
bution function (d f) F . The classic simple goodness-of-fit problem is to test

Ho: F = Fo
(6. 1)
Щ г F Ф Tq

where Fo is a specific continuous distribution function. The hypothesis testii^


problem o f (6 . 1) is often not a very useful model in practice. It is m ore
meaningful in many instances to test that the distribution function F has some
specified functional form without assuming that the values of a ll param eters
a re known.
Let ^ = (0 1 ,. . . , öp) be a vector of real-valued param eters and ^o ”
{ F ^ : в G ß } b e a param etric class o f probability distribution functions.
M oreover, we assume that ß is a natural param eter space, i . e . , that it
contains all points 0 fo r which F^ is a continuous probability distribution
function. Then the composite goodness-of-fit problem is to test

Ho : F G 5^0
(6 . 2)
F ¢

♦W ork supported in part by National Science Foundation Grant MCS76-82652.

235
236 QUESENBERRY

Let T = ( tj^, . . . » Tp^ ) be another vector of real-valued param eters and


= ^ 1} ^ another param etric class of continuous distribution
functions- The classes and are called separate fam ilies (Cox 1961) if
the density o f an arbitrary m em ber of either class cannot be obtained as the
limit of a sequence of densities from the other class • K and are sepa­
rate fam ilies, then the separate fam ilies testing problem is to test

Ho : F G
(6.3)
Hi : F G ^i

In w ords, this testing problem is to test that the sample is from a m em ­


b e r o f one class of distributions against the composite alternative hypK>thesls
that it is from a certain alternative cla ss. A s a particular example let be
the df*s o f the scale param eter ejqjonentlal class of densities given by (6.24)
fo r в unknown and = 0, and let be the df’s of the lognorm al class of
densities given by (6.35) for д = 0 and cr^ unknown. F o r this choice of
and ^ i , (6.3) is to test that the sample is from a scale-param eter exponen­
tial distribution against the alternative that it is from a shape-param eter
lognormal distribution.
Suppose there are available к independent sam ples:

X , X , ...,X
11’ 12’ Ini
X^.,
21’ 22’ ’ ” ’ 2Пг
(6.4)

\nk

The several samples goodness-of-fit problem is to test

H.: X,, - Fo G i — I, . . . , k ; j — I, -, n.
IJ I
(6.5)
Negation of

In w ords, this null h5pothesis is that all random variables (rv*s) have
df*s of the same functional form ; however, the param eters may change from
sample to sam ple. The testing problem of (6.5) is an important generalization
of the classical single sample goodness-of-fit composite null hypothesis test­
ing problem of (6 . 2 ). The several samples null hypothesis can also be con­
sidered against a corresponding several samples separate fam ilies hypothesis
testing problem as follows:
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 237

Xy F^^ E ; I —I 9 e . . , k ; j —1, o .*,n .

(6 . 6)
Xy F j-^ E ^ k> j = l , • • • » П.

In the following sections we shall consider some techniques fo r testing


the five types of problem s described above. The approach used here may be
described as follows fo r the classical goodness-of-fit problem o f display
(6 .2 ). The sample X i , . . . » 3¾ is transform ed to a set of values U i , ...»
(N = n - p, the number of observations minus the number o f param eters) in
such a way that when Hq is true, then U i , . . . , U n are independently and
Identically distributed uniform random variables on the ( 0 , 1) Interval—i . i . d .
U (0 ,1 ) rv *s. The composite null hypothesis Ho o f (6.2) is then replaced by
the surrogate simple null h3qx)thesis that the U ’s are i . i . d . U (0 ,1) r v ’s.
The read er who is interested only in the methodology of this approach
may wish to go directly to Section 6 .4 , which considers studying the uniform­
ity of the transform ed values; and then to Section 6 .5 , where the form ulas
a re given for the transformations fo r a number of fam ilies of distributions.
Num erical examples are given in Section 6.6 to illustrate the application of
this approach to some data se ts.

6 .1 .2 The Transform ations Approach

In this section we consider again a single sample X i , . . . , Хц with parent


df F that is assumed under a null hypothesis to be a m em ber o f the p a ra ­
m etric class of fbio last subsection. Consider a set of transformations of
the structure

U i = h^(X^,...,X^)

= 1*2 ( X^........ X^)


(6.7)

,.X ^ )

w here N < n. Each Ц fo r i E { l , . . . , N } i s a real-valu ed m easurable func­


tion. Let P p denote the probability distribution of ( U i , . . . >Uj^) induced from
the parent distribution with df F , and re c a ll that a test is sim ilar on Hq o f
(6.2) if it has constant probability of rejecting fo r every F E • The fol­
lowing are three properties of the U-transform ations o f (6.7) which we list
fo r consideration.

(I ) There exists a probability distribution Q such that P p = Q fo r every


F E > i- e* > (U i, • • • ,U n ) has the same distribution fo r every F E *
238 QUESENBERRY

( 2) If P p = Q, then F (E * i - e . , ( U i , . . . ,U n ) has the characterizing


distribution Q only if F is a m em ber of the class •
(3) There exists a test based on (U ^ , . • . ,U ^ ) fo r testing the null hypothesis
Hq of (6.2) against a particular simple alternative hypothesis that has
the same power as the most powerful sim ilar test based on (X ^, • . . , Xn)
fo r this same testing problem .

Condition (I) is very important because it assures that every size a. test
based on (U i , . . . ,U n ) Is also a sim ilar a test for the same null hyix)thesls.
Condition (2) is a theoretically interesting property of transformations
of the structure of (6.7); however, it should be pointed out that this type of
characterization property of transformations has not played an important
role in the goodness-of-fit field. The apparent reason why this is so involves
the following considerations. The actual test statistic w ill itself be a r e a l­
valued function of the values ( U^ , . . . ,U n ) of (6 .7 ), i . e . , the test statistic is
obtained by composition of a real-valued function with the transformations
of (6 .7 ). Unless the distribution of the test statistic is also a characterization
of the distribution Q in property (2 ), then property (2) fo r the transformations
of (6.7) is of little relevance. In other w ords, it m atters little whether char­
acterization is lost in the first o r second step of the transform ations. In this
context, it should be observed that most of the goodness-of-fit statistics that
are important in applied statistics do not characterize a null class o f d istri­
butions. As a particular example, consider the chi-squared test statistic.
Although chi-squared test statistics do not characterize null hypothesis
classes of distributions, they have, of course, been and are of great im por­
tance in applied statistics (see Chapter 3).
A number of transformations of the form of (6 . 7) have been given in the
literature for particular param etric fam ilies. David and Johnson (1948) con­
sidered the probability integral transformation when param eters are replaced
by estimates. They showed that the transform ed values are dependent, and
fo r many location-scale param eter fam ilies that the transform ed random
variables have distributions that do not depend upon the values of the param ­
eters .
A number of w riters have given transformations fo r particular fam ilies
of the structure of (6.7) that satisfy condition (I) for transformations. Sarkadi
(1960,1965) gives transformations for the three univariate normal fam ilies.
Durbin (1961) has proposed a transformation approach that eliminates the
nuisance param eters by introducing a further randomization step. Störmer
(1964) gives a method for transform ing a sample from a N (^,o ^) distribution
to a sample of n - 2 values from a N (0 ,1) distribution. A number of tran s­
formations of the structure of (6.7) and satisfying property (I ) have been
considered in the literature for one and two param eter exponential classes.
Two of these transformations are considered by Seshadri, Csörgö and
Stephens (1969), and one is shown to have property (2), also. Csörgö and
Seshadri (1970), Csörgö and Seshadri (1971), and C sörgö, Seshadri, and
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 239

Yalovsky (1973) have considered this transformations approach, and have


given transformations fo r some particular norm al, exponential, and gamma
fam ilies of distributions with properties (I) and (2 ). A number of w riters
have considered the "recu rsiv e resid u als" fo r norm al regression m odels,
which are of the structure o f (6.7) when the variance is known. W e shall d is­
cuss these in subsection 6. 5 . 7 below.
A general theory fo r obtaining transformations such that the transform ed
U*s are l . i . d . U (0 ,1) random variables is given by O ’R eilly and Quesenberry
(1973) and extended to additional classes by Quesenberry (1975). These
authors call the transformations involved conditional probability integral
transformations —C P IT ’s , and we shall in this chapter consider only trans­
formations obtained by this approach. W e do not claim that the transform a­
tions and resulting tests and other analyses obtained by this approach have
advantages over a ll other transform ations fo r particular classes o f distribu­
tions. Indeed, in some cases they w ill be found to give statistics equivalent
to those of some other approaches.

6 . 2 P R O B A B IL IT Y IN T E G R A L TR ANSFO R M ATIO NS

6.2.1 C lassical Probability Integral Transform ations

The well-known probability integral transformation theorem due to R. A .


F ish er (1930) can be stated as follow s.

Theorem 6 . 1

If X is a real-valued random variable with continuous df F , then U = F (X )


has a uniform distribution on the interval (0 ,1 ), l . e . , U is a U (0 ,1) rv .
(H istoricalaside: Actually, F ish er did not explicitly discuss the tran sfor­
mation in this paper; rather, he used the fact that a continuous distribution
function F ( T ; 0 ) of a statistic T is a U (0 ,1) rv in deriving fiducial limits fo r
a param eter. He also used the result in Theorem 6 . 1 in Statistical Methods
fo r Research W ork ers (F ish er 1932), in his method for combining tests of
significance—again, without explicit discussion of the transformation itself.)

Thus if X^, O. . , X jj are i . l . d . with continuous common df F , then


U i = F (X i), • • • » ^ (¾ ) l . i « d . U (0 ,1) random variables. An im por­
tant generalization of this basic theorem due to Rosenblatt (1952) is given in
the next theorem.
If F is a multivariate distribution function we denote by F ( - ) , F(* I •).
e t c ., the usual m arginal and conditional df’s.

Theorem 6.2

If ( Y i , . . . , Yjn) is a vector of m random variables with absolutely continuous


multivariate distribution function F , then the m random variables
240 QUESENBERRY

Uj - F ( Y j ) . U j - F ( Y j I Y j ) ...........U j ^ T I Y ^ Y j ............. Y _ , . j )

are i . i . d . U (0,1) rv*s.

A s a simple example to illustrate Theorem 6.2, let m = 2 and (Y i , Y 2>


have joint density function

f(yi г) = exp (-У 2) fo r 0 < yi < У 2 < + «

Then

F (y i) = I - e x p (-y i) for У 1 > 0

and

Fiyzl yi ) = I - ехр(у1 - У 2 ), 0 < У1 < У2 <

Theorem 6.2 says that

Ui = F ( Y i ) = I - e x p (-Y i) and U 2 = F (Y 2 l Y i ) = I - e x p (Y i - Y 2)

fo r 0 < Y i < Y 2 < are i . i . d . U (0 ,1) rv^s. This result is easily verified
directly. Applying the standard transformation of densities gives

h(Ui ,U 2) = е х р (-у 2> • е х р (у 2) = I fo r 0 < Ui < I , 0 < U2 < I

The generality with which both o f the preceding theorems hold should be
carefully noted. W e shall apply these in particular for cases when F is
already езфИсШу a conditional distribution function.

6. 2. 2 Conditional Probability Integral


Transform ations: CPIT*s

The model assumptions we make in this chapter are m ore restrictive than
those in O ’Reilly and Quesenberry (1973) (O -Q ), but they are sufficiently gen­
eral to cover many important cases. W e assume that the param etric class
of distribution functions corresponds to a continuous ejqx^nentlal class (cf.
Zacks, 1971, Section 2.5), and that T^ is a p-component vector that is a suf­
ficient and complete statistic for в = (% , • • • , 0p)- Denote by ^ ( x ^ , . ,x^)
the distribution function of ( X^, . . . ,X^) given the statistic Tj^. Then Fjj(x^),
x i ) , O. . , ^ (X n I X j , . . . ,Xj^„;j^) are the m arginal and conditional d istri­

bution functions obtained from ^ ( x ^ , . . . ,x^). The following theorem is a


direct consequence of Theorem 2.3 of O -Q and can be obtained from The­
orem 6.2 above.
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 241

Theorem 6.3

The (n - p) random variables

= W ' = ••
(6. 8)
V p = V V p ' ^ ’-'-*V p -i)

ar e i . i . d . U (0 ,1 ) rv^s.

W e note that the assumption that ( X^, . . ' , ¾ ) are i . i . d . is not necessary
to obtain Theorem 6.3. It is sufficient to require that ( X^, . . • ,X^) have a
full-rank absolutely continuous distribution. W e give results below in sub­
section 6. 5. 3 obtained by applying Theorem 6.3 to the ord er statistics of a
sample from an exponential distribution.
A sequence (T jj)jj> i of statistics is said to be doubly transitive if each

o f the p airs o f values (T n ,X n + l) and (T n + i,X n + i) can be computed from the


other. F o r example, if Tn = ^ = (X i + • • • + Xn)/n then =
(( 11¾ + X n + i)/ (n + _ l), Xn+ 1) , and ( ^ . X n + 1) = (((n + l ) ^ j ) / n . Xn+ 1); and
the sample mean Xn is doubly transitive.
The i . i . d . assumptions on ( X^, . . • ,Xn) a re necessary, in general, for
the next theorem, which is obtained from C orollary 2 . 1 of O - Q . Now, it is
often rather difficult to apply Theorem 6.3 directly to obtain explicit form u­
las fo r particular param etric fam ilies. The next theorem greatly sim plifies
the task of deriving the actual transformations for many important p a ra ­
m etric fam ilies.

Theorem 6.4

(T n )n > l is doubly transitive, then the (n - p) random variables

(6.9)
......... o „ -p ■ W

a re i . i . d . U (0 ,1 ) rv^s.

6. 2. 3 Conditional Probability Integral Transform ations,


Truncation Param eter F am ilies

Some important classes of distributions which are not covered by the trans­
formation theory of the preceding section are the truncation param eter fam ­
ilies considered by Quesenberry (1975). F o r these fam ilies we assume that
the parent density defined on an interval (a ,b ), finite o r infinite, is of one of
the three form s:
242 QUESENBERRY

f(x;^i , a¿2 , 0) = c(/ij ,/¿2 , 0)h(x, 0); а < / Х х < х < Д 2 < Ь (6 . 10)
fj (х;д, в ) = Cl (м, 0)hi ( x;0) , a < м< x < b (6 . 11)
Í 2 (x ;m , 0) = C2 ( ß ,0 )h2 ( x ;e ) , а<х<м <Ь (6 . 12)

Here a and b are known constants; м» Mi and ^^2 are truncation param eters;
0 is a p-component param eter vector; and h(x, Ö), hj (x, 0), and h2(x, 0) are
positive, continuous, and integrable functions over the intervals (^¿!,Mz )»
(M,b), an d (a,M ), respectively.
F o r X;!^, . . , Xn a sample we now set out a particular transformation
to another set of values. Let r denote the antirank of i . e . , r must
satisfy X r = X(n) ; and put

W = X •••, W = X , W =X W = X^
I I r -1 r -1 r ir^l n-1 n

Then we shall call W ^, . . . , W n -i the sample with X(n) deleted. In w ords:


W ^, . . . , Wn_x are the sample m em bers that are le ss than the largest o rd er
statistic and subscripted in the same o rd er.
Next, let r denote the antirank of i . e . , r must satisfy X^. = X^i)*
Define W x , . . , W n -i in term s o f this X^. in the same manner as above,
and these values w ill be called the sample with X^x) deleted.
Finally, let r^ and Г 2 denote the antiranks of X(x) and X^^), respectively;
and put mi = m in {r i, Г 2}, and m 2 = m a x { r i , Г2} . Put

W = X , ..., W = X W = X ......... W =X
I I т ^ “1 т ^ ”! ^^1 nij^l m2""" m2” i

W . = X ., . . . , W ^ = X
m 2 -l L* ’ n-2 n

These values W x , . . . » W^_2 ^ called the sample with X(x) and X^j^j
deleted.
F o r X x , » • • J Xji a sample from the density f of (6.10) and W x .........
W n _2 sample with X ^ j and X^j^^ deleted, the next theorem is from
Quesenberry (1975).

Theorem 6.5

F o r f i x e d (X(X). X(^)) = (X(x), X(^)), the m em bers W x, . . . , W n-2 of the


deleted sample are conditionally independent, identically distributed, con­
tinuous r v ’s with common density function

h (w ,0 )I (W)

n i)* (P)^ ( 6 . 13 )
g(w , 0) =

J h(w, 0) dw

'‘(I )
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 243

where = I if x^^j < w < x^^j, and is otherwise zero.

Let Tjj«2 be a complete sufficient ^statistic [a function of (W;|^,. . . ,W jj.2 )l


fo r в in the fam ily of g(w , Ö), and let ^bat is ,
Gn«2 is fbe R ao-Blackw ell (M V U ) estimating df of the df G (w , в ) correspond­
ing to the density function g(w , в ) -
F rom Theorem 6.4 above o r from Theorem 2 of Quesenberry (1975),
the next result follow s.

Theorem 6.6

If (T .). ^ is doubly transitive then the (n - p -2) r v ’s


J J>1

U._J, = G^(W^) j= p + l, ...,n -2 (6.14)

a re i . i . d . U (0,1) r v 's .

Transform ations of sam ples from distributions with densities of the


form of fj of (6.11) o r Î 2 of (6.12) can be obtained in a sim ilar fashion. If
the sample is from fj we fix X (i ) = x (i) and let W i , . . . , W n -i be the sample
with X (I) deleted. Then for fixed X (i) = Х ( ц these values a re conditionally
independent, identically distributed, continuous rv*s with common density
function

Si(W ) - — 5--------- ------------- (6.15)

/ hj^(w, 6) dw

^(1)

Finally, if the sample is from f j we fix Х^ц) = Х(ц) and let W ^, . . . » W n -i


be the sample with X(j^) deleted. F o r fixed X(n) = Х(ц) these values are con­
ditionally independent, identically distributed, continuous r v ’s with common
density function

S , ( w , --------- ----- --------- Й * - (6.16)


(n )
J h2 (w, в ) dw
a

In each of these two cases we apply Theorem 6.4 to the W j ’s to produce


(n - P - I) i . i . d . U (0 ,1) r v ’s. The transformations fo r some particular fam ­
ilies of truncation param eter distributions w ill be given in Section 6.5.
244 QUESENBERRY

6.3 SOME PR O PE R TIE S O F C P IT ’S

6. 3. 1 Notation and Terminology

In this section we give a largely descriptive account of some of the more


important properties of C P IT *s, and for this we shall use some notation and
terminology that is not used elsewhere in this chapter.
Denote by ^ a param etric class of probability measures, corresponding
to the class of df*s o f Section 6 . 1 for j = 0, I . The m em bers of these
classes are probability m easures on a B orel set 9Б of re a l numbers and let
V denote the B orel subsets of X . Let (36^, be the usual product space, and

= { p “ ; p “ = P x • • • x p , P e #>.}, ) = 0,1
3 3

Let g : $ - ^ $ be a strictly increasing I -to - I function and g^ be de­


fined by g ( X|,. . . ,Xj^) = ( g x | , . . . , gXj^). F or each function g assume there
exists a fu n ction g^ö-* Osuch that P ^ X C g ^ A ) = P ^ ( X C A ) for every A € U ”
and X* = ( X | , . . . , X^). Let G denote a transformation group on SB, G the
corresponding transformation group on35^, and G the corresponding tran sfor­
mation group on Q. A transformation group on a space is said to be transitive
if its maximal invariant is constant on the space. The U -values of Theorem
6. 4 can be expressed as functions on SB . The next three theorems are from
Quesenberrj^ and Starbuck (1976) (Q -S). In the following U ’ - ( U i , . . . » Un-p).

Theorem 6.7

If G is a transformation group of strictly increasing functions on SBthat induces


a transitive group G on Í2, and conditions for Theorem 6.4 are satisfied,
then U is equivalent to an Invariant statistic, l . e . ,

5 (X ,........ X ) = U (g x ,........... ) a.s. V g e


i n I n

From the distributional result of Theorem 6.3 and Basu (1955, 1960),
the next result follows.

Theorem 6 . 8

F o r T a complete sufficient statistic for ^ and U as given in Theorem 6.3,


T and U are independent v ectors.

The following result from Q -S shows that the U-transform atlons of


Section 6.2 are efficient from the power viewpoint.
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 245

Theorem 6*9

A most powerful sim ilar test exists fo r testing Hq of (6.2) against a simple
alternative F = F ^ , and this test can be expressed as a function of ( и ^ , . . . ,
Un-p) only.

The Import of this result is that a ll information in the sample about the
class of df*s is also in the U -valu es U i , . . . » Un«p. This result and
Theorem 6.8 shows that the C P IT transformations may be regarded as a
technique whereby the information in the sample ( X^ , . . . , X^) can be p arti­
tioned into two vectors (U ^ T ), and the vector T contains all information
about the param eters (ö i, • • • , 6^), the vector U contains a ll information
about the class , and T and U are independent. Thus the values ( U i , . . . ,
Un_p) may be used to make inferences about the class of distributions (is
it norm al? езфопепйа!? ), the statistic(s) T may be used to make inferences
within the class about param eter values (estimate the m ean?), and the inde­
pendence o f U and T can be e l i c i t e d to assess overall e r r o r rates.

6. 3. 2 Sequential Nature of C P IT 's

From the nature of the transformations in (6.8) and (6.9) it is apparent that,
in general, the vector U o f transform ed values is not a sym m etric function
o f the X sam ple. That i s , the vector U is not invariant under permutations
o f the observations. F o r those cases when the transformations are not p e r ­
mutation Invariant, this property requires consideration on a number of
points.
One point of concern when the u’s are not permutationally invariant is
that a goodness-of-fit test o r other analysis may lead to a conclusion that
depends upon the presum ably irrelevant indexing of the X*s. If two randomly
selected orderings are used to compute U and then a particular goodness- o f -
fit test (such as the Neyman smooth test discussed below) is computed on the
two U vectors, the statistics are Identically distributed and dependent, but
nonidentical. If we consider these two test statistics, then the situation is
sim ilar to that when m ore than one competing test statistic is computed for
the same testing problem . In particular, it is common practice today to com ­
pute a number of the go o d n ess-of-f it test statistics, discussed elsewhere in
this book, for each sam ple. A quantity of relevance here is the probability
that two tests w ill agree in their conclusions. Quesenberry and Dietz (1983)
considered this probability fo r Neyman smooth tests made on the U ’s from
random permutations of a sam ple. They gave em pirical evidence that these
agreement probabilities are very high in many cases of interest and are
bounded below by the value 2/3 for all cases considered.
It is possible to obtain C P IT transform s that are invariant under perm u­
tations of the original sam ple. To obtain these transform s re c a ll that in
o rd e r to apply the approach of Rosenblatt it was necessary to assume only
that the rv*s ( X^ , . . . ,X^) had a full-ran k continuous distribution. Thus to
246 QUESENBERRY

obtain permutation invariant transformations we can first transform an i . i . d .


sample to its o rd er statistics, say ( X ( i ) , . . . ,X^j^)), and then
make СРГГ transformations using Theorem 6 . 3 on these o rd er statistics.
However, fo r this case Theorem 6.4 is not applicable and it is a difficult
task to find the transform s in practice. They have been found only fo r two
rather simple cases of ejqponential and uniform distributions, that w ill be
considered further below.
The discussion thus fa r has assumed that the sample is entirely sym ­
m etric of given size n. In practice, we often have data that can be naturally
ordered by some variable. Perhaps the most common ordering variable is
tim e, and when this is so the observations themselves may arriv e sequen­
tially in time. When the data are ordered by time, o r some other variable,
then the СРГГ transformations approach has an especially strong appeal for
it w ill allow the analyst to design tests and other analysis techniques specif­
ically to detect mi specifications in the i . i . d . model that are related to the
ordering of the data. F o r example, one o r m ore of the param eters might
change with tim e. Some problem s of this type are classical ones in statis­
tics—there is a large literature concerned with slippage of normal means
and with detecting heteroscedasticity of normal variances. However, we
shall not consider these problem s in this chapter.

6.4 TESTING S IM P LE U N IF O R M IT Y

6. 4. 1 Introductory Rem arks

The transformations of Section 6.2 can be used to construct sim ilar a tests
fo r the testing problem s of (6.2) and (6.3) by making size a tests of the
surrogate null hypothesis that the U -values are themselves independently
and identically distributed as uniform random variables on the interval (0 ,1 ),
i . e . , ar e i . i . d . U (0 ,1) r v ’s. These transform ed values of the U*s should be
studied with care because they contain a ll test information in the sense of
Theorem 6.9. It should also be observed that when the null hypothesis
H(,: P G fa lls, the distribution of the U^s may fail to be i . i . d . U (0 ,1) in
many w ays. They may no longer be independent, nor identically distributed,
nor uniformly distributed. M oreover, if the model properties of independence
o r identical distributions of the observations as assumed by the form al
goodness-of-fit and separate hypothesis testing problem s of (6 .2 )-(6 .6 ) are
violated, then this w ill also result in transform ed values that are not, in
general, i . i . d . U (0 ,1) r v ’s. Thus we can use these transform ed values to
study the validity of other model specification, in addition to violations of an
assumed param etric density functional form . Now, the choice of tests to be
made w ill, of course, depend upon the type of violation of model assumptions
of concern. In some problem s, we may not have reason to be concerned about
pàrticular types of model violations, and we would like to perform an analysis
with good sensitivity against a wide range of alternatives. Such an analysis
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 247

can be made by studying the transform ed values to determine if these are


feasible values for i . i . d . U (0 ,1) r v ’s. The analyses which we propose here
for this purpose include graphical methods, and two omnibus goodness-of-fit
tests on the transform ed U values.
Subsequently, we shall use lower case u^s and write ц = (u^, • • •
fo r both the random variables and their observed values. Here N = n - p,
the number of observations minus the number of param eters in the model.
A ls o , when we consider models with normal distributions of the e rro rs in
subsections 6. 5. 5 and 6. 5. 7 (linear regression models) below , it w ill be
seen that the values uj are closely related to quantities usually called re s id ­
uals from the least squares fitted lines from these m odels. F o r this reason
we shall call u the vector of uniform residuals from the param etric model,
in general.
Next put

z. = Ф j = l. N (6.17)

fo r Ф a N (0 ,1) df, and Ф“* Its Inverse. By the Inverse of Theorem 6.1, when
the Uj^s are i . i . d . U (0,1) rv^s, then the z j ’s are i . i . d . N (0,1) rv*s. Thus we
can also test (6.3) by testing the surrogate null hypothesis that the zj^s are
i . i . d . N (0,1) rv^s. W e shall consider further tests based on the zj*s below
when we consider particular param etric c lasses. W e shall call the values zj
norm al uniform (NU) residuals. A principal reason fo r considering these N U
residuals is that the problem of testing normality is the most extensively
studied problem in the goodness-of-fit area. See Chapter 9 o f this book, and
we w ill recommend tests of normality below. Hester and Quesenberry (1984)
have found that a test based on these N U residuals has attractive power prop­
erties for testing for heteroscedasticity, i . e . , fo r either increasing o r de­
creasing variance fo r ordered regression data.

6. 4. 2 Graphical Methods fo r Symmetric Samples

Graphical methods are very useful fo r studying the uniformity of the values
U i , . . . , U^ on the unit Interval. A s a first step the data can be plotted on
the (0 , 1) interval; o r, if N is large it w ill be m ore convenient to partition
the (0, 1) interval into a number of subintervals of equal length and to con­
struct a histogram on these subintervals. The data pattern on the unit inter­
val, o r the shape of the histogram constructed, conveys important information
about the shape o f the parent density from which the data w ere drawn relative
to the shapes of the densities of the null hypothesis cla ss. W e next consider
this in m ore detail.
In o rd er to study the significance of particular patterns of the u-values
on the unit interval, we first suppose that the u’s w ere obtained from the
classical probability integral transformation of Theorem 6 . 1 by transform ing
a sample X i , . . . , using a continuous df F q (x) with corresponding density
248 QUESENBERRY

function fo (x). However, if X^, .. , is actually a sample from a parent


distribution with df F j (x) and density function fj (x), then the u*s w ill not
constitute a set of i . i . d . U (0,1) rv^s, unless F q = Fj . Let (a ,b ) be an inter­
val on the re a l line where fo (x) < f^ (x) fo r every x in ( a , b ) . Then the expected
number o f u’s in the interval (F q (a ), F q (b)) w ill be greater than under uni­
formity* Conversely, if fo(x) > fi(x ) fo r x in the interval (c ,d ), then the ex­
pected pattern of points in (F q ( c), F q (d)) w ill be m ore sparse than under
uniformity. Thus the splatter pattern of u i, . . . , u n on the unit Interval
should be interpreted as fo llo w s. If the data are too dense on an interval
(F o (a ), F q (b)), this Implies that the true density function fj exceeds the dens­
ity function fo on the interval (a ,b ) and, conversely, when the data are too
sparse on (F o (a), Fo (b)) the true density fj is less than ^ o n (a ,b ).
Next, suppose that the u^s w ere obtained from a sample ..., by
the C P IT transformations of Theorem 6.4 fo r a param etric class . Then
the data patterns have the same interpretations as fo r the classical probabil­
ity integral transformations just discussed. This is true because the trans­
forming function used for the ith observation is an estimator of the parent df.
Indeed, this transform ing function is a minimum variance unbiased (M VU)
estim ator of the parent df based on ( X i , . . . , Xj ) whenever the parent df is a
m em ber of . (See Seheult and Quesenberry (1971).)
If U (I ), . . . , U(N) are the o rd er statistics of a sample from a U (0 ,1)
parent, then U(i) is a beta random variable with p aram eters (i, n - i + I ).
Thus U(I) has mean and variance given by

i/(N + I) and 1(N + i - 1)/(N + 1)^(N + 2)

respectively, and its distribution function is the incomplete beta-function. If


the points (u(i), i/(N + 1)) are plotted on Cartesian axes, these points should
approximate the line g(u) = u fo r 0 < u < I . Quesenberry and Hales (1980)
have given graphs called concentration bands fo r N = 2, 5, 10, 15, 20, 30,
40, 50, 60, 80, 100, 150, 200, 300, 500 that are helpful guides for judging
the significance of these uniform probability plots.
The discussion made above in this subsection allows us to anticipate the
pattern of points that the uniform residuals w ill make on the unit interval
when a particular alternative density is considered. F o r a given data set this
allows a direct interpretation of the data patterns observed in uniform and
N U residuals both in histograms and in uniform probability plots. Some ex­
amples o f data patterns In uniform probability plots w ill be given in the
num erical examples of Section 6 . 6 .

6. 4. 3 Test Statistics for Uniformity and Normality

The statistical literature contains a large number of goodness-of-fit test


statistics which can be used to test that a set of values u^, . . . , u n consti-
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 249

T A B L E 6.1 E m pirical C ritical Values for

N/a 0.1 0.05 0.01

2 0.164 0.181 0.195


3 0.158 0.191 0.242
4 0.152 0.187 0.254
5 0.152 0.185 0.256
6 0.151 0.187 0.260
7 0.151 0.187 0.262
8 0.152 0.188 0.262
9 0.152 0.189 0.270
10 0.151 0.185 0.265
OO 0.152 0.187 0.267

tutes a sample from a U (0 ,1) distribution. W e are here interested in tests


which have good power against a wide range of alternative shapes. Monte
Carlo power studies of tests of uniformity on which we shall base our choice
of test statistics have been given by Stephens (1974), Quesenberry and M ille r
(1977) (Q -M ), and M ille r and Quesenberry (1979). W e recommend two test
statistics for use in testing uniformity of the u-values, and our choices are
the statistic of Watson (1961) and the smooth test of Neyman (1937).
The Watson statistic was found in Q -M to have good power against a
number of classes of alternatives, even for sm all sample size s. This statis­
tic is given by

N
= (1/12N) + { (2i 1)/2N U N (u - 0.5)^ (6 . 18)
(i/
i=l

where u = (u^ + • • • + u n )/N . Stephens (1970) found em pirically that a linear


function of given by

U^OD " " f - 0. 1/N + 0.1/N2} { i + 0.8/N} (6.19)

has critical values that are approximately constant in N fo r N > 10. Table
6.1 gives approximate significance points fo r U ^ qj^ that w ere obtained by
Q -M by Monte Carlo methods, and the Stephens approximation fo r N > 10.
In a classic paper Neyman (1937) proposed a test statistic that is defined
as follow s. Let тг^. denote the rth degree Legendre polynomial, the first five
of which are, for 0 < u < I,
250 QUESENBERRY

TTo(U) = I

TTj (u ) = n/12(u - 1/2)

TTj (U) = «^5(6 (U - 1/2)2 _ 1/2)

TTj (U) = ^7(20(u - 1/2)2 _ з^ц _ 1/ 2))


TT^(U) = 210(u - 1/2)4 _ 45 (u _ i/ 2)Z + 9 /3

Then put

N
V = Z v K ) » r = I , ..., к
j=i ^ J

and

p2 = (1/N) Z V
r=l

M ille r and Quesenberry (1979) (M -Q ) showed that each o f the tests


fo r к = 2, 3, 4 have good power against a number of classes of alternatives.
Ne 3nnan showed that p^ has an asymptotic (k) distribution under the uni­
formity null hypothesis and a noncentral distribution under the alternative.
Following recommendations of M -Q we shall use p| as a general omnibus
test statistic for the uniformity null hypothesis. Table 6.2 gives some signif­
icance points for p 4 obtained by M -Q by Monte Carlo methods.
W e shall consider only the statistics and p| for testing uniformity
because they appear on the basis of the above noted power studies to be two
o f the best general omnibus tests of uniformity. In those studies it was noted
that one weakness shared by many goodness-of-fit tests (including p|) is that
for sm all sample sizes (sometimes fo r N as large as 15), and some im por­
tant alternatives, the tests are biased . W e have not observed a case where
gives a biased test for any sample size, nor one for which p| gives a
biased test for N la rg e r than 10. Thus we recommend only if N < 10, and
we w ill compute both and p| for la rg e r sample sizes. A practical advan­
tage of p| is that it has an approximate (4) distribution as N increases, and
so its observed significance level, p-value, is easily evaluated (see Table
6. 2) . These points w ill be illustrated in the numerical examples of Section
6 . 6.
Instead of (or in addition to) the graphs and tests using the uniform
residuals, we can make graphs and tests using the N U residuals defined in
(6.17) above. F or most purposes, we feel that the graphs of uniform re sid ­
uals are more easily interpreted than those of NU residuals. It seems to this
w riter that it is a bit easier to judge if a histogram agrees with an assumed
uniform parent than with a norm al-shaped parent. A probability plot of the
TR AN SFO R M ATIO N METHODS IN GOODNESS-O F -F IT 251

T A B L E 6.2 E m pirical C ritical Values fo r p|

N/a Oel 0.05 0.01

2 7.19 9.52 16.14


3 7.34 9.51 15.80
4 7.46 9.50 15.43
5 7.53 9.49 15.12

6 7.57 9.48 14.86


7 7.60 9.47 14.65
8 7.62 9.47 14.47
9 7.63 9.46 14.32
10 7.64 9.46 14.19

11 7.65 9.45 14.09


12 7.65 9.45 14.00
13 7.66 9.44 13.93
14 7.66 9.44 13.87
15 7.66 9.43 13.82

16 7.66 9.43 13.78


17 7.66 9.43 13.74
18 7.67 9.42 13.71
19 7.67 9.42 13.69
20 7.67 9.42 13.67

30 7.68 9.40 13.58


40 7.68 9.40 13.52
50 7.69 9.40 13.48
OO 7.78 9.49 13.28

N U residuals requires norm al probability paper, whereas the uniform re sid ­


uals requires no special paper. Thus the uniform residuals are m ore con­
venient to plot with widely available software.
Aside from graphs o f N U residuals, we can also make omnibus tests of
a wide variety of param etric models by testing the normality of these re s id ­
uals. There are a number of reasons to consider tests on N U residuals in place
of, o r in addition to, tests on the uniform residu als. The problem of testing
normality for a simple random sample is the most extensively studied p rob­
lem in the goodness-of-fit field, and excellent tests fo r this problem are
readily available (see Chapter 9 of this handbook). A lso , fo r at least some
o f the many important problem s of testing null hypothesis param etric models
with norm ally distributed e r r o r s —such as the multiple sam ples problem s,
regression models of Section 6. 5. 7, o r the A N O V A model of Section 6 . 5 . 8 —
the N U residuals have a useful tendency to retain data patterns from the
252 QUESENBERRY

original data. F o r example, Hester and Quesenberry (1984) have e 4 >loited


this property to construct efficient tests for heteroscedasticity fo r norm al
regression m odels.
Any of the multitude o f goodness-of-fit tests fo r the completely specified
null hypothesis testing problem of (6 . 1) may be used to test normality of the
N U residuals. However, it has been shown by both Stephens (1974) and D yer
(1974) that the test for composite normality usually has better power against
most alternatives, even when the param eters are known. W e have done some
simulation work in stud3dng the efficiency of the A nderson-D arling (AD) test
on NU residuals, and it appears that this characteristic of goodness-of-fit
tests to show better power when param eters are not assumed known can be
expected to obtain here, also. Finally, we must choose the particular tests
of normality to make on the N U residuals. At the present state o f knowledge,
we feel that reasonable choices of test statistics would be the Shapiro-W llk
test o r the AD test.
Another point that should be recalled in interpreting the analysis of
residuals discussed here is the following. W e generally consider testing the
goodness-of-fit hypothesis testing problem as stated in display (6 .2 ), and
this means, of course, that we assume that the observations a re l . i . d . How­
ever, with real data it w ill often be the case that we cannot really be sure
that these assumptions are valid, and thus in practice in these cases the
classical goodness-of-fit null hypothesis of (6.2) should be expanded to in­
clude the i . i . d . assumptions. That is , it is desirable to validate the entire
model including these sampling assumptions.

6.5 TRANSFORMATIONS FOR P A R T IC U L A R F A M ILIE S

6. 5. 1 Introduction

In this section the C P IT transformations are given fo r a number of param etric


fam ilies of probability distributions. Low er case letters are used to denote
both random variables and their observed values here and in the next section,
which considers numerical exam ples. In all cases we assume that a sample
X^, . . . , Xjj is available from an unspecified m em ber of the class of d istri­
butions under consideration. W e shall denote by (w^^,. . . , Wj^.^) the sample
with X(n) deleted, and use sim ilar notation for the other cases when x^^) is
deleted o r both x (i) and x^^j are deleted. (See Section 6. 2. 3 for details of
this transform ation.)

6. 5. 2 Uniform Distributions

Densities:

= (l/{ß2 - M l ^ J (X ) (6 . 20)
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 253

fo r -во < jUj < •


Case I, Ati = а^ю known, /!3 unknown.
F o r W^, . . . , W ji_i the sample with x(n) deleted put

j = I. n - I (6 . 21)

Case П, Ml unknown, a¿2 ~ M20 known.


F o r W i, . . . , W ji_i the sample with x (i ) deleted put

“j = <**20 - - *(1 )) * j = I. . . . . n (6 . 22)

Case Ш , both a¿i and ^ 2 unknown.


F o r W i , . . . , Wji_2 the sample with both x^i) and х^ц) deleted put

X (I)), 3 = 1, . . . , n - 2 (6.23)

F o r these transformations fo r uniform distributions, we note that the


lack of invariance with respect to permutations o f the X*s cited in subsection
6. 3. 2 does not obtain. That is, two permutations of the X ’s w ill give the same
values for the elements o f the ц vector, though not in the sam e o rd e r. That
is, the ordered values of the components of ц w ill be the sam e, and any analy­
sis o r test that is symmetric in the components of u w ill be unaffected by a
permutation of the X^ s.

6. 5. 3 E3ç>onential Distributions

Densities:

f(x;A¿, Ö) = (I/ 6) exp {-(x - ц ) / в } 1 (X) (6.24)

fo r -во < At < eo^ and в > 0*


Case I, At unknown, 0 = Oq known.
Let W i, . . . , W ji»! be the sample with X (i) deleted.

u. = I - еч> {-(w . - j = I. .. , n - I (6.25)

Note that the rem arks about the effect of permuting the X*s on the u*s
above also obtain to these u^s, v i z . , if the X ’s are permuted we still get the
same set of values { u i , . . . *Uj^_i} from (6.25). F o r cases П and HI that
follow, we give the transform s obtained from the sample X i , . . . , from
Theorem 6.4, as (6.26), and, also, transform s obtained by applying The­
orem 6.3 directly to the o rd er statistics (x (i ), . . • ,x^^p, as (6.27); these
latter form ulas w ere obtained by O^Reilly and Stephens (1984).
254 QUESENBERRY

Case П , 9 unknown, Д = Mo known.


Put x| = Xj - i = 1» . . . , n.

/j-1 /3 \} - l
(6.26)
......... ”

Next, put Y i = X(i) - MO* I = I . • • • > У0

. I - (n - j + l)y / (y . + • • • + У ) J “ ^
U = 1 - } ----------------------------------------------------------- — Л ,j = l, . . . , n - 1 (6.27)
'"j j i - (n - 3 + i )y . y (y . + ---+y^)(

Case Ш , M and ^ unknown.


F irst, let W j^, . . . , W n_i be the sample with x (i ) deleted and put
= Wi - X (i), 1 = 1, . . . , n - I . Then the transformations fo r this case are
given by (6.26) with Xi replaced by w [ and n by n - I .
Next, put Y i ^ i = X(i) - X (I ), 1 = 1, . . . , n. Then the transformations for
this case are given by (6.27) with these y*s, and n replaced by n - I .

6. 5. 4 Pareto Distributions

Densities:

0¿ l+ûf
f(x ;a ,y ) = { a y /x )I^^ (x), a > 0, y > 0 (6.28)

K the transform ed values In Xi, i = I , . • . , n, are considered, then these


transform ed values a re a sample from the exponential fam ily o f (6.23) where
в = I/ a and M = In y.

Case I, у unknown, o¿ = aQ known.


Let W^, . . . , be the sample with X (i) deleted.

• I - (X,„ / » , ) “ • (6.29)

fo r j = l , . . . , n — I*

Case П , a unknown, у = Уо known.


Put x| = In Xi - In Уо; I = I , . . . , n. Then the C P IT transform s a re given
by (6.26) using these x-^s.
Put yi = In X(i) = In У0 , i = I , . . . , n. Then the o rd er statistics C P IT is
given by (6.27) using these y^s.

Case Ш , a and у unknown.


Let W j^, . . . , W jj_i be the sample with X (i) deleted, as above in Case I,
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 255

and put Xj = In fo r 1 = 1 . • . . , n - Io Then the C P IT transform s are


given by (6.26) usmg these x|’s and n replaced by n - I .
To obtain the o rd er statistics transform s for this case put y j _ i =
In X(J) - In X (I ), n = I , . . . , n, and use these y*s in (6.27) with n replaced
by n - I.

6 . 5. 5 Norm al Distributions

Densities:

f(x; д,сг^) = (l/o•^/^) exp { - ( x - д)^/2сг^} (6.30)

fo r -«> < oo, and OT^ > 0.

Case I, ß unknown, = (Tq known.


Let Xj = (x i + • • • + X j ) / j for j = I, n and Ф denote the df o f a
N (0 ,1 ) distribution.

u . _ i = ф { ( j - 1)^(X^ - (6.31)

fo r j = 2, . . . , n.

Case П, Œ unknown, м = /½ known.


F o r this case only we put

sf = Z (X - . j = I, ..., n
J i= l

and let Gj; denote a Student-t distribution function with v degrees of freedom .
Then

Vl " j=2 . .... n (6.32)

This can be generalized somewhat, as follow s. Suppose that is a mean


square estim ator o f that is independent o f the X*s, and that p S f ;/ a ^ has a
distribution. Then we put

s* = {[(j - + +j - 1)}^
and

*>j_l = • j = 2. • • • . n (6.32')
256 QUESENBERRY

Case Ш , M and o-^ unknown.


F o r this case put

J J
X. = (1/j) Z X,. S = (1 / 0 -1 )) Z (X. - X )
J I= I -" i= l ^

Then

(6.33)
“j-2 = - « / « S - V l > ''^ j - 1 ^ ’ ^ =

Again, let S p be an independent mean square estimator o f such that


vSp/< j^ is a (^) Then put

s * = { [ ( j - 2)S^_^ + v s ^ / (i; + j - 2 ) } ^

and

(6.33*)
“j-2 =

In computing the above quantities it is helpful to use the following !^dating


form ulas. F o r xj as defined above, and for

V - 2
(SS.) = ¿ (X. - X .)

then

-2
X. = [ ( j - l)x ._^ + x .]/ j, (SS.) = (SS^_^) + [ j / (j - l)](x . - Í . ) (6.34)

Youngs and C ram er (1972) and Chan, Golub, and Le Veque (1983) d is­
cussed these form ulas; the latter authors w ere prim arily concerned with
numerical accuracy.

6. 5. 6 Lognorm al Distributions

Densities:

f(x;^,cr^) = (xcr n/ ^ ) “^ e x p {-(In x - д)^/2о^} (6.35)

fo r - « < Д < OU, and (7^ > 0 .


TRANSFORMATION METHODS IN GOODNESS-OF-FIT 257

The values In X]^, • • • , In constitute a sample fo r the norm al fam ily


o f (6.30)» and transformations fo r the three cases can be obtained by replac­
ing (XI» . . . » Xjj) by (In XI» . . . » In Xq) in the transform ations fo r the c o r­
responding norm al cases above.

6. 5. 7 N orm al Linear R egression Models

” (У1» • • • >Уп)* ^ vector of independent rv*s, ^ an n x p m atrix of


full rank (n > p)» and £ = (^ i* • • • ,^p)* a vector of p param eters» we con­
sid er the fam ily of norm al distributions

y. ^ N ( x ^ £ .( t2), 3 = 1. (6.36)

fo r the jth row of Хд. The m atrix Xj denotes the m atrix consisting of the
firs t j rows o f Xjj. W e shall give transformations here fo r two cases» v i z . ,
fo r (7^ known and unknown.

Case I, ¢7^ = (7q known.


F o r this case it is readily verified that the statistic t ¿ = is a
complete sufficient statistic fo r £• M oreover, the UMVU estimatiiig d istri­
bution function is itself the df of a norm al distribution with

mean = x !(x ! X .)“^ x !y . » variance = (j5[1 - ^ ! ( X l X .)"^ x .]


3 J ° J J J

Using this result with Theorem 6.4 gives the following resu lts. Put

y. - x : ( x : x . ) - i x : y .
A = .. Д 1 3 3____ 3 Z 1 - (6.37)
^ (7J1
Ql - x ! ( X !Jx .j/
) - ^ x^jJ
.]^

and

U = Ф(А ), j = P + I, .. (6.38)
J-P 3

Certain well-known updating form ulas [Placket (1950), Bartlett (1951)]


a re very convenient fo r the computation of Aj as w ell as fo r the development
below. F o r

= (X! X .)”^X!y. , the least squares estimate of g from the first j o b -


^ servations

S? = [I - X.(X1 X .)“^X l ]^ . , the usual least squares sum of squares for


^ ^ 3 3 3 3 3 residuals from the first j observations.

then the updating form ulas are


258 QUESENBERRY

/Vt V \- I = /X* X ^ ^---------T“^— ---------- (6.39)


<^3-1 3-1^ I +

(6.40)
^3 3“1 3 3 33 3 3-1
S? = S? + a lA } (6.41)
3 3 -1 ^ 3

Using these relations, Aj can be written in the alternative form

y. “ x! b. ^
A. = ____ 3- ГЗ 3-1 (6.37*)
è
^ (tJ 1 + ^ : ( X : , X. J - ^ x . ]
0 3 3-1 3-1 3

Put Wj = c7oAj, and these Wj*s are the quantities sometimes called
recursive resid u als, and have been considered by a number o f w riters in­
cluding Hedayat and Robson (1970), Brown, Durbin, and Evans (1975), and,
recently, by Galpin and Hawkins (1984). These w riters have generally not
assumed that is known. Then in the form (6.37*) the Wj*s can be shown to
be i . i . d . N(0,(j^), which also follows easily from the C P IT results given
above.

Case П, CT^ unknown.


F rom O -Q , Example 4.3, we put

(i - P - I)I .(yj
B. = ----- (6.42)
^ { [ 1 - x ! ( X : x . ) " ‘ x.]S? - (y. - x î b . ) * } ’
J J J r J J JJ

and

u. I =G- i(®-) > j = P 2, (6.43)


J-P-I J-P-I J

U sing the updating form ulas again, we express Bj in the alternative form

B. = (6.42*)

Note that the quantity (yj - x j b j . i ) in the numerator of (6.37*) and (6.42*)
is the residual of yj from the least squares line fitted using the first j - I
points. This is a norm al rv with mean zero, and by examining (6.37*) and
(6.42*), we see that Aj and Bj are the standardized and Studentized form s of
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 259

this rv , respectively. M oreover, by again using the updating form ulas, we


can show that

B. = ( j - P - l ) ^ A y ^ ( Д = ( 3-P - [Д w ?) (6.44)

fo r the recursive residual defined above.


Wj
It should be noted that the form ulas given above in Section 6. 5 . 5 fo r the
univariate norm al distribution fo r the cases when cr^ is known a re , of course,
special cases of the form ulas given in this section fo r the univariate re g r e s ­
sion model.

6.5.8 Norm al Analysis of Variance Model

The m aterial in this section is largely from Q uesenberry, G iesbrecht, and


Burns (1983) (Q G B ). W e consider к mutually independent sam ples, x y .
i I, . . . , k, j I, , n i , as in (6.4), and the fam ily of distributions

X .. ~ ff^), i = I, ..., к (6.45)

In w ords, this is a fixed-effects, one-w ay, norm al e r r o r s , analysis- o f -


varlance model. Define

n = n, + • • • + n, , V .. = n +
I к IJ I

K.. = (x _ + ••• + x ..)/ j, SS.. = Д (x. - X. . ) ^


IJ Il ij' Ij Ш ij'

and for Vy > 0,

(6.46)
Ч гW
X SS + SS.,. ,,
Ii

(Rem ark: Readers fam iliar with the QGB paper w ill note that the iqpdat-
Ing form ulas in (6 . 34) above have been used to rew rite the form ula given for
A y in that paper in a m ore convenient fo rm .)

Case I, (7^ = Oo known.

Uj j = * { l * J J - Xjj][j/(3 - = * { [ X j . - X j(j_l)][(3 - (6.47)


260 QUESENBERRY

Case П, (7^ unknown«

U.. = G (A^.) (6.48)


'> ''i)

fo r i = I and j = 3, Til and i = 2, . . . , k, j = 2,

A case can arise which is, in a sense, intermediate between cases I


and П above. Suppose there is an external mean square estim ator of
available such that vSj,/ (r^ is a rv . Then put

i
S ..| û SS
“ “a
+ S S „ . ,, + VS^
И 3-1) *' )/¾
and

= G * Ü ( j - l ) / j ] * ( x « - X , , , ,, )/ S * } (6.49)
“ij i(j-l)'
Ij

The question can be raised as to what is to be gained by using (6*49)


rather than (4.48) since, under the null ЬзфоШее1е model, we obtain the
same number of l . i . d . U (0 ,1) r v ’s. The answer is that when some of the
model assumptions are incorrect the anomalous data patterns resulting
should be m ore distinct and analyses based on them m ore sensitive. Another
point is that even under the null hypothesis the dependence of the u*s upon
the ordering o f the data is weakened. This point can be seen by observing in
(6.49) that as — OO the distribution function in (6.49) converges to Ф and
thus (6.47) is the limiting case of (6.49).

6.6 N U M E R IC A L E X A M P L E S

6. 6. 1 Introduction

In this section we use some of the transformations of the last section in


numerical exam ples. In all o f these examples the computations w ere p e r ­
formed using program s written by the author fo r this purpose.

6. 6. 2 Salinity Data

A large scale program by North Carolina State University to study environ­


mental impact in the Cape F e a r Estuary includes a sampling over time by
weeks of the larval density in the estuary. Accompanying data includes
salinity m easured in parts p e r thousand (ppt) at the time o f collection of the
la rv a l specimen. The salinity data consists of several sam ples, and each
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 261

TABLE 6.3

Sample Num ber

O bs. 6

1 18.3 19.8 11.1 27.0 13.1 9.1


2 16.9 2 2.0 12.7 26.4 12.6 13.6
3 15.9 22.5 10.2 28.5 18.7 16.4
4 16.0 21.3 8.6 25.5 19.4 14.3
5 15.8 20.0 7.9 28.1 17.4 14.0
6 14.5 19.1 10.9 26.3 16.9 10.2
7 18.6 18.8 13.7 26.6 15.5 11.7
8 21.8 22.4 12.4 24.5 16.7 12.6
9 18.3 22.9 12.9 27.8 17.7 12.0
10 18.5 21.5 10.1 28.7 11.1
11 17.1 21.4 9.9 28.7
12 16.6 20.9 26.8

sample consists of m easurements made over a relatively short time period


(approximately 24 h ou rs), and the sam ples a re taken at intervals of at least
one month apart. It appears reasonable to consider each sample to be an
independent random sample from a common parent distribution, and to
assume that the functional form of the parent density is the same fo r all
sam ples. However, the means and variances can be expected to vary widely
from sample to sam ple.
There are available six sam ples, given in Table 6 .3 , for studying the
functional form of the parent distributions.

Norm al Analysis

W e consider the null hypothesis Hq of (6.5) with F the df of a N(//^,(7?)

distribution, i . e . , we consider the null h5^othesis that each sample is from


a normal parent, but allow each of these parents to have different means and
variances (cf. Q uesenberry, Whitaker, and Dickens (1976)).
W e have transform ed each of the sam ples of T able 6.3 using the tran s­
formations of equation (6.33). Since a sample of size nj gives nj - 2 trans­
form ed values, there are a total of N = n^ + • • • + n¿ - 12 = 54 transform ed
u-values. These 54 u-values are given in Table 6.4, and are graphed against
the expected values (i/ (N + 1)) = (i/55) in Figure 6.1.
The values of the modified Watson statistic and the Neyman
smooth statistic p| are both much too sm all fo r significance at the 10 percent
262 QUESENBERRY

TABLE 6.4 Pooled and Ranked u-Values, Normal Analysis

0380 .1431 .2625 .3487 .4572 .5908 .6861 .7918 .9080


0734 .1630 .2676 .3884 .4716 .6030 .6989 .8066 .9106
0988 .1823 .2892 .3894 .4779 .6196 .7223 .8197 .9274
1012 .1972 .2918 .3971 .5638 .6288 .7607 .8680 .9366
1159 .2177 .3139 .4201 .5690 .6380 .7797 .8770 .9765
1233 .2450 .3473 .4364 .5705 .6641 .7908 .8926 .9922

^ = 0.013 p j = 1.631
MOD

FIGURE 6.1 Salinity data. Norm al analysis.


TRANSFORMATION METHODS IN GOODNESS-OF-FIT 263

level. The P 4 statistic is approximately a (4) rv and the observed signifi­


cance level is

P(p| > 1.631) = P(x^(4) > 1.631) = 0.80

Consider Figure 6 . 1 and re c a ll that the points should approximate the


line if the underlying parent fám ily is norm al. The fit is excellent, and from
this and the sm all values of U ^io D and P 4 we conclude that norm al distribu­
tions fit these data very w e ll. Indeed.
Even though we have concluded that norm al distributions fit these data
w e ll, we shall in the rem ainder o f this subsection consider fitting two other
tw o-param eter fam ilies to the data in o rd er to illustrate the use of the
transform ations.

Uniform Analysis

W e have transform ed each of the sam ples of T able 6.3 using the transform a­
tions of (6.23) for a tw o-param eter uniform fam ily. The 54 u-values ob­
tained are given in Table 6.5 and are graphed against the null hypothesis
expected values in Figure 6.2. This uniform probability plot is typical of the
pattern we obtain when norm al data are transform ed using the transform a­
tions for a uniform fam ily. This S-shaped pattern can be anticipated from
the discussion of data patterns in subsection 6 .4 .2 , since both distributions
a re S3mimetric and on the range of the sample the norm al density has thinner
tails near the extrem es, and is higher than the uniform density in the center.
The value o f in T able 6.5 fa lls between the upper 10 and 5 percent
points, and the observed significance level of p j is

P(p| > 6.610) = P(x^(4) > 6.610) = .158

From these statistics and from Figure 6 .2 it is clear that the uniform
distributions do not fit the data w ell, and certainly not as w ell as normal
distributions. It should be borne in mind here that the uniform distribution
as an alternative to normality is one that is rather difficult to detect.

Exponential Analysis

W e have transform ed each of the sam ples of T able 6 . 3 using the transform a­
tions of (6.27) for tw o-param eter exponential distributions; see Case DI in
subsection 6 .5 .3 . The 54 pooled and ranked values obtained from these trans­
formations are given in Table 6 - 6 . These values are plotted in Figure 6.3
against the expected valu es. This probability plot is typical of the pattern
obtained when norm al data are transform ed using exponential fam ily tran s­
form ations—re c a ll the discussion in subsection 6 .4 .2 .
This value of U ^ qj ^ is highly significant since the upper I percent point
fo r U |ioD is -267- M oreover, P(p| > 28.260) = P(x^(4) > 28.260) = 0.00001.
264 QUESENBERRY

TABLE 6.5 Pooled and Ranked u-Values, Uniform Analysis

0732 .2055 .3288 .3973 .5122 .5517 .6324 .7500 .8621


0735 .2381 .3448 .4265 .5172 .5616 .6341 .7759 .8780
1207 .2439 .3562 .4286 .5205 .5952 .6585 .7805 .8971
1507 .2740 .3562 .4524 .5205 .6029 .6712 .7857 .9024
1781 .2877 .3793 .4795 .5476 .6098 .7059 .8276 .9525
1918 .2927 .3966 .5000 .5479 .6164 .7123 .8571 1.0000

p| = 6.610

FIGURE 6.2 Salinity data. Uniform analysis.


TRANSFORMATION METHODS IN GOODNESS-OF-FIT 265

TABLE 6.6 Pooled and Ranked u-Values, Exponential Analysis

1199 .3843 .4254 .4857 .5489 .6009 .6753 .7266 .7952


1780 .3871 .4318 .5182 .5550 .6186 .6870 .7301 .8066
1790 .3967 .4397 .5313 .5617 .6189 .7073 .7574 .8102
2864 .4138 .4546 .5322 .5719 .6294 .7131 .7606 .8233
3345 .4146 .4684 .5365 .5926 .6439 .7174 .7619 .9134
3539 .4175 .4771 .5386 .6000 .6725 .7256 .7794 .9242

= -7 4 7 p| = 28.260
MOD
266 QUESENBERRY

From these values and Figure 6.3 it is clear that the tw o-param eter e x p o -
nential distributions fit these data very poorly.

6 .6 .3 Simulated Data

The appendix gives three samples of 100 observations each that have been
drawn from norm al, exponential, and uniform distributions. These samples
a re named NOR, N E X , and UNI, which we shall abbreviate further to N, E,
and U in this section. F o r each of these three sam ples we have computed 98
u-values using the appropriate transformations fo r tw o-param eter norm al,
exponential, and uniform fam ilies. This gives nine sets of u-values. W e have
partitioned the ( 0 , 1) interval into ten subintervals of equal lengths, and de­
note by M i the number of u^s that fall in the subinterval ((i - 1)/10, I/10]
fo r i = I , . . , , 10 for each set of u-values. These values and those for U?,
p| and the observed significance level of p|, called Sig., are given in Table
6.7. The label (N , E) on row 4 means the normal sample was subjected to
exponential transformations, for example.
The first three rows of Table 6.7 give results when the sample from a
distribution is subjected to the transformations fo r a class to which that d is­
tribution belongs. The expected cell frequencies are 9.8, and none of the
test statistics is significant at any of the usual levels. The other six rows
all give level .05 significant tests for both U ^ qj ) and p|. A ll except (U ,N )
have observed significance levels for both statistics that are very sm all,
indeed. A m ore detailed analysis could be carried out by plotting the 98
u-values fo r each case as was done in the last example in Figures 6 . 1 , 6 . 2 ,
and 6 . 3.

T A B L E 6.7 Analysis of Simulated Samples

M l М 2 М3 M4 M5 M 6 M7 M8 M9 MlO ^ pi Sig.
MOD

(N ,N ) 12 12 5 10 7 8 17 3 13 11 .068 0.818 .94


(U .U ) 10 10 14 10 5 9 8 8 11 13 .040 1.547 .82
(E .E ) 13 11 10 10 10 8 10 5 13 8 .093 3.521 .47
(N .E ) 0 3 4 9 12 15 27 20 8 0 1.587 63.041 2.4E(-12)
(U .N ) 14 15 9 11 9 7 I 8 18 6 .221 13.174 .01
(E .U ) 54 21 12 7 3 0 I 0 0 0 3.340 281.416 4 .5E (-12)
(N .U ) 3 8 13 12 18 19 9 10 5 I .770 30.046 4 .8 E (-6 )
(U ,E ) 8 5 8 10 9 13 13 21 11 0 .416 22.083 1.9 E (-4 )
(E .N ) 0 12 27 17 7 8 5 7 6 9 .767 42.259 1 .5E (-8)
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 267

TABLE 6.8

Sample A Sample B

Ranked u-values Ranked u-values

Data Norm al Uniform Exponential Data Norm al Uniform Exponential

210 196
190 236
182 .060 .053 .1 1 1 246 .022 .147 .184
230 .067 .105 .191 187 .024 .216 .281
236 .077 .228 .356 193 .083 .224 .318

214 .145 .263 .386 231 .122 .319 .411


246 .244 .298 .404 199 .161 .405 .418
186 .365 .298 .500 147 .165 .422 .456
168 .419 .351 .563 177 .225 .457 .489
162 .524 .404 .604 232 .263 .491 .496

196 .563 .474 .625 155 .273 .491 .531


226 .731 .509 .715 195 .296 .543 .534
156 .789 .561 .758 179 .372 .560 .562
202 .825 .614 .758 208 .422 .569 .590
190 .868 .649 .761 167 .445 .595 .616

236 .891 .649 .768 130 .616 .629 .643


220 .901 .702 .777 225 .621 .672 .654
320 .909 .702 .798 183 .724 .819 .667
242 .918 .754 .801 187 .727 .871 .669
270 .980 .789 .868 203 .809 .879 .735

156 .822 .914 .744

6 .6 .4 The B liss Data

The appendix gives data from B liss (1967) on the body weight in gram s of
21 -day-old white leghorn chicks at two dosage levels of vitamin D . W e have
randomized each of the sam ples labeled series A and se rie s B . Table 6 .8
gives the data in the o rd er in which it was analyzed here. Table 6 .8 gives
also the transform ed u-values when the sam ples are subjected to tw o-
param eter norm al, uniform, and e:qx>nential analyses. The test statistics
u | jo D observed significance level, Sig. , of p|, have
been computed fo r each sample as w ell as for the pooled sam ples and are
given in Table 6 .9 . Graphs of the pooled u-values are given in F igures 6.4,
6 .5 , and 6.6 for the norm al, uniform, and exponential c lasses, respectively.
268 QUESENBERRY

T A B L E 6.9 Test Statistics fo r B liss Data

Transform ation Sample P4 Sig.


MOD

Norm al A .128 4.707 .32


B .060 4.329 .36
Pooled .068 3.257 .52

Uniform A .110 4.347 .36


B .135 4.269 .37
Pooled .200 7.634 .1 1
Exponential A .224 9.852 .04
B .423 15.550 .004
Pooled .470 18.333 .001
>
-D

S. 0.4 —

FIGURE 6.5 B liss data. Uniform analysis.


270 QUESENBERRY

TABLE 6.10 Ranked u-Values for Fisher Data

.0001 .0870 .3685 .6630


.0107 .1419 .4275 .7599
.0163 .1532 .4384 .7848
.0414 .2273 .5188 .9046
.0635 .2275 .5899 .9640
.0636 .2702 .5946 .9842
.0768 .2940 .6403

M OD
0.117 P i = 12.16
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 271

The probability plot in Figure 6.6 as w ell as the test statistics for the
exponential case in Table 6.9 easily eliminate the exponential class from
consideration for fitting these data. Comparison of the graphs in Figures
6.4 and 6.5 suggests that the normal class fits better than the uniform class,
and this conclusion is supported by the values of and p| computed for
the pooled u-values in Table 6.9. The statistic is significant at the
. 05 level fo r the uniform class but is much too sm all fo r significance at even
the . 10 level for the norm al c la ss. The probability plot of Figure 6.4 sug­
gests that the true distribution (assuming that the i . i . d . assumptions hold,
of course) has a density with slightly thicker tails than a norm al density, but
it does look pretty sym m etric. Figure 6 .5 suggests that the true density has
thinner tails on its interval of support (where the density is positive) than the
tails of the uniform density on its interval of support, but again the density
appears to be sym m etric.

6 .6 .5 Regression Data

In this section we give three examples to Illustrate the use of the transform a­
tions of (6.37) to test the regression model assumptions of (6.36). It should
be borne in mind that we are testing M of the assumptions of the normal
linear regression model.

Fisher^s Data

In this example we consider data given by F ish er (1958, p. 137) on the r e la ­


tive effects of two nitrogenous fertilizers in maintaining yields in bushels of
wheat over a period of 30 y e a rs. F ish er uses a simple linear norm al r e g r e s ­
sion model for these data and suggests that a m ore complex (curved) r e g r e s ­
sion line might give a better fit.
W e have applied the regression transformations of (6.43) to these data
in the o rd er that they are given by F ish er. Here n = 30 and p = 2 , and the
number of u-values is the number of observations less the number of param ­
eters in the model (two regression coefficients and the variance), i . e . ,
N = n - p - l = 30 - 3 = 27. The ranked u-values and test statistics
and p| are given in Table 6 .10, and the ranked u-values are plotted against
expected values in Figure 6.7.
The value = 0.117 is much less than the upper 10 percent point
of 0.152 given in Table 6 .1 , and is not significant at this level. Using the
chi-squared approximation to the distribution of we obtain Р(р| > 12.16) =
P(X^(4) > 12.16) = 0.016, and this value is significant at the 2 percent level
but not at I percent. The graph of Figure 6.7 also ra ise s doubts about the
appropriateness of the normal linear model. This pattern suggests that the
e r r o r distribution is not symmetric and is skewed to the right.
272 QUESENBERRY

TABLE 6.11 U-Values for Wallace-Snedecor Data

.029 .266 .687 .933


.034 .350 .717 .961
.106 .534 .819 .992
.200 .603 .826
.262 .641 .886

MOD
.063 p| = 2.389

FIGURE 6 .8 W allaee-Snedeeor data. Regression analysis.


TR AN SFO R M ATIO N METHODS IN G O O D N E S S -O F -F IT 273

T A B L E 6.12 U-Values fo r Snedecor-CkDchran Data

.057 .147 .211 .453


.101 .179 .258 .498
.138 .201 .429 .631

XJ2
MOD
0.225 p| = 9.464
274 QUESENBERRY

W allace and Snedecor Data

W allace and Snedecor (1931) give 25 observations on a dependent variable Y ,


and five independent variables (Xj ,X 2 ,X 3 ,X ^ jX g ). Ostle (1954, p* 220) also
gives these data and uses them in an example to illustrate a standard least
squares analysis of a normal multiple linear regression m odel. W e have
applied the transformations of (6.43) to these data in the o rd e r given by O stle.
Here n = 25 and the number of independent variables is 5, so the number of
u-values is 2 5 - 7 = 18.
The analysis is given in Table 6 .11 and Figure 6 . 8 . The values of
Um o d pI much less than their upper 10 percent points, and the pat­
tern in Figure 6.8 does not give cause to suspect the model.

Snedecor-Cochran Data

Snedecor and Cochran (1967, p. 140) give the initial weight x and the gain in
weight у of 15 female rats on a high protein diet, from the 24th to 84th day
of age and suggest considering a norm al linear regression model. The ranked
u-values fo r these data are given in Table 6 .12 and are plotted against ex­
pected values in Figure 6.9.
The number of observations here, 15, is rather sm all and gives only 12
transform ed u-values for testing the model. With so little data only rather
large violations of the model w ill be detected. Recall that for this sm all
number of u^s that many tests are biased fo r some alternatives. The U m o d
statistic is unbiased for all known cases and is our p referred test statistic
here. The observed value of U m o D ” *^25 here falls between the upper 5
and I percent values, and P(p| > 9.464) = .05. The pattern of points in
Figure 6.9 also ra ise s doubts about the adequacy of fitting these data with a
norm al simple linear regression m odel. W e conclude, even with so few ob­
servations, that the norm al simple linear regression model assumptions
a re unwarranted.

6 . 6.6 Further Examples in the Literature


Some of the transformations of Section 6.5 have been used in numerical
examples in the literature. Q uesenberry, Whitaker, and Dickens (1976) used
the transformations (6.33) to study the normality of 11 samples of peanut
aflatoxin data. Q uesenberry, G iesbrecht, and B um s (1983) used the trans­
formations (6.33) and (6.48) to study the analysis of variance model assum p­
tions for sm all samples (size 4) of tall fescue obtained as part of a forage
management study. The variable considered was neutral detergent fiber.
The C P IT transformations fo r multivariate normal models w ere given
by Rincon-G allardo, Quesenberry, and O ’Reilly (1979). Applications of these
form ulas fo r multiple multivariate samples w ere considered by Rincon-
Gallardo and Quesenberry (1982).

(Author’s note: This chapter was written in 1977 and revised in 1984.)
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 275

R E FE R E N C E S

Bartlett, М . S. (1951). An inverse m atrix adjustment arisin g in discriminant


analysis. Ann. Math. Stat. 22, 107-111.

B asa, D . (1955, 1960). On statistics independent of a complete sufficient


statistic. Sankhya 15, 337-380; 223-226.

B lis s , C. J. (1946). Collaborative comparison of three ratios fo r the chick


assay of vitamin D . J. A sso c . O ff. A g r . Chemists 29, 396-408.

Brown, R. L . , Durbin, J ., and Evans, J. M . (1975). Techniques fo rtestin g


the constancy o f the regressio n relationship over tim e. J. R. Statist.
Soc. B37, 149-192.

Chan, T . F . , Golub, G . H ., and LeVeque, R. J. (1983). Algorithm s for


computing the sample variance: analysis and recommendations. A m e r­
ican Statist. 37(3), 242-247.

Cox, D . R. (1961). Tests of separate fam ilies of hypotheses. Proceedings


of the 4th Berkeley Symposium on Mathematical Statistics and P ro b a ­
bility, V o l. I . Berkeley: University of California P r e s s , 105-123.

C sörgö, M . and Seshadri, V . (1970). O n th e p ro b le m o fre p la c in g c o m p o s ite


hypotheses by equivalent sim ple ones. Rev. Int. Statist. Inst. 38, 351-
368.

C sörgö, M . and Seshadrl, V . (1971). Characterizing the Gaussian and expo­


nential laws via mappings onto the unit Interval. Z . W ahrscheinlichkeits­
theorie und V erw . Gebiete 18, 333-339.

C sörgö, M . , Seshadri, V . and Yalovsky, M . (1973). S o m e e x a c tte s ts fo r


normality in the presence of unknown param eters. J. Roy. Statist. Soc.
B35, 507-522.

David, F . N . and Johnson, N . L . (1948). The probability integral tran sfor­


mation when param eters are estimated from the sam ple. Biom etrika
35, 182-192.

Durbin, J. (1961). Some methods o f constructing exact tests. Biom etrika


40, 41-55.

D yer, A . R. (1974). Hypothesis testing procedures fo r separate fam ilies of


hypotheses. Jour. Am . Stat. A s s n . 69, 140-145.

F ish er, R. A . (1930). Inverse probability. Proceedings of the Cam bridge


Philosophical Society 36(4), 528-535.

F ish er, R. A . (1932). Statistical Methods fo r Research W o rk e rs . 4th E d .,


O liver and Boyd, Ltd.
276 QUESENBERRY

Galpln, J. S. and Hawkins, D . M . (1984). The use o f recursive residuals in


checking model fit in linear regression . Am erican Statist. 38(2), 94-105.

Hedayat, A . and Robson, D. S. (1970). Independent stepwise residuals for


testing homoscedasticity. Jour. A m . Stat. A s s n . 65, 1573-1581.

Hester, R. A . , Jr. and Q uesenberry, C . P . (1984). Analyzinguniform


residuals for heteroscedasticity. Institute of Statistics Mimeo Series
No. 1639, North Carolina State U n iversily.

M ille r, F . L . , Jr. and Q uesenberry, C. P . (1979). P o w e rs tu d ie s o fs o m e


tests fo r uniformity, П. Commun. Statist. B 8 (3 ), 271-290.

Neyman, Jerzy (1937). ’Smooth* test fo r goodness of fit. Skandinavisk Aktu-


arietidskrift 20, 149-199.

O ’Reilly, F . and Q uesenberry, C. P . (1973). The conditional probability


integral transformation and applications to obtain composite chi-square
goodness of fit tests. Ann. Statist. 1^, 74-83.

O ’Reilly, F . J. and Stephens, M . A . (1984). Characterizationsofgoodness


of fit tests. J. R. Statist. Soc. B44(3), 353-360.

Ostle, B . (1954). Statistics in R esearch . Am es: Iowa State University P re s s .

Placket, R . L . (1950). Some theorems in least squares. Biom etrlka 37,


149-157.

Q uesenberry, C . P . (1975). Transform ing sam ples from truncation param e­


ter distributions to uniformity. Comm, in Statist. 4, 1149-1155.

Q uesenberry, C. P . and Dietz, E . J. (1983). A gre e m e n tp ro b a b ilitle sfo r


some C P IT -N e 3nnan smooth tests. J. Statist. Comput. Sim ul. 17, 125-
131.

Quesenberry, C. P . , Giesbrecht, F . G. and Burns, J. C. (1983). Some


methods fo r studying the validity of model assumptions fo r multiple
sam ples. Biom etrics 39, 735-739.

Quesenberry, C . P . and Hales, C . (1980). Concentration bands fo r uniform­


ity plots. J. Statist. Comput. Simul. 11, 41-53.

Quesenberry, C. P . and M l l e r , F . L . , J r. (1977). Pow er studies o f some


tests fo r uniformity. J. Statist. Comput. Sim ul. 5, 169-191.

Quesenberry, C. P . and Starbuck, R. R. (1976). Onoptim al tests for sepa­


rate ЬзфоШееее and conditional probability integral transform ations.
Comm, in Statist. A5(6), 507-524.

Quesenberry, C. P . , Whitaker, T . B . and Dickens, J. W . (1976). Ontesting


normality using several sam ples: an analysis of peanut aflatoxin data.
Biom etrics 32(4), 753-759.
TRANSFORMATION METHODS IN GOODNESS-OF-FIT 277

R incón-G allardo, S. and Q uesenberry, С . P . (1982). Testlngm ultivariate


normality using several sam ples: applications techniques. Commun.
Statist.—Theor. Meth. 11(4), 343-358.

R incon-G allardo, S ., Q uesenberry, C . P . and O 'R e illy , F . J. (1979). Con­


ditional probability Integral transformations and goodness-of-fit tests
fo r multivariate norm al distributions. Ann. Statist. 7^(5), 1052-1057.

Rosenblatt, M . (1952). Rem arks on a multivariate transform ation. Ann.


Math. Stat. 470-472.

Sarkadi, K. (1960). On testing fo r norm ality. Publications of the Mathemat­


ical Institute of the Hungarian Academy o f Sciences. V o l. 3, 269-275.

Sarkadi, K . (1965). On testing fo r normality. Proceedings of the 5th Berkeley


Symposium on Mathematical Statistics and Probability. Berkeley: Uni­
versity of C alifornia P r e s s , 373-387.

Seheult, A . H. and Q uesenberry, C . P . (1971). O n u n biased estim atio n o f


density functions. Ann. Math. Stat. 42, 1434-1438.

Seshadrl, V . , C sörgö, M . and Stephens, AJ* A . (1969). Tests fo r the expo­


nential distribution using Kolmogorov-tзфe statistics. J. Roy. Statist.
Soc. B31, 499-509.

Snedecor, G. W . and Cochran, W . G . (1967). Statistical Methods. 6th E d .,


Am es: Iowa State University P re s s .

Stephens, M. A . (1970). Use of the Kolm ogorov-Sm irnov, C ram er-von M ises
and related statistics without extensive tables. J. R. Statist. Soc. B32,
115-122.

Stephens, M. A . (1974). ED F statistics fo r goodness of fit and some com par­


isons. Jour. Am . Stat. A ssn . 69, 730-737.

Stôrm er, Horand (1964). Ein test zum erkennen von Norm alverteilungen.
Zeitschrift W ahrscheinlichkeitstheorie und Verwände Gebiete 2 , 420-428.

W allace, H. A . and Snedecor, G. W . (1930). Correlation and machine calcu­


lation. (Revised edition) Iowa State College O fficial Publication, V o l.
No. 4.

Watson, G . S. (1961). G oodness-of-fit tests on a c irc le . Biom etrika 48.


109-114.

Youngs, E . A . and C ram er, E . M . (1971). Some results relevant to choice


o f sum and sum -of-product algorithm s. Technometrics 13, 657-665.

Zacks, S. (1971). The Theory o f Statistical Inference. New Y ork: W iley.


M o m e n t (V b i, b j) Techniques

K. О. Bowman Oak Ridge National ^ b o r a t o r y , Oak Ridge, Tennessee

L . R. Shenton U n iv e rs ity o fG e o rg ia , Athens, G eorgia

7.1 IN TR O D UCTIO N

F o r the random sample X i , . . . , Xj^, with mean

n
= Z X /n
j= l ^

we define the central moments

Ч = Z (X. - m )V n , i = 2, 3, 4 (7.1)
j= l ^

The sample skewness { ^ ib i ) and kurtosis (b 2) defined as

n/Ь- = m
I 3 2
(7.2)

‘'г =

and it is readily seen that they are invariant under origin and scale changes.
The corresponding m easures for a specified density are denoted by \Tßi
and F o r a normal distribution = 0, /З2 = 3, and in random samples
from it there may be wide variations from these values, especially fo r sm all
sam ples (n < 25). M oreover, the sample may a rise from some nonnormal

279
280 BOWMAN AND SHENTON

distribution, such as a uniform, negative exponential, o r W eib u ll, etc. S y m ­


m e t r i c distributions (or those with non-zero densities extending over negative
and positive variate values) are likely to produce samples with sm all skew­
ness, whereas distributions corresponding to positive valued random v a r i­
ables (such as the negative exponential) are likely to produce samples with
large skewness. In sampling from fairly symmetric distributions, one might
expect the kurtosis to reflect the nonnormality. Thus a combination of the
test statistics ^ГЬl and Ьз might provide a m ore comprehensive test than
either taken by itself.
This chapter w ill be mainly concerned with tests o f goodness o f fit based
on N/bi, Ьз in sampling from the normal o r other distributions such as m em ­
b ers of the Pearson system.
The read er may be reminded that of the classical test statistics, Student*s
t, F -ratio , correlation coefficient (and perhaps mean and variance), the skew­
ness and kurtosis statistics are the only ones whose distributions in normal
sampling are still now known exactly. However, Milholland (1977) has
arrived at an approximation to the null distribution of N/bj for samples of at
most 25; this is undoubtedly a breakthrough, although the mathematical ex­
pressions are very complicated, and it seems unlikely that the method can
be applied to sampling from more general populations. In nonnormal sampling
very few exact results are known for the distributions of s l h i and Ьз o r their
joint distribution.

7.2 N O R M A L DISTRIBUTION

E arly work goes back to K a rl Pearson (1902) who gave in general sampling
expressions for the dominant term s (n”^ asymptotics) in the variance of bj
and Ъ г , and also the correlation p(bi ,Ь з ). The idea was to use, fo r example,
N/Var (b j) and NTVar (Ьз) as e r r o r assessm ents, but it was fa r too early in
the development of the subject to consider questions o f the validity of the
of the asymptotics o r their uses.
A quarter of a century later an important development cams from
E . S. Pearson (1930) who used the work of F ish er (1928) and W ishart (1930)
on k-statistics to develop a Taylor series expansion (in term s o f the k -s ta tis -
tic discrepancies ki - /cj , 1 = 2, 3, 4) fo r NZbj and Ьз* F o r example, defining

у = (n - l)'^ {b i/ (n - 2)}

Pearson showed for the second and fourth central moments o f у (in norm al
sampling) that

/Ч . 6 ^ 22 70
Mz(y) = I + “Т “ “ Г +
^ n n^ n^

and
MOMENT ('T bi.bj) TECHNIQUES 281

. 1056 , 24132
ß i< y ) = 3 - ^ -------

the odd moments being zero. He developed sim ilar e g r e s s io n s fo r the 2nd,
3rd, and 4th central moments o f b 2 , along with Е ф г ) = 3(n - l)/ (n + I ) . To
damp-out higher o rd er term s, Pearson used sam ples o f n > 50 fo r n/Bj ,
n > 100 for ¿2 so as to a ssess the low er and upper 1% and 5% o f the d istri­
butions in norm al sampling. Thirty o r so y ears later (P earson , 1965), he
gave a set of ^accepted" percentage points; for a sample o f 50 there is no
change in the third d .p . entries fo r ^Tbi at the 1%, 5% levels; fo r b 2 and
n = 100 there is no change in the second d .p . entries; in a ll, quite a rem ark ­
able achievement.
The next step forw ard came from F ish er (1930) who showed that in
norm al sampling the standardized moments m^/m^/^, r > 3, are distributed
independently of the second moment m 2 (F ish er used k-statistic notation).
Thus, fo r example, E (b i) = Em|/Em| follows from the independence of
m|/m| and n ^ ; here E means the mathematical expectation operator. In
this manner, the exact moments of and b 2 can be found. In fact, F ish er
derived the first six cumulants of ^ I h i , Hsu and Lawley (1939) the fifth and
sixth moments of b 2 . L ater, G eary and W orlledge (1947) gave the seventh
noncentral moment of b 2 ; actually they give . Some o f the coeffi­
cients are quite la rg e , that of n^ being a 13-digit integer multiplied by 25515
(the whole expression has had scant usage to date).
Knowing exact moments, it w as a natural development to search for
approximating distributions, reaching out toward percentage points of the
distributions. Four-m om ent fits w ere studied by Pearson (1963) and at the
time he had the choice of the Pearson system, G ra m -C h a rlle r series system
based on the norm al, and the Johnson Su translation system (Johnson and
Kotz, 1970). F o r n > 30, the Student-t density gave an acceptable approxi­
mation fo r \ l h i , the criterion being the closeness of agreement between the
standardized sixth and eighth moments fo r the model and the true values.
Johnson's Su, although troublesome to fit, seemed to be equally acceptable
to Pearson (1963, p. 106). Recently, D'Agostino (1970) has shown that
Johnson's Su fo r n > 8 gives a very acceptable and simple approximation;
in fact

Z = Ô In { Y / a + nJ (1 + ( Y / o i) 2 ) } (7.3)

is approximately a standard norm al variate with zero mean and unit standard
deviation, where

¢2 = 3(n^ + 27n - 70)(n + l)(n + 3)/{(n - 2)(n + 5)(n + 7)(n + 9)}


= -I + - 2)

Ô = 1/Æ W , a = \ T { 2 / ( W ^ - 1)}

Y = 4 { [ n + l)(n + 3)bi7(6n - 12)}


ND
OO

§
>
!z:
>

Ul

FIGURE 7.1 Contours for Kg test; n = 30 (5) 65, 85, 100, 120, 150, 200, 250, 300, 500, 1000; n 20, 25
based on K| (normal sam pling). O
й:
MOMENT (\íbi,b2) TECHNIQUES 283

F o r the kurtosis the problem is m ore difficult because the statistic Ьз


is one-sided (its range being from 0 to « ) and very skew in general; thus for
n = 25, ^ T ß i(^ 2 ) = 1*75, /ЗгФг) = 9.90 in comparison to ^ Г ß l{^ ÍЬ l) = 0,
(N^bi) = 3.58. B riefly, Pearson Types VI and IV (o r Johnson^s Su) can be
regarded as acceptable approximations to the distributions of b 2 fo r n > 40
according to unpublished work o f C . T . Hsu (quoted by P earson ). Hsu pointed
out that for n > 30, the (ß i ,ß ^ ) points fo r b 2 w ere close to the Type V line
(P earson , 1963, p. 106). Noting this, Anscombe and Glynn (1975) suggest a
linear function of a reciprocal of a -variate as an approximation to b 2 for
n > 30 o r so; they do not make any comparisons with Johnson*s Su approxi­
mation. A b rie f description of the Pearson system is given in Appendix I .

7 .2 .1 Onmibus Tests

It is fairly obvious that the behavior of \ I h i , b 2 springs from the values of


>ßz population) and the sample size. F o r example, sampling from
a uniform population (n/^j = 0 , /?2 = 1 o8) is not likely to produce large values
of \/bi, whereas sampling from a negative e^)onential {^Tßi = 2 , /?2 = 9)
could result in large N/bj and b 2 • Thus the skewness and kurtosis statistics
a re in general correlated (see 7. 5 ), and although fo r norm al sampling the
correlation is zero, they a re still dependent variables (E(N/bib 2) = 0 , but
E (b ib 2) E (b i)E (b 2) ). Put otherwise, there w ill be situations in which ^ Ih i
w ill dominate the test decision about norm ality, b 2 playing a m inor ro le, and
vice v e rsa . F o r example, monthly rainfall amounts in certain clim ates are
w ell fitted by a negative е^фопепАа! distribution so that one might e ^ e c t the
skewness to play a m ajor role in testing fo r non-norm ality.
The use o f both skewness and kurtosis as a test statistic arose from a
study of D^Agostino and Pearson (1973); background theory is given in 7 .6 .

7 .2 .2 The K| Test Statistic

Calculate ^IЪ l and b 2 as defined in (7 .2 ). If a pocket computer is available,


it is a simple matter to input X i , . . . , ¾ and evaluate the first four sample
moments along with N/bj and b 2 . W e assume fo r the omnibus test that n > 20.
Plot the couplet (N/bj ,b 2) on the 90% and 95% contour charts (F igu res 7 . 1(a),
7 . 1(b)). If the point is internal to the appropriate contour (approximate inter­
polation fo r n should not result in much loss o f accuracy in the decision
p ro c e s s ), then accept the hypothesis of normality; the read er should be r e ­
minded that this procedure involves only one test for normality and is not
n ecessarily e r r o r fre e . Note also that the contours of acceptance are sym ­
m etric with respect to N/bj, the negative half o f the diagram s being omitted.
Obviously, the sign of N/bj (m 3 may be negative) plays no part in normality.
284 B O W M A N AND SHENTON

7 .2 .3 Num erical Examples

7 .2 .3 .1 Example

Sample values: Xj = I, i = ± I , ± 2, . . . , ±m

(A) m = 10 (B) m = 25

n= 20 n = 50

mj = 0 mi = 0
m 2 = 38.5 m2 = 221
m3 = 0 m3 = 0
m^ — 2533 • 3 m^ = 86145.8

*Л >1 =0 ^ T b i= O

Ьг = 1.71 Ьг = 1.76

Conclusion

(A ) is borderline at the 90% level whereas (B) is significant.


Comment: The samples nearly follow discrete uniform distributions,
so the nearness of Ьз to 1 . 8 is not surprising.

7 .2 .3 .2 Example

Sample values a re the first 20 of each of the first four data sets in the
appendix.

NOR UNI EXP LOG

mi 100.37 6.13 4.29 100.90

88.8095 8.1844 16.8497 306.6382

m3 316.32 -8.4149 92.1552 487.247

m4 16803.04 108.81 1317.94 323979.6

NTbi 0.378 -0.359 1.33 0.09

bz 2.130 1.624 4.642 3.45

I 90% Nsb S^ NS*»


K |j
' 95% NS*’ S^ NS*’

= significant.
^NS = not significant.
MOMENT (N/bi ,Ьг) TECHNIQUES 285

Comment

It is interesting to see how the single tests using 'v/bj ,Ьз separately would
p erform - Using the D'Agostino approximation fo r N/bj under normality, we
have fo r n = 20 ,

/?2 = 3.5778, = 1.2706, a = 2.7187, ô = 2.8899

and

= 1.2856 Sinh (0.3460 Z) (Z G N (0 , 1))

The .90 and .95 levels of N/bj are 0.589 and 0.772, respectively. Thus E X P
is significant, and the other three not significant. A s fo r Ьг, using D'Agostino
and Pearson (1973, 1974), we find the approximate levels (n = 20):

P .01 .025 .10 .15 .20 .85 .95

Ьг 1.64 1,73 1.94 2.04 2.12 3.40 4.18

W e now see that fo r NOR and UNI the interest is in values of Ьг significantly
lo w er than 3, whereas fo r E X P and LO G , the interest lies in values o f Ьг
la r g e r than 3; thus the tests become directional, a concept introduced by
Pearson . C learly Рг(Ьг < 2.13) is about .20, Рг(Ьг < 1.54) < 0.01,
Рг(Ьг > 3.40) = 0.15, and Рг(Ьг > 4.15) = 0.05 approx. F o r the four cases,
the single test sum m aries a re:

NOR U NI EXP LO G

^ГЬl NS NS S NS

NS S S NS

in good agreement with the omnibus test.

7 .2 .3 .3 Example

Sample valu es, series A from B U S (appendix)

n = 20
m^ = 209.6
m2 = 892.24
m3 = -1201.25
m^ = 1758724
N/bi = -0.045
Ьг = 2 .2 1
286 BOWMAN AND SHENTON

Conclusion

Not significant at 90% level fo r the three tests. Note that under normal
sampling, N/bj has a = 0.473, 02 = 3.58, and Ьз has ß [ = 2.714, a = 0.761,
n/01 = 1.738, and = 8.54. Evidently, there is a very good chance that
-0 .0 5 < bi < 0.05. F o r the kurtosis, using a 4-moment Pearson curve as an
approximant p r (2.18 < Ьз < 3 .0 5 ) = 0.5 a p p ro x ., so the observed value is
quite acceptable.

7 .2 .3 .4 Example

Sample values, first 50 of Sxj(0, 3) (appendix)

n = 50
mi = -0.024
тз = 0.1084
m^ = -0.00118
m^ = 0.0401
>^>J = -0.03
bz = 3 .4 2

Conclusion

Not significant for the single tests nor the onmibus. Note that under norm al­
ity, \/bi has (7 = 0.326, 02 = 3 .4 5 , and Ьз has ß [ = 2.882, (7 = 0.598, n/0i =
1.582, and 02 = 8.42. C learly, the observed N/bj is not significant. A
4 -moment Pearson curve fo r Ьз gives Рг(Ьз > 3.63) = 0 .1 a p p ro x ., so that
the observed value is quite acceptable.

7 .2 .3 .5 Example

Sample values are rainfall amounts at Tifton, Georgia fo r the month of


January over 30 years (1928-1957). Data from U .S . Department of Com m erce,
Weather Bureau.

Rainfall: 1.36 4.25 2.52 4.03 3.50 3.82


(inches) 5.11 2.09 3.18 4.90 1.44 0.72
5.41 2.41 2.77 3.94 1.20 5.42
2.54 5,27 1.74 4.84 1.27 3.35
6.06 5.55 5.02 5.94 2.57 0.81

n = 30
nil ~ 3.4343
m.2 = 2.6857
^ЛЬl = —0.05
Ьз =1.71
MOMENT (^/bl ,b j) TECHNIQUES 287

Conclusion

The omnibus test rejects at 90%. Also РгФ г < 1*73) = 0.025, so the kurtosls
test rejects.

7 .2 .4 Further Comments

There is a computerized version o f the test in Appendix 2 . This could be


Included as a subroutine in a statistical data package and does not depend on
graphical displays.
A lso , we describe in 7.6 the construction of another set of test contours
fo r normality based on a model fo r the joint distribution of ^/b^ and Ь з. O v e r­
a ll, this new set of contours leads to about the same decision as the k | set.
Given the mean, s . d . , skewness, and kurtosis o f a distribution, approx­
imate percentage points may be quickly evaluated using a Pearson curve fit
(Bowman and Shenton, 1979a, 1979b). The standardized percentiles are given
by rational fraction approxlmants and are suitable fo r use on a portable
calculator.

7.3 N O N N O R M A L S A M P LIN G

The state of the art for this case is fa r less developed than fo r normal sam ­
pling. Contours of acceptance under various hypotheses a re shown below in
F igu res 7.3, 7.4, 7.5, 7.7, and 7.8; fo r each of these the sample values
NTbi, Ьг 3^^® plotted and judged against the appropriate percent and sample
size. In particular. Figure 7.8 shows 90, 95, and 99 percent contours for
sam ples of 75 from a skew Type I distribution.
An overview of 90% contours (n = 200) fo r several Pearson populations
is shown in Figure 7 .9 . Note the significant changes in area enclosed for a
norm al (n/^1 = 0, /З2 = 3), uniform {^Tßi = 0, /!3 = 1.8) and a population close
to the Type in line (circled as 8) . It is evident that discrimination between
populations using an omnibus test based on ^ fb i and Ьз ra ise s many unsolved
problem s.
F o r a discussion o f various aspects of this situation, the read er should
turn to 7.6. The rem ainder of this chapter deals with theoretical aspects of
the omnibus contours, including evaluation of the moments of ^ ÍЬ l and Ь з ,
their correlation, and the construction of equivalent normal deviates (based
on Johnson^s Su system) for \ lh i and Ьз.
288 BOWMAN AND SHENTON

7.4 M OM ENTS OF S A M P L E M OM ENTS

7 .4 .1 Series Developments

The Independence property o f the skewness (and kurtosis) and variance


breaks down in nonnormal sampling so that it is no longer ^ s s i b l e to ex­
p re ss, for example, E ( \ ib i) as the ratio of Е (т з ) to E (m j Thus we have

to reso rt to multivariate Tay lor-expansions. The general problem of evalu­


ating moments of functions of moments has been structured on a recursive
basis by the present authors and D . Sheehan (1971). The flavor of the results
is contained in the following cases:

(a) Univariate; Moments of the mean mj

If

Ag = E(m^ -

(7.4)
& = E (X -
S I

then

(7.5)
r= l

where re fe rs to the coefficient of n"*^ in A g , and A ^ ^ = 0 fo r к > s,


к < 0 . In particular

A j = Mj /n, A j = Mj /n^

A 4 = 3a|/n^ + (a 4 - 3a|)/n^

(b) Bivariate

S t
д(к)
V l.t
- У f P)fMa
" -4k „ t k V l - X . t - M
A^"^^
(k+X+M-s-t)
\. M
X=O ß = 0

S t
(7.6a)
M O M EN T , b j ) TECH NIQUES 289

t
(k+\+/x-s-t)
E
X-O

(7.6b)

where [(s + t + 1)/2] < 1 < s + t, A g ^ = 0 fo r к < 0, o r к > s + t, = 1


unless A. = Д = 0 when Ôq,o = 0 and [x] Indicates the largest integer < x .
A and a have sim ilar meanings as in (a); for example,

"^r.s = - it’ )V g - i«2)®

V , - E (x - „ ; , V - „ ÿ *

would be a possibility, o r sim ilar expression involving linear sums of non­


central sample moments and the corresponding expressions for sam ples
n = I.
Using this approach, we have set up, using a computer, the first eight
moments of \ ih i and the first six moments o f Ьг up to and including the term
in n“® in the sample size. Moments of \ I h i involve four-dim ensional arra y s
(corresponding to the first three sample moments and the sample size ); sim ­
ila rly those of Ьз involve five-dim ensional a rra y s . A s fo r the moments of
the population sampled, the firs t 40 a re needed fo r the eight moments o f ^ J h i ,
and the first 44 fo r the six moments o f b 2 ; it is preferable to set up recursive
schemes fo r these.

7 .4 .2 niustrations (Bowman and Shenton, 1975a)

(a) Population: Uniform Distribution (n/^j = 0, = 1.8)


(M2 g + i ( * ^ l ) = 0 by symmetry)

2 .0 5+7 1
------ —^ 2 .2 6 2 9 ^ 1.8042 5.2943E 05
n

I . 2696E O l . 4.7687E01 4.6669E 05


M^(NZbi) ~
n

1.3058E 02 . 1.0406E 04 1.0269E 06


Mj(NZbi) ~
n® n^

1.8804E 03 . 2.5831E 04 1.8986E 07


M8{^Zbl) ~
290 BOWMAN AND SHENTON

(b) Population: Pearson Type I (\Tßi = 0.2, = 3 .1 )

1.4211 6.5992 3.9728E O l . 3.9971E 02


E(^Tbi) - 0.2 - •
n n^ n^

5.9343E 03 . I . 14875E 05 2.7672E 06 . 8.0796E 07


n=’5 „6
n” ” г.7
n

6.7573 6.1810E O l . 6.9918E02 8.5309E 09


Var(NZbi) - «8
n n^

5.9716E01 3.0573E03 . 1.0889E05 . . 3.1041E11


Дз(^Ь1) ^
n2 n^ n^ n«

1.3698E02 1.5310E03 3.1549E05 5.5277 E 12


n^ n^ " n'* ” n®

(c) Population: Norm al Mixture pN(v,cr^) + (1 - p )N (v 4 o ^ )


(nT^i = i . o . /З2 = 4 .0 ), P = 0.8706, V = -.2961, v ' = 1.9917, = 0.4103)

2.3142 9.7524E01 9.6665E02 . 4.0862E 08


Е ф г) 4 ----------- -- ---------- 5--------- -- -------------3----------+
n n^ n”* n®

8.5535E 03 5.2878E 05 ^ 6.0489E 06 + • • • 5.3930E 12


-
n^ n" n®

2.2836E 06 . 5.6775E 08 2.5635E 15


А*б(Ьг) n^ n^ n',8

(d) Population:: Pearson Type I, ^Tßi = 1.0, ß^ = 4.0

9.7966 4.9612 E02 . 6.1251E03 5 .7413E 07


Еф г^ЬО ~ 4 ------------- -- ---------- 2--------**■--------- 3-------- + . . . +
n n^ n^

5.6987E 02 . 3.9918E 03 4.9240E 05 + . . . + 3.4572E 10


V a r (b^'v/bi)
n n‘ n-'

2.9913E05 . 1.0895E07 8.0078E11


^tз{b2^ЛJl) -------^ --------------------

9.7426E 03 . 3.1439E 08 9.9575E 13


n* a» n^

7 . 4 .3 Safe Sample Sizes


Our early work in using these expansions to approximate the distributions of,
for example, N/bj and Ьг, relied on inflating the sample size to damp-out
higher o rd er term s. Thus in the tables of Bowman and Shenton (1975a), safe
sample sizes are indicated for each moment, using the rather arbitrary rule
that the critical sample size is one which adjusts the size of the highest order
MOMENT (N/bi, Ьг) TECHNIQUES 291

to the lowest o rd er term s to be approximately one-tenth. F o r example, in


sampling from Type I with \Tßi = 0 .6 , = 3.2, the safe sample size fo r
E { \ l b i ) is n = 10, and

E('Vbi) - 0.6 - 0.3082 + 0.0444 + 0.0227 - 0.0302

+ 0.0175 + 0.0186 - 0.0606 + 0.0414 (7-7)

Sim ilarly, in sampling from Pearson Type I with •sTßi = 1.4, ß^ = 3.4, the
critical size is n = 100 fo r МзФг), and

ДбФг) 3.4419 + 3.7942 + 2.5590 + 1.4111 + 0.7088 + 0.3417 (7.8)

C learly, in both cases the sample sizes a re only just adequate to damp-out
the n“® term s. Pearson (1930) used rather sim ila r damping factors; fo r ex­
am ple, for norm al sam ples he gave

a(N/bi) - 0.3464(1 - .0600 + .0024 - .0001) (7.9)

when n = 50, and

ßz^z) 3 + 5.4000 - 2.0196 + 0.4704 (7 . 10)

when n = 100. In the case of cr it looks as if a sm a lle r sample size could have
been used.
F o r our illustrations in 7 .4 .2 the safe sample sizes are for

(a) 8, 8, 10, 18
(b) 18, 28, 28, 61, 86
(C ) 21, 43, 102
(d) 36, 57, 72, 179

Evidently using the se rie s indiscriminately would be disastrous.

7. 4. 4 Rational Fraction and Other Approximations

Asymptotic o r slowly convergent series may be approximated by the ratio of


polynomials in the variable (in our case, n the sample s iz e ), and there has
been a resurgence o f interest in the last decade in the subject, basically
Initiated by Pad^ (his thesis was published in 1892). B riefly , the domain o f
convergence o f Padá approxlmants (which include Stleltjes continued fractions
as a special case) is generally m ore extensive than is the case fo r series
developments; fo r series in 1/n, this suggests the possibility that sm aller
values of n may hold fo r Padá approxlm ants. Genuinely divergent series (or
what appear to be so from the pattern of the first few term s) seem to be quite
common in statistics; at least that is our experience, but from a knowledge
292 BOWMAN AND SHENTON

of a few term s (8, 15, o r perhaps 30), one must not expect to arriv e at a
precise answer; rather, one looks for an optimum assessm ent.
F o r fu ller accounts o f the general Pad^ approach the reader is re fe rre d
to Baker (1965, 1970, 1975). An extensive bibliography on Padá approxima­
tion and related matters is given by Brezlnskl (1977, 1978, 1980, 1981).
General comments and cautionary rem arks on summing divergent series are
given by Van Dyke (1974), and problem s o f e r r o r analysis for convergent
and divergent series are discussed by O liver (1974).
An interesting account of the properties of continued fractions (special
case of the Pad^ table) is given by Henrici (1976) who, among other things,
pays much attention to the rate of convergence; continued fraction as develop­
ment for rapidly divergent series fairly frequently converge slow ly—rem ark ­
able property at w orst. A b rie f account of Pad^ methods with special r e fe r ­
ence to statistical series is given in Bowman and Shenton (1984).
Discussion on divergent series fo r moments of statistics is to be found
in Shenton and Bowman (1977a), and Bowman and Shenton (1978, 1983a,
1983b, 1984). A summation algorithm due to Levin (1973) has turned out to
be successful (Bowman et a l. 1978b, and Bowman and Shenton 1983c) with
series of alternating sign and moderately divergent (as for example with the
factorial series I - x l ! + x^2! - x^3! . . . ) . Cases considered include the
standard deviation from exponential, logistic, rectangular, and half Gaussian
populations.
Pad^ algorithms have been used to find low o rd er moments of ^ Ib i and Ьг
required in the Su approximations.

7.5 THE CO R R E LA TIO N B E T W E E N bi AND

7. 5. 1 A s 3nmptotlc Correlation

E arly work goes back to Pearson (1905) who gave, for general sampling, the
firs t-o rd e r as 5rmptotics for Mz Фг)» and the correlation

. , (Щ - m ß * - ^ ß z ß s + 6/3,/31 + 3 ß iß 2 - 6¾ + I Z ß l + 24ßi)
R(bi ,Ьз) ------------------------------------------ T------------------------
[Mz (bl ) Mz(^z)I
(7.11)

where

ßs = M3M5/MJ, ß4 = Мб/м|. ßs = M3M7/MI. ßb = Мв/м|

Note, as fa r as firs t-o rd e r term s go, this is the same as the correlation
between ^Tbi and Ьз •
MOMENT cVbi, bj) TECHNIQUES 293

7 .5 .2 M ore Exact Results

Further coefficients of higher powers of can be used in

. n , , Е(Ь2‘Л>1) - E(Uj)E(KZbi)
(7.12)

by the method of moments o f sample moments. Each of the four term s in


(7.12) is developed as a series in descending pow ers of n, and a summation
technique applied in each case (7 .4 .2 (d ) gives examples o f moment series
fo r the statistic b 2^/bl). Due to digital program m ing complexities the series
fo r E ( b 2 ^ ^ l ) could only be taken as fa r as n“^ ; those for N/bj and Ьг w ere
taken as fa r as n”®. A selection of results is given in T able 7 .1 .
The surprising feature is the largeness of the correlation especially at
param eter points ( ^ i ,/З^) not In the neighborhood of the norm al point (0 ,2 .0 ).
Even at (0.2, 3.0) there is a correlation of 0.43. Note that in normal sampling
the correlation is zero, but bj and Ьз are correlated. Thus (Shenton and
Bowman, 1975) in norm al sampling.

216n(n - 2)(n - 3)
c o v (b i,b 2 ) = (7.13)
(n + l)2(n-b 3 )(n + 5 )(n + 7)

and the correlation is

54n(n^ - 9)
P (b i»b 2 ) = (n > 3) (7.14)
(n - 2)(n + 5)(n + - I

Returning to Table 7 .1 , also note that sample size beyond 50 only changes
the correlation slightly. M ore extensive tabulations (Bowman and Shenton,
1975a, Table 5) suggest that in sampling from Pearson Type I distributions
with sam ples n > 50, the correlation between ^ГЬl and b 2 is 0.8 o r m ore if
the skewness (^/Pl) of the sampled population exceeds around 0 .7 . This prop­
erty shows that in sampling from Pearson Type I distributions the dot dia­
gram of (N/bj ,b 2> w ill consist of an elongated narrow band fo r s f b i > 0 . 7 o r
so (doubtless, if ^2 is also large and the Pearson curve considered is Type
Ш o r Type IV , then the narrow band may broaden, but we have insufficient
data to be sure of this). The dot diagram s of the couplets {^ÍЪl ,b 2> in F ig ­
ures 7.3, 7.4, and 7.5, below, support these properties (note that a limited
investigation of a sim ilar grid of values ß i у ß z) for normal mixtures showed
no significant change in the pattern of behavior of the correlation coefficient).
TABLE 7 .1 Covariance and Correlation between \íbi and Ьз
ln Pearson Sampling

ßz 50 75 100 250 500 1000

0.20 1.20 (a) 0.0344 0.0218 0.0159 0.0061 0.0030 0.0015


(b) 0.6963 0.7478 0.7788 0.8467 0.8740 0.8887
0.20 1.80 (a) 0.0229 0.0150 0.0112 0.0044 0.0022 0.0011
(b) 0.5141 0.5274 0.5343 0.5473 0.5518 0.5541
0.20 3.00 (a) 0.0871 0.0665 0.0534 0.0242 0.0126 0.0064
(b) 0.4325 0.4573 0.4699 0.4930 0.5007 0.5046
0.40 1.20 (a) 0.0816 0.0509 0.0370 0.0140 0.0069 0.0034
(b) 0.8876 0.9192 0.9363 0.9698 0.9818 0,9880
0.40 3.00 (a) 0.1459 0.1085 0.0859 0.0379 0.0195 0.0099
<b) 0.6962 0.7142 0.7226 0.7369 0.7414 0.7435
0.60 1.40 (a) 0.1377 0.0851 0.0616 0.0232 0.0114 0.0056
(b) 0.9379 0.9571 0.9669 0.9850 0.9912 0.9943
0.60 3.00 (a) 0.1671 0.1191 0.0921 0.0387 0.0197 0.0099
(b) 0.8176 0.8252 0.8287 0.8345 0.8363 0.8372
0.60 3.40 (a) 0.2697 0.2163 0.1772 0.0837 0.0444 0.0229
(b) 0.8431 0.8317 0.8359 0.8447 0.8473 0.8485
0.80 1.80 (a) 0.1959 0.1210 0.0876 0.0330 0.0162 0.0080
(b) 0.9559 0.9688 0.9750 0.9860 0.9897 0.9915
0.80 2.40 (a) 0.1306 0.0850 0.0630 0.0246 0.0122 0.0061
(b) 0.9188 0.9263 0.9301 0.9367 0.9388 0.9399
0.80 3.00 (a) 0.1674 0.1133 0.0853 0.0342 0.0171 0.0085
(b) 0.8778 0.8813 0.8831 0.8867 0.8879 0,8885
0.80 3.80 (a) 0.3178 0.3219 0.2725 0.1346 0.0726 0.0378
(b) — 0.9063 0.8918 0.8927 0.8936 0.8939
1.00 3.00 (a) 0.1920 0.1257 0.0932 0.0365 0.0181 0.0090
(b) 0.9271 0.9325 0.9352 0.9402 0.9418 0.9426
1.00 4.00 (a) 0.4262 0.3322 0.2693 0.1239 0.0649 0.0332
(b) 0.9117 0.9141 0.9151 0.9166 0.9170 0.9172
1.40 3.00 (a) 0.7534 0.4229 0.2947 0.1049 0.0506 0.0249
(b) 0.9688 0.9813 0.9867 0.9949 0.9973 0.9984
1.40 4.00 (a) 0.4001 0.2602 0.1922 0.0746 0.0369 0.0183
(b) 0.9546 0.9601 0.9628 0.9676 0,9691 0.9699

(a) Covariance, (b) Correlation. Moments are based on asymptotic s e r ie s .


— Eb 2 not w ell approximated at this point for n = 50.
MOMENT (^/bl, Ьг) TECHNIQUES 295

7.6 SIM ULTANEO US BEH AVIOR OF N/bi AND

7 .6 .1 Johnson’s Su Distributions and ^ Ih i and Ьз

W e have given in (7.3) the Johnson Su transformation fo r \ lb i under normality


due to D ’Agostino. The transform ation works w ell fo r.n > 8 o r 9.
W e have tested out the Su system (Shenton and Bowman, 1975) as an
approximant to the distributions o f ^ ¡Ь l and Ьз in nonnormal sampling, in­
cluding norm al m ixtures o f two components and other distributions. Su gives
excellent results even fo r relatively sm all sam ples (n > 30 o r so) and fo r d is­
tributions (determined by specified values o f and ß ^) fo r which 0 < \Tßi
< 1 . 2 and 1.2 < /З2 < 4.0 approximately. The kurtosis statistic is le ss w ell
fitted due no doubt to its one-tailed nature. But in general fo r moderate sized
sam ples, Su provides an acceptable fit.

7 . 6.2 The Su Transform ation


Following Johnson’s notation (1949), Su has density

p(y) = (0/(271)2) ^ yZjèj e x p (- i z ^ ) (7.15)

where
I
Z = V + In { y + ( l + y ^ )2 }

so that

y = Sinh { ( z - y ) / ô }

and Z e N (0 ,1 ). The first four moments a re , defining In o) = l/ô^, =y/ô,

Ml(y) = sinhfí

Mz(y) = (^ - l)(c j cosh(2í2) + 1)/2


(7.16)
Мз(У) = -(со - 1)^ n/co{ co(cj + 2) sin h (3 fí) + 3 s ln h n }/ 4

M4 (y) = (со - l )^ {d 4 cosh(4í2) + d 2 cosh(2í2) + do }

where d^ = co^(co^ + 2oo^ + Зы^ - 3)/8, d 2 — 500^(00 + 2), do = 3(2co + l ) / 8 .


Suppose now that X is a statistic (o r random variable) with mean , and
7 S
cumulants K 2 , K 3 , and K 4 . F rom the standard cumulants (K s/K 2) ,
S = 3 , 4 , determine y , ô by matching these with the skewness and kurtosis
of Su (quite accurate rational fraction approximations to со and Q have been
given (Bowman and Shenton, 1980a) in term s of ^ ß i , ¢2 and capable of im ­
plementation on a portable calculator, preferably program m able). Sim ilarly,
296 BOWMAN AND SHENTON

solutions are available for the neighboring S3 system (Bowman and Shenton,
1980b); otherwise, see Johnson (1965) o r Pearson and Hartley (1972) for
facilitating tables). Define t = (X - ^)/X, and determine ¢, X setting
Aii(t) =/^i(y) and Mz (t) =M z(y)- Thus

X^ = K ^ / ß ^ iy )

Í = K j - X ß [ (у)

and the firs t four moments of t are now those of у given in (7 .1 6 ).


In particular there are the approximate transformations,

N/bi = +Xi Sinh [(Xs(NÍbi) -r i )/ ô i ]

bz = Í2 + Sinh[(X s(b2) -yz)/Ô 2l

where Xs(^/bl), Xs (b 2) are equivalent normal deviates based on Su» To


determine the eight param eters | i , X j, . . . , Ô2 we need the first four
moments of ^ ÍЬ l and b 2 (mean, variance, ^ Г ß l , in standardized moments).
These are determined by the Taylor series approach (see 7 . 4 ), and to set up
the two sets of moments we require 40 moments of the population sampled in
o rd er to take the moment series out to term s of ord er n“® (36 fo r NÍb^, 40
fo r b 2)* The series are summed either by the ”safe sample size” technique
o r the Pad^ algorithm. Malfunctions of the summation technique can som e­
times be detected by lack of smoothness in assessm ents of a moment fo r a
set of equally spaced sample s iz e s .

7 .6 .3 Omnibus Test Contours

The tests we shall consider are:

A. K | .

Norm al
B: K» = X | (A j ) * X j(b ^
Sampling

C: Kg, and K^, treated as x^-variates with 2 d .f. w ill be

labelled Xg and x^* respectively

S e I - E*
Nonnormal
where
Sampling
R = P (X g i-J b j), Xgib^))

E: treated as x^(*^ = 2) w ill be called Xq


SR
MOMENT (N/bi, bz) TECHNIQUES 297

(In the sequel we shall also describe a bivariate model fo r the joint distribu­
tion of N/bj, Ьз from which acceptance contours are constructed.)
Case C is the D^Agostino-Pearson test—a fàirly obvious concept since
the components a re squares of approximate norm al variates—so that p e r ­
centage values can be used to test significance. They used extensive sim u­
lations to assess the distribution o f b 2 (sam ple sizes n = 20(5) 50,100,200)
from which X e(b 2), the equivalent norm al deviate for Ьз, could be estimated
at various probability lev els. However, the statistics Xg(^/bl) and Хе(Ьз)
a re not independent (D*Agostino and Pearson, 1974); and whereas this p e r ­
haps has little effect on the test contours for la rg e sam ples, it does have to
be taken into account fo r most applications.
The test statistics in A , B , and D are to be regarded as mappings, so
that probability levels a re to be discovered by Monte C arlo sim ulations. A
is a quick and easy statistic to simulate if we want a rough test; all that we
require are the Johnson transform s fo r N/bj and Ьз followed by simulating K|
(see Appendix). B is an improvement on A and C , since it uses the em pirical
equivalent norm al deviate ХеОЬз)« D is intended as an approximate testing
statistic but is still fairly complicated to construct. F irs t of all, fo r a given
sample size, we must c a rry out simulations to asse ss the correlation between
Xs(NZbi) and X s (b 2), since this correlation is intractable mathematically.
Next, we simulate Kg to determine a set of percentage points, and finally,
we map the regions in the (N/bj ,Ьз) plane. F o r present purposes, the x | r
version is adequate.
In the next paragraph we set out some supporting m aterial fo r the sta­
tistics involved in A , B , and C . R eaders who p re fe r to follow the main devel­
opment may move to 7. 6 . 5.

7 .6 .4 Comments on Moments o f Statistics in the


Test Statistics in Norm al Sampling

Moments of N/b^, Ь з , etc. under normality a re given in T ables 7.2 and 7.3
(Bowman and Shenton, 1975b). Xg(NZbi) has moments very close to the normal
even fo r n = 20. X g(b 2) is m ore discrepant but is satisfactory fo r n > 50; the
Improvement using Хё(Ьз) is evident, especially fo r the sm aller sam ples.
The theoretical moments of NZb^ and Ьз are derived from the T aylor series
developments and show gratifying closeness to the simulation results. Those
fo r the “fype statistic (A and B) on the assumption of independence should
be = 2, (7 = 2 , ^IWl = 2, /?2 = 9. The discrepancies (Table 7.2) are marked
fo r n sm all and p ersist for and even at n = 100.
The upper percentage points in Table 7.3 of Xg(NZbi), X s(b 2), and X e(b 2>
are close to the normal values. However, fo r the low er percentage points,
whereas the agreement for Xg(NZbi) is satisfactory, that fo r Xg(b2) is quite
discrepant, especially fo r sm all sam ples; by contrast the percentage points
fo r Xe(b2) a re in good agreement fo r all sample sizes studied. Thus Kg w ill
give too much weight to la rg e discrepancies especially fo r sm all sam ples.
tSD
TABLE 7.2 Moments of 'v/bj, and Related Variates in Normal Sampling Based on 50,000 Simulations^ CD
OO

Moment param eters


Sample size
n Variate (T •^ßi ßi

20 N/b^ 0.000 (0.000) 0.472 (0.473) -0.001 (0.000) 3.64 (3.58)

Xs(^/bl) OoOOl 0.998 -0.002 3.03

2.708 (2-714) 0.767 (0.761) 1.840 (1.738) 9.52 (8.54)


^2
Xs(b2) -0.018 1.033 -0.257 4.09

^ (Ь г ) -0.008 1.000 0.040 3.22

k1 2.063 2.653 5.012 72.38

k | 1.995 2.441 4.297 43.38

25 NÍb^ -0.003 (0.000) 0.437 (0.435) -0.015 (0.000) 3.59 (3.58)

Xs(^Zbi) -0.006 1.003 -0.004 3.00

b2 2.765 (2.769) 0.740 (0.731) 1.834 (1.747) 9.41 (8.90) W


Q
Xs(b2) -0.015 1.026 -0.149 3.51

-0.009 1.005 0.053 3.18 >


Xe(b2>

k| 2.058 2.505 3.385 21.63

2.016 2.433 4.078 37.37 W


Д
W
50 0.000 (0.000) 0.327 (0.326) 0.002 (0.000) 3.57 (3.45)
O
Xs(^Zbi) 0.001 1.001 0.001 3.06
Ь2 2.880 (2.882) 0.609 (0.598) 1.678 (1.582) 8.90 (8.42)
i
X s (Ьг) -0.042 1.011 0.055 2.89 H

ХеФ г) - 0.020 1.011 0.057 3.06

2.026 2.291 3.300 22.36 à


к|
2.024 2.358 3.683 29.80

100 ^ibi 0.000 ( 0 . 000) 0.238 (0.238) -0.016 ( 0 . 000) 3.30 (3.28) tí
H
о
X s(Ȓb i) 0.002 1.001 -0.015 3.01

2.939 (2.941) 0.461 (0.455) 1.324 (1.277) 6.99 (6.77)


Ьг

-0.009 1.010 0.057 2.82


f
О)
Х зФ г)

ХеФ г) - 0.011 1.012 0.007 2.96

К| 2.023 2.198 3.075 20.51

к| 2.027 2.231 3.030 19.68

^Parenthetic entries re fe r to theoretical values of the moment param eters. Most of the simulation for (^/bl, Ьз)
and the statistics was carried out on IBM system 360 model 91 with occasional checks on a CDC 6400 s y s­
tem. The basic uniform variates were generated by a multiplicative congruential method; recommended starting
values and m ultipliers are given in a Computer Technology Center Report, Union Carbide, Oak Ridge, Tennessee,
by J. G. Sullivan and B . Coveyou. Pseudo-random normal deviates w ere derived from the uniform variates, using
a rejection method attributed to von Neumann (Kahn, 1956, p. 39). This method sets up x j = log (1/¾ ) (i = 1,2)
and accepts Xj if (Xj - 1)^ < 2x 3 , where Mj and ^3 are identically and independently distributed uniform variates
on (0,1); a normal variate follows by giving Xj an equal chance of being positive or negative.

to
CD
CD
300 BOWMAN AND SHENTON

T A B L E 7.3 Percentage Points of 'Vbj, Ьз and Related Variates


in Norm al Sampling Based on 50,000 Simulations

Sample Percentage points


size
n Variate 5% 10% 90% 95% 99%

20 ^/bl -1.151 -0.772 -0.587 0.583 0.767 1.151

X g (^ b i) -2.326 -1.644 -1.278 1.271 1.636 2.328

Ьг 1.642 1.828 1.948 3.657 4.143 5.394

Х зФ г) -2.631 -1.717 -1.299 1.261 1.631 2.359

Хе(Ьг) -2.316 -1.651 -1.289 1.262 1.618 2.333

K| 0.020 0.101 0.208 4.702 6.787 12.756

K| 0.020 0.103 0.214 4.473 6.352 11.654

25 NZbi -1.065 -0.713 -0.543 0.544 0.710 1.052

Xgi-VTb,) -2.338 -1.650 -1.282 1.284 1.643 2.313

Ьг 1.722 1.912 2.028 3.675 4.137 5.381

Х зФ г) -2.601 -1.704 -1.303 1.264 1.634 2.382

Х еФ г) -2.321 -1.670 -1.289 1.268 1.634 2.370

к| 0.020 0.101 0.212 4.695 6.728 12.44

к| 0.021 0.103 0.217 4.514 6.354 11.66

50 -v/b. -0.789 -0.530 -0.408 0.408 0.535 0.795

Xs(N/b, ) -2.329 -1.636 -0.128 0.128 1.647 2.345

Ьг 1.964 2.143 2.251 3.631 4.000 4.925

Xs(bz) -2.305 -1.707 -1.358 1.251 1.623 2.327

Xe(bz) -2.311 -1.671 -1.310 1.283 1.645 2.355

к| 0.021 0.106 0.217 4.515 6.146 11.15

K^e 0.021 0.107 0.217 4.529 6.259 11.30

(continued)
M O M E N T (N/bi, b j ) TECHNlQtJES 301

T A B L E 7*3 (continued)

Sample Percentage points


size
n Variate 5% 10% 90% 95%

'/Bi -0.569 -0.389 -0.301 0.300 0.389 0.561

Xs(^Гbl) -2.332 -1.642 -1.285 1.281 1.640 2.305

bz 2.180 2.342 2.439 3.527 3.796 4.399

X s(bz) -2.257 -1.671 -1.327 1.298 1.671 2.330

X e ib j) -2.367 -1.675 -1.323 1.291 1.665 2.335

k | 0.023 0.110 0.221 4.513 6.018 10.36

k| 0.022 0.107 0.215 4.552 6.133 10.49

Standard
norm al -2.326 -1.645 -1.282 1.282 1.645 2.326

(V = 2) 0.020 0.103 0.211 4.605 5.992 9.210

but in general there is little to choose between K| and k | fo r n greater than


about 100.
The changes in the upper percentiles o f k | as the sample size increases
a re shown in T able 7.4. F o r a sample o f 1,000, the 99% value is still som e­
what la rg e r than the corresponding value fo r = 2 ).
It is thought that the Kg and possibly K| approximation deteriorates from
the chi-squared fo r percentage points m ore extreme than those studied here.

T A B L E 7.4 Upper Percentage Points fo r Kg fo r Large


Samples Based on 50,000 Simulations

Sample size
n 90% 95% 99%

150 4.497 5.950 10.010


200 4.507 5.997 9.944
250 4.522 5.961 9.737
300 4.561 6.021 9.972
500 4.595 6.029 9.574
1000 4.629 6.032 9.444
302 BOWMAN AND SHENTON

7 .6 .5 Omnibus Contours fo r A , B , C (Norm al Sampling)

The form of the joint distribution f (^Vbj, Ьг) and the test contours may be
gained from a dot diagram of 5,000 runs o f samples of 20 (F igure 7 .2 ).
Superimposed on the diagram are the 90, 95, and 99% contours based on Kg,
k |, and ” 2). The parabolic shape adumbrated in the dot diagram is
striking and (from unpublished graphs) becomes less sharp for la rg e r
samples (this feature w ill be mentioned in the sequel). It is also to be noted
that there is evidence that the conditional distribution of \ lb i fo r Ьг sm all
is unimodal, whereas as Ьз increases it becomes markedly bimodal. Con­
tours at 90% and 95% levels of acceptance fo r numerous sample sizes are
given in Figure 7 .1 . To use them, m erely compute \ I b i , Ьз and enter and
interpret with respect to the appropriate sample size contour. Slightly im ­
proved contours are given below in Figures 7.6b and 7.6c, constructed from
an entirely different model.

7 .6 .6 Omnibus Contours Under Nonnormal Sampling

Omnibus contours D (see 7.6 .3 ) and dot diagram s of 1,000 points as illu s -

2 5

2.5

FIGURE 7.2 Dot diagram and 90, 95, and 99 percent contours in normal
sampling for n = 20. K |-----, K| — , x| — -.
i
W

и
о
Î33
3
JO

W
M

со
о
со
304 BOWMAN AND SHENTON

trations have been constructed In the following cases (the assumption that
Is approximately a -variate (i^ = 2) was made fo r sim plicity):

Pearson Type I: = 2/7 ß z =33/14 n = 100 (Figure 7. ЗА)


Norm al Mixture: ^Tßi = 2/7 ß z = 33/14 n = 100 (Figure 7.3B)
(2 components)

Norm al Mixture: -vT^l = I ß z = 2 .6 n = 50 (Figure 7.4)

Norm al Mixture: density — 0.9 N (0 ,1 ) + 0.1 N (3 ,1 ) n = 100 (Figure 7.5)

FIGURE 7.4 90, 95, and 99 percent contours XsR i^o^mal mixture \Tßi = 1.0,
= 2.6, n = 50.
MOMENT (-Vbi, b j) TECHNIQUES 305

FIGURE 7.5 90, 95, and 99 percent contours (x |r ) fo r the norm al mixture
f ( . ) = 0 .9 N (0 ,1 ) + 0 .1 N (3 ,1 ); n = 100.

T A B L E 7.5 Num ber of Couplets Outside the 90, 95, x|j^-Contours


for Five Populations

Num ber of ( s ih i > points


in 1000 outside the contour

Figure Population 'v/ßi ßz n R 90% 95% 99%

7 . ЗА Type 2/7 33/14 100 0.571 89 46 9


7.3B N .M . 2/7 33/14 100 0.470 85 41 6
— N .M . 1.0 2.2 70 0.998 67 37 16
7.4 N .M . 1.0 2.6 50 0.979 97 48 15
— Uniform 0.0 1.8 50 0.000 100 57 16

^Type I has indices 2 and 3; N .M . = Norm al M ixture.


306 BOWMAN AND SHENTON

The odd param eter points in the Type I case appear because they correspond
to a density f(n) = k x (l - x)^, chosen for ease of simulation.

Comments

(1) The contours do not seem to change markedly fo r different populations


(Figure 7 .3 ).
(2) The rem arks on the effect of the correlation between NÍb^ and Ьз made
in 7 .5 .2 are illustrated in Figures 7.4 and 7.5.
(3) A visual count of the number o f couplets outside the three percentage
levels for five populations for the x|j^ contours is given in Table 7.5.
The agreement is satisfactory.

A different set o f contours based on another model is now described.

7.7 A B IV A R IA T E M O D E L

7 .7 .1 Genesis

W e noticed (7 .6 .5 ) that in normal sampling there is evidence (Figure 7.2)


that the contours o f equal probability lie on parabolic arcs and that the omni­
bus contours have the wrong concavity fo r Ьз > 3 o r s o . Again there is ev i­
dence of a change from unimodality to bimodality of the \ lh i array s for
given Ьз as Ьз increases. So this suggests considering a model with the Su
transform ed norm al curve fo r the m arginal of ^ i h i . A s for the conditional
distribution of Ьз given ^^bl, we carried out simulations of 50,000 samples
of 10, 20, 30, and 50, from which the Ь з-a rra y s appeared to have a gam m a-
type density bounded by Ьз = I + bj (all distributions for which four moments
exist, and samples are such that Ьз > I + b^). In addition, the means of the
Ьз array s (N/b^ constant) formed a parabolic arc almost p arallel to the bound­
ing parabola. M oreover, the mean and variance of the Ьз array s increased
with Increasing (N/bj) whereas the skewness decreased.

7 .7 .2 The Model fo r General Sampling

W e define a bivariate product density by (Shenton and Bowman, 1977)

f(NTbi, b j) = wi-v/bi) g(bz I ^/bi) (7.17)

where w («) re fe rs to the Su approximation to the m arginal probability density


function of N/bi, and g(«l •) t o sl conditional gamma distribution. Specifically,

w (\ ^ i) = T exp ( - i Z^) (7.18)


+ (bi -
MOMENT (-v/bi, bj) TECHNIQUES 307

g i b j - ^ i ) = к {к (Ь г - 1 - b l ) } ® ^ '^ i )" ^ (l/ r {0 (N / b i )}) exp {- k (b j - 1 - b j ) }

where

(1) Z = y + ô slnh-4(NTbi - í ) A } ,

(2) к > О, ö(^/bl) > О for all re a l N/bj,

(3) Í , X, у , and Ô are the param eters associated with Johnson^s S u fo r the
distribution of N/bi.

F o r our present study we assume a quadratic form for 0(*)> namely,

0(N/bi) = a + Ь^Л >1 + cbi о 0)

SO that the unknowns in the model (bearing in mind (3)) are now a, b, c,
and k. The sim plest method to determine these (there are several) is to use
^ (Ь з ), v a r (Ьз), E (b 2N/bi), and БОЬзЬ^); examples of the latter are given in
7 .4 . Define

**r.s = E ( b ^ - E ( ^ Г b У ( b 2 - E i b g ) ) ' (7.19)

V = E('s/b,)'^b^
r ,s ' I' 2

Then from (7 . 17) after simplification

I^oi = I + a/k + (b/k)Vio + ( 1 + c/k)«/^

Mil = (Ь/к)Й20 + ( 1 + CA ) (1^30 - 1^20^10)

/i02 = (I + c/k)*(v45 - 1¾) + 2(1 + c/k)(bA)(i/3o - v ^ V y ,^ ) (7.20)

+ (b/k)^Mjo + (a + bi'io + c v ^ ) / ] x }

Mzi = (Ь/к)Мзо + (1 + - М» + Zi'ioM»)

F o r example, in deriving ~ ~ ^’oi » we use

»'02 = I W(Nfbi)d(Nfbi) / {bz - I - bi + (I + bi)}2 g(bi lNfbi) dbz


-« 1+bi

The equations (7.20) can be solved explicitly and in sequence

^ = (Mzo/^zl “ ^ 30^ 11)/ (^ 20^ " ^30¾)

Q ~ (A^II^ “ А^21^з)/ (^20^ “ ^ 30^ 3)


(7.21)
308 BOWMAN AND SHENTON

P = >'oi - q^io -
к = (p - I + qt'io + (r - l ) v ^)/(Aioj - - rq aj -

where

P = I + a/k, q = b/k, r = I + c/k

% ~ ^30 “ ^20*^10 » ^ ” ^40 "

D = P« - m|) + Zi'ioMso

In this way having determined к we deduce a, b, and c from p, q, r .


To determine the param eters in the model, we must find the Su for
and in addition E (Ьз), E(N/bib 2), у а г Ь з , and

Ma = E(bib2> - E(bi)E(b2> - 2E(^Л),)E(^Гb,b2) + 2E(b2)(EN/bi)^

These in the general case (nonnormal sampling) are found by the series
approach described in 7.3.
Note that since g (b 2 1^/bj) is the conditional distribution of b 2 , it follows
that

E(b2lN/bi) = P + qN/bi + rbi


(7.22)
var (b 2 |N/bi) = (a + bN/bj + cbi)/k^

so that since c > 0 (see 7 .7 .2 (2) and r = I + c/k, the conditional mean and
variance ultimately Increase with I Ф > 1 1.
Again since the correlation between bj and b 2 is given by

Pibi.bj) =(Ma+2Kiÿiu)/{(7(bi)ff(b2)} (7.23)

and since the moments involved here have been used to determine the model
param eters, it follows that the model exactly reproduces r { \ I b i , b j ) and
r ( b j , b j ) ; it does not, however, respond directly to the skewness and k u r-
tosis of b - , .

7 .7 .3 The Model Under Norm ality

In this case it is known (F ish er, 1930)

= 6(n - 2)/((n + l)(n + 3))

= 3 + 36(n - 7)(n^ + 2n - 5)/{(n - 2)(n + 5)(n + 7)(n + 9)}

Voi = 3(n - l)/ (n + I)

Mo2 = 24n(n - 2)(n - 3)/{(n + l)* (n + 3)(n + 5)}

Ma = 216n(n - 2)(n - 3)/{(n + l)^ (n + 3)(n + 5)(n + 7)}


MOMENT (-s/bi, bj) TECHNIQUES 309

SO that, after simplification from (7.21)

a = (n - 2)(n + 5)(n + 7)(n2 + 27n - 70)/(6A)

b = 0

C = (n - 7)(n + 5)(n + 7)(n2 + 2n - 5)/(6Д) (7.24)

к = (n + 5)(n + 7)(n^ + 37n^ + I l n - 313)/(12Д)

Д = (n - 3)(n + l)(n2 + 15n - 4)

Since C > 0 fo r the existence o f the density, we must have n > 7, in which
case there is always a unique solution. It is also evident that as the sample
size increases a and c — n/6, whereas к — n/12, so that p and r approach 3.
Param eters o f the model a re shown in Table 7 . 6 and comparisons of the con­
ditional means and variances (theory versus simulation) in T able 7 . 7 .
Com parison of the omnibus test contours K| (equivalent to K| fo r n > 25)
and those fo r the bivariate model fo r n = 100 a re given in Figure 7.6a. The
new contours a re m ore responsive to the bim odallly property noticed in the
a rra y s fo r Ьг > 3. Theoretically, the \ I b i a rra y s fo r given Ьг from the
model have a density of the form

ф(х) = c e ^ (d^ - x^)^^^^ ^/Г(а + bx^) (x^ < d^, a ,b > 0; x = N/bi)

and fo r certain param eter combinations this can show multimodality.


A visual count of the Monte C arlo coi^lets (NZb^, Ьз) outside the 99, 95,
and 90% bivariate contours gave 11, 52, and 94 occurrences. Another run
fo r 1000 sam ples o f n = 50 gave 11, 60, and 100 occurrences fo r the same
three levels.
Contours of 90 and 95% in the N/bi, b 2 plane fo r sam ples n = 20(5) 65,
75, 85, 100, 120, 150, 200, 250, 300, 500 and 1000 are shown in Figures
7.6b and 7.6c.
A very interesting study by Tietjen and Low (1975) displays several
three dimensional plots of the joint distribution o f ^ Ih i and h z , along with
a set of contours at the 95% level, sample sizes ranging from 4 to 50, the

T A B L E 7.6 P aram eters of the Model in Norm al Sampling

n a C к P r

10 5.385 0.774 5.045 2.06 1.15


30 8.797 4.208 5.778 2.52 1.73
50 12.184 7.493 7.311 2.67 2.02
100 20.578 15.763 11.395 2.81 2.38
CO
O

T A B L E 7.7 Conditional Means and Variances for the Model in Norm al Sampling

Sample size n = 10 Sample size n = 50

Range of ^/bl E ib jl^ b i) V a r (b 2 l* ^ i) C E(b2l^/bl) Var(b2 I^ i ) C

-0.02 < Ф )1 < 0.02 2.07 (2.11) 0.21 (0.21) 2695 2.67 (2.68) 0.23 (0.21) 5108

0.18 < N/bj < 0.22 2.11 (2.14) 0.21 (0.19) 2581 2.75 (2.78) 0.23 (0.22) 3747

0.38 <NTbi < 0.42 2.25 (2.30) 0.22 (0.21) 1999 2.99 (3.04) 0.25 (0.26) 1885

0.58 <^/bl < 0.62 2.48 (2.58) 0.22 (0.24) 1411 3.40 (3.48) 0.28 (0.29) 645

0.78 < ^ Ih i < 0.82 2.81 (2.90) 0,23 (0.23) 822 3.96 (4.04) 0.32 (0.33) 188

0.98 < 'v/bj < 1.02 3.22 (3.41) 0.24 (0.24) 481 4.69 (4.46) 0.37 (0.16) 32

^Parenthetic entries re fe r to Monte Carlo simulations, the number of samples drawn in each range being c. >
iz:
>

O
îz:
MOMENT (N/bi, Ьг) TECHNIQUES 311

FIG UR E 7-6a Norm al sampling. Contours of 90, 95, and 99 percent content,
n = 100. (Rem arks on the test sam ples are given in Sec. 7 .8 .)
312 BOWMAN AND SHENTON

FIG UR E 7.6b Norm al sampling, bivariate contours, 90 percent level,


n = 20 (5) 65, 75, 85, 100, 120, 150, 200, 250, 300, 500, and 1000.
MOMENT («/bl, bj) TECHNIQUES 313

FIGURE 7.6c Norm al sampling, bivariate contours, 95 percent level.


CO
T A B L E 7.8 Illustrations of the Bivariate Model for N/bi and b 2^ h-l
vU

Moments Coefficients

NTbi bz Joint for ^ ih i Gamma Regression

Pearson type

ß [ = 0.3660 Ml = 2.7187 Mii = 0’ 1095 ô = 4.766 a = 15.110 E(b2iNTb,) =


^2 = 0.0752 М2 = 0.3280 M21 = 0.0280 У = -1.711 b = 1.207 2.32 + 0 . I l ^ T b i + 1.70bi
^Tßi = 0.2260 Л = 1.198 C = 8.011
ß^ = 3 .2 5 1 0 f = -0.083 к = 11.412

Pearson type

Ml = O Ml = 1.8513 Mii = O 0= 6.422 a = 22.957 E(b2lNTbi) =


М2 = 0.0421 М2 = 0.0345 M21 = 0.0049 у= 1.1301 b = 0 1.80 + 1.31bi
NTßi = 0 Л= 0 C = 9.071
/3j = 3.1006 ¢= 0 к = 28.839

Norm al mixture®

m1 = 0.2037
/l¿2 = 0.0503
4 ß i = 0.0806
ß2 = 3.1431
Ml = 1.2517
М2 = 0.0178
Mii = 0.0232
M21 = 0.0063
0 = 5.590
у = -0.819
X = 1.220
í = 0.021
a =
b =
C =
к =
32.202
0.010
17.432
211.431
E (b 2 l^ i) =
1.15 - 1.08bi
I
!2:

N ~ ß i = 0.4, /?2 = 2.8, n = 50.


и
bjSj = 0, ^2 ” I '® (rectangular), n = 50.
^Two components, equal variances; N~ßi = 0.2, /?2 = 1-2, n = 75.
g
^Num erical results are rounded to three o r four decimal places for convenience from machine output in double
О
precision. ¾
MOMENT (NÍbi, bj) TECHNIQUES 315

results being mainly based on Monte C arlo simulations. T h eir contours do


have shapes sim ilar to concentric parabolic a rc s , although the gradient
change is sharp at the intersections.

7 .7 .4 The Model Under Nonnormalily

Illustrations o f the param eters in a few cases fo r the bivariate model are
given in Table 7.8.
In F igures 7.4 and 7.7 there is a comparison of the xsR contours and
bivariate contours fo r a norm al mixture population. There is little to choose
between the tw o. Again an example of the bivariate contours fo r another

FIGURE 7. 7 Comparison of bivariate contours and omnibus contours


(F igure 7.4) for a normal m ixture, = 1.0, = 2.6, n = 50; 90, 95, and
99 percent levels.
316 BOWMAN AND SHENTON

FIGURE 7.8 Population: Norm al mixture ^Tßi = 0 . 2 , 02 = 1 . 2 , n = 75;


90, 95, and 99 percent bivariate contours (r(N/bi , b 2> = 0.75).

norm al mixture is given in Figure 7.8. In each figure there are 1000 sim u­
lated couplets (N/bi, Ьз), and there is good evidence of the agreement between
the contours and the trend and density of the dots.

7.8 E X P E R IM E N T A L S A M PLE S

The editors have provided 17 random sam ples of 100 from specified popu­
lations fo r discussion. The corresponding values of N/bj, Ьз are plotted in
Figure 7.6a. If we work at the 99% level, we should reject populations 2 , 3,
5, 11, 12, 14, 15, and 17 immediately without further complicated calcula­
tions . This brings out the striking simplicity of the omnibus test approach.
M oreover, if we had some p rio r knowledge of the population sampled,
we could (if it is specified by (n/0 i , ¢ 3)) construct omnibus test regions from
x | r o r the bivariate model to a ssess significance. F o r example, sample 12
is drawn from a normal mixture with n/0j = 0.80, = 4.02; the Xgj^ con­
tours a re shown in Figure 7 .5 . The sample values = 0.73, Ьз = 3.49
a re w ell inside the 90% contour.
MOMENT (N/bi, Ьг) TECHNIQUES 317

In Figure 7.9, we give illustrations of the 90% bivariate contours for


n = 200 in sampling from Pearson Type I distributions (the moderately large
sample of 200 is used m erely to contain significant pictures on the scale
u sed). Note that fo r symmetric populations the size o f the acceptance region
decreases sls ß 2 d ecreases. M oreover, there is a transition to an elongated
shape especially fo r \Tßi large and /З2 - - I sm all; however, this elongated
shape broadens considerably as /З2 Increases. The actual a re a contained in
the contours (those for 95, 99% not shown) turns out to be (see Figure 7.9
fo r the distribution num bers):

Distribution 90% 95 % 99%

I 0.0815 0.1064 0.1646


2 0.1201 0.1566 0.2418
3 0.7718 1.0161 1.6046
4 0.4413 0.5771 0.8970
5 0.9312 1.2492 8.8236
6 0.0899 0.1179 0.1846
7 0.2313 0.3024 0.4695
8 — — 16.2689

These confirm fairly obvious notions in attempting to discriminate b e ­


tween distributions (Pearson Type I in this case) using N/bj and b 2 . F o r

FIG UR E 7.9 Examples of different shapes of bivariate contours of 90 percent


content for several Тзфе I populations (n = 200); numbers 1-8 (sm all circles)
indicate (V ß i, ^ 2) of population sam pled.
318 BOWMAN AND SHENTON

example, there should be no problem (with moderate sample sizes) in decid­


ing between a normal distribution and symmetric alternatives for which
< 1 * 8 (distributions 3, 2, I ). However, with moderate skewness, the
regions overlap especially fo r moderate values of /З2 ( l « e . , distributions not
close to the ’im p o ssib le ” boundary) and very emphatically in the vicinity of
the Type Ш boundary.
F o r comments on the tests described here and other tests fo r departures
from normality from the power point of view, the reader is re fe rre d to
Pearson, D ’Agostino, and Bowman (1977).

R E FE R E N C E S

Anscombe, F . J. and Glynn, W . J. (1975). D istrlb u tlo n o fth ek u rto sis


statistic Ьз for normal sam ples. Tech. Rep. 37, Department of Statis­
tics, Y ale University.

B aker, G. A . , J r. (1965). The theory and application o f the Padá approxi­


mation method. In: Brueckner, K. A . (E d .). Advances in Theoretical
Physics 1^, 1-58, Academic P re s s , New York.

Baker, G. A . , J r. (1975). Essentials of Pad^ Approxim ants. Academic


P r e s s , New York.

Baker, G. A . , Jr. and Gam m el, J. L . (1970). (Editors) The Padá A p p ro xi-
mant in Theoretical P h ysics. Academic P r e s s , New Y ork.

Bowman, K. O ., Beauchamp, J. J ., and Shenton, L . R. (1977). The d istri­


bution of the t-statlstlc under non-normality. Internat. Stat. R ev. 45,
233-242.

Bowman, K. O ., Lam , H. K ., and Shenton, L . R. (1978a). Characteristics


of moment series for student’s t and other statistics in non-normal
sampling. P r o c . A m er. Statist. A s s o c . 206-211.

Bowman, K. O ., Lam , H. K ., and Shenton, L . R . (1978b). Levin’s algorithm


applied to some divergent s e rie s. Unpublished.

Bowman, K. O. and Shenton, L . R . (1975a). Tables of moments of the skew­


ness and kurtosis statistics in non-normal sampling. Union Carbide
Nuclear Division Report U C C N D -C S D -8 .

Bowman, K. 0 . and Shenton, L . R. (1975b). Omnibus test contours fo r


departures from normality based on ^IЪl and Ьз. Biom etrika 62, 243-
250.

Bowman, K. O. and Shenton, L . R . (1978). Asymptotic se rie s and Stieltjes


continued fraction for a gamma function ratio. J. Comput. A p p l. Math.
4, No. 2, 105-111.

Bowman, K . O. and Shenton, L . R . (1979a). Approxim atepercentagepoints


for Pearson distributions. Biometrika 66(1), 145-151.
MOMENT (N/bi, Ьг) TECHNIQUES 319

Bowman, К. О. and Shenton, L . R . (1979b). Further approximate Pearson


percentage points and C o rn ish -F ish e r. Comm. Statist. B —Simulation
Comput. B 8(3), 231-234.

Bowman, K. O. and Shenton, L . R. (1980a). E v a lu atio n o fth ep aram eters


of Su by rational fractions. Comm. Statist. B —Simulation Comput*
B9(2), 127-132.

Bowman, K. O . and Shenton, L . R . (1980b). Explicit approximate solutions


to Sb - Comm. Statist. B —Simulation Comput. B lO (I ) , 1-15.

Bowman, K. O. and Shenton, L . R. (1983a). Moment se rie s fo r estimators


of the param eters of a W eibull density. Proceedings of Computer Sci­
ence and Statistics: 14th Symposium on the Interface. ^ )г Inger-V e rla g ,
New Y ork, 184-186.

Bowman, K. O. and Shenton, L . R. (1983b). The D istribu tion ofth e Standard


Deviation and Skewness in Gamma Sampling—A New Look at a C ra ig -
Pearson Study. ORNL/CSD-109.

Bowman, K. 0 . and Shenton, L . R. (1983c). Levin*s summation algorithm.


Encyclopedia of Statistical Sciences, V o l. 4 (Samuel Kotz and N . L .
Johnson, E d s .). W iley, New Y ork, 611-617.

Bowman, K. O. and Shenton, L . R . (1986). The dark side o f asymptotics


and moment s e r ie s . Sankhya 48, A . (In p r e s s .)

B rezin sk i, C . (1977). A Bibliography on Pad^ Approximation and Related


Subjects. Université des Science et Techniques de L ille .

B rezin sk i, C. (1978). A Bibliography on Pade -^proxim ation and Related


Subjects, Addendum I . U niversité des Science et Techniques de L ille .

B rezin sk i, C . (1980). A Bibliography on Padé Approximation and Related


Subjects, Addendum 2. U niversité des Science et Techniques de L ille .

B rezin sk i, C . (1981). A Bibliography on Padé Approximation and Related


Subjects, Addendum 3. U niversité des Science et Techniques de L ille .

D ’Agostino, R. B . (1970). Transform ations to normality o f the null d istri­


bution of g j . Biom etrlka 57, 679-681.

D ’Agostino, R. B . and Pearson, E. S. (1973). Tests fo r departures from


norm ality. E m pirical results fo r the distribution of Ьз and \/bi. B io-
m etrika 60, 613-622.

D ’Agostino, R. B . and Pearson, E . S. (1974). Correction and amendment.


Test for departures from norm ality. E m pirical results for the distribu­
tion of Ьз and \ f b i . Biom etrlka 61, 647.

Elderton, W . P . and Johnson, N . L . (1969). Systems o f Frequency Curves.


Cam bridge University P r e s s , London.
320 BOWMAN AND SHENTON

F ish er, R . A . (1928). Moments and product moments of sampling distribu­


tions. P r o c . bond. Math. Soc. (Series 2) 30, 199-238.

F ish er, R . A . (1930). The moments of the distribution for normal samples
of m easures of departures from normality. P r o c . Roy. Soc., Series A
130, 17-28.

Geary, R. C. and W orlled ge, J. P . G. (1947). On the computation of uni­


v e rsa l moments o f tests of statistical normality derived from samples
drawn at random from a norm al universe. Application to the calculation
of the seventh moment o f b 2 • Blom etrika 34, 98-110.

Henrlci, P . (1976). Applied and Computational Complex Analysis, V ol. 2.


W iley, New York.

Hsu, C. T . and Lawley, D . N. (1939). The derivation of the fifth and sixth
moments of Ьз in samples from a normal population. Blom etrlka 31,
238-248.

Johnson, N . L . (1949). S y ste m so ffreq u en cy cu rv esgen erated b y m eth o d s


of translation. Blom etrika 36, 149-176.

Johnson, N . L . (1965). Tables to facilitate fitting Su frequency curves.


Blom etrlka 52, 547-558.

Johnson, N . L . and Kotz, S. (1970). Distributions in Statistics—Continuous


Univariate Distributions, Vol. I . Houghton M ifflin, Boston.

Kahn, H. (1956). Application of Monte C a rlo . Rand Corporation, Santa


Monica.

Levin, David (1973). Development of non-linear transformations fo r im prov­


ing convergence o f sequences. Internat. J. Comput. Math. 3, 371-
388.

Mulholland, H. P . (1977). On the null distribution of bj fo r samples of size


at most 25, with tables. Blom etrika 64, 401-409.

O lver, F . W . J. (1974). Asymptotic and Special Functions. Academic P re s s ,


New Y ork .

Pearson, E . S. (1930). A further development of tests fo r normality. B lo -


m etrika 22, 239-249.

Pearson, E . S. (1963). Some problem s arising In approximating to proba­


bility distributions, using moments. Blom etrlka 50, 95-111.

Pearson, E. S. (1965). Tables of percentage points of 'Vbj and Ьг in normal


sam ples: a rounding off. Blom etrlka 52, 282-285.

Pearson, E . S ., D^Agostlno, R. B . , and Bowman, K. O. (1977). T e s t o f


departure from normality: comparison of pow ers. Blometrika 64, 231-
246.
MOMENT еЛ)1, bj) TECHNIQUES 321

Pearson , E . S. and Hartley, H. О . (1972). Biom etrika T ables fo r Statisti­


cians, V o l. 2. Cam bridge University P r e s s , London.

Pearson, K. (1905). On the general theory of skew correlation and non-linear


regression . Draperas Company M e m o irs, Biom etric Series П .

Shenton, L . R. and Bowman, K. O . (1975). Johnson’s Su and the kurtosis


statistics. J. A m e r. Statist. A s s o c . 70, 220-228.

Shenton, L . R. and Bowman, K. O . (1977a). A new algorithm fo r summing


divergent s e rie s . J. Comput. Appl. Math. 3, 35-51.

Shenton, L . R. and Bowman, K . O . (1977b). A bivariate model fo r the d is­


tribution o f N/bj a n d b 2 * J. A m e r. Statist* A s s o c . 72, 206-211.

Shenton, L . R . , Bowman, K . O . , and Sheehan, D . (1971). Sampling moments


of moments associated with univariate distributions. J. Roy. Statist.
Soc. B , 33, 444-457.

Tietjen, G . L . and Lowe, V . W . , J r. (1975). Some aspects of the joint d is­


tribution of skewness and kurtosis in norm al sam ples. (Private com ­
munication) .

W ishart, J. (1930). The deviation of certain h igh-order sampling product-


moments from a normal population. Biom etrika 22, 224.

Van Dyke, M . (1974). Analysis and improvement of perturbation s e rie s.


Q uar. J. Mech. Appl. M ath. 27, 424-450.

A P P E N D IX I

The main Pearson distribution types (Elderton and Johnson, 1969) are:

Type Equation Lim its fo r X

ba,
I - a i < X< a2

Ш f(x) = e x p (-x / a ) X> 0

/ x2\"“
IV f(x) е зф {-Ь a rc tan (x/a)} -O O < X< OO

V f(x) = Уох”™ езф (-а / х ) 0 < к < X < 00

VI f(x) = Уох“ *(х - a )" “ i a < X < OO


322 BOWMAN AND SHENTON

Restrictions on the param eters are omitted, but in all cases the probability
integrals must exist. A s for the structures, Type I can be bell-shaped, J,
o r U-shaped, the range being finite. The other types have unlimited ranges
and generally, at most, one mode.

A P P E N D IX 2

A 2 .1 A Computer Version of the Kg Test Under Norm ality

Let Xg(NZbi) and X g(b 2) be the normal deviates corresponding to the skew­
ness ( ^ 1 ) and kurtosls (Ьг), respectively. Then Kg is

(Al)
W e shall need the moments of N/bj and •
The first four moments of are

) = 0
^Í2(^Л)l) = 6(n - 2)/{(n + l)(n + 3)}

%(N/bj) = 0
(A2)

Sim ilarly those for Ьз are

(Ьг) = 3(n - l)/ (n + I)

= 24n(n - 2)(n - 3 )/ {(n + l)* (n + 3)(n + 5)}

- ^ 1728n(n - 2ИП - 3)(n^ - Sn + 2) (A3)


% (n + 1)^ (n + 3)(n + 5)(n + 7)(n + 9)

- ________________ 1728n(n - 2)(n - 3)iT(n)______________


i*4(02) (jj + + 3 )(„ + 5)(n + 7)(u + + + Щ

where тг(п) = n® + 207n^ - 1707n® + 4 IOSn^ - 1902n + 720-

A 2.2 General Case fo r Su

To ñ t a Johnson S y to a statistic T (or random variable), let its moment


param eters be m1(T ), ^ ' ^ i ( T ) , a n d /J^(T), where

^/■^x(T) = Мз(Т)/(М2(Т))®/*

^г(Т ) = í »4(T)/(/í 2(T))2


MOMENT (s lh i, b j) TECHNIQUES 323

F o r simplicity in the following, we shall w rite ^ T ß i, fo r (T) , /S^(T),


re^)ectively .

A 2 . 2 >I Quick Approximate Solution

W e define w = exp (1/6^), Q = y / ô - Using ß i and compute

'^{ßl^ßz) = П 1(^,У2)/П 2(/51,Г2) (A4)

where for i = I, 2, with = ßz - 3,

- ~0 < r+
l s<
i 3А У л

The param eters } are given in Table A l .


r ,S
Then an approximation to w is

w* =V[Ñ/fÍ^7r^^5l(^7^¡rr2>- 1] (AS)

(F o r 3 < ^2 < 75 and bounded by the lognormal line in the ( ß i , ¢ 2) plane,


the e r r o r fo r w * is 0.006%.)

TABLE A l Coefficients

rs Num erator Denominator

00 1.333848465690817 1.0

10 -5.455870858760243 (I) -3.621879838877379 (I )


Ol 4.120727348534858 (I) 2.677096382861022 (I )
20 4.557065299738849 (2) 2.215014552020006 (2)

11 -7.219603313144450 (2) -3.727562379795881 (2)

02 2.459166955114775 (2) 1.266073347716621 (2)

30 -3.989549653042761 (4) -9.985226235607946 (5)


21 1.018227677593445 (3) 2.517660829639101 (4)
12 -8.686256219859072 (4) -2.555913695361249 (4)

03 2.000771886697820 (4) 6.266097479093280 (5)

2-Parenthetic entries taken negatively give the power o f 10


by which the corresponding entry is to be multiplied.
324 BOWMAN AND SHENTON

Now compute Q * from

Sinh n*
=Л D (w *) - I - W * )
2w* /
(A6)

where 3

^ _ [B(W) + ^/TЩ w) - 4 A (w )C (w )}]


2A(w)

and

A(W) = 2 (ß z - 2w’ - 3w^ + 3)

B(W) = 4(w^ - l)(w ^ + 2w + 3) (AT)

C(W) = (W^ - l)^(w ^ + 2w + 3)

(sin h "*x = In { x + 'v/(x2 + 1 )}, X re a l)» Go to A 2 .2 .6 to determine | and X and


complete the approximate solution.

A 2 . 2 . 2 Exact Solution from Bivariate Iteration

(1) Determine w.
(2) Determine Q .
(3) Go to A 2 .2.6.

Starting value for w

Compute

W i= ^/[^/(2/32 - 2) - 1]

Ui = 1 + ^ { P + n/Q} - ^ { - P + 'n/Q }

(P = 2/32 + 7, Q = (4 /З2/З + 3)2 + P^)


(A8)
A = I - -v/^i - 2)

W 2= [ - A + nÍ(A 2 -4 B )]/ 2

Then a starting value is

Wo = (W i + W2)/2

A better starting value in general is that given in (A S). Note that if ( ß i , ß z)


MOMENT (N/bi, b j) TECHNIQUES 325

is close to the lognormal line (A S ), the iterative process may abort (this
line is defined by w = I + ß ^/(w + 2)^ and ß 2 2w^ + 3w^ - 3 ). At this
stage check if there is, in any event, a Johnson S u curve tomatch ( ß i , ¢ 3 ).

There is no solution if

ß* > ßi

where using (AS)

ß* = (Wi - 1)(W2 + 2)2

A 2 . 2 . 3 The Iterates fo r w and П

(I) From (A 6), using Wq fo r w f , compute sinh Ц, and Q q , with the


restrictions

(i) Q q is taken to be <0 if ^/^l > 0 .


(ii) is taken to be > 0 if •sTßi < 0.
(iii) Q itself is taken to be zero if iSj = 0 ; in this case

W = ^/{^/(2^2 - 2) - 1 }

Go to A 2 .2 .6 from wMch | = (T ), \ = n/{ 2/í 2 (T )/ ( w 2 -1 )}

(2) Using Wo, ^ compute ^/^f(Wo , « o ) from

f i w ( w - 1)1 [w (w + 2) Sinh (ЗП) + 3 sinh П]


'v/ßf(w,ß) =-Ь=----- ----------------------- (AlO)
[w cosh ( 2 П) + 1] 2/г

Solve for a new w (say Wi ) the equation

' ^( ßi. ßilVf ) = <Hßi ,ßzi ^o)

where

ßz - f ( w ^ + 2w 2 + 3)
ßi

Thus

Wi = ^/îv|2ßг - 2ßi<i{ßit ßz'>'^o) - - 1] (A ll)


326 BOWMAN AND SHENTON

Return this new value to (I) and continue the cycle to meet whatever tolerance
Is deemed reasonable; for example, at the sth cycle, demand

IW - W J < C (c = 10“® o r so)


S S -I

One could also require a tolerance based on the la rg e r of Iwg - W g_il and
I ß g - f^ s - l ^•

A2 ♦2 >4 Examples

Check a few cases using the final computed values o f w and Q in (7.16) to
determine n/^i and ¢ 3 .

A 2 .2 «5 Illustrations

Example 1» Computed on HP97 (see ESS, V o l. 4, p. 312)

A random sample of 15 is drawn from a population with density

(a ) e "^ *^ “ V o !r (p ) (X > 0, p, a > 0)

P is a maximum likelihood estimator of p and the solution of

, . , (Arithmetic Mean)
ln P -^ (P ) = '” ÍGeometricMe^;

Given the moments of p are

fi\ = 1.2044

P2 = 0.2299

^Г¢l = 2.1680

/З2 = 14.4765

Then

W j = 1.469166402

= 2.376230596 > n/^1 , so there Is a solution

Wo = 1.758256474, Iw „ - I< 0.000000001

Wi 1.579959821 ***
W2 1.597436626 ***
Wj 1.595876511 ***
W4 1.596017234 ***
MOMENT (N/bi, Ьг) TECHNIQUES 327

Ws 1.596004551
We 1.596005695
iterates
W7 1.596005591
We I . 596005602
W9 1.596005601
Wio 1.596005601
Wii 1.596005601
n — -0.905955881

t
1.204400000
0.229900000
Л 0.358105885 ***

I 0.736127663

Feed-back

W 1.596005601 ***
Si -0.905955881 ***
М1(У) 1.307636524 ***
U z(y) 1.792734830 **♦
n/^1 2.167999999 ***
14.476500001 ***

Example 2 . HP97 (see ESS, V o l. 4, p. 312)

The density o f a random variable X is P (x) = 1/2 exp - I x l , -«. < x < <«..

with moments: ц [ = 0, P j = 2, = 0, and ß^ = ß

Then Su is given by

6.000000000 ***
^Z
Wo 1.315956418
W 1.470468517 ♦♦♦
Ô 1.610431098 %:|c4e
t ***
Ml 0.000000000
2.000000000 ***
Mz
Л 1.855132999
0.000000000
1.470468517
0.000000000

Feed-back

W 1.470468517
Í2 0.000000000 ♦♦He
0.000000000 ♦♦♦
328 BOWMAN AND SHENTON

0.581138830
ßl 0.000000000
ßz 5.999999997

A 2 .2 .6 The Scale and O rigin Param eters

U se the first two moments of the statistic T , ß [( T ) and Mz(T) to determine

(T)
X =
~ (w - l)(w cosh (2ÍÍ) + I)

f = Ml (T) + Xn/w sinh Í2

and

Z = 7 + Ô sinh~^^^ V ^ )

A2.3 The Test

Construct K| = X|(N/bi) + Х дф з)- Bowman and Shenton (1975b) have accumu­


lated extensive simulation results fo r K| fo r n = 25 (5) 100 (10) 250,300,
500,1000. A quadratic regression fo r each o f the percentage points 90, 95,
and 99 has been worked out to provide acceptance levels fo r k | under the
null h5^othesis for n = 30 (I) 5000. The quadratic regression s are, for
30 < n < 1000.

Fl = 4.50225 + 5.39716 X IQ -^ n + 8.59940 X 10’ ®n^

95% F2 = 6.22848 + 1.52485 X 1 0 "½ + 1.38568 X 10"^n^ (A12)

F3 = 10.8375 + 2.80482 X 10"^n + 4.97077 X lO-'^n^

Hence fo r a given value of n, ‘\ i b i , Ьз proceed as follows:

(1) Compute Ô, a from (7 .3 ), thence ô i , X j, and finally X s ( ^ i ) .

(2) Compute Xjj, Уг» and thence Х з(Ь 2).

(3) Construct k | = Xg(NZbi) + X g(b 2) and compare with the levels F I , F2,
F2 fo r the given n.

It is thought that there may be a use for this computerized version as a


sub-routine in a data analysis package, since the omnibus charts are visual
aids. There would be an automatic print out of significance at the three
le v e ls.
Lastly, note that the S j j transformation for the skewness statistic
-v/bi = under a normal population assumption is given explicitly
MOMENT («Л>1, Ьг) TECHNIQUES 329

(in term s of the sample size n) from the param eters

Y i = ^ l{ { n + l)(n + 3)bi/(6n - 12)}

^ 3(n^ + 27n - 70)(n + l)(n -b 3)


(n - 2)(n + 5)(n + 7)(n + 9)

W = n/{-^(2B2 - 2) - 1 }

Ô = 1 / ln W

0! = 'v/{2/(W^ - 1)}

Then

Xs(NZbi) = Ô In { Y i / a + *7(1 + Y i /0!)2)}

is an approximate norm al variate.


8
Tests fo r the U n ifo rm D istribution
Michael A . Stephens Simon F r a s e r University, Burnaby, B . C . , Canada

8.1 INTR O D U C TIO N

The uniform distribution plays a special role in goodness-of-fit testing. It


sometimes a ris e s , as fo r most tests, in the natural occurrence of certain
types o f data; but it appears also because there exist many ways o f trans­
form ing a given sample o f X -v alu es, from a distribution other than the uni­
form , to produce a set o f U -v a lu e s, which is uniformly distributed between
0 and I . This distribution is written U (0 ,1 ). A test that the X -sam ple comes
from a certain distribution can then be reduced to a test that the U -sam ple
comes from U (0 ,1). F o r most of this chapter we therefore discuss tests of

a complete random sample of n U -values comes from U (0 ,1) ( 8 . 1)

Tests for a uniform distribution with unknown lim its a re given in Section 8.16,
and tests for censored uniforms a re sum m arized in Section 8.17. An enor­
mous literature exists on the properties of a uniform sam ple, its o rd e r sta­
tistics, and its spacings, and no attempt w ill be made to reference this
entire literature. Two articles with extensive references are those by Barton
and M allows (1965) and Pyke (1965). References given in subsequent sections
w ill mostly be concerned with a particular test o r group of tests, and the
provision of tables.

331
332 STEPHENS

8.2 N O TA TIO N

The uniform, o r rectangular, distribution for a random variable U ' with


support over a < u’ < b has the density f(u ’) = l/ (b - a), a < u’ < b, and this
density w ill be denoted by U (a ,b ). The transformation U = (U^ - a)/(b - a)
yields a random variable U with density U (0 ,1). The distribution function
of U is

F(U) = u, 0 <U< I ( 8 . 2)

The distribution in (8.2) w ill be called the standard uniform distribution; U ’


o r U w ill be described as a uniform random v a ria b le , o r simply as a uniform
and a random sample from U (0 ,1) w ill be called a uniform sam ple, with
notation Ux, i = 1> • • • , n; when these are placed in ascending o rd er the
o rd er statistics U (i ) < U^2) < • • • < U(n) w ill be called ordered uniform s.
Either Ui o r U/jj might be regarded as the components of a vector U .
It is convenient to define U(0) = 0 and U^n+1) = !• The notation w ill also be
extended to sample values which are to be tested to be U (0 ,1).
Spacings. When a uniform sample is ordered, the spacings Dj are de­
fined by:

D = U -U i = l, n+ I (8.3)
I (I) (i - 1 )’

In particular, D i = U^i) and D ^+i = I - U^^)* An important result is that the


spacings Di have the distribution of a sample from the езфопепйа! distribution
F(u) = I - exp (-u/jS), (u > 0), with ß = l/(n + I ), conditional on the constraint
Z jD i = Spacings are discussed in Section 8.9.
Many o f the statistics to follow Involve calculations o f m axim a, o r of
sum s. Unless otherwise stated, the index used (usually i) in these езфгеееюпе
w ill run from I to n.

8.3 TRANSFORMATIONS TO UNIFORMS

Some of the most important transformations from a non-uniform distribution


to a uniform distribution U(0,1) are:

a) The Probability Integral Transform ation, and the related half-sam ple
method, considered in Chapter 4;
b) The Conditional Probability Integral Transform ation discussed in
Chapter 6;
c) The J and K transformations, discussed in Sections 10.5.4 and 10.5.5,
which take exponential random variables to uniform random v a ria b le s.
TESTS FOR THE UNIFORM DISTRIBUTION 333

There are also methods by which uniform random variables U may


themselves be transform ed to a new set of values U* o r U ” which also has
the U (0 ,1) distribution. Two of these are transformations G and W below.

8.4 TRANSFORMATIONS FROM UNIFORM S T O UNIFORM S

8 .4 .1 T h eG -T ran sfo rm atio n

Suppose U^j), 1 = 1, . . . , n, is an ordered random sam ple from U (0 ,1), and


IetDi^, k = l , 2, . . . , n + I , be the spacings. Suppose further that these
spaclngs are themselves ordered, so that D (i ). (2)» •••* ^(n+1) constitutes
the ordered spacings, and define D ( 0) = 0. Then construct new variables Dj.
from

D* = (n + 2 - r )(D . . - D , r = I, . . . , n + I (8.4)
r ' (r) (r -1 )'

The set is another set of unordered uniform spacings (Sukhatme, 1937).


F rom these, we can clearly build up another ordered uniform sam ple:

j
" Z » j = I, . . . , n

This transformation, which we call the G transformation, takes a set U of


ordered uniforms and produces another set U* of o rdered uniforms; we can
w rite U^ = G U . A useful purpose o f the G transformation is to increase
power of a test for uniformity. Suppose the parent population of the U-values
is not uniform, but close to uniform. Durbin (1961) showed that, loosely
speaking, transformation G makes large spacings la rg e r and sm all spacings
sm aller so that, when the transformation is applied to the U -se t the resulting
set U ’ w ill usually appear further removed from uniform. A test for uniform­
ity would then be m ore powerful applied to U^ than to U . There is, of course,
a lim it to this argument: if the original sample U w ere fa r removed from
uniform, the transformation w ill possibly make the resulting U ’ appear m ore
like uniform, and power would be lost. A m ore complete discussion of how
the spacings might look before G loses power rather than gains it is in
Seshadri, Csdrgo"*, and Stephens (1969). In practical term s, G w ill increase
power if the set tested might be thought to be close to the uniform distribu­
tion, but not if it is fa r away.

8 .4 .2 The W Transform ation

Another transformation to produce uniforms from uniforms is the W trans­


formation. This is given by
334 STEPHENS

i = I, >, n
” 1' ■ ' " ( « л * « » •

with ü (n + l) defined to be I. The resulting U ” set is a uniform sam ple, not


ordered. Form ally, we may write U ” = W U . Wang and Chang (1977) have
discussed the use of W in connection with tests fo r the e^iponential distribu­
tion (see Chapter 10); it arises also in the work of 0*R eilly and Stephens
(1982).

8.5 SUPERUNIFO RM OBSERVATIONS

When a test is made for uniformity, the alternative is often that the sample
comes from a distribution which gives spacings m ore irre g u la r than those
from a uniform sam ple. The implication for testing is that fo r most test
statistics a one-tail test is used. There a re , however, some occasions when
a sample should be tested as too regu lar to be uniform; such a sample w ill
be called superuniform . Stuart (1954) entertainingly re fe rs to superuniform
observations as ”too good to be t r u e ." One situation in which one wishes to
detect superuniformity arises when there is a suspicion that values unhelpful
to a certain hypothesis have been deliberately removed from the sample;
hopefully this situation w ill be r a r e . However, there a re other cases where
superuniform observations can arise quite naturally, with important im pli­
cations in test situations. They can occur, fo r example, when transform a­
tions are made (for example, the J transformation o f Chapter 10; see
Section 10.6).

.6 TESTS BASED ON THE E M P IR IC A L


DISTRIBUTION F U N C T IO N (E D F)

Tests based on the EDF of a set of observations are discussed extensively


in Chapter 4. Those which apply to a test of uniformity are the Case 0 tests
of Section 4 .4 . In the present context, the values U (i) w ill replace the Z (i)
in equations (4. 2) and subsequent equations of Sections 4 . 5 and 4 . 7.
If it is desired to guard against superuniform observations, the low er
tall of EDF statistics should be used. Significance levels for a given value of
a test statistic may be found from Tables 4.2 and 4 .3 , as described in
Section 4 . 5. 1.
Statistics V and U^ w ere introduced as omnibus tests for observations
on a circle (see Section 4 .5 .3 ), but they may be used in the same way as the
other statistics, for testing Hq in (8.1) above. The various EDF statistics
can be e3q>ected to possess different powers when the true population is not
the uniform distribution. Pow er properties are discussed further in Sec­
tions 8.13 and 8.14.
TESTS FOR THE UNIFORM DISTRIBUTION 335

E 8. 6 Example

In Table 8 . 1 are given 10 values of U (i), the ord er statistics of a random


sample of U -valu es; these values w ill be subjected to the various tests fo r
the uniform distribution U (0 ,1) to be given throughout this chapter« Also
Included in the table are the EDF statistics calculated from equations (4. 2 ).
The modified values of the statistics, for Case 0 , a re then:

= .270, D " * = 1.258, D * = 1.258, V * = 1.573,

W * = .582, U * = .217, A * = 2.865

Reference to Table 4 .2 .1 shows the significance levels corresponding to


these values to be approximately:

D"^; > .2 5 ; D "; .045; D: .09; V: .13;

W ^: .025; U^: .03; A^: .03

The fact that , U ^ , and a ll have much the same significance levels is
unusual; it reflects the fact that the sample is highly non-uniform in several
featu res. The fact that D " is much m ore significant than D"*” shows that the
drift in the values is toward I rather than 0. This is also shown by the mean
Ü = .686.
Another example of a test fo r U (0 ,1) is given in Section 4 .4 .' Here the
test is a Case 0 test for normality; because the distribution tested is com ­
pletely specified, the test reduces to a test that the values of Z j given in
T able 4.1 are uniform. A lso, in Chapter 10, tests for exponentlality are

T A B L E 8 . 1 Set of Values U^^^ and Derived Statistics

Values U: .004, .304, .612, .748, .771, .806, .850, .885, .906, .977

2 EDF statistics: D"^; .096; D": .448; D; .448; V; .544


(Section 8 . 6) , 554 . u2; .207; A^: 2.865

3 Mean: U = .686
4 Values Vj: -.087, .122, .339, .384, .316, .261, .214, .158, .088, .068
(Section 8.8.1)
5 Statistics C ^ i .384; C": .087; C: .384; K: .471
(Section 8.8.1)
6 Spacings Dj; .004, .300, .308, .136, .023, .035, .044, .035, .021, .071, .023
7 Statistics; G(IO) = .214; Q = 0.361
336 STEPHENS

converted, by transform ing the data by transformations J and K, to tests


fo r uniformity.

8.7 REGRESSION AND C O R R ELATIO N TESTS

The e x p e c t e d value of U (i), on Hq , is m^ = i/(n + I ). A useful technique for


testing uniformity is to plot the ordered observations against m i, and then
to see how w ell these fit a straight line. When the lim its of the distribution
are not known, the simple correlation coefficient may be used (see Sec­
tion 5 .5 ). When the lim its a re known to be ( 0 , 1), the straight line, say L ,
w ill join the origin O to the point P = (I , I ). Quesenberry and Hales (1980),
using the beta distribution o f a typical ord er statistic, have given a useful
pictorial method of testing, by giving bands around L in which the observa­
tions can be e3q>ected to lie . Test statistics may also be based on the devi­
ations from L , that is , on the v j in the next section (see Section 5.6 ).

8.8 OTHER TESTS BASED O N ORDER STATISTICS

8 . 8 . 1 Comparisons of O rder Statistics with


Expected Values

A m easure of the displacement of each variable U(i) from its e^)ected value
mi = i/(n + I) is

(8.5)

and statistics for testing Ho can be based on the v j. Some statistics which
have been suggested are

C = max. V . ; C = max. ( - v j ;
I l 1 1
+ — + —
C = m ax (C , C ) ; K = C + C ; ( 8 . 6)

T l = S .v ? / n ; T 2 = 2. Iv.l/n
I I I I

The test procedure is to calculate the Vj from (8 .5 ), then the statistic


from (8 .6 ), and re fe r to the upper tall of the appropriate null distribution.
F o r the same reasons as for EDF statistics, only the upper tail is usually
used in the test for uniformity; for superuniform observations, the lower
tall would be used. Tables for significance points for C*", C ” , and C have
been given by Durbin (1969a), and tables for K by Brunk (1962) and by
Stephens (1969b). From the exact points for each n, Stephens (1970) calcu -
TESTS FOR THE UNIFORM DISTRIBUTION 337

Iated modified form s, sim ilar to those for the ED F statistics in Section 8.6,
so that the C and K statistics may be modified and used with only the asym p­
totic points; these points are the sam e as fo r the corresponding EDF statis­
tics. The modified form s and the asymptotic points are given in Table 8.2.
Johannes and Rasche (1980) have recently given m ore detailed modifications
fo r C"*", C ", and C.
The C -statistics a rise in a natural way when testing the periodogram in
tim e -se rie s analysis; see, for example, Durbin (1969a, 1969b). Statistic T j
a ris e s if U (i) are plotted against mi and a test is based on the sum of squares
o f residuals (see Section 5 .6 ).
Hegazy and G reen (1975) have given moments and percentage points for
statistics T l and T 2 above, and also fo r the statistics t J and T 2 , calculated
using the same form ulas as fo r T j and T 2 , but with v^ = U (j) - (i - l)/ (n - 1);
they also gave some power studies fo r these statistics.
The statistics discussed in this section have much in common with EDF
statistics, and overall they have much the same power properties.

E 8 .8 .1 Example

The values v j for the data set in T able 8 . 1 a re included in the table together
with the statistics ^ C “ , C , and K. When these are modified as in
T able 8.2, the significance levels of the test statistics a re : C*": .01;
C : ^ .25; 0 : .02; IC: ^ .20.

T A B L E 8.2 Modifications and Upper and Low er T a ll Percentage Points for


Statistics C^, C ", C, and K (Section 8 .8.1)

Significance level a
Statistic
T Modified form T* 0.15 0.10 0.05 0.025 0.01

Upper tail
+
C (C^ + 0.4/п)(\Гп'* 0.2 + 0.б8/^/n) 0.973 1.073 1.224 1.358 1.518

c" (C“^ 0.4/n)(N/n+0 .2 + 0.68/N/n) 0.973 1.073 1.224 1.358 1.518

C (C+0.4/n)(\^+0.2+ 0.68/Vn) 1.138 1.224 1.358 1.480 1.628

K {K -l/(n -^ l)}K (n + I)-+ 0.1555+0.24/NT(n+ 1)} 1.537 1.620 1.747 1.862 2.001
Lower tall

C (C + 0.5/n)(N/n+ 0.44 -0.32/%¾ 0.610 0.571 0.520 0.481 0.441

K {K -l)/ (n + l)| K(n + 1) + 0.41-0.26/-/(114 1)} 0.976 0.928 0.861 0.810 0.755

Adapted from Stephens (1970), with perm ission o f the Royal Statistical
Society.
338 STEPHENS

8.8.2 Statistics Based on One O rder Statistic

On Ho, the o rd er statistic U (r) has a beta distribution Beta (x; p, q) with
density

f(x) = [Г (Р + q)/{r(p)r(q)}]xP"V - 0<x< I

where Г (р ) is the gamma function; fo r statistic U (r) the param eters are
P = r and q = n - r + I . Thus in a sample o f size n = 2k + I, the median
U (k +i) has density

n!
0 < x< I

with mean 0.5 and variance l/ {4 (n + 2 )}.


The following result connects the B e ta (x ;p , q) and F2p,2q distributions.
If X has the Beta (x ;p , q) distribution, then у defined by у = q x / {p (l - x ) }
has the F2p^2q distribution; equivalently, z = l/y = p (l - x)/(qx) has the
F2q,2p distribution.
It follows that Z = { ( n - r + l ) U (j . )}/ {r (l - U (r ))} has an Fs^t distribution,
with degrees of freedom s = 2r and t = 2(n - r + I ) . It is therefore possible
to base a test for uniformity on one of the o rd er statistics, rejecting if Z is
too large o r too sm all compared with the F -distribution percentage points;
the median U with r = n/2 o r r = (n + 1)/2) has sometimes been sug­
gested. The test is useful if, fo r some reason, not all of the sample values
w ere known, but only, for example, the sm allest values; but in general it
appears to be a relatively weak method o f testing for uniformity, given a full
sam ple. This is revealed, fo r example, in Chapter 10, where tests fo r expo-
nentiality are converted to tests fo r uniformily and then the mean Ú o r the
median U used as test statistics: by and la rg e , Û perform s much better
than U.

8.9 STATISTICS BASED O N SPACINGS

Another group of statistics for testing Hq is based on the spacings Dj defined


in Section 8.2. These tests have often been introduced in connection with
testing fo r exponentiality rather than as direct tests for uniformity. The
reason fo r this is that an e^on en tial set of n values can be transform ed,
by means of the J and K transformations of Sections 10.5.4 and 10.5.5, into
the n spacings produced by n - I ordered uniforms U (i ). Some tests for
exponentiality are tests based on these spacings, using the distribution theory
o f spacings between uniforms. The treatment of spacings statistics w ill
TESTS FOR THE UNIFORM DISTRIBUTION 339

therefore be shared with Chapter 10. In this section we discuss only the
Greenwood statistic and some modifications.

8 .9 .1 The Greenwood Statistic

This statistic was itself introduced by Greenwood (1946) in connection with


testing that the intervals between events (the incidence o f a contagious d is­
ease) w ere e¾юnential, that is, that the times o f the events constituted a
Poisson process (see Chapter 10), but it has also received attention specif­
ically as a test for uniformity. The statistic is

n+1
G(n) = Y j D?
i= l

Distributional properties of G(n) w ere investigated by M oran (1947, 1951);


recently, percentage points have been given by B urrow s (1978), Hill (1979;
see the correction 1981), C u rrie (1981), and Stephens (1981). Percentage
points fo r nG(n) based on these results are given in Table 8 .3 . L a rg e values
o f G(n) w ill indicate highly irre g u la r spacings and sm all values w ill indicate
superuniform observations; G(n) is w e ll suited to detect superuniform s. F o r
large n, nG(n) is approximately norm ally distributed, with mean /х= 2п/^(п+2)
and = 4/n, but this limiting distribution is attained very slowly.
The expected value of Di is l/ (n + I ) , and a possible test statistic might
be the dispersion o f the spacings defined by

G *(n) = S j { D . - l / ( n + 1)}^

However, it is easily shown that G*(n) is G(n) - l/ (n + I ), so that G *(n) is


equivalent to G (n). Several other statistics a re also equivalent to G(n); see
Section 1 0 .9 .3 .2 .

8 .9 .1 .1 The Greenwood Statistic Adapted fo r Censored Data

Suppose a sample is given, right-censored at value U ( r ) , and define

Hj^(n) = D^. Kim ball (1947, 1949) discussed distribution theory and

gave moments of Hj^(n). C learly Hj.(n) could be used as a test statistic fo r


uniformity; also, because o f the exchangeability o f uniform spacings, the Di
in Hk(n) could be replaced by any set o f available spacings between adjacent
o rd e r statistics, even if some o f the U (i) w ere m issing. A further m odifi­
cation is

(I - V ))'
(8.7)
°г < ”) = «г < “) " Т Г 7 ^
340 STEPHENS

TABLE 8.3 Upper and Lower Percentage Points for nG(n) (Section 8.9.1)

Low er tail Upper tail


SampleÍ Significance level a Significance level a
size
n .01 .025 .05 .10 .05 .025 .01

2 0.672 0.680 0.694 0.722 1.381 1.539 1.673 1.780


3 0.776 0.796 0.825 0.870 1.635 1.852 2.075 2.269
4 0.855 0.885 0.923 0.974 1.800 2.037 2.311 2.560
5 0.919 0.954 0.997 1.050 1.915 2.160 2.461 2.737
6 0.973 1.009 1.055 1.112 1.995 2.246 2.559 2.849
7 1.017 1.060 1.104 1.162 2.053 2.306 2.615 2.921
8 1.055 1.095 1.145 1.205 2.097 2.349 2.670 2.967
9 1.088 1.129 1.180 1.241 2.131 2.381 2.700 2.997
10 1.117 1.159 1.211 1.272 2.157 2.404 2.717 3.008
12 1.198 1.234 1.272 1.326 2.204 2.441 2.683 3.015
14 1.233 1.272 1.312 1.368 2.227 2.457 2.691 3.014
16 1.263 1.304 1.346 1.403 2.242 2.464 2.691 3.003
18 1.288 1.332 1.375 1.433 2.251 2.466 2.685 2.988
20 1.311 1.356 1.400 1.459 2.258 2.465 2.677 2.970
25 1.358 1.405 1.451 1.510 2.265 2.456 2.651 2.920
30 1.395 1.444 1.490 1.549 2.265 2.443 2.624 2.873
40 1.453 1.502 1.548 1.605 2.258 2.415 2.573 2.790
50 1.495 1.544 1.589 1.644 2.248 2.389 2.531 2.723
60 1.529 1.577 1.621 1.674 2.238 2.367 2.495 2.669
80 1.579 1.625 1.666 1.716 2.220 2.331 2.441 2.587
100 1.616 1.659 1.698 1.745 2.205 2.304 2.400 2.528
200 1.714 1.750 1.781 1.818 2.159 2.226 2.289 2.371
500 1.811 1.836 1.858 1.884 2.107 2.147 2.183 2.228

Adapted from Burrow s (1979) and from Stephens (1981), with perm ission of
the first author and of the Royal Statistical Society.

This reduces to G(n) above when r = n. L u rie, Hartley, and Stroud (1974)
investigated statistic S^ = (n + 2 ){(n + l)G p(n) - l } and gave moments and
some null percentage points obtained by curve-fitting. F o r complete samples
S^ had previously been discussed by Hartley and Pfàffenberger (1972). The
statistic Is clearly equivalent as a test statistic to G r (n ), and the moments
o f S^ can be used to give moments of G r(n ). The moments of % (n ) and of
Gj.(n) have been used by the author to fit Pearson curves to the null d istri­
bution to give percentage points fo r G ^in). These percentage points are given
in Stephens (1986).
TESTS FOR THE UNIFORM DISTRIBUTION 341

T A B L E 8.4 Upper T a il Percentage Points


of the Q Statistic (Section 8 .9.2)

Significance level a.

n .50 .10 .05 .01 .001

2 .659 .811 .859 .932 .977


3 .527 .691 .736 .831 .920
4 .447 .586 .635 .727 .829
5 .388 .505 .551 .642 .739

6 .343 .442 .433 .573 .691


7 .307 .393 .429 .512 .622
8 .278 .355 .387 .463 .551
9 .254 .322 .350 .423 .506
10 .234 .294 .319 .378 .461

11 .217 .272 .294 .351 .434


12 .202 .251 .272 .318 .392
13 .189 .234 .253 .298 .371
14 .177 .220 .237 .279 .348
15 .168 .206 .222 .259 .321

16 .159 .195 .209 .245 .294


17 .150 .184 .197 .230 .278
18 .143 .174 .187 .218 .268
19 .137 .166 .177 .206 .257
20 .131 .158 .168 .196 .246

21 .125 .151 .162 .187 .232


22 .120 .144 .154 .178 .218
23 .115 .138 .147 .169 .206
24 .111 .133 .141 .163 .192
25 .107 .128 .136 .156 .189

26 .103 .123 .131 .148 .182


27 .100 .119 .126 .144 .173
28 .097 .114 .121 .138 .166
29 .094 .111 .117 .134 .158
30 .091 .107 .114 .130 .155

(continued)
342 STEPHENS

T A B L E 8.4 (continued)

Significance level a.

n .50 .10 .05 .01 .001

31 .088 .104 .110 .125 .148


32 .086 .101 .106 .120 .142
33 .083 .097 .103 .116 .135
34 .081 .095 .100 .112 .131
35 .079 .092 .097 .110 .128

36 .077 .090 .095 .107 .124


37 .075 .087 .092 .103 .122
38 .073 .085 .089 .100 .118
39 .071 .083 .087 .097 .114
40 .070 .081 .085 .095 .112

41 .068 .079 .083 .092 .110


42 .067 .077 .081 .090 .107
43 .065 .075 .079 .088 .103
44 .064 .073 .077 .086 .101
45 .062 .072 .075 .084 .098

46 .061 .070 .074 .082 .094


47 .060 .069 .072 .080 .092
48 .059 .067 .070 .078 .091
49 .058 .066 .069 .077 .090
50 .057 .065 .068 .075 .086

55 .052 .059 .061 .068 .078


60 .048 .054 .056 .062 .070
65 .044 .050 .052 .057 .064
70 .041 .046 .048 .052 .059
75 .038 .043 .045 .048 .054

80 .036 .040 .042 .045 .051


85 .034 .038 .039 .042 .047
90 .032 .036 .037 .040 .044
95 .031 .034 .035 .038 .041
100 .029 .032 .033 .036 .040

Taken from Quesenberry and M ille r (1977),


with perm ission of the authors and publishers.
Copyright © Gordon and Breach Science Pub­
lish ers, Inc.
TESTS FOR THE UNIFORM DISTRIBUTION 343

8 .9 .2 Statistics Related to Greenwood^s Statistic

An adaptation of G(n) has been proposed by Quesenberry and M ille r (1977).


This is the statistic

n+1 n
Q = Z Dj i+ l
i= l i= i ^ ^

and Ho is rejected if Q is too la rg e . T ables of percentage points fo r Q are


given in Table 8.4, taken from Quesenberry and M ille r and based on Monte
C arlo studies. Q is designed to take into account the pattern of the spacings
(specifically, the autocorrelation) as w e ll as their sizes, and could be a
useful statistic in analyzing series o f events, where autocorrelation som e­
times plays a part (see Section 10.6 .2 ).

E 8.9.2 Example

The values o f the spacings , fo r the data set U in T able 8 . 1 a re also given
in the table. From these are calculated Greenwood’s statistic G(IO) = 0.214,
and Q = 0.361. Reference to Table 8.3 shows G(IO) to be significant at about
the upper 10% level, and reference to Table 8.4 shows Q to be significant at
about the upper 6% level.

8 .9 .3 Other Statistics Calculated from Spacings

Other statistics based on spacings are M oran’s M and the Kendall-Sherm an


statistic K (Sections 1 0 .9.3.4 and 1 0 .9 .3 .5 ); ED F statistics (Section
1 0 .9 .3 .6 ) and 1ц(р) (Section 1 0 .9 .3 .7 ); and statistics given in Section
10.11.4. Note that these are a ll defined fo r n spacings, not n + 1 spacings,
and the form ulas must be adapted accordingly.

E 8 .9.3 Example

F o r the U set in T able 8.1, M oran’s statistic is 14.81 (Section 10.9.3.4


with n = 11); C = 1.18, so M (IO)Zc = 12.57. This is significant at the 25%
level when compared with Xio •

8 .9 .4 Higher O rder Spacings and Gaps

There has recently been Interest in k -spacin gs, defined by Dki = U(^i) -U (k i-k )»
these a re the spacings between the observations, taken к at a tim e. This use
o f spacings suppresses some of the Information in the sam ple, but Hartley
and Pfaffenberger (1972) suggested that k-spacings might be useful in tests
fo r large sam ples. Del Pino (1979) discussed statistics of the form
W = Zih(nDki) where h (-) is an appropriate function, summed over the range
o f i for fixed к (for simplicity, suppose к divides into n + 1 and let
344 STEPHENS

i = (n + I ) /к; then the range o f i is I < i < f ) . Del Pino showed that, by the
criteria o f asymptotic relative efficiency, h(x) = gives an optimum sta­
tistic W ; let this be W j . Del Pino also argued fo r the utility o f such statistics
fo r la rg e sam ples; see also D arling (1963) and W eiss (1957a, 1957b) fo r m ore
general considerations involving spacings.
C re ssie (1976, 1977a, 1978, 1979) and Deken (1980) have considered
test statistics which are functions of m -th o rd er gaps = U „. , ,
------------------- I (i+m) (I)
fo r m a fixed integer, and 0 < l < n + l - m ; a s before, U(0) = 0 and
.(m )
U . . _ = I- For m = I, = D., : fo r higher m , the G ' contain o v e r -
(n+1) 1 1 + 1 ® i
lapping sets of D j, in contrast with k-spaclngs above. Deken defined G^ з(Р+1)
as a p -stretch , and gave distribution theory fo r the maximum p-stretch.
Solomon and Stephens (1981) gave percentage points fo r n = 5 and 10 and
made a comparison with an approximation given by Deken. C ressie (1977a,
1978) has also studied the minimum p-stretch and the minimum one
might suppose the minimum p-stretch to be useful in detecting a ”bump”
in an otherwise uniform density, and this would be valuable in studying the
tim es of a series of events (see Chapter 10); however, C re ssie (1978) found
the minimum gap of either type to be less powerful against a specific bump

alternative than = log g |”^\ r = n + I - m , o r its p a ra lle l

= Z j log D ^J (r = [(n + l)/ m ] - I ). C ressie (1978) discussed

showing asymptotic normality and giving some Monte C arlo power results.

For m = I, is essentially Moran*s statistic M (Section (1 0 .9 .3 .4 ); for

m > I, may be useful in overcoming the difficulties of M with very

sm all values (Section 10.10). Tables of the null distributions of and


are given by M cLaren and Stephens (1985).
An Interesting justification fo r using high-order spacings comes from
considerations of entropy, which, under certain conditions, characterizes
a distribution. Vasicek (1976) introduced an estimate of entropy and used
it to produce a consistent test fo r normality; the estimate, adapted fo r the
uniform distribution, is

jj.

where now = ^
if r < I and U , = U , , if r > n. There are clearly
Дт)
close connections between 1¾ and H (m ,n ). Dudewicz and van der Meulen
(1981) have proposed H (m ,n) as a test statistic fo r uniformity, and have
given tables of percentage points, derived from Monte Carlo methods, fo r
n = 10, 20, 30, 40, 50, 100 and fo r various values o f m from m = I to
m = M , with M becoming la rg e r with n. They also show asymptotic normality
TESTS FOR THE UNIFORM DISTRIBUTION 345

fo r H (m ,n ), established by the relationship with , but both these statis­


tics attain the asymptotic normality only very slowly; this appears to be a
feature of spacings statistics. Dudewicz and van der Meulen also give power
results; H(m ,n) appears to be particularly good against alternatives
with a high density near 0.5 and fo r m » 0.4n. The m to give best power
v aries with the alternative.
C ressie (1979) discussed statistics of the form H = Z)ih(nGj ) , and

found that, with h(x) = the resulting statistic = has


higher asymptotic relative efficiency than del Pino^s . Note that both
these statistics can be regarded as extensions of Greenwood^s statistics to
( 2)
h igh er-ord er spacings and gaps. ' is closely related, and asymptotically
equivalent, to the Q u esen berry -M iller statistic Q above. M cLaren and

Stephens (1985) have given percentage points fo r and D el Plno


(1979) and C ressie (1979) proved as 3nnptotic normality fo r statistics W , H,

and Sn , and Holst (1979) gave the mean and variance of H, but tables fo r
finite n are not yet available fo r these statistics. Greenwood^s statistic
Itself converges only slowly to its asymptotic distribution, and this may be
the case also for these related statistics. M cLaren and Stephens (1985) have

given power studies fo r statistics L¿ ' and S^ , fo r m = I, 2, and 3. These


included alternatives with spacings derived from Gamma o r W eibull variates;
the L -C lass was better than the S -c la ss, and power decreased with m.
Another statistic to detect bumps is the scan statistic; this is S (L ), the
maximum number of observations (out of n) falling into a window of length L ,
as the window travels along the Interval (0 ,1 ). The statistic has been studied
by W allenstein and Naus (1974), who gave the null distribution fo r finite n,
and by C re ssie (1977b, 1980) who gave asymptotic theory; see these articles
also fo r references to e a rlie r work by N au s. Ajne (1968) discussed the scan
statistic on the c ircle, with circum ference I and L = 0 .5 .
Mich Interesting work has been done on the gaps and scan statistics (the
papers quoted give many e a rlie r re fe re n c e s), but m ore is needed to make
them of practical use as test statistics and to compare them with other tests
fo r uniformity.

8.10 STATISTICS FO R S P E C IA L A L T E R N A T IV E S

The statistics so fa r considered have been based on various methods of


relating the o rd e r statistics o r their spacings to the pattern eiqpected of them.
Many other test statistics fo r uniformity, usually fairly simple functions of
the U i, a rise when special distributions are regarded as the alternative if Ho
is not true, and likelihood ratio methods are used to find test statistics. Some
o f these are discussed in this section.
346 STEPHENS

8.10.1 The Statistic 0


Suppose the alternative distribution to Ho is density

ku
f(u) = 0 < U< I (8. 8)
к
e - I

This is a truncated eзфonentlal distribution which reduces to the uniform


density when к = 0. Thus a test fo r uniformity becomes a test fo r к = 0, and
the likelihood ratio method gives the test statistic T = 1¾U j , o r equivalently
0 = T/n.
The null distribution of Ú is w eU known, although its form is quite com­
plicated. Low er tall percentage points are in Table 8.5, adapted from

T A B L E 8.5 Low er T all Percentage Points for U (Section 8.10.1)

Significance level oi

0.25 .15 .10 .05 .025 .01 .005

4 0.399 0.346 0.312 0.262 0.221 0.176 0.148


5 0.410 0.363 0.332 0.287 0.250 0.208 0.181
6 0.419 0.376 0.347 0.306 0,271 0.232 0.207
7 0.425 0.385 0.359 0.320 0.288 0.251 0.227
8 0.430 0.393 0.368 0,332 0.301 0.266 0.244
9 0.434 0.399 0.376 0.341 0.312 0.279 0.257
10 0.438 0.404 0.382 0.350 0.322 0.290 0.269
12 0.443 0.413 0.393 0.363 0.337 0.308 0.289
14 0.447 0.419 0.401 0.373 0.349 0.322 0.304
16 0.451 0.425 0.407 0.381 0.359 0.333 0.316
18 0.454 0.429 0.412 0.388 0.367 0.343 0.327
20 0.456 0.433 0.417 0,394 0.374 0.351 0.335
25 0.461 0.440 0.426 0.405 0.387 0.366 0.352
30 0.464 0.445 0.432 0.413 0.397 0.378 0.365
35 0.467 0.449 0.437 0.420 0.404 0.387 0.375
40 0.469 0.453 0.441 0.425 0.411 0.394 0.383
45 0.471 0.455 0.445 0.429 0.416 0.400 0.390
50 0,472 0.458 0.448 0.433 0.420 0.405 0.395
60 0.475 0.461 0.452 0.439 0.427 0.414 0.404
70 0.477 0.464 0.456 0.443 0.432 0.420 0.411
80 0.478 0.467 0.459 0.447 0.437 0.425 0.417
90 0.479 0.468 0.461 0.450 0.440 0.429 0.422
100 0.481 0.470 0.463 0.453 0.443 0.433 0.426

Adapted from Stephens (1966), with permission of the Biometrika Trustees.


TESTS FOR THE UNIFORM DISTRIBUTION 347

Stephens (1966); if Z q, is the given point for level ce, the corresponding upper
tail point, that is, for level 1-ce, is I - Z q,. F o r large n (n > 20) the d is­
tribution of Ú is well-approxim ated by the norm al distribution with mean 0.5
and variance l / {l2 n }o The distribution (8.8) occurs in connection with
points U obtained from a renewal process with a trend (see Section 10.9.1).

E 8.10.1 Example
The mean of the U -se t in Table 8 . 1 is 0.686, so I - U = .314. Reference to
Table 8.5 gives a p -le v e l equal to 0.02 (one-tail) o r 0.04 (tw o-tail).

8.10.2 T h e S ta tls tlc P

Suppose the alternative distribution to Hq is the density

f(u) = (k + l)u ^, к > -I, 0 < U < I,

which reduces to the uniform density when к = 0. This fam ily of densities is
sometimes re fe rre d to as the Lehmann fam ily. The likelihood ratio test
statistic for a test for к = 0 against к ^ 0 is P/2 where

P = -2 Z ^ log U j

On Ho, P has the distribution with 2n degrees of freedom . The test of Hq


is the test that к = 0 ; against the alternative к > 0, low values o f P w ill be
regarded as significant, and Hq w ill be rejected if P is less than the appro­
priate percentage point in the low er tail of x^n* against
the alternative - I < к < 0, high values of P w ill be significant, and Hq w ill
be rejected if P exceeds the upper tail percentage point of xin* most
general test of Hq , that к = 0 against the alternative к 0, a tw o-tail test
w ill be used. P has often been used to combine several tests of significance
by Flsher^s method (see Section 8.15 below ).

8.10.3 Statistics fo r the C ircle o r the Sphere

Suppose points P j, i = I , • • . , n a re marked on the circum ference of a circle


of radius I , and it is desired to test the null hypothesis Hq that the points P j
a re uniformly distributed around the c irc le . Let O be the center o f the circle
and let N be the north pole; let в be the angle between O N and O P , where P
is a typical point on the c irc le . A common distribution used fo r describing
a unimodal population around the c ircle is the von M ises distribution (Sec­
tion 4 . 15) fo r which the density is

f (e ) = exp { k c o s (0 - 0 o )}, 0 < 0 < 2ir, k >0


2irIo(/c)
348 STEPHENS

This density is S3mimetric, with a mode along the line OA with coordinate »
and is increasingly clustered around OA as к becomes la rg e r; when K = O the
distribution is uniform around the circ le . I q {к) re fe rs to the B essel function
with im aginary argument, of ord er zero. F o r a von M ises alternative, the
null hypothesis Hq is equivalent to

K = O, against the alternative Hj : к > 0

When the modal vector OA is not known, the likelihood ratio procedure gives
a test statistic which is the length R of the resultant, o r vector sum, g , of
the vectors OP¿, i = I, . . . , n. In the m ore unlikely event that, on the alter­
native, the modal vector OA is known, the component of g along OA, called X,
is the test statistic.
The distributions of R and X are very complicated for points on a circle;
they have been studied by Greenwood and Durand (1955) and Durand and
Greenwood (1957), who have given some percentage points. Stephens (1969a)
has given a table of upper tail percentage points for testing Hq , for both R
and X. F o r large sam ples, 2R^/n has the x| distribution and X has the n or­
mal distribution with mean O and variance n/2. These statistics arise also
in a totally different context, when EDF statistics and are partitioned
into components (Section 8.12).
Further applications of the uniform distribution arise in studying random­
ness of directions on a sphere, against various alternatives. Suppose the
sphere has center 0, and radius I , and let a typical point P on the sphere be
located by spherical polar coordinates ( в , ¢ ). If P is uniformly distributed
on the surface of the sphere, cos в is uniformly distributed between - I and I.
Again, the von M ises distribution, with density p er unit area proportional to
exp ( k cos y ) , is the most important for unimodal data; here y i s the angle
between O P and the modal vector P A . Likelihood ratio tests fo r uniformity
( k = 0) against a von M ises distribution (к > 0) (also called the F ish er d istri­
bution on the sph ere), lead again to the length R of the resultant R of a sample
of n vectors, as test statistic; when the modal vector of the alternative is
known, the test statistic is the component X o f R on this vector, as for the
c irc le . Stephens (1964) has given tables of percentage points for R and for X.
F o r large n, 3R^/n is approximately Хз distributed, and X is norm al, with
mean 0 and variance n/3.
Other alternatives to randomness have been proposed to describe natural
data, among them, densities for which the probability p er unit area is p ro ­
portional to
fj(y ) = e -« lc o s y |

. , . K Sin y
0 < у < 7Г
or
- . . K CO S^y
fa(y) = e
TESTS FOR THE UNIFORM DISTRIBUTION 349

F o r fj (y) and Í 2 Í y ) f к > 0 ; fo r fsiy ), к is any constant. The densities are


all S5nnmetric about the axis O A , are either bimodal o r equatorial, and
a ll reduce to the uniform density when /c = 0 . In a test fo r uniformity of
directions against the alternatives given above, when O A is a known axis,
the null hypothesis is Hq : /c = 0, against the alternative к Ф 0 . Likelihood
ratio test statistics are, respectively,

L i = S il V i l / n ; L j = S i (I - V f)^ / n ; L j = S iV j/ n

where = cos y j has, on the uniform distribution between - I and I .


L i has the same distribution as Û , considered in Section 8 . 1 0 .1 above, and
I j 2 and L 3 have the same distributions as

Q = S j (I - U*)^/n

and

T = Z^uVn

where U^ a re U (0 ,1 ). Further, the statistic

S2 = (U. - 0 .5 )V n

which is a m easure of the dispersion of the U^, has the same distribution as
T/4. Significance points fo r Ü are in Table 8 .5; points fo r Q and T have been
given by Stephens (1966), and the applications to tests fo r directions are d is­
cussed further in that re feren ce. When OA is not known fo r the distributions
above, the tests for uniformity become m ore complicated. The statistics Ü,
S^ i and T w ill appear again in the next section in connection with Neym an-
Barton tests, and with partitioning the Anderson-D arling statistic into
components.

8 . 10. 3 . 1 Ajne^s Statistic


Ajne (1968) suggested a test statistic fo r uniformity on the circum ference (of
length I) of a c irc le , which has optimum properties against the alternative
density fi(x) = r (0 < X < 1/2), fi(x) = s (1/2 < x < I ) , where r and s are con­
stants . The test statistic is

A=-Zj N(X) - |n f dx
“ 0 (

where N(x) is the number of observations falling in the sem icircle (x, x + l/ 2).
Computing form ulas have been given by Watson (1967) and by Stephens (1969c).
350 STEPHENS

Suppose the observations are < U^2) < * * ' < ^ (n )» m easured around the
circum ference. Then

Another form ula is

A = J - - Z
4 n

where

n j-1
Z = Z Z “ ij
j=2 i=l
with

U -U
(j) (i) ^ (j) " % - 2

“ ij =

Watson (1967) gave the asymptotic null distribution of A .


Stephens (1969c) has given the moments, some exact distribution theory
and percentage points for A ; also given are some power studies which com­
pare A with and V and which suggest that, in practice, the gain in using
A when it is optimal is sm all compared with the loss when it is not.

8.1 0 .3 .2 Omnibus Tests

Omnibus tests are not designed for specific alternatives, but it is convenient
to mention several of these, especially for the c ircle, before leaving this
section. EDF statistics and V (Chapter 4; here replaces Z (i) in Equa­
tions (4.2)) w ere designed for the circle because they do not depend on the
origin of U; Watson (1976) gave another statistic derived from the E D F and
Darling (1982, 1983) has recently given the asymptotic points. Ajne (1968)
also gave another statistic fo r the circle. A review of tests for uniformity on
the hypersphere is given by Prentice (1978); see also Beran (1968) and Gine
(1975). Such tests are sometimes derived in very general term s, and often
give the test statistics in this section when particular cases are taken.
TESTS FOR THE UNIFORM DISTRIBUTION 351

8.11 THE N E Y M A N -B A R T O N SMOOTH TESTS

Another application of likelihood ratio methods yields the Neym an-Barton


tests. Neyman (1937) considered the problem of testing fo r uniformity U (0 ,1 ),
against an alternative density of the form

K
f(u) = C(^) езф 11 + 2] 0. [ . O< U < I, к = I, 2, (8.9)
j= l ^ ^

where 7Tj(u) are the Legendre polynomials, ¿ is a vector of param eters with
components ..., and c (^ ) is the norm alizing constant. The Legendre
pol 30iomlals are orthonormal on the Interval (0 ,1 ). By varying k, the density
may be made to approximate a given density, and it also va rie s smoothly
from the uniform distribution as the take increasingly large values. The
test fo r uniformity of U then reduces to testing the null hypothesis

$,= 0 fo r all j
J

Ho may be put in the form Neyman found an appropriate

statistic, based on likelihood ratio methods, fo r testing this null hypothesis.


F o r given k, the test statistic is calculated as follow s.

(a) Let

n
^ (8.10).
3 = Vn
T E
J I

In these calculations, TTj(U) is best expressed in term s of у = U - 0.5.


F o r the first four polynomials,

TTj (U) = 2 ^/Зy ; ^ (U) = N/5(6y^ - 0 . 5) ;

Wj (U) = Ф г ( 2 0 у ^ - 3y) ; W^(U) = 3(70y* - ISy^ + 0.375)

(b) The Ne 3Tman statistic of o rd er к is

\ = E V? (8 . 11)
j= i ^

The null hypK)thesis of uniformity w ill be rejected for large values of Nj^; for
large n, on Ho, % is asjmaptotically distributed as The tests based on
Nk a re consistent and asymptotically unbiased. Незппап showed that, asym p-
352 STEPHENS

totically, the vj are independent N (0,1) variables on Hq . David (1939) further


showed that the as 3nnptotic distributions w ere very good approximations
to the finite-n distributions for n > 20. Note that Vi and are respectively

equivalent to the sample mean Ü and sample variance (U^ -0 .5 )V n = S^;

thus the statistic N 3 is a combination of these two basic statistics, and as


such has an intuitive appeal for testing uniformity. Furtherm ore, Locke and
S purrier (1978) and M ille r and Quesenberry (1979) have recently shown N 3
to be an effective statistic against a wide range of alternatives. In Table 8.6
upper tall percentage points for N 3 a re given; these w ere obtained by fitting
Pearson curves to the moments, and are taken from Solomon and Stephens
(1983). M ille r and Quesenberry also recommend N 4 against some alterna­
tives; they give tables of N ^ , N 3 , N 3 , and N 4 based on Monte Carlo studies.
T h eir tables for N 3 and N 4 are also given in Table 8 . 6 . The quantities vj
arise again in the next section, in connection with decomposing the EDF
statistic into components. Further discussion of the Neyman tests is in
Pearson (1938) and David (1939).
Barton (1953) considered a slightly different class of alternatives given
by

f(u) = I + 2 0.7T.(u) , 0-^u<l, k = l , 2,


J=I

A restriction must now be placed on the Ö1 to ensure that the density is


always positive. The same statistic as above, Nj^, may again be used to test
fo r uniformity against this fam ily of alternatives. Some asymptotic power
calculations can also be made. Barton (1955, 1956) has investigated the
application of these statistics when the data have been grouped o r are discrete,
and also the situation when the Щ are not uniform, but have been obtained by
the Probability Integral Transform ation applied to a distribution with esti­
mated param eters. This situation has also been examined by Thomas and
P ierce (1979) and by B argall and Thomas (1983). F o r these problem s there
a re some interesting connections with the Pearson test.

E 8 .11 Example
F o r the data in Table 8 . 1, Neyman’s statistic N 3 = 6.437 and is significant
at about the 4% level. The individual components vf and v| (equivalent to Ü
and S^) have significance levels p = 0.04 and p = 0.12, respectively.
TESTS FOR THE UNIFORM DISTRIBUTION 353

TABLE 806 Upper T a il Percentage Points fo r N 2 , N 3 , and N 4 (Section 8 .11 )

Significance level a

n 0 .5 0.25 0.1 0.05 0.025 0.01 0.005

Statistic N 2

2 1.587 2.244 4.023 5.903 7.771 10.012 11.530


3 1.589 2.565 4.013 5.682 7.372 9.717 11.526
4 1.530 2.712 4.116 5.566 7.287 9.643 11.472
5 1.491 2.763 4.227 5.573 7.226 9.517 11.340
6 1.464 2.776 4.316 5.618 7.148 9.384 11.214
7 1.445 2.774 4.382 5.640 7.096 9.326 11.100
8 1.438 2.777 4.421 5.683 7.110 9.276 11.030
9 1.434 2.777 4.453 5.735 7.142 9.208 10.940
10 1.432 2.772 4.476 5.774 7.167 9.265 10.870
11 1.429 2.779 4.489 5.790 7.174 9.173 10.820
12 1.420 2.766 4.486 5.822 7.198 9.170 10.770
14 1.406 2.736 4.517 5.897 7.311 9.235 10.735
16 1.403 2.740 4.527 5.908 7.319 9.233 10.720
18 1.402 2.744 4.536 5.918 7.327 9.235 10.716
20 1.400 2.746 4.542 5.925 7.332 9.234 10.706
25 1.398 2.751 4.554 5.937 7.341 9.230 10.684
30 1.396 2.755 4.562 5.947 7.348 9.230 10.677
35 1.395 2.757 4.568 5.962 7.352 9.226 10.662
40 1.394 2.759 4.573 5.958 7.357 9.230 10.666
45 1.393 2.760 4.576 5.961 7.357 9.221 10.645
50 1.392 2.762 4.579 5.964 7.360 9.223 10.646
60 1.391 2.763 4.584 5.969 7.364 9.224 10.644
80 1.390 2.766 4.589 5.974 7.367 9.218 10.627
100 1.390 2.768 4.592 5.979 7.370 9.220 10.626
OO
1.386 2.773 4.605 5.991 7.378 9.210 10.597

Statistic N 3

2 5.59 7.40 13.50


3 5.75 7.48 12.87
4 5.91 7.53 12.45
5 5.99 7.57 12.15
6 6.04 7.60 11.95
7 6.07 7.63 11.81
8 6.10 7.65 11.71
9 6.11 7.67 11.65
10 6.12 7.68 11.60

(continued)
354 STEPHENS

TABLE 8.6 (continued)

Significance level a

0.1 0.05 0.01

11 6.13 7.69 11.57


12 6.13 7.70 11.55
14 6.14 7.72 11.52
16 6.14 7.73 11.51
18 6.14 7.73 11.50
20 6.15 7.74 11.50
30 6.16 7.75 11.49
40 6.17 7.76 11.49
50 6.18 7.76 11.48
OO 6.25 7.81 11.35

Statistic N 4

2 7.19 9.52 16.14


3 7.34 9.51 15.80
4 7.46 9.50 15.43
5 7.53 9.4ii 15.12
6 7.57 9.48 14.86
7 7.60 9.47 14.65
8 7.62 9.47 14.47
9 7.63 9.46 14.32
10
11 7.65 9.45 14.09
12 7.65 9.45 14.00
14 7.66 9.44 13.87
16 7.66 9.43 13.78
18 7.67 9.42 13.71
20 7.67 9.42 13.67
30 7.68 9.40 13.58
40 7.68 9.40 13.52
50 7.69 9.40 13.48
OO
7.78 9.49 13.28

Adapted from M ille r and Quesenberry (1979) and from Solomon and Stephens
(1983), by courtesy of the authors and of M arcel Dekker, Inc.
TESTS FOR THE UNIFORM DISTRIBUTION 355

8.12 CO M PO NENTS O F TEST STATISTICS

In the expression (8.11) fo r N^, the individual term Vj may be regarded as


a component of the entire statistic N^ • A s 3rmptotically these components are
independently norm ally distributed with mean 0 and variance I . F o r finite n,
their distributions could, in principle, be examined, from the form ulas
fo r V j, o r from approximations using the moments (see, fo r example, David,
1939) although for finite n the vj a re not Independent. A s David (1939) sug­
gests , against certain alternatives, use of one of the individual components
w ill prove m ore powerful than use of the entire statistic N^ • In recent years
E D F statistics have also been partitioned into components along sim ila r
lines. F o r example, the EDF statistic can be written

W* = 2 ^
j= 0 ^

where

==J4= (f) i=Zl COS ) ( 8. 12)

and where Xj are weights (Durbin and Knott, 1972; see also Schoenfeld, 1977).
Suppose V i = TIjUi. i = I* 2, . . . , n. Starting at the point (1,0) in the usual
rectangular coordinates, V i can be recorded on the circum ference o f the
unit c ircle, centered at the origin O, and with radius I . Let point be the
point on the c ircle corresponding to V i , and let Rj be the resultant (vector
sum) o f the vectors O P i, i = I , . . . , n. Component Zj is proportional to X j,
the length of the projection of Rj on the X -a x is . When the Uj are U (0 ,1), the
V i w ill be uniform on (0, Jtt) and the distributions of Xj are tiie sam e as those
discussed in connection with directions in Section 8.10.3. Stephens (1974a)
has shown that the components of U^ are proportional to R j, the length o f g j ,
also discussed in Section 8.10.3. F o r A ^, the components a re proportional
to the Vj in the Ne 3onan-Barton test statistics; thus the sum o f the firs t к
components of A^ is related to Neyman*s N^ in that they use the same com ­
ponents V j, j = I , . . . » k, defined in equation (8.10), but with different
weights. Against some alternatives one o r two components of, say, o r A^
may be m ore powerful, as a test statistic fo r uniformity of U i , than the
entire statistic o r A ^ . Durbin and ICnott (1972) have demonstrated this
fo r a test o f normality N (0 ,1), in which the U i are obtained by the Probability
Integral Transform ation, against alternatives involving either a shift in mean
o r a shift in variance. The firs t component alone, fo r example, is better
than in detecting the shift in mean. However, Stephens (1974a) has shown
that the first component is insensitive to an alternative where both mean and
356 STEPHENS

variance have been changed; for such an alternative, at least the first two
components would be needed—this is roughly the same as using the first
component of U^. By expanding an alternative density into a se rie s, using
appropriate orthogonal functions, it should be possible to suggest which
departures are detected by which components, and then perhaps decide how
many to use to get best power, but this w ill be difficult in the usual situation
where the alternative distribution to the null is not clearly known. Sim ilar
rem arks apply to the other statistics partitioned into components.
Durbin, Knott, and T aylor (1975), Stephens (1976), and Schoenfeld (1980)
have discussed partitioning of EDF statistics into components for the case
where the tested distribution contains unknown param eters; here the d istri­
bution theory of components is very complicated, and only as 5nnptotic results
a re known. Components can be useful in the theoretical examination of test
statistics, especially in calculating asymptotic power properties.

8.13 THE E F F E C T O N TE ST STATISTICS OF


C E R T A IN P A T T E R N S OF U -V A L U E S

In the next section we discuss the power of the various test statistics in this
chapter. However, before this, some general observations can be made on
the appearance of the U -se t and its effect on different test statistics. If the
U -s e t is truly uniform, it should be scattered m ore o r less evenly along the
interval (0 ,1 ). If the alternative to uniformity makes the values tend toward 0,
there w ill be a high value for D"**; if they tend toward I , there w ill be a high
value for D “ . In either case D w ill be large and perhaps significant. The
statistics and A^ w ill also detect a shift of values toward 0 o r I . If the
U set has been obtained from the Probability Integral Transform ation (Chap­
ter 4), from a Case 0 test that are from a completely specified F (x ), a
set of U -values tending toward 0 o r toward I w ill suggest that the hypothesized
F (x) has an incorrect mean (it may of course also have other incorrect param ­
eters o r be of incorrect shape). If the U -s e t tends to cluster at some point
in the interval, o r to divide into two groups toward 0 and I , the statistics V
and U^ w ill be large and w ill tend to show significance. This indicates that
the variance of the h5TX)thesized F (x) is too large or too sm all. The statistic
P = -2 log U¿, like D"^ and D “ , also indicates which way the points have
moved; it they have moved closer to 0, P w ill be large, and if closer to I ,
P w ill be sm all. The value of P is very much m ore dependent on low values
of Ui than on high values, because log u, when u is nearly I , is nearly 0,
while as u approaches 0, log u becomes very large and negative. W e shall
see later that this has some importance in methods of combining tests for
several sam ples. Among the other statistics, clearly Ù o r U(n/2) might have
some power against an e r r o r in mean, but not against an e r r o r in variance of
the tested distribution. The same applies to the first component v^ o r z^ in the
decomposition of both the Ne 3rman tests and o r A ^ , and in turn in
equation (8.12) w ill not be sensitive to an e r r o r in mean (Durbin and Knott,
TESTS FOR THE UNIFORM DISTRIBUTION 357

1972; Stephens, 1974a). Greenwood’s statistic G(n) = takes its sm all­


est value l/(n + I) when all spacings are equal, that is, when values Ui are
superuniform; large values of G(n) w ill occur with widely varying patterns
of U -valu es.

8.14 PO W ER O F TEST STATISTICS

A number of studies have been made on tests for uniformity, including those
by Stephens (1974b), Quesenberry and M ille r (1977), Locke and Spurrier
(1978), and M ille r and Quesenberry (1979).
In general it can be said that, among the ED F statistics, the quadratic
statistics appear to be m ore powerful than the supremum class; when the
discrepancy between the ED F F^(U) and the theoretical distribution F(u) = u
is used a ll along the interval 0 < u < I it appears that better use is made of
the Information in the sample than by using only the maximum discrepancy.
When the basic problem is to test an X -se t fo r a distribution F (x ), so that
the observations U^ have been obtained by the Probability Integral T ra n sfo r­
mation, W^ and A^ (especially A^) w ill detect shifts in the mean of the
hypothesized distribution from the true mean, and U^ is effective at detecting
shifts in variance. A^ is also especially good at detecting irregularity in the
tails of the distribution. Among other tests fo r uniformity, statistics ob­
tained by the likelihood-ratio method are most powerful against their respec­
tive fam ilies of alternatives, as would be expected. Against a decreasing o r
increasing density, Ù alone is very efficient, and against distributional alter­
natives in which the mean is near the uniform mean of 0 .5 , but the variance
is changed either because the distribution is unimodal and S3ntnmetric, o r
U-shaped and sym m etric, the quantity above is a powerful statistic. F o r
unimodal nonsymmetric distributions, these statistics lose some of their
efficiency. The effect of the good perform ance of these relatively simple
statistics means that the Незппап statistic N 2 in Section 8.11, which com­
bines them both, is effective fo r a wide range of alternatives to uniformity
(Locke and Spurrier, 1978; M ille r and Q uesenberry, 1979). Although the two
components in N 3 occur again in A ^ , the presence of further components,
and the different weightings, sometimes make A^ less effective than N 3 ; in
a sim ilar way, N 4 can be less effective than N 3 , whenever adding further
components "dilutes" the power of the first two (Quesenberry and M ille r,
1979). This is sim ilar to the situation for EDF statistics (Section 8.12).

8.15 STATISTICS FOR COM BINING IN D E P E N D E N T


TESTS FOR S E V E R A L S A M PLE S

The uniform distribution has traditionally played an important role In com­


bining test statistics fo r several sam ples. This is probably because of the
358 STEPHENS

general use of F ish e r's method, which is based on the p -le v e ls of the com­
ponent tests.
To fix ideas, suppose к tests are to be made of null hypotheses H q i ,
Ho2> • • • » Hok* ^ composite hypothesis that all Hok are true; if
any one is not true, Hq should be rejected. Let T^ be the test statistic used
fo r Hq J, and suppose the test is an upper tail test. When the test is made,
let T i take the value T j , and suppose Pi is the significance level (often called
the p -le v e l) of this value, that is, when Hq í Is true, P r (T^ > Т|) = pi. When
Hqí is true, Pi is U (0 ,1), and when к independent tests are made of the к
null hypotheses above, we should obtain a random sample of к values of Pi
from U (0 , 1). Thus all к null hypotheses are tested simultaneously by testing
if the Pi appear to be such a uniform sam ple. This of course can be done by
any of the methods described in this chapter. F ish er (1967) suggested the
statistic P = -2 Z i log Pi, already discussed in Section 8.10.2. Effectively
the same idea had been put forw ard before by K arl Pearson, who suggested
using the product of the Pi- F o r a summary of early work on some of the
problem s discussed in this section see Pearson (1938). Note that if Qi = l - P i *
Qi could replace Pi in P , since clearly Qi is also U (0 ,1). Finally, let r i be
the minimum of Pi and qi, form ally written r i = min (P i,q i); it is easily proved
that, when pi is U (0 ,1), r i has the uniform distribution with limits 0 and 0.5,
so that P could be calculated using 2ri, o r I - 2ri = I qi - Pil instead of Pi.
Thus we have possible statistics

1*1 = -2 2. log p. ; Pg = -2 Г. log

Pg = -2 log 2rj ; P^ = -2 Z . log (I - 2r.)

On Hq , each of these statistics has the X2k distribution. An important ques­


tion in making the overall test is which of these statistics to use. F ish er
advocated P ^ , with significance for large values, and this suggestion appears
to have been generally accepted, although Pearson (1938) raised the p o ssi­
bility of using Р з . Littell and Folks (1971, 1973) have shown that P^ has
desirable properties from the point of view of Bahadur efficiency.
It has already bèen shown that P^ is the likelihood ratio statistic, fo r a
test of к = 0 against the alternative density f^ (u) = (k + l)u ^, 0 < u < I ,
к > - I ; fo r к > 0, P i w ill be declared significant for sm all values and for
- I < к < 0, P j w ill be significant fo r large values. Sim ilarly, P ^ is the like­
lihood ratio statistic for the alternative density f 2 (u) = (k + 1)(1 - u)^^,
0 < u < I, k > - I , with Рз significant for large values when к > 0 and for
sm all values when - I < к < 0. Thus F is h e r's use of P j , with significance
for large values, would imply that the alternative distribution fo r the p¿
values w ill give sm all values of p¿ large probability, but w ill allow some
values to be close to one (the density f^ (u), with - I < к < 0, gives non-zero
density at u = I ).
TESTS FOR THE UTSIIFORM DISTRIBUTION 359

Thus P i can be expected to be powerful if some of the component hypoth­


eses Hqj , i = I , . . . , k, w ere true (giving possibly a high p-value) and some
w ere not true. Another possibility is that, when Hq is rejected, all Hq i are
false together so that sm all p-values are likely in every test (for example,
the tests might all be tests of normality, and it is felt that all the sam ples
are likely to be non-normal if any of them are) ; then statistic P 2 , used with
significance in the lower tail, might be m ore effective.

E 8 .1 5 .1 Example

Suppose five independent tests, for example, that five sm all sam ples are
each from the norm al distribution with mean zero and variance one, give
p -values . 15, .20, .28, . 16, .25, so that each test is not significant at the
10% level. Then P^ = -2 2¾ log pj = 15.99; this value is exactly significant
at the 10% level for xîo • This follows the usual procedure as suggested by
F ish e r. However, if we use the q-values .85, .80, .72, .84, .75 and calcu­
late P 2 = -2 Z i log Qi we obtain P 2 = 2.352. This is significant at the 1%
level in the lower tail of Xio 5 the sample gives greater significance using P 2
than using P j .

E 8.15.2 Example

F o r this example we take F ish e r’s firs t illustration of his test (F ish er, 1967).
In this example three tests of fit yielded p-values of .145, .263, .087, and
P j is 11.42. In the upper tail of x l » this is significant at approximately the
7 . 5% level. The q-values are .855, .737, .913, and P 2 is 1.105. This value
is significant at the 2.5% level in the lower tail of x l • Again the value of P 2
is m ore significant than the value of P j in its appropriate tail.

8.15.3 Use of F ish e r’s Test with T w o -T a il


Component Tests

If a ll the component tests in Hq w ere tw o-tail tests, either P 3 o r P^ above


should be used as test statistics. F o r the same reasons that P 2 can be p re ­
fe rre d to P i fo r one-tail tests, P^ might be better than P 3 fo r tw o-tail tests.
When some o f the component hypotheses H^j are tested by one-tail tests and
some by tw o-tail tests, the form ula fo r P could be

P5 = -2 Z i log Ui

with Ui = qi for one-tail tests, and U i = I - 2ri for two-tail tests. Alterna­
tively P^ = -2 Z i log U i could be used with U i = pi for one-tail tests and
U i = 2ri fo r two-tail tests. Pg and P¿ w ill again have the x^y^ distribution
when a ll H^i are true.
360 STEPHENS

E 80I 5 .3 Example
Suppose five independent tests are to be made that five samples from normal
distributions have means 0 , against the alternative that the means are not 0 .
Thus five t-tests w ill be used, with significance in either tail for each test.
Suppose the significance levels m easured a ll from the upper tail (for this we
use the tem porary notation P ^ are p * = .15, .04, .75, .92, .07; thus only
the second sample would be declared significant using a two-tail 10 % level
f o r t ^ . The corresponding values o f r j are .15, .04, .25, .08, .07 and these
give the value for P 3 = 16.44. Using I - 2rj instead of 2r i we have P^ = 2.92.
P 3 is significant at the 10 % level of Xio (upper tail), and P^ at the 2 % level
(lower ta il).

8.15.4 Possible Misuse of F ish e r's Test

It is possible to m isuse F ish e r’s test, in the situation where some of the
component tests are tw o-tail tests, by using r^ when 2r j should be used.
This is especially easily done when the results of two-tail tests are som e­
times reported, using expressions such as ’’the lower tail p-value equals
0 .1 1 ,” o r ’’the upper tail p-value equals 0 .3 5 .” The statistician might then
wrongly use log 0.11 and log 0.35 in the calculation of, say, P ^ , and obtain
false levels of significance for the test of the overall hypothesis H^. A s an
example, consider Example E 8.15.3 above in which all tests are tw o-tail
tests, and the third and fourth tests, for example, could have been reported
as significant at the .25 and .08 levels in the lower tail. Then if the values
of T l instead of 2r j are used in calculating P 3 (which is the same as P^ since
a ll tests a re tw o-tall), we have P 3 = 23.37. This is spuriously highly s ig ­
nificant; P 3 is at the 1 % level of xîo •

8.16 TESTS FOR A UNIFORM DISTRIBUTION


W ITH UNKNOW N LIM ITS

Until now, tests in this chapter have been tests fo r a U (0 ,1) distribution;
they can be used easily fo r a test for U (a ,b ), where a and b are known, by
making the transformation given In Section 8.2. If the limits of the distri­
bution a re not known, other procedures are available. Two of these are:

(a) Suppose, given an ordered sample ..., that the range is

“ ■ ° ;.) - " ii)- 'o ) - ° V d ■ ” (D >''® -


i = I , . . . , (n - 2). A test o f H¿: the U ’ set is an ordered sample from
U (a ,b ), becomes a test of Hq ^ the U -sam ple, of size n - 2, is an ordered
sample from U (0 ,1).
(b) Use o f the correlation test of Section 5 .5 .1 .
TESTS FOR THE UNIFORM DISTRIBUTION 361

8.17 TESTS FOR CENSORED U NIFO R M S A M P LE S

Tests fo r censored data are described in several other chapters and examples
a re given in Chapter 11. Tests specifically for censored uniforms are

(a) EDF tests: Sections 4 .7 .3 -4 .7 .6 , and 12.3;


(b) Correlation tests: Sections 5.5, 5.6, and 12.3;
(C) Spacings tests based on Greenwood^s statistic: Section 8 . 9 . 1.

REFERENCES

Ajne, B . (1968). A sim ple test fo r uniformity of a c ircu lar distribution.


Biom etrlka, 55, 343-354.

B a r g a l, A . I. and Thom as, D . E . (1983). Smooth goodn ess-of-fit tests fo r


the W eibull distribution with singly censored data. Commun. Statist .
Theor. M eth. , 12, 1431-1447.

Barton, D . E . (1953). On Neyman’s smooth test o f goodn ess-of-fit and its


pow er with respect to a particular system of alternatives. Skand. Aktu-
a rtid sk r. , 36, 24-63.

Barton, D . E . (1955). A form o f Neyman’s p si-2 test of goodn ess-of-fit


applicable to grouped and discrete data. Skand. Aktuartidskr. , 1-17.

Barton, D . E. (1956). Neyman*s p si-2 test of goodn ess-of-fit when the null
hypothesis is composite. Skand. Aktuartidskr. , 39, 216-245.

Barton, D . E. and M allow s, C. L . (1965). Some aspects of the random


sequence. Ann. Math. Statist. , 36, 236-260.

Beran, R. J. (1968). Testing fo r uniformity on a compact homogeneous


space. J. A pp. P r o b . , 5 , 177-195.

Brunk, H. D . (1962). On the range of the difference between hypothetical


distribution function and Pyke^s modified em pirical distribution function.
Ann. Math. Statist. , 33, 525-532.

B u rro w s, P . M . (1979). Selected percentage points of Greenwood’s Statistic.


J. Roy. Statist. Soc. , A , 142, 256-258.

C r e s s ie , N . (1976). On the logarithm s of high o rd er spacings. Biom etrika,


63, 343-355.

C re ssie , N . (1977a). The minimum of higher o rd e r gaps. Australian J .


Statist. . 19, 132-143.

C re ssie , N . (1977b). On some properties of the scan statistic on the circle


and the line. J. A p p l. P r o b ., 14, 272-283.
362 STEPHENS

C re s s ie , N . (1978). Pow er results for tests based on high o rd er gaps. B io -


m etrika, 65, 214-218.

C re s s ie , N . (1979). An optimal statistic based on higher ord er gaps. B io -


m etrika, 66, 619-627.

C re s s ie , N . (1980). Asymptotic distribution of the scan statistic under uni­


form ity. Ann. P r o b . , 9, 828-840.

C u rrie, I. D. (1981). Furtherpercentage points for Greenwood^s Statistic.


J. Roy. Statist. Soc., A , 144, 360-363.

D arling, D . A . (1953). On a class of problem s related to the random division


of an Interval. Ann. № th . Statist. , 239-253.

D arling, D. A . (1982). On the supremum of a certain Gaussian P ro c e ss.


Ann. P r o b . , 11, 803-806.

D arling, D. A . (1983). O nthe asymptotic distribution of Watson^s statistic.


Ann. Statist. , 11, 1263-1266.

David, F . N . (1939). On Neyman^s ^smooth" test for goodness o f fit. B io -


m etrika, 31, 191-199.

Deken, J. G. (1980). Exact distributions fo r gaps and stretches. Technical


Report, Department of Statistics, Stanford University.

Del Pino, G. E. (1979). On the as 3nmptotic distribution of k-spacings with


applications to goodness-of-fit tests. Ann. Statist. , 7^, 1058-1065.

Dudewicz, E. J. and Van D er M eulen, E . C. (1981). Entropy-based tests


of uniformity. J. A m er. Statist. A s s o c ., 76, 967-974.

Durand, D . and Greenwood, J. A . (1957). Random unit vectors П: Usefulness


of G ra m -C h a rlie r and related series in approximating distributions.
Ann. Math. Statist. , 28, 978-986.

Durbin, J. (1961). Some methods of constructing exact tests. Biom etrika,


48, 41-55.

Durbin, J. (1969a). Test fo r se ria l correlation in regression analysis based


on the periodogram o f least-squares residuals. Biom etrlka, 56, 1-16.

Durbin, J. (1969b). Tests of se ria l independence based on the cumulated


periodogram . B ull. Inst. Internat. Statist* , 42, 1039-1048.

Durbin, J. and Knott, M . (1972). Components of C ram er-V o n M ises statis­


tics, I. J. Roy. Statist. Soc., B , 34, 290-307.

Durbin, J ., Knott, M . , and T aylor, C . C . (1975). C o m p o n e n tso fC ra m e r-


V o n M ise s statistics, П . J. Roy. Statist. Soc. , B , 216-237.

F ish er, R. A . (1967). Statistical Methods for R esearch W o rk e rs . 4th Edition.


New York: Stechert.
TESTS FOR THE UNIFORM DISTRIBUTION 363

Gin^, E . (1975). Invariant tests fo r Uniformity on Compact Riemannian


Manifolds Based on Sobolev N orm s. Ann. Statist., 3, 1243-1266.

Greenwood, J. A . and Durand, D. (1955). The distribution of lerjgth and


components of the sum of N random unit vectors. Ann. Math. Statist.,
233-246.

Greenwood, M . (1946). The statistical study of infectious disease. J. Roy.


Statist. Soc. , A , 109, 85-110.

Hartley, H. O . and Pfaffenberger, R . C . (1972). Quadratic form s in order


statistics used as goodness-of-fit crite ria . Biom etrika, 59, 605-611.

Hegazy, Y . A . S. and Green, J. R . (1975). Some new goodness-of-fit tests


using o rd er statistics. A p p l. Statist. , 24, 299-308.

H ill, I. D. (1979). Approximating the distribution o f Greenwood*S statistic


with Johnson distributions. J. Roy. Statist. Soc. , A , 142, 378-380.
Corrigendum (1981). J. Roy. Statist. Soc. , A , 144, 388.

Holst, L . (1979). Asymptotic norm alities of sum-functions of spacings.


Ann. P r o b . , 7, 1066-1072.

Johannes, J. M . and Rasche, R . H. (1980). Additional information on s ig ­


nificance values fo r Durbin’s C'*’ , C “ and C statistics. Biom etrika, 67,
511-514.

Kim ball, B . F . (1947). Some basic theorems fo r developing tests o f fit fo r


the case of the non-param etric probability distribution function, I. Ann.
Math. Statist. , 1^, 540-548.

Kim ball, B . F . (1950). On the as 5rmptotic distribution of the sum o f powers


of unit frequency differences. Ann. Math. Statist. , 263-271.

L ittell, R. C . and Folks, J. L . (1971). Asymptotic optimality o f F is h e r’s


method of combining independent tests. J. A m e r. Statist. A s s o c . , 66,
802-806.

Littell, R. C . and F olks, J. L . (1973). Asymptotic optimality of F ish e r’s


method of combining independent tests, П. J. A m e r. Statist. A s s o c . ,
6 S , 193-194.

Locke, C. and f u r r i e r , J. D . (1978). On tests o f uniformity. Com m .


Statist. Theory Methods, A 7 , 241-258.

L u rie , D . , H artley, H. O ., and Stroud, M . R . (1974). A g o o d n e s s o ffit t e s t


fo r censored data. Comm. Statist. , 3, 745-753.

M cLaren , C . G. and Stephens, M . A . (1985). P ercen tagep o in tsan d p o w er


fo r spacings statistics fo r testing uniformity. Technical Report, D epart­
ment of Mathematics and Statistics, Simon F r a s e r University.
364 STEPHENS

M ille r, F . L . and Q uesenberry, C. P . (1979). P o w e r s t u d ie s o fs o m e t e s t s


for uniformity, П. Comm. Statist. Simula. Comput., B, 8, 271-290.

Moran, P . A . P . (1947). The random division of an interval—P art I. J. Roy.


Statist. Soc., B , 9, 92-98.

M oran, P . A . P . (1951). The random division of an interval—P art П. J. Roy.


Statist. Soc. , B , 13, 147-150.

Незппап, J. (1937). “Smooth” tests for goodness-of-fit. Skand. Aktuarie-


tidskr. , 20, 149-199.

O^Rellly, F . J. and Stephens, M . A . (1982). Characterizationsandgoodness


of fit tests. J. Roy. Statist. Soc.,

Pearson, E . S. (1938). T h e p ro b a b llily ln te g ra ltra n sfo rm a tio n fo rte stln g


goodness-of-fit and eombining Independent tests of signifieanee. B io -
m etrika, 30, 134-148.

Prentiee, M. J. (1978). On Invariant tests for uniformity fo r dlreetions and


orientations. Ann. Statist. , 6, 169-176.

Руке, R . (1965). Spaeings. J. Roy. Statist. Soe. , B , 395-449.

Q uesenberry, C. P . and Hales, S. (1980). ConeentrationbandsforU niform ity


plots. J. Statist. Comput. Simul., 11, 41-53.

Q uesenberry, C. P . and M ille r, F . L . J r. (1977). Pow er studies of some


tests fo r uniformity. J. Statist. Comput. Simulation, 5 , 169-191.

Sehoenfeld, D . A . (1977). Asymptotie properties of tests based on linear


eomblnations of the orthogonal eomponents of C ram er-von M ises statis-
ties. Ann. Statist. , 5, 1017-1026.

Sehoenfeld, D. A . (1980). Tests based on linear eombinations of the orthog­


onal eomponents of the C ram er-von M ises statistie when param eters
are estimated. Ann. Statist., 8, 1017-1022.

Seshadri, V . , C so rgo , M ., and Stephens, M. A . (1969). Tests for the expo­


nential distribution using Kolmogorov-type statisties. J. Roy. Statist.
Soe. , B , 499-509.

Smirnov, N . V . (1947). Akad. Nauk SSR, C . R . (D okl.) Akad. Sei. URSS.


56, 11-14.

Solomon, H. and Stephens, M . A . (1981). Tests fo r uniformity: Greenwood*s


test and tests based on gaps and stretehes. Teehnieal Report 311, Dept,
of Statisties, Stanford University.

Solomon, H. and Stephens, M . A . (1983). On Ne 3maan*s statistie fo r testing


uniformity. Comm. Statist. Simulation Comput. , 12, 127-134.

Stephens, M . A . (1964). The testing of unit veetors for randomness. J.


A m er. Statist. A s s o e . , 59, 160-167.
TESTS FOR THE UNIFORM DISTRIBUTION 365

Stephens, M. A . (1966). Statistics connected with the uniform distribution:


percentage points and applications to testing for randomness of d ire c ­
tions. Biom etrika, 53, 235-239.

Stephens, M . A . (1969a). Tests fo r randomness of directions g a i n s t two


circu lar alternatives. J. A m e r. Statist. A s s o c . , 64, 280-289.

Stephens, M. A . (1969b). Results from the relation between two statistics of


the Kolm ogorov-Sm im ov type. Ann* Math* Statist. , 40, 1833-1837*

Stephens, M . A . (1969c). A goodness-of-fit statistic fo r the c irc le , with


some com parisons. Biom etrlka, 56, 161-168*

Stephens, M . A . (1970). U se o f the K olm ogorov-Sm im ov, C ram er-von M ises


and related statistics without extensive tables. J. Roy. Statist. Soc. ,
B , 32, 115-122.

Stephens, M . A . (1974a). Components of g oodn ess-of-fit statistics. Ann.


Inst. H. Poin care, B , 10, 37-54.

Stephens, M. A . (1974b). EDF statistics fo r goodness-of-fit and some com ­


parisons. J. A m er. Statist. A s s o c . , 69, 730-737.

Stephens, M . A . (1976). Asymptotic power of EDF statistics fo r exponen-


tiality against Gamma and W eibull alternatives. Technical Report
No. 297: Department of Statistics, Stanford University.

Stephens, M . A . (1981). F urtherpercentage points fo r Greenwood^s statistic.


J. Roy. Statist. Soc. , A , 144, 364-366.

Stephens, M . A . (1986). G oodness-of-fit for censored data. Technical


report. Department of Statistics, Stanford University.

Stuart, A . (1954). Too good to be tm e. Appl. Statist., 3, 29-32.

Sukhatme, P . V . (1937). Tests of significance fo r sam ples of the chi-square


population with two degrees of freedom . Annals of Eugenics London,
52-56.

Thom as, D . R. and P ie rc e , D. A . (1979). Ne 3man^s smooth goodn ess-of-


fit test when the hypothesis is composite. J. A m er. Statist. A s s o c . , 74,
441-445.

V aslcek , O. (1976). A test for normality based on sample entropy. J. Roy.


Statist. Soc. , B , 54-59.

W allenstein, S. R. and N au s, J. I. (1974). Probabilities fo r the size of


largest clusters and sm allest in tervals. J. A m e r. Statist. Soc. , 69,
690-695.

W ang, Y . H. and Chang, S. A . (1977). A new approach to the norç)arametric


tests o f e^K)nential distribution with unknown param eters. The Theory
and Application of Reliability. 2, 235-258. New York: Academ ic P re s s .
366 STEPHENS

Watson, G . S. (1967). Some problem s ln the statistics o f directions. Bulletin


of the Int. Statist. Inst. Conference, 36th Session, Sydney, Australia.

Watson, G . S. (1976). Optimal invariant tests for uniformity. Study in P ro b ­


ability and Statistics. Am sterdam : North-Holland, 121-127.

W e iss, L . (1957a). The convergence of certain functions o f sample spacings.


Ann. Math. Statist., 778-782.

W e iss, L . (1957b). The asymptotic power of certain tests of fit based on


sample spacings. Ann. Math. Statist. , 28, 783-788.
9
Teste fo r the N orm al Distribution
Ralph B , D^Agostino Boston University, Boston, Massachusetts

9.1 IN TR O D UCTIO N

The single most used distribution in statistical analysis is the normal


distribution. Its uses can be classified in two sets. The firs t relates to the
class of statistics which are taken to be norm ally distributed due to the
applicability of large sample theorems such as the Central Lim it Theorem
(Rao, 1973, Chapter 2), the Delta Theorem s (Rao, 1973, Chapter 6), and
theorems related to the as 3onptotic distribution o f linear functions of ord er
statistics (Chernoff, Gastwirth, and Johns, 1967). The second set relates to
situations where the norm al distribution is assumed to be the appropriate
mathematical model for the underlying phenomenon under investigation. The
applied literature is replete with examples of this latter class w here, for
exam ples, the norm al distribution o r the related lognorm al distribution (i.e .
logs of data are norm ally distributed) a re used as models fo r cadmium and
lead levels in the blood o f children (Smith, Tem ple, and Reading, 1976), the
distribution of hydrologic runoff (Kottegoda and Yevjevich, 1977), body d is­
comfort and transm isslbllily scores (G riffin and Whitham, 1978), weights of
m am m ary tumors in rats (Fredholm , Gunnarsson, Jensen, and Muntzing,
1978), levels of toxic gases to which w orkers are e?фosed (D ’Agostino and
G illespie, 1978; and Smith, W agner, and M oore, 1978), nuclear cro ss se c ­
tions data (Richert, Sim bel, and W eidenm uller, 1975), radio scintillation
data (Rino, Livingston, and Whitney, 1976), earnings and wages (White and
Olson, 1981), and the distributions of a ir pollutants (L arson , 1971, and
Hunt, 1972). The chapter deals with this second class of use and discusses
goodness-of-fit tests designed to test form ally the appropriateness o r ade­
quacy of the normal distribution as a model fo r the underlying phenomenon

367
368 D’AGOSTINO

from which data w ere generated. These tests complement the informal
graphical techniques already discussed in Chapter 2 (see Sections 2.4 and
2 .5 ).
This chapter w ill focus on tests applicable to complete sam ples. Tech­
niques based on incomplete o r censored samples are discussed elsewhere;
in Chapter 1 1 , Analysis of Data from Censored Samples, and also in Chap­
ters 3, 4, and 5. W e start by discussing tests that assume a complete random
sample is available for analysis. These tests occupy the m ajor portion o f the
chapter and are the prim ary interest of the chapter. Tests applicable on
residuals and tests for multivariate normality w ill also be discussed.

9. 2 C O M P L E T E RANDOM S A M PLE S

Until stated otherwise we assume the following. Let X^, X 2 , . . . , X^ be a


random sample of size n from a population with probability density function
(pdf) f(x) and cumulative distribution function (cdf) F (x ). The pdf and cdf of
the normal distribution are, respectively, ф {х) and Ф(х). Our null hypothesis
is

Ho: f(x) = ф(х) or F(X) = Ф(х) (9.1)

9 .2 .1 Null Hypothesis

The pdf of the normal distribution is given by

(9.2)
^27Г0-

( -00 < X < во \


-C O < ^ < CO J

a > O /

In testing for departures from the normal distribution the null hypothesis of
(9 .1 ), Hq , is that the random variable X under consideration is distributed as
a normal variable, o r in other w ords, X has a probability density function
given by (9 .2 ). If, further, specific values of both the mean and standard
deviation, ß and <7 , of (9.2) are specified by the null hypothesis ( e . g . , X is
norm ally distributed with ß = 500 and or = 100), then the null hypothesis is a
simple hypothesis. This means the null hypothesis concerns itself with only
one particular distribution. If either ß o r er are not specified completely, then
the null hypothesis under consideration is a composite hypothesis. This
chapter deals mainly with the composite null hypothesis with both ß and cr
unknown. In most applications p rio r knowledge o f ß o r c r is not available. If.
it is available, it usually is of no help, from a power point of view, in judging
goodness-of-fit (see, for example, Chapter 4 on E D F tests).
TESTS FOR THE NORMAL DISTRIBUTION 369

9 .2 .2 Alternative Hypothesis

The alternative hypothesis, H ^, usually employed in these testing situations


is the composite hypothesis that X is not norm ally distributed. Directions o f
nonnormality o r alternative distributions are only ra re ly considered ( e . g . ,
by Uthoff, 1970, 1973). In this chapter alternatives to normality are limited
to the following: (I) X is nonnormal and no p rio r information is available
concerning alternative distributions, o r (2) X is nonnormal and inform a­
tion is available concerning the deviations from normality in term s of
skewness and/or tail thickness o r peakedness as measured, for e x ­
ample, by the kurtosis coefficient /Зз* F o r a random variable X the skewness
and kurtosis coefficients a re , respectively.

(9.3)

and

(a)

(b)

FIG UR E 9 . 1 Illustration of distributions with Ф 0 and /З2 ^ 3. (a) D istri­


bution differing in skewness: A . > 0, B . ^ fß i = 0, C . ^/^J < 0. (b) D is tri­
bution differing in kurtosis: A . /З2 = 3, B . ßz ^
370 D ’AGOSTINO

E(X -
ßz (9.4)

where E represents the expected value operator. F o r the norm al distribution


\ l ß i = 0 and 02 = 3. In the following w ill often be used to r e fe r to ’’tall
thickness” of an alternative distribution with ß ^ > Z Indicating a thick tailed
distribution and ß^ < 3 Indicating a thin tailed distribution. The re ad er Is r e ­
fe rre d to Chapter 2 on graphical analysis and Chapter 7 on moment techniques
fo r further details o f these coefficients. Figu re 9. 1 contains illustrations of
distributions with Ф 0 and ß ^ ^ Z *
The next two sections consist o f classifying and Investigating the relative
m erits of the te sts. The re ad er who is interested in our final recommenda­
tions may want to go directly to Section 9.5 and then read sections 9.3 and
9.4 selectively.
The objective of the next three sections is to classify and review a se le c ­
tive number of the various available tests (som e of which have already been
presented In the above chapters), discuss their relative m erits, and then
make recommendations concerning which should be used in practice. W e
have attempted to select, in an objective manner, tests which a re serious
contenders for use o r else have a long historical usage behind them . How­
ever, such a selection involves personal judgments and Is not entirely objec­
tive. There are no definitive answ ers as to which tests a re best, and the
tests we have selected are sure to have excluded from them the favorite tests
o f some re se a rc h e rs. W e have attempted to select and recommend tests that
a re as good as o r better than other existing tests. There may exist other
tests not recommended which a re also good. In anticipation o f this omission
let us give our apologies here.

9.3 C LASSIFICATIO N OF EXISTING TESTS

F o r the purposes of this chapter tests fo r normality can be grouped into five
categories, chi-square test, em pirical distribution function tests, moment
tests, regression tests, and m iscellaneous tests. W e now discuss these
gro u p s.

9 .3 .1 Chi-Square Type Tests

These well-known tests, first developed by K arl Pearson, a re suitable for


simple o r composite null hypotheses and are discussed exhaustively in Chap­
ter 3. F o r the normal distribution the mechanics of these consist of d is ­
cretizing the hypothesized distribution (with known o r estimated param eters)
I into a multinormal distribution of к cells, counting the observed number of
observations in each cell and contrasting these, via a ch i-squ are statistic o r
a likelihood ratio statistic, with the ejqjected number of observations for each
TESTS FOR THE NORMAL DISTRIBXJTION 371

cell. The latter expected values are computed assuming the data did arise
from a normal distribution. O f particular interest to testing for normality
are the articles of Chernoff and Lehmann (1954) and Watson (1957). In the
fo rm er article it is shown that the use of the sample mean and standard devi­
ation based on ungrouped data to obtain the expected values results in the
observed chi-square statistic as being as 3onptotically distributed as

X 2 (k - 3 ) + CH^Xl(I)+ C^aXi(I) (9.5)

In (9 O5) x|(i^) represents a chi square variable with v degree of freedom , all
these chi square variables are independent and O < ofj < I . The often quoted
к - 3 degrees of freedom is incorrect. The Watson (1957) article describes
how switching appropriately to fixed cell probabilities can result in obtaining
explicit form ulas fo r the a ^ and a ^ o f (9 .5 ). W hile the chi square tests are
o f historical Interest and are continuously being modified, we agree with
P ro fe sso r D. S. M oore, the author o f Chapter 3, that they should not be
recommended fo r use in testing fo r departures from normality when the full
ungrouped sample of data is available. Other procedures to be discussed
below are m ore powerful. In the cases where the fuU sample is not available
( i . e . , data are censored o r truncated) o r where the data a re grouped into
classes (see Section 3 .2 .7 , Example 2) these procedures are of use. We
r e fe r the read er to Chapter 3 fo r further details. The rem ainder of this
chapter w ill not contain any further discussion of these tests.

9 .3 .2 Tests Based on the E m pirical Distribution


Function (EDF)

Chapter 4 above discussed in detail the concept and applications of the tests
based on the em pirical distribution function (E D F ). B asically fo r the normal
distribution these tests involve m easuring the discrepancy between the cumu­
lative distribution function

-é((t-íí)/o)^dt
Ф(х) = / — (9.6)

o f the normal distribution and the em pirical distribution function

# (X < X )
F (X) = — r — - (9.7)
n n

o f the sam ple. The д and a o f (9.6) often are not specified and are replaced
by the sample mean X and standard deviation S, where

V XX ^ /z ( X - 5x i !
X = ----- and S (9.8)
n N n - I
372 D’AGOSTINO

9 >3. 2. 1 Simple Null Hypothesis

Many tests have been developed fo r this situation. Some prominent ones are
the Kolmogorov (1933)-Sm irnov (1939) test, the Kuiper V test (1960), Pyke’s
C test (1959), Brunk’s B test (1962), Durbin’s D test (1961), the C ra m e r-
von M ises test (1928), Durbin’s M test (1973), W atson’s test (1961),
the Anderson-D arling A^ test (1954), F ish e r’s Trand тг’ tests (1928), and the
H artley-Pfaffen berger test (1972). See Chapter 4 fo r details on some of
these.

9 .3 .2 .2 Anderson-D arling Test fo r the Composite Hypothesis

Some o f the above tests have been modified to apply to the composite null
hypothesis of normality with ц and a unknown (Stephens, 1974, and Green
and Hegazy, 1976). In Chapter 4, Section 4 .2 , form ulas are given fo r the
Kolm ogorov-Sm irnov D test, the Kuiper V test, the C ram er-von M ises
test, the Watson test, and the Anderson-D arling A^ test, and in Section
4 .8 the application of these to the normal distribution is described in detail.
F o r the purposes of this chapter we now present in the notation of this chap­
te r the procedure fo r perform ing the Anderson-D arling A^ test, which is the
EDF test we recommend for u s .
(1) A rran ge the sample in ascending o rder,

(2) Calculate standardized values, Y (i), where

= fo r 1 = 1 , . . . , n (9.9>
(I) S

(3) Calculate P j fo r 1 = 1, . . . , n where

(9.10)

Ф(у) in (9.10) represents the cdf o f the standard norm al distribution and P j
is the cumulative probability corresponding to the standard score Y^jj of
(9 .9 ). Pj[ can be found from standard norm al tables as given in the Appendix
o r by use of the following approximation due to Hastings (1955). F o r Y (i)
such that O < Y (i) < OO define у = Y^y and compute

Qj = I - -(1 + Ciy + C^yZ + Сзу* + C^y^)- (9.11)

where
TESTS FOR THE NORMAL DISTRIBUTION 373

Cl = 0.196854, C 3 = 0.000344

C 2 = 0.115194, C 4 = 0.019527

Here P i of (9.10) is equal to Qi of (9.11). F o r such that < Y^i^ < 0


define у = -Y (i) and compute Qi of (9.11). H ere P i o f (9.10) is equal to
I - Qi of (9.11).
(4) Compute the Anderson-D arling statistic

2 ^
A = [ ( 2i - I) {lo g P. + log (I - }/ n ] (9.12)
i= l

where log is log base e .


(5) Compute the modified statistic

A * = A2(1.0 + 0 .7 5 /n +2 .2 5 /n ^) (9.13)

( 6) Reject the null hypothesis of normality if A * exceeds 0.631, 0.752,


0.873, 1.035, and 1.159 at levels of significance 0.10, 0.05, 0.025, 0.01,
and 0 . 005, respectively.
The above procedure is valid fo r sam ples of size n > 8.
E 9. 3 . 2 . 2 . 1 Num erical example o f A nderson-D arling test Table 9.1 con­
tains num erical examples employing the first ten observations o f the NOR
and E X P data sets (of the Appendix). A s is to be expected the Anderson-
D arling test accepts the hypothesis o f normality fo r the NOR data and rejects
it for the E X P data at level of significance 0.05.
The P -le v e l o r descriptive level o f significance fo r the A nderson-D arling
test can be obtained by use o f T able 4 .8 . The read er is re fe rre d to Chapter 4,
Section 4 .8 .2 for details.

9 .3 .2 .3 Transform ation Methods fo r the Composite Hypothesis

C so rg o , Seshadri, and Yalovsky (1973) present another approach fo r applying


ED F tests to the composite null hypothesis (д and <j unknown). In this approach
the n (> 4) observations are first transform ed into n - 2 independent o b se r­
vations free of unknown param eters. Then ED F tests ( e . g . , A^) are applied
to these. O^Rellly and Quesenberry (1973) and Quesenberry (1975) present a
general theory fo r obtaining transform ations such that the transform ed v a r i­
ables are independent uniform random v ariab les. Chapter 6 o f this book is
devoted to transformation techniques. With use o f these it is now possible to
apply any EDF test as a test fo r deviations from norm ality. However, these
transformation procedures require randomization of the data. To many users
this is considered an undesirable feature«
D^AGOSTINO
374

T A B L E 9.1 Num erical Examples of Anderson-D arling A^ T est^ ’ ^

I-P ,.
^(i) ^(i) ^(1) ■"

NOR data Normal д = 100, о = 10 x = 98 . 414 , s = 8.277

I 84.27 - 1.71 .0436 .9564 - 3.1327 - 0.0446 - 6.5685


2 90.87 - 0.91 .1814 .8186 - 1.7071 - 0.2002 - 10.7358
3 92.55 - 0.71 .2389 .7611 - 1.4317 - 0.2730 - 12.3790
4 96.20 - 0.27 .3936 .6064 - 0.9324 - 0.5002 - 12.8506
5 98.70 0.03 .5120 .4880 - 0.6694 - 0.7174 - 12.7800
6 98.98 0.07 .5279 .4721 - 0.6388 - 0.7506 - 14.9182
7 100.42 0.24 .5948 .4052 - 0.5195 - 0.9034 - 13.2561
8 101.58 0.38 .6480 .3520 - 0.4339 - 1.0441 - 10.6035
9 106.82 1.02 .8461 .1539 - 0.1671 - 1.8715 - 6.2441
10 113.75 1.85 .9678 .0322 - 0.0327 - 3.4358 - 1.4687
- 101.8045
EXP data (Exponential д = 5 . 0) X = 4 . 257, S = 5.169
I 0.06 - 0.81 .2090 .7910 - 1.5654 - 0.2345 - 6.3937
2 0.37 - 0.75 .2266 .7734 - 1.4846 - 0.2570 - 8.9076
3 0.44 - 0.74 .2296 .7704 - 1.4714 - 0.2608 - 12.8460
4 0.89 - 0.65 .2578 .7422 - 1.3556 - 0.2981 - 14.8029
5 2 . 17 - 0.40 .3446 .6554 - 1.0654 - 0.4225 - 13.8663
6 2.63 - 0.31 .3783 .6217 - 0.9721 - 0.4753 - 15.3406
7 4.69 0.08 .5319 .4681 - 0.6313 - 0.7591 - 12.0822
8 6.48 0.43 .6664 .3336 - 0.4059 - 1.0978 - 10.0005
9 8 . 15 0.75 .7734 .2266 - 0.2570 - 1.4846 - 8.7380
10 16.69 2.41 .9920 .0080 — 0.0080 - 4.8283 - 4.6075
- 107.5853

_ 101. 8045
ЗА* “ IЛ “ 10 = .18045
1 ^ .75 ^ 2.25'
A * = A^( -) = . 198044 (Accept Norm ality for NOR data)
10 100

b^2 = 12 Z - g .§ 5 3 _ 3_Q ^ ^ ,7 5 3 5 3
10

A * = A^( I + 0■.7 5^ +. 2.25^ _ ,832487 (R e je c t N o r m a lit y fo r E X P d a t a


10 100
at 0.05 level of significance)

9 .3 .2 .4 Components of an EDF Statistic

Durbin, Knott, and T aylor (1975) have employed a procedure from which it
is possible to express the test statistic of an EDF test as a weighted linear
function o f independent chi square variables each with one degree of freedom.
This perm its computation of asymptotic significance points. These are sim ­
ila r to those obtained by Monte Carlo procedures and presented by Stephens
in Chapter 4 of the present volume
The read er is referred to Chapter 4 for further discussion of EDF tests.
TESTS FOR THE NORMAL DISTRIBUTION 375

9 .3 .3 Moment Tests and Related Tests

Chapter 7, Section 7.2 discusses moment tests as they apply to the normal
distribution. The m odem theory o f tests fo r normality can be regarded as
having been Initiated by K a rl P earson (1895), who recognized that deviations
from normality could be characterized by the standard third and fourth
moments of a distribution. To be m ore езфИси, as previously discussed In
Section 9 .2 .2 , the normal distribution with density given by (9.2) has as its
standardized third and fourth moments, respectively.

= = о (9.14)
CT

and

^ EjX _
Pz - 3

The third standardized moment •\Tßi characterizes the skewness o f a


distribution. If a distribution is S5mimetric about its mean д, as is the n or­
m al distribution, ^fßi = 0. Values o f Ф 0 indicate skewness and so
nonnormality. The fourth standardized moments characterize the kurtosis
o r peakedness of a distribution. F o r the norm al distribution, = 3. Values
o f /?2 3 indicate nonnormality. /З2 is also useful as an indicator o f tall
thickness. F o r the normal ¢2 = 3. Values of /З2 > 3 indicate distributions
with "thicker" than norm al tails, and values o f ß 2 < 3 indicate distributions
with "thinner" than normal tails.
Pearson suggested that in the sam ple, the standardized third and fourth
moments given by

^^bl = m j/ m f (9.16)

and

b2 = т 4 / т | (9.17)

where

m^ = 2 (X - i p V n , к > I (9.18)

and

X = ZX/n (9.19)

could be used to judge departures from normality. He found the first approx­
imation ( l . e . , to n"^) to the variances and covariances of ^ ÍЬ l and bz for
sam ples drawn at random from any population, and assuming that ^ГЬl and b 2
w ere distributed jointly with bivariate normal probability, constructed equal
376 D’AGOSTINO

probability e llip se s. From these approximate assessm ents could be made if


the sample deviated too greatly from normality. F o r situations where ^ ih i
and Ьз deviated substantially from expectation under normality, K. Pearson
developed his elaborate system of Pearson c u rv es. These could be used as
possible alternative distributions for the populations under investigation. (See
Elderton and Johnson (1969) for a full discussion of Pearson curves. Chap­
ter 7 also makes extensive use of them.)
A number of Investigators have concerned themselves with obtaining
correct significance points fo r n/Ъх and Ьг • The reader is referred to Chap­
ter 7, Section 7.2 for a b rie f review of the history. In the following we p r e ­
sent the present state of the field.

9 .3 .3 .1 Third Standardized Moment

The n/Ъх test can be applied fo r all sample sizes n > 5.

9 .3 .3 Л .I Monte C arlo points for n = 5 to 35

D ’Agostino and Tietjen (1973) presented simulation probability points


applicable for n = 5 to 35 and valid fo r two sided tests ( i . e . , Hj : Nonnormal­
ity with i 0) fo r levels of significant o?= 0.002, 0.01, 0.02, 0.05, 0.10,
and 0.20 and for one sided test ( i . e . , H j > 0 o r H i Z \ f ß i < 0) fo r levels of

T A B L E 9.2 Probability Points of N/bj for n = 5 to 35 (Monte Carlo Points)

Tw o-sided significance levels


0.20 0.10 0.05 0.02 0.01 0 .0 0 2

n 0.10 0.05 0.025 0.01 0.005 0.001

5 0.819 1.058 1.212 1.342 1.396 1.466


6 0.805 1.034 1.238 1.415 1.498 1.642
7 0.787 1.008 1.215 1.432 1.576 1.800
8 0.760 0.991 1.202 1.455 1.601 1.873
9 0.752 0.977 1.189 1.408 1.577 1.866
10 0.722 0.950 1.157 1.397 1.565 1.887
11 0.715 0.929 1.129 1.376 1.540 1.924

13 0.688 0.902 1.099 1.312 1.441 1.783


15 0.648 0.862 1.048 1.275 1.462 1.778
17 0.629 0.820 1.009 1.188 1.358 1.705
20 0.593 0.777 0.951 1.152 1.303 1.614
23 0.562 0.743 0.900 1.119 1.276 1.555
25 0.543 0.714 0.876 1.073 1.218 1.468
30 0.510 0.664 0.804 0.985 1.114 1.410
35 0.474 0.624 0.762 0.932 1.043 1.332

Taken from D ’Agostino and Tletjen (1973) with perm ission o f the B io -
metrlka Trustees.
TESTS FOR THE NORMAL DISTRIBUTION 377

significant c¿ = 0.001, 0.005, 0.01, 0.025, 0.05, and 0.10. These points are
given here in Table 9.2. Mulholland (1977) gives good approximations for
n = 4 to 25.

9 .3 .3 .1 .2 Syy approximation

D^Agostino (1970) further showed that the null distribution o f •sihi be


w ell approximated by a Johnson Su curve. The approximation is given as
follows:

(1) Compute \fbi from the sample data.


(2) Compute

(9.20)
Ч 6(n
6 ( n - 22)) /

^ 3(n^ + 27n - 70)(n + IHn + 3)


ßz = (9.21)
(n - 2)(n + 5)(n + 7)(n + 9)

W 2 = - I + {2(/32 - 1 ) } ^ (9.22)

Ô = I / n/log W (9.23)

a = { 2 / (W^ - 1 )}^ (9.24)

(3) Compute

Z = Ô log[Y/o? + {(Y/ûf)2 + l } 2 ] (9.25)

Z of (9.25) is approximately a standard norm al variable with mean zero and


variance unity. Once Z o f (9.25) is computed, rejection o r acceptance is
decided by reference to any table o f the standard norm al distribution (such
as given in the Appendix). This transformation is applicable for any sample
size n > 8. Further with it both one-sided and two-sided tests with any desired
levels of significance can be perform ed. F o r exam ple, for a two-sided test
with a 0.05 level of significance reject if I Z| > 1 .9 6 .
T able 9.3 contains critical values of N/bj computed from this Su ap p rox-
mation for n > 36.

9 .3 .3 .1 .3 t approximation

D^Agostino and Tietjen (1973) investigated a t approximation to the null


distribution N/bj. It also requires n > 8 and appears to be as good as the Su
approximation. It is given as follows:
Compute

T = ( ^ ) / CT(NTbi) (9.26)

where
T A B L E 9. 3 Probability Points of ^/bl for n > 36
(Su Approximation Points)

Tw o-sided significance levels

0.20 0.10 0.05 0.02 0.01

One-sided significance levels

0.10 0.05 0.025 0.01 0.005

36 0.469 0.614 0.747 0.912 1.032


37 0.464 0.607 0.738 0.901 1.019
38 0.459 0.600 0.730 0.891 1.007
39 0.454 0.594 0.722 0.881 0.996
40 0.449 0.588 0.714 0.871 0.985

41 0.445 0.581 0.707 0.861 0.974


42 0.440 0.576 0.699 0.852 0.963
43 0,436 0.570 0.692 0.843 0.953
44 0.432 0.564 0.685 0.835 0.943
45 0.428 0.559 0.678 0.826 0.934

46 0.424 0.553 0.672 0.818 0.924


47 0.420 0.548 0.666 0.810 0.915
48 0.416 0.543 0.659 0.803 0.906
49 0.412 0.538 0.653 0.795 0.898
50 0.409 0.534 0.648 0.788 0.889

51 0.405 0.529 0.642 0.781 0.881


52 0.402 0.525 0.636 0.774 0.873
53 0.399 0.520 0.631 0.767 0.865
54 0.395 0.516 0.626 0.760 0.858
55 0.392 0.512 0.620 0.754 0.850

56 0.389 0.508 0.615 0.748 0.843


57 0.386 0.504 0.610 0.742 0.836
58 0.383 0.500 0.606 0.736 0.829
59 0.380 0.496 0.601 0.730 0.822
60 0.378 0.492 0.596 0.724 0.816

61 0.375 0.489 0.592 0.718 0.809


62 0.372 0.485 0.588 0.713 0.803
63 0.370 0.482 0.583 0.708 0.797
64 0.367 0.478 0.579 0.702 0.791
65 0.365 0.475 0.575 0.697 0.785

66 0.362 0.472 0.571 0.692 0.779


67 0.360 0.468 0.567 0.687 0.774
68 0.357 0.465 0.563 0.683 0.768
69 0.355 0.462 0.559 0.678 0.763
70 0.353 0.459 0.556 0.673 0.758

378
TABLE 9.3 (continued)

T w o-sided significance levels

0.20 0.10 0.05 0.02 0.01

One-sided significance levels

n 0.10 0.05 0.025 0.01 0.005

71 0.351 0.456 0.552 0.669 0.752


72 0.348 0.453 0.548 0.664 0.747
73 0.346 0.451 0.545 0.660 0.742
74 0.344 0.448 0.541 0.656 0.737
75 0.342 0.445 0.538 0.651 0.733

76 0.340 0.442 0.535 0.647 0.728


77 0.338 0.440 0.532 0.643 0.723
78 0.336 0.437 0.528 0.639 0.719
79 0.334 0.435 0.525 0.635 0.714
80 0.332 0.432 0.522 0.632 0.710

81 0.330 0.430 0.519 0.628 0.706


82 0.329 0.427 0.516 0.624 0.701
83 0.327 0.425 0.513 0.621 0.697
84 0.325 0.422 0.510 0.617 0.693
85 0.323 0.420 0.507 0.613 0.689

86 0.322 0.418 0.505 0.610 0.685


87 0.320 0.416 0.502 0.607 0.681
88 0.318 0.413 0.499 0.603 0.677
89 0.317 0.411 0.497 0.600 0.674
90 0.315 0.409 0.494 0.597 0.670

91 0.313 0.407 0.491 0.594 0.666


92 0.312 0.405 0.489 0.590 0.663
93 0.310 0.403 0.486 0.587 0.659
94 0.309 0.401 0.484 0.584 0.656
95 0.307 0.399 0.481 0.581 0.652

96 0.306 0.397 0.479 0.578 0.649


97 0.304 0.395 0.477 0.575 0.646
98 0.303 0.393 0.474 0.573 0.642
99 0.302 0.391 0.472 0.570 0.639
100 0.300 0.390 0.470 0.567 0.636

102 0.297 0.386 0.465 0.562 0.630


104 0.295 0.383 0.461 0.556 0.624
106 0.292 0.379 0.457 0.551 0.618
108 0.290 0.376 0.453 0.546 0.612
lio 0.287 0.373 0.449 0.541 0.607

(continued)
379
TABLE 9.3 (continued)

Tw o-sided significance levels

0.20 0.10 0.05 0.02 0.01

One-sided significance levels

n 0.10 0.05 0.025 0.01 0.005

112 0.285 0.369 0.445 0.536 0.601


114 0.283 0.366 0.441 0.532 0.596
116 0.280 0.363 0.438 0.527 0.591
118 0.278 0.360 0.434 0.523 0.586
120 0.276 0.358 0.431 0.519 0.581

122 0.274 0.355 0.427 0.514 0.576


124 0.272 0.352 0.424 0.510 0.571
126 0.270 0.349 0.421 0.506 0.567
128 0.268 0.347 0.417 0.502 0.562
130 0.266 0.344 0.414 0.499 0.558

132 0.264 0.342 0.411 0.495 0.554


134 0.262 0.339 0.408 0.491 0.550
136 0.260 0.337 0.405 0.488 0.546
138 0.258 0.335 0.403 0.484 0.542
140 0.257 0.332 0.400 0.481 0.538

142 0.255 0.330 0.397 0.477 0.534


144 0.253 0.328 0.394 0.474 0.530
146 0.252 0.326 0.392 0.471 0.526
148 0.250 0.324 0.389 0.468 0.523
150 0.249 0.322 0.387 0.465 0.519

155 0.245 0.317 0.381 0.457 0.511


160 0.241 0.312 0.375 0.450 0.503
165 0.238 0.307 0.369 0.443 0.495
170 0.234 0.303 0.364 0.437 0.488
175 0.231 0.299 0.359 0.430 0.481

180 0.228 0.295 0.354 0.425 0.474


185 0.225 0.291 0.349 0.419 0.467
190 0.222 0.287 0.345 0.413 0.461
195 0.219 0.284 0.340 0.408 0.455
200 0.217 0.280 0.336 0.403 0.449

210 0.212 0.274 0.328 0.393 0.439


220 0.207 0.267 0.321 0.384 0.428
230 0.203 0.262 0.314 0.376 0.419
240 0.199 0.256 0.307 0.368 0.410
250 0.195 0.251 0.301 0.361 0.402

380
D’AGOSTINO 381

Table 9.3 (continued)

Tw o-sided significance levels

0.20 0.10 0.05 0.02 0.01

One-sided significance levels

n 0.10 0.05 0.025 0.01 0.005

275 0.186 0.240 0.287 0.344 0.383


300 0.178 0.230 0.275 0.329 0.366
325 0.172 0.221 0.265 0.316 0.352
350 0.165 0.213 0.255 0.305 0.339
375 0.160 0.206 0.247 0.294 0.327

400 0.155 0.200 0.239 0.285 0.317


425 0.151 0.194 0.232 0.277 0.307
450 0.146 0.188 0.225 0.269 0.299
475 0.143 0.184 0.219 0.262 0.291
500 0.139 0.179 0.214 0.255 0.283

550 0.133 0.171 0.204 0.243 0.270


600 0.127 0.164 0.195 0.233 0.258
650 0.122 0.157 0.188 0.224 0.248
700 0.118 0.152 0.181 0.216 0.239
750 0.114 0.146 0.175 0.208 0.231

800 0.110 0.142 0.169 0.202 0.224


850 0.107 0.138 0.164 0.196 0.217
900 0.104 0.134 0.160 0.190 0.211
950 0.101 0.130 0.155 0.185 0.205
1000 0.099 0.127 0.152 0.180 0.200

1200 0.090 0.116 0.138 0.165 0.182


1400 0.084 0.107 0.128 0.152 0.169
1600 0.078 0.101 0.120 0.143 0.158
1800 0.074 0.095 0.113 0.134 0.149
2000 0.070 0.090 0.107 0.127 0.141

2500 0.063 0.080 0.096 0.114 0.126


3000 0.057 0.073 0.088 0.104 0.115
3500 0.053 0.068 0.081 0.096 0.107
4000 0.050 0.064 0.076 0.090 0.100
4500 0.047 0.060 0.072 0.085 0.094

5000 0.044 0.057 0.068 0.081 0.089


10000 0.031 0.040 0.048 0.057 0.063
382 D'AGOSTINO

(9o27)

¢2 is given by (9.21)
and

(9.28)

Under the null hypothesis T of (9.26) is approximately a t variable with и


given by (9.27) degree of freedom . Interpolation in standard t appears ade­
quate for judging significance of test results.

9 . 3 . 3 . 1.4 Norm al approximation


The norm al approximation given by

. [(n ^lMn^3)1^
6(n-2) J (9.29)

appears to be valid for n > 150.

E 9. 3 . 3. 1. 5 Num erical examples of test Table 9.4 contains the first


ten observations from five data sets given in the Appendix. These are the
uniform (U N I), two Johnson Unbounded distributions ((SU(0,2) and S U (1,2)),
the negative exponential (E X P ), and the normal (N O R ). The population values
of ^íßl and /?2 are included in the table. The ^ Ib i statistic has been computed
fo r all five data sets. Employing a two tailed test the 0.05 critical ^/bl o b­
tained from Table 9.2 is 1.157. Rejection of the null h3q>othesis only occurs
fo r the negative exponential data set (E X P ).
While we anticipate computation of w ill mainly be done via a com ­
puter, a computational form ula may still be useful. One such form ula is

фу
[Z (X -X )Z jr

nZZXZ - 3nZXZXZ + 2(ZX)Z


“ 3 (9.30)
[n Z X Z _ (zJQ Zjr

9 .3 .3 -1 .6 Recommendations fo r use of ^/b^

The Su approximation given by (9.20) to (9.25) is adequate for n > 8.


F o r computerization of ^ Í Ъ l , we recommend its use. F o r n = 5, 6, and 7,
Table 9.2 must be used. F o r table look-ups we recommend Table 9.2 for
n = 5 to 35 and Table 9.3 for n > 36. A lso for n > 150 the simple norm al
approximation of (9.29) can be used.
TESTS FOR THE NORMAL DISTRIBUTION 383

T A B L E 9.4 Num erical Examples fo r Moment and Related Tests

Distributionst

Johnson Johnson Negative Norm al:


Unbounded Unbounded E^qx>nential, M = 100,
Uniform (0 ,2 ) (1,2) Mean 5 O' = 10

UNI SU(0,2) SU(1,2) EXP NOR

8.1 0 0 .1 0 -0.41 8.15 92.55


2.06 -0.31 -0.91 4.69 96.20
1.60 -0.09 -0 .6 3 2.17 84.27
8.87 -0 .5 8 -1 .2 5 0.37 90.87
Data^ 9.90 1.15 0.50 16.69 101.58
6.58 0.17 -0.34 0.06 106.82
8.68 -1.39 -2.46 6.48 98.70
7.31 -0.14 -0 .6 8 2.63 113.75
2.85 -0.31 -0.97 0.44 98.98
6.09 0 .6 8 0.13 0.89 100.42

0 0 -0.87 2 0

ßz 1.80 4.51 5.59 9 3

X 6.204 -0.072 -0.702 4.257 98.414


S 3.009 0 .6 8 8 0.807 5.169 8.277
^^b, -0.49 -0.06 -0.72 1.49^ 0.16
Ьг 1.75 3.11 3.58 4 . 33 b 2.76
a 0.86 0.74 0.73 0.77 0.76
U 2.76 3.69 3.67 3.22 3.56

tData sets are first ten observations fo r UNI, S U (0 ,2 ), SU (1 ,2 ), E X P


and NOR data sets of Appendix.
^Note that

” 2
Z I X - Xl /n
a = Geary^s statistic =
m:

U = David et al. (1954) statistic = (sample range)/s


I I
S = [n/(n - l )]* m f

^^Reject null hypothesis of normality at 0.02 level of significance.


^Reject null hyi>othesls of normality at 0.10 level of significance.
со
OO

T A B L E 9.5 Probability Points of Ьз for n = 7 to 200

Sample size
n Percentiles

Part I (n = 7 to 20)

I 2 2.5 5 10 20 80 90 95 97.5 98 99

7 1.25 1.30 1.34 1.41 1.53 1.70 2.78 3.20 3.55 3.85 3.93 4.23
8 1.31 1.37 1.40 1.46 1.58 1.75 2.84 3.31 3.70 4.09 4.20 4.53
9 1.35 1.42 1.45 1.53 1.63 1.80 2.98 3.43 3.86 4.28 4.41 4.82
10 1.39 1.45 1.49 1.56 1.68 1.85 3.01 3.53 3.95 4.40 4.55 5.00
12 1.46 1.52 1.56 1.64 1.76 1.93 3.06 3,55 4.05 4.56 4.73 5.20
15 1.55 1.61 1.64 1.72 1.84 2.01 3.13 3.62 4.13 4.66 4.85 5.30
20 1.64 1.71 1.73 1.83 1.95 2.12 3.20 3.68 4.18 4.68 4.87 5.38

Part 2 (n = 20 to 100)

.5 I 2.5 5 10 15 20 80 85 90 95 97.5 99 99.5

20 1.58 1.64 1.73 1.83 1.95 2.04 2.12 3.20 3.40 3.68 4.18 4.68 5.38 5.91
25 1.66 1.72 1.82 1.92 2.03 2.12 2.20 3.24 3.43 3.69 4.15 4.63 5.29 5.81
30 1.73 1.79 1.89 1.98 2.10 2.19 2.26 3.26 3.44 3.69 4.12 4.57 5.20 5,69
35 1.78 1.84 1.94 2.03 2.15 2.24 2.31 3.28 3.45 3.68 4.09 4.51 5.12 5.58

40 1.83 1.89 1.99 2.07 2.19 2.28 2.35 3.29 3.45 3.66 4.06 4.46 5.04 5.48 O
2.31 2.38 3.29 3.44 3.65 4.02 4.41 4.96 5.38
45
50
1.87
1.91
1.93
1.96
2.03
2.06
2.11
2.15
2.23
2.26 2.34 2.41 3.29 3.44 3.63 4.00 4.36 4.88 5.28 §
55 1.94 2.00 2.09 2.18 2. 29 2.37 2.44 3.29 3.43 3.62 3.97 4.32 4.81 5.19 Ï
60 1.97 2.03 2.12 2.21 2.32 2.39 2.46 3.29 3.43 3.60 3.94 4.28 4.75 5.11
65 2.00 2.05 2.15 2.23 2.34 2.41 2.48 3.28 3.42 3.59 3.91 4.24 4.69 5.03
70 2.02 2.07 2.17 2.25 2.36 2.43 2.50 3.28 3.41 3.58 3.89 4.20 4.64 4.97
75 2.05 2.10 2.19 2.27 2.38 2.45 2.51 3.28 3.41 3.57 3.87 4.17 4.59 4.90

80 2.07 2.12 2.21 2.29 2.39 2.46 2.53 3.27 3.40 3.56 3.85 4.14 4.54 4.84
85 2.08 2.14 2.22 2.31 2.41 2.48 2.54 3.27 3.39 3.55 3.83 4.11 4.50 4.79
90 2.10 2.16 2.24 2.32 2.43 2.49 2.55 3.27 3.39 3.54 3.81 4.08 4.46 4.74
95 2.11 2.17 2.26 2.34 2.44 2.50 2.56 3.27 3.38 3.53 3.80 4.05 4.43 4.70
100 2.13 2.19 2.27 2.35 2.45 2.52 2.57 3.26 3.37 3.52 3.78 4.03 4.39 4.66

P art 3 (n = 100 to 200)

100 2.13 2.19 2.27 2.35 2.45 2.52 2.57 3.26 3.37 3.52 3.78 4.03 4.39 4.66
HO 2.15 2.22 2.30 2.37 2.47 2.53 2.59 3.26 3.37 3.51 3.75 3.99 4.32 4.58
120 2.18 2.24 2.32 2.39 2.49 2.55 2.61 3.25 3.35 3.49 3.72 3.95 4.26 4.52

130 2.20 2.26 2.34 2.41 2.51 2.57 2.63 3.25 3.34 3.47 3.70 3.92 4.21 4.46
140 2.22 2.28 2.36 2.43 2.52 2.58 2.64 3.25 3.33 3.46 3.67 3.89 4.17 4.41
150 2.24 2.30 2.37 2.45 2.54 2.60 2.65 3.24 3.33 3.45 3.65 3.86 4.13 4.36

160 2.26 2.32 2.39 2.46 2.55 2.61 2.66 3.24 3.32 3.44 3.63 3.83 4.09 4.31
170 2.28 2.33 2.40 2.48 2.56 2.62 2.67 3.23 3.32 3.43 3.62 3.81 4.06 4.27
180 2.29 2.35 2.41 2.49 2.57 2.63 2.68 3.23 3.31 3.42 3.60 3.79 4.03 4.23

190 2.31 2.36 2.43 2.50 2.58 2.64 2.69 3.22 3.30 3.41 3.58 3.77 4.00 4.19
200 2.32 2.37 2.44 2.51 2.59 2.65 2.70 3.22 3.30 3.40 3.57 3.75 3.98 4.16

Adapted from D ’Agostino and Tietjen (1971) and D'Agostino and Pearson (1973), with perm ission of the
Biometrika Trustees.

CO
OO
СЛ
386 D’AGOSTINO

FIGURE 9 . 2a Em pirical cumulative distribution Ьз (P < 0 . 55).

FIGURE 9 . 2b Em pirical cumulative distribution of Ьз (0.55 < P < 0 .9 5 ).


TESTS FOR THE NORMAL DISTRIBUTION 387

FIGURE 9.2c Em pirical cumulative distribution of Ьз (P > 0 . 975).


388 D'AGOSTINO

Finally, both a one sided o r two sided test can be used. If the direction
o f the skewness is anticipated ( i . e . , ^Ißi > 0 o r ^Ißi < 0) a one sided test
should be used.

9 .3 .3 .2 Fourth Standardized Moment b ^

9 .3 .3 .2 .1 Monte Carlo points for n = 7 to 200

D'Agostino and Tietjen (1971) presented simulation probability points


applicable for n = 7 to 50. Later D'Agostino and Pearson (1973) extended
these results to n = 200. Table 9.5 and Figure 9.2 contain probability points
valid fo r n = 7 to 200 and curves o f the probability distributions (em pirical
probability integral) fo r n = 20 to 200, respectively.

9 .3 .3 .2 .2 Anscombe and Glynn approximation

Anscombe and Gl3nm (1983) showed that the results of Table 9.5 and
Figure 9.2 for n > 20 can be adequately approximated, when the first three
moments of the distribution of Ьз have been determined, by fitting a linear
function of the reciprocal of a variable and then using the W ilson-H ilferty
transformation. Their approximation is computed as follows:

(1) Compute Ьз from the sample data.


(2) Compute the mean and variance of Ьз

E(b ) = m j ú i (9.31)
n+ I

and

= 2 4 п (п -2 )(п -3 )
^ ^ /„ + 1ч2/„ + Qwj, + C (9.32)
(n + l)2 (n + 3 )(n + 5)

(3) Compute the standardized value of b 2

^ E(bz) (9.33)
N/var (Ьг)

(4) Compute the third standardized moment of Ьз

^ 6(n^ - 5 n + 2 j
(9.34)
HiV z; (n + + 9) V n(n - 2)(n - 3)

(5) Compute

8 2
A = 6+
N/^l(bz) ФхОЬг)
J ^1(Ьг) } (9.35)
TESTS FOR THE NORMAL DISTRIBUTION 389

(6) Compute

(9.36)
•((‘ -¿ » -[ tt S S s í ]* )/ « " «
Z of (9.36) is approximately a standard norm al variable with mean zero
and variance unity.
The approximation given by (9.31) to (9.36) can be used to test directly
null hypotheses concerning Ьз fo r two sided o r one sided alternatives. F o r
example, fo r testing at level of significance 0.05

Ho : Norm ality

versus the one sided composite alternative

Hi : Nonnormality with ß 2 > ^

one would reject Hq if Z o f (9.36) exceeded 1.645. F o r

Hi : Nonnormality with ¢2 < 3

one would reject Hq if Z o f (9.36) w as sm aller than -1.645.

9 .3 .3 .2 .3 Norm al approximation

The normal approximation given by

Ьг - E (b ,)
(9.37)
(b2)

is valid only fo r extrem ely large n values ( i . e . , w ell over 1000). It should
not be used.

9 . 3 . 3 . 2.4 Bowman and Shenton*s Sn approximation


Appendix 2 of Chapter 7 gives details of a computer program fo r finding
an Su approximation to b 2 • This can be used to yield tests fo r n > 40.

E 9. 3 . 3 . 2. 5 Num erical examples of b ^ test T able 9.4 contains numerical


examples fo r the five data sets already used fo r N/bj. A s before only the E X P
data lead to rejection o f a two tailed test. H ere the level of significance
is 0.10.
390 D*AGOSTINO

9 « 3 . 3. 2. 6 Recommendation for use of

Table 9.5 o r Figure 9.2 can be used for 7 < n < 200. The Anscombe and
Glynn o r Bowman and Shenton approximations can be used for n > 20. Both
require computations. The form er requires less computations giving an ex­
plicit solution.
Again, as with N/bj, when knowledge is available concerning the a lter­
native ( i . e . , /?2 > 3 o r < 3) a one sided test should be used.

9 .3 .3 .3 Omnibus Tests Based on Moments

The ^[Ъl test is excellent fo r detecting nonnormality due to skewness Ф 0).


The b 2 test is p rim arily directed to detecting nonnormality due to nonnormal
kurtosis o r nonnormal tail thickness (¢2 ^ 3 ). A number o f investigators have
worked on combining these tests to produce an omnibus test o f normality.

9 .3 .3 .3 .1 The R-test

The simplest omnibus test consists of perform ing the N/bj test at level
and the b 2 test at level 0^2 and reject normality if either test leads to re je c ­
tion. The overall level of significance a for these two tests combined would
then be, by Bonferroni^s inequality,

a < O'! + 0^2 (9.38)

Pearson, D^Agostino, and Bowman (1977) showed that if ai = аг = 2a* a


good approximation to the overall level of significance is

a = Ц а * - (0^+)2) (9.39)

(9.39) would hold exactly if Nibj and w ere independent. They are uncorre­
lated but not independent and use o f (9.39) to determine the overall level of
significance produces a conservative test. T ables o f corrected values are
given in Pearson et al. (1966) fo r n = 20, 50, and O' = 0.05 and 0.10.
In order to use this test one can determine a * as

2a* = (I - ( I - a)^) (9.40)

where a i s the d esired overall lev el. Note 2a* is the level o f the in dividual
te sts.
The term R test was given to the above omnibus procedure because it
can be viewed as employing rectangular coordinates for rejection of normality.

9 . 3 . 3 . 3 .2 D^Agostino-P earso n chi square test


D^Agostino and Pearson (1973) suggested the statistic

= x^(\/bi) + (b j) (9.41)
TESTS FOR THE NORMAL DISTRIBUTION 391

as an omnibus test where X(NZbj) and X (b 2) a re standardized norm al equiv­


alent deviates and can be viewed as a chi square variable with two degrees
o f freedom . This test w as developed assuming and Ьз w ere independent.
They a re not. However, as Bowman and Shenton point out in Chapter 7, they
a re uncorrelated and nearly independent. So is approximately a chi square
variable with two degrees of freedom . F o r n > 100 the chi square distribution
approximation presents no problem .
The test statistic o f (9.41) is trivial to employ given the above m aterial.
X(NZbj), the normal deviate fo r NZbj, can be found using the Su approximation
of (9.20) to (9.25) and X (b 2), the norm al deviate for b 2 , can be found using
the Anscombe and Glynn approximations of (9.31) to (9.36).

9 .3 .3 .3 .3 Bowman-Shenton chi square test

In Chapter 7 Bowman and Shenton review their k | test which has the
same form at as (9.41) except their approximation to X (b 2) involved use of
Johnson Su curves (see also Bowman and Shenton (1975)). They also present
in F igure 7 . 1 contours which allow fo r exact level o f significance tests o f
a = 0.05 and 0.10 fo r the K| test. These contours are fo r n = 25 to 1000.

9 .3 .3 .3 .4 Other omnibus tests

Bowman and Shenton (1975) also suggested

(NZbi)Vfff + (bj - 3)V<^ (9.42)

where fff = 6/n and o-f = 24/n. Asymptotically (9.42) would be distributed
as a chi square variable with two degrees of freedom if the null hзфothesis
of normality was true. Due to the slow convergence of b 2 to normality this
test is not useful.
Cox and Hinkley (1974) suggested

max (iN /bil/ffi, Ib z -S I/ ffz ) (9.43)

Another possibility is

- log P (NTbj ) - log P (b j) (9.44)

where P(\Zbi) and P (b j) are the probability Integral transformation of ^ Ib i


and b 2 - This statistic would be approximately a four degree chi square
v a ria b le .

9 . 3 . 3 . 3 .5 Recommendations fo r use of omnibus test


Figure 7 . 1 gives the omnibus test, K|, for n = 25 to 1000 and a = 0.05
and 0.10. It is preferred to other approximations. F o r other situations there
is little to choose between of (9.41) and K| of Chapter 7. (The reader is
re fe rre d to Chapter 7 for num erical exam ples. ) Both the and K| tests can
be program m ed e a s ily . See Section 9 .3 .3 .3 .2 for .
392 D’AGOSTINO

The R test is not as powerful as o r K| but requires in many cases


only trivial interpolations in Tables 9.2, 9.3, and 9.5. A s a quick test it
has much to recommend it.

9 .3 .3 .4 Related Tests

9 .3 .3 .4 .1 G eary’s tests

A number of tests are related to moment tests and are of historical


Interest. Most noticeable are G eary’s test involving the ratio of the mean
deviation to the standard deviation

W = f ^ (9.45)
'jn (Z X * )*

and

Z I X - X l /n
a = (9.46)

(see Geary (1935)). Tables of a a re published in Pearson and Hartley (1972).


D^Agostino (1970) showed that

\Гп(а - .7979)
(9.47)
.2123

can be considered as a standard normal variable with mean zero and variance
unity for n > 41. Recently Gastwirth and Owen (1977) discussed optimal fea­
tures of a.
Geary (194 7) in one of the most distinguished papers on tests of normality
considered tests of the form

a(c) = Z IX - Xl for C > I (9.48)


c/2
nm^

Note a (l) = a of (9.46), and a(4) = Ьз» Geary discussed optimal properties
of Ьз in this fram ework. Table 9.4 contains numerical examples of G eary’s
a test.

9 .3 .3 .4 .2 Sample range test

David, Hartley, and Pearson (1954) presented a test defined as

_ ^(n) “ ^(1)
(9.49)
TESTS FOR THE NORMAL DISTRIBUTION 393

that is, the ratio of the sample range to the sample standard deviation.
Probability points of u are given in Pearson and Hartley (1972). Table 9.4
contains numerical examples of this u test.

9.3.4 R egression Tests

C learly, the recent interest in tests of normality is due mainly to the exciting
work of S. S. Shapiro and M . B. W ilk (1965). Their test fo r normality and
the tests that have resulted as modifications and extensions of their test are
called regression and correlation tests (see Chapter 5). These term s are used
in that these tests can be viewed as originally arisin g from considering a
linear model

X = Д + O’ E Z + €. (9.50)
(i) ^ (i)

and estimating, in particular, the param eter o- by a regression technique.


In (9.50) X (i) is the ith ord er statistic from the observed sample of size n,
E Z (I) is the expected value of the ith ord er statistic from a sample of size n
drawn from the standard norm al distribution (mean zero and variance unity),
SL^d €i is a random e r r o r term . Sections 5.7, 5.9, and 5.10 contain a d is­
cussion of these tests. It is suggested that the reader read these sections in
addition to the following.

9 .3 .4 .1 Shapiro -W ilk T e s t -A n Omnibus Test

In (9.50) the best linear unbiased estimate o f <x from the Gauss M irkov the­
orem is

52 = c*V“^X/c*V“b (9.51)

where c is the vector o f ejq)ected values of the n o rd er statistics from the


standard norm al distribution and V is the covariance m atrix of €^ in (9.50).
(Note is a regression estim ate.) The Shapiro-W ilk W statistic is basically
the ratio of to S^, the sample variance. In particular.

(a ^ 2
W = (9.52)
( n -1)8=' 2 (X - 3 Q ^

where

c*V
(9.53)
(c»V "2ç)-

The SLi optimal weights fo r the weighted least squares e s ti-


T A B L E 9.6 Num erical Examples for Shapiro-W ilk W Test
and D ’Agostino D Testt

NOR Data E X P Data

a.
I

Shapiro-W ilk

-0.5739 84.27 -48.363 0.06 -0.034


-0.3291 90.87 -29.905 0.37 -0.122
-0.2141 92.55 -19.815 0.44 -0.094
-0.1224 96.20 -11.775 0.89 -0.109
-0.0399 98.70 -3.938 2.17 -0.087
0.0399 98.98 3.949 2.63 0.105
0.1224 100.42 12.291 4.69 0.574
0.2141 101.58 21.748 6.48 1.387
0.3291 106.82 35.154 8.15 2.682
0.5739 113.75 65.281 16.69 9.578

24.627 13.880

Z (X - 616.554 240.498

W = - g ^ - 0,984 0.801^
S (X -X )^

D ’Agostino

-4 .5 84.27 -379.215 0.06 -0.270


-3 .5 90.87 -318.045 0.37 -1.295
-2 .5 92.55 -231.375 0.44 -1.100
-1 .5 96.20 -144.300 0.89 -1.335
-0 .5 98.70 -49,350 2.17 -1.085
0.5 98.98 49.490 2.63 1.315
1.5 100.42 150.630 4.69 7.035
2.5 101.58 253,950 6.48 16.200
3.5 106.82 373.870 8.15 28.525
4.5 113.75 511,875 16.69 75.105

217.53 123.095

z (x - 616.554 240.498
Sc X
I (I)
D = 0,27703 0,25101
n N / n Z (X -X )2

ÎD ata sets are first ten observations fo r NOR and E X P data sets of
Appendix,
^ F o r the Shapiro-Wilk W test reject the null hypothesis at 0,02 level of
significance if W < 0.806. So reject for negative exponential at 0.02 level,
R e j e c t the null hyxюthesls at 0.05 level of significance if observed
D < 0.2513, the low er tall critical value of D (see text). So reject at 0.05
level.
TESTS FOR THE NORMAL DISTRIBUTION 395

mator of (j given that the population is norm ally distributed. W can also be
viewed as the R^ (square of the correlation coefficient) obtained from a n o r­
m al probability plot (see Section 2 . 4) and thus the notion of a correlation
test.
The ai values fo r n = 3 to 50 w ere given by Shapiro and W ilk (1965)
and are presented in this book in Table 5.4. Because W is sim ilar to an R^
value, large values ( i . e . , values close to one) indicate normality and values
sm aller than unity indicate nonnormality. Thus values in the low er tail of
the null distribution of W are used for rejection. Table 5.5 gives the critical
values of W fo r n = 3 to 50.

E 9 . 3 . 4 . 1. 1 Num erical example of W test Table 9.6 contains two numerical


examples of the Shapiro-W ilk test. The first ten observations of the NOR
data and the EX P data are used. The E X P leads to rejection at the 0.02 level
of significance. The weights a¿ come from Table 5.4. Chapter 5 contains
other numerical exam ples.

9 .3 .4 .2 D^Agostino^s D Test

The W test requires a different set of a weights fo r each sam ple size n. A
modification was presented by D^Agostino (1971) which does not require any
tables of weights. It is given as follow s:

D =
n^'v/m”.

(9.54)
n ^ {Z (X - X )2}5

where

T = I (i-| (n + l))x (9.55)


(i)
i= l

The statistic D is equal, up to a constant, to the ratio o f Downton^s (1966)


linear estimator of the standard deviation to the sample standard deviation.
The expected value o f D is approximately 1/(2nÍ7t) = 0.28209479 and the
standard deviation is asymptotically

02998598
(9.56)
L 24тт J n/п

An approximate standardized variable is thus

^ n/Ïi (D - 0.28209479)
Y = (9.57)
0.02998598
CO
CD

T A B L E 9.7 Probability Points of D ’Agostino’s D Test for n = 10 to 2000 (Y statistic of (9.57))

Percentiles

0.5 1.0 2.5 10 90 95 97.5 99 99.5

10 -4.66 -4.06 -3.25 -2.62 -1.99 0.149 0.235 0.299 0.356 0.385
12 -4.63 -4.02 -3,20 -2.58 -1.94 0.237 0,329 0.381 0.440 0.479
14 -4.57 -3.97 -3.16 -2.53 -1.90 0.308 0.399 0.460 0.515 0.555
16 -4.52 -3.92 -3.12 -2.50 -1.87 0,367 0.459 0.526 0.587 0.613
18 -4.47 -3.87 -3.08 -2.47 -1.85 0.417 0.515 0.574 0.636 0.667
20 -4.41 -3.83 -3,04 -2.44 -1.82 0.460 0.565 0.628 0.690 0.720

22 -4.36 -3.78 -3.01 -2,41 -1.81 0.497 0.609 0.677 0.744 0.775
24 -4.32 -3.75 -2.98 -2.39 -1.79 0.530 0.648 0.720 0.783 0.822
26 -4.27 -3.71 -2.96 -2.37 -1.77 0.559 0.682 0.760 0.827 0.867
28 -4,23 -3,68 -2,93 -2.35 -1.76 0.586 0.714 0.797 0.868 0.910
30 -4.19 -3.64 -2.91 -2.33 -1.75 0.610 0.743 0.830 0.906 0.941

32 -4.16 -3.61 - 2,88 -2.32 -1.73 0.631 0.770 0.862 0.942 0.983
34 -4.12 -3.59 - 2,86 -2.30 -1,72 0.651 0.794 0.891 0.975 1.02
36 -4.09 -3.56 -2,85 -2.29 -1.71 0.669 0.816 0.917 1.00 1.05
38 -4,06 -3.54 -2.83 -2.28 -1.70 0.686 0.837 0.941 1.03 1.08
40 -4.03 -3.51 -2.81 -2.26 -1.70 0.702 0.857 0.964 1.06 1.11

42 -4,00 -3.49 -2.80 -2,25 -1.69 0.716 0.875 0.986 1.09 1.14
44 -3.98 -3.47 -2,78 -2.24 - 1.68 0.730 0.892 1.01 1.11 1.17
46 -3.95 -3.45 -2.77 -2.23 -1.67 0.742 0.908 1.0 2 1.13 1,19
48 -3.93 -3.43 -2.75 - 2.22 -1.67 0.754 0.923 1.04 1.15 1.22
50 -3.91 -3.41 -2.74 - 2.21 - 1.66 0.765 0.937 1.06 1.18 1.24 §
H
60 -3.81 -3.34 - 2.68 -2.17 -1.64 0.812 0.997 1.13 1.26 1.34 W
70 -3.73 -3.27 -2.64 -2.14 -1.61 0.849 1.05 1.19 1.33 1.42 H
Ul
80 -3.67 -3.22 -2.60 - 2.11 -1.59 0.878 1.08 1.24 1.39 1.48
4
90 -3.61 -3.17 -2.57 -2.09 -1.58 0.902 1.12 1.28 1.44 1.54 0
tti
100 -3.57 -3.14 -2.54 -2.07 -1.57 0.923 1.14 1.31 1.48 1.59
д
150 -3.409 -3.009 -2.452 -2.004 -1.520 0.990 1.233 1.423 1.623 1.746
200 -3.302 -2.922 -2.391 -1.960 -1.491 1.032 1.290 1.496 1.715 1.853
1.060 1.328 1.545 1.779 1.927
§
250 -3.227 -2.861 -2.348 -1.926 -1.471
300 -3.172 -2.816 -2.316 -1.906 -1.456 1.080 1.357 1.528 1.826 1.983
>
350 -3.129 -2.781 -2.291 - 1.888 -1.444 1.096 1.379 1.610 1.863 2.026 ïr*

400 -3.094 -2.753 -2.270 -1.873 -1.434 1.108 1.396 1.633 1.893 2.061 1
450 -3.064 -2.729 -2.253 -1.861 -1.426 1.119 1.411 1.652 1.918 2.090
1.668 1.938 2.114 5
500 -3.040 -2.709 -2.239 -1.850 -1.419 1.127 1.423 д
550 -3.019 -2.691 -2.226 -1.841 -1.413 1.135 1.434 1.682 1.957 2.136
600 -3.000 -2.676 -2.215 -1.833 -1.408 1.141 1.443 1.694 1.972 2.154 о
650 -2.984 -2.663 -2.206 -1.826 -1.403 1.147 1.451 1.704 1.986 2.171
700 -2.969 -2.651 -2.197 -1.820 -1.399 1.152 1.458 1.714 1.999 2.185
750 -2.956 -2.640 -2.189 -1.814 -1.395 1.157 1.465 1.722 2.010 2.199
800 -2.944 -2.630 -2.182 -1.809 -1.392 1.161 1.471 1.730 2.020 2.211
850 -2.933 -2.621 -2.176 -1.804 -1.389 1.165 1.476 1.737 2.029 2.221

900 -2.923 -2.613 -2.170 -1.800 -1.386 1.168 1.481 1.743 2.037 2.231
950 -2.914 -2.605 -2.164 -1.796 -1.383 1.171 1.485 1.749 2.045 2.241
1000 -2.906 -2.599 -2.159 -1.792 -1.381 1.174 1.489 1.754 2.052 2.249
1500 -2.845 -2.549 -2.123 -1.765 -1.363 1.194 1.519 1.793 2.103 2.309
2000 -2.807 -2.515 - 2 .1 0 1 -1.750 -1.353 1.207 1.536 1.815 2.132 2.342

Adapted from D ’Agostino (1971) and (1972) with perm ission of the Biometrika Trustees. CO
(£>
-Cl
398 D’AGOSTINO

I f the null hypothesis of normality is false Y w ill tend to differ from


zero. Simulation studies by D ’Agostino (1971) indicated that fo r alternative
distributions with kurtosis less than the norm al (¢3 < 3 ), Y tends to be
greater than zero. F o r alternative distributions with /З2 > 3, Y tends to be
less than z e ro . So in o rd er to guard against all possibilities a two sided test
needs to be employed. This procedure produces an omnibus test. The statis­
tic can also be used fo r a one sided test fo r directional alternatives ( i . e . ,
either ¢3 < 3 o r > 3).
D ’Agostino (1971) gave a table of percentile points fo r Y based on
C orn lsh -F lsh er e:q>ansions for n = 50 to 1000. D ’Agostino (1972) later gave
improved points fo r n = 50 to 100 based on Pearson curves and extensive
simulations. Table 9.7 contains probability points for n = 10 to 2000 based
on these and other w ork.
F o r n > 1000 a C o m ish -F ish e r езфапе!оп should be adequate to obtain
critical values of D . The expansion using the first four cumulants is as
fo llo w s. If Dp and Zp are the IOOP percentile points (0 < P < I) of D and
the standard normal distribution, respectively, then the C o m ish -F ish er
e 3q>ansion fo r Dp in term s of Zp is

D = E(D) + V (9.58)
P P

where
y i(Z ^ - I) 3Zp) y f(2 Z 3 - 5Zp)
Vp = Zp + (9.59)
24 36

Here

EP) = ( 1 + — ------ + (9.60)


2aJV
Ф 1: ^/ n V 4(n - I) 32(n - 1)2 " I28(n - 1)^

0.02998598
VPzW = (9.61)
\/n

-8.5836542
Vi = (9.62)
n/п

and

Yz = 114.732
n
(9.63)

Note with the C o m ish -F ish e r expansion of (9.58) to (9.63) there is no need
to transform to Y of (9.57).
Finally because the range of D is sm all, it should be calculated to five
decimal p laces.
TESTS FOR THE NORMAL DISTRIBUTION 399

E 9. 3 . 4 » 2. 1 Num erical examples of Agostino^s test Table 9.6 contains


two numerical examples of the D^Agostino D test. A s with the Shapiro-W ilk
test the first ten observations of the NOR and E X P data sets are used. With
n = 10 and a level of significance o f 0.05 one rejects normality if Y < -3.25
o r Y >0.299. Using

0.02998598
D = 0.28209479 +
\/n

one rejects if D < 0.2513 o r D > 0.2849. The NOR data set does not lead to
rejection at the 0.05 level. E X P does.

9 .3 .4 .3 Shapiro-Francia*s W^ Test

Shapiro and Francia (1972) addressed the problem of the a weights of the
Shapiro-W ilk test by noting that fo r large samples the ordered observations
may be treated as if they w ere independent. With this, the a weights of
(9 . 53) can be replaced by

b^ = (9.64)

and the W statistic of (9.53) can be replaced by

. _e$L . EîiïffliL (9.65)

Recall from Section 9 .3 .4 .1 c is the vector of the expécted values of the n


o rd er statistics from the standard norm al distribution. Values of c are
readily available (H arter, 1961) for n up to 400.
Shapiro and Francia (1972) supplied the weights bf fo r W ’ and critical
values for n = 35, 50, 51 (2)99. Pearson, D ’Agostino, and Bowman (1977)
noted that these critical values w ere calculated via simulations from only
1000 sam ples. They reevaluated percentage points fo r n = 99, 100, and 125
based on 50,000 simulations. A comparison indicated that the Shapiro-
Francia values in the low er tall w ere higher than what they should be. This
would result in producing actual levels of significance la rg e r than indicated
by the Shapiro-Francia tables. Further it would indicate in power studies
that the test was m ore powerful than it actually is.

9 .3 .4 .4 W eisberg-B in gh am ’s Test
and Asymptotic Extensions

W eisberg and Bingham (1975) suggested replacing the b of (9.64) with


400 D*AGOSTINO

d = (9.66)

where the elements of the vector c are

5 = ф -.Г ^ - з / б ] for i = I , . . . » n (9.67)


i Ln+ 1/4 J
and Ф”^(р) is the inverse of the standard normal cumulative distribution func­
tion. The resulting statistic is denoted by W ’ and is given as

(c 'X )V (c '5 ) _ (Z d jX (I))"


W’ = (9.68)
Z (X -X )" ” Г (X -X )"

The approximations c^ to C^ was suggested by Blom (1958) and its use in


(9.68) results in the null distribution of W ’ being very close to W ’ , at least
for n = 5, 20, 35 where the authors made a comparison. They suggested
using 'W' in place of W ' . This rem oves the need to have tables of weights for
computations of the test statistic. With use of W ’ they suggested use of the
critical values of W ’ .
Stephens in Chapter 5 of this book discusses use of

Z (X ,c ) = n (l - W ' ) (9.69)

and supplies a table of critical values for n < 1000 (see Section 5.7.3 and
Table 5.2 where the statistic (9.69) is written as Z (X ,m )). Royston (1982)
gave an extension of the Shapiro-W ilk test sim ilar to (9.69) fo r n < 2000.
He presented a statistic for the form

(I-W ) (9.70)

\ is a function of sample size.

9 .3 .4 .5 Other Extensions/Modifications of the


Shapiro-W ilk Test

A number of other Investigators have considered extending and modifying the


Shapiro-W ilk test. Most noticeable are the works of Filliben (1975) and
L a Brecque (1977). F illib e n 's test is exactly the correlation coefficient b e ­
tween the ordered observations X (i) and the o rd er statistic medians Mi from
the standard normal distribution. It can be viewed in the context of the last
section (9 .3 .4 .4 ) where the weights a of the W statistic are replaced with
functions of medians of the o rd er statistics. Filliben gives weights and s ig ­
nificance levels for n < 100.
TESTS FOR THE NORMAL DISTRIBUTION 401

La Brecque (1977) extended the W test by augmenting (9.50) to detect


nonlinearity in normal probability plots. The reader is re fe rre d to Section
5 . 10.1 for a further comment on this text.
A lso of interest here is the work of P u ri and Rao (1976). They wrote
the expected value of the ith ord er statistic as

E (X _ J = у ^ + y «(c . ^ )+ V 4 (C j MC^) + (9.71)


^ (i)' ^l i

where \ and д w ere selected so as to provide orthogonal polynomials. When


the underlying distribution is norm al y^ = Уг = o-» Уз = У 4 = • ’ • = 0. The
Shapiro-W ilk test is basically the test у 2 = o". P u rl and Rao investigated if
a better test can be developed by Incorporating у 3 and у 4 into the test. They
ultimately concluded tests using jointly W and a skewness test would be m ore
efficient for testing normality versus skewed distributions ( i . e . , Ф 0)
and a test using jointly W and a kurtosis type test would be m ore efficient for
testing normality versus distributions with nonnormal kurtosis ( i . e . , /З2 ^ 3 ) .
M ore on other extensions of the W test are given in Chapter 5.

9 .3 .5 M iscellaneous Tests

There is a plethora of other tests o f normality too numerous to mention in


detail. Some selected ones we now discuss b rie fly .

9 . 3 . 5 .1 Locke and Spurrier^s U Statistic Test


Locke and Spurrier (1976, 1977) used the theory o f U -statistics to develop
tests of normality. They showed that both the ^/bl test and D^Agostino^s D
test can be generated from this theory. They also developed other new tests.

9 .3 .5 .2 The Gap Test

Andrews, Gnanadesikan, and W arn er (1971, 1972) developed gap test for
normality. Gaps gi are defined as

_ ^(1+1) ' ^ (i)


(9.72)
°i C - C
(i+ 1) (I)

where c(j) are the expected values of the ith o rd e r statistics from the standard
normal distribution. If the null hypothesis of normality is true, the g^ of
(9.72) are independent exponential variables. Specific types of deviations
from normality reflect themselves in deviations from exponentiality of the g j.
Andrews et al. gave an omnibus test for normality that is distributed under
the null hypothesis approximately as a chi square variable with two degrees
of freedom .
402 D^AGOSTINO

9 .3 .5 .3 Likelihood Ratio Tests/Specific Alternatives

Dumonceaux, Antle, and Haas (1973) looked at the theory of likelihood ratio
tests to develop tests of normality versus some specific alternative d istri­
butions (Cauchy, Exponential, and Double Exponential). C rltic a lv a lu e s of
the tests are given. F o r the double exponential the likelihood ratio test is
sim ilar to G eary 's a of Section 9 .3 .3 .4 .1 . Hogg (1972) presented the family
of distributions of the form

в
-Ix I
f(x; 0 ) = (9.73)

( - 1)

F o r 0 = 2 , X is norm al, for 0 = 1 , x is double exponential and as 0 — » ,


X tends to the uniform distribution. F o r testing 0 = 0^ versus 0 = 02,
G eary 's a test is shown to have optimal properties for 0i = I versus 02 = 2
and for testing 0^ = 2 versus 02 = 4, the b 2 test has optimal properties.

9 .3 .5 .4 Tiku's Tests

Tiku (1974) presented tests of normality based on the ratio of estimates of cr


from trimmed samples to the sample standard deviation. These tests are
suitable for specific alternative hypotheses in term s of skewness of the a lter­
native. Percentage points are given.

9 .3 .5 .5 Spiegelhalter' S Combination o f Test Statistics

Spiegelhalter (1977) used the theory of most powerful location and scale
Invariant tests to develop tests of normality against the uniform and the
double exponential distributions. He then suggested the sum o f these two as
a combined test statistic. A Bayesian argument was presented to justify the
combination. A s 3ntnptotically the two components are equal to G eary 's a test
o f (9.46) and the David, Hartley, and Pearson u test of (9.49). Critical
points w ere given in the article for n < 100 fo r level o f significance 0.05
and 0 . 10 .

9 .3 .5 .6 Other Tests

Other tests of interest are the tests based on the Independence of the sample
mean and standard deviation (Line and Mudholkar, 1980), the test based on
the em pirical characteristic function (Hall and W elsh, 1983) and the squeeze
test (Burch and Parsons, 1976). The term squeeze was derived from the
method used to perform the test, whereby data points plotted on the appro­
priate probability paper are squeezed between p arallel r u le s .
TESTS FOR THE NORMAL DISTRIBUTION 403

9.4 COMPARISONS OF TESTS

9 .4 .1 Pow er Studies

There are a large number of tests for judging normality o r departures from
normality. There is no one test that is optimal fo r all possible deviations
from normality. The procedure usually adopted to investigate the sensitivity
of these tests is to perform power studies where the tests are applied to a
wide range of nonnormal populations fo r a variety of sample sizes. A number
of such studies haye been undertaken. The m ajor ones, in ord er of complete­
ness and importance, are Pearson , D*Agostino, and Bowman (1977), Shapiro,
W ilk, and Chen (1968), Saniga and M iles (1979), Stephens (1974), D*Agostino
(1971), Filliben (1975), and D*Agostino and Rosman (1974). Other useful,
but less m ajor, studies are D yer (1974), Prescott (1976), Prescott (1978),
Tiku (1974), and Locke and Spurrier (1976).
Presentation of the results of the above power studies produces a number
of difficulties. In o rd e r to be most informative we w ill first present results
Indexed by skewness { • ^ i ) and kurtosis {ß 2 ) and then indexed by specific
tests. The fo rm er comparisons are mainly from Pearson et al. (1977). The
latter are sum m arized from all of the above articles.

9 .4 .1 .1 Pow er Results fo r Skewed Alternatives (n/^, Ф 0)

The Shapiro-W ilk test (Section 9 .3 .4 .1 ) and the Shapiro-Francia extensions


(Sections 9 .3 .4 .3 and 9 .3 .4 .4 ) are very sensitive omnibus tests against
skewed alternatives (H^ : Nonnormality with Ф 0 ). F o r many skewed a lte r­
natives they are clearly the most powerful. When we have p rio r grounds for
believing that, if the population is not norm al, it w ill be positively skewed
{sjß^ > 0), directional tests are very powerful. Directional tests re fe r here
to the \ibi test (Section 9 .3 .3 .1 ) based on the upper tail of its distribution
and the R test (Section 9 .3 .3 .3 .1 ), employing a one-sided test and a
two-sided Ьз test. F o r negatively skewed alternatives {^íßl < 0) the low er tail
of N/bj should be employed.

9 .4 .1 .2 Pow er Results fo r Symmetric Distributions


with Nonnormal Kurtosis {ß y Ф 3)

9 .4 .1 .2 .1 Platykurtic alternatives (g , < 3)

When the omnibus tests are applied without directional knowledge of the
alternative distribution, there is very little to choose between the powers
of (Section 9 .3 .3 .3 .2 ) and the R test (Section 9 .3 .3 .3 .1 ) when applied to
platykurtic populations. The Shapiro-W ilk W test is on the whole more pow er­
ful than these. The D ’Agostino D test (Section 9 .3 .4 .2 ) does not fit consist­
ently in the comparison. In general tnere is usually some other test m ore
powerful than it.
When knowledge of the direction of ß 2 is known {ß 2 < 3 ), the low er tail Ьз
is more powerful than the , R, W , and D tests.
404 D’AGOSTINO

9. 4. 1 ■2 >2 Leptokurtic alternatives (/3» > 3)

The powers of the omnibus tests are broadly in the following o rd er of


descending power: K^, R, D, and W . F o r very long tailed populations ( e . g . ,
> 36), D ’Agostino’s D test is best.
When = 0 and ^2 > 3 is the known direction of the alternative, there
is no cle a r preponderance of the upper tail b 2 over the low er tail D ’Agostino’s
D test. However, these tests a re m ore powerful than the omnibus test K^, R ,
D, and W .

9 .4 .1 .3 Pow er Results fo r Specific Tests

1. The Shapiro-W ilk W test and the Shapiro Francia extension are very
sensitive omnibus tests. F o r many skewed populations they are clearly the
most powerful. When = O and /?2 > 3 a number o f other tests are more
powerful.
2. N/bj and b 2 have excellent sensitivity over a wide range of alternative
distributions which deviate from normality with respect to skewness and
kurtosis, respectively. A s n gets large the N/bj test has no power fo r symmetric
alternatives. When directional information is available ( e . g . , sjß i > O o r
ß 2 < 3) appropriate one sided versions of these tests are very powerful. In
most cases studied in the literature they are usually most powerful.
3. The D ’Agostino-Pearson of Section 9 .3 .3 .3 .2 o r because of its
equivalency to K^, the Bowman-Shenton Kg of Section 9 .3 .3 .3 .3 are sensi­
tive to a wide range of nonnormal populations. They can be considered omni­
bus tests. F o r skewed alternatives the Shaplro-W ilk W test is usually m ore
powerful. A lso for symmetric alternatives with ß ^ < the W test is often
m ore powerful. F o r symmetric alternatives with ß ^ > Zy is often most
powerful.
A . The R test of Section 9 .3 .3 .3 .1 is also an omnibus test. Its power
usually does not exceed that of .
5. The most powerful EDF test appears to be the Anderson-D arling A^
(Section 9 .3 .2 .2 ). It is at times presented as being sim ilar in power to the
Shapiro-W ilk W test. However, it has not been studied as extensively as
either the moments tests o r the regression tests. M ore power studies are
required to compare it m ore fully to the W , K^, R, and D ’Agostino’s D tests.
6. While D ’Agostino’s D test is an omnibus test it has best power for
distributions with /?2 > 3. Other tests are better than it fo r skewed alter­
natives.
7. G eary's a test (Section 9 .3 .3 .4 .1 ) has good power for symmetric
alternatives with ß^ Z * However, b 2 is usually better. F o r skewed a lter­
natives W is generally superior.
8. Kolm ogorov-Sm irnov test has poor power in comparison to the many
tests described in detail in this chapter.
9. Chi-Square test is in general not a powerful test of normality.
TESTS FOR THE NORMAL DISTRIBUTION 405

9.4*2 Effects of T ies Due to Grouping

Results of power studies are not the only means fo r judging o r comparing
the normality tests. In practice the data may often involve ties, either b e ­
cause available figures have been rounded fo r grouping purposes o r because
measurements cannot be carried out beyond a certain degree of accuracy.
It may not be desirable to reject the null hypothesis just because the data
contain these ties o r are grouped. Use of the data as if the underlying popu­
lation w ere norm al may not present any problem s if the true population is
approximately norm al and the resulting data contain tie s. (R esearch is
needed on this point.) Pearson et a l. (1977) investigated the effect which
ties and the grouping of data have on four tests o r norm ality, ^ I b i , D, W ,
and W *.
F o r judging the effect what matters is the ratio, say Í , o f the standard
deviation of the distribution to the rounding interval, i . e . , the interval b e ­
tween the nearest possible readings o r observations left after rounding.
Pearson et al. (1977) considered the effect of grouping on ^ I Ь l , and W for
i = 3, 5, and 10 and n = 20 and 50, and for D and W* fo r f = 3, 5, 8, and 10
and n = 100. The present author also considered ^/bl for n = 100, D for
n = 20 and 50, and Ьз fo r n = 20, 50, and 100.
The effect of grouping on and Ьз was not significant. That is, group­
ing did not produce differences between the actual declared o r nominal level
of significance. The effect on D w as to make the test slightly conservative.
That is, the actual level of significance was slightly sm aller than the de­
clared o r nominal level. F o r the W test the effect was significant for f = 3
and 5. Here the actual level of significance exceeded significantly the normal
level. F o r H = 10, the effect was m inim al. The statistic W* was extrem ely
unsatisfactory. Usually the actual level of significance exceeded by substan­
tial amounts (e . g . , 30%) the nominal level, even for £ = 10.
The above results suggested that o r its derivatives as given in Sec­
tions 9 .3 .4 .4 and 9 .3 .4 .5 and Chapter 5 must be used with caution if there
are multiple ties.
Pearson et al. (1977) did not consider the effects of ties on the EDF
tests. Until the effects are investigated they should be used with caution on
data containing ties.

9.5 R ECO M M ENDATIO NS

Attempting to make final recommendations is an unwelcome and near impos­


sible task involving the Imposition of personal judgments. Still a set of well
justified recommendations can be made at this time. W e make the following.
I. A detailed graphical analysis involving normal probability plotting
should always accompany a form al test o f norm ality. Section 2.4 gives the
necessary steps fo r this. It is not clear that standard statistical software
packages give useful probability plots. However, Chapter 2 explains in detail
406 DAGOSTINO

how to employ the computer for a good graphical analysis« A detailed exam i­
nation of the probability plot should be undertaken.
2. The omnibus tests, the Shapiro-W ilks W test and its extensions
(Sections 9 .3 .4 .1 , 9 .3 .4 .3 , and 9 .3 .4 .4 ), the D^Agostino-P e a rso n test
o r the Bowman-Shenton version k | (Sections 9 .3 .3 .3 .2 and 9 .3 .3 .3 .3 ), and
the Anderson-D arling edf test A^ (Section 9 .3 .2 .2 ) appear to be the best
omnibus tests available. The Shapiro-W ilk type tests are probably overall
most powerful. However, due to the problem with ties (Section 9 .4 .2 ) and
the fact that it gives as a by-product no numerical indications of the nonnor-
m alily, the test based jointly on the very Informative NZb^ and Ьз statistics
may be p referred by many. The ^ I b i and Ьз statistics can be very useful fo r
indicating the type of nonnormality. A lso they can be useful fo r judging if
nonnormality w ill affect any inferences to be made with the data ( e . g . , if
a t test is to be applied to the data o r a prediction is to be m ade).
3. The R test (Section 9 .3 .3 .3 .1 ) and D*Agostino^s D test (Section
9 .3 .4 .2 ) can be used as omnibus test. They are convenient and easy to use.
The tests o f point 2 are probably m ore powerful.
4. If the direction of the alternative to normality is known ( e . g . ,
> O o r /5з > 3), then the directional versions of the ^/bl, Ьз, and D^Agos-
tino D test should be used.
5. F o r testing for normality, the Kolm ogorov-Sm im ov test is only a
historical curiosity. It should never be used. It has poor power in comparison
to the above procedures.
6. F o r testing for normality, when a complete sample is available the
chi-square test should not be used. It does not have good power when com­
pared to the above tests.

9.6 TESTS OF N O R M A L IT Y ON RESIDUALS

Chapters 4 and 5 discussed the application of edf and regression tests on


resid u als. Much of that discussion involved tests of normality. The reader
is re fe rre d to those chapters. Anscombe and Glynn (1983) also discussed the
application of Ьз on residuals. These attempts only represent the beginnings.
There is much research that needs to be done.
By residuals we mean the following. A mathematical model of the form

Y = f(^ ,X )+ € (9.74)

is under investigation. X represents a vector of variables, ß a vector of


unknown, to be estimated, coefficients, Y represents the unknown dependent
variable, and c represents a random e r r o r . Based on a sample size n, the
ß are estimated by ß and the residual fo r each observation is defined as

€4 = Y i - Y,. (9.75)

where
TESTS FOR THE NORMAL DISTRIBUTION 407

Y. = H ß , X^) (9.76)

for i = I , . . . » n. Intuitively if the sample employed to estimate the ß con­


iste of a large number of observations in comparison to the dimension of
the î^, then the tests of normality given in Section 9.3 above applied directly
to the residuals of (9.75) should be approximately correct. This being the
case even if no adjustments are made for the statistical dependencies among
the residuals and the unequal variances that usually exist with resid u als.
Unfortunately the correct implementation of this intuition needs substantial
work. W e now discuss two attempts at its implementation.

9 .6 .1 R e s id u a ls fr o m a L in e a r R e g r e s s io n

White and MacDonald (1980) considered the linear regression model

Y = W x+ € (9.77)

and the residuals

€ = Y -Y (9.78)

Here n Independent o b s e r v i o n s are drawn and the resulting design m atrix


is of full rank. The vector ß is of dimension k.
White and MacDonald showed that under general conditions the N/bj, Ь г ,
D ’Agostino’s D, and Shapiro-Francia W ’ tests computed on the residuals
^ 1 » ^2» •• • » ^n bave as 3nnptotically the same distributions as if computed
on n independent identically distributed norm al e r r o r s e in (9.77). Further
for n = 20, 35, 50, and 100 they perform ed simulations to judge the effects
of using the residuals on the null distributions of the test statistics N/bj, Ь з ,
D, R, W , and W ’ . They also examined how the statistics behaved fo r nonnor­
mal e r ^ r s € in model (9.77). F o r all this simulation the dimension к of the
vector ^ was 4 .
In general they showed that fo r the cases examined computation of the
tests on the residuals did not invalidate them. O verall, D ’Agostino’s D test
produced the best agreement between the test based on the correlated re s id ­
uals of (9.78) and the one on the independent e r ro rs o f (9.77), ^ÍЪl and Ьз
exhibited the next best behavior, followed by W ’ and W .
W eisb erg (1980) in a comment article to the White and MacDonald (1980)
article emphasized that their results w ere limited and that n, the sample size,
k, the dimension of ß and V , the design m atrix, can all influence the validity
of the tests of normality on residuals from a linear regression analysis. He
demonstrated this with examples for n = 20 using the W test. Unfortu­
nately n = 20 and the W test comprised the weakest combination in the White
and MacDonald w ork. It would have been better had he examined the R o r D
tests.
408 D’AGOSTINO

Appropriate practical advice here seem s to be that n should be large


(say n > 50) and к reasonably sm all ( e . g . , 5 o r le ss) before the form al signif­
icance levels of the tests on residuals can be taken as appropriate. Further
work is needed to clarify the real Impact of varying k, n, and the design
m atrix.
In applying the results of White and MacDonald it should be emphasized
that the computations of the test statistics and the corresponding table look­
ups to determine statistical significance are carried out directly on the n
residuals as if they constituted an independent sample of size n. No adjust­
ments a re made fo r k. Also note because the mean o f the residuals is zero,
in ^Tbi and b 2 the sample mean need not be езфИсШу computed fo r them.

9 .6 .2 Residuals from an Autoregressive Model

A lexa B e ise r (1985) in an unpublished work has considered first order auto­
re g re ssiv e models of the form

y ^ = m + p (Y^_ i - ai) + €^ (9.79)

where are Independent norm al variables. She has shown that the N/bj and b 2
computed on the residuals produce valid levels of significance fo r n > 50
and P < 0.9. In fact, this procedure appeared to be m ore appropriate than
computing N/bi and b 2 on directly Incorporating adjustments fo r the de­
pendencies of the observations.
In her work the p of (9.79) is estimated as

t (Y .-Y H Y -Ÿ )
t=2
P =
n
Z
Z (Y t -Y )
t= l

where

11

2 ".
= P

The statistics N/bj and b 2 are computed on the n - I residuals

^ = (Y ,-Y )-3 (Y ^ _ ^ -Y ) (9.80)

fo r t = 2, . . . , n. Note the tests are employed as if a sample of n - I is


available, not n.
TESTS FOR THE NORMAL DISTRIBUTION 409

9.7 M U L T IV A R IA T E N O R M A L IT Y

An excellent review article exists fo r tests of multivariate normality


(Gnanadesikan, 1977). W e w ill not attempt to discuss these tests in the detail
given in that treatment. Rather this section w ill only be a b rie f overview .

9 .7 .1 Univariate Tests fo r M arginal Norm ality

In practice one ra re ly perform s solely a multivariate analysis. Rather, a


m^oltivariate analysis is usually one stage in an analysis to be supplemented
by univariate analyses considering each variable separately. These uni­
variate analyses often can detect sufficiently what the multivariate analysis
contains. At times they are m ore informative due to their specific attention
to each v ariable. In that view, although normality of each m arginal variable
does not imply joint normality, the presence of many types of nonnormality
is often shown in the m arginal distribution. Also if there is multivariate n or­
mality then necessarily each m arginal distribution w ill be norm al. Detection
of one m arginal that is nonnormal indicates the multivariate distribution is
nonnormal.
Given the above, it is reasonable to test each m arginal distribution using
the univariate tests discussed in Section 9.3 and recommended in Section9.5.
In o rd er to guard against an inflation o f the Type I e r r o r it is probably sen­
sible to use Bonferronl^s inequality fo r determining the overall level o f s ig ­
nificance. Thus if we had a p dimensional distribution under consideration,
each m arginal should be tested at the

О'/p

level of significance. Here a. is the desired o verall level o f significance.


F o r example, if p = 5 and we desired to have an overall ol = 0.05, then each
m arginal should be tested at the 0.05/5 = 0.01 level of significance.
In these assessm ents of m arginal normality normal probability plotting
should be employed.

9 .7 .2 Generalization of Univariate Procedures

9 .7 .2 .1 Mardia^ s Tests

M ardia (1970) proposed tests of multivariate normality via multivariate


m easure of skewness and kurtosis. These moments have the maximum effect
on the distribution of Hotelling^s T^ under nonnormality (M ardia, 1975).
Let X i , . . . , Xn be a random sample of vectors of p components from
a population with mean vector д and covariance m atrix Z . Suppose X and S
denote the sample mean vector ап-^ covariance m atrix, respectively. M ardia
based his test on the following m easures of multivariate skewness and
kurtosis.
410 D'AGOSTINO

b = \ Z Z .rf. (9.81)
I.P ц2 i J ц

and

к - i V ,.2
b_ ” S r .. (9.82)
2 ,P n i 11

where

Гц = (Xj - ÍQ’s ' \ x . - X) (9.83)

Here X is the mean vector and S, the sample covariance m atrix. Significant
points obtained from simulations are given in IVbrdia (1970) and M ardia
(1975). M ardia and Foster (1983) discussed omnibus multivariate tests based
on b i^p and b 2 ,p .

9 .7 .2 .2 Malkovich and Afifi^ s Tests

Malkovich and A fifi (1973) proposed multivariate skewness and kurtosis tests
using Roy’s union-intersection principle. Multivariate skewness w as given
by them as

^ (ELc'.:?L--cV )? )! (9.84)
(v a r (c 'X ))’

and multivariate kurtosis by

(9.85)
(v a r (c ’X))2

for some vector c. Roy’s principle could lead naturally to appropriate re je c ­


tion ru les. These w ere in the form s reject for skewness if

m ax {/ 3 i(c)} > ki (9.86)

where kj produced an a level test and reject for kurtosis if

max ( /^ 2 (C )) o r min (/^2 ( 0 ) ) (9.87)

fall outside the interval (к 2 ,кз) where these k ’s produce an a level test.
Machado (1983) found the asymptotic distributions of the statistics in
(9.86) and (9.87) for p = 2, 3, and 4. As with the univariate case, the statis­
tics approach their asymptotic behavior very slowly. Machado used the
D ’Agostino ^ГЬl suggested Johnson Su approximation for (9.86) and the
Anscombe and Glynn b 2 approximation for (9.87) to obtain null distributions
TESTS FOR THE NORMAL DISTRIBUTION 411

valid fo r n > 25 and p = 2, 3, and 4. However, he mentioned overestimation


fo r n < 100 and underestimation fo r n > 100. He did not give any corrections.
M ore details o r work are needed here.
Malkovlch and A fifi (1973) also suggested a multivariate generalization
o f the Shapiro-W ilk W test.
The tests of Malkovich and A fifi appear to require considerable compu­
tation.

9 .7 .3 Solely M ultivariate Procedures

9 .7 .3 .1 Directional Norm ality

Andrew s, Gnanadesikan, and W arn er (1971) defined the scaled residuals as

Z. = S"^(X. - X ) (9.88)
I ' I '

fo r i = I , . . . , n where S ^ is the symmetric square root of the covariance


m atrix S. They also defined a norm alized weighted sum of the

S.w.Z^
X = ^ ^ ^ (9.89)
a llEjW.Z.II

Here d^ is a vector.

W^ = IIZjII (9.90)

I I ZI I denotes the Euclidean norm , o r length of the vector X, and a is a con­

stant to be chosen.
F or^a = -1 , d^ is a function only of the orientation of the Zi^s, while for
a = I , da becomes sensitive to the observations distant from the mean. M ore
generally, for a > 0 the vector da w ill tend to point toward any clustering of
observations fa r from the mean, while for a < 0, the vector d^ w ill point in
the direction of any abnormal clustering near the center of gravity of the
data. I
Therefore, d* = d^ fo r a given value o f a, can be regarded as a uni­
variate sam ple. Any univariate test of normality can now be employed. The
value o f a can be selected to be sensitive to certain types of nonnormality.
Because of the data-dependence of the approach, the procedure can only be
used as a guide. The form al significance levels do not apply.

9 .7 .3 .2 Radius and Angles Decompositions

Consider again the scaled residuals


412 D'AGOSTINO

Z. = S"^(X. - X ) (9.91)
1 1

fo r i = I , . ., n. Under the null hypothesis, the scaled residuals are approx­


imately spherically sym m etrically distributed. The squared radii, o r squared
lengths o f the Z f

rf = ZlZ^ = (X. - x ) ' s " \ x , - X ) (9.92)


i

w ill have approximately a chi-squared distribution with p degrees of freedom .


Here P is the dimension of Z . F o r the bivariate case, define % to be the
angle Z i makes with a prescribed line. The ^ are then approximately uni­
form ly distributed on (0,2тг) under the null hypothesis and the r i and % are
approximately independent. F o r moderate sample sizes the dependence
should be negligible. Probability plots of the r i and 0i can be used to evaluate
multivariate normality.

9 .7 .3 .2 .1 Bivariate case

O rd er the n squared radii r^^^ < • • • < r^^^ and plot these against the

corresponding expected values for the cdf from a chi-square distribution


with two degree of freedom . Sim ilarly fo r ^/2тг plot the ordered values
against the expected values o f the cdf of a uniform distribution. Both of these
plots should be linear under the null hypothesis of normality.

9 .7 .3 .2 .2 Higher dimensional data (p > 2)

F o r higher dimensional data the appropriate chi-square distribution fo r


the squared radius plot is the chi-square with p degrees of freedom . F o r the
angles there are p - I plots. See Andrews et al. (1974) fo r details of these
p lots.

9 .7 .3 .3 Other Procedures

There are still other procedures fo r testing multivariate normality. Some o f


these are:
1. The nearest distance test o f Andrews et a l. (1974),
2. The maximum curvature test of Cox and Small (1978), and
3. The Dahiya and Gurland (1973) generalized minimum chi-square tech­
nique applicable for bivariate normality.

9 .7 .4 P ow er of Multivariate Norm ality Tests

V ery little has been done by way of power studies fo r multivariate normality
tests. MaIkovich and A fifi (1973) have undertaken a sm all study. M ore is
needed.
TESTS FOR THE NORMAL DISTRIBUTION 413

9 .7 .5 Recommendations

It is inappropriate at this stage to give detailed recommendations. Much


m ore research is needed. A ll the procedures reviewed above have m erit.
Personally we have found the univariate tests fo r m arginal normality (Sec­
tion 9.7 .1 ) in conjunction with Mardia^s test (Section 9 .7 .2 .1 ) to be very
useful. F o r bivariate normality the radius and angle procedure of Section
9 .7 .3 .2 .1 has good m erit.

R EFER EN C ES

This reference list contains two sets of referen ces. The first set is a b rie f
set of references in which the norm al distribution is used as a mathematical
model to re a l world data. The second set is the m ajor set of references and
it is on the statistical methods of tests of normality.

Applications of the Norm al Distribution

D ’Agostino, R . B . and G illespie, J. C. (1978). C om m en tson th eO S H A


accuracy of measurement requirements fo r monitoring employee ex­
posure to benzene. A m er. Industrial Hygiene A ssoc. Jour. 39, 510-513.

Fredholm , B . , Gunnarson, K ., Jensen, G ., and Muntzlng, J. (1978).


M am m ary tumor inhibition and subacute toxicity in rats of predrim ostlne
and of its m olecular components chlorambulc and prednisolone. Acta
Pharm acol, et T oxicol. 42, 159-163.

Griffin, M . J. and Whitham, E. M . (1978). In d iv id u a lv a ria b ility a n d its


effect on subjective and biodynamic response to whole body vibration.
Jour, of Sound and Vibration 58, 239-250.

Hunt, W . F . J. (1972). The precision associated with the frequency of lo g -


norm ally distributed a ir pollutant m easurem ents. Jour. A ir Pollution
Cont. A s s . 22(9), 687.

Kollegoda, N . T . and Y evjevich, V . (1977). P re s e rv a tio n o fc o rre la tio n ln


generated hydrologic samples through two-station m odels. Jour, of
Hydrology 33, 99-121.

Larson, R. I. (1971). A mathematical model fo r relating a ir quality m eas­


urement to a ir quality standards. Publication No. A P -8 9 , U .S . E P A ,
R e se a rc h T rla n g le P ark , N .C .

Rino, C. L . , Livingston, R. C ., and Whitney, H. E. (1976). Som enew


results on the statistics of radio wave scintillation: I . Em pirical ev i­
dence for gaussian statistics. Jour, of Geophysical Research 81, 2051-
2064.
414 D’AGOSTINO

Richert, J ., Simbel, М . H ., and W eidenm uller, H. A . (1975). Statistical


theory of nuclear cross section fluctuation. Z . Physik 273, 195-203.

Smith, T . J ., Tem ple, A . R ., and Reading, J. C. (1976). Cadmium, lead


and copper blood levels in normal children. Clinical Toxicology 9,
75-87.

Smith, T . J ., W agner, W . L . , and M oore, D . E . (1978). D istributionof


SO 2 for people with chronic exposure. Jour, of Occupational Medicine
2 { 2 ) , 83-87.

White, H. and Olson, L . (1981). Conditional distributions of earnings,


w a g e s, and hours for blacks and w hites. Jour, of Econometrics 17,
263-285.

Statistical Methods of T ests of Norm ality

Andrew s, D. F . , Gnanadesikan, R . , and W arn er, J. L . (1971). T ra n sfo r­


mation of multivariate data. Blom etrika 27, 825-840.

Andrew s, D. F . , Gnanadesikan, R ., and W arn er, J. L . (1972). Methods


fo r assessing multivariate normality. Unpublished memorandum.

Anscombe, F . J. and Glynn, W . J. (1983). D istribu tlo n o fth ek u rto sis


statistic Ьз for normal statistics. Biometrika 70, 227-234.

B e ise r, A . (1985). Distributions of N/bj and Ьз for autoregressive e r r o r s .


Unpublished thesis, Boston University.

Blom , G. (1958). Statistical Estimates and T ransform ai B e ta -V a ria b le s.


W iley, New York.

Bowman, K. O. and Shenton, B . R. (1975). Omnibus test contours for


departures from normality based on N/bj and Ьз • Biometrika 62, 243-
250.

Brunk, H. D. (1962). On the range of the difference between hypothetical


distribution function and Pyke’s modified em pirical distribution function.
A m . Math. Statist. 33, 525-532.

Burch, C. R. and Parsons, I. T . (1976). ’’Squeeze” significance tests. A ppl.


Statist. 287-291.

Chernoff, H ., Gastwlrth, J. L . , and Johns, M . V . (1967). Asymptotic d is­


tribution of linear combinations of functions of ord er statistics with
applications to estimation. A m . Math. Statist. 38, 52-72.

Chernoff, H. and Lehmann, E. L . (1954). The use o f the maximum likelihood


estimates on tests for goodness of fit. Am . IVfath. Statist. 25, 579-
586.

Cox, D. R. and Hinkley, D. V . (1974). Theoretical Statistics. Chapman


and Hall, London.
TESTS FOR THE NORMAL DISTRIBUTION 415

Cox, D . R. and Small, N . J. H. (1978). Testing multivariate normality.


Biom etrika 65, 263-272.

C ram er, H. (1928). On the composition of elementary e r r o r s . Second paper:


statistical applications. Skand. Aktvarietidskr. 11, 141-180.

C so rg o , M . , Seshadri, V . , and Yalovsky, M . (1973). Some exact tests for


normality in the presence o f unknovm param eters. J. Roy. Statist. Soc.
B 507-522.

D^Agostino, R. B . (1971). An omnibus test of normality fo r moderate and


large sample size. Biom etrika 58, 341-348.

D ’Agostino, R. B . (1972). Small sam ple probability points fo r the D test of


norm ality. Biom etrika 59, 219-221.

D ’Agostino, R. B . and Pearson, E . S. (1973). T e s tin g fo rd e p a rtu re s fro m


normality. I. F u ller em pirical results for the distribution o f Ьз and N/bj.
Biom etrika 60, 613-622.

D ’Agostino, R. B . and Rosman, B . (1974). The power of G eary’s test of


norm ality. Blom etrika 61, 181-184.

D ’Agostino, R. B . and Tietjen, G. L . (1971). Sim ulationprobabilitypolnts


for b 2 for sm all sam ples. Biom etrika 58, 669-672.

D ’Agostino, R. B. and Tietjen, G. L . (1973). Approaches to the null d istri­


bution of N/bj. Biom etrika 60, 169-173.

Dahiya, R. C. and Gurland, J. (1973). A test of fit fo r bivariate distribution.


J. Roy. Statist. Soc. B . 35, 4 52-465.

David, H. A . , Hartley, H. O ., and Pearson, E . S. (1954). Thedistribution


of the ratio, in a single norm al sample of range to standard deviation.
Biom etrika 41, 482-493.

Downton, F . (1966). Linear estimates with polynomial coefficients. B io -


metrika 53, 129-141.

Dumonceaux, R ., Antle, C. E . , and H aas, G . (1973). L ik elih o o d ratio test


for discrimination between two models with unknown location and scale
param eters. Technometrics 15, 19-31.

Durbin, J. (1961). Some methods fo r constructing exact tests. Biometrika


48, 41-55.

Durbin, J. (1973). Weak convergence of the sample distribution function when


param eters are estimated. Ann. Stat. I , 279.

Durbin, J ., Knott, M . , and T aylo r, C. C. (1975). C o m p o n e n tso fC ra m e r-


von M ises statistics. II. J. Roy. Statist. Soc. B 37, 216-237.

D yer, A . R. (1974). Comparisons of tests fo r normality with a cautionary


note. Biom etrlka 61, 185-189.

Elderton, W . P. and Johnson, N. L . (1969). Systems of Frequency C u rves.


Cam bridge University P r e s s , Cam bridge.
416 D’AGOSTINO

Epps, T . W . and Pulley, L . B . (1983). A test fo r normality based on the


em pirical characteristic function. Biometrika 70, 723-726.

Filliben, J. J. (1975). T h e p ro b a b ility p lo tc o rre la tio n c o e ffic ie n tte stfo r


normality. Technometrics 17, 111-117.

F ish er, R. A . (1928). O n a p ro p e rty c o n n e c tin g th e X ^ m easure of discrep­


ancy with the method of maximum likelihood. Reproduced in Contributions
to Mathematical Statistics (1950). W iley, New York.

Gastwirth, J. L . and Owens, M . G. B . (1977). On classical tests of norm al­


ity. Biometrika 64, 135-139.

Geary, R . C. (1935). The ratio of the mean deviation to the standard devi­
ation as a test of normality. Biometrika 27, 310-332.

Geary, R. C. (1947). Testing fo r normality. Biometrika 34, 209-242.

Gnanadesikan, R. (1977). Methods fo r Statistical Data Analysis of M iltiv ari-


ate Observations. W iley, New York.

Green, J. R. and Hegazy, Y . A . S. (1976). Powerful modified ED F goodness-


o f-fit tests. J. A m er. Statist. A s s . 71, 204-209.

Hall, P . and W elsh, A . H. (1983). A t e s t o fn o r m a lit y b a s e d o n t h e e m p ir ic a l


characteristic function. Biometrika 70, 485-489.

Harter, H. L . (1961). Expected values of normal o rd er statistics. Biometrika


48, 151-165.

Hartley, H. O . and Pfaffenberger, R. C. (1972). Quadratic form s in order


statistics used as goodness-of-fit criteria. Biometrika 59, 605-611.

Hastings, C. (1955). Approximations for Digital Com puters. Princeton Uni­


versity P re s s , Princeton, N .J .

Hegazy, Y . A. S. and Green, J. R. (1975). Some new goodness-of-fit tests


using order statistics. A p p l. Statist. 24, 297-308.

Hogg, R. V. (1972). M ore light on the kurtosis and related statistics. J.


A m er. Statist. A s s . 67, 422-424.

Kolmogorov, A. (1933). Sulla determinazione em pírica di una legge di d is tri-


buzione. G io r. Ist. Ital. Attuari £, 83-91.

K ulper, N . H. (1960). Tests concerning random points on a circ le . Proc.


Koninkl. N ed er. Akad. van Wetenschappen, A 63, 38-47.

L a Brecque, J. (1977). G oodness-of-fit tests based on nonlinearity in proba­


bility plots. Technometrics 19, 293-306.

Lin, C. and Mudholkar, G. S. (1980). A simple test fo r normality against


asymmetric alternatives. Biom etrika 67, 455-461.

Locke, C . and Spurrier, J. D. (1976). The use of U-statistics for testing


normality against non-sym m etric alternatives. Biometrika 63, 143-147.
TESTS FOR THE NORMAL DISTRIBUTION 417

Locke, C. and Spurrier, J. D. (1977). The use of U -statistics fo r testing


normality against alternatives with both tails heavy o r both tails light.
Biom etrika 64, 638-640.

Machado, S. G. (1983). Two statistics for testing for multivariate normality.


Biometrika 70, 713-718.

Malkovich, J. F . and A fifi, A . A . (1973). O n tests for multivariate norm al­


ity. J. A m er. Statist. A s s . 68, 176-179.

M ardia, K. V . (1970). M easures of multivariate skewness and kurtosis with


applications. Biom etrika 57, 519-530.

M ardia, K. V . (1974). Applications of some m easures of multivariate skew­


ness and kurtosis for testing normality and robustness studies. Sankhya
A 36, 115-128.

M ardia, K. V . (1975). Assessm ent of multivariate normality and the robust­


ness of Hotelling’s T^ test. A p p l. Statist. 24, 163-171.

M ardia, K. V . and Foster, K. (1983). O m n ibu stesto fm u ltin o rm ality based


on skewness and kurtosis. Comm, in Statist. 12, 207-221.

Mulholland, H. P . (1977). On the null distribution of N/bj fo r sam ples of size


at most 25, with tables. Biom etrika 64, 401-409.

O ’R eilly, F . and Q uesenberry, C . P . (1973). Theconditionalprobabllity


integral transformation and application to composite chi-square goodness
of fit test. Am . Statist, 74-83.

Pearson, E. S. (1930). A further development of tests for normality. B io -


m etrika 22, 239.

Pearson, E. S ., D ’Agostino, R. B . , and Bowman, K. O. (1977). Tests for


departure from normality. Comparison of pow ers. Biom etrika 64,
231-246.

Pearson, E. S. and Hartley, H. O. (1972). Biom etrika T ables fo r Statisti­


cian s, V ols. I and П. Cam bridge University P re s s , Cam bridge.

Pearson, K. (1895). Contributions to the mathematical theory of evolution.


Philosophical Transactions of the Royal Society, London, 91, 343.

Pearson, K. (1900). On a criterion that a given system of deviations from


the probable in the case of a correlated system of variables is such that
it can be reasonably supposed to have arisen in random sampling. P h il.
M ag. 5th S e r. 157-175

Prescott, P . (1976). Comparison of tests for normality using stylized surfaces.


Biom etrika 63, 285-289.

P u ri, M . L . and Rao, C. R. (1976). Augmenting Shapiro-W ilk test fo r nor­


mality. Contributions to Applied Statistics. Birkhauser (G rosseh aus),
B erlin , 129-139.
418 D ^a g o s t in o

Руке, R. (1959). The supremum and infimum of the Poisson process. J.


A m er. Math. Soc. 30, 568-576.

Q uesenberry, C. P . (1975). Transform ing samples from truncation param ­


eter distributions to uniformity. Comm, in Statist. £, 1149-1155.

Rao, C. R. (1973). Linear Statistical Inference and Its Applications (2nd


edition). W iley, New York.

Royston, J. P . (1982). An extension of Shapiro and W ilk ’s W Test for nor­


mality to large sam ples. A ppl. Statist. 31, 115-124.

Saniga, E. M . and M ie s , J. A . (1979). Pow er of some standard goodness-


o f-fit tests of normality against asymmetric stable alternatives. J. A m e r.
Statist. A sn . 74, 861-865.

Shapiro, S. S. and Francia, R. S. (1972). An analysis of variance test for


normality. J. A m er. Statist. A s s . 67, 215-216.

Shapiro, S. S. and W ilk , M . B . (1965). An analysis of variance test for nor­


mality (complete sam ples). Biometrika 52, 591-611.

Shapiro, S. S ., W ilk, M . B . , and Chen, H. J. (1968). A comparative study


of various tests for normality. J. A m er. Statist. A s s . 63, 1343-1372.

Spiegelhalter, D. J. (1977). A test for normality against symmetric alterna­


tives. Biometrika 64, 415-418.

Smirnov, N . V . (1939). Sur les écarts de la courbe de distribution empirique


(Russian/French sum m ary). R ec. Math. 6, 3-26.

Stephens, M . A . (1974). EDF statistics for goodness- o f - f it and some com­


parisons. J. A m er. Statist. A s s . 69, 730-737.

Tiku, M . L . (1974). A new statistic for testing fo r normality. Comm. Statist.


3, 223-232.

Uthoff, V . (1970). An optimum test property of two well-known statistics.


J. A m er. Statist. A s s . 65, 1597-1600.

Uthoff, V . (1973). The most powerful scale and location invariant test of the
normal versus the double exponential. Ann. Statist. £, 170-174.

Watson, G. S. (1957). The goodness-of-fit test fo r normal distributions.


Biom etrika 44, 336-348.

Watson, G. S. (1961). G oodness-of-fit tests on a c ircle. Biometrika 48,


109-114.

W eisberg, S. (1980). Comment to some large sample tests for nonnormality


in the linear regression model. J. A m er. Statist. A s s . 75, 28-31.
TESTS FOR THE NORMAL DISTRIBUTION 419

W eisberg, S. and Binham, C . (1975). An approximate analysis of variance


test for non-normality suitable fo r machine calculation. Technometrics
133-134.

White, H. and MacDonald, G . M . (1980). Some la rge-sam p le tests fo r non­


normality in the regression m odel. J. A m er. Statist. A s s . 75, 16-28.
10
Tests fo r the Exponential Distribution

M ichael A . Stephens Simon F r a s e r University, Burnaby, B . C . , Canada

10.1 IN TR O D U C TIO N AN D CO NTENTS

The е?фопеп11а1 distribution is probably the one most used in statistical work
after the norm al distribution. It has Important connections with life testing,
reliability theory, and the theory o f stochastic processes, and is closely
related to several other well-known distributions with statistical applications,
for example, the gamma and the W elbull distributions.
The general form of the exponential distribution is

F (x ; 0-,/3 ) = I - exp { - ( x - 0-)//3 } , x > a , (10. 1)

where ot and ß are constants, with ß positive. The notation E ^ (a ,/ 3 ) w ill be


used to re fe r to F (x ; a , ß ) , o r to a sample from it, and o¿ w ill be called the
origin o f the distribution; the mean o f E x p (a , ß ) is o¿ + ß and the variance
is If a random sample X x, . , X¿ is Exp(o',/3), the ordered sample
^ ( 1) < ^ ( 2) < • • * < ^ ^ ^ ordered Е ^ (о :,)3 ). A s in other
chapters, the notation U (0 , 1) w ill re fe r to a uniform distribution from 0
to I , and a sample from U (0 , 1) w ill be called a uniform sample o r , if placed
in ascending o rd er, an ordered uniform sam ple.
In this chapter we discuss tests o f the null hypothesis that a random
sample X x , • . . , Xn is E x p { a , ß ) ^ with possibly c¿, o r ß , o r both, unknown.
This gives four possible cases:

Case 0 : where a and ß are both known;


Case I : where a is unknown and ß is known;

421
422 STEPHENS

Case 2 : where a is known and ß is unknown;


Case 3: where both a and ß are unknown.

The null hypotheses corresponding to these four cases w ill be called Hqq ,
Hqi 9 Hq2 9 and Hqj .
Case 0 can be easily handled: the Probability Integral T ransform (P IT ,
Section 4 .2 .3 ) gives n values Z j = F (X i; which, on Hqq , are uniform
U ( 0 , 1) and these can be tested by any of the methods o f Section 4.4 o r Chap­
ter 8 . An example of a Case O test for exponentiality is given in Section 4 .9 .
The data are the 15 values of X, given in Table 10.1, and are times to failure
of airconditioning equipment in aircraft, given by Proschan (1963). W e shall
use these data throughout the chapter to illustrate test procedures.
Mathematical properties of the exponential distribution can be used to
change Case I to Case 0 , and to change Case 2 and Case 3 to the special test
of Case 2 with о = 0 . Most of the tests in the literature have been proposed
fo r this situation, so we shall reserv e the notation Hq fo r the hypothesis:

Hq : a random sample of size n comes from Exp (0 , /3), with ß unknown

Tests fo r this hypothesis are those discussed throughout most of this chapter,
although we return to Case 3 in Section 10. 14.
A large number of test procedures have been given fo r Hq . One reason
fo r this is that, again because of mathematical properties of the e^x>nential
distribution, it is possible to transform a sample of n X -values from
Exp (0,/3) in several useful ways; one transformation (N below) takes the
sample to a new n-sam ple X* which is also E x p (0,/3), and another tran sfor­
mation (J) takes X to a set of n - I ordered uniforms U . Further, J can be
applied to the X* set, to give a set of n - I uniforms U*; we call the conver­
sion of X to the K-transform ation on X. Thus tests o f Hq on X can become
tests of Hq on X ’ , o r tests of uniformity on U o r on U*. Furtherm ore, the
different transformations have useful interpretations, depending on the o r ig ­
inal motivation for testing the X sam ple, and on the alternative distributions
to E x p (0,/3) that the X might have. Two of the most important applications
of Exp (0,/3) variables are to modelling time intervals between events, o r
to modelling lifetim es, o r times to failure, in survival analysis and re lia b il­
ity theory. A particular application w ill tend to lead to a particular group of
tests. In a general way, the J transformation, and various tests on U, w ill
arise naturally in connection with a series of events, and the N and K tran s­
formations , with tests on X^ o r on U ’ , w ill arise in tests on lifetim es. This
is partly because the properties of X^ and are influenced by whether o r
not the true distribution of X, if not exponential, has a decreasing o r in creas­
ing failure rate.
The overall plan of this chapter is therefore as follows. A fter a section
on notation, we show how other cases are reduced to Case 0 o r to a test of
Hq . The applications of the exponential distribution are discussed, and fo l-
TESTS FOR TH E E X P O N E N T IA L DISTRIBUTION 423

T A B L E 10.1 Set of Observations X and Derived Values

T U® d" X-®

74 74 .041 .041 180 .099


57 131 .072 .031 126 .17
48 179 .098 .026 65 .20
29 208 .114 .016 12 .21
502 710 .390 .276 22 .22
12 722 .397 .007 0 .22
70 792 .435 .038 171 .32
21 813 .447 .012 72 .36
29 842 .463 .016 14 .36
386 1228 .675 .212 66 .40
59 1287 .708 .033 20 .41
27 1314 .722 .014 316 .58
153 1467 .806 .084 519 .87
26 1493 .821 .015 120 .94
326 1819 1.000 .179 116 1.00
1819 1819

^Tim es to failure of a ir conditioning equipment fo r an aircraft.


^P artial sums of X¿ (Section 10.4).
^Values of T divided by the largest value (1819), i . e . , values U derived
from X b y the J-transform ation (Section 10.5).
^Spacings between the U -v alu es.
^Norm alized spacings given by the N transformation (Section 10.5).
^Values U* obtained from X*, by the J transformation, o r from X by the K
transformation (Section 10.5).
Data taken from Proschan (1963), by perm ission of the author and of the
Am erican Statistical Association.

lowed by the details of the N , J, and K transform ations, and some of their
properties. Then we turn to tests o f Hq • With the potential applications in
mind, these are roughly grouped into three groups, as follows: Group I ,
those applied to X; Group 2 , those applied to U ; Group 3, those applied to
X^ o r to U 4 The tests fo r the three groups occupy Sections 10.7 to 10 . 1 1 .
The test statistics discussed are almost always presented in the context in
which they w ere first suggested, although, obviously, any test first suggested
fo r the X set could equally be applied to set X \ and vice v e rsa . There w ill
be some inevitable overlap with other chapters, particularly Chapter 8 , con­
taining tests for uniformity. A few statistics a re repeated in Chapter 8 (they
differ slightly because in Chapter 8 it is natural to calculate the statistics
from n uniform s, but in this chapter they are found from m = n - 1 uniform s).
424 STEPHENS

One group which is treated mainly in this chapter are tests based on spacings
between uniforms since such spacings are exponentials and these tests have
arisen mostly in connection with tests of Hq •
If a general random sample is given, with no details as to context, and
a test o f Ho is required, the question of making a transformation o r not b e ­
comes one of obtaining the best power for a test procedure against a class
of alternatives. Some studies on power of tests are reported in Section 10.13,
fo r both omnibus tests and one-sided tests. One feature of these studies is
that they reveal much sim ilarity in power of many of the test procedures,
when applied to a general random sam ple. The user w ill therefore often be
guided by personal preference.
The value of a test statistic often gives information on the set from which
it is calculated, and this in turn may sometimes be interpreted in term s of
the X -sam p le. Thus, in modern statistics, it is common practice to calculate
several statistics, and to use them to analyze features of the given data,
rather than rigorously to apply significance tests. This approach is taken to
illustrate the tests, applied to the data set in Table 10 . 1 , in Section 10 . 12 .
In Section 10.14 we return to Case 3 tests.

10.2 NO TATIO N

The notation used in this chapter, apart from that already described, is
listed in this section.

X: the original data set, X^, X 2 , • . . , Xn- i is the index of X^.

n = size of set X; m = n - I.

DFR , IF R : decreasing failure rate, increasing failure rate (Section 10.4.5).

D FR (IFR) Sample: a random sample from a D FR (IFR) distribution (Sec­


tion 1 0.4.5).

Cy: coefficient of variation (Section 10.4.5)

Transform ations J, K, N: See Section 10.5.

X^: a set of size n, derived from X by transformation N.

U: a set of size m , derived from X by transformation J.

U*: a set of size m, derived from X by transformation K.

E: spacings between exponentials X (Section 1 0.5.2).

D: spacings between set U (Section 10.9.3).

D ’ : spacings between set (Section 10.11.4).

Group I : tests using X (Section 10.8).


TESTS FOR THE EXPONENTIAL DISTRIBUTION 425

Group 2 : tests using U (Section 10.9).


Group 3: tests using X ’ o r U* (Section 10.11).

log X means loge

Significant tail o f a test statistics See Section 1 0 .1 3 .1.

10.3 TESTS FOR E X P O N E N T IA L IT Y s THE FO UR CASES

10.3.1 Four Results

In this section we show how Case I can be reduced to Case 0, and C ases 2
and 3 to a test of Hq . These employ the following properties of a sample
from Езф ( 01,/3 ).

Result I . If Xi, i = I , . . . , n, is a random sample from E x p { a , ß ) ^ the set


Y i given by Y i = Xi - O', i = I , . . . , n, is a random sample from Exp (0,/3).

Result 2 . If Xi, i = I, . . . , n, is an ordered sample from Exp (0^,/3), the


Y -sam p le obtained from Y^i) = X^i+ 1) - Х ^ц , i = l , . . . , n - 1 , is an ordered
sample of size n - 1 from Езф (0,/3). This result can be successively applied
to give

Result 3 . If X (i), i = I , . . . , n, is an ordered sample from Exp ( 0^,/3 ), the


Z -sam ple obtained from

= X, -X ^ ,, I = I, n - r , ( 10. 2)

where r is fixed, l < r < n - l , is an ordered sample of size n - r from


Exp (0,/3).

Result 4 . I f X i s Exp (0,/3), Y = 2X//3 has the distribution.

10.3.2 Tests fo r C ases I , 2 , and 3


1 0 .3 .2 .1 Tests for. Case I; o? is not known but ß is known

F o r this case. Result 2 above can be used to change the original X -sam ple
to a Y -sam p le of size n - 1 which, on Hqi , w ill be Exp (0,/3), with /3 known,
and the test becomes a Case 0 test on the Y set.

1 0 .3 .2 .2 Tests fo r Case 2; a is known, and /3 is unknown

F o r this case, when the known value of a is not zero. Result I can be used
to produce a Y -sam p le which, on Hq2 , w ill be Exp (0,/3). Thus the Case 2
test with a known is reduced to a test of Hq , applied to the Y -s e t.
426 STEPHENS

1 0 .3 .2>3 Tests for Case 3: both param eters unknown

This test, of Н03: that a given set X is from Exp ( a , ß ) with both param eters
unknown, can be transformed by use of Results 2 o r 3 above to a test that the
Y -s e t o r Z -se t is Exp (0,/3), that is, a test of Hq applied to set Y o r to set Z .
Result 2 w ill be used if a complete X -sam ple of size n is available; the Y -
sample is then of size n - I . Result 3 w ill be useful if the first r - I ordered
observations of the X -se t are not available, o r if, fo r some reason, they are
suspected to be outliers.
This use of Results 2 and 3 has been a generally accepted way to handle
tests with unknown a , but it may not be the best way, and we return to tests
of Case 3 in Section 10. 14.

10.4 A P P L IC A T IO N S OF THE E X P O N E N T IA L DISTRIBUTION

10.4.1 The Poisson P rocess

Suppose a series of events is recorded, starting at time To = 0; the events


occur at times T^, T 2 , • • • , with T q < T^ < T 2 < < T q^. Consider the
intervals between events, X^, defined by

X. Tj — ^ ...,n

If the events at times T j are from a Poisson process the variables Xj w ill be
independently distributed Exp (0,/3), where /3 is a positive constant. A
test that the process generating the events is Poisson can therefore be
based on a test that the intervals X¿ are Exp (0,/S). The Poisson process
is discussed in many textbooks, see, for example, Cox and Lew is (1966,
Chapter 2). If the T j are recorded on a horizontal time axis, the Xi are the
spacings between the T j .

10.4.2 Models for Tim e to Failure

The second application of Exp (0,/3) is to model the lifetime, o r time to


failure, of, say, a piece of apparatus, such as the airconditioning equipment
for which 15 values of X are given in Table 10.1.
Suppose the item is immediately replaced whenever it fails and let Xj
be the lifetime of successive items. If times T j are calculated, given by

T = X + X^ + + X
n I 2

these times w ill be the times of failure for the overall equipment (here an
a irc ra ft).
TESTS FOR THE EXPONENTIAL DISTRIBUTION 427

By comparison with the preceding paragraph it may be seen that, when­


ever the Xi are Exp (0,/3), the tim es T i can be regarded as a realization of
a Poisson p rocess.

E 1 0 .4 .2 .1 Example

Values T i, derived from the X i, a re shown in T able 10 . 1 .

10.4.3 A Lifetesting Experiment

The lifetim es X i of equipment in the example above w ere obtained by using


the units successively in an airc ra ft as required. If it is desired to test that
lifetim es are exponential, the test w ill be accelerated, if the units are avail­
able and expendable, by making a laboratory e ^ e r im e n t in which they are
all put into use at the same time T q = 0, and times to failure are recorded
either until a ll units fail, o r until a fixed time T f is reached. Suppose the
units a re numbered I to n, and let Xi be the lifetim e of unit number i; notice
that in general the labelling of the X^ w ill be quite a rb itra ry . Tim es w ill be
recorded as failures occur, and these times give the o rd er statistics.
To < X (I) < X ( 2) < • • • < T f o f the X sam ple. If only r items fail in time T f ,
only the first r o rd er statistics o f the sample w ill be known. The sample
is then said to be right-censored (see Section 4 . 7 and Chapter 12).
^ Note that times T * calculated from the o rd er statistics by T * = X ( i ) ,
T 2 = Х (ц + X ( 2), e t c ., w ill not be times in a Poisson p rocess as described
in Section 10.4.2 above, because the o rd er statistics o f an e:qx>nential ran­
dom sample a re not themselves exponentially distributed. However, trans­
formation N b e lo w changes ordered E x p { 0 , ß ) variables into random E ^ (0,/3)
variables, and the above construction can then be used to create times in a
Poisson p rocess.

10.4.4 Alternative Distributions Used


in Reliability Theory

In o rd er to compare various test procedures, it w ill be useful firs t to discuss


distributions alternative to the exponential, which a re used in reliability anal­
ysis as models for lifetim e data. Two of the most important of these are the
gamma and W eibull distributions. The most general form s of these distribu­
tions are given in Sections 4.11 and 4.12. Here we are interested only in the
distributions with origin zero. The gamma distribution then has density

fçj(x) = X e X >о, (10.3)

where m and ß are positive constants. When the constant m = I , the d istri­
bution reduces to the exponential. The density is infinite at x = 0 when m < I ,
and is zero when m > I .
428 STEPHENS

The W eibull distribution, with origin zero, has density

mi
f ^ (x ) = ( m / ß ) ( x / ß ) ^ ~ ^ exp{-(x//3)“ } , x > 0, (10.4)

with m and ß positive constants. F o r m < I , the density x = 0 is infinite,


and when m > I it is zero; when m = I , the distribution reduces to the ex­
ponential. In shape the gamma and W eibull distributions are somewhat
sim ila r.

10.4.5 Properties of Distributions: C oefficien tof


Variation, and F ailure o r Hazard Rate

Two useful param eters in describing distributions are the coefficient of


variation Cy, and the failure ra te . The Cy is defined as а/д, where д and a ^
a re the mean and variance of the distribution. A sm all Cy suggests that the
variable X has fairly constant values, but a large C y suggests they w ill be
widely spread relative to the size of the mean. F o r the gamma distribution,

Д = m ß and = m ß ^ ; thus Cy = m
The failure rate, o r hazard rate, of X, is defined as

h(x) = - ¾ - X > 0< (10.5)


' ' I - F(X)

where f(x) and F (x) are the density and distribution functions of X, assumed
continuous. If F(X) is a distribution of lifetim es, the quantity h(x) dx may be
interpreted as the probability of failing in time dx at x, given that failure
has not occurred up to x. F o r the exponential distribution Exp (0,/3), h(x) =/3,
a constant; for the gamma and the W eibull distributions, the failure rate
increases steadily with x for m > I, and decreases fo r 0 < m < I; for m = I
they both reduce to the exponential distribution with constant failure ra te .
Abbreviations IF R and D FR are commonly used fo r increasing o r decreasing
failure ra te . A distribution with IF R (often called an IF R distribution; a
sample from such a distribution w ill be called an IF R sample) w ill have
C y < I , and a D F R distribution w ill have C y > I . W e can sum m arize results
fo r the gamma and W eibull distributions in the following sm all table.

P aram eter value C F ailure rate


V

m < I Cv > I DFR

m = I (E x p (0,/3) Cy = I Constant = ß

m > I Cy < I IF R
TESTS FOR THE EXPONENTIAL DISTRIBUTION 429

10.5 TRANSFORM ATIONS FROM E X P O N E N T IA LS


TO E X P O N E N T IA LS OR TO UNIFORM S

10.5.1 Introduction

W e now describe the three important transform ations which are often made
to the data set X under test fo r Exp (0,)3). These a re:

1) transformation N, which transform s an ordered exponential sample


E x p ( 0 , ß ) , of size n, to a random Езф (0,/3) sample o f size n;
2) transformation J, which transform s a random Езф { 0 , ß ) sample of
size n into an ordered uniform U (0 ,1) sample o f size m = n - I;
3) transformation K, which, like J, transform s a random E x p { 0 ^ß ) sample
of size n into an ordered uniform U (0 ,1) sample o f size m = n - I .

10.5.2 The N Transform ation from Exponentials


to Exponentials: Norm alized f a c i n g s

Suppose Х^ц, i = I , . . . , n, is an ordered sample from E^> (0,/3), and


define the spaclngs between the e^qponentials by

I — I , •• • > n
= (i-l)’

with X(O) = 0 . The new set X|, defined by

X| = (n + I - i ) E j , i = I, . . . , n

w ill be independently and Identically distributed Езф (0,/3). F o r m ore precise


conditions on this result see, fo r example, Seshadri, C sörgo, and Stephens
(1969). This transformation w ill be called transformation N and we w rite
X* = NX. The values x| are called the norm alized spacings o f the original
set Xi: Ei has expected value /3/(n + I - i), so that i i = Ei/(expectation
o f Ei) ^ ^ / ß l the i i are sometimes called leap s. F o r further discussion of
norm alized spaclngs see Section 4.20.

10.5.3 The ” Total Tim e on Test” Statistic

Suppose the transform ed sample X^ is used to construct a Poisson process


realization as described in Section 10.4.2:

" ^ i’ ^ ^2’ •••’ '^ r = 2


i= l

It may easily be shown that T^ is also, in term s of the original X -values,


430 STEPHENS

T' = X + X + (10.6)
г (I) (2)

In the context of the lifetesting experiment discussed in Section 10. 4 . 3 , the


first r terms on the right side of (10.6) are the times for which the first r
failed items w ere working successfully and the last term is the time so fa r
spent working by those n - r items which have not yet failed. Thus is
interpretable as the total time on test till the r-th failure, At time X.(n) = aU
test items have failed, and then

n n
= 2 x: = X X = T (10.7)
n ^ I ^ i n
i= l i= l

10.5.4 The J Transform ation from E:qюnentials


to Uniforms

A result in Section 8.2 states that the n + I spacings between a sample of


size n from U (0 ,1) are each E^qp(0,/3) with ß = l/(n + I ); the spacings are
not a random sample, but are conditional on their sum being fixed. This
being so, a sample of ordered uniforms can be produced from an exponential
sample as follows:

a) Let X i, X 2 , . . . , be a random sample from E^> (0,/3), and let

Tj = X¿, j = I , • • • , n.

b) Define = Tj/Tj^, j = I , . . . , n - I .

Result I . The U qj are distributed as the o rd er statistics o f a random


sample U, of size n - I, from U (0 ,1 ).
Note that if the definition w ere extended to j = n, the value of U(n) would
be identically I . Thus the transformation produces n - I ordered unifoi*ms
from n original observations; a "degree o f freedom " has been lost in elim i­
nating the unknown param eter ß . The above transformation from X to U w ill
be called the transformation J, and we write U = JX. If the T j are the times
in a Poisson process, as described in Section 10.4.2, the U(i) above are the
values Tj/Tji, i = I , . . . , n - I .
Result I was obtained by dividing the times in a Poisson process by the
time of the last event observed. If the process is observed to a fixed time,
we have a second result:
Result 2. If a Poisson process is observed from time zero to a fixed
time T f , with n events at 0 < T i < T 2 < • • • < T ^ < T f occurring in that time,
the values = T^/Tf, i = I, . . . , n, w ill give a sample U of n ordered
uniform s.
TESTS FOR THE EXPONENTIAL DISTRIBUTION 431

10. 5.5 The K Transform ation from Exponentials


to Uniform s

A sample o f X -valu es, all positive, could first be transform ed to a new set
by transformation N , and then to a set U* by applying transformation J
to X ’ . On Hq , the set U ’ w ill be n - I ordered uniform s. The combination
of N and J is equivalent to the following transformation, which we call K.

a) Let Х ц ) < X^2) < • * • < be the o rd e r statistics o f a sample from

Exp (0 ,ß ) and let T^ = X^, as b e fo re -

b) W rite X(O) = 0 and let E¿ = X (j) - ^ ( i - 1)» calculate

X^ = (n + I - i ) E . , i = I, , n

c) Calculate Tl =
J 1=1 X ! , and U * = T \ / T
I* (J) J n
, j = I,

>,n“I•
Result« The set U* is a sample of n - I ordered uniforms from U (0 ,1 ).
Note also that, from (10.7), Following e a rlie r notation we write
U* = KX.

E 10.5.5 Example

In Table 10.1 the values X^ are given, obtained by application o f transform a­


tion N to the set X; also given are the values U and U^ obtained from applica­
tion of transformations J and K.

10.5.6 The N and K Transform ations with Censored Data

The N and K transformations can be used with a censored sam ple. When only
the r sm allest values o f the original set X a re available, the r values
X \ , X2, . , X^ o f set X* given by the N transformation can still be obtained;
the J transformation can be applied to these to produce r - I ordered uni­
fo rm s. This is equivalent to the following useful result. Suppose we calculate,
using the notation o f Section 1 0 .5 . 5 , values

.........

Result. The set U^ is a complete sample of r - I ordered uniforms


from U (0 ,1 ).
A lso , since the X [ are a random sample from E x p (0,/3), the value of
X j)/ r = TVr gives an estimate of ß ; this estimate is often used with

censored data.
432 STEPHENS

10 о6 TEST SITUATIONS AND CHOICE OF PROCEDURES

10.6.1 General Comments

The decision whether o r not to test Hq directly on the X -set, o r to use one
of the transformations to X ’ o r to U o r U *, is related to the three test situ­
ations: tests on a general random sam ple, with no particular context given,
tests on intervals between events (where the indexing of the intervals might
be important), and tests on lifetim es. W e first observe an important d iffer­
ence between transformations J and K: the J transformation p reserves the
original indexing of the X i, while the K transformation does not—that is to
say, in making transformation K (or N) the Xi are first put in ascending
ord er, and the original labelling is irrelevant. Most tests based directly on
the Xi (Group I below) w ill also involve putting them in o rd er and losing the
original indexing.
In tests on a series of events, the original indexing w ill probably be
Important—for example, one wants to know if the intervals are getting longer
o r shorter as time p a sse s—and then m ore Information w ill be given by using
J followed by tests on U.
On the other hand, when the labelling of the Xi is not important there
a re some disadvantages to J. Consider, for example, the lifetim es of equip­
ment in the laboratory experiment described in Section 10. 4 . 3 . There is no
significance (presumably) in the index I attached to lifetime X i, and so
different statisticians could label the X -set differently; when this is done the
J transformation w ill produce different values U, and when tests for uniform ­
ity are applied to the U -se t, different conclusions w ill be reached.
In contrast, use of the K transformation gives always the same set U \
and tests based on U' w ill, for all statisticians, give the same results. The
same holds true of tests based on the X^ themselves, and of those tests on
the X -set, such as EDF o r regression tests, where the observations are
first ordered. This invariance is usually considered a desirable property
for a test procedure.
Another feature of J is that it can produce superuniform U, that is, a
set U which is too evenly spaced to be considered uniform (see Sections 8 . 5
and 4 .5 .1 ). This w ill occur, sometimes, when the X -set comes from an IF R
alternative. Many test procedures are not set up to detect superuniforms; for
example, EDF tests o r regression tests using the upper tail (as is customary)
w ill not detect them. Several of the test statistics to follow w ill detect su p er-
uniform s, o r EDF tests o r regression tests may be used with the low er tail,
but then power is lost when the tests are used with two tails against DFR
alternatives.
The K transformation produces U ’ values which w ill never typically be
superuniform; in fact, they w ill drift toward 0 for D FR samples and toward I
for IF R sam ples, so the pattern of the U ’ set gives information about the
alternative. Finally, K may be used on a censored sam ple. F o r these reasons
K is recommended rather than J to apply to a general random sam ple. These
TESTS FOR THE EXPONENTIAL DISTRIBUTION 433

points w ere discussed further by Seshadri, C so rg o , and Stephens (1969),


and we return to them again when we discuss power in Section 10.13.
W e now pose two contrasting questions: why transform the data at all
o r, at the other extrem e, why not apply several transform ations, one after
the other ?
In considering these questions, when the indexing is not important, the
tester w ill be most interested in getting good power fo r a test procedure,
with possibly a particular class of alternatives in mind. W e shall see in
Section 10.13 that, in practice, tests on the original set X, and tests on U*,
often give much the same power; furtherm ore, some tests on X also give
Information on the parent population (IFR o r D F R ).
On the other hand, if K = JN is a good thing, why not apply, say, N
several times to X, to get sets X ’ = NX, then X ’ = N X ’ = N NX , e t c ., and
test the final set for exponentiality ? Equally, one could apply G and W
(uniform s-to-uniform s transformations given in Chapter 8) to uniforms U
o r U ’ , to obtain data sets represented sym bolically by, fo r example, =
GGJX, o r U 3 = W GJX, o r U 4 = GWGJX, all of which, on Hq , should be uni­
form and can be tested by the methods of Chapter 8 . One good reason for
not repeating such transformations as G and W ad absurdum must be that
they w ill appear to a practical statistician to be arbitrary and unmotivated,
and to produce data sets which are fa r removed from the original X; then
if Ho is rejected, because, say, the final data set has too many values close
to zero, it w ill be difficult o r im possible to interpret this phenomenon in
terms of properties of the original X set. Other practical reasons exist too;
it is pointed out in Chapter 8 that application of G to a nonuniform set may
often increase power of a subsequent test fo r uniformity, but repeated appli­
cations may decrease it. No doubt fo r these reasons tests of Hq in the lite r­
ature, such as those given below, have been confined, if a transformation of
this 1зфе is used at all, to one application of N , J, o r K . Some tests involving
the set X ^ = W X , that is, one application of W to X, have been discussed by
Wang and Chang (1977). Other aspects of transformations have been discussed
by O ’Reilly and Stephens (1983); there are interesting connections between
the C P IT of Chapter 6 and transformations J, K, G, and W^
W e now discuss in greater detail tests on events and tests on lifetim es.

10.6.2 T e s t s o n a S e r le s o f E v e n t s

It has been suggested above that in tests on events, the natural index of X¿
w ill play an important role. Consider tests for the^ Poisson p rocess, against
the alternative of trend.
In the Poisson p rocess, events occur at a constant rate as time passes;
this leads to the intervals between events being independent and exponential.
M ore p recisely, let X(t) dt be the probability of an event occurring in the
interval dt at time t ; for the Poisson process X(t) is constant. An obvious
alternative to the Poisson process is the model fo r which X(t) increases o r
434 STEPHENS

decreases with t, that is, events occur m ore quickly o r m ore slowly as time
passes. There is then said to be a trend in the rate of occurrence. A possible
model fo r trend considered, fo r example, by Cox (1955) is to suppose \(t) =
C e^^^; if к is positive (negative) there is an increasing (decreasing) rate of
events, and if к = O the rate is constant. Another model for trend is \(t) =
(a > I ); this was considered by Bartholomew (1956) who found a sequen­
tial test for randomness against this model for trend.
If events occur more quickly, the Intervals Xj between events w ill b e ­
come shorter on average; thus the X j, as naturally indexed in time, become
sm a lle r. If the J transformation is applied to the Xj as naturally indexed,
the U(i) values w ill tend toward I. If events occur m ore slowly, the tend
toward zero. Thus trend w ill be detected by the statistics which detect move­
ments of the U -values toward O or I . Another alternative to a constant
a rriv a l rate for events is the possibility that they occur periodically; then
the intervals between events are of fairly constant length. The coefficient of
variation Cy of the Xj w ill be sm all, and the U(J) from the J transformation
w ill be super uniform . Statistics to detect periodicity of events must be well
adapted to detect superuniformity of the U (j ). In contrast, if there is a wide
disparity between the Intervals X j, when the longer intervals appear too long
compared with the shorter intervals, the Cy of the Xj w ill be large.
It is possible that the Intervals between events, even if exponentially
distributed, are not independent. Lack of independence is usually difficult to
detect, and the appropriate test statistic w ill depend on how the intervals
are related . Here the indexing of Xj w ill again be important; as naturally
indexed in time, the Xj might, for example, be tested fo r autocorrelation.
An interesting example of events fo r which the J transform produces super­
uniform U (i), probably because of lack of independence, are the events
recorded by the dates T j marking the reigns of kings and queens of England
(see Pearson (1963)). F o r further rem arks on the problem s of correlated
intervals see Lewis (1965).
It might be decided to base a test for uniformity of the Uj on the spacings
D j between the U j. The spacing Dj is Xj/T^, so that the rem arks above con­
cerning the indexing of the Xj w ill apply also to the spacings D j. The sizes
of the spacings, as naturally indexed by time, w ill be Important both in tests
fo r trend o r for independence, whereas the variance of the spacings w ill be
important in m easuring periodicity o r great disparity between the time
in tervals.

10.6.3 Tests on Lifetime Data

Suppose the original Xj are lifetim es. Application of the N transformation


w ill produce a set Xl which w ill be Exp ( 0 , ß ) if the Xj are Exp (0,j3); how­
ever, if the Xj are from an IF R o r D FR distribution, the Xj w ill not be expo­
nential, and the indexing of the Xj w ill become important. Suppose the true
lifetime distribution of X has an increasing failure rate. Then a random
TESTS FOR THE EXPONENTIAL DISTRIBUTION 435

sample X, when placed in o rd e r, should exhibit sm aller spaclngs fo r large


values of X than those given by the exponential distribution with the same
mean lifetime. Since the X^ are norm alized values o f these spacings, that
is, the spacings multiplied by a factor, the X [ fo r large i w ill on the whole
be sm aller than e^)ected. This is form ally stated as follows (see, for ex­
ample, Barlow , Bartholomew, B re n m e r, and Brunk, 1972 (C h a p te rs).
R esult. If the lifetime distribution of X is IF R (DFR) the Xj are stochas­
tically decreasing (increasing) in i = I , . . . , n.
From this result have come many ideas fo r testing that the original
sample X is Е^ф { 0 , ß ) , against IF R o r D FR alternatives, using tests based
on Xi o r U i . A general discussion, with exam ples, is in Epstein (1960a,
1960b).
Finally, we should note that powerful tests on lifetim es w ill be powerful
tests on any general random sam ple, regard less of its source. In the power
studies of Section 10.13, it turns out that it is useful to divide alternatives
into D FR and IF R classes; this division is naturally meaningful in tests on
lifetim es, but it tends also to classify alternatives by Cy value, and by their
skewness and kurtosis compared with the e:qx>nential.

10.7 TESTS W ITH ORIGIN K NO W N: GROUPS I , 2, AND 3

A fter the preceding discussion we classify test procedures fo r Hq into three


broad groups:

Group I . Tests fo r exponentiality using the basic data set X.


Group 2. Tests based on the transformation U = JX, with a subsequent test
for uniformity of U.
Group 3. Tests based on the transformations X* = NX, o r U* = KX, followed
by a test for exponentiality of the X \ o r a test fo r uniformity of U ' .

It is cle a r from e a rlie r discussion that J w ill often be applied to inter­


vals between events, leading to tests of Group 2 , and N and K w ill be applied
to lifetime o r failure-tim e data, leading to tests in Group 3. Group 2 tests
have been called uniform conditional tests by Lew is (1965) and by Cox and
Lew is (1966).

10.8 GR O UP I TESTS

Here the set X w ill be tested directly for Exp (0,/3), with no transformation
N, J, o r K. Tests available include the Pearson chi-square test and m odem
adaptations, discussed in Chapter 3, EDF tests and regression tests, d is­
cussed in Chapters 4 and 5, and tests based on sample moments. No further
comments w ill be offered on chi-square tests, but the other tests w ill be r e ­
viewed below, in the special context of tests fo r exponentiality.
436 STEPHENS

10.8.1 E D F T e s ts

(a) The direct EDF test, In which ß is estimated by X, is described in


Section 4.9.
(b) A variation of standard EDF procedures has been suggested by Srinivasan
(1970, 1971) (see Section 4 .1 6 .3 ); fo r the exponential distribution the
calculations are easy. The transformation is made to a Z -se t by

n -1
i = I, ., n

where T = Z ? , X.. The Kolmogorov statistic D is then calculated from


n i= l I
the using equations (4.2 ); large values of D are significant. Schafer,
Finkelstein, and Collins (1972) have given tables of Monte Carlo signifi­
cance points. The transformation to Z used in this test and the trans­
formation Z/JJ = I - exp (-X^ jj/ÍQ, used in the direct EDF test, are very
close, and Moore (1973) showed the two tests to be asymptotically equiv­
alent. Pow er studies show that for sm all samples they have very sim ilar
properties also.
(C ) Another test based on the EDF has been proposed by Finkelstein and
Schafer (1971). The Z (i) are calculated as for the direct EDF test, that
is, from Z(J) = I - e x p {-X (i)/ X } , i = I , . . . , n; then ôj is defined as
max { I Z(i) - ( 1 - l ) / n l , I Z^JJ - i/nl } , i = I , . . . , n, and the test statis­

tic is S* = Z?_j^ ÔJ. Finkelstein and Schafer have provided tables for S *
based on Monte Carlo studies.

10.8.2 Regression Tests

Most regression and correlation tests described in Chapter 5 are devised for
Case 3, where both location and scale param eters are unknown. However,
two tests are designed specifically for testing Hq . These are

a) Stephens* W g, described in Section 5.11.5;


b) Jackson*s test, described in Section 5.11.5.

10.8.3 Tests Based on s Sample Moments

Gurland and Dahiya (1970) and Dahlya and Gurland (1972) have discussed a
general method of deriving a test statistic based on s sample moments. Siç)-
pose the r-th sample moment is m^ = zj^_^ (X.)^/n; when the Dahiya-Gurland

method is applied to the test for Exp (0,/3) we have (C urrie and Stephens,
1984)

Qi = Cl = n ( - l + m 2/ 2ml^)^ = n { - l + m 2/mJ^}^ /4 where m 2 = m 2 -(m i)^


TESTS FOR THE EXPONENTIAL DISTRIBUTION 437

Q e = Qi ^ 2 , with C 2 = n { l - + 1113 /(Sm ^m l) } ^


Q3 = Qz C 3 , with C 3 = n { - l + 3 m 2/( 2mi^) - + m 4/( 4injm ^}^

A s 3onptotically Qt has a Xt distribution on Ho , but the convergence is quite


slow . Statistic Qi is equivalent to Greenwood^s statistic, discussed in Sec­
tion 8 .9 .1 and in Section 10.9.3 below; percentage points for Q 2 and Q 3 , for
finite n, obtained by Monte G arlo sam pling, have been given by C u rrie and
Stephens (1984). Tests based on Qt a re upper-tail tests. C u rrie and Stephens
have given power studies fo r n = 20 , against a wide range o f alternatives.
Dahiya and Gurland (1972) have given power studies, fo r n = 50 and 100 , and
fo r gamma and W eibull alternatives.

T A B L E 10.2 Tests fo r the Set X o f T able 1 0 .1

Group I tests

E D F and regression statistics applied directly to the X -values

X = 121 . 3 , S .D .(X ) = 154.3, Coefficient of Variation = 1.272


D irect ED F test statistics (Section 1 0.8.1): ß = X = 121.2; n '= 15

Statistics followed by approximate significance levels p, when p < 0.10:

0^^ = 0.277, D “ = 0.132, D = 0.277 (p = 0 .0 4 ), V = O.409

W * = 0 .2 1 9 (P = 0.05), U=' = 0 .1 7 0 (P = 0 .04), A* = 1 . 163 ф = 0.075)

D = 0.292 Ф = 0.04) S* = 1.9970 Ф = 0.045)

R egression statistics and p -valu es fo r set X (Section 10.8.2)

W j, = 0.0384 (p = 0 .0 4 ,-lo w e r tail, so p = 0.080, 2-tail)

W = 0.0397 (p = 0.075, low er tail, so p = 0.15, 2-tail)


S
J = 2.039

K (X ,m ) = .958, Z = 1 5 { l - R * (X ,m )} = 1.24 ф = .2 5 )
R (X ,H ) = .950, Z = 1 5 { l - R ^ (X ,H )} = 1.47 ф = . 15)

V alues o f Dahiya-Gurland statistics (Section 10.8.3)

Q j = 0 .9 7 7 ф = 0.32)
Q j = 3 .3 8 4 ф = 0.18)
Q j = 5.081 ф = 0.16)
438 STEPHENS

E 10.8.3 Example

Values of EDF statistics for the X -data of Table 1 0 .1 are given in Table 10.2,
P art I . F o r these d a t a = 121 . 2 ; thus, following Section 4 .9 , we calculate

= I = I - e x p (-12/121.2) = 0.094,
(I)

and so on. The values of Z are given in Table 4.13, column 3. Then equa­
tions (4 . 2) give = 0.277 and the other values given in Table 10.2. Some
P -lev els of the statistics (used with upper tail only) are recorded. Several
are significant at the 5% level, suggesting rejection of Hq . D and S* (Sec­
tion 10 . 8 . 1) are also given.
Values of regression statistics are given in P art 2 of Table 10.2. The
values of R (X ,m ) and R(X, H) (Section 5.11) are, respectively, 0.958 and
0.950; then Z (X ,m ) in Section 5.11.2 is 15(1 - 0.958)^) = 1.24 and Z (X ,H )
is 1 5 { l - (0.950)2} = 1.47. These are, respectively, significant at p = 0.25
and P = 0.15 when referred to Tables 5.6 and 5.7. The value of the Shapiro-
W ilk statistic W g , which is designed fo r use when a is not known, is included
fo r comparison. In P art 3 of Table 10.2 are given the values of the Dahiya
and Gurland statistics Q i , Q 2 , and Q 3 . The p-values have been found from
the Currie-Stephens tables.

10.9 GROUP 2 TESTS, A P P L IE D TO U = JX

10.9.1 Tests Based Directly on the U -V alu es

A s was stated before, a number of tests for a series of events are based on
making transformation J and then testing that the n - I values U are U (0 ,1).
Because these tests have been suggested in this connection they are reviewed
here; there w ill necessarily be some overlap with Chapter 8 .
Important Note: In Chapter 8 it is natural to assume that the U set has
n values; when tests of Chapter 8 are applied to set U (or to set U* after the
K transformation) the value n must be replaced by m (= n - I) in the form ulas
fo r test statistics, and in using the tables.
a) EDF tests. EDF tests (Case 0) can be used on the U -se t, as described
in Chapter 8 . Statistics D and D " are w ell adapted to detect a shift of U
toward 0 o r I , that is*, to detect trend in events (Section 10.6 .2 ). W^ and A^
can also be expected to be effective fo r these alternatives. Notice, however,
that as customarily used (employing only the upper tall of test statistics for
significance) EDF statistics w ill not detect superuniform U; thus they w ill
not detect periodicity in events (Section 1 0.6.2), o r the occasions when the
J transform can produce superuniformity (Section 1 0 .6 .2 ), unless test statis­
tics are referred to the low er tail of the relevant null distribution.
TESTS FOR THE EXPONENTIAL DISTRIBUTION 439

b) The statistic U « A sim ple statistic fo r testing uniformity is the mean

Û = Ui/(n - I ) . Percentage points fo r Û are given in Table 8.5; the


table must be entered at same size m = n - I . F o r m > 15, the quantity
_ I
P = (IÎ - 0 .5 )(12m)^ w ill have approximately the standard norm al distribution.
c) Statistics based on • The o rd er statistic U (r) has a beta d istri­
bution (Section 8 .8 ), and the function of U (r) given by = (n - r )U (r )/
{ r ( l - U ( r ) ) } has an Fp^q distributlon^with p = 2r and q = 2(n - r) degrees
o f freedom . In particular the median U , where r is n/2 o r (n + 1)/2, has
been proposed as a test statistic fo r uniformity.

10.9.2 Application to Tests for Trend

When the model for trend in a se rie s of events is A.(t) = ce^^^, as discussed
in Section 10.6.2, the values U (i), i = l , • • • , n - 1 , instead of being ordered
uniforms, w ill be an ordered sample from density

T A B L E 10.3 Tests fo r the Set X o f Table 10 .1

Group 2 tests

J transformation followed by tests fo r uniformity on 14 values of U

Values of Test Statistics

0^^^ = 0.180, D “ = 0.105, D = 0.180, V = 0.285

= 0 .1 0 6 , = 0.059, A^ = 0.729, F S* = 1.466

None of the above is significant at the 20% level upper tail, o r the 15% level
low er tail.

Statistics Ü = 0.442, U ,„ = 0.447, U ,.. = 0.463, Neyman N| = 0.722


(7) (o)
None of the above is significant at the 20% level (I tail), o r the 40% level
(2 t a ils ).

R (U ,m ) = 0 .9 7 8 ; Z = 1 4 { l - R 2 (U ,m )} = 0.61 (p > 0 .5 0 )

Statistics based on the 15 spacings Dj

M oran M = 19.232, М/с = 16.329 (p = 0.30 in the upper tail of X u)

Greenwood G(14) = 0 .1 6 7 , 14G(14) = 2.338 (p = 0.075, upper tall)

Kendall-Sherm an K = 0. 4 70

Q u esen berry -M iller Q = 0.193 (p = 0.35, upper tail)

Lorenz L j4 (. 5) = 0.106 (p = 0 . 1, low er tall)


440 STEPHENS

ku ,
f(u) = e , 0< U< 1
к .
e - I

Cox (1955) suggested that the test for к = 0 (Poisson process) against к O
should be based on Ù, which is the likelihood ratio statistic. Large values
of Ü w ill Indicate к > 0 , that i s , events are occurring m ore rapidly with
Increasing time and the U(J) are tending to drift toward I sim ilarly, low Ù
indicates that events are happening less often, and the U(i) a re moving toward
zero. Thus a one-tail test is used if the direction of trend is known, but in
general a tw o-tall test is required. The median U w ill also detect movements
of the U -valu es toward 0 o r I . Note that neither the mean Ú nor the median Ú
w ill detect superuniform observations, nor, in general, the case where there
is excessive variation among the inteiwals between events.

E 10.9.2 Example

Table 10.3, P art I , shows the values of E D F statistics calculated, following


Section 4 .4 , from the 14 values in the U -se t. A lso shown are the values of
U (T ), U (8 ), and Ü . On Hq , U(T) has the beta (x; 7,8) distribution (S ectio n s.8),
and U(8) has the beta (x; 8,7) distribution. T ables of this distribution, and
Table 8.5 for Û, give the approximate p -le v e ls shown. The correlation
R (U ,m ) (Section 5.6) is 0.978, with a p-value greater than 0.5 (Table 5.2);
note that R (U ,m ) has weaknesses as a test statistic (Section 5 .6 ).

10.9.3 Statistics Based on the Spaclngs Between the U j

The spacings between the U -se t are defined by

D = U -U , 1 = 1. ., n, where U, s o, and U, = I
I (i) (i - 1 )’ (O) ’ (n)

thus n - I uniforms give n spacings. The spacings are connected with the
original observations Xj by = Xj/Tn, i = I, . , n, where Тд = X j.
Basic articles for work on spacings are by Pyke (1965, 1972). Many test
statistics for exponentiality have been based on the values X j, divided by
to eliminate the scale param eter /5. These statistics are therefore calculated,
in effect, from the values D^, and the associated tests can be regarded as
tests for uniformity of the set U , based on the spacings. Test statistics of
this type are discussed both in this chapter and also in Chapter 8.

10 .9 .3 .1 Greenwood^ s Statistic

The first spacings statistic which we discuss was introduced by Greenwood


(1946) in connection with tests on a series of events; specifically, on the
incidence o f a contagious disease. The statistic here is G(n - I) = D^,

the argument n - I referrin g to the fact that G(n - I) is calculated from


TESTS FOR THE EXPONENTIAL DISTRIBUTION 441

n - I uniform s, giving n spacings. F o r use in the present application we


have
n
G(n - I) = E Dj
i= l

to make a test, (n - l)G (n - I) is re fe rre d to T able 8.3 using the percentage


points fo r sample size m = n - I . Small values of G(n - I) w ill detect su p e r-
üniform values U i, o r excessively regu lar spacings between events, such as
would occur if they w ere periodic. L a rg e values of G(n - I) w ill detect if
the intervals are too disp erse, fo r example, if the long intervals are too
long compared with the short intervals.

E 1 0 .9 .3 . 1 Example

From the D of T able 10.1, G(14) = (.041)^ + (.031)^ + . . . + (.179)2 = .167.


Thus 14G(14) = 2.34, and reference to T able 8.3 shows this to be significant
at P = 0.075, upper tail.

E 10.9.3.2 Equivalence of Greenwood^s Statistic


and Other Statistics

Greenwood’s statistic w ill detect unusual dispersion of the spacings. The


mean value of each spacing is l/n, so the dispersion could be m easured by

11
V = ^ (D. - l/n)2
i= l

a statistic studied by Kim ball (1947). It is easily shown that V = G(n - 1) - l/n.
A lso , in term s o f the original X j, V can be written

V = 2] (X j - x )V t ^
i= l

= S2/(n5i)2

where S^ = 2? ^ (X. - . Moments of the X set are ml = X and m , = S^/n,

and the sample coefficient of variation C y is (NZm^)ZmJ ; thus V is


m ,/(nm l2) = Cy/n. Note also that Qi of Section 10.8.3 is calculated from
. To sum m arize, the following relations hold:
442 STEPHENS

G(n - I) = V + 1/n = C V n + 1/n

and

Qi = n {n G (n - 1) - 2 } V 4

Thus V , C yf and Qj are all equivalent to G(n - 1) as test statistics. Use of


the upper tail of Qi is equivalent to a two-tall test based on G (n - I ); the
two tails contain unequal probabilities fo r finite n, converging slowly to
equal probabilities as n increases.
Furtherm ore, V is the sam e as the regression statistic W E q (Section
5 .1 1 .5 ; Stephens’ W g (Section 5.11.5) is also related to G(n - I) by

(W g)-> = n (n + l ) { G ( n - 1 )} - n

Tests using the upper tail of G(n - I) o r of W E q are equivalent to tests using
the low er tail of W g , and vice v ersa.
Thus, several statistics which have been derived from very different
approaches all turn out to be equivalent to Greenwood’s G(n - I ).

1 0 .9 .3 .3 Other Spacings Statistics

A number of statistics have been devised which are directly related to Green­
wood’s statistic; most of these have been discussed in connection with testing
fo r uniformity and are included in Chapter 8. The Q u esen berry -M iller statis­
tic Q (Section 8 .9.2) might be useful in detecting autocorrelation in a series
of events; so, also, might statistics based on high-order gaps (Section 8 .9 .4 ).
W e now continue with four tests based on spacings which have been developed
specifically in connection with tests for exponentiallty on X o r X ’ . They are
defined in term s of both Dj and the original X¿.

10.9.3.4 Statistic M (Moran, 1951)

This statistic is

Xl

M (n - I) = -2 2] log (n D )
i= l

n
= -2 Y j lo g (X ./ X )
i= l

n
= - 2 1 £ log X. I + 2n log X
4=1

When X is Exp (0, /3), the distribution of 2X/ß is xl (Result 4 of Section


TESTS FOR THE EXPONENTIAL DISTRIBUTION 443

1Ö.3.1); thus the Xi can be regarded as sample variances from normal


samples with true variance /3, based on two degrees of freedom . M(n - I) is
then equivalent to Bartlett^s (1934) statistic to test that such sam ples come
from populations with the same variance; on Ho, the distribution o f M (n - l)/ c ,
where c = I + (n + l)/(6 n ), is approximately with n - I degrees of fre e ­
dom. A s a general test fo r e^x>nentlality, M (n - I) is tw o-tailed.
M oran (1951) showed that M (n - I) is the as 5rmptotically most powerful
test against gamma alternatives (see also Shorack 1972), and Bartholomew
(1957) showed it to be a strong test against the W eibull alternative. Cox and
Lew is (1966, Chapter 6), Bartholomew (1957), and Jackson (1967), among
other authors, have re fe rre d to the effect on M (n - I) o f inaccurate m easure­
ment o f the values X i, particularly fo r sm all values; a sm all inaccuracy in
X j produces a big e r r o r in log X j. Difficulties due to sm all o r zero values
are discussed in Section 10.10. Bartholomew (1956) has based a sequential
test fo r e^qx)nentiality on M (n - I ) .

1 0 .9 .3 .5 The Kendall-Sherm an Statistic

Kendall (1946) suggested a statistic fo r testing the randomness of events in


tim e. This is

SO that K(n - I) is based on a com parison o f a ll the Di with the common


expected value l/n . Another form o f K is

XX

E IX, - X l
i= l
K(n - I) =
2nX

This statistic, introduced at about the same time as Greenwood*s G(n - I ),


has many sim ilar properties. It m easures the dispersion of the D i, and sm all
values w ill detect si^erun iform U i , o r periodicity in events. The statistic
K (n), that is, derived from n uniform s, and n + I spacings, w as discussed
by Sherman (1950, 1957) who gave its null distribution, moments, and u p p e r
tail percentage points fo r n < 20. Bartholomew (1954) fitted an F -approxi­
mation to the null distribution of a function o f K (n).

1 0 .9 .3 .6 ED F Tests fo r Spacings

When there a re n - I values U i, giving ris e to n spacings D j, the m arginal


distribution o f any one spacing is
444 STEPHENS

n-1
F ^ (x ;n ) = P ( D j < x ) = 1 - (1 •X) O< X < 1

This iS a fuliy specified distribution, and it might be thought that EDF tests,
Case 0 (Section 4.4) could be made. The Probability Integral Transformation
would be Z (i) = I - (I - where the D (i) are the ordered spacings,
and E D F statistics could be calculated from the Z (i). In particular, suppose
Ъ is the Kolmogorov statistic. Because the spacings are not independent
(their sum is I ), the Z (ij are not ordered uniforms, so Case 0 tables cannot
be used. However, fortuitously, the Z^y are exactly those which arise in
Srlnivasan’s test (Section 1 0 .8 .1), and Ь w ill be the same as D of that
section, and w ill be re fe rre d to the tables referenced there.

1 0 .9 .3 .7 Test Based on the Lorenz Curve

Let P be a value between 0 and I , and let r = [n p ], that is , the greatest inte­
g e r less than o r equal to np. The Lorenz curve statistic, derived from the n
ordered spacings D^^^, is

i.„(P) - E %

Gail and Gastwirth (1978) proposed Lj^(0.5) as a test statistic fo r Hq , and


they gave tables for a two-tall test, for values 2 < n < 40, and a normal ap­
proximation for la rg e r values of n. They also gave values of the as 5nnptotic
relative efficiency (ARE) of this test, compared with that based on using the
maximum likelihood estimate of the shape param eter a , for both gamma
and W eibull alternatives, and some power studies.

E 1 0.9.3.7 Example

P art 2 o f T able 10. 3 shows the values of some of the above statistics, based
on the 15 spacings D^ given in Table 10.1. F o r example, M oran's statistic
is M(14) = - 2 [l o g (15 X .041) + lo g (15 X .031) + ••• + log(15 X .179)] = 19.232;
C is then I + 16/90 = 1.178, so М/с = 16.329. This must be compared to X u ,
to give a p -le v e l = 0.30, approximately.

10.10 THE E F F E C T OF ZERO V A L U E S , AND OF TIES

It may be that a value of X is recorded as zero; if this is so, the value of A^


in the Group I EDF tests, and the value of M oran’s statistic M(n - I) in the
TESTS FOR THE EXPONENTIAL DISTRIBUTION 445

previous section, w ill becom e Infinite and Hq w ill automatically be rejected.


C learly, if a set X to be tested to be Exp (0 ,ß ) contains one o r m ore values
which are recorded as z e ro , the reason is that a true sm all value has been
rounded to zero, and a correction can be applied. Suppose the rounding inter­
val is d; fo r the M oran statistic, G ail and W a re (1978) have shown that an
adequate correction is to replace the zero by d/4, fo r d up to 0.2 times the
mean of the e^qxDnential distribution. Thus fo r a mean life of 5 (say hours),
the values should be recorded to at least the nearest hour and then a zero
would be replaced by 0.25. In many practical situations, measurements w ill
be m easured to at least this level o f accuracy, and then no correction w ill
be needed.
The problem a rise s again if there are ties in the X -se t (as there are in
Table 10.1), and if tests are to be applied to the transform ed values X ’ o r U*.
This is because two equal values in the X -s e t gives a zero in the X* set.
Pyke (1965) has given a correction to separate two X -values recorded as
equal. Nevertheless, even if corrections are used fo r zero values of X o r X*,
significant values of o r M (n - I) should be examined carefully to see if
they are due only to one o r two excessively sm all values in the X o r X* set,
and, if so, why these are so sm all.

10.11 GR O UP 3 TESTS A P P L IE D TO X* = NX,


OR TO U^ = KX

10.11.1 Introduction

C learly, after transformations N o r K, tests o f Hq fo r X, such as those


given above, may equally be applied to X \ and tests fo r uniformity fo r U can
be applied to U*. Good reasons exist for making these transform ations, p a r ­
ticularly when the original X are lifetim es. These come from the results in
Section 10.6.3, namely, that if the distribution o f X is not exponential but
is IF R , the X| are stochastically decreasing with i, while if X is D F R the x j
are stochastically increasing with i. These properties have led to further
tests being proposed in connection with lifetime data X . The new tests are
functions of X j/ T ¿ , that is, of the spacings d J between the U* set, defined
as w ere the Di from Ui in Section 10 .9 .3 . Tests on U ’ are discussed In the
next section, followed by the new group of tests based on D \

10.11.2 D irect Tests fo r Exponentiality


on the Transform ed X ’

EDF and regression tests, applied directly to set X* have not been much
emphasized, perhaps because the X* would first be ordered, and the inform a­
tion given by the indexing of the X* is then lost. Other tests on the X* them­
selves have been proposed by Epstein (1960a, 1960b). These make use of
Result 4 of Section 10.3.1 but now applied to X*; on Hq , yi = 2X¡/fi has the
446 STEPHENS

X z distribution, and the yj are independent. Therefore ratios of independent


sums o f yj, times a constant, have the F distribution. Epstein has suggested
tests based on such partial sum s, to test if the value o f ß has changed, o r to
test if the time to the first failure is significantly longer o r sm aller than ex­
pected if all failure times come from the same е^ю пепйа! distribution. The
times У 1 can also be divided into groups, and the several groups tested to see
if they have a common ß ; for example, Bartlett^s test that several normal
samples have the same population variance, o r other well-known tests of
this hypothesis, can be adapted to make a test for common /3. When the yj
values are divided into only two groups, Y = Z f , y, and = - У . it
I 1=1 i 2 i=r*-l I
is easily seen that tests based on the ratio Y j / Y 2 are equivalent to tests
based on U (r)» to be discussed below. A s can be seen, much of the emphasis
in Epstein (1960a, 1960b) is on tests for ß ; however, there is a fine line b e­
tween such tests for a param eter and tests of fit, and several of Epstein^s
illustrations may be viewed as tests of fit.

E 10 . 11.2 Example
Values of EDF statistics and regression statistics, calculated from the
given in Table 10 . 1 , are given in Table 10.4.

T A B L E 10.4 Tests for the Set X of Table 10.1

Group 3 tests

Statistics calculated from set X*

Mean = 121.3, S .D . = 138.6, coefficient o f variation = 1.143

D irect ED F test-statistics: D"*" = 0.167, D “ = 0.082, D = O. 167,

V = 0.250, = 0.050, = 0.041, A^ = corrected A^ = 0.462 (see Sec­

tion 10.10). S* = 1 .151, 5 = 0.177


None of the above is significant at the 25% level, upper tail; and are
significant at approximately the 25% level, low er tail.

R egression statistics

W _ = 0.058 (p = 0.20 lower tail). R (Z ,m ) = 0.98, Z = 0.54 (p > 0 .5 0 ).


hi
W = 0.049 (p = 0.12 lower tail). R (Z ,H )= 0 .9 7 , 2 = 0.90 (p = 0.35)
S
J = 1.957

Greenwood G(14) = .148; 14G(14) = 2.071 (p > 0.10 upper tall)


Dahiya-Gurland Q 2 = 0.569 (p = 0.48); Q 3 = 0.680 (p > 0 .5 0 )
TESTS FOR THE EXPONENTIAL DISTRIBUTION 447

10.11.3 Tests Based on the U* Values

Since on Ho the n - I U j values should be uniform U (0 ,1), tests can be based


on testing this hypothesis concerning U *. F o r IF R alternatives, the U j move
toward I , and fo r D F R alternatives, they move toward zero. E D F statistics
(Case 0) o r the statistics Ü* o r fo r some r , might be expected to be
useful in detecting such alternatives. O f the ED F statistics, D"^ w ill be s ig ­
nificant when the u| move near zero, and D " when they move near I . Statis­
tics and , and to a le s s e r extent D, w ill detect either of these a lte r­
natives.

E 10. 11. 3 Example

Values of EDF statistics based on the U* derived from the data set X in
Table 10.1, are given in Table 10.5. The statistics are found by using u Jq
in equations (4 . 2 ), and p-values a re found from T ables 4 .2 .1 and 4 . 2 . 2 .

10.11.3 Л Tests Based on Ü ’ o r U(r)

The simplest test fo r the uniformity o f the U* -s e t is based on , the mean


of the n - I values, o r equivalently, on their sum S = (n - 1)Ü*; this statistic
was suggested by Lew is (1965). Some algebra w ill show that, in term s of
the original X -valu es,
n
S = 2n - 2
n

Ù ’ w ill tend to be large fo r an original IF R sam ple, and to be sm all fo r a


D FR sam ple. Thus Û* o r S can be used as a one-tail test to guard against
alternatives with IF R o r D FR , but as a statistic against unknown o r general
alternatives it w ill be two-tailed. Percentage points fo r Ù ’ = S/(n - I) are
given in Table 8.5; the table must be entered fo r sample size m = n - I.

F o r m > 15, (Ü* - 0.5)(12m )^ w ill have approximately the standard normal
distribution.
Lew is (1965) also suggested the statistic as test statistic, with r
a suitable Integer. The statistic given by Z¿ = (n - r )U ji .)/ {i ’(i - U ( r ) ) }
has, on Ц ) , the Fp^q distribution with p = 2r and q = 2(n - r) degrees of
freedom . A commonly suggested statistic is the median U^^), with r = ( n + 1)/2
o r n/2. F o r IF R alternatives, U (r) can be expected to be la rg e , so that Zp
is significant in the upper tail o f Fp^ q; fo r D F R alternatives Zp w ill be s ig ­
nificant in the low er tail. This statistic w as again examined by Gnedenko,
Belyayev, and Solovyev (1969), by Fercho and Ringer (1972), and Tiku, R a i,
and Mead (1974); the statistic у of Tiku, Rai, and Mead (1974, Section 4)
designed fo r testing Hq , is equivalent to .

In Table 10.5 are given values of Ü ’ and U(Y) and U ( 3) fo r the U ’ set
derived from U of T able 1 0 .1; p-values are found from Table 8.5 and the
beta (x ;7 ,8 ) and beta (x ;8 ,7 ) distributions (Sections 8 .8 .2 and 8 .1 0 .1 ).
448 STEPHENS

TAPTi^ 10.5 Tests for the Set X of Table 10.1

Group 3 tests

Statistics for uniformity calculated from the 14 values of

Statistics followed by approximate significance levels p In parentheses:

D =0-----------
.374 (P = 0.015) D " = 0.099 (p > 0.25)

D = 0.374 (p = 0.03) V = 0 .4 7 3 (P = 0.025)

W * = 0.417 (p = 0.07) = 0.227 (p = 0.02)

A } = 1.894 (p = 0.10) S* = 2.433 (p = 0.09)

Statistics U* = 0.383 (p = 0.07 lower tall, p = 0.14 2-tall)

^(7 ) ~ (p = 0.12 lower tall)

U* . = 0.356 (p = 0.09 low er tall)


(Ö)

Кезгтап N2 = 2.558 (p = 0.30)

R (U ,m )= 0 .9 0 , Z = 2.62 (p < 0 .0 1 )

Statistics based on the 15 spacings

Moran M = 25.579, М/С = 21.714 (p = 0.09 upper tail)


(a zero spacing corrected, see Section 10.10)

Greenwood G(14) = 0 .1 4 8 , 14G(14) = 2.072 (p > 0 .1 0 upper tail)

Q u esen berry-M iller Q = 0.237 (p = 0.05 upper tail)

S* = -0 .6 0 ; S* = 1.20; S* = 0.24 (Section 10.1 1 .4 ).

10.11.4 Tests Based on the Spacings Between the U* Set

A ll the tests in Section 10.9.3 fo r uniformity o f U , based on the spacings D,


can o f course be applied to the new spacings D* calculated from the U ’:
D! = U * . - U*. , . , I = I , . . . , n, with U^. . = 0 and U* = I . Some new s ta -
1 (I) (1-1) (O) (n)
tlstlcs have also been proposed for the set X \ based essentially on the D j.

10.11.4.1 T h e ’*Cumulative Total Tim e on T est" Statistic

The ----------------------------
total time on test statistic was defined in Section 10.5.3 as т
r!, = е У
1=1^ X|.
I
Suppose, fo r given k, we define
TESTS FOR THE EXPONENTIAL DISTRIBUTION 449

k-1
E T'
=1 ^
\ = T ^ ’
n

is called the k-th cumulative total time on test statistic. When к = n,


V = 2^”^ U* ., since Ü* . = T*/T*. Thus V = (n - 1)0*. Another formula
n r=l (r) (r) r n n ' '
for Vn is

E E
Ii U

(n - i ) x ; iX *
i= l i= l
V = = n- •
n T’
n

In term s of the spacings , this is

n n
V = E (n -i)D i = n - E id ;
i= l i= l

Another group of tests, proposed fo r the set X* by Bickel and Doksum (1969)
includes

S* = E l - i x ; / {(n + l)T ^ }] = j - E^ I D j j ^ (n+ I)

n n
S* = ^ x ;H j/T^ = E d ;H j , where = -lo g (I - i/(n + 1))
i= l i= l

n
=E x ; (-lo g H )/ T = - E D 'a o g H )
1=1 1=1

R ecall, in these form ulas, that Tj^ = (Equation 10.7). Statistics and S j
have a resem blance to the regressio n statistics o f Chapter 5, but there is
an important difference; the set x [ are not ordered in the above form ulas,
and they would be fo r regression statistics.
Statistic S f is as 3rmptotically most powerful against W eibull alternatives
fo r X , and statistics s f and S f against two other alternatives (the Makeham
and linear failure rate alternatives) discussed by Bickel and Doksum. Other
statistics may be derived using the properties of X j. F o r example, on a v e r­
age, X* < Xl for i > 3 if the distribution is IF R and a test could be based on
450 STEPHENS

the number of r e v e rs a ls , that I s , the number of occasions when this inequal­


ity is realized, for all pairw ise comparisons. Alternatively, values X| could
be plotted against i, o r against n - i + I; the slope of the regression line
would, if the original X w e r e Exp(0,/J), be zero, but if the X w e r e from an
IF R distribution, it would be negative for the first plot and positive for the
second. (The slope has the same sign as Z j í d [ - n/2.) Proschan and Pyke
(1967) and Bickel and Doksum (1969) have also investigated tests based on
the ranks of the X j.

E 10. 11. 4 Example

Values of the Greenwood G(14), M oran M , and Q u esen berry-M iller Q, calcu­
lated from the D j , are given in Table 10. 5. These tend to have low er p-levels
than the corresponding statistics based on the D i, in Table 10.3. The X ’ set,
and hence the D ’ set, contains a zero, and the M oran statistic (also the
Anderson-D arling A^) has been calculated using the correction suggested in
Section 10.10; X^^j has been given value 0.25, since the rounding interval
is I.

10.11.5 The Equivalence of and Other Test Statistics

Several of the statistics given in Section 10.11.4 are equivalent to the statis­
tic Ü* discussed in Section 10.11.3. Since Vn = (n - 1 )Ü \ Vn is the same
as S of Section 10.11.3. A lso Vn and S* are related by Vn = n + (n + l ) S j .
Thus, Vn, S, and S* are all equivalent to as test statistics.
Another statistic equivalent to Ù* is the Ginl statistic Gn, discussed by
G all and Gastwirth (1975). Gn is related to the Lorenz curve discussed in
Section 10 .9 .3 .8 above, and, like the Lorenz curve, derives from concepts
used in economics. The Gini index for a distribution is twice the area b e ­
tween the population Lorenz curve у = I^ (p ) and the line у = p. The Gini
statistic Gn derived from this index can be calculated in two w a y s. In term s
o f the original X j, Gn is

n-1
E iXi-x.i

G
2(n - 1)T

this may be shown to be the same as

n-1 n-1
T j ^^1+1 T ^^i+1
i= l 1=1
G =
n (n - 1)T* n -1
TESTS FOR THE EXPONENTIAL DISTRIBUTION 451

Slngpurwalla has shown (see Gail and Gastwirth, 1975) that = I - so


that G j^ too is equivalent to as a test statistic.

10.12 DISCUSSION O F THE D A T A SET

The values of all the various statistics can now be used to give an overall
assessm ent o f the X -se t in T able 10.1.
a) The direct tests on X in Group I (Table 10.2) point towards rejection
of Hq : that the X are Exp { 0 , ß ) , with the large value of D'*' indicating that
there are too many sm all values of X compared with large values. This im ­
plies a D F R population fo r X (see next section and Table 10.6). However,
transformation J, from X to U , gives little information, although G re e n -
wood*s statistic is quite large (0.167), implying high dispersion among
X -valu es.
b) The values X are lifetim es, and the discussion in this chapter sug­
gests that tests on X* and on U* w ill be Informative. The high value of D“^
fo r the EDF tests on U* (Table 10.5) means that the U^ set tends toward zero,
and this is confirmed by the low values of Ü ’ , U ( 7 ) and U(8)- These low values
are because the normalized spacings Xi a re , on the whole, increasing with I,
implying that the original X a re from a D FR distribution (Section 1 0 .6 .3 ).
Thus the Group I tests and the tests on U ’ give supporting conclusions.
c) The indexing of set X ’ is giving information about the parent popula­
tion for X. Then M oran’s o r Greenwood’s statistics found from the spacings
D i, which a re , in effect, sym m etric functions of the X^ (in which the indexing
is lost), are not significant. Neither are direct ED F tests on X ’ , based on
first ordering the X ’-se t, so that the Indexing is again Iost-
d) The lack of significance of the statistics where indexing is lost in
set X ’ indicates how the Indexing can be Important. H ere, when the X ’ are
only regarded as a random sam ple, as in c) above, they are acceptably
exponential, and one would then accept that the original X -se t is exponential;
however, when the indexing of the X { gives information, as it does in the
tests on U ’ , the indications are that the X -se t comes from a D F R distribution.

10.13 E V A L U A T IO N OF TESTS FOR E X P O N E N T IA L IT Y

10.13.1 Omnibus Tests

The author (Stephens, 1978, 1986) has conducted a large power study on the
various test procedures for Hq , using Monte C arlo sam ples o f sizes n = 10
and n = 20 from a wide range of alternatives. T ables 10.6, 10.7 give a
selection of these results for n = 20. These perm it some comparisons b e ­
tween tests, applied to a general random sam ple, when the special consid­
erations of preserving Indexes, e t c ., are not important.
TABLE 10.6 Power Studies for Tests of Exponentiality, Origin Known (n = 20)

omnibus tests: Power results


Group I sample Group 2
Group I EDF tests on X Group I moments based on U Group 3 E D F tests on U* Group 3
D+ D " D V D S* J Ql Q2 Q3 M G K L D+ D - D V A^ (J‘
(n/2)

IF R
Alternatives

I 70 55 51 63 55 62 49 64 56 51 42 35 71 57 57 61 I 67 54 34 57 35 60 61 25

U(0,1) 16 82 69 79 81 75 79 63 83 93 85 69 54 56 87 72 71 0 80 75 50 80 48 89 83 73

W elb ( l . s f I 70 55 51 61 55 59 50 65 64 58 45 34 67 63 62 62 0 71 56 30 61 31 61 64 36

I 43 29 26 33 28 28 23 32 37 29 18 11 27 32 33 29 I 38 28 16 30 17 31 33 27

D FR
Alternatives

69 I 57 69 62 51 75 61 62 53 51 67 72 80 47 6 70 64 I 59 40 65 40 75 65 47

W eib (0.8)^ 39 3 21 21 26 22 35 30 27 29 27 36 40 32 25 33 34 42 I 30 19 33 22 34 34 26

Iognor (1)'^ 18 18 21 22 23 35 23 22 24 23 24 24 23 14 25 19 12 33 6 23 30 23 33 22 18 18

¿ Cauchy^ 73 2 66 59 69 61 69 70 75 77 79 79 56 73 72 61 79 I 73 66 74 67 72 72 63

^W eib (m) re fe rs to density (10.4) with ß - I .


^lognor (m) re fe rs to density f(x) = const exp {-(lo g x)^/(2m ^)} , x > 0.
: X is I Y l , where Y = N (0 ,1).
^¿Cauchy : X is I Y | , where Y has the Cauchy distribution, median 0.
TABLE 10.7 Power Studies for Tests of Exponentiality, Origin Known (n = 20)

5% one -sided tests: Pow er results


Group I sample Group 2
Group I EDF tests on X Group I moments based on U Group 3I EDF tests on U* Group 3
D -D V W2 A* D S* J Ql Q2 Q3 M G K L D+ D - D V w * U2 A^ Tf
n ^(n/2)

IF R
Alternatives
Significant tail L L L L U U U

0 53 39 36 47 41 43 32 47 56 24 10 5 71 57 57 61 0 52 37 21 44 20 42 61 25
x|
U(0,1) 2 67 54 66 66 61 64 48 73 93 60 27 12 56 87 72 71 0 75 63 35 68 34 82 83 72

W eib (1.5)^ 0 53 38 33 46 39 42 33 51 64 27 10 5 67 63 62 62 0 52 39 18 44 19 47 64 34

0 31 17 17 23 20 19 14 23 37 10 3 I 27 32 33 29 0 28 18 8 18 8 22 33 27

D FR
Alternatives
Significant tail U U U U L L L

60 0 45 39 54 44 71 51 52 53 45 56 61 80 47 66 70 62 0 50 31 55 30 69 65 47
Xi
W eib (0.8)^ 24 0 16 13 20 15 26 22 21 29 23 30 31 32 25 33 34 30 I 22 14 23 13 27 34 26

lognor (1)^ 11 11 13 15 15 16 14 15 15 23 19 18 18 14 25 19 12 14 9 15 21 14 20 13 18 18

2 Cauchy 65 I 58 52 65 53 62 61 61 75 72 74 75 56 73 72 61 73 I 66 62 70 62 68 7 63

^Weib (m) re fe rs to density (10.4) with /1=1.


^^lognor (m) re fe rs to density f(x) = const exp {-(lo g x)^/(2m ^)} , x > 0.
: X is I Y l , where Y = N (0 ,1).
^¿Cauchy : X is I Y | , where Y has the Cauchy distribution, median 0.
454 STEPHENS

Roughly speaking, fo r a sample as large as n = 20, a statistic falls into


one o r other tail (called the significant tail ) according to whether the parent
population is IF R o r D F R . This is, therefore, a natural way to divide the
alternatives; it also coincides with Cy < I (IFR) and Cy > I (DFR) for most
populations •
W e first look fo r good omnibus tests, that is, tests which w ill declare
significance for the whole range of (IFR and DFR) alternatives. Statistics to
be compared are: Group I , using the upper tail fo r significance for EDF
and Gurland-Dahiya statistics calculated from the X; Group 2, statistics
derived from U, using two tails fo r the statistics M oran M , Greenwood G,
Kendall-Sherman S, and Lorenz L because sam ples from IF R alternatives
a re likely to be significant in one tall (superuniforms U) and those from DFR
alternatives in the other tail; and Group 3, EDF statistics (upper tail) and
and U(n/2)> tw o-tail. The Jackson statistic J in Group I is also two-tail;
other correlation statistics treated in Chapter 5 cannot rightfully be com­
pared because they assume unknown origin, except for W g , which is equiv­
alent to Greenwood^s. Recall also that several other statistics are equivalent
to Greenwood^s (Section 1 0.9.3), several to Ü ’ (Section 10.11.5), and several
to U(^/ 2) (Section 10.11.3).
It is obviously im possible to give best procedures against a ll alternatives,
but some salient features em erge from the power results (Stephens, 1978,
1986).

(a) A s omnibus tests against both IF R and D FR alternatives (Table 10.6),


there is not much to choose between the following sets of statistics:
Group I, o r W^ o r S* o r J; Group 2, Moran M , Kendall-Sherman K,
and Lorenz L ; and Group 3, A^ o r W^ o r Perhaps the most striking
feature of the power results is how sim ilar they are for these statistics.
Any of them, at the preference of the u ser, should do w ell to provide
omnibus tests.
It is noteworthy that statistic has very high power; after trans­
formation K, Ù* i s , of cou rse, easily calculated and has a null distribu­
tion which converges quickly to the norm al. Note that statistic U(n/2)»
by contrast, often gives poor power.
(b) Transform ation J gives good results when followed by two-tail statistics
M , G, K, o r L ; re c a ll that Ü, U(n/ 2), and EDF statistics (as usually
used) cannot detect the possibility of superuniforms.
(c) F o r gamma and W eibull alternatives, M oran M is very good (often best,
as ejqjected), but loses power against some other alternatives; recall
that there can be problem s with sm all values. The Lorenz L and Kendall-
Sherman K compare w ell with M overall. F o r these reasons M might be
regarded as a "risk y ” statistic compared to others in (a) above (see the
discussion in Section 10.10).
TESTS FOR THE EXPONENTIAL DISTRIBUTION 455

10.13.2 One-Sided or Directional Tests

Since many statistics have a significant tail for IF R alternatives and the
opposite tail significant for D FR alternatives, such statistics can be used
with one tall only if it is desired to guard against only one of these types of
alternative. The tests must be used with care, since they w ill be biased
against the other alternative fam ily. Table 10.7 gives power comparisons
when some statistics are used with one tail only; the size of the test is
now 5%. F o r a fa ir comparison, statistics always used with one tail only,
such as EDF o r sample moment statistics, should now be compared for test
size 5%. The significant tail of one-sided statistics is indicated in the table.
When the direction of the alternative is known, the Group 3 statistic Ù*
is again effective, and now is overall better than in Group 3 o r Group I.
The Greenwood, Sherman, and Lorenz statistics compare with but on
occasion are less powerful; U(n/2) is poor in term s of power. M oran’s sta­
tistic is again best of all against gamma and W eibull alternatives, but drops
behind Ù ’ fo r other alternatives. Again, EDF statistics (Group I) and EDF
statistics (Group 3) show rem arkably sim ilar results. The power results
reported form part of a la rg e r study, and values are available (showing sim ­
ila r trends) fo r n = 10 and n = 50.

10.14 TESTS W ITH ORIGIN AND SC A LE UNKNOW N

W e now return to tests of exponentiality fo r which both a and ß in Exp (a,/3)


are unknown. There are several ways o f dealing with two unknown param eters:
(a) in Section 10.3.2 it was shown how such a test situation could be
reduced to a test of Hq , by making the transformation to m = n - I new v a r i­
ables Y (I) = X(i+x) - X(X), i = I , . . . , m; Hq may then be tested on the m
values in set Y using any of the methods so fa r given. Several authors have
suggested this as a way of dealing with the unknown o¿; fo r example, L^CP)
of Gail and Gastwirth (1978) is explicitly derived in this way, and statistic у
of Tiku, R a i, and Mead (1974, Section 2) can be derived by applying the K
transformation to set Y to give n - 2 values U ’ , and obtaining у as statistic
U (n - r - l ) » r = [n + l]/ 2 .
However, the transformation to Y may not be the best way to handle
unknown a . It may be better, fo r example, to use any o f the following
methods:
(b) E D F tests on X, with a and ß both estimated from the data, as
described in Section 4 .9 .4 .
(c) The Shaplro-W llk statistic W e given in Section 5.11.5.
(d) The correlation statistics R^(X, m) or R^(X, H) (Section 5. 11. 2).
Spinelli and Stephens (1987) have compared powers of these several
techniques, using samples of size n = 20 and 10% tests, and have given a
number of tables. Tiku, Ral, and Mead (1974) have also given power studies
comparing their statistic with the Shapiro-W llk W g . Splnelli and Stephens
456 STEPHENS

used only Group I EDF tests on set Y . F rom the various comparisons the
following points em erge:

(I) ED F statistics on set Y are slightly less powerful than direct EDF statis­
tics where a and ß are both estimated, as in (b) above, except possibly
fo r alternatives with a very high probability of sm all values ( e . g . , Xi
or W eibull (0 .5 )).
(2) The direct EDF statistics and (method (b) above) and the Shapiro-
W llk W e give sim ilar power results, but other correlation statistics
have low er power.
(3) The results of T lku, R a i, and Mead suggest that with r =
[(n + 1)/2] (equivalent to their statistic T e ), has better power than has
the median U(n/2) ^ tests of Hq when a was known to be zero, but is
still not as powerful overall as W^ o r in direct EDF tests. These two
are, therefore, the recommended statistics for omnibus tests, since there
is a problem of consistency with W e (see Section 5.12).
(4) The significant tall for W jj o r T g is the same as that for W g, and oppo­
site to that for G. These statistics, and also D^ o r D** for direct EDF
statistics, can be used as one-tail tests on one-sided fam ilies of a lte r­
natives (DFR o r !F R ), with a consequent increase in power.
(5) Splnelll and Stephens showed that when a. is known, it is best, on the
whole, to use this fact, and therefore to apply the tests given e a rlie r in
the chapter. Note that the opposite effect has been observed in connection
with EDF tests for normality (Section 4'. 16. 2 ).

10.15 SUM M ARY

From the plethora of tests and power results which have been given in this
chapter, it would be useful to extract a simple strategy fo r the practical
statistician to follow, but this is not easy. Perhaps we should sum m arize by
simply repeating four themes which have surfaced throughout the chapter:
(a) Test statistics should be regarded as giving information about the
data and their parent population, rather than as tools for form al testing
p roced u res.
(b) If the data are intervals between events, and if the times of these
events are known, the natural questions to ask w ill be m ore readily answered
by converting these times to the U -se t via the J-transform ation, and looking
at the configuration of the U.
(C) If the data are lifetim es, one must ask what alternative populations
a re of interest. Information on IFR or D FR parent populations can be deduced
from the spacings between the X; this leads naturally to the N -transformation
(which gives the normalized spacings) o r the K-transform ation, with tests to
follow on the U ’ -set. This approach has been much advanced, especially as
the T^ which lead to the U* have the ^’total time on test" interpretation: how­
ev er, very sim ilar information is given by direct EDF tests on the original X,
TESTS FOR THE EXPONENTIAL DISTRIBUTION 457

o r , fo r the important gam m a o r W elbull alternatives, and if the m easure­


ments a re non-zero, by M oran’s statistic.
(d) F o r data from other sources, re fe rre d to above as ”a general random
sa m p le ," it may still be o f interest to classify possible alternative populations
as IF R o r D FR , which is roughly equivalent to sh orter-tail o r lon ger-tail,
o r to Iow -C v o r to high-Cy; then comments in (c) w ill still apply.

R EFE R E N C E S

Barlow , R. E . , Bartholomew, D. J ., B rem ner, J. M . , and Brunk, H. D.


(1972). Statistical Inference Under O rd er R estrictions. New York: W iley.

Bartholomew, D. J. (1954). Note on the use of Sherman's statistic as a test


fo r randomness. Biom etrlka 41, 556-558.

Bartholomew, D. J. (1956). Tests fo r randomness o f a se rie s o f events


when the alternative is a trend. J. Roy. Statist. S o c ., B 18, 234-239.

Bartholomew, D . J. (1957). Testing fo r departure from the exponential d is­


tribution. Biom etrika 44, 253-257.

Bartlett, M . S. (1934). The problem in statistics of testing several v a ri­


ances. P r o c . Camb. Phil. Soc. 30, 164-169.

Bickel, P . J. and Doksum, K. A . (1969). Tests fo r monotone failure based


on normalized spacings. Ann. Math. Statist. 40, 1212-1235.

Cox, D. R. (1955). Some statistical methods connected with series of events.


J. Roy. Statist. S oc., B 17, 129-164.

Cox, D. R. and Lew is, P . A . W . (1966). Statistical Analysis of Series of


Events. London: Methuen.

C u rrie, I. and Stephens, M . A . (1984). On sample moments and tests of fit.


Technical Report: Department of Statistics, Stanford University.

Dahiya, R. C . and G u rland, J. (1972). Goodness of fit tests for the gamma
and exponential distributions. Technometrics 14, 791-801.

Epstein, B . (1960a). Tests for the validity of the assumption that the under­
lying distribution of life is exponential. P art I. Technometrics 2, 83-
101.
Epstein, B . (1960b). Tests fo r the validity of the assumption that the under­
lying distribution of life is exponential. P art П . Technometrics 2, 167-
183.

F erch o, W . W . and Ringer, L . J. (1972). Small sample power of some tests


of the constant failure rate. Technometrics 14, 713-724.
458 STEPHENS

Finkelstein, J. М . and Schäften, R. E . (1971). Improved goodness-o f-fit


tests. Biometrika 58, 641-645.

G ail, M . H. and Gastwirth, J. L . (1975). A scale free goodness-of-fit for


the eзфonential distribution based on the Gini statistic. J. Roy. Statist.
Soc. 40, 350-357.

Gail, M . H. and Gastwirth, J. L . (1978). A sc a le -fre e goodness-of-fit test


fo r the exponential distribution based on the Lorenz curve. J. A m e r.
Statist. A s s o c . 73, 787-793.

G ail, M . H. and W a re , J. (1978). On the robustness to measurement e rro r


of tests of eзqюnentiallty. Biom etrika 65, 305-309.

Gnedenko, B. V . , Belyayev, Yu. K ., and Solovyev, A . D. (1969). Mathe­


m atical Methods of Reliability T heory. New York: Academic P re s s .

Greenwood, M . (1946). The statistical study of infectious disease. J. Roy.


Statist. Soc., A 109, 85-110.

Gurland, J. and Dahiya, R. C. (1970). A test of fit for continuous distribu­


tions based on generalised minimum chi-squared. Statistical Papers in
Honor of G. W . Snedecor, T . A . Bancroft, Editor, 115-127. IowaState
University P re s s .

Jackson, O. A . Y . (1967). An analysis of departure from the exponential


distribution. J. Roy. Statist. S o c., B 29 , 540-549.

Kendall, M . G. (1946). Discussion of P ro fe sso r Greenwood*s paper. J. Roy.


Statist. Soc. 109, 103-105.

K im ball, B . F . (1947). Some basic theorems fo r developing tests of fit for


the case of the non-param etric probability distribution function, I.
Ann. Math. Statist. 18 , 540-548.

L e w is, P . A . W . (1965). Some results on tests for Poisson p rocesses. B io -


m etrika 52, 67-77.

M oore, D. S. (1973). A note on Srinivasan*s goodness-of-fit test. Biometrika


209-211.

M oran, P . A . P . (1951). The random division of an interval—P art П. J. Roy.


Statist. Soc., B 13, 147-150.

0 *R eilly, F . J. and Stephens, M . A . (1982). Characterizations and goodness


o f fit tests. J. Roy. Statist. Soc.

P earson , E . S. (1963). Comparison of tests fo r randomness of points on a


line. Biom etrlka 50, 315-325.

Proschan, F . (1963). T h eo re tic a le x p la n a tio n o fo h se rv e d d e c re a sln g fa ilu re


rate. Technometrics 5, 375-383.
TESTS FOR THE EXPONENTIAL DISTRIBUTION 459

Proschan, F . and Руке, R - (1967). T ests fo r monotone failure rate. P r o ­


ceedings Fifth Berkeley Symposium, 3, 293-312.

Руке, R . (1965). lÿacin gs. J. Roy. Statist. S o c-, В 27, 395-449.

Руке, R . (1972). Spacings revisited. Proceedings Sixth B erkeley Symposium,


I , 417-427.

Schafer, R . E - , Finkelstein, J. M . , and C ollins, J. (1972). O n a g o o d n e s s -


o f-fit test fo r the exponential distribution with mean unknown. Biom etrlka
59, 222-224.

Seshadri, V . , C sörgd, M . , and Stephens, M. A . (1969). Tests fo r the expo­


nential distribution using Kolm ogorov-type statistics. J. Roy. Statist.
S o c ., B 31, 499-509.

Sherman, B . (1950). A random variable related to spaclngs o f sample values.


Ann. Math. Statist. 21, 339-361.

Sherman, B . (1957). Percentages o f the cj^ statistic. Ann. Math. Statist. 28,
259-261.

Shorack, G . R . (1972). The best test of ejqwnentiallty against gam m a a lter­


natives. J. A m er. Statist. A s s o c ., 67, 213-214.

Spinelli, J. J. and Stephens, M. A. (1987). T ests for exponentiality when


origin and scale param eters are unknown. Technom etrics 29, 471-476.

Srinivasan, R. (1970). An approach to testing the goodn ess-of-fit of incom­


pletely specified distributions. Biom etrika 57, 605-611.

Srinivasan, R. (1971). Tests fo r ejqxjnentiality. Statist. Hefte 12, 157-160.

Stephens, M. A . (1978). Goodness of fit tests with special reference to tests


fo r exponentiality. Technical Report, Department of Statistics, Stanford
University.

Stephens, M. A . (1986). P ow er studies fo r tests fo r exponentiality. Tech­


nical Report, Department of Statistics, Stanford University.

Tiku, M . L . , R al, К . , and Mead, E . (1974). A new statistic fo r testing


exponentiality. Comm. Statist. 3, 485-493.

W ang, Y . H. and Chang, S. A . (1977). A new approach to the nonparametric


tests o f e?qx>nential distribution with unknown param eters. The Theory
and Application o f Reliability 2, 235-258. New Y ork: Academic P r e s s .
11
Analysis of Data from Censored Samples
John R. Michael B ell Telephone L aboratories, Holmdel, New Jersey*

W illiam R. Schucany SouthernM ethodistUniversity, D allas, Texas

11.1 INTR O D UCTIO N

In this chapter we consider a variety of techniques which are appropriate as


tests of fit when only a certain portion of the random sample from a continu­
ous underlying distribution is available. The censoring o r deletion o f o b se r­
vations can occur in several w a y s. The type o r manner o f censoring deter­
mines the appropriate method o f analysis.
The most common and sim ple censoring schemes involve a planned limit
either to the magnitude of the variables o r to the number o f o rd e r statistics
which can be observed. These are called singly Type I and Type 2 censored
data, respectively. The number o f sm all (or large) o rd er statistics which
w ill be observed in Type I censoring is a random v ariable. In life testing
applications it is quite common fo r an experiment to produce a Type I right
censored sample by having n items placed on test and recording the values
O < Y (I ) < • • • < Y (r ) o f the failure times which a re observed up to a fixed
test time. (In this chapter observations w ill be re fe rre d to as Y , rather
than X, since in plotting techniques we shall wish to plot observations on the
vertical, o r y -a x is .) Data arisin g from such a procedure are occasionally
also re fe rre d to as being truncated. If the life test is planned to continue
until a fixed number, r , of failures occur, then the resulting failure data are
Type 2 right censored. A s another example, if one records only the 10 la r g ­
est independent competitive bids on an oil lease, the observed sample is
singly Type 2 censored on the left. Types I and 2 censoring are sometimes
referred to as time censoring and failure censoring, respectively.
In the m ore complicated situation in which the variables are subject to
different censoring limits the sample is said to be multiply censored. If the

♦ Current affiliation; W e sta tIn c ., Rockville, M aryland.


461
462 MICHAEL AND SCHUCANY

differing censoring limits are preplanned, as would result from placing


items on a life test at different starting times with a single fixed termination
time for the test, the data a re progressively censored (Type I ) . Samples
which a re progressively censored (Type 2) occur less often in practice but
could result, again in life testing, if the units are put on test at the same
time and then selected fixed numbers of (randomly chosen) unfailed items
are removed from test immediately after different preplanned numbers of
failures have occurred.
The unplanned type of censored data which a rise s most often in practice
is randomly time censored o r arbitrary right censored data. The la rg e r
values (again usually in life testing) are not observed due to random censor­
ing times which are statistically independent of the variable of interest (usu­
ally failure tim es). If some of the units are accidentally lost, destroyed, o r
removed from the study p rio r to the measurement of the variable (failure
time) and if these Independent censoring times are recorded then the data
can still be analyzed fo r goodness o f fit. In certain situations competing
modes o f failure w ill produce randomly censored data (see Example 11.2.3.2.)
Combinations of multiply right and left censored data can also arise in p ra c ­
tice (see Section 11.2.4).
The graphical technique of examining probability plots (Chapter 2) adapts
quite easily to the censored sample situation. Subjective im pressions should
be formed with somewhat m ore caution than in the complete sample case,
but the computational aspects are essentially unchanged. Probability plots
are discussed in Section 11.2.
When the null distribution is completely specified, the probability integral
transformation (see Section 4 .2 .3 ) may be employed to reduce the problem
to a test fo r uniformity. Section 11.3 presents a number of examples of
standard EDF (Chapter 4) goodness-of-fit statistics which have been modified
in a straightforward fashion to accommodate a censored uniform sam ple.
Adaptations for correlation (Chapter 5) and spacings (Chapter 8) statistics
are also discussed. F o r Type 2 censored samples a transformation of the
uniform o rd er statistics is described which makes it possible to analyze the
data as if it w ere a complete random sam ple.
In testing fit, it is a common situation for the null hypothesis to be
composite; the hypothesized parent population is not completely specified,
but only the form F(x|6) of the cumulative distribution function (cdf) is given.
H ere Ö is an indexing param eter; it may be a vector of several components,
some known and some unknown. One very natural approach which has been
taken in the complete sample case is to replace the unknown components in $
b y efficient estimators (for example, the m . l . e . 9) and then to calculate a
statistic based on F(x|6) as if it w ere the completely specified distribution
function. This has been done, fo r example, in many of the tests in Chapters
4 and 5. Censoring presents an extra complication fo r this approach simply
because o f increased complexity o f efficient estim ators of $• A variety of
results fo r the composite hypothesis problem are examined in the final
ANALYSIS OF DATA FROM CENSORED SAMPLES 463

Section 11.4. Adaptations of the chi-square procedure are not covered in


this chapter. F o r some discussion on this topic, see Section 3 .4 .2 .

11.2 P R O B A B IL IT Y P LO T S

Probability plotting has been described in Chapter 2 as a valuable technique


fo r assessing goodness of fit with complete sam ples. This extends naturally
to incomplete samples for most types of censoring. Even in the case of
multiple censoring a probability plot can often be constructed quickly using
only ordinary graph paper and a hand calculator.
In Section 11.2.1, the construction of probability plots fo r complete
samples is reviewed. The method is extended to singly-censored samples
in Section 11.2.2, to multiply right-censored sam ples in Section 11.2.3,
and to other types of censoring in Sections 1 1 .2 .4 -1 1 .2 .6 . An easy-to-u se
summary of the steps required in constructing a probability plot is given in
Section 11.2.7.

11.2.1 Complete Samples

Let Y ( I ) , Y (2 ), • • • . Y(n) be a complete ordered random sample o f size n


and let F(y| д,(7) be the corresponding cdf where /x and cr are unknown location
and scale param eters, respectively. (Note that ¡jl and a are not n ecessarily
the mean and standard deviation.) When there is no ambiguity F(y| /х,(т) w ill
be shortened to F (« ) o r F .
Since fJL and a are location and scale param eters, we can write (as was
done in Form ula (2.9))

F(y|/i,cr) = g ( 2 ^ ) = G(Z) (11. 1)


^ Cr '

where Z = (Y - fi)/<j is re fe rre d to as the standardized variable and G (z ),


also re fe rre d to as G (-) o r G , is the cdf o f the standardized random variable.
Using obvious notation, it follows that, using E fo r expectation o r mean,

where Z (i) is the ith o rd er statistic from the standardized distribution, and
m| is E { Z ( i ) } . Sim ilarly, fo r 0 < p j < I ,

P^-th quantile of F (y ß ,a ) = M + crfp.-th quantile of G (z )}

= M +(T[G-i(p^)]

where G ”^ is the inverse function o f G.


464 MICHAEL AND SCHUCANY

W e can regard Y (j) as an estimate of its mean, o r of the Pi-th quantile


of F (y; д ,о ), where pi is an appropriate probability. In constructing a proba­
bility plot we could plot the Y (i) on the y -a x is versus m i on the x -a x is . If the
sample is in fact from Г (у;д,(г) then the points w ill tend to fall on a straight
line with intercept ß and slope a - W e then test our distributional assumption
by visually judging the degree of linearity of the plotted points. Methods
based on regression and correlation are discussed in Chapter 5.
It should be noted that if the null hypothesis is sim ple, that is, the values
of all distributional param eters are specified beforehand, we can plot the
Y (I) against their hypothesized means and then judge whether the plotted
points fa ll near a straight line with intercept 0 and slope I .
A drawback to using means of order statistics is that they are often
difficult to compute. Quantiles, on the other hand, are easy to compute as
long as F is easy to invert. A plot of the sample quantiles Y (i) versus theo­
retical quantiles of G is a probability plot as defined in Chapter 2; it is also
called a quantile -quantile o r Q -Q plot (Wilk and Gnanadesikan, 1968). How­
ever, the plots w ill be different from those in Chapter 2 where the observa­
tions w ere plotted on the horizontal o r x -a x is ; here they are plotted on the
vertióle o r y -a x ls . Special probability plotting paper is available for many
fam ilies of distributions, but as w as stated in Chapter 2 no special graph
paper is required if F can be inverted in closed form o r if standard quantiles
are available from tables o r approximations. Often a scientific calculator
and ordinary graph paper are all that one needs.
Table 11.1 lists the c d fs of some common fam ilies of distributions
along with the form ulas required to construct probability plots. The reader
is re fe rre d to Chapter 2 for further discussion of these distributions. In
this context the w ill be referred to as quantile probabilities.
There is much discussion in the literature over the best choice o f quan­
tile probabilities for Q -Q plots (see Kimball (1960) and Barnett (1975)). A
frequently used form ula is given by Pi = (I - c)/(n - 2c + I ), where c is some
constant satisfying 0 < c < I . The choices c = 0 and c = 0.5 (see Chapter 2)
are both popular. Here we use c = 0.3175 since the resulting probabilities
closely approximate medians of uniform (0 , 1) order statistics (Filliben,
1975). This choice has the attractive invariance property that if pj is the
median of the ith o rd er statistic from the uniform (0 , 1) distribution, then
G “^(pj) is the median of Z (i) and F~^(pi) is the median of Y (I ), for any con­
tinuous F . Medians may also be p referred as m easures of central tendency
since the distributions of most o rd er statistics are skewed. In the examples
that follow we w ill adhere to the convention of choosing c = 0.3175 unless
stated otherw ise. Thus we w ill plot the points

{ G - '( p , ) , Y ) (U.2)

where pi = (I - 0.3175)/(n + 0.365). The particular choice of quantile proba­


bilities is not crucial since for any reasonably large sample different choices
ANALYSIS OF DATA FROM CENSORED SAMPLES 465

T A B L E 11 .1 C D Fs and Plotting Form ulas fo r Selected Fam ilies


of Distributions

Distribution^ F(y) Abscissa Ordinate

Uniform JLzJi
a (I)

Normal Ф * > ,)
(I)
pog(y) - mJ
Lognormal Ф
*>1 >

Exponential I - exp log [1/(1 - pj)]


(I)

Extreme-value I -exp [-ex p ( 2 - ^ ) ] lo g {lo g [l/ (l - P j ) ] }


(I)

Weibull log {lo g [1/(1 - P j)] } log

i.exp(2-^), y < M lo g ( 2 P j) . Pj < 2

Laplace
log [1 / (2 -2 p j)], Pj > I (i)

Logistic l / [l + e x p ( - 5 ^ ) ] log [pj/(l - Pj)l


(i)

Cauchy 1 + i • arctan ~ **) t a n [x -(p j-| )]


2 X '< a ' (i)

^ iç )p o rt of each distribution is ( - » < y < » ) except fo r the uniform { ß < y


< Д + a ), lognorm al (y > 0), е^фопеп^а! (у > д), and W eibull (у > 0).

w ill have little effect on the appearance of the main body o f the plot. There
may be some noticeable differences, however, fo r extreme o rd e r statistics
from long-tailed distributions. (The read er should note that in Chapter 2 the
Pl of (11.2) was symbolized by Fn(y), the em pirical distribution function.)

E 11. 2. 1. 1 Uncensored N orm al Example

Data for this example consist of the first 40 values from the NOR data set
which w ere simulated from the normal distribution with д = 100 and cr = 10.
A normal probability plot is shown in Figure 11.1. The norm al distribution
provides a good fit to the data. Note that the intercept and slope of a straight
line drawn through the points provide estimates of the theoretical mean and
standard deviation. (The reader should compare Figure 11.1 to Figure 2.15
466 MCHAEL AND SCHÜCANY

FIGURE 11.1 Norm al probability plot of the first 40 observations from the
NOR data set.

where the full NOR data set is plotted with X and Y axes interchanged from
Figure 1 1.1.)

11.2.2 Singly-Censored Samples

The method of the previous section can be applied directly In any situation
where the data consist of some known subset o f o rd er statistics from a ran­
dom sam ple. This is because the available are still sample quantiles
from the complete sample and appropriate quantiles of G can be calculated
as before. Although only a portion of the observations from the hypothetical
complete sample can be plotted, the plotted positions of the uncensored points
are the same as when the complete sample is available. The only difference
ANALYSIS OF DATA FROM CENSORED SAMPLES 467

Is that points corresponding to censored observations do not appear. The


simple example of this is the case of a singly-censored sam ple.

E 1 1 .2 .2 .1 Right-Censored Norm al Example

Data for this example consist o f the sm allest 20 values among the firs t 40
values listed in the NOR data set. A norm al probability plot is shown in
Figure 11.2. This plot is m erely an enlargement of the low er portion of the
plot shown in Figure 11.1.
The plotting procedure is the same for Type I as fo r Type 2 sin gly-
censored sam ples; however, with Type I censoring there is one additional
piece of information, namely the censoring time, that can be represented
graphically. Suppose we observe the r sm allest observations from a random

JQ
O
■o

Theoretical Quantiles

FIGURE 11.2 Norm al probability plot of the sm allest 20 o f the first 40


values listed in the NOR data set.
468 MICHAEL AND SCHUCANY

sample of size n. F o r a location and scale fam ily we plot the points (G “^(p¿),
Y (i)) fo r 1 = 1, 2, r . Now suppose that the censoring is Type I and that
the observations are all those that are less than some predetermined value t;
thus Y(XH-I) must be greater than t. This additional Information can be given
by plotting the point {G "^ (p j 4-]^),t} with a symbol such as an arro w pointing
up, thus indicating the range of possible values fo r Y ( xh^^x)* Nelson (1973)
Illustrates this technique.

11.2.3 Multiply Right-Censored Samples

The method of probability plotting extends easily to multiply right-censored


sam ples; however, the computation of quantile probabilities is m ore com pli­
cated. F o r ease of explanation we w ill first consider the special case of
p rogressive Type I censoring, but the methodology can be applied to any
multiply right-censored sam ple. Suppose we place n units on test, using
several different starting tim es, and terminate the experiment at time t.
Now let Y (I ) < Y ( 2) < • * • < Y(n) denote the ordered lifetim es of the n units,
some of which are failure times and some o f which may be censoring tim es.
If we observe r failu res, then (n - r) units are still operating at time t. In
this case the observed time to failure Y (i) does not necessarily represent
the ith largest observation from the hjqpothetical complete sam ple, and Y (y
cannot be regarded as a sample quantile from the complete sample (unless
Y ( I ) » Y ( 2), . . . , Y(n) are all failure tim es).
We still wish to plot the r failure times against theoretical quantiles
from G. The question now becom es, what proportion of the population falls
below Y (i), o r equivalently, what is the value of F (Y (i) I д,(т). Kaplan and
M eier (1958) discuss the maximum likelihood nonparametric estimator of F
fo r the case of a multiply right-censored (Type I) sam ple. If S is the set of
subscripts corresponding to those units which fail during the course of the
e x p rim e n t, then the K aplan-M eier (K -M ) estimator is given by

■ П
±
n - j + I
je s
j:Y .< y

(This estim ator is undefined fo r y > Y(n) if Y(u) is not a failure tim e.) In the
case of a complete sample the K -M estim ator reduces to the fam iliar EDF
Fn(y) = (the number of Y ( j ) < y)/n, discussed in Chapter 4. The estimated
probability at the point Y j provided by the K aplan-M eier estim ator is given
by

p .(K -M ) = 1 - 1 (11.3)
n - j + I
je s
3<i
ANALYSIS OF DATA FROM CENSORED SAMPLES 469

fo r i £ S. Herd (1960) and Johnson (1964) propose the sim ilar quantile p rob­
abilities

+ I
p.(H -J) = I - П (11.4)

3<i

fo r i e S. Implicit In the work of Nelson (1972) a re the quantile probabilities

P j(N) = Ц exp ( -
T T -.)
(11.5)
je S
ill

for i £ S. Nelson re fe rs to his method as (cumulative) hazard plotting, but


it is equivalent to probability plotting with the above special choice of quan­
tile probabilities. An algebraic comparison reveals that р^(К-М ) > Pi(N) >
P i(H -J) for all I £ S. F o r a discussion of the properties of the Kaplan-
M eier estim ator see Peterson (1977). Results by B reslow and Crowley (1974)
apply to the K aplan-M eier estim ator and the estim ator implicit in the work
of Nelson. See G aver and M ille r (1983) for a discussion of the jackknife
technique for approximate confidence intervals in this setting. F o r a com ­
plete sample the form ulas (11.3), (11.4), and (11.5) fo r quantile probabilities
reduce to i/n, i/(n + I ), and [ I - e з ф (-S i)], respectively, where s j =
(n - j + 1)“^. The choice of probabilities given by

( 11. 6)
' n - 2 c + I .Ii n - j - C+ 2

3<i

reduces to (i - c)/(n - 2c + I) with a complete sam ple. A s a special case,


Pi(C) = P i(H -J) when C = O. In the examples that follow we w ill rem ain con­
sistent with Section 11.2.1 and use (11.6) with c = 0.3175 unless stated
otherwise. Again fo r purposes of assessin g goodness of fit the particular
formulation fo r quantile probabilities is of little consequence.

E 11. 2. 3 . 1 Multiply Right-Censored Example

Data fo r this example consist of the 100 observations from the W E2 data set
which w ere simulated from the W eibull distribution with cr = I and m = 2.
The data w ere censored as follows: observations among the first, second,
third, and fourth sets of 25 w ere recorded that w ere less than I , 0.75, 0.50,
and 0.25, respectively. This tj^ e of p rogressive Type I censoring could have
occurred if four sets o f 25 devices w ere placed on test at times 0, 0.25,
0.50, and 0.75 with the experiment terminating at time I . The 100 values
470 MICHAEL AND SCHUCANY

T A B L E 11.2 P rogressively Censored Data from the WE2 Data Set

Quantile Probabilities

I Failure time K -M N H -J C = 0.3175

1 0.09 0.010 0.010 0.010 0 0.007


2 0.14 0.020 0.020 0.020 0.017
3 0.16 0.030 0.030 0.030 0.027
4 0.18 0.040 0.040 0.040 0.037
5 0.18 0.050 '0.050 0.050 0.047
6 0.20 0.060 0.060 0.059 0.057
30 0.27 0.073 0.073 0.072 0.070
31 0.30 0.086 0.086 0.086 0.083
32 0.32 0.100 0.099 0.099 0.096
33 0.33 0.113 0.112 0.112 0.109
34 0.33 0.126 0.125 0.125 0.122
35 0.34 0.139 0.139 0.138 0.136
36 0.34 0.153 0.152 0.151 0.149
37 0.36 0.166 0.165 0.164 0.162
38 0.38 0.179 0.178 0.177 0.175
39 0.40 0.192 0.191 0.190 0.188
40 0.42 0.206 0.204 0.203 0.201
41 0.43 0.219 0.218 0.216 0.215
42 0.47 0.232 0.231 0.229 0.228
43 0.49 0.245 0.244 0.242 0.241
61 0.51 0.264 0.262 0.261 0.260
62 0.56 0.283 0.281 0.279 0.278
63 0.62 0.302 0.300 0.298 0.297
64 0.65 0.321 0.318 0.316 0.316
65 0 .6 8 0.340 0.337 0.335 0.334
66 0.71 0.359 0.356 0.353 0.353
67 0.74 0.377 0.375 0.372 0.371
85 0.76 0.416 0.412 0.409 0.409
86 0.78 0.455 0.450 0.446 0.447
87 0.92 0.494 0.488 0.483 0.485
88 0.93 0.533 0.526 0.520 0.522
89 0.95 0.572 0.564 0.556 0.560
90 0.97 0.611 0.602 0.593 0.598
ANALYSIS OF DATA FROM CENSORED SAMPLES 471

Theoretical Quantiles

FIGURE 11.3 W elbull probability plot of p rogressively censored data from


the WE2 data set.

(censored and failed) w ere ranked from sm allest to largest. The 33 failure
times are listed in Table 11.2 along with four different choices of quantile
probabilities. O fth e 67 censored devices, 23, 17, 17, and 10 devices had
censoring times of 0.25, 0.50, 0.75, and 1.00, respectively. O nepurpose
of this example is to show how close the agreement can be fo r different
choices of quantile probabilities. Note also the relationship p j(K -M ) > pi(N)
> P i(H -J ). A W eibull probability plot fo r the data using pi(c) with c = 0.3175
is shown in Figure 11.3.
The rem arks made in Section 11.2.1 and Chapter 2 concerning the inter­
pretation of probability plots with complete sam ples hold also fo r the case of
multiple censoring. However, in the case of a multiply right censored
sam ple, the effect o f censoring is to Increase the variability on the righ t-
hand side of the plot.
472 MICHAEL AND SCHUCANY

T A B U E 11.3 Life Data for Mechanical Device

Failure Mode Quantile Probabilities

I Time A B Device A B

I 1.151 X 0.017 0.017


2 1.170 X 0.042 0.042
3 1.248 X 0.066 0.066
4 1.331 X 0.091 0.091
5 1.381 X 0.116 0.116
6 1.499 X 0.141 0.020
7 1.508 X 0.166 0.141
8 1.534 X 0.190 0.167
9 1.577 X 0.215 0.192
10 1.584 X 0.240 0.218
11 1.667 X 0.265 0.052
12 1.695 X 0.289 0.084
13 1.710 X 0.314 0.116
14 1.955 X 0.339 0.246
15 1.965 X 0.364 0.149
16 2.013 X 0.389 0.276
17 2.051 X 0.413 0.305
18 2.076 X 0.438 0.334
19 2.109 X 0.463 0.187
20 2.116 X 0.488 0.365
21 2.119 X 0.512 0.396
22 2.135 X 0.537 0.228
23 2.197 X 0.562 0.269
24 2.199 X 0.587 0.430
25 2.227 X 0.611 0.313
26 2.250 X 0.636 0.466
27 2.254 X 0.661 0.360
28 2.261 X 0.686 0.505
29 2.349 X 0.711 0.544
30 2.369 X 0.735 0.415
31 2.547 X 0.760 0.470
32 2.548 X 0.785 0.524
33 2.738 X 0.810 0.597
34 2.794 X 0.834 0.586
35 2.883 (Working)
36 2.883 (Working)
37 2.910 X 0.870 0.675
38 3.015 X 0.905 0.763
39 3.017 X 0.941 0.851
40 3.793 (Working)
ANALYSIS OF DATA FROM CENSORED SAMPLES 473

F o r Type 2 multiple right censoring consider the following simple situ­


ation. W e place n units on a life test and when the rth unit fails we remove
a ll but a fraction ф o f the remaining working units. W e then observe the
failure time o f those units not rem oved. In this situation the pi(c) values can
be obtained from (11.6) where Y ( i ) < • • • < Y (r ) are the first r failure tim es,
Y (r + i) = • • • = Y (r+(n -r)(l-<i>)) are the censoring times of the removed items
and ^ ( г + ( ц - г ) { 1 - ф ) + 1 ) * • • • » ^(n ) ^be failure times of the items that w ere
not rem oved. The set S in form ula (11.6) consists of the indices o f the first
r failure times and the last (n - г)ф failure tim es, and a probability plot can
be drawn as described above. M ore elaborate Type 2 multiply right-censored
sam ples are handled in the obvious manner.

E 11. 2. 3.2 Competing Modes Example

Data fo r this example consist of the lifetim es, m easured in m illions of op er­
ations, of 40 mechanical devices. The devices w ere placed on test at d iffer­
ent tim es, and three w ere still working at the end of the experiment. The
data are presented in T able 11.3. Only two modes of failure w ere observed:
either comoonent A failed o r component B failed. These two components are
identical in construction, but they a re subject to different stresses when the
device is operated. Thus their life distributions need not be identical. Quan­
tile probabilities are given in T able 11.3 fo r the device as a whole, compo­
nent A , and component B under the columns headed “d e v ic e ,” “A , ” and “B , “
respectively. The data for the device are multiply censored since the 35th,
36th, and 40th ordered lifetim es are incomplete. In addition, observations
on component A are censored by failures of component B and vice v e rsa .
This is an example of random censoring caused by competing modes of
failure.
Probability plots fo r the individual components w ere constructed using
several common life distributions. The lognormal distribution seemed to
offer the best fit. Lognorm al probability plots are shown for components A
and B ln F ig u re 11.4(A). The intercepts and slopes of the two lines suggested
by the plots appear to be different. This raises the possibility that, while
the life distributions of the two components may be of the same fam ily, the
distribution param eters may be different. A distracting feature is the notice­
able gap near the center of the p lots. The natural tendency is to expect too
much orderliness and to declare that something unusual has occurred. But
such anomalies frequently arise by chance and should not be taken too s e r i­
ously. The reader is re fe rre d to Hahn and Shapiro (1967), pages 264-265,
fo r an example of a plot in which the same unusual feature has arisen by
chance.
If the life distributions for components A and B are independent and log­
norm al, then the life of the device is distributed as the minimum of two log­
norm al random variables. F o r illustration we assume the equality of param ­
eters . The cdf of the device is then given by
474 MICHAEL AND SCHUCANY

_ ф [Ь б (у )
Fíylíí.a-) = I - <1

А probability plot for this distribution is constructed by plotting the points

{ф -*(1 lo g (Y )} (11.7)

Such a plot is shown in Figure 11.4(B). If it is desired to fit different sets of


param eters to the individual components, we can always estimate them using,
say, the method of maximum likelihood. The estimated cdf of the device,
however, would then be difficult to invert. One way around this is to estimate
the probability integral transformation with F ( - 1Mi, Д2 »^ 2)
F (yil Ml ,М2 >3-1 ,^ 2) versus the pi* This approach is described m ore fully in

Theoretical Quantiles

FIGURE 11.4(A) Lognormal probability plots for components A and B .


ANALYSIS OF D A T A FROM CENSORED SA M PLE S 475

Cl

FIGURE 11.4(B) Probability plots for the mechanical device where the
assumed distribution is the minimum of 2 1.1.D . Iognorm als.

Section 11.3. Note finally that the derivation o f special theoretical quantiles
given in (H o 7) would not have been necessary if we had modeled the lifetimes
of the components as exponential, extrem e-value, o r W eibull random v a ri­
ables . This is because the minimum of any number of independent identically
distributed random variables from one o f these fam ilies is also of the same
fam ily.
Although the development in the last example is somewhat speculative
in nature, it does serve to illustrate the versatility and usefulness o f proba­
bility plotting, as w ell as its subjective and limited ihterpretability.

11.2.4 Other Тзфее o f Multiple Censoring

There are other m ore complicated t5^ e s of multiple censoring which can
a rise in practice. A few of these w ill be discussed below. The thought to
476 MICHAEL AND SCHUCANY

keep in mind is that a meaningful probability plot can always be constructed


as long as the parent cdf can be estimated.
Occasionally, data arise which are multiply left-cen sored. If the o b s e r­
vations are all multiplied by - I then the resulting values can be viewed as
being multiply right censored. W e can now determine quantile probabilities
using the form ulas of Section 11.2.3. In term s of the subscripts o f the o rig ­
inal observations, the probabilities р|(с) given by the form ula


Pj(C) = -
n - C +L i П _ i - C ( 11. 8)
2c + I . “ o 3 - C + I
je s ^
j> i

reduce to (i - c)/(n - 2c + I) with complete sam ples.


A m ore complicated situation can occur when the data are both multiply
righ t- and multiply left-censored. If all of the left-censored observations
are not le ss than all of the right-censored observations, then quantile proba­
bilities can no longer be calculated using a simple form ula. But appropriate
probabilities can still be determined as long as the cdf can be estimated
nonparam etrically. Turnbull (1976) shows how to calculate the maximum
likelihood nonparametric estimate of the cdf when the data are arbitrarily
right and left-censored, grouped and truncated.
Quantal response data occurs when each observation is either righ t- o r
left-cen sored. In the following example the sample size is so sm all that firm
inferences cannot be drawn; however, the example does show how quantal
response data can a rise , and does serve to illustrate how to construct a
probability plot with such data.

E 11. 2. 4 . 1 Quantal Response Example

It is desired to investigate the nature of the distribution of the shelf life of a


certain electronic set. A total of 47 sets are Involved in the study. After
days on the shelf the ith set is tested and is found to be either good o r bad.
The set is never observed again. Thus a good set constitutes a right censored
observation whereas a bad set constitutes a left censored observation. The
number o f days on the shelf at the times of test are as follows with failures
indicated by an asterisk: 20, 22, 23, 25, 26, 27, 28, 29*, 30, 31, 37, 37,
37, 41, 42, 43, 62, 69, 69, 78, 92, 92, 93, 114, 117, 124*, 128*, 130, 136,
151, 211, 226, 231, 242, 244, 244, 244, 244, 245*, 245, 245, 250, 259*,
259, 287, 317, and 340 days. Using the recursive algorithm given by Turnbull
(1976), the maximum likelihood nonparametric estimate o f the cdf is found
to be
ANALYSIS OF DATA FROM CENSORED SAMPLES 477

0 -CO < у <

undefined 28 < у <

.056 29 < у <

undefined 117 < у <


Fn(y) =
.143 124 < у <

undefined 244 < У <

.222 245 < у <

undefined 340 < у <

Four values of у w ere selected fo r purposes o f probability plotting: 28.5,


120.5, 244.5, and 340. The firs t three are the midpoints o f the three closed

I Z А

1 Z А

1 Z А

1 Z А
¥
Gamma

FIG UR E 11.5 Probability plots fo r the shelf life of electronic sets


(gam m a shape param eter = 1 , 2 , and 4; origin fo r gamma plots is shown
as V
478 MICHAEL AND SCHUCANY

Intervals which are assigned probability, and the last is the largest value for
which Fn(y) is defined. The four probabilities used are 0.028, 0.099, 0.127,
and 0 . 222. The first three are the midpoints of the jumps and the fourth is
equal to Fn(340). Probability plots are shown in Figure 11.5 fo r four fam i­
lies commonly used to model lifetim es. The lognorm al, gamma (with origin 0
and shape near I ), and W eibull distributions all appear to fit the data w ell.
These results are not inconsistent since the gamma distribution described
(езфопепйа! distribution with origin zero) is a m em ber of the W elbull fam ily,
and the low er portions of the W eibull and lognormal cdfs are very sim ilar.
Any conclusions, however, are highly tentative because of the sm all
sample size and the severity of the censoring. If we use a jackknife technique
o r the theory of m . l.e . (see Turnbull, 1976) to estimate the variances of
probabilities assigned to each of the four Intervals, it then appears that none
of the models considered in Figure 11.5 can be soundly rejected.
Grouping is perhaps the most common form of censoring encountered in
practice. Each grouped observation is both righ t- and left-cen sored. Quantile
probabilities can be calculated using the form ula for a complete sample. One
approach to constructing a probability plot is to represent each observation
with the endpoint (or midpoint) of the Interval in which it fa lls . The resulting
plot w ill have a stairstep appearance with the number of steps equal to the
number o f groups. One advantage of this approach is that the sample size is
evident. A simplification is to plot only one point per group.

11.2.5 P roportionalH azards

A quasl-nonparametric method fo r analyzing survival data w as proposed by


Cox (1972, 1975). The method is param etric in that it is assumed that the
hazard functions fo r the observations are a ll proportional. But the method is
nonparametric in that no p rio r restrictions are placed on the form of the
hazards (and hence the c d fs). The cdf for a particular observation is esti­
mated using all the data. This estimate then provides the appropriate quantile
probabilities for purposes of probability plotting.
The W eibull (or exponential) fam ily is the only fam ily for which it makes
sense to construct a probability plot after having assumed the proportional
hazards model. This is because, fo r the W eibull fam ily, a multiplicative
change in the hazard function is equivalent to a change in the scale param eter.
Thus it does not matter which cdf is estimated since the resulting probability
plots w ill differ only in the labeling of their a x e s, and not in the degree of
linearity of the plotted points.

11.2.6 Superposition of Renewal P rocesses

Finally, a very different situation w ill be described which perhaps stretches


the definition of the term censoring.*^ Suppose we have n units that all begin
ANALYSIS OF DATA FROM CENSORED SAMPLES 479

operation at the same tim e. If a unit fa ils, it is instantly replaced with a new
unit. It is assumed that the lifetim es of the original and replacement units
are independent and identically distributed with cdf F . The exact times of
failures are known but not the identities of the failed units. Except for the
first failure, then, we cannot be sure fo r the ages of the failed units. W e
thus observe a superposition of renewal p ro c e sse s. The failure times are
not censored here, but the identities and therefore the ages o f the failed units
are censored in a sense.
Trlndade and Haugh (1979) describe a method fo r the nonparametric esti­
mation of F in the above situation. The renewal function, M, is estimated
using a straightforward nonparametric method. The parent cdf is then esti­
mated by eiqploiting the relationship of F to M through the fundamental r e ­
newal equation. F o r any particular set o f points in time, the estimate of F
provides appropriate probabilities for determining corresponding theoretical
standard quantiles fo r purposes o f probability plotting. Again we w ill empha­
size that a meaningful probability plot can always be constructed as long as
the parent cdf can be estimated using a nonparametric method.

11.2.7 Summary of Steps in Constructing


a Probability Plot

Below are given the steps required in constructing a probability plot with
uncensored, singly-censored, multiply right-censored, and multiply le ft-
censored data. The user must provide a value of the constant c with 0< c < I .
The values c = 0.3175 and c = .5 are popular.
(1) Let Y ( I ) , Y ( 2)» •••> Y(n) denote n ordered observations, some of
which may be censored, and let S be the set of subscripts corresponding to
the observations in the ordered list that are not censored.
(2) Determine quantile probabilities p¿ fo r each i e S using one of the
following form ulas:

(i - c)/(n - 2c + I , fo r complete o r singly-censored samples

n - C + I Tl
Ti n—
5
- ]y
j - C + I
C +1 ^ multiply right-censored
n - 2c “* X . П —I - C * r ^
je S samples
j< i (11.9)

n - C
n - 2c
“Г Т II ■ » fo r multiply left-censored sam ples
3es

(3) Enter Table 11.1 and find the line corresponding to the hypothesized
family of distributions. Plot the entry under ”abscissa" versus the entry
under ’’ordinate” fo r each i e S.
480 MICHAEL AND SCHUCANY

11.3 TESTING A S IM PLE N U L L HYPOTHESIS

F o r this section it is assumed that the hypothesis of interest is Hq : the


sampled population has the completely specified absolutely continuous cdf
F (y ). A s in other chapters, this situation is called Case 0. F o r most of the
discussion the data at hand w ill consist o f a singly right-censored sample
(Type I o r Type 2), that is, the set of r sm allest o rd er statistics Y ( i ) , . . . ,
Y (r )* The probability integral transformation U (i) = F (Y (i)), 1 = 1 , . . . , r ,
can be applied, and an equivalent test of fit is that the < * ” < U (r) are
the r sm allest o rd er statistics of a random sample of size n from the uni­
form (0 , 1) distribution. If the data are Type I censored at у = y * and if
t = F (y *), then r is a random variable giving the number of o rd er statistics
fo r the uniform random sample which are less than t ; if the data are Type 2
censored, then r is fixed in advance.
Many o f the methods which have been discussed in e a rlie r chapters have
been adapted to accommodate censoring of both tзфes. These include censored
versions of ED F statistics (Section 4 .7 ), correlation-type tests (Section 5.5)
and tests based on spacings (Section 8.9). Later we examine procedures in
which the o rd er statistics U (i), i = I, . . . , r , are transform ed to new values
which under Hq are distributed as a complete uniform sam ple. Then any of
the many tests for uniformity fo r a complete sample (Chapter 8) may be ap­
plied to the transform ed valu es.

11.3.1 E D FStatistics

In Chapter 4, censored versions o f EDF statistics w ere introduced. W e w ill


now illustrate the use of these statistics by appl3dng them directly to a
censored sam ple.

T A B L E 11.4 Hypothetical Survival Data


and Transform ed Observations

i
^(1) % ^(1)

I .1 .00995 .03979
2 ,2 .01980 .07918
3 .3 .02955 .11815
4 .4 .03921 .15681
5 o7 .06761 .27038
6 1.0 .09516 .38056
7 1.4 . 13064 .52245
ANALYSIS OF DATA FROM CENSORED SAMPLES 481

E 11. 3. 1. 1 Exponential Example

This Is an example is of Type I censoring. B a r r and Davidson (1973) give


the sm allest 7 observations in a Type I censored sample of size n = 20. The
hyp>othesized null distribution is the exponential distribution F(y) = I -
езф (-у/Ю ), у > 0, with a censoring value at y * = 2 .2 . T able 11.4 gives the
values Y (I ), with the values U(I) given by U (i) = F { Y ( í ) } , i = I , . . . , 7. The
Type I censoring value fo r u is then t = F (2 .2 ) = 0.1975. The Z (i) values
shown here are first discussed in Section 1 1 .3 .3 .3 .
In Section 4 .7 .2 , the Kblm ogorov-Sm irnov statistics iD t,n and 2 ^ r ,n
fo r censored data o f Types I and 2, respectively, a re defined; it is also
shown how these may be transform ed and re fe rre d to the as 3nnptotic d istri­
bution tabulated in Table 4. 4. Alternatively, the exact tables fo r finite n,
given by B a r r and Davidson (1973) may be used. W orking from the values U(i)
of Table 11.4, the statistic is found to be 7/20 - 0.131 = 0.219, with
t = 0.1975 and n = 20. D irect interpolation in the tables of B a r r and Davidson
gives the approximate significance level p = 0.11. Alternatively, use of the
form ulas of Dufour and M aag (1978) (Section 4.7) 3delds the modified statistic
D * = 4.472(0.219) + 0.19/4.472 = 1.022. Reference to the asymptotic points
in Table 4.4 then gives a p-value o f approximately 0.10. If the data had been
Тзфе 2 censored, the form ulas would give a modified statistic D * = 1.033
with p-value approximately 0.095. Percentage points of the as 3nnptotic d is­
tribution are derived and tabled fo r the Kulper statistic, = i D^^q +
lD t,n* under Type I right censoring by Koziol (1980a).
InSection 4.7.3 two types of C ram ^r-von M ises statistics fo r censored
data are given. The first type is denoted by x ^ t »n right censored data
of Type I. The second is fo r Type 2 censoring. Both w ere derived by Pettitt
and Stephens (1976a, b) by adapting the com plete-sam ple definitions of these
statistics. F o r the U (i) of T able 11.4, we have 2 20 ~
A^ O ОЛ ” 1-214. R eferrin g to Table 4.4 with t = 0.2 gives p -values 0.008

and 0.005, respectively. If the data are treated as Type 2 censored, the test
statistics become = C.057 and A i = 0.863; re fe rrin g to Table
2 I , Ai) A I , Ai)
4.5 with P = r/n = .35 gives approximate p -valu es o f 0.25 and 0.08.

E 1 1 .3 .1 .2 Uniform Example

This is an example of Type 2 censoring. Consider the first n = 25 values


from the UNI data set, rescaled to the unit interval. Simpóse that the sm all­
est r = 15 values from this set are available and are to be tested fo r fit as
uniform o rd er statistics. The observed value o f ^D = 0.216 and (^/ñ) D
2 r ,n ' '2 r ,n
= 1.08; then D * (Section 4. 7) is 1.08 + . 24/5 = 1.128 and reference to the
tabulated asymptotic distribution gives a p-value of 0.1296.
The p-value may also be computed from the form ula for the asymptotic
distribution given by Schey (1977) and quoted in Section 4. 7. 2. The value of
482 MICHAEL AND SCHUCANY

t = 15/25 = 0.60; this gives At = 2.041, Bt = 0.408 in the notation of Section


4. 7. 2, and then Gt(1.128) is 0.9362. The observed significance level for
the two-sided test is then approximately 2(1 - 0.936) = 0.128.

Comment A . Use of the censoring information

When the observations U(i) are to be tested fo r uniformity, the value of t,


o r of r (whichever is given) is important, in addition to the values U (i). Thus,
fo r example, in Table 11.4, there are 7 observations out of 20 below
t = 0.1975, a number la rg e r than expected. If the sample had been Type 2
censored, we could observe that the largest observation U ( 7 ) is 0.131, much
sm aller than the expected value 0 . 333, These facts are implicitly used in
calculating the EDF statistics. A lso , the value of U (r)» although not the value
of t, is used in the censored version of spaclngs statistics (Section 8.9).

Comment B . Random censoring

Extensions of EDF statistics to situations involving randomly censored


data generally involve a K aplan-M eier estimator fo r the true distribution
function. F o r versions of the Kolm ogorov-Sm im ov, Kuiper, and C ra m á r-
von M ises see Koziol (1980b), N a ir (1981), o r Flem ing et al. (1980), who
obtain asymptotic distributions and examine the adequacy of sm all sample
approxim ations.

11.3.2 Correlation Statistics

The statistics in this class, as discussed in Chapter 5, basically focus upon


the strength o f the pattern of linear association which is present in probability
plots (see Chapter 2 and Section 11-2). Suppose all the observations U (i) are
known between U(s) and U(r) • These may be plotted against mi = i/(n + I ) ,
i = s, . . . , r , and the coefficient R (X ,m ) may be calculated as described in
Section 5 .1 .2 . Because is sc a le -fre e , the value obtained is the same as
if the w ere correlated with i, from I to r - s + I . The U(i) are a subset
of ord er statistics from (0 , 1) and w ill themselves be uniform between limits
which may o r may not be known. In either case, the distribution of R^ so
calculated is the same under Hq , as that of R^ fo r a full sample of size
r - S + I . Thus Table 5 .1 may be used to make a test. The weakness in this
test procedure is that it does not make use of any Type I censoring values.
In Chapter 5 it is shown how this may be overcome, by including the censor­
ing lim its in the observed sam ple. The value of R^ calculated from these
values w ill have the same null distribution as R^ calculated from a complete
uniform sample of size r - s + 3, so that again Table 5 . 1 can be used.

E 11. 3. 2 ♦ I Exponential Example Revisited

F o r the r = 7 values of U(i) in Table 11.4 the correlation coefficient R (X ,m )


= 0.964, which yields T = r ( l - R ^(X ,m )) = 0.49. Reference to Table 5.1
ANALYSIS OF DATA FROM CENSORED SAMPLES 483

shows that this value is not significant even at the 50% level. If the endpoints
S = O and t = 0.1975 are included, then T = 1.071, which is significant at
the .20 level approximately.

11.3.3 Transform ations to Enable Complete-Sample Tests

1 1 .3.3.1 Conditioning on the Censoring Values

When a ll the values from U (s) to U (r ) (s < r) are available, the test of
H(,: that these a re a subset from an ordered uniform sam ple, can be changed
to a test fo r a complete sam ple. There are several ways to do this. The
sim plest method is as fo llo w s. F o r Type I censoring siq>pose the low er cen­
soring value is A and the upper censoring value is B , and let R = B - A ;
then under Hq the values V (i) = (U (i) - A )/R , i = I , . . . , r - s + I, w ill be a
complete ordered uniform sample on the unit interval and can be so tested.
F o r Type 2 censoring, under Hq the values U (s + l)i • • • , U ( r - l ) w ill be
distributed as a complete ordered sample from the uniform distribution b e ­
tween lim its A = U(S) and B = U (r)* The transformation У (ц = (U(s+1) -
U(s))/R, can be made for i = I , . . . » n*, where n* = r - I - s and R = B - A ;
the V (i), i = I , . . . , n* can then be tested fo r uniformity between 0 and I .

E 11. 3. 3 . 1 Exponential Example Again

Consider, again, the data of Table 11.4. The upper (Type I) censoring value
is t = 0.1975; thus we can first transform the V(i) as У (ц = U (i)/ 0 .1975 and
then test the У ( 1) fo r uniformity between 0 and I . The У(^) are then 0.050,
0.100, 0.150, 0.199, 0.342, 0.482, and 0.661. The ED F statistics a re
D'^ = 0.375, D " = 0.050, D = 0.375, V = 0.426, W * = 0.413, = 0.085, and
A^ = 2.107. Reference to Case 0 tables (Table 4.2) gives p-vaiues of 0.21
fo r D , 0.07 fo r W ^, and 0.08 fo r A ^.

1 1 .3.3.2 Handling Blocks o f M issing Observations

Suppose censoring occurs in г uniform sample other than at the ends; for
example, U (r) and U (r+ q ) might be known, but the q - I observations in b e ­
tween are not known. A spacing U (j^ q ) - U(y) is called a q-spacing. Now
suppose that S is a q-spacing covering unknown observations, and let its
length be d. Keeping in mind the exchangeability of uniform spacings (see
Chapter 8) we exchange S with the set of all spacings to the right of S. Under
Ho the new sample U (i ), U ( 2), • • • , U (r ), U fri-i), . . . , Uj^n-q+1)» where
^ fj) ” j = r + l , . . . , n - q + 1 , w ill be distributed as an
ordered uniform sample which is right-censored at U n -q + l* The process
may be repeated if there is m ore than one such spacing. The method can be
used only if it is known how many values are m issing in the spacings. Thus
a uniform (0 , 1) sample with known blocks of m issing observations can be
transform ed to behave like a right (or left) censored sample. Techniques
for this sim pler kind of censoring can then be used.
484 MICHAEL AND SCHUCANY

E 11>3 . 3 . 2 . 1 Example from Chapter 4

In Table 4.13, a set o f 15 values fo r Z is given which are distributed uni­


form ly on (0 ,1) under the null hyix)thesls that the original set X (also given
there) is exponential. Suppose the four values 0.237, 0.252, 0.252, 0.381
are lost from the set z f . Then 0.434 - 0.229 is a 5-spacing of length
d = 0.205. W e subtract d from áll the values of Z starting with .446 to
obtain 0.113, 0.189, 0.229, 0.241 (= 0.446 - d), 0.298, 0.317, 0.578,
0. 757, 0.774, 0.778,(= 0.993 - d), 0.795 (= 1.0 - d ) . T h e s e llv a lu e s can
be analyzed as being right censored (Type 2) at 0.795, and thus can be tested
by any of the methods of Section 11.3.1. Alternatively, they can be tran s­
formed to be a complete sam ple, as in Section 1 1 .3 .3 .1 above, o r by another
method to be described after the next exam ple.

E 11. 3. 3 . 2 . 2 Exponential Example Modified

These various techniques may be combined to handle blocks o f m issing


observations within, say, right-censored data. Thus suppose the values in
the U-set o f Table 11.4, which are Type I right censored at t = 0.1975, are.
in fact, the values U , U U, U, u, that is , values
(1 )’ (2 )’ (3 )' (4 )' (8 )' " ( 9 ) '
^/Г7\ constitute a block of m issing observations. F irst the set of
(5) (6) (7)
U^s may be transform ed to a uniform sample as described in Section 11.3.3.
above to give a new set V (i) = U (i)/ 0 .1975. The values are those given in
Example E 1 1 .3 .3 .1.1, but they now represent the o rd e r statistics with
indices I, 2, 3, 4, 8, 9, 10. There is thus a 4-spacing of length d = 0.342 -
0.199 = 0.143 between V ( 4 ) and V(8)- Following the steps of this section,
new values V * are found to be V^5) = V ( 9) - 0.143 = 0.339, V*6) ” 0.518,
= I - d = 0.857. These 7 values are to be treated as a right-censored
sample of size 7, now of Type 2, with n = 10 (the 7 given values plus the 3
m issing in the 4-spacing). Since the low er end-point of the distribution is
known to be zero, the values 0.050, 0.100, 0.150, 0.199, 0.339, 0.518 can
be divided by 0.857 and then tested for uniformity on the unit interval.

11.3.3.3 M ore Powerful Transform ations

A disadvantage of the above method of transform ing to a complete sample


fo r Type 2 censoring is that the resulting test examines the values of the U(i)
relative to U (s) and U (r) but takes no account of whether these values them­
selves are too large o r too sm all. (See Comment A in Section 1 1 .3 .1 .)
Michael and Schucany (1979) propose a modification of the above tech­
nique by which a subset of r uniform U (0 ,1) o rd er statistics can be trans­
formed monotonically to behave like a complete U (0 ,1) sample of size r
from the U (0 , 1) distribution. F o r definiteness the result is presented here
in term s o f right censorship; however, the technique can be applied to any
ANALYSIS OF DATA FROM CENSORED SAMPLES 485

kind of Type 2 censoring. F o r example, a q-spacing representing a block


of (q - I) m issing observations in a sample of size n can be shrunk to a
!-sp acin g in a "com plete” sample of size n - q + I . The relative spacings
between consecutive o rd er statistics are not affected by this transformation.
Let U ( I ) , U ( 2), • • • , U (r) be the sm allest r o rd er statistics from a
random sample o f size n from the uniform (0 , 1) distribution, and let B (*)
denote the cdf of U (r ), which is known to have the beta (r , n - r + I) d istri­
bution (see Section 8 .8 .2 ). If Z ( I ) , Z ( 2), Z (r ) are defined by

Z =U h (U , J (11. 10)
(i) (I) r,n^ (r )'

l/ r
where h^^nW = { B * ( x ) } /x, then the Z (i), i = I , . . . , r , a re distributed
like a complete uniform ( 0 , 1) sample of size r . The proof is straightforward
by change of variable.
The computations fo r the transformation a re easily perform ed on a
scientific calculator since the beta cdf can be expressed as the binomial sum

= Z ( i)
l= r
n -i

Any standard goodness-of-fit test fo r uniformity (Chapter 8) may now be


applied to the transform ed observations. The A nderson-D arling statistic is
recommended because of its sensitivlly to departures from uniformity in the
tails of the distribution. The reason why this is important is best presented
by illustration.

E 11. 3 . 3 . 3 . 1 A rtificial Uniform Right-Censored Sample

Three artificial but informative examples of the transformation are shown


in Figure 11.6 where the sm allest five o f nine observations a re plotted both
before and after the transform ation. In each example the values of the U(i)
were artificially chosen to satisfy U ,.y u ,_ , = i/5 = E (U ,.y u ,_ J U _ J . The
(I) (5) (I) (5) (5)
values fo r U ( 5) w ere chosen to be .500, .103, and .897 which correspond,
respectively, to the .500, .001, and .999 quantiles o f the beta (5,5) d istri­
bution, which is the null distribution of U ( 5) when testing the hyj)othesis of
uniformity for the U(¿). Note the manner in which sm all and large values of
U ( 5) affect the appearance of the transform ed points. Sm all values o f U ( 5)
lead to sm all values of Z ( 5) which, in turn, w ill inflate most reasonably
formulated goodness-of-fit statistics. But if Z ( 5) is large, the departure
from uniformity may appear less pronounced; however, Z ( 5) w ill be very
close to I and this w ill inflate a statistic like the A nderson-D arling statistic
which is sensitive to such an apparent departure from uniformity.
486 MICHAEL AND SCHUCANY

a. 0 ,3 ^ ,,= .5 0 0

U I---- X---- к---- X-----X------ X-

Z I------------------- V ----------------- W ------------------

b. 0 ,5 ^ „ = .103

IXKXXX

c. U ,5 „ = .897

-« H

0 1
FIGURE 11.6 Examples of the transformation with г = 5, n = 9.

E 11. 3 . 3 . 3 . 2 Exponential Example

Consider again the values U (i), i = I, . . . , 7 given in Table 11.4. Using the
transformation above, we first compute the scale factor h = h

1 /7
h = [ B * ( 0 . 13064)] /0.13064 = 3.9991

where B * (- ) Isth e beta (7,14) distribution. T h evalu es Z(I) = hU(i) are


given in Table 11.4. The C ram ^r-von M ises statistic, W ^, calculated from
the seven values of Z (i) by the full-sam ple form ulas (Equation 4.2) is 0.673,
and the A nderson-D arling statistic if = 3.404. These have approximate
p-values of 0.035 and 0.02. The K olm ogorov-Sm im ov statistic is D = 0.47755
which has a p -value of approximately 0.056. The p-values using the tran s­
formed Z -values are lower than those using the statistics directly adapted
for censoring (see Section 1 1.3.1).

Comment

This transformation technique for goodness-of-fit analysis of censored


samples has some advantages over the other procedures which have been
proposed for this problem . No new o r additional tables of critical points are
required. Any subset o f o rd er statistics can be analyzed. The power o f the
Anderson-D arling statistic based on the transform ed sample appears to be
ANALYSIS OF DATA FROM CENSORED SAMPLES 487

generally greater than that of existing methods in the presence o f left o r


right censorship. A minor disadvantage is the slight increase in computation
to evaluate the scaling factor, h r ,n (U (r ))» The technique can be extended to
a ll kinds of Type 2 censoring, even pro gressiv e censoring. F o r details and
asymptotic results see Michael and Schucany (1979).

11.4 TESTING A COM POSITE HYPOTHESIS

In this section the hypothesis of interest is that the sampled population has
an absolutely continuous cdf F(y|0), where (9 is a vector of unknown (nuisance)
param eters. Typically the censored data at hand must be a singly-censored
sample if published tables of critical points are to be used. F o r m ore com­
plicated types of censoring, such as multiple right censoring, little work has
been done. F o r a particular set of data, it may be possible to modify a
standard statistic and then estimate certain percentiles, o r the observed
significance level, using simulation techniques. When the censoring is
Type 2, test statistics can often be constructed which have p aram eter-free
null distributions. When the censoring is Type I , statistics with asymptot­
ically param eter-free distributions are a possibility.

11.4.1 Omnibus Tests

Turnbull and V ^iss (1978) present an omnibus test fo r a composite null


hypothesis based on the generalized likelihood ratio statistic. T h eir pro ce­
dure is appropriate for discrete o r grouped data and accommodates multiple
censoring by employing the K aplan -M eler estimate to maxim ize the a lte r­
native likelihood. In less complicated cases o f Type I o r 2 censoring several
standard goodness-of-fit statistics have been modified to test a composite
null hypothesis.

1 1.4.1.1 ED F Statistics for Censored Data


with Unknown P aram eters

Modifications of EDF statistics which accommodate certain types o f censor­


ing when the null hypothesis is simple w ere discussed in Section 1 1 .3 .1.1.
Sim ilar modifications fo r use in testing normality with unknown param eters,
or exponentiality with unknown scale, are given in Sections 4 .8 .4 and 4.9. 5.

E 11. 4 . 1. 1. 1 Norm al example

The data consist o f the sm allest 20 values among the first 40 values
listed in the NOR data set. W e wish to test that the underlying distribution is
norm al. Gupta’s estimates (Gupta, 1952) here are /i = 98.233 and a = 9.444.
Relevant calculations are given in Table 11.5. The value o f the C ra m á r-
von M ises statistic is found to be, using Section 4 .7 .3 ,

20 40
W^ + 0.02512 - — (0.5 - 0.53741)^ = 0.02686
2 20,40 12(40)^ 3 ^ ^
488 MICHAEL AND SCHUCANY

T A B L E 11.5 Steps in Calculating for the Smallest 20 O rder Statistics


Among the F irst 40 Observations in the NOR Data Set

I - 0.5
I
^(1) 40 ^(i) L (i) n J
I 79.43 0.0125 0.02323 0.00012
2 83.53 0.0375 0.05974 0.00049
3 83.67 0.0625 0.06152 0.00000
4 84.27 0.0875 0.06962 0.00032
5 85.29 0.1125 0.08525 0.00074
6 87.83 0.1375 0.13531 0.00000
7 89.00 0.1625 0,16411 0.00000
8 89.90 0.1875 0.18878 0.00000
9 90.03 0.2125 0.19252 0.00040
10 90.87 0.2375 0.21778 0.00039
11 91.46 0.2625 0.23662 0.00067
12 92.02 0.2875 0.25529 0.00104
13 92.45 0.3125 0.27014 0.00179
14 92.55 0.3375 0.27364 0.00408
15 95.45 0.3625 0.38411 0.00047
16 96.13 0.3875 0.41188 0.00059
17 96.20 0.4125 0.41477 0.00001
18 98.70 0.4375 0.51972 0.00676
19 98.98 0.4625 0.53152 0.00476
20 99.12 0.4875 0.53741 0.00249
0.02512

R eferring to Table 4 .5 , we find that the observed value is sm aller


than the .15 point which, by interpolation, is approximately 0.03. The value
of 2 A 20 40 is 0.233; this is significant at about the .10 level.
EDF tests for exponentiality with an unknown scale param eter are set
out in Section 4 .9 . Note that use o f the N transformation of Chapter 10 (see
Section 10.5.6) converts a right-censored e^on en tial sample to a complete
sample of exponentials, with the same scale, and then any of the tests of
Chapter 10 can be used. This property is explored in a test based on leaps,
in Section 1 1 . 4 . 1 .3 below.

11.4.1.2 Correlation Statistics

Consider again the sample correlation coefficient between the Y (i) and a set
of constants Ki, denoted, as in Chapter 5, by R (Y ,K ). Because R (Y ,K ) is
invariant with respect to a linear transformation of the Y (i), it follows that
Its null distribution does not depend on location o r scale param eters of the
distribution. This makes it a useful statistic for testing fit. Suppose F(y) is
ANALYSIS OF DATA FROM CENSORED SAMPLES 489

the cdf of Y fo r location param eter 0 and scale param eter I , and let F “^(*)
be its Inverse. Suitable sets of constants a re then Ki = mt, where mt is the
eзфected value o f the ith o rd e r statistic o f a sam ple o f size n from F (y ), o r
Ki = Hi = F “^(i/(n + 1)). The statistic Z = n { l - R 2 (Y ,K )} has been discussed
in Chapter 5, and percentage points have been given fo r censored versions
of Z .
Chen (1984) presents a correlation statistic as an omnibus test fo r the
composite hypothesis o f exponentiality in the presence o f random censoring.
A s 3nnptotic distributions are derived under a particular censorship model,
which is quite robust provided that less than 40% of the observations are
censored.

E 11. 4 . 1. 2. 1 Norm al example revisited

Consider again the sm allest 20 values among the first 40 values in the
NOR data set. When testing fo r normality using R (Y ,K ), we obtain Z = 0.035
which fa lls just below the .50 point. Thus on the basis of the statistic Z we
cannot reject the hypothesis o f normality at the usual lev els. Another example
of a correlation type statistic is given in the next section.

11«4.1.3 Statistics Based on Spaclngs and on Leaps

Spacings between ordered uniforms w ere defined in Section 10.9.3. Sim ilarly,
spaclngs can be defined between o rd e r statistics Y (i) o f a sample from any
distribution. If the distribution has no low er lim it, the first spacing w ill be
D i = Y(2) - Y ( I ) , and so on. Sim ilarly leaps jgj can be defined by f i = D^ZE(Di).
An important property of leaps is that, fo r continuous distributions, they
w ill (under regularity conditions) be asymptotically exponentially distributed
with mean I . Then a test fo r a given distribution with unknown location and
scale param eters is reduced to a test sim ilar to a test fo r е^фопеп^аШу of
the f i. The test w ill not be exactly the same as a test fo r exponentiality
because the f i do not become an independent sam ple, even asymptotically.
W e illustrate the technique with an example given by Mann, Scheuer, and
F ertig (1973) who created a test fo r the extrem e-value distribution by using
leaps; the test is fo r right-censored data and can be adapted to a test fo r the
W elbull distribution. Both these features a re illustrated in the example.

E 11. 4 . 1. 3 . 1 W eibull Example

The following values are t(i), i = 1......... 15, the firs t 15 o rd e r statistics of
a sample o f 22 t-values; the null hypothesis is Hq: the t-sam ple comes from
a tw o-param eter W elbull distribution (Section 10.4 .4 ), against a three-
param eter W eibull (with positive origin) as alternative. Values are: 15.5,
15.6, 16.5, 17.5, 19.5, 20.6, 22.8, 23.1, 23.5, 24.5, 26.5, 26.5, 32.7,
33.8, 33.9. The steps in making the test a re as follow s. F irs t find X (i) =
log f(i)> I = 1> • • • » 15; Hq then reduces to a test that X (i) are from an
extrem e-value distribution (equation 4.7 o r 5.22) with unknown location and
scale param eters. The denominator E(D j) o f f j depends on the unknown
490 MICHAEL AND SCHUCANY

scale, so test statistics are calculated from normalized spacings. These


a re defined by = ” ^ (i)^ ^ ^^i+1 ~ ^ i ^ ’ where m^ are the e^qjected
values o f the o rd er statistics from an extrem e-value distribution with loca­
tion 0 and scale I . Tests based on normalized spacings, including the Mann,
Scheuer, and F ertig statistic S, a re discussed in Section 4.20. F rom the
data above, the 14 normalized spacings are .0063, .1070, .1640, .3912,
.2408, .5177, .0752, .1089, .2858, .5724, 0.0, 1.6651, .2676, .0241.
i 14
In the notation of Section 4.20, these give 13 values z,., = 2. , y,/2. , y, ;
(I) J=I-^i 3=1-^1
these are .0014, .0256, .0626, .1510, .2054, .3224, .3394, .3640, .4286,
.5579, .5579, .9341, .9946. The M ann-Scheuer-Fertig statistic is S =
I - z ( 7 ) = 0.661; the authors suggest a one-tail test, and reference to their
tables shows S to be significant at about the 11% level. Tiku and Singh (1981)
proposed using the mean Z of the z (i), and Lockhart, O ’R eilly, and Stephens
(1985) have suggested a |, the Anderson-D arling statistic calculated from
the Z(i). These statistics also are discussed in Section 4.20. F o r the data
above, Z = 0.380 and a | = 1.878; these are significant at about the 4% and
5% levels, respectively. These statistics appear to be m ore sensitive than S.
Mann and F ertig (1975) consider ratios o f other sums of leaps as w ell as
ratios of weighted sums of leaps, and describe how their approach can be
extended to progressively censored sam ples. F o r further discussion see
Section 4 . 20.
W e can use this example also to illustrate the use of correlation statis­
tics for the extreme value distribution. The test is for distribution (5.22),
which has a short tail to the right and a long left tail. The r = 15 values of
X (i) are tested to correlate with = log [-lo g { l - i/(n + I ) } ] , i = I, . . . , 15,
n = 22; then R = 0.9446 and Z = n { l - R ^ (X ,H )} = 1.616. Interpolation in
Table 5.10, fo r n = 22 and p = r/n = 0.68, shows Z to be significant at
approximately the 0.25 level.
It might be useful also to illustrate a danger which may arise in testing
fo r the extrem e-value distribution. F o r a full sam ple, it does not matter
whether one takes X = log t, where t is W elbull data, and tests that X is
from (5.22), o r takes X ’ = - log t and tests that X ’ is from (5.21); the same
value of the correlation coefficient is obtained by both methods, and both
recommendations are seen in the literature. However, for a censored sample,
it is important to follow the correct procedure: fo r right-censored W eibull
data, take X = log t and test fo r right-censored data from (5.22), as in the
example above, and for left-censored W eibull data, take X ’ = - log t and
test for right-censored data from (5.21). This second test fo r W elbull is
probably less likely to occur in practice, and the Mann, Scheuer, and F ertig
test is not set up for this case, although it could be adapted.
Two tests for the tw o-param eter exponential that can be used with doubly
censored samples have been presented by B rain and Shapiro (1983), These
tests combine the properties of spacings and o f the correlation statistic to
have good sensitivity to alternatives with monotone and nonmonotone hazard
functions, respectively. Still other related work on statistics based on sp ac-
ANALYSIS OF DATA FROM CENSORED SAMPLES 491

Ings may be found in Mehrotra (1982). Some statistics based on modified


leaps have been studied by Tiku (1980, 1981).

11.4.2 Alternative Fam ilies of Distributions

Typically when testing for goodness of fit we assume only that the underl3ring
cdf is absolutely continuous. Occasionally we may wish to limit our choices
to, say, two fam ilies o f distributions. In particular we may wish to test the
composite null hypothesis

H i: F(y) = F i(y | 6 i)

against the composite alternate hypothesis

F(y) = T ^ { y \ e ^ )

where 9 1 and 9 2 are unknown (nuisance) param eters. Because we have


narrowed the set of alternate distributions considerably, we should be able
to tailor tests to the specific hypothesis of interest which are m ore powerful
than omnibus goodness-of-fit tests. There have been several approaches to
this problem .
Let fi(y l0 ) be the probability density function fo r fam ily i, i = I , 2. W e
w ill denote by L j the sample likelihood under Щ after has been replaced
with its maximum likelihood estim ator, This maximized likelihood is
then

L = H f . (Y , . . ; 0 . )
i V ( 3) I
3=1

fo r the complete sample . . . , ^ ( д у W e w ill denote the ratio of m axi­


mum likelihoods by

R M L = L 1/ L 2

Cox (1961, 1962) formulates a test of Hj versus H 2 which is based upon


the statistic

T = In (R M L ) - E[ln (R M L)]

where E is the expectation under the null hypothesis, H j . F o r complete


samples the large sample distribution of T is approximated using maximum
likelihood theory. Hoadley (1971) extends maximum likelihood theory to situ­
ations which include censoring. Thus valid approximations to the distribution
of T are also possible with censored sam ples.
F o r location-scale fam ilies with pdfs fjiy - Mi)/o'!) > Lehmann (1959)
492 MICHAEL AND SCHUCANY

shows that the uniformly most powerful Invariant (under linear transform a­
tions) test is based upon the (Lehmann) ratio of integrals

LRI = I i /I2

where

Ij = / / fj(vyj^ + u, v y ^ + u )d v d u
-O O O

The R M L statistic and some modified versions are discussed in a series of


papers: Antle (1972, 1973, 1975); Dumonceaux, Antle, and Haas (1973);
Dumonceaux and Antle (1973); Kllmko and Antle (1975); Kotz (1973). P e r ­
centage points are given for the null distribution of R M L for comparisons
involving a number of different fam ilies of distributions. In some cases, the
LR I and R M L tests coincide. In others, the R M L test is almost as powerful
as the LR I test. The authors make use of the fact that the distribution of the
R M L statistic is p aram eter-free whenever the fam ilies to be compared are
both location-scale fam ilies. This result appears to hold fo r any Type 2
censored sam ple. The only tables of critical points which have been con­
structed fo r use with censored samples appear in Antle (1975) and apply to
the situation where one is testing the null hypothesis that the underlying d is­
tribution is W eibull (or extrem e-value) against the alternate hypothesis of
Iognormality (or normality), and vice v e rsa .

E 11. 4 . 2. 1 Lognorm al vs . W eibull Example

W e once m ore consider the sm allest 20 values among the first 40 values
listed in the NOR data set. W e first exponentiated the data, and then proceeded
to test the lognormal against the W eibull fam ily. An Interactive procedure was
used to determine the value of R M L . Entries in Table IX of Antle (1975) must

be compared to RML^^^^ which here was determined to be 1.063. This value


is just above the 95 percent point and so we have the surprisiug result that
we can reject the (true) hypothesis of normality in favor of the extrem e-value
distribution at the 0.05 level of significance.
Finally, a somewhat different approach to this general problem deserves
mention. F arew ell and Prentice (1977) construct a three-param eter family
of generalized gamma distributions which includes the W eib u ll, lognormal,
and gamma fam ilies as special cases. Likelihood ratio tests using asymptotic
likelihood results are recommended which can accommodate censoring as
well as regression variables.
ANALYSIS OF DATA FROM CENSORED SAMPLES 493

R E FER ENCES

Antle, C . E. (1972). Choice of model fo r reliability studies and related


topics. A R L 72-0108, Aerospace Technical Laboratories, W rig h t-
Patterson A F B , Ohio.

Antle, C. E. (1973). Choice of model fo r reliability studies and related


topics, П. A R L 73-0121, Aerospace Technical Laboratories, W rig h t-
Patterson A F B , Ohio.

Antle, C. E. (1975). Choice of model fo r reliability studies and related


topics, Ш . A R L 75-0133, Aerospace Technical Laboratories, W righ t-
Patterson A F B , Ohio.

Barnett, V . (1975). Probability plotting methods and o rd er statistics. J. R .


Statist. Soc. C 24, 95-108.

B a r r , D . R. and Davidson, T . (1973). A Kolm ogorov-Sm irnov test fo r cen­


sored sam ples. Technometrics 15, 739-757.

Brain, C. W . and Shapiro, S. S. (1983). A regression test fo r езфопепиаНу:


Censored and complete sam ples. Technometrics 25, 69-76.

B reslow , N . and Crowley, J. (1974). A large sample study of the life table
and product limit estimates under random censorship. Ann. Statist. 2,
437-453.

Chen, C . H. (1984). A correlation goodness-of-fit test fo r randomly censored


data. Biom etrika 71, 315-322.

Cox, D. R. (1961). Tests of separate fam ilies of hypotheses. P r o c . 4th


Berkeley Symp. 1^, 105-123.

Cox, D. R . (1962). Further results on tests of separate fam ilies of hypothe­


ses. J. R . Statist. Soc. B 24, 406-424.

Cox, D. R. (1972). R egression models and life tables (with discussion). J.


R . Statist. Soc. B 187-220.

Cox, D. R. (1975). Partial likelihood. Biom etrika 62, 269-276.

Dufour, R. and M aag, U . R. (1978). Distribution results fo r modified


Kolm ogorov-Sm irnov statistics fo r truncated o r censored sam ples.
Technometrlcs 20, 29-32.

Dumonceaux, R. and Antle, C . E. (1973). Discrim ination between the log­


normal and W elbull distributions. Technometrics 15, 923-926.

Dumonceaux, R ., Antle, C. E ., and Haas, G. (1973). L lk e llh o o d ra tlo te st


fo r discrimination between two models with unknown location and scale
param eters. Technometrics 15, 19-27.

F arew ell, V . T . , and Prentice, R. L . (1977). A study of distributional shape


in life testing. Technometrics 19, 69-75.
494 MICHAEL AND SCHUCANY

Fllliben, J. J. (1975). T h e p ro b a b ility p lo tc o rre la tio n c o e ffic le n tte s tfo r


normality. Teehnometries 17, 111-117.

Flem ing, T . R ., O ’Fallon, J. R . , O ’Brien, P . C . , and Harrington, D. P .


(1980). Modified Kolm ogorov-Sm im ov test proeedures with applieation
to arb itrarily right-eensored data. Biom etries 36, 607-625.

Gaver, D. P . and M ille r, J r ., R. G. (1983). JaekknifingtheK aplan-M eier


survival estimator fo r censored data: Simulation results and asymptotic
analysis. Commun. Statist. 12, 1701-1718.

Gupta, A . K. (1952). Estimation o f the mean and standard deviation o f a


norm al population from a censored sam ple. Biom etrlka 39, 260-273.

Hahn, G. J. and Shapiro, S. S. (1967). Statistical Methods in Engineering.


W iley, New York.

Herd, G. R. (1960). Estimation o f reliability from incomplete data. P r o ­


ceedings of the Sixth National Symposium on Reliability and Quality
Control, 202-217.

Hoadley, B . (1971). Asjonptotic properties of maximum likelihood estimators


for the independent not identically distributed case. Ann. Math. Statist.
1977-1991.

Johnson, L . G. (1964). The Statistical Treatment of Fatigue Experim ents.


E lsev ier, New York.

Kaplan, E . L . and M eier, P . (1958). Nonparametric estimation from incom­


plete observations. J. A m er. Statist. A s s o c . 53, 457-481.

Kim ball, B . F . (1960). On the choice of plotting positions on probabillly


paper. J. A m er. Statist. A s s o c . 55, 546-560.

Kllm ko, L . A . and Antle, C. E. (1975). Tests for normality versus lo g -


norm ality. Commun. Statist. 4, 1009-1019.

Kotz, S. (1973). Norm ality versus Iognormallty with applications. Commun.


Statist. I , 113-132.

KoZiol, J. A . (1980a). Percentage points of the asymptotic distributions of


one and two sample Kuiper statistics fo r truncated o r censored data.
Technometrics 22, 437-442.

K ozlol, J. A. (1980b). GoodnesS-o f-fit tests fo r randomly censored data.


Biom etrika 67, 693-696.

Lehmann, E. L . (1959). Testing Statistical Hypotheses. W iley, New Y ork.

Mann, N. R ., Scheuer, E . M ., and Fertig, К. W . (1973). A n e w g o o d n e s s -


o f-flt test for the tw o-param eter W elbull o r extrem e-value distribution
with unknown param eters. Commun. Statist. 2, 383-400.

Mann, N . R. and Fertig, К. W . (1975). A goodness-of-fit test for the two


ANALYSIS OF DATA FROM CENSORED SAMPLES 495

param eter v s . three param eter W eibu ll; confidence bounds fo r threshold.
Technometrics 17. 237-245.

M ehrotra, K. G. (1982). On goodness-of-fit tests based on spacings fo r


type П censored sam ples. Commun. Statist. 11, 869-878.

Michael, J. R. (1977). Goodness o f fit: Type П censoring, influence func­


tions. Unpublished P h .D . dissertation. Southern Methodist University.

M ichael, J. R. and Schucany, W . R. (1979). A new approach to testing


goodness of fit fo r censored sam ples. Technometrics 21, 435-441.

N a ir , V . N . (1981). Plots and tests fo r goodness of fit with randomly cen­


sored data. Biom etrlka 68, 99-103.

Nelson, W . (1972). Theory and applications of hazard plotting fo r censored


failure data. Technometrics 14, 945-966.

Nelson, W . (1973). Analysis of residuals from censored data. Technometrics


15, 697-715.

• Peterson, A . V . (1977). Expressing the K aplan-M eier estim ator as a function


of em pirical subsurvival functions. J. A m e r. Statist. A s s o c . 72, 854-
858.

Pettitt, A . N . and Stephens, M . A . (1976a). C ram ^r-von M ises 1зфе statis­


tics for the goodness of fit o f censored data—simple hypothesis. Tech­
nical Report No. 229, Department of Statistics, Stanford University.

Pettitt, A . N. and Stephens, M . A . (1976b). Modified C ram d r-von M ises


statistics fo r censored data. Biom etrika 63, 291-298.

Schey, H. M . (1977). The as 5rmptotic distribution fo r the one-sided K olm og-


oroV-Sm im oV statistic for truncated data. Commun. Statist. A 6 , 1361-
1366.

Tiku, M . L . (1980). G oodness-of-fit statistics based on spacings of complete


o r censored sam ples. Australian J. of Statist. 22, 260-275.

Tiku, M . L . (1981). A goodness-of-fit statistic based on the sample spacings


fo r testing a S5nnmetric distribution against symmetric alternatives.
Australian J. of Statist. 23, 149-158.

Trindade, D. C. and Haugh, L . D. (1979). N o n p aram etricestim atio n o fa


lifetime distribution via the renewal function. Technical Report 19.0463,
IB M .

Turnbull, B . W . (1976). The em pirical distribution function with arbitrarily


grouped, censored and truncated data. J. R. Statist. Soc. B 38, 290-
295.

Turnbull, B. W . and W eiss, L . (1978). A likelihood ratio statistic for testing


goodness of fit with randomly censored data. Biom etrics 34, 367-375.
496 MICHAEL AND SCHUCANY

W ilk, M . B. and Gnanadesikan, R. (1968). P robabilityp lottin gm eth od sfor


the analysis of data. Biometrika 55, 1-17.
12
The Analysis and D etection o f Outliers
G. L . Tietjen Los Alam os National Laboratory, Los Alam os, New Mexico

12.1 IN TR O D U C TIO N

The term **outlier^^ (straggler, sport, m averick, flyer o r a w ild, aberrant,


discordant, o r anomalous observation) has at best a subjective definition. It
is an observation ^’so fa r separated . . . from the rem ainder that [it] gives
rise to the question of whether [it is] not from a different population”
(Kendall and Buckland, 1957), o r ”It is one that appears to deviate markedly
from other m em bers of the sam ple” (Grubbs, 1959). We shall adopt in this
chapter the definition given by Beckman and Cook (1983): A discordant o b se r­
vation is one that appears surprising o r discrepant to the investigator; a
contaminant is one that does not come from the target population; an outlier
is either a contaminant o r a discordant observation.
In ord er for an observation to ”appear su rprisin g” to the investigator,
he must have in mind some model of the data (symmetry, normality, upper
bounds) which he is applying. W e shall discuss here only the underlying
assumption of normality since there is very little theory fo r any other case.
The assumption of normality should not be taken lightly: The investigator
needs some experience with his data generating process in o rd e r to decide
whether the assumption holds. Lacking ejqjerience, he needs to ask him self
whether the assumption is theoretically reasonable; he must bear in mind
that there are numerous sets of data with genuinely skewed distributions (in
which case the outlier theory is not applicable unless the data can be trans­
formed to normality) and other instances where the data arise from a m ix­
ture of distributions. Gumbel (1960) has stated that ”The rejection of outliers
on a purely statistical basis is and remains a dangerous procedure. Its very
existence may be a proof that the underlying population is, in reality, not
what it was assumed to b e . ”

497
498 TIETJEN

Before beginning his search fo r outliers, the experimenter w ill need to


ask him self why he is looking fo r outliers. It may be that he wishes only to
estimate the mean and variance of his population. In that case, however
dangerous the procedure, it may be m ore dangerous to do nothing. Anscombe
(I960) has noted that ”No observations are absolutely trustworthy,’* and that
’’one sufficiently erroneous reading can wreck the whole of a statistical anal­
y sis. ” A set of bivariate data is especially sensitive to outliers, and one ob­
servation can easily change the correlation coefficient from .01 to .99. A
second Important reason for looking for outliers is that interest may be cen­
tered in the outliers themselves. In prospecting for uranium, the prospector
is interested only in the discordant observation; he is not at a ll Interested
in estimating the average background of a region. Beckman and Cook (1983)
cite the search for the Russian satellite which crashed in Canada as a sim ilar
instance. Barnett (1978) discusses a court case of doubtful paternity where
the mother gave birth to a child 349 days after the father went overseas. Is
this gestation period an outlier o r is it within the range of variation? The
main interest here is not in estimating the mean background o r the standard
deviation of the human gestation period. A third reason for looking for out­
liers is fo r the information they may yield about the data gathering p ro c e ss.
Kruskal (1960) pursues this issue: ”An apparently wild observation is a signal
that says: ’’Here is something from which we may learn a lesson, perhaps of
a kind not anticipated beforehand and perhaps m ore important than the main
object of the study. Examples of such serendipity have been frequently d is­
cussed—one of the most popular is Flem ing’s recognition of the virtue of
penicillin . . . . Much depends on what we are after . . . . ”
Kruskal cites an example of five determinations on the concentration of
a chemical in a certain m ixture, one of which is badly out of line. It is deter­
mined that the outlier stemmed from a m iscalibratlon affecting only the one
observation. If the objective is to estimate the concentration of that particular
m ixture, the outlier could be forgotten o r a correction m ade. If the goal is to
Investigate the method of measurement, the presence of the outlier ’’tells us
something about the frequency and magnitude of serious e r ro rs in the method.*'
If finding an outlier results in correcting a flaw in the measurement p rocess,
its discovery w ill be worthwhile. When an unusual observation is encountered,
we should ask: (I) What was the likelihood, before taking the measurement,
that something would go wrong with the experiment and that it would be wild ?
(2) Is there any evidence, other than its magnitude, that something did go
wild ? Can we check the notebooks to see whether the procedure was carried
out properly o r that the results w ere recorded correctly? What is done with
an outlier may depend upon the answers to these questions. W e agree with
Kruskal (1960a) that ”it is of great importance to preach the doctrine that
apparent outliers should always be reported, even when one feels that their
causes are known or when one rejects them for whatever good rule o r reason.
The immediate p ressu res of practical statistical analysis are almost uni­
form ly in the direction of suppressing announcement of observations that do
ANALYSIS AND DETECTION OF OUTLIERS 499

not fit the pattem ; we must maintain a strong se a -w a ll against these


p r e s s u r e s .”
Anscombe (1960) has identified three sources of e r r o r in any m easure­
ment process: (I) the inherent variability in the experimental units them­
selves, (2) the e r r o r in the m easuring instruments, and (3) execution e r r o r ,
o r any discrepancy between what we intend to do and what is actually done.
The latter may include measuring a subject not belonging to the population
o r measuring some characteristic other than the one intended o r selecting
a biased sam ple. ”If we could be s u r e ," he continues, "that an outlier was
caused by a large measurement o r execution e r r o r which could not be recti­
fied, we should be justified in discarding the observation and all memory of
it. The act of observation would have failed; there would be nothing to report.
Such an observation could just be described as spurious. " [Following the
Kruskal doctrine, we would report such a value, tell what caused the e r r o r ,
then forget about it .] In some cases, measurement o r execution e rro rs may
be giving US measurements which are not extrem e. In an inter laboratory
experiment, for example, one laboratory may be reporting the mean of sev­
e ra l measurements while another is reporting single observations. The
means w ill tend to fall toward the center of the data rather than toward the
extrem es. Goodness-of-fit techniques, rather than outlier methods should
be resorted to in these cases.
Having given some thought to the objectives o f an analysis, we need to
realize that there are two principal methods of dealing with outliers: identi­
fication and accommodation. If the outliers are detected o r identified, they
may be treated in one o f several ways:

1. Omit the outliers and treat the reduced sample as a "n ew " sam ple.
2. Omit the outliers and treat the reduced sample as a censored sam ple.
3. W insorlze the outliers, i . e . , replace them with the value of the nearest
"good" observation. This at least preserves the direction of m easure­
ment.
4. Ask the experimenter to take additional observations to replace the
ou tliers.
5. Present one analysis including the outliers and another excluding them.
If the results are very different, view the conclusions cautiously.

Accommodation of the outliers without previously identifying them falls


into the area of robust estimation. This may take the form of using trimmed
means, W insorized means, using the median instead of the mean (an extreme
form of the other two), o r it may involve the use of a weighted estimator
(omission and trimming corresponds to zero weights, W insorlzation to others,
and Huber’s M-estim ation to still others). Estimation o f the variance in these
circumstances is a very different matter from estimation of the mean and
may be much m ore difficult.
500 TIETJEN

12.2 А SINGLE O U T L IE R IN A UNIV A R IA TE S A M P LE

W e begin first with the identification of a single outlier in a univariate sample


of size n. Without going into the long history of work in this area (see Beck­
man and Cook, 1983), the best one can do is to obtain the mean (¾ and the
standard deviation (s) of the entire sam ple, and calculate the extreme student-
ized residual Tj^ = - x)/s where x^^j is the single largest suspect
observation. If the least observation is suspect, Tj^ = (x - x (i))/ s . If T^
is la rg e r than the critical value given by Grubbs (1959) in Table 12.1, the
suspect observation is not regarded as being part of the underlying norm al
population. If the population standard deviation o* is considered "known”
(from considerable experience), one may use the fourth and fifth columns of
Table 12.1 as critical values. If the standard deviation is estimated inde­
pendently of the present sam ple, the second part of Table 1 2 .1 should be
used for critical values.
The test we have given is a one-sided test. To use it appropriately,
we must decide, in advance, whether the outliers w ill occur only on the high
side o r only on the low side. Alternately, we may have decided that we w ere
interested only in outliers on the high side o r on the low side. If we do not
know in advance whether the outlier w ill occur on the high side o r on the low
side, we should use a two-sided test. F o r a two-sided test at the o'-level of
significance, we calculate both test statistics and compare the maximum of
the two statistics it to the tabled critical value for 0^/2 .

E 12. 2. 1 Example

In this and the following examples, sample sizes w ere chosen partly for
convenience in computation; the tests may not be as powerful as one would
like. A set of eight mass spectrometer measurements w ere made on a single
sample of a particular Isotope of uranium. The data, arranged in ord er, are
as follows: 199.31, 199.53, 200.19, 200.82, 201.92, 201.95, 202.18,
245.57. Experience has shown that outliers usually occur on the high side.
Assuming normality, can the largest observation be rejected as an outlier?
W e calculate x = 206.43, s = 15.85, and Тд = (245.57 - 206.43)/15.85 =
2.460. Since this is greater than the 5% critical value of 2.03 from Table
12.1, we reject the Ьзфоthesis ( i . e . , we decide that 245.57 is an outlier).

Anscombe (1970) saw no particular reason fo r treating outliers as a


hypothesis-testing problem , partly because significance levels can be swamped
by the assumptions made. He advocated a data analysis approach in which we
treat a rejection rule like a homeowner*s fire insurance policy. A fire occurs
when an observation is spurious (comes from a different population). Before
bu5dng fire insurance, we should ask: (I) What is the premium? (2) How much
protection does the policy give when there is a fire ? (3) How much danger is
there of a fire ? Reduced to statistical te rm s, the premium m easures the
T A B L E 12.1 Critical Values fo r G rubbs’ One-Outlier
Statistic T jj

Std Dev Calculated from Sample Std Dev Knovm

n 5% 2.5% 1% 5% 1%

3 1.15 1.15 1.15 1.74 2.22

4 1.46 1.48 1.49 1.94 2.43

5 1.67 1.71 1.75 2.08 2.57

6 1.82 1.89 1.94 2.18 2.68

7 1.94 2.02 2.10 2.27 2.76

8 2.03 2.13 2.22 2.33 2.83

9 2.11 2.21 2.32 2.39 2.88

10 2.18 2.29 2.41 2.44 2.93

11 2.23 2.36 2.48 2.48 2.97

12 2.29 2.41 2.55 2.52 3.01

13 2.33 2.46 2.61 2.56 3.04

14 2.37 2.51 2.66 2.59 3.07

15 2.41 2.55 2.71 2.62 3.10

16 2.44 2.59 2.75 2.64 3.12

17 2,47 2.62 2.79 2.67 3.15

18 2.50 2.65 2.82 2.69 3.17

19 2.53 2.68 2.85 2.71 3.19

20 2.56 2.71 2.88 2.73 3.21

21 2,58 2.73 2.Э1 2.75 3,22

22 2.60 2.76 2.94 2.77 3.24

23 2.62 2.78 2.96 2.78 3.26

24 2.64 2.80 2.99 2.80 3.27

25 2.66 2.82 3.01 2.81 3.28

30 2,75 2.91 3,10

35 2.82 2.98 3.18

40 2.87 3.04 3.24

45 2.92 3.09 3.29

50 2.96 3.13 3.34

(continued)

501
502 TIETJEN

TABLE 12.1 (continued)

Std Dev Calculated from Sample Std Dev Known

n 5% 2.5% 1%

60 3.03 3. 20 3. 41

70 3.09 3. 26 3. 47

80 3.14 3. 31 3. 52

90 3.18 3. 35 3. 56

100 3.21 3. 38 3. 60

C ritical Values for T Std Dev Independently Estimated^

n 3 4 5 6 7 8 9 10 12

d .f. 1% points

10 2.78 3.10 3.32 3.48 3.62 3.73 3.82 3.90 4.04

11 2.72 3.02 3.24 3.39 3.52 3.63 3.72 3.79 3.93

12 2.67 2.96 3.17 3.32 3.45 3.55 3.64 3.71 3.84

13 2.63 2.92 3.12 3.27 3.38 3.48 3.57 3.64 3.76

14 2.60 2.88 3.07 3.22 3.33 3.43 3.51 3.58 3.70

15 2.57 2.84 3.03 3.17 3.29 3.38 3.46 3.53 3.65

16 2.54 2.81 3.00 3.14 3.25 3.34 3.42 3.49 3.60

17 2.52 2.79 2.97 3.11 3.22 3.31 3.38 3.45 3.56

18 2.50 2.77 2.95 3.08 3.19 3.28 3.35 3.42 3.53

19 2.49 2.75 2.93 3.06 3.16 3.25 3.33 3.39 3.50

20 2.47 2.73 2.91 3.04 3.14 3.23 3.30 3.37 3.47

24 2.42 2.68 2.84 2.97 3.07 3.16 3.23 3.29 3.38

30 2.38 2.62 2.79 2.91 3.01 3.08 3.15 3.21 3.30

40 2.34 2.57 2.73 2.85 2.94 3.02 3.08 3.13 3.22

60 2.29 2.52 2.68 2.79 2.88 2.95 3.01 3.06 3.15

120 2.25 2.48 2.62 2.73 2.82 2.89 2.95 3.00 3.08
OO 2.22 2.43 2.57 2.68 2.76 2.83 2.88 2.93 3.01

(continued)
AN ALYSIS AND D E T E C T IO N O F O U TLIE R S 503

T A B L E 12.1 (continued)

C ritical Values fo r Std Dev Independently Estimated^

n 3 4 5 6 7 8 9 10 12

d .f. 5% points

10 2.01 2.27 2.46 2.60 2.72 2.81 2.89 2.96 3.08

11 1.98 2.24 2.42 2.56 2.67 2.76 2.84 2.91 3.03

12 1.96 2.21 2.39 2.52 2.63 2.72 2.80 2.87 2.98

13 1.94 2.19 2.36 2.50 2.60 2.69 2.76 2.83 2.94

14 1.93 2.17 2.34 2.47 2.57 2.66 2.74 2.80 2.91

15 1.91 2.15 2.32 2.45 2.55 2.64 2.71 2.77 2.88

16 1.90 2.14 2.31 2.43 2.53 2.62 2.69 2.75 2.86

17 1.89 2.13 2.29 2.42 2.52 2.60 2.67 2.73 2.84

18 1.88 2.11 2.28 2.4Q 2.50 2.58 2.65 2.71 2.82

19 1.87 2.11 2.27 2.39 2.49 2.57 2.64 2.70 2.80

20 1.87 2.10 2.26 2.38 2.47 2.56 2.63 2.68 2.78

24 1.84 2.07 2.23 2.34 2.44 2.52 2.58 2.64 2.74

30 1.82 2.04 2.20 2.31 2.40 2.48 2.54 2.60 2.69

40 1.80 2.02 2.17 2.28 2.37 2.44 2.50 2.56 2.65

60 1.78 1.99 2.14 2.25 2.33 2.41 2.47 2.52 2.61

120 1.76 1.96 2.11 2.22 2.30 2.37 2.43 2.48 2.57

OO 1.74 1.94 2.08 2.18 2.27 2.33 2.39 2.44 2.52

inñation in the mean square e r r o r (MSE) of an estim ator o f location when in


fact a ll the observations are from the underlying population (by falsely r e ­
jecting the hypothesis a fraction o f the time, the MSE o f our estimator is
la rg e r than it would be if we had no rejection ru le ). Protection m easures
the reduction in the mean square e r r o r o f the estim ator when there are out­
lie rs present, i . e . , we get a sm aller MSE using the rejection rule than not
using it. Guttman has stated that **whlle the above concepts o f premium and
protection are relevant and appealing, num erical computation turns out to
be quite difficult.” Nevertheless, considerable work has been done in m eas­
uring and comparing the premium and protection of different rejection ru le s .
504 TIETJEN

The null hypothesis in Identifying outliers is that a ll the observations


come from a normal population; rejection of the hypothesis can mean many
things. Since only the extreme observations have been tested, we see that
outlier-detection statistics would not be very useful as tests of normality
(goodness-of-fit tests would be m ore appropriate). Two models for gen er­
ating a population containing outliers are widely used. The first is the mean
shift model where n - к observations are from a N(//,o-^) population and к
from a N(/i + X, cr^) population. In practice this is done by generating all n
from the first population and adding X to the first к of these. If some of the
first к are below the mean, adding X w ill make them close to the mean, so
that they w ill be w ell "hidden” among the others. This naturally limits the
power of any outlier procedure. An erroneous method of contamination is to
add X to the largest к of the n observations generated from the N(/z,cr^) popu­
lation. This creates a non-normal truncated distribution fo r the bulk of the
data, and makes it easy fo r any outlier test to perform w ell because of the
large gap in the data- The second model for contamination contains к o b s e r­
vations from a population, and is called the variance-shift model.

12.3 M U L T IP L E O UTLIER S IN A UNIV A R IA TE S A M P L E

When there is a possibility of m ore than one outlier in the sam ple, com pli­
cations quickly a r is e . Grubbs (1950) derived exact critical values fo r the two
largest (o r two sm allest) outliers, but did not obtain critical values fo r the
largest and sm allest observations. Tietjen and M oore (1972) extended G rubbs’
critical values, by simulation, for up to 10 outliers. The statistic used was
the ratio of the sum of squares in a sample which omits the outliers to the
sum of squares for the complete sam ple. This statistic was called
Tletjen and M oore obtained another statistic, Ej^, which took the same form
as L j^ but the numerator was based on omitting the к most extreme o bserva­
tions from the mean (from either o r both ends). The test was based on the
assumption that к was known and in practice determined by looking at the
data. Since one does not anticipate any outliers in a sample, к could not be
known in advance, and looking at the sample interfered with the a -le v e l in
an unknown way. Furtherm ore, if к w ere determined automatically, Ej^ could
pick the wrong observations to test. (In a sample of size 12, let 10 values be
from a N (0 ,1), one be at 10 and one be at 100. Since the mean is close to 10,
the two most extreme observations are the sm allest one and the one at 100.
C learly the sm allest observation is not an ou tlier.) The last problem is
remedied simply by picking out the single observation furthest from the
mean as outlier candidate #1, then omitting it from the sample and picking
the farthest from the new mean as outlier candidate #2, etc.
Yet another problem arises in trying to choose a value for k, the number
of observations to be tested as o u tliers. If there are two large observations
which a re nearly equal, and if one uses a one-outlier test on the largest, the
test w ill usually fail to reject because the second outlier masks the presence
ANALYSIS AND DETECTION OF OUTLIERS 505

T A B L E 12.2 Critical Values fo r R osner*s ESD (Тд) Statistic. Sample


Estimates of 10, 5, and 1% Points for ESD^ у for Selected N

a ce

N .10 .05 .01 N .10 .05 .01

к = 2 (fo r ESDi, E S D j) 40 3.01 3.17 3.52


2.64 2.77 2.98
10 2.39 2.55
2.17 2.32 45 3.17 3 .5 7
2.82 3.05
11 2.45 2.62
2.23 2.41 50 3.10 3.27 3.61
2.72 2.85 3.08
12 2.50 2.71
2.27 2.49 60 3.15 3.34 3.70
2.77 2.90 3.17
13 2.57 2.84
2.31 2.56 80 3.28 3.45 3.80
2.85 2.97 3.23
14 2.62 2.86
2.39 2.61 100 3.34 3.52 3.87
2.92 3.03 3.28
15 2.65 2.91
2.42 2.66 ,
к == 3 (for ESD I J ESD 2 , ESD 3 )
16 2.70 2.95
20 2.76 2.88 3.13
2.44 2.64
2.47 2.60 2.83
17 2.75 3.03 2.34 2.45 2.68
2.48 2.65
30 2.97 3.12 3.41
18 2.79 3.08 2.61 2.73 3.01
J2.46 2.68 2.44 2.56 2.75

19 2.80 3.10 40 3.07 3.22 3.58


2.49 2.71 2.69 2.81 3.03
2.52 2.62 2.82
20 2.69 2.83 3.09
2.41 2.52 2.76 50 3.18 3.34 3.68
2.76 2.89 3.15
25 2.99 3.34
2.58 2.68 2.89
2.62 2.82
60 3.26 3.42 3.75
30 2.89 3.05 3.35
2.83 2.95 3.20
2.55 2.67 2.92
2.64 2.73 2.95
35 3.09 3.41
80 3.32 3.49 3.85
2.74 2.96
2.90 3.03 3.27
2.71 2.81 3.01

(continued)
506 TIETJEN

TABLE 12.2 (continued)

a a

.10 .05 .01 N .10 .05 .01

100 3.44 3.60 3.97 к = 5 (for ESDi, ESD 2 , ESD 3,


2.97 3.10 3.34 ESD 4 , ESDg )
2.77 2.86 3.06
20 2.85 2.97 3.10
2.55 2.65 2.89
If ESD 2 , ESD 3, ESD 4)
2.40 2.51 2.69
20 2.81 2.95 3.20 2.33 2.42 2.61
2.51 2.63 2.83 2.27 2.37 2.57
2.38 2.49 2.68
30 3.05 3.19 3.48
2.29 2.39 2.58
2.67 2.78 3.03
30 3.02 3.16 3.48 2.51 2.60 2.80
2.65 2.77 3.02 2.42 2.51 2.74
2.48 2.59 2,79 2.35 2.45 2.62
2.39 2.49 2.70
40 3.16 3.31 3.63
40 3.14 3.32 3.64 2.76 2.88 3.13
2.74 2.86 3.10 2.59 2.69 2.89
2.57 2.67 2.87 2.46 2.55 2.74
2.45 2.55 2.74 2.39 2.47 2.65

50 3.24 3.40 3.74 50 3.28 3.45 3.77


2.81 2.93 3.18 2.84 2.96 3.21
2.62 2.72 2.92 2.65 2.74 2.94
2.50 2.59 2.78 2.52 2.61 2.79
2.44 2.52 2.70
60 3.31 3.48 3.82
2.85 2.98 3.20 60 3.34 3,51 3.81
2.67 2.77 2.97 2.88 3.01 3.24
2.54 2.63 2.82 • 2.68 2.77 2.96
2.56 2.65 2.83
80 3.40 3.57 3.91
2.48 2.56 2.72
2.94 3.05 3.31
2.74 2.84 3.04 80 3.44 3.61 3.93
2.61 2.69 2.87 2.98 3.11 3.36
2.77 2.86 3.08
100 3.47 3.64 3.96
2.63 2.72 2.89
3.00 3.13 3.34
2.54 2.62 2.76
2.79 2.89 3.06
2.66 2.74 2.90 100 3.53 3.70 4.01
3.04 3.16 3.42
2.81 2.91 3.10
2.68 2.77 2.93
2.59 2.67 2.84
ANALYSIS AND DETECTION OF OUTUERS 507

o f the first; the procedure caimot get started. Thus the repeated application
of a single outlier test can easily fa il. Furtherm ore, if there is only one
large outlier and one uses a test fo r two ou tliers, the test is likely to reject
Hq and claim that there are two outliers, a phenomenon known as swamping.
Rosner (1975) devised a procedure which successfully overcam e masking
but is still subject to some swamping. Let I q be the full data set and It+ i be
the set obtained by omitting from It the point most extrem e from the mean
o f It* Let к be an upper bound on the number of outliers in the sam ple.
Apply a one-outlier test in succession to I q , I i , • • • , I k - 1 * and let the last
significant result be fo r 1щ - 1 - Decide that the m observations omitted from
Im are outliers. The critical values have to hold simultaneously fo r the sev­
e ra l tests, hence a re difficult to generate. It should be e a sie r to estimate ku
than k, but the amount of swamping w ill depend on how badly we estimate кц.
Despite the swamping, we recommend R osner*s test fo r sev eral outliers if
the a -le v e l is Important to maintain. W e cannot state how effective Lj^ o r Ej^.
might be because there is no objective way of deciding upon a value o f k. The
best tables a re given by Jain (1981) as Table 12.2.

E 1 2 .3 .1 Example

Twenty laboratories did an analysis on a single blood sample for lead content.
Assum ing the data are norm ally distributed, are there any outliers? .000,
.015, .016, .022, .022, .023, .026, .027, .027, .028, .028, .031, .032,
.033, .035, .037, .038, .041, .056, .058.
Using Rosner^s ESD procedure, we set 20% as an upper limit on the
number of outliers and check fo r up to 4 o u tliers. I q is the full set of data
(x = .0298, S = .0131), I j the set with .000 omitted (¾ = .0313, s^ = .0114),
I 2 is the set with .000 and .058 omitted (¾ = .0298, S2 = .0097), I 3 is the
set with .000, .058, and .056 omitted (¾ = .0283, S3 = .0073), I 4 is the set
with .000, .058, .056, and .015 omitted. W e calculate the Tj^ statistic fo r
each set, obtaining Rj = 2.27 (for set I^ ), R 2 = 2 .3 4 , R 3 = 2.70, R^ = 1.82.
F rom Table 12.2 we obtain the 5% critical values fo r Ri, R2 , R3 , and R^ of
2.95, 2.63, 2.49, and 2.39, respectively. R 3 is the only one significant,
hence we declare .000, .056, and .058 to be outliers.

12.4 THE ID E N T IF IC A T IO N O F A SINGLE O U T L IE R


IN L IN E A R M O DELS

In the univariate case the residuals are correlated and have a common v a ri­
ance. In the regression case, the residuals are also correlated but each
residual has its own variance which depends, to some extent, on the arran ge­
ment of the x -v alu es. Let Y = J g + £ be the linear model in which X is the
(n X I) vector of respon ses, X an (n x p) m atrix of known constants, ß a
(p X I) vector of unknown param eters, and € an (n x I) vector of norm ally
508 TIETJEN

TABLE 12. 3 Critical Values fo r a Single Outlier in Linear Models


{a = . 10)

n I 2 3 4 5 6 8 10 15 25

5 1.87

6 2.00 1.89

7 2.10 2.02 1.90

8 2.18 2.12 2.03 1.91

9 2.24 2.20 2.13 2.05 1.92

10 2.30 2.26 2.21 2.15 2.06 1.92

12 2.39 2.37 2.33 2.29 2.24 2.17 1.93

14 2.47 2.45 2.42 2.39 2.36 2.32 2.19 1.94

16 2.53 2.51 2.50 2.47 2.45 2.42 2.34 2.20


18 2.58 2.57 2.56 2.54 2.52 2.50 2.44 2.35

20 2.63 2.62 2.61 2.59 2.58 2.56 2.52 2.46 2 .1 1


25 2.72 2.72 2.71 2.70 2.69 2.68 2.66 2.63 2.50

30 2.80 2.79 2.79 2.78 2.77 2.77 2.75 2.73 2.66 2.13

35 2.86 2.85 2.85 2.85 2.84 2.84 2.82 2.81 2.77 2.55

40 2.91 2.91 2.90 2.90 2.90 2.89 2.88 2.87 2.84 2.72

45 2.95 2.95 2.95 2.95 2.94 2.94 2.93 2.93 2.90 2.82

50 2.99 2.99 2.99 2.99 2.98 2.98 2.98 2.97 2.95 2.89

60 3.06 3.06 3.05 3.05 3.05 3.05 3.05 3.04 3.03 3.00

70 3.11 3.11 3.11 3.11 3.11 3.11 3.10 3.10 3.09 3.07

80 3.16 3.16 3.16 3.15 3.15 3.15 3.15 3.15 3.14 3.12

90 3.20 3.20 3.19 3.19 3.19 3.19 3.19 3.19 3.18 3.17

100 3.23 3.23 3.23 3.23 3.23 3.23 3.23 3.22 3.22 3.21

(continued)
ANALYSIS AND DETECTION OF OUTLIERS 509

TABLE 12.3 (continued)


a = .05

n I 2 3 4 5 6 8 10 15 25

5 1.92

6 2.07 1.93

7 2.19 2.08 1.94

8 2.28 2.20 2.10 1.94

9 2.35 2.29 2.21 2 .10 1.95

10 2.42 2.37 2.31 2.22 2 .1 1 1.95

12 2.52 2.49 2.45 2.39 2.33 2.24 1.96

14 2.61 2.58 2.55 2.51 2.47 2.41 2.25 1.96

16 2.68 2.66 2.63 2.60 2.57 2.53 2.43 2.26

18 2.73 2.72 2.70 2.68 2.65 2.62 2.55 2.44

20 2.78 2.77 2.76 2.74 2.72 2.70 2.64 2.57 2.15

25 2.89 2.88 2.87 2.86 2.84 2.83 2.80 2.76 2.60

30 2.96 2.96 2.95 2.94 2.93 2.93 2.90 2.88 2.79 2.17

35 3.03 3.02 3.02 3.01 3.00 3.00 2.98 2.97 2.91 2.64

40 3.08 3.08 3.07 3.07 3.06 3.06 3.05 3.03 3.00 2.84

45 3.13 3.12 3.12 3.12 3.11 3.11 3.10 3.09 3.06 2.96

50 3.17 3.16 3.16 3.16 3.15 3.15 3.14 3.14 3.11 3.04

60 3.23 3.23 3.23 3.23 3.22 3.22 3.22 3.21 3.20 3.15

70 3.29 3.29 3.28 3.28 3.28 3.28 3.27 3.27 3.26 3.23

80 3.33 3.33 3.33 3.33 3.33 3.33 3.32 3.32 3.31 3.29

90 3.37 3.37 3.37 3.37 3.37 3.37 3.36 3.36 3.36 3.34

100 3.41 3.41 3.40 3.40 3.40 3.40 3.40 3.40 3.39 3.38

(continued)
510 TIETJEN

TABLE 12.3 (continued)


{a = . 01)

n I 2 3 4 5 6 8 10 15 25

5 1.98

6 2.17 1.98

7 2.32 2.17 1.98

8 2.44 2.32 2.18 1.98

9 2.54 2.44 2.33 2.18 1.99

10 2.62 2.55 2.45 2.33 2.18 1.99

12 2.76 2.70 2.64 2.56 2.46 2.34 1.99

14 2.86 2.82 2.78 2.72 2.65 2.57 2.35 1.99

16 2.95 2.92 2.88 2.84 2,79 2.73 2.58 2.35

18 3.02 3.00 2.97 2.94 2.90 2.85 2.75 2.59

20 3.08 3.06 3.04 3.01 2.98 2.95 2.87 2.76 2.20


25 3.21 3.19 3.18 3.16 3.14 3.12 3.07 3.01 2.78

30 3.30 3.29 3.28 3.26 3.25 3.24 3.21 3.17 3.04 2.21
35 3.37 3.36 3.35 3.34 3.34 3.33 3.30 3.28 3.19 2.81

40 3.43 3.42 3.42 3.41 3.40 3.40 3.38 3.36 3.30 3.08

45 3.48 3.47 3.47 3.46 3.46 3.45 3.44 3.43 3.38 3.23

50 3.52 3.52 3.51 3.51 3.51 3.50 3.49 3.48 3.45 3.34

60 3.60 3.59 3.59 3.59 3.58 3.58 3.57 3.56 3.54 3.48

70 3.65 3.65 3.65 3.65 3.64 3.64 3.64 3.63 3.61 3,57

80 3.70 3.70 3.70 3.70 3.69 3.69 3.69 3.68 3.67 3.64

90 3.74 3.74 3.74 3.74 3.74 3.74 3.73 3.73 3.72 3.70

100 3.78 3.78 3.78 3.77 3.77 3.77 3.77 3.77 3.76 3.74

n = number of observations
q = number of independent variables (including count for intercept if fitted)
ANALYSIS AND DETECTION OF OUTLIERS 511

T A B L E 12.4 C ritical Values fo r Balanced T w o-W ay


and T h ree-W ay Layouts

R C 6 8 10

C ritical Values fo r the T w o-W ay Layout


(asterisks denote theoretically exact values)

a = 0.01
3 .66033*

4 .67484* .66511*

5 .66434* .63995* .60797*

6 .64597* .61302* .57774* .54628*

7 .62576* .58767* .55080* .51901* .49193

8 .60584* .56463* .52707* .49538 .46870 .44599

9 . 58696* .54386* .50611* .47475 .44857 .42641 .40736

10 .56935* .52516* .48750 .45658 .43094 .40931 .39079

a = 0.05

3 .64810*

4 .64512* . 62066*

5 .62415* .58971* .55513*

6 .60008* .56079* .52491* .49459

7 .57666* .53513* .49897 .46899 .44396

8 .55498* .51256* .47660 .44715 .42273 .40213

9 .53521* .49265 .45712 .42827 .40447 .38447 .36736

10 .51724 .47498 .43998 .41175 .38856 .36911 .35251

Critical Values fo r the T hree-W ay Layout


(asterisks denote theoretically exact values ; differences (x 10”^ between upper
and lower bounds are given in parentheses after the bound when necessary)

a = 0.01

. 50294*

.48778*

.46011* .42529

(continued)
512 TIETJEN

TABLE 12.4 (continued)

R C 3 4 5 6 7 8 9 10

5 3 .46503*

4 .43209 .39515

5 .40257 .36509 .33582

6 3 .44276

4 .40755 .37024

5 .37783 .34087 .31267

6 .35352 .31756 . 29060 . 26969

7 3 .42260

4 .38649 .34951

5 .35709 .32100 .29387

6 .33341 .29858 .27279 .25290

7 .31397 .28044 .25585 .23697 .22191

8 3 .40468

4 .36836 .33200

5 .33949 .30436 .27824

6 .31647 .28278 .25804 .23903

7 .29770 .26538 .24186 .22386 .20953

8 .28203 .25099 .22852 .21138 . 19776 . 18660

9 3 .38877

4 .35261 .31697

5 .32434 .29018 .26498

6 .30198 .26937 .24557 .22734

7 .28383 .25264 .23005 .21281 . 19911

8 .26872 .23882 .21728 .20089 . 18788 . 17723

9 .25591 .22715 .20653 . 19086 .17845 . 16829 . 15977

10 3 .37461

4 .33878 .30391

5 .31115 .27791 .25355

(continued)
ANALYSIS AND DETECTION OF OUTLIERS 513

TABLE 12.4 (continued)

R C 4 5 6 7 8 9 10

25779 .23484 .21730

24166 .21991 . 20334 .19019

22835 . 20763 . 19190 . 17942 .16921

,21714 . 19731 . 18228 . 17038 . 16065 . 15250

.20751 . 18847 . 17406 . 16265 . 15333 .14553

a = 0.05

3 .47790*

3 .45465*

.42314 .38800

.42912

.39495 .35936

5 .36652 .33144 .30467

3 .40625

4 .37136 .33624

5 .34338 .30930 .28370

6 .32096 .28814 .26380 .24500

3 .38640

4 .35157 .31724

5 .32426 .29127 .26675

6 .30259 .27101 .24779 . 22993

7 .28496 .25468 .23259 .21565 .20214

3 .36920

4 .33476 .30131

5 .30817 .27625 .25270

6 .28723 .25679 .23455 .21750

7 .27025 .24116 .22004 .20389 . 19104

8 .25614 .22824 .20808 . 19271 . 18049 . 17046

(continued)
514 TIETJEN

TABLE 12.4 (continued)

R C 10

9 3 . 35418 (1)

4 .32028 .28770

5 .29441 .26347 .24079

6 .27414 .24474 .22337 .20702

7 .25776 .22972 .20945 . 19399 . 18170

8 .24417 .21732 .19800 18329 .17161 .16204

9 . 23266 20686 .18836 .17430 .16314 .15401 .14634

10 .34094 (I)

.30765 .27590

.28246 .25244 .23054

. 26280 .23435 .21375 .19801

.24696 .21987 .20036 .18550 .17369

8 . 23384 .20794 .18935 .17522 .16401 .15484

9 .22275 .19788 .18009 .16659 .15589 .14713 . 13979

10 .21320 .18924 .17415 .15920 .14894 14055 . 13351 . 12750

distributed e rro rs with mean zero and variance o^I. If Y is the vector of fitted
values, fe = IC - X is the vector of residuals. Letting У = (vy) = J(JC^X)
(n - p) and v a r (ej) ^ ( I ~ v jj). The ith studentized residual is
r i = e^/N/var ej and the maximum o f the absolute values of the n w ill be de­
noted by r^n* If is greater than some critical value hoj, the observation
which gave rise to r^j^ is declared to be an outlier. Much o f the early work in
this area is due to Srlkantan (1961), but the best tables to date are those of
Lund (1975), Table 12.3.
In cases where we "know” cr^ o r have an independent estimate Sy of it,
we may use this knowledge in calculating r ^ , but should use the critical
values given by Joshi (1972).
F o r a certain large class o f designed experiments the residuals have a
common variance. This class of experiments includes a ll balanced factorial
arrangem ents, balanced incomplete blocks, and Latin squares. F o r these
arrangem ents we fit the model and find the ith residual. The ith normalized
I
residual is = е^/(2е?)^ and the maximum o f the absolute values of the
is called the maximum normed residual, M N R . Important work in this area
ANALYSIS AND DETECTION OF OUTLIERS 515

w as done by Stefansky (1972), but Galpin and Hawkins (1981) have a better
and m ore extensive set of tables (Table 12.4). If the M NR exceeds the tabu­
lated critical values, the observation which gave ris e to it is judged to be
an outlier. The statistic given by C . Daniel (1960) in his work on half normal
plots is equivalent to this test.

E 1 2 .4 .1 Example

Snedecor and Cochran (1967) give an example o f the use o f organic phosphorus
(X2) inorganic phosphorus (x^) on the yield of co m (y) on 18 Iowa so ils. The
data are shown below:

Soil
Sample Ii i i

1 64 0.4 53 61.56 2.44 0.14

2 60 0.4 23 58.96 1.04 0.06

3 71 3.1 19 63.45 7.55 0.42

4 61 0 .6 34 60.27 0.73 0.04

5 54 4.7 24 66.74 -12.74 -0.67

6 77 1.7 65 64.93 12.07 0.79

7 81 9.4 44 76.89 4.11 0.21


8 93 10.1 31 77.01 15.99 0.81

9 93 11.6 29 79.53 13.47 0.70

10 51 12.6 58 83.83 -32.83 -1.72

11 76 10.9 37 78.97 -2.97 -0 .1 5

12 96 23.1 46 101.58 -5.58 -0 .2 9

13 77 23.1 50 101.93 -24.93 -1.29

14 93 21.6 44 98.72 -5.72 -0.29

15 95 23.1 56 102.45 -7 .4 5 -0.39

16 54 1.9 36 62.77 -8.77 -0 .4 5

17 168 26.8 58 109.24 58.76 3.18

18 99 29.9 51 114.18 -15.18 —0 O84

Snedecor and Cochran thought the residual on soil 17 was ^suspiciously


large. ” Using the above test for one outlier, we fit the equation E(y) = O' +
ß i X i + /З2Х2 to the data and obtain the predicted values shown above. The
516 TIETJEN

residuals are also shown, as w ell as the studentlzed residuals. The value
of Гщ is 3.18, and from Table 12.3 with q = 3 (number of param eters)
significant at the .05 level, hence is declared an outlier.

E 12.4.2 Example

The following two-way layout was given by Daniel (I960):

Data Fitted Residuals

35 32 40 37 33 34 39 38 +2 -2 +1 -I

29 29 36 34 29 30 35 34 +0 -I +1 +0
25 29 20 30 23 24 29 28 +2 +5 -9 +2

19 25 35 25 23 24 29 -28 -4+1+6-3

22 20 29 29 22 23 28 27 +0 -3 +1 +2

The residual in the third row and column seems large. ____
F o r these data, Stefansky^s maximum normed residual is 9/n/202 =
. 6332. Consulting Table 12.4 we find a 5% critical value of . 590 and a 1%
critical value of .640, hence the observation in the third row and third col­
umn would be judged an outlier at the 5% level but not at the 1% level.

12.5 M U L T IP L E O U TLIER S IN THE LIN E A R M O D E L

In the sim ple linear regression case, the observation with largest residual
may not be the one with the largest test statistic, Гщ» so that judging re s id ­
uals by eye begins to fail as a tool. In a two-way table. Gentleman and W ilk
(1975) declare that ”in the null case of no outliers, the residuals behave much
like a norm al sam ple. When one outlier is present, the direct statistical
treatment o f residuals provides a complete basis fo r data-analytic judgements,
especially through judicious use of probability plots. When two outliers are
present, however, the resulting residuals w ill often not have any noticeable
statistical p e cu liarities.” These authors devised a test statistic (¾ = e * e - e * e
where ”e*e is the sum of squares o f revised residuals resulting from fitting
the basic model to the data remaining after the om ission of к data points”
and e*e is the sum of squares of residuals obtained by fitting the model to

a ll of the data. They envisioned the computation of fo r each o f the

possible data partitions, and used the largest of the to identify the ”k most
likely outlier su bset.” Such a procedure can be computationally awesome.
Methods of reducing the labor have been devised, but they are still form idable,
since one firs t chooses a maximum value fo r к (no easy task) then proceeds.
ANALYSIS AND DETECTION OF OUTLIERS 517

If the Qk Is not significant, the value of к is reduced by I and the process


repeated. A statistic useful in judging Qk Is F = (n - p - k)Qk/k(e^e - Qk)
with к and n - p - к degrees of freedom . The authors suggest that a plot be
made of the largest Qk values against typical values obtained as medians
from 10 Monte Carlo tria ls.
John and D raper (1978) showed that Qk is ”the sum of squares of к suc­
cessive revised normalized uncorrelated residuals, ” and that one only need
examine the subset of Qk which arose from the subset o f the la rg e r re sid ­
uals. In practice they suggested examining a plot of the original residuals
to see if one outlier appeared to be present. If so, that value was omitted
and treated as a m issing value and estimated as usual (by minimizing the
e r r o r sum of squares). After estimation, the sam e process was repeated
to see if a second outlier could be detected by examining the residu als. Con­
tinuing this process, John and D rap er (1980) show how to conduct a th re e -
stage test fo r up to three outliers, using simulated critical values.
A good example would take up m ore space than we have available, hence
the reader is re fe rre d to the papers mentioned above.

12.6 A CCO M M O D ATIO N O F O U TLIE R S

W e have thus fa r concentrated on detecting o r identifying ou tliers. W e shall


now discuss the use of robust regressio n techniques to get around the outliers
without detecting them and without allowing them to "devastate" our analyses.
Besides the use of trimmed means, W insorized m eans, and medians, there
is a set of techniques known as Huber M -estim ation. Since the derivative of
the log of the likelihood function fo r any f(x j;6 ) is equal to S f ’ (xj - 0)/f(xj - ¢),
we set this equal to zero and have as a result that the maximum likelihood
estimator of в is 0, the solution o f 2ÿ(Xj - ö) = 0, where ÿ(x) = -Г (х )Л (х ).
Since f(x) is not known, we can choose a ÿ(x) which w ill have suitable robust
estimation properties. Many form s of ф{х) have been suggested, but we shall
confine ourselves to one. Andrews (1974) suggested a sine-v/ave function for
jp{x) , and showed that it could be carried out by using any iterative weighted
least squares procedure in which the model for the data is yj = + e j, the
least squares estimate of g is b , and the Ith residual is r¿ = yj To do
robust regression , we start with an initial estimate o f ¢, denoted by b o , ob­
tained as described below. From this estimate we calculate the residuals r^.
W e let S = median I r j j , and solve, by weighted least squares, the system of p
equations 2 w jX y(yi summation runs from i = I to n where
j = I, 2, . . . , P- The weights a re w? = sin (rj/ s )/ r j if Ir^j < 1.5тг and
zero otherwise. The solution of the equations provides a new estimate b j ,
from which we obtain new residuals and new weights and get another esti­
mate Ьз- This continues until the Ц converge. On the last iteration, very
sm all o r zero weights fo r some residuals indicate that they are outliers o r
nearly so. By letting x be a vector of I ’s, the model w ill do fo r the univari­
ate case as w ell. In most cases a starting value b o , obtained by solving the
518 TIETJEN

original equations fo r ß , w ill do nicely. Andrews has a rather involved a lter­


native when the data a re " fa r from Gaussian. "
I would recommend the use o f robust regression to accompany the usual
param etric regression procedure. If the answers a re quite different, there
is an indication of outliers o r influential observations which indicate that the
situation should be studied further. Andrews also suggests that the procedure
may give better results after a few iterations than if it is carried to con­
vergence.

E 12.6.1 Example

A set of data from Brownlee (1965) on observations from a plant fo r oxidation


of ammonia to nitric acid is given below.

Stack Cooling water Acid


Observation loss Air flow inlet temperature concentration
number y Xl X2 Хз

1 42 80 27 89
2 37 80 27 88
3 37 75 25 90
4 28 62 24 87
5 18 62 22 87
6 18 62 23 87
7 19 62 24 93
8 20 62 24 93
9 15 58 23 87
10 14 58 18 80
11 14 58 18 89
12 13 58 17 88
13 11 58 18 82
14 12 58 19 93
15 8 50 18 89
16 7 50 18 86
17 8 50 19 72
18 8 50 19 79
19 9 50 20 80
20 15 56 20 82
21 15 70 20 91
ANALYSIS AND DETECTION OF OUTLIERS 519

Daniel and Wood (1971) did careful work on this problem « T h eir fit to
the origin al data was E(y) = -3 9 .9 + .72xi + !.ЗОХз - . 15x 3 . A fter much
consideration and plotting of the data, they discarded observations I , 3, 4,
and 21 as outliers, and refitted the equation, obtaining E(y) = -3 7 .6 + .80xi
+ . 58X2 - . 07 x 3 . A robust regressio n yields E(y) = -37.2 + .82Xi + . 52X3
- . 07 x 3 , and deletion o f the four points does not alter the coefficients. The
residuals from the four fits are shown below . The size of the residuals for
points I , 3, 4, and 21 indicate, somewhat subjectively, that they are outliers.

Residuals

Least squares Robust fit cI•= 1.5


Observation ------------------------------ ------------------ -----------
number Response with 1,3,4,21 without with 1,3,4,21 without

1 42 3.24 6.08 6.11 6.11


2 37 -1.92 1.15 1.04 1.04
3 37 4.56 6.44 6.31 6.31
4 28 5.70 8.18 8.24 8.24
5 18 -1.71 -0.67 -1.24 -1.24
6 18 -3.01 -1.25 -0.71 -0.71
7 19 -2.39 -0.42 -0.33 -0.33
8 20 -1.39 0.58 0.67 0.67
9 15 -3.14 -1.06 -0.97 -0.97
10 14 1.27 0.35 0.14 0.14
11 14 2.64 0.96 0.79 0.79
12 13 2.78 0.47 0.24 0.24
13 11 -1.43 -2.51 -2.71 -2.71
14 12 -0.05 -1.34 -1.44 -1.44
15 8 2.36 1.34 1.33 1.33
16 7 0.91 0.14 0.11 0.11
17 8 -1.52 -0.37 -0.42 -0.42
18 8 -0.46 0.10 0.08 0.08
19 9 -0.60 0.59 0.63 0.63
20 15 1.41 1.93 1.87 1.87
21 15 -7.24 -8.63 -8.91 -8.91
520 TIETJEN

12.7 M U L T IV A R IA T E O U TLIER S

A generalization of the univariate technique has been used here. Let S be


the sample covariance m atrix fo r a ll the data and S i be the covariance m atrix

obtained by deleting к observations from the sam ple. ^The possible

ways of choosing the к outliers are indexed by i . ) W ilks (1963) suggested


form ing Ri = I ^ l / I SI and using the minimum of the Ri as a test statistic
and gave some critical values fo r one and two o u tliers. His test is recom ­
mended.

12.8 O UTLIER S IN TIM E SERIES

Fox (1972) was apparently the first to take into account the correlations b e­
tween successive observations in p rocessing time series for outliers. He
considered a type I outlier as one in which a gross e r r o r of observation o r
recording e rro rs affects only a single observation. The type П outlier occurs
when a single ’^innovation” is extreme and w ill affect that observation and
the subsequent ones.
As a model for the t y p e I outlier, Fox used a stationary pth o rd er auto­
regressiv e process in which the qth observation has d added to it. The null
hypothesis is Hq : Д = 0, and the alternative, H^ : A 0. Using an asymptotic
expression for the elements w 4 o f the inverse of the covariance m atrix.
Fox obtained a likelihood ratio criterion.

= (y - A ) ' W - y - A)
(12.8.1)
q ,n y ’^ ”V

where is the estimated Inverse under H j , is the estimated inverse


under Hq , and A , the displacement in the qth observation, is a vector of
zero$ except fo r the qth component which is estimated asymptotically by
The distribution of A q ^ is not known except with W known. Using
simulation. Fox compared the distribution of Aq^n (a) the distribution
obtained by replacing W “^ by and (b) the distribution obtained by assum ­
ing that W is known instead of estimated in Eq. (1 2 .8 .1). The distributions
in the three cases w ere very close, hence Fox recommended that we act as
though the estimated W ’s w ere known in o rd er to obtain the distribution (and
critical values) of A q ^ W here the position of the outlier is not known. Fox
simulates the situation and obtains a sm all table of critical values.
F o r the t5фe П outlier. Fox again employs a likelihood ratio criterion
and obtains an approximate distribution for both cases (position known and
unknown). In the case where the type of outlier is unknown, he suggests
seeing whether one can detect the effect of the observation on subsequent
observations. If not, the type I outlier is assumed. The above approach is
ANALYSIS AND DETECTION OF OUTLIERS 521

shown to be superior to the one in which the observations are considered to


be independent, and we recommend its use. Note that it has the characteris­
tic form of a (weighted) sum of squares omitting outliers divided by a weighted
sum of squares fo r the total sam ple.

R E FE R E N C E S

Andrews, D. F . (1971). Significance tests based on residuals, Biom etrika


58, 139-148.

Andrews, D. F . (1974). A robust method fo r multiple linear regression .


Technometrics 16, 523-531.

Anscombe, F . J. (1960). Rejection of ou tliers, Technometrics 2, 123-147.

Barnett, V . and Lew is, T . (1978). Outliers in Statistical Data. New York:
W iley.

Beckman, R. J. and Cook, R. D. (1983). Outlier. . s, Technometrics 25,


119-149.

Daniel, C. (1960). Loc''.ting outliers in factorial experiments, Technometrics


2, 149-156.

Fox, A. J. (1972). Outliers in time se rie s. Journal of the Royal Statistical


Society. Ser. B 34. 350-363.

Galpin, J. S. and Hawkins, D. M . (1981). R e je c t io n o fa s in g le o u t lie r in


tw o- o r three-way layouts. Technometrics 23, 65-70.

Gentleman, J. F . and W ilk , M. B. (1975a). Detectingoutliers in a two-way


table: I. Statistical behavior of residuals. Technometrics 17, 1-14.

Gentleman, J. F . and W ilk , M. B . (1975b). Detecting outliers II. Supple­


menting the direct analysis of residuals. Biom etrics 31, 387-410.

Grubbs, F . E. (1950). Sample criteria for testing outlying observations.


Annals of Mathematical Statistics 21, 27-58.

Grubbs, F. E. (1969). P ro ced u resfo rd etectin gou tlyin gobservation s in


sam ples, Technometrics 11, 1-21.

Guttman, I. (1973). Prem ium and protection of several procedures for deal­
ing with outliers w'hen sample sizes are moderate to large, Techno­
m etrics 15, 385-404.

Guttman, I. and Smith, D. E. (1969). Investlgationof rules for dealing with


outliers in sm all sam ples from the normal distribution I: Estimation of
the mean, Technometrics 11, .327-550.
522 TIETJEN

Guttman, I. and Smith, D. E . (1971). In v e s tig a tio n o fru le s fo rd e a lln g w lth


outliers on sm all samples from the normal distribution II: Estimation
of the variance, Technometrics 13, 101-111.

Jain, R. B. (1981). Percentage points of m any-outlier detection procedures.


Technometrics 23, 71-75.

John, J. A. and D raper, N. R. (1978). On testing for two outliers o r one


outlier in two-way tables. Technometrics 20, 69-78.

Joshi, P . C. (1972). Some slippage tests of mean for a single outlier in


linear regression, Biometrika 59, 109-120.

Kruskal, W . H. (1960). Some rem arks on wild observations, Technometrics


2, 1-3.

Lund, R. E. (1975). Tables for an approximate test for outliers in linear


regression s, Technometrics 17, 473-476.

Rosner, B. (1975). On the detection of many outliers, Technometrics 17,


221-227.

Rosner, B. (1977). Percentage points for the RST many outlier procedure.
Technometrics 19, 307-312.

Srikantan, K. S. (1961). Testing for a single outlier in a regression model,


Sankhya, Ser. A 23, 251-260.

Stefansky, W . (1971). Rejectingoutliers by maximum normal residual.


Annals of Mathematical Statistics 42, 35-45.

Tietjen, G. L. and M oore, R. H. (1972). Some Grubbs-type statistics for


the detection of several outliers, Technometrics 14, 583-597.

W ilks, S. S. (1963). Multivariate statistical outliers, Sankhya 25, 407-426.


Appendix

1. Table I , Cumulative Distribution Function of the


Standard Norm al Distribution

2. Table 2, Critical Values of the Chi-Square Distribution

3. Simulated Data Sets

Set Name Distribution ^ ß i ßz

NOR N orm al м = 100, o- = 10 .0 3.


UNI Uniform on Interval 0 to 10 .0 1.80
EX P Negative Exponential ^ = 5 2. 9.
LOG Logistic Ц = 100 .0 4.2
W E .5 W elbull к = . 5 6.62 87.72
WE2 W eibull к = 2 .63 3.25
SU(1,2) Johnson Unbounded (1,2) -.8 7 5.57
SU(0,3) Johnson Unbounded (0,3) .0 3.53
SU(0,2) Johnson Unbounded (0,2) .0 4.51
SB(0,2) Johnson Bounded (0 , 2) .0 2.63
SB(0, .5) Johnson Bounded (0 , . 5) .0 1.63
L C N (.0 5 ,3 ) Contaminated Norm al p = .05, ß = 3 .68 4.35
L C N (.1 ,3 ) Contaminated Norm al p = . I , ß - 3 .80 4.02
L C N (.2 ,3 ) Contaminated Norm al p = . 2, ß = 3 .68 3.09
SB(1,1) Johnson Bounded (1,1) .73 2.91
SB(1,2) Johnson Bounded (1,2) .28 2.77
SB(.533, .5) Johnson Bounded (.533, .5) .65 2.13

4. Real Data Sets

BUS Data Set


CHEN Data Set
BLAC Data Set
EM EA Data Set
BAEN Data Set

523
524 APPENDIX

I. TABLE I Cumulative Distribution Function of the


Standard Norm al Distribution

A re a s under the standard normal curve (A reas to the left)

6 8 9

-3 .0 * .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010
-2 .9 .0019 .0018 .0017 .0017 .0016 .0016 .0015 .0015 .0014 .0014
- 2.8 .0026 .0025 .0024 .0023 .0023 .0022 .0021 .0020 .0020 .0019
-2 .7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026
- 2.6 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036
-2 .5 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048
-2 .4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
-2 .3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084
- 2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110
- 2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143
- 2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183
-1 .9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
- 1. 8 .0359 .0351 .0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294
-1 .7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367
- 1.6 .0548 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455
-1 .5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559
-1 .4 .0808 .0793 .0778 .0764 .0749 .0735 .0721 .0708 .0694 .0681
-1 .3 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823
- 1.2 .1151 .1131 .1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985
- 1.1 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170
- 1.0 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379
-.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611
-.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867
-.7 .2420 .2389 .2358 .2327 .2296 .2266 .2236 .2206 .2177 .2148
-.6 .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451
-.5 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776
-.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3516 .3121
-.3 .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483
-.2 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3936 .3897 .3859
-.1 .4602 .4562 .4522 .44 83 .4443 .4404 .4364 .4325 .4286 .4247
- .0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641

*F o r Z < -4 the areas are 0 to four decimal p laces.


APPENDIX 525

T A B L E I (continued)

8 9

.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517
.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
1. 0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621
1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830
1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177
1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817
2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857
2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890
2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916
2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936
2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
2.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986
3.ot .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990

^ F o r Z > 4 the areas are I to four decimal places.


Adapted from Probability with Statistical Applications, second edition, by
F .M osteller, R. E. K. Rourke, and G. B . Thom as, J r. Reading, JVàss.:
A ddison -W esley, 1970, p. 473.
526 APPENDIX

2. T A B U E 2 Critical Values o f the Chi-Square Distribution

d.f. .995 .99 .975 .95 .90 .10 .05 .025 .01 .005

1 .00 .00 .00 .00 .02 2.71 3.84 5.02 6.63 7.88
2 .0 1 .02 .05 .10 .21 4.61 5.99 7.38 9.21 10.60
3 .07 .11 .22 .35 .58 6.25 7.81 9.35 11.34 12.84
4 .21 .30 .48 .71 1.06 7.78 9.49 11.14 13.28 14.86
5 .41 .55 .83 1.15 1.61 9.24 11.07 12.83 15.09 16.75
6 .68 .87 1.24 1.64 2.20 10.64 12.59 14.45 16.81 18.55
7 .99 1.24 1.69 2.17 2.83 12.02 14.07 16.01 18.48 20.28
8 1.34 1.65 2.18 2.73 3.49 13.36 15.51 17.54 20.09 21.96
9 1.73 2.09 2.70 3.33 4.17 14.68 16.92 19.02 21.67 23.59
10 2.16 2.56 3.25 3.94 4.87 15.99 18.31 20.48 23.21 25.19
11 2.60 3.05 3.82 4.57 5.58 17.28 19.68 21.92 24.72 26.76
12 3.07 3.57 4.40 5.23 6.30 18.55 21.03 23.34 26.22 28.30
13 3.57 4.11 5.01 5.89 7.04 19.81 22.36 24.74 27.69 29.82
14 4.07 4.66 5.63 6.57 7.79 21.06 23.68 26.12 29.14 31.32
15 4.60 5.23 6.26 7.26 8.55 22.31 25.00 27.49 30.58 32.80
16 5.14 5.81 6.91 7.96 9.31 23.54 26.30 28.85 32.00 34.27
17 5.70 6.41 7.56 8.67 10.09 24.77 27.59 30.19 33.41 35.72
18 6.26 7.01 8.23 9.39 10.86 25.99 28.87 31.53 34.81 37.16
19 6.84 7.63 8.91 10.12 11.65 27.20 30.14 32.85 36.19 38.58
20 7.43 8.26 9.59 10.85 12.44 28.41 31.41 34.17 37.57 40.00
21 8.03 8.90 10.28 11.59 13.24 29.62 32.67 35.48 38.93 41.40
22 8.64 9.54 10.98 12.34 14.04 30.81 33.92 36.78 40.29 42.80
23 9.26 10.20 11.69 13.09 14.85 32.01 35.17 38.08 41.64 44.18
24 9.89 10.86 12.40 13.85 15.66 33.20 36.42 39.36 42.98 45.56
25 10.52 11.52 13.12 14.61 16.47 34.38 37.65 40.65 44.31 46.93
26 11.16 12.20 13.84 15.38 17.29 35.56 38.89 41.92 45.64 48.29
27 11.81 12.88 14.57 16.15 18.11 36.74 40.11 43.19 46.96 49.65
28 12.46 13.56 15.31 16.93 18.94 37.92 41.34 44.46 48.28 50.99
29 13.12 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.59 52.34
30 13.79 14.95 16.76 18.49 20.60 40.26 43.77 46.98 50.89 53.67
50 27.99 29.71 32.36 34.76 37.69 63.17 67.50 71.42 76.15 79.49
100 67.33 70.06 74.22 77.93 82.36 118.5 124.3 129.6 135.8 140.2
500 422.3 429.4 439.9 449.1 459.9 540.9 553.1 563.9 576.5 585.2
1000 888.6 898.8 914.3 927.6 943.1 1058 1075 1090 1107 1119

Adapted from D. B . Owen, Handbook of Statistical T a b le s . Courtesy of the


Atomic Energy Commission. Reading, M a s s .: A ddlson -W esley, 1962.
APPENDIX 527

3. Simulated Data Sets

There are 17 simulated data sets, each of 100 observations. Throughout the
book these have been used to illustrate and compare procedures.

Set Name Distribution (nT^I, ß i )

NOR Norm al distribution, ц. = 100, о- = 10 (0, 3)


UNI Uniform distribution, f(x) = ^ fo r 0 < x < 10 (0, 1.80)

I “x/5
EXP Negative езфопеп11а11у f(x) = - e fo r x > 0 (2, 9)
5

LOG Logistic, F(X) = (0, 4.2)

I
-X^
WE. 5 W elbull, F(X) = I - e (6.62, 87.72)

-X^
W E2 W eibull, F(X) = I - e (.6 3 , 3.25)

SU(1,2) Johnson Unbounded (y,6) (-.8 7 , 5.59)


sinh“^(x) is standard norm al ( - « < x < «-)

y=/. ä-г /A \
ß, = a 7 6 , ß , = 5 . 5 9 \

SU (0,3) Johnson Unbounded (0,3) (0, 3.53)

SU(0,2) Johnson Unbounded (0, 2) (0, 4.51)

5r
y== 0 . 6 = 2 j
/\ \
= 0 . ß ,= 4 .5 l \w

SB(0,2) Johnson Bounded (y,ô) (0, 2.63)

у + Ô In is a standard norm al (0< x < I)


528 A P P E N D IX

Set Name Distribution

S B (0 ,. 5) Johnson Bounded ( 0 , . 5) (0, 1.63)


y = ( ) , 8 = 0 .5
ß ,= ( ) , ß ,= 1 .6 3

The above distributions w ere the first ones generated. To increase the
number of skewed distributions, the following w ere added.

Location Contaminated Norm als

p N (M ,l) + ( l - p ) N ( 0 , l )

The three Included here are:

Data Set
P M ß2 Name

.05 3 .68 4.35 L C N (.0 5 ,3 )


.10 3 .80 4.02 L C N (.1 0 ,3 )
.20 3 .68 3.09 L C N (.2 0 ,3 )

The last three simulated data sets are from Johnson Bounded SB(y,ô) d istri­
bution where SB(y,0) is defined above. The three samples here are from :

Data Set
У Ô Name

I I .73 2.91 SB (1,1)


I 2 .28 2.77 SB (1.2)
.533 .5 .65 2.13 SB (.5 3 3 ,.5 )
APPENDIX 529

NOR Data Set Norm al Distribution = 100 nT^I =0


d = 10 ¢2 =3
O b s e r­ O b s e r­ O b s e r­ O b se r­
No. vation No. vation No. vation vation

I 92.55 26 102.56 51 111.38 76 88.13

2 96.20 27 79.43 52 103.22 77 102.98

3 84.27 28 105.48 53 113.17 78 103.71

4 90.87 29 85.29 54 108.39 79 95.14

5 101.58 30 83.53 55 103.60 80 85.71

6 106.82 31 104.21 56 103.90 81 103.56

7 98.70 32 100.75 57 89.35 82 89.44

8 113.75 33 92.02 58 124.60 83 88.26

9 98.98 34 100.10 59 104.34 84 97.80

10 100.42 35 87.83 60 85.29 85 97.33

11 118.52 36 89.00 61 97.78 86 103.90

12 89.90 37 108.67 62 109.76 87 96.38

13 92.45 38 103.09 63 94.92 88 94.33

14 115.92 39 99.12 64 95.12 89 99.62

15 103.61 40 91.46 65 88.56 90 95.94

16 96.13 41 125.28 66 115.95 91 104. 89

17 95.45 42 91.45 67 100.79 92 83.34

18 108.52 43 92.56 68 104. 87 93 87.04

19 112.69 44 102.66 69 95.89 94 89.80

20 90.03 45 101.91 70 110.72 95 83.07

21 111.56 46 76.35 71 86.28 96 112.14

22 109.26 47 111.30 72 107.97 97 113.90

23 83.67 48 89.33 73 117.23 98 100.46

24 112.97 49 79.89 74 104.12 99 110.39

25 116.87 50 110.17 75 95.97 100 98.43


530 APPENDIX

UNI Data Set Uniform Distribution on Interval 0 to 10 n/^1 =0


ß2 = 1.80

O b s e r­ O b s e r­ O b s e r­ O b s e r­
vation vation No. vation No. vation

I 8.10 26 6.54 51 3.93 76 4.26

2 2.06 27 8.24 52 0.08 77 3.32

3 1.60 28 9.12 53 3.51 78 9.29

4 8.87 29 0.31 54 0.44 79 2.57

5 9.90 30 2.63 55 1.22 80 0.55

6 6.58 31 6.20 56 1.12 81 6.53

7 8.68 32 5.47 57 2.34 82 2.33

8 7.31 33 7.80 58 1.86 83 9.01

9 2.85 34 1.30 59 8.35 84 7.86

10 6.09 35 9.39 60 3.53 85 7.06

11 6.10 36 8.67 61 5.05 86 8.54

12 2.94 37 1.87 62 5.28 87 9.71

13 1.85 38 6.67 63 6.87 88 8.49

14 9.04 39 5.90 64 2.96 89 2.08

15 9.38 40 0.15 65 2.35 90 0.50

16 7.30 41 3.91 66 4.02 91 3.54

17 2.11 42 8.87 67 1.44 92 3.75

18 4.55 43 2.50 68 9.63 93 9.46

19 7.66 44 7.49 69 9.44 94 0.04

20 9.63 45 0.55 70 5.44 95 7.79

21 9.48 46 5.25 71 3.71 96 8.08

22 5.31 47 5.61 72 4.21 97 3.60

23 5.76 48 1.00 73 2.22 98 8.85

24 9.66 49 3.23 74 2.87 99 1.50

25 4.37 50 1.05 75 0.72 100 0.18


APPENDIX 531

E X P Data Set Negative Exponential with Mean = 5 -vT^i = 2


=9

O b se r- O b se r- O b s e r- O b s e r-
No. vation No. vation No. vation No. vation

1 8.15 26 11.89 51 1.27 76 5.19

2 4.69 27 7.26 52 1.56 77 0.26

3 2.17 28 14.71 53 16.81 78 9.46

4 0.37 29 0.23 54 6.07 79 0.95

5 16.69 30 1.21 55 3.89 80 0.51

6 0.06 31 0.18 56 9.60 81 1.39

7 6.48 32 1.24 57 3.12 82 3.74

8 2.63 33 12.94 58 4.16 83 4.37

9 0.44 34 4.78 59 0.07 84 3.87

10 0.89 35 18.53 60 1.67 85 5.40

11 6.96 36 9.20 61 3.80 86 2.41

12 5.15 37 1.65 62 1.52 87 5.93

13 9.78 38 2.20 63 2.79 88 39.12

14 6.47 39 1.13 64 0.36 89 1.05

15 0.99 40 5.20 65 4.49 90 0.47

16 7.70 41 14.74 66 9.76 91 9.57

17 1.61 42 2.86 67 2.37 92 8.29

18 1.68 43 0.19 68 9.91 93 3.79

19 0.92 44 0.08 69 6.60 94 2.35

20 1.87 45 3.22 70 0.17 95 1.09

21 14.80 46 1.21 71 14.68 96 4.19

22 9.96 47 3.51 72 3.72 97 12.21


23 25.92 48 5.67 73 6.92 98 1.57

24 3.37 49 10.50 74 2.53 99 3.52

25 2.76 50 10.45 75 4.77 100 0.48


532 APPENDIX

LOG Data Set Logistic M = 100 •^ßi = 0


/Зг = 4 .2

O b s e r­ O b se r­ O b s e r­ O b se r­
No. vation No. vation No. vation No. vation

I 96.91 26 86.98 51 112.50 76 98.51

2 109.99 27 79.23 52 109.82 77 107.91

3 102.97 28 110.70 53 94.66 78 132.40

4 118.54 29 98.58 54 107.08 79 103.32

5 63.35 30 76.52 55 108.22 80 116.01

6 94.63 31 93.44 56 81.61 81 111.18

7 144.28 32 89.81 57 102.90 82 65.87

8 104.47 33 100.62 58 85.94 83 96.30

9 111.81 34 108.75 59 66.35 84 83.74

10 78.32 35 103.91 60 97.12 85 91.97

11 109.91 36 87.71 61 90.09 86 94.95

12 98.07 37 145.33 62 111.92 87 98.95

13 82.45 38 121.83 63 83.89 88 98.21

14 114.97 39 99.52 64 77.45 89 98.71

15 103.08 40 116.58 65 74.29 90 108.88

16 78.48 41 106.05 66 102.90 91 68.44

17 97.45 42 92.55 67 113.41 92 118.92

18 107.64 43 79.07 68 104.37 93 117.01

19 83.73 44 111.59 69 100.46 94 89.22

20 116.99 45 103.18 70 104.14 95 123.39

21 103.82 46 105.03 71 51.90 96 85.30

22 131.24 47 101.19 72 105.34 97 123.58

23 95.86 48 102.81 73 108.94 98 113.79

24 111.90 49 106.17 74 103.43 99 102.86

25 60.57 50 112.12 75 81.17 100 88.22


APPENDIX 533

W E . 5 Data Set W eibull with к = . 5 ^ ß i = 6.62


ßz = 87.72
O b s e r­ O b se r­ O b s e r­ O b se r­
No. vation №. vation №. vation No. vation

I .30 26 .06 51 2.26 76 .39

2 1.72 27 .01 52 1.69 77 1.36

3 .73 28 1.86 53 .21 78 10.75

4 4.00 29 .39 54 1.23 79 .76

5 .00 30 .01 55 1.41 80 3.19

6 .21 31 .17 56 .02 81 1.96

7 19.71 32 .09 57 .72 82 .00

8 .89 33 .52 58 .05 83 .28

9 2.10 34 1.50 59 .00 84 .03

10 .01 35 .82 60 ,31 85 .14

11 1.71 36 .07 61 .10 86 .22

12 .36 37 20.64 62 2.12 87 .41

13 .03 38 5.24 63 .03 88 .37

14 2.89 39 .45 64 .01 89 .40

15 .74 40 3.36 65 .01 90 1.52

16 .01 41 1.08 66 .72 91 .00

17 .33 42 .15 67 2.47 92 4.13

18 1.31 43 .01 68 ,87 93 3.49

19 .03 44 2.05 69 .51 94 .09

20 3.48 45 .75 70 .85 95 5.91

21 .81 46 .95 71 .00 96 .04

22 10.03 47 .57 72 .99 97 6.00

23 .26 48 .71 73 1.53 98 2.57

24 2.12 49 1.10 74 .77 99 .72

25 .00 50 2.17 75 .02 100 .07


534 APPENDIX

W E 2 Data Set W eibull with к = 2 ^ ß i = .63


ßz = 3.25

O b s e r­ O b se r­ O b se r­ O b se r­
vation No. vation No. vation No. vation

I .74 26 .49 51 1.23 76 .79

2 1.15 27 .34 52 1.14 77 1.08

3 .92 28 1.17 53 .68 78 1.81

4 lo41 29 .79 54 1.05 79 .93

5 .16 30 .30 55 1.09 80 1.34

6 o68 31 .65 56 .38 81 1.18

7 2.11 32 .56 57 .92 82 .18

8 .97 33 .85 58 .47 83 .72

9 1.20 34 1.11 59 .18 84 .42

10 .33 35 .95 60 .75 85 .61

11 1.14 36 .51 61 .56 86 .69

12 .78 37 2.13 62 1.21 87 .80

13 .40 38 1.51 63 .43 88 .78

14 1.30 39 .82 64 .32 89 .79

15 .93 40 1.35 65 .27 90 1.11

16 .33 41 1.02 66 .92 91 .20

17 .76 42 .62 67 1.25 92 1.43

18 1.07 43 .34 68 .97 93 1.37

19 .42 44 1.20 69 .85 94 .54


20 1.37 45 .93 70 .96 95 1.56
21 .95 46 .99 71 .09 96 .45

22 1.78 47 .87 72 1.00 97 1.56

23 .71 48 .92 93 1.11 98 1.27


24 1.21 49 1.02 74 .94 99 .92

25 .14 50 1.21 75 .36 100 .52


A P P E N D IX 535

SU(1,2) Data Set Johnson Unbounded (1,2) Nfß, = .87


ß i = 5.59

O b s e r- O b s e r- O b s e r­ O b s e r-
No. vatlon No. vation No. vation № . vation

1 -.4 1 26 -.1 0 51 -1.00 76 -.4 7

2 -.9 0 27 .11 52 -.8 9 77 -.8 2

3 -.6 3 28 -.9 3 53 -.3 4 78 -1.88

4 -1 .2 5 29 -.4 7 54 -.7 8 79 -.6 4

5 .50 30 .18 55 -.8 3 80 -1.15

6 -.3 4 31 -.3 0 56 .05 81 -.9 5

7 -2 .4 6 32 -.1 9 57 -.6 3 82 .44

8 -.6 8 33 -.5 4 58 -.0 7 83 -.3 9

9 -.9 7 34 -.8 5 59 .43 84 -.0 1

10 .13 35 -.6 6 60 -.4 2 85 -.2 5


11 -.9 0 36 -.1 3 61 -.2 0 86 .35
12 -.4 5 37 -2.51 62 -.9 8 87 -.4 8
13 .02 38 -1.40 63 -.0 2 88 -.4 6
14 -1.10 39 -.5 0 64 .16 89 -.4 8
15 -.6 3 40 -1.17 65 .24 90 -.8 5

16 .13 41 -.7 4 66 -.6 3 91 .38


17 -.4 3 42 -.2 7 67 -1.04 92 -1.27
18 -.8 1 43 .11 68 -.6 8 93 -1.19
19 -.0 1 44 -.9 6 69 -.5 4 94 -.1 7
20 -1 .1 9 45 -.6 4 70 -.6 7 95 -1.47
21 -.6 6 46 -.7 1 71 .76 96 -.0 6

22 -1.83 47 -.5 6 72 -.7 2 97 -1 .4 8


23 -.3 8 48 -.6 2 73 -.8 6 98 -1.05
24 -.9 8 49 -.7 5 74 -.6 4 99 -.6 2

25 .56 50 - 98 75 .09 100 -, 14


536 APPENDIX

SU(0,3) Data Set Johnson Unbounded (0, 3) nT^i =0


ß 2 = 3 .5

O b se r­ O b se r­ O b se r­ Obser-
vation No. vation No. vation No . vation

1 .06 26 .27 51 -.2 6 76 .03

2 -.2 1 27 .42 52 -.2 0 77 -.1 6

3 -.0 6 28 -.2 2 53 .11 78 -.6 3

4 -.3 8 29 .03 54 -.1 5 79 -.0 7

5 .70 30 .47 55 -.1 7 80 -.3 3

6 .11 31 .14 56 .37 81 -.2 3

7 -.8 3 32 .21 57 -.0 6 82 .66

8 -.0 9 33 -.0 1 58 .29 83 .08

9 -.24 34 -.1 8 59 .65 84 .33

10 .43 35 -.0 8 60 .06 85 .17

11 -.2 0 36 .25 61 .20 86 .11

12 .04 37 -.8 5 62 -.2 5 87 .02

13 .36 38 -.44 63 .33 88 .04

14 -.3 1 39 .01 64 .45 89 .03


15 -.0 6 40 -.3 4 65 .51 90 -.1 8

16 .43 41 -.1 3 66 -.0 6 91 .61

17 .05 42 .15 67 -.2 8 92 -.3 8

18 -.1 6 43 .42 68 -.0 9 93 -.3 5


19 .33 44 -.2 4 69 -.0 1 94 .22

20 -.3 5 45 -.0 7 70 -.0 9 95 -.4 7


21 -.0 8 46 -.1 0 71 .89 96 .30
22 -.6 1 47 -.0 2 72 -.1 1 97 -.4 7

23 .09 48 -.0 6 73 -.1 9 98 -.2 8

24 -.24 49 -.1 3 74 -.0 7 99 -.0 6


25 .75 50 -.2 5 75 .40 100 .24
APPENDIX 537

SU(0,2) Johnson Unbounded (0 , 2) nT^i =0


ßz “ 4 . Ê

O b s e r­ O b se r­ O b se r­ O bser
No. vation vation No. vation No . vation

1 .10 26 .41 51 -.3 9 76 .05

2 -.3 1 27 .65 52 -.3 1 77 -.2 5

3 -.0 9 28 -.3 3 53 .17 78 -1.01

4 -.5 8 29 .04 54 -.2 2 79 -.1 0

5 1.15 30 .73 55 -.2 6 80 -.5 0

6 .17 31 .21 56 .57 81 -.3 5

7 -1.39 32 .32 57 -.0 9 82 1.07

8 -.1 4 33 -.0 2 58 .44 83 .12

9 -.3 7 34 -.2 7 59 1.05 84 .51

10 .68 35 -.1 2 60 .09 85 .25

11 -.3 1 36 .38 61 .31 86 .16

12 .06 37 -1 .4 2 62 -.3 7 87 .03

13 .55 38 -.6 8 63 .50 88 .06

14 -.4 7 39 .01 64 .70 89 .04

15 -.1 0 40 -.5 2 65 .80 90 -.2 8

16 .67 41 -.1 9 66 -.0 9 91 .98

17 .08 42 .23 67 -.4 2 92 -.5 9

18 -.2 4 43 .65 68 -.1 4 93 -.5 3

19 .51 44 -.3 6 69 -.0 1 94 .34

20 -.5 3 45 -.1 0 70 -.1 3 95 -.7 3

21 -.1 2 46 -.1 6 71 1.51 96 .46

22 -.9 7 47 -.0 4 72 -.1 7 97 -.7 4

23 .13 48 -.0 9 73 -.2 8 98 -.4 3

24 -.3 7 49 -.1 9 74 -.1 1 99 -.0 9

25 1.23 50 -.3 8 75 .62 100 .37


538 APPENDIX

SB(0, 2) Data Set Johnson Bounded (0 , 2) Nfßi = 0


ß i = 2.63

O bser- O b se r­ O b s e r- O b s e r-
No. vation №. v a tio n No. vation No. vation

1 .52 26 .60 51 .41 76 .51

2 .42 27 .65 52 .42 77 .44

3 .48 28 .42 53 .54 78 .29

4 .37 29 .51 54 .45 79 .47

5 .73 30 .66 55 .44 80 .38

6 .54 31 .55 56 .63 81 .42

7 .24 32 .58 57 .48 82 .72

8 .47 33 .50 58 .60 83 .53

9 .41 34 .43 59 .71 84 .62

10 .65 35 .47 60 .52 85 .56

11 .42 36 .59 61 .58 86 .54

12 .52 37 .24 62 .41 87 .51

13 .63 38 .35 63 .62 88 .51

14 .39 39 .50 64 .66 89 .51

15 .48 40 .38 65 .68 90 .43

16 .65 41 .45 66 .48 91 .70

17 .52 42 .56 67 .40 92 .36

18 .44 43 .65 68 .47 93 .38

19 .62 44 .41 69 .50 94 .58

20 .38 45 .48 70 .47 95 .34

21 .47 46 .46 71 .77 96 .61

22 .30 47 .49 72 .46 97 .34

23 .53 48 .48 73 .43 98 .40

24 .41 49 .45 74 .47 99 .48

25 .74 50 .41 75 .64 100 .59


APPENDIX 539

SB(0, .5) Johnson Bounded (0, .5) ^Гß^ = 0


ßz = l - (

O b se r­ O b se r­ O b se r­ O b se r­
vation vation No. vation No. vation

1 .60 26 .83 51 .18 76 .55

2 .23 27 .92 52 .23 77 .27

3 .41 28 .21 53 .66 78 .03

4 .10 29 .54 54 .29 79 .40

5 .98 30 .94 55 .27 80 .13

6 .66 31 .69 56 .90 81 .20

7 .01 32 .78 57 .41 82 .98

8 .36 33 .48 58 .85 83 .61

9 .19 34 .25 59 .98 84 .88

10 .93 35 .38 60 .59 85 .73

11 .23 36 .82 61 .77 86 .65

12 .56 37 .01 62 .19 87 .53

13 .89 38 .07 63 .87 88 .56

U .14 39 .51 64 .93 89 .54

15 .40 40 .12 65 .95 90 .25

16 .93 41 .32 66 .41 91 .97

17 .58 42 .72 67 .16 92 .10

18 .28 43 .92 68 .37 93 .12

19 .88 44 .19 69 .49 94 .79


20 .12 45 .40 70 .37 95 .06

21 .38 46 .35 71 .99 96 .86

22 .03 47 .46 72 .34 97 .06

23 .63 48 .41 73 .25 98 .16

24 .19 49 .32 74 .39 99 .41

25 .98 50 .19 75 .91 100 .81


540 A P P E N D IX

L C N (.0 5 ,3 ) Data Set Contaminated Norm al n/^1 = .68


(p = .05, /X = 3) ß2 = 4.35
.95N(0,1) + .05N(3,1)

O b se r­ O b se r­ O b se r­ O b ser-
No, vation No. vation №. vation № . vatioii

I .19 26 -.7 6 51 .60 76 3.77

2 -,1 9 27 .33 52 -.5 0 77 -.4 6

3 1.96 28 -.5 1 53 .69 78 1.40

4 -2.26 29 -.1 8 54 .28 79 .13

5 -.7 2 30 1.83 55 1.15 80 -.0 8

6 -,6 1 31 .61 56 -.2 4 81 .71

7 1.05 32 .97 57 -.0 2 82 1.14

8 -.1 9 33 1.^7 58 -1.31 83 -1.21

9 .16 34 -.8 2 59 1.82 84 -.3 2

10 .98 35 -.0 3 60 .48 85 .00

11 -.24 36 2.41 61 -1.52 86 .58

12 .26 37 -.5 5 62 .84 87 -.2 4

13 2.07 38 1.17 63 -.7 7 88 -1 .8 8

14 1.22 39 -.4 9 64 -1.17 89 .44

15 .09 40 -.2 1 65 -.6 5 90 .27

16 .A l 41 2.31 66 .79 91 .66

17 -.04 A2 ,23 67 -.9 7 92 -.5 1

18 -.24 43 .50 68 -.14 93 -1.24


19 -2.30 44 .07 69 .20 94 -1.19
20 ,03 45 .08 70 .19 95 .27

21 -.3 8 46 1.74 71 .20 96 .13


22 1.23 47 -1.02 72 .50 97 1.68

23 -.2 0 48 -1 .3 5 73 -.1 3 98 -.0 5

24 -.0 7 49 -1.36 74 -2.25 99 -.3 0


25 -.3 8 50 -.1 8 75 1.77 100 -1.70
A P P E N D IX 541

L C N (.1 0 ,3 ) Contaminated Norm al -fß i = .8


(P = .10, ß = 3) ß2 = 4.02
.90N(0,1) + .10N(3,1)

O b ser- O b ser- O b s e r­ O b s e r-
No. vation No. vation No. vation . vation

1 -.7 8 26 .96 51 -2.01 76 .78

2 -1 .9 8 27 -.8 1 52 -1.12 77 .24

3 .57 28 -.0 4 53 .18 78 -.4 0

4 2.10 29 .88 54 .31 79 -1.42

5 .36 30 .07 55 3.54 80 .37

6 2.33 31 3.78 56 .00 81 -.4 7

7 .99 32 2.61 57 1.66 82 -.0 9

8 1.23 33 3.93 58 .74 83 .11

9 -1 .0 5 34 -.0 6 59 .87 84 -1.24

10 .00 35 1.48 60 -1.02 85 3.67

11 -1.05 36 -.9 2 61 .67 86 .67

12 -.5 5 37 .87 62 3.03 87 .70

13 -.2 6 38 -2.60 63 3.42 88 -1.43

14 .09 39 .28 64 -1.95 89 -1.70

15 .08 40 .26 65 .83 90 .33

16 -.8 1 41 -.1 3 66 -.6 7 91 .44

17 3.30 42 -.6 9 67 -1.71 92 -1.00

18 -1.21 43 -2.13 68 -Л 0 93 .49

19 -1.31 44 -.2 7 69 3.54 94 -.1 0

20 .07 45 .49 70 .26 95 -.1 0

21 .71 46 .48 71 .32 96 1.84

22 .89 47 .02 72 .04 97 -.9 9

23 -1.54 48 .69 73 .16 98 .15

24 2.45 49 -.2 5 74 -.5 1 99 1.67


25 -.9 1 50 -.84 75 .52 100 1.30
542 A P P E N D IX

L C N (.2 0 ,3 ) Data Set Contaminated Norm al = .68


(p = . 20, /X = 3) = 3 .0 9
.80N(0,1) + .20N(3,1)

O b se r­ O b se r­ O b s e r­ O b s e r-
No. vation No. vation No. vation No . vatioii

I .09 26 -.9 5 51 1.42 76 4.12

2 -.2 4 27 2.03 52 1.17 77 3.57

3 1 .4 4 28 -.8 7 53 2.00 78 .13

4 1.25 29 -1.61 54 .26 79 -1.55

5 2.24 30 2.93 55 -1.87 80 -.4 9

6 .16 31 -1.14 56 .80 81 .47

7 .05 32 -1.05 57 3.45 82 1.08

8 -.6 2 33 -.3 2 58 -.2 9 83 -.5 9

9 3.44 34 1.12 59 -.2 7 84 .64

10 -.0 4 35 -.2 5 60 1.29 85 1.49

11 -.4 1 36 1.12 61 -.0 4 86 2.39

12 1.49 37 1.31 62 .78 87 -.4 3

13 -.8 7 38 -.3 1 63 .62 88 .46

14 -2.61 39 -1.17 64 .79 87 -.6 8

15 .08 40 .05 65 .43 90 -.0 8


16 -.8 3 41 -.0 2 66 2.83 91 -.1 6

17 .32 42 3.86 67 .69 92 4.80


18 .43 43 .09 68 .55 93 1.71
19 2.86 44 -.3 4 69 .35 94 3.59
20 1.40 45 -1 .0 5 70 1.78 95 2.25

21 -.0 8 46 -2.20 71 .02 96 -.7 1


22 2.03 47 -.2 2 72 2.01 97 -.5 3
23 .31 48 2.49 73 3.86 98 -1.82

24 4.58 49 -1.41 74 .06 99 -.0 3


25 .07 50 .49 75 -.54 100 5.04
A P P E N D IX 543

SB(1,1) Data Set Johnson Bounded (1,1) ^Tßi = .73


= 2 .9 1

O b s e r- O bser- O b s e r­ O bser-
No. vation No. vation No. vation No . vation

1 .31 26 .45 51 .15 76 .29

2 .17 27 .55 52 .17 77 .18

3 .23 28 .16 53 .34 78 .06

4 .11 29 .29 54 .19 79 .23

5 .72 30 .59 55 .18 80 .12

6 .34 31 .36 56 .52 81 .16

7 .04 32 .41 57 .23 82 .70

8 .22 33 .26 58 .46 83 .32

9 .15 34 .18 59 .70 84 .49

10 .57 35 .22 60 .31 85 .38

11 .17 36 .44 61 .40 86 .34

12 .29 37 .04 62 .15 87 .28

13 .51 38 .09 63 .49 88 .29

14 .13 39 .27 64 .58 89 .29

15 .23 40 .12 65 .61 90 .18

16 .56 41 .20 66 .23 91 .68

17 .30 42 .37 67 .14 92 .11

18 .19 43 .56 68 .22 93 .12

19 .49 44 .15 69 .26 94 .42

20 .12 45 .23 70 .22 95 .09

21 .22 46 .21 71 .80 96 .47

22 .06 47 .25 72 .21 97 .09

23 .32 48 .24 73 .17 98 .14

24 .15 49 .20 74 .23 99 .24

25 .75 50 .15 75 .54 100 .43


544 A P P E N D IX

S B (1 ,2) Data Date Johnson Bounded (1,2) ^Гß^ = .28


= 2 .7 7

O b s e r­ O b se r­ O b se r­ O b s e r-
No. vation No. vation vation . vation

I .35 26 .39 51 .41 76 .49

2 .26 27 .46 52 .42 77 .43

3 .29 28 .46 53 .33 78 .44

4 .21 29 .43 54 .29 79 .17

5 .36 30 .31 55 .27 80 .57

6 .24 31 .37 56 .22 81 .37

7 .46 32 .39 57 .15 82 .37

8 .42 33 .68 58 .38 83 .34 -

9 .21 34 .61 59 .55 84 .25

10 .43 35 .51 60 .22 85 .34

11 .42 36 .37 61 .25 86 .28

12 .42 37 .23 62 .21 87 .45

13 .26 38 .27 63 .32 88 .39

14 .22 39 .32 64 .37 89 .47

15 .29 40 .33 65 .53 90 .45

16 .46 41 .29 66 .29 91 .33

17 .39 42 .34 67 .30 92 .44

18 .38 43 .30 68 .41 93 .65

19 .29 44 .24 69 .32 94 .43

20 .45 45 .30 70 .37 95 .26

21 .37 46 .42 71 .22 96 .50

22 .31 47 .44 72 .24 97 .44

23 .31 48 .33 73 .24 98 .44


24 .27 49 .45 74 .43 99 .53

25 .38 50 .33 75 .26 100 .68


APPENDIX 545

SB(. 535,. 5) Data Set Johnson Bounded (..535, .5) ^Гßl = .65
= 2.13

O b se r­ O b se r­ O b s e r­ O b se r­
No. vation No. vation vation vation

1 .07 26 .00 51 .02 76 .03

2 .03 27 .01 52 .02 77 .15

3 .45 28 .05 53 .62 78 .05

4 .80 29 .49 54 .05 79 .41

5 .14 30 .32 55 .82 80 .15

6 .02 31 .02 56 .91 81 .25

7 .16 32 .24 57 .39 82 .39

8 .22 33 .01 58 .04 83 .11

9 .12 34 .00 59 .14 84 .08

10 .56 35 .30 60 .44 85 .43

11 .41 36 .07 61 .69 86 .55

12 .03 37 .20 62 .01 87 .30

13 .93 38 .33 63 .71 88 .42

14 .56 39 .23 64 .36 89 .93

15 .66 40 .55 65 .01 90 .95

16 ,03 41 .04 66 .78 91 .04

17 .23 42 .10 67 .81 92 .05

18 .57 43 .15 68 .22 93 .30

19 .07 44 .42 69 .49 94 .63

20 .39 45 .26 70 .02 95 .72

21 .41 46 .06 71 .03 96 .98

22 .08 47 .52 72 .06 97 .15

23 .56 48 .25 73 .01 98 .09

24 .07 49 .22 74 .25 99 .09

25 .45 50 .15 75 .36 100 .08


546 APPENDIX

4 . Real Data Sets

There are five data sets which are given here. These can be used to illu s -
trate and compare various techniques. Some are used in the text.

B U S Data Set Data Set for Two Independent Norm al Populations

Body weight in gram s of white Leghorn chicks at 21 days; from two labora-
tories in a collaborative vitamin D assay (Blisses data)

Series A Series B

Weight Frequency Weight Frequency

156 I 130 I
162 I 147 I
168 I 155 I
182 I 156 I
186 I 167 I
190 2 177 I
196 I 179 I
202 I 183 I
210 I 187 2
214 I 193 I
220 I 195 I
226 I 196 I
230 2 199 I
236 2 203 I
242 I 208 I
246 I 225 I
270 231 I
20 232 I
236 I
246 I
21

Source: B lis s , C. I. (1946). Collaborativecom parison of three ratios for


chick assay of vitamin D. J. A ssoc. Off. A g r . Chem. 396-408, given in
B lis s , C. I. (1967). Statistics in B iology, vol. I, New York: M cG raw -H ill,
p. 108.
APPENDIX 547

CHEN Data Set Data Set for Lognormal Population

Lethal dose of the drug clnobufagln in 10 (m g/kg), as determined by titration


to cardiac a rre st in individual etherized cats (Chen et a l . , 1931)

Dose log Dose Dose log Dose f

1.26 0.100 I 2.34 0.369 I


1.37 0.137 I 2.41 0.382 I
1.55 0.190 I 2.56 0.408 I
1.71 0.233 I 2.63 0.420 2
1.77 0.248 I 2.67 0.427 I
1.81 0.258 I 2.82 0.450 2
1.89 0.276 2 2.84 0.453 I
1.98 0.297 I 2.99 0.476 I
2.03 0.308 3 3.65 0.562 I
2.07 0.316 I 3.83 0.583
25

Source: Chen, K. K ., H. Jensen, and A . L . Chen (1931). The pharm acolog­


ical action o f the principles isolated from ch*an su, the dried venom o f the
Chinese toad. J. Pharm acol. Expt. T h erap. 43, 13-50. Given in B lis s ,
C . I. (1967). Statistics in Biology, vol. I , New Y ork: M c G ra w -H ill, p. 114.

B L A C Data Set Data Set fo r Testing for Poisson Model

Density of Eryngium M arltlm um (Blackman* s data)

Num ber of plants


per quadrat
square Frequency

0 16
I 41
2 49
3 20
4 14
5 5
6 I
7 I
8 and over __ 0
147

Source: Blackman, G. E . (1935). A study by statistical methods of the d is­


tribution of species in grassland associations. Ann. Bot. 49, 749-777. Given
in K. Mather (1966). Statistical Analysis in Biology. London: Methuen, p. 37.
548 APPENDIX

EMEA Data Set Data Set for Testing for Normality

Distribution o f the Heights of M aize Plants (in decim eters)


(Em erson and East*s data)

Height of plants
in dms
(class center) Frequency

7 I
8 3
9 4
10 12
11 25
12 49
13 68
14 95
15 96
16 78
17 53
18 26
19 16
20 3
21 __ I
530

X = 14.5396

= 4.9936
N -I

4.9936 - .0833 (Sheppard^s correction) =4.9103

S = \/4.9103 = 2.2159

Source: Em erson, K. A . , and E . M . East (1913). Inheritanceofquantitative


characters in m aize, Neb. Exp. Stat. R es. B ull. 2 . Given in K. Mather
(1966). Statistical Analysis in Biology. London: Methuen, p. 29.
APPENDIX 549

BAEN Data Set Data Set for Testing for Double Eзфonential (Laplace)

Differences in flood stages for two stations on the Fox R iver in Wisconsin
(Bain and Engelhardt's data)

lo97, 1.96, 3.60, 3.80, 4.79, 5.66, 5.76, 5.78, 6.27, 6.30, 6.78, 7.65,
7.84, 7.99, 8.51, 9.18, 10.13, 10.24, 10.25, 10.43, 11.45, 11.48, 11.75,
11.81, 12.34, 12.78, 13.06, 13.29, 13.98, 14.18, 14.40, 16.22, 17.06

Source: Bain, L . J . , and M . Engelhardt (1973). Interval estimation fo r the


tw o-param etric double exponential distribution. Technometrics 15, 875-
887, p. 885.
Index

Accommodation of outliers, 499, 517 [b 2 statistics]


Alternative fam ilies, 491-492 Bowman and Shenton’s approxi­
Anderson-D arling test (see edf sta­ mation, 283, 322
tistic and edf tests) D'Agostino and P earson ’s sim u­
A N O V A , test fo r constant variance, lations, 388
153 D'Agostino and Tietjen’s sim u­
Asymmetry, 13 lations, 388
Asymptotics, 280, 291 Beta distribution, 338
transformation to F , 338
^ ib i statistic (see also normality, 42, 279, 375
tests of), 14, 283, 288, 289, Beta random variables, 248
290, 292, 293, 294, 295, 296, /?2, 23, 42, 279, 375
297, 302, 306, 321, 375, 376- Biased tests, 250
381 Bivariate contours, 312, 315, 316,
D^Agostino’s approximation, 281, 317
285, 377 Bivariate model, 306, 311, 314, 317
normal approximation, 382 B M D P (Biomedical Computer P r o ­
t approximation, 377 gram s P ), 9
Johnson^s Su transformation, 281, B orel subsets, 244
285, 377
b 2 statistics (see also normality, Cauchy distribution, edf test for, 160
tests of), 283, 286, 288, 290, Censored data, 86-88
292, 293, 294, 295, 296, 297, confidence bands for, 120
302, 306, 322, 375. 388-390 doubly-censored, 118
Anscombe and Glynn's approxi­ edf statistics, and, 111-122, 480-
mation, 388 481

551
552 INDEX

[Censored data] Components of uniformity test sta­


Greenwood statistic, 339 tistics, 355
left censoring, 476, 478, 479 Composite goodness-of-fit problem
multiply censored, 468-473 ( see hypothesi s , composite)
progressively, 462, 468-471 Computer version of K| test, 322
random censoring, 119 Concentration bands, 248
replacement method, 120 Conditional probability integral
right censoring, 461, 462, 467, transformation (see C P IT and
468, 469, 476, 478, 480, 483 transformation)
singly-C en sored sam ples, 466-468 Conditioning, 483
test for exponential distribution, Confidence bands, censored data,
141 120
test for normality, 128 Confidence intervals using
transformation to complete sample, Kolmogorov D, 109
121 Consistency, 223
type I censoring, 112, 200, 461, Shapiro-W ilk W , 223
467-468 Shapiro-W llk W ^ , lack of con­
type П censoring. 113. 199, 461, sistency, 223
467-468 S h a p iro -F ra n c ia W ', 223
Central limit theorem, 367 Consonance set, 109
Chi-squared statistic: Contaminant, 497
Chernoff-Lehmann, 67. 73-75. 76 C orn ish -F ish er transformation, 398
Dzhaparidze-Nikolin. 79. 83 Correlation coefficient (see also
F reem an-Tukey, 68 correlation statistics and c o r­
log likelihood. 65. 68 relation te sts), 336
modified, 66 A N O V A , and, 198
Pearson, 64, 67, 72. 73 censored data, 196
P e a rs o n -F is h e r, 66, 87 consistent, 197
power of. 69. 72, 78, 91 Correlation between 'Æ iand Ь з , 292,
Rao-Robson, 78, 79-82, 85-86, 92 293, 294
W ald's method. 79. 87 Correlation intervals. 434
W atson-Roy, 76 Correlation statistics (see also c o r­
Choice of cells in chi-square, 69-71 relation tests), 482
Coefficient of variation, 424, 428, Correlation tests (see also regression
435. 457 tests), 195, 223
Circle, tests for observations on a. Cauchy distribution, for the, 230
107, 347 censored samples, 199, 200
omnibus tests. 350 exponential distribution, for the,
Combination of tests, 176-179, 357 215
Combining independent tests for exponential power distribution, for
several samples. 357 the, 230
Competing models, 4 73-474 extreme value distribution, for the.
Complete sufficient statistic. 243 225
Component distributions, 17 logistic distribution, for the, 225
Components of edf statistics, 374 normal distribution, for the. 201
INDEX 553

[Correlation tests] [Data sets fo r examples]


R* (X, l i ) , based on, 205 [NO R ] 394-395, 399, 465-466,
R^(X ,m ), based on, 204 467, 487, 489, 492
uniform distribution, for the, 199, SB(0, .5 ), 316
201 S B (0 ,2 ), 316
U (0 ,1 ), 201 SB(.533, .5 ), 316
Covariance of order statistics for the S B (1 ,1 ), 316
normal distribution, 208 S B (1,2), 316
C ram ^r-von M ises statistic (see edf SU (0 ,2 ), 42-43, 316, 382, 383,
statistics and tests) 389, 392
СРГГ, 240. 254, 433 SU (0 ,3 ), 286, 316
exponential distribution. 253-254 SU (1,2), 12-14, 15, 316, 382, 383,
lognormal distribution, 256-257 389, 392
normal analysis of variance, 259- U NI, 32-34, 42-43, 266, 285, 316,
260 382, 383. 389, 392, 481
normal distribution, 255-256 316
normal linear regression , 257-259 1\E2, 20-22. 55-56. 80-81. 316.
Pareto distribution, 254-255 469-471
uniform distribution, 252-253 Delta theorems, 367
Contamination, 15, 42 Dependent observations, 89
cdf. 8, 19, 22, 24-25, 28, 35, 64. D FR (decreasing failure ra te ). 167.
235, 236. 239 422. 428, 432, 435, 447, 451,
Cumulative distribution function 454, 456
(see cdf) Directed divergence, 68
Cumulative total time on test statis­ Discordant observation, 497
tic. 448 Discrete distributions, edf tests for,
171-176
Data-dependent cells (in chi-squared). Distribution function (see cdf)
75, 80. 82-84, 84-86, 87-88 df (see cdf)
Data sets for examples: Divergent se rie s, 292
B A E N . 82-84 Double trans it iv ity, 241
BLIS. 98-99, 125. 204. 213, 267.
285 ecdf, 8-10
CHEN, 49-50 plotting, 8
EMEA, 75 edf statistics (see also edf tests) :
E X P , 12-14. 15, 20-21. 43, 81, 88. A n d erso n -D arlin gA ^ , 100, 101
266, 285, 316, 373, 382, 383, components, 164
389, 392. 394-395, 399 confidence sets, use of edf statis­
LOG, 26-31. 32-33, 75, 285, 316 tics for. 109
L C N (.0 5 ,3 ). 316 C ram er-von M ises , 100, 101
L C N (.1 ,3 ), 16. 44-45. 316 indicators of the parent population,
L C N (.2 .3 ), 17, 44-46, 316 as, 180
NOR, 8-9. 12-14, 15, 23-24, 40- quadratic statistics, 100
41, 42. 44, 73-74. 266, 285, Kolmogorov D, 100, 101
316, 373. 382, 383. 389, 392. Kuiper V, 100. 101
554 INDEX

[edf statistics] [edf tests]


Supremum statistics, 100 von M ises distribution, for, 164-
W atson U ^. 100, 101 166
edf techniques, 24 W eibull distribution, for the, 149-
edf tests (see also edf statistics), 150
102-184 E m pirical characteristic function,
Ander son-D a r ling, 4, 372-373, 485 tests based on, 170
Cauchy distribution, for the, 160 E m pirical cumulative distribution
chi square distribution, for the, 153 function (see ecdf and edf)
combining edf tests statistics, 177- Entropy, 344
179 Equivalent normal deviates, 287,
use of standardized values, 179 296, 297, 300, 322, 328
Cram ^r-von M ises, 4. 480 Estimates of scale, tests based on,
discrete distributions, test for, 206
171-176 Exchangeability, 483
em pirical characteristic function, Expected values of ordered statis­
tests based on. 170 tics. 31
exponential distribution, for. 133- standard normal, 37-39. 41. 202
145 Exploratory technique, 7
extreme value distribution, for. Exponential distributlon :
145-149 gamma distribution as an alterna­
fully specified distribution, for a tive, 454, 457
(Case 0), 104-122 J transformation, 422, 424, 430
censored data, tests on. 111-122 K transformation. 422, 424. 431,
circle, observations on a. 107, 481 433. 445
power. HO N transformation. 422. 424, 429,
gamma distribution, for the. 151- 431, 445
156 test situations, 432
half sample method. 169 test on interval between events.
Kolm ogorov-Smirnov. 4, 111, 481 432
Kuiper. 481 tests on lifetimes, 432
logistic distribution, for the. 156- W eibull distribution as an alterna­
160 tive, 449. 454, 457
Mann. Scheuer, and Fertig. 183 Exponential distribution, applica­
median and mean of transformed tions of, 426
SpacingS, tests based on. 183 lifetime experiments. 427
normality, tests for, 122-133 Poisson process, 426, 433, 434
censored data. 128 reliability theory, 427
power. 166-168 time to failure, 426
Renyi statistics. 121 Exponential distribution, tests of
symmetry, for, 170 the. (origin known). 435-455
unknown shape param eter in edf. Bickel and Doksum tests. 449
103 censored data, on, 141
using normalized spacings. 180 correlation tests. 215
power. 184 edf tests, 133-145. 436. 438
INDEX 555

[Exponential distribution, tests of [Exponential distribution, tests of


the, (origin known)] the, (origin and scale unknown))
gamma distribution as an alterna­ power comparison, 455-456
tive, 454, 457 Shaplro-W ilk W g , 455
Greenwood statistics, 217, 439, Extrem e tail percentiles, 2
440-441, 454 Extrem e value distribution, edf
Gurland and Dahiya moment tests, tests for, 145-149
436, 454 using normalized spacings, 181
Jackson J, 436, 454
Kendall-Sherman К, 439, 443, 454 F ailu re censoring, 461
linear failure rate alternatives, 449 Failu re rate, 422, 428
Lorenz curve, tests based on, 444, Fiducial lim its, 239
454 F ish er distribution on the sphere,
Makeham alternative, 449 348
mean U statistic, 439, 447, 454 F ish er, R. A . , 280, 281
median U statistic, 439, 447, 454 F is h e r's P , 347, 358, 359, 360
M oran M , 442, 454 Form al num erical techniques, 7
omnibus tests, 451, 454 F orm al statistical tests, I
power study, 222, 451-455
ratio of two estimates of scale, Gamma distribution, edf tests, 151-
tests based on, 218 156
regression tests, 215, 436, 438 Gaps, 343
deW et-Venter, 222 Generalized least squares, 206
Jackson J, 222 Generalized likelihood ratio statis­
power of tests, 222 tic, 487
Shaplro-W ilk W g , 218 Geometric mean, 50
Stephens W s , 219 Geometric standard deviation, 50
residuals, tests based on, 218 G ra m -C h a rlie r system, 281
sample moment tests, 436, 454 Graphical display of a random
tests based on J transformation, sam ple, 195
438-444 Graphical techniques, 2, 7
tests based on K transformation, Graph paper (see also probability
445-451 plotting papers) :
tests based on N transformation, arithmetic, 8, 39
445-451 four cycle, 21
tests fo r trend, 439 log, 8
Shaplro-W ilk W g , 218, 438 sem i-lo g, 19, 21
Stephens W s, 219, 436, 454 Greenwood statistic (see also edf
W eibull as an alternative, 449, tests, exponential distribution,
454, 457 tests of the, and uniform d istri­
Exponential distribution, tests of bution, test of the), equivalent
the, (origin and scale unknown), to other statistics, 441
455-456 Grouped data, 10, 33, 65, 171, 405,
correlation statistics, 455 478
edf statistics, 455
556 INDEX

Half sample method, 169 Leaps, 429, 489-491


Hazard plotting, 469 Life testing, 461
Hazard rate, 428 Lifetim e data, tests on, 434
Heavy tailed distribution, 19 Lifetim e, model for the, 426
Heteroscedasticity, 246 Light tailed distribution, 21
Higher ord er spaeings, 343 Linear funetions of order statlstios,
Hypothesis: 367
alternative, I, 2, 369, 372, 373 Linear models, outlier Identification,
eomposite, I , 4, 235, 368, 372, 507, 516
373, 487-492 Loglikellhood ratio statistic, 65, 68,
null, I, 2, 102, 104, 122, 368, 358
372, 422, 479 Logistic distribution:
simple, I, 4, 235, 368, 372, 479 edf test for, 156-160
surrogate, 237 using normalized spaeings, 181
probability plotting, 26-32
Identifieation of outliers, 499 regression test for, 224
Independent tests, 359, 360 Lognorm al distribution:
IFR (inereasing failure rate), 167, censored data, 492
422, 428, 432, 435, 447, 450, probability plotting, 47-54, 473-
454, 456 474
Intervals between events, 456 three param eter, 51
Invarianee, 432 two param eter, 47
Invariate : Lorenz curve, test based on, 444,
maximal, 244 454
permutation, 245
Iteration, 324 M axim al invariant, 244
M L E (maximum likelihood estim ator),
Jaekknife teehnlque, 469 65, 79, 84
Johnson’s S translation system, 281, raw data, 67, 87
287, 294, 297, 306, 307, 322 Mean Ü ,^347, 439
J transformation, 332, 422, 424, 430 Median Û, 338, 439
K transformation, 422, 424, 431, Maximum value, 18
433, 445 Minimum chi-squared estim ator, 65
K aplan -M eler estimator, 119, 468 Minimum modified chi-squared esti­
Kolm ogorov-Sm irnov statlstle D mator, 66-67
(see edf tests and edf statistlos) Mixtures of distributions, 15-18, 42
к statistles, 280 param etric techniques, 18
K^ test (see D ’Agostino and Pearson Moments of sample moments:
omnibus test under normality, bivariate, 288
test for) norm al mixture, 290, 304, 305,
K| test: 314, 315, 316
normality, for (see normality, Pearson Type I distribution, 290
tests for) uniform distribution, 289
nonnormal sampling, 296 univariate, 288
kurtosis (see also and Ь з ), 435 Monte C arlo simulations, 315
INDEX 557

Most powerful sim ilar test, 245 [N orm ality, tests of]
M ultivariate normality, tests of: G eary’s a, 392
directional normality, 411 test, 283, 297, 390, 403-406
generalization of univariate tests, Kg test, 282, 283, 286, 296, 300,
409-411 301, 309, 322, 328, 391, 403, 406
Machado, 410 LaBrecque, 400
MaIkovich and A fifi, 410 likelihood ratio, 400
M ardia, 409 Locke and S p u rrier’s U statistic,
maximum curvature test, 412 401
nearest distance test, 412 moment, 375
radius and angle decomposition, 411 Neyman smooth test, 249, 261
solely multivariate procedures, norm alized spacings, using, 181
411-413 omnibus tests based on moments,
univariate procedures, 409 390-391
power studies, 214, 403-404
Negatively skewed, 11 R test, 390
Neym an-Pearson theory, 2 regression tests, 201-205, 393-
Nonparametric estimator of the cdf, 401
468, 476 residuals, on, 406-408
Norm al mixture, 290, 304, 305, 314, autoregressive, 408
315, 316 linear regression , 132, 407
Norm ality, multivariate (see multi­ Royston, 400
variate normality, tests of) sam ple range, 392
Norm ality, tests of: Shapiro-Francia W ’ , 213, 223,
Anderson-Darling, 372-374 399, 403-406
n/Ь^, 376-381, 403-406 Shapiro-W ilk W , 3, 4, 206, 208,
Bowman and Shenton’s omnibus test 211, 252, 393, 403-406
(see Kg test under normality, Spiegelhalter, 402
tests of) third standardized moment, 376-
b2, 388-390, 403-406 381, 403-406
chi-squared type, 370-371 tie s , effect of, 405
comparison of, 403-405 Tiku, 402
correlation coefficient tests, 201- l^^tsonU^, 249, 261
205 W eisberg-B in gh am , 399
D ’Agostino and Pearson ’s omnibus, N transformation, 422, 424, 429,
283, 297, 390, 403-406 431
D ’Agostino’s D, 212, 395-399, N U residuals (norm al uniform
403-406 resid u als), 247, 250, 251
effects of ties on, 405
edf, 122-133, 214, 371-374 Omnibus tests (see also normality,
edf for simple null hypothesis, 372 tests of), 251, 283, 285, 315,
entropy, 344 390-391, 486
Filliben, 400 Omnlbustest contours, 296-297,
fourth standardized moment, 388- 302, 315
390, 403-406 Ordinary least sq u ares, 197, 198
558 INDEX

Outliers, 14-15, 42, 497 Pow er transformation, 58


accommodation of, 499, 517 P - P plots, 58
identification of, 499 P IT (see probability Integral trans­
multiple outliers in a univariate formation)
sample, 504 Probability integral transformation,
multiple outliers in the linear 101, 207, 239-246, 332, 357,
model, 516 422, 480
single outlier in the linear model, classical, 247
507 conditional, 240, 254
time series, in, 520 Probability plot, 2, 4, 25, 34, 412
Probability plotting, 24-57, 463-479
Padd approximations, 291, 292 СРГГ analysis, 261-271
Param eter estimation from plots: gamma distribution, 59, 475, 478
best linear unbiased, 29 general concepts, 24, 34
informal, 26 normal distribution, 35-47, 465-
unweighted least squ ares, 29 467
Patterns of U -valu es, 356 lognormal distribution, 47-54,
Peakedness, 369, 375 473-474
Pearson, 280, 281 logistic, 26-28
Pearson curve (see Pearson system) Weibull, 54-57, 469-471
Pearson populations (see Pearson Probability plotting papers, 8, 34
system) normal probability paper, 39
Pearson system, 280, 281, 286, logistic, 34
287, 398 lognormal, 50
Pearson Type I distribution, 293, Weibull, 54
304, 306, 317 Proportional hazards, 478-479
Periodicity of events, 434, 438 P -statistic, 358
Permutation invariant transform a­
tions, 246 Quantile probabilities, 464
Poisson process, 426, 433 Quadratic form , 197
independence of intervals, 434 Quadratic statistics (see edf statis­
periodic alternatives, 434 tics and edf te sts), 357
trend alternatives, 433 Q -Q plots, 58, 462
Positively skewed, 11, 17 Quantile-quantile plots (see Q -Q
Pow er (see also power entries under plots)
chi-squared tests, edf tests,
exponential distribution tests, R ao-Blackw ell estimate, 243
normal distribution tests, and R ao-Blackw ell theorem, 168
uniform distribution tests), 2, 69, Ratio of maximum likelihood, 491
72, 78 Rational fraction approxlmants, 287
Pow er of a test (see power entries Recursive residuals, 258-259
under chi-squared tests, edf Regression tests (see also c o rre ­
tests, exponential distribution lation te sts), 195
tests, normal distribution tests, Cauchy distribution, for the, 224
and uniform distribution tests) deWet and Venter’s general p ro­
cedure fo r tests, 224
INDEX 559

[R egression tests] [Spacings]


exponential distribution, for the, 215 between the U set, 440
extreme value distribution, for the, correlated, 434
224 norm alized spacings, 429, 456
logistic distribution, fo r the, 224 unordered uniform spacings, 333
norm al distribution, fo r the, 201- Spheres:
215 distributions on, 348
residuals, based on, 205 tests on, 347
Reliability theory, 427 omnibus tests, 350
Renewal processes, 478 Standardized moments (see also
Residuals, 197 /Зг, ^/Fl, b j ) , 281
tests fo r normality of, 132, 207, SAS (Statistical Analysis System ), 9
406-408 SPSS (Statistical Package for the
Renyi statistics, 121 Social Sciences), 9
R eversals, 450 Statistics fo r the c ircle o r the sphere,
Robust regression , 517 347
Step function, 10
Safe sample size, 290, 301 Stleltes continued fractions, 291
Sample, delete, 242 Stretches, 344
Scaled residuals, 412 Sufficient statistics, 168
Scan statistic, 345 Summation technique, 296
on the circle, 345 Superuniform observations, 106,
Scatter d iagram , 33 334, 432, 434, 438
Separate fam ilies, 236 Supremum class, 357
several sam ples, 236 Symmetry, 11-14
testing problem , 236 tests for, 170
Series of events, tests on a, 433 Symmetry plots, 14
Several sam ples goodness-of-fit
problem , 236 T ail thickness, 18-20, 23, 369-370
Shapiro-W ilk (see tests fo r exponen­ T aylor expansions, 288
tial distribution and normal T aylor se rie s, 297
distribution) Tests of normality (see normality,
Significant tall of a test statistic, tests of)
425, 454 T ies in data, 33, 405, 444
Sim ilar test, 237 Tim e censoring, 461
Simple goodness-of-fit problem (see Tim e se rie s, outliers in, 520
hypothesis, simple) Total time on test statistic, 429
Simultaneous behavior of \/Ц and Ь з , Total time on test till rth failure,
295 430
Skewness (see also and ^/Vl), 2, Transform ation :
13, 435 censored data, on, 431
Slippage, 246 conditional probability integral
Spacing, 483 (see СРГГ)
Spacings, 332, 424, 429, 440, 442, 456 exponential, to, 422
autocorrelation, 343 N transformation, 422, 424, 429,
445
560 INDEX

[ Transformation] [Uniform distribution, tests of the]


normality, to, 58 F ish e r’s P , 347
permutation Invariant, 246 Greenwood statistic, 339, 440
probability integral, 239-246 adapted to censored data, 339
uniform, to, 332 likelihood ratio methods, 345
exponential to uniform , 332 M oran’s statistic, 343
G transformation, 333, 433 Neyman (Neym an-Barton) smooth
J transformation, 332, 422, 424, tests, 249, 351, 357
430, 433, 438 order statistics, tests based on,
K transformation, 332, 422, 424, 336
433, 445 P^, 250
uniform to uniform , 333 power of tests, 357
W transformation, 333, 433 Quesenberry and M ille r, 343
Transform ation group, 244 unknown lim its, 360
Transform ation to normality, 58 regression tests, 336
Transition zone, 45 scan statistics, 345
Transitive, 244 Shapiro-WiUc, 224
Truncated data. 461 Ü statistics, 346
Truncation param eter fam ilies, 241 ^^¾tson U ^ , 249
Uniformity, tests of (see uniform
Unbiasedness, 69 distribution, tests of)
Uniform conditional test, 435 Uniform residuals, 247
Uniform distribution and sample: Unknown shape param eters, 103
ordered uniform sample, 332, 421 Unweighted least squares (see also
standard uniform, 332 ordinary least sq u a re s), 29
uniform random variable, 332 Updating form ulas, 257
uniform sample, 332, 421
Uniform distribution, tests of the. 331 Variance, sample, 197
Ajne’s statistic, 349 von M ises distribution, 347
Anderson-Darling, 252 edf test for, 164-166
C -C lass, 336
censored uniforms, 331, 361 W eibull distribution:
circle o r sphere, on, 347 censored data, 492
components of test statistics, 355 e d f tests, 149-150
correlation tests, 336 probability plotting, 469-473
effect on due to patterns of three param eter, 57
U -statistlcs, 356 two param eter
edf tests, 334
entropy, 344 Zero data values, 51, 57, 444
about the book . . .

Conveniently grouping methods by techniques, such as chi-squared and empirical distribution


function, and also collecting methods of testing for specific famous distributions, this useful
reference is the first comprehensive, review of the extensive literature on the subject. It surveys
the leading methods of testing fit . . . provides tables to make the tests available . . . assesses
the comparative merits of different test procedures . . . and supplies numerical examples to aid
in understanding these techniques.

shows how to apply the techniques . . . emphasizes testing for the


G o o d n e s s -o f -F it T ech n iq u es
three major distributions, normal, exponential, and uniform . . . discusses the handling of cen­
sored data . . . and contains over 650 bibliographic citations that cover the field.

Illustrated with tables and drawings, this volume is an ideal reference for mathematical and
applied statisticians, and biostatisticians; professionals in applied science fields, including psy­
chologists, biometricians, physicians, and quality control and reliability engineers; advanced
undergraduate- and graduate-level courses on goodness-of-fit techniques; and professional sem-
inarr. and symposia on applied statistics, quality control, and reliability.

about the editors . . .

R a l p h B. D ’A g o s t i n o is Professer of Mathematics and Statistics in Boston University’s Mathe­


matics Department, Professor of Public Health in Boston University’s School of Public Health,
and Lecturer in Law at Boston University’s Law School. With extensive experience as a con­
sultant to firms and research groups, such as Lever Brothers, American Institute for Research
in the Social Sciences, and United Brands Company, he is also a consultant to the Biometrics
Division of [he Food and Drug Adm.inistration and Special Research Scientist at Boston City
Hospital. The author or coauthor of over 60 articles and book chapters on statistics, tests of
normality, and biosiatistics, Dr. D’Agostino is the coauthor of one booV., F a c t o r A nalysis, and
editor of the journal, E m e rg e n c y H ea lth Services R e v ie w , as well as editorial board member of
Statistics in M e d ic in e . He is a member of the American Statistical Association, American Asso­
ciation for Quality Control, Institute of Mathematical Statistics, and American Public Health
Association. Dr. D’Agostino received the A.B. (1962) and A.M. (1964) degrees from Boston
University, and Ph.D. degree (1968) from Harvard University.

is Professor of Mathematics and Statistics at Simon Fraser University in


M i c h a e l A . St e p h e n s
Burnaby, British Columbia, Canada. Prior to that he taught at several universities, including
McGill, Nottingham, McMaster, and Toronto, and was a visiting professor at Stanford, Wisconsin-
Madison,and Grenoble, as well as consultant to medical groups, government agencies, and pri­
vate companies. The author or coauthor of seme 65 articles on the analysis of directional data,
cqntinuous proportions, curve-fitdng, and tests of fit. Dr. Stephens was President of the Statis­
tical Society of Canada in 1983. He is a Fellow of the Royal Statistical Society, and his honors
include membership in the International Statistical Institute, and fellowships of the American
Statistical Association and the Institute of Mathematical Statistics. Dr. Stephens received the
B.Sc(. degree (1948) from Bristol Univetsity and A.M. degree (1949) in physics from Harvard
Univfcisity, where he was the first Frank Knox Fellow, and Ph.D. degree (1962) from the Uni­
versity of Toronto.

P rin ted ih the U n ite d States o f A m e rica ISBN: 0 -8 2 4 7 -8 7 0 5 -

\ m arcel dekker, inc./new yc*rk • bcisel •hong kong

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy