0% found this document useful (0 votes)
181 views157 pages

Ilya Gertsbakh Auth. Measurement Theory For

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 157

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/270888378

Measurement Theory for Engineers

Book · January 2003

CITATIONS READS

8 4,125

1 author:

Ilya B. Gertsbakh
Ben-Gurion University of the Negev
243 PUBLICATIONS   2,175 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

We are working on the reliability evaluation of heterogeneous networks View project

All content following this page was uploaded by Ilya B. Gertsbakh on 19 January 2015.

The user has requested enhancement of the downloaded file.


IIya Gertsbakh

Measurement Theory for Engineers


Springer-Verlag Berlin Heidelberg GmbH

ONLINE L1BRARY
Engineering
http://www.springer.de/engine/
Ilya Gertsbakh

Measurement Theory
for Engineers

With 25 Figures and 49 Tables

Springer
Ilya Gertsbakh

Ben Gurion University of the Negev


Department of Mathematics
84105 Beer-Sheva
Israel

ISBN 978-3-642-05509-6 ISBN 978-3-662-08583-7 (eBook)


DOI 10.1007/978-3-662-08583-7

Cataloging-in-Publication Data applied for.


Bibliographie information published by Die Deutsche Bibliothek. Die Deutsche Bibliothek lists
this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in
the Internet at <http://dnb.ddb.de>.
This work is subject to copyright. All rights are reserved, whether the whole or part of the material
is concerned, specifieally the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on rnicrofilm or in other ways, and storage in data banks. Duplication
of this publication or parts thereof is perrnitted only under the provisions of the German Copyright
Law of September 9, 1965, in its current version, and permission for use must always be obtained
from Springer-Verlag. Violations are liable to prosecution under German Copyright Law.

http://www.springer.de
© Springer-Verlag Berlin Heidelberg 2003
Originally published by Springer-Verlag Berlin Heidelberg New York in 2003.
Softcover reprint ofthe hardcover Ist edition 2003
The use of general descriptive names, registered names, trademarks, etc. in this publication does
not imply, even in the absence of a specific statement, that such names are exempt from the
relevant protective laws and regulations and therefore free for general use.
Typesetting: Datenconversion by author
Cover-design: Medio, Berlin
Printed on acid-free paper 62 I 3020 hu - 5 4 3 2 I 0
To my wife Ada
Preface

The material in this book was first presented as a one-semester graduate course
in Measurement Theory for M.Sc. students of the Industrial Engineering De-
partment of Ben Gurion University in the 2000/2001 academic year.
The book is devoted to various aspects of the statistical analysis of data
arising in the process of measurement. We would like to stress that the book is
devoted to general problems arising in processing measurement data and does
not deal with various aspects of special measurement techniques. For example,
we do not go into the details of how special physical parameters, say ohmic
resistance or temperature, should be measured. We also omit the accuracy
analysis of particular measurement devices.
The Introduction (Chapter 1) gives a general and brief description of the
measurement process, defines the measurand and describes different kinds of
the measurement error.
Chapter 2 is devoted to the point and interval estimation of the popula-
tion mean and standard deviation (variance). It also discusses the normal and
uniform distributions, the two most widely used distributions in measurement.
We give an overview of the basic rules for operating with means and variances
of sums of random variables. This information is particularly important for
combining measurement results obtained from different sources. There is a
brief description of graphical tools for analyzing sampIe data. This chapter also
presents the round-off rules for data presentation.
Chapter 3 contains traditional material on comparing means and variances
of several populations. These comparisons are typical for various applications,
especially in comparing the performance of new and existing technological pro-
cesses. We stress how the statistical procedures are affected by measurement
errors. This chapter also contains abrief description of Shewhart charts and a
discussion on the influence of measurement errors on their performance.
When we measure the output parameter of a certain process, there are two
main sources of variability: that of the process and that due to the errors
introduced by the measurement process. One of the central issues in statistical
measurement theory is the estimation of the contribution of each of these two
sources to the overall variability. This problem is studied in the framework
of ANOVA with random effects in Chapter 4. We consider one-way ANOVA,
viii PREFACE

repeatability and reproducibility studies in the framework of two-way ANOVA


and hierarchical design of experiments.
Very often, it is not possible to carryout repeated measurements, for ex-
ample, in destructive testing. Then a special design of experiment is necessary
to process the measurement data. Two models dealing with this situation are
considered in Chapter 4: one is a combination of two hierarchical designs, and
another is the Grubbs model.
There are two principal types of measurements: direct and indirect. For
example, measuring the voltage by a digital voltmeter can be viewed as a direct
measurement: the scale reading of the instrument gives the desired result. On
the other hand, when we want to measure the specific weight of some material,
there is no such device whose reading would give the desired result. Instead, we
have to measure the weight W and the volume V, and express the specific weight
pas their ratio: p = W/V. This is an example of an indireet measurement. The
question of principal importance is estimating the uncertainty of the measure
of specific weight introduced by the uncertainties in measuring W and V. The
uncertainty in indirect measurements is estimated by using the so-called error
propagation formula. Its derivation and use are described in Chapter 5.
Chapter 6 is devoted to calibration of measurement instruments. In many
instances, we are not able to measure directly the parameter of interest, x.
Instead, we are able to measure some other parameter, say y, which is related to
x viasome unknown relationship Y = r/>(x). When we observe the value ofy = Yo
we must find out the corresponding value of x as Xo = r/>-l (Yo). The statistician
is faced with the problem of estimating the unknown nlllction r/>(') and the
uncertainty in the value of xo. These problems are known as "calibration" and
"inverse regression". We present the results for linear calibration curves with
equal and variable response variances, and with uncertainties in both variables
x and y.
It is a well-known fact from measurement practice that if similar sampIes
are analyzed independently by several laboratories, the results might be very
diverse. Chapter 7 deals with statistical analysis of data obtained by several
laboratories, so-called, "collaborative studies". The purpose of these is to locate
laboratories which have large systematic and/or random measurement errors.
Chapter 8 is devoted to the study of two special situations arising in mea-
surements. One is when the measurand variability is of the same magnitude as
the measurement instrument scale unit. Then the repeated measurements are
either identical or differ only by one scale unit. The second is measurements
under constraints on the measured parameters.
All chapters, except Chapter 7, have exercises, most with answers and solu-
tions, to enable the reader to measure his/her progress.
The book assumes that the reader already has some acquaintance with prob-
ability and statistics. It would also be highly desirable to have some knowledge
of the design of experiments. In particular, I assume that such notions as mean,
variance, density function, independent random variables, confidence interval,
ix

the t-test and ANOVA, are familiar to the reader.


I believe that the book might serve as a text for graduate engineering stu-
dents interested in measurements, as weH as a reference book for researchers and
engineers in industry working in the field of quality control and in measurement
laboratories.
I would like to express my deep gratitude to Dr. E. Tartakovsky for many
valuable discussions on measurement and for introducing me to some aspects of
statistical measurement theory.

Ilya Gertsbakh
Beersheva, October 2002
Contents

Preface vii

1 Introduction 1
1.1 Measurand and Measurement Errors 1
1.2 Exercises . . . . . . . . . . . 7

2 Mean and Standard Deviation 9


2.1 Estimation of Mean and Variance . 9
2.2 How to Round off the Measurement Data 17
2.3 Normal Distribution in Measurements 19
2.4 The Uniform Distribution . . . . 24
2.5 Dixon's Test for a Single Outlier . . . 26
2.6 Using Range to Estimate u . . . . . . 28
2.7 Confidence Interval for the Population Mean 30
2.8 Control Charts for Measurement Data 32
2.9 Exercises . . . . . . . . . . . . . 40
3 Comparing Means and Variances 43
3.1 t-test for Comparing Two Population
Means . . . . . . . . . . . . . . . . . 43
3.2 Comparing More Than Two Means . 50
3.3 Comparing Variances 54
3.4 Exercises . . . . . . . . . . . . . . . 56

4 Sources of Uncertainty: Process and Measurement Variability 59


4.1 Introduction: Sources of Uncertainty . . 59
4.2 Process and Measurement Variability:
One-way ANOVA with Random Effects 60
4.3 Hierarchical Measurement Design . . . . 66
4.3.1 The Mathematical Model of Hierarchical Design 67
4.3.2 Testing the Hypotheses u A = 0, UB(A) = 0 70
4.4 Repeatability and Reproducibility Studies 71
4.4.1 Introduction and Definitions . . . . . . . 71
xii CONTENTS

4.4.2 Example of an R&R Study . . . . . . . . 72


4.4.3 Confidence Intervals for R&R Parameters 76
4.5 Measurement Analysis for Destructive
Testing . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6 Complements and Exercises . . . . . . . . . . . . . . . . . . 82

5 Measurement Uncertainty: Error Propagation Formula 87


5.1 Introduction........................... 87
5.2 Error Propagation Formula . . . . . . . . . . . . . . . . . . 88
5.3 EPF for Particular Cases of Y = feX}, ... , X n ) • • • 90
5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6 Calibration of Measurement Instruments 95


6.1 Calibration Curves . . . . . . . . . . . . 95
6.1.1 Formulation of the Problem . . . 95
6.1.2 Linear Calibration Curve, Equal
Response Variances . . . . 97
6.2 Calibration Curve for Nonconstant
Response Variance . . . . . . . . . . . . . . . . 103
6.2.1 The Model and the Formulas . . . . . . . . . 103
6.2.2 Uncertainty in x* for the Weighted Case . . . . . . . . . . 106
6.3 Calibration Curve When Both Variables Are Subject to Errors . 107
6.3.1 Parameter Estimation . . . . . . . . . . . . . . . . . . . . 107
6.3.2 Uncertainty in x When Both Variables Are Subject to
Errors . . . . . . . . . . . . . . . . . . . . . . . 111
6.4 Exercises . . . . . . . 111

7 Collaborative Studies 113


7.1 Introduction: The Purpose of Collaborative
Studies . . . . . . . . . . . . . . . . . . . 113
7.2 Ruggedness (Robustness) Test . . . . . 115
7.3 Elimination of Outlying Laboratories .. 118
7.4 Homogeneity of Experimental Variation 120

8 Measurements in Special Circumstances 123


8.1 Measurements With Large Round-o:ff Errors . . . . . . . . . .. 123
8.1.1 Introduction and Examples . . . . . . . . . . . . . . . . . 123
8.1.2 Estimation of I' and u by Maximum Likelihood
Method: y* = 0 or 1. . . . . . . . . . . . . . 125
8.1.3 Estimation of I' and u: y* = 0, I, 2 . . . . . . 128
8.2 Measurements with Constraints . . . . . . . . . . 130
8.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 132

Answers and Solutions to Exercises 133


CONTENTS xiii

Appendix A: Normal Distribution 137


Appendix B: Quantiles of the Chi-Square Distribution 139
Appendix C: Critical Values of the F -distribution 141
References 143

Index 147
Chapter 1

Introduction: Measurand
and Measurement Errors

Truth lies within a little and certain compass, but error is immense.

Bolingbroke

1.1 Measurand and Measurement Errors


Our starting point will be a well-defined physical object which is characterized
by one or more properties, each of a quantitative nature.
Consider for example a cylinder steel shaft, a rolled steel sheet, and a speci-
men made of silver. A complete characterization of any of these objects demands
an infinite amount of information. We will be interested, for the sake of sim-
plicity, in a single (Le. one-dimensional) parameter or quantity. For example we
will be interested
a. in the diameter of the cylinder shaft measured in its midsectionj
b. in the thickness of the rolled steel sheet, or in the chromium proportion
in the steelj
c. in a physical constant termed the specijic weight, measured in grams per
cubic centimeter of silver.
The quantity whose value we want to evaluate is called the measurand. So
the diameter of the cylinder shaft in its midsection is the measurand in example
a. The proportion of chromium in the rolled steel sheet is the measurand for
example b. The measurand in example c is the specific weight of silver.
We define measurement as the assignment of a number to the measurand,
using special technical means (measuring instruments) and a specified technical
2 CHAPTER 1. INTRODUCTION

procedure. For example, we use a micrometer to read the diameter of the shaft,
according to the accepted rules for operating the micrometer. The measurement
result is 7.252 mm. Chemical analysis carried out according to specified rules
establishes that the rolled steel sheet has 2.52% chromium. Weighting the piece
of silver and measuring its volume produces the result 10.502 g/cm3 . The word
"measurement" is also used to describe the set of operations carried out to
determine the measurand.
Note that we restrict our attention in this book to measuring well-defined
physical parameters and do not deal with measurements related to psychology
and sociology, such as IQ, level of prosperity, or inflation.
Suppose now that we repeatthe measurement process. For example, we mea-
sure the shaft midsection diameter several timesj or repeat the chemical analy-
sis for percentage of chromium several times, taking different sampies from the
metal sheetj or repeat the measurements of weight and volume needed to calcu-
late the specific weight of silver. The fundamental fact is that each repetition
of the measurement, as a rule, will produce a different result.
We can say that the measurement results are subject to variations. An equi-
valent statement is that the result of any measurement is subject to uncertainty.
In principle, we can distinguish two sources of uncertainty: the variation created
by the changes in the measurand and the variations which are intrinsic to the
measurement instrument and/or to the measurement process.
As a rule, we ignore the possibility of an interaction between the measurand
and the measurement instrument or assume that the effect of this interaction on
the measurement result is negligible. For example, in the process of measuring
the diameter, we apply a certain pressure which itself may cause adeformation
of the object and change its diameter. A good example of such interaction is
measuring blood pressure: most people react to the measurement procedure
with a raise in blood pressure.
One source of uncertainty (variability) is the measurand itself. Consider for
example the cross-section of the cylindric shaft. It is not an ideal circle, but
similar, in exaggerated form, to Fig. 1.1. Obviously, measurements in different
directions will produce different results, ranging, say from 10.252 mm to 10.257
mm.
We can try to avoid the "multiplicity" of results by redefining the measurand:
suppose that the diameter is defined as the average of two measurements taken in
perpendicular directions as shown in Fig. 1.1. Then the newly defined diameter
will vary from one shaft to another due to the variations in the process of shaft
production.
The situation with the rolled steel sheet is similar. Specimens for the purpose
of establishing the proportion of chromium are taken from different locations
on the sheet, and/or from different sheets. They will contain, depending on the
properties of the production process, different amounts of chromium ranging
from, say, 2.5% to 2.6%.
1.1. MEASURAND AND MEASUREMENT ERRORS 3

Figure 1.1. The diameter D of the shaft depends on 0:.

The situation with the specific weight of silver is somewhat different. We


postulate that the specific weight of silver is not subject to change and is a
physical constant. (Of course, this physical constant is defined for a specified
range of ambient temperatures and possibly other environmental factors, such
as altitude.) If we repeat the process of measuring the weight and the volume
of the same piece of silver, under prescribed stable environmental conditions,
we nevertheless will get different results.
How can this be explained? Let us examine the measurement process. Here
we have what are called indireet measurements. There is no instrument which
can provide the value of specific weight. To obtain the specific weight, we have
to measure the weight and the volume of our specimen. The repeated mea-
surements of the weight, even if carried out on the same measurement device,
will have small variations, say of magnitude 0.0001 g. Random vibrations, tem-
perature changes, reading errors etc. are responsible for these variations. The
repeated measurements of the specimen volume are also subject to variations,
which may depend on the measurement method. It may happen that these vari-
ations are of maginitude, say, 0.0001 cm 3 • Thus the repeated measurements of
the specific weight, which is the ratio weightjvolume, will also vary from mea-
surement to measurement. Note that the measurand itself remains constant,
unchanged.
In measuring the diameter of the cylindrical shaft, the measurement results
are influenced both by the variations in the "theoretical" diameter, and by
inaccuracies caused by the measurement itself.
The relationship between the "true" value of the measurand and the result
of the measurement can be expressed in the following form:
y = J1. + 1':, (1.1.1)
where J1. is the true value of the physical constant and I': is the measurement
error.
4 CHAPTER 1. INTRODUCTION

It should be stressed that (1.1.1) is valid for the situation when there exists
a well-defined "true" value of the measurand 1'. How to extend (1.1.1) to the
situation in which the measurand itself is subject to variations, as it is in our
shaft diameter and chromium content examples?
Let us adopt the approach widely used in statistics. Introduce the notion
of a population as the totality of all possible values of the shaft diameters.
(We have in mind a specmed shaft produetion process operating under fixed,
controlled conditions.) Fisher (1941) calls this totality a "hypothetical infinite
population" (see Mandel (1994), p. 7). The value D of the diameter of each
parlicular shaft therefore becomes a random variable. To put it simply, the
result of each particular measurement is unpredictable, it is subject to random
variations. What we observe as the result of measuring several randomly chosen
shafts becomes a random sampie taken from the above infinite population which
we create in our mind.
How, then, do we define the measurand? The measurand, by our definition,
is the mean value of this population, or more formally, the mean value I' of the
corresponding random variable D:

I' = E[D]. (1.1.2)

Here E[·] denotes the expectation (the mean value).


Some authors define the measurand as "A value of a physical quantity to
be measured" (see Rabinovich 2000, p. 287). Our definition is in fact wider,
and it includes also population parameters, e.g. the population mean. On the
other hand, we can say that the population mean is considered as a "physical
quantity" .
The purpose of measuring is not always to assign a single numerical value
to the measurand. Sometimes the desired measurement result is an interval.
Consider for example measuring the content of a toxic element, say lead, in
drinldng water. The quantity of practical interest is the mean lead content.
Since this varies in time and space, in addition to having a point estimate of
the mean value, the water specialists would also like to have an interval-type
estimate of the mean, e.g. a confidence interval for the mean. We recall that
the 1 - 2a confidence interval, by definition, contains the unknown parameter
of interest with probability P = 1 - 2a.
It is always possible to represent the measurement result Y as

Y = I'+X, (1.1.3)
where the random variable X reßects all uncertainty of the measurement result.
We can go a step further and try to find out the strueture of the random
variable X. We can say that X consists of two principal parts: one, X o, re-
fleets the variations of the measurand in the population, Le. its deviations from
the mean value 1'. Imagine that our measurement instruments are free of any
errors, Le. they are absolutely accurate. Then we would observe in repeated
measurements only the realizations of the random variable I' + X o.
1.1. MEASURAND AND MEASUREMENT ERRORS 5

The second principal component of X is the "pure" measurement error E.


This is introduced by the measurement instrument and/or by the whole mea-
surement process. Rabinovich (2000, p. 286) defines the error of measurement
as "the deviation of the result of a measurement from the true value of the
measurand" .
Now the measurement result Can be represented in the following form:
y = fJ. + X o + E. (1.1.4)
What can be said about the random variable e? There is an ample literature
devoted to the study of the structure of this random variable in general, and
for specific measurement instruments in particular. Rabinovich (2000) assumes
that E consists of elementary errors of different type.
The first type is so-called absolutely constant error that remains the same in
repeated measurements performed under the same conditions. By their nature
these errors are systematic. For example, a temperature meter has a nonlinear
characteristic of a thermocouple which is assumed to be linear, and thus a fixed
error of say l°e is present in each temperature measurement in the interval
500-600. SO, EI = 1.
The second type is conditionally constant error. This may be partially sys-
tematic and partially random. It remains constant for a fixed instrument, but
varies from instrument to instrument. An example is the bias of the zero point of
the scale in spring-type weights. A particular instrument may add some weight
E2 to any measurement (weighing) result. The value of the bias changes from
instrument to instrument, and certain limits for this error can usually be given
in advance.
The third type of error is random error. In the course of repeated mea-
surements carried out on the same object, by the same operator, under stable
conditions, using the same instrument, random errors vary randomly, in an
unpredictable way. Operator errors, round-off errors, errors caused by small
changes in environmental conditions (e.g. temperature changes, vibrations) are
examples of random errors.
Measurement errors themselves are subject to changes due to instrument
mechanical wear-out and time drift of other parameters. For example, certain
electric measurements are carried out using a buHt-in source of electricity. A
change in the voltage might cause the appearance of an additional error in
measurements.

Remark 1
For the purpose of measurement data processing, it is often convenient to de-
compose the measurement error E into two components EI and E2 which we will
call "systematic" and "random", respectively. By definition, the systematic
component remains constant for repeated measurements of the same object car-
ried out on the same measurement instrument under the same conditions. The
random component produces replicas of independent identically distributed ran-
6 CHAPTER 1. INTRODUCTION

dom variables for repeated measurements made by the same instrument under
identical conditions.
In theoretical computations, it is often assumed that the first component has
a uniform distribution, whose limits can be established from the certificate of
the measurement instrument or from other sources. The random component is
modeled typically by a zero-mean normal random variable; its variance is either
known from experience and/or certificate instrument data, or is estimated from
the results of repeated measurements.

Suppose that we carry out aseries of repeated measurements of the same


measurand I' on the same instrument, by the same operator and under the same
environmental conditions. The result of the ith measurement will be
Yi = I' + EI + E2,i, i = 1, ... ,n. (1.1.5)
Here EI is some unknown quantity which represents the "systematic" error, for
example, an error of the measurement instrument. EI does not change from mea-
surement to measurement. E2,i, i = 1, ... ,n, are independent random variables
and they represent the random errors.
Suppose that we take the average of these n measurements. Then we have

y = I' + EI + E?=I E2,i • (1.1.6)


n
The systematic error is not ''filtered out" by the averaging. However, the ratio
a = E?=I E2,i/n tends with probability one to zero as n increases. This fact
is known in probability theory as the law of large numbers. An explanation of
this phenomenon lies in the fact that the variance of a tends to zero as n goes
to infinity.
Equation (1.1.1) is a particular case of (1.1.4). From now on, we use the
following formula:
y = I'+X, (1.1. 7)
where I' is the measurand, and the random variable X represents the entire
uncertainty of the measurement, Le. the random and systematic measurement
errors, random variations in the population around 1', etc.

Remark ~
In the literature, we very often meet the notions of "precision" and "accuracy"
(e.g. Miller and Miller 1993, p. 20). The easiest way to explain these notions
is to refer to the properties of the random variable X in (1.1.7). Let 6 be the
mean of X and u its standard deviation (then u 2 is the variance of f. This fact
will be denoted as f '" (6,u 2 ).) Then measurements with small Ö are called
"accurate" ; measurements with small u are called "precise". It is very often
said that the presence of 6 1= 0 signifies so-called systematic error.
1.2. EXERCISES 7

Suppose, for example, that we measure the shaft diameter by a micrometer


which has a shifted zero point, with a positive shift of 0.020. The measurement
results are the following: 10.024,10.024,10.025,10.023,10.025. This situation
reflects precise but nonaceurate measurements.

Remark 3
There is a scientific discipline called metrolo911 which studies, among other issues,
the properties of measurement instruments and measurement processes. One of
the central issues in metrology is the analysis of measurement errors. Generally
speaking, the measurement error is the difference between the ''true" value of
the measurand (which is assumed to be constant) and the measurement result.
Even for a relatively simple measurement instrument, say a digital ohmmeter,
it is a quite involved task to find all possible sources of errors in measuring the
ohmic resistance and to estimate the limits of the total measurement error. We
will not deal with such analysis. The interested reader is referred to Rabinovich
(2000).
The limits of maximal error of a measurement instrument (including sys-
tematic and random components) usually are stated in the certificate of the
instrument, and we may assume in analyzing the measurement data that this
information is available.
The data on instrument error (often termed "accuracy") is given in an ab-
solute or relative form. For example, the certificate of a micrometer may state
that the maximal measurement error does not exceed 0.001 mm (1 micron).
This is an example of the "absolute form". The maximal error of a voltmeter
is typically given in a relative form. For example, its certificate may state that
the error does not exceed 0.5% of a certain reference value.

1.2 Exercises
1. Take a ruler and try to measure the area of a rectangular table in cm2 in your
room. Carry out the whole experiment 5 times and write down the results. Do
they vary? Describe the measurand and the sources of variability of the results.
How would you characterize the measurement process in terms of accuracy and
precision?

2. During your evening walk, measure the distance in steps from your house
to the nearest grocery store. Repeat the experiment five times and analyze the
results. Why do the results vary?

3. Take your weight in the morning, before breakfast, four days in a row.
Analyze the results. What is the measurand in this case? Are the measurements
biased?
8 CHAPTER 1. INTRODUCTION

4 . Consider using a beam balance to weigh an object. An unknown weight x


is put on the left balance platform and equaJized with a known weight Pt. The
lengths of the balance arms may not be exactly equal, and thus a systematic
error appears in the measurement result. Let h and l2 be the unknown lengths
of the left and right arm, respectively. Gauss suggested a method of eliminating
the systematic error by weighing first on the left platform, and then on the
right platform. Suppose that the balancing weights are PI and P2 , respectively.
Derive the formula for estimating the weight x.
Hint. x·h = P1 ·12. When the weight x is put on the right platform, x·12 = P2·h.
Derive that x = "/P1 • P2 •
Chapter 2

Measuring Population
Mean and Standard
Deviation

There is no such thing as chance. We have invented this


word to express the known effect 0/ every unknown cause.
Voltaire

2.1 Estimation of Mean and Variance


Example 2.1.1: The weight 0/ Yarok apples
Apples of a popular brand called "Yarok" are automatically sorted by a special
device which controls their size and eliminates those with surface defects. The
population in our case is the totality of weight measurements of the daily batch
of apples sorted by the device. The experiment consists of weighing a random
sampIe of 20 apples taken from the bateh. The results are summarized in Table
2.1.

•• •••
• • •• •
I • I • •••••••• I I
155 160 165 170
Figure 2.1. Representation of apple weights in form of a dot diagram
10 CHAPTER 2. MEAN AND STANDARD DEVIATION

Table 2.1: The weights of 20 apples

i Weight in grams i Weight in grams


1 168 11 164
2 169 12 162
3 165 13 164
4 167 14 166
5 158 15 167
6 166 16 168
7 163 17 169
8 166 18 165
9 165 19 170
10 163 20 167

It is a golden rule of statistical analysis first of all to try to "see" the data.
Simple graphs help to accomplish this task. First, we represent the data in the
form of a so-called dot-diagram; see Fig. 2.1.

170

166

162 I

158
*
Figure 2.2. Box and whisker plot for the data of Example 2.1.1
A second very popular and useful type of graph is so-called box and whisker
plot. This is composed of a box and two whiskers. The box encIoses the middle
half of the data. It is bisected by a Une at the value of the median. The vertical
lines at the top and the bottom of the box are called the whiskers, and they
indicate the range of "typical" data values. Whiskers always end at the value
2.1. ESTIMATION OF MEAN AND VARIANCE 11

of an actual data point and cannot be longer than 1.5 times the size of the box.
Extreme values are displayed as "*" for possible outliers and "0" for probable
outliers. Possible outIiers are values that are outside the box boundaries by
more than 1.5 time the size of the box. Probable outliers are values that are
outside the box boundaries by more than 3 times the size of the box.
In our example, there is one possible outIier - 158 gram. In Sect. 2.5, we
describe a formal procedure for identifying a single outIier in a sampie.
The mathematical model corresponding to the above example is the follow-
ing:
Y =I'+X, (2.1.1)
where I' is the mean value for the weight of the population of Yarok apples (the
measurand), and the random variable X represents random fluctuation of the
weight around its mean. X also incorporates the measurement error.
X is a random variable with mean 8 and variance u 2 • We use the notation
X,..., (8,u 2 ). (2.1.2)
Consequently, Y has mean I' + 8 and variance u2:
Y""'(1'+8,u 2 ). (2.1.3)
Expression (2.1.3) shows that the measurement results include a constant error
8, which might be for example the result of a systematic bias in the apple
weighing device. In principle, we cannot "filter out" this systematic error by
statistical means alone. We proceed further by assuming that the weighing
mechanism has been carefully checked and the bias has been eliminated, Le. we
assume that 8 = o.

Remark 1
In chemical measurements, the following method of subtraction is used to elim-
inate the systematic bias in weighing operations. Suppose we suspect that the
weighing device has a systematic bias. Put the object of interest on a special
pad and measure the weight of the object together with the pad. Then weigh
the pad separately, and subtract its weight from the previous result. In this
way, the systematic error which is assumed to be the same in both readings will
be canceled out.

Recall that u, the square root of the variance, is called the standard deviation.
u is the most popular and important measure of uncertainty which characterizes
the spread around the mean value in the population of measurement results. Our
purpose is to estimate the mean weight I' (the measurand) and u, the measure
of uncertainty. It is important to mention here that u (unlike ( 2 ) is measured
in the same units as 1'. So, in our example u is expressed in grams.
We use capital italic letters X, Y, Z, ... to denote random variables and the
corresponding small italic Ietters x, y, z, . .. to denote the observed values of
these random variables.
12 CHAPTER 2. MEAN AND STANDARD DEVIATION

Let 'Ui be the observed values of apple weights, i = 1, ... , 20. In measure-
ments, the unknown mean JJ is usually estimated by sampie mean y:
",20
- -
Y- 20 'Ui .
LJi=I (2 .1.4)

(Often, the word sampie a"erage or simply a"erage is used for the sampIe mean.)
The standard deviation u is estimated by the following quantity 8:

",20 ( . _ -)2
LJi-I y, Y
8= (2.1.5)
19
An estimate of a parameter is often denoted by the same letter with a "hat".
So, P. is used to denote an estimate of JJ.
For our example we obtain the following results:
y = p. = 165.6;

8 = 2.82.
In general, for a sampIe {YI, Y2, ... , Yn} the estimate of the mean (Le. the
sampIe mean) is defined as

-y= E~IYi , (2.1.6)


n
and the estimate of the standard deviation is defined as

E~-I (Yi - y)2


8= (2.1.7)
n-1
The sampIe mean is an unbiased (if 6 = 0) and consistent estimate of the
parameter JJ. Recall that the term "consistency" means that the estimate has a
variance and bias which both tend to zero as the sampIe size n goes to infinity.
The sampIe variance is computed by the formula

82 = E~-I ('Ui - y)2 . (2.1.8)


n-1
It is an unbiased and consistent estimate of u 2. Using formula (2.1.7) for es-
timating u introduces a bias, which tends to zero as the sampie size goes to
infinity.
Sometimes we have a sampIe of only two observations, {YI, Y2}' In this case,
(2.1.8) takes the form
2 (YI - Y2)2
8 = 2 . (2.1.9)
2.1. ESTIMATION OF MEAN AND VARIANCE 13

Remark 2: Sheppard '8 correction


The result of weighing apples was given as a discrete integer variable, while the
weight is a continuous variable. Suppose that the weight reading is provided by
a digital device which automatically gives the weight in grams as an integer. The
weighing device perceives the "true" weight y and rounds it off to the nearest
integer value. What the device does can formally be described as adding to the
weight y some quantity 'Y, such that y + 'Y is an integer. If the scale unit is h (in
our case h is 1 gram), then 'Y lies in the interval [-h/2, h/2]. A quite reasonable
assumption is that 'Y is uniformly di8tributed in the interval [-h/2, h/2]. We use
the following notation: 'Y "" U( -h/2, h/2). We will say more about the uniform
distribution in Sect. 2.4. Let us recall that the variance of 'Y is
h2
Var['Y] = 12· (2.1.10)

and that the mean value of'Y is zero: E['Y] = O.


Suppose that we observe a sampIe from the population of Y = X +'Y. J1.x and
J1.y are the mean values of X and Y, respectively. Since Eh] = 0, J1.x = J1.y = J1..
The sampIe mean of random variable Y, y, is therefore an unbiased estimate of
J1..
How to estimate from the sampIe of the observed Y values the variance of
the "unobserved" random variable X? Under quite wide assumptions regarding
the distribution of X, it can be shown that
Var[Y] ~ Var[X] + h 2 /12, (2.1.11)
and therefore
Var[X] ~ Var[Y] - h2 /12. (2.1.12)
Thus, an estimate of Var[X] can be obtained from the formula

(2.1.13)

The term h 2 /12 in this formula is called the Sheppard's correction; see Cramer
(1946, Sect. 27.9). The applicability of (2.1.13) rests on the assumption that 'Y
has a uniform distribution and the density function for X is "smooth".
Let us compute this correction for our apple example. By (2.1.5), si-
=
2.822 = 7.95. h2 /12 = 1/12 = 0.083. By (2.1.13) we obtain s~ = 7.95-0.083 =
7.867, and Sx = 2.80. We see that in our example the Sheppard correction
makes a negligible contribution.

Properties of SampIe Mean, SampIe Variance and the Variance of a


Sum of Random Variables: aReminder
Suppose that the weighing experiment is repeated m times in similar conditions,
Le. we take m random sampIes of 20 apples and for each of them compute the
14 CHAPTER 2. MEAN AND STANDARD DEVIATION

values of y and 8. For each sampIe we will obtain different results, reßecting the
typical situation in random sampling.
The apple weights Yl, ... ,Y20 are the observed values of random variables
Y1 , Y2 , ••• , 1'20' These random variables are identically distributed and therefore
have the same mean value I' and the same standard deviation 0'. Moreover, we
assume that these random variables are independent. The observed sampIe
means y(j), j = 1,2, ... ,m, are the observed values of the random variable

Y = L~~l Yi (2.1.14)
20 .
This random variable is called an estimator of 1'.
Of great importance is the relationship between Y and 1'. The mean value
of Y equals 1':

E[Y] = 1', (2.1.15)


which means that "on average" , the values of the estimator coincide with 1'. In
statistical terminology, the sampIe mean Y is an unbiased estimator of 1'. The
observed (Le. the sampIe) value of the estimator is called estimate.
How elose will the observed value of Y be to I'? This is a question of
fundamental importance. The simplest answer can be derived from the following
formula:
_ 0'2
Var[Y] = 20' (2.1.16)

Le. the variance of Y is inverse proportional to the sampIe size n = 20. As the
sampIe size n increases, the variance of Y decreases and the estimator becomes
closer to the unknown value 1'. This property defines a so-called consistent
estimator. (More formally, as n -+ 00, then Y tends to I' with probability 1).
Let n be the sampIe size. It can be proved that the quantity
-)2
82 = L...i-l (Yi -
~n
Y (2.1.17)
n-1
is an unbiased estimator of the variance 0'2, Le. E[8 2 ] = 0'2.
We have already mentioned that the square root of 8 2 ,

L~-l (Yi - Y)2


8= (2.1.18)
n-1
is used as an estimator of the standard deviation 0'. Unfortunately, 8 is not
an unbiased estimator of 0', but in measurement practice we usually use this
estimator. As n grows, the bias of 8 becomes negligible.
If the distribution of Yi is known, it becomes possible to evaluate the bias
ofthe estimator (2.1.18) and to eliminate it by means of a correction factor. In
Sect. 2.6 we consider this issue for the normal distribution.
2.1. ESTIMATION OF MEAN AND VARIANCE 15

Readers of the statistical literature are often eonfused by the similarity of


the notation for random variables (estimators) and their observed sample values
(estimates). Typically, we use eapital letters, like Y, S2 and 8 for random
variables and/or estimators, as in (2.1.14), (2.1.17) and (2.1.18). When the
random variables Yi in these expressions are replaced by their observed (Le.
sample) values, the eorresponding observed values of Y, 8 2 and 8 (Le. their
estimates) will be denoted by lower-ease letters y, 8 2 ,8, as in (2.1.4), (2.1.8)
and (2.1.7).
A very important subject is eomputation of the mean and varianee of a sum
of random variables. Let Yi '" (/1i, an, i = 1,2, ... , k, be random variables.
Define a new random variable W = I:~=1 aiYi, where ai are any numbers. It is
easily proved that
k
E[W] =L ai/1i· (2.1.19)
i=1
The situation for the varianee is more eomplex:
k
Var[W] = L a~a~ + 2 L Cov[Yi, Yj]aiaj, (2.1.20)
i=1 i<j
where Cov[·] is the eovarianee of Yi and Yj,
Cov[Yi, Yj] = E[(Yi - /1i)(Yj - /1j)]. (2.1.21)
It is important to note that if Yi and Yj are independent, then their eovari-
anee is zero. In most applieations we will deal with the case of independent
random variables Y 1 , ••• , Y k • Then (2.1.20) becomes
k
Var[W] = L a~a~. (2.1.22)
i=1
This formula has important implications. Suppose that all Yi '" (/1,0'2). Set
ai = l/k in (2.1.22). Then we obtain that

"U
var
[I:~-1
k
Yi] = 0'2
k . (2.1.23)

Formula (2.1.16) is a particular ease of (2.1.23) for k = 20.


Now eonsider the case that all Yi have the same mean /1 but different vari-
ances: Yi '" (!-',an. (The Yi are assumed to be independent.) Let us compute
the mean and the variance of the random variable
W = a1 Y1 + a2 Y2 + ... + ak Yk, (2.1.24)
where all ai are nonnegative and sum to 1:
k
L a1 = 1, ai ~ o. (2.1.25)
;=1
16 CHAPTER 2. MEAN AND STANDARD DEVIATION

Obviously, from (2.1.22) it follows that


k
E[W] = p" Var[W] =L a~a~. (2.1.26)
i=1

It follows from (2.1.26) that W remains an unbiased estimator of p,.


Suppose that the choice of the "weights" ai (under the constraint 'E ai = 1)
is up to uso How should ai be chosen to minimize the variance of W? The
answer is given by the following formula (we omit its derivation):

a: = a~ 'E 1 k
I/al
, i = 1, ... ,k. (2.1.27)
i =1
The following example demonstrates the use of this formula in measurements.

Example 2.1.2: Optimal combination 0/ measurements from different instru-


ments
The same unknown voltage was measured by three different voltmeters A,B
and C. The results obtained were VA = 9.55 Vi VB = 9.50 V and Vo = 9.70
V. The voltmeter certificates claim that the measurement error does not exceed
the value of 6.A = 0.075 V, 6.B = 0.15 V and 6. 0 = 0.35 V, for A, B and C,
respectively.
We assume that the measurement error Vo for a voltmeter has a uniform
distribution on (-6.,6.). Then the variance of the corresponding measurement
error is equal to
Vi [\1,] _ (26.)2 _ 6. 2 (2.1.28)
ar 0 - 12 - 3 .
Denote by VA, VB, Vo the random measurement results of the voltmeters. The
corresponding variances are: Var[VA] = 1.88· 10-3 , Var[VB] = 0.0075 and
Var[Vo] = 0.041.
According to (2.1.27), the optimal weights are: a~ = 0.77, a~ = 0.19, a~ =
0.04. Therefore, the minimal-variance unbiased estimate ofthe unknown voltage
is
V* = 0.77·9.55 + 0.19·9.5 + 0.04·9.7 = 9.546.

The corresponding variance is


Var[V*] = 0.772 .0.00188 + 0.19 2 .0.0075 + 0.04 2 .0.041 = 0.00145,
and the standard deviation is uv* = 0.038.
Combining the results with equal weights would produce a considerably less
accurate estimate of variance equal to ~(0.00188 + 0.0075 + 0.041) = 0.005598.
The estimate of the standard deviation is av* = 0.075, twice as large as the
previous estimate.
2.2. HOW TO ROUND OFF THE MEASUREMENT DATA 17

Let Y be any random variable with mean J.I. and standard deviation u. In
most cases which will be considered in the context of measurements, we deal
with positive random variables, such as concentration, weight, size and velocity.
Assume, therefore, that Y is a positive random variable.
An important characterization of Y is its coefficient 01 variation defined as
the ratio of u over J.I.. We denote it as c.v.(Y):
u
c.v.(Y) = -.
p,
(2.1.29)

Small c.v. means that the probability mass is closely concentrated around the
mean value.
Suppose we have a random sampie of size n from population with mean p,
and standard deviation u. Consider now the sampie mean Y. As follows from
(2.1.23), it has standard deviation equal to u/Vii and mean J.I.. Thus,
_ c.v.(Y)
c.v. (Y) - Vii . (2.1.30)

The standard deviation of Y in a sampie of size n is ujJ. = u/Vii. It is called


the standard error 0/ the mean. The corresponding estimate is
s
sjJ.= Vii. (2.1.31)

In words: the standard deviation of the sampie mean equals the estimated
population standard deviation divided by the square root of the sampie size.
For the apple data in Example 2.1.1, sjJ. = 2.8/VW = 0.63.

2.2 How to Round off the Measurement Data


Nowadays, everybody uses calculators or computers for computations. They
provide results with with at least 4 or 5 significant digits. Retaining all these
digits in presenting the measurement results (e.g. estimates of means and stan-
dard deviations) might create a false impression of extremely accurate measure-
ments. How many digits should be retained in presenting the results?
Let us adopt the following rules, as in Taylor (1997) and Rabinovich (2000):
1. Experimental uncertainties should always be rounded to two significant
digits.
2. In writing the final result subject to uncertainty, the last significant figure
of this result should coincide with the last significant digit of the uncertainty.
3. Any numbers to be used in subsequent calculations should normally retain
at least one significant figure more than is finally justified.

For example, suppose that in calculating the specific weight, we have ob-
tained the weight 54.452 g, the volume 5.453 cm3 , and the overall uncertainty
18 CHAPTER 2. MEAN AND STANDARD DEVIATION

of the specific weight (an estimate of standard deviation) was calculated as


0.002552.
First, following rule 1, we round off the uncertainty to 0.0026. After calcu-
lating the specific weight as 54.452/5.453 = 9.985696, we keep only 5 significant
digits (rule 2) and present the final result in the form: 9.9857 ± 0.0026.
Another important issue is the accuracy used in recording the original mea-
surements. Suppose that a digital voltmeter has a discrete scale with step h. If
it gives areading of 12.36 V, h = 0.01.
When a measurement instrument is not of digital type, the readings are made
from a scale with marks. Typically, we round off the measurement results to
the nearest half of the interval between the adjacent scale marks. For example,
aspring weight has a scale with marks at every 10 g, and we record the weight
by rounding the reading to the nearest 5 g. In this case h = 5 g.
The value of h depends on the measurement instrument's precision. The
desired precision must depend on the spread of the measurement results in the
population.
Suppose that the distribution of random weights is such that all weights lie
between 998 and 1012 g. Weighing on a scale with step h = 10 g would result in
reading the weight either as 1000 or 1010 g. This will introduce an error which
is quite large relative to the standard deviation in the population.
In the extreme case when the instrument scale is too "rough", it might
happen that all our readings are identical. Then our estimate of the population
standard deviation will be zero, certainly an absurd result.
A practical rule for the desired relationship between the scale step h and the
population standard deviation (1 is that the scale step should not exceed one half
0/ the standard deviation in the population (see Cochran and Cox 1957, Sect.
3.3).
Let us check our apple example. We have h = 1 gram. The estimated
standard deviation is s = 2.8. Half of this is greater than h, and therefore the
measurements device is accurate enough.

Definition 2.2.1. A measurement procedure with 8 = s/h ~ 2 is called regular.

A measurement scheme with 8 < 0.5 we call special. To put it simply, special
measurements are made using a measuring instrument with scale step greater
than two population standard deviations.
If a measurement is special, then the rules for computing the estimates of
population mean and standard deviation must be modified. We will describe
these rules in Chapter 8, for the case of sampling from a normal population.
As a rule, we will deal only with regular measurements. It is characteristic of
special measurements that all measurement results take on usually not more
than two or three adjacent numerical values on the scale of the measurement
instrument.
2.3. NORMAL DISTRIBUTION 19

2.3 Normal Distribution in Measurements


Let us return to (1.1.4), which represents the measurement result as a random
variable. We did not make any assumptions concerning the distribution of Y.
In fact, to compute estimates of means and standard deviations we can manage
without these assumptions. However, there are more delicate issues, such as
identifying outliers, constructing confidence intervals and testing hypotheses
which require assumptions regarding the distribution function of the random
variables involved.
Most of our further analysis is based on the assumption that the random
variable Y follows the scrcalled normal distribution. Later we will give reasons
for this assumption. The normal distribution is specified by two parameters,
which we denote by Jl. and 0'. If Y has a normal distribution, we denote it by
Y '" N (Jl., 0'2). Let us for convenience recall the basic facts regarding the normal
distribution.
Let Z '" N (Jl., 0'2). Then the density function of Z is

fz(v) = ~O' e-(v-/J)2/2q2, -00 < v < 00; (2.3.1)

Jl. may be any real number and 0' any positive number. The parameters Jl. and
0' have the following meaning for the normal distribution:
E[ Zl = Jl., Var[ Z] = 0'2.

°
It is more convenient to work with the scrcalled standard normal distribution
having Jl. = and 0' = 1 (Fig. 2.3). So, if Zo '" N(O, 1), then
r, ( ) _ _1_ _ v 2 /2
JO V -!<Ce . (2.3.2)
v27r
The cumulative probability function

P(Zo ~ t) = /:00 fo(v)dv = q;(t) (2.3.3)

gives the area under the standard normal density from -00 to t. It is called the
normalized Gauss function. (In European literature, the name Gauss-Laplace
function is often used.) The table of this function is presented in Appendix A.
It is important to understand the relationship between Z '" N(Jl.,0'2) and
Zo '" N(O, 1). The reader will recall that if Y '" N(Jl., 0'2), then the mean value
of Y equals Jl., and the variance of Y equals 0'2. Now
Z-Jl.
Z = Jl. + 0' . Zo, or Zo =- -.
0'
(2.3.4)

In simple terms, the random variable with normal distribution N(Jl., 0'2) is ob-
tained from the "standard" random variable Zo '" N(O, 1) by a linear transfor-
mation: multiplication by standard deviation 0' and addition of Jl..
20 CHAPTER 2. MEAN AND STANDARD DEVIATION

-2 2 4
Figure 2.3. Standard normal density

For a better understanding of the normal density function, let us recall how
the probability mass is distributed around the mean value. This is demonstrated
by Fig. 2.4. In practical work with the normal distribution we often need to use
so-called quantiles.

Normal Quantiles
Definition 2.2.1 Let 0: be any number between 0 and 1, and let Y be a random
variable with density fy(t). Then the number tOt which satisfies the equality

0: = j
-00
ta
fy(t)dt (2.3.5)

is called the o:-quantile of Y.


In simple terms, tOt is a number to the left of which lies the probability mass
of size 0:. Table 2.2 presents the most useful quantiles of the standard normal
distribution. For this distribution, the o:-quantiles are usually denoted as ZOt.
Note a useful property of these quantiles: for any 0:,
(2.3.6)
How to calculate the quantile of an arbitrary normal distribution? Suppose
Y '" N(I',q2). Then the o:-quantile of Y, denoted t~ can be expressed via the
standard normal quantile ZOt as folIows:
(2.3.7)
2.3. NORMAL DISTRIBUTION 21

Table 2.2: Quantiles of the standard normal distribution

er Za
0.001 -3.090
0.010 -2.326
0.025 -1.960
0.050 -1.645
0.100 -1.282
0.900 1.282
0.950 1.645
0.975 1.960
0.990 2.326
0.999 3.090

-4 -3 -2 -1 o 1 2 3 4
Figure 2.4. The probability mass under the normal density curve

N orm.a1 Plot
A simple graphical tool can be used to check whether a random sampie is taken
from normal population. This is a so-called normal probability plot.
Example fU.l retJisited To construct the normal plot, we first have to order the
weights, and assign to the ith ordered observation Y(i) the value P(i) = (i-0.5) In;
see Table 2.3. We now plot the points (Y(i),P(i»). Figure 2.5 shows the normal
22 CHAPTER 2. MEAN AND STANDARD DEVIATION

Table 2.3: The ordered sampie of 20 apples

i Weight in grams P(i) i Weight in grams P(i)


1 158 0.025 11 166 0.525
2 162 0.075 12 166 0.575
3 163 0.125 13 167 0.625
4 163 0.175 14 167 0.675
5 164 0.225 15 167 0.725
6 164 0.275 16 168 0.775
7 165 0.325 17 168 0.825
8 165 0.375 18 169 0.875
9 165 0.425 19 169 0.925
10 166 0.475 20 170 0.975

plot for the apple weights. If the points on the plot form a reasonable straight
line, then this is considered as a confirmation of the normality for the population
from which the sampie has been taken.

The Origin of the Normal Distribution


To explain the use of the normal distribution in measurements, let us quote Box
and Luceno (1997, p. 51).
"Why should random errors tend to have a distribution that is approximated
by the normal rather than by some other distribution? There are many different
explanations - one of which relies on the centrallimit effect. This centrallimit
effect does not apply only to averages. H the overall error e is an aggregate of
a number of component errors e = €l + €2 + ... + €n, such as sampling errors,
measurement errors or manufacturing variations, in which no one component
dominates, then almost irrespective of the distribution of the individual compo-
nents, it can be shown that the distribution of the aggregated error e will tend
to the normal as the number of components gets larger" .
A different and very remarkable argument is due to James Clark Maxwell.
We describe it following Krylov (1950, Sect. 121) and Box and Luceno (1997).
Suppose that we perform a large number N of measurements, each of which
is carried out with a random error €. Let us consider the number of times
(out of N) that the random error falls into a small interval [x, x + dx]. It is
natural to assume that this number is proportional to dXj proportional to N;
and dependent on x. Denoting this number by ds, we have
ds = Nf(x)dx. (2.3.8)
Thus the probability that the random error is between x and x + dx is
dp = f(x)dx. (2.3.9)
2.3. NORMAL DISTRIBUTION 23

0.99~--~----~----.----,----,-----,----,-----

0.98 f--......... -+.........-.. . . . . -+

0.8~---+----4-----~---+----+---~~---+--~
0.75~---+----4-----~---+----+-~.a-~----+---~
o. 7 ~·············-+············t ...---.-....... . .- ......-.---...-...-
o. 6 5 ~.......... ............;.......... --·t······ ............................- -.-.- ····f············- j.....-.-- ............................. -
0 . 6 . - - .............. --
oö: ~ e---~ -------l---.----.-- ---.-------..~--.
o0.4
. 45 _____ .____4t___. __ _. . __________ . ._. . _. . _. _ __ .. __ . __ .______.

0.35 •
0.3~---+----4-----~--~.~~+-----~---+--~
0.25~--_+----4_----~--_r.----+_----r---_+-----
0.2 f-- ........... + ... --_. +. . . . . . . . . . . . . .+ .•" ... -.:
0.15F====±====~====~=.==~==~====~====~==~
0.1~---+----4_----~---+----+_----r---_+-----
0.08~---+ii----4----.~---+----+_----r---_+-----
o.06 f-................... i-- ............. +················-1····· ............................. _...... ........................- .. [.... .......... ..
0.04~---+----4-----~---+----+_----~--_+-----

0.02~--~·~--~--~----~--~----+----+--~
I
158 160 162 164 166 168 170 172

Figure 2.5. The normal plot for the data of Example 2.1.1
The function f(x) is what we used to call the density function. Now it is
natural to postulate that this function is even, Le. f(x) = f( -x), which means
that the distribution of errors is symmetrie around the origin x = O. Without
loss of generality, we may assume that f(x) = q,(x2).
The second postulate is that q,(-) is a decreasing function of x, Le. larger
errors are less probable than smaller ones. Suppose now that we carry out a
measurement whose result is a pair of numbers (x, y). In other words, let us
assume that we have a two-dimensional measurement error (€1, €2)' A good way
to imagine this is to consider shooting at a target, and interpret €1 and €2 as
the horizontal and vertical deviations from the center of the target, respectively.
Put the center of the target at the point (x = 0, y = 0).
The third postulate is the following: the random errors €1 and €2 are in-
dependent. This means that the probability dp of hitting a target at a small
square [x, x + dx] x (y, y + dy] can be represented in the following product form:
dp = q,(X2)q,(y2)dxdy. (2.3.10)
Now rotate the target around the origin in such a way that the x-axis goes
through the point (x,y). Then it can be shown, based on the third postulate,
24 CHAPTER 2. MEAN AND STANDARD DEVIATION

that the probability of hitting the target at the distance r = J x 2 + y2 from


the origin depends on x and y through r only. In other words, the joint density
of €l, €2 must depend only on r = J x2 + y2. Thus we arrive at the following
principal relationship:
(2.3.11)

The solution of (2.3.11) is beyond the standard calculus course. Let us accept
without proofthe fact that (2.3.11) determines a unique function which has the
fol1owing general form: q,(x2 ) = C . expAz2. Since q,0 must be decreasing (the
second postulate), Ais negative. Denote it by -1/2a 2 • The constant C must
be chosen such that the integral of q, from minus to plus infinity is 1. It turns
out that C = l/...ti7ra. Thus we arrive at the expression

f(x) = q,(x2 ) = __ 1 e- 2/2 2


Z (J', (2.3.12)
...ti7ra
which is the normal density centered at zero.

2.4 The Uniform Distribution


The second distribution widely used in measurements is called the rectangular
or uniform distribution.
The density of this distribution has a rectangular form:
1
fx(x) = b _ a' xE [a,b], (2.4.1)

with f(x) = 0 outside [a, b]. The mean value of X is E[X] = (b + a)/2 and the
variance is Var[X] = (b - a)2/12.
Often the interval [a, b] is symmetrie with respect to zero: [a, b] = [-ß, ß],
Le. the density is constant and differs from zero in asymmetrie interval around
zero. The length of this interval is 2ß. We use the shorthand notation X ......,
U(a, b) to say that X has density (2.4.1).
Let us recall that the first two moments of X......, U( -ß, ß) are

E[X] = 0, (2.4.2)

ß2
Var[X] = 3. (2.4.3)

The standard deviation is


ß
ax = J3. (2.4.4)

Now suppose that the random variable X is a sum of several independent


random variables Xl, X 2 , • •• , X,,, each of which has a uniform distribution:
2.4. UNIFORM DISTRIBUTION 25

Table 2.4: Atomic weights

Element Atomic weight Quoted Uncertainty Standard Uncertainty


C 12.011 ±0.001 0.00058
H 1.00794 ±0.00007 0.000040
o 15.9994 ±0.0003 0.0017
K 39.0983 ±0.001 0.00058

Xi"" U( -~i, ~i)· The variance of sum of independent random variables is the
sum of component variances. Thus,
4 A2
L L
k k A2
Var[X] = _ui = ~. (2.4.5)
i=1 12 i=1 3
The standard deviation of X is

ax = J(~~ + ... + ~~)/3. (2.4.6)


An important and useful fact is that if the number of summands k is 6 or
more and the ~i are of the same magnitude, then the distribution of X may be
quite weH approximated by a normal distribution. So, for k ~ 6,
X "" N(O,a~), (2.4.7)
where ax is given by (2.4.6).
Let us consider several typical cases of the use of the uniform distribution
in measurement practiee.

Example 2.'+.1: Ro'Und-off errors


Suppose that the scale of a manometer is graduated in units of 0.1 atmosphere.
The pressure reading is rounded therefore to the dosest multiple of 0.1. This in-
troduces a round-off error which may be assumed to have a uniform distribution
in the interval [-0.05,0.05].

Example 2.'+.2: Atomic weights


Table 2.4 presents the atomie weights of four elements, together with the un-
certainties in their measurements. Eurachem (2000) explains for the data of
Table 2.4 that ''for each element, the standard uncertainty is found by treat-
ing the quoted uncertainty as forming the bounds of a rectangular distribution.
The corresponding standard uncertainty is therefore obtained by dividing these
values by \1'3" (see (2.4.4)).

Example 2.'+.3: Specijication limits Eurachem (2000) describes how to handle


errors arising due to the geometrie imperfection of chemieal glassware instru-
ments. For example, the manufacturer of a 10 ml pipette guarantees that the
26 CHAPTER 2. MEAN AND STANDARD DEVIATION

volume of the liquid in the pipette filled up to the 10 ml mark with "absolute"
accuracy may deviate from this value by not more than ±0.012 ml. In simple
terms, the error in the liquid volume due to the geometrie imperfection of the
glassware is X '" U( -0.012,0.012).

2.5 Dixon's Test for a Single Outlier


Looking at the box and whisker plot in Fig. 2.2, we see that the smallest ob-
servation Y(l) = 158 may possibly be an outlier. Below we describe a simple
test due to Dixonj see Bolshev and Smirnov (1965, p. 89). The test is used to
identify a single outlier in a sampie from a normal population.
Suppose we observe a random sampie {Yl' ... ,Yn}. Denote the kth ordered
observation as Y(k). Then the smallest and the largest observations are denoted
as Y(l) and Y(n), respectively. Our null hypothesis 110 is the following: all
observations belong to a normal population with unknown parameters (14, (2).
The alternative hypothesis 1l"1 is that the smallest observation comes from a
population with distribution N(p - d,(2), where dis some unknown positive
constant. The test statistic Q is the following:

Q= Y(2) - Y(l) .
(2.5.1)
Y(n) - Y(l)

The difference Y(n) - Y(l) is called the observed or sampie range.


If Q exceeds the critical value Cn given in Table 2.5, 110 is rejected (at
significance er = 0.05) in favor of the alternative 1l"1. Let us apply Dixon's test
to the apple data. The value of Q is Q = (162 -158)/(170 - 158) = 0.333. It
is greater than the critical value Cn = 0.300 for n = 20. Thus, we assume that
the value Y(l) = 158 is an outlier.
For the alternative 1lt that the largest observation comes from a population
N (14 + d, (2), where d > 0, we use a similar statistic

Q* = Y(n) - Y(n-l) • (2.5.2)


Y(n) - Y(l)

We reject the null hypothesis in favor of 1lt, at significance level er = 0.05, if


Q* exceeds the critical value Cn in Table 2.5.
Remark 1: Interpretation %utliers
The easiest decision is just to delete the outlying observation from the sampie.
The right way to deal with an outlier is to try to find the reason for its ap-
pearance. It might be an error in data recording, some fault of the measuring
device or some reason related to the production process itself. The latter is
most important for the process study and should deserve the greatest attention
of the statistician.
2.5. DIXON'S TEST FOR A SINGLE OUTLIER 27

Table 2.5: Dixon's statistic for significance level a = 0.05 (Bolshevand Smirnov,
p.328)

3 0.941 13 0.361
4 0.765 14 0.349
5 0.642 15 0.338
6 0.560 16 0.329
7 0.507 17 0.320
8 0.468 18 0.313
9 0.437 19 0.306
10 0.412 20 0.300
11 0.392 25 0.277
12 0.376 30 0.260

It is interesting to check the normal plot without the outlier y = 158. The
normal plot without the "outlier" in Fig. 2.6 looks "more normal" .

r--------I-------l-------+'---i-----+------I----i---l
0.99,----,----.-----.----,-----r----~--~----~

O. 98
0.96~---4----~------r_----:-----4_-
0.94r----+----~----~--~----~----+_--~----~
0.92r----+----~----7_--_+----~----+_~a~4_--____I
0.9r----+---+----+----~--~~--~---+_-~
0.85r---~------r----+_---+----~----+-~·~~--~i
0.8r----+-----r----+----+----~----~--~-----
0.75r----+----~----+_--~----~----~--~-----
0.7r----+-----r----+_---4----~-~:-+----~----
0.65r----+----~----+_--~!----~-~~-+----4-----
0.6r----+-----r----+----+----~~~+_--~-----
0.55 I--------i----+----+----+__~ - - ----l-----------i
0.5 1
0.45r---+----+----+----~-~~-~--+_-~
0.4 f---------l----f-----+---+----+--+ -----t---
o. 35 f---- -----r---el--+--------I------l----~
0.3 i •
0.25r----+----~---+_--__,---~---+_--~----~
0.2 '.----+-----+-------I~---
0.15

0.1r---+--+---+---~---~-~---+--------

g:g~t====t====j=-=-----~+-~·~~:----~----+---~-----


o.-04 r----+-----r----+----+----~----+_--~------1

0.02r----+---+---+----~--____I~--_l___----+_--~

158 160 162 164 166 168 170 172


Figure 2.6. The normal plot for 19 observations without the outlier
28 CHAPTER 2. MEAN AND STANDARD DEVIATION

Table 2.6: Coeffi.cients for computing Estimates of u based on sampie range

n An Un
2 1.128 1.155
3 1.693 1.732
4 2.059 2.078
5 2.326 2.309
6 2.534 2.474
7 2.704 2.598
8 2.847 2.694
9 2.970 2.771
10 3.078 2.834
12 3.258 2.931
14 3.407 3.002
16 3.532 3.057
18 3.640 3.099
19 3.689 3.118
20 3.735 3.134

2.6 U sing Range to Estimate a


Ha sampie of size n :$ 10 is drawn from a normal population, then its range,
Le. the largest observation minus the smallest one, can serve as a quick and
quite accurate estimate of u. The modus operandi is very simple: to obtain the
estimate of u, sampie range must be divided by the coefficient An taken from
Table 2.6:
sampie range
(2.6.1)
A

u=
An
Let us apply (2.6.1) to the Yarok example, after deleting the outlier Y(l) =
158. Using Table 2.6, we have (j = 8/A19 = 2.17. Note that the estimate of u
based on 19 observations computed by (2.1.7) is s = 2.24, close enough to the
previous result.

Properties of u-estimators
Suppose that our observations {Yl, ... ,Yn} are drawn from a uniform population
y '" U (a, b). It turns out also for this case that there is an easy way to estimate
the standard deviation using sampie range.
It is easily proved that the mean value of the range for a sampie drawn from
a uniform population equals

E[Y(n) - y(l)] = (b - a)(n - 1)/(n + 1). (2.6.2)


2.6. USING RANGE TO ESTIMATE 0' 29

Compare this formula with the formula for the standard deviation:
O'y = (b - a)/vl2 (2.6.3)
and derive that
O'y = E[Y(n) - Y(l)](n + 1)/(vl2(n -1)). (2.6.4)
It follows therefore that an estimate of O'y ean be obtained by the formula
Y(n) - Y(l) (
A

O'y = , 2.6.5)
Un
where U n = JI2(n - 1)/(n + 1).
The eoeffi.dents un are given in Table 2.6. It is surprising that for small n
they are quite elose to the values An. This means that for estimating 0' in small
sampIes (n ~ 10), assuming a normal distribution or uniform distribution would
produee quite similar results.
We see that there are two different estimators of standard deviation: the
S-estimator,

E~=l (1'; - Y)2


S= (2.6.6)
n-1
see (2.1.7), and the estimator (2.6.1) based on range,
A
0' = Y(n)
- Y(l)
---!-"'---,,-----'--"-
An
The latter is reeommended for anormal sampIe, when the eoeffidents An are
taken from Table 2.6, while the former one is universal and does not need any
distributional assumptions regarding the random variables involved. Suppose
that we may assume normality. Which ofthe above two estimators is preferable?
Let us note fust that the estimator S is biased, i.e. its mean value is not equal
to 0'. To make this estimator unbiased it must be multiplied by a eoeffident
b _ r«n - 1)/2)Jn=l ( )
n - r(n/2)v'2 ' 2.6.7
where r(·) is the gamma function. More details ean be found in Bolshev and
Smirnov (1965, p. 60). The values of bn are given in Table 2.7.
Suppose that the S-estimator is multiplied by bn . Let us now eompare
S* = bnS with fT (2.6.1) Both formulas define unbiased estimators. They can
be eompared in terms of their varianee. The ratio en = Var[S*]/Var[fT] is ealled
efficiency. Bolvev and Smirnov (1965, p. 60), present the following values of en :
e2 = 1,e5 = 0.96,elO = 0.86,e15 = 0.77. We see that for small n ~ 10, the use
of fT leads to a rather small loss of effideney, Le. Var[fT] is only slightly larger
than Var[S*]. This fact justifies the use of the estimator based on range in the
ease of small sampIes.
In addition, let us mention the theoretical fact that for the normal ease, the
S*-estimator is the minimal-variance unbiased estimator, Le. it has the smallest
possible varianee among all possible unbiased estimators of 0'.
30 CHAPTER2. MEANANDSTANDARDDEWATION

Table 2.7: The correction factor bn


n bn
2 1.253
3 1.128
4 1.085
5 1.064
10 1.028
15 1.018
20 1.013
30 1.009
50 1.005

Table 2.8: 'Yn(a) = ta«n - 1)/.Jii for a = 0.05 and a = 0.025

Sampie size 'Yn(0.05) 'Yn(0.025) Sampie size 'Yn(0.05) 'Yn(0.025)


n n
3 1.686 2.484 13 0.494 0.604
4 1.176 1.591 14 0.473 0.577
5 0.953 1.241 15 0.455 0.554
6 0.823 1.050 16 0.438 0.533
7 0.734 0.925 17 0.423 0.514
8 0.700 0.836 18 0.410 0.497
9 0.620 0.769 19 0.398 0.482
10 0.580 0.715 20 0.387 0.468
11 0.546 0.672 25 0.342 0.413
12 0.518 0.635 30 0.310 0.373

2.7 Confidence Interval for the Population Mean


Suppose we draw a sampie of size n from a normal population. It is weH known
that the 1 - 2a confidence interval for the population mean ~ is the foHowing:

(2.7.1)

where ta«n - 1) is the a-critical value of the t-statistic with n - 1 degrees of


freedom. To simplify the computation procedure, we present in Table 2.8 the
values of 'Yn(a) = ta«n - 1)/.Jii for various values of n and for a equal to 0.05
and 0.025.
For the Yarok example, after deleting the outlier, the sampie mean isI66.00,
n = 19, S = 2.24. Let U8 keep two significant digits for s (put s =2.2) and
2.7. CONFIDENCE INTERVAL FOR MEAN 31

Table 2.9: Measurement results for ten specimens

i Ai = Yi - Di
1 10.533 10.545 0.012
2 9.472 9.476 0.004
3 9.953 9.960 0.007
4 10.823 10.830 0.007
5 8.734 8.736 0.002
6 10.700 10.706 0.006
7 9.620 9.630 0.010
8 10.580 10.582 0.002
9 10.546 10.555 0.011
10 9.518 9.520 0.002

round off the final computation result to 0.1. Thus, using Table 2.8, for n = 19,
and 1 - 2a = 0.9, we have '/'19 (0.05) = 0.398. Thus the 0.90 confidence interval
on the sampie mean is
[166.00 - 0.398 . 2.2,166.00 + 0.398 . 2.2] = [165.1,166.9]

Let us recall that confidence intervals can be used to test a hypothesis re-
garding the mean value in the following way. Suppose that we want to test the
null hypothesis 110 that the population mean equals I' = 1'0 against a two-sided
alternative I' -=11'0, at the significance level 0.05. We proceed as follows. Con-
struct, on the basis of the data, a 0.95 confidence interval on 1'. If this interval
covers the value 1'0, the null hypothesis is not rejected. Otherwise, it is rejected
at the significance level 0.05.

Remark 1
We draw the reader's attention to an important issue which is typically not very
weH stressed in courses on statistics. H the sampie {Y1, ... , Yn} is obtained as a
result of measurements with a constant but unknown bias Ö, then the confidence
interval (2.7.1) will be in fact a confidence interval for the sum I' + Ö.

Example 2.7.1: Testing a measurement instrument for bias


In order to test whether there is a systematic error in measuring the diameter of
a cylindric shaft by a workshop micrometer, ten specimens were manufactured.
Each was measured twice. The first measurement was made by a highly accurate
"master" instrument in the laboratory. This instrument is assumed to have a
zero bias. The second measurement was made by the workshop micrometer.
Denote by Di and Yi the measurement results for the ith specimen made
by the master and the workshop micrometer, respectively. The measurement
results are presented in Table 2.9. The average value of the difference Ai =
32 CHAPTER 2. MEAN AND STANDARD DEVIATION

Yi - D i is 0.0063, and the sampie standard deviation for ßi is 8 = 0.0038. If


the workshop micrometer has zero bias, the mean of ßi must be zero. So, our
null hypothesis 110 is E[ßi] = f.!.a = O.
Let us construct a 0.9 confidence interval on the mean value f.!. of the differ-
ences ßi. From Table 2.8, ')'10 = 0.580. Thus the confidence interval is
[0.0063 - 0.580 X 0.0038,0.0063 + 0.580 x 0.0038] = [0.0041,0.0085].
This interval does not contain zero, and thus the null hypothesis is rejected.
The estimate of the bias of the workshop micrometer is Jia = 0.0063. In words:
on average, the workshop mierometer measurements are larger than the true
value by approximately 0.0063 mm.
The probability model for the above measurement scheme is as folIows. Di
is considered as the true, "absolutely accurate" value of the diameter of the ith
specimen. The measurement result is given by
(2.7.2)
where f.!.a is the bias of the workshop mierometer, and €i is the random zero-
mean measurement error. By our assumption, therefore, the workshop miero-
meter has a constant bias during the measurement experiment.

2.8 Control Charts for Measurement Data


Control charts were first invented by W.A. Shewhart in the 1930s for monitoring
production processes and for discovering deviations from their normal course.
These deviations in their most basic form are of two types - changeS in the
process mean value and changes in the standard deviation. By a "production
process" we mean here an one-dimensional parameter, say the diameter of a rod
produced on an automatie machine.
In anormal, stable situation the rod diameter variations are described by
a random variable Y with mean f.!. and standard deviation a. These variations
are caused by so-called common causes. Beauregard et al. (1992, p. 15) say
that the "common cause of variation refers to chronie variation that seems to
be 'buHt-in' to the process. It's always been there and it will always be there
unless we change the process" .
Shifts in the mean value and changes in the standard deviation result from
the action of so-called special causes, such as change of adjustment, deviation
in the technologieal process, sudden changes in the measurement process etc.
According to Beauregard et al., "special causes of variation are due to acute
or short-term influences that are not normally apart of the process as it was
intended to operate" .
We explain the Shewhart chart for measurement data using an example
borrowed from Box and Luceno (1997, p. 57).
Table 2.10 shows measurements of the diameter of rods taken from produc-
tion line. The target value of the diameter is T = 0.876 inch. A sample of four
2.8. CONTROL CHARTS FOR MEASUREMENT DATA 33

Table 2.10: Rod diameter measurements over 20 hours


Houri DI D2 D3 D4 Average Range
1 -8 3 12 -9 -0.50 21
2 -3 7 -9 20 3.75 29
3 7 -5 7 -8 0.25 15
4 18 -23 -1 -18 -6.00 41
5 0 5 10 -11 1.00 21
6 1 -9 5 9 1.50 18
7 5 -2 25 -3 6.25 28
8 7 4 -11 -14 -3.50 21
9 9 -19 7 19 4.00 38
10 11 28 11 7 14.25 21
11 0 0 5 12 4.25 12
12 4 -6 14 5 4.25 20
13 2 1 21 0 6.00 21
14 28 13 -9 -4 7.00 37
15 26 2 7 -10 6.25 36
16 -15 11 0 7 0.75 26
17 14 16 7 23 15.00 16
18 19 11 29 18 19.25 18
19 16 20 30 17 20.75 14
20 -4 24 9 26 13.75 30

rods was randomly chosen and measured every hour and the table shows data
over a particular period of 20 hours of production. The figures in this table are
the deviations of the rod diameters from value Tl = 0.870 in thousands of an
inch. Each row in the table corresponds to one hour. The last two columns give
the hour average and hour range for the four observations. I

Two graphs are of the utmost importance in Shewhart's methodology. The


first, the "X bar chart" (see Fig. 2.7), is a representation of sampie means for
each hour.

IThis material is borrowed from George Box and Alberto Lueeno Statistical Contr"Ol b,l
Monitoring and Feedback Adjustment (1997) and is used by permission of John Wiley & Sons,
Ine., Copyright @1997.
34 CHAPTER 2. MEAN AND STANDARD DEVIATION

241------------------------123.506

15

6 t--------).o(---+-----\---::i'<~:':::::(~__1_----_I5.9125

-3

-12-}----------------------_I-11.681

6 11 16 21
Ca se Number
Figure 2.7. The X bar chart for data in Table 2.10 produced by Statistix. The
zero line corresponds to Tl = 0.870.

X Bar Chart
The line 5.9125 (near the mark "6") is the estimated mean value of the process.
This value is the average {J of hour averages in Table 2.10. The lines 23.506
and -11.681 (near the marks "24" and "12", respectively) are called control
limits and correspond to ±3uD / V4 deviations from the estimated mean value.
UD is the estimate of the standard deviation of the rod diameter. We divide
it by V4 because we are plotting ±3 standard deviations for the average of 4
observations.
An important statistical fact is that the sampie means, due to the central
limit effect, closely follow the normal distribution, even if the rod diameters
themselves may not be normally distributed.
How do we estimate the standard deviation UD? The practice is to use the
average range. From the Range column of Table 2.10 one can calculate that the
average range is R = 24.15. Now from Table 2.6 obtain the coeflicient An for
the sampie size n = 4: A 4 = 2.059. Then, as already explained in Sect. 2.6, the
estimate of a D will be
UD = R/A4 = 24.15/2.059 = 11.73. (2.8.1)
The estimate of standard deviation of sample averages is V4 = 2 times smaller
and equals 5.86. We can round this result to 5.9.
When the averages go outside of the control limits, actions must be taken
to establish the special cause for that.
The principal assumption of Shewhart's charts is that the observations taken
each hour are, for astate of control, sample values of identically distributed
2.8. CONTROL CHARTS FOR MEASUREMENT DATA 35

independent random variables. The deviations from these assumptions manifest


themselves in the appearance of abnormalities in the behaviour of the X bar
plot.
For example, Western Electric Company adopted the following roles for an
action signal (see Box and Luceno (1997)):
Rule 1: A single point lies beyond the three-sigma limits.
Rule 2: Two out of three consecutive points lie beyond the two-sigma limits.
Rule 3: Four out of five consecutive points lie beyond one-sigma limits.
Rule 4: Eight consecutive points He on one side of the target value.
Nelson (1984) contains a collection of stopping roles based on various abnormal
patterns in the behavior of X bar charts.

Range charts
The other graph which is important in Shewhart's methodology is the 'range
chart' or 'R chart'. Range charts are meant for detecting changes in the process
variability. They are constrocted similarly to the X bar charts; see Fig. 2.8.
Each hour, the actual observed range of the sampIe of n = 4 rod diameters is
plotted on the chart.
Denote by UR the estimate ofthe standard deviation ofthe range (for n = 4).
The distribution of the range is only approximately normal, but nevertheless
±2UR or ±3UR limits are used to supply action limits and warning limits.

60
1------------------------155.115

40

1---;,I--\--+------\---+~_+__-_\_--_+-------"'\__--_+----124.150

20

o-t---------------------------jO.oooo
6 11 16 21
Case Number
Figure 2.8. The R chart for the data in Table 2.10. The average range is
r = 24.15. The upper controllimit equals 55.115 = 24.150 + 3 ·10.32.
For the normal distribution, the mean range and the standard deviation of
the range (for fixed sampIe size n) depend only on U and are proportional to u,
36 CHAPTER 2. MEAN AND STANDARD DEVIATION

Table 2.11: Probabilistic characteristic of normal ranges, 0' = 1; Bolshevand


Smirnov (1965), p. 58

n E[Rnl UR.. UCL = E[Rnl + 3UR.. Po = P(Rn > UCL)


4 2.059 0.880 2.34 4.699 0.0049
5 2.326 0.864 2.69 4.918 0.0046
6 2.534 0.848 2.99 5.078 0.0045
7 2.704 0.833 3.25 5.203 0.0044
8 2.847 0.820 3.47 5.307 0.0043
9 2.970 0.808 3.68 5.394 0.0044
10 3.078 0.797 3.86 5.469 0.0044

the standard deviation of the population. UR can be obtained from the average
range f by the following formula:
f
(2.8.2)
A

UR = kR(n) ,

where kRen) is the coeffi.cient in the fourth column of Table 2.11.


In our example, the average ofthe last column ofTable 2.10 is 24.15. kR(4) =
2.34 and
A 24.15
UR = 2.34 = 10.32. (2.8.3)

If the average range minus 3UR goes below zero, it is replaced by zero.

Inßuence of Measurement Errors on the Performance of R Charts


The above description of Shewhart's charts is quite basic. There is a huge
literature devoted to the control and monitoring of changes in the process which
might be revealed by these charts; see Box and Luceno (1997) and Beauregard
et al. (1992). In practice, it is most important to find the so-called assignable
causes responsible for the particular deviation beyond the controllimits on the
X bar and/or the R chart.
The literature on statistical process control typically devotes little attention
to the fact that the measurement results refl.ect both the drift and the variability
of the controlled process and the systematic and random measurement errors.
Formally, the measurement result Y might be represented in the following form:

Y = J.t + c5pr + €pr + c5instr + €instr, (2.8.4)


where J.t is the process target (mean) value, c5pr is the shift of the process from
J.t, €pr is the random deviation of the process around J.t + c5pr due to common
causes and c5instr and €instr are the systematic bias and random error of the
measurement instrument, respectively.
2.8. CONTROL CHARTS FOR MEASUREMENT DATA 37

Suppose that statistical process control is applied to a process with I5pr = 0


and l5instr = 0, Le. the process has no shift and the measurement instrument
has no systematic bias. Then

y = J.' + €pr + €instr· (2.8.5)


Assuming that €pr and €instr are independent, we have

(2.8.6)

where G pr and Gm are the standard deviations of €pr and €instr, respectively.
If for example 'Y = Gm/Gpr = 1/7, then Gy = JGir + G~r/49 ~ 1.01Gpr .
Thus the overall standard deviation will increase by 1 % only. Assuming that
1 % increase is admissible, we can formulate the following rule of thumb: The
standard deviation of the random measurement error must not exceed one sev-
enth of the standard deviation caused by the common causes in the controlled
process.

There are situations, especially in chemical measurements, for which typi-


cally the measurement error has the same magnitude as the process variation;
see Dechert et al. (2000). These situations demand special attention since the
presence of large measurement errors may change the performance characteris-
tics of the control charts.
Suppose that the X bar and the R charts were designed using properly ad-
justed and highly accurate measurement instruments and/or measurement pro-
cedures. In other words, let us assume that the control limits were established
for a process in control, where the measurement instruments had a negligible
l5instr and Gm. Afterwards, in the course of work, the measurement instrument
gradually loses its proper adjustment and develops a systematic error, say a
positive bias ~ > O. Then, even in the absence of any special cause, the X bar
chart will reveal a high number of upper control limit crossings. Similarly, an
increase in Gm, e.g. due to measurement instrument mechanical wearout, will
lead to increase in the probability of crossing the controllimits, both for X bar
and R charts. It should be kept in mind, therefore, that the loss of accuracy
of a measurement instrument may be the assignable cause for the crossing of
controllimits in Shewhart's charts.
Note also that the appearance of instrument bias in measurements will not
affect the range of the sampIe, and thus the R chart will be insensitive ("ro-
bust") to the systematic error in the measurement process. If the R chart looks
"normal", but the X bar chart signals trouble, a reasonable explanation might
be that this trouble is caused by a process drift or by measurement instrument
bias.
Any measurement instrument and/or measurement procedure based on us-
ing measurement instruments must be subject to periodic calibration, i.e. in-
spections and check-ups using more accurate and precise instruments. This
38 CHAPTER 2. MEAN AND STANDARD DEVIATION

calibration reminds in principle the statistical process control. In particular,


the measurement instruments must be checked periodically to discover changes
in the adjustment (e.g. the presence of a systematic bias) and to discover a 1088
of precision, Le. an increase in the standard deviation of random measurement
errors. The particular details how to carry out these inspections depend on the
type of the instrument, its precision standards, etc.j see Morris (1997).
Let us investigate the inßuence of measurement errors on the performance of
the R chart. This material is of somewhat theoretical in nature and the formal
reasoning could be omitted at the first reading. We assume that the process
measurement model is the following:
y = J1. + Epr + Einstr, (2.8.7)
where Epr '" N(O, u;r) and Einstr '" N(O, u;J are independent. Expression
(2.8.7) means that we assume no systematic process shift and no measurement
bias: 6 pr = 6instr = O. In fact, this assumption is not restrictive since R bar
charts are insensitive to process and measurement bias.
Let us consider the probability of crossing the upper control limit in the R
chart:
Po = P(Rn > E[Rnl + 3UR,.),
where Rn is the random range in the sampIe of n normally distributed mea-
surement results, E[Rnl is the corresponding mean range and UR" is the range
standard deviation.
The most important characteristic of the performance of a control chart is
the average run length (ARL), defined as the mean number of sampIes until the
first crossing of the three-sigma control limit. ARL is expressed in terms of Po
as folIows:
1
ARL= - . (2.8.8)
Po
To compute Po note that Rn/u is distributed as a random range for a sampIe of
size n which is taken from a standard normal population N(O,I). If we divide
E[Rnl and UR" by u, we will obtain the mean range and the standard deviation
of range, respectively, for N(O,I). Therefore, to compute Po we can use the
standard normal distribution. Table 2.11 gives the Po values for sampIe sizes
n = 4 - 10. For example, for n = 4, Po ~ 0.0049 and ARL ~ 205.
Denote
Um
"(=-. (2.8.9)
U pr

It follows from (2.8.7) that now the measurement Y '" (J1., u;r(I + "(2».
Suppose that "( = 0.37. Let us find out the corresponding ARL for n = 4.
The controllimit equals 4.699 x UR" (see Table 2.11, column 5). This limit was
established for the ideal situation where Um = 0, or"( = O. Now the sampIe
range of random variable Y has increased by a factor VI + "(2.
2.8. CONTROL CHARTS FOR MEASUREMENT DATA 39

It can be shown that now the probability of crossing the controllimit equals

P-y = p(Rn > 4.699/,,11 + ')'2). (2.8.10)

For our example, ,,/1 + ')'2 = VI + 0.372 = 1.066, and we have to find P(Rn >
4.699/1.066) ~ P(Rn > 4.41). This probability can be found using special
tablesj see Dunin-Barkovskyand Smirnov (1956, p. 514). The desired quantity
is ~ 0.01, from which it follows that ARL ~ 100. Thus, ifthe standard deviation
of the measurement instrument is about 37% of the process standard deviation,
the ARL of the R chart decreases by a factor of approximately 0.5.

Performance of X Bar Chart in the Presence of Measurement Bias


Let us investigate the inftuence of measurement bias <5instr and variance U~ on
the probability of crossing the controllimits LCL = fJ - 3upr/..Jii and UCL =
fJ + 3upr / ..Jii.
This material, like the material at the end of the previous subsection, is of
theoretical nature and can be omitted when first reading the book. Only the
final numerical illustrations are important.
Suppose that in (2.8.4) the process shift <5pr = 0, and the measurement
instrument introduces a systematic error <5instr. Then Y '" N(fJ + <5instr,U;r +
u~). The average Y of n observations is then a normally distributed random
variable with mean fJ + <5instr and standard deviation Uy = u pr VI
+ "(2/..Jii.
Therefore
Zo = (Y - fJ - <5instr)/Uy '" N(O, 1). (2.8.11)
Now the probability that the sampie average is between LCL and UCL equals
P (LCL < Y < UCL) (2.8.12)
_ p( -3Upr - ~
-
<5instr..Jii Z
< 0<
3upr - <5instrVn)
~.
Upry 1 + ')'2 Upry 1 + ')'2
After some simple algebra we obtain that the probability that Y is outside the
control limits equals

p(Y < LCL or Y > VCL) = ~ (~)


1 + "(2
+~ (-h
1+
3 ) , (2.8.13)
"(2

where ß = <5instr/Upr and ~(.) is the distribution function for the standard
normal variablej see (2.3.3).

Example 2.8.1: The ARL tor the X bar ehart in the presence 0/ instrument bias
Assume that the sampie size is n = 4, ')' = um/upr = 0.2 and ß = 0.5. Thus we
assume that the measurement bias is half of u pr , and the standard deviation of
measurement error is 20% of the process standard deviation.
40 CHAPTER 2. MEAN AND STANDARD DEVIATION

Substituting these values into (2.8.13) we obtain, using the table in Appendix
A and assuming ~(3.92) = 1.000, that
p (Y < LCL or Y > UCL) = ~(-1.96) - ~(-3.92) =1- 0.975 = 0.025.
The ARL is therefore 1/0.025 = 40. Let us compare this with the ARL for an
"ideal" measurement instrument (8inBtr = Um = O).
The probability that the averages go outside three-sigma limits equals 2~ (-3) =
2(1 - 0.9986} = 0.0028. The corresponding ARL = 1/0.0028 ~ 360. We see
therefore that the presence of 8inBtr = 0.5upr and Um = 0.2upr reduces the ARL
by a factor of 9.

2.9 Exercises
1. Below are the results of 15 replicate determinations of nitrate ion concentra-
tion in mg/mI in a particular water specimen:
0.64,0.64,0.63,0.63,0.62,0.65,0.66,0.63,0.60, 0.64, 0.65, 0.66, 0.61, 0.62, 0.63.

a. Find estimate jJ, of the mean concentration.


b. Estimate the standard deviation via formula (2.1.7) and via the sampIe
range. For the latter divide the sampIe range by A 15 = 3.472.
c. Would you consider the smallest result as an outlier (use Dixon's test)?
d. Construct a 95-% confidence interval on the mean concentration.
e. Using the rule of Sect. 2.2, check whether the original data are recorded
with sufficient accuracy.

2. It is known that the weight of Yarok apples is normally distributed with


mean 150 g and standard deviation 15 g. Find the probability that the weight
of a given apple will be inside the interval [145,158].

3. In order to avoid systematic error in the process of weighing, the following


method is used in analytic chemistry. First, avesseI containing a liquid is
weighed, and then the weight of the empty vessel is subtracted from the previous
result. Suppose that the random weighing error has a standard deviation 0' =
0.0005 g. What will be the standard deviation of the weight difference?
Solution: Using (2.1.22), with a1 = l,a2 = -1, obtain that the standard devia-
tion of the weight difference is 0.0005 x v'2 = 0.0071.

4. Stigler (1977) presents Simon Newcomb's measurements ofthe speed oflight


carried out in 1882. The data are presented as deviations in nanoseconds from
24800, the time for the light to travel 7442 meters. Table 2.12 gives the first 30
measurements (out of 66):1

1 Reprinted with permission from Andrew Gelman, John B. Carlin, Hai S. Stern and Donald
Rubin, Bayesian Data Analysis (2000), p. 70, Copyright CRC Press, Boca Raton, Florida
2.9. EXERCISES 41

Table 2.12: Newcomb's light speed data (Stigler 1977)

28 26 33 24 34 -44 27 16 40 -2
29 22 24 21 25 30 23 29 31 19
24 20 36 32 36 28 25 21 28 29

Assume the normal model and check whether the lowest measurement -44
might be considered as an outlier.
Solution. The Dixon's statistic Q = (-2 - (-44)/(36 - (-44)) = 0.575 > C30 =
0.260; see Table 2.5.

5. In order to check the titration procedure for a potential bias in measuring


NaCI concentration in water, a standard solution with 0.5% concentration of
NaCI was tested 7 times, and the following percentage results were obtained:
0.51, 0.55, 0.53, 0.50, 0.56, 0.53, 0.57.
Test the null hypothesis that the concentration is 0.5% at the significance level
0.05. Use for this purpose the 0.95- confidence interval for the mean concentra-
tion. Estimate the measurement bias.
Solution. The average concentration is 0.536, s = 0.026. The 95% confidence
interval [0.512,0.560] does not contain the point 0.5. The estimate of the bias
is 8 = 0.536 - 0.500 = 0.036 ~ 0.04.
Chapter 3

Comparing Means and


Variances

Our discontent is from comparison.

J. Norris

3.1 t-test for Comparing Two Population


Means
In this section we consider a very important and widely used test in statistics
called the t-test which is designed to compare the mean values in two inde-
pendent normally distributed populations A and B. Let us first consider an
example.

Example 3.1.1: Dextroamphetamine excretion by children1


This example (Devore, 1982, p. 292) gives data on the amount of a special
chemical substance called dextroamphetamine (DEM) excreted by children. It
is assumed that the children with organically related disorders produce more
DEM than the children with nonorganic disorders. To test this assumption,
two sampies of children were chosen and both were given a drug containing
DEM. These sampies were compared by the percentage of drug recovery seven
hours after its administration.

1 Borrowed from Probability and Statistics for Engineering and the Sciences, 1st Edition
by J.Devore. @1982. Reprinted with permission of BrooksjCole, a division of Thompson
Learning: www.thompsonsrights.com. Fax 800730-2215.
44 CHAPTER 3. COMPARING MEANS AND VARlANCES

30

24

18

12
A B

Figure 3.1. Box and whisker plots for sampIes A and B, Example 3.1.1

SampIe A (nonorganic disorders): 15.59,14.76,13.32,12.45,12.79.


SampIe B (organic disorders): 17.53,20.60,17.62,28.93,27.10.
Figure 3.1 shows the box and whisker plot for both sampIes. We see that the
sampIes differ considerably in their mean values and in their standard deviations.
We assume therefore that populations A and B are distributed as N(I'A,O"~)
and N(I'B, 0"1), respectively, with O"A -::J: O"B·
Let us define the null hypothesis 1lo: I'A = I'B and the alternatives
1l_ : I'A < I'B,
1l+ : I'A > I'B and
1l* : I'A -::J: I'B·
The test is based on the following statistic:
T- XA -XB (3.1.1)
- ..js~/nA + s~/nB'
where XA and XB are the sampIe means, and s~ and s~ are sampIe variances,
computed by formulas (2.1.6) and (2.1.8), respectively. nA and nB are the
sampIe sizes.
Define also the number of degrees of freedom 11:
(s~/nA + s~/nB)2 (3.1.2)

The value of 11 must be rounded to the nearest integer.


3.1. TWO-SAMPLE T -TEST 45

Table 3.1: Critieal values of ta(v) of the t-distribution

v a = 0.05 a = 0.025 a = 0.01 a = 0.005


1 6.314 12.706 31.821 63.657
2 2.920 4.303 6.965 9.925
3 2.353 3.182 4.451 5.841
4 2.132 2.776 3.747 4.604
5 2.015 2.571 3.365 4.032
6 1.943 2.447 3.143 3.707
7 1.895 2.365 2.998 3.499
8 1.860 2.306 2.896 3.355
9 1.833 2.262 2.821 3.250
10 1.812 2.228 2.764 3.169
11 1.796 2.201 2.718 3.106
12 1.782 2.179 2.681 3.055
13 1.771 2.160 2.650 3.012
14 1.761 2.145 2.624 2.977
15 1.753 2.131 2.602 2.947
16 1.746 2.120 2.583 2.921
17 1.740 2.110 2.567 2.898
18 1.734 2.101 2.552 2.878
19 1.729 2.093 2.539 2.861
20 1.725 2.086 2.528 2.845
25 1.708 2.060 2.485 2.787
30 1.697 2.042 2.457 2.750
40 1.697 2.021 2.423 2.704

Denote by ta(v) the a-critical value of the t-distribution with v degrees


of freedom. We assume that the reader is familiar with this distribution, often
referred to as Student's distribution. If a random variable To has at-distribution
with v degrees of freedom, the probability that To exceeds ta(v) is equal to
a. The t-distribution is symmetrie with respect to zero, and therefore the
probability that To is less than -ta(v) also equals a. Table 3.1 gives the values
of ta(v). We reject 11.0 in favor of 11._ at the significance level a if the value of
T computed by (3.1.1) is less than or equal to -:ta(v). We reject 11.0 in favor of
11.+ at level a if the value of T is greater than or equal to ta(v). We reject 11.0
in favor of 11.* at level a if the absolute value of T is greater than or equal to
t a / 2 (v).

Example 3.1.1 contim,ed


Let us complete the calculations for Example 3.1.1. We have XA = 13.782, SA =
1.34, nA = 5. Similarly, XB = 22.356 SB = 5.35, nB = 5. (Observe that the
46 CHAPTER 3. COMPARING MEANS AND VARlANCES

populations differ quite significantly in the values of their estimated standard


deviations.) By (3.1.1) the test statistic is

T = 13.782 - 22.356 = -3.477


J1.34 2 /5 + 5.352/5
Now let us compute v. By (3.1.2),
(1.342/5 + 5.352/5)2
V = 1.344/(52 X 4) + 5.354/(52 X 4) = 4.5 Rl 5.
We will check 11.0 against 11._ : J1.A < J1.B at significance a = 0.01. We
see from Table 3.1 that T = -3.477 is less than -tom (5) = -3.365 and thus
we reject the null hypothesis. We confirm therefore the assumption that DEM
excretion for nonorganic disorders is smaller than for organic disorders.

Remark 1
What if the normality assumption regarding both populations is not valid? It
turns out that the t-test used in the above analysis is not very sensitive to
deviations from normality. More on this issue can be found in Box (1953), and
an enlightening discussion can be found in Scheffe (1956, Chapter 10). It is
desirable to use, in addition to the t-test, the so-called nonparametric approach,
such as a test based on ranks. We discuss in Sect. 3.2 a test of this kind which
is used to compare mean values in two or more populations.
Remark 2
It is quite typical of chemical measurements for there to be a bias or a systematic
error caused by a specific operation method in a specific laboratory. The bias
may appear as a result of equipment adjustment (e.g. a shift of the zero point),
operational habits of the operator, properties of reagents used, etc. It is vitally
important for the proper comparison of means to ensure that the two sets of
measurement (sampie A and sampie B) have the same systematic errors. Then,
in the formula for the T-statistic, the systematic errors in both sampies cancel
out. Interestingly, this comment is rarely made in describing the applications
of the t-test.
In order to ensure the equality of systematic errors, the processing of sampies
A and B must be carried out on the same equipment, in the same laboratory,
possible by the same operator, and within a relatively short period of time.
These conditions are referred to in measurement practice as "repeatability con-
ditions"; see dause 4.2.1 "Two groups of measurements in one laboratory" , of
the British Standard BS ISO 5725-6 (1994).

Remark 3
Is there a way to estimate bias? In principle, yes. Prepare a reference sampie
with a known quantity of the chemical substance. Divide this sampie into n
portions, and carry out, in repeatability conditions, the measurements for each
3.1. TWO-SAMPLE T-TEST 47

of these n portions. Calculate the mean quantity xn . If Xo is the known contents


of the chemical substance, then the difference ~ = Xo - xn is an estimate of
measurement bias. Construct, for example, the confidence interval for the mean
difference E[ 6.) = f.-' and check whether it contains the zero point. If not, assume
that the bias does exist and ~ is its estimate.

Suppose that we can assume that the variances in populations A and B are
equal: UA = UB· Then the testing procedure for hypotheses about PA - PB is
similar to that described above, with some minor changes. The test statistic
will be
T- XA -XB
(3.1.3)
- spJl/nA + I/nB'
where sp (called the pooled sample variance) is defined as

(nA - l)s~ + (nB - l)s1


sp = , (3.1.4)
nA +nB - 2
in which s~ and s1 are the sample variances for samples A and B, respectively,
calculated as in (2.1.8). The number of degrees of freedom for this version of
the t-test is v = nA + nB - 2.

Paired Experiment for Comparing Means


Suppose that we wish to compare two wear-resistant materials, A and B. For
this purpose, we prepare two samples of 8 specimens from the above materials.
Suppose that testing these specimens is done by two laboratories, 1 and 2.
Suppose that each of the laboratories tests 8 specimens. One way of organizing
the experiment is to give sample A to lab 1 and the sampie B to lab 2. Suppose
that the results differ significantly, and that sample A is definitely better, Le.
shows less wear. Would the results of this experiment be conclusive in favor of
material A, if the corresponding t-test indicates that the wear in sample A is
significantly smaller than in sample B? On one hand yes, but on the other there
remain some doubts that possibly lab 1 applied smaller friction force than lab
2, or used different abrasive materials, etc. So, it is desirable to organize the
experiment in such a way that the possible differences in the testing procedure
are eliminated.
An efficient way to achieve this goal is to redesign the experiment in the
following way. Choose randomly 8 pairs of specimens, each pair containing one
from sampie A and one from sampie B. Organize the experiment in such a
way that each pair is tested in identical conditions. Compute for each pair the
difference in the amount of wear between the two specimens. In this way, the
differences resulting from the testing conditions will be eliminated. What will
be measured is the "pure" difference in the behaviour of the materials. This
type of experiment is called "pairwise blocking" .
48 CHAPTER 3. COMPARING MEANS AND VARIANCES

Table 3.2: Experiment 1: Average weight loss in % for material A

Driver 1 2 3 4 5 6 7 8
Weight loss 5.0 7.5 8.0 6.5 10.0 9.5 12.5 11.0

Table 3.3: Experiment 2: Average weight loss in % for material B

Driver 12345678
Weight loss 5.5 8.5 9.0 7.0 11.0 9.5 13.5 11.5

To make the advantages of pairwise blocking more obvious, let us consider an


imaginary experiment which is a variation on the famous statistical experiment
for comparing the wear of shoe soles made from two different materials.

Example 9.1.2: Comparing the wear of two materials for disk brake shoes
First, consider a "traditional" two-sample approach. Two groups of n = 8
similar cars (same model, same production year) are chosen. The first (group
A) has front wheel brake shoes made of material A (each car has two pairs
of shoes, one pair for each front wheel). Each particular shoe is marked and
weighed before its installation in the car. Eight drivers are chosen randomly
from the pool of drivers available, and each driver is assigned to a certain car.
In the first experiment, each driver drives his car for 1000 miles. Afterwards
the brake shoes are removed, their weight is compared to the initial weight,
the relative loss of weight for each shoe is calculated, and the average 1088 of
weight for all four shoes for each car is recorded. In the second experiment,
the same cars (each with the same driver) are equipped with brake shoes made
of material B, and the whole procedure is repeated. To exclude any driver-
material interaction, the drivers do not know which material is used in each
experiment.
The results obtained for Experiments 1 and 2 are presented in Tables 3.2
and 3.3.
It is obvious from Fig. 3.2 that material B shows greater wear than material
A. Moreover, this claim is not only true "on average", but also holds for each
car.
The amount of wear has a large variability caused by shoe material nonho-
mogeneityas weH as by the variations in driving habits of different drivers and
differences in the roads driven by the cars. Prom the statistical point of view,
the crucial fact is that this variability hides the true difference in the wear.
To show this, let us apply the above-described t-test to test the null hypoth-
esis Jl.A = Jl.B against the obvious alternative Jl.A < Jl.B. (Here index A refers to
Experiment 1 and index B to experiment 2.)
3.1. TWO-SAMPLE T-TEST 49

Table 3.4: The differenee of wear measured by average weight 1088 in pereentage
Driver 1 2 3 4 5 6 7 8
Weight 1088 0.4 1.1 1.6 0.6 0.9 0.1 1.2 0.4

WE I GHT
I I I
lOSS
5 10 14
%

EXPERIMENT

~ ~~F\\i SiR,VER 1
EXPERIMENT
2
Figure 3.2. Comparison of wear for Experiment 1 and 2

We ealeulate that XA = 8.75, XB = 9.44, s7.t = 6.07 and s~ = 6.53. We have


nA = nB = 8, v = 14. By (3.11) we obtain

T = 8.75 - 9.44 = -0.55.


v'6.07/8 + 6.532 /8

The 0.05-eritical value to.os(14) = 1.761. We must reject the null hypothesis
only if the eomputed value of T is smaller than -1.761. This is not the ease,
and therefore our t-test fails to eonfirm that material B shows greater wear than
material A.
Now let us eonsider a more clever design for the whole experiment. The disk
brake has two shoes. Let us make one shoe from material A and the other from
material B. Suppose that 8 ears take part in the experiment, and that the same
drivers take part with their ears. Each ear wheel is equipped with shoes made
of different materials (A and B), and the loeation (outer or inner) of the A shoe
is chosen randomly.
For each ear, we reeord the diIJerence in the average weight 1088 for shoes
of material A and B, Le. we record the average weight lass of B shoes minus
the average weight loss of A shoes. Table 3.4 shows simulated results for this
imaginary experiment. For each ear, the data in this table were obtained as
the differenee between previously observed values perturbed by adding random
numbers.
There is no doubt that these data point out on a significant advantage of
material A over material B. The ealculations show that the average difference
50 CHAPTER 3. COMPARING MEANS AND VARIANCES

in wear is d = 0.79 and the sampIe variance B~ = 0.25. Let us construct a 95%
confidence interval for the mean weight loss difference fJ.a; see Sect. 2.7. The
confidence interval has the form

(d - t0.Q2s(7)· Ba/VB, d + tO.02s(7) . Ba/VB)


= (0.79 - 2.365·0.5/2.83,0.79 + 2.365.0.5/2.83) = (0.37, 1.21).
We obtain therefore that the confidence interval does not contain zero, and
the null hypothesis must certainly be rejected.

3.2 Comparing More Than Two Means


In this section we consider the comparison of several mean values by means of a
simple yet efficient procedure called the Kruskal-Wallis (KW) test. Unlike the
tests considered so far, the KW test is based not on the actual observations but
on their ranks. It belongs to the so-called nonparametric methods.
Let us remind the reader what we mean by ranks. Suppose that we have a
sampIe of 8 values: {12, 9, 14, 17,7,25,5, 18}. Now put these values in increasing
order:

(5,7,9,12,14,17,18,25)
and number them, from left to right. The rank of any observation will be its
ordinal number. So, the rank of 5, r(5), is 1, the rank of 7 is r(7) = 2, etc.
How to assign the ranks if several observations are tied, Le. are equal to each
other, as in the following sampIe: {12, 12,14,17,7,25,5, 18}?
Order the sampIe, assign an ordinal number to each observation, and define
the rank of a tied observation as the corresponding average rank:
Ordered sampIe 5, 7, 12, 12, 14, 17, 18, 25;
Ordinal numbers 1, 2, 3, 4, 5, 6, 7, 8;
Ranks 1, 2, 3.5,3.5, 5, 6, 7, 8.
Notation for the KW Test
We consider I sampIes, numbered i = 1,2, ... , I; sampIe i has Ji observations.
N = J1 + ... + h is the total number of observations; Xij is the jth observation
in the ith sampIe.
It is assumed that the ith sampIe comes from the population described by
random variable
Xi = fJ.i +€ij, i = 1, ... ,1, j = 1,2,···,Ji, (3.2.1)
where all random variables €ij have the same continuous distribution. Without
loss of generality, it can be assumed that fJ.i is the mean value of Xi.
It is important to stress that the Kruskal-Wallis procedure does not demand
that the random sampIes be drawn from normal populations. It suffices to
3.2. COMPARING MORE THAN TWO MEANS 51

demand that the populations involved have the same distribution but may differ
in their location.
Suppose that all Xij values are pooled together and ranked in increasing
order. Denote by R ij the rank of the observation Xij' Denote by Ri. the total
rank (the SUfi of the ranks) of all observations belonging to sampie i. R i . =
Rd Ji is the average rank of sampie i.
The null hypothesis 11.0 is that alll'i are equal:
1'1 =1'2 = ... =1'1· (3.2.2)
In the view of the assumption that all €ij have the same distribution, the null
hypothesis means that all I sampies belong to the same population.
If 11.0 is true, one would expect the values of Ri to be elose to each other
and hence elose to the overall average

R = R1 . + ... + RI . = N + 1 (3.2.3)
.. N 2 .
An appropriate criterion for measuring the overall eloseness of sampie rank
averages to R .. is a weighted sum of squared differences (R i . - R..)2.
The Kruskal-Wallis statistic is given by the following expression:

KW
12
= N(N + 1) t"t
I (_ N + 1)2
Ji Ri. - - 2 - . (3.2.4)

The use of the KW test is based on the following:

Proposition 3.2.1
When the null hypothesis is true and either I = 3, Ji ~ 6, i = 1,2,3,
or I > 3, Ji ~ 5, i = 1,2, ... , I, then KW has approximately a chi-square
distribution with v = I - 1 degrees of freedom (see Devore 1982, p. 597).

The alternative hypothesis for which the KW test is most powerful is the
following:
11.*: not all population means ml, ... ,mI are equal.

Since KW is zero when all Rö. are equal and is large when the sampies are
shifted with respect to each other, the null hypothesis is rejected for large values
of KW. According to the above Proposition 3.2.1, the null hypothesis is rejected
at significance level 0: if KW > X~(v), where X~(v) is the 1-0: quantile ofthe
chi-square distribution with v degrees of freedom.

Remark 1: Chi-square distribution


Let us remind the reader that the chi-square distribution is defined as the dis-
tribution of the sum of squares of several standard normal random variables.
We say that the random variable G has a chi-square distribution with k
degrees of freedom if G = Xl + ... + Xi, where all Xi are independent random
52 CHAPTER3. COMPARING MEANS AND VARIANCES

Table 3.5: The values of X;(v)

d.f. v a=O.l a = 0.05 a = 0.025 a =0.01 a = 0.005


1 2.706 3.841 5.024 6.635 7.879
2 4.605 5.991 7.378 9.210 10.597
3 6.251 7.815 9.348 11.345 12.838
4 7.779 9.488 11.143 13.277 14.860
5 9.236 11.070 12.833 15.086 16.750
6 10.645 12.592 14.449 16.812 18.548
7 12.017 14.067 16.013 18.475 20.278
8 13.362 15.507 17.535 20.090 21.955
9 14.684 16.919 19.023 21.666 23.589
10 15.987 18.307 20.483 23.209 25.188
11 17.275 19.675 21.920 24.725 26.757
12 18.549 21.026 23.337 26.217 28.300
13 19.812 22.362 24.736 27.688 28.819
14 21.064 23.685 26.119 29.141 31.319
15 22.307 24.996 27.488 30.578 32.801

variables distributed as N(O, 1). The corresponding critical values are defined
as folIows:

(3.2.5)
Table 3.5 gives the critical values of X~(v) for the KW test. A more complete
table of the quantiles of the chi-square distribution is presented in Appendix B.
Let us consider an example.
Example 3.2.1: Silicon wafer planarity measurements 1
Silicon wafers undergo a special chemical-mechanical planarization proce-
dure in order to achieve ultra-Hat wafer surfaces. To control the process per-
formance, a sampie of wafers from one batch was measured at nine sites. Table
3.6 presents a fragment of a large data set, for five wafers and for four sites.
Assuming that the wafer thickness at different sites can differ only by a shift
parameter, let us check the null hypothesis 1lo that the thickness has the same
mean value at all four sites.
The mean ranks are 4.6, 8.0, 13.8 and 15.6 for sites 1 through 4, respectively.
The KW statistic equals 11.137 on four degrees of freedom. From Table 3.5, it
1 Source: Arnon M. Hurwitz and Patrick D. Spagon "Identifying sources of variation" , pp.
105-114, in the collection by Veronica Czitrom and Patrick D. Spagon Statistical Gase Studies
tor Industrial Process lmprovement @1997. Borrowed with the kind permission of the ASA
and SIAM
3.2. COMPARING MORE TRAN TWO MEANS 53

Table 3.6: Thickness (in angstroms) for four sites on the wafer

Wafer Site 1 Site 2 Site 3 Site 4


1 3238.8 3092.7 3320.1 3487.8
2 3201.4 3212.6 3365.6 3291.3
3 3223.7 3320.8 3406.2 3336.8
4 3213.1 3300.0 3281.6 3312.5
5 3123.6 3277.1 3289.1 3369.6

corresponds to a ~ 0.025. We reject the null hypothesis at a = 0.05.

How to proceed if the null hypothesis is rejected? The statistician's first


task is to find those samples which do not show significant dissimilarities with
respect to their mean values. For this purpose a statistical procedure called
"multiple comparisons" is used (see Devore 1982, p. 598). Multiple comparisons
are based on a pairwise comparisons between all pairs of I samples involved. To
avoid tedious calculations, the use of statistical software is recommended. Every
statistical package has an option for multiple comparisons in the Kruskal-Wallis
procedure. Using Statistix reveals that samples 1, 2 and 3, or alternatively
samples 2,3,4 may be considered, at significance level a = 0.05, as having the
same mean values. The box and whisker plot in Fig. 3.3 also suggests that all
four samples cannot be treated as coming from the same population.
The next step in data analysis would be an investigation of the production
process. Finding and eliminating the factors responsible for large dissimilarities
in the wafer thickness at different sites would be a joint undertaking by engineers
and statisticians.
3490
*
3410

3330

-
3250

3170

0
3090
SITE1 SITE2 SITE3 SITE4

Figure 3.3. Box and whisker plot of wafer thickness at sites 1,2,3 and 4.
54 CHAPTER3. COMPARING MEANS AND VARIANCES

Table 3.7: Critical values of Hartley's statistic, for a = 0.05 (Sachs 1972, Table
151)

n-1 k=2 k=3 k=4 k=5 k=6 k=7 k=8 k=9


2 39.0 87.5 142 202 266 333 403 475
3 15.40 27.8 39.2 50.7 62.0 72.9 83.5 93.9
4 9.60 15.5 20.6 25.2 29.5 33.6 37.5 41.1
5 7.15 10.8 13.7 16.3 18.7 20.8 22.9 24.7
6 5.82 8.38 10.4 12.1 13.7 15.0 16.3 17.5
7 4.99 6.94 8.44 9.70 10.8 11.8 12.7 13.5
8 4.43 5.34 6.31 7.11 7.80 8.41 8.95 11.1
9 4.03 5.34 6.31 7.11 7.80 8.41 8.95 9.45
10 3.72 4.85 5.67 6.34 6.92 7.42 7.87 8.28
12 3.28 4.16 4.79 5.30 5.72 6.09 6.42 6.72

3.3 Comparing Variances


We present in this section two popular tests for the hypothesis of equality of
several variances.

Hartley's Test
Assume that we have k sampIes of equal size n drawn from independent normal
populations. Let s~, 8~, ... ,8~ be the estimates of the corresponding variances
computed using (2.1.8).
To test the null hypothesis 110 : O'~ = O'~ = ... = O'~, we have to compute
the following statistic due to Hartley:
A maxs~
Fmaz=-.-~. (3.3.1)
mmsi
The critical values of the Fmaz statistic are given in Table 3.7. The null hy-
pothesis is rejected (at the significance level a = 0.05) in favor of an alternative
that at least one of variances is different from others, if the observed value of
the Hartley's statistic exceeds the corresponding critical value in Table 3.7.

Example 9.9.1: Testing the equality of variances for wafer thickness data
Using formula (2.1.8), we compute the variances for the thickness of sites 1
through 4. In this example, the number of sampIes k = 4, and the sampIe size
n = 5. We obtain the following results:
s~ = 2019, s~ = 8488, s~ = 2789, s~ = 5985.
3.3. COMPARING VARIANCES 55

Hartley's statistic equals Fmaz = 8488/2019 = 4.2. The corresponding critical


value for n - 1 = 4 and k = 4 is 20.6. Since this number exceeds the observed
value of the statistic, the null hypothesis is not rejected.

Bartlett's Test
A widely used procedure is Bartlett's test. Unlike Hartley's test, it is applicable
for sampies of unequal size. The procedure involves computing a statistic whose
distribution is closely approximated by the chi-square distribution with k - 1
degrees of freedom, where k is the number of random sampies from independent
normal populations.
Let nl, n2, ... , nk be the sampie sizes, and N = nl + ... + nk. The test
statistic is
B = (loge 10) . ll, (3.3.2)
c
where
k
q = (N - k) IOg10 8; - ~:)ni - 1) log10 8~, (3.3.3)
i=1

1 k
C = 1 + 3(k _ 1) (tt(ni - 1)-1 - (N - k)-I) (3.3.4)

and ~ is the so-called pooled variance of all sampie variances 8~, given by

2 _
8p -
L~=1 (ni
N _ k
- 1)8~
.
(
3.3.5)

The quantity q is large when the sampie variances 8~ differ greatly and is equal
to zero when all 8~ are equal. We reject 1lo at the significance level a if
B > X~(k - 1). (3.3.6)
Here X~ (k - 1) is the 1 - a quantile of the chi-square distribution with k - 1
degrees of freedom. These values are given in Table 3.5.
It should be noted that Bartlett's test is very sensitive to deviations from
normality and should not be applied if the normality assumption of the popu-
lations involved is doubtful.

Example 9.9.2 continued


By (3.3.5), 8~ = 4820.2. By (3.3.3), q = 1.104, and by (3.3.4) c = 1.104. Thus
the test statistic

B = loge 10· ~ :~g~ = 2.303.


From Table 3.5 we see that the critical value of chi-square for k - 1 = 4 and
a = 0.05 is 9.49. Thus the null hypothesis is not rejected.
56 CHAPTER 3. COMPARING MEANS AND VARIANCES

Table 3.8: .Diastolic blood pressure measurement results

person no Left hand Right hand


1 72 74
2 75 75
3 71 72
4 77 78
5 78 76
6 80 82
7 76 76
8 75 78

3.4 Exercises
1. Consider the first and and the fourth site data in Table 3.6. Use the t-test
to check the hypothesis that /-'1 = /-'4.

2. Use Hartley's test to check the null hypothesis on the equality of variances
in five populations of day 1 through day 5; see Table 4.1.

3. It is assumed that the diastolic blood pressure is smaller, on average, on the


left hand of men than on their right hand. To check this assumption, the blood
pressure was measured in a sample of 8 healthy men of similar age. Table 3.8
presents the measurement results.
Use the data in this table to check the null hypothesis that the mean blood
pressure on both hands is the same.
Hint: Compute the differences "right hand pressure minus left hand pressure"
and construct a confidence interval for the mean value of these differences.

4. The paper by Bisgaard (2002) presents data on the tensile strength of high
voltage electric cables. Each cable is composed of 12 wires. To examine the
tensile strength, nine cables were sampled from those produced. For each cable,
a small piece of each of the 12 wires was subjected to a tensile strength test.
The data are shown in Table 3.9. 1 The box and whisker plot for these data is
shown in Fig. 3.4.
a. Compute the cable tensile strength variances and use Hartley's test to
check the null hypothesis that all nine sample are drawn from populations with
equal variances.

1 Reprinted from Quality Engineering (2002), Volume 14(4), p. 680, by courtesy of Marcel
Dekker, lnc.
3.4. EXERCISES 57

355

349

~$+
*
343

337

331 *
*
325
N1 N2 N3 N4 N5 N6 N7 N8 N9

Figure 3.4. Box and whisker plot for cable tensile strength data
Hint: maxs~ = 8~ = 48.4, min 8~ = 8~ = 10.1. Hartley's test statistic is 4.8,
which is smaller than the 0.05-criticallevel for k = 9 sampIes and n - 1 = 11
which lies between 8.28 and 6.72; see Table 3.7.

b. Use the Kruskal-Wallis procedure to test the null hypothesis that the
mean strength for all nine cables is the same.
Solution. The average ranks for sampIes 1 through 9 are as shown below:

R1 = 36.3, R2 = 26.9, Rs = 28.3, R4 = 44.3, R o = 67.6,


R6 = 61.4, R7 = 63.2, Rs = 74.5,.Rg = 88.0.

The Kruskal-Wallis statistic (3.2.4) is equal to 45.4, which is far above the 0.005
critical value for v = I - 1 = 8 degrees of freedom. Thus, we definitely reject
the null hypothesis.
c. Using multiple comparisons, determine groups of cables which are similar
with respect to their mean tensile strength.
Solution. Statistix produces the following result at a = 0.05:

There are 3 groups in which the means are not significantly different from
one another.
Group 1: 9,8,7,6,5;
Group 2: 8,5,7,6,4,1;
Group 3: 1,2,3,4,5,6,7.
Bisgaard (2002) notes "that a more detailed examination of the manufacturing
process revealed that the cables had been manufactured from raw materials
58 CHAPTER 3. COMPARING MEANS AND VARIANCES

Table 3.9: Tensile strength of 12 wires for each of nine cables (Bisgaard 2002,
p.680)
Wire no 1 2 3 4 5 6 7 8 9
1 345 329 340 328 347 341 339 339 342
2 327 327 330 344 3410 340 340 340 346
3 335 332 325 342 345 335 342 347 347
4 338 348 328 350 340 336 341 345 348
5 330 337 338 335 350 339 336 350 355
6 334 328 332 332 346 340 342 348 351
7 335 328 335 328 345 342 347 341 333
8 340 330 340 340 342 345 345 342 347
9 337 345 336 335 340 341 341 337 350
10 342 334 339 337 339 338 340 346 347
11 333 328 335 337 330 346 336 340 348
12 335 330 329 340 338 347 342 345 341

taken from two different lots, cable Nos. 1-4 having been made from lot A and
cable Nos. 5-9 from lot B".
Chapter 4

Sources of U ncertainty:
Process and Measurement
Variability

Happy the man, who, studying Nature's laws


through known eJJects can trace the secret cause.

Virgil

4.1 Introduction: Sources of Uncertainty


Suppose that we have a production process which is monitored by taking samples
and measuring the parameters of interest. The results of these measurements
are not constant, they are subject to uncertainty. Two main factors are respon-
sible for this uncertainty: changes in the process itself (process variability) and
measurement errors (measurement process variability). Suppose, for example,
that we are interested in controlling the magnesium content in steel rods. In
the normal course of rod production, the magnesium content will vary due to
variations in the chemical content of raw materials, "normal" deviations in the
parameters ofthe technological process, temperature variations, etc. So, even an
"ideal" laboratory which does absolutely accurate measurements would obtain
variable results. In real life there is no such thing as an "ideal" measurement
laboratory. Suppose that we prepare several specimens which have practically
the same magnesium content. This can be done, for example, by crushing the
steel into powder and subsequent mixing. The results of the chemical analysis
60 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

for magnesium content will be different for different sampIes. The variability is
introduced by the operator, measurement instrument bias and "pure" measure-
ment errors, Le. the uncertainty buHt into the chemical measurement procedure.
In this chapter we describe several statistical models which allow us to esti-
mate separately the two main contributions to the overall variability: produc-
tion process variability and measurement process variability. The first model
described in the next section is a one-way ANOVA with random effects.

4.2 Process and Measurement Variability:


One-way ANOVA with Random Effects
Example 4.2.1: Titanium content in heat resistant steel
A factory produces a heat-resistant steel alloy with high titanium (Ti) content.
To monitor the process, each day several specimens of the alloy are prepared
from the production line. Seven sampies are prepared from these specimens
and sent for analysis to the local laboratory to establish the Ti content. The
experiment is designed and organized in such a way that all seven sampies from
the daily production must have theoretically the same Ti content.
The results are tabulated in Table 4.1 and a box and whisker plot is given
in Fig. 4.1. One can see that there are variations between the sampies on a
given day, and that there is a considerable day-to-day variability of the average
Ti content.
The production manager claims that the variations in the measurement re-
sults on each of the five days prove that the measurement process is unstable and
erroneous. The laboratory chief, on the other hand, claims that the measure-
ments are quite accurate and that the Ti content varies because the production
process is not stable enough. As evidence he points to the day-to-day variations
in the average Ti content.
Theoretically, in the absence of measurement errors, all seven sampies taken
from a single day's batch must have the same percentage of titanium, and if the
process itself is stable, the day-to-day variation of titanium content must also
be very small.
The purpose of our analysis is the investigation of the day-to-day variability
and the variability due to the uncertainty introduced by the measurement pro-
cess. Our analysis is in fact an application of so-called one-way random factor
(or random effects) analysis of variance (ANOVA). It is very important to use
accurate notation and to use the appropriate probability model to describe the
variation in the results.
4.2. PROCESS AND MEASUREMENT VARIABILITY 61

Table 4.1: Ti contents of steel rods (xO.Ol%)

Sampie Day 1 Day 2 Day 3 Day4 Day5


No.
1 180 172 172 183 173
2 178 178 162 188 163
3 173 170 160 167 168
4 178 178 163 180 170
5 173 165 165 173 157
6 175 170 165 172 174
7 173 177 153 180 171

Day 175.71 172.86 162.86 177.57 168.00


aver.
81 = 2.93 82 = 4.98 83 = 5.76 84 = 7.23 85 = 6.06

188

181

174

*
167.

160

153
DAY1 DAY2
*
DAY3 DAY4 DAY5

35 cases

Figure 4.1. The box and whisker plot for the data in Example 4.2.1

The Model
Let i be the day (bateh) number, i = 1,2, ... , I, and let j be the measurement
number (Le. the sampie number) within a batch, j = 1,2, ... , J. Xii denotes the
62 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

Table 4.2: The data in matrix form


Sampie No Batch 1 Batch 2 BatchI
Xn
XI2

J XIJ XIJ
XI·

Here Ai is the random batch deviation from the overall mean. It changes ran-
domly from batch to batch (from day to day), but remains constant for all
sampies within one day, Le. Ai remains constant for all measurements made on
the sampies prepared from the material produced during one day. €ij is the
random measurement error.
The model (4.1.1) is the so-called single random effect (or single random
factor) model. An excellent source on the ANOVA with random effects is Sa-
hai and Ageel (2000). The mathematical model of a single-factor model with
random effects is described there on pp. 11 and 24.
Table 4.2 presents the data in matrix form. The measurement results from
batch (day) i are positioned in the ith column, and Xij is the measurement
result from sampie j of batch i.
Further analysis is based on the following assumptiolls:
(i) The Ai are assumed to be randomly distributed with mean zero and

varlance a A2 .
(ii) The €ij are assumed to be randomly distributed with mean zero and

varlance a e2 .
iii) For any pair of observations, the corresponding measurement errors are
uncorrelated; any Ai is uncorrelated with any other A j , i i- j, and with any
measurement error €ld.
It follows from (4.2.1) that

Var[Xij 1= a~ + a~. (4.2.2)


Thus the total variance of the measurement results is a sum 0/ two variances,
one due to the batch-to-batch variance a~ and the other due to the measurement
error variance a~. The main goal of our analysis is to estimate the components
of the variance, a~ and a~.
4.2. PROCESS AND MEASUREMENT VARIABILITY 63

Estimation of 0A and Ue

First, define the ith batch sampie mean


Xi! + Xi2 + ... + XiJ
Xi· = ------=----
J
(4.2.3)

and the overall sampie mean z ..


_ Xl. + ... +XI. (4.2.4)
X .. = I

Define the sum of squares of the batch sampie mean deviation from the
overall mean, ss A, according to the following formula:
I
SSA = J2)Xi. _X.. )2. (4.2.5)
i=l

This is the between-batch variation.


The second principal quantity is the within-batch variation SSe:
I J
SSe = L L(Xij - Xi.)2 = (J -1)(s~ + ... + sn, (4.2.6)
i=l j=l

where
J
s~ = L(Xij - Xi.)2 I(J - 1). (4.2.7)
j=l

Now we are ready to present the formulas for the point estimates of U e and

I sSe
(4.2.8)
VI(J -
A .

ue = 1)"

VSSAI(I - 1) - t7~
= (4.2.9)
A

UA
J
Ifthe expression under the square root in (4.2.9) is negative, the estimate of UA
is set to zero.
Let us outline the theory behind these estimates. Let Xi. be the random
variable expressing the average for day i, i = 1, ... , I:

X ,.. -_ XiI + Xi2 + ... + X u (4.2.10)


J
Denote

X .. = (Xl' + ... + XdII, (4.2.11)


64 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

Table 4.3: Analysis of variance for model (4.2.1)

Source of Degrees of Sum of Mean Expected mean F-value


variation freedom squares square square
Between I-I SSA MSA 0'2e + J0'2A MSAIMSe
Within I(J -1) SSe MSe 0'2e
Total IJ-l SST

the mean of all observations. Denote by MSA and MSe the so-called mean
squares:

MSA = SSAI(I -1), MSe = SSel (I(J -1». (4.2.12)


Then it can be proved that the expectation of the mean squares is equal to
I J
E[MSe] = E[L L(Xij - X i .)2 II(J -1)] = O'~ (4.2.13)
i=1 j=1

and
I J
E[MSA] = E[L L(Xi, - X .. )2 1(1 -1)] = O'~ + JO'~. (4.2.14)
i=1 j=1

Table 4.3 summarizes this information


Point estimates ue and UA in (4.2.8), (4.2.9) are obtained by replacing
Xij,X i . and X .. by the observed values xij,:h and x.. , respectively. Section
2.5 of Sahai and Ageel (2000) contains more details on the point estimation of
0' A and O'e.
It is important to mention that (4.2.13) and (4.2.14) are true without as-
suming any specific form of a distribution regarding the random variables Ai
and f.ij'

Example 4.2.1 concluded. Additional statistical checks


Let us complete the computations for the titanium example. Here I = 5 and
J = 7. From Table 4.1 it follows that
x.. = (175.71 + 172.86 + 162.86 + 177.57 + 168.00)/5 = 171.40, (4.2.15)

8SA = 7(175.71-171.40)2 + ... + (168.00 -171.40)2) = 1002.9,(4.2.16)

SSe = 6(2.932 + 4.982 + 5.762 + 7.232 + 6.062) = 933.4. (4.2.17)


4.2. PROCESS AND MEASUREMENT VARIABILITY 65

Table 4.4: Optimal number of measurements J* per batch for N = 60


(J = 0.025 (J = 0.05 (J = 0.1 (J = 0.2 (J = 0.4 (J = 0.8 (J = 1.0
30 12 10 6 4 3 3

Now by (4.2.8) and (4.2.9),


ue = V933.4/30 = 5.58 ~ 5.6; (4.2.18)

UA = V(1002.9/4 - 5.582 )/7 = 5.6. (4.2.19)


We see, therefore, that the measurement errors and day-to-day variability of the
production process make equal contributions to the total variability of measure-
ment results.

If the computation results show that 0' A is very small, then it is a good idea
to test the null hypothesis that 0'A = o. The corresponding testing procedure
is carried out under the normality assumption regarding all random variables
involved, Ai and €ij. This procedure is based on the so-called F-ratio defined
in Table 4.3, and it works as follows.
Compute the ratio
F = 88A/(I - 1) (4.2.20)
88 e /(I(J - 1»
If F exceed8 the o:-critical val ue of the F -statistic with I-I, I (J - 1) degrees
of freedom, F1-1,I(J-l)(0:), then the null hypothesis O'A = 0 is rejected at the
significance level 0: in favor of the alternative 0' A > O.
For example 4.2.1, F = 8.06. From Appendix C we see that for a = 0.01,
F4,30(0.01) = 4.02. Since this number is smaller than the computed F-value,
we reject the null hypothesis at level 0: = 0.01.

Optimal N umber of Measurements per Batch


The following issue is of theoretical and practical interest. Suppose that we
decide to make in total N measurements, where N = I x J, I being the number
of batches and J being the number of measurements per batch. Suppose that our
purpose is the estimation with maximal precision (Le. with minimal variance)
of the ratio 8 = O'~ / O'~. The variance in estimating 8 depends on the choice of I
and J (subject to the constraint on their product I x J = N). Table 4.4 shows
the optimum number of measurements per batch J* depending on the assumed
value of 8, for N = 60.
Returning to Example 4.2.1, 8 ~ 1. The best allocation of N = 60 ex-
periments would be to take J* = 3 measurements per batch and to take 20
batches.
66 OHAPTER 4. PROOESS AND MEASUREMENT VARIABILITY

The general rule is the following: H U e » U A, Le. (J is small, it is preferable to


take many observations per batch and a few batches. H U A ~ U e , it is preferable
to take many batches, with few observations per batch. Suppose that for our
example we choose I x J = 36. The optimal allocation design would be to take
I = 3 batches and 12 measurements per batch.

ConcIuding re marks
In the model considered in this section, apart from the measurement errors, the
only additional source of result variability is the process batch-to-batch varia-
tion. This was reflected in choice ofmodel, the one-factor ANOVA with random
effects. In production and measurement practice, we meet more complicated
situations. Assume, for example, that the batches are highly nonhomogeneous,
and each batch is divided into several sampies within which the product is ex-
pected to have relatively small variations. Formalization of this situation leads
to a two-factor hierarchical (or nested) design. This will be considered in next
section.
It happens quite frequently that there are several, most often two, principal
factors influencing the measurement process variability. A typical situation
with two random factors is described in Sect. 4.4. There we will consider a
model with two random sources influencing measurement result variability, one
of which is the part-to-part variation, and the other the variability brought
into the measurement process by using different operators to carry out the
measurements.

4.3 Hierarchical Measurement Design


Our assumption in Example 4.2.1 was that a single batch is completely homo-
geneous with regard to the titanium content. This, in fact, is valid only in
rather special circumstances. For example, each day's production consists of
several separately produced portions (e.g. each portion is produced on a dif-
ferent machine), which then go through a mechanical mill and are thoroughly
mixed together. So, each batch given to the lab is homogenized. This is a clever
technique which helps to avoid machine-to-machine variability. However, the
production process does not always allow such homogenization and/or it is in
our interest to discover and to evaluate the variability introduced by different
machines.
Consider, for example, the following situation. Each day a long metal rod is
producedj along its length there might be some variations in titanium content.
We might be interested in estimating these variations and in separating them
from the daily batch-to-batch variations.
The following design of experiment allows us to investigate the contribution
to the total variability of the following three sources: the day-to-day variationj
the sample-to-sample variation within one daYj and the variability caused by
4.3. HIERARCHICAL MEASUREMENT DESIGN 67

measurement errors.
We will refer to each day's output as a batch. In our example the batch
will be a long metal rod. We analyze I batches. For the analysis, J sampIes
are randomly taken from each batch. In our case it might be a collection of J
small specimens cut randomly along the whole length of the rod. Each sampIe
is measured K times in the lab to determine its titanium content. The variation
in these K measurements from one sampIe is introduced solely by the variations
in the measurement process. H the measurement process were "ideal", an K
measurements would be identical.
Schematically, this measurement design can be presented in the form of a
"tree", as shown on Fig. 4.2. In the literat ure it is called often a "nested" or
"hierarchical" design.

"} SAMPLES
PER BATCH

0 Q 0 0 0 0 0 0 0 0 0 K
0 0 0 0 0 0 0 0 0 0 0 MEASUREMENTS
0 0 0 0 0 0 0 0 0 0 0 PER SAMPlE
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0

Figure 4.2. A hierarchical design of measurements

4.3.1 The Mathematical Model of Hierarchical Design


We number the batches i = 1, ... , I, the sampIes within each batch j = 1, ... , J,
and the measurements (tests) within each sampIe k = 1,2, ... ,K. Xijk is the
measurement result of the kth measurement in the jth sampIe of the ith batch.
We assurne that

(4.3.1)

where p. is the overall mean value, ~ represents the random contribution to p.


due to ith batch, Bj(i) is the random contribution to p. due to sampIe j nested
68 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

within batch i, and €ijk is the error term that reflects the measurement error
for the kth measurement of the jth sampie in the ith bateh.
We make the following assumptions:
(i) The Ai are zero-mean random variables with variance 0'1.
(ii) The Bj(i) are zero-mean random variables with variance O't.
(iii) The €ijk are zero-mean random variables with variance O'~.
(iv) The Ai and Bj(i) are mutually uncorrelatedj the €ijk are uncorrelated
between themselves and uncorrelated with Ai and Bj(i).
Our purpose is to estimate O'e,O'A and O'B. We need some notation. The
observed value of the random variable Xijk will be denoted as Xijk. Let Xij. be
the sampie average of all K measurements in the jth sampie of batch i:

Xij· = (Xij1 + ... +XijK)/K. (4.3.2)


Let Xi .. be the sampie average of all measurements in the ith batch:

Xi·· = (Xil. + ... + xu.)/J. (4.3.3)


Let x... be the overall average:
I J K
X ... = (Xl .. + ... +XI .. )/I = LLLXijk. (4.3.4)
i=l j=l k=l
We will need the following three sums of squares:
I J K
88 e = LLL(Xijk -Xij.)2, (4.3.5)
i=l ;=1 k=l
which is called the sum of squares for the pure measurement errorj
I J
88B(A) = KL L(Xij. - Xi .. )2, (4.3.6)
i=l ;=1
the so-called between-sample sum of squares, and
I
88A = K· JL(Xi .. - X... )2, (4.3.7)
i=l
the between-batch sum of squares. Here are the formulas for Ue, UA and UB:

~
O'e =
J 88 e
I J(K - 1) j (4.3.8)

88B(A) _ ~2)/K.
( (4.3.9)
I(J - 1) O'e ,

(4.3.10)
4.3. HIERARCHICAL MEASUREMENT DESIGN 69

Table 4.5: Analysis of variance for model (4.3.1)

Source of Degree of Sum of Mean Expected mean square


variation freedom squares square
Due toA I-I
B within A I(J -1)
Error IJ(K -1)
Total IJK-l

Note that if the expression under the square root in (4.3.9) or (4.3.10) is nega-
tive, then the corresponding estimate is set to zero. Table 4.5 summarizes the
information used for point estimation of a e, a A and aB.
It is worth noting that the formulas for point estimates (4.3.8)-(4.3.10) are
derived without using any assumptions regarding the form of the distribution
of random variables Ai, Bi(j) , Eij/;. The normality of their distribution must be
assumed at the stage of testing hypotheses regarding the parameters involved.

Example 4.9.1: Hiemrchical measurement scheme 1


The data in Table 4.6 are borrowed from Box (1998, p. 174). We have 1=5
batches, J = 2 sampies per batch and K = 2 measurements per sample. To
compute the first sum of squares, SSe, let us recall a useful shortcut: if there
are two observations in a sampie, say Xl and X2, then the sum of squares of the
deviations from their average is (Xl - X2)2/2. Thus

SSe = (74.1 ~ 74.3)2 + ... + (78.2 ~ 78.0)2 = 0.98

Then by (4.3.8), ae = JO.98/1O = 0.313 R$ 0.31.


Since there are two sampies per one batch, we can apply the same shortcut
to compute SSB(Af

-2 (74.2-68.0)2
SS B(A) _ . 2 + ... + (81.9-78.1)2)_2340
2 -.

and thus by (4.3.9), aB = 4.83 R$ 4.8.


The batch averages are

Xl .. = (74.2 + 68.0)/2 = 71.1,


X2 .. = (75.1 + 71.5)/2 = 73.3,
X3 .. = (59.0 + 63.4)/2 = 61.2,
X4 .. = (82.0 + 69.8)/2 = 75.9,
1 R.eprinted from Quality Engineering (1998-99), Volume 11(1), p. 174, by courtesy of
Marcel Dekker, Inc.
70 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

Table 4.6: Titanium content xO.01 %

Batch 1 1 2 2 3 3 4 4 5 5
Sample 1 2 1 2 1 2 1 2 1 2
74.1 68.2 75.4 71.5 59.4 63.2 81.7 69.9 81.7 78:2
74.3 67.8 74.8 71.5 58.6 63.6 82.3 69.7 82.1 78.0
Average 74.2 68.0 75.1 71.5 59.0 63.4 82.0 69.8 81.9 78.1

X5 .. = (81.9 + 78.1)/2 = 80.0,


and the overall average (mean) is
(71.1 + 73.3 + 61.2 + 75.9 + 80.0)/5 = 72.3.
Then, by (4.3.7),

88A = 2· 2 ((71.1 - 72.3)2 + ... + (80.0 - 72.3)2) = 791.6.


Now by (4.3.10),
UA = V(791.6/4 - 0.098 - 2 x 4.832)/(2 x 2) = 6.14 ~ 6.1.
The total variance of X ijk is Var[Xijk] = O'~ + 0'1
+ o'~. Its estimate is
6.142 + 4.83 2 + 0.31 2 = 61.1. The greatest contribution to this quantity is
due to the batch-to batch variation, about 62%. The second largest is the
sample-to-sample variability within one bateh, about 38%. The variability due
to measurement errors gives a negligible contribution to the total variability,
about 0.2%.

4.3.2 Testing the Hypotheses 0"A = O,O"B(A) = 0


Hypothesis testing in the random effects model rests on the assumption that all
random variables involved are normally distributed.
To test the null hypothesis o'~ = 0 against alternative o'~ > 0 it is necessary
to compute the statistic
FA = 88A/(I - 1) . (4.3.11)
88B(A)/I(J -1)
The null hypothesis is rejected at significance level a if the computed value of
FA exceeds .rI-I,I(J-I)(a), which is the a-critical value of the .r distribution
with VI = I-I and 112 = I(J -1) degrees offreedom. These critical values are
presented in Appendix C.
Similarly, to test the null hypothesis o'~ = 0 against alternative o'~ > 0, it
is necessary to compute the statistic
88B(A)/I(J -1)
FB= . (4.3.12)
88 e /IJ(K -1)
4.4. REPEATABILITY AND REPRODUCIBILITY STUDIES 71

The null hypothesis is rejected at signifieanee level a if the eomputed value of FB


exeeeds F1(J-I),IJ(K-I)(a), whieh is the a-eritical value of the F distribution
with VI = J(J - 1) and &'2 = J J(K - 1) degrees of freedom.
As an exercise, eheck that for Example 4.3.1 the null hypotheses for 0"1 and
O"~ are rejected at signifieanee level 0.1 and 0.05, respectively.

4.4 Repeatability and Reproducibility Studies


4.4.1 Introduction and Definitions
The 1994 edition of US National Institute of Standards and Technology (NIST)
Technical Note 1297, Guidelines /or Evaluating and Ezpressing the Uncertainty
0/ NJST Measurement Results, defines the repeatability of measurement results
as "closeness of the agreement between the results of suecessive measurements
of the same measurand earried out under the same eonditions of measurement" .
It goes on to say: "These eonditions, ealled repeatability conditions, include the
same measurement proeedure, the same observer, the same measuring instru-
ment, under the same eonditions, the same loeation, repetition over a short
period oftime" (clause D.1.1.2). Repeatability may be expressed quantitatively
in terms of the dispersion (varianee) characteristics of the measurement results
earried out under repeatability eonditions.
The reproducibility of measurement results is defined as "closeness of the
agreement between the results of measurements of the same measurand earried
out under ehanged eonditions of measurement". The changed eonditions may
indude "principle of measurement, method of measurement, different observer,
measuring instrument, loeation, eonditions of use, time. Reproducibility may
be expressed quantitati vely in terms of the dispersion (varianee) eharacteristics
of the results" (dause D.1.1.3).
In all our previous examples, we spoke about "measurements" without spe-
cifying the exact measurement eonditions, tools and methods. The reader prob-
ably understood that all measurements were earried out on a single measurement
device, by a single operator, within a short period of time and under permanent,
unehanged eonditions. In practice, however, all these limitations are rarely sat-
isfied. Measurements are performed on several measurement devices, by several
different operators, over a protracted period of time and under variable eondi-
tions. All that inereases of eourse the variability of the results. Repeatability
and reproducibility (R&R) studies are meant to investigate the eontribution of
various factors to the variability of measurement results. While in Seets. 4.2
and 4.3 our attention was foeused on separating the proeess variability from the
variability introdueed by the measurements, in this section we foeus on the mea-
surement process and study various sourees of variability in the measurement
proeess itself. Often in literature the subjeet of this ehapter is ealled "Gauge
eapability studies." The word "gauge" means a measuring tool or a means of
making an estimate or judgment.
72 CHAPTER 4. PROCESS AND MEASUREMENT VARlABILITY

Table 4.7: Measured weights (in grams) of 1= 2 pieces of paper taken by J = 5


operators

Item Operator 1 Operator 2 Operator 3 Operator 4 Operator 5


1 3.481 3.448 3.485 3.475 3.472
1 3.477 3.472 3.464 3.472 3.470
1 3.470 3.470 3.477 3.473 3.474
2 3.258 3.254 3.256 3.249 3.241
2 3.254 3.247 3.257 3.238 3.250
2 3.258 3.239 3.245 3.240 3.254

4.4.2 Example of an R&R Study


Example 4.4.1: Weight 0/ paper measured by jive operators
This example, borrowed from Vardeman and VanValkenburg (1999), presents a
typical framework for R&R studies. Five operators, chosen randomly from the
pool of operators available, carried out weight measurements of two hems (two
pieces of paper). Each operator took three measurements of each item). The
measured weights in grams are presented in Table 4.7. 2
There are four sources of the result variability: item, operators, item-
operator interactions and measurements. The variation within one cell in the
Table 4.7, where the item, and the operator are the same, reflects the repeatabil-
ity. The variation due to the operator and the item-operator interaction reflects
the reproducibility.
Figure 4.3 illustrates the data structure in general form. It is convenient
to represent the measurement data in a form of a matrix. This has I rows, J
columns and I x J cells. Here I is the number of items, J is the number of
operators. The (i, j)th cell contains K measurements carried out by the same
operator on the same item. In Example 4.4.1, 1= 2, J = 5, K = 3.

The Model
To give the notions of repeatability and reproducibility more accurate defini-
tions, we need a mathematical model describing the formal structure of the
measurement results.
The kth measurement result in the (i,j)th cell is considered as a random
variable which will be denoted as X ijk • We assume that it has the following
form:
(4.4.1)
2Reprinted with permission from Technometries. Copyright 1999 by the American Statis-
tical Association. All rights reserved.
4.4. REPEATABILITY AND REPRODUCIBILITY STUDIES 73

FACTOR B

Fig. 4.3. Graphical representation of a two-factor balanced design model

Here J.t is the overall mean; Ai is the random contribution to J.t due to item i; B j
is the random contribution to J.t due to operator j; (AB)ij is the random term
describing the joint contribution to J.t of item i and operator j, which is called
the random i - j interaction term. fijk is the "pure" measurement error in the
kth measurement for a fixed combination of i and j.
Formally (4.4.1) is called a two-way crossed design with random factors (or
random effects); see Sahai and Ageel (2000, Chapt. 4).
The observed values of X ijk will be denoted as Xijk. For example, the ob-
served K measurement results in the (i, j )th cell will be denoted as Xij1, ••• , Xij K •
We make the following assumptions regarding the random variables ~,Bj,
(AB)ij and fijk:
(i) The Ai are zero-mean variables with variance a~.
(ii) The B j are zero-mean random variables with variance a~.
(iii) The (AB)ij are zero-mean random variables with variance a~B.
(iv) The fijk are zero-mean random variables with variance a;.
(v) All the above-mentioned random variables are mutually uncorrelated.
Assumptions (i)-(v) allow us to obtain point estimates ofthe variances involved
and of some functions of them which are of interest in repeatability and repro-
ducibility studies.
To obtain confidence intervals and to test hypotheses regarding the variances,
the following assumption will be made:
(vi) All random variables involved, Ai, B j , (AB)ij and fijk, are normally
distributed.
74 CHAPTER 4. PROCESS AND MEASUREMENT VARlABILITY

Repeatability is measured by U e • We use the notation Urepeat = U e • Repro-


ducibility is measured by the square root of the variance components u1 + U~B'
The quantity

(4.4.2)
is the standard deviation of measurements made by many operators measuring
the same item in the absence of repeatability variation. In R&R studies, an
important parameter is

(4.4.3)

which reflects the standard deviations of all measurements made on the same
item. Finally, another parameter of interest, which is analogous to urepro, is

(4.4.4)

Point Estimation of Urepeah Urepro, UR&R and UitemB

As already mentioned, there is no need for assumption (vi) in obtaining the


point estimates.
The observed measurement results in the (i, j)th cell will be denoted as
Xijk, k = 1, ... , K. Xijk will denote the corresponding random measurement
result. The sampIe mean of one cell is denoted as Xij.:
K

Xij· = LXijk/K. (4.4.5)


k=l

The sampIe mean of all cell averages in row i is denoted as Xi .. :

Xi·· = (Xii. + ... + XiJ.)/J. (4.4.6)


By analogy, the average of all cell means in the jth column is denoted as x.j.:

X-j. = (X1j. + ... + xlj.)/I. (4.4.7)


The observed value of the mean of all cell means is denoted as z... :
J I
X... = Lx.j.jJ = LXi ..jI. (4.4.8)
j=l i=l

First, define the following four sums of squares based on the observed values
of the random variables Xijk:
I
SSA = J. K· L(Xi .. - X... )2, (4.4.9)
i=l
4.4. REPEATABILITY AND REPRODUCIBILITY STUDIES 75

Table 4.8: Analysis of variance for model (4.4.1)

Source of Degrees of Sum of Mean Expected mean square


variation Freedom squares square
Due toA I-I SSA MSA O'e2 +K0'AB2 + JK 0'A2
Due to B J-l SSB MSB O'~ + KO'~B + IK0'1
Interaction (I -1)(J -1) SSAB MSAB O'~ + KO'~B
AxB
Error IJ(K -1) SSe MSe 0'2e
Total IJK-l SST

J
SSB = I· K· ~)x.j. - X... )2, (4.4.10)
j=1
I J
SSAB = KE~:)Xij. -Xi .. -X.j. +X... )2, (4.4.11)
i=1 j=1
I J K
SSe = EEE(Xijl: -Xi;.)2. (4.4.12)
i=1 j=1 1:=1
Let us now define four random sum of squares SSA,SSB,SSAB and SSe by
replacing in (4.4.9)-(4.4.12) the lower-case italic letters Xijl:, Xi .. , X.j. and X... by
the corresponding capital italic letters which denote random variables. In this
replacement operation, Xij. is the random mean of all observations in cell (i,j);
Xi .. is the random mean of all observations in row i, X.j. is the random mean
of all observations in column j. X ... denotes the random mean of all I x J x K
observations. The properties of these random sum of squares are summarized
in Table 4.8.
Replacing the expected values of SSe, SSA, SSB and SSAB by their observed
values sSe, SSA, SSB, SSAB, and the variances O'~, O'~, 0'1, O'~B by their estimates,
we arri ve at the following four equations for the point estimates of the variances
in the model:
SSe _ (,2
(4.4.13)
[J(K -1) - e'

(4.4.14)

(4.4.15)

(4.4.16)
76 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

Combining the variance estimates, we arrive after little algebra at the following
estimates of the parameters which are of interest in R&R studies:
88B 88AB 88 e )
(
max 0, KI(J -1) + IK(J -1) - IJ(K -l)K . (4.4.17)

88B 88AB 88 e
(4.4.18)
URkR=
IK(J -1) + IK(J -1) + KIr
88A 88AB 88 e )
( (4.4.19)
max 0, K(I -l)J + JK(I -1) - KIJ(K -1) .

Example 4.4.1 continued


Let us return to the data in Table 4.7 In our case I = 2, J = 5 and K =3. We
omit the routine computations, which we carried out with the two-way ANOVA
procedure in Statistix, and present the results:
88 A = 0.37185; (4.4.20)
88 B = 0.000502;

88 AB = 0.000176;

88 e = 0.00102.
Using formulas (4.4.13) and (4.4.17)-(4.4.19), we find that

Urepeat = 0.0071; (4.4.21)


urepro = 0.0034;
UR&R = 0.0079;
UitemB = 0.16.

4.4.3 Confidence Intervals for R&R Parameters


This subsection contains rather theoretical material, and the reader can skip
the theory at first reading and go directly to the formulas presenting the final
results.
We will use the following notation: V '" X2 (v) means that the random
variable V has a chi-square distribution with v degrees of freedom. Denote by
qß(v) the ß-quantile of the chi-square distribution with v degrees of freedom.
Thus, if V '" X2 (v) then
P(V :::; qCI-ß) (v») = 1- ß, (4.4.22)
qß(v») = ß,
P(V :::; (4.4.23)
P(qß(v) <V :::; qCI-ß) (v») = 1- 2ß· (4.4.24)
4.4. REPEATABILITY AND REPRODUCIBILITY STUDIES 77

The quantiles for the chi-square distribution are presented in Appendix B for
ß = 0.01, 0.025, 0.95, 0.975, 0.99.
All our derivations are based on the following claim proved in statistics
courses.

Claim 4.4.1
If all random variables in model (4.4.1) have normal distribution, then

SSA 2(1 1) (4.4.25)


2+KaAB
ae 2 +JKaA
2 "'X - ,

SSB 2() (4.4.26)


a2
e
+ K a2AB + 1K a2B '" X J - 1 ,

2 SS;B 2 '" X2 (1 -1) x (J -1»), (4.4.27)


ae + aAB

S~e '" X2(IJ(K -1)). (4.4.28)


ae
All X2 (-) random variables in (4.4.25)-(4.4.28) are independent.

An immediate application of this claim is the 1 - 2ß confidence interval on


a e • It follows from (4.4.26) that

P(qß(IJ(K-l))~ S~e ~q(l-ß)(IJ(K-l))) =1-2ß, (4.4.29)

P ( SSe <a < SSe


q(l-ß) (IJ(K -1)) - e - qß(IJ(K -1)) = 1- 2ß· (4.4.30)

For the data in Example 4.4.1, 88 e = 0.00102, 1 J(K - 1) = 20, qo.o2s(20) =


9.591,qo.97s(20) = 34.17. By (4.4.29), the 95% confidence interval on a e is

[0.0055, 0.010].

The confidence interval for arepro will be obtained using the so-called M-
method developed in reliability theory (see Gertsbakh 1989, Chap. 5). This
method works as follows.
Suppose that the random set Sl-2ß covers in the three-dimensional paramet-
ric space the point a; = (a~,a~B,a~) with probability 1 - 2ß. Let m(Sl-2ß)
and M(Sl-2ß) be the minimum and the maximum of the ntnction 'IjJ(a;) =
o. a; + a~B + a~ on this set. Then the following implication holds:
(a;,a~B,a~) E Sl-2ß =} m(Sl-2ß) ~ 'IjJ(a;) ~ M(Sl-2ß)' (4.4.31)
Therefore,

(4.4.32)
78 CHAPTER 4. PROCESS AND MEASUREMENT VARlABILITY

The choice of the confidence set Sl-2ß is most important. We choose it


according to (4.4.25) of Claim 4.4.1 for I = 2,J = 5,K = 3, ß = 0.05. Note
that
p(SSB/qO.9S(4) $ O'~ + 30'~B + 60'~ $ SSB/QO.05(4») = 0.90. (4.4.33)

Thus, the desired 90% confidence set is defined by the inequalities


+ 30'~B + 60'~ < SSB/Qo.os(4).
SSB/QO.9S(4) < O'~ (4.4.34)
It is easy to demonstrate that the function O'~B + O'~ has the following
maximal and minimal values on this set:

maximal value M = SSB/(Qo.os(4) ·3) and


minimal value m = SSB/(QO.9S(4) ·6).

Therefore, (m, M) is the interval which contains O'~B + O'~ = O'~epro with prob-
ability at least 0.90.
Now substitute SSB = 0.000502, qo.9s(4) = 9.488, qo.os(4) = 0.711 and
obtain that the confidence interval for O'repro is
(rm,..fM) = [0.003, 0.015].
Vardeman and VanValkenburg (1999) suggest constructing the confidence
intervals for O'R&R O'repro and O'items using the approximate values of the stan-
dard errors of the corresponding parameter estimates. This method is based on
the error propagation formula, and we will explain in the next chapter how this
method works.

4.5 Measurement Analysis for Destructive


Testing
Pull Strength Repeatability: Experiment Description
We have seen so far that the estimation of measurement repeatability is based
on repeated measurements carried out under similar conditions (the same item,
the same operator, the same instrument, etc.). There are situations in which
it is not possible to repeat the measurements. A typical case is an experiment
which destroys the measured specimen. For example, measuring the strength or
fatigue limit of a mechanical construction destroys the construction. In order to
evaluate the repeatability it is necessary to design the experiment in a special
way. The following interesting and instructive example is described by Mitchell
et al. (1997).
In a hardware production process, a gold wire is used to connect the inte-
grated circuit (IC) to the lead frame. Ultrasonic welding is applied to bond
the wire. Quality suffers if the IC and the frame are not weil connected. The
4.5. DESTRUCTNE TESTING 79

Table 4.9: Pull strength in grams for 10 units with 5 wires each

wire UnI Un2 Un3 Un4 Un5 Un6 Un7 Un8 Un9 UnlO
1 11.6 9.7 11.3 10.1 10.9 9.7 9.6 10.7 10.8 9.9
2 11.3 11.2 11.0 10.2 10.8 11.0 10.3 11.2 10.6 9.4
3 10.3 9.8 10.2 10.8 10.3 10.3 9.2 9.9 9.2 11.3
4 11.7 11.0 11.1 8.6 11.7 10.8 9.1 9.7 10.6 10.8
5 10.3 9.6 11.4 10.8 10.6 9.3 10.2 9.7 10.5 11.3

bond pull test is used to pull on the wire to determine the strength required to
disconnect the wire from the lead frame.
Two experiments were designed to establish the puH strength repeatability
u~. In the first experiment, the u~ was confounded with u~, the variability
created by the wire position on the production unit. In the second experiment,
u~ was confounded with the variability u~ caused by different production units.
The information combined from two experiments enables separate estimation of
u e 2 , u!, u~, as weH as of the variability u~per due to different operators.

Experiment 1: Description and Data


Ten units were selected randomly from a batch of units produced under identical
conditions. Five wire positions were selected randomly and one operator carried
out the pull test for each wire. (The wire positions were the same for all units).
The experiment results are presented in Table 4.9. 1
This experiment in fact has an hierarchical design; see Fig. 4.2. The units
play the role of batches, and the wires play the role of the sampies within a
bateh. The only difference is that there are no repeated measurements of the
same sampie. In the notation of Sect. 4.3, I = 10, J = 5 and K = 1.
The random pull strength Xij of the jth wire in the ith unit is modeled as
(4.5.1)
Here p. is the overall mean value. Ai is the random contribution due to the
variability between units, and is a zero-mean random variable with variance u;.
Bj(i) is the random contribution due to the wire position within a unit, and is a
zero-mean random variable with variance u!. €ij is the random contribution of
the pull strength for fixed unit and fixed wire position. It is assumed that the
€ij are zero mean random variables with variance u~.

lSource: Teresa MitchelI, Victor Hegemann and K.C. Liu "GRP methodology for destruc-
tive testing and quantitative assessment of gauge capability", pp. 47-59, in the collection
by Veronica Czitrom and Patrick D. Spagon Statistical Gase Studies for Industrial Process
lmprovement ©1997. Borrowed with the kind permission of the ASA and SIAM.
80 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

Experiment 1: Calculations
Let us modify appropriately the formulas (4.3.5)-(4.3.7) for our data. Obviously,
88 e = o. Also

10 5
88B(A) = LL(Xij -xd 2 , (4.5.2)
i=l j=l
where Xij is the observed pull strength for unit i and wire j, and Xi. is the
average pull strength for unit i. Finally,
10
88A = 5· L(xi. - X.. )2, (4.5.3)
i=l
where x.. is the overall average pull strength.
Now we have the following two equations (compare with Table 4.5 and set
UA = Uu,UB = u w ):

(4.5.4)

(4.5.5)

It is easy to find the values of 88A and 88B(A) , for example by using the one-way
ANOVA procedure in Statistix. The results are below:
88A = 8.12, 88B(A) = 19.268.
From (4.5.4) and (4.5.5), we find that
&; + &! = 0.482; (4.5.6)
&~ = 0.084. (4.5.7)

Experiment 2: Description and Data


Thirty units were chosen randomly from the same production lot as in Exper-
iment 1. Three randomly chosen operators carried out the pull strength test.
Each operator pulled one wire on each of ten units. The wires subjected to the
pulling test had the same positions on each unit. The pull strength measurement
results are presented in Table 4.10. 2
Experiment 2 also has a hierarchical design. Operators play the role of
batches, and the units are the sampies within batch. Because of the nature of
the experiment, repeated measurements for the same sampie are not possible.
2Source: Teresa MitchelI, Victor Hegemann and K.C. Liu "GRP methodology for de-
structive testing and quantitative assessment of gauge capability", p. 52, in the collection
by Veronica Czitrom and Patrick D. Spagon Statistical Case Studies for Industrial Process
ImprolJement ©1997. Borrowed with the kind permission of the ASA and SIAM.
4.5. DESTRUCTNE TESTING 81

Table 4.10: Pull strength in grams for one wire on each of ten units

Unit Operator 1 Operator 2 Operator 3


1 10.3 9.3 11.1
2 10.9 11.4 10.4
3 11.0 10.1 11.5
4 11.7 9.1 11.1
5 9.8 10.0 11.6
6 10.5 9.9 9.7
7 10.5 11.1 10.9
8 10.8 10.9 9.8
9 10.8 10.3 11.4
10 10.3 9.9 10.4

In this experiment I = 3, J = 10 and K = 1. Denote by Yij the measured pull


strength recorded by operator i and unit j. For each i, j ranges from 1 to 10:

(4.5.8)
where J1.* is the overall mean, Ci is the variability due to the operator, Aj(i)
is the variability due to the unit for a fixed operator, and f:tj is the random
repeatability contribution to the pull strength for a fixed wire, operator and
unit. (Imagine that there exist several absolutely identical copies of the wire
bond on a fixed position for each i and j. Each such bond would produce an
independent replica of the random variable €tj)' Our assumptions are that all
random variables involved have zero mean value and the following variances:

Var[Cil = a~per;
Var[Aj(i)] = a~, equal to the corresponding variance in the Experiment 1;
Var[€tj] = a~, equal to the variance of €ij in Experiment 1.
It is important to note that all units in both experiments are produced in
identical conditions, so that the tool repeatability expressed as a~ is the same
in both experiments.
Similarly to Experiment 1, we will use the following two sum of squares:
3 10
88A(0) =L L(Yij - YiY; (4.5.9)
i=1 j=1

3
880 = 10 L(Yi' - y,y. (4.5.10)
i=1
82 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

Here 'ih is the mean of all measurements for operator i, and Y.. is the mean of
all 30 observations. Now we have the following two equations (compare with
Table 4.5):

(4.5.11)

(4.5.12)

Calculations Completed
It is easy to compute, using single-factor ANOVA, that SSA(O) = 11.55 and
SSO = 1.92. Substituting into (4.5.11) and (4.5.12), we obtain that

(4.5.13)
(4.5.14)

Combining this with the result of Experiment 1, we obtain the following


estimates of all variances and standard deviations appearing in the model:

u~ = 0.346; ue = 0.59;
u~ = 0.084; Uu. = 0.29;
u~per = 0.053; uoper = 0.23;
u! = 0.136; Uw = 0.37.
Mitchell et al. (1997) were interested in estimating the parameter O'meaB =
JO'~ + O'~per· This estimate is u meaB = v'0.346 + 0.053 = 0.63.

4.6 Complements and Exercises


1. Range method for estimating repeatability
The following simple procedure produces good point estimates of O'e in the
R&R studies; see Montgomery and Runger (1993). Compute, for each item i
and each operator j the range ri,j for K measurements in the (i,j)th cell; see
Table 4.7. For example, for i = 2,j = 1 r2,1 = 0.004. Compute the mean range
over all I x J cells:

- L[=l L;=l ri,j (4.6.1)


R= IJ .
Estimate 0' e as
~* R (4.6.2)
O'e = AK '
4.6. COMPLEMENTS AND EXERCISES 83

Table 4.11: Running times of 20 mechanical time fuses measured by operators


stopping two independent docks (Kotz and Johnson 1983, p. 542)

Fuse No First instrument (sec) Second instrument (sec)


1 4.85 5.09
2 4.93 5.04
3 4.75 4.95
4 4.77 5.02
5 4.67 4.90
6 4.87 5.05
7 4.67 4.90
8 4.94 5.15
9 4.85 5.08
10 4.75 4.98
11 4.83 5.04
12 4.92 5.12
13 4.74 4.95
14 4.99 5.23
15 4.88 5.07
16 4.95 5.23
17 4.95 5.16
18 4.93 5.11
19 4.92 5.11
20 4.89 5.08

where AK is the constant from Table 2.6. K is the number of measurements


made by one operator. In the notation of Table 2.6, K = n = 3. In our case,
we must take A 3 = 1.693.

2. Grubbs 's method 0/ analysing measurement data when repeated measurements


are not available
The data in Table 4.11 give the burning time in seconds of 20 shell fuses. For
each shell, the burning time was measured by two independent instruments. For
obvious reasons, these are not repeated measurements. 1
Let us formulate the model corresponding to this measurement scheme. De-
note by Yil the first instrument reading time for fuse i. We assume that

Yil = Xi + ßl + fil , (4.6.3)


1 Borrowed from S. Kotz and N.L. Johnson (eds.) Encyc/opedia 0/ Statistical Sciences,
Vo1.3, p. 543; copyright @1983 John Wiley & Sons, Inc. This material is used by permission
of John Wiley & Sons, Inc.
84 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY

where Xi is the true burning time of the ith fuse, ßl is an unknown constant
(instrument 1 bias), and fil is the random error of instrument 1. It is assumed
that Xi and fil are independent random variables,
Xi '" N(I',O'~), fil '" N(O'O'~l). (4.6.4)
Similarly, the reading of the second instrument for the ith fuse is
(4.6.5)
where ß2 is an unknown constant bias of instrument 2, and fi2 is the random
measurement error of instrument 2. It is assumed that Xi and fi2 are indepen-
dent,
(4.6.6)
Find estimates of Ißl - ß21, O'~'O'~l and 0';2·
It may seem that there are not enough data for this estimation. The trick is
to involve two additional statistics - the sum and the difference of the readings
of the instruments.

Solution

Denote SU M = Yil + Yi2, D I F F = Yil - Yi2. It is easy to prove that


Var[Yil] = O'~ + O'~l' (4.6.7)

Var[Yi2] = O'~ + 0'~2' (4.6.8)

Var[SU M] = 40'~ + O'~l + 0'~2' (4.6.9)

Var[DIFF] = O'~l + 0';2· (4.6.10)

It follows from these formulas that


2 Var[SUM]- Var[DIFF]
O'x = 4 ' (4.6.11)

(4.6.12)

and
(4.6.13)

(4.6.14)

3. For the data in Table 4.11, estimate O'X,O'el,O'e2 and fJ = ßl - ß2.


A good source on Grubbs' estimation method are the papers by Grubbs
(1948; 1973); see also Kotz and Johnson (1983, p. 542). For more than two
4.6. COMPLEMENTS AND EXERCISES 85

instruments a separate treatment is needed, but the principle of using pairwise


differences and sums remains the same.

4. For the data of Table 4.11, construct the 95% confidence interval for the
parameter fJ.
Hint: Use the statistics of the running time differences.

5. Compute the variance of SSe in Example 4.4.1.


Solution. According to (4.4.26), SSe/O'~ has a chi-square distribution with v =
1 J(K - 1). Recall that if Y has a chi-square distribution with v degrees of
freedom, then Var[Yl = 2v. Therefore,

Var[SSe/O'~l = 2v,
and
Var[SSel = 2IJ(K -1)0':. (4.6.15)

6. Compute the variances of SSA,SSB and SSAB in Example 4.4.1.


Answer:
Var[SSAl = 2(1 -1)(O'~ + KO'~B + JKO'~)2; (4.6.16)

Var[SSBl = 2(J -1)(O'~ + KO'~B + 1KO'1)2; (4.6.17)

Var[SSABl = 2(1 - 1)(J - 1)(O'~ + KO'~Bt (4.6.18)

7. For Example 4.3.1, check the null hypotheses for O'~ and 0'1 at significance
level 0.05.
Chapter 5

Measurement U ncertainty:
Error Propagation Formula

I/ a man will begin with certainty, he shall end


in doubts; but i/ he will be content with doubts,
he shall end in certainties.

Francis Bacon, The Advancement 0/ Leaming

5.1 Introduction
So far we have dealt with various aspects of uncertainty in measuring a single
quantity. In Sect. 4.4 it was a measurement of weight; in Sect. 4.5 we analyzed
results from measuring pull strength. In most real-life situations, the measure-
ment process involves several quantities whose measurement result is subject to
uncertainty. To clarify the exposition, let us consider a rather simple example.

Example 5.1.1: Measuring specific weight.


Suppose that there is a need to know the specific weight of a certain steel used in
metallic constructions. For this purpose, a cylindrical specimen is prepared from
the steel, and the weight and volume of this specimen are measured. Denote by
D the diameter of the specimen and by L its length. Then the volume of the
specimen is V = 'Ir D2 L / 4. If the weight of the specimen is W, then the specific
weight p equals
W 4W
p= V = 'lrD2L' (5.1.1)
88 CHAPTER 5. ERROR PROPAGATION FORMULA

Note that all quantities in this formula W,D,L are obtained as the result of a
measurement, Le. they are subject to random errors. To put it more formally,
they are random variables. Therefore the specific weight p is also a random
variable. We are interested in the variance of p, Var[p) , or in its standard
deviation O'p = .jVar[p).
An approximate value of 0'p can be found by means of a formula known as
the error propagation formula which is widely used in measurements. Let us
first derive it, and then apply to our example.

5.2 Error Propagation Formula


Let a random variable Y be expressed as a known function / (.) of several other
independent random variables Xl,X 2 , ••• , X n :
(5.2.1)
Let I'i and O'i be the mean and standard deviation, respectively, of Xi. To
obtain an approximate expression for Var[Y) we will use an approach known as
the delta methodj see Taylor (1997, p. 146).
Let us expand Y as a Taylor series around the mean values I'i of Xi, i =
1, ... ,n:

(5.2.2)

It this formula the dots represent the higher-order terms. The partial derivatives
are evaluated at the mean values of the respective random variables.
Now ignore all higher-order terms and write:
8/
~ /(1'1,1'2, ... , I'n) + L 8x.lx =w
n
Y o
. (Xi -I'i) (5.2.3)
i=l ''I''
Let us compute the variance ofthe right-hand side of (5.2.3). Denote the partial
derivatives by /1. Since the terms in the sum are independent, we can use
formula (2.1.22):
n
Var[Y) ~ LUI) 2 Var[Xi ). (5.2.4)
i=1

From this it follows that


n
O'y ~ LUD 2 Var[Xi ]. (5.2.5)
i=1

Expression (5.2.3) implies that


I'Y = E[Y) ~ /(1'1, .. . , I'n). (5.2.6)
5.2. ERROR PROPAGATION FORMULA 89

We are already familiar with the notion of the coefficient 01 tJariation of a


positive random variable Y, defined as oY/J!.Y; see (2.1.29). Assuming that
Y is positive, in measurement practice we often use the term relatitJe standard
detJiation (RSD) instead of coefficient of variation. Let us write the formula for
the RSD ofY:
n 2
Z)/D 2 ( Ui ) • (5.2.7)
i=l J!.Y

The use of (5.2.5) and (5.2.7) is a valid operation if the RSD's of random
variables Xi, Ui/ J!.i are small, say no greater than 0.05. In other words, it will
be assumed that
Ui < 0.05. (5.2.8)
J!.i -
In measurements, this is usually a valid assumption.
In measurement practice, we do not know, however, the mean values of
random variables Y, Xl, ... )....Xn . What we do know are their observed values,
which we denote as y (or f), and Xl."" X n · In the case where the relative
standard deviations of all random variables are small, it is a valid operation to
replace the theoretical means J!.i by Xi. Thus, we arrive at the following version
of (5.2.7):

(5.2.9)

where II is the the partial derivative of 10 with respect to Xi, evaluated at


the observed values of Xi = Xi, i = 1, ... , n. (5.2.9) is known as the error
propagation lormula (EPF). It may also be writen as
Its equivalent version is
n
u ll ~ I)ll)2 U ;. (5.2.10)
i=l

In practice, we do not usually know the variances ul either, and use their
estimates a;.
Then the previous formula takes the form
n
ull ~ L,(ll) 2 al· (5.2.11)
i=l

Example 5.1.1 continued.


The following measurement results of W, D and L were obtained: W = 25.97
gram; D = 1.012 cm; L = 4.005 cm. Assume that it is known from previous
90 CHAPTER 5. ERROR PROPAGATION FORMULA

experience that the standard deviations in measuring the weight W and the
quantities D, L are as folIows:
Uw = 0.05 gjUD = 0.002 cmj UL = 0.004 cm.
Find the value of p and its standard deviation U p •
Solution
Here p plays the role of Y, and W, L, D play the role of Xl, X 2 , X 3 • The function
f(W,L,D) is
p = 4W/(7rD 2L». (5.2.12)

The estimated value of pis p= 4·25.97 /(7r' 1.0122 ·4.005) = 8.062 g/cm3 •
Compute the partial derivatives 8p/8w, 8p/8L, 8p/8D and evaluate the par-
tial derivatives at the observed values of W, D and L:

, 8p 4
fw = 8W = 7rD2L j (5.2.13)
I 8p -4W
h = 8L = 7rD2L2 j (5.2.14)
, 8p -8W
fD = 8D = 7rLD3'
Substituting into these formulas the observed values of W,D,L, we obtain:
(flv)2 = 0.096j (5.2.15)
(nY = 4.052j (5.2.16)
(11)2 = 253.8.
By (5.2.11) the standard deviation of pis

u p ~ ';0.096.0.052 + 4.052.0.0042 + 253.8.0.0022 = 0.036. (5.2.17)


It is a common practice to present the computation result in the form ß± u p :
8.062 ± 0.036 gram/cm3 •

5.3 EPF for Particular Cases ofY = f(X1 , ... , X n )


ease 1. EPF for Y = L~=l CiXi
In this case f is a linear nmction of its arguments. The Taylor series expansion
produces the same result as formula (2.1.22):

Uy = ';Var[Y] =
n
:~::>~u~, (5.3.1)
i=l
5.3. EPF FOR PARTICULAR CASES 91

where 0'; = Var[Xi]. (Recall that the Xi are assumed to be independent random
variables.) Ifthe 0'; are replaced by their estimates u;,
the analogue of (5.2.11)
is
n
Uy = y'var[Y] R1 L~u;, (5.3.2)
i=l

Case 2. EPF for Y = f(XI, X 2 ) = XI/(Xl + X2)


In this case,
(5.3.3)
If we observe Xl = Xl, X 2 = X2, and ul is the estimate of ul, i = 1,2, and
1= XI/(Xl + X2), then (5.2.11) takes the form
~ /A2 2
V 0'1 • X 2+ 0'2
A2 2
. Xl
(5.3.4)
0'11 R1 (Xl + X2)2 •

Case 3. EPF for Y = f(XI, ... ,Xk,Xk+l,' .. ,Xn ) = n~l XiI n~=k+l Xi
Very often the function f(·) is a ratio of products of random variables. For
this particular case, one can check that
p
(fI)
2
= x~,i = 1, ... ,n.
I

Replace the Xi by their observed values Xi and denote f(xl'" . , X n ) = 1.


Now formula (5.2.11) takes the following elegant form:

(5.3.5)

Another form of this formula is

(5.3.6)

In words: the relative standard deviation (RSD) of the resulting variable Y is


approximately equal to the square root of sum of squares of the RDS's of the
variables XI, .. . , X n .
4. EPF for Y = f(XI,X 2 ,X3 ) = JXI/Cl + X 2 /C2 + X 3 /C3
In this case,

/ ' _
c-i l
i - 2Y' (5.3.7)
92 CHAPTER 5. ERROR PROPAGATION FORMULA

i = 1,2,3. Put a; = Var[Xi]. Then we obtain that


JaVC? + aUq + aUCl (5.3.8)
ay ~ 21YI .
In practice, we must replace Y by its estimate (its observed value) Y, and a;
by their estimates (j-;. Then
J(j-2/C2 + (j-2/C 2 + (j-2/C2
ay~ 1 1 2..... 2 3 3. (5.3.9)
21YI
We will apply this formula for approximate computation of the standard
deviation of aR&R defined by (4.4.18)j see Exercise 4 in the next section.

Remarks
1. How do we obtain the relative standard deviations of the quantities entering
the formula for ay? Generally, there are three ways: statistical experiment,
Le. previous repeatability /reproducibility studiesj certified standards data for
the measuring instrument and/or from manufacturer warrantiesj expert opin-
ion/ analyst judgments.
Useful sources on the methodology and practice of establishing uncertainty
are Eurachem (2000) and Kragten (1994).
Sometimes, the certificate data say that the measurement error of a certain
device has a specific distribution, for example a uniform one in an interval of
given length~. Then the corresponding standard deviation is equal to a =
~/Vf2.
2. It is important to give the following warnings. Often, the EPF underesti-
mates the RSD of the final result. This happens because certain factors which
in reality inßuence the result and may increase the uncertainty are omitted
from the formula y = f( Xl, ..• , X n ). So, for example, the result may depend
on the variations of the ambient temperature during the experiment, and the
temperature is omitted from the input variables on which the output Y depends.
Another reason for underestimation of the uncertainty is underestimation of
component variability. Consider, for example, measuring the concentration C of
some chemical agent, c = W /V, where W is the weight of the material and V is
the volume of the solution. When we measure the weight of the material to be
dissolved, we may take into consideration only the variability introduced by the
weighing process itself and ignore the fact that the agent is not 100% pure. In
fact, the content of the agent has random variations, and an additional statistical
experiment should be carried out to establish the variability introduced by this
factor.
5.4. EXERCISES 93

5.4 Exercises

1. Specifie weight is eomputed as p = W IV, where W is the weight measurement


and V is the volume measurement. The measurement results are w = 13.505
g, v = 3.12 ems . The relative standard deviations of the weight and volume
measurements are 0.01 and 0.025, respectively. Find the RSD of the specifie
weight.

Answer: By (5.3.6) pi p ~ VO.0252 + 0.01 2 = 0.027.


0"

2. The volume V of a rectangular prism equals VI V2 Va where VI and V2 are the


dimensions of its base and Va is its height. Each of these dimensions is measured
with RSD of 0.01. Find the relative error of the volume measurement.

Answer: V3· 0.01 2 = 0.0017.

3. Potassium hydrogen phthalate has the following chemical formula: CsH 5 0 4 K.


Its atomic weight equals 8· Wc + 5· WH + 4· Wo + WK· WC, WH, WO, WK
are the atomie weights of elements C, H, 0, K respeetivelYi see Table 2.4. The
standard deviation of the estimate of the atomic weight of each element is given
in the third eolumn of Table 2.4. For example, 0.00058 = JVar[WcJ.
Find the approximate standard deviation (standard error) in estimating the
atomic weight of potassium hydrogen phthalate.

Solution: The relevant expression is (5.3.2).


O"w ~ J64 . 0.000582 + 25 . 0.0000402 + 16 . 0.000172 + 0.0000582 =
0.0047.

4. Estimation 01 standard error oIO"R&R


The estimate of O"R&R is defined by (4.4.18). If we treat O"R&R as a random
variable, then the eorresponding formula will be the following:

SSB SSAB SSe


(5.4.10)
O"R&R = IK(J -1) + IK(J -1) + IJK'
see Case 4 of Seet. 5.3.
Caleulate an approximate value of the standard error of O"R&R. Take I =
2, J = 5,K = 3 and UR&R = 0.0079. Use the results of Example 4.4.1.

Solution. The main difficulty is obtaining estimates of varianees of SSe, SSAB


and SSB. For this purpose, we must use formulas (4.6.15), (4.6.17), (4.6.18).
· 0"e2
These cont am 22
, 0" B, 0" AB.
94 CHAPTER 5. ERROR PROPAGATION FORMULA

By (4.6.15),

V;U:[SSe] = 40· u~ = 40· (0.0071)4 = 1.02.10-7 • (5.4.11)


To obtain an estimate of U~B' use the following formula which ia derived
from (4.4.16):

U~B =max(0,(88AB/4-8~)/3).
Substituting into thia formula 88 AB = 0.0000176 and u~ = 0.0071 2 , we see that
U~B = O. Now by (4.4.15),

Now by (4.6.17), (4.6.18),

V;U:[SSB] = 8(0.0071 2 + 6 ·1.25 ,10- 5 )2 = 1.26.10-7 , and


V;U:[SSAB] = 8.0.0071 4 = 2.03 .10- 8 •
Now substitute all estimates into the formula

~ VVar[SSB]/242 + Var[SSAB]/242 + Var[SSe]/30 2


St. dev. of UR&R ~ 2~
UR&R

The final result is 0.0012 and this is the approximate value of the standard
deviation (standard error) of UR&R.
The above calculations are an implementation of the method suggested by
Vardeman and VanValkenburg (1999) based on the use of EPF.
Chapter 6

Calibration of Measurement
Instruments

A little inaccuracy sometimes saves tons


of explanation.

Saki

6.1 Calibration Curves


6.1.1 Formulation of the Problem
The best way to formulate the problem of calibration is to consider a typical
example. Suppose that we are interested in measuring glucose concentration in
mg/dl in certain substances. For this purpose, a spectrophotometric method is
used. Without going into detail, let us note that the response of the measure-
ment device y is the so-called absorbance, which depends on the concentration
x. So, the measurement instrument provides us with a value y, the response,
when the substance has glucose concentration x. One might say that each mea-
surement instrument is characterized by its own relationship hetween x and y.
Formally, it seems that we can write
y = f(x), (6.1.1)
where the function f(·) represents this relationship.
It is important to note that what happens in measurements is not a purely
deterministic relationship like (6.1.1). Suppose that we prepare ten portions
of a suhstance which contains exactly 50 mg/dl of glucose. Then we measure
the response of our instrument to these ten specimens of the suhstance and we
96 CHAPTER 6. CALIBRATION

observe differentvalues ofthe absorbance. Why does this happen? There might
be many reasons: measurement errors, small variations in the environmental
conditions which infiuence the absorbance, operator error, etc. So, in fact, the
response at concentration x is a random variable Y. Assuming that x is known
without error, we represent the mathematical modelofthis situation as folIows:
Y = fex) +€, (6.1.2)
where € is a zero-mean random variable which describes the deviation of the
actual response Y from its "theoretical" value fex). Then, the mean value of
the "output" Y, given the "input" x, equals fex). Formally,
E[Y] = fex). (6.1.3)
Suppose that we know the relationship y = fex). This is called the cal-
ibration curve. We are given some substance with unknown concentration of
glucose. We observe the instrument response Y*, Le. we observe a single value
of the absorbance. Our task is to determine the true content x*. Figure 6.1
illustrates this typical situation.

f(x)

~-----------------o----------~x
x*

Figure 6.1. Example of a calibration curve


Assume for a moment that Y = fex) is a totally deterministic and known
relationship. Then the problem looks simple: x* will be the solution of the
equation y* = f(x*). Mathematicians would write this via the inverse function:
x* = f- 1 (y*). (6.1.4)
In reallife, our situation is complicated by the fact that we do not know the
exact calibration curve fex). The best we can achieve is a more or less accurate
estimate of it. Let us use the notation j(.) for this estimate. Thus, we will be
solving the equation
Y* = 1cx*) (6.1.5)
with respect to x*, and this solution will involve random errors. In other words,
the determination of x* involves uncertainty. This uncertainty depends on the
uncertainty involved in constructing the calibration curve.
6.1. CALIDRATION CURVES 97

Table 6.1: Absorbance as a function of glucose in serum

Concentration Absorbance, y
of glucose in mg/dl, x
o 0.050
50 0.189
100 0.326
150 0.467
200 0.605
400 1.156
600 1.704

Now we are able to formulate our task as folIows:


1) Using the data on the glucose concentration x and and the respective ab-
sorbances Y(x), construct the calibration curve f<x).
2) Estimate the uncertainty in solving the inverse problem, Le. estimate the
uncertainty in determining x* for given Y* = y*.

6.1.2 Linear Calibration Curve, Equal


Response Variances
Example 6.1.1: Measuring glucose concentration1
Mandel (1991, p. 74) reports an experiment set up to construct a calibration
curve for an instrument designed to estimate glucose concentration in serum.
For this purpose, aseries of sampies with exactly known glucose concentration
were analyzed on the instrument, and the so-called absorbance was measured
for each sampie. The results are presented in Table 6.1.
The data are a set of pairs [Xi, Yi], i = 1,2, ... , n, where Xi are known quan-
tities, Le. nonrandom variables, and Yi are the observed values of random vari-
ables, the responses.
Our principal assumption is that there exists the following relationship be-
tween the concentration X and the response (absorbance) Y:
Y =a+ßx+t:. (6.1.6)
Note that we use capital Y for the random response which depends on x and is
influenced also by random measurement error t:. For the ith measurement, the
relationship (6.1.6) can be rewritten as

(6.1. 7)

1 Reprinted from John Mandel, Evaluation and Control 0/ Measurements (1991), by eour-
tesy of Mareel Dekker, Ine.
98 CHAPTER 6. CALIBRATION

We assume that €i '" N(O, 0- 2 ), and that the random variables €i are independent
for i = 1, ... ,n.
It follows from (6.1.7) that the mean response E[Yil at point Xi equals
!(Xi) = a + ßXi. This is our "ideal" calibration curve. It is assumed there-
fore that !(x) is a straight line.
Another point of principal importance is that the variance of measurement
errors 0- 2 does not depend on Xi. This is the so-called equal response variance
case. We will comment later on how a calibration curve might be constructed
when error variances depend on the input variable values x. In further exposition
we denote by Yi the observed value of Yi.

Constructing the Calibration Curve


For our model (6.1.7), we have to find the values oftwo parameters determining
the linear relationship, a (the intercept) and ß (the slope). These are found
using the so-called least squares grinciple. According to this, we take estimates
of a and ß as those values a and ß which minimize the following sum of squares:
n
Sum of Squares = L(Yi - a - ßXi)2. (6.1.8)
i=l

The geometry of this approach is illustrated in Fig. 6.2.


Finding the estimates of a and ß is a routine task: compute the partial
derivatives of the "Sum of Squares" with respect to a and ß, equate them to
zero, and then solve the equations. The details of the solution can be found in
any statistics course, we will present the results.
First, define the following quantities, which are all we need to extract from
the data:

n
x= LXi/n; (6.1.9)
i=l

(6.1.10)
i=1
n
8 zz = L(Xi - X)2; (6.1.11)
i=l
n
Syy = L(Yi - y)2; (6.1.12)
i=1
n
Szy = L(Xi - X)(Yi - y). (6.1.13)
i=1
6.1. CALIBRATION CURVES 99

Now the estimates of a and ß are:


~ SZ1/
ß=-, (6.1.14)
SZZ

Ci=y-ßx. (6.1.15)
In addition, we are able to find an estimate of a 2 :

(6.1.16)

-+--~------~------~--------~~~x
Xi
Figure 6.2. The estimates Ci and ßminimize L;=l Ai.
Example 6.1.1 contimied
For the data of Example 6.1.1 we have
x = 214.29; Y = 0.6424; 8 zz = 273,570; 8 z 1/ = 754.26; 81/1/ = 2.0796;
ß= 0.00276; Ci = 0.0516; a = 0.00188.
Using Statistix, we obtain all the desired results automatically, as shown in Fig.
6.3; see Analytical Software (2000).

Estimating x* from the Observed Response y*


Suppose we test a certain substance with unknown value of x. Our instrument
provides us with an observed value of the response y*. Then the unknown value
of the independent variable x, x*, will be obtained by solving the equation
y* = Ci + ßx*. (6.1.17)
Obviously,
y*-Ci
x*--- (6.1.18)
- ß
100 CHAPTER 6. CALIBRATION

More convenient will be an equivalent formula:


* -
x* = x+ y :: y, (6.1.19)
ß
which one can obtain by substituting (6.1.15) for er. It follows from (6.1.19)
that
( * -)2
(x* - X)2 = Y :: y (6.1.20)
ß2
UNWEIGHTED LEAST SQUARES LINEAR REGRESSION OF ABSORB

PREDICTOR
VARIABLES COEFFICIENT STD ERROR STUDENT'S T P

CONSTANT 0.05163 0.00105 49.28 0.0000


CONCENTR 0.00276 3.593E-06 767.30 0.0000

R-SQUARED 1.0000 RESID. MEAN SQUARE (MSE) 3.532E-06


ADJUSTED R-SQUARED 1.0000 STANDARD DEVIATION 0.00188

SOURCE DF SS MS F P
---------- ---------- ----------
REGRESSION 1 2.07954 2.07954588753.23 0.0000
RESIDUAL 5 1.766E-05 3.532E-06
TOTAL 6 2.07956

CASES INCLUDED 7 MISSING CASES 0

Simple Regression Plot


1.8

1.5

1.2
ID
c:::
oCD 0.9
ID
«
0.6

0.3

0.0

o 100 200 300 400 500 600


CONCENTR
Figure 6.3. The printout and regression plot produced by Statistix.
Suppose that in Example 6.1.1 we observe the absorbance y* = 0.147. Then
by (6.1.19),
x* = 214.29 + (0.147 - 0.6424)/0.00276 = 34.80.
6.1. CALIBRATION CURVES 101

U ncertainty in Estimating x*
An important practical issue is estimating the uncertaint1l introduced into the
value of x* by our measurement procedure. Let us have a closer look at (6.1.19).
The right-hand side of it depends on y*, y and ß. Imagine that we repeated
the whole experiment, for the same fixed values of Xl, ... , X n . Then we would
observe values of the response variable y different from the previous ones be-
cause of random errors Ei. Therefore, we will obtain different values of y and
different values of the estimate ß. In addition, the response of the measurement
instrument to the same unknown quantity of X will also be different because y*
is subject to random measurement error.
Formally speaking, x* is a random variable depending on random variables
y, ß and y*. To keep the notation simple, we use the same small italic letters
for these random variables.
Our goal is to obtain an approximate expression for the variance of x*. In
practice, the following approximate formula for the variance of x* is used:

(6.1.21)

(The corresponding expression in Mandel (1991, p. 81) contains a misprint.


Expression (6.1.21) coincides with (5.9) in Miller and Miller (1993, p. 113).
Let us present the derivation of this formula. It is an application of the
error propagation formula developed in Chap. 5. Although this derivation is
instructive, the reader may skip it and simply use the final result. To derive
(6.1.21) we need to assume that EiN(O,a2).

Derivation of (6.1.21)
The following statistical facts are important:
1. y* is a random variable with variance a 2 , the same variance possessed byall
observations for various x-values.
2. y* and y are independent random variables since they depend on independent
sets of observations.
3. The variance of y is a2 In.
4. The variance of ß

(6.1.22)

5. Under the normality assumptions, y* - y and 7J are independent random


variables.
Properties 4 and 5 are established in statistics courses.
Expression (6.1.19) is oftype Z = C+UIV, where U and V are independent
random variables. Since C is a nonrandom quantity, it does not influence the
102 CHAPTER 6. CALIBRATION

variance. The expression U /V is an example of case 3 in Sect. 5.3. Using (5.3.5),


we obtain that
,~ [U] """"' U
var V V2
2
[var[u] var[v]]
U2 + V2 . (6.1.23)

Put U = y* - y and V = ß. Then U 2/V2 = (x* - X)2 j see (6.1.20). Thus

,~var [*]
x '"
'" (x * _ -)2
x [var[y*] +) 2
(y*-y Var[y] + var[ß]]
~.
( )
6.1.24
ß2
Replacing the random variables by their observed values, we arrive at the desired
formula:
Var[x*] ~ (x* - X)2 [1+ l/n + 1]
(y* - y)2
-~-- 0'2.
ß2 S :r;:r;
A
(6.1.25)

Example 6.1.1 concluded


Returning to Example 6.1.1, the corresponding numerical result is

*] ( 1 2)2 ( 1 + 1/7
Var [x ~ 34.80 - 2 4. 9 (0.147 _ 0.6424)2 + 0.0027621 )
. 273,570 x
3.532 . 10-6 = 0.584,
and
0'* = v'Var[x*l ~ 0.77.
Suppose we think that Var[x*] = 0.584 is too large and we would like to
carry out a new experiment to reduce it. One way of doing so is to divide the
substance with unknown concentration of glucose into several, say k, portions
and to measure the response for each of these k portions. Then we would observe
not a single value of y*, but a sampie of k values {yt, ... ,yk}. Denote by 11*
the corresponding mean value:
k
11* = Lyi/k. (6.1.26)
i=l
Now we will use the value 11* instead of y*. What will be gained by this? Note
that the variance of 11* is k times smaller than the variance of y*. Thus, formula
(6.1.21) will take the form

Var[x*] ~ (x* - X)2 [ l/k + l/n


(yO - y)2
1]
+ -~- 0'2.
ß2 S:r;:r;
A
(6.1.27)

For example, let k = 4. Then, with the same numerical values for all other
variables, we will obtain Var[x*] ~ 0.237, 0'* ~ 0.49, quite a considerable
reduction.
6.2. NONCONSTANT RESPONSE VARIANCE 103

6.2 Calibration Curve for Nonconstant


Response Variance
6.2.1 The Model and the Formulas
In our principal model (6.1.6) it was assumed that the t:i are zero-mean ran-
dom variables with constant variance. That is to say, it was assumed that the
response variance does not depend on the value of the input variable x. In most
technical applications, the variability of the response does depend on the value
of x. For example, for BOme instruments, the response variability increases with
the input variable.
So, let us now assume that,

(6.2.1)

The case of nonconstant variances is referred to as heteroscedasticity.


How might one discover this phenomenon? A good piece of advice is to use
visual analysis of residuals. The residual at point Xi is the difference between the
a
observed value Vi and the corresponding value on the calibration curve + ßXi.
Formally, the residual r i is defined as

(6.2.2)
If the residuals behave, for example, as shown in Fig. 6.4, then obviously the
variability of the response is increasing with x.

r·I

Figure 6.4. The variability of residuals increases with x.

We will assume now that the variance of the response at the input value Xi
has the following form:

(J
2()
Xi =K
-v, (6.2.3)
Wi
104 CHAPTER 6. CALIBRATION

where KtJ is some positive constant and Wi, the so-called weights, are assumed
to be known exactly or up to a constant multiple. In practice, the weights are
either known from previous experience, or on the basis of a specially designed
experiment. This experiment must inelude observing several responses at the
Xi values, i = 1, ... ,n.
Transformation is often used to obtain a linear calibration curve. Then it
may happen that the transformed responses exhibit heteroscedasticity. The
form of the transformation dictates the choice of weights. We will not go into
the details of finding the weights. There is an ample literature on this issuei see
Madansky (1988, Chapter 2).
One way of obtaining weights is to observe, say, m responses at each Xi and
set the weights to be equal to the inverse of the observed sampIe variances:
(6.2.4)
So, let us assume that in addition to the data [Xi, Yi, i = 1,2, ... ,n] we have
a collection of weights
(6.2.5)
In the statisticalliterature, the procedure of finding the calibration curve
when Var[€i] is not constant is called weighted regression. The estimates of
parameters Q and ß are found by minimizing the following weighted sum 0/
squares:
n
Weighted Sum of Squares = E Wi(Yi - Q - ßXi)2. (6.2.6)
i=1

We see therefore that the squared deviation of Q + ßXi from Yi is "weighted"


in inverse proportion to the variance of Yi. Thus, less accurate observations get
smaller weight.
Now define the following quantities:

(6.2.7)

(6.2.8)

n
Szz = EWi(Xi -xli (6.2.9)
i=l

n
81111 = E Wi(Yi - y)2; (6.2.10)
i=1

n
8z1l = E Wi(Xi - X)(Yi - y). (6.2.11)
i=1
6.2. NONCONSTANT RESPONSE VARIANCE 105

Table 6.2: Time for water to boil as a nmction of the amount of water
i Xi ti Estimated Weight
cm3 sec variance Wi
s~

1 100 35 12.5 0.08
2 200 63 6.1 0.16
3 300 95 4.5 0.22
4 400 125 2.0 0.50
5 500 154 2.0 0.50

The formulas for the estimates of a and ß remain the same:


Ti = Szy (6.2.12)
Szz
Ci = fi- fix. (6.2.13)
The estimate of (12 (Xi) is now

~2( x·2) =k-v,


(1 (6.2.14)
• Wi

where [(v is defined as


n .-'- 2
k - :Ei-l Wi(Yi - a - ßXi)
(6.2.15)
v - n-2

Example 6.2.1: Calibmtion curve with variable (12(Xi)


I decided to investigate how long it takes to boil water in my automatie coffee
pot. The time depends on the amount of water. I poured 100, 200, 300, 400
and 500 cm3 of water, and measured the boiling time in seconds. The results
are presented in Table 6.2.
When I repeated the experiment, I noticed relatively large variations in the
boiling time for a small amount of water, while for a large amount of water the
boiling time remains almost constant. I decided to repeat the whole experiment
in order to estimate the variance of the boiling time. The estimated variances
s; are given in the fourth column of Table 6.2. The weights Wi were taken as
Wi = l/s;'
Figure 6.5 gives the S~atistix printout for the weighted regression. The
estimates are Ci = 4.4708, ß = 0.29991 and the estimate of K v is 0.22644.
106 CHAPTER 6. CALIBRATION

WEIGHTED LEAST SQUARES LINEAR REGRESSION OF TIME

WEIGHTING VARIABLE: WEIGHT

PREDICTOR
VARIABLES COEFFICIENT STD ERROR STUDENT'S T P

CONSTANT 4.45037 1.33483 3.33 0.0446


WATER 0.29996 0.00335 89.46 0.0000

R-SQUARED 0.9996 RESID. MEAN SQUARE (MSE) 0.22942


ADJUSTED R-SQUARED 0.9995 STANDARD DEVIATION 0.47897

SOURCE DF SS MS F P
---------- ---------- ----------
REGRESSION 1 1836.02 1836.02 8003.03 0.0000
RESIDUAL 3 0.68825 0.22942
TOTAL 4 1836.71

CASES INCLUDED 5 MISSING CASES 0

Simple Regression Plot


170

140

110
W
~
i=
80

50

20
90 160 230 300 370 440 510
WATER
Figure 6.5. The printout and calibration curve for the data in Table 6.2 with
95% confidence belt.

6.2.2 Uncertainty in x* for the Weighted Case


Now suppose that we observe a single value of the response y*. Then, exactly
as in the "regular" (nonweighted) case, the estimate of x* is obtained from the
formula
-+ -~-.
x * =x y*-y (6.2.16)
ß
6.3. BOTH VARIABLES SUBJECT TO ERRORS 107

Note that in this formula, x,y and 11 are computed from (6.2.7), (6.2.8) and
(6.2.12).
After calculating x* from (6.2.16), we need to establish the uncertainty for
x*. Assume that we know the weight w· which corresponds to the calculated
value of x*. w* can be obtained by means of a specially designed experiment or,
more simply, by interpolating between the weights W r and wr +1 corresponding
to the nearest neighbors of x* from the left and from the right.
The formula for the uncertainty in x* is similar to (6.1.21), with obvious
changes following from introducing weights.
Note that the expression for Var[ß] is the same as (6.1.22), with obvious
changes in the expression for S:Il:ll; see, for example, Hald (1952, Chapt. 18,
Sect. 6):

'I.T [*] '" x [1/W* (+ 1/ L~=l


'" (x * _ -)2 )2 Wi _1_]. K
+ .~ (6.2.17)
A

var x V·
y* - Y ß2S:Il:ll
Note that this formula reduces to (6.1.21) if we set all weights Wi == 1.

Example 6.2.1. continued.


First compute x = 380.8 and 'fi = 118.7 by (6.2.7) and (6.2.8). The value of ß
is in the printout: ß= 0.29991. Let y* = 75 sec. Then it follows from (6.2.16)
that x* = 235.2. We take the corresponding weight w* = 0.18, by interpolating
between the weights 0.16 and 0.22. From the printout we know that Kv =
0.22644. From Table 6.2 it follows that L~=l Wi = 1.46 and S:Il:ll = 20,263. Now

* 2 [1/0.18 + 1/1.46 1]
Var[x ] R1 (235.2 - 380.8) (75 _ 118.7)2 + 0.32 . 20,263 . 0.2264
= 18.3.
Summing up, x* ± ..jVar[x*l = 235.2 ± 4.3.

6.3 Calibration Curve When Both Variables Are


Subject to Errors
6.3.1 Parameter Estimation
In this section we assume that x is also subject to experimental error. This is a
more realistic situation than the previously considered cases where the xs were
assumed to be known without errors.
Consider, for example, the situation described in Example 6.1.1. In mea-
suring absorbance, the solution of glucose is obtained by dissolving a certain
amount of reagent in water. The exact amount of the reagent and the volume
(or the weight) ofthe solution are known with error resulting from measurement
errors in weighting and determining the volume. Thus the input value of x fed
108 CHAPTER 6. CALIBRATION

into the spectrometer for calibration purposes is also subject to uncertainty, or


to put it simply, to experimental error.
In this section our assumptions are the following:
1. The data is a set of pairs [Xi, Yi, i = 1,2, ... , n].
2. All values of Xi are observations (sampie values) of random variables Xi,
with mean E[Xi ] and variance Var[Xi ] = u~. Note that we treat here only the
case of constant variances, Le. u~ = Const.
3. All values of Yi are observations (sampie values) of random variables JIi,
with mean E[Yi] and variance Var[JIi] = u~. Again, we consider only the case
of constant variances u~ = Const.
4. Xi, JIi, i = 1, ... , n, are mutually independent random variables.
5. The ratio of variances

(6.3.1)

is assumed to be a known quantity. A separate experiment must be carried out


to estimate A.
6. It is assumed that there is a linear relationship between the mean value
of the input variable X and the mean value of the output variable Y, Le. we
assume that
(6.3.2)
Our purpose is to find the "best" fit of our data to a linear relationship
Y = a+ßx. To proceed further, we need to put Xi and JIi into similar conditions,
Le. balance the variances. This will be done by introducing a new random
variable

1i* = ~. (6.3.3)

Now obviously, Var[1i*] = Var[Xi]. Our "new" data set is now [Xi,Yt =
Yi/.j)., i = 1,2, ... , n].
It remains true that the mean values of 1i* and the mean value of Xi are
linearly related to each other:
(6.3.4)

Obviously, a and ß are expressed through a* and ß* as

a=a*~, ß=ß*~. (6.3.5)


What is the geometry behind fitting a linear relationship to our data set? In
Sect. 6.1, for the "standard" linear regression curve, we minimized the sum of
squared distances of Yi from the hypothetical regression line. There, Y and X
played different roles, since Xi were assumed to be known without errors. Now
the situation has changed, and both variables are subject to errors. The principle
6.3. BOTH VARIABLES SUBJECT TO ERRORS 109

of fitting the data set to the linear calibration curve will now be minimizing
the sum 0/ sqtiares 0/ the distances of the points (Xi, yt) to the hypothetical
calibration curve y* = 0:* + ß*x. This is illustrated in Fig. 6.6.

y*

(Xi ,Yi )*
x
Figure 6.6. Constructing the calibration curve by minimizing E elf
The formula expressing the sum of squares of the distances of the experi-
mental points (Xi,yt) to the calibration curve y* = 0:* + ß*x is given by the
following formula known from analytic geometry:

01.( ß) = ~ (0:* + ß*Xi - yt)2


'f' 0:, ~ 1 + (ß*)2 .
~=1

Now substitute into this formula Yt = Yi/VA. After simple algebra we obtain:

(6.3.6)

The estimates of 0:* and ß*, denoted as Q* and ß* respectively, minimize


the expression (6.3.6). To find them, we have to solve the set of two equations:
8t/J/80:* = 0, 8t/J/8ß* = O. (6.3.7)
Let us omit the technical details and present the final result. In the original
coordinates (x, y), the estimates of 0: and ß denoted as and ßare as folIows:a
~ 81/1/ - AS:z::z: + V(A8 zZ - 81/1/)2 + 4A8i1/
ß= 28z 1/
, (6.3.8)

a=y-ßx. (6.3.9)
110 CHAPTER 6. CALIBRATION

Table 6.3: Engine temperature in oe at two locations

1 73 79
2 132 83
3 211 121
4 254 159
5 305 167

Note that x, fj, 8 zz , 8 zll , 8 1111 are defined by (6.1.9)-(6.1.13), exactly as in the case
of nonrandom xs.
The results (6.3.8) and (6.3.9) can be found in Mandel (1991, Chapt. 5).
Standard texts on measurements and on regression, such as Miller and Miller
(1993), typically do not consider the case of random errors in both variables.

Remark 1.
Suppose Ä -+ 00. This corresponds to the situation when xs are measured
without error. H we investigate the formula for jj when Ä goes to infinity, we
obtain, after some algebra, the usual formula for ß= 8 zll /8 zz . The proofis left
as an exercise.

Example 6.3.1: Relationship between engine temperature at two location8


An experiment is set up on engine testing bed to establish the relationship
between the temperature Tb of the crankshaft bearing and the temperature of
the engine cap Tc. The usual engine temperature measurements are made only
from the engine cap, and it is important to evaluate n
from the reading of Tc.
Table 6.3 presents the measurement data obtained in the experiment.
It is assumed that there is a linear relationship between the true temperature
at the bearing and the true temperature at the engine cap. It is assumed also
that the variance of the temperature reading inside the engine is four times
as large as the variance in measuring Tc. In our notation, it is assumed that
Ä = 1/4 = 0.25. Our goal is to estimate the parameters a and ß in the assumed
linear relationship
(6.3.10)

Solution. Let us return to familiar notation. Denote Xi = n(i) and Yi = Tc(i).


Then it follows from Table 6.3 that
x = 195.0, fj = 121.8, 8 zz = 34690, 8 1111 = 6764.8, 8 zll = 14820 .
Now by (6.3.8) and (6.3.9) we find that Ci = 36.0 and ß = 0.44. Thus, the best
6.4. EXERCISES 111

fit for the experimental data is the relationship


Y = 36 + 0.44x, or Tc = 36 + 0.44 . Tb. (6.3.11)
For example, the temperature on the cap is 180°C. Then solving (6.3.11)
for Tb, we obtain Tb = (180 - 36)/0.44 = 327.3°C

6.3.2 Uncertainty in x When Both Variables Are Subject


to Errors
Suppose that we have a single observation on the response which we denote as
y. Then the correspondi~ value of x, denoted as x, will be found from the
y a
calibration curve = + ßx by the following formula:
_ y-a
X=-~-. (6.3.12)
ß
Following Mandel (1991, Sect. 5.6), define the following quantities:

cl;. = Yi - a- ßXi; (6.3.13)


n
D2 = LdU(n - 2); (6.3.14)
i=1

(6.3.15)

Q = A2 8%% + 2Aß8%1I + ßD2 81111 ,


~
(6.3.16)

We present the following result by Mandel (1991, p. 93):

(6.3.17)

6.4 Exercises
1. Derive the formula (6.2.17).

2. The following table is based on data on carbon metrology given in Pankratz


(1997).1
Xi is the assigned carbon content in ppma (parts per million atomic) in
specially prepared specimens. Yi is the average carbon content computed from
five measurements of the specimen i carried out during the 6th day of the
1 Source: Pet er P. Pankratz "Calibration of an FTIR Spectrometer for Measuring Carbon",
in the collection by Veronica Czitrom and Patrick D. Spagon Statistical Gase Studies tor
Industrial Process Improvement ©1997. Borrowed with the kind permission of the ASA and
SIAM.
112 CHAPTER 6. CALIBRATION

experimentation (Pankratz 1997, p. 32). The sampie variance 8~ was computed


from these five measurements, and the weight was set up as Wi = 0.001 X (1/8~).

i Xi Yi Sampie Weight
Variance Wi

1 0.0158 0.0131 0.000066 15


2 0.0299 0.0180 0.00008 12.5
3 0.371 0.3678 0.00022 4.5
4 1.1447 0.9750 0.00076 1.3
5 2.2403 2.1244 0.0027 0.4

Compute the parameters of the calibration line Y = 0: + ßx using weighted


regression.

An8wer: Using the weighted regression procedure in Statistix, we obtain the


following estimates: Ci = 0.0416jß = 0.8827jKv = 0.15264.

3. Suppose we observe y* = 0.4 in Exercise 2. Find the corresponding value of


x* and the approximate value of v'Var[x*].

4. Investigate (6.3.8) when A -+ 00 and prove that ß-+ 88:r;1J/8:r;:r;.


Solution: Represent the square root in (6.3.8) as A8:r;zvr=-:Y, where 'Y =
281J1J/(A8:r;:r;} - 8;1J/(A2 8~z) - 48~1I/(A8~:r;}, and use the approximation VI - 'Y ~
1 - 'Y /2. Simplify the whole expression and let A -+ 00.

5. Suppose that we observe, at each Xi, ki values ofthe response variable Yi and
use for fitting the regression line the data [Xi,Yi,i = 1, .. . ,n], where fh is the
average of ki observations. Argue that this case can be treated as a weighted
regression with Wi = ki'

6. Suppose that in Example 6.3.1 we observe jj = 180. Compute Var[x] using


(6.3.17). Take D 2 = 146.4.
Chapter 7

Collaborative Studies

The only way you can sometimes achieve a meeting


01 minds is by knocking a lew heads together.

14,000 Quips and Quotes for Writers and Speakers

7.1 Introduction: The Purpose of Collaborative


Studies
So far we have dealt with various measurement problems which arise in a sin-
gle measurement laboratory and which can be resolved within that laboratory.
There often arise, however, measurement-related problems whose analysis and
resolution involve several measurement laboratories. A typical example is the
implementation of a new technology and/or product by enterprises which have
different geographical locations. Suppose a pilot plant located in Texas, USA,
develops a chemical process for producing new artificial fertilizer. An important
property of this new product is that a certain chemical agent, which is respon-
sible for underground water pollution, is present in very small amounts. Later
on, the new process will be implemented in eight branches of the pilot plant
which are located in different parts of the USA and elsewhere. The quality of
the fertilizer produced in each branch will be controlled by the locallaboratory.
We are faced with a new and unexpected phenomenon: all branches produce
the fertilizer by the same technology, each branch has a seemingly stable produc-
tion process, alllaboratories follow the same instructions, but the measurement
results obtained by different laboratories show a large disparity.
Miller and Miller (1993, p. 85) present two examples of such disparity: "In
one study the level of polyunsaturated fatty acids in a sampie of palm oil re-
ported by 16 different laboratories varied from 5.5% to 15% .... Determination
114 CHAPTER 7. COLLABORATIVE STUDIES

Table 7.1: Results from collaborative tests reported by 11 labs

1 2 3 4 5 6 7 8 9 10 11
y 16.0 16.1 16.3 17.1 16.5 16.5 16.7 16.9 16.5 16.5 16.5
x 16.0 15.8 16.0 16.8 16.4 16.2 16.7 16.6 16.3 16.5 16.2

of the percentage of aluminium in a sampie of limestone in ten laboratories


produced values ranging from 1.11% to 1.9%."
In principle, there are two main reasons for the large disparity in the results:
"usual" variations caused by measurement errors, and the presence of systematic
errors. To clarify the second source of disparity, let us consider a fragment from
Table 1 in Youden and Steiner (1975, p. 73).1
Table 7.1 presents measurement results of the same chemical characteristic
made by 11 laboratories. Each received two specimens with the same amount of
the chemical agent. Denote by x and y the smallest and the largest observation,
respectively.
Youden and Steiner (1975) suggest to analyze the results by means of an
x - y scatter plot shown in Fig. 7.1. The (x, y) points lie quite elose to a line
with slope 45°. The regression line of y on x is y = 1.06 + 0.95x.
Scatter Plot of LARGEST vs SMALLEST
17.5

17.0

I-
(f)
16.5 o 0
LU
(!) o
0::
«
...J o

15.5

15.0
15.0 15.5 16.0 16.5 17.0 17.5
SMALLEST
Figure 7.1. Scatter plot of y versus x for the data in Table 7.1.

I Reprinted, with permission, from the Statistical Manual 0/ the Association 0/ Official
Analytical Chemists, (1975). Copyright, 1975, by AOAC INTERNATIONAL.
7.2. RUGGEDNESS (ROBUSTNESS) TEST 115

The explanation is that both measurement results of laboratory i have the


same bias 8;. Suppose that the true content is 16.2. Then most of the labora-
tories have a positive systematic error.
What is the reason for this systematic error? Probably, deviations by each
local laboratory from the "standard" measurement procedure developed and
recommended by the pilot plant laboratory.
The purpose of collaborative studies (CSs) is twofold. First, it is desirable to
discover the factors which might cause the bias in the measurement process and
eliminate them. This is mainly achieved by carrying out a specially designed
experiment which will be described in the next section (the so-called "rugged-
ness" test). Second, analysis needs to be carried out to locate the laboratories
which have excessive random and/or systematic errors. To achieve this goal, all
laboratories taking part in the CS carry out measurements on a specially se-
lected set of specimens. (Alllaboratories receive identical sets). An independent
organization or the pilot laboratory performs the data analysis.

7.2 Ruggedness (Robustness) Test


After development of a new product and/or a new technology, the pilot lab-
oratory which initiates the CS carries out a specially designed experiment in
order to reveal the factors which in the course of the experiment might greatly
influence the measurements of the new product parameters and cause large bias
in the measurement results
Suppose that a technological process has been developed for producing a new
chemical substance from plants. An important characteristic of this substance
is its purity, mainly affected by small amounts of pesticides in it. Accurate
measurement of the amount of pesticides is crucial for monitoring the production
process.
Suppose that the pilot laboratory has established the following list of factors
which may influence the measurement results: reagent purity (factor A); cata-
lyst presence (factor B); reaction time (factor C); ambient temperature (factor
D); air humidity (factor E); spectrograph type (factor F); and water quality
(factor G).
The next step is crucial: aseries of eight experiments are carried out with a
special choice of factor combinations for each experiment; each factor appears
at one oftwo levels shown in Table 7.2, denoted by either a capital or lower-case
letter. These factor combinations are organized in a special way called ortho-
gonal design, shown in Table 7.3. Thus, for example, experiment 4 is carried out
using the following combination offactors: low reagent purity (A), no catalyst
(b), reaction time 10 min (e), 20°C ambient temperature (d), etc. The observed
result was Y4 = 0.730. Of course, in all experiments the measurements were
carried out on a specimen with the same amount of impurity.
116 CHAPTER 7. COLLABORATIVE STUDIES

Table 7.2: Potential inHuencing factors on measurement results

Factor Description Level 1 coding Level 2 coding


1 Reagent purity Low~A High~a
2 Catalyst presence Yes~B No~b
3 Reaction time 20min~C 10 min~c
4 Ambient tempo 25°C~D 20°C~d
5 Air humidity Low~E High~e
6 Type of spectrograph Newtype~F Old type ~ f
7 Water quality Regular~G Distilled ~ 9

Table 7.3: Factor combination in orthogonal design

Experiment i Measurement result Yi


1 A B C D E F G Yl = 0.836
2 A B c D e f 9 Y2 = 0.858
3 A b C d E f 9 Y3 = 0.776
4 A b c d e F G Y4 = 0.730
5 a B C d e F 9 Ys = 0.770
6 a B c d E f G Y6 = 0.742
7 a b C D e f G Y7 = 0.823
8 a b c D E F 9 Ys = 0.883
7.2. RUGGEDNESS (ROBUSTNESS) TEST 117

We assume that the following simple additive model of factor influence on


the measurement result is valid:

Y1 l' + A + B + C + D + E + F + G + 1':1,
Y2 l' + A + B + c + D + e + f + 9 + 1':2,
Y3 = l' + A + b + C + d + E + f + 9 + 1':3,
Y4 l' + A + b + c + d + e + F + G + 1':4,
Ys l' + a + B + C + d + e + F + 9 + I':S, (7.2.1)
Y6 = l'+a+B +c+d+E+ f +G +1':6,
Y7 l' + a + b + C + D + e + f + G + 1':7,
Ys = I'+a+ b+c+ D + E +F+ g+ I':s.

Here l' is the overall mean value; the I':i are assumed to be independent zero-mean
measurement errors with variance O'g. The special structure of equations (7.2.1)
allows an extremely simple procedure for calculating the contribution of each
of the factors. The rule is as follows. To obtain an estimate of the contribution
of a single factor in the form of A - a, Le. in the form of a difference in the
measurement results when the factor A is replaced by a, add all the Yi containing
the capitalletter level of the factor A and subtract the Yi which contain the small
letter level of the factor. Then divide the result by 4. For example, factor B is
at Level 1 in experiments 1,2,5,6 and at Level 2 in experiments 3,4,7,8; see
Table 7.3. Then
~ = Y1 + Y2 + Ys + Y6 -4(Y3 + Y4 + Y7 + Ys) . (7.2.2)
Let us proceed with a numerical example.

Example 7.2.1: Factors injluencing the measurement reswts in Table 7.3


The calculations reveal the following contributions of the factors:
---.
A - a = -0.0045;
---.
---.
B - b= -0.0015;
C - c = -0.002;
,....;-..

---.
D -d= 0.096;
E - e = 0.014;
,....;-..
F - f = 0.005;
---.
G - 9 = -0.039.
The dot diagram in Fig. 7.2 shows that only two factors, D and G have
a significant influence on the measurement result. The measurement result is
especially sensitive to the ambient temperature. Increasing it by 5°C gives an
increase in the result of RJ 0.1. So it is recommended to control carefully the
118 CHAPTER 7. COLLABORATNE STUDIES

c
,-
-4 -2
I -,--
-o , 2 4 6 8 10 FACTOR EFFECTS
x 100

Figure 7.2. Dot diagram for the effects offactors A,B, ... ,G.

ambient temperature and keep it near to 20°C. In addition, use of regular water
in the course of reaction causes a decrease in the result by 0.039. To reduce
disparity 'in measurement results from different laboratories, it is recommended
that alilaboratories control this factor too and use only distilled water.
Another valuable conclusion from the above experiment is that the measure-
ment results are robust (insensitive) with respect to the changes in the remaining
five factors.
The example considered here is the so-called 23 factorial design. There is a
tremendous literature on using designed experiments to reveal the influencing
factors, including their possible interactionsj see e.g. Montgomery (2001).

7.3 Elimination of Outlying Laboratories


Recommendations based on long experience say that the desirable number of
laboratories in the CS must be ten or more. Each laboratory must receive
three, four or five sampies which correspond to different levels of the measurand
(say, concentration). In some situations, this level may be known exactly. All
laboratories receive identical sets of sampies. It is advised for each laboratory
to carry out at least two independent measurements on each sampie.
Table 7.4 summarizes the results of a CS carried out by 11 laboratories.
Each laboratory received three sampies and measured each sampie twice. 1
We describe below a simple statistical procedure based on ranking the sums
0/ replicates. Its purpose is to identify the outlying laboratories, Le. the labo-
ratories with large systematic biasesj see Youden and Steiner (1975)
1. Take the sum of both replicates for alilaboratories and for all sampies.
2. Rank the sums within one sampie for alllaboratories.
3. Compute the total rank of each laboratory as the sum of its ranks for each
sampie.
1 Reprinted from the Statistical Manual 0/ Official Analytical Chemists, (1975). Copyright,
1975, by AOAC INTERNATIONAL
7.3. ELIMINATION OF OUTLYlNG LABORATORIES 119

Table 7.4: Results from collaborative tests reported by 111aboratories on three


samplesj Youden and Steiner 1975, p. 73

Lab Sampie 1 Sampie 2 Sampie 3


1 21.1 21.4 12.7 12.9 16.0 16.0
2 21.4 21.6 13.2 13.0 16.1 15.8
3 20.8 20.7 13.1 12.8 16.3 16.0
4 21.9 21.6 13.5 13.1 17.1 16.8
5 21.0 20.9 12.9 13.0 16.5 16.4
6 20.9 20.4 12.8 12.7 16.5 16.2
7 21.2 20.9 12.8 12.7 16.7 16.7
8 22.0 21.1 13.0 12.9 16.6 16.9
9 20.7 21.0 12.6 12.9 16.3 16.5
10 20.9 21.3 12.1 12.8 16.5 16.7
11 21.1 20.6 13.0 12.8 16.5 16.2

Table 7.5: Ranking the sum of replicates from Table 7.4

Lab Sampie 1 rank Sampie 2 rank Sampie 3 rank Total rank


1 42.5 8 25.6 5 32.0 2 15
2 43.0 9 26.2 10 31.9 1 20
3 41.5 2 25.9 8 32.3 3 13
4 43.5 11 26.6 11 33.9 11 33
5 41.9 5 25.9 8 32.9 7 20
6 41.3 1 25.5 3 32.7 4.5 8.5
7 42.1 6 25.5 3 33.4 9 18
8 43.1 10 25.9 8 33.5 10 28
9 41.7 3.5 25.5 3 32.8 6 12.5
10 42.2 7 24.9 1 33.2 8 16
11 41.7 3.5 25.8 6 32.7 4.5 14
120 CHAPTER 7. COLLABORATIVE STUDIES

Table 7.6: Criterion for rejecting a low or a high ranking laboratory score with
0.05 probability of wrong decisionj Youden and Steiner 1975, p. 85

No. of Laboratories 3 Sampies 4 Sampies 5 sampies


6 3 18 5 23 7 28
7 3 21 5 27 8 32
8 3 24 6 30 9 36
9 3 27 6 34 9 41
10 4 29 7 37 10 45
11 4 32 7 41 11 49
12 4 35 7 45 11 54
13 4 38 848 12 58
14 4 41 8 52 12 63
15 444 8 56 13 67
16 4 47 9 59 13 71
17 5 49 9 63 14 76

Table 7.5 presents the ranking results of the data from Table 7.4. (In the case
of tied ranks, each result obtains an average of all tied ranks.)
4. Compare the highest and the lowest rank with the lower and upper critical
limits. These limits are given in Table 7.6. They are in fact 0.05 critical values,
Le. they might be exceeded by chance, in case of no consistent differences be-
tween laboratories, with probability 0.05. 2 If a certain laboratory shows a rank
total lying outside one of the limits, this laboratory must be eliminated from
the CS.
From the Table 7.5 we see that the total rank of laboratory 4 exceeds the
critical value of 32 for 11 laboratories and three sampies. Thus we conclude
that this laboratory has a too large systematic error and we decide to exclude
it from further studies. Note also that the x - y plots (we presented only one in
Figure 7.1, for Sampie 3) point to the laboratory 4 as having the largest bias.

7.4 Homogeneity of Experimental Variation


Let us number the laboratories in the CS by i = 1, ... , I and the sampies by j =
1, ... , J. The cell (i,j) contains K observations, denoted as Xijk, k = 1, ... , K.
For the data in Table 7.4, we had at the beginning of the CS I = 11, J = 3, and
K = 2 observations in each cello After deleting laboratory 4, we preserve the
numbers of the remaining laboratories but keep in mind that we now have 10
laboratories.
2Reprinted from the Statistical Manual 0/ Official Anallltical Chemists, (1975). Copyright,
1975, by AOAC INTERNATIONAL
7.4. HOMOGENEITY OF EXPERIMENTAL VARIATION 121

Table 7.7: k*-statistics for 10 laboratories and three sampIes

Lab No R;,l R;,2 Ri,3 Ri,1/R1 R;,2/ R 2 R i,3/ R 3 Row sum


1 0.3 0.2 0.0 0.83 0.87 0.0 1.70
2 0.2 0.2 0.3 0.56 0.87 1.5 2.93
3 0.1 0.3 0.3 0.28 1.30 1.5 3.08
5 0.1 0.1 0.1 0.28 0.43 0.5 1.21
6 0.5 0.1 0.3 1.39 0.43 1.5 3.33
7 0.3 0.1 0.0 0.83 0.43 0.0 1.27
8 0.9 0.1 0.3 2.50 0.43 1.5 4.43
9 0.3 0.3 0.2 0.83 1.30 1.0 3.14
10 0.4 0.7 0.2 1.11 3.04 1.0 5.15
11 0.5 0.2 0.3 1.39 0.87 1.5 3.76
Rj 0.36 0.23 0.2

Denote by Sij the standard deviation among replicates within cell (i,j) and
by the pooled value for replication standard deviation for sampIe j.
Sj
Mandel (1991, p. 176), suggested studying the experimental cell variability
by using the so-called k-statistic which he defined as

(7.4.1)

We will estimate Sij via the range of the replicates in the cell (i,j), and Sj
via the average range of all cells in sampIe j. To compute kib denote by R;,j the
range of the replicates in the cell (i,j), and by R j the average range for sampIe
j. For our case of K = 2 observations per cell, R;,j is the largest observation
minus the smallest in the cello The value ofthe k-statistic for cell (i,j) we define
as

(7.4.2)

k:
Columns, 5,6,7 ofTable 7.7 present the values of j for the data in Table 7.4.
For visual analysis ofthe data in Table 7.7 we suggest the display shown on
Figure 7.3. The area (or the height) of the shaded tri angle in the (i, j)th cell is
proportional to the value of kij. It is instructive to analyze the display column
by column and row by row. The column analysis may give raise to the question
whether the standard deviation in cell (8,1) or (10,2) is unusually high. This
question may be answered on the basis of tables presented by Mandel (1991,
pp. 237-239). One will find that the critical value of the ku-statistic for 10
laboratories and two replicates is 2.32, at the 1% significance level. Thus we
may conclude that the within-cell variability in cells (8,1) and (10,2) is "too
large". This does not mean that the measurement results should be ignored
122 CHAPTER 7. COLLABORATIVE STUDIES

o 1 230 1 2301 230 2 4


Figure 7.1. Display of kij statistics for data of Table 7.7.

and measurements repeated. These results must be marked for a more thorough
investigation of the measurement conditions for the corresponding laboratories
and sampIes.
The row-by-row analysis may serve as a comparison between laboratories.
A row with high values of kij, j = 1,2,3, demonstrates that laboratory i has
unusually high standard deviations. This is not a reason for excluding this
laboratory from the es, but rather a hint to investigate the potential sources of
large within-cell variability in this laboratory. The last column in our display
shows the row sums 2:;=1 ktj • For the data in the display, there is a suspicion
that the measurement conditions in laboratories 8 and 10 cause the appearance
of greater variability.
Chapter 8

Measurements in Special
Circumstances

The cause is hidden, but the result is known.

Ovid, Metamorphoses

8.1 Measurements with Large Round-off


Errors
8.1.1 Introduction and Examples
For sake of simplicity, assume that the measurement results are integers on a
unit-stepgrid, Le. themeasurement resultsmaybe ... , -2, -1, -2, -1,0, 1,2, ...
In the notation of Sect. 2.2, the interval between instrument scale readings is
h = 1.
Consider, for example, a situation where the measurand equals p. = 0.3 and
the measurement errors are normally distributed with mean zero and standard
deviation 0' = 0.25. So, the measurement result Y (before the rounding off) is
Y = p. + €, (8.1.1)
where the measurement error € '" N(O, 0'2).
99.7% of the probability mass of Y is concentrated in the interval p. ± 30',

°
Le. in the interval [-0.45, 1.05]. Taking into account the rounding-off, the prob-
ability of obtaining the result is
~((0.5 - p.)/O') - ~((-0.5 - p.)/O') = ~(0.8) - ~(-3.2)
= 0.7881- 0.0007 = 0.7874, (8.1.2)
124 CHAPTER 8. SPECIAL MEASUREMENTS

and the probability of obtaining 1 is


4l((1.5 - J1.)/u) - 4l((0.5 - J1.)/u) = 4l(4.8) - 4l(-0.8) (8.1.3)
= 1.0000 - 0.7881 = 0.2119.
It is easy to calculate that with probability 0.0007 the measurement result will
be -1.
Let us denote by y* the measurement result after the rounding-off, Le. the
observed measurement result:
Y* = Y +1/, (8.1.4)
where 1/ is the round-off error. In our example, all values of Y which lie in the
interval (-0.5,0.5] are automatically rounded to y* = O. All values of Y which
He in the interval (0.5,1.5] are automatically rounded to y* = 1 and all values
of Y in the interval (-1.5, -0.5] produce y* = -1.
In terms of Sect. 2.2, we have in thisexample a situation with small u/h
ratio: ~ = u/h = 0.25/1 = 0.25. We termed this situation a specialmeasurement
scheme.
We consider in this section measurements with u / h ::; 0.5, Le. the situa-
tions with u ::; 0.5. Our purpose remains estimating the measurand J1. and the
standard deviation u of the measurement error. The presence of large round-
off errors (relative to the standard deviation of the measurement error u) may
invalidate our "standard" estimation methods. Let us consider a numerical
example.

Example 8.1.1: Bias in averaging


Suppose, J1. = 0.35 and u = 0.25. Then we obtain
Po = P(y* = 0) = 4l((0.5 - 0.35)/0.25) - 4l(( -0.5 - 0.35)/0.25)
= 4l(0.6) - 4l(-3.4) = 0.7257 - 0.0003 = 0.7254,
PI = P(y* = 1) = 4l((1.5 - 0.5)/0.25) - 4l((0.5 - 0.35)/0.25)
= 4l(4.0) - 4l(0.6) = 1- 0.7257 = 0.2743.
Thus the mean of the random variable y* is
E[Y] ~ 0.7254·0 + 1· 0.2743 = 0.2743, (8.1.5)
which is biased with respect to the true value of J1. = 0.35.

Suppose that we carry out aseries of n independent measurements of the


same measurand and n is "large" . Then no ~ n . Po occasions we will have the
result y* = 0 and on ni ~ n .PI occasions we will have the result y* = 1. Since
Po + PI ~ 1, the average will be Ji = nt/n ~ 0.2743.
This is a biased estimate, and most striking is the fact that with the increase
in the sample size n, the sampIe average approaches the mean of Y*, and thus
the increase in the sampIe size does not reduce the bias. Formally speaking, our
8.1. MEASUREMENTS WITH LARGE ROUND-OFF ERRORS 125

estimate of ,." is not consistent. The standard remedy of increasing the sampie
size to obtain a more accurate estimate of the measurand simply does not work
in our new situation with rounding-off.
The bias in this example is not very large. Suppose for a moment that
,." = 0.4 and u = 0.05. It is easy to compute that practically all the probability
mass of y* is concentrated in two points, 0 and 1, and that P(Y* = 1) ~ 0.023.
This gives the mean of y* ~ 0.023. The estimate of ,." is biased by 0.38.
Consider another example related to estimating u.

Example 8.1.2: Bias in estimating (1


Consider Y '" N(,." = 0.5,u = 0.25). Practically all measurement results of Y
are concentrated on the support [I' - 3 . (1, I' + 3 . (1] = [-0.25,1.25], and we
observe y* equal to either 0 or 1 with probabilities 0.5.
Consider using ranges to estimate the standard deviation. The theory which
is the basis for obtaining the coefficients An in Table 2.6 presumes that the data
are taken from anormal sampie.
Suppose we have a sampie of n = 5. The probability that all measure-
ment results will be either y* = 0 or y* = 1 is 2 . (0.5)S = 0.0625. This is
the probability of having the estimate of u equal zero. With complementary
probability 0.9375 the observed range will be 1, and in that case the estimate
a = I/As ~ 0.43. On average, our estimate will be 0·0.0625+0.43·0.9375 ~ 0.4,
which is considerably biased with respect to the true value 0.25.
We see, therefore, that rounding-off invalidates the range method, which
theoretically, in the absence of round-off, produces unbiased estimates.
Let us try another estimate of u based on computing the variance of the
two-point discrete distribution of the observed random variable Y*.
Suppose we have n = 5 replicates of Y*. With probability 2/32 we will
observe either 0 on five occasions or 1 on five occasions. The corresponding
variance estimate is zero. With probability 10/32, we will observe (4,1) or (1,4),
and the variance estimate will be 4/25. With probability 20/32, we will observe
(3,2) or (2,3) and the variance estimate will be 6/25. Thus, the expected value
of the estimate will be

O· (2/32) + (4/25) . (10/32) + (20/32)· (6/25) = 0.2.


The estimate of the standard deviation is a= y'[2 = 0.447, almost twice the
true standard deviation.

8.1.2 Estimation of J.L and (J by Maximum Likelihood


Method: y* = 0 or 1
Let the total number of observations be n, n = no + nl, where no and nl are the
numbers of observations of the points 0 and 1, respectively. Suppose that all
measurements produce the same result, say n = no. Then our only conclusion
will be that the estimate of the mean is ji, = O.
126 CHAPTER 8. SPECIAL MEASUREMENTS

Regarding u, we can say only that the length of the support of the random
variable E does not exceed h = 1. In practical terms, this means that u < 1/.;f2.
From the measurement point of view, the instrument in that case is not
accurate enough and should be replaced by one which has a finer scale (Le.
smaller value of h).
Now let us consider the case where the observations are located at two points,
Le. both no ~ 1 and nl ~ 1. Our entire information is, therefore, two numbers
no and nl'

Estimation of I' When u Is Known


Assume u is known from previous measurement experience. Note that there is
always a "naive" estimator of I' based on relative frequencies. We denote it I'n:

I'n = (0· no + 1· nd/n = nt/no (8.1.6)


We believe that the following maximum likelihood estimate of 1', denoted as
I'ML, has better properties than I'n·
Step 1. Write down the so-called likelihood function which in fact is the prob-
ability of observing the sampie which we already have:
Lik = [P(y* = o)]no • [P(Y* = 1)]n1, (8.1.7)

where
P(y* = 0) = ~(0.5 - I')/u) - ~(-0.5 - I')/u), (8.1.8)
P(y* = 1) = ~(1.5 - I')/u) - ~(0.5 - I')/u).
Note that Lik depends on I' only since u is assumed to be known.
Step 2. Find the maximum likelihood estimate of 1'. The maximum likelihood
estimate I'M L is that value of I' which maximizes the likelihood function, see
DeGroot (1975, p. 338). In practice, it is more convenient to maximize the
logarithm of the likelihood function (which will produce the same result):
I'ML = Argmax[Log[Lik]].
jJ.
(8.1.9)

The estimation of I'ML needs special computer software. Below we demonstrate


an example by using Mathematica; see Wolfram (1999).

Example 8.1.9: Maximum likelihood estimate 0/ I'


Suppose that our data are no = 4, nl = 1. From previous experience it
is assumed that u = 0.3. All computations are presented in the printout in
Fig. 8.1. We see, therefore, that I'ML = 0.256. Note that the naive estimate is
I'n = 0.2.
8.1. MEASUREMENTS WITH LARGE ROUND-OFF ERRORS 127

In[l]:= «Statistics'ContinuousDistributions'
ndist = NormalDistribution[O, 1] ;
0=0.3;
=
nO 4; n1 1; =
pO = CDF[ndist, (0.5 -1-1) 10] - CDF[ndist, (-0.5 -1-1) 10];
p1 = CDF[ndist, (1.5 -1-1) 10] - CDF[ndist, (0.5 -1-1) 10] ;
Lik = (pO) "'nO * (p1) "'n1;
FindMinimum[-Log[Lik], {I-I, 0.4}]
Plot[Log[Lik], {I-I, 0.1, 0.7}]

Out[7]= {2.53282, U.1~O.256461}}

-2.5

-3

-3.5

0.1 0.2 0.3 0.4 0.5

Figure 8.1. Mathematica printout of program for computing J.tML

Maximum Likelihood When Both J.t and u Are Unknown


Our data remain the same as in the previous case: we observe the result y* = 0
no times, and y* = 1 nl = n - no times, no > 0, nl > O.
Let us first try the method of maximum likelihood also in the case of two
unknown parameters. The maximum likelihood function has the same form as
(8.1.7), only now both J.t and u are unknown:
Lik = [~((0.5 - J.t)/u) - ~((-0.5 - J.t)/u)]no (8.1.10)
x[~((1.5 - J.t)/u) - ~((-0.5 - J.t)/u)t 1 •

Now we have to find the maximum of Log[Lik] with respect to J.t and u.
The numerical investigation of the likelihood function (8.1.10) has revealed
that the surface Lik = t/J(J.t, u) has a long and Hat ridge in the area of the
maximum. This can be weIl seen from the contour plot of 1jJ(J.t,u) in Fig. 8.2.
In that situation, the "FindMinimum" operator produces results which, in fact,
are not minimum points. Moreover, these results depend on the starting point
of the minimum search.
128 CHAPTER 8. SPECIAL MEASUREMENTS

«Statistics'ContinuousDistributions'
ndist = NormalDistribution[O, 1];
nO = 3; n1 = 7;
pO = CDF[ndist, (0.5 -1-1) /0] - CDF[ndist, (-0.5 -1-1) /0];
p1 = CDF[ndist, (1.5 -1-1) /0] - CDF[ndist, (O. 5 -1-1) /0] ;
Lik = (pO) .... nO * (p1) .... n1;
ContourPlot[Loq[Lik], {I-I, 0.48, 0.72}, {o, 0.05, 0.25},
Contours -+ 39]

O . v~[===-~=============
0.5 0 . 55 0 . 6 0 . 65 0 . 7

Figure 8.2. The contour plot of the likelihood function (8.1.10); no = 3, nl = 7


For example, when the starting point is (~ = 0.6, U = 0.1), Mathematica finds
(f,JML = 0.55,UML = 0.1); when the starting point is (f,J = 0.7,0" =.0.2), the
results are (~ML = 0.59, UML = 0.18). The values of tP(~ML, UML) coincide in
these two cases up to six significant digits.
What we suggest to do in this unfortunate situation is to use for U a naive
frequency estimator with the Sheppard correction; see Sect. 2.2:

Ü _. / no . nl _ ..!..-
n - V n· n 12'
(8.1.11)

Afterwards, substitute ä n into the expression (8.1.10) and optimize the likeli-
hood function with respect to ~ only. This becomes a well-defined problem. We
present in Table 8.1 the calculation results in a ready to use form for all cases
of n = 10 measurements with no > 0, nl > O.

8.1.3 Estimation of J-L and a: y* = 0, 1,2


Suppose that the measurand ~ = 0.95, and the measurement error € '" N(0,u 2 ),
with U = 0.45. Then the ±3u zone around ~ is [-0.4,2.3]. Then three values of
y* are likely to be observed: 0, 1 or 2. Suppose that we made n measurements,
observing y* = 0 on no occasions, y* = 1 on nl occasions and y* = 2 on
8.1. MEASUREMENTS WITH LARGE ROUND-OFF ERRORS 129

Table 8.1: Estimates Jl.ML and an for no + nl = 10


no nl Jl.ML an
1 9 0.605 0.082
2 8 0.731 0.28
3 7 0.675 0.36
4 6 0.591 0.40
5 5 0.50 0.41
6 4 0.409 0.40
7 3 0.325 0.36
8 2 0.269 0.28
9 1 0.395 0.082

n2 = n - no - nl occasions, such that no, nl, n2 are all nonzero. Our purpose
remains to estimate JI. and a.
Let us present first the naive estimates based on the first two moments of
the discrete random variable Y*:
E[Y*] = 1· P(Y* = 1) + 2· P(Y* = 2). (8.1.12)
Replacing the probabilities by the relative frequencies, we arrive at the following
estimate of JI.:
(8.1.13)
For the estimate of a we suggest the estimate of the standard deviation of y*
with the Sheppard correctionj see Sect. 2.2. The variance of y* equals
Var[Y*] = E[(y*)2] _ (E[y*])2. (8.1.14)
l,From here we arrive at the following estimate of a:
Ci = y'J2 . nI/n + 22 . n2/n - (nI/n + 2 . n2/n)2 - 1/12. (8.1.15)
The maximum likelihood estimates of JI. and a are obtained by maximizing
with respect to JI. and a the following expression for the likelihood function:
Lik = [P(Y* = O)to • [P(y* = l)t1 • [P(y* = 2)t 2 , (8.1.16)
where

P(Y* = 0) = ~(0.5; J1.) _ ~(-O.! - JI.), (8.1.17)

P(y* = 1) = ~C·5a- JI.) _ ~(0.5a- JI.), (8.1.18)

P(y* = 0) = ~C·5a- JI.) - ~C·5a- JI.). (8.1.19)


130 CHAPTER8. SPECIAL MEASUREMENTS

Table 8.2: Estimates of I' and 0' for no + nl + n2 = 10


no nl n2 I'ML O'ML I' 0'

3 6 1 0.800 0.52 0.800 0.53


2 7 1 0.898 0.46 0.900 0.45
3 5 2 0.901 0.63 0.900 0.64
2 6 2 1.000 0.56 1.000 0.56

An interesting empirical fact is that the maximum likelihood estimates of


I'ML and O'ML practically coincide with the naive estimates ii,u, as Table 8.2
shows.

8.2 Measurements with Constraints


We will consider this subject on the basis of an example involving measuring
angles in a triangle.
Suppose that we made independent measurements of the angles 0:, ß and 'Y
in a triangle. We have obtained 0:1,0:2, ••• ,0:1; for angle O:j ßl, fh., ... ,ßI; for
angle ß, and 'Yl, 'Y2, ... ,'YI; for angle 'Y.
Of course our estimates of the angles a, 9 must satisfy the natural con-ß,
straint
(8.2.1)
The approach to finding the "best" estimates of the angles is the following:
a,
we are looking for such values of ßand 9 which minimize the expression
I; I; I;
2: (O:i - a)2 + 2: (ßi - fj)2 + 2: hi - 9)2 (8.2.2)
i=l i=l i=l

subject to the constraint (8.2.1).


In order to take the constraint into account let us express 9 from (8.2.1) as
9= 180 - a-ß and substitute it into (8.2.2). Then our task is to minimize the
expression
I; I; I;
D(a, ß) = L (O:i - a)2 + L (ßi - fj)2 + L hi - (180 - a - (i})2 .(8.2.3)
i=l i=1 i=l

To salve this problem, we have to solve the following system of equations:

8D/8a = 0, 8D/8ß = o. (8.2.4)


8.2. MEASUREMENTS WITH CONSTRAINTS 131

After simple algebra, these equation take the following form:


2a + jj = 180 - 7 + a, (8.2.5)
a + 2jj = 180 + 73 - 7, (8.2.6)

where a = :E~=1 ai/ k , 73 = :E~=1 ßdk and 7 = :E~1 "Ii/k are the respective
average values.
It is easy to solve these equations and to obtain that

a= 60 - '1/3 - 13/3 + 2a/3, (8.2.7)

jj = 60 - 7/3 + 273/3 - a/3. (8.2.8)

These two equations must be complemented by a third one whieh is obtained


by substituting the estimates for a and ß into the constraint. This equation is
9 = 60 + 27<[/3 -73/3 - a/3. (8.2.9)

Let us now estimate the variance of the angle estimates. Assurne that al1
ang1es are measured with errors having equal variances a 2 • Then
Var[a] = a2 /k = Var[ß] = Var[,?]. (8.2.10)

Since the measurements are independent, it follows from from (8.2.7) that

Var[a] = Var[7]/9 + Var[ß]/9 + 4Var[a]/9. (8.2.11)

Using (8.2.10), we obtain that

~ 2a 2
Var[a] = 3k' (8.2.12)

By symmetry, the variances of jj and 9 are the same.


It is worth noting that the individual measurements ai, ßi. "Ii do not appear
in the resu1ts, on1y the averages a,ß and 7. We wou1d obtain exact1y the same
result if the expression (8.2.3) were written as

(8.2.13)

and minimized subject to the same constraint (8.2.1). This fact has the following
geometrie interpretation. We minimize the distance from the point (a, 13, '1) to
a
the hyper plane + jj + 9 = 1800 •
132 CHAPTER 8. SPECIAL MEASUREMENTS

8.3 Exercises
1. In measuring the triangle angles, the following results were obtained:

al = 30 0 30'j a2 = 29°56'j
ßl = 56°25'j ß2 = 56°45'j
'Yl = 93°50'j'Y2 = 94°14'.

Find the "best" estimates of a, ß, 'Y.

2. For the data of 1, find the estimate of JVar[a].

3. In measuring the ohmic resistance of a specimen using a digital instrument,


the following results (in ohms) were obtained:
10.1,10.2,10.2,10.1,10.1.

It is known that the instrument has measurement error € '" N(0,u 2 ) with
u = 0.03. Find the maximum likelihood estimate Jl.ML of the ohmic resistance.
Hint. Multiply all measurement results by a factor of 10. This will also increase
u by a factor of 10. Put the zero of the scale at the point 101. Then there are
three measurements equal to zero, no = 3, and two equal to 1, nl = 2. Now we
are in the situation of Example 8.1.2. On the new scale Jl.ML = 0.256, and in
the original scale the estimate of the resistance is 10.1 + 0.0256 ~ 10.13.

4. Suppose that the random variable X '" U( -0.7,1.3). The measurements are
made on a grid ... , -2, -1,0,1,2, ... with corresponding round-off. Find the
mean and the standard deviation of the observed measurement results.

5. Five weight measurements are made on a digital instrument with scale step
h = 1 gram, and the following results obtained: 110 gram, 111 gram (3 times)
and 112 gram. Suppose that the weight is an uknown constant JI. and the
measurement error has a normal distribution N(O, 0"2). Find the maximum
likelihood estimates of JI. and 0".
Answers and Solutions to
Exercises

Chapter 2.
1, a. p, = 0.634;
1, b. s = 0.017; the estimate of u using the sampIe range is fT = 0.06/3.472 =
0.017;
1, c. Q = (0.61- 0.60)/0.06 = 0.167 < 0.338. The minimal sampIe value is not
an outlier.
1, d. The 0.95 confidence interval on JI. is [0.6246,0.6434]
1, e. h = O.01,s/h = 1.7. It is satisfactory

2. Solution. Y '" N(150,u = 15). P(145 < Y < 148) = P((145 - 150)/15 <
(Y - 150)/15 < (148 - 150)/15) = 41(-0.133) - 41(-0.333) = 41(0.333) -
41(0.133) = 0.630 - 0.553 = 0.077.

Chapter 3
1. The sam pIe averages are Xl = 3200.1, X4 = 3359.6. The sampIe variances are
s~ = 2019.4, s~ = 5984.6. Assuming equal variances, the T-statistic for testing
Jl.1 = Jl.4 against Jl.1 < Jl.4 equals -3.99, which is smaller than to.oos(8). We
reject the null hypothesis.
2. For Table 4.1 we obtain the maximal variance for day 4, s~ = 52.29, and the
minimal s~ = 8.51 for day 1. Hartley's statistic equals 6.10, which is less than
the 0.05 critical value of 33.6 for 4 sampIes and 7 observations in each sampIe.
The null hypothesis is not rejected.

3. For Table 3.8 data, the 95 % confidence interval on mean difference is


[0.02,2.98]. Since it does not contain zero, we reject the null-hypothesis.

Chapter 4
1. The average range per cell is 0.0118. It gives fT: = 0.0070. This is in elose
agreement with fT e = 0.0071 obtained in Sect. 4.4.
134 ANSWERS AND SOLUTIONS TO EXERCISES

3. ux = 0.092, Ue 1 = 2.8.10- 2; Ue2 = 2.1.10- 2. The estimate of bis S= -0.21.


4. The 95 % confidence interval on bis [0.194,0.227].
7. FA = (791.6/4)/(234.0/5) = 4.23, which is less than .1"4,5(0.05) = 5.19. Thus
we do not reject the null hypothesis at the significance level 0.05 but we do
reject it at the level 0.1 for which .1"4,5(0.05) = 3.52.
FB = (234/5)/(0.98/10) = 477.6 » .1"5,10(0.05) = 3.33. Thus the null
hypothesis is certainly rejected.

Chapter 6
1. Exactlyas in the case of constant variance,

" [*] '" ( * _ -)2 [Var[y*] + Var[y


var x '" x x (y - y* )2 + var[ß]]
~
ß2 .
Now use the following formulas for variance estimates:
Vir[y*] = Kv/w*;
Var[y] = Iby (6.2.8)1 = ( Li=l = K v/ "LJi=l
- _ n 2 k 2 n
wi Kv/Wi /(Li=l Wi) Wi;
A ) A

Vir[p] = Kv /8 zz •
and substitute them in the expression for Var[x*].

3. Solution. x = 0.0938; Y = 0.1244; 8 zz = 3.4355; 8 z71 = 3.0324;


8 7171 = 3.1346, L:=l Wi = 33.7. x* = (0.4 - 0.0416)/0.8827 = 0.406.
To obtain the weight w* for x* interpolate linearly between W3 and W4.
w* ~ 4.4. Then using (6.2.17) with K v = 0.15264, obtain that

Var[x]* ~ (0.406 - 0.0938)2 (10.!~t~2!!·7 + 1/(0.88272 .3.4355») . 0.15264 =


0.0559. Thus the result is: 0.406 ± 0.236.

5. If 'ih = (yi! + ... + Yik,)/ki, then VarWi] = Var[Y]i1]/ki. Compare this with
(6.2.3): Wi = ki'

6. Compute from the data in Table 6.3 that D2 = L:=l (Yi - 36 - 0.44Xi)2/3 =
146.4. Use (6.3.15) and (6.3.16) to calculate that P = 0.444 and Q = 6738.2.
Substitute the values of D 2 ,>.,P,P,Q,fj = 180 and y into (6.3.17). The result
is Var[x] ~ 964.1.

Chapter 8
1. Answer: ii = 29°56'; P= 56°18'.
2. Solution. Vir[a] = (al - (2)2 /2 = 342/2 = 578. Similarly, Vir[ß] = 202/2 =
200, and Vir["Y] = 242/2 = 288.
135

The pooled variance estimate of U 2 is = (578 + 200 + 288)/3 = 355.3.


fj2 By
(8.2.12), Var[a] = u 2 /3 and Va;[a] = 355.3/3 = 118.4. JVa;[a] = 11'.

4. Solution. With probability 0.1 the result will be -1, with probability 0.5 it
will be zero, and with the remaining probability 0.4 it will be + 1. So, the mea-
surement result will be a discrete random variable Y*, with E[Y*] = 0.1( -1) +
0.5·0+ 1·0.4 = 0.3. Var[Y*] = 0.1· (_1)2 +0.4· (1 2) -0.32 = 0.5-0.09 = 0.41.

5. Solution. Consider expression (8.1.15) with no = 2, nl = 6, n2 = 2. Set


the zero at 110 gram. This case is presented in Table 8.2, and the results
are I'ML = 1,uML = 0.56. Note that the likelihood ftmction for the data
no = 1, nl = 3, n2 = 1 is the square root of the likelihood function (8.1.15), and
thus it will be maximized at the same values. Finally, I'ML = 111, UML = 0.56
gram.
Appendix A: Normal
Distribution

cP(x)
1
= ..,I2ii jZ exp[-v 2 j2]dv
271" -00

Hundredth parts of x
x 0 1 2 3 4 5 6 7 8 9
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7703 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9556 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987
For negative values of x, cP(x) =
1 - cP( -x). For example, let x = -0.53. Then
= =
cP( -0.53) 1 - cP(0.53) 1 - 0.7019 0.2981. =
Appendix B: Quantiles of
the Chi-Square Distribution

Let v'" x2(/I). Then p(v ~ qß(/I)) = ß.


/I qO.01 (/I) qO.025 (/I) QO.06(/I) QO.96 (/I) QO.975 (/I) QO.99 (/I)
1 0.000157 0.000982 0.00393 3.841 5.024 6.635
2 0.0201 0.0506 0.103 5.991 7.378 9.210
3 0.115 0.216 0.352 7.815 9.348 11.345
4 0.297 0.484 0.711 9.488 11.143 13.277
5 0.554 0.831 1.145 11.070 12.833 15.086
6 0.872 1.237 1.635 12.592 14.449 16.812
7 1.239 1.690 2.167 14.067 16.013 18.475
8 1.646 2.180 2.733 15.507 17.535 20.090
9 2.088 2.700 3.325 16.919 19.023 21.666
10 2.558 3.247 3.940 18.'ß07 20.483 23.209
11 3.053 3.816 4.575 19.675 21.920 24.725
12 3.571 4.404 5.226 21.026 23.337 26.217
13 4.107 5.009 5.892 22.362 24.736 27.688
14 4.660 5.629 6.571 23.685 26.119 29.141
15 5.229 5.262 7.261 24.996 27.488 30.578
16 5.812 6.908 7.962 26.296 28.845 32.000
17 6.408 7.564 8.672 27.587 30.191 33.409
18 7.015 8.231 9.390 28.869 31.526 34.805
19 7.633 8.907 10.117 30.144 32.852 36.191
20 8.260 9.591 10.851 31.410 34.170 37.566
21 8.897 10.283 11.591 32.671 35.479 38.932
22 9.542 10.982 12.338 33.924 36.781 40.289
23 10.196 11.689 13.091 35.172 38.076 41.638
24 10.856 12.401 13.848 36.415 39.364 42.980
25 11.524 13.120 14.611 37.652 40.646 44.314
26 12.198 13.844 15.379 38.885 41.923 45.642
27 12.879 14.573 16.151 40.113 43.195 46.963
28 13.565 15.308 16.928 41.337 44.461 48.278
29 14.256 16.047 17.708 42.557 45.772 49.588
30 14.953 16.791 18.493 43.773 46.979 50.892
Appendix C. Critical Values .rV1 ,V2 (a) of the F - Distribution
1/1

iI2 Q 1 2 3 4 5 6 7 8 9 10 15 20
1 0.050 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9 245.9 248.0
1 0.025 647.8 799.5 864.2 899.6 921.8 937.1 948.2 956.7 963.3 968.6 984.9 997.2
1 0.010 4052 4999 5403 5625 5764 5859 5928 5981 6022 6056 6157 6209
1 0.005 16211 19999 21615 22500 23056 23437 23715 23925 24091 24224 24630 24836
2 0.050 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.43 19.45
2 0.025 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.40 39.43 39.45
2 0.010 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 99.40 99.43 99.45
2 0.005 198.5 199.0 199.2 199.2 199.3 199.3 199.4 199.4 199.4 199.4 199.4 199.4
3 0.050 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.70 8.66
3 0.025 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.42 14.25 14.17
3 0.010 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 27.23 26.87 26.69
3 0.005 55.55 49.80 47.47 46.19 45.39 44.84 44.43 44.13 43.88 43.69 43.08 42.78
4 0.050 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.86 5.80
4 0.025 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.84 8.66 8.56
4 0.010 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55 14.20 14.02
4 0.005 31.33 26.28 24.26 23.15 22.46 21.97 21.62 21.35 21.14 20.97 20.44 20.17
5 0.050 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.62 4.56
5 0.025 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.62 6.43 6.33
5 0.010 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05 9.72 9.55
5 0.005 22.78 18.31 16.53 15.56 14.94 14.51 14.20 13.96 13.77 13.62 13.15 12.90
6 0.050 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 3.94 3.87
6 0.025 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.46 5.27 5.17
6 0.010 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87 7.56 7.40
6 0.005 18.63 14.54 12.92 12.03 11.46 11.07 10.79 10.57 10.39 10.25 9.81 9.59
7 0.050 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.51 3.44
7 0.025 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.76 4.57 4.47
7 0.010 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62 6.31 6.16
..-
7 0.005 16.24 12.40 10.88 10.05 9.52 9.16 8.89 8.68 8.51 8.38 7.97 7.75 .j::..
.....
1/1

1 2 3 4 5 7 20
......
112 a 6 8 9 10 15 ~
8 0.050 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.22 3.15 N
8 0.025 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.30 4.10 4.00
8 0.010 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81 5.52 5.36
8 0.005 14.69 11.04 9.60 8.81 8.30 7.95 7.69 7.50 7.34 7.21 6.81 6.61
9 0.050 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.01 2.94
9 0.025 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.96 3.77 3.67
9 0.005 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26 4.96 4.81
9 0.001 13.61 10.11 8.72 7.96 7.47 7.13 6.88 6.69 6.54 6.42 6.03 5.83
10 0.050 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.85 2.77
10 0.025 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.72 3.52 3.42
10 0.010 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85 4.56 4.41
10 0.005 12.83 9.43 8.08 7.34 6.87 6.54 6.30 6.12 5.97 5.85 5.47 5.27
11 0.050 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.72 2.65
11 0.025 6.72 5.26 4.63 4.28 4.04 3.88 3.76 3.66 3.59 3.53 3.33 3.23
11 0.010 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.54 4.25 4.10
11 0.005 12.23 8.91 7.60 6.88 6.42 6.10 5.86 5.68 5.54 5.42 5.05 4.86
12 0.050 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.62 2.54
12 0.025 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.37 3.18 3.07
12 0.010 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30 4.01 3.86
12 0.005 11.75 8.51 7.23 6.52 6.07 5.76 5.52 5.35 5.20 5.09 4.72 4.53
13 0.050 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.53 2.46
13 0.025 6.41 4.97 4.35 4.00 3.77 3.60 3.48 3.39 3.31 3.25 3.05 2.95
13 0.010 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.10 3.82 3.66
13 0.005 11.37 8.19 6.93 6.23 5.79 5.48 5.25 5.08 4.94 4.82 4.46 4.27
14 0.050 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.46 2.39
14 0.025 6.30 4.86 4.24 3.89 3.66 3.50 3.38 3.29 3.21 3.15 2.95 2.84
14 0.010 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 3.94 3.66 3.51
14 0.005 11.06 7.92 6.68 6.00 5.56 5.26 5.03 4.86 4.72 4.60 4.25 4.06
15 0.050 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.40 2.33
15 0.025 6.20 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.06 2.86 2.76
15 0.010 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80 3.52 3.37
15 0.005 10.80 7.70 6.48 5.80 5.37 5.07 4.85 4.67 4.54 4.42 4.02 3.88
References

Analytical Software. 2000. Statistix7, Users Manual Analytica1 Software,


Tallahassee, FL.
Beauregard, M., R. Mikulak and B. Olson. 1992. A Practical Guide to Quality
Improvement. Van Nostrand Reinhold.
Bisgaard, S. 2002. An investigation involving the comparison of several objects.
Quality Engineering, 14(4),685--689.

Bolshev, L.N. and N.V. Smirnov. 1965. Tables of Mathematical Statistics.


Nauka (in Russian).

Box. G. 1953. Nonnormality and tests on variances. Biometrika, 40,318-335.


Box, G. 1998. Multiple sources of variation: variance components. Quality
Engineering, 11(1), 171-174.
Box, G. and A. Luceno. 1997. Statistical Control by Monitoring and Feedback
Adjustment. John Wiley & Sons, Inc.
Cochran, W.G. and G.M. Cox. 1957. Experimental Designs, 2nd ed. John
Wiley & Sons, Inc.
Cramer, H. 1946. Mathematical Methods of Statistics. Princeton University
Press, Princeton.
Veronica Czitrom and Patrick D. Spagon. 1997. Statistical Case Studies for
Industrial Process Improvement. SIAM - ASA.

Dechert, J. et al. 2000. Statistica1 process control in the presence of large


measurement variation. Quality Engineering, 12(3), 417-423.

DeGroot, M.H. 1975. Probability and Statistics, 2nd ed. Addison-Wesley.

Devore, J.L. 1982. Probability and Statistics for Engineering and Sciences.
BrooksjCole Publishing Company, Monterey California.
144 REFERENGES

Dunin-Barkovsky, I.V. and N.V. Smirnov. 1955. Theory of Probability ana


Mathematical Statistics for Engineering. Gosizdat, Moscow (in Russian).
Eurachem. 2000. Quantifying Uncertainty in Chemical Measurement. Eu-
rachem Publications.
Fisher, R.A. 1941. Statistical Methods for Research Workers. G.E. Stechert &
Co., New York.
Andrew Gelman, John B. Carlin, Hal S. Stern and Donald Rubin. 1995.
Bayesian Data Analysis. Chapman & Hall/ CRC.
Gertsbakh, I.B. 1989. Statistical Reliability Theory. Marcel Dekker, Inc.
Grubbs, F .E. 1948. On estimating precision of measuring instruments and prod-
uct variability. Journal of the American Statistical Association, 43, 243-264.
Grubbs, F.E. 1973. Comparison of measuring instruments. Technometries, 15,
53-66.
Hald, A. 1952. Statistical Theory With Engineering Applications. Wiley, New-
York - London.
Hurwitz et al. (1997) Identifying the sources of variation in wafer planarization
process, in V. Czitrom and P.D. Spagon (eds), Statistical Gase Stuaies for In-
austrial Process Improvement, pp. 106-113. Society for Industrial and Applied
Mathematics and American Statistical Association.
Kotz, S. and N.L. Johnson (eds). 1983. Encyclopeaia of Statistical Sciences,
Vol. 3, pp. 542-549. Wiley.
Kragten, J. 1994. Calculating standard deviations and confidence intervals with
a universally applicable spreadsheet technique. Analyst, 119, 2161-2166.
Krylov, A.N. 1950. Lectures on Approximate Galculations. Gosizdat, Moscow
and Leningrad (in Russian).
Madansky, A, 1988. Prescriptions for Working Statisticians. Springer, New
York.
Mandel, J. 1991. Evaluation ana Control of Measurements. Marcel Dekker, Inc.

Miller, J.C. and J.N. Miller. 1993. Statistics for Analytical Chemistry, 3rd ed.
Ellis Hoorwood PTR Prentice Hall, New York.
Mitchell,T, Hegeman, V. and K.C. Liu. 1997. GRR methodology for destruc-
tive testing, in V. Czitrom and P.D. Spagon (eds), Statistical Case Stuaies for
Inaustrial Process Improvement, pp. 47-59. Society for Industrial and Applied
145

Mathematics and American Statistical Association.


Montgomery, O.C. 2001. Design and Analysis of Experiments, 5th ed. John
Wiley & Sons, Inc.
Montgomery, O.C. and G.C. Runger. 1993. Gauge capability and designed
experiments. Part I: basic methods. Quality Engineering, 6(1), 115-135.
Morris, A.S. 1997. Measurement and Galibration Requirements for Quality As-
surance ISO 9000. John Wiley & Sons, Inc.
Nelson, L.S. 1984. The Shewhart control chart - tests for special causes. Journal
of Quality Technology, 16, 237-239.

NIST Technical Note 1297. 1994. Guidelines for Evaluating and Expressing the
Uncertainty of NIST Measurement Results.
Pankratz, P.C. 1997. Calibration of an FTIR spectrometer for measuring car-
bon, in V. Czitrom and P.O. Spagon (eds), Statistical Gase Studies for Industrial
Process Improvement, pp. 19-37. Society for Industrial and Applied Mathemat-
ics and American Statistical Association.
Rabinovich, S. 2000. Measurement Errors and Uncertainties: Theory and Prac-
tice, 2nd ed. Springer, New York.

Sachs, L. 1972. Statistische Auswertungsmethoden, 3rd ed. Springer, Berlin


Heidelberg New York.
Sahai, H. and M.l. Ageel. 2000. The Analysis of Variance. Birkhäuser, Boston
Basel Berlin.
Scheffe, H. 1960. The Analysis of Variance. John Wiley & Sons, Inc. New
York.
Stigler, S.M. 1977. 00 robust estimators work with real data? (with discussion).
Annals of Statistics, 45, 179-183.
Taylor, J.R. 1997. An Introduction to Error Analysis. University Science Books,
Susalito, California.

Vardeman, S.ß. and E.S. Van Valkenburg. 1999. Two-way random-effects anal-
yses and gauge R&R studies. Technometrics, 41(3), 202-211.

Wolfram, Stephen. 1999. The Mathematica Book, 4th ed. Cambridge University
Press.

Youden, W.J. and E.H. Steiner. 1975. Statistical Manual of the Association
of OjJicial Analytic Chemists: Statistical Techniques for Collaborative Studies.
AOAC International, Arlington, VA.
Index

absence of repeated measurements chi-square distribution 51,77


- Grubbs's method 83 - 1 - a quantile 55
absolutely constant error 5 - origin 51
accuracy 6 - quantiles 77
additive model of factor infiuence - variance 85
117 coefficient of variation 16,89
analysis of residuals 103 - of sampie mean 17
analysis of variance for nested design collaborative studies (eS) 112
69 combining two hierarchical designs
AN OVA 80
- nested design 67 common cause 32
- with random effects 60 conditionally constant error 5
assignable cause confidence interval 4
- measurement errors 37 - use for hypotheses testing 30
assignable causes 36 confidence
average 12 - interval and hypotheses testing 31
- run length (ARL) 38 - interval for mean value 50
- interval on instrument bias 85
Bartlett's test 55 - set on parameters 78
between-batch variation 63 consistent estimator 14
bias control charts 32
- elimination 8 - estimation of a 34
- in averaging 124 - stopping rules 35
- in estimating a 125 control limits 34
- in measurements 41 covariance 15
-- - estimation 31,47 es studies
-- - estimation of mean 31 - homogeneity of experimental varia-
- in S-estimator 29 tion 120
- in weighing 11 es, systematic errors 114
box and whisker plot 11,44,61 es, visual analysis of cell variability
121
calibration 95
- curve 95,97 day-to-day variability 60
- mathematical model 96 delta method 88
- measurement error 96 destructive testing
- of measurement instruments 37 - repeatability 78
- of spectrometer 112 determining the outlying laboratory
- uncertainty in estimating x 100 120
148 INDEX

discrete measurement scale 123 interaction of measurand and the in-


Dixon's statistic strument 2
- table 28 inverse regression 99
Dixon's test for an outlier 26 - repeated observations 102
dot diagram 10
Kruskal-Wallis statistic 51,58
elementary error 5 Kruskal-Wallis test 50
elimination of outlying laboratories
118 law of large numbers 6
EPF least square principle 97
- for product-form function 91 linear calibration curve 97
- in inverse regression 101 linear regression
methodology 92 - equal variances 98
particular forms 90 - parameter estimation 98
various forms 89 linear regression with errors in x and
equivariance in calibration 97 y 107
error propagation formula (EPF) 88
- use in confidence intervals 78 M-method for confidence estimation
estimate 15 77
of aA, er e 63 Mandel's k-statistic 121
of er 2 in regression 99 Mathematica 127
of standard deviation 12 maximum likelihood
of regression parameters 110 - estimate 132
estimation of er using range 28 - estimate of I-L 126
estimation of variances in one-way - method 125
ANOVA 62 -- estimation of I-L and er 125
estimator 14 mean
- of er 29 - confidence interval 40
- - efficiency 29 - of the sum of random variables 15
of mean 14 - value of population 4, 11
- , biased 14 measurand 1
- , unbiased 14 measure of repeatability and reprodu-
expectation 4 cibility 74
measurement 1
factorial design 118 - process uncertainty 60
- regular 18
Gauss function 19 - scale step 18
Gauss-Laplace function 19 - special 18
geometry of the best fit - uncertainty 2
- errors in both variables 108 - under constraints 131
Grubbs's method - with constraints 130
- destructive testing 83 measurement error 3
absolute 7
Hartley's test 54 bias, random error 36
hierarchical design 80 performance of control charts 38
- relative 7
influencing factors 118 measuring
input, response 96 - blood pressure 56
instrument bias 32 cable tensile strength 58
instrument certificate 7 - pull strength 79,80
INDEX 149

method of subtraction 11 pairwise differences 85


metrology 7 performance of control charts 38
minimal-variance unbiased estimator point estimation in R&R studies 76
of a 29 pooled sampie variance 47
multiple comparisons 53 pooled variance 55
population 4
naive estimators of J-L and a 130 population mean 30
nested design - confidence interval 30
- hypotheses testing 70 precision 6
- mathematical model 67 process variability 60
- model, assumptions 67 pure measurement error 5
- model, estimation 67
- parameter estimation 69,70 quantiles of chi-square distribution
- testing hypotheses 71 51
nested or hierarchical design 67
nonparametric comparison of means R charts 35
50 R&R studies 71
nonparametric tests 46 - ANOVA model 75
normal density - confidence intervals 76
- origin 24 - estimation, hypotheses testing 73
- postulates 22 - model, assumptions 73
normal density function 19 - point estimation 76
normal distribution 19 - sources of variability 72
- cumulative probability function random error 5,6
random sampie 4
19
- mean and variance 19 random variable 4
- parameters 19 range 26,34
normal distribution, origin 22 - uniform population 29
normal distribution, postulates 23 range method
normal plot 21, 28 - repeatability 82
normal probability paper 22 range to estimate a 29
normal probability plot 22 ranking laboratories in es 120
normal quantiles 20,21 ranks for tied observations 50
ranks, definition 50
observed values of estimators 15 regression
one-way ANOVA - errors in both variables 108
- testing hypotheses 65 - heteroscedasticity 103
one-way ANOVA with random effect regression with errors in x and y 110
60 relative standard deviation 89,93
- assumptions 62 repeatability 71
- model 62 - conditions 46
optimal choice of weights 16 - of pull strength 80
optimal number of sampies per batch repeated measurements 2
65 reproducibility 71
organization of es 118 residuals and response variability
orthogonal design 115 103
outlier 11,26,40 round-off error 13, 123
round-off rules 17
paired experiment 47 RSD, underestimation in EPF 92
pairwise blocking 47 ruggedness (robustness) test 115
- in es 115
150 INDEX

sampIe t-test
- average 12 - degrees of freedom 44
- mean 12 - equal variances 47
- range 26,28,40 - null hypothesis, alternatives 45
- variance 12 - statistic 44
Sheppard correction 130 testing hypotheses in AN OVA 85
Sheppard's correction 13 two-factor balanced model
Shewhart's charts 32 - random effects 72
significance level 30 two-sample t-test 43
single-factor ANOVA 82
small a/h ratio 124 unbiased estimator of variance 14
sources of uncertainty uncertainty
- measurement errors 60 - in x for given y 96
special cause 32 - in x in calibration 111
special measurement scheme 124 - in measurements 4, 87
specific weight 1, 2 - in measuring specific weight 87
specification limits 25 - of atomic weights 25
speed of light, measurements 41 - of the measurement 6
standard deviation 11 uniform density 24
- of random error 37 uniform distribution 13,24, 132, 135
- of range 35 - mean and variance 13, 24
standard error 93 - of round-off errors 25
- in estimating atomic weight 93 use of calibration curve 96
- of aR&R 93 using ranges to estimate repeatability
- of the mean 17 82
standard normal density 19 using ranks in es 120
standard normal distribution 19
Statistix for regression 99 variance 11
Statistix software 53 - unbiased and consistent estimate
Statistix, use for ANOVA 76 12
sum of squares 130 variance of a sum of random variables
systematic and random errors 6 15,16
systematic error 6,7,11 variance of sum of squares in ANOVA
- in measurements 40 85

t-distribution wear of brake shoes 48


-- critical values 45 weighing using beam balance 8
t-statistic weighted regression 104,105,112
- critical values 30 - uncertainty in x* 106
- degrees of freedom 30 weighted sum of squares 104
within-batch variation 63

X bar chart 34
- performance 39

View publication stats

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy