Ilya Gertsbakh Auth. Measurement Theory For
Ilya Gertsbakh Auth. Measurement Theory For
Ilya Gertsbakh Auth. Measurement Theory For
net/publication/270888378
CITATIONS READS
8 4,125
1 author:
Ilya B. Gertsbakh
Ben-Gurion University of the Negev
243 PUBLICATIONS 2,175 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Ilya B. Gertsbakh on 19 January 2015.
ONLINE L1BRARY
Engineering
http://www.springer.de/engine/
Ilya Gertsbakh
Measurement Theory
for Engineers
Springer
Ilya Gertsbakh
http://www.springer.de
© Springer-Verlag Berlin Heidelberg 2003
Originally published by Springer-Verlag Berlin Heidelberg New York in 2003.
Softcover reprint ofthe hardcover Ist edition 2003
The use of general descriptive names, registered names, trademarks, etc. in this publication does
not imply, even in the absence of a specific statement, that such names are exempt from the
relevant protective laws and regulations and therefore free for general use.
Typesetting: Datenconversion by author
Cover-design: Medio, Berlin
Printed on acid-free paper 62 I 3020 hu - 5 4 3 2 I 0
To my wife Ada
Preface
The material in this book was first presented as a one-semester graduate course
in Measurement Theory for M.Sc. students of the Industrial Engineering De-
partment of Ben Gurion University in the 2000/2001 academic year.
The book is devoted to various aspects of the statistical analysis of data
arising in the process of measurement. We would like to stress that the book is
devoted to general problems arising in processing measurement data and does
not deal with various aspects of special measurement techniques. For example,
we do not go into the details of how special physical parameters, say ohmic
resistance or temperature, should be measured. We also omit the accuracy
analysis of particular measurement devices.
The Introduction (Chapter 1) gives a general and brief description of the
measurement process, defines the measurand and describes different kinds of
the measurement error.
Chapter 2 is devoted to the point and interval estimation of the popula-
tion mean and standard deviation (variance). It also discusses the normal and
uniform distributions, the two most widely used distributions in measurement.
We give an overview of the basic rules for operating with means and variances
of sums of random variables. This information is particularly important for
combining measurement results obtained from different sources. There is a
brief description of graphical tools for analyzing sampIe data. This chapter also
presents the round-off rules for data presentation.
Chapter 3 contains traditional material on comparing means and variances
of several populations. These comparisons are typical for various applications,
especially in comparing the performance of new and existing technological pro-
cesses. We stress how the statistical procedures are affected by measurement
errors. This chapter also contains abrief description of Shewhart charts and a
discussion on the influence of measurement errors on their performance.
When we measure the output parameter of a certain process, there are two
main sources of variability: that of the process and that due to the errors
introduced by the measurement process. One of the central issues in statistical
measurement theory is the estimation of the contribution of each of these two
sources to the overall variability. This problem is studied in the framework
of ANOVA with random effects in Chapter 4. We consider one-way ANOVA,
viii PREFACE
Ilya Gertsbakh
Beersheva, October 2002
Contents
Preface vii
1 Introduction 1
1.1 Measurand and Measurement Errors 1
1.2 Exercises . . . . . . . . . . . 7
Index 147
Chapter 1
Introduction: Measurand
and Measurement Errors
Truth lies within a little and certain compass, but error is immense.
Bolingbroke
procedure. For example, we use a micrometer to read the diameter of the shaft,
according to the accepted rules for operating the micrometer. The measurement
result is 7.252 mm. Chemical analysis carried out according to specified rules
establishes that the rolled steel sheet has 2.52% chromium. Weighting the piece
of silver and measuring its volume produces the result 10.502 g/cm3 . The word
"measurement" is also used to describe the set of operations carried out to
determine the measurand.
Note that we restrict our attention in this book to measuring well-defined
physical parameters and do not deal with measurements related to psychology
and sociology, such as IQ, level of prosperity, or inflation.
Suppose now that we repeatthe measurement process. For example, we mea-
sure the shaft midsection diameter several timesj or repeat the chemical analy-
sis for percentage of chromium several times, taking different sampies from the
metal sheetj or repeat the measurements of weight and volume needed to calcu-
late the specific weight of silver. The fundamental fact is that each repetition
of the measurement, as a rule, will produce a different result.
We can say that the measurement results are subject to variations. An equi-
valent statement is that the result of any measurement is subject to uncertainty.
In principle, we can distinguish two sources of uncertainty: the variation created
by the changes in the measurand and the variations which are intrinsic to the
measurement instrument and/or to the measurement process.
As a rule, we ignore the possibility of an interaction between the measurand
and the measurement instrument or assume that the effect of this interaction on
the measurement result is negligible. For example, in the process of measuring
the diameter, we apply a certain pressure which itself may cause adeformation
of the object and change its diameter. A good example of such interaction is
measuring blood pressure: most people react to the measurement procedure
with a raise in blood pressure.
One source of uncertainty (variability) is the measurand itself. Consider for
example the cross-section of the cylindric shaft. It is not an ideal circle, but
similar, in exaggerated form, to Fig. 1.1. Obviously, measurements in different
directions will produce different results, ranging, say from 10.252 mm to 10.257
mm.
We can try to avoid the "multiplicity" of results by redefining the measurand:
suppose that the diameter is defined as the average of two measurements taken in
perpendicular directions as shown in Fig. 1.1. Then the newly defined diameter
will vary from one shaft to another due to the variations in the process of shaft
production.
The situation with the rolled steel sheet is similar. Specimens for the purpose
of establishing the proportion of chromium are taken from different locations
on the sheet, and/or from different sheets. They will contain, depending on the
properties of the production process, different amounts of chromium ranging
from, say, 2.5% to 2.6%.
1.1. MEASURAND AND MEASUREMENT ERRORS 3
It should be stressed that (1.1.1) is valid for the situation when there exists
a well-defined "true" value of the measurand 1'. How to extend (1.1.1) to the
situation in which the measurand itself is subject to variations, as it is in our
shaft diameter and chromium content examples?
Let us adopt the approach widely used in statistics. Introduce the notion
of a population as the totality of all possible values of the shaft diameters.
(We have in mind a specmed shaft produetion process operating under fixed,
controlled conditions.) Fisher (1941) calls this totality a "hypothetical infinite
population" (see Mandel (1994), p. 7). The value D of the diameter of each
parlicular shaft therefore becomes a random variable. To put it simply, the
result of each particular measurement is unpredictable, it is subject to random
variations. What we observe as the result of measuring several randomly chosen
shafts becomes a random sampie taken from the above infinite population which
we create in our mind.
How, then, do we define the measurand? The measurand, by our definition,
is the mean value of this population, or more formally, the mean value I' of the
corresponding random variable D:
Y = I'+X, (1.1.3)
where the random variable X reßects all uncertainty of the measurement result.
We can go a step further and try to find out the strueture of the random
variable X. We can say that X consists of two principal parts: one, X o, re-
fleets the variations of the measurand in the population, Le. its deviations from
the mean value 1'. Imagine that our measurement instruments are free of any
errors, Le. they are absolutely accurate. Then we would observe in repeated
measurements only the realizations of the random variable I' + X o.
1.1. MEASURAND AND MEASUREMENT ERRORS 5
Remark 1
For the purpose of measurement data processing, it is often convenient to de-
compose the measurement error E into two components EI and E2 which we will
call "systematic" and "random", respectively. By definition, the systematic
component remains constant for repeated measurements of the same object car-
ried out on the same measurement instrument under the same conditions. The
random component produces replicas of independent identically distributed ran-
6 CHAPTER 1. INTRODUCTION
dom variables for repeated measurements made by the same instrument under
identical conditions.
In theoretical computations, it is often assumed that the first component has
a uniform distribution, whose limits can be established from the certificate of
the measurement instrument or from other sources. The random component is
modeled typically by a zero-mean normal random variable; its variance is either
known from experience and/or certificate instrument data, or is estimated from
the results of repeated measurements.
Remark ~
In the literature, we very often meet the notions of "precision" and "accuracy"
(e.g. Miller and Miller 1993, p. 20). The easiest way to explain these notions
is to refer to the properties of the random variable X in (1.1.7). Let 6 be the
mean of X and u its standard deviation (then u 2 is the variance of f. This fact
will be denoted as f '" (6,u 2 ).) Then measurements with small Ö are called
"accurate" ; measurements with small u are called "precise". It is very often
said that the presence of 6 1= 0 signifies so-called systematic error.
1.2. EXERCISES 7
Remark 3
There is a scientific discipline called metrolo911 which studies, among other issues,
the properties of measurement instruments and measurement processes. One of
the central issues in metrology is the analysis of measurement errors. Generally
speaking, the measurement error is the difference between the ''true" value of
the measurand (which is assumed to be constant) and the measurement result.
Even for a relatively simple measurement instrument, say a digital ohmmeter,
it is a quite involved task to find all possible sources of errors in measuring the
ohmic resistance and to estimate the limits of the total measurement error. We
will not deal with such analysis. The interested reader is referred to Rabinovich
(2000).
The limits of maximal error of a measurement instrument (including sys-
tematic and random components) usually are stated in the certificate of the
instrument, and we may assume in analyzing the measurement data that this
information is available.
The data on instrument error (often termed "accuracy") is given in an ab-
solute or relative form. For example, the certificate of a micrometer may state
that the maximal measurement error does not exceed 0.001 mm (1 micron).
This is an example of the "absolute form". The maximal error of a voltmeter
is typically given in a relative form. For example, its certificate may state that
the error does not exceed 0.5% of a certain reference value.
1.2 Exercises
1. Take a ruler and try to measure the area of a rectangular table in cm2 in your
room. Carry out the whole experiment 5 times and write down the results. Do
they vary? Describe the measurand and the sources of variability of the results.
How would you characterize the measurement process in terms of accuracy and
precision?
2. During your evening walk, measure the distance in steps from your house
to the nearest grocery store. Repeat the experiment five times and analyze the
results. Why do the results vary?
3. Take your weight in the morning, before breakfast, four days in a row.
Analyze the results. What is the measurand in this case? Are the measurements
biased?
8 CHAPTER 1. INTRODUCTION
Measuring Population
Mean and Standard
Deviation
•• •••
• • •• •
I • I • •••••••• I I
155 160 165 170
Figure 2.1. Representation of apple weights in form of a dot diagram
10 CHAPTER 2. MEAN AND STANDARD DEVIATION
It is a golden rule of statistical analysis first of all to try to "see" the data.
Simple graphs help to accomplish this task. First, we represent the data in the
form of a so-called dot-diagram; see Fig. 2.1.
170
166
162 I
158
*
Figure 2.2. Box and whisker plot for the data of Example 2.1.1
A second very popular and useful type of graph is so-called box and whisker
plot. This is composed of a box and two whiskers. The box encIoses the middle
half of the data. It is bisected by a Une at the value of the median. The vertical
lines at the top and the bottom of the box are called the whiskers, and they
indicate the range of "typical" data values. Whiskers always end at the value
2.1. ESTIMATION OF MEAN AND VARIANCE 11
of an actual data point and cannot be longer than 1.5 times the size of the box.
Extreme values are displayed as "*" for possible outliers and "0" for probable
outliers. Possible outIiers are values that are outside the box boundaries by
more than 1.5 time the size of the box. Probable outliers are values that are
outside the box boundaries by more than 3 times the size of the box.
In our example, there is one possible outIier - 158 gram. In Sect. 2.5, we
describe a formal procedure for identifying a single outIier in a sampie.
The mathematical model corresponding to the above example is the follow-
ing:
Y =I'+X, (2.1.1)
where I' is the mean value for the weight of the population of Yarok apples (the
measurand), and the random variable X represents random fluctuation of the
weight around its mean. X also incorporates the measurement error.
X is a random variable with mean 8 and variance u 2 • We use the notation
X,..., (8,u 2 ). (2.1.2)
Consequently, Y has mean I' + 8 and variance u2:
Y""'(1'+8,u 2 ). (2.1.3)
Expression (2.1.3) shows that the measurement results include a constant error
8, which might be for example the result of a systematic bias in the apple
weighing device. In principle, we cannot "filter out" this systematic error by
statistical means alone. We proceed further by assuming that the weighing
mechanism has been carefully checked and the bias has been eliminated, Le. we
assume that 8 = o.
Remark 1
In chemical measurements, the following method of subtraction is used to elim-
inate the systematic bias in weighing operations. Suppose we suspect that the
weighing device has a systematic bias. Put the object of interest on a special
pad and measure the weight of the object together with the pad. Then weigh
the pad separately, and subtract its weight from the previous result. In this
way, the systematic error which is assumed to be the same in both readings will
be canceled out.
Recall that u, the square root of the variance, is called the standard deviation.
u is the most popular and important measure of uncertainty which characterizes
the spread around the mean value in the population of measurement results. Our
purpose is to estimate the mean weight I' (the measurand) and u, the measure
of uncertainty. It is important to mention here that u (unlike ( 2 ) is measured
in the same units as 1'. So, in our example u is expressed in grams.
We use capital italic letters X, Y, Z, ... to denote random variables and the
corresponding small italic Ietters x, y, z, . .. to denote the observed values of
these random variables.
12 CHAPTER 2. MEAN AND STANDARD DEVIATION
Let 'Ui be the observed values of apple weights, i = 1, ... , 20. In measure-
ments, the unknown mean JJ is usually estimated by sampie mean y:
",20
- -
Y- 20 'Ui .
LJi=I (2 .1.4)
(Often, the word sampie a"erage or simply a"erage is used for the sampIe mean.)
The standard deviation u is estimated by the following quantity 8:
",20 ( . _ -)2
LJi-I y, Y
8= (2.1.5)
19
An estimate of a parameter is often denoted by the same letter with a "hat".
So, P. is used to denote an estimate of JJ.
For our example we obtain the following results:
y = p. = 165.6;
8 = 2.82.
In general, for a sampIe {YI, Y2, ... , Yn} the estimate of the mean (Le. the
sampIe mean) is defined as
(2.1.13)
The term h 2 /12 in this formula is called the Sheppard's correction; see Cramer
(1946, Sect. 27.9). The applicability of (2.1.13) rests on the assumption that 'Y
has a uniform distribution and the density function for X is "smooth".
Let us compute this correction for our apple example. By (2.1.5), si-
=
2.822 = 7.95. h2 /12 = 1/12 = 0.083. By (2.1.13) we obtain s~ = 7.95-0.083 =
7.867, and Sx = 2.80. We see that in our example the Sheppard correction
makes a negligible contribution.
values of y and 8. For each sampIe we will obtain different results, reßecting the
typical situation in random sampling.
The apple weights Yl, ... ,Y20 are the observed values of random variables
Y1 , Y2 , ••• , 1'20' These random variables are identically distributed and therefore
have the same mean value I' and the same standard deviation 0'. Moreover, we
assume that these random variables are independent. The observed sampIe
means y(j), j = 1,2, ... ,m, are the observed values of the random variable
Y = L~~l Yi (2.1.14)
20 .
This random variable is called an estimator of 1'.
Of great importance is the relationship between Y and 1'. The mean value
of Y equals 1':
Le. the variance of Y is inverse proportional to the sampIe size n = 20. As the
sampIe size n increases, the variance of Y decreases and the estimator becomes
closer to the unknown value 1'. This property defines a so-called consistent
estimator. (More formally, as n -+ 00, then Y tends to I' with probability 1).
Let n be the sampIe size. It can be proved that the quantity
-)2
82 = L...i-l (Yi -
~n
Y (2.1.17)
n-1
is an unbiased estimator of the variance 0'2, Le. E[8 2 ] = 0'2.
We have already mentioned that the square root of 8 2 ,
"U
var
[I:~-1
k
Yi] = 0'2
k . (2.1.23)
a: = a~ 'E 1 k
I/al
, i = 1, ... ,k. (2.1.27)
i =1
The following example demonstrates the use of this formula in measurements.
Let Y be any random variable with mean J.I. and standard deviation u. In
most cases which will be considered in the context of measurements, we deal
with positive random variables, such as concentration, weight, size and velocity.
Assume, therefore, that Y is a positive random variable.
An important characterization of Y is its coefficient 01 variation defined as
the ratio of u over J.I.. We denote it as c.v.(Y):
u
c.v.(Y) = -.
p,
(2.1.29)
Small c.v. means that the probability mass is closely concentrated around the
mean value.
Suppose we have a random sampie of size n from population with mean p,
and standard deviation u. Consider now the sampie mean Y. As follows from
(2.1.23), it has standard deviation equal to u/Vii and mean J.I.. Thus,
_ c.v.(Y)
c.v. (Y) - Vii . (2.1.30)
In words: the standard deviation of the sampie mean equals the estimated
population standard deviation divided by the square root of the sampie size.
For the apple data in Example 2.1.1, sjJ. = 2.8/VW = 0.63.
For example, suppose that in calculating the specific weight, we have ob-
tained the weight 54.452 g, the volume 5.453 cm3 , and the overall uncertainty
18 CHAPTER 2. MEAN AND STANDARD DEVIATION
A measurement scheme with 8 < 0.5 we call special. To put it simply, special
measurements are made using a measuring instrument with scale step greater
than two population standard deviations.
If a measurement is special, then the rules for computing the estimates of
population mean and standard deviation must be modified. We will describe
these rules in Chapter 8, for the case of sampling from a normal population.
As a rule, we will deal only with regular measurements. It is characteristic of
special measurements that all measurement results take on usually not more
than two or three adjacent numerical values on the scale of the measurement
instrument.
2.3. NORMAL DISTRIBUTION 19
Jl. may be any real number and 0' any positive number. The parameters Jl. and
0' have the following meaning for the normal distribution:
E[ Zl = Jl., Var[ Z] = 0'2.
°
It is more convenient to work with the scrcalled standard normal distribution
having Jl. = and 0' = 1 (Fig. 2.3). So, if Zo '" N(O, 1), then
r, ( ) _ _1_ _ v 2 /2
JO V -!<Ce . (2.3.2)
v27r
The cumulative probability function
gives the area under the standard normal density from -00 to t. It is called the
normalized Gauss function. (In European literature, the name Gauss-Laplace
function is often used.) The table of this function is presented in Appendix A.
It is important to understand the relationship between Z '" N(Jl.,0'2) and
Zo '" N(O, 1). The reader will recall that if Y '" N(Jl., 0'2), then the mean value
of Y equals Jl., and the variance of Y equals 0'2. Now
Z-Jl.
Z = Jl. + 0' . Zo, or Zo =- -.
0'
(2.3.4)
In simple terms, the random variable with normal distribution N(Jl., 0'2) is ob-
tained from the "standard" random variable Zo '" N(O, 1) by a linear transfor-
mation: multiplication by standard deviation 0' and addition of Jl..
20 CHAPTER 2. MEAN AND STANDARD DEVIATION
-2 2 4
Figure 2.3. Standard normal density
For a better understanding of the normal density function, let us recall how
the probability mass is distributed around the mean value. This is demonstrated
by Fig. 2.4. In practical work with the normal distribution we often need to use
so-called quantiles.
Normal Quantiles
Definition 2.2.1 Let 0: be any number between 0 and 1, and let Y be a random
variable with density fy(t). Then the number tOt which satisfies the equality
0: = j
-00
ta
fy(t)dt (2.3.5)
er Za
0.001 -3.090
0.010 -2.326
0.025 -1.960
0.050 -1.645
0.100 -1.282
0.900 1.282
0.950 1.645
0.975 1.960
0.990 2.326
0.999 3.090
-4 -3 -2 -1 o 1 2 3 4
Figure 2.4. The probability mass under the normal density curve
N orm.a1 Plot
A simple graphical tool can be used to check whether a random sampie is taken
from normal population. This is a so-called normal probability plot.
Example fU.l retJisited To construct the normal plot, we first have to order the
weights, and assign to the ith ordered observation Y(i) the value P(i) = (i-0.5) In;
see Table 2.3. We now plot the points (Y(i),P(i»). Figure 2.5 shows the normal
22 CHAPTER 2. MEAN AND STANDARD DEVIATION
plot for the apple weights. If the points on the plot form a reasonable straight
line, then this is considered as a confirmation of the normality for the population
from which the sampie has been taken.
0.99~--~----~----.----,----,-----,----,-----
0.8~---+----4-----~---+----+---~~---+--~
0.75~---+----4-----~---+----+-~.a-~----+---~
o. 7 ~·············-+············t ...---.-....... . .- ......-.---...-...-
o. 6 5 ~.......... ............;.......... --·t······ ............................- -.-.- ····f············- j.....-.-- ............................. -
0 . 6 . - - .............. --
oö: ~ e---~ -------l---.----.-- ---.-------..~--.
o0.4
. 45 _____ .____4t___. __ _. . __________ . ._. . _. . _. _ __ .. __ . __ .______.
•
0.35 •
0.3~---+----4-----~--~.~~+-----~---+--~
0.25~--_+----4_----~--_r.----+_----r---_+-----
0.2 f-- ........... + ... --_. +. . . . . . . . . . . . . .+ .•" ... -.:
0.15F====±====~====~=.==~==~====~====~==~
0.1~---+----4_----~---+----+_----r---_+-----
0.08~---+ii----4----.~---+----+_----r---_+-----
o.06 f-................... i-- ............. +················-1····· ............................. _...... ........................- .. [.... .......... ..
0.04~---+----4-----~---+----+_----~--_+-----
0.02~--~·~--~--~----~--~----+----+--~
I
158 160 162 164 166 168 170 172
Figure 2.5. The normal plot for the data of Example 2.1.1
The function f(x) is what we used to call the density function. Now it is
natural to postulate that this function is even, Le. f(x) = f( -x), which means
that the distribution of errors is symmetrie around the origin x = O. Without
loss of generality, we may assume that f(x) = q,(x2).
The second postulate is that q,(-) is a decreasing function of x, Le. larger
errors are less probable than smaller ones. Suppose now that we carry out a
measurement whose result is a pair of numbers (x, y). In other words, let us
assume that we have a two-dimensional measurement error (€1, €2)' A good way
to imagine this is to consider shooting at a target, and interpret €1 and €2 as
the horizontal and vertical deviations from the center of the target, respectively.
Put the center of the target at the point (x = 0, y = 0).
The third postulate is the following: the random errors €1 and €2 are in-
dependent. This means that the probability dp of hitting a target at a small
square [x, x + dx] x (y, y + dy] can be represented in the following product form:
dp = q,(X2)q,(y2)dxdy. (2.3.10)
Now rotate the target around the origin in such a way that the x-axis goes
through the point (x,y). Then it can be shown, based on the third postulate,
24 CHAPTER 2. MEAN AND STANDARD DEVIATION
The solution of (2.3.11) is beyond the standard calculus course. Let us accept
without proofthe fact that (2.3.11) determines a unique function which has the
fol1owing general form: q,(x2 ) = C . expAz2. Since q,0 must be decreasing (the
second postulate), Ais negative. Denote it by -1/2a 2 • The constant C must
be chosen such that the integral of q, from minus to plus infinity is 1. It turns
out that C = l/...ti7ra. Thus we arrive at the expression
with f(x) = 0 outside [a, b]. The mean value of X is E[X] = (b + a)/2 and the
variance is Var[X] = (b - a)2/12.
Often the interval [a, b] is symmetrie with respect to zero: [a, b] = [-ß, ß],
Le. the density is constant and differs from zero in asymmetrie interval around
zero. The length of this interval is 2ß. We use the shorthand notation X ......,
U(a, b) to say that X has density (2.4.1).
Let us recall that the first two moments of X......, U( -ß, ß) are
E[X] = 0, (2.4.2)
ß2
Var[X] = 3. (2.4.3)
Xi"" U( -~i, ~i)· The variance of sum of independent random variables is the
sum of component variances. Thus,
4 A2
L L
k k A2
Var[X] = _ui = ~. (2.4.5)
i=1 12 i=1 3
The standard deviation of X is
volume of the liquid in the pipette filled up to the 10 ml mark with "absolute"
accuracy may deviate from this value by not more than ±0.012 ml. In simple
terms, the error in the liquid volume due to the geometrie imperfection of the
glassware is X '" U( -0.012,0.012).
Q= Y(2) - Y(l) .
(2.5.1)
Y(n) - Y(l)
Table 2.5: Dixon's statistic for significance level a = 0.05 (Bolshevand Smirnov,
p.328)
3 0.941 13 0.361
4 0.765 14 0.349
5 0.642 15 0.338
6 0.560 16 0.329
7 0.507 17 0.320
8 0.468 18 0.313
9 0.437 19 0.306
10 0.412 20 0.300
11 0.392 25 0.277
12 0.376 30 0.260
It is interesting to check the normal plot without the outlier y = 158. The
normal plot without the "outlier" in Fig. 2.6 looks "more normal" .
r--------I-------l-------+'---i-----+------I----i---l
0.99,----,----.-----.----,-----r----~--~----~
O. 98
0.96~---4----~------r_----:-----4_-
0.94r----+----~----~--~----~----+_--~----~
0.92r----+----~----7_--_+----~----+_~a~4_--____I
0.9r----+---+----+----~--~~--~---+_-~
0.85r---~------r----+_---+----~----+-~·~~--~i
0.8r----+-----r----+----+----~----~--~-----
0.75r----+----~----+_--~----~----~--~-----
0.7r----+-----r----+_---4----~-~:-+----~----
0.65r----+----~----+_--~!----~-~~-+----4-----
0.6r----+-----r----+----+----~~~+_--~-----
0.55 I--------i----+----+----+__~ - - ----l-----------i
0.5 1
0.45r---+----+----+----~-~~-~--+_-~
0.4 f---------l----f-----+---+----+--+ -----t---
o. 35 f---- -----r---el--+--------I------l----~
0.3 i •
0.25r----+----~---+_--__,---~---+_--~----~
0.2 '.----+-----+-------I~---
0.15
•
0.1r---+--+---+---~---~-~---+--------
•
g:g~t====t====j=-=-----~+-~·~~:----~----+---~-----
•
o.-04 r----+-----r----+----+----~----+_--~------1
0.02r----+---+---+----~--____I~--_l___----+_--~
n An Un
2 1.128 1.155
3 1.693 1.732
4 2.059 2.078
5 2.326 2.309
6 2.534 2.474
7 2.704 2.598
8 2.847 2.694
9 2.970 2.771
10 3.078 2.834
12 3.258 2.931
14 3.407 3.002
16 3.532 3.057
18 3.640 3.099
19 3.689 3.118
20 3.735 3.134
u=
An
Let us apply (2.6.1) to the Yarok example, after deleting the outlier Y(l) =
158. Using Table 2.6, we have (j = 8/A19 = 2.17. Note that the estimate of u
based on 19 observations computed by (2.1.7) is s = 2.24, close enough to the
previous result.
Properties of u-estimators
Suppose that our observations {Yl, ... ,Yn} are drawn from a uniform population
y '" U (a, b). It turns out also for this case that there is an easy way to estimate
the standard deviation using sampie range.
It is easily proved that the mean value of the range for a sampie drawn from
a uniform population equals
Compare this formula with the formula for the standard deviation:
O'y = (b - a)/vl2 (2.6.3)
and derive that
O'y = E[Y(n) - Y(l)](n + 1)/(vl2(n -1)). (2.6.4)
It follows therefore that an estimate of O'y ean be obtained by the formula
Y(n) - Y(l) (
A
O'y = , 2.6.5)
Un
where U n = JI2(n - 1)/(n + 1).
The eoeffi.dents un are given in Table 2.6. It is surprising that for small n
they are quite elose to the values An. This means that for estimating 0' in small
sampIes (n ~ 10), assuming a normal distribution or uniform distribution would
produee quite similar results.
We see that there are two different estimators of standard deviation: the
S-estimator,
(2.7.1)
i Ai = Yi - Di
1 10.533 10.545 0.012
2 9.472 9.476 0.004
3 9.953 9.960 0.007
4 10.823 10.830 0.007
5 8.734 8.736 0.002
6 10.700 10.706 0.006
7 9.620 9.630 0.010
8 10.580 10.582 0.002
9 10.546 10.555 0.011
10 9.518 9.520 0.002
round off the final computation result to 0.1. Thus, using Table 2.8, for n = 19,
and 1 - 2a = 0.9, we have '/'19 (0.05) = 0.398. Thus the 0.90 confidence interval
on the sampie mean is
[166.00 - 0.398 . 2.2,166.00 + 0.398 . 2.2] = [165.1,166.9]
Let us recall that confidence intervals can be used to test a hypothesis re-
garding the mean value in the following way. Suppose that we want to test the
null hypothesis 110 that the population mean equals I' = 1'0 against a two-sided
alternative I' -=11'0, at the significance level 0.05. We proceed as follows. Con-
struct, on the basis of the data, a 0.95 confidence interval on 1'. If this interval
covers the value 1'0, the null hypothesis is not rejected. Otherwise, it is rejected
at the significance level 0.05.
Remark 1
We draw the reader's attention to an important issue which is typically not very
weH stressed in courses on statistics. H the sampie {Y1, ... , Yn} is obtained as a
result of measurements with a constant but unknown bias Ö, then the confidence
interval (2.7.1) will be in fact a confidence interval for the sum I' + Ö.
rods was randomly chosen and measured every hour and the table shows data
over a particular period of 20 hours of production. The figures in this table are
the deviations of the rod diameters from value Tl = 0.870 in thousands of an
inch. Each row in the table corresponds to one hour. The last two columns give
the hour average and hour range for the four observations. I
IThis material is borrowed from George Box and Alberto Lueeno Statistical Contr"Ol b,l
Monitoring and Feedback Adjustment (1997) and is used by permission of John Wiley & Sons,
Ine., Copyright @1997.
34 CHAPTER 2. MEAN AND STANDARD DEVIATION
241------------------------123.506
15
6 t--------).o(---+-----\---::i'<~:':::::(~__1_----_I5.9125
-3
-12-}----------------------_I-11.681
6 11 16 21
Ca se Number
Figure 2.7. The X bar chart for data in Table 2.10 produced by Statistix. The
zero line corresponds to Tl = 0.870.
X Bar Chart
The line 5.9125 (near the mark "6") is the estimated mean value of the process.
This value is the average {J of hour averages in Table 2.10. The lines 23.506
and -11.681 (near the marks "24" and "12", respectively) are called control
limits and correspond to ±3uD / V4 deviations from the estimated mean value.
UD is the estimate of the standard deviation of the rod diameter. We divide
it by V4 because we are plotting ±3 standard deviations for the average of 4
observations.
An important statistical fact is that the sampie means, due to the central
limit effect, closely follow the normal distribution, even if the rod diameters
themselves may not be normally distributed.
How do we estimate the standard deviation UD? The practice is to use the
average range. From the Range column of Table 2.10 one can calculate that the
average range is R = 24.15. Now from Table 2.6 obtain the coeflicient An for
the sampie size n = 4: A 4 = 2.059. Then, as already explained in Sect. 2.6, the
estimate of a D will be
UD = R/A4 = 24.15/2.059 = 11.73. (2.8.1)
The estimate of standard deviation of sample averages is V4 = 2 times smaller
and equals 5.86. We can round this result to 5.9.
When the averages go outside of the control limits, actions must be taken
to establish the special cause for that.
The principal assumption of Shewhart's charts is that the observations taken
each hour are, for astate of control, sample values of identically distributed
2.8. CONTROL CHARTS FOR MEASUREMENT DATA 35
Range charts
The other graph which is important in Shewhart's methodology is the 'range
chart' or 'R chart'. Range charts are meant for detecting changes in the process
variability. They are constrocted similarly to the X bar charts; see Fig. 2.8.
Each hour, the actual observed range of the sampIe of n = 4 rod diameters is
plotted on the chart.
Denote by UR the estimate ofthe standard deviation ofthe range (for n = 4).
The distribution of the range is only approximately normal, but nevertheless
±2UR or ±3UR limits are used to supply action limits and warning limits.
60
1------------------------155.115
40
1---;,I--\--+------\---+~_+__-_\_--_+-------"'\__--_+----124.150
20
o-t---------------------------jO.oooo
6 11 16 21
Case Number
Figure 2.8. The R chart for the data in Table 2.10. The average range is
r = 24.15. The upper controllimit equals 55.115 = 24.150 + 3 ·10.32.
For the normal distribution, the mean range and the standard deviation of
the range (for fixed sampIe size n) depend only on U and are proportional to u,
36 CHAPTER 2. MEAN AND STANDARD DEVIATION
the standard deviation of the population. UR can be obtained from the average
range f by the following formula:
f
(2.8.2)
A
UR = kR(n) ,
If the average range minus 3UR goes below zero, it is replaced by zero.
(2.8.6)
where G pr and Gm are the standard deviations of €pr and €instr, respectively.
If for example 'Y = Gm/Gpr = 1/7, then Gy = JGir + G~r/49 ~ 1.01Gpr .
Thus the overall standard deviation will increase by 1 % only. Assuming that
1 % increase is admissible, we can formulate the following rule of thumb: The
standard deviation of the random measurement error must not exceed one sev-
enth of the standard deviation caused by the common causes in the controlled
process.
It follows from (2.8.7) that now the measurement Y '" (J1., u;r(I + "(2».
Suppose that "( = 0.37. Let us find out the corresponding ARL for n = 4.
The controllimit equals 4.699 x UR" (see Table 2.11, column 5). This limit was
established for the ideal situation where Um = 0, or"( = O. Now the sampIe
range of random variable Y has increased by a factor VI + "(2.
2.8. CONTROL CHARTS FOR MEASUREMENT DATA 39
It can be shown that now the probability of crossing the controllimit equals
For our example, ,,/1 + ')'2 = VI + 0.372 = 1.066, and we have to find P(Rn >
4.699/1.066) ~ P(Rn > 4.41). This probability can be found using special
tablesj see Dunin-Barkovskyand Smirnov (1956, p. 514). The desired quantity
is ~ 0.01, from which it follows that ARL ~ 100. Thus, ifthe standard deviation
of the measurement instrument is about 37% of the process standard deviation,
the ARL of the R chart decreases by a factor of approximately 0.5.
where ß = <5instr/Upr and ~(.) is the distribution function for the standard
normal variablej see (2.3.3).
Example 2.8.1: The ARL tor the X bar ehart in the presence 0/ instrument bias
Assume that the sampie size is n = 4, ')' = um/upr = 0.2 and ß = 0.5. Thus we
assume that the measurement bias is half of u pr , and the standard deviation of
measurement error is 20% of the process standard deviation.
40 CHAPTER 2. MEAN AND STANDARD DEVIATION
Substituting these values into (2.8.13) we obtain, using the table in Appendix
A and assuming ~(3.92) = 1.000, that
p (Y < LCL or Y > UCL) = ~(-1.96) - ~(-3.92) =1- 0.975 = 0.025.
The ARL is therefore 1/0.025 = 40. Let us compare this with the ARL for an
"ideal" measurement instrument (8inBtr = Um = O).
The probability that the averages go outside three-sigma limits equals 2~ (-3) =
2(1 - 0.9986} = 0.0028. The corresponding ARL = 1/0.0028 ~ 360. We see
therefore that the presence of 8inBtr = 0.5upr and Um = 0.2upr reduces the ARL
by a factor of 9.
2.9 Exercises
1. Below are the results of 15 replicate determinations of nitrate ion concentra-
tion in mg/mI in a particular water specimen:
0.64,0.64,0.63,0.63,0.62,0.65,0.66,0.63,0.60, 0.64, 0.65, 0.66, 0.61, 0.62, 0.63.
1 Reprinted with permission from Andrew Gelman, John B. Carlin, Hai S. Stern and Donald
Rubin, Bayesian Data Analysis (2000), p. 70, Copyright CRC Press, Boca Raton, Florida
2.9. EXERCISES 41
28 26 33 24 34 -44 27 16 40 -2
29 22 24 21 25 30 23 29 31 19
24 20 36 32 36 28 25 21 28 29
Assume the normal model and check whether the lowest measurement -44
might be considered as an outlier.
Solution. The Dixon's statistic Q = (-2 - (-44)/(36 - (-44)) = 0.575 > C30 =
0.260; see Table 2.5.
J. Norris
1 Borrowed from Probability and Statistics for Engineering and the Sciences, 1st Edition
by J.Devore. @1982. Reprinted with permission of BrooksjCole, a division of Thompson
Learning: www.thompsonsrights.com. Fax 800730-2215.
44 CHAPTER 3. COMPARING MEANS AND VARlANCES
30
24
18
12
A B
Figure 3.1. Box and whisker plots for sampIes A and B, Example 3.1.1
Remark 1
What if the normality assumption regarding both populations is not valid? It
turns out that the t-test used in the above analysis is not very sensitive to
deviations from normality. More on this issue can be found in Box (1953), and
an enlightening discussion can be found in Scheffe (1956, Chapter 10). It is
desirable to use, in addition to the t-test, the so-called nonparametric approach,
such as a test based on ranks. We discuss in Sect. 3.2 a test of this kind which
is used to compare mean values in two or more populations.
Remark 2
It is quite typical of chemical measurements for there to be a bias or a systematic
error caused by a specific operation method in a specific laboratory. The bias
may appear as a result of equipment adjustment (e.g. a shift of the zero point),
operational habits of the operator, properties of reagents used, etc. It is vitally
important for the proper comparison of means to ensure that the two sets of
measurement (sampie A and sampie B) have the same systematic errors. Then,
in the formula for the T-statistic, the systematic errors in both sampies cancel
out. Interestingly, this comment is rarely made in describing the applications
of the t-test.
In order to ensure the equality of systematic errors, the processing of sampies
A and B must be carried out on the same equipment, in the same laboratory,
possible by the same operator, and within a relatively short period of time.
These conditions are referred to in measurement practice as "repeatability con-
ditions"; see dause 4.2.1 "Two groups of measurements in one laboratory" , of
the British Standard BS ISO 5725-6 (1994).
Remark 3
Is there a way to estimate bias? In principle, yes. Prepare a reference sampie
with a known quantity of the chemical substance. Divide this sampie into n
portions, and carry out, in repeatability conditions, the measurements for each
3.1. TWO-SAMPLE T-TEST 47
Suppose that we can assume that the variances in populations A and B are
equal: UA = UB· Then the testing procedure for hypotheses about PA - PB is
similar to that described above, with some minor changes. The test statistic
will be
T- XA -XB
(3.1.3)
- spJl/nA + I/nB'
where sp (called the pooled sample variance) is defined as
Driver 1 2 3 4 5 6 7 8
Weight loss 5.0 7.5 8.0 6.5 10.0 9.5 12.5 11.0
Driver 12345678
Weight loss 5.5 8.5 9.0 7.0 11.0 9.5 13.5 11.5
Example 9.1.2: Comparing the wear of two materials for disk brake shoes
First, consider a "traditional" two-sample approach. Two groups of n = 8
similar cars (same model, same production year) are chosen. The first (group
A) has front wheel brake shoes made of material A (each car has two pairs
of shoes, one pair for each front wheel). Each particular shoe is marked and
weighed before its installation in the car. Eight drivers are chosen randomly
from the pool of drivers available, and each driver is assigned to a certain car.
In the first experiment, each driver drives his car for 1000 miles. Afterwards
the brake shoes are removed, their weight is compared to the initial weight,
the relative loss of weight for each shoe is calculated, and the average 1088 of
weight for all four shoes for each car is recorded. In the second experiment,
the same cars (each with the same driver) are equipped with brake shoes made
of material B, and the whole procedure is repeated. To exclude any driver-
material interaction, the drivers do not know which material is used in each
experiment.
The results obtained for Experiments 1 and 2 are presented in Tables 3.2
and 3.3.
It is obvious from Fig. 3.2 that material B shows greater wear than material
A. Moreover, this claim is not only true "on average", but also holds for each
car.
The amount of wear has a large variability caused by shoe material nonho-
mogeneityas weH as by the variations in driving habits of different drivers and
differences in the roads driven by the cars. Prom the statistical point of view,
the crucial fact is that this variability hides the true difference in the wear.
To show this, let us apply the above-described t-test to test the null hypoth-
esis Jl.A = Jl.B against the obvious alternative Jl.A < Jl.B. (Here index A refers to
Experiment 1 and index B to experiment 2.)
3.1. TWO-SAMPLE T-TEST 49
Table 3.4: The differenee of wear measured by average weight 1088 in pereentage
Driver 1 2 3 4 5 6 7 8
Weight 1088 0.4 1.1 1.6 0.6 0.9 0.1 1.2 0.4
WE I GHT
I I I
lOSS
5 10 14
%
EXPERIMENT
~ ~~F\\i SiR,VER 1
EXPERIMENT
2
Figure 3.2. Comparison of wear for Experiment 1 and 2
The 0.05-eritical value to.os(14) = 1.761. We must reject the null hypothesis
only if the eomputed value of T is smaller than -1.761. This is not the ease,
and therefore our t-test fails to eonfirm that material B shows greater wear than
material A.
Now let us eonsider a more clever design for the whole experiment. The disk
brake has two shoes. Let us make one shoe from material A and the other from
material B. Suppose that 8 ears take part in the experiment, and that the same
drivers take part with their ears. Each ear wheel is equipped with shoes made
of different materials (A and B), and the loeation (outer or inner) of the A shoe
is chosen randomly.
For each ear, we reeord the diIJerence in the average weight 1088 for shoes
of material A and B, Le. we record the average weight lass of B shoes minus
the average weight loss of A shoes. Table 3.4 shows simulated results for this
imaginary experiment. For each ear, the data in this table were obtained as
the differenee between previously observed values perturbed by adding random
numbers.
There is no doubt that these data point out on a significant advantage of
material A over material B. The ealculations show that the average difference
50 CHAPTER 3. COMPARING MEANS AND VARIANCES
in wear is d = 0.79 and the sampIe variance B~ = 0.25. Let us construct a 95%
confidence interval for the mean weight loss difference fJ.a; see Sect. 2.7. The
confidence interval has the form
(5,7,9,12,14,17,18,25)
and number them, from left to right. The rank of any observation will be its
ordinal number. So, the rank of 5, r(5), is 1, the rank of 7 is r(7) = 2, etc.
How to assign the ranks if several observations are tied, Le. are equal to each
other, as in the following sampIe: {12, 12,14,17,7,25,5, 18}?
Order the sampIe, assign an ordinal number to each observation, and define
the rank of a tied observation as the corresponding average rank:
Ordered sampIe 5, 7, 12, 12, 14, 17, 18, 25;
Ordinal numbers 1, 2, 3, 4, 5, 6, 7, 8;
Ranks 1, 2, 3.5,3.5, 5, 6, 7, 8.
Notation for the KW Test
We consider I sampIes, numbered i = 1,2, ... , I; sampIe i has Ji observations.
N = J1 + ... + h is the total number of observations; Xij is the jth observation
in the ith sampIe.
It is assumed that the ith sampIe comes from the population described by
random variable
Xi = fJ.i +€ij, i = 1, ... ,1, j = 1,2,···,Ji, (3.2.1)
where all random variables €ij have the same continuous distribution. Without
loss of generality, it can be assumed that fJ.i is the mean value of Xi.
It is important to stress that the Kruskal-Wallis procedure does not demand
that the random sampIes be drawn from normal populations. It suffices to
3.2. COMPARING MORE THAN TWO MEANS 51
demand that the populations involved have the same distribution but may differ
in their location.
Suppose that all Xij values are pooled together and ranked in increasing
order. Denote by R ij the rank of the observation Xij' Denote by Ri. the total
rank (the SUfi of the ranks) of all observations belonging to sampie i. R i . =
Rd Ji is the average rank of sampie i.
The null hypothesis 11.0 is that alll'i are equal:
1'1 =1'2 = ... =1'1· (3.2.2)
In the view of the assumption that all €ij have the same distribution, the null
hypothesis means that all I sampies belong to the same population.
If 11.0 is true, one would expect the values of Ri to be elose to each other
and hence elose to the overall average
R = R1 . + ... + RI . = N + 1 (3.2.3)
.. N 2 .
An appropriate criterion for measuring the overall eloseness of sampie rank
averages to R .. is a weighted sum of squared differences (R i . - R..)2.
The Kruskal-Wallis statistic is given by the following expression:
KW
12
= N(N + 1) t"t
I (_ N + 1)2
Ji Ri. - - 2 - . (3.2.4)
Proposition 3.2.1
When the null hypothesis is true and either I = 3, Ji ~ 6, i = 1,2,3,
or I > 3, Ji ~ 5, i = 1,2, ... , I, then KW has approximately a chi-square
distribution with v = I - 1 degrees of freedom (see Devore 1982, p. 597).
The alternative hypothesis for which the KW test is most powerful is the
following:
11.*: not all population means ml, ... ,mI are equal.
Since KW is zero when all Rö. are equal and is large when the sampies are
shifted with respect to each other, the null hypothesis is rejected for large values
of KW. According to the above Proposition 3.2.1, the null hypothesis is rejected
at significance level 0: if KW > X~(v), where X~(v) is the 1-0: quantile ofthe
chi-square distribution with v degrees of freedom.
variables distributed as N(O, 1). The corresponding critical values are defined
as folIows:
(3.2.5)
Table 3.5 gives the critical values of X~(v) for the KW test. A more complete
table of the quantiles of the chi-square distribution is presented in Appendix B.
Let us consider an example.
Example 3.2.1: Silicon wafer planarity measurements 1
Silicon wafers undergo a special chemical-mechanical planarization proce-
dure in order to achieve ultra-Hat wafer surfaces. To control the process per-
formance, a sampie of wafers from one batch was measured at nine sites. Table
3.6 presents a fragment of a large data set, for five wafers and for four sites.
Assuming that the wafer thickness at different sites can differ only by a shift
parameter, let us check the null hypothesis 1lo that the thickness has the same
mean value at all four sites.
The mean ranks are 4.6, 8.0, 13.8 and 15.6 for sites 1 through 4, respectively.
The KW statistic equals 11.137 on four degrees of freedom. From Table 3.5, it
1 Source: Arnon M. Hurwitz and Patrick D. Spagon "Identifying sources of variation" , pp.
105-114, in the collection by Veronica Czitrom and Patrick D. Spagon Statistical Gase Studies
tor Industrial Process lmprovement @1997. Borrowed with the kind permission of the ASA
and SIAM
3.2. COMPARING MORE TRAN TWO MEANS 53
Table 3.6: Thickness (in angstroms) for four sites on the wafer
3330
-
3250
3170
0
3090
SITE1 SITE2 SITE3 SITE4
Figure 3.3. Box and whisker plot of wafer thickness at sites 1,2,3 and 4.
54 CHAPTER3. COMPARING MEANS AND VARIANCES
Table 3.7: Critical values of Hartley's statistic, for a = 0.05 (Sachs 1972, Table
151)
Hartley's Test
Assume that we have k sampIes of equal size n drawn from independent normal
populations. Let s~, 8~, ... ,8~ be the estimates of the corresponding variances
computed using (2.1.8).
To test the null hypothesis 110 : O'~ = O'~ = ... = O'~, we have to compute
the following statistic due to Hartley:
A maxs~
Fmaz=-.-~. (3.3.1)
mmsi
The critical values of the Fmaz statistic are given in Table 3.7. The null hy-
pothesis is rejected (at the significance level a = 0.05) in favor of an alternative
that at least one of variances is different from others, if the observed value of
the Hartley's statistic exceeds the corresponding critical value in Table 3.7.
Example 9.9.1: Testing the equality of variances for wafer thickness data
Using formula (2.1.8), we compute the variances for the thickness of sites 1
through 4. In this example, the number of sampIes k = 4, and the sampIe size
n = 5. We obtain the following results:
s~ = 2019, s~ = 8488, s~ = 2789, s~ = 5985.
3.3. COMPARING VARIANCES 55
Bartlett's Test
A widely used procedure is Bartlett's test. Unlike Hartley's test, it is applicable
for sampies of unequal size. The procedure involves computing a statistic whose
distribution is closely approximated by the chi-square distribution with k - 1
degrees of freedom, where k is the number of random sampies from independent
normal populations.
Let nl, n2, ... , nk be the sampie sizes, and N = nl + ... + nk. The test
statistic is
B = (loge 10) . ll, (3.3.2)
c
where
k
q = (N - k) IOg10 8; - ~:)ni - 1) log10 8~, (3.3.3)
i=1
1 k
C = 1 + 3(k _ 1) (tt(ni - 1)-1 - (N - k)-I) (3.3.4)
and ~ is the so-called pooled variance of all sampie variances 8~, given by
2 _
8p -
L~=1 (ni
N _ k
- 1)8~
.
(
3.3.5)
The quantity q is large when the sampie variances 8~ differ greatly and is equal
to zero when all 8~ are equal. We reject 1lo at the significance level a if
B > X~(k - 1). (3.3.6)
Here X~ (k - 1) is the 1 - a quantile of the chi-square distribution with k - 1
degrees of freedom. These values are given in Table 3.5.
It should be noted that Bartlett's test is very sensitive to deviations from
normality and should not be applied if the normality assumption of the popu-
lations involved is doubtful.
3.4 Exercises
1. Consider the first and and the fourth site data in Table 3.6. Use the t-test
to check the hypothesis that /-'1 = /-'4.
2. Use Hartley's test to check the null hypothesis on the equality of variances
in five populations of day 1 through day 5; see Table 4.1.
4. The paper by Bisgaard (2002) presents data on the tensile strength of high
voltage electric cables. Each cable is composed of 12 wires. To examine the
tensile strength, nine cables were sampled from those produced. For each cable,
a small piece of each of the 12 wires was subjected to a tensile strength test.
The data are shown in Table 3.9. 1 The box and whisker plot for these data is
shown in Fig. 3.4.
a. Compute the cable tensile strength variances and use Hartley's test to
check the null hypothesis that all nine sample are drawn from populations with
equal variances.
1 Reprinted from Quality Engineering (2002), Volume 14(4), p. 680, by courtesy of Marcel
Dekker, lnc.
3.4. EXERCISES 57
355
349
~$+
*
343
337
331 *
*
325
N1 N2 N3 N4 N5 N6 N7 N8 N9
Figure 3.4. Box and whisker plot for cable tensile strength data
Hint: maxs~ = 8~ = 48.4, min 8~ = 8~ = 10.1. Hartley's test statistic is 4.8,
which is smaller than the 0.05-criticallevel for k = 9 sampIes and n - 1 = 11
which lies between 8.28 and 6.72; see Table 3.7.
b. Use the Kruskal-Wallis procedure to test the null hypothesis that the
mean strength for all nine cables is the same.
Solution. The average ranks for sampIes 1 through 9 are as shown below:
The Kruskal-Wallis statistic (3.2.4) is equal to 45.4, which is far above the 0.005
critical value for v = I - 1 = 8 degrees of freedom. Thus, we definitely reject
the null hypothesis.
c. Using multiple comparisons, determine groups of cables which are similar
with respect to their mean tensile strength.
Solution. Statistix produces the following result at a = 0.05:
There are 3 groups in which the means are not significantly different from
one another.
Group 1: 9,8,7,6,5;
Group 2: 8,5,7,6,4,1;
Group 3: 1,2,3,4,5,6,7.
Bisgaard (2002) notes "that a more detailed examination of the manufacturing
process revealed that the cables had been manufactured from raw materials
58 CHAPTER 3. COMPARING MEANS AND VARIANCES
Table 3.9: Tensile strength of 12 wires for each of nine cables (Bisgaard 2002,
p.680)
Wire no 1 2 3 4 5 6 7 8 9
1 345 329 340 328 347 341 339 339 342
2 327 327 330 344 3410 340 340 340 346
3 335 332 325 342 345 335 342 347 347
4 338 348 328 350 340 336 341 345 348
5 330 337 338 335 350 339 336 350 355
6 334 328 332 332 346 340 342 348 351
7 335 328 335 328 345 342 347 341 333
8 340 330 340 340 342 345 345 342 347
9 337 345 336 335 340 341 341 337 350
10 342 334 339 337 339 338 340 346 347
11 333 328 335 337 330 346 336 340 348
12 335 330 329 340 338 347 342 345 341
taken from two different lots, cable Nos. 1-4 having been made from lot A and
cable Nos. 5-9 from lot B".
Chapter 4
Sources of U ncertainty:
Process and Measurement
Variability
Virgil
for magnesium content will be different for different sampIes. The variability is
introduced by the operator, measurement instrument bias and "pure" measure-
ment errors, Le. the uncertainty buHt into the chemical measurement procedure.
In this chapter we describe several statistical models which allow us to esti-
mate separately the two main contributions to the overall variability: produc-
tion process variability and measurement process variability. The first model
described in the next section is a one-way ANOVA with random effects.
188
181
174
*
167.
160
153
DAY1 DAY2
*
DAY3 DAY4 DAY5
35 cases
Figure 4.1. The box and whisker plot for the data in Example 4.2.1
The Model
Let i be the day (bateh) number, i = 1,2, ... , I, and let j be the measurement
number (Le. the sampie number) within a batch, j = 1,2, ... , J. Xii denotes the
62 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY
J XIJ XIJ
XI·
Here Ai is the random batch deviation from the overall mean. It changes ran-
domly from batch to batch (from day to day), but remains constant for all
sampies within one day, Le. Ai remains constant for all measurements made on
the sampies prepared from the material produced during one day. €ij is the
random measurement error.
The model (4.1.1) is the so-called single random effect (or single random
factor) model. An excellent source on the ANOVA with random effects is Sa-
hai and Ageel (2000). The mathematical model of a single-factor model with
random effects is described there on pp. 11 and 24.
Table 4.2 presents the data in matrix form. The measurement results from
batch (day) i are positioned in the ith column, and Xij is the measurement
result from sampie j of batch i.
Further analysis is based on the following assumptiolls:
(i) The Ai are assumed to be randomly distributed with mean zero and
•
varlance a A2 .
(ii) The €ij are assumed to be randomly distributed with mean zero and
•
varlance a e2 .
iii) For any pair of observations, the corresponding measurement errors are
uncorrelated; any Ai is uncorrelated with any other A j , i i- j, and with any
measurement error €ld.
It follows from (4.2.1) that
Estimation of 0A and Ue
Define the sum of squares of the batch sampie mean deviation from the
overall mean, ss A, according to the following formula:
I
SSA = J2)Xi. _X.. )2. (4.2.5)
i=l
where
J
s~ = L(Xij - Xi.)2 I(J - 1). (4.2.7)
j=l
Now we are ready to present the formulas for the point estimates of U e and
I sSe
(4.2.8)
VI(J -
A .
ue = 1)"
VSSAI(I - 1) - t7~
= (4.2.9)
A
UA
J
Ifthe expression under the square root in (4.2.9) is negative, the estimate of UA
is set to zero.
Let us outline the theory behind these estimates. Let Xi. be the random
variable expressing the average for day i, i = 1, ... , I:
the mean of all observations. Denote by MSA and MSe the so-called mean
squares:
and
I J
E[MSA] = E[L L(Xi, - X .. )2 1(1 -1)] = O'~ + JO'~. (4.2.14)
i=1 j=1
If the computation results show that 0' A is very small, then it is a good idea
to test the null hypothesis that 0'A = o. The corresponding testing procedure
is carried out under the normality assumption regarding all random variables
involved, Ai and €ij. This procedure is based on the so-called F-ratio defined
in Table 4.3, and it works as follows.
Compute the ratio
F = 88A/(I - 1) (4.2.20)
88 e /(I(J - 1»
If F exceed8 the o:-critical val ue of the F -statistic with I-I, I (J - 1) degrees
of freedom, F1-1,I(J-l)(0:), then the null hypothesis O'A = 0 is rejected at the
significance level 0: in favor of the alternative 0' A > O.
For example 4.2.1, F = 8.06. From Appendix C we see that for a = 0.01,
F4,30(0.01) = 4.02. Since this number is smaller than the computed F-value,
we reject the null hypothesis at level 0: = 0.01.
ConcIuding re marks
In the model considered in this section, apart from the measurement errors, the
only additional source of result variability is the process batch-to-batch varia-
tion. This was reflected in choice ofmodel, the one-factor ANOVA with random
effects. In production and measurement practice, we meet more complicated
situations. Assume, for example, that the batches are highly nonhomogeneous,
and each batch is divided into several sampies within which the product is ex-
pected to have relatively small variations. Formalization of this situation leads
to a two-factor hierarchical (or nested) design. This will be considered in next
section.
It happens quite frequently that there are several, most often two, principal
factors influencing the measurement process variability. A typical situation
with two random factors is described in Sect. 4.4. There we will consider a
model with two random sources influencing measurement result variability, one
of which is the part-to-part variation, and the other the variability brought
into the measurement process by using different operators to carry out the
measurements.
measurement errors.
We will refer to each day's output as a batch. In our example the batch
will be a long metal rod. We analyze I batches. For the analysis, J sampIes
are randomly taken from each batch. In our case it might be a collection of J
small specimens cut randomly along the whole length of the rod. Each sampIe
is measured K times in the lab to determine its titanium content. The variation
in these K measurements from one sampIe is introduced solely by the variations
in the measurement process. H the measurement process were "ideal", an K
measurements would be identical.
Schematically, this measurement design can be presented in the form of a
"tree", as shown on Fig. 4.2. In the literat ure it is called often a "nested" or
"hierarchical" design.
"} SAMPLES
PER BATCH
0 Q 0 0 0 0 0 0 0 0 0 K
0 0 0 0 0 0 0 0 0 0 0 MEASUREMENTS
0 0 0 0 0 0 0 0 0 0 0 PER SAMPlE
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
(4.3.1)
within batch i, and €ijk is the error term that reflects the measurement error
for the kth measurement of the jth sampie in the ith bateh.
We make the following assumptions:
(i) The Ai are zero-mean random variables with variance 0'1.
(ii) The Bj(i) are zero-mean random variables with variance O't.
(iii) The €ijk are zero-mean random variables with variance O'~.
(iv) The Ai and Bj(i) are mutually uncorrelatedj the €ijk are uncorrelated
between themselves and uncorrelated with Ai and Bj(i).
Our purpose is to estimate O'e,O'A and O'B. We need some notation. The
observed value of the random variable Xijk will be denoted as Xijk. Let Xij. be
the sampie average of all K measurements in the jth sampie of batch i:
~
O'e =
J 88 e
I J(K - 1) j (4.3.8)
88B(A) _ ~2)/K.
( (4.3.9)
I(J - 1) O'e ,
(4.3.10)
4.3. HIERARCHICAL MEASUREMENT DESIGN 69
Note that if the expression under the square root in (4.3.9) or (4.3.10) is nega-
tive, then the corresponding estimate is set to zero. Table 4.5 summarizes the
information used for point estimation of a e, a A and aB.
It is worth noting that the formulas for point estimates (4.3.8)-(4.3.10) are
derived without using any assumptions regarding the form of the distribution
of random variables Ai, Bi(j) , Eij/;. The normality of their distribution must be
assumed at the stage of testing hypotheses regarding the parameters involved.
-2 (74.2-68.0)2
SS B(A) _ . 2 + ... + (81.9-78.1)2)_2340
2 -.
Batch 1 1 2 2 3 3 4 4 5 5
Sample 1 2 1 2 1 2 1 2 1 2
74.1 68.2 75.4 71.5 59.4 63.2 81.7 69.9 81.7 78:2
74.3 67.8 74.8 71.5 58.6 63.6 82.3 69.7 82.1 78.0
Average 74.2 68.0 75.1 71.5 59.0 63.4 82.0 69.8 81.9 78.1
The Model
To give the notions of repeatability and reproducibility more accurate defini-
tions, we need a mathematical model describing the formal structure of the
measurement results.
The kth measurement result in the (i,j)th cell is considered as a random
variable which will be denoted as X ijk • We assume that it has the following
form:
(4.4.1)
2Reprinted with permission from Technometries. Copyright 1999 by the American Statis-
tical Association. All rights reserved.
4.4. REPEATABILITY AND REPRODUCIBILITY STUDIES 73
FACTOR B
Here J.t is the overall mean; Ai is the random contribution to J.t due to item i; B j
is the random contribution to J.t due to operator j; (AB)ij is the random term
describing the joint contribution to J.t of item i and operator j, which is called
the random i - j interaction term. fijk is the "pure" measurement error in the
kth measurement for a fixed combination of i and j.
Formally (4.4.1) is called a two-way crossed design with random factors (or
random effects); see Sahai and Ageel (2000, Chapt. 4).
The observed values of X ijk will be denoted as Xijk. For example, the ob-
served K measurement results in the (i, j )th cell will be denoted as Xij1, ••• , Xij K •
We make the following assumptions regarding the random variables ~,Bj,
(AB)ij and fijk:
(i) The Ai are zero-mean variables with variance a~.
(ii) The B j are zero-mean random variables with variance a~.
(iii) The (AB)ij are zero-mean random variables with variance a~B.
(iv) The fijk are zero-mean random variables with variance a;.
(v) All the above-mentioned random variables are mutually uncorrelated.
Assumptions (i)-(v) allow us to obtain point estimates ofthe variances involved
and of some functions of them which are of interest in repeatability and repro-
ducibility studies.
To obtain confidence intervals and to test hypotheses regarding the variances,
the following assumption will be made:
(vi) All random variables involved, Ai, B j , (AB)ij and fijk, are normally
distributed.
74 CHAPTER 4. PROCESS AND MEASUREMENT VARlABILITY
(4.4.2)
is the standard deviation of measurements made by many operators measuring
the same item in the absence of repeatability variation. In R&R studies, an
important parameter is
(4.4.3)
which reflects the standard deviations of all measurements made on the same
item. Finally, another parameter of interest, which is analogous to urepro, is
(4.4.4)
First, define the following four sums of squares based on the observed values
of the random variables Xijk:
I
SSA = J. K· L(Xi .. - X... )2, (4.4.9)
i=l
4.4. REPEATABILITY AND REPRODUCIBILITY STUDIES 75
J
SSB = I· K· ~)x.j. - X... )2, (4.4.10)
j=1
I J
SSAB = KE~:)Xij. -Xi .. -X.j. +X... )2, (4.4.11)
i=1 j=1
I J K
SSe = EEE(Xijl: -Xi;.)2. (4.4.12)
i=1 j=1 1:=1
Let us now define four random sum of squares SSA,SSB,SSAB and SSe by
replacing in (4.4.9)-(4.4.12) the lower-case italic letters Xijl:, Xi .. , X.j. and X... by
the corresponding capital italic letters which denote random variables. In this
replacement operation, Xij. is the random mean of all observations in cell (i,j);
Xi .. is the random mean of all observations in row i, X.j. is the random mean
of all observations in column j. X ... denotes the random mean of all I x J x K
observations. The properties of these random sum of squares are summarized
in Table 4.8.
Replacing the expected values of SSe, SSA, SSB and SSAB by their observed
values sSe, SSA, SSB, SSAB, and the variances O'~, O'~, 0'1, O'~B by their estimates,
we arri ve at the following four equations for the point estimates of the variances
in the model:
SSe _ (,2
(4.4.13)
[J(K -1) - e'
(4.4.14)
(4.4.15)
(4.4.16)
76 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY
Combining the variance estimates, we arrive after little algebra at the following
estimates of the parameters which are of interest in R&R studies:
88B 88AB 88 e )
(
max 0, KI(J -1) + IK(J -1) - IJ(K -l)K . (4.4.17)
88B 88AB 88 e
(4.4.18)
URkR=
IK(J -1) + IK(J -1) + KIr
88A 88AB 88 e )
( (4.4.19)
max 0, K(I -l)J + JK(I -1) - KIJ(K -1) .
88 AB = 0.000176;
88 e = 0.00102.
Using formulas (4.4.13) and (4.4.17)-(4.4.19), we find that
The quantiles for the chi-square distribution are presented in Appendix B for
ß = 0.01, 0.025, 0.95, 0.975, 0.99.
All our derivations are based on the following claim proved in statistics
courses.
Claim 4.4.1
If all random variables in model (4.4.1) have normal distribution, then
[0.0055, 0.010].
The confidence interval for arepro will be obtained using the so-called M-
method developed in reliability theory (see Gertsbakh 1989, Chap. 5). This
method works as follows.
Suppose that the random set Sl-2ß covers in the three-dimensional paramet-
ric space the point a; = (a~,a~B,a~) with probability 1 - 2ß. Let m(Sl-2ß)
and M(Sl-2ß) be the minimum and the maximum of the ntnction 'IjJ(a;) =
o. a; + a~B + a~ on this set. Then the following implication holds:
(a;,a~B,a~) E Sl-2ß =} m(Sl-2ß) ~ 'IjJ(a;) ~ M(Sl-2ß)' (4.4.31)
Therefore,
(4.4.32)
78 CHAPTER 4. PROCESS AND MEASUREMENT VARlABILITY
Therefore, (m, M) is the interval which contains O'~B + O'~ = O'~epro with prob-
ability at least 0.90.
Now substitute SSB = 0.000502, qo.9s(4) = 9.488, qo.os(4) = 0.711 and
obtain that the confidence interval for O'repro is
(rm,..fM) = [0.003, 0.015].
Vardeman and VanValkenburg (1999) suggest constructing the confidence
intervals for O'R&R O'repro and O'items using the approximate values of the stan-
dard errors of the corresponding parameter estimates. This method is based on
the error propagation formula, and we will explain in the next chapter how this
method works.
Table 4.9: Pull strength in grams for 10 units with 5 wires each
wire UnI Un2 Un3 Un4 Un5 Un6 Un7 Un8 Un9 UnlO
1 11.6 9.7 11.3 10.1 10.9 9.7 9.6 10.7 10.8 9.9
2 11.3 11.2 11.0 10.2 10.8 11.0 10.3 11.2 10.6 9.4
3 10.3 9.8 10.2 10.8 10.3 10.3 9.2 9.9 9.2 11.3
4 11.7 11.0 11.1 8.6 11.7 10.8 9.1 9.7 10.6 10.8
5 10.3 9.6 11.4 10.8 10.6 9.3 10.2 9.7 10.5 11.3
bond pull test is used to pull on the wire to determine the strength required to
disconnect the wire from the lead frame.
Two experiments were designed to establish the puH strength repeatability
u~. In the first experiment, the u~ was confounded with u~, the variability
created by the wire position on the production unit. In the second experiment,
u~ was confounded with the variability u~ caused by different production units.
The information combined from two experiments enables separate estimation of
u e 2 , u!, u~, as weH as of the variability u~per due to different operators.
lSource: Teresa MitchelI, Victor Hegemann and K.C. Liu "GRP methodology for destruc-
tive testing and quantitative assessment of gauge capability", pp. 47-59, in the collection
by Veronica Czitrom and Patrick D. Spagon Statistical Gase Studies for Industrial Process
lmprovement ©1997. Borrowed with the kind permission of the ASA and SIAM.
80 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY
Experiment 1: Calculations
Let us modify appropriately the formulas (4.3.5)-(4.3.7) for our data. Obviously,
88 e = o. Also
10 5
88B(A) = LL(Xij -xd 2 , (4.5.2)
i=l j=l
where Xij is the observed pull strength for unit i and wire j, and Xi. is the
average pull strength for unit i. Finally,
10
88A = 5· L(xi. - X.. )2, (4.5.3)
i=l
where x.. is the overall average pull strength.
Now we have the following two equations (compare with Table 4.5 and set
UA = Uu,UB = u w ):
(4.5.4)
(4.5.5)
It is easy to find the values of 88A and 88B(A) , for example by using the one-way
ANOVA procedure in Statistix. The results are below:
88A = 8.12, 88B(A) = 19.268.
From (4.5.4) and (4.5.5), we find that
&; + &! = 0.482; (4.5.6)
&~ = 0.084. (4.5.7)
Table 4.10: Pull strength in grams for one wire on each of ten units
(4.5.8)
where J1.* is the overall mean, Ci is the variability due to the operator, Aj(i)
is the variability due to the unit for a fixed operator, and f:tj is the random
repeatability contribution to the pull strength for a fixed wire, operator and
unit. (Imagine that there exist several absolutely identical copies of the wire
bond on a fixed position for each i and j. Each such bond would produce an
independent replica of the random variable €tj)' Our assumptions are that all
random variables involved have zero mean value and the following variances:
Var[Cil = a~per;
Var[Aj(i)] = a~, equal to the corresponding variance in the Experiment 1;
Var[€tj] = a~, equal to the variance of €ij in Experiment 1.
It is important to note that all units in both experiments are produced in
identical conditions, so that the tool repeatability expressed as a~ is the same
in both experiments.
Similarly to Experiment 1, we will use the following two sum of squares:
3 10
88A(0) =L L(Yij - YiY; (4.5.9)
i=1 j=1
3
880 = 10 L(Yi' - y,y. (4.5.10)
i=1
82 CHAPTER 4. PROCESS AND MEASUREMENT VARIABILITY
Here 'ih is the mean of all measurements for operator i, and Y.. is the mean of
all 30 observations. Now we have the following two equations (compare with
Table 4.5):
(4.5.11)
(4.5.12)
Calculations Completed
It is easy to compute, using single-factor ANOVA, that SSA(O) = 11.55 and
SSO = 1.92. Substituting into (4.5.11) and (4.5.12), we obtain that
(4.5.13)
(4.5.14)
u~ = 0.346; ue = 0.59;
u~ = 0.084; Uu. = 0.29;
u~per = 0.053; uoper = 0.23;
u! = 0.136; Uw = 0.37.
Mitchell et al. (1997) were interested in estimating the parameter O'meaB =
JO'~ + O'~per· This estimate is u meaB = v'0.346 + 0.053 = 0.63.
where Xi is the true burning time of the ith fuse, ßl is an unknown constant
(instrument 1 bias), and fil is the random error of instrument 1. It is assumed
that Xi and fil are independent random variables,
Xi '" N(I',O'~), fil '" N(O'O'~l). (4.6.4)
Similarly, the reading of the second instrument for the ith fuse is
(4.6.5)
where ß2 is an unknown constant bias of instrument 2, and fi2 is the random
measurement error of instrument 2. It is assumed that Xi and fi2 are indepen-
dent,
(4.6.6)
Find estimates of Ißl - ß21, O'~'O'~l and 0';2·
It may seem that there are not enough data for this estimation. The trick is
to involve two additional statistics - the sum and the difference of the readings
of the instruments.
Solution
(4.6.12)
and
(4.6.13)
(4.6.14)
4. For the data of Table 4.11, construct the 95% confidence interval for the
parameter fJ.
Hint: Use the statistics of the running time differences.
Var[SSe/O'~l = 2v,
and
Var[SSel = 2IJ(K -1)0':. (4.6.15)
7. For Example 4.3.1, check the null hypotheses for O'~ and 0'1 at significance
level 0.05.
Chapter 5
Measurement U ncertainty:
Error Propagation Formula
5.1 Introduction
So far we have dealt with various aspects of uncertainty in measuring a single
quantity. In Sect. 4.4 it was a measurement of weight; in Sect. 4.5 we analyzed
results from measuring pull strength. In most real-life situations, the measure-
ment process involves several quantities whose measurement result is subject to
uncertainty. To clarify the exposition, let us consider a rather simple example.
Note that all quantities in this formula W,D,L are obtained as the result of a
measurement, Le. they are subject to random errors. To put it more formally,
they are random variables. Therefore the specific weight p is also a random
variable. We are interested in the variance of p, Var[p) , or in its standard
deviation O'p = .jVar[p).
An approximate value of 0'p can be found by means of a formula known as
the error propagation formula which is widely used in measurements. Let us
first derive it, and then apply to our example.
(5.2.2)
It this formula the dots represent the higher-order terms. The partial derivatives
are evaluated at the mean values of the respective random variables.
Now ignore all higher-order terms and write:
8/
~ /(1'1,1'2, ... , I'n) + L 8x.lx =w
n
Y o
. (Xi -I'i) (5.2.3)
i=l ''I''
Let us compute the variance ofthe right-hand side of (5.2.3). Denote the partial
derivatives by /1. Since the terms in the sum are independent, we can use
formula (2.1.22):
n
Var[Y) ~ LUI) 2 Var[Xi ). (5.2.4)
i=1
The use of (5.2.5) and (5.2.7) is a valid operation if the RSD's of random
variables Xi, Ui/ J!.i are small, say no greater than 0.05. In other words, it will
be assumed that
Ui < 0.05. (5.2.8)
J!.i -
In measurements, this is usually a valid assumption.
In measurement practice, we do not know, however, the mean values of
random variables Y, Xl, ... )....Xn . What we do know are their observed values,
which we denote as y (or f), and Xl."" X n · In the case where the relative
standard deviations of all random variables are small, it is a valid operation to
replace the theoretical means J!.i by Xi. Thus, we arrive at the following version
of (5.2.7):
(5.2.9)
In practice, we do not usually know the variances ul either, and use their
estimates a;.
Then the previous formula takes the form
n
ull ~ L,(ll) 2 al· (5.2.11)
i=l
experience that the standard deviations in measuring the weight W and the
quantities D, L are as folIows:
Uw = 0.05 gjUD = 0.002 cmj UL = 0.004 cm.
Find the value of p and its standard deviation U p •
Solution
Here p plays the role of Y, and W, L, D play the role of Xl, X 2 , X 3 • The function
f(W,L,D) is
p = 4W/(7rD 2L». (5.2.12)
The estimated value of pis p= 4·25.97 /(7r' 1.0122 ·4.005) = 8.062 g/cm3 •
Compute the partial derivatives 8p/8w, 8p/8L, 8p/8D and evaluate the par-
tial derivatives at the observed values of W, D and L:
, 8p 4
fw = 8W = 7rD2L j (5.2.13)
I 8p -4W
h = 8L = 7rD2L2 j (5.2.14)
, 8p -8W
fD = 8D = 7rLD3'
Substituting into these formulas the observed values of W,D,L, we obtain:
(flv)2 = 0.096j (5.2.15)
(nY = 4.052j (5.2.16)
(11)2 = 253.8.
By (5.2.11) the standard deviation of pis
Uy = ';Var[Y] =
n
:~::>~u~, (5.3.1)
i=l
5.3. EPF FOR PARTICULAR CASES 91
where 0'; = Var[Xi]. (Recall that the Xi are assumed to be independent random
variables.) Ifthe 0'; are replaced by their estimates u;,
the analogue of (5.2.11)
is
n
Uy = y'var[Y] R1 L~u;, (5.3.2)
i=l
Case 3. EPF for Y = f(XI, ... ,Xk,Xk+l,' .. ,Xn ) = n~l XiI n~=k+l Xi
Very often the function f(·) is a ratio of products of random variables. For
this particular case, one can check that
p
(fI)
2
= x~,i = 1, ... ,n.
I
(5.3.5)
(5.3.6)
/ ' _
c-i l
i - 2Y' (5.3.7)
92 CHAPTER 5. ERROR PROPAGATION FORMULA
Remarks
1. How do we obtain the relative standard deviations of the quantities entering
the formula for ay? Generally, there are three ways: statistical experiment,
Le. previous repeatability /reproducibility studiesj certified standards data for
the measuring instrument and/or from manufacturer warrantiesj expert opin-
ion/ analyst judgments.
Useful sources on the methodology and practice of establishing uncertainty
are Eurachem (2000) and Kragten (1994).
Sometimes, the certificate data say that the measurement error of a certain
device has a specific distribution, for example a uniform one in an interval of
given length~. Then the corresponding standard deviation is equal to a =
~/Vf2.
2. It is important to give the following warnings. Often, the EPF underesti-
mates the RSD of the final result. This happens because certain factors which
in reality inßuence the result and may increase the uncertainty are omitted
from the formula y = f( Xl, ..• , X n ). So, for example, the result may depend
on the variations of the ambient temperature during the experiment, and the
temperature is omitted from the input variables on which the output Y depends.
Another reason for underestimation of the uncertainty is underestimation of
component variability. Consider, for example, measuring the concentration C of
some chemical agent, c = W /V, where W is the weight of the material and V is
the volume of the solution. When we measure the weight of the material to be
dissolved, we may take into consideration only the variability introduced by the
weighing process itself and ignore the fact that the agent is not 100% pure. In
fact, the content of the agent has random variations, and an additional statistical
experiment should be carried out to establish the variability introduced by this
factor.
5.4. EXERCISES 93
5.4 Exercises
By (4.6.15),
U~B =max(0,(88AB/4-8~)/3).
Substituting into thia formula 88 AB = 0.0000176 and u~ = 0.0071 2 , we see that
U~B = O. Now by (4.4.15),
The final result is 0.0012 and this is the approximate value of the standard
deviation (standard error) of UR&R.
The above calculations are an implementation of the method suggested by
Vardeman and VanValkenburg (1999) based on the use of EPF.
Chapter 6
Calibration of Measurement
Instruments
Saki
observe differentvalues ofthe absorbance. Why does this happen? There might
be many reasons: measurement errors, small variations in the environmental
conditions which infiuence the absorbance, operator error, etc. So, in fact, the
response at concentration x is a random variable Y. Assuming that x is known
without error, we represent the mathematical modelofthis situation as folIows:
Y = fex) +€, (6.1.2)
where € is a zero-mean random variable which describes the deviation of the
actual response Y from its "theoretical" value fex). Then, the mean value of
the "output" Y, given the "input" x, equals fex). Formally,
E[Y] = fex). (6.1.3)
Suppose that we know the relationship y = fex). This is called the cal-
ibration curve. We are given some substance with unknown concentration of
glucose. We observe the instrument response Y*, Le. we observe a single value
of the absorbance. Our task is to determine the true content x*. Figure 6.1
illustrates this typical situation.
f(x)
~-----------------o----------~x
x*
Concentration Absorbance, y
of glucose in mg/dl, x
o 0.050
50 0.189
100 0.326
150 0.467
200 0.605
400 1.156
600 1.704
(6.1. 7)
1 Reprinted from John Mandel, Evaluation and Control 0/ Measurements (1991), by eour-
tesy of Mareel Dekker, Ine.
98 CHAPTER 6. CALIBRATION
We assume that €i '" N(O, 0- 2 ), and that the random variables €i are independent
for i = 1, ... ,n.
It follows from (6.1.7) that the mean response E[Yil at point Xi equals
!(Xi) = a + ßXi. This is our "ideal" calibration curve. It is assumed there-
fore that !(x) is a straight line.
Another point of principal importance is that the variance of measurement
errors 0- 2 does not depend on Xi. This is the so-called equal response variance
case. We will comment later on how a calibration curve might be constructed
when error variances depend on the input variable values x. In further exposition
we denote by Yi the observed value of Yi.
n
x= LXi/n; (6.1.9)
i=l
(6.1.10)
i=1
n
8 zz = L(Xi - X)2; (6.1.11)
i=l
n
Syy = L(Yi - y)2; (6.1.12)
i=1
n
Szy = L(Xi - X)(Yi - y). (6.1.13)
i=1
6.1. CALIBRATION CURVES 99
Ci=y-ßx. (6.1.15)
In addition, we are able to find an estimate of a 2 :
(6.1.16)
-+--~------~------~--------~~~x
Xi
Figure 6.2. The estimates Ci and ßminimize L;=l Ai.
Example 6.1.1 contimied
For the data of Example 6.1.1 we have
x = 214.29; Y = 0.6424; 8 zz = 273,570; 8 z 1/ = 754.26; 81/1/ = 2.0796;
ß= 0.00276; Ci = 0.0516; a = 0.00188.
Using Statistix, we obtain all the desired results automatically, as shown in Fig.
6.3; see Analytical Software (2000).
PREDICTOR
VARIABLES COEFFICIENT STD ERROR STUDENT'S T P
SOURCE DF SS MS F P
---------- ---------- ----------
REGRESSION 1 2.07954 2.07954588753.23 0.0000
RESIDUAL 5 1.766E-05 3.532E-06
TOTAL 6 2.07956
1.5
1.2
ID
c:::
oCD 0.9
ID
«
0.6
0.3
0.0
U ncertainty in Estimating x*
An important practical issue is estimating the uncertaint1l introduced into the
value of x* by our measurement procedure. Let us have a closer look at (6.1.19).
The right-hand side of it depends on y*, y and ß. Imagine that we repeated
the whole experiment, for the same fixed values of Xl, ... , X n . Then we would
observe values of the response variable y different from the previous ones be-
cause of random errors Ei. Therefore, we will obtain different values of y and
different values of the estimate ß. In addition, the response of the measurement
instrument to the same unknown quantity of X will also be different because y*
is subject to random measurement error.
Formally speaking, x* is a random variable depending on random variables
y, ß and y*. To keep the notation simple, we use the same small italic letters
for these random variables.
Our goal is to obtain an approximate expression for the variance of x*. In
practice, the following approximate formula for the variance of x* is used:
(6.1.21)
Derivation of (6.1.21)
The following statistical facts are important:
1. y* is a random variable with variance a 2 , the same variance possessed byall
observations for various x-values.
2. y* and y are independent random variables since they depend on independent
sets of observations.
3. The variance of y is a2 In.
4. The variance of ß
(6.1.22)
,~var [*]
x '"
'" (x * _ -)2
x [var[y*] +) 2
(y*-y Var[y] + var[ß]]
~.
( )
6.1.24
ß2
Replacing the random variables by their observed values, we arrive at the desired
formula:
Var[x*] ~ (x* - X)2 [1+ l/n + 1]
(y* - y)2
-~-- 0'2.
ß2 S :r;:r;
A
(6.1.25)
*] ( 1 2)2 ( 1 + 1/7
Var [x ~ 34.80 - 2 4. 9 (0.147 _ 0.6424)2 + 0.0027621 )
. 273,570 x
3.532 . 10-6 = 0.584,
and
0'* = v'Var[x*l ~ 0.77.
Suppose we think that Var[x*] = 0.584 is too large and we would like to
carry out a new experiment to reduce it. One way of doing so is to divide the
substance with unknown concentration of glucose into several, say k, portions
and to measure the response for each of these k portions. Then we would observe
not a single value of y*, but a sampie of k values {yt, ... ,yk}. Denote by 11*
the corresponding mean value:
k
11* = Lyi/k. (6.1.26)
i=l
Now we will use the value 11* instead of y*. What will be gained by this? Note
that the variance of 11* is k times smaller than the variance of y*. Thus, formula
(6.1.21) will take the form
For example, let k = 4. Then, with the same numerical values for all other
variables, we will obtain Var[x*] ~ 0.237, 0'* ~ 0.49, quite a considerable
reduction.
6.2. NONCONSTANT RESPONSE VARIANCE 103
(6.2.1)
(6.2.2)
If the residuals behave, for example, as shown in Fig. 6.4, then obviously the
variability of the response is increasing with x.
r·I
We will assume now that the variance of the response at the input value Xi
has the following form:
(J
2()
Xi =K
-v, (6.2.3)
Wi
104 CHAPTER 6. CALIBRATION
where KtJ is some positive constant and Wi, the so-called weights, are assumed
to be known exactly or up to a constant multiple. In practice, the weights are
either known from previous experience, or on the basis of a specially designed
experiment. This experiment must inelude observing several responses at the
Xi values, i = 1, ... ,n.
Transformation is often used to obtain a linear calibration curve. Then it
may happen that the transformed responses exhibit heteroscedasticity. The
form of the transformation dictates the choice of weights. We will not go into
the details of finding the weights. There is an ample literature on this issuei see
Madansky (1988, Chapter 2).
One way of obtaining weights is to observe, say, m responses at each Xi and
set the weights to be equal to the inverse of the observed sampIe variances:
(6.2.4)
So, let us assume that in addition to the data [Xi, Yi, i = 1,2, ... ,n] we have
a collection of weights
(6.2.5)
In the statisticalliterature, the procedure of finding the calibration curve
when Var[€i] is not constant is called weighted regression. The estimates of
parameters Q and ß are found by minimizing the following weighted sum 0/
squares:
n
Weighted Sum of Squares = E Wi(Yi - Q - ßXi)2. (6.2.6)
i=1
(6.2.7)
(6.2.8)
n
Szz = EWi(Xi -xli (6.2.9)
i=l
n
81111 = E Wi(Yi - y)2; (6.2.10)
i=1
n
8z1l = E Wi(Xi - X)(Yi - y). (6.2.11)
i=1
6.2. NONCONSTANT RESPONSE VARIANCE 105
Table 6.2: Time for water to boil as a nmction of the amount of water
i Xi ti Estimated Weight
cm3 sec variance Wi
s~
•
1 100 35 12.5 0.08
2 200 63 6.1 0.16
3 300 95 4.5 0.22
4 400 125 2.0 0.50
5 500 154 2.0 0.50
PREDICTOR
VARIABLES COEFFICIENT STD ERROR STUDENT'S T P
SOURCE DF SS MS F P
---------- ---------- ----------
REGRESSION 1 1836.02 1836.02 8003.03 0.0000
RESIDUAL 3 0.68825 0.22942
TOTAL 4 1836.71
140
110
W
~
i=
80
50
20
90 160 230 300 370 440 510
WATER
Figure 6.5. The printout and calibration curve for the data in Table 6.2 with
95% confidence belt.
Note that in this formula, x,y and 11 are computed from (6.2.7), (6.2.8) and
(6.2.12).
After calculating x* from (6.2.16), we need to establish the uncertainty for
x*. Assume that we know the weight w· which corresponds to the calculated
value of x*. w* can be obtained by means of a specially designed experiment or,
more simply, by interpolating between the weights W r and wr +1 corresponding
to the nearest neighbors of x* from the left and from the right.
The formula for the uncertainty in x* is similar to (6.1.21), with obvious
changes following from introducing weights.
Note that the expression for Var[ß] is the same as (6.1.22), with obvious
changes in the expression for S:Il:ll; see, for example, Hald (1952, Chapt. 18,
Sect. 6):
var x V·
y* - Y ß2S:Il:ll
Note that this formula reduces to (6.1.21) if we set all weights Wi == 1.
* 2 [1/0.18 + 1/1.46 1]
Var[x ] R1 (235.2 - 380.8) (75 _ 118.7)2 + 0.32 . 20,263 . 0.2264
= 18.3.
Summing up, x* ± ..jVar[x*l = 235.2 ± 4.3.
(6.3.1)
1i* = ~. (6.3.3)
Now obviously, Var[1i*] = Var[Xi]. Our "new" data set is now [Xi,Yt =
Yi/.j)., i = 1,2, ... , n].
It remains true that the mean values of 1i* and the mean value of Xi are
linearly related to each other:
(6.3.4)
of fitting the data set to the linear calibration curve will now be minimizing
the sum 0/ sqtiares 0/ the distances of the points (Xi, yt) to the hypothetical
calibration curve y* = 0:* + ß*x. This is illustrated in Fig. 6.6.
y*
(Xi ,Yi )*
x
Figure 6.6. Constructing the calibration curve by minimizing E elf
The formula expressing the sum of squares of the distances of the experi-
mental points (Xi,yt) to the calibration curve y* = 0:* + ß*x is given by the
following formula known from analytic geometry:
Now substitute into this formula Yt = Yi/VA. After simple algebra we obtain:
(6.3.6)
a=y-ßx. (6.3.9)
110 CHAPTER 6. CALIBRATION
1 73 79
2 132 83
3 211 121
4 254 159
5 305 167
Note that x, fj, 8 zz , 8 zll , 8 1111 are defined by (6.1.9)-(6.1.13), exactly as in the case
of nonrandom xs.
The results (6.3.8) and (6.3.9) can be found in Mandel (1991, Chapt. 5).
Standard texts on measurements and on regression, such as Miller and Miller
(1993), typically do not consider the case of random errors in both variables.
Remark 1.
Suppose Ä -+ 00. This corresponds to the situation when xs are measured
without error. H we investigate the formula for jj when Ä goes to infinity, we
obtain, after some algebra, the usual formula for ß= 8 zll /8 zz . The proofis left
as an exercise.
(6.3.15)
(6.3.17)
6.4 Exercises
1. Derive the formula (6.2.17).
i Xi Yi Sampie Weight
Variance Wi
5. Suppose that we observe, at each Xi, ki values ofthe response variable Yi and
use for fitting the regression line the data [Xi,Yi,i = 1, .. . ,n], where fh is the
average of ki observations. Argue that this case can be treated as a weighted
regression with Wi = ki'
Collaborative Studies
1 2 3 4 5 6 7 8 9 10 11
y 16.0 16.1 16.3 17.1 16.5 16.5 16.7 16.9 16.5 16.5 16.5
x 16.0 15.8 16.0 16.8 16.4 16.2 16.7 16.6 16.3 16.5 16.2
17.0
I-
(f)
16.5 o 0
LU
(!) o
0::
«
...J o
15.5
15.0
15.0 15.5 16.0 16.5 17.0 17.5
SMALLEST
Figure 7.1. Scatter plot of y versus x for the data in Table 7.1.
I Reprinted, with permission, from the Statistical Manual 0/ the Association 0/ Official
Analytical Chemists, (1975). Copyright, 1975, by AOAC INTERNATIONAL.
7.2. RUGGEDNESS (ROBUSTNESS) TEST 115
Y1 l' + A + B + C + D + E + F + G + 1':1,
Y2 l' + A + B + c + D + e + f + 9 + 1':2,
Y3 = l' + A + b + C + d + E + f + 9 + 1':3,
Y4 l' + A + b + c + d + e + F + G + 1':4,
Ys l' + a + B + C + d + e + F + 9 + I':S, (7.2.1)
Y6 = l'+a+B +c+d+E+ f +G +1':6,
Y7 l' + a + b + C + D + e + f + G + 1':7,
Ys = I'+a+ b+c+ D + E +F+ g+ I':s.
Here l' is the overall mean value; the I':i are assumed to be independent zero-mean
measurement errors with variance O'g. The special structure of equations (7.2.1)
allows an extremely simple procedure for calculating the contribution of each
of the factors. The rule is as follows. To obtain an estimate of the contribution
of a single factor in the form of A - a, Le. in the form of a difference in the
measurement results when the factor A is replaced by a, add all the Yi containing
the capitalletter level of the factor A and subtract the Yi which contain the small
letter level of the factor. Then divide the result by 4. For example, factor B is
at Level 1 in experiments 1,2,5,6 and at Level 2 in experiments 3,4,7,8; see
Table 7.3. Then
~ = Y1 + Y2 + Ys + Y6 -4(Y3 + Y4 + Y7 + Ys) . (7.2.2)
Let us proceed with a numerical example.
---.
D -d= 0.096;
E - e = 0.014;
,....;-..
F - f = 0.005;
---.
G - 9 = -0.039.
The dot diagram in Fig. 7.2 shows that only two factors, D and G have
a significant influence on the measurement result. The measurement result is
especially sensitive to the ambient temperature. Increasing it by 5°C gives an
increase in the result of RJ 0.1. So it is recommended to control carefully the
118 CHAPTER 7. COLLABORATNE STUDIES
c
,-
-4 -2
I -,--
-o , 2 4 6 8 10 FACTOR EFFECTS
x 100
Figure 7.2. Dot diagram for the effects offactors A,B, ... ,G.
ambient temperature and keep it near to 20°C. In addition, use of regular water
in the course of reaction causes a decrease in the result by 0.039. To reduce
disparity 'in measurement results from different laboratories, it is recommended
that alilaboratories control this factor too and use only distilled water.
Another valuable conclusion from the above experiment is that the measure-
ment results are robust (insensitive) with respect to the changes in the remaining
five factors.
The example considered here is the so-called 23 factorial design. There is a
tremendous literature on using designed experiments to reveal the influencing
factors, including their possible interactionsj see e.g. Montgomery (2001).
Table 7.6: Criterion for rejecting a low or a high ranking laboratory score with
0.05 probability of wrong decisionj Youden and Steiner 1975, p. 85
Table 7.5 presents the ranking results of the data from Table 7.4. (In the case
of tied ranks, each result obtains an average of all tied ranks.)
4. Compare the highest and the lowest rank with the lower and upper critical
limits. These limits are given in Table 7.6. They are in fact 0.05 critical values,
Le. they might be exceeded by chance, in case of no consistent differences be-
tween laboratories, with probability 0.05. 2 If a certain laboratory shows a rank
total lying outside one of the limits, this laboratory must be eliminated from
the CS.
From the Table 7.5 we see that the total rank of laboratory 4 exceeds the
critical value of 32 for 11 laboratories and three sampies. Thus we conclude
that this laboratory has a too large systematic error and we decide to exclude
it from further studies. Note also that the x - y plots (we presented only one in
Figure 7.1, for Sampie 3) point to the laboratory 4 as having the largest bias.
Denote by Sij the standard deviation among replicates within cell (i,j) and
by the pooled value for replication standard deviation for sampIe j.
Sj
Mandel (1991, p. 176), suggested studying the experimental cell variability
by using the so-called k-statistic which he defined as
(7.4.1)
We will estimate Sij via the range of the replicates in the cell (i,j), and Sj
via the average range of all cells in sampIe j. To compute kib denote by R;,j the
range of the replicates in the cell (i,j), and by R j the average range for sampIe
j. For our case of K = 2 observations per cell, R;,j is the largest observation
minus the smallest in the cello The value ofthe k-statistic for cell (i,j) we define
as
(7.4.2)
k:
Columns, 5,6,7 ofTable 7.7 present the values of j for the data in Table 7.4.
For visual analysis ofthe data in Table 7.7 we suggest the display shown on
Figure 7.3. The area (or the height) of the shaded tri angle in the (i, j)th cell is
proportional to the value of kij. It is instructive to analyze the display column
by column and row by row. The column analysis may give raise to the question
whether the standard deviation in cell (8,1) or (10,2) is unusually high. This
question may be answered on the basis of tables presented by Mandel (1991,
pp. 237-239). One will find that the critical value of the ku-statistic for 10
laboratories and two replicates is 2.32, at the 1% significance level. Thus we
may conclude that the within-cell variability in cells (8,1) and (10,2) is "too
large". This does not mean that the measurement results should be ignored
122 CHAPTER 7. COLLABORATIVE STUDIES
and measurements repeated. These results must be marked for a more thorough
investigation of the measurement conditions for the corresponding laboratories
and sampIes.
The row-by-row analysis may serve as a comparison between laboratories.
A row with high values of kij, j = 1,2,3, demonstrates that laboratory i has
unusually high standard deviations. This is not a reason for excluding this
laboratory from the es, but rather a hint to investigate the potential sources of
large within-cell variability in this laboratory. The last column in our display
shows the row sums 2:;=1 ktj • For the data in the display, there is a suspicion
that the measurement conditions in laboratories 8 and 10 cause the appearance
of greater variability.
Chapter 8
Measurements in Special
Circumstances
Ovid, Metamorphoses
°
Le. in the interval [-0.45, 1.05]. Taking into account the rounding-off, the prob-
ability of obtaining the result is
~((0.5 - p.)/O') - ~((-0.5 - p.)/O') = ~(0.8) - ~(-3.2)
= 0.7881- 0.0007 = 0.7874, (8.1.2)
124 CHAPTER 8. SPECIAL MEASUREMENTS
estimate of ,." is not consistent. The standard remedy of increasing the sampie
size to obtain a more accurate estimate of the measurand simply does not work
in our new situation with rounding-off.
The bias in this example is not very large. Suppose for a moment that
,." = 0.4 and u = 0.05. It is easy to compute that practically all the probability
mass of y* is concentrated in two points, 0 and 1, and that P(Y* = 1) ~ 0.023.
This gives the mean of y* ~ 0.023. The estimate of ,." is biased by 0.38.
Consider another example related to estimating u.
Regarding u, we can say only that the length of the support of the random
variable E does not exceed h = 1. In practical terms, this means that u < 1/.;f2.
From the measurement point of view, the instrument in that case is not
accurate enough and should be replaced by one which has a finer scale (Le.
smaller value of h).
Now let us consider the case where the observations are located at two points,
Le. both no ~ 1 and nl ~ 1. Our entire information is, therefore, two numbers
no and nl'
where
P(y* = 0) = ~(0.5 - I')/u) - ~(-0.5 - I')/u), (8.1.8)
P(y* = 1) = ~(1.5 - I')/u) - ~(0.5 - I')/u).
Note that Lik depends on I' only since u is assumed to be known.
Step 2. Find the maximum likelihood estimate of 1'. The maximum likelihood
estimate I'M L is that value of I' which maximizes the likelihood function, see
DeGroot (1975, p. 338). In practice, it is more convenient to maximize the
logarithm of the likelihood function (which will produce the same result):
I'ML = Argmax[Log[Lik]].
jJ.
(8.1.9)
In[l]:= «Statistics'ContinuousDistributions'
ndist = NormalDistribution[O, 1] ;
0=0.3;
=
nO 4; n1 1; =
pO = CDF[ndist, (0.5 -1-1) 10] - CDF[ndist, (-0.5 -1-1) 10];
p1 = CDF[ndist, (1.5 -1-1) 10] - CDF[ndist, (0.5 -1-1) 10] ;
Lik = (pO) "'nO * (p1) "'n1;
FindMinimum[-Log[Lik], {I-I, 0.4}]
Plot[Log[Lik], {I-I, 0.1, 0.7}]
-2.5
-3
-3.5
Now we have to find the maximum of Log[Lik] with respect to J.t and u.
The numerical investigation of the likelihood function (8.1.10) has revealed
that the surface Lik = t/J(J.t, u) has a long and Hat ridge in the area of the
maximum. This can be weIl seen from the contour plot of 1jJ(J.t,u) in Fig. 8.2.
In that situation, the "FindMinimum" operator produces results which, in fact,
are not minimum points. Moreover, these results depend on the starting point
of the minimum search.
128 CHAPTER 8. SPECIAL MEASUREMENTS
«Statistics'ContinuousDistributions'
ndist = NormalDistribution[O, 1];
nO = 3; n1 = 7;
pO = CDF[ndist, (0.5 -1-1) /0] - CDF[ndist, (-0.5 -1-1) /0];
p1 = CDF[ndist, (1.5 -1-1) /0] - CDF[ndist, (O. 5 -1-1) /0] ;
Lik = (pO) .... nO * (p1) .... n1;
ContourPlot[Loq[Lik], {I-I, 0.48, 0.72}, {o, 0.05, 0.25},
Contours -+ 39]
O . v~[===-~=============
0.5 0 . 55 0 . 6 0 . 65 0 . 7
Ü _. / no . nl _ ..!..-
n - V n· n 12'
(8.1.11)
Afterwards, substitute ä n into the expression (8.1.10) and optimize the likeli-
hood function with respect to ~ only. This becomes a well-defined problem. We
present in Table 8.1 the calculation results in a ready to use form for all cases
of n = 10 measurements with no > 0, nl > O.
n2 = n - no - nl occasions, such that no, nl, n2 are all nonzero. Our purpose
remains to estimate JI. and a.
Let us present first the naive estimates based on the first two moments of
the discrete random variable Y*:
E[Y*] = 1· P(Y* = 1) + 2· P(Y* = 2). (8.1.12)
Replacing the probabilities by the relative frequencies, we arrive at the following
estimate of JI.:
(8.1.13)
For the estimate of a we suggest the estimate of the standard deviation of y*
with the Sheppard correctionj see Sect. 2.2. The variance of y* equals
Var[Y*] = E[(y*)2] _ (E[y*])2. (8.1.14)
l,From here we arrive at the following estimate of a:
Ci = y'J2 . nI/n + 22 . n2/n - (nI/n + 2 . n2/n)2 - 1/12. (8.1.15)
The maximum likelihood estimates of JI. and a are obtained by maximizing
with respect to JI. and a the following expression for the likelihood function:
Lik = [P(Y* = O)to • [P(y* = l)t1 • [P(y* = 2)t 2 , (8.1.16)
where
where a = :E~=1 ai/ k , 73 = :E~=1 ßdk and 7 = :E~1 "Ii/k are the respective
average values.
It is easy to solve these equations and to obtain that
Let us now estimate the variance of the angle estimates. Assurne that al1
ang1es are measured with errors having equal variances a 2 • Then
Var[a] = a2 /k = Var[ß] = Var[,?]. (8.2.10)
Since the measurements are independent, it follows from from (8.2.7) that
~ 2a 2
Var[a] = 3k' (8.2.12)
(8.2.13)
and minimized subject to the same constraint (8.2.1). This fact has the following
geometrie interpretation. We minimize the distance from the point (a, 13, '1) to
a
the hyper plane + jj + 9 = 1800 •
132 CHAPTER 8. SPECIAL MEASUREMENTS
8.3 Exercises
1. In measuring the triangle angles, the following results were obtained:
al = 30 0 30'j a2 = 29°56'j
ßl = 56°25'j ß2 = 56°45'j
'Yl = 93°50'j'Y2 = 94°14'.
It is known that the instrument has measurement error € '" N(0,u 2 ) with
u = 0.03. Find the maximum likelihood estimate Jl.ML of the ohmic resistance.
Hint. Multiply all measurement results by a factor of 10. This will also increase
u by a factor of 10. Put the zero of the scale at the point 101. Then there are
three measurements equal to zero, no = 3, and two equal to 1, nl = 2. Now we
are in the situation of Example 8.1.2. On the new scale Jl.ML = 0.256, and in
the original scale the estimate of the resistance is 10.1 + 0.0256 ~ 10.13.
4. Suppose that the random variable X '" U( -0.7,1.3). The measurements are
made on a grid ... , -2, -1,0,1,2, ... with corresponding round-off. Find the
mean and the standard deviation of the observed measurement results.
5. Five weight measurements are made on a digital instrument with scale step
h = 1 gram, and the following results obtained: 110 gram, 111 gram (3 times)
and 112 gram. Suppose that the weight is an uknown constant JI. and the
measurement error has a normal distribution N(O, 0"2). Find the maximum
likelihood estimates of JI. and 0".
Answers and Solutions to
Exercises
Chapter 2.
1, a. p, = 0.634;
1, b. s = 0.017; the estimate of u using the sampIe range is fT = 0.06/3.472 =
0.017;
1, c. Q = (0.61- 0.60)/0.06 = 0.167 < 0.338. The minimal sampIe value is not
an outlier.
1, d. The 0.95 confidence interval on JI. is [0.6246,0.6434]
1, e. h = O.01,s/h = 1.7. It is satisfactory
2. Solution. Y '" N(150,u = 15). P(145 < Y < 148) = P((145 - 150)/15 <
(Y - 150)/15 < (148 - 150)/15) = 41(-0.133) - 41(-0.333) = 41(0.333) -
41(0.133) = 0.630 - 0.553 = 0.077.
Chapter 3
1. The sam pIe averages are Xl = 3200.1, X4 = 3359.6. The sampIe variances are
s~ = 2019.4, s~ = 5984.6. Assuming equal variances, the T-statistic for testing
Jl.1 = Jl.4 against Jl.1 < Jl.4 equals -3.99, which is smaller than to.oos(8). We
reject the null hypothesis.
2. For Table 4.1 we obtain the maximal variance for day 4, s~ = 52.29, and the
minimal s~ = 8.51 for day 1. Hartley's statistic equals 6.10, which is less than
the 0.05 critical value of 33.6 for 4 sampIes and 7 observations in each sampIe.
The null hypothesis is not rejected.
Chapter 4
1. The average range per cell is 0.0118. It gives fT: = 0.0070. This is in elose
agreement with fT e = 0.0071 obtained in Sect. 4.4.
134 ANSWERS AND SOLUTIONS TO EXERCISES
Chapter 6
1. Exactlyas in the case of constant variance,
Vir[p] = Kv /8 zz •
and substitute them in the expression for Var[x*].
5. If 'ih = (yi! + ... + Yik,)/ki, then VarWi] = Var[Y]i1]/ki. Compare this with
(6.2.3): Wi = ki'
6. Compute from the data in Table 6.3 that D2 = L:=l (Yi - 36 - 0.44Xi)2/3 =
146.4. Use (6.3.15) and (6.3.16) to calculate that P = 0.444 and Q = 6738.2.
Substitute the values of D 2 ,>.,P,P,Q,fj = 180 and y into (6.3.17). The result
is Var[x] ~ 964.1.
Chapter 8
1. Answer: ii = 29°56'; P= 56°18'.
2. Solution. Vir[a] = (al - (2)2 /2 = 342/2 = 578. Similarly, Vir[ß] = 202/2 =
200, and Vir["Y] = 242/2 = 288.
135
4. Solution. With probability 0.1 the result will be -1, with probability 0.5 it
will be zero, and with the remaining probability 0.4 it will be + 1. So, the mea-
surement result will be a discrete random variable Y*, with E[Y*] = 0.1( -1) +
0.5·0+ 1·0.4 = 0.3. Var[Y*] = 0.1· (_1)2 +0.4· (1 2) -0.32 = 0.5-0.09 = 0.41.
cP(x)
1
= ..,I2ii jZ exp[-v 2 j2]dv
271" -00
Hundredth parts of x
x 0 1 2 3 4 5 6 7 8 9
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7703 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9556 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987
For negative values of x, cP(x) =
1 - cP( -x). For example, let x = -0.53. Then
= =
cP( -0.53) 1 - cP(0.53) 1 - 0.7019 0.2981. =
Appendix B: Quantiles of
the Chi-Square Distribution
iI2 Q 1 2 3 4 5 6 7 8 9 10 15 20
1 0.050 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9 245.9 248.0
1 0.025 647.8 799.5 864.2 899.6 921.8 937.1 948.2 956.7 963.3 968.6 984.9 997.2
1 0.010 4052 4999 5403 5625 5764 5859 5928 5981 6022 6056 6157 6209
1 0.005 16211 19999 21615 22500 23056 23437 23715 23925 24091 24224 24630 24836
2 0.050 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.43 19.45
2 0.025 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.40 39.43 39.45
2 0.010 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 99.40 99.43 99.45
2 0.005 198.5 199.0 199.2 199.2 199.3 199.3 199.4 199.4 199.4 199.4 199.4 199.4
3 0.050 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.70 8.66
3 0.025 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.42 14.25 14.17
3 0.010 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 27.23 26.87 26.69
3 0.005 55.55 49.80 47.47 46.19 45.39 44.84 44.43 44.13 43.88 43.69 43.08 42.78
4 0.050 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.86 5.80
4 0.025 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.84 8.66 8.56
4 0.010 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55 14.20 14.02
4 0.005 31.33 26.28 24.26 23.15 22.46 21.97 21.62 21.35 21.14 20.97 20.44 20.17
5 0.050 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.62 4.56
5 0.025 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.62 6.43 6.33
5 0.010 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05 9.72 9.55
5 0.005 22.78 18.31 16.53 15.56 14.94 14.51 14.20 13.96 13.77 13.62 13.15 12.90
6 0.050 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 3.94 3.87
6 0.025 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.46 5.27 5.17
6 0.010 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87 7.56 7.40
6 0.005 18.63 14.54 12.92 12.03 11.46 11.07 10.79 10.57 10.39 10.25 9.81 9.59
7 0.050 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.51 3.44
7 0.025 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.76 4.57 4.47
7 0.010 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62 6.31 6.16
..-
7 0.005 16.24 12.40 10.88 10.05 9.52 9.16 8.89 8.68 8.51 8.38 7.97 7.75 .j::..
.....
1/1
1 2 3 4 5 7 20
......
112 a 6 8 9 10 15 ~
8 0.050 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.22 3.15 N
8 0.025 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.30 4.10 4.00
8 0.010 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81 5.52 5.36
8 0.005 14.69 11.04 9.60 8.81 8.30 7.95 7.69 7.50 7.34 7.21 6.81 6.61
9 0.050 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.01 2.94
9 0.025 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.96 3.77 3.67
9 0.005 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26 4.96 4.81
9 0.001 13.61 10.11 8.72 7.96 7.47 7.13 6.88 6.69 6.54 6.42 6.03 5.83
10 0.050 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.85 2.77
10 0.025 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.72 3.52 3.42
10 0.010 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85 4.56 4.41
10 0.005 12.83 9.43 8.08 7.34 6.87 6.54 6.30 6.12 5.97 5.85 5.47 5.27
11 0.050 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.72 2.65
11 0.025 6.72 5.26 4.63 4.28 4.04 3.88 3.76 3.66 3.59 3.53 3.33 3.23
11 0.010 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.54 4.25 4.10
11 0.005 12.23 8.91 7.60 6.88 6.42 6.10 5.86 5.68 5.54 5.42 5.05 4.86
12 0.050 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.62 2.54
12 0.025 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.37 3.18 3.07
12 0.010 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30 4.01 3.86
12 0.005 11.75 8.51 7.23 6.52 6.07 5.76 5.52 5.35 5.20 5.09 4.72 4.53
13 0.050 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.53 2.46
13 0.025 6.41 4.97 4.35 4.00 3.77 3.60 3.48 3.39 3.31 3.25 3.05 2.95
13 0.010 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.10 3.82 3.66
13 0.005 11.37 8.19 6.93 6.23 5.79 5.48 5.25 5.08 4.94 4.82 4.46 4.27
14 0.050 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.46 2.39
14 0.025 6.30 4.86 4.24 3.89 3.66 3.50 3.38 3.29 3.21 3.15 2.95 2.84
14 0.010 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 3.94 3.66 3.51
14 0.005 11.06 7.92 6.68 6.00 5.56 5.26 5.03 4.86 4.72 4.60 4.25 4.06
15 0.050 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.40 2.33
15 0.025 6.20 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.06 2.86 2.76
15 0.010 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80 3.52 3.37
15 0.005 10.80 7.70 6.48 5.80 5.37 5.07 4.85 4.67 4.54 4.42 4.02 3.88
References
Devore, J.L. 1982. Probability and Statistics for Engineering and Sciences.
BrooksjCole Publishing Company, Monterey California.
144 REFERENGES
Miller, J.C. and J.N. Miller. 1993. Statistics for Analytical Chemistry, 3rd ed.
Ellis Hoorwood PTR Prentice Hall, New York.
Mitchell,T, Hegeman, V. and K.C. Liu. 1997. GRR methodology for destruc-
tive testing, in V. Czitrom and P.D. Spagon (eds), Statistical Case Stuaies for
Inaustrial Process Improvement, pp. 47-59. Society for Industrial and Applied
145
NIST Technical Note 1297. 1994. Guidelines for Evaluating and Expressing the
Uncertainty of NIST Measurement Results.
Pankratz, P.C. 1997. Calibration of an FTIR spectrometer for measuring car-
bon, in V. Czitrom and P.O. Spagon (eds), Statistical Gase Studies for Industrial
Process Improvement, pp. 19-37. Society for Industrial and Applied Mathemat-
ics and American Statistical Association.
Rabinovich, S. 2000. Measurement Errors and Uncertainties: Theory and Prac-
tice, 2nd ed. Springer, New York.
Vardeman, S.ß. and E.S. Van Valkenburg. 1999. Two-way random-effects anal-
yses and gauge R&R studies. Technometrics, 41(3), 202-211.
Wolfram, Stephen. 1999. The Mathematica Book, 4th ed. Cambridge University
Press.
Youden, W.J. and E.H. Steiner. 1975. Statistical Manual of the Association
of OjJicial Analytic Chemists: Statistical Techniques for Collaborative Studies.
AOAC International, Arlington, VA.
Index
sampIe t-test
- average 12 - degrees of freedom 44
- mean 12 - equal variances 47
- range 26,28,40 - null hypothesis, alternatives 45
- variance 12 - statistic 44
Sheppard correction 130 testing hypotheses in AN OVA 85
Sheppard's correction 13 two-factor balanced model
Shewhart's charts 32 - random effects 72
significance level 30 two-sample t-test 43
single-factor ANOVA 82
small a/h ratio 124 unbiased estimator of variance 14
sources of uncertainty uncertainty
- measurement errors 60 - in x for given y 96
special cause 32 - in x in calibration 111
special measurement scheme 124 - in measurements 4, 87
specific weight 1, 2 - in measuring specific weight 87
specification limits 25 - of atomic weights 25
speed of light, measurements 41 - of the measurement 6
standard deviation 11 uniform density 24
- of random error 37 uniform distribution 13,24, 132, 135
- of range 35 - mean and variance 13, 24
standard error 93 - of round-off errors 25
- in estimating atomic weight 93 use of calibration curve 96
- of aR&R 93 using ranges to estimate repeatability
- of the mean 17 82
standard normal density 19 using ranks in es 120
standard normal distribution 19
Statistix for regression 99 variance 11
Statistix software 53 - unbiased and consistent estimate
Statistix, use for ANOVA 76 12
sum of squares 130 variance of a sum of random variables
systematic and random errors 6 15,16
systematic error 6,7,11 variance of sum of squares in ANOVA
- in measurements 40 85
X bar chart 34
- performance 39