annhyg%2Fmem045
annhyg%2Fmem045
annhyg%2Fmem045
611–632, 2007
Ó The Author 2007. Published by Oxford University Press
on behalf of the British Occupational Hygiene Society
doi:10.1093/annhyg/mem045
The purpose of this study was to compare the performance of several methods for statistically
611
612 P. Hewett and G. H. Ganser
the exposures span several orders of magnitude; or Assuming that the underlying exposure profile is
the sample time is short or the flow rate is low, re- reasonably lognormal, which CDA methods
sulting is a small sample volume. Published cen- should be considered?
sored data analysis (CDA) methods fall into four Which method should be used for complex-
general categories: substitution methods, log- censored datasets and/or when we suspect that
probit regression (LPR) methods, maximum likeli- the underlying exposure distribution departs sig-
hood estimation (MLE) methods and non-paramet- nificantly from single lognormal assumption?
ric (NP) methods. Within each category or family, Is there an ‘omnibus’ method that should be the
there are several variations, usually developed to first choice, regardless of the sample size, (ob-
reduce transformation bias (discussed later) or served) percent censored, complexity of censor-
for the situation where it is suspected that the un- ing or variability in the data?
derlying distribution departs significantly from the
assumed lognormal distribution in the hope of To address these questions, we estimated the bias and
reducing the bias or improving overall accuracy root mean square error (rMSE) for each of the CDA
(defined as bias plus precision) when estimating methods when estimating a commonly used compli-
the ‘mean concentration’ (Helsel, 2005). This has ance statistic (i.e. the 95th percentile) and the expo-
resulted in numerous peer-reviewed articles that sure profile mean (often used in environmental
For example, we will address the analysis of both tive bias for the 95th percentile (which is calculated
lognormal and contaminated lognormal exposure from the sample GM and sample GSD)). LOD/2 sub-
profiles, examine the effect of LODs in ranges of stitution appears to be the CDA method of choice in
1–50% and 50–80% of the underlying distribution the epidemiological literature whenever large, com-
and cover sample sizes ranging from 5–100. Further- plex-censored datasets are used to construct a job-
more, we will contrast and compare the most exposure matrix (Hornung and Reed, 1990; Glass
common examples from all four families of CDA and Gray, 2001). When estimating the true geometric
methods, nearly all of which have been addressed mean (GM) and GSD, Hornung
pffiffiffi and Reed (1990) rec-
by one or more previous papers. However, we feel ommended using LOD 2 substitution whenever it
these repeats are justified as our results provide was suspected that the underlying GSD is ,3, and
interesting and sometimes surprising comparisons LOD/2 substitution otherwise. But when estimating
between the various methods. the mean, they recommended the LOD/2 method
We purposefully did not address confidence inter- provided the percent censoring was ,50%. Their
vals in this study. Neither did the majority of the pre- paper is frequently referenced to justify using the
vious investigators. Some authors have examined and LOD/2 substitution method.
promote the use of the bootstrap or jackknife meth- All of the substitution methods are biased, and this
ods for devising confidence intervals (Shumway bias will be a function of the true GSD, the true per-
the log-transformed data); the minimum variance un- predicted using the initial sample GM and GSD
biased estimator equation (Mulhausen and Damiano, and then combined with the detects. The final sample
1998) is typically used with complete (i.e. uncen- GM and GSD, as well as the simple arithmetic mean
sored) samples to reduce this bias] since the mean (thus avoiding transformation bias), are calculated in
can be estimated using the simple arithmetic mean. the conventional manner using the combined dataset.
With LPRr, the missing values, represented by the While in principle either MLE or MLEr can be ap-
non-detects, are predicted using the ‘initial’ values plied to complex-censored datasets, we decided to
of the sample GM and sample GSD determined using create an additional variation (MLErm) which is iden-
LPR. The ‘final’ sample estimates are calculated by tical to LPRrm, except that the MLE method, rather
combining the predicted values and the detects and than the LPR method, is used to generate the initial
analyzing the dataset in the conventional fashion estimates of the sample GM and GSD. Otherwise,
for the sample GM and GSD (and from these calcu- all other calculations are the same as those used for
lating the sample 95th percentile) and for the sample the LPRrm method (see the citations for LPRrm, such
arithmetic mean (thus avoiding transformation bias). as Helsel, 2005).
In principle, neither of the LPR and LPRr methods The last variation on the MLE method is that de-
should be applied to complex-censored datasets. vised by Succop et al. (2004). The MLE method is
Helsel and Cohn (1988) devised an ad hoc method used to derive initial estimates of the sample GM
(where n 5 sample size and p 5 proportion: for ex- of variability (i.e. four coefficients of variation) for
ample, p 5 0.95 for the 95th percentile) equals an in- each of the four types of distributions. For each of
teger; in these cases, the empirical cdf method the 16 scenarios, 500 random datasets were gener-
assigns the proportion to the ranked value for that in- ated. The rMSE summary statistics were calculated
teger, while the KM method assigns the proportion to across all 16 simulations, making the results difficult
the next higher ranked value]. Its main advantage is to interpret for any one method and scenario, but the
the ability to estimate the mean in the presence of authors concluded that the LPR and MLE methods
non-detects, without relying upon a distributional were, across all the distributions, the preferred meth-
assumption. The KM method is available in many sta- ods for estimating the mean and ‘median and inter-
tistics programs, but because it was originally intended quartile range’, respectively.
for right-censored datasets, the exposure data must be Helsel and Cohen (1988) extended the work of
‘flipped’ before analysis (i.e. the left-censored data- Gilliom and Helsel (1986) by considering multiple
set must be converted to a right-censored dataset). LODs and adding the LPRrm method to the methods
Helsel (2005) provided an example of the calcula- tested. They looked at only one sample size (n 5 25)
tions and recommended it in preference to ‘all other and three LODs, set at 20%, 40% and 80% of the un-
methods’ whenever the (observed) percent censored derlying distribution. Roughly one-third of the meas-
is ,50%. To implement this method, we wrote com- urements were assigned to each LOD. (If a randomly
on the mean exposure. While they allowed that the US agencies and organizations have published sev-
interpretation of the power curves was subjective, eral monographs on the analysis of environmental
they concluded that ‘in general, the (KM) test seems data. The Environmental Protection Agency (EPA,
better’ than the MLE method. She (1997) compared 2006) offered the following general recommenda-
the KM method to the LOD/2 substitution, LPR and tions:
MLE methods. She generated 1000 datasets of size
if the percent censored is ,15%, use substitution
n 5 21 where each dataset had three censoring
with zero, LOD/2, or the LOD, or use the MLE
points randomly assigned from 10% to 80% of the
method,
underlying distribution (in increments of 10%). The
for 15–50% censored, use the MLE method, and
bias and rMSE results varied, with the LOD/2 substi-
for 50–90% censored, calculate the NP exceed-
tution occasionally performing better than the KM
ance fraction for the limit.
method. However, She rejected the LOD/2 as a valid
method based on the perception that it ‘has no statis- The US Geological Survey agency (Helsel and
tical theoretical basis’. She concluded that the KM Hirsch, 2002) published a guide to statistical meth-
method ‘performs as well as or better than’ the ods in which the substitution methods were recog-
MLE, LPR or LOD/2 substitution methods, making nized to have good overall accuracy (i.e. low
it an ‘attractive alternative . . . because it is non- rMSE), but were not recommended because they
were compared to the true values. After the genera- For Simulation 1, we generated 100 000 artificial
tion and analysis of 100 000 artificial-censored data- datasets from censored lognormal or censored con-
sets, the average bias and rMSE (an estimate of the taminated lognormal distributions. The sample size
overall accuracy, discussed later) were calculated for each dataset was randomly varied (using the uni-
for each parameter. form distribution) between 20 and 100 (inclusive).
To compare the methods, we devised the following The percentage of the distribution that was censored
three simulations (Table 1): was also randomly varied (using the uniform distri-
bution), between 1% and 50% (inclusive). The labo-
Simulation 1: n ranged between 20 and 100, the ratory LOD was then set at the concentration in the
true percent censored ranged between 1% and distribution corresponding to the percent censored.
50% and the true GSD ranged between 1.2 and 4, Simulation 1 was repeated for each of four scenar-
Simulation 2: n ranged between 20 and 100, the ios (see Table 1) and for each of the CDA methods. In
true percent censored ranged between 50% and Scenario I, a single lognormal distribution was as-
80% and the true GSD ranged between 1.2 and sumed as well as a single laboratory. The GM was
4, and
fixed at 1, while the GSD for the distribution was ran-
Simulation 3: n ranged between 5 and 19, the true
domly varied between 1.2 and 4 (inclusive) using the
percent censored ranged between 1% and 50%
uniform distribution. In Scenario II, a single lognor-
used, as just described, but with three laboratories standard simple arithmetic mean formula. For the
and three LODs, as was described for Scenario II. KM method, the mean was estimated without varia-
Simulation 2 was identical to Simulation 1 above, tion from the procedure outlined in Helsel (2005).
except that the percentage censored varied between However, we estimated the mean regardless of the
50% and 80% (inclusive). Simulation 3 was also actual fraction of censored data [Helsel (2005) rec-
identical to Simulation 1, except that the sample sizes ommended that the KM method should not be used
were allowed to vary between 5 and 19 (inclusive), to estimate the mean whenever the dataset is .50%
rather than 20 and 100. censored]. The KM and NP methods were used to es-
For each randomly generated dataset, the com- timate the 95th percentile in those simulations where
puter program did the following: the sample size was 20 or greater.
Once the sample 95th percentile and mean were
determined if the dataset was invalid,
calculated, the differences between the sample esti-
determined if the dataset was completely uncen-
mates and the true values were determined. After
sored,
all 100 000 datasets were generated, the program cal-
applied standard statistical methods to each valid,
culated the average bias for each of the parameters
uncensored dataset, and
across all 100 000 datasets:
applied the selected CDA method to each valid,
censored dataset. Bias 5 ð
x hÞ;
Table 2. Simulation 1, Scenario I—single lognormal distribution and a single laboratory where the laboratory LOD is 1–50% of
the true distribution; 20 n 100; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
MLErm 0.4 MLEmpv 22.4 MLErm 0.0 MLE 17.9
pffiffiffi
LPRr 0.4 Sub LOD 2 22.6 MLEr 0.1 LPR 18.4
Sub LOD/2 0.5 MLErm 23.0 LPRr 0.1 MLEr 19.5
MLEr 0.5 Sub LOD/2 23.1 MLEmpv 0.1 MLErm 19.6
MLE 0.6 MLEr 23.2 LPRrm 0.2 LPRr 19.6
LPRrm 1.4 MLE 23.3 LPR 0.5 LPRrm 19.8
LPR 2.5 LPRr 23.8 MLE 0.5 MLEmpv 19.8
pffiffiffi pffiffiffi
KM 2.8 LPRrm 24.1 Sub LOD 2 0.6 Sub LOD 2 19.9
MLEmpv 6.0 Sub LOD 24.2 Sub LOD/2 1.8 Sub LOD/2 20.4
pffiffiffi
Sub LOD 2 7.8 LPR 24.9 Sub LOD 4.2 KM 20.5
Sub LOD 13.5 KM 35.8 KM 4.2 Sub LOD 20.5
NP 15.2 NP 50.6
Table 4. Simulation 1, Scenario III—a contaminated lognormal distribution and a single laboratory where the laboratory LOD is
1–50% of the true distribution; 20 n 100; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
LPRr 0.2 MLEmpv 24.1 MLErm 0.1 MLE 18.2
pffiffiffi
MLEr 0.3 Sub LOD 2 24.4 MLEmpv 0.1 LPR 18.7
MLE 0.5 Sub LOD/2 24.4 LPRr 0.1 MLErm 21.4
MLErm 1.2 MLErm 24.7 MLEr 0.2 MLEmpv 21.5
pffiffiffi
LPRrm 1.3 MLE 24.9 LPRrm 0.4 Sub LOD 2 21.8
pffiffiffi
Sub LOD/2 1.6 MLEr 25.2 Sub LOD 2 1.1 MLEr 21.8
LPR 1.8 LPRr 25.7 Sub LOD/2 1.4 Sub LOD/2 21.9
KM 4.6 Sub LOD 26.0 LPR 1.5 LPRrm 21.9
MLEmpv 7.2 LPRrm 26.5 MLE 2.3 LPRr 22.6
pffiffiffi
Sub LOD 2 9.0 LPR 26.9 Sub LOD 4.3 KM 22.7
Sub LOD 14.4 KM 39.9 KM 4.4 Sub LOD 22.9
NP 19.7 NP 64.4
620 P. Hewett and G. H. Ganser
Table 5. Simulation 1, Scenario IV—a contaminated lognormal distribution and three laboratories where the LOD for each
laboratory fell in the range of 1–50% of the true distribution; 20 n 100; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
pffiffiffi
MLE 0.7 Sub LOD 2 23.8 MLErm 0.0 LPR 16.8
Sub LOD/2 1.3 MLErm 24.1 LPRrm 0.2 MLE 18.1
LPRrm 2.2 Sub LOD/2 24.2 MLEmpv 0.4 LPRrm 21.4
pffiffiffi
MLEr 3.2 LPR 24.3 Sub LOD 2 1.0 LPRr 21.5
MLErm 3.3 LPRr 24.3 Sub LOD/2 1.3 MLEr 21.6
KM 4.4 MLEr 24.3 MLEr 1.4 MLEmpv 21.8
MLEmpv 5.4 LPRrm 24.6 KM 1.6 KM 21.9
pffiffiffi
LPRr 7.9 MLE 24.7 LPR 1.9 Sub LOD 2 21.9
LPR 7.9 Sub LOD 24.8 MLE 2.4 MLErm 21.9
pffiffiffi
Sub LOD 2 8.7 MLEmpv 25.3 LPRr 3.1 Sub LOD/2 22.3
Sub LOD 13.3 KM 39.7 Sub LOD 4.3 Sub LOD 22.4
NP 19.5 NP 62.7
Table 7. Simulation 2, Scenario II—single lognormal distribution and three laboratories where the LOD for each laboratory fell
in the range of 50–80% of the true distribution; 20 n 100; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
MLE 0.2 MLErm 22.9 MLE 0.0 MLE 19.0
KM 2.9 LPRr 22.9 MLErm 0.1 MLErm 19.6
LPRrm 3.1 LPRrm 24.1 LPRrm 0.6 LPRrm 19.8
LPR 4.2 Sub LOD 24.3 Sub LOD/2 0.8 MLEr 19.9
LPRr 4.6 MLE 24.9 MLEmpv 1.7 MLEmpv 20.6
pffiffiffi
MLEr 4.8 Sub LOD 2 26.3 MLEr 3.4 Sub LOD/2 21.4
pffiffiffi pffiffiffi
MLErm 5.4 LPR 26.6 Sub LOD 2 12.1 Sub LOD 2 23.9
NP 15.0 MLEr 26.7 LPR 17.9 LPR 25.8
Sub LOD 17.8 Sub LOD/2 27.2 KM 19.8 KM 28.1
Sub LOD/2 19.0 MLEmpv 29.5 LPRr 21.9 LPRr 30.6
pffiffiffi
Sub LOD 2 19.8 KM 34.1 Sub LOD 30.5 Sub LOD 37.2
MLEmpv 21.8 NP 51.2
Comparison of several methods for analyzing censored data 621
Table 8. Simulation 2, Scenario III—a contaminated lognormal distribution and a single laboratory where the laboratory LOD is
50–80% of the true distribution; 20 n 100; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
MLE 0.9 MLErm 26.7 Sub LOD/2 0.2 MLE 20.1
MLEr 1.0 MLE 27.7 LPRr 0.9 LPR 21.0
MLErm 1.2 MLEr 27.9 MLEr 1.0 MLErm 21.6
LPRr 3.0 Sub LOD 28.2 MLErm 1.1 MLEr 21.6
pffiffiffi
LPRrm 3.8 Sub LOD 2 29.1 MLE 2.3 LPRr 21.8
KM 4.5 LPRr 29.5 LPR 3.2 MLEmpv 21.8
LPR 5.4 Sub LOD/2 29.7 MLEmpv 3.2 LPRrm 22.3
NP 19.7 LPRrm 30.3 LPRrm 3.8 Sub LOD/2 22.7
pffiffiffi pffiffiffi
Sub LOD 21.1 MLEmpv 31.3 Sub LOD 2 12.4 Sub LOD 2 25.9
Sub LOD/2 21.2 LPR 31.8 Sub LOD 29.9 Sub LOD 39.0
pffiffiffi
Sub LOD 2 22.1 KM 40.0 KM 30.0 KM 39.1
MLEmpv 22.7 NP 62.3
Table 10. Simulation 3, Scenario I—single lognormal distribution and a single laboratory where the laboratory LOD is 1–50% of
the true distribution; 5 n 19; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
pffiffiffi
MLEmpv 1.3 Sub LOD 2 54.3 LPRrm 0.0 MLE 38.9
pffiffiffi
Sub LOD 2 1.8 Sub LOD/2 58.3 MLErm 0.0 LPR 40.2
MLE 3.7 Sub LOD 60.9 MLEr 0.2 LPRr 41.2
MLErm 4.1 MLEmpv 61.3 MLEmpv 0.2 MLErm 41.8
pffiffiffi
LPRr 4.4 MLE 63.4 Sub LOD 2 0.7 MLEr 42.0
MLEr 5.4 MLErm 63.7 LPR 0.8 KM 42.1
pffiffiffi
Sub LOD/2 6.8 MLEr 66.7 LPRr 0.9 Sub LOD 2 42.7
Sub LOD 7.9 LPRr 69.0 Sub LOD/2 1.9 Sub LOD/2 42.9
LPRrm 11.5 LPRrm 94.8 MLE 2.1 MLEmpv 43.1
LPR 12.6 LPR 110.5 KM 4.0 LPRrm 43.2
Sub LOD 4.4 Sub LOD 45.6
622 P. Hewett and G. H. Ganser
Table 11. Simulation 3, Scenario II—single lognormal distribution and three laboratories where the LOD for each laboratory fell
in the range of 1–50% of the true distribution; 5 n 19; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
MLErm 0.9 Sub LOD 50.8 MLErm 0.1 MLE 38.0
pffiffiffi
MLEmpv 0.8 Sub LOD 2 56.0 MLEmpv 0.3 LPR 38.0
LPRr 1.0 MLErm 59.6 LPRrm 0.4 MLErm 41.0
pffiffiffi pffiffiffi
Sub LOD 2 1.2 MLEmpv 60.2 Sub LOD 2 0.8 MLEmpv 41.7
MLEr 2.9 Sub LOD/2 60.6 MLEr 1.5 MLEr 42.1
MLE 3.1 MLE 60.7 LPR 1.6 Sub LOD 42.5
LPR 3.4 LPRr 62.3 KM 1.6 LPRrm 42.7
LPRrm 4.1 MLEr 63.8 Sub LOD/2 1.8 LPRr 42.8
Sub LOD 6.8 LPRrm 72.3 MLE 2.3 KM 44.0
Sub LOD/2 7.2 LPR 90.6 LPRr 4.0 Sub LOD/2 44.9
pffiffiffi
Sub LOD 4.2 Sub LOD 2 45.6
Table 13. Simulation 3, Scenario IV—a contaminated lognormal distribution and three laboratories where the LOD for each
laboratory fell in the range of 1–50% of the true distribution; 5 n 19; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
MLEmpv 0.3 Sub LOD 54.8 MLErm 0.2 LPR 38.5
pffiffiffi
MLErm 0.8 Sub LOD 2 57.4 LPR 0.2 MLE 38.9
LPRr 0.8 MLE 63.7 MLEmpv 0.2 LPRr 46.2
pffiffiffi
Sub LOD 2 2.3 Sub LOD/2 64.3 LPRrm 0.7 MLErm 46.5
pffiffiffi
MLE 2.3 MLEmpv 65.5 Sub LOD 2 0.9 Sub LOD 46.5
pffiffiffi
MLEr 2.9 MLErm 66.4 Sub LOD/2 1.3 Sub LOD 2 47.4
LPR 3.9 LPRr 66.8 MLEr 1.4 Sub LOD/2 47.8
LPRrm 4.9 MLEr 70.6 KM 1.7 MLEmpv 47.9
Sub LOD/2 6.0 LPRrm 84.0 LPRr 3.8 MLEr 47.9
Sub LOD 7.5 LPR 89.1 MLE 4.0 LPRrm 48.3
Sub LOD 4.2 KM 49.2
Comparison of several methods for analyzing censored data 623
minus several percentage points. Consequently, we other Simulation–Scenario combinations. Other in-
feel that the bias and rMSE estimates in the tables vestigators may have a different experience. For ex-
are reliable. ample, perhaps only a single laboratory is used, but
When comparing methods, our view is that there is the sample sizes are typically ,20, suggesting that
little practical difference between methods having an the Simulation 3 results will be more informative.
absolute bias that differs by only 1% or so, or a rMSE Or perhaps the datasets are always complex and the
that differs by only 2% or so. Furthermore, it is worth lognormal distribution assumption is always in
mentioning that these are composite results for a wide doubt, resulting from the combination of data from
range of distributions and censoring points created disperse areas and/or time periods, in which case
using the simulation parameter ranges specified in the Scenario IV simulations are of interest.
Table 1. It is likely that the rankings would change
if the methods were challenged with a specific distri- Substitution methods
bution, a specific LOD and a specific sample size. While the substitution methods have often been
Our view is that a composite analysis is more infor- condemned (She, 1997; Helsel, 2005), their use con-
mative regarding the performance that one should ex- tinues. Our results, as expected, show that there is
pect in the long run from each method. good reason for not using the LOD substitution
Which comparison metric should be used: the bias method, as it consistently ranked near the bottom
LPR-based methods rMSE for the 95th percentile and low bias for the
While there were exceptions, the LPR-based meth- mean.
ods tended to be in the middle to top half of the Overall, our choice would be either of the MLE or
bias and rMSE rankings for Simulations 1 and 2. The MLEr methods when estimating the 95th percentile
LPR-based methods appear to be fairly robust when and the MLE or MLErm methods when estimating
confronted with multiple LODs and/or contaminated the mean. Since both the MLEr and MLErm methods
distributions. The LPRrm method tended to have require additional manipulations, we would, if con-
lower bias than the LPR and LPRr methods in the fined to a single choice, select the standard MLE
multiple LOD scenarios (Scenarios II and IV) in method. Regarding the mean, the standard MLE
Simulations 1 and 2. Overall, the LPR-based meth- method almost always had the lowest rMSE, regard-
ods were slightly lower in the rankings than the less of the simulation or scenario. For the 95th per-
MLE-based methods, although exceptions frequently centile, the MLE method consistently appeared in
occurred. The LPRrm method, which was designed the top half of the rankings for both bias and rMSE,
solely with the intention of estimating the mean from particularly for the severely censored scenarios (Sim-
complex datasets and when the single lognormal dis- ulation 2; Tables 6–9) and appeared to be surpris-
tribution assumption is in doubt, consistently had low ingly robust when confronted with contaminated
bias when estimating the mean in all three simula- distributions (Scenarios III and IV) and complex-
method ‘seems better’ than the MLE method, but for the KM to be successfully applied the censoring
their conclusion was not free of equivocation and must be random: ‘. . .the probability that the mea-
was focused on hypothesis testing rather than param- surement of an object is censored cannot depend on
eter estimation (as is the focus of our study). In the the value of censored variable’. Schmoyer et al.
other study, the investigator (She, 1997) found that (1996) recognized this essential requirement when
the LOD/2 substitution method often outperformed in their computer simulations the authors assumed
the KM method, but recommended the KM method that the censoring point or LOD was a random vari-
over the substitution method because the LOD/2 able and generated a random LOD for each random
method ‘has no statistical theoretical basis’. measurement. However, with occupational exposure
Interestingly, as we programmed the KM method, data (and we suspect environmental data as well),
we immediately recognized that when estimating the the probability that a measurement will be censored
mean exposure the KM method is mathematically does indeed depend on the true ‘value’ or concentra-
identical to the worst of the substitution methods tion: the censoring point is relatively fixed and any
(LOD substitution), which was demonstrated in the true concentration below that censoring point will re-
results (e.g. see the tables for Scenarios I and III), re- sult in a LOD measurement. These considerations
sulting in a strong positive bias for the mean. If the make it difficult to envision an occupational scenario
non-detects are internal, that is, bounded on both where the KM method could be applied, even if it
Table 14. Simulation using Simulation 1, Scenario IV parameters, comparing the eight 95th percentile (quantile) estimation
methods presented by Hyndman and Fan (1996)
95th Percentile
Method Bias (%) Method rMSE
Q1—empirical CDF 3.3 Q4—weighted average 1 30.0
Q2—empirical CDF w/averaging 3.9 Q7—weighted average 3 30.1
Q7—weighted average 3 4.1 Q3—closest value 32.0
Q4—weighted average 1 4.8 Q2—empirical CDF w/averaging 37.6
Q3—closest value 4.9 Q1—empirical CDF 38.8
Q5—Cleveland 5.8 Q5—Cleveland 39.9
Q8—median quartile 10.1 Q8—median quartile 47.6
Q6—weighted average 2 19.4 Q6—weighted average 2 63.7
626 P. Hewett and G. H. Ganser
the bias and rMSE values for the NP quantile meth- For these larger sample sizes, the bias and rMSE
ods were virtually identical to those in Table 14. For results for the LPR- and MLE-based methods tend
the LPR- and MLE-based methods, the bias and to converge, suggesting that method-related dif-
rMSE values were again superior, little worse than ferences should be a minor consideration when esti-
the values in Table 5. However, when we increased mating either the 95th percentile or the mean. When
the sample size range from 20–100 to 100–1000, estimating the 95th percentile, in the contaminated
holding all other Simulation 1, Scenario IV parame- distribution scenarios the NP quantile methods
ters the same, the bias and rMSE values for the NP tended to slightly outperform the MLE-based meth-
quantile methods tended to approach those of the ods in terms of bias, but not rMSE, when challenged
LPR- and MLE-based methods. This suggests that with a contaminated distribution, suggesting that the
the NP methods could be applied to contaminated NP methods should not in such instances be consid-
distributions, but only for very large sample sizes. ered an automatic alternative to the higher order
In summary, the NP methods, while they have the methods when estimating the 95th percentile for
advantage of being applicable to any underlying ex- large datasets. For sample sizes ,100, the LPR-
posure profile, whether or not that profile is close to and MLE-based methods are sufficiently robust to
some assumed distribution function, do not perform be preferable to the NP quantile methods.
as well as the parametric methods for right-skewed
One of the themes of this paper is that there will effect of measurement error. In our own computer
be datasets where the unimodal, lognormal distribu- simulations, we assumed that the randomly gener-
tion assumption is inappropriate, but to date, there ated exposures were (i) measured precisely and (ii)
is little guidance on how to make this determina- that the measurements were not rounded or trun-
tion. Certainly, subjective graphical techniques— cated. Therefore, our conclusions, and those of the
log-probability plots and histograms—are helpful, referenced studies, are strictly applicable to ideal
but thus far all of the objective goodness-of-fit proce- exposure measurement systems.
dures assume or require a complete or uncensored A sampling and analytical method produces an es-
dataset. A censored data goodness-of-fit method, per- timate of the true concentration. The accuracy (refer-
haps consisting of both a subjective graphical test ring to the combination of bias and precision) of
and an objective statistical test, where the outcome these estimates vary with the mass collected, the an-
is a decision to use a standard versus robust version alyte and the analytical method. Due to variability in
of a method (e.g. MLE versus MLEr or MLErm) the sampling pump flow rate and manufacturing var-
might prove to be useful. iation in the sampling device (e.g. the sampling de-
None of the methods studied here account for the vice used to obtain a respirable dust sample), the
residual information in a complex dataset: the LOD mass collected will be an estimate of the true mass
and the laboratory in use for each measurement per unit volume at the location sampled. An analyti-
by adding measurement error to the true concentra- normal distribution function, the actual exposure
tion using the following equation: profile for any particular workplace has an upper
boundary and is an unknown function of the physical
x9 5 x ð1 þ Zr CVt ðxÞÞ;
parameters of the workplace and work practices of
where x9 5 measured concentration, x 5 true con- the employees, and not a function of the GM and
centration (a random value generated from a log- GSD. Even if the underlying exposure profile is rea-
normal distribution), Zr 5 random Z-value and sonably lognormal, we will never know the true GSD
CVt(x) 5 the sampling and analytical method CVt or the true percentage of the underlying distribution
at x (see Appendix 2). that lies below the field LOD, so that overall it can
The results are presented in Tables 15–17. Interest- be difficult to determine which of the simulation
ingly, while the bias tended to increase for all meth- and scenario combinations devised here best fit our
ods, the rMSE often changed little or even decreased. situation.
(After considering the issue, we recognized that the Our results show that for the simulations and sce-
since the CVt increases with decreasing concentra- narios postulated an ‘omnibus’ CDA method does
tion, it is more likely that a true non-detect will be not yet exist. In our view, a ‘preferred’ CDA method
‘measured’ as a detect and less likely that a detect is one that has both low bias and rMSE for the expo-
will be measured as a non-detect. This results in sure profile parameter of interest and is robust when-
Table 15. Simulation 1, Scenario I (with measurement error)—single lognormal distribution and a single laboratory where the
laboratory LOD is 1–50% of the true distribution; 20 n 100; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
MLErm 1.7 MLEmpv 22.5 MLE 0.4 MLE 17.9
pffiffiffi pffiffiffi
LPRr 2.5 Sub LOD 2 22.8 Sub LOD 2 0.8 LPR 18.4
MLEr 2.8 MLErm 22.9 LPRrm 1.0 MLErm 19.6
MLE 2.8 Sub LOD 23.7 MLEmpv 1.1 LPRrm 19.7
Sub LOD/2 3.5 MLE 23.7 MLErm 1.1 MLEr 19.7
LPRrm 3.5 MLEr 23.7 MLEr 1.2 MLEmpv 19.7
MLEmpv 3.6 LPRr 24.1 LPRr 1.5 LPRr 19.9
pffiffiffi
LPR 4.3 LPRrm 24.7 LPR 1.5 Sub LOD 2 20.0
pffiffiffi
KM 4.7 Sub LOD/2 24.9 Sub LOD 2 1.7 Sub LOD/2 20.6
pffiffiffi
Sub LOD 2 5.3 LPR 25.2 Sub LOD 5.3 KM 20.8
Sub LOD 11.4 KM 34.1 KM 5.3 Sub LOD 20.9
NP 16.9 NP 52.3
Table 17. Simulation 3, Scenario I (with measurement error)—single lognormal distribution and a single laboratory where the
laboratory LOD is 1–50% of the true distribution; 5 n 19; 1.2 GSD 4
95th Percentile Mean
Method Bias (%) Method rMSE Method Bias (%) Method rMSE
pffiffiffi
Sub LOD 2 0.8 Sub LOD 52.4 Sub LOD/2 0.9 MLE 38.5
pffiffiffi
MLEmpv 3.7 Sub LOD 2 55.6 LPRrm 0.9 LPR 40.5
MLE 6.1 Sub LOD/2 59.7 MLE 1.0 MLErm 41.7
Sub LOD 6.1 MLEmpv 61.0 MLEr 1.0 MLEr 41.9
LPRr 6.4 MLE 62.2 MLErm 1.1 LPRr 41.9
MLErm 6.6 MLEr 65.0 MLEmpv 1.3 MLEmpv 42.1
pffiffiffi
MLEr 7.2 MLErm 65.8 Sub LOD 2 1.8 LPRrm 42.9
Sub LOD/2 9.8 LPRr 70.9 LPR 2.0 Sub LOD 43.2
LPRrm 13.1 LPRrm 85.6 LPRr 2.2 KM 43.2
pffiffiffi
LPR 13.9 LPR 96.2 KM 5.2 Sub LOD 2 43.7
Sub LOD 5.3 Sub LOD/2 44.0
630 P. Hewett and G. H. Ganser
apart from any opinions regarding the preferred methods did consistently well in all of the scenarios,
method, ease of use and accessibility are bound to our table consists primarily of the MLE-based meth-
Hyndman and Systat Version 11 SAS Version 9 Intermediate 95th Percentile calculation
Fan (1996) calculationsa
quantile method
Q1 Empirical CDF QNTLDEF 3 k 5 0.95n, If (k i) . 0 then X0.95 5 xiþ1,
i 5 Floor(k) if (k i) 5 0 then X0.95 5 xi
Q2 Empirical CDF QNTLDEF 5 k 5 0.95n, If (k i) . 0 then X0.95 5 1⁄2(xiþ1 xi),
with averaging (default) i 5 Floor(k) if (k i) 5 0 then X0.95 5 1⁄2(xiþ1 xi)
Q3 Closest value QNTLDEF 2 k 5 0.95n, X0.95 5 xi
i 5 Round(k)
Q4 Weighted average 1 QNTLDEF 1 k 5 0.95n, X0.95 5 xi þ (k i)(xiþ1 xi)
i 5 Floor(k)
Q5 Cleveland’s method k 5 0.95n þ 1⁄2, X0.95 5 xi þ (k i)(xiþ1 xi)
(default) i 5 Floor(k)
Q6 Weighted average 2 QNTLDEF 4 k 5 0.95(n þ 1), X0.95 5 xi þ (k i)(xiþ1 xi)
i 5 Floor(k)
Q7 Weighted average 3 k 5 0.95(n 1) þ 1, X0.95 5 xi þ (k i)(xiþ1 xi)
i 5 Floor(k)
Q8 k 5 0.95(n þ 1⁄3) þ 1⁄3, X0.95 5 xi þ (k i)(xiþ1 xi)
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2
rmass Fig. 1. Total coefficient of variation (CVt) calculated as
CVt ðxÞ 5 CV2pump þ CV2sampler þ x ; a function of true concentration when sampling respirable dust
QT
(i.e. RPM).
where rmass 5 the standard deviation of the analyti- with the analyte and analytical method. For a single
cal system; Q 5 flowrate; T 5 averaging time for the weighing of a filter used in respirable dust sampling,
measurement and x 5 true concentration. a typical rmass is 0.005 mg. Let us assume that the
Using as an example, the sampling of respirable mass collected on each filter is also blank corrected
dust (respirable particulate mass, RPM) (and ignoring (i.e. each sample filter has a matching blank filter).
any uncorrectable particle size distribution effects), Since both the sample filter and the matched blank
the total coefficient of variation at different concentra- are pre- and post-weighed, a total of four weighings
tions of RPM can be estimated. Although recent stud- are required to estimate the mass collected on the sam-
ies (Kogut et al., 1997) have shown slightly lower ple filter, resulting in an overall rmass of 0.010 mg.
values, the CVpump has traditionally been given a value Finally, the low rate and averaging time will be set
of 0.05. The CVsampler for the Dorr-Oliver 10-mm ny- at the standard values of 1.7 Lpm (for RPM) and
lon cyclone has been estimated by Kogut et al. (1997) 480 min, resulting in the following equation:
to be 0.023, but in this example we will use 0.05 as re- Figure 1 shows the relationship between the CVt
ported by Bartley et al. (1994). The rmass will vary and the true RPM concentration. At the higher
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
2 2 0:010 mg 1 2
CVt ðxÞ 5 0:05 þ 0:05 þ :
0:0017 m3 min 1 480 min x
632 P. Hewett and G. H. Ganser
concentrations, the CVt is relatively constant. Suffi- Eduard W. (2002) Estimation of mean and standard deviation
cient mass is collected that the contribution of the (letter to the editor). Am Ind Hyg Assoc J; 63: 4.
El-Shaarawi AH, Esterby SR. (1992) Replacement of censored
CVanalysis becomes insignificant compared to the observations by a constant: an evaluation. Water Res; 26:
fixed variability due to the sampling pump and 835–44.
sampler. At low concentrations, the CVanalysis pre- Environmental Protection Agency (EPA). (2006) Data quality
dominates and steadily increases with decreasing assessment: statistical methods for practitioners, EPA QA/
G-9S. Washington, DC: Environmental Protection Agency.
collected mass. (The curve does not remain flat Finkelstein MM, Verma DK. (2001) Exposure estimation
forever as the concentration increases. Very high in the presence of nondetectable values: another look
concentrations will tend to result in overloaded sam- (see AIHAJ 63:4 2002 for letters to the editor). AIHAJ; 62:
plers, which will drive the CVt upwards.) 195–8.
Frome EL, Wambach PF. (2005) Statistical methods and soft-
A CVt curve can be determined for any analyte and ware for the analysis of occupational exposure data with
sampling method, and will have a shape similar to non-detectable values. Oak Ridge, TN: Oak Ridge National
that in Fig. 1. According to the NIOSH (Abell and Laboratory ORNL/TM-2005/52.
Kennedy, 1997), a reasonably accurate method Gibbons RD, Goleman DE. (2001) Statistical methods for de-
should have a true CVt that is ,0.128 over the range tection and quantification of environmental contamination.
New York: John Wiley and Sons, Inc.
of 10–200% of the exposure limit. However, at the Gilbert RO. (1987) Statistical methods for environmental pol-
method’s field LOD—that is, the laboratory LOD di- lution monitoring. New York: Van Nostrand Reinhold.