Assuring The Quality of Test and Calibration Results
Assuring The Quality of Test and Calibration Results
Assuring The Quality of Test and Calibration Results
9]
7.5.1 Use of certified reference materials
Spikes are widely used for method validation and calibration in chemistry and
microbiology. They provide a reasonable alternative to certified references, if the
spiking
material is adequately authenticated, ideally by certification of its purity. On the
face of it,
a spike has the advantage that the laboratory can spike into a matrix which is
absolutely
typical of its normal sample stream. The counter argument is to question whether
a
material spiked into the sample artificially is really present in the same distribution
and
speciation as the actual target. The strength of this argument depends on the
matrix. A
metal ion spiked into a water sample might well be regarded as a valid approach,
but a
pesticide spiked into a food sample may be questioned on the grounds that the
pesticide
in real samples was, perhaps, systemically absorbed by the crop used to make the
food
and so may be bound into the cell structure. However, in complex matrices the
spike
may be the only alternative, however imperfect it may be suspected to be.
A spike is generated by taking a real sample and adding a known amount of the
target in
question. Ideally, the base sample for the spike should have little or none of the
target
Page 48
present before spiking. If this is not possible, the spike level should be large
compared to
the natural level present. Of course, the natural level must be known in this
instance. The
spike must be thoroughly mixed and distributed homogeneously throughout the
matrix.
available.
The spike does not provide true traceability but it can be reasonably assumed that
laboratories which are able to demonstrate good recoveries of spikes have good
accuracy
and hence will tend to agree.
The use of spikes is especially important where laboratories are carrying out tests
in
complex matrices which may affect the results. Examples are water analysis where
matrix
effects are common and microbiology where components of the sample may well
affect
the viability of organisms. The spike, at the very least, demonstrates that the
laboratory
would detect the material or organism being sought if it were present.
are compared with the reference regularly. The reference is, perhaps, calibrated
externally
annually.
Quality control samples are used in exactly the same way as spikes and CRMs.
They are
merely samples for which the laboratory has established values and acceptance
limits.
They are tested along with unknown samples as a performance check. The
laboratory
may establish the values of the analytical quality control samples by repeated
testing, but
they should, ideally, be confirmed by at least two other laboratories.
If possible, quality control samples should be calibrated against CRMs. In this
instance
they become transfer standards, and the quality control sample provides
traceability. This
strategy is frequently adopted when expensive CRMs are needed since the
laboratory can
use the quality control sample routinely and check it only occasionally against the
expensive CRM.
ISO Guide 35 is a useful source of information on procedures for validating inhouse
quality control samples and confirming their homogeneity. Although, strictly
speaking,
the Guide is intended to refer to the production of CRMs, similar principles apply
to the
production of in-house reference materials for use as quality control samples.
As with the spikes, quality control samples which are not calibrated against CRMs
do not
provide traceability in themselves but demonstrate consistency of performance on
the
part of a laboratory. Such consistency, when combined with satisfactory results
from
interlaboratory exercises showing that the laboratory normally agrees with its
peers,
comes a very close second to establishing true traceability and is, in many
situations, the
only possible option.
Page 49
The laboratory will need to have a policy on the use of quality control samples and
for
evaluating and responding to quality control results. Guidance on this is given in
section
7.6.
Laboratories using spikes and other forms of analytical quality control samples are
effectively monitoring the consistency of their own performance. They should have
a
very good picture of their internal reproducibility. If they perform well on spikes
and on
any CRMs, they have also every reason to believe that their results are accurate.
Nonetheless, it is in the interest of any laboratory to test this assumption from time
to
time by exchanging samples with other laboratories and comparing results.
Such exercises are a very effective extension to the internal quality control
programme of
laboratories. They also provide an element of traceability when CRMs are not
available
since the more laboratories agree on results and the wider the range of samples on
which
they agree the more certain everyone can be of the accuracy of the collective
results. This
is further reinforced if agreement spans several analytical methods for the same
determinand.
Interlaboratory studies may be informal, in that a group of laboratories will
exchange
samples on an ad hoc basis, or may be formal exercises organised by a third party
who
circulates performance indicators. Irrespective of how it is done, the crucial part of
the
exercise is that the laboratory uses the data. This means reacting positively to any
results
which indicate that it is not performing as well as the other participants and
carrying out
remedial action.
Participation in appropriate interlaboratory proficiency schemes will normally be
required
by accreditation bodies, and some bodies actually operate their own schemes.
Although
interlaboratory comparison is listed in ISO 17025 as only one of several quality
maintenance options, most accreditation bodies will insist on its use wherever
possible. If
there are schemes available in a laboratorys sphere of activity or opportunities for
ad hoc
exchanges with other laboratories operating in the field, the accreditation
applicant will
have to provide good reasons to the accreditation body for not participating in
these
activities. The accreditation body simply sees inter-comparisons as the most
stringent test
of a laboratory and wants to see the laboratory subject itself to such a test.
The recently changed world security situation has resulted in severe difficulties in
shipping samples for proficiency testing across national boundaries, especially by
airfreight. This has, unfortunately, coincided with an increase in the insistence, by
accreditation bodies and particularly by the regional laboratory accreditation
conferences,
on proficiency testing as an activity for accredited laboratories. As a result it is
becoming
almost essential for any national accreditation body to ensure that adequate
proficiency
testing is available within the country before it can seek international recognition
through
MRAs and conference membership.
Accreditation will not normally be conditional upon any particular level of
performance
in interlaboratory comparison, but what will be required is for the laboratory to
have a
documented procedure for evaluating the results from its participation and for
responding to any problems revealed. There must also be records showing that the
results were evaluated and what action was taken to remedy problems.
The accreditation body will not withdraw accreditation on the basis of isolated
instances
of poor interlaboratory proficiency performance. However, if a laboratory is
consistently
Page 50
In the case of many methods, neither certified reference materials nor effective
spikes are
available. There may, however, be what are often referred to as consensus
standards,
recognised by all parties concerned. These would include many industry standards,
such
as those used in, for example petroleum source rock analysis or colour fastness
measurements. Such standards may not be traceable in a strict sense but are used
to
ensure consistency of data within the industry sector and hence form a basis for
agreement when testing against product quality standards.
There is another approach to testing that is also recognised as a means of
providing
confidence in results, a well established one in analytical chemistry:
determinations by
different methods which lead to comparable answers, which are generally
persuasive.
Repeat determinations are also used to provide confidence in results. Such
confidence
may be misplaced since errors may be repeated, especially systematic errors
resulting
from poor design of the system or errors in making up reagents. Where repeat
determinations can be valuable is if samples are retained over a relatively long
timescale
and then resubmitted, ideally blind to testing personnel, to check the consistency
of the
data.
Correlation of results from different characteristics of an item is also mentioned as
a
quality assurance method in ISO 17025. The extent to which this is relevant will
depend
not only on the type of testing being carried out but also on whether the client has
requested a range of tests rather than an isolated test. There is no explicit
requirement for
the laboratory to do additional tests to provide data for such correlations. Most
laboratories will scrutinise data in this way where they can, for example in water
analysis
to confirm pH, the hardness, alkalinity, conductivity and dissolved solids present a
consistent picture. This type of scrutiny should, however, be systematised and
there
should be guidelines in the methods documentation on the criteria to be used in
the
assessment so that it is made consistently.
Page 51
also fall into the same category. Measurements of fat or fibre, for example, are
clearly
method-dependent since they do not measure precisely defined chemical species. If
fat
is determined by weighing the material extracted with pentane, for example, the
definition we are adopting for fat is that material which is extractable by pentane.
In these cases, the correct result is defined in terms of a reference method which
is
tightly specified, and traceability effectively means traceability to the reference
method.
Laboratories using methods other than the reference method should calibrate
them
against the reference method from time to time and would be under an obligation
to
demonstrate that any method which they choose to adopt gives data comparable to
that
from the reference method.
Even where targets are clearly defined, it may be necessary to agree on a
reference
method. This will arise when there are several methods which typically return
different
results.
Ideally, the technical problems implied by the inconsistency between methods
should be
resolved, the best method chosen and the rest discarded. Sometimes, however, it is
not
possible to come to a definitive answer on which method is technically superior
and,
especially where enforcement is the issue, one method is more or less arbitrarily
defined
as giving the correct result in order to solve the impasse.
Under these circumstances, interlaboratory calibrations are essential since the
method
defines the reference values and this is meaningless unless all participating
laboratories
are able to produce results which agree when all use the reference method.
next section gives an introduction to the use of control charts which might be
adopted
by a laboratory which is new to the area of statistical quality control.
0.1
A common rule of thumb is to review the data every sixty points. If 1-6 (inclusive)
points
are found outside the 2 limits, then the indication is that the limits are
satisfactory. If
more points are found outside, then the limits are optimistic and either they should
be
revised or the method investigated in order to bring it back to a level the
performance
required. If no points are found outside 2, then new limits should be set reflecting
the
enhanced performance. Some laboratories take the view in this last case that they
will not
reduce the limits since this will result in increased rejection of data. In these
circumstances, the decision on whether to revise to tighter control limits will be
determined by whether the un-revised limits are acceptable in the sense that they
indicate
that the method is fit for purpose.
There is no simple answer to how frequently quality control items should be run.
The
1.9
Warning Limit +2
1.6
Expected
value
1.0
Dateoftest
Warning Limit -2
0.4
Action Limit -3
X
X
X
X
X
X
X
X
XX
X
X
XX
X
XX
X
X
X
X
X
X
X
X
Page 54
quality control standard since this not only checks consistency but also gives
information
on overall error.
A method which can only be controlled by a high frequency of quality control
checks