Bland-Altman Analysis
Bland-Altman Analysis
Bland-Altman Analysis
Review Article
A R T I C LE I N FO A B S T R A C T
Keywords: The rapid increase in the number of new laboratory methods has led to the necessity of reliable verification
Bland-Altman analysis methods. Validation of a new measurement method for application to medical practice requires comparison with
Limits of agreement gold standard techniques. The Bland-Altman analysis is a frequently applied technique in studies that investigate
Correlation analysis the agreement between two methods of the same medical measurement. In this review, potential areas of usage
Biostatistics
of Bland-Altman analysis is elaborated from a clinical viewpoint, and possible pitfalls in study designs are
discussed in statistical perspective.
https://doi.org/10.1016/j.tjem.2018.09.001
Received 1 September 2018; Received in revised form 10 September 2018; Accepted 11 September 2018
Available online 17 September 2018
2452-2473/ 2018 Emergency Medicine Association of Turkey. Production and hosting by Elsevier B. V. on behalf of the Owner. This is an open access article
under the CC BY-NC-ND license (http://creativecommons.org/licenses/BY-NC-ND/4.0/).
N.Ö. Doğan Turkish Journal of Emergency Medicine 18 (2018) 139–141
Table 1
Dataset for potassium levels in venous blood gases and blood electrolyte work-up.
Potassium level (mEq/L) (Obtained from Potassium level (mEq/L) (Obtained from Mean potassium level Difference between potassium
venous blood gas analysis) blood electrolyte levels) (mEq/L) levels (mEq/L)
good agreement between the two methods.4 Moreover, data which 4. Clinical implication and potential areas of usage
seem to be in a poor agreement can produce quite high correlations.
Only a clinician, who uses the test results in a clinical setting can
decide whether the mean bias and LOA are acceptable or not. For in-
3. Analysis of the differences between variables stance, a mean bias of 0.2 mEq/L is obviously acceptable for potassium
levels. However, 3 mEq/L is too broad and can lead to lethal compli-
Bland and Altman quantified the difference between measurements cations if the actual potassium value is higher in biochemistry panel.
using a graphical method. They draw a scatterplot in which the X-axis Bland-Altman analysis was previously used in many method com-
represented the average [(K1 + K2)/2], and the Y-axis represented the parisons in the literature. It may be used to compare two new mea-
difference (K1 – K2) of two measurements. After the graph is drawn, the surement methods or one measurement method against a reference
mean bias (mean of the K1 – K2) and its confidence limits (limits of standard. These measurement variables should be continuous (not ca-
agreement) should be quantified. Using statistical software, a one- tegorical) such as hemoglobin level (g/dl), anti-HCV antibody titer or
sample T-test can be performed to calculate the mean bias and its SD. the size of a tumor (cm). The Bland-Altman method is a popular ap-
To represent mean bias and limits of agreement, we need only mean of proach, and there are reports including but not limited to compare two
the difference of measurement methods and its standard deviation ob- hemodynamic measurements,6 end-tidal carbon dioxide measurement
tained from one-sample T-test. Secondly, the data points can be re- methods,7,8 different electrolyte level measurement methods,9 self-as-
stricted using +2 standard deviation (SD) to demonstrate a 95% con- sessed general well-being scores,10 performance of different computed
fidence interval (CI; precisely defined: mean ± 1.96 standard tomography technologies in evaluating pulmonary nodules.11
deviations) of distributed data. An ideal agreement is zero difference
between measurements. Thus average difference and its limits can also
be found near zero in this setting. 5. Pitfalls in Bland-Altman analysis
For our dataset, the mean difference (mean bias) was found as 0.012
with an SD of 0.260. A scatterplot should be drawn to understand One of the critical problems in the Bland-Altman analysis is the need
dispersion of variables using X-axis (average) and Y-axis (difference). to meet the assumption of normal distribution. The continuous mea-
The LOA can be drawn manually if the statistical software does not surement variables need not to be normally distributed, but their dif-
automatically demonstrate them. In our data set, the upper limit can be ferences should. If the assumption of normal distribution is not met,
calculated using mean + 1.96 x SD (0.012 + 1.96 x 0.260 = 0.522) data may be logarithmically transformed.4 The data may be tested
and the lower limit can be calculated using mean – 1.96 x SD against the normal distribution using classical methods such as the
(0.012–1.96 x 0.260 = –0.498). The appropriate statement used in the Shapiro-Wilk test or Kolmogorov-Smirnov test. Visual evaluation of the
manuscript can be following: The Bland-Altman plot showed the mean histogram plot may not be adequate.
bias ± SD between first and second potassium levels as 0.012 ± 0.260 Another problem arises from the sample size. Studies comparing
mEq/L, and the limits of agreement were −0.498 and 0.522 (Fig. 1). methods of measurements should be adequately sized to conclude that
The scatterplot can be evaluated according to the scatter dispersion. the effects are universally valid. If the sample size is not adequate, it is
In a good agreement, the scattering of points is diminished, and points possible to find a low mean bias and reduced limits of agreement by
lie relatively close to the line which represents mean bias. As a quan- comparing two methods.12 Such methods cannot be recommended for
tifiable measure, mean bias and limits of the agreement give informa- general use without verification of the results of other studies. To cal-
tion about the utility of the new measurement method. Regarding our culate sample size, maximum allowed difference derived from other
data set, those two methods can be used interchangeably as the limits studies should be provided.
vary from nearly one mEq/L of potassium. Some authors argue that also regression analysis can be performed
140
N.Ö. Doğan Turkish Journal of Emergency Medicine 18 (2018) 139–141
141