001
001
Institute of Applied Chemistry and Pharmaceutical Analysis, Faculty of Pharmacy, University of “Ss Cyril and Methodius”,
Majka Tereza 47, 1000 Skopje, Republic of Macedonia
Abstract
The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical
method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be
expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using non-
parametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained
with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight
differences between the error (presented through the relative residuals) were obtained. Estimation of the significance of the differences in the
RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups.
The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.
simplest model that adequately describes the concentra- tration in a range from 0.1 - 50 mg/L, over three days. QC
tion-response relationship should be used” (EMA 2011, samples at concentrations of 0.1mg/L (LLOQ); 0.2 mg/L
FDA 2001), the question is how to select the adequate regres- (LQC), 20mg/L (MQC) and 40 mg/L (HQC) in five rep-
sion model. The statistic criteria such as coefficient of deter- licates for each QC level, were analyzed along with the
mination (R2) is only informative and not relevant, since dif- calibration curves. Five linear and five quadratic regres-
ferent regression models give similar values for R2 and in ad- sion models were evaluated during the selection of the ad-
dition acceptable R2 values may be obtained despite the im- equate curve fit.
paired accuracy (Hartmann et al., 1998; Castillo & Castells, The CS and QC data were fitted on OLS, weight-
2001; De Souza & Junqueira, 2005; Stockl et al., 2009). The ed linear (1/x, 1/x2, 1/y and 1/y2) and on quadratic models
approach based on comparison of the sum of relative residu- (non-weighted and weighted) using calibration curve op-
als (SRR) obtained for each regression model is reported by tions in MS data system (XcaliburTM Data system, Ther-
several authors (Lang & Bolton, 1991; Wieling et al., 1996; mo Scientific, USA). The relative residuals (RR) were cal-
Almeida et al., 2002). According this approach the regres- culated based on back-calculated concentration obtained
sion model with lowest SRR should be chosen as adequate. from each regression model and nominal concentration
Singtoroj et al (Singtoroj et al., 2006) proposed a strategy (Equation 1).
based on using non-parametric test of ranks for evaluation of
regression models in terms of SRR obtained from calibration
curve fit and calibration curve predictability.
Eq. 1
An issue that arises is whether these approaches ena-
ble selection of the simplest regression model, especially
in cases when slight difference between the SRR are ob-
The SRR was computed from the average RR (%)
tained. The objective of our work is to present a statisti-
obtained from each calibration level and every QC level
cal approach for selection of an adequate regression model
(Equation 2).
during the bioanalytical method validation. The proposed
approach is based on two non-parametric statistical tests:
One sample rank test and Wilcoxon signed rank test for Eq. 2
two independent groups of samples.
The test of homoscedasticity was carried out using
F-test and the obtained experimental value was compared
Experimental with tabled one, for level of significance 0.05.
Sample preparation and chromatographic conditions The non-parametric statistical tests (One sample rank
test and Wilcoxon signed rank test for two independent
The data that were statistically evaluated were derived groups of samples) were calculated using Windows Excel
from the validation of the HPLC-MS/MS method for ibu- (Microsoft Corporation).
profen (IBP) quantification in human plasma. Briefly, the
calibration standards (CS) and quality control (QC) sam-
ples were prepared by spiking 50 µl working standard solu- Results and discussion
tions of rac-IBP with 950 µl blank human plasma. The plas-
ma samples, containing ketoprofen as an internal standard The range of the calibration curve for the bioanalyti-
(IS), were extracted using LLE procedure. The analysis was cal methods should be established to allow adequate de-
conducted on TSQ Quantum Discovery Max triple quadru- scription of the pharmacokinetics of the analyte of interest
pole mass spectrometer (Thermo Scientific, USA). Chro- (EMA 2011). Therefore before defining the concentration
matographic separation was achieved on Lux Cellulose 3 range, it is essential to have information about the expected
chromatographic column (Phenomenex, 250 x 4.6 mm) us- concentrations of the analyte in biological matrix. Consi-
ing 0.1 % (v/v) acetic acid in mixture of methanol/water dering that different dosage strength (200 mg, 400 mg, 800
(90:10, v/v) as a mobile phase. Analyses were conducted mg etc) of IBP are available on the market and the infor-
at a flow rate of 0.6 mL/min and the injection volume was mation obtained from the literature data (Canaparo et al.,
10 µl. The mass spectrometer was operated in the selected 2000; Bonato et al. 2003; Szeitz et al., 2010), a 500-fold
reaction monitoring (SRM) mode using negative electro- concentration range was required (0.1 - 50 mg/L).
spay ionization. Ibuprofen and ketoprofen were quantified The data of CS and QC samples obtained dur-
in selected reaction monitoring (SRM) using the transition ing the validation process were fitted to the OLS (linear
of m/z 205.1→161.3 and m/z 253.2→209.2, respectively. non-weighted) regression model, as this model is the start-
ing point for the selection of the adequate calibration curve.
Given the wide concentration range, heteroscedasticity of
Data analysis
the data was expected. In order to confirm this expectation,
Six-point calibration curve was constructed from the a test for homoscedasticity was carried out. The F-test re-
peak area ratio (peak area IBP / peak area IS) vs. concen- vealed that the variance is not evenly distributed over the
whole concentration range, since the experimental F-value generated from each regression model is present in Table
(F=75.4) was significantly higher than the tabled one (Ftab 1. The SRR obtained using quadratic models (non-weight
=9.28). The heteroscedasticity of the data was further con- and weighted) were lower than the SRR obtained from the
firmed from the RR of the CS samples obtained from three linear models. In addition all quadratic models generate
days calibration curve (Fig.1). The RR obtained from the same percent of error (presented through SRR), which is
lowest concentration was 930% compared with 5% RR for not the case for linear models where large difference be-
the highest concentration. The RR data clearly showed that tween the errors were observed. The OLS and weighted
fitting the linear non-weighted model generates inaccurate 1/x and 1/y linear regression models generate large error,
results especially in the lower concentration range, indicat- whereas 1/x2 and 1/y2 weighted linear models gave simi-
ing the need of regression models different from the line- lar error as the corresponding quadratic weighted models.
ar non-weighted. Therefore four linear WLS and five quad- Considering that quadratic models are more complex than
ratic models (one non-weight and four weighted) were fur- weighted linear, additional investigation should be carried
ther constructed on the same data set. out before selecting the quadratic models as an adequate
The calculated regression models were evaluat- regression model.
ed according the traditional approach based on the SRR Statistical tests were applied for evaluation of the sig-
(Almeida et al., 2002) with slight modification referred to nificance of the difference between the errors obtained with
the calculation of SRR. In our investigation the SRR cal- quadratic and linear 1/x2 and 1/y2 models. The use of statis-
culation was based on errors obtained from the CS and the tical approach for the selection of the adequate regression
QC samples, whereas in the literature data the SRR was model is important, since the selection of more complex
based on the errors obtained just from the CS. The SRR regression model should be justified. The proposed statisti-
Conc. 0.1 mg/L 0.635 mg/L 6.35 mg/L 12.5 mg/L 25.4 mg/L 50.8 mg/L Rank
Reg.model RR (%) Rank RR (%) Rank RR (%) Rank RR (%) Rank RR (%) Rank RR (%) Rank sum
Linear non-weighted (OLS) 922.55 10 147.40 10 8.52 6 9.22 8 7.64 10 2.59 6 50
Linear 1/x 24.51 8 11.65 9 17.99 9 10.78 9 5.41 8 7.72 8 51
Linear 1/x2 1.31 5 7.30 6 9.73 7 2.14 6 2.77 7 7.90 9 40
Linear 1/y 32.35 9 10.50 8 18.83 10 11.43 10 6.08 9 6.87 7 53
Linear 1/y2 2.29 6 9.61 7 11.04 8 3.85 7 1.31 6 12.69 10 44
Quadratic non-weighted 15.13 7 6.30 5 2.87 5 1.55 3 0.30 1 0.08 1 22
Quad.1/x 0.98 3.5 0.89 4 2.71 4 1.33 1 1.14 4 0.31 3 19.5
Quad.1/x2 0.65 1.5 0.26 1 1.93 1 1.90 5 0.36 2 0.83 5 15.5
Quad.1/y 0.98 3.5 0.58 3 2.56 3 1.37 2 0.93 3 0.22 2 16.5
Quad.1/y2 0.65 1.5 0.37 2 2.05 2 1.81 4 1.22 5 0.59 4 18.5
Table 3. One sample rank test - Ranking of the regression models based on the accuracy and precision of the QC samples
2
Quad.1/x 17.81 6.5 0.00 1.5 -8.33 6.5 1.91 3 4.46 3 3.65 7 8.14 7.5 2.84 9 44
Quad.1/y 17.50 4.5 0.54 4 -6.29 4 1.91 3 4.51 5 3.65 7 8.15 9 2.83 7 43.5
2
Quad.1/y 17.50 4.5 0.54 4 -8.96 8 2.95 9 4.54 6 3.65 7 8.17 10 2.83 7 55.5
Table 4. One sample rank test - Final ranking of the regression models
cal approach was based on nonparametric tests because the curve fit and the calibration curve predictability was eval-
obtained data were independent, their distribution was not uated through the accuracy and precision obtained from
Gaussian and generally these statistical tests are less affect- four independent QC levels. Afterwards, the investigated
ed by outlying values. regression models were ranked according SRR obtained
Non-parametric One sample rank test based on rank- from the CS (Table 2) and QC samples (Table 3). The fi-
ing of the regression models in terms of SSR obtained from nal rank was obtained as a sum of ranks of the calibration
calibration curve fit and calibration curve predictability curve fit and their predictability (Table 4). The OLS, 1/x
was reported by Singtorojet al (Singtoroj et al., 2006). The and 1/y weighted linear regression models did not meet the
SRR of the CS obtained from each calibration level over acceptance criteria of ±15 % and ±20 % for LLOQ (Table
three days were used for the assessment of the calibration 2 and Table 3) given by the EMA guide for bioanalytical
method validation (EMA 2011) and consequently were not non-parametric tests should be used. The results obtained
included in the further evaluation. According this statistical within this paper showed that One sample rank test could
test, the selection of the regression model is based on the fi- not give statistical justification for the selection of linear
nal rank, so the quadratic 1/x2 model ranked with 1 should vs. quadratic regression models because slight differenc-
be selected as an adequate (Table 4). es between the SRR were obtained. The proposed statisti-
The data shown in Table 2 and Table 3 indicated that cal approach, based on Wilcoxon signed rank test where-
weighted 1/x2 and 1/y2 linear models also fulfill the EMA in linear and quadratic regression models were evaluate as
criteria of ±15 % and ±20 % and gave similar SRR com- two separate groups, allowed estimation whether the dif-
pared to the quadratic models, using the traditional ap- ferences in errors are statistically significant. The applica-
proach. However these weighted linear models had final tion of this simple non-parametric statistical test provides
rank 5 and 7 respectively, while the corresponding quadrat- statistical justification of the choice of an adequate regres-
ic models (1/x2 and 1/y2) were ranked with 1 and 6, respec- sion model.
tively (Table 4). This evaluation indicated that One sam-
ple rank test could not give justification for the selection
of the weighted quadratic over the weighted linear regres- References
sion models. In order to evaluate the significance of the dif-
ferences between the errors obtained using weighted line- Almeida, A.M., Castel-Branco, M.M., Falco, A.C., 2002. Linear
regression for calibration lines revisited: weighting schemes
ar and quadratic regression models, non-parametric Wil-
for bioanalytical methods. J.Chromatogr. B 774, 215-222.
coxon sing rank test for two independent groups of sam- Bonato, P.S., Perpetua, M., Del Lama, F.M., de Carvalho, R.,
ples was applied. The Wilcoxon signed rank test was per- 2003. Enantioselective determination of ibuprofen in plasma
formed using data for RR obtained by linear 1/x2 and 1/y2 by high-performance liquid chromatography-electrospray
models as first group and quadratic 1/x2 and 1/y2 models as mass spectrometry. J Chromatogr. B 796, 413-420.
a second group of samples. This test analyses not just the Canaparo, R., Muntoni, E., Zar,a G.P., Della Pepa, C., Berno, E.,
differences between the RR, but it also takes into account Costa, M., Eandi, M., 2000. Determination of ibuprofen in
the magnitude of the observed differences. The data eval- human plasma by high-performance liquid chromatography:
uated in the two matched groups were the average RR ob- validation and application in pharmacokinetic study. Bi-
omed. Chromatogr. 14, 219-226.
tained from each calibration level during three days and the
Castillo, MA, Castells, RC., 2001. Initial evaluation of quantitative
average RR obtained from the replicates of four QC sam- performance of chromatographic methods using replicates at
ple levels. Afterwards the differences between the RR were multiple concentration. J. Chromatogr. A. 921, 121-133.
computed, ranked and depending on the observed differ- De Souza, SVC., Junqueira, RG., 2005. A procedure to assess lin-
ences sign (“+”or “-”) was attached on each rank (Table 5). earity by ordinary least squares method. Anal.Chim. Acta
The null hypothesis (H0) was that there is no difference be- 552, 25-35.
tween the RR generated with linear and quadratic models European Medicines Agency. Guideline on validation of bioan-
versus H1 hypotheses: there is a difference between the RR alytical methods. Committeefor Medical Products for Hu-
obtained with the investigated regression models. The test man Use (CHMP).2011. http://www.ema.europa.eu/docs/
en_GB/document_library/Scientific_guideline/2011/08/
statistic for the Wilcoxon signed rank test is W, defined as
WC500109686.pdf.
the smaller of W+ (sum of positive ranks) and W- (sum of Guidance for Industry, Bioanalytical Method Validation (Center
negative ranks). The W+ was found to be 130 and the W- for Drug Evaluation and Research, U.S. Food and Drug Ad-
was 80. The critical value of W for n=20 with α=0.05 is 52. ministration, U.S. Department of Health and Human Servic-
The decision rule is reject H0 if W ≤ 52. Given that 80 > 52, es, Rockville, Maryland, May 2001).
the null hypothesis should be accepted. Hartmann, C., Smeyers-Verbeke, J., Massart, DL., McDowa-
The results derived using Wilcoxon signed rank test ll, RD., 1998. Validation of bioanalytical chromatographic
showed that although quadratic models generated small- methods. J Phar. Biomed.Anal. 17, 193-218.
er error than the weighted 1/x2 or 1/y2 linear models, the Hubert, Ph., Nguyen-Huu, J., Boulanger, B., Chapuzet, E., Co-
hen, N., Compagnon, A., Dewe, W., Feinberg, M., Laurentie,
difference between the errors were not statistically sig-
M., Mercier, N., Muzard, G., Valat, L., Rozet, E. 2007. Har-
nificant. Considering the request of the regulatory agen- monization of strategies for the validation of quantitative an-
cies that the simplest model that adequately describes data alytical procedure A SFSTP proposal. Part III. J Pharm Bi-
should be chosen, the proposed statistical approach gave omed.Anal. 45, 82-96.
justification for the selection of linear 1/x2 model as an Kimanani, E.K., 1998. Bioanalytical calibration curves: propos-
adequate regression model for the IBP calibration curve. al for statistical criteria. J. Pharm. Biomed. Anal. 16, 1117-
1124.
Lang, JR., Bolton, S., 1991. A comprehensive method validation
Conclusion strategy for bioanalytical applications in the pharmaceutical
industry – 1 Experimental consideration. J Pham. Biomed.
The present investigation showed that during the se- Anal. 9, 357-361.
Peters, F.T., Maurer, H.H., 2007. Systematic comparison of bias
lection of a regression model for bioanalytical assays with
and precision data obtained with multiple-point and one-
broad concentration range, statistical approach based on
point calibration in six validated multianalyte assays for liquid chromatography tandem mass spectrometry (UP-
quantification of drugs in human plasma. Anal. Chem. 79, LC-MS/MS). AJAC 2, 47-58.
4967-4976. Stockl, D., D’Hondt, H., Thienpont, LM., 2009. Method vali-
Rozet, E., Marini, R.D., Ziemons, E., Boulanger, B., Hubert, Ph., dation across the disciplines-critical investigation of major
2011. Advances in validation, risk and uncertainty assess- validation criteria and associated experimental protocols. J.
ment of bioanalytical methods. J. Pharm. Biomed. Anal. 55 Chromatogr. B. 877, 2180-2190.
(4), 848-858. Tellinghuisen, J., 2008. Weighted least squares in calibration: the
Singtoroj, T., Tarning, J., Annerberg, A., Ashton, M., Bergqvist, problem with using “quality coefficients” to select weighting
Y., White, N.J, Lindegardh, N., Day, N.P.J., 2006. A new ap- formulas. J. Chromatogr. B. 872, 162-166.
proach to evaluate regression models during validation of bi- Wieling, J., Hendriks, G., Tamminga, WJ., Hempenius, J., Men-
oanalytical assays. J. Pharm. Biomed. Anal. 41, 219-227. sink, CK., Oosterhuis, B., Jonkman, JH., 1996. Rational ex-
Szeitz, A., Edginton, A.N., Peng, H.T., Cheung, B., Riggs, K.W., perimental design for bioanalytical methods validation. Il-
2010. A validated enantioselective assay for the determina- lustration using an assay method for total captopril in plas-
tion of ibuprofen in human plasma using ultra performance ma. J. Chromatogr. A. 730 (1-2), 381-394.
Резиме
Клучни зборови: модел на регресија, биоаналитички метод, непараметарски статистички тестoви, валидација.