Critical Appraisal of Studies Using Laboratory Animal Models
Critical Appraisal of Studies Using Laboratory Animal Models
Critical Appraisal of Studies Using Laboratory Animal Models
P
reclinical assessment of interventions and a great deal
of pathogenic mechanism research is conducted using lated to animal studies in preclinical medicine (Higgins
animal models. Our understanding of the working of bi- et al. 2011).
ological systems, such as the immune and cardiovascular sys- Critical appraisal should be differentiated from an assess-
tems, is often based upon the results of research on animals. ment of comprehensive reporting. Comprehensive reporting
Many pharmaceutical interventions in human health are first simply assesses if the study is reported in a manner that in-
tested in animal studies for efficacy and safety. As such, ani- cludes the important components of a study. There is substan-
mal research is considered critical to the scientific endeavor. tial evidence that reporting of preclinical studies is less than
However, how well animal studies achieve this goal is a sub- comprehensive or has low reproducibility (Kilkenny et al.
ject of debate (van der Worp et al. 2010). 2009; Prinz et al. 2011; Steward et al. 2012). Based on this
Although it frequently seems to researchers that publica- evidence, there have been calls for improved reporting
tion is an endpoint, it is only an intermediary step in the sci- (Begley and Ellis 2012; Landis et al. 2012; van der Worp
entific process (Sargeant and O’Connor 2013). When authors and Macleod 2011). Further, there are several guidelines out-
describe results, they often make the inference that the results lining aspects of study design, analysis, and reporting that
from the study animals (study population) represent the re- should be included in any research report; examples include
sults expected from the population the study animals came the ARRIVE Guidelines (Kilkenny et al. 2010) and the
“Guidance for the Description of Animal Research in Scien-
tific Publications” (National Research Council [US] Institute
Annette M. O’Connor, BVSc, MVSc, DVSc, FANZCVS, is a veterinarian
and Professor of Epidemiology at the Veterinary Medical Research Institute,
for Laboratory Animal Research 2011). The target audience
College of Veterinary Medicine, Iowa State University, Ames, Iowa. Jan for these checklists or guidelines is generally authors, and the
M. Sargeant, DVM, PhD, is a veterinarian and Professor of Epidemiology aim is to provide guidance on to how to present the study.
in the Department of Population Medicine and Director of the Centre for Many, but not all, of the items in these checklists are related
Public Health and Zoonoses at the Ontario Veterinary College, University to enabling critical appraisal by the end-user, and for this rea-
of Guelph, Guelph, Ontario, Canada.
Address correspondence and reprint requests to Annette M. O’Connor,
son sometimes these checklists are used for critical appraisal.
Veterinary Medical Research Institute, Bld 4, College of Veterinary Medi- This is inappropriate. Assessing comprehensive reporting re-
cine, Iowa State University, Ames, IA 50010 or email oconnor@iastate.edu. quires an assessment of presence or absence of an item,
Table 1 Sample form that might be used to document the approach to critical appraisal of a laboratory animal study designed to compare outcomes
among groups
External validity “If the study was conducted in a No—don’t assess Inclusion criteria for relevant 7, 8, and 9
manner that suggests little internal Yes—continue to assess internal populations, housing, and
bias, will it be useful for the ‘next validity intervention used
step’ because the population is
relevant to ‘the next step?’”
Internal validity (using risk-of-bias domains from Higgins et al. 2011)
Selection bias “Are the groups comparable such Yes—low ROB Blinded allocation to group, 6, 8, 10, and 11
that an observed difference is likely Unclear—unclear ROB restriction, randomization,
attributable to the treatment rather No—high ROB restricted randomization
2014
than a confounder?”
Performance bias “Was the approach to husbandry the Yes—low ROB Blinding of caregivers, use of multiple 6, 9, and 13
same for all treatment groups and Unclear—unclear ROB cages per treatment
was caregiving done without No—high ROB
knowledge of the treatment
group?”
Detection bias “Was the approach to assessing the Yes—low ROB Blinding of outcome assessors, use 6, 9, and 13
outcomes the same in both groups Unclear—unclear ROB of repeatable and objective
and done without knowledge of the No—high ROB outcome measures
group?”
Attrition bias “Was the loss of animals from the Yes—low ROB Minimization of loss to follow-up and 13 to 15
groups minimal and unrelated to Unclear—unclear ROB complete reporting of loss to
the treatment groups?” No—high ROB follow-up for each treatment group
Reporting bias “Were the results of all outcome Yes—low ROB Comprehensive reporting and a 12 to 17
variables assessed reported Unclear—unclear ROB well-designed study protocol
completely?” No—high ROB
Random error
Test-level error “Is there a low probability that chance Yes—low risk of random error The exact p-value and the 95% 10, 13, and 16
played a role in the observed in the test confidence interval
difference?” Unclear—unclear risk of random
error in the test
No—high risk of random error
in the test
Continued
413
ARRIVE guideline
was specifically designed—i.e., the basis for the power calcu-
lation). For the critical appraiser, if reporting suggests a large
Items above
the study increases. This is the area where reporting bias af-
items fects the critical appraiser. If the authors do not report all out-
comes assessed, the person appraising the study cannot
accurately gauge the role of random error due to multiplic-
ity in the study results. For example, if a researcher tests
and publishes 20 outcomes and only one is statistically
The power of the study, the number
the tests
to enhance transparency.
assessment of internal validity?
those the study was designed
hypothesis tests conducted to
Study-level error
Table 2 Completed critical appraisal form for a hypothetical study using examples in the text
differences.
Detection bias “Was the approach to assessing the outcomes the No—high ROB Example 3: The outcome assessors are clearly
same in both groups and done without knowledge aware of the treatment assignment and the
of the group?” approach to measurement is highly subjective, so
these factors could be the cause of the observed
differences between groups rather than the
treatment.
Attrition bias “Was the loss of animals from the groups minimal and No—high ROB Example 4: Some animals are missing from one
unrelated to the treatment groups?” group and no explanation is provided.
Reporting bias “Were the results of all outcome variables assessed Unclear—unclear ROB Example 5: Several measured outcome are not
reported completely?” reported; this suggests incomplete reporting and
may indicate a non-significant finding.
Random error
Test level “Is there a low probability that chance played a role in Yes—low risk of random error Example 6: The p-value is very small suggesting the
the observed difference?” observed difference is rare under the null
hypothesis.
Study level “Did the authors limit the number of hypothesis tests No—high risk of random error Example 8: The authors appear to have conducted
conducted to those the study was designed 9 × 7 = 63 hypothesis tests without adjustment for
( powered) to assess?” multiplicity, and only found one significant
outcome. It is unclear if this is a primary (important)
outcome.
Conclusion Low internal validity