Sharma 2021 Clinical Significance
Sharma 2021 Clinical Significance
Sharma 2021 Clinical Significance
Key words: Bias; biostatistics; clinical significance; research design; sample size; statistical significance
DOI:
How to cite this article: Sharma H. Statistical significance or clinical
10.4103/sja.sja_158_21 significance? A researcher’s dilemma for appropriate interpretation of
research results. Saudi J Anaesth 2021;15;431-4.
Hunny Sharma
Department of Community and Family Medicine, All India Institute of Medical Sciences, Raipur, Chhattisgarh, India
Address for correspondence: Dr. Hunny Sharma, MD 264, Phase 4, Near AIIMS Residential Complex, Kabir Nagar, Raipur ‑ 492 099,
Chhattisgarh, India. E‑mail: smilerecoverydc@gmail.com
tests to assess whether new therapies or treatment protocol Hence it only aims to accept or reject the null hypothesis
are better in clinical practice as compared to the usual rather than focusing on the research hypothesis. From a
approach or methods. Researchers should understand what statistical point of view, it measures the strength of evidence
is the importance of both statistical and clinical significance.[3] against the null hypothesis.[9]
When looking from a clinical point of view, the statistically With the advancement in biostatistics, it is now clear that the
significant difference among groups is not of prime “P” value can easily be affected by various factors like sample
importance. If a well‑conducted study shows a difference size, the magnitude of the relationship and error. Each of
in treatment options within two groups, it is of prime these factors can work independently or in combination to
importance to know whether that difference is of clinically distort the study findings based on “P” values.[10]
importance or not.[4] Since sample size and measurement
variability can easily influence the statistical results, a (1) Effect of error on “P” values
nonsignificant outcome does not imply that the new therapy In general, two types of errors that is, systematic and random
or treatment protocol is not clinically useful.[5,6] error effects the “P” value.
Hence this review aims to impart knowledge about “P” “Systematic errors,” that is, “Non‑random errors” of certain
value and its importance in biostatistics, also highlights the significant magnitude distorts the research results towards a
importance of difference between statistical and clinical specific direction or can result in altered observed association
significance for appropriate interpretation of research results. in either direction. This type of error generally occurs when
a single examiner takes the measurement leading to an
What does P value infer? unintended bias of deviating the research results to his/her
In simpler terms, the P value tests all hypothesis about how expectations or may also result from modification of the
the data were produced (the whole model), not just the measuring technique. Hence, Systematic error is a systematic
targeted hypothesis that it is intended to test (such as a null flaw in the measurement of a variable due to methodological
hypothesis).[7] error leading to underestimation or overestimation of
measurements. The extent of systematic errors can be
The P value is the likelihood that if every model assumption, determined by re‑examination and re‑measurement of a
including the test hypothesis, were correct, the chosen test certain sufficient number (i.e., 20%, not always applicable)
statistic would have been at least as large as its observed of individuals again by material and method used in the
value.[7] agreement. Some statistical tests like paired t‑test, the
intraclass correlation and the Bland‑Altman method can also
The most common threshold value for the “P” we find in help in the determination of systematic errors.[10‑12]
biomedical literature is 0.05 (or 5%), and most often the
P value is distorted into a dichotomous number where results A “random error” is defined as a variability of the data which
are considered “statistically significant” when P falls on or cannot be explained. Random errors of high magnitude
below a cut‑off (usually 0.05) and otherwise its declared means trouble in reproducibility of the measurements, which
“nonsignificant”.[7] may result in questionable methodology and questionable
examiners’ ability. This occurs randomly across the
Why are “P” values not enough? population, ultimately distorting the results. Random errors
According to Ron Wasserstein, ASA’s executive director, the can be minimized by taking a large number of samples or
P value was never meant to substitute the scientific reasoning, measurements. Let us understand this by taking an example
which is of greater interest. P value, which is a number whose of measuring Mid‑Upper Arm Circumference (MUAC) of the
value can range from zero to one in relation to a threshold population. While measuring the MUAC of each individual in
value, represents the probability that the difference between the population, random error may exhibit itself in the form
the groups is not by chance. A well‑reasoned and scientifically of random MUAC among individuals that is, less or more
driven explanation will always remain the basis of reporting MUAC measured as compared to the actual measurement.
scientific outcomes.[8] This may be a result of how the tape was held while taking
the measurement, at what position it was measured (ideally
On what factors does the “P” value depend? midway between the olecranon process and the acromion),
It should be borne in mind that the “P” value only represents and who was the researcher who took the measurement.
that to what extent the data are inconsistent or incompatible Random error can be reduced by incorporating a large
with a given specific statistical model (i.e., null hypothesis). number of samples or measurements that is, the more study
432 Saudi Journal of Anesthesia / Volume 15 / Issue 4 / October-December 2021
Sharma: Understanding clinical and statistical significance
participants are included in these measurements, the smaller assess the effectiveness or efficacy of a treatment modality.
the effect of random error will become.[10] When used the term “clinically significant” findings are those
who make the patient improves the quality of life and makes
(2) Effect of sample size on “P” values him/her feel, function well.[13]
It is well known that the P value depends on the sample
size to a vast extent. More the sample size less will be the Clinically significant findings are those which improve
variability of the measurement or data, and more precise medical care resulting in the improvement of individual’s
will be the measurement for the study population. With physical function, his/her mental status, and ability to
an increase in sample size, the magnitude of random error engage in social life. The term improvement of quality of
decreases and the study is more likely to find a significant life in medical care deals with both subjective and objective
relationship if it exists.[10] terms. Here the term objective deals with improvement in
performance status, duration of remission of disease, and
(3) Effect of magnitude of relationship between groups prolongation of life‑span, while subjective improvement in
on “P” values quality‑of‑life deals with improved mood, attitude, physical
P‑value also relies on the magnitude of difference or and social activity, feeling of general wellbeing, and the
relationship between the groups compared. In simpler terms, alleviation of distressing symptoms like pain, weakness,
if the magnitude of difference between the two groups is and discomfort.
more substantial, then it will be easy to detect and will have
a small P value.[10] Since statistical significance results do not necessarily
mean that the results are clinically relevant and lead to
What are the American Statistical Association (ASA) improvement in the quality of life of the individuals. Hence,
principal statements on statistical significance and many outcomes can be statistically significant but not
P values? clinically relevant in a clinical point of view. Hence, clinicians
ASA on 8th March 2016, in the event of the growing concern and researchers should give importance to both statistical
of misuse and misinterpretation of P values, gave six principal and clinical significance.[13]
statements to improve conduct and interpretation of
quantitative research and increase research reproducibility. A clinically relevant intervention justifies the effects
The six principal statements issued regarding significance which over‑benefits the associated costs, harm, and the
and P value which are as follows: inconveniences caused to the individuals for whom it is
1. P‑value shows the extent of incompatibility of the data targeted. The main difference between statistical and
with the stated statistical model.[8] clinical significance is that the clinical significance observes
2. P‑value is neither the measure of the probability of the dissimilarity between the two groups or the two treatment
studied hypothesis being true nor is the representation modalities, while statistical significance implies whether
of the probability that study data were produced by there is any mathematical significance to the carried analysis
random chance alone.[8] of the results or not.
3. It is extremely important to note that any business
model, policy decision, or conclusion related to any Different studies can have a similar statistical significance
scientific study or experiment should not be based on but may differ significantly in clinical significance. Let’s
the P value and merely on the fact whether it passes a consider an example of two different chemotherapy agents
specific threshold or not.[8] for cancer. The first study estimates to increase the survival
4. It is the moral duty of the authors and researchers to of treated patient with Drug A (Less Expensive than usual
report the research or experimental findings to its full chemotherapeutic agents) by five years (P = 0.01) and
extent and with transparency.[8] alpha being 0.005, similarly a Second study utilizing Drug
5. A P value is neither represents the importance of research B (Expensive than usual chemotherapeutic agents) estimates
results nor is the representation of the effect size of the to increase the survival of treated patient by mere five
study.[8] months (P = 0.01) and alpha being 0.005. In both cases, the
6. P‑value does not give a sufficient measure of evidence statistical test is significant, but Drug B only increases the
regarding a model or “hypothesis”.[8] survival by only five months which is not clinically significant
as compared to Drug A which increases survival by five years,
What are clinically significant outcomes? nor useful in terms of cost‑effectiveness and superiority
The term “clinically significant” can be used for the researches when compared to already available chemotherapeutic
in which clinically relevant results or outcomes are used to agents.[14,15]
Saudi Journal of Anesthesia / Volume 15 / Issue 4 / October-December 2021 433
Sharma: Understanding clinical and statistical significance
Conclusion for training across the continuum of medical education. PLoS One
2013;8:e77301.
2. Grabowski B. “P<0.05” might not mean what you think: American
Hence from the above description of statistically significant statistical association clarifies P values. J Natl Cancer Inst
and clinically significant results, it is clear that both the 2016;108:djw194.
notations have the importance of their own. The statistically 3. Page P. Beyond statistical significance: Clinical interpretation
significant results may not of clinical importance, vice of rehabilitation research literature. Int J Sports Phys Ther
2014;9:726‑36.
versa the results which are of clinical importance may not 4. Bhandari M, Joensson A. Part ΠC: Understanding Treatment Effects.
be statistically significant. It is high time now that the Clinical Research for Surgeons. Thieme Electronic Book Library. 333,
researchers, journal editors, and readers should take a keen Seventh Avenue, New York, NY 10001, USA: Thieme Publisher; 2009.
interest in looking beyond the threshold “P” value and also p. 139‑44.
5. Sullivan GM, Feinn R. Using effect size‑or why the P value is not enough.
consider the results from a clinical point of view rather
J Grad Med Educ 2012;4:279‑82.
than just assessing the worth of research by considering 6. Batterham AM, Hopkins WG. Making meaningful inferences about
the “P” values. All the researchers should also take into magnitudes. Int J Sports Physiol Perform. 2006;1:50‑7.
account the design, sample size, effect size of the study, 7. Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN,
et al. Statistical tests, P values, confidence intervals, and power: A guide
bias incorporated, and reproducibility of the study while
to misinterpretations. Eur J Epidemiol 2016;31:337‑50.
analyzing the study results. An aware researcher with a 8. “P‑Values under Question.” American Psychological Association,
logically and critically thinking mind is in the best position American Psychological Association. Available from: https://www.apa.
to evaluate research results and thereby applying them to org/science/about/psa/2016/03/p‑values. [Last accessed: 30 December
practice evidence‑based medicine. Logically, discussion of the 2020].
9. Nahm FS. What the P values really tell us. Korean J Pain 2017;30:241‑2.
clinically significant research results will increase discussion 10. Thiese MS, Ronna B, Ott U. P value interpretations and considerations.
and understanding of the new treatment modalities and will J Thorac Dis 2016;8: E928‑31.
help in the upliftment of evidence‑based practice. 11. Houston WJ. The analysis of errors in orthodontic measurements. Am
J Orthod 1983;83:382‑90.
12. Cançado RH, Lauris JR. Error of the method: What is it for?. Dental
Financial support and sponsorship
Press J Orthod 2014;19:25‑6.
Nil. 13. Armijo‑Olivo S. The importance of determining the clinical significance
of research results in physical therapy clinical research. Braz J Phys Ther
Conflicts of interest 2018;22:175‑6.
There are no conflicts of interest. 14. Tenny S, Abdelgawad I. Statistical Significance. [Updated 2019
May 13]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls
Publishing; 2019. Available from: https://www.ncbi.nlm.nih.gov/books/
References NBK459346/.
15. Man‑Son‑Hing M, Laupacis A, O’Rourke K, Molnar FJ, Mahon J,
1. Arnold LD, Braganza M, Salih R, Colditz GA. Statistical trends in Chan KB, et al. Determination of the clinical importance of study results.
the Journal of the American Medical Association and implications J Gen Intern Med 2002;17:469‑76.