How To Calculate Sample Size For Observational and Experimental Nursing Research Studies?
How To Calculate Sample Size For Observational and Experimental Nursing Research Studies?
How To Calculate Sample Size For Observational and Experimental Nursing Research Studies?
net/publication/336762026
CITATIONS READS
20 3,434
4 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Suresh K Sharma on 18 February 2020.
REVIEW ARTICLE
How to calculate sample size for observational and experimental nursing
research studies?
Suresh Kumar Sharma, Shiv Kumar Mudgal, Kalpana Thakur, Rakhi Gaur
Department of Nursing, College of Nursing, All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India
Correspondence to: Shiv Kumar Mudgal, E-mail: peehupari05@gmail.com
ABSTRACT
Sample size estimation for a research study is the most crucial part of the research process because it helps to produce
reliable results which improve generalizability of study results. A researcher must have understanding about significance
level, effect size, study’s power, and effect size; margin of error and ratio in event among population and design effect to
use sample size calculation formulas efficiently. There are different formulas of sample size calculation for different types
of variables measured in distinct study designs, namely descriptive, epidemiological, comparative, and interventional
research studies which are covered in this article. Review authors searched online and grey literature related to sample size
and read extensively. There were two authors who extracted and complied information related to topic.
KEY WORDS: Sample Size Calculation; Observational Studies; Experimental Studies; Case–Control and Cohort Studies
Access this article online A researcher needs to have the following information in hand
Website: www.njppp.com Quick Response code for the estimating sample size for a particular study: [8-10]
National Journal of Physiology, Pharmacy and Pharmacology Online 2020. © 2020 Shiv Kumar Mudgal, et al. This is an Open Access article distributed under the terms of the Creative
Commons Attribution 4.0 International License (http://creative commons.org/licenses/by/4.0/), allowing third parties to copy and redistribute the material in any medium or format and to
remix, transform, and build upon the material for any purpose, even commercially, provided the original work is properly cited and states its license.
set to be rejected and always used in conjunction with an while calculating for sample size. It is usually estimated
alternative hypothesis (significant difference). Sometimes, it is from previous literature, including observational cohorts.
not possible that the null hypothesis can be rejected, but it does For example, studying the association of alcohol and liver
not mean that it is true, and it just states that we are unable to disease, the prevalence rate for liver disease in studying
produce enough evidence to reject stated null hypothesis. Thus, population should be known before the study.
investigator must have clearly defined null hypothesis in hand.
Margin of error
Acceptable significance level
It is a random sampling error, which is a likelihood of sample
it is in continuation with the previous line such as, (it is results variation from the population. For example, suppose there
denoted by a and acceptable level for significance means is 40% prevalence of anemia study sample and we set margin of
when a really true null hypotheses is rejected, in other words error as 5%; it means that range of anemia in population would
a = P [Type -I error]). Conventionally, 5% (α/P = 0.05) or be between 40 ± 5, i.e., 35% and 45% prevalence of anemia.
1% (α/P = 0.001) level of significance is considered by
the biomedical researchers; which means that research Standard deviation (SD) in the population
accepts that there could be 5% or 1% probability that results A researcher must anticipate the population variance of given
observed are due the chance not by our intervention. The outcome variable that may be calculated by mean that of SD.
resembling confidence levels (confidence interval [CI]) for For homogenous population, smaller sample size will be
the appropriate level of significance are: (a) CI-95% for the needed as variance or SD will be less in this population. For
5% (α/P = 0.05) level of significance and (b) CI-99% for the example, for studying the effect of exercise regimen on blood
1% (α/P = 0.001) level of significance. glucose, we include a population with blood glucose ranging
from 150 to 350 mg/dl. Now, it is simple to understand that we
Study power might require more number of samples to find out differences
It means possibility to rule out a significant difference when it among interventions because SD in this group will be more.
really exists. In other words, it is a probability of generalization Although if we consider a sample from population with blood
of study findings to population at large. An increase in sugar reading in between 150 and 250 mg/dl, then researcher
statistical power will decrease the possibility of Type II error may receive a more similar group representing homogeneity,
(β) occurrence; means reduces risk of false-negative results. therefore, decreases SD and number of samples for study.
Therefore, it is denoted as 1-β. In most of clinical trial the
power of 0.8 (80%) or greater is considered more appropriate One tail and two tail inferential statistical test
to find out a statistically significant difference. Power of 80% The choice of one-tail or two-tail test depends on the objective
means there are 20% chances that we may fail to identify a of the study. Research has a hypothesis that a new drug is more
significant difference even though it really exists. effective in reducing the blood pressure; then the one-tail test
could be sufficient to test the hypothesis, but if not sure that
Expected effect size the new drug may be more or less effective in lowering BP
It refers to the magnitude of the relationship between two as compared to existing drug then it is always better to use
variables as it occurs in population. For example, average two-tail test. Inputs for the one-tail and tow tail tests are same
hemoglobin rise with one type of diet is 5 g/dl and another is except the critical ratio (Z value); which is different in one-tail
2 g/dl; then, the absolute effect size would be 5–2 = 3 g/dl. Thus, test (Z1-α) and two-tail test (Z1-α/2) as depicted in Table 1.
effect size is considered as the calculated difference between
the measured effects of interventional and control group. Pilot Design effect (DEFF)
studies and previously reported data can be used to estimate The sample size calculation formulas provided in this article
the effect size. Cohen guide for effect size which is preferred helps to estimate an adequate sample size when simple random
by many social scientist states that it is considered to be small
effect if effect size is <0.1, effect size between 0.3 and 0.5 is Table 1: Z values
assumed to be medium effect and more than 0.5 is called as large Level of confidence (%) Two‑tailed (z1−α/2)
difference effect. Hence, effect size of 0.5 is commonly used as 0.05 (95) 1.96
it comes in reflects moderate to large difference. Nevertheless, 0.01 (99) 2.58
it is important to mention that effect size and sample size are
0.001 (99.9) 3.29
inversely proportional to each other; when effect size is large
Power of the test Z value (Z1−β)
research needs smaller sample size and vice versa.
0.80 0.84
Underlying event rate in the population 0.90 1.28
0.95 1.65
It is very essential to consider a prevalence rate or bottom
0.99 2.33
line event rate of the condition under study population
sampling technique is assumed to be used in the study. However, Sample size (n) = 406
whenever simple random sampling is not possible to use, then Z1-α/2 = 1.96
calculated sample may not be adequate and to overcome this P = 40% = 0.4
problem, calculated sample size has to be adjusted in terms of q = 1−0.4 = 0.6
DEFF. This is equal to the ratio of expected variance in cluster d = 5% = 0.05
random sampling with expected variance in simple random
sampling. The DEFF is generally ≥1. Therefore, in cluster Hence, for conducting a new cross-sectional study to identify
design, we assume DEFF = 2. the prevalence or proportion of DM among patients, minimum
406 subjects will be required.
DEFF = 1+δ(n−1) (Hers, δ = interclass correlation;
n = common size of the cluster) Example II: Sample size, when mean is of the study or data
are on interval/ratio scale
EXAMPLES FOR SAMPLE SIZE ESTIMATION (Z1-α /2 ) 2 * (σ ) 2
Sample size (n ) =
FOR OBSERVATIONAL AND EXPERIMENTAL ( d )2
STUDIES[11-23] n = Desired number of samples
z1-α/2 =
Standardized value for the corresponding level of
Estimation of Sample Size for Cross-sectional or confidence.
Descriptive Research Studies (At 95% CI, it is 1.96 and at 99% CI or 1% type I error it is
These studies or surveys are generally conducted to find out, 2.58)
observe, describe, and document aspects of a situation as it d = Margin of error or rate of precision
naturally occurs. It is not used to identify the causation of σ = SD which is based on previous study or pilot study
something, such as a reason of any epidemic. Researchers do
not manipulate variables. A researcher might collect cross- Suppose a researcher wants to know the average hemoglobin
sectional data on past alcohol habits and current diagnoses of level among adults in the city at 95% CI and the margin of
liver disease, for example. error is 2 g/dl. From a previous study, the SD of hemoglobin
level among adults was found to be 4.5 g/dl. How many study
Example I: Sample size in case data is on nominal/ordinal subjects will be required to conduct a new study?
scale and proportion is one of the parameters
(z1− α / 2 ) 2* (p) (q ) On applying:
Sample size ( n ) = (Z1-α /2 ) 2 * (σ ) 2
(d) 2 Sample size (n ) =
n = Desired sample size ( d )2
Z1−α/2 =
Critical value and a standard value for the (1.96)2 * (4.5)2
corresponding level of confidence. (n ) =
(At 95% CI or 5% level of significance (type-I error) it is (2)2
1.96 and at 99% CI it is 2.58) (n) = 19.44
P = Expected prevalence or based on previous research (n) = 19+2 (considering 10% dropout of study participants)
q = 1-p Sample size (n) = 21
d = Margin of error or precision z1-α/2 = 1.96
σ = 4.6 g/dl
A researcher wants to carry out a descriptive study to understand d = 2 g/dl
the prevalence or proportion of diabetes mellitus among adults
in a city. A previous study stated that diabetes in the adult Hence, for conducting a new cross-sectional study to estimate
population was 40%. At 95% CI and 5% margin of error, the average hemoglobin level among adults, minimum 21
calculate the sample required to conduct other new research? subjects will be required.
is 5%. Find out the sample size for the upcoming survey? Sample size ( n ) = *
r (p1 − p2 )2
On applying: Above formula (1 + 1) 0.35 (1 - 0.35) (0.84 + 1.96)
2
(n ) = *
(n) = 2500/1+2500*(0.05)2
1 (0.40 − 0.30)2
(n) = 2*178.36
(n) = 344.82 (n) = 356.72
(n) = 345+34 (considering 10% dropout of study participants) (n) = 357+36 (considering 10% dropout of study participants)
Sample size (n) = 379 Sample size (n) = 393
N = 2500 p1 = 40%=0.4
d = 5% = 0.05 p2 = 30%=0.3
r=1
Hence, for conducting a new cross-sectional study to assess Z1-β = 0.84
the prevalence of regular foot care among adults with z1-α/2 = 1.96
diabetes, minimum (379 subjects) will be required. p = 0.4+0.3/2 = 0.35
n= *
the study. He assumes expected proportion in case is 40% and r d2
control group is 30% and decides to have a same number of
(1 + 1) (18) (0.84 + 1.96)
2 2
cases in both groups. Find out the optimum sample size for (n ) = *
each group in study. 1 10
σ = 18 0.01
d = 10
(n) = [1.199+.1539]2/0.01
Therefore, a researcher will require minimum (56 subjects) in
the case as well as in the control group. (n) = 183.17
(n) = 183+18 = 201 (considering 10% dropout of study
participants)
Same Size Estimation for Cohort Studies Sample size (n) = 201
It is defined as a longitudinal research study that includes a r=1
category of people who share the same characteristic, typically Z1-β = 0.84
those who experienced a common event in a selected period, z1-α/2 = 1.96
such as disease or education. There is no control group, and p0 = 0.2
no intervention or treatment is given to patient that is why p1 = 0.3
it is different from randomized control trials. For example, m=1
researcher asks study subject to record their eating practices 0.3 + 1 × 0.2
P= = 0.25
over the period of time and then correlates between eating 1+1
practices and their sleep pattern. Therefore, a researcher will require a minimum (201 subjects)
for the study.
Example I: Sample size estimation for independent cohort
studies Sample size estimation for comparative studies
[ Z1− α / 2 √ {(1 + 1 / m ) p * (1 − p )} + Z1−β It is the study design in which comparison is done between two or
more groups on the basis of selected attributes such as knowledge,
n=
{
√ p0 * (1 - p0 / m ) p1 (1 - p1 ) ]2 } perception, and attitude. A multidisciplinary approach is best
(p0− p1 ) used for this type of researches. In case of comparative studies,
2
(n ) =
{
√ p0 * (1 - p0 / m ) p1 (1 - p1 ) ] } 2
p1 (1 − p1 ) + p 2 (1 − p 2 )
Sample size ( n ) = *C
(p0− p1 ) (p1 − p2 )2
2
(d ) 2
S2 = Pooled SD (both comparison groups) Example II: Sample size to rule out the difference (effect size)
p = Response rate of standard intervention among two groups (on the basis of difference in the mean or
p0 = Response rate of new intervention for continuous variables).
A researcher wants to assess the difference in the effectiveness A researcher wants to see the variation in the effectiveness of
of treatment A (new intervention) and treatment B (standard drug-X (new drug) and drug-Y (standard drug) prescribed for
intervention) for the treatment of stroke in two-month protocol. diabetes treatment. Change in blood glucose level (mg/dl) is
the primary outcome, compared to values at baseline study
Researcher assumes all parameters as: point at 95% CI and 80% power of the study. The researcher
assumes all parameters as:
p = 0.30; p0=0.45; a=0.05 (95% CI); β=0.20 (80% power);
Z1-α = 1.645, Z1-β=0.84, Z1-α/2=1.96
d=0.16; 𝛿=0.21, 𝛿0=0.10
• Mean change of blood glucose level in the new drug
a. Non-inferiority trial:
treatment group is 20 mg/dl
Z1-β + Z1-α 2 p • Mean change of blood glucose level in the standard
Sample size ( n ) = 2 * * (1 − p )
δ0 treatment group is 16 mg/dl
• S (pooled SD of both comparison groups) = 10 mg/dl
0.84 + 1.6452 0.3 • 𝛿 (actual margin between two intervention) = 4 and 𝛿0
n = 2* * (1 - 0.3)
0.10 (clinically allowable difference) = 2.
n = 260.40+26 (considering 10% dropout)
Sample size (n) = 286 a. Non-inferiority trial:
p = 0.30 Z1−β + Z1− α
2
δ0
Z1-β = 0.84
𝛿0 = 0.05 0.84 + 1.6452
(n ) = 2 * * 10
2
2
b. Equivalence trail
(n) = 308.7+30 (considering 10% dropout)
Z1-β + Z1− α / 2 2 Sample size (n) = 339
Samplesize ( n ) = 2 * * p * (1 − p )
δ0
b. Equivalence trial
0.845 + 1.962
n = 2* * 0.3 * (1 − 0.3) Z1−β + Z1− α / 2
2
δ0
n = 330.45+33 (considering 10% dropout)
Sample size (n) = 363 0.845 + 1.962
n = 2* *10
2
p = 0.30 2
Z1-α/2 = 1.96
n = 280.5+28 (considering 10% dropout)
Z1-β = 0.84
Sample size (n) = 308
𝛿0 = 0.1
c. Clinical superiority trial:
c. Clinical superiority trial
Z1−β + Z1− α
2
Z1-β + Z1− α
2
Sample size ( n ) = 2 * * SD
2
Sample size ( n ) = 2 * * p * (1 − p ) δ − δ 0
δ − δ 0
0.84 + 1.6452
0.84 + 1.6452 (n ) = 2 * *10
2