ResearchMethodologyModule3 1
ResearchMethodologyModule3 1
ResearchMethodologyModule3 1
Module-3: Meaning of Research Design, Need for Research Design, Features of a Good Design,
Important Concepts Relating to Research Design, Different Research Designs, Basic Principles of
Experimental Designs, Important Experimental Designs, Design of Sample Surveys: Introduction,
Sample Design, Sampling and Non-sampling Errors, Sample Survey versus Census Survey, Types
of Sampling Designs.
(a) The sampling design which deals with the method of selecting items to be observed for the given study;
(b) The observational design which relates to the conditions under which the observations are to be made;
(c) The statistical design which concerns with the question of how many items are to be observed and how the
information and data gathered are to be analysed; and
(d) The operational design which deals with the techniques by which the procedures specified in the sampling,
statistical and observational designs can be carried out.
Page 1
Module-3 Research Design
The important features of a research design as under:
(i) It is a plan that specifies the sources and types of information relevant to the research problem.
(ii) It is a strategy specifying which approach will be used for gathering and analysing the data.
(iii) It also includes the time and cost budgets since most studies are done under these two constraints.
If the research study happens to be an exploratory or a formulative one the research design most appropriate
must be flexible enough to permit the consideration of many different aspects of a phenomenon. But when the
purpose of a study is accurate description of a situation or of an association between variables (or in what are
Page 2
Module-3 Research Design
called the descriptive studies), accuracy becomes a major consideration and a research design which
minimises bias and maximises the reliability of the evidence collected is considered a good design. Studies
involving the testing of a hypothesis of a causal relationship between variables require a design which will
permit inferences about causality in addition to the minimisation of bias and maximisation of reliability.
Any given research may have its elements in different studies. It is based on the primary function that a study
can be categorised. Besides, the availability of time, money, skills of the research staff and the means of
obtaining the information must be given due weightage while working out the relevant details of the research
design such as experimental design, survey design, sample design and the like.
2. Extraneous variable: Independent variables that are not related to the purpose of the study, but may affect
the dependent variable are termed as extraneous variables. Whatever effect is noticed on dependent variable
as a result of extraneous variable(s) is technically described as an ‘experimental error’. A study must always
be so designed that the effect upon the dependent variable is attributed entirely to the independent variable(s),
and not to some extraneous variable or variables.
Ex: The impact of learning format/teaching style on exam performance.
independent variable-> learning format/teaching style
dependent variable-> exam performance
extraneous variable-> quality of lecture.
3. Control: One important characteristic of a good research design is to minimise the influence or effect of
extraneous variable(s). In experimental researches, the term ‘control’ is used to refer to restrain experimental
conditions.
Page 3
Module-3 Research Design
4. Confounded relationship: When the dependent variable is not free from the influence of extraneous
variable(s), the relationship between the dependent and independent variables is said to be confounded by an
extraneous variable(s).
7. Experimental and control groups: In an experimental hypothesis-testing research when a group is exposed
to usual conditions, it is termed a ‘control group’, but when the group is exposed to some novel or special
condition, it is termed an ‘experimental group’. In the above illustration, the Group A can be called a control
group and the Group B an experimental group.
8. Treatments: The different conditions under which experimental and control groups are put are usually
referred to as ‘treatments’. In the illustration taken above, the two treatments are the usual studies programme
and the special studies programme. Similarly, if we want to determine through an experiment the comparative
impact of three varieties of fertilizers on the yield of wheat, in that case the three varieties of fertilizers will be
treated as three treatments.
9. Experiment: The process of examining the truth of a statistical hypothesis, relating to some research
problem, is known as an experiment. For example, we can conduct an experiment to examine the usefulness
of a certain newly developed drug. Experiments can be of two types viz., absolute experiment and
comparative experiment. If we want to determine the impact of a fertilizer on the yield of a crop, it is a case of
Page 4
Module-3 Research Design
absolute experiment; but if we want to determine the impact of one fertilizer as compared to the impact of
some other fertilizer, our experiment then will be termed as a comparative experiment.
10. Experimental unit(s): The pre-determined plots or the blocks, where different treatments are used, are
known as experimental units.
2. After-only with control design: In this design two groups or areas (test area and control area) are selected
and the treatment is introduced into the test area only. The dependent variable is then measured in both the
areas at the same time. Treatment impact is assessed by subtracting the value of the dependent variable in the
control area from its value in the test area. This can be exhibited in the following form:
3. Before-and-after with control design: In this design two areas are selected and the dependent variable is
measured in both the areas for an identical time-period before the treatment. The treatment is then introduced
into the test area only, and the dependent variable is measured in both for an identical time-period after the
introduction of the treatment. The treatment effect is determined by subtracting the change in the dependent
variable in the control area from the change in the dependent variable in test area. This design can be shown in
this way:
Page 8
Module-3 Research Design
4. Completely randomized design (C.R. design): Involves only two principles viz., the principle of replication
and the principle of randomization of experimental designs. It is the simplest possible design and its procedure
of analysis is also easier. The essential characteristic of the design is that subjects are randomly assigned to
experimental treatments (or vice-versa). For instance, if we have 10 subjects and if we wish to test 5 under
treatment A and 5 under treatment B, the randomization process gives every possible group of 5 subjects
selected from a set of 10 an equal opportunity of being assigned to treatment A and treatment B.
The two groups (experimental and control groups) of such a design are given different treatments of the
independent variable. This design of experiment is quite common in research studies concerning behavioural
sciences. The merit of such a design is that it is simple and randomizes the differences among the sample
items. But the limitation of it is that the individual differences among those conducting the treatments are not
eliminated, i.e., it does not control the extraneous variable
(ii) Random replications design: The limitation of the two-group randomized design is usually eliminated
within the random replications design. In the illustration just cited above, the teacher differences on the
dependent variable were ignored, i.e., the extraneous variable was not controlled. But in a random replications
design, the effect of such differences are minimised (or reduced) by providing a number of repetitions for each
treatment. Each repetition is technically called a ‘replication’
Page 9
Module-3 Research Design
Random replication design serves two purposes viz., it provides controls for the differential effects of the
extraneous independent variables and secondly, it randomizes any individual differences among those
conducting the treatments.
5. Randomized block design (R.B. design) is an improvement over the C.R. design. In the R.B. design the
principle of local control can be applied along with the other two principles of experimental designs. In the
R.B. design, subjects are first divided into groups, known as blocks, such that within each group the subjects
are relatively homogeneous in respect to some selected variable. The variable selected for grouping the
subjects is one that is believed to be related to the measures to be obtained in respect of the dependent
variable. The number of subjects in a given block would be equal to the number of treatments and one subject
in each block would be randomly assigned to each treatment. In general, blocks are the levels at which we
hold the extraneous factor fixed, so that its contribution to the total variability of data can be measured.
Let us illustrate the R.B. design with the help of an example. Suppose four different forms of a standardised
test in statistics were given to each of five students (selected one from each of the five I.Q. blocks) and
following are the scores which they obtained.
Page 10
Module-3 Research Design
The purpose of this randomization is to take care of such possible extraneous factors (say as fatigue)
6. Latin square design (L.S. design) is an experimental design very frequently used in agricultural research.
the L.S. design is used when there are two major extraneous factors such as the varying soil fertility and
varying seeds. For instance, an experiment has to be made through which the effects of five different varieties
of fertilizers on the yield of a certain crop, say wheat, it to be judged. In such a case the varying fertility of the
soil in different blocks in which the experiment has to be performed must be taken into consideration;
otherwise the results obtained may not be very dependable because the output happens to be the effect not
only of fertilizers, but it may also be the effect of fertility of soil. Similarly, there may be impact of varying
seeds on the yield.
The Latin-square design is one wherein each fertilizer, in our example, appears five times but is used only
once in each row and in each column of the design. In other words, the treatments in a L.S design are so
allocated among the plots that no treatment occurs more than once in any one row or any one column. The
two blocking factors may be represented through rows and columns (one through rows and the other through
columns). The following is a diagrammatic form of such a design in respect of, say, five types of fertilizers,
viz., A, B, C, D and E and the two blocking factor viz., the varying soil fertility and the varying seeds:
Page 11
Module-3 Research Design
7. Factorial designs: Factorial designs are used in experiments where the effects of varying more than one
factor are to be determined. They are specially important in several economic and social phenomena where
usually a large number of factors affect a particular problem. Factorial designs can be of two types: (i) simple
factorial designs and (ii) complex factorial designs
(i) Simple factorial designs: In case of simple factorial designs, we consider the effects of varying two factors
on the dependent variable, but when an experiment is done with more than two factors, we use complex
factorial designs. Simple factorial design is also termed as a ‘two-factor-factorial design’, whereas complex
factorial design is known as ‘multi-factor-factorial design.
Illustration 1: (2 × 2 simple factorial design).
A 2 × 2 simple factorial design can graphically be depicted as follows:
In this design the extraneous variable to be controlled by homogeneity is called the control variable and the
independent variable, which is manipulated, is called the experimental variable. Then there are two treatments
of the experimental variable and two levels of the control variable. As such there are four cells into which the
sample is divided. the column means in the given design are termed the main effect for treatments without
taking into account any differential effect that is due to the level of the control variable. Similarly, the row
means in the said design are termed the main effects for levels without regard to treatment.
The following examples make clear the interaction
effect between treatments and levels. The data obtained in case of two (2 × 2) simple factorial studies may be
as given in Fig. 3.9.
Page 12
Module-3 Research Design
The graph relating to Study I indicates that there is an interaction between the treatment and the level which,
in other words, means that the treatment and the level are not independent of each other. The graph relating to
Study II shows that there is no interaction effect which means that treatment and level in this study are
relatively independent of each other.
It may also be of the type having two experimental variables or two control variables. For example, a college
teacher compared the effect of the classsize as well as the introduction of the new instruction technique on the
learning of research methodology.
But if the teacher uses a design for comparing males and females and the senior and junior students in the
college as they relate to the knowledge of research methodology, in that case we will have a 2 × 2 simple
factorial design wherein both the variables are control variables as no manipulation is involved in respect of
both the variables.
Page 13
Module-3 Research Design
(ii) Complex factorial designs: Experiments with more than two factors at a time involve the use of complex
factorial designs. A design which considers three or more independent variables simultaneously is called a
complex factorial design. In case of three factors with one experimental variable having two treatments and
two control variables, each one of which having two levels, the design used will be termed 2 × 2 × 2 complex
factorial design which will contain a total of eight cells as shown below in Fig. 3.13.
Page 14
Module-3 Research Design
Design of Sample Surveys
Introduction
A complete enumeration of all the items in the ‘population’ is known as a census inquiry or census survey
Ex: To calculate average per capita income of the people. Census survey is impossible in the situations when
population is infinite, it is not possible to examine every item in the population. Therefore it is better to resort
to a sample survey, i.e selection of a few items or respondent, the respondent is a representative of the total
population, the selected respondent is the sample, the selection process is called ‘sampling technique’, the
survey conducted is called ‘sample survey’
Sample Design
A sample design is a definite plan for obtaining a sample from a given population. Sample design may as
well lay down the number of items to be included in the sample. Sample design is determined before the data
is collected. There are many sample designs from which a researcher can choose.
The main steps of sampling design are as follows:
i) Objective: Define the objectives of survey in clear and concrete terms. The objectives must be in
accordance with money, manpower and time limit available for the survey.
ii) Population: What should be the population? This question should be answered.
iii) Sampling unit and frame: Sampling unit may be a geographical one such as state, district, village or a
construction unit such as house, flat, etc., or it may be a social unit such as family, club, school, etc., the list of
sampling unit is called as sampling frame.
iv) Size of sample: The size of the sample should be neither be excessively large, nor too small. It should be
optimum. An optimum sample is one which fulfills the requirements of efficiency, representativeness,
reliability and flexibility. Budgetary constraint must be considered when we decide the sample size.
v) Parameters of interest: Statistical constants of the population are called as parameters, e.g., population
mean, population proportion etc.
vi) Data collection: No irrelevant information should be collected and no essential information should be
discarded.
vii) Non-respondents: This non-response tends to change the results. The reason for the non-response should
be recorded by the investigator.
viii) Selection of proper sampling design: The researcher must decide the type of sample he will use i.e., he
must decide about the technique to be used in selecting the items for the sample.
ix) Organizing field work: There should be efficient supervisory staff and trained personnel for the field work.
x) Pilot survey: Try out the research design on a small scale before going to the field.
Page 15
Module-3 Research Design
xi) Budgetary constraint: Cost considerations, from practical point of view have a major impact upon
decisions relating to not only the size of sample but also the type of sample.
Page 16
Module-3 Research Design
sampling. Under quota sampling the interviewers are simply given quotas to be filled from the different
strata, with some restrictions on how they are to be filled. Quota samples are essentially judgement samples
and inferences drawn on their basis are not amenable to statistical treatment in a formal way.
Probability sampling: Probability sampling is also known as ‘random sampling’ or ‘chance sampling’. Under
this sampling design, every item of the universe has an equal chance of inclusion in the sample. It is, so to say,
a lottery method in which individual units are picked up from the whole group not deliberately but by some
mechanical process. Random sampling ensures the law of Statistical Regularity which states that if on an
average the sample chosen is a random one, the sample will have the same composition and characteristics as
the universe.
Keeping this in view we can define a simple random sample (or simply a random sample) from a finite
population as a sample which is chosen in such a way that each of the NC possible samples has the same
probability, 1/NC, of being selected. To make it more clear we take a certain finite population consisting of
six elements (say a, b, c, d, e, f ) i.e., N = 6. Suppose that we want to take a sample of size n = 3 from it. Then
there are n6C = 20 possible distinct samples of the required size, and they consist of the elements abc, abd,
abe, abf, acd, ace, acf, ade, adf, aef, bcd, bce, bcf, bde, bdf, bef, cde, cdf, cef, and def. If we choose one of
these samples in such a way that each has the probability 1/20 of being chosen, we will then call this a random
sample.
This procedure will also result in the same probability for each possible sample. We can verify this by taking
the above example. Since we have a finite population of 6 elements and we want to select a sample of size 3,
the probability of drawing any one element for our sample in the first draw is 3/6, the probability of drawing
one more element in the second draw is 2/5, (the first element drawn is not replaced) and similarly the
probability of drawing one more element in the third draw is 1/4. Since these draws are independent, the joint
probability of the three elements which constitute our sample is the product of their individual probabilities
and this works out to 3/6 × 2/5 × 1/4 = 1/20. This verifies our earlier calculation.
Page 17
Module-3 Research Design
COMPLEX RANDOM SAMPLING DESIGNS
Probability sampling under restricted sampling techniques, as stated above, may result in complex random
sampling designs
(i) Systematic sampling: In some instances, the most practical way of sampling is to select every ith item on a
list. Sampling of this type is known as systematic sampling. An element of randomness is introduced into this
kind of sampling by using random numbers to pick up the unit with which to start. For instance, if a 4 per cent
sample is desired, the first item would be selected randomly from the first twenty-five and thereafter every
25th item would automatically be included in the sample. Systematic sampling has certain plus points. It can
be taken as an improvement over a simple random sample in as much as the systematic sample is spread more
evenly over the entire population. Systematic sampling has certain plus points. It can be taken as an
improvement over a simple random sample in as much as the systematic sample is spread more evenly over
the entire population.
(ii) Stratified sampling: If a population from which a sample is to be drawn does not constitute a
homogeneous group, stratified sampling technique is generally applied in order to obtain a representative
sample. Under stratified sampling the population is divided into several sub-populations that are individually
more homogeneous than the total population (the different sub-populations are called ‘strata’) and then we
select items from each stratum to constitute a sample. The following three questions are highly relevant in the
context of stratified sampling:
(a) How to form strata?
(b) How should items be selected from each stratum?
(c) How many items be selected from each stratum or how to allocate the sample size of each stratum?
Regarding the first question, we can say that the strata be formed on the basis of common characteristic(s) of
the items to be put in each stratum. This means that various strata be formed in such a way as to ensure
elements being most homogeneous within each stratum and most heterogeneous between the different strata.
Thus, strata are purposively formed and are usually based on past experience and personal judgement of the
researcher. In respect of the second question, we can say that the usual method, for selection of items for the
sample from each stratum, resorted to is that of simple random sampling. Systematic sampling can be used if
it is considered more appropriate in certain situations. Regarding the third question, we usually follow the
method of proportional allocation under which the sizes of the samples from the different strata are kept
proportional to the sizes of the strata.
(iii) Cluster sampling: If the total area of interest happens to be a big one, a convenient way in which a sample
can be taken is to divide the area into a number of smaller non-overlapping areas and then to randomly select
Page 18
Module-3 Research Design
a number of these smaller areas (usually called clusters), with the ultimate sample consisting of all (or samples
of) units in these small areas or clusters. Suppose we want to estimate the proportion of machine parts in an
inventory which are defective. Also assume that there are 20000 machine parts in the inventory at a given
point of time, stored in 400 cases of 50 each. Now using a cluster sampling, we would consider the 400 cases
as clusters and randomly select ‘n’ cases and examine all the machine parts in each randomly selected case.
(iv) Area sampling: If clusters happen to be some geographic subdivisions, in that case cluster sampling is
better known as area sampling. In other words, cluster designs, where the primary sampling unit represents a
cluster of units based on geographic area, are distinguished as area sampling. The plus and minus points of
cluster sampling are also applicable to area sampling. sampling unit such as states in a country. Then we may
select certain districts and interview all banks in the chosen districts. This would represent a two-stage
sampling design
(v) Multi-stage sampling: Multi-stage sampling is a further development of the principle of cluster sampling.
Suppose we want to investigate the working efficiency of nationalised banks in India and we want to take a
sample of few banks for this purpose. The first stage is to select large primary sampling unit such as states in a
country. Then we may select certain districts and interview all banks in the chosen districts. This would
represent a two-stage sampling design with the ultimate sampling units being clusters of districts.
(vi) Sampling with probability proportional to size: In case the cluster sampling units do not have the same
number or approximately the same number of elements, it is considered appropriate to use a random selection
process where the probability of each cluster being included in the sample is proportional to the size of the
cluster. For this purpose, we have to list the number of elements in each cluster irrespective of the method of
ordering the cluster.
The following are the number of departmental stores in 15 cities: 35, 17, 10, 32, 70, 28, 26, 19, 26, 66, 37, 44,
33, 29 and 28. If we want to select a sample of 10 stores, using cities as clusters and selecting within clusters
proportional to size, how many stores from each city should be chosen? (Use a starting point of 10).
Solution: Let us put the information as under (Table 4.1):
Since in the given problem, we have 500 departmental stores from which we have to select a sample of 10
stores, the appropriate sampling interval is 50. As we have to use the starting point of 10 , so we add
successively increments of 50 till 10 numbers have been selected. The numbers, thus, obtained are: 10, 60,
110, 160, 210, 260, 310, 360, 410 and 460 which have been shown in the last column of the table (Table 4.1)
against the concerning cumulative totals. From this we can say that two stores should be selected randomly
Page 19
Module-3 Research Design
from city number five and one each from city number 1, 3, 7, 9, 10, 11, 12, and 14. This sample of 10 stores is
the sample with probability proportional to size.
(vii) Sequential sampling: This sampling design is some what complex sample design. The ultimate size of the
sample under this technique is not fixed in advance, but is determined according to mathematical decision
rules on the basis of information yielded as survey progresses. This is usually adopted in case of acceptance
sampling plan in context of statistical quality control. When a particular lot is to be accepted or rejected on the
basis of a single sample, it is known as single sampling; when the decision is to be taken on the basis of two
samples, it is known as double sampling and in case the decision rests on the basis of more than two samples
but the number of samples is certain and decided in advance, the sampling is known as multiple sampling. But
when the number of samples is more than two but it is neither certain nor decided in advance, this type of
system is often referred to as sequential sampling.
Page 20