Simple Random Sampling
Simple Random Sampling
2.1 INTRODUCTION
Simple random sampling refers to the sampling technique in which each and
every item of the population is having an equal chance of being included in
the sample. The selection is thus free from any personal bias because the
investigator does not make any preference in the choice of items. Since
selection of items in the sample, depends entirely on chance, this method is
also known as the method of probability sampling.
Random sampling is sometimes referred to a representative sampling. If the
sample is chosen at random and the size of the sample is sufficiently large, it
will represent all groups in the population. An element of randomness is
necessary to be introduced in the final selection of the item. If that is not
introduced, bias is likely to enter and make the sample unrepresentative.
Methods of selection of a simple random sample are explained in Section 2.2.
In Section 2.3 the properties of simple random sampling are described. Simple
random sampling of attributes is introduced in Section 2.4 whereas in Section
2.5 the sample size determination for specific precision is described briefly.
Objectives
After studying this unit, you would be able to
describe the simple random sampling;
explain the method of SRSWR and SRSWOR;
explain and derive the properties of simple random sampling;
calculate the variance of the estimate of the sample mean;
describe the SRS for attribute and its properties; and
describe and draw a appropriate sample size for specific precision.
25
Statistical Techniques
2.2 METHODS OF SELECTION OF A SAMPLE
The random sample obtained by a method of selection in which every item
has an equal chance to be selected in the sample. The random sample depends
not only on method of selection but also on the sample size and nature of
population.
Simple Random Sampling
If a sample of n units is selected randomly from a population of size N, this
method is known as simple random sampling. As the name suggest, simple
random sampling is a method in which the required number of elements /units
are selected simply by random method from the target population. One can
select a simple random sample by either of these two methods with
replacement method and without replacement method.
1. By Lottery method;
2. By ‘Mechanical Randomization’ or ‘Random Numbers’ method; and
3. By Computer Random Number Generation method.
2.2.1 Lottery Method
This is very popular method of selecting a random sample under which all
items of the population are numbered or named on separate slips of paper.
These slips of paper should be of identical size, color and shape. These slips
are then folded and mixed up in a container or box or drum. A blind fold
selection is then made of the number of slips required to constitute the
desired size of sample. The selection of items thus depends entirely on
chance. For example if we want to select n candidates out of N. We assign
the numbers 1 to N. One number to each candidate and write these numbers
(1 to N) on N slips which are made as homogeneous as possible. These slips
are then put in bag and thoroughly shuffled and then n slips are drawn one by
one. Then the n candidates corresponding to numbers on the slip drawn will
constitute a random sample.
If we draw a slip and note down the number written on the slip and then again
replace the slip in the bag and then draw the next. These number of slips
constitute a sample of required size is called a sample of SRSWR. If we do
not replace the first slip which has already been drawn in the bag for the
26 subsequent draws then it is called SRSWOR.
The above method is very popular in lottery draws where a decision about Simple Random Sampling
prizes is to be made. However, while adopting lottery method, it is absolutely
essential to see that the slips are as homogeneous as possible in terms of size,
shape and color, otherwise there is a lot of possibility of personal prejudice
and bias affects in the result.
2.2.2 Random Number Method
The lottery method, discussed above, become quite cumbersome to use if the
size of the population is very large. An alternative method of random selection
is that of using the table of random numbers. The most practical and
inexpensive method of selecting a random sample consists in the use of
‘Random number tables’ which have been so constructed that each of the
digits 0, 1, 2, … , 9 appears with approximately the same frequency and
independently.
If we have to select a sample from a population of size N (≤ 99) then the
numbers can be combined two by two to give pairs from 00 to 99. The method
of drawing the random sample consists in the following steps:
1. Identify the N units in the population with the numbers from 1 to N;
2. Select at random, any page of the random number tables and pick up the
numbers in any row or column or diagonal at random and discart the
number which is greater than N; and
3. The population units corresponding to the numbers selected in step-2,
constitute the random sample.
The sample may be selected with replacement or without replacement. In the
case of sampling with replacement, a number occurring more than once is
accepted. A unit is repeated as many times as a random number occurs. But in
the case of sampling without replacement, if a number in random number
table or remainder occurs more than once is omitted at any sequent stage. In
the above selection procedure numbering of units from 00 onwards and
making use of remainders has an advantage, as no random numbers is being
wasted during the selection procedure. This saves time and labor. For example
a population consists of 20 units and a sample of size 5 is to be selected from
this population. Since 20 is a two digit figure, unit are numbered as 00, 01…
19. Five random numbers are obtained from a two digit random number table.
They are given as 85, 63, 52, 34, and 46. On dividing 85 by 20 the remainder
is 5, hence select the unit on serial no. 5. Similarly, dividing 63, 52, 34 and 46
by 20, the respectively remainder are 3, 12, 14 and 6. Hence selected units are
at serial numbers 05, 03, 12, 14 and 06. These selected units constitute the
sample.
A number of random number tables are available such as:
1. Tippet’s Random Number Tables
These tables consist of 10,400 four digits numbers, giving in all 10,400
4 i.e. 41,600 digits.
a - Multiplier Integer
z1 = (5 4 + 7) mod 8 = 3
z2 = (5 3 + 7) mod 8 = 6
z3 = (5 6 + 7) mod 8 = 5
z4 = (5 5 + 7) mod 8 = 0
1 N
X X i Population mean
N i1
1 n
x x i Sample mean
n i1
1 N
S2 X i X 2 Population mean square
N 1 i1
1 n
s2 x i x 2 Sample mean square
n 1 i1
1 N
σ2 X i X 2 Population variance
N i1
Let, Er be the event that any specified unit is selected at the rth draw.
P (Er) = Prob.{A specific unit is not selected at any one of previous
(r-1) draws and then selected at the rthdraw}
r 1
P ( It is not selected at i th draw)
i1
P (Er) =
N 1 N 2 N 3 ... N r 1 1
N N 1 N 2 N r 2 N r 1
1
P (Er) =
N
That means
1
P (Er) = = P( E1)
N
29
Statistical Techniques
Theorem 2: The probability that a specified unit is selected in the sample of
n
size n is
N
Proof: Since a specified unit can be selected in the sample of size n in n
mutually exclusive ways, viz. it can be selected in the sample at the rth draw (r
=1, 2,…, n) and since the probability that it is selected at rth draw is
1
P (Er) = ; r 1, 2, 3, ..., n
N
Therefore, the probability that a specified unit is included in the sample would
be the sum of the probabilities of inclusion in the sample at 1st draw, 2nd draw,
…, nth draw. Thus, by addition theorem of probability, we get
n n
1 n
P ( Er ) = N
r 1 r 1 N
Proof: The first unit can be drawn from N units in N ways. Similarly, second
unit can also be drawn in N ways because the first selected unit again mixed
with the population. So on up to the selection of nth unit.
n
C1 N
N n
Proof: We have
1 n 1 N
E x E xi E n a i Xi
n i1 i1
where, ai = 1 if ith unit is included in the sample
0 if ith unit is not included in the sample
n n n
= 1. 0 . 1
N N N
Hence,
1 N n 1 N
E( x )
n i1 N
Xi Xi X
N i1
30
Theorem 5: In SRSWR, the sample mean x is an unbiased estimator of Simple Random Sampling
population mean X .
Proof: We have
1 n
E x E x i
n i 1
1 n
E(x i )
n i1
1 n
X
n i1
1
n. X = X
n
1 n 2 1 n 2 1 n
xi xi x i x j
n 1 i1 n i 1 n i j1
1 1 n 2 1 n
1 x i x i x j
n 1 n i 1 n i j1
1 n 1 n x 2 1 n x x
n 1
i n (n 1) i
n i1 j1
i j
1 n 2 1 n
i n n 1 i
n i 1
x
j1
xi x j
1 n 2 1 n
E s 2 E x i E xi x j
n i 1 nn 1 i j1 … (1)
We have
n N
E x 12 E a i Xi 2
i 1 i1
N
E a i Xi2
i 1 … (2)
th
where, ai = 1 if i unit is included in the sample
0 if ith unit is not included in the sample … (3)
31
Statistical Techniques
Therefore,
n n N
E x i2 X i2
i 1 N i 1 … (4)
n N
and E x i x j E a i a jX i X j
i j1 i j1
N
E a i a j X i X j
i j1 … (5)
where, ai and aj are defined in equation (3)
Therefore,
E ai a j 1.P a i a j 1 0.P a i a j 0
P a i 1 a j 1
a
P a i 1 .P j 1
a i 1
n (n 1)
N ( N 1) … (6)
Because
n
E a i 1 P i th unit is included in the sample
N
and
a j 1 jth unit is included in the sample given
P P
a i 1 th
that i unit is included the sample
n 1
N 1
Substituting in equation (5), we get
n N
n n 1
E xi x j Xi X j
i j1 i j1 N 1
N
… (7)
Substituting from equations (4) and (7) in equation (1), we get
2 1 N 2 1 N
E s Xi Xi X j
N i1 N( N 1) i j1
1 N 1 N 2 1 N
N 1 N i 1
Xi Xi X j
N i j1
1 1 N 2 1 N
1 Xi Xi X j
N 1 N i 1 N i j1
1 N
2 1 N
2
N
X i X i X i X j
N 1 i 1 N i 1 i j1
2
1 N 2 1 N
X i X i
N 1 i 1 N i 1
32
1 N 2 Simple Random Sampling
X i NX 2 S 2
N 1 i 1
E (s2) = S2
X i X X i NX
2 2 2
But
i 1 i 1
N N
X i2 X i X NX 2
2
i 1 i1
N
X i2 N 1S2 NX 2
i 1
Therefore,
n N 1 2 2
E x i2 n S X
i1 N … (10)
Also from equation (7), we have
n n n 1 N
E x i x j XiX j
i j1 N N 1 i j1
2
n n 1 N N
2
Xi X i
N N 1 i 1 i 1
n n 1 2 2 2 2
N X N 1 S N X
N N 1
n n 1 N N 1 X 2 N 1 S2
N N 1
2 S2
n ( n 1) X
N
… (11)
33
Statistical Techniques
Substituting from equations (10) and (11) in equation (9), we get
1 1 2 2 1 2 S 2
E x2 1 S X n X N
1
n N
1 2 1 2 1 1 1 1
X 1 X 1 S2 1 S2
n n n N N n
1 1
X 2 S2
n N ... (12)
Substituting from equation (12) in equation (8), we get
2 1 1 2
Var x X S2 X
n N
1 1
Var x S2
n N
VarSRSWR x VarSRSWOR x
Proof: We have
N 1 2
VarSRSWR x S
nN
34
Nn 2 Simple Random Sampling
and VarSRSWOR x S
nN
Therefore,
VarSRSWR x VarSRSWOR x
N 1 S2
N n S 2
nN nN
1 2
S N 1 N n )
Nn
1 2
S n 1
Nn
n 1 2
S 0
nN
That implies VarSRSWR x VarSRSWOR x
That means variance of the sample mean is more in SRSWR as compared
with its variance in the case of SRSWOR. In other words SRSWOR provides
a more efficient estimate of sample mean relative to SRSWR.
2.3.1 Merits and Demerits of Simple Random Sampling
Merits
Simple random sampling has the following merits:
1. In simple random sampling each unit of the population has equal chance
to be included in the sample; and
2. Efficiency of the estimates can be found out in simple random sampling
because all the estimates are calculated by using the probability theory.
Demerits
Despite merits, simple random sampling has some demerits too viz.
1. An up-to-date frame of population is required in simple random sampling;
2. Some administrative inconvenience arises in simple random sampling if
some of the units are spreaded in a wide area. So collecting information
from these related units may be problem; and
3. SRS required larger sample size than any other sampling for a fix level of
precision.
VarSRSWR x VarSRSWOR x
Solution: We have
X = 1, 2, 3, 4, 5, 6, 7
1 2 3 4 5 6 7
X 4
7
1 N
S2 X i X 2
N 1 i1
35
Statistical Techniques
1
9 4 1 0 1 4 9
6
28
4.666
6
All possible samples of size 2 are as follows:
No. x
1 (1,2) 1.5 -2.5 6.25
2 (1,3) 2.0 -2.0 4.00
3 (1,4) 2.5 -1.5 2.25
4 (1,5) 3.0 -1.0 1.00
5 (1,6) 3.5 -0.5 0.25
6 (1,7) 4.0 0 0
7 (2,3) 2.5 -1.5 2.25
8 (2,4) 3.0 -1.0 1.00
9 (2,5) 3.5 0.25
-0.5
10 (2,6) 4.0 0
0
11 (2,7) 4.5 0.25
+0.5
12 (3,4) 3.5 0.25
-0.5
13 (3,5) 4.0 0
0
14 (3,6) 4.5 0.25
+0.5
15 (3,7) 5.0 1.00
+1.0
16 (4,5) 4.5 0.25
+0.5
17 (4,6) 5.0 1.00
+1.0
18 (4,7) 5.5 2.25
+1.5
19 (5,6) 5.5 2.25
+1.5
20 (5,7) 6.0 4.00
+2.0
21 (6,7) 6.5 6.25
+2.5
x X 35.00
2
x i 84.0 and i
i 1
NC
n
x i
84
E x i1
N
4X
Cn 21
N
cn
1 2 1
Var x N
Cn
x
i 1
i X
21
35.00 1.667
N 1 2 6
2 S 4.667 4.0008
N 7
36
Verification: In SRSWOR the variance of sample mean is given by Simple Random Sampling
Nn 2 72
Var x S 4.667 1.667
Nn 7 2
2 4.0008
Var x 2.0004
n 2
E1) Draw all possible samples of size 2 from the population {2, 3, 4} and
verify that E x X also find variance.
E2) How many random samples of size 5 can be drawn from a population of
size 10 if sampling is done with replacement?
x
i 1
i x 45
2
Let us consider SRSWOR sample of size n. From this population if ‘a’ is the
number of units in a sample possessing the given attribute then
The X
i 1
i = A, the number of units in the population possessing the given
attribute.
n
N
1 A
Thus, X
N
i 1
Xi
N
1 n a
and x
n i1
xi
n
p
Similarly,
N
2
X
i 1
i A Nπ
n
2
and x
i 1
i a np
1 N 1 N 2
S2 X i X 2
X i NX 2
N 1 i 1 N 1 i 1
1 N (1 )
N N 2
38 N 1 N 1
Similarly, Simple Random Sampling
1 n 2 1 n 2
s2 x i x x i nx 2
n 1 i1 n 1 i1
1 npq
n 1
np np 2
n 1
Proof: We have
1 n a
x
n i1
xi
n
p
1 N A
X
N i 1
Xi
N
π
N n (1 )
Theorem 11: In SRSWOR, show that Var (p) = .
N 1 n
Proof: We have, Var (p) = Var x
Nn 2
= S
nN
N n N. (1 )
.
n. N N 1
N n (1 )
= .
N 1 n
39
Statistical Techniques
P x X d 1 α … (15)
or
P x X d ... (16)
x E x xX xX
SE x Var x 1 1
S
n N … (17)
P Z 1.96 0.05
xX
P 1.96 0.05
1 1
S n N
1 1
P x X 1.96 S 0.05
n N
1 1
d 1.96 S
n N
d2 1 1
2 2
S 1.96 n N
2
NS2 1.96 3.84 NS2
n
Nd 2 S2 1.96
2
3.84S2 Nd 2
… (18)
This formula gives the sample size in SRSWOR for estimating population
mean with confidence level 95 % and margin of error d, provided n is large.
xX
t
1 1
S
n N
40
If t is the critical value of t for (n−1) df and at level of significance then n Simple Random Sampling
1 1
P x X S . t
n N … (19)
1 1
d S .t
n N
d2 1 1
2 2
S t n N
NS 2 t 2 S2 t 2
n
Nd 2 S 2 t 2 S 2 t 2
d2
N
2.6 SUMMARY
In this unit, we have discussed:
1. The simple random sampling;
2. The method of SRSWR and SRSWOR;
3. The properties of simple random sampling;
4. Method of finding the variance of the estimate of the sample mean;
5. The simple random sampling for attribute and its properties; and
6. The sample size determination for specific precision.
x
i 1
i 9
41
Statistical Techniques N
Cn
1
E x N x i
Cn i 1
9
3
3
1 N
X Xi
N i1
1
2 3 4
3
9
3
3
Therefore,
E x X
Nn 2
Again V x S
N.n
1 n 2
S2
N 1 i 1
X X
i
1
2
2 32 3 32 4 32
=1
Therefore,
32
Var x 1
3 2
1
0.166
6
E2) The first unit can be drawn from 10 units in 10C1 = 10 ways. Since
sampling is done with replacement so the second unit can be drawn in
10
C1 = 10 ways … so on upto the selection of 5th Unit. Thus the total
ways are 10.10.10.10.10 = 105 ways.
E3) We have
n
x
i 1
i 48
1 n 1
x
n i1
x i 48 4.8
10
So
1 n
s2 x i x 2
n 1 i1
1
s 2 36 4
9
which is the estimate value of S2.
42
Therefore, Simple Random Sampling
N n 2 50 10
Variance x S 4
Nn 50 10
16
0.32
50
E4) In SRSWOR the number of samples is NCn = 3C2 = 3 and samples with
their means are
x 36 ,
i 1
i
Therefore,
N
Cn
1
E x N x i
Cn i1
1 3
xi
3 i 1
1
36 12
3
8 12 16
Again X 12
3
Therefore,
E x X
Again estimator of population mean is sample mean and so its
variance
1 1
Var x S2
n N
where,
1 N
S2 X i X 2
N 1 i1
1
3 1
8 122 12 122 16 122
1
16 0 16 16
2
Therefore,
1 1
Varx 16
2 3
43
Statistical Techniques
3 2 16
16
6 6
8
2.66
3
E5) We have,
1 1
Var x S2
n N
and
n
1
s2 x i x 2
( n 1) i 1
1
45 5
9
Which is the estimate value of S2.
Therefore,
1 1
Varx 5
10 100
9
5 0.45
100
44