Sampling CH-8
Sampling CH-8
8. SYSTEMATIC SAMPLING
8.1. Definition of Systematic Sample and Selection Procedures
Systematic sampling is a more commonly used selection procedure. Systematic sampling means
the selection of units at a fixed interval from a list, starting from a randomly determined point. In
other words, it involves selecting units from a list using a selection interval, say 𝑘, so that every
𝑘 𝑡ℎ element on the list, following a random start selected between 1 and k, is included in the
sample. Systematic sampling is normally carried out as follows considering different cases.
8.1.1. When the interval (k) is an integer or assumes a whole number
For example, consider the farm survey in which 𝑛 = 8,000 farms are to be selected using
144,000
systematic sampling from N=144,000 farms. The selection interval is 𝑘 = 𝑁/𝑛 = = 18.
8,000
We must find a random start number between 1 𝑎𝑛𝑑 18 inclusive, and then take every 18th
thereafter. If the random start number between 1 𝑎𝑛𝑑 18 𝑖𝑠 12, then the units in the sample would
corresponds to the farms numbered 12, 30, 48, 66, 84, … , 143994. Once the sampling interval (k)
is determined, the random selection of the starting point determines the whole sample. In this case
there are 18 possible samples that can be chosen, i.e., for 𝑗 = 1, 2, … , 18.
1 1 𝑛 1
For this design the sampling fraction is 𝑓 = 𝑘 = 𝑁 = 𝑁 = 18
𝑛
1. Choose a random number between 1 and N where N is the number of elements in the
population.
2. Compute the quotient j/k, where j is the random number selected and k is the sampling
interval. Express this quotient as an integer plus a remainder.
3. If the remainder is 0, take a systematic sample of 1 𝑖𝑛 𝑘, elements in the usual way
beginning with element k. If the remainder is nonzero, say m, take a systematic sample of
1 in k elements, beginning with element m.
Instead of taking a random number between 1 and 6 to start the systematic sampling, we must take
a random number j between 1 and 26, and j will have 26 possible values.
𝑗 8 2
Let j be 8. We then divide 8 by 6 and determine the remainder, i.e., 𝑘 = 6 = 1 6 , the remainder is
2 (m=2). We take a systematic sample 1 in 6, starting with the 2nd employee, and obtain
2,8, 14, 20 on the list.
𝑗
When 𝑗 = 24, the quotient ⁄𝑘 = 4⁄6 = 4, and the remainder 𝑚 = 0. In this case we take a
systematic sample 1 in 6, starting with k=6. Then we obtain 6,12,18,24 on the list.
Example: - For illustration purpose let the population size N=14, and the sample size n=5. The
𝑁 14 4
value of the interval is k= 𝑛 = = 2 5≈3
5
We choose a random start j between 1 and 14. Let j=7. Then a systematic sample 1 in 3 would
produce samples with the following serial numbers.
For i=0,1,2,3,4 and j=7, we obtain the following numbers either by using j+ik or j+ik-N.
7+0x3=7, if i=0
7+1x3=10, if i=1
7+2x3=13, if i=2
7+3x3-14=2, if i=3
7+4x3-14=5, if i=4
If he units are listed randomly, this is similar to simple random sampling. But, if the units are listed
in order by a specific attribute such as age group, income group, etc, then this is similar to stratified
sampling. This process will reduce the sampling error associated with a randomly ordered list by
ensuring that each group is represented in the sample in a proportion equal to its proportion in the
underlying population.
Several formulas have been developed for the variance of 𝑦̅𝑠𝑦 , the mean of a systematic sample.
However, a systematic sampling can be treated just like a simple random sample for the purpose
of analysis provided that certain conditions are met.
The basic requirement of this is that the list used as the sampling frame must not have any intrinsic
regularity or periodicity of its own.
∑𝑘 𝑛
𝑗=1 ∑𝑟=1 𝑌𝑗𝑟 ∑𝑘 ̅𝑗
𝑗=1 𝑦
𝑌̅ = = , the mean of sample means
𝑁 𝑘
2 2
∑𝑘𝑗=1 ∑𝑛𝑟=1(𝑦𝑗𝑟 − 𝑌̅) ∑𝑘𝑗=1 ∑𝑛𝑟=1(𝑦𝑗𝑟 − 𝑌̅)
2
𝑆 = =
𝑁−1 𝑛𝑘 − 1
𝟐
∑𝒌𝒋=𝟏(𝒚 ̅)
̅𝒋 −𝒀
Variance of the mean: S𝒂𝟐 = 𝒌−𝟏
Like cluster sampling, the parameter ƿw is the correlation coefficient between pairs of units that
are in the same systematic sample. The variance of the means in terms of this ƿw would be S𝒂𝟐 =
(𝑵−𝟏)[ƿw(n−1)+1]
. In systematic sampling, the estimated mean denoted by 𝑦̅𝑠𝑦 is simply one of the
𝒏𝟐 (𝒌−𝟏)
∑𝑛
𝑟=1 𝑦𝑗𝑟
𝑦̅𝑗 ’s, depending on which random number was chosen to start sampling. Therefore,𝑦̅𝑠𝑦= =
𝑛
𝑦̅𝑗 .
Theorem 8.1: Let yjr denotes the rth element of the jth systematic sample with N=nk, where
j=1,2,…,k; r=1,2,…,n. Then the mean of systematic sample, 𝑦̅𝑠𝑦 , is an unbiased estimate of 𝑌̅, and
𝟐 2
∑𝒌𝒋=𝟏(𝒚 ̅)
̅𝒋 −𝒀 𝑘−1 𝟐 ∑𝑘 ̅ 𝑗 −𝑌̅)
𝑗=1(𝑦
its variance is V(𝑦̅𝑠𝑦 )= = S𝒂 where : S𝑎2 = . prove this theorem.
𝒌 𝑘 𝑘−1
1
If ƿ𝑤 ≤ − 𝑁−1, then 𝑦̅𝑠𝑦 is more precise. This can occur when the populations are ordered
with respect to the variable.
1
If ƿ𝑤 = − 𝑁−1 or ƿ𝑤 = 0, for large N, then they are equally efficient. This can occur when
the population’s elements are arranged in random order with respect to the variable under
the study.
1
If If ƿ𝑤 > − 𝑁−1 or ƿ𝑤 > 0, for large N, then 𝑦̅𝑠𝑦 is less efficient. This can happen when
the population’s elements are arranged in a list demonstrating a high degree of periodicity.
𝑚
𝑁 2 (𝑘 − 1) 2
𝑉(𝑌̂𝑠𝑦 ) = ∑(𝑦̅𝑖 − 𝑦̅𝑠𝑦 )
𝑘𝑚(𝑚 − 1)
𝑖=1
(300)2 (10 − 1)
𝑉(𝑌̂𝑠𝑦 ) = [(10.055 − 9.83)2 + (9.755 − 9.83)2 + (9.68 − 9.83)2 ]
10𝑥3𝑥2
= 1063.125
Suppose a population consists of linear trend that could be expressed in the form of yi=i. That is,
y1=1, y2=2, y3=3, . . . , yi=i, . . ., yN= N, for population of size N.
𝑁(𝑁+1) 𝑁(𝑁+1)(2𝑁+1)
∑𝑁 𝑁
𝑖=1 𝑦𝑖 = ∑𝑖=1 𝑖 = ,∑𝑁 2 𝑁 2
𝑖=1 𝑦𝑖 = ∑𝑖=1 𝑖 =
2 6
1 1 1 𝑁(𝑁+1)(2𝑁+1) 1 𝑁(𝑁+1) 2
V(y)=𝑆 2 = 𝑁−1 (∑𝑁 2 𝑁 2
𝑖=1 𝑦𝑖 − 𝑁 (∑𝑖=1 𝑦𝑖 ) ) = 𝑁−1 ( −𝑁( ) )
6 2
𝑁(𝑁+1)
𝑆 2= 12
Similarly, the means of systematic samples, 𝑦̅𝑗 , is in increasing order starting from 1, and with
value of 1 unit apart from each other. That is 𝑦̅1 = 1, 𝑦̅2 = 2, … , 𝑦̅𝑗 =j, for j=1, 2, . . ., k and for k
possible systematic samples.
1 2
V (𝑦̅𝑠𝑦 ) =𝑘 ∑𝑘𝑗=1(𝑦̅𝑗 − 𝑌̅)
1 1 2 1 𝑘(𝑘+1)(2𝑘+1) 1 𝑘(𝑘+1) 2 𝐾2 −1
V (𝑦̅𝑠𝑦 ) = (∑𝑘𝑗=1 𝑦̅𝑗2 − (∑𝑘𝑗=1 𝑦̅𝑗 ) ) = ( − ( ) )=
𝑘−1 𝑘 𝑘 6 𝑘 2 12
𝐾2 −1 (𝑘−1)(𝑛𝑘+1)
V (𝑦̅𝑠𝑦 ) ≤V(𝑦̅) if and only if ≤ . If n=1, they are equal, otherwise systematic
12 12
sample is much more effective.
b) Periodicity: