0% found this document useful (0 votes)
52 views

Sampling Theory Sampling Theory: Two Stage Sampling Two Stage Sampling (Sub Sampling)

Two stage sampling involves sampling units in two stages: 1. At the first stage, a sample of clusters is selected from the population. 2. At the second stage, a sample of elements is selected from each cluster selected in the first stage. This allows distributing a given number of elements over more clusters for higher precision than taking a single stage sample. Expectations under two stage sampling depend on both stages, as the second stage is conditional on selection in the first stage.

Uploaded by

Baba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Sampling Theory Sampling Theory: Two Stage Sampling Two Stage Sampling (Sub Sampling)

Two stage sampling involves sampling units in two stages: 1. At the first stage, a sample of clusters is selected from the population. 2. At the second stage, a sample of elements is selected from each cluster selected in the first stage. This allows distributing a given number of elements over more clusters for higher precision than taking a single stage sample. Expectations under two stage sampling depend on both stages, as the second stage is conditional on selection in the first stage.

Uploaded by

Baba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Sampling Theory

MODULE X
LECTURE - 33
TWO STAGE SAMPLING
(SUB SAMPLING)

DR. SHALABH
DR
DEPARTMENT OF MATHEMATICS AND STATISTICS
INDIAN INSTITUTE OF TECHNOLOGY KANPUR

1
In cluster sampling, all the elements in the selected clusters are surveyed. Moreover, the efficiency in cluster
sampling depends on size of the cluster. As the size increases, the efficiency decreases. It suggests that
hi h precision
higher i i can be
b attained
tt i d by
b distributing
di t ib ti a given
i number
b off elements
l t over a large
l number
b off clusters
l t and
d
then by taking a small number of clusters and enumerating all elements within them. This is achieved in
subsampling.

I subsampling
In b li

ƒ Divide the population into clusters.

ƒ Select a sample of clusters [first stage]

ƒ From
F each
h off the
th selected
l t d cluster,
l t select
l t a sample
l off specified
ifi d number
b off elements
l t [[second
d stage]
t ]

The clusters which form the units of sampling at the first stage are called the first stage units and the units or
group of units within clusters which form the unit of clusters are called the second stage units or subunits.

Th procedure
The d iis generalized
li d tto th
three or more stages
t and
d iis th
then ttermed
d as multistage
lti t sampling.
li

For example, in a crop survey

ƒ villages are the first stage units,

ƒ fields within the villages are the second stage units and

ƒ plots within the fields are the third stage units.

In another example, to obtain a sample of fishes from a commercial fishery,

ƒ first take a sample of boats and

ƒ then take a sample of fishes from each selected boat.

2
Two stage sampling with equal first stage units
A
Assume th t
that
ƒ population consists of NM elements.
ƒ NM elements are grouped into N first stage units of M second stage units each, (i.e., N clusters, each
cluster is of size M).
ƒ Sample of n first stage units is selected (i.e., choose n clusters)
ƒ Sample of m second stage units is selected from each selected first stage unit (i.e., choose m units
from each cluster).
ƒ Units at each stage are selected with SRSWOR
SRSWOR.

Cluster sampling is a special case of two stage sampling in the sense that from a population of N clusters of
equal size m = M, a sample of n clusters chosen.

If further
f h M = m = 1,
1 we get SRSWOR.
SRSWOR

If n = N, we have the case of stratified sampling.

3
yij : Value of the characteristic under study for the jth second stage unit of the ith first stage unit;

i = 1, 2,..., N ; j = 1, 2,.., m.

m
1
Yi =
M
∑y
j =1
ij : mean per 2nd stage unit of ith 1st stage units in the population.

N M N
1 1
Y =
MN
∑ ∑ yij =
i =1 j =1 N
∑y
i =1
i = YMN : mean per second
d stage
t unitit in
i the
th population
l ti

1 m
yi = ∑
n j =1
yij : mean per second stage unit in the ith first stage unit in the sample.

1 n m 1 n
y= ∑∑ ij n ∑
mn i =1 j =1
y =
i =1
yi = ymn : mean per second stage in the sample.

Advantages

The principle advantage of two stage sampling is that it is more flexible than the one stage sampling. It reduces to
one stage sampling when m = M but unless this is the best choice of m, we have the opportunity of taking some
smaller value that appears more efficient.
efficient As usual,
usual this choice reduces to a balance between statistical precision
and cost. When units of the first stage agree very closely, then consideration of precision suggests a small value of
m. On the other hand, it is sometimes as cheap to measure the whole of a unit as to a sample. For example, when
the unit is a household and a single respondent can give as accurate data as all the members of the household.

4
A pictorial scheme of two stage sampling scheme is as follows:

Population (MN units) 

Cluster   Cluster   Cluster


Cluster   Population     
Population
…  …  … N clusters
M units M units M units

N clusters

Cluster   Cluster   Cluster   First stage 


sample                 
M units M units M units n  clusters (small)

n clusters (small)

Second stage 
sample                 
Cluster  
Cluster Cluster  
Cluster Cluster  
Cluster
…  …  … m units                  
m m m units n  clusters (small)
units units
mn units

5
Note The expectations under two stage sampling scheme depend on the stages. For example, the expectation
at second stage unit will be dependent on first stage unit in the sense that second stage unit will be in the
sample provided it was selected in the first stage.

To calculate the average

ƒ First average the estimator over all the second stage selections that can be drawn from a fixed set of n units
that the plan selects.

ƒ Then average over all the possible selections of n units by the plan.

In case of
o two
t o stage sa
sampling,
p g,
E (θˆ) = E1[ E2 (θˆ)]
↓ ↓ 2
average average average over all
over over all possible 2nd stage
all 1st stage selections from a
samples samples fixed set of units

In case of three stage


g sampling,
p g,

{ }
E (θˆ) = E1 ⎡ E2 E3 (θˆ) ⎤ .
⎣ ⎦
To calculate the variance, we proceed as follows:
In case of two stage sampling,
Var (θˆ) = E (θˆ − θ ) 2
= E E (θˆ − θ ) 2 .
1 2

6
Consider
E2 (θˆ − θ ) 2 = E2 (θˆ 2 ) − 2θ E2 (θˆ) + θ 2

{
= ⎡ E2 (θˆ) } + V2 (θˆ) ⎤ − 2θ E2 (θˆ) + θ 2 .
2

⎢⎣ ⎥⎦

Now average over first stage selection as


2
E1 E2 (θˆ − θ ) 2 = E1 ⎡⎣ E2 (θˆ) ⎤⎦ + E1 ⎡⎣V2 (θˆ) ⎤⎦ − 2θ E1 E2 (θˆ) + E1 (θ 2 )

{ }
= E1 ⎡ E1 E2 (θˆ) − θ 2 ⎤ + E1 ⎡⎣V2 (θˆ) ⎤⎦
2

⎢⎣ ⎥⎦

Var (θˆ) = V1 ⎡⎣ E2 (θˆ) ⎤⎦ + E1 ⎡⎣V2 (θˆ) ⎤⎦ .

In case of three stage sampling,

Var (θˆ) = V1 ⎡ E
⎣ 2
{E (θˆ)}⎤⎦ + E ⎡⎣V {E (θˆ)}⎤⎦ + E ⎡⎣ E {V (θˆ)}⎤⎦ .
3 1 2 3 1 2 3

7
Estimation of population mean

Consider y = ymn as an estimator of the population mean Y .

Bias

Consider

E ( y ) = E1 [ E2 ( ymn ) ]

= E1 [ E2 ( yim | i ) ] (as 2nd stage is dependent on 1st stage)


= E1 [ E2 ( yim | i ) ] (as yi is unbiased for Yi due to SRSWOR)

⎡1 n ⎤
= E1 ⎢ ∑ Yi ⎥
⎣ n i =1 ⎦
N
1
=
N
∑Y
i =1
i

=Y .

Thus ymn is an unbiased estimator of the population mean.

8
Variance
Var ( y ) = E1 [V2 ( y | i ) ] + V1 [ E2 ( y | i ) ]

⎡ ⎧1 n ⎫⎤ ⎡ ⎧1 n ⎫⎤
= E1 ⎢V2 ⎨ ∑ yi | i ⎬⎥ + V1 ⎢ E2 ⎨ ∑ yi | i ⎬⎥
⎣ ⎩ n i =1 ⎭⎦ ⎣ ⎩ n i =1 ⎭⎦
⎡1 n
⎤ ⎡1 n ⎤
= E1 ⎢ 2
⎣n

i =1
V ( yi | i ) ⎥ 1 ⎢ n ∑ E2 ( yi | i ) ⎥

+ V
⎣ i =1 ⎦
⎡1 n
⎛1 1 ⎞ 2⎤ ⎡1 n ⎤
= E1 ⎢ 2
⎣n
∑ ⎜ −
i =1 ⎝ m M
⎟ i ⎥ 1 ⎢ ∑ Yi ⎥
S
⎠ ⎦
+ V
⎣ n i =1 ⎦
1 n
⎛1 1 ⎞
= 2
n
∑ ⎜⎝ m − M ⎟⎠E ( S
i =1
i
2
) | i + V1 ( yc )

( where yc is based on cluster means as in cluster sampling)


1 ⎛1 1 ⎞ 2 N −n 2
= n⎜ − ⎟ Sw + Sb
n2 ⎝m M ⎠ N
Nn
1⎛ 1 1 ⎞ 2 ⎛1 1 ⎞ 2
= ⎜ − ⎟ S w + ⎜ − ⎟ Sb
n⎝m M ⎠ ⎝n N ⎠
2

∑∑ (Yij − Yi )
N N M
1 1
where S =
N
2
w ∑
i =1
S =i
2

N ( M − 1) i =1 j =1
1 N
S =
b
2

N − 1 i =1
(Yi − Y ) 2 .

9
Estimate of variance

An unbiased estimator of variance of y g S b2 and S w2 byy their unbiased estimators in the


can be obtained byy replacing
p
expression of variance of y
Consider an estimator of

N
1
S w2 =
N
∑S
i =1
i
2

where
2

∑ ( yij − Yi )
1 M
S =
2

M − 1 j =1
i

and

1 n 2
s = ∑ si
2
w
n i =1
1 m
si2 = ∑
m − 1 j =1
( yij − yi ) 2 .

10
So E ( sw2 ) = E1 E2 ( sw2 | i )

⎡1 n ⎤
= E1 E2 ⎢ ∑ si2 | i ⎥
⎣ n i =1 ⎦
1 n
= E1 ∑ ⎡⎣ E2 ( si2 | i ) ⎤⎦
n i =1
1 n 2
= E1 ∑ Si
n i =1
(as SRSWOR is used)

1 n
= ∑
n i =1
E1 ( Si2 )

1 N
⎡1 N

=
N
∑ ⎢
i =1 ⎣ N
∑S
i =1
i
2


N
1
=
N
∑S
i =1
i
2

= S w2
2
so sw2 is an unbiased estimator of Sw .

Consider
1 n
s =
2
b ∑
n − 1 i =1
( yi − y ) 2

as an estimator of
1 N
Sb2 = ∑
N − 1 i =1
(Yi − Y ) 2 .

11
So
1 ⎡ n ⎤
E ( sb2 ) = E ⎢ ∑ ( yi − y ) 2 ⎥
n − 1 ⎣ i =1 ⎦
⎡ n ⎤
(n − 1) E ( sb2 ) = E ⎢ ∑ yi2 − ny 2 ⎥
⎣ i =1 ⎦
⎡ n ⎤
= E ⎢ ∑ yi2 ⎥ − nE ( y 2 )
⎣ i =1 ⎦
⎡ ⎛ n ⎞⎤
= E1 ⎢ E2 ⎜ ∑ yi2 ⎟ ⎥ − n ⎡Var ( y ) + { E ( y )} ⎤
2

⎣ ⎝ i =1 ⎠ ⎦ ⎣ ⎦

⎡ n ⎤ ⎡⎛ 1 1 ⎞ ⎛1 1 ⎞1 2 2⎤
= E1 ⎢ ∑ E2 ( yi2 ) | i ) ⎥ − n ⎢⎜ − ⎟ Sb2 + ⎜ − ⎟ Sw + Y ⎥
⎣ i =1 ⎦ ⎣⎝ n N ⎠ ⎝m M ⎠n ⎦
⎡ n
{ 2 ⎤
}
⎡⎛ 1 1 ⎞ ⎛1 1
= E1 ⎢ ∑ Var ( yi ) + ( E ( yi ) ) ⎥ − n ⎢⎜ − ⎟ Sb2 + ⎜ −
⎣ i =1 ⎦ ⎣⎝ n N ⎠ ⎝m M
⎞1 2 2⎤
⎟ Sw + Y ⎥
⎠n ⎦
⎡ n ⎧⎛ 1 1 ⎞ 2 2 ⎫⎤ ⎡⎛ 1 1 ⎞ 2 ⎛ 1 1 ⎞ 1 2 2⎤
= E1 ⎢ ∑ ⎨⎜ − ⎟ Si + Yi ⎬⎥ − n ⎢⎜ − ⎟ Sb + ⎜ − ⎟ S w + Y ⎥
⎣ i =1 ⎩⎝ m M ⎠ ⎭⎦ ⎣⎝ n N ⎠ ⎝m M ⎠n ⎦
⎡1 ⎧ n ⎛ 1 1 ⎞ 2 2 ⎫⎤ ⎡⎛ 1 1 ⎞ 2 ⎛ 1 1 ⎞ 1 2 2⎤
= nE1 ⎢ ⎨∑ ⎜ − ⎟Si + Yi ⎬⎥ − n ⎢⎜ − ⎟ Sb + ⎜ − ⎟ S w + Y ⎥
⎣ n ⎩ i =1 ⎝ m M ⎠ ⎭⎦ ⎣⎝ n N ⎠ ⎝m M ⎠n ⎦
⎡⎛ 1 1 ⎞1 N
1 N
⎤ ⎡⎛ 1 1 ⎞ 2 ⎛ 1 1 ⎞1 2 2⎤
= n ⎢⎜ −
⎣⎝ m M

⎠N

i =1
S +
i
2

N
∑Y
i =1
i
2
⎥ − n ⎢⎜ n − N ⎟ Sb + ⎜ m − M
⎦ ⎣⎝ ⎠ ⎝
⎟ Sw + Y ⎥
⎠n ⎦
⎡⎛ 1 1 ⎞ 2 1 N
⎤ ⎡⎛ 1 1 ⎞ 2 ⎛ 1 1 ⎞ 1 2 2⎤
= n ⎢⎜ −
⎣⎝ m M
⎟ Sw +
⎠ N
∑Y i =1
i
2
⎥ − n ⎢⎜ n − N ⎟ S b + ⎜ m − M ⎟ n S w + Y ⎥
⎦ ⎣⎝ ⎠ ⎝ ⎠ ⎦

12
⎛1 1 ⎞ n N
⎛1 1 ⎞
= ( n − 1) ⎜ − ⎟ S w2 +
⎝m M ⎠ N
∑Y
i =1
i
2
− nY 2 − n ⎜ − ⎟ Sb2
⎝n N ⎠
⎛1 1 ⎞ 2 n ⎡N 2 ⎤ ⎛1 1 ⎞
= ( n − 1) ⎜ − ⎟ S w + ⎢ ∑ Yi − NY 2 ⎥ − n ⎜ − ⎟ Sb2
⎝m M ⎠ N ⎣ i =1 ⎦ ⎝n N ⎠
⎛1 1 ⎞ n ⎛1 1 ⎞
= ( n − 1) ⎜ − ⎟ S w2 + ( N − 1) Sb2 − n ⎜ − ⎟ Sb2
⎝m M ⎠ N ⎝n N ⎠
⎛1 1 ⎞
= ( n − 1) ⎜ − ⎟ S w2 + ( n − 1) Sb2 .
⎝m M ⎠

⎛1 1 ⎞ 2
⇒ E ( sb2 ) = ⎜ − ⎟ S w + Sb
2

⎝m M ⎠
or
⎡ ⎛1 1 ⎞ 2⎤
E ⎢ sb2 − ⎜ − ⎟ sw ⎥ = S b .
2

⎣ ⎝m M ⎠ ⎦

Thus

m( y) = 1 ⎛ 1 − 1 ⎞ ˆ 2 ⎛ 1 1 ⎞ ˆ2
Var ⎜ ⎟ Sω + ⎜ − ⎟ S b
n⎝m M ⎠ ⎝n N ⎠
1 ⎛ 1 1 ⎞ 2 ⎛ 1 1 ⎞⎡ 2 ⎛ 1 1 ⎞ 2⎤
= ⎜ − ⎟ sw + ⎜ − ⎟ ⎢ sb − ⎜ − ⎟ sw ⎥
n⎝m M ⎠ ⎝ n N ⎠⎣ ⎝m M ⎠ ⎦
1⎛1 1 ⎞ 2 ⎛1 1 ⎞ 2
= ⎜ − ⎟ sw + ⎜ − ⎟ sb .
N ⎝m M ⎠ ⎝n N ⎠

13

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy