0% found this document useful (0 votes)

405 views

Sampling Two Stage Sampling

- Two stage sampling involves dividing a population into clusters, selecting a sample of clusters at the first stage, and then selecting a sample of elements from each selected cluster at the second stage. - It provides more flexibility than one-stage sampling and can achieve higher precision by distributing elements over more clusters. - The mean per second stage unit in the sample is an unbiased estimator of the population mean. Its variance consists of the average variance of the first stage cluster means plus the variance between cluster means.

Uploaded by

Fitri Juanda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

405 views

Sampling Two Stage Sampling

Uploaded by

Fitri Juanda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Chapter 10

Two Stage Sampling (Subsampling)

In cluster sampling, all the elements in the selected clusters are surveyed. Moreover, the efficiency in
cluster sampling depends on the size of the cluster. As the size increases, the efficiency decreases. It
suggests that higher precision can be attained by distributing a given number of elements over a large
number of clusters and then by taking a small number of clusters and enumerating all elements within
them. This is achieved in subsampling.

In subsampling
- divide the population into clusters.
- Select a sample of clusters [first stage}
- From each of the selected cluster, select a sample of the specified number of elements [second
stage]

The clusters which form the units of sampling at the first stage are called the first stage units and the
units or group of units within clusters which form the unit of clusters are called the second stage units or
subunits.

The procedure is generalized to three or more stages and is then termed as multistage sampling.

For example, in a crop survey

- villages are the first stage units,
- fields within the villages are the second stage units and
- plots within the fields are the third stage units.

In another example, to obtain a sample of fishes from a commercial fishery

- first, take a sample of boats and
- then take a sample of fishes from each selected boat.

Two stage sampling with equal first stage units:

Assume that
- the population consists of NM elements.
- NM elements are grouped into N first stage units of M second stage units each, (i.e., N
clusters, each cluster is of size M )

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 1
- Sample of n first stage units is selected (i.e., choose n clusters)
- Sample of m second stage units is selected from each selected first stage unit (i.e., choose m
units from each cluster).
- Units at each stage are selected with SRSWOR.

Cluster sampling is a special case of two stage sampling in the sense that from a population of N clusters
of equal size m  M , a sample of n clusters are chosen.
If further, M  m  1, we get SRSWOR.
If n  N , we have the case of stratified sampling.

yij : Value of the characteristic under study for the j th second stage units of the i th first stage

unit; i  1, 2,..., N ; j  1, 2,.., m.

1 M
Yi 
M
y
j 1
ij : mean per 2nd stage unit of i th 1st stage units in the population.

1 N M
1 N
Y 
MN
 yij 
i 1 j 1 N
y
i 1
i  YMN : mean per second stage unit in the population

1 m
yi  
m j 1
yij : mean per second stage unit in the i th first stage unit in the sample.

1 n m 1 n
y  ij n 
mn i 1 j 1
y 
i 1
yi  ymn : mean per second stage in the sample.

Advantages:

The principle advantage of two stage sampling is that it is more flexible than the one-stage sampling. It
reduces to one stage sampling when m  M but unless this is the best choice of m , we have the
opportunity of taking some smaller value that appears more efficient. As usual, this choice reduces to a
balance between statistical precision and cost. When units of the first stage agree very closely, then
consideration of precision suggests a small value of m . On the other hand, it is sometimes as cheap to
measure the whole of a unit as to a sample. For example, when the unit is a household and a single
respondent can give as accurate data as all the members of the household.

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 2
A pictorial scheme of two stage sampling scheme is as follows:

Population (MN units)

Cluster Cluster Cluster Population

M units M units ……… M units N clusters
(large in
number)
N clusters

Cluster Cluster First stage

Cluster sample n
M units M units ………
M units clusters (small
in number)
n clusters

Second stage
sample m units
n clusters (large
Cluster Cluster Cluster number of
………
m units m units m units elements from
each cluster)
mn units

Note: The expectations under two stage sampling scheme depend on the stages. For example, the
expectation at second stage unit will be dependent on first stage unit in the sense that second stage unit
will be in the sample provided it was selected in the first stage.

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 3
To calculate the average
- First average the estimator over all the second stage selections that can be drawn from a fixed
set of n units that the plan selects.
- Then average over all the possible selections of n units by the plan.

In the case of two stage sampling,

E (ˆ)  E1[ E2 (ˆ)]
  
average average average over
over over all all possible 2nd stage
all 1st stage selections from
samples samples a fixed set of units

In case of three stage sampling,


E (ˆ)  E1  E2 E3 (ˆ)  .
  

To calculate the variance, we proceed as follows:

In case of two stage sampling,
Var (ˆ)  E (ˆ   ) 2
 E E (ˆ   ) 2
1 2

Consider
E2 (ˆ   ) 2  E2 (ˆ 2 )  2 E2 (ˆ)   2
.
 
  E2 (ˆ  V2 (ˆ)   2 E2 (ˆ)   2
2

 

Now average over first stage selection as

2
E1 E2 (ˆ   ) 2  E1  E2 (ˆ)   E1 V2 (ˆ)   2 E1 E2 (ˆ)  E1 ( 2 )

 
 E1  E2 (ˆ)   2   E1 V2 (ˆ) 
2

 
Var (ˆ)  V1  E2 (ˆ)   E1 V2 (ˆ)  .

In case of three stage sampling,

   
Var (ˆ)  V1  E E3 (ˆ)   E1 V2 E3 (ˆ)   E1  E2 V3 (ˆ)  .
 2       

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 4
Estimation of population mean:
Consider y  ymn as an estimator of the population mean Y .

Bias:
Consider
E ( y )  E1  E2 ( ymn ) 
 E1  E2 ( yim i )  (as 2nd stage is dependent on 1st stage)

 E1  E2 ( yim i )  (as yi is unbiased for Yi due to SRSWOR)

1 n 
= E1   Yi 
 n i 1 
1 N
  Yi
N i 1
Y .

Thus ymn is an unbiased estimator of the population mean.

Variance
Var ( y )  E1 V2 ( y i )   V1  E2 ( y / i ) 
 1 n   1 n 
 E1 V2   yi i   V1  E2   yi / i 
  n i 1    n i 1 
1 n  1 n 
 E1  2  V ( yi i )   V1   E2 ( yi / i ) 
 n i 1   n i 1 
1 n 1 1   1 n 
 E1  2     Si2   V1   Yi 
 n i 1  m M    n i 1 
1 n 1 1 
 2    E1 ( Si2 )  V1 ( yc )
n i 1  m M 
(where yc is based on cluster means as in cluster sampling)
1  1 1  2 N n 2
 n    Sw  Sb
n2  m M  Nn
1 1 1  1 1 
    S w2     Sb2
nm M  n N 
2
1 1
 Yij  Yi 
N N M
where S 
N
2
w 
i 1
S 
i
2

N ( M  1) i 1 j 1
1 N
Sb2  
N  1 i 1
(Yi  Y ) 2

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 5
Estimate of variance
An unbiased estimator of the variance of y can be obtained by replacing Sb2 and S w2 by their unbiased

estimators in the expression of the variance of y .

Consider an estimator of
1 N
S w2 
N
S
i 1
i
2

2
1 M
where S    yij  Yi 
2

M  1 j 1
i

1 n 2
as  si
sw2 
n i 1
1 m
where si2  
m  1 j 1
( yij  yi ) 2 .

So
E ( sw2 )  E1 E2  sw2 i 

1 n 
 E1 E2   si2 i 
 n i 1 
1 n
 E1   E2 ( si2 i ) 
n i 1
1 n 2
 E1  Si
n i 1
(as SRSWOR is used)

1 n
  E1 (Si2 )
n i 1
1 N
1 N


N
 
i 1  N
S
i 1
i
2


1 N

N
S
i 1
i
2

S 2
w

so sw2 is an unbiased estimator of S w2 .

Consider
1 n
sb2   ( yi  y )2
n  1 i 1
as an estimator of
1 N
Sb2  
N  1 i 1
(Yi  Y ) 2 .

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 6
So
1  n 
E ( sb2 )  E   ( yi  y ) 2 
n  1  i 1 
 n 
(n  1) E ( sb2 )  E   yi2  ny 2 
 i 1 
 n

 E   yi2   nE ( y 2 )
 i 1 
  n 
 E1  E2   yi2    n Var ( y )   E ( y ) 
2

  i 1    

 n   1 1  1 1 1 2 2
 E1   E2 ( yi2 ) i )   n    Sb2     Sw  Y 
 i 1   n N  m M n 
 n
 i 1
 
2 


 1 1   1 1 1 
 E1   Var ( yi )   E ( yi    n    Sb2     S w2  Y 2 
 n N  m M n 
 n  1 1    1 1   1 1 1 
 E1      Si2  Yi 2   n    Sb2     S w2  Y 2 
 i 1  m M    n N  m M n 
1  n  1 1    1 1   1 1 1 
 nE1     Si2  Yi 2   n    Sb2     S w2  Y 2 
 n  i 1  m M    n N  m M n 
 1 1  1 N
1 N
  1 1   1 1 1 
 n     Si2   Yi 2   n    Sb2     S w2  Y 2 
 m M  N i 1 N i 1   n N  m M n 
 1 1  1 N   1 1   1 1 1 
 n    S w2   Yi 2   n    Sb2     S w2  Y 2 
 m M  N i 1   n N  m M n 
1 1  n N
1 1 
 (n  1)    S w2   Yi 2  nY 2  n    Sb2
m M  N i 1 n N 
1 1  2 nN 2 2 1 1  2
 (n  1)    S w    Yi  NY   n    Sb
m M  N  i 1  n N 
1 1  2 n 1 1  2
 (n  1)    S w  ( N  1) Sb  n    Sb
2

m M  N n N 
1 1  2
 (n  1)    S w  (n  1) Sb .
2

m M 
1 1 
 E ( sb2 )     S w2  Sb2
m M 
 1 1  
or E  sb2     sw2   Sb2 .
 m M  
Thus

 ( y )  1  1  1  Sˆ 2   1  1  Sˆ 2
Var      b
nm M  n N 
1 1 1   1 1  1 1  
    sw2      sb2     sw2 
nm M   n N  m M  
11 1  1 1 
    sw2     sb2 .
N m M  n N 

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 7
Allocation of sample to the two stages: Equal first stage units:
The variance of the sample mean in the case of two stage sampling is

( y)  1  1  1  2 1 1  2
Var   S w     Sb .
nm M  n N 
It depends on Sb2 , S w2 , n and m. So the cost of survey of units in the two stage sample depends on
n and m.

Case 1. When cost is fixed

We find the values of n and m so that the variance is minimum for given cost.

(I) When cost function is C = kmn

Let the cost of the survey be proportional to sample size as
C  knm
where C is the total cost and k is constant.
C0
When cost is fixed as C  C0 . Substituting m  in Var ( y ), we get
kn
1  2 S w2  Sb2 1 kn 2
Var ( y )   Sb     Sw
n M  N n C0
1 S 2   S 2 kS 2 
  Sb2  w    b  w  .
n M   N C0 

 2 S w2 
This variance is a monotonic decreasing function of n if  Sb    0. The variance is minimum when
 M 

n assumes maximum value, i.e.,

C0
nˆ  corresponding to m  1.
k
 2 S w2 
If  Sb    0 (i.e., intraclass correlation is negative for large N ) , then the variance is a monotonic
 M 

C0
increasing function of n , It reaches minimum when n assumes the minimum value, i.e., nˆ  (i.e., no
kM
subsampling).

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 8
(II) When cost function is C  k1n  k2 mn
Let cost C be fixed as C0  k1n  k2 mn where k1 and k2 are positive constants. The terms k1 and k2

denote the costs of per unit observations in first and second stages respectively. Minimize the variance of
sample mean under the two stage with respect to m subject to the restriction C0  k1n  k2 mn .

We have
 S2   S2   S2  k S2
C0 Var ( y )  b   k1  Sb2  w   k2 S w2  mk2  Sb2  w   1 w .
 N  M  M m
 2 S w2 
When  Sb    0, then
 M
2 2
 S2    S2     S2  k S2 
C0 Var ( y )  b    k1  Sb2  w   k2 S w2    mk2  Sb2  w   1 w 
N 
 
  M  M m 
 
which is minimum when the second term of right-hand side is zero. So we obtain

k1 S w2
mˆ  .
k2  2 S w2 
 Sb  
 M

The optimum n follows from C0  k1n  k2 mn as

C0
nˆ  .
k1  k2 mˆ

 2 S w2 
When  Sb    0 then
 M

 Sb2   2 S w2   2 S w2  k1S w2
C0 Var ( y )    k1  Sb    k2 S w  mk2  Sb 
2

 N  M  M m

is minimum if m is the greatest attainable integer. Hence in this case, when

C0
C0  k1  k2 M ; mˆ  M and nˆ  .
k1  k2 M
C0  k1
If C0  k1  k2 M ; then mˆ  and nˆ  1.
k2

If N is large, then S w2  S 2 (1   )

S w2
S w2   S 2
M

k1  1 
mˆ    1 .
k2   

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 9
Case 2: When variance is fixed
Now we find the sample sizes when variance is fixed, say as V0 .

1 1 1  1 1 
V0     S w2     Sb2
nm M  n N 
1 1 
Sb2     S w2
n m M 
S2
V0  b
N
So
 2 S w2 
 Sb   kS w2
C  kmn  km  M  .
2 2
V  b S  V  Sb
 0  0
 N  N
 S2 
If  Sb2  w   0, C attains minimum when m assumes the smallest integral value, i.e., 1.
 M

 2 S w2 
If  Sb    0 , C attains minimum when mˆ  M .
 M 

Comparison of two stage sampling with one stage sampling

One stage sampling procedures are comparable with two stage sampling procedures when either
(i) sampling mn elements in one single stage or
mn
(ii) sampling first stage units as cluster without sub-sampling.
M
We consider both the cases.

Case 1: Sampling mn elements in one single stage

The variance of sample mean based on
- mn elements selected by SRSWOR (one stage) is given by
 1 1  2
V ( ySRS )    S
 mn MN 
- two stage sampling is given by
1 1 1  2 1 1  2
V ( yTS )     S w     Sb .
nm M  n N 

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 10
The intraclass correlation coefficient is

 N  1  2 Sw
2

  Sb 
M M ( N  1) Sb  NS w
2 2
 N  1
  ;   1 (1)
 NM  1  2 ( MN  1) S 2
M 1
 S
 NM 
and using the identity
N M N M N M

 ( yij  Y )2  ( yij  Yi )2   (Yi  Y )2

i 1 j 1 i 1 j 1 i 1 j 1

( NM  1) S  ( N  1) MS  N ( M  1) S w2
2 2
b (2)

1 N M
1 M
where Y 
MN
 yij , Yi 
i 1 j 1 M
y .
j 1
ij

Now we need to find Sb2 and S w2 from (1) and (2) in terms of S 2 . From (1), we have

 MN  1   N 1 
S w2     MS   
2 2
 MSb . (3)
 N   N 
Substituting it in (2) gives
 N  1   MN  1  2 
( NM  1) S 2  ( N  1) MSb2  N ( M  1)   MSb    MS  
2

 N   N  
 ( N  1) MSb  ( M  1)( N  1) Sb   M ( M  1)( MN  1) S
2 2 2

 ( N  1) MSb2 [1  ( M  1)]   M ( M  1)( MN  1) S 2

 ( N  1) MSb2   M ( M  1)( MN  1) S 2
( MN  1) S 2
 Sb2  1  ( M  1)  
M 2 ( N  1)
Substituting it in (3) gives

N ( M  1) S w2  ( NM  1) S 2  ( N  1) MSb2
 ( MN  1) S 2 
 ( NM  1) S 2  ( N  1) M  2 1  ( M  1)   
 M ( N  1) 
 M  1  ( M  1)  
 ( NM  1) S 2  
 M
 ( NM  1) S 2 ( M  1)(1   )
 MN  1  2
 S w2    S (1   ).
 MN 

Substituting Sb2 and S w2 in Var ( yTS )

 MN  1  S  m(n  1) N n m M  m 
2
V ( yTS )    1  ( M  1)   .
 MN  mn  M ( N  1)  N 1 M M  

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 11
m
When subsampling rate is small, MN  1  MN and M  1  M , then
M
S2
V ( ySRS ) 
mn
S2   N n 
V ( yTS )  1   m  1  .
mn   N 1 
The relative efficiency of the two stage in relation to one stage sampling of SRSWOR is
Var ( yTS )  N n 
RE   1   m  1 .
Var ( ySRS )  N 1 
N n N n
If N  1  N and finite population correction is ignorable, then   1, then
N 1 N
RE  1   (m  1).

Case 2: Comparison with cluster sampling

mn
Suppose a random sample of clusters, without further subsampling is selected.
M
The variance of the sample mean of equivalent mn / M clusters is
M 1 2
Var ( ycl )     Sb .
 mn N 
The variance of the sample mean under the two stage sampling is
1 1 1  2 1 1  2
Var ( yTS )     S w     Sb .
nm M  n N 
So Var ( ycl ) exceedes Var ( yTS ) by

1M  2 1 2 
  1   Sb  S w 
n m  M 
which is approximately
1M  2  2 S w2 
  1   S for large N and  Sb    0.
n m   M

MN  1 S 2
where Sb2  1   ( M  1)
M ( N  1) M
MN  1 2
S w2  S (1   )
MN
So smaller the m / M , larger the reduction in the variance of two stage sample over a cluster sample.
 S2 
When  Sb2  w   0 then the subsampling will lead to loss in precision.
 M

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 12
Two stage sampling with unequal first stage units:
Consider two stage sampling when the first stage units are of unequal size and SRSWOR is employed at
each stage.
Let
yij : value of j th second stage unit of the i th first stage unit.

M i : number of second stage units in i th first stage units (i  1, 2,..., N ) .

N
M 0   M i : total number of second stage units in the population.
i 1

mi : number of second stage units to be selected from i th first stage unit, if it is in the sample.
n
m0   mi : total number of second stage units in the sample.
i 1

1 mi
yi ( mi ) 
mi
yj 1
ij

1 Mi
Yi 
Mi
yj 1
ij

1 N
Y 
N
y
i 1
i  YN
N Mi N
 yij M Y i i
1 N
Y 
i 1 j 1
N
 i 1
 u Y i i

 Mi
MN N i 1

i 1

Mi
ui 
M
1 N
M   Mi
N i 1

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 13
The pictorial scheme of two stage sampling with unequal first stage units case is as follows:

Population (MN units)

Cluster Cluster Cluster Population

M1 M2 ……… MN N clusters
units units units

N clusters

Cluster Cluster Cluster First stage

M1 M2 ……… Mn sample n
units units n clusters clusters (small)
units

Second stage
Cluster Cluster Cluster sample n
m1 units m2 units ……… mn units clusters (small)

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 14
Now we consider different estimators for the estimation of the population mean.
1. Estimator based on the first stage unit means in the sample:
1 n
Yˆ  yS 2   yi ( mi )
n i 1

Bias:
1 n 
E ( yS 2 )  E   yi ( mi ) 
 n i 1 
1 n

 E1   E2 ( yi ( mi ) ) 
 n i 1 
1 n 
 E1   Yi  [Since a sample of size mi is selected out of M i units by SRSWOR]
 n i 1 
1 N
  Yi
N i 1
YN
 Y.

So yS 2 is a biased estimator of Y and its bias is given by

Bias ( yS 2 )  E ( yS 2 )  Y
1 N
1 N

N
 Yi 
i 1
 M iYi
NM i 1
1 N 1  N  N 
  i i
M Y    Yi    M i  
NM  i 1 N  i 1   i 1 
1 N
  (M i  M )(Yi  YN ).
NM i 1
This bias can be estimated by
N 1 n
( y )  
Bias S2  (M i  m)( yi ( mi )  yS 2 )
NM (n  1) i 1
which can be seen as follows:

 ( y )   N 1 E  1 
E2 ( M i  m)( yi ( mi )  yS 2 ) / n
n
E  Bias
 S2 
NM
1 
 n  1 i 1 
N 1  1 n 
 E 
NM  n  1 i 1
( M i  m)(Yi  yn ) 

1 N

NM
 (M
i 1
i  M )(Yi  YN )

 YN  Y

1 n
where yn   Yi .
n i 1

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 15
An unbiased estimator of the population mean Y is thus obtained as
N 1 1 n
yS 2   (M i  m)( yi ( mi )  yS 2 ) .
NM n  1 i 1
Note that the bias arises due to the inequality of sizes of the first stage units and probability of selection
of second stage units varies from one first stage to another.

Variance:
Var ( yS 2 )  Var  E ( yS 2 n)   E Var ( yS 2 n) 
1 n  1 n 
 Var   yi   E  2  Var ( yi ( mi ) i ) 
 n i 1   n i 1 
1 1  1 n  1 1  2
    Sb2  E  2     Si 
n N   n i 1  mi M i  
1 1  1 N  1 1  2
    Sb2 
n N 
  
Nn i 1  mi M i 
 Si

where Sb2 
1 N

 Yi  YN
N  1 i 1

2
1 Mi
S 
i
2
  yij  Yi  .
M i  1 j 1

The MSE can be obtained as

MSE ( yS 2 )  Var ( yS 2 )   Bias ( yS 2 )  .

Estimation of variance:
Consider mean square between cluster means in the sample
1 n
  
2
sb2  yi ( mi )  yS 2 .
n  1 i 1
It can be shown that
1 N
 1 1  2
E ( sb2 )  Sb2   m  Si .
N i 1  i Mi 
1 mi
Also si2   ( yij  yi ( mi ) )2
mi  1 j 1
1 Mi
E ( si2 )  Si2  
M i  1 j 1
( yij  Yi ) 2

1 n  1 1  2 1 N
 1 1  2
So E      si    m   Si .
 n i 1  mi M i   N i 1  i Mi 

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 16
Thus
1 n  1 1  2
E ( sb2 )  Sb2  E      si 
 n i 1  mi M i  
and an unbiased estimator of Sb2 is

1 n  1 1  2
Sˆb2  sb2      si .
n i 1  mi M i 
So an estimator of the variance can be obtained by replacing Sb2 and Si2 by their unbiased estimators as

 ( y )   1  1  Sˆ 2  1
N
 1 1  ˆ2
Var S2 
n N 
 b    Si .
Nn i 1  mi M i 

2. Estimation based on first stage unit totals:

1 n M y
Yˆ  yS* 2   i i ( mi )
n i 1 M
1 n
  ui yi ( mi )
n i 1
M
where ui  i .
M
Bias
1 n 
E ( yS* 2 )  E   ui yi ( mi ) 
 n i 1 
1 n

 E   ui E2 ( yi ( mi ) i ) 
 n i 1 
1 n

 E   uiYi 
 n i 1 
1 N

N
u Y
i 1
i i

Y.
Thus yS* 2 is an unbiased estimator of Y .

Variance:
Var ( yS* 2 )  Var  E ( yS* 2 n)   E Var ( yS* 2 n) 
1 n  1 n 
 Var   uiYi   E  2  ui2Var ( yi ( mi ) i ) 
 n i 1   n i 1 
1 1  1 N
 1 1  2
    Sb*2 
n N 
u 2
i    Si
nN i 1  mi M i 
1 Mi
where S  2

M i  1 j 1
i ( yij  Yi ) 2

1 N
Sb*2  
N  1 j 1
(uiYi  Y ) 2 .

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 17
3. Estimator based on ratio estimator:
n

M y i i ( mi )
Yˆ  yS**2  i 1
n

M
i 1
i

u y i i ( mi )
 i 1
n

u
i 1
i

yS* 2

un

Mi 1 n
where ui  , un   ui .
M n i 1
This estimator can be seen as if arising by the ratio method of estimation as follows:

Let yi*  ui yi ( mi )
Mi
xi*  , i  1, 2,..., N
M
be the values of study variable and auxiliary variable in reference to the ratio method of estimation. Then
1 n *
y*  
n i 1
yi  yS* 2

1 n *
x*   xi  un
n i 1
1 N
X* 
N
X
i 1
*
i  1.

The corresponding ratio estimator of Y is

y* y*
YˆR  X *  S 2 1  yS**2 .
x* un

So the bias and mean squared error of yS**2 can be obtained directly from the results of ratio estimator.

Recall that in ratio method of estimation, the bias and MSE of the ratio estimator upto second order of
approximation is
N n
Bias ( yˆ R )  Y (Cx2  2  Cx C y )
Nn
Var ( x ) Cov( x , y ) 
Y  2
 
 X XY
MSE (YˆR )  Var ( y )  R 2Var ( x )  2 RCov( x , y ) 

Y
where R  .
X
Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 18
Bias:
The bias of yS**2 up to second order of approximation is

Var ( xS2 ) Cov( xS2 , yS* 2 ) 

Bias ( yS**2 )  Y  2
 
 X XY 
1 n
where xS*2 is the mean of auxiliary variable similar to yS* 2 as xS*2   xi ( mi ) .
n i 1

Now we find Cov( xS2 , yS 2 ).

 1 n 1 n   1 n 1 n 
Cov( xS*2 , yS* 2 )  Cov  E   ui xi ( mi ) ,  ui yi ( mi )    E Cov   ui xi ( mi ) ,  ui yi ( mi )  
  n i 1 n i 1    n i 1 n i 1 
1 n
1 n
 1 n

 Cov   ui E ( xi ( mi ) ),  ui E ( yi ( mi ) )   E  2  ui2Cov( xi ( mi ) , yi ( mi ) ) i 
 n i 1 n i 1   n i 1 
1 n 1 n  1 n
 1 1  
 Cov   ui X i ,  uiYi   E  2 u 2
i   Sixy 
 n i 1 n i 1  n i 1  mi M i  
1 1  * 1 N
 1 1 
    Sbxy
n N 
 u 2
i   Sixy
nN i 1  mi M i 
where
1 N
*
Sbxy   (ui X i  X )(uiYi  Y )
N  1 i 1
1 Mi
Sixy   ( xij  X i )( yij  Yi ).
M i  1 j 1

Similarly, Var ( xS*2 ) can be obtained by replacing x in place of y in Cov( xS*2 , yS* 2 ) as

1 1  1 N
 1 1  2
Var ( x )     Sbx*2 
*
S2
n N 
u 2
i   Six
nN i 1  mi M i 
1 N
where Sbx*2   (ui X i  X )2
N  1 i 1
1 Mi
Six*2   ( xij  X i )2 .
M i  1 i 1

Substituting Cov( xS*2 , yS* 2 ) and Var ( xS*2 ) in Bias ( yS**2 ), we obtain the approximate bias as

 1 1   Sbx*2 Sbxy
*
 1 N
 2  1 1   Six2 Sixy  
Bias ( y )  Y     2 
**
S2   
XY  nN
  i 
u   2    .
 n N   X   mi M i   X
i 1  XY  

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 19
Mean squared error
MSE ( yS**2 )  Var ( yS* 2 )  2 R*Cov( xS*2 , yS* 2 )  R*2Var ( xS*2 )
1 1  1 N
 1 1  2
Var ( yS**2 )     Sby*2 
n N 
u 2
i    Siy
nN i 1  mi M i 
1 1  1 N
 1 1  2
Var ( xS**2 )     Sbx*2 
n N 
u 2
i    Six
nN i 1  i
m M i 

1 1  * 1 N
 1 1 
Cov( xS*2 , yS**2 )     Sbxy
n N 
 u 2
i    Sixy
nN i 1  mi M i 
where
1 N
Sby*2   (uiYi  Y )2
N  1 i 1
1 Mi
Siy*2   ( yij  Yi )2
M i  1 j 1
Y
R*  Y.
X
Thus

1 1  1   1 1  2 
MSE ( yS**2 )      Sby*2  2 R* Sbxy  R*2 Sbx*2     Siy  2 R Sixy  R Six  .
N

n N 
*
 u 2
i  
* *2 2

nN i 1   mi M i  
Also

1 1  1 N 2 1   1 1  2 
ui Yi  R* X i     Siy  2 R Sixy  R Six  .
N

  u
2
MSE ( y )    
**
S2
2
 
* *2 2

 n N  N  1 i 1
i
nN i 1   mi M i  

Estimate of variance
Consider
1 n
*
sbxy    ui yi ( mi )  yS* 2  ui xi ( mi )  xS*2  
n  1 i 1  
1 n
sixy    xij  xi ( mi )  yij  yi ( mi ) .
mi  1 j 1 

It can be shown that

1  1 1 
E  sbxy   Sbxy* 
N
*
u 2
i   Sixy
N i 1  mi M i 
E ( sixy )  Sixy .
So
1 n  1 1   1 N   1 1  
E   ui2    sixy    u 2
i    Sixy .
 n i 1  mi M i   N i 1   mi M i  

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 20
Thus
1 n  1 1 
Sˆbxy
*
 sbxy
*
  ui2    sixy
n i 1  mi M i 
1 n  1 1  2
Sˆbx*2  sbx
*2
  ui2    six
n i 1  mi M i 
1 n  1 1  2
Sˆby*2  sby
*2
  ui2    siy .
n i 1  mi M i 

Also
 1 n   1 1  2  1 N   1 1  2
E   ui2    six    u 2
i    Six 
 n i 1   mi M i   N i 1   mi M i  
 1 n   1 1  2  1 N   1 1  2
E   ui2    siy    u 2
i    Siy .
 n i 1   mi M i   N i 1   mi M i  

A consistent estimator of MSE of yS**2 can be obtained by substituting the unbiased estimators of

respective statistics in MSE ( yS**2 ) as

 ( y ** )   1  1   s*2  2r * s*  r *2 s*2 
MSE S2   by bxy bx
n N 
1 n 2 1 1  2
  ui     siy  2r sixy  r six 
* *2 2

nN i 1  mi M i 
1 1  1 n
  yi ( mi )  r * xi ( mi ) 
2
  
 n N  n  1 i 1
1 n  2 1 1  2  yS* 2
  ui     siy  2r sixy  r six   where r * 
* *2 2
.
nN i 1   mi M i   xS*2

Sampling Theory| Chapter 10 | Two Stage Sampling (Subsampling) | Shalabh, IIT Kanpur Page 21

Fisher, R. A. (1925) Statistical Methods For Research Workers
75% (4)
Fisher, R. A. (1925) Statistical Methods For Research Workers
145 pages
Chapter 8 - Quiz
No ratings yet
Chapter 8 - Quiz
10 pages
Moment Generating Functions
No ratings yet
Moment Generating Functions
7 pages
Agra University Journal Scie
No ratings yet
Agra University Journal Scie
69 pages
Formulates Appropriate Null and Alternative Hypothesis
No ratings yet
Formulates Appropriate Null and Alternative Hypothesis
23 pages
Chapter9 Sampling Cluster Sampling
No ratings yet
Chapter9 Sampling Cluster Sampling
21 pages
Chapter10 Sampling Two Stage Sampling
No ratings yet
Chapter10 Sampling Two Stage Sampling
21 pages
Chapter7 Sampling Varying Probability Sampling
No ratings yet
Chapter7 Sampling Varying Probability Sampling
32 pages
Simple Random Sampling Without Replacement (SRSWOR)
No ratings yet
Simple Random Sampling Without Replacement (SRSWOR)
23 pages
Sampling Distributions of Sample Means and Proportions PDF
No ratings yet
Sampling Distributions of Sample Means and Proportions PDF
14 pages
Practis Exam Chapter 8
No ratings yet
Practis Exam Chapter 8
12 pages
Instant Download (Ebook) An Introduction to Model-Based Survey Sampling with Applications by Ray Chambers, Robert Clark ISBN 9780198566625, 019856662X PDF All Chapters
100% (1)
Instant Download (Ebook) An Introduction to Model-Based Survey Sampling with Applications by Ray Chambers, Robert Clark ISBN 9780198566625, 019856662X PDF All Chapters
67 pages
Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture On 01-05-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture On 01-05-2021) - STAT 243 - Stochastic Process
5 pages
Confidence Interval Estimationnew
No ratings yet
Confidence Interval Estimationnew
12 pages
Basis Representation Theorem
No ratings yet
Basis Representation Theorem
25 pages
Notes On Stochastic Processes: 1 Learning Outcomes
No ratings yet
Notes On Stochastic Processes: 1 Learning Outcomes
26 pages
Quadratic Forms
No ratings yet
Quadratic Forms
4 pages
SPS 2320 Theory of Estimation Year 3 Semester II
100% (1)
SPS 2320 Theory of Estimation Year 3 Semester II
2 pages
Markov Chains Cheat Sheet PDF
No ratings yet
Markov Chains Cheat Sheet PDF
2 pages
Quadratic Forms and Characteristic Roots Prof. NasserF1
No ratings yet
Quadratic Forms and Characteristic Roots Prof. NasserF1
65 pages
PPT9-Renewal Process
No ratings yet
PPT9-Renewal Process
29 pages
RCBD Anova Notes (III)
No ratings yet
RCBD Anova Notes (III)
13 pages
SAS Part001
No ratings yet
SAS Part001
15 pages
Chapter 10-Inference About Means and Proportions With Two Populations
No ratings yet
Chapter 10-Inference About Means and Proportions With Two Populations
69 pages
Wishart Distribution
100% (1)
Wishart Distribution
6 pages
Chapter5 Sampling Ratio Method Estimation
No ratings yet
Chapter5 Sampling Ratio Method Estimation
23 pages
Chapter 11 L21-24
No ratings yet
Chapter 11 L21-24
120 pages
A Family of Median Based Estimators in Simple Random Sampling
No ratings yet
A Family of Median Based Estimators in Simple Random Sampling
11 pages
Chapter 07 Sampling
No ratings yet
Chapter 07 Sampling
22 pages
Wishart Distribution
No ratings yet
Wishart Distribution
6 pages
Statistics FinalReview
No ratings yet
Statistics FinalReview
8 pages
Notes Estimation Theory
100% (3)
Notes Estimation Theory
39 pages
Ss Notes
No ratings yet
Ss Notes
34 pages
215 Final Exam Formula Sheet
No ratings yet
215 Final Exam Formula Sheet
2 pages
Chapter 6 Section 4-5: Probability: Multiple Choice
No ratings yet
Chapter 6 Section 4-5: Probability: Multiple Choice
7 pages
Sampling Distribution
No ratings yet
Sampling Distribution
37 pages
PPT7-Discrete Time - Markov Chain
No ratings yet
PPT7-Discrete Time - Markov Chain
37 pages
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
No ratings yet
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
18 pages
Unit-9 IGNOU STATISTICS
No ratings yet
Unit-9 IGNOU STATISTICS
16 pages
Confidence Interval Estimation
100% (1)
Confidence Interval Estimation
31 pages
Latin Square Design
No ratings yet
Latin Square Design
8 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Basic Business Statistics: Analysis of Variance
No ratings yet
Basic Business Statistics: Analysis of Variance
85 pages
4 Sampling Distributions
100% (1)
4 Sampling Distributions
30 pages
Unit-17 IGNOU STATISTICS
No ratings yet
Unit-17 IGNOU STATISTICS
15 pages
Multivariate Normal Distribution
No ratings yet
Multivariate Normal Distribution
9 pages
Desriptive Statistics - Zarni Amri
No ratings yet
Desriptive Statistics - Zarni Amri
57 pages
Term Test 1 MCQ
100% (1)
Term Test 1 MCQ
18 pages
Lehmann Scheffe PDF
100% (1)
Lehmann Scheffe PDF
7 pages
Practice 2
100% (1)
Practice 2
6 pages
Confidence Interval Estimation
No ratings yet
Confidence Interval Estimation
62 pages
Ratio Estimation (Edited)
100% (1)
Ratio Estimation (Edited)
21 pages
Questions & Answers Chapter - 7 Set 1
No ratings yet
Questions & Answers Chapter - 7 Set 1
6 pages
CT4 Q&A Bank Part 1 Questions
No ratings yet
CT4 Q&A Bank Part 1 Questions
12 pages
Probability Handwritten Notes
No ratings yet
Probability Handwritten Notes
6 pages
Axiomatic Probability and Concepts
No ratings yet
Axiomatic Probability and Concepts
6 pages
Assignment of Econometrics
No ratings yet
Assignment of Econometrics
12 pages
Tutsheet 7
No ratings yet
Tutsheet 7
2 pages
Comparison of Several Multivariate Means
No ratings yet
Comparison of Several Multivariate Means
111 pages
2.2 Hyphothesis Testing (Continuous)
No ratings yet
2.2 Hyphothesis Testing (Continuous)
35 pages
Sampling Theory Sampling Theory: Two Stage Sampling Two Stage Sampling (Sub Sampling)
No ratings yet
Sampling Theory Sampling Theory: Two Stage Sampling Two Stage Sampling (Sub Sampling)
13 pages
Two Stage
No ratings yet
Two Stage
13 pages
Linear Regression
No ratings yet
Linear Regression
2 pages
(Ebook) Econometrics by Example by Damodar Gujarati ISBN 9781137375018, 1137375019 download
No ratings yet
(Ebook) Econometrics by Example by Damodar Gujarati ISBN 9781137375018, 1137375019 download
43 pages
Chi Square
100% (3)
Chi Square
2 pages
T-Test Assignment
No ratings yet
T-Test Assignment
2 pages
Forecasting Stata
No ratings yet
Forecasting Stata
18 pages
Lecture 5 - Sampling Distribution
No ratings yet
Lecture 5 - Sampling Distribution
4 pages
A Simple Test For Heteroscedasticity and Random Coefficient Variation (Breusch y Pagan)
No ratings yet
A Simple Test For Heteroscedasticity and Random Coefficient Variation (Breusch y Pagan)
9 pages
AS and A-Level Further Maths: Topic Test
No ratings yet
AS and A-Level Further Maths: Topic Test
4 pages
STA 342-TH9-Correlation and Regression
No ratings yet
STA 342-TH9-Correlation and Regression
13 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Linear Regression
No ratings yet
Linear Regression
14 pages
Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
ECON1203 Course Outline
No ratings yet
ECON1203 Course Outline
21 pages
Q4 Statistics and Probability 11 - Module 1
No ratings yet
Q4 Statistics and Probability 11 - Module 1
18 pages
LAMPIRAN B_MINOR (BI)
No ratings yet
LAMPIRAN B_MINOR (BI)
19 pages
Bowerman Regression CHPT 1
100% (1)
Bowerman Regression CHPT 1
18 pages
ARCH Effect Explained (Excel)
100% (2)
ARCH Effect Explained (Excel)
7 pages
CHAPTER 6lesson 4
No ratings yet
CHAPTER 6lesson 4
11 pages
4.1.2.4 Lab - Simple Linear Regression in Python
No ratings yet
4.1.2.4 Lab - Simple Linear Regression in Python
6 pages
Priv - Outliers Omitted - NGO Still Sign
No ratings yet
Priv - Outliers Omitted - NGO Still Sign
8 pages
Chi Square
No ratings yet
Chi Square
2 pages
Multiple linear regression
No ratings yet
Multiple linear regression
39 pages
Exam SRM Sample Questions 2
No ratings yet
Exam SRM Sample Questions 2
60 pages
Worksheet 2.2
No ratings yet
Worksheet 2.2
7 pages
Pengaruh Kualitas Produk Dan Harga Terhadap Keputusan Pembelian Mobil Daihatsu Grand Max Pick Up
No ratings yet
Pengaruh Kualitas Produk Dan Harga Terhadap Keputusan Pembelian Mobil Daihatsu Grand Max Pick Up
12 pages
Keller SME 12e PPT CH06
No ratings yet
Keller SME 12e PPT CH06
31 pages
Least Squares Matrix Form PDF
No ratings yet
Least Squares Matrix Form PDF
16 pages
50 Important Statistics' Q & A To Crack DS Interview
No ratings yet
50 Important Statistics' Q & A To Crack DS Interview
14 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Sampling Two Stage Sampling

Uploaded by

Sampling Two Stage Sampling

Uploaded by

Chapter 10

Two Stage Sampling (Subsampling)

For example, in a crop survey

In another example, to obtain a sample of fishes from a commercial fishery

Two stage sampling with equal first stage units:

unit; i  1, 2,..., N ; j  1, 2,.., m.

Population (MN units)

Cluster Cluster Cluster Population

Cluster Cluster First stage

In the case of two stage sampling,

In case of three stage sampling,

To calculate the variance, we proceed as follows:

Now average over first stage selection as

In case of three stage sampling,

 E1  E2 ( yim i )  (as yi is unbiased for Yi due to SRSWOR)

Thus ymn is an unbiased estimator of the population mean.

estimators in the expression of the variance of y .

so sw2 is an unbiased estimator of S w2 .

Case 1. When cost is fixed

(I) When cost function is C = kmn

n assumes maximum value, i.e.,

The optimum n follows from C0  k1n  k2 mn as

is minimum if m is the greatest attainable integer. Hence in this case, when

Comparison of two stage sampling with one stage sampling

Case 1: Sampling mn elements in one single stage

 ( yij  Y )2  ( yij  Yi )2   (Yi  Y )2

 ( N  1) MSb2 [1  ( M  1)]   M ( M  1)( MN  1) S 2

Substituting Sb2 and S w2 in Var ( yTS )

Case 2: Comparison with cluster sampling

M i : number of second stage units in i th first stage units (i  1, 2,..., N ) .

Population (MN units)

Cluster Cluster Cluster Population

Cluster Cluster Cluster First stage

So yS 2 is a biased estimator of Y and its bias is given by

The MSE can be obtained as

MSE ( yS 2 )  Var ( yS 2 )   Bias ( yS 2 )  .

2. Estimation based on first stage unit totals:

The corresponding ratio estimator of Y is

Var ( xS*2 ) Cov( xS*2 , yS* 2 ) 

Now we find Cov( xS*2 , yS* 2 ).

It can be shown that

respective statistics in MSE ( yS**2 ) as

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Var ( xS2 ) Cov( xS2 , yS* 2 ) 

Now we find Cov( xS2 , yS 2 ).