Lecture 6-3 - Simple Random Sampling

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Simple Random Sampling!

Professor Ron Fricker!


Naval Postgraduate School!
Monterey, California!

Reading Assignment:!
Scheaffer, Mendenhall, Ott, & Gerow!
2/1/13
Chapter 4! 1

Goals for this Lecture!

•  Define simple random sampling (SRS) and


discuss how to draw one!
•  Horvitz-Thompson estimation and SRS!
–  The finite population correction (fpc)!
•  Defining estimators for means, totals, and
proportions!
•  Sample size calculations!

2/1/13
2

Definition!

•  Simple random sampling (SRS) occurs when


every sample of size n (from a population of
size N) has an equal chance of being
selected!
–  This is not how we will actually draw such a
sample, just how it’s defined!
•  Note it is not defined as each element having
an equal chance of being selected!
–  That can occur with more complex designs,
particularly stratified designs!
•  An example…!
2/1/13
3

Example!

•  Consider a population consisting of 90 men


and 10 women, so N=100, where we want to
sample n=10 individuals!
–  With SRS, we can get samples of all men or all
women!
•  We could also draw a stratified sample,
where via SRS we sample nine men and
(separately) via SRS one woman!
–  Here each person has probability 1/10 of being
sampled, but not all groups of 10 can be sampled!

2/1/13
4

How to Draw a SRS!

•  Easiest way: !
–  Assign every element in the sampling frame a
uniformly distributed random number (say
between 0 and 1)!
–  Sort the list according to the random numbers!
•  Either ascending or descending, doesn’t matter!
–  Then take the first n elements!
•  Don’t try to actually generate all possible
combinations of n elements out of N…!
•  Chapter 4 describes other manual ways to do
this using tables of random numbers!
2/1/13
5

Example!

UNSORTED
SORTED

2/1/13
6

Note the Difference!

•  So, notice that giving every element in the


population an equal chance of selection like
this results in a SRS!
•  Which is probably why SRS is often
mistakenly defined this way!
•  But remember that other non-SRS methods
can also result in every element having an
equal chance of being selected!
–  For example, stratified sampling when probability
of selection is proportional to strata size!

2/1/13
7

Horvitz-Thompson Under SRS!

•  Under SRS, each sampling unit has


probability n/N of being selected!
•  Estimating µ with Horvitz-Thompson
estimator, we have!
1 n 1 1 n 1 1 n N 1 n
µˆ = ∑ yi = ∑ yi = ∑ yi = ∑ yi = y
N i =1 π i N i =1 n / N N i =1 n n i =1
–  Same as Stats 101!!
•  If population is infinite, standard error of y is
estimated the same way too: σ !ˆ y = s n

2/1/13
8

But What If Population Is Finite?!

•  It can be shown (see Appendix A of SMO&G)


that for finite populations,!
! E S( )
2
=
N
N −1
σ 2

•  So, an unbiased estimate for the variance of


the sample mean is:! ⎛ N − n ⎞ s 2
 Y =
Var ( ) ⎜⎝ N ⎟⎠ n
•  And thus the estimated standard error is:!
⎛ n⎞ s
s.e.(Y ) = ⎜ 1− ⎟ ×

⎝ N⎠ n

2/1/13
“finite population correction” or fpc! 9

Finite Population Correction!

•  Note that failure to use the finite population


correction (fpc) results in standard errors that
are too large!
–  Confidence intervals will be (erroneously) too big!
–  Hypothesis tests will be (erroneously) less
powerful!
•  For a survey with sample size less than 5
percent of population, can ignore the fpc!
–  It will have negligible effect!
•  If sample size larger than 5 percent, use fpc
to get more precise results – a good thing! !
2/1/13
10

Example: Margin of Error Estimates!

•  For various
sample sizes,
margins of error
for an infinite-
sized population
and one with
N=300
–  Binary question!
–  Conservative
p=0.5 assumption!

2/1/13
11

Another Example!

•  Survey asks a binary yes/no question


–  Estimate the proportion of respondents who say
“yes” with a confidence interval (N=300 and n=200)!
–  If 100 of the 200 say “yes,” population point
estimate is 50% ( pˆ = 0.5)!
•  Calculating the 95% confidence interval:!
–  Incorrect interval without fpc: (43%, 57%)!
pˆ (1 − pˆ ) 0.25
pˆ ± 1.96 = 0.5 ± 1.96 = 0.5 ± 0.07
n 200
–  Correct interval with fpc: (46%, 54%)!
⎛ n ⎞ ⎛ p̂(1− p̂) ⎞ ⎛ 1 ⎞ ⎛ 0.25 ⎞
p̂ ± 1.96 ⎜ 1− ⎟ ⎜ ⎟ = 0.5 ± 1.96 ⎜ ⎟ ⎜ ⎟ = 0.5 ± 0.04
⎝ N⎠⎝ n ⎠ ⎝ 3 ⎠ ⎝ 200 ⎠
2/1/13
12

Where Does the FPC Come From?!

•  In an infinite population, if we sample two


observations then!
–  Doesn’t really matter whether we sample with
replacement or not! Cov(Yi , Y j ) = 0
•  For a finite population, when we sample
without replacement, !
1
Cov(Yi , Y j ) = − σ2
N −1
•  Picking one observation affects the rest, so
there is correlation!!

2/1/13
13

Mean Estimation Summary!

1 n
•  Estimator for the mean:! y = ∑ yi
n i =1

⎛ n ⎞ s 2

( ) ⎜⎝ N ⎟⎠ n
 y = 1−
•  Variance of y :! Var

•  Bound on the error of estimation (margin of error):!

⎛ n ⎞ s 2

( ) ⎜⎝ N ⎟⎠ n
 y = 2 1−
2 Var

2/1/13
14

Estimating Totals!

N n
•  Estimator for the total:! τˆ = N × y = ∑ yi
n i=1

⎛ n ⎞ s 2

( )  ( Ny ) = N 2 ⎜⎝ 1− N ⎟⎠ n
 τˆ = Var
•  Variance of τˆ :! Var

•  Bound on the error of estimation (margin of error):!

⎛ n ⎞ s 2
2 Var ()
 τˆ = 2N 1−
⎜⎝ N ⎟⎠ n

2/1/13
15

Estimating Proportions!

1 n
•  Estimator for the proportion:!pˆ = y = ∑ yi
n i =1

⎛ n ⎞ p̂ (1− p̂ )
•  Variance of p̂ :! Var ( p̂ ) = ⎜ 1− ⎟

⎝ N⎠ n

•  Bound on the error of estimation (margin of error):!

⎛ n ⎞ p̂ (1− p̂ )
2 Var ( p̂ ) = 2 ⎜ 1− ⎟

⎝ N⎠ n

2/1/13
16

Sample Size Calculations (w/ fpc)

for Estimating Means !
•  Typically, we want to determine a sample size
to achieve a particular margin of error B
•  So, solving the following for n
⎛ N −n ⎞σ
2
2 ⎜ ⎟ =B
⎝ N −1 ⎠ n
gives!
Nσ 2
n= 2
B ( N − 1) 4 + σ 2
•  This is the number of respondents required!
–  Will need to inflate to account for nonrespondents!
2/1/13
17

Sample Size Calculations (w/ fpc)

for Estimating Totals !
•  Proceed as before, but use the expression for
the margin of error for totals!
•  That is, solve the following for n
⎛ N − n ⎞ σ 2
2N ⎜ ⎟ =B
⎝ N −1 ⎠ n
•  ! gives!
Nσ 2
n= 2
B ( N − 1) 4 N 2 + σ 2

•  Again, don’t forget to inflate this to account for


the nonresponse rate!
2/1/13
18

Sample Size Calculations (w/ fpc)

for Estimating Proportions !
•  Again proceed as before, but use the
expression for proportions!
•  That is, solve the following for n
⎛ n ⎞ p (1− p )
2 ⎜ 1− ⎟ =B
⎝ N⎠ n
gives!
Np(1 − p)
n= 2
B ( N − 1) 4 + p(1 − p)

•  And again, don’t forget to inflate this to


account for the nonresponse rate!
2/1/13
19

Power Calculations Example!

•  Back to survey with N=300, where we guess


that p=50% (most conservative assumption) !
•  What sample size do we need to achieve a
margin of error of 3%?!
Np(1 − p)
n= 2
B ( N − 1) 4 + p(1 − p)
300 × 0.5(1 − 0.5)
= = 236.4
0.03 ( 300 − 1) 4 + 0.5(1 − 0.5)
2

•  So, need responses from 237 out of the 300


–  If 80% response rate, must sample 237/0.8=297!!
2/1/13
20

For Our Project!

•  Same assumptions:!
–  Binary question!
–  p=0.5
•  If we’re going to
survey ~900 people
out of 1500, might
as well do them all?!
–  Plus, 1500 gives
some insurance if
response rate < 0.7!

2/1/13
21

Doing the Calculations Directly!

•  First, we need this many respondents for a


3% margin of error:!
Np(1− p)
n= 2
B ( N − 1) 4 + p(1− p)
1500 × 0.5(1− 0.5)
= = 638.5
0.03 (1500 − 1) 4 + 0.5(1− 0.5)
2

•  Then, accounting for nonresponse:!

638.5 / 0.7 = 912.1


2/1/13
22

Sample Size Calculations (w/out fpc)

for Estimating Proportions!
•  Similar to what we were doing, but margin of
error expression does not include fpc!
–  Choose B, the margin of error !
–  Then,! B = 2 pˆ (1 − pˆ ) / n
–  Algebra gives required sample size: !
4 pˆ (1 − pˆ )
n=
B2
•  Can simplify further:!
–  Estimate p using worst case: ½
–  Then, !n = 1/ B
2

2/1/13
23

Example!

•  National poll of likely voters for candidate “X”!


–  Desire 3% margin of error!
•  Then! n = 1/ B = 1/ 0.03 = 1,111.1
2 2

•  If expect a 70% response rate, then sample


1,111.1/0.7=1,587.3 or 1,588 likely voters!
•  Compare to fpc-based calculation:!
300,000,000 × 0.5(1 − 0.5)
n= = 1,111.1
0.03 (300,000,000 − 1) 4 + 0.5(1 − 0.5)
2

2/1/13
24

How Does That Work?!

Np (1 − p )
n= 2
B ( N − 1) 4 + p (1 − p )
⎛ N ⎞
4⎜ ⎟ p (1 − p )
N −1 ⎠
= ⎝
p (1 − p )
B +4
2

N −1
⎛ N ⎞
⎜ ⎟

= ⎝ ⎠
N 1
(for p = 1/ 2)
1
B +
2

N −1
1
≈ 2 for large N
B
2/1/13
25

Take-Aways !!

•  With SRS and sample size less than 5% of


population, proceed using “Stats 101”
methods!
–  Means, totals, proportions!
–  Can use standard statistical software!
•  With SRS, if n > 0.05N, then be sure to use
finite population correction!
–  Reported results more precise (and correct)!
–  Either need to use special software or manually
adjust the reported standard errors!

2/1/13
26

What We Have Covered!

•  Defined simple random sampling (SRS) and


discussed how to draw one!
•  Discussed Horvitz-Thompson estimation and
SRS!
–  Defined the finite population correction (fpc)!
•  Defined estimators for means, totals, and
proportions, including their standard errors!
•  Discussed sample size calculations!

2/1/13
27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy