0% found this document useful (0 votes)
213 views41 pages

Sampling Theory PPT 1 1

This document provides an introduction to sampling theory. It discusses the purposes of statistical surveys and collecting data from populations. The key methods of collecting data are the census method (complete enumeration) and sampling method. While the census method is more accurate, sampling is more practical for large populations as it reduces costs and effort. There are two main types of errors in surveys: sampling errors, which occur due to making inferences about a population based on a sample, and non-sampling errors, which are mistakes made in data collection. The document then classifies sampling techniques as either non-probability or probability sampling and provides examples of techniques in each category such as convenience sampling, quota sampling, and simple random sampling. It notes advantages of probability sampling include
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
213 views41 pages

Sampling Theory PPT 1 1

This document provides an introduction to sampling theory. It discusses the purposes of statistical surveys and collecting data from populations. The key methods of collecting data are the census method (complete enumeration) and sampling method. While the census method is more accurate, sampling is more practical for large populations as it reduces costs and effort. There are two main types of errors in surveys: sampling errors, which occur due to making inferences about a population based on a sample, and non-sampling errors, which are mistakes made in data collection. The document then classifies sampling techniques as either non-probability or probability sampling and provides examples of techniques in each category such as convenience sampling, quota sampling, and simple random sampling. It notes advantages of probability sampling include
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

SAMPLING THEORY

an introduction

Dr. Mathachan Pathiyil


Associate Professor
Department of Statistics
Nirmala College, Muvattupuzha
Statistical Surveys
• The purpose of a survey is the collection of information to
satisfy a definite need.
• The need to collect data arises in all walks of life.
• The data we need may be about
1. the population (total, sex, age, migration, rate of growth,
literacy, religion … )
2. labour (no. of employees, hrs. of work, wages, strikes,
unemployment … )
3. agriculture ( area under diff. crops, forests, agriculture income,
manures, cultivation practices … )
4. Industry ( turn over, production, capital investment, water
consumption, pollution … )
5. Trade ( wholesale/ retail prices, demand, profit/ loss … )
etc.
• A statistical survey is a sort of investigation carried out
by an agency or individual to study the nature of the
unknown characteristics of a population.
. We undertake a survey for a variety of purposes.
However in most cases our interest may be
concentrated on 4 important unknown values (or
parameters ) of the population under study.
1. the population total
2. the population mean
3. the population proportion
4. the population ratio
Proportion : whole and part ( proportion of smokers,
males, defectives, distinctions … )
Ratio : part and part ( sex ratio, import/ export ratio,
birth/ death ratio … )
Population : Aggregate of all objects about which
we want to collect information ( houses in an area,
students in a class, fishes in a lake, viewers of a
specific T. V. programme, normal population with
mean 50 and SD 5 …)
Characteristic : Any aspect of the population about
which we want to collect information ( colour,
height, life length, yield, political affiliation,
income, employment … )
Characteristics are of two types – Variables and Attributes

Two ways of collecting information –


Census method ( complete enumeration method)
and Sampling method.
• Census – Method of collecting data from each and
every unit of the population.
Merits of census method
• The results are more representative, accurate and reliable
• The results are free from sampling errors
• A census data may be used as a basis for various other surveys
…..
However despite of these advantages, the census method is not
popularly used in practice.
• Effort, money, time required for completing census is very large.
• There is no way of checking the error in the data except through a
re- survey or sample checks.
• Census is practically impossible for a researcher or a small
organization.
• If the population is infinite or the enumeration is destructive in
nature, census cannot be used.
Sampling – Method of collecting data from a
representative part of the population only
Merits of sampling include
• Reduced cost, time and labour
• Greater scope ( need only less number of trained investigators, less
administrative cost, less number of equipments … )
• Greater accuracy
• If the population is hypothetical or infinite, only sampling is possible
• It is always possible to determine the extent of sampling error
Demerits of sampling method are
• A proper choice of the sampling method is not made, the results may be
misleading
• The chances of sampling errors are great in sampling
• When the population is small, we can’t use sampling
• When the information is needed from each and every unit in the
population (Voters list preparation, incom-tax assessment, college
admissions … ), sampling cannot be used.
• Neither sampling nor census admit universal
application.
• Census and sampling will produce identical conclusions
when the population is perfectly homogenous.
• It is a curious fact that the results from a carefully
planned, well executed sample survey are expected to
be more accurate than those from a census survey.
• The aim of sampling theory is to make sampling more
effective so that the answer to a particular question is
given in a quick, valid, efficient and economical way.
Errors in Surveys
• Two major types of errors can arise when a survey is conducted to
make observations on a characteristic defined over the population:
sampling errors and non-sampling errors
• Sampling error refers to the error arising due to drawing
inferences about the population on the basis of few observations
taken from it.
• This error is inherent and unavoidable in any sample survey. It can
be decreased by increasing the sample size .
• S. E. is inversely proportional to the square root of the sample size.
• Sampling errors are absent in census surveys.
Few reasons for sampling errors are – faulty selection of the
sample(purposive or judgment sampling, use of inappropriate
sampling scheme like srs for heterogeneous populations … ),
substitution(when difficulties arise, investigator may substitute a
convenient member of the population), faulty identification of the
• Non-sampling errors are more serious and are due to
mistakes made in the acquisition of data .
• This is present in both sample surveys and census surveys.
• It can occur at any stage of its planning, execution and
analysis.
Few reasons for non sampling errors are – faulty planning or
definitions(faulty objectives, faulty questionnaire, errors in
measurements, lack of trained investigators … ), errors due to
non response (not at homes, unable to answer, refuses to
answer the questions … ), response errors (respondent may
misunderstand a question and may furnish false data, prestige
bias, investigator bias … ), errors in coverage(inclusion/exclusion
of units which are to be excluded/included in a survey … ),
compiling errors(errors in coding, editing, tabulation … ),
publication errors (errors in printing, presentation … ) …
Classification of Sampling Techniques

Sampling Techniques

Non probability Probability


Sampling Sampling

Convenience Judgmental Quota Snowball


Sampling Sampling Sampling Sampling

Simple Random Stratified Systematic Cluster PPS sampling


Sampling sampling Sampling Sampling
Non- probability sampling and probability sampling
• Non probability sampling - Method of selecting
samples in which the choice of selection of units
into the sample depends entirely on the judgment
of the sampler (investigator).
• Probability sampling – Scientific method of
selecting samples from the population. In this
procedure , each unit in the population has a
definite pre assigned non zero probability of
being selected into the sample.
Non probability sampling
• Convenience sampling
Attempts to obtain a sample of convenient elements.
Usually the sample is restricted to a part of the population
that is readily available.
Often, respondents are selected because they happen to
be in the right place at the right time.

– use of students or members of social organizations


– mall intercept interviews without qualifying the
respondents
– fruits on the top of the containers
– “people on the street” interviews
• Judgmental sampling
method of sampling in which the sample elements
from the population are selected based on the judgment
of the researcher.

– party members selected in voting behavior research


– expert witnesses used in court
– purchase engineers selected in industrial marketing
research
Quota sampling
may be viewed as two-stage restricted judgmental sampling.
– The first stage consists of developing control categories, or
quotas, of population elements.
– In the second stage, sample elements are selected based on
convenience or judgment.
Population Sample
composition composition
Control
Characteristic Percentage Number
(Sex)
Male 48 480 (48%)
Female 52 520 (52%)
____ ____
100 1000
• Snowball sampling
an initial group of respondents is selected, usually at
random.
– After being interviewed, these respondents are
asked to identify others who belong to the target
population of interest.
– Subsequent respondents are selected based on the
referrals.
Disadvantages of non probability sampling
All the non probability sampling procedures suffers from
drawbacks of favoritism, personal biases, prejudices ... of
the investigator.
Only if the investigator is well experienced and perfect in
nature we can expect satisfactory results in this case
Probability Sampling Schemes
• Simple Random Sampling
• Stratified Random Sampling
• Systematic Sampling
• Cluster Sampling
• PPS Samling

Advantages of probability sampling
• The sample will be representative of the population with
respect to the variables of interest.
• Probability samples are more accurate than non-probability
samples (They remove conscious and unconscious sampling bias )
• Probability samples permit the development of the theory for
the estimation of population parameters.
• Probability samples allow us to determine the accuracy of the
sample estimates.
• For any survey there are three important stages
– Planning, execution and Analysis.
• Execution of the survey is 100% practical work.
Sampling theory plays no role in the execution
of the survey.
• It gives importance to the other two stages
namely, planning and analysis.
Principal steps in sample surveys
Sample survey may be considered as an organized fact finding
procedure. While developing a sampling design ( planning
execution and analysis), we must pay attention to the
following points.
• Stating clearly the objectives of the survey
• Define the population to be sampled (covered)
• Definition of the sampling units and the preparation of the
sampling frame
• Deciding the data to be collected
• Methods of collecting the data
• Preparation of the questionnaire
• Selection of the sample
• Organization of the field work
• Summary and analysis of the data collected
The Language of Sampling

• Population: the theoretical aggregation of specified elements


defined for a given survey defined by time and space
• Sample element: a case or a single unit that is selected from
a population and measured in some way for the study (e.g., a
person, thing ...).
• Sampling frame: a specific list containing the names or
addresses of all elements in the population. From this, the
researcher selects units to create the study sample.
• Sample: a set of cases or units that is drawn from a
population and used to make conclusions(generalizations or
inferences) about the unknown aspects of the population
• Estimator: a calculating scheme or formula (statistic) for
obtaining an appropriate value of the population parameter
based on the sample observations.
• Estimate: particular value of an estimator w. r. to a sample
• Expected value: average value of all possible estimates
based on an estimator from repeated trials of a sampling
scheme.
• Bias : difference between the expected value of the
estimator and the true value of the parameter
• Precision : measures the closeness of the estimator with
its expected value. Variance of an estimator is usually used
to measure precision.
• Accuracy : refers to the closeness of the estimate and the
true value of the parameter. Mean Square Error of the
estimator is used to measure accuracy
Estimator is both accurate and precise only if
the estimator is unbiased
Simple Random Sampling (srs)
• It is the simplest method of probability sampling
• The sample is drawn unit by unit with equal probability of
selection for each unit at each draw.
• If the unit selected is returned to the population after
enumeration, before the next draw, the procedure of selection
called srswr.
• If the unit selected is removed from the population after
enumeration, before the next draw, the procedure of selection
called srswor.
• If N is the population size and n is the size of the sample we select,
there are Nn srswr samples and NCn srswor samples are possible
Since the srswor sample provide more precise and accurate estimates of the
population parameters than that based on the srswr sample, we always prefer
srswor samples
Procedures of selecting a srs
Define the population and select a suitable sampling
frame
Each element is assigned a number from 1 to N

Generate n different random numbers between 1


and N
The numbers generated denote the elements that
should be included in the sample
Lottery method and random number table method
are the two procedures available for selecting srs
Lottery Method
• This is the most popular and simplest method. In this method all
the items of the population are numbered on separate slips of
paper of same size, shape and colour. They are folded and
stored in a container. Shuffle them thoroughly. Slips are then
drawn at random one by one till the required number or units
are selected into the sample.
Table of Random numbers
As the lottery method cannot be used, when the population is
large, the alternative method is that of using the table of
random numbers. There are several standard tables of random
numbers.
1. Tippett’ s table
2. Fisher and Yates’ table
3. Kendall and Smith’ s table

Selection of a srs using random number tables
Identify and define the population.
Determine the desired sample size.
Assign all individuals on the list a consecutive
number from zero to the population size.
Select an arbitrary number in the table of random
numbers.
For the selected number, locate the unit in the
population bearing it and select that unit to the sample
Go to the next number in the column of the table
and repeat the above step until the desired
number of individuals has been selected for the
sample.
Estimation of Parameters using srswor
• Let (y1, y2, … , yn) be the srswor sample of observations taken from
the population under study. Then the sample mean is
1 n
y srs   yi
n i 1
It is an unbiased and consistent estimator of the population mean
Y
  i  is the sample
1 n 2
• If N is the population size and s 2

n  1 i 1
y  y
variance, then an unbiased estimator of the variance of the
estimator is  N  n  2 .
 s
 Nn 
• An estimate of the population total Y is N y and an unbiased
estimator of its variance is
 N n 2
N2  s
 Nn 
Advantages of srs
• easy to conduct
• strategy requires minimum knowledge of
the population to be sampled
• simple estimators for the parameters
….
Drawbacks of srs

• When the population is not homogeneous w. r.


to the characteristic under survey, the srs need
not be a good representative of it.
Homogeneous and heterogeneous populations
• If all members of a population were identical, the
population is considered to be homogenous. That is,
the characteristics of any one individual in the
population would be the same as the characteristics of
any other individual (little or no variation among
individuals).
• When individual members of a population are different
from each other, the population is considered to be
heterogeneous (having significant variation among
individuals).
Eg. Students in a school, students in a college,
workers in a factory …
Stratified Sampling
• When the population is heterogeneous in nature, the
stratified random sampling is used.
• Two-step process in which the population is partitioned
into subpopulations, or strata.
• Strata should be mutually exclusive and collectively
exhaustive so that every population element should be
assigned to one and only one strata and no population
elements should be omitted.
• Elements are selected from each strata by a random
sample procedure, usually srs. Then pool them together to
get the stratified sample
• A major objective of stratified sampling is to increase
precision without increasing cost.
• The elements within a strata should be as
homogeneous as possible, but the elements in
different strata should be as heterogeneous as
possible.
• The stratification variables (Auxiliary variables)
should also be closely related to the characteristic
of interest (Survey variable or Study variable).
If height is the study variable,
weight or age may taken as the auxiliary variable

If volume of timber is the study variable,


girth or height of the trees may be taken as auxiliary variable

Procedure for Drawing a Stratified Random Sample
• Define the population and the sampling frame
• Select the stratification variable(s) and the number of strata, L
• Divide the entire population into L non overlapping subdivisions
called strata based on the classification variable
• In each strata, number the elements from 1 to Nh (the pop.
size of strata h)
• Determine the sample size of each strata, nh, where

L
nh = n

h=1

• From each strata, select a simple random sample of nh units


and pool them together to get the required Stratified sample.
Reasons for stratification

• Administrative convenience
• When sampling problems differ markedly in
different parts of the population(In surveying
factories, firms may be grouped into large/
medium/small, individual/group, private/govt. … )
• When stratification produce gain in precision in
the estimates of the parameters of the whole
population ( when the population is highly
heterogeneous)
Estimation of Parameters using a Stratified sample
• Let the population of size N is divided into L strata each of
L

size Nh, h= 1, 2, …, L so that . N N h


h 1

• We take random samples of size nh from the h th strata L


n  n
in
h
the population,
yh , ih=1, 2, …n,Lh ,so
 1, 2,..., hthat
1, 2,..., L, .
h 1

• Suppose denote the i th observation


in the sample ntaken from the hth strata in the population.
1 h

• Now yh  n  yhbe the sample mean from the hth


i
h i 1

, h  1, 2, ....L
strata in the population .
• Then an unbiased estimator of the population mean
Y 1 L
is y st 
N
 N
h 1
h yh
Allocation of sample size in different strata
• Once the sampling strategy is fixed as Stratified
random sampling, their arise the question of
deciding the sample size, nh, for the hth strata, h=
1, 2, …, L in the population.
• The following are the important methods of
allocation in stratified sampling
1. equal allocation
2. proportional allocation
3. optimum allocation
n
Equal Allocation nh  , h  1, 2,..., L
L
Nh
Proportional Allocation nh  n, h  1, 2,..., L
N
Optimum Allocation
In optimum allocation procedures, we resort to
conditional minimization techniques. Here we consider
linear cost functions. The standard procedures are
• Minimizing the variance of the estimator for a given total
cost of the survey
• Minimizing total cost the survey for a given variance of
the estimator
• Minimizing the variance of the estimator for a given
sample size (Neyman Optimum Allocation)
Advantages of Stratification

• Administrative convenience
• Samples are more representative
• Estimation with greater accuracy
• Stratification makes it possible to use
different sampling designs in different strata
Systematic Sampling
• A sampling technique in which only the first unit is
selected randomly and the rest are selected
automatically according to a predefined pattern
• The sample is chosen by selecting a random starting
point and then picking every kth element in succession
from the sampling frame.
• The sampling interval, k, is determined by dividing the
population size N by the required sample size n and
rounding it to the nearest integer.
For example, let there are 5,000 elements in the
population and a sample of 50 is desired. In this case
the sampling interval, k, is 100. A random number ( r,
the random start) between 1 and 100 is selected. If, for
example, r=23, the sample consists of elements 23, 123,
223, 323, 423, 523, and so on upto 4923.
Procedure for Drawing a Systematic Sample
• Define the population and select a suitable
sampling frame
• Each element is assigned a number from 1 to N
• Determine the sampling interval k (k=N/n). If k
is a fraction, round it to the nearest integer
• Select a random number, r, between 1 and k
• Now the elements with the following numbers
will constitute the systematic random sample
r, r+k,r+2k,r+3k,r+4k,...,r+(n-1)k
Things to remember before using
Systematic sampling scheme
• Efficiency of systematic sampling depends on the
order of arrangement of the units in the population
• If the units in the population show an
increasing/decreasing trend along with the increase
in magnitude of their labels, the systematic sample
means will also show the same tendency (rank lists,
salary lists …)
• If the population is almost periodic/cyclic in nature,
then the efficiency of systematic sampling depends
on the value of k, the sampling interval (Rain fall,
market days, peak traffic hrs. … )
Cluster Sampling
• The target population is first divided into
mutually exclusive and collectively exhaustive
subpopulations, or clusters.
• Then a random sample of clusters is selected,
based on a probability sampling technique such
as srs.
• For each selected cluster, either all the elements
are enumerated (single stage) or a random
sample of elements is drawn from the selected
clusters and are only enumerated (two-stage).
• Ideally, each cluster should be a small-scale
representation of the population.
Types of Cluster Sampling
Cluster Sampling

Single Stage Two-Stage Multistage


Sampling Sampling Sampling

Population : People in Kerala Population : Students in a


Cluster : districts/ Villages/ college
Panchayats/ municipalities Cluster : Courses/batches
Elements : Houses/Schools/ Elements : Each student
individuals
THANK YOU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy