Statistics and Probability4
Statistics and Probability4
Statistics and Probability4
Probability
Random Sampling, Parameter and Statistic, and Sampling Distribution of
Statistics
Objectives
After going through the learning activities, you are
expected to:
1.illustrate random sampling; (M11/12SP-IIId-2)
2.distinguish between parameter and statistic;
(M11/12SP-IIId-3) and
3.identify sampling distribution of statistics (sample
mean).(M11/12SP-IIId-4)
Random Sampling
• Selection of n elements derived from the N
population, which is the subject of an
investigation or experiment.
All subjects in the population listed in the study
have the same chances of being chosen using the
appropriate technique.
The conclusions of the post-hypothesis tests
applied to the sample selection will apply to the
entire population as well.
Population
the whole group under study or
investigation. In research, the population
does not always refer to people.
A group containing elements of anything
you want to study, such as objects, events,
organizations, countries, species,
organisms, etc.
Sample
A subset taken from a population,
either by random sampling or by non-
random sampling.
A representation of the population
where it is hoped that valid conclusions
will be drawn from the population.
Population and the Sample
SAMPLING…….
7
PROBABILITY SAMPLING
• A probability sampling scheme is one in which every
unit in the population has a chance (greater than
zero) of being selected in the sample, and this
probability can be accurately determined.
• When every element in the population does have the
same probability of selection, this is known as an
'equal probability of selection' (EPS) design. Such
designs are also referred to as 'self-weighting'
because all sampled units are given the same weight.
8
PROBABILITY SAMPLING…….
•Probability sampling includes:
•Simple Random (Lottery)
Sampling,
•Systematic Sampling,
•Stratified Random Sampling,
•Cluster Sampling
•Multistage Sampling.
9
SIMPLE RANDOM (LOTTERY) SAMPLING
• Applicable when population is small, homogeneous
& readily available
• All subsets of the frame are given an equal
probability. Each element of the frame thus has
an equal probability of selection.
• It provides for greatest number of possible
samples. This is done by assigning a number to
each unit in the sampling frame.
• A table of random number or lottery system is
used to determine which units are to be selected.
10
SIMPLE RANDOM (LOTTERY) SAMPLING……..
• Estimates are easy to calculate.
• Simple random sampling is always an EPS design, but not
all EPS designs are simple random sampling.
• Disadvantages
• If sampling frame large, this method impracticable.
• Minority subgroups of interest in population may not be
present in sample in sufficient numbers for study.
11
REPLACEMENT OF SELECTED UNITS
• Sampling schemes may be without replacement
('WOR' - no element can be selected more than once
in the same sample) or with replacement ('WR' - an
element may appear multiple times in the one sample).
• For example, if we catch fish, measure them, and
immediately return them to the water before
continuing with the sample, this is a WR design,
because we might end up catching and measuring the
same fish more than once. However, if we do not
return the fish to the water (e.g. if we eat the fish),
this becomes a WOR design.
12
SYSTEMATIC SAMPLING
• Systematic sampling relies on arranging the target
population according to some ordering scheme and then
selecting elements at regular intervals through that ordered
list.
• Systematic sampling involves a random start and then
proceeds with the selection of every kth element from then
onwards. In this case, k=(population size/sample size).
• It is important that the starting point is not automatically
the first in the list, but is instead randomly chosen from
within the first to the kth element in the list.
• A simple example would be to select every 10th name from
the telephone directory (an 'every 10th' sample, also
referred to as 'sampling with a skip of 10').
13
SYSTEMATIC SAMPLING……
As described above, systematic sampling is an EPS method,
because all elements have the same probability of selection (in
the example given, one in ten). It is not 'simple random sampling'
because different subsets of the same size have different
selection probabilities - e.g. the set {4,14,24,...,994} has a one-in-
ten probability of selection, but the set {4,13,24,34,...} has zero
probability of selection.
14
SYSTEMATIC SAMPLING……
• ADVANTAGES:
• Sample easy to select
• Suitable sampling frame can be identified easily
• Sample evenly spread over entire reference
population
• DISADVANTAGES:
• Sample may be biased if hidden periodicity in
population coincides with that of selection.
• Difficult to assess precision of estimate from one
survey.
15
STRATIFIED SAMPLING
Where population embraces a number of distinct
categories, the frame can be organized into separate
"strata." Each stratum is then sampled as an independent
sub-population, out of which individual elements can be
randomly selected.
• Every unit in a stratum has same chance of being selected.
• Using same sampling fraction for all strata ensures
proportionate representation in the sample.
• Adequate representation of minority subgroups of
interest can be ensured by stratification & varying
sampling fraction between strata as required.
16
STRATIFIED SAMPLING……
• Finally, since each stratum is treated as an independent
population, different sampling approaches can be applied to
different strata.
• Drawbacks to using stratified sampling.
• First, sampling frame of entire population has to be prepared
separately for each stratum
• Second, when examining multiple criteria, stratifying variables
may be related to some, but not to others, further complicating
the design, and potentially reducing the utility of the strata.
• Finally, in some cases (such as designs with a large number of
strata, or those with a specified minimum sample size per group),
stratified sampling can potentially require a larger sample than
would other methods
17
STRATIFIED SAMPLING…….
18
CLUSTER SAMPLING
• Cluster sampling is an example of 'two-stage sampling' .
• First stage a sample of areas is chosen;
• Second stage a sample of respondents within those
areas is selected.
• Population divided into clusters of homogeneous units,
usually based on geographical contiguity.
• Sampling units are groups rather than individuals.
• A sample of such clusters is then selected.
• All units from the selected clusters are studied.
19
CLUSTER SAMPLING…….
• Advantages :
• Cuts down on the cost of preparing a sampling frame.
• This can reduce travel and other administrative costs.
• Disadvantages: sampling error is higher for a simple random
sample of same size.
• Often used to evaluate vaccination coverage in EPI
20
CLUSTER SAMPLING…….
• Identification of clusters
– List all cities, towns, villages & wards of cities with their
population falling in target area under study.
– Calculate cumulative population & divide by 30, this gives
sampling interval.
– Select a random no. less than or equal to sampling interval
having same no. of digits. This forms 1st cluster.
– Random no.+ sampling interval = population of 2nd cluster.
– Second cluster + sampling interval = 4th cluster.
– Last or 30th cluster = 29th cluster + sampling interval
21
MULTI-STAGE SAMPLING
• Complex form of cluster sampling in which two or
more levels of units are embedded one in the
other.
• First stage, random number of districts chosen in
all
states.
• Followed by random number of talukas, villages.
• Then third stage units will be houses.
• All ultimate units (houses, for instance) selected
at last step are surveyed.
22
MULTISTAGE SAMPLING……..
• This technique, is essentially the process of taking random
samples of preceding random samples.
• Not as effective as true random sampling, but probably solves
more of the problems inherent to random sampling.
• An effective strategy because it banks on multiple
randomizations. As such, extremely useful.
• Multistage sampling used frequently when a complete list of all
members of the population not exists and is inappropriate.
• Moreover, by avoiding the use of all sample units in all selected
clusters, multistage sampling avoids the large, and perhaps
unnecessary, costs associated with traditional cluster sampling.
23
Questions???
24
Happy Lunch!!!