Day6 STATwithMehedi

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Introduction to Statistics for Research

Day 6 Lecture Sheet

Learning Objectives:

1. Understand what sampling is and its importance.


2. Define population, sampling frame, and sample.
3. Learn steps in the sampling process.
4. Compare sampling and census methods.
5. Describe probability and non-probability sampling techniques.
6. Define and understand sampling error and standard error.

Sampling?
Sampling is the statistical process of selecting a subset (called a “sample”) of a
population of interest for purposes of making observations and statistical inferences
about that population.

Social science research is generally about inferring patterns of behaviors within specific
populations. We cannot study entire populations because of feasibility and cost
constraints, and hence, we must select a representative sample from the population of
interest for observation and analysis. It is extremely important to choose a sample that
is truly representative of the population so that the inferences derived from the sample
can be generalized back to the population of interest.

It is necessary to go through some important keywords before we discuss sampling


further.

Population: The population refers to the entire group of individuals, items, or entities
that possess certain characteristics and are of interest to the researcher. It represents
the larger group from which a sample is drawn and about which inferences are to be
made.

Sampling Frame: The sampling frame is a list or an operational definition of the target
population from which the sample will be drawn. It serves as a practical way of
identifying and accessing the individuals or items that make up the population.
Ideally, the sampling frame should accurately represent the population and include all its
members. However, in practice, limitations such as incomplete lists or inaccuracies may
exist.

Sample: A sample is a subset of the population that is selected for study. It is chosen to
represent the population and provide insights into its characteristics or behaviors.
Sampling involves selecting a smaller group of individuals or items from the population
with the aim of making generalizations or conclusions about the population as a whole.

Steps in the Sampling Process:

Step1: Defining the Target Population

Step2: Choose a Sampling Frame

Step3: Choosing a Sampling Using a Well-defined Sampling Method

Sampling vs Census
The opposite of sampling is complete-counting which is also known as a census. A
census is a systematic and comprehensive survey conducted to gather data about
every individual or item in the population.

Sampling and Census is compared side-by-side in the following table:

Aspect Sampling Census

Definition Selecting a subset of Collecting data from every


individuals/items from the individual/item in the entire
population. population.

Method Various techniques such as Direct surveying or observing


random sampling, stratified every member of the population.
sampling, etc.

Advantages Cost-effective, time-efficient, Comprehensive, accurate, useful


feasible for large populations. for small populations.

Representa- Aims to represent population Guarantees complete


tiveness characteristics accurately. representation of the population.
Resource Requires fewer resources in Demands more resources due to
Usage terms of time, money, and surveying every member.
manpower.

Accuracy Introduces potential for sampling Provides accurate data for the
error. entire population.

Feasibility Preferred for large populations or More feasible for smaller


limited resources. populations or when accuracy is
critical.

Advantages and Disadvantages of Sampling over


Complete-counting

Advantages of Sampling

1. Cost-effectiveness: Sampling is typically less expensive than conducting a


complete census, especially when dealing with large populations or extensive
data sets. It requires fewer resources in terms of time, manpower, and money.
2. Time efficiency: Sampling allows researchers to gather data quickly since they
don't need to collect information from every single member of the population.
3. Feasibility: In some cases, it may be logistically difficult or practically impossible
to conduct a complete census, especially when dealing with a very large or
dispersed population.
4. Minimizes respondent burden: Sampling reduces the burden on respondents
since only a subset of the population needs to be surveyed or studied.
5. Accuracy: When properly designed and executed, sampling techniques can
yield accurate and reliable results that closely approximate those of a complete
census.

Disadvantages of Sampling

1. Sampling error: There is always a risk of sampling error, which occurs when the
sample data differs from the true characteristics of the population. This error can
lead to incorrect conclusions or generalizations.
2. Bias: If the sample is not representative of the population, it can introduce bias
into the results, leading to skewed or inaccurate findings.
3. Generalizability: While sampling can provide valuable insights, the extent to
which the findings can be generalized to the entire population may be limited,
particularly if the sample is not truly representative.
4. Complexity in design: Designing an effective sampling strategy requires careful
consideration of factors such as sample size, sampling method, and sampling
frame. This complexity can sometimes lead to errors or difficulties in
implementation.
5. Undercoverage: Some segments of the population may be harder to reach or
less likely to participate in a sampling survey, which can affect the
representativeness of the sample.

Sampling Techniques
Sampling techniques can be grouped into two broad categories: probability (random)
sampling and non-probability sampling. Probability sampling is ideal if generalizability
of results is important for your study, but there may be unique circumstances where
non-probability sampling can also be justified.

Probability Sampling
Probability sampling is a technique in which every unit in the population has a chance
(non-zero probability) of being selected in the sample, and this chance can be
accurately determined.
All probability sampling have two attributes in common:
1. Every unit in the population has a known non-zero probability of being sampled,
and
2. The sampling procedure involves random selection at some point.

The different types of probability sampling techniques include:

(1)Simple Random Sampling


In this technique, all possible subsets of a population (more accurately, of a sampling
frame) are given an equal probability of being selected. This is the simplest of all
probability sampling techniques; however, the simplicity is also the strength of this
technique. Because the sampling frame is not subdivided or partitioned, the sample is
unbiased and the inferences are most generalizable amongst all probability sampling
techniques.

(2)Systematic Sampling
In this technique, the sampling frame is ordered according to some criteria and
elements are selected at regular intervals through that ordered list.

Systematic sampling involves a random start and then proceeds with the selection of
every k th element from that point onwards, where k = N/n, where k is the ratio of
sampling frame size N and the desired sample size n, and is formally called the
sampling ratio. It is important that the starting point is not automatically the first in the
list, but is instead randomly chosen from within the first k elements on the list. The
sample is representative of the population, at least on the basis of the sorting criterion.

(3)Stratified Sampling
In stratified sampling, the sampling frame is divided into homogeneous and
non-overlapping subgroups (called “strata”), and a simple random sample is drawn
within each subgroup.

(4)Cluster Sampling
If you have a population dispersed over a wide geographic region, it may not be
feasible to conduct a simple random sampling of the entire population. In such a case, it
may be reasonable to divide the population into “clusters” (usually along geographic
boundaries), randomly sample a few clusters, and measure all units within that cluster.

**Multi-stage Sampling
The probability sampling techniques described previously are all examples of
single-stage sampling techniques. Depending on your sampling needs, you may
combine these single-stage techniques to conduct multi-stage sampling. For instance,
you can stratify a list of businesses based on firm size, and then conduct systematic
sampling within each stratum. This is a two-stage combination of stratified and
systematic sampling.

Non-Probability Sampling
Non-Probability sampling is a sampling technique in which some units of the population
have zero chance of selection or where the probability of selection cannot be accurately
determined. Typically, units are selected based on certain non-random criteria, such as
quota or convenience. Because selection is non-random, nonprobability sampling does
not allow the estimation of sampling errors, and may be subjected to a sampling bias.
Therefore, information from a sample cannot be generalized back to the population.

Types of non-probability sampling techniques include:

(1)Convenience Sampling
Also called accidental or opportunity sampling, this is a technique in which a sample is
drawn from that part of the population that is close to hand, readily available, or
convenient. This type of sampling is most useful for pilot testing, where the goal is
instrument testing or measurement validation rather than obtaining generalizable
inferences.

(2)Quota Sampling
In this technique, the population is segmented into mutually-exclusive subgroups (just
as in stratified sampling), and then a non-random set of observations is chosen from
each subgroup to meet a predefined quota.

(3)Expert Sampling
This is a technique where respondents are chosen in a non-random manner based on
their expertise on the phenomenon being studied.

(4)Snowball Sampling
In snowball sampling, you start by identifying a few respondents that match the criteria
for inclusion in your study, and then ask them to recommend others they know who also
meet your selection criteria. Although this method hardly leads to representative
samples, it may sometimes be the only way to reach hard-to-reach populations or when
no sampling frame is available.
Sampling Error
A sampling error is a statistical deviation that occurs when the sample used in a study
is not representative of the entire population. Even randomized samples have some
degree of sampling error because they only approximate the population from which they
are drawn.

This error can be categorized into four types:

1. Population-Specific Error: This occurs when a researcher does not understand


whom to survey.
2. Selection Error: It happens when the survey is self-selected, or only participants
interested in the survey respond to the questions. Researchers can mitigate this
by encouraging participation.
3. Sample Frame Error: This error arises when a sample is selected from the
wrong population data.
4. Non-Response Error: It occurs when selected participants do not respond to the
survey.

To calculate the overall sampling error, use the following formula:

Sampling Error= =

Where:
- represents the Z-score value based on the confidence interval (approximately
1.96 for a 95% confidence interval).
- is the population standard deviation.
- is the sample size.

The prevalence of sampling errors can be reduced by increasing the sample size. As
the sample size increases, the sample gets closer to the actual population, which
decreases the potential for deviations from the actual population. Consider that the
average of a sample of 10 varies more than the average of a sample of 100. Steps can
also be taken to ensure that the sample adequately represents the entire population.

Researchers might attempt to reduce sampling errors by replicating their study. This
could be accomplished by taking the same measurements repeatedly, using more than
one subject or multiple groups, or by undertaking multiple studies.
Random sampling is an additional way to minimize the occurrence of sampling errors.
Random sampling establishes a systematic approach to selecting a sample. For
example, rather than choosing participants to be interviewed haphazardly, a researcher
might choose those whose names appear first, 10th, 20th, 30th, 40th, and so on, on the
list.

Sampling Distribution
A sampling distribution is a probability distribution of a statistic (such as the mean,
variance, or proportion) obtained from a large number of samples drawn from a specific
population. It describes how the statistic varies from sample to sample and provides
insight into the properties of the statistic under repeated sampling.

Here are some key points about sampling distributions:

1. Foundation in Probability:
- The sampling distribution is based on the concept of repeated sampling. If you take
multiple samples from the same population and calculate a statistic for each sample, the
distribution of these statistics forms the sampling distribution.

2. Central Limit Theorem (CLT):


- The CLT is a fundamental theorem in statistics that states that, regardless of the
population distribution, the sampling distribution of the sample mean (or sum) will tend
to be normally distributed if the sample size is sufficiently large.
- This is particularly useful because it allows statisticians to make inferences about
population parameters even when the population distribution is not normal.

3. Standard Error:
- The standard error is a measure of the variability of the sampling distribution. It
indicates how much the sample statistic is expected to fluctuate from sample to sample.

- For the sample mean, the standard error is , where is the population
standard deviation and is the sample size.

Example:
Consider a population with a known mean and standard deviation . If you
take multiple random samples of size from this population and calculate the mean
for each sample, you will get a distribution of sample means. This distribution of sample
means is the sampling distribution of the sample mean.

- Mean of the Sampling Distribution: The mean of the sampling distribution of the
sample mean is equal to the population mean .
- Standard Error: The standard deviation of the sampling distribution (standard error) is

Here is a simple illustration:

1. Imagine a population with a certain distribution (e.g., uniform, normal, skewed).


2. Draw multiple random samples of the same size from this population.
3. Calculate the statistic of interest (e.g., mean) for each sample.
4. Plot the distribution of these sample statistics. This plot represents the sampling
distribution.

The sampling distribution provides a theoretical foundation for statistical inference,


allowing researchers to generalize findings from a sample to the broader population with
quantifiable confidence.
Sampling Error vs Standard Error
Standard error and sampling error are related concepts in statistics, but they refer to
different aspects of the variability and accuracy of sample estimates.

- Concept:
- Standard Error: Measures the variability of the sample statistic over repeated
samples.
- Sampling Error: Measures the deviation of a sample statistic from the population
parameter for a single sample.

- Calculation:
- Standard Error: Calculated using the standard deviation of the population (or
sample) and the sample size.
- Sampling Error: Calculated as the difference between the sample statistic and the
population parameter.

- Usage:
- Standard Error: Used in inferential statistics to make generalizations about the
population, such as constructing confidence intervals and hypothesis testing.
- Sampling Error: Specific to the difference observed in a particular sample compared
to the population, illustrating the accuracy of that single sample.

### Visual Representation

Imagine you have a population and you take multiple samples from it:

1. Standard Error: If you plot the sample means from each sample, the standard error
is the standard deviation of these sample means. It shows how much the sample means
vary from each other.

2. Sampling Error: For each sample, the difference between the sample mean and the
population mean is the sampling error. It shows how far each sample mean is from the
true population mean.
Session Summary

➔ Explained the concept of sampling and its importance in research.


➔ Defined key terms: population, sampling frame, and sample.
➔ Outlined the steps in the sampling process.
➔ Compared the advantages and disadvantages of sampling versus census.
➔ Described various probability and non-probability sampling techniques.
➔ Defined sampling error and standard error, and discussed their impacts.

References
Agresti, A., Franklin, C., & Klingenberg, B. (2022). Statistics: The Art and Science of
Learning from Data. Pearson.

(n.d.). Investopedia. Retrieved June 4, 2024, from

https://www.investopedia.com/terms/s/samplingerror.asp

Course Instructor

S. M. Mehedi Hasan

Student, BBA 3rd Year 1st Semester, Department of Marketing, University of Rajshahi

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy