0% found this document useful (0 votes)
46 views

Chapter one_Sampling

Uploaded by

jiranusmotuma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Chapter one_Sampling

Uploaded by

jiranusmotuma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Chapter One

Sampling and Sampling Distribution

1.1. Sampling

Definition: Sampling is a process of selecting a number of individuals for a study in such a way
that the individuals represent the larger group from which they were selected. Sampling is the
method of gathering information about a population by taking a representative of the population
called sample

A sample is selected, evaluated and studied in an effort to gain information about the larger
population from which the sample was drawn. A sample represents a population, and
information obtained from a sample is generalized to be true for the entire population from
which it was drawn. A well-selected sample can provide information comparable to that
obtained by a census. The sampling frame is a list of all elements or other units containing the
elements in a population. Can the data gathered from the sample be used to make inferences
about the population? Statistically speaking, yes.

However, every sample has a different statistic. And this statistic is also considered a random
variable because the data vary from one sample to another.

 Some terms used in sampling

1. Parameter – characteristics of a population. Eg. Total of Annual GDP or exports,


Also, µ or σ of a probability distribution are termed parameters.
2. Statistic – numerical characteristics of a sample. Eg. monthly unemployment rate, 𝑥,
̅𝑠

3. Sampling distribution of a statistic is the probability


distribution of the statistic.

4. Sampling unit: the ultimate unit to be sampled or elements of the population to be


sampled.

Examples: If somebody studies Socio-economic status of the households, households are


the sampling unit.

1
If one studies performance of freshman students in some college, the student is the
sampling unit.

5. Sampling frame: is the list of all elements in a population under study.

Examples: List of households.

List of students in the registrar office.

Advantages of sampling

Studying a sample instead of a population, can have the following advantages


1. Cost – Samples can be studied at much lower cost. The smaller number of units or
individuals involved in a sample requires less time and money to evaluate. Samples
can provide affordable, accurate, and useful information in cases where a census
would cost more than the value of the information obtained.
2. Time – Samples can be evaluated more quickly than a population. If a decision had to
wait for the results of a census, a critical advantage might be missed, or the
information might be made obsolete by events or changes that took place while the
data were being collected and analyzed.
3. Accuracy – Any time data are collected, there is a chance for errors to occur. Errors
of measurement, incorrect recording of data, transposition of digits, recording of
information in the wrong area of a form, and errors in entering data into a computer
can all influence the accuracy of results. A sample can provide a data set that is small
enough to monitor carefully and can permit careful training and supervision of data
gatherer and handlers.
4. Scope of information – In a sample survey, there are greater varieties of information
that can be considered which may be impracticable in a complete census due to
constraints such as limited number of trained personnel and equipment. When
evaluating a smaller group, it is sometimes possible to gather more extensive
information on each unit evaluated.

Errors in Sampling: There are two types of errors

2
1. Sampling error: It is the discrepancy between the population value and sample
value.
 May arise due to inappropriate sampling techniques applied
2. Non sampling errors: are errors due to procedure bias such as:
 Due to incorrect responses
 Measurement
 Errors at different stages in processing the data.

Types of sampling techniques

There are two categories of sample designs, namely,

Probability (or random) sampling and Non-probability sampling

1. Probability Sampling; In this sub-section, we introduce important sampling methods


which incorporate randomization, which means that the selection is not consciously
influenced by human choice. The major principle of these designs is to avoid bias in the
selection procedure and to achieve the maximum precision for a given outlay of
resources. The main types of probability sampling designs are:
1. Simple random sampling,
2. Systematic sampling,
3. Stratified sampling, cluster
4. Cluster or multi-stage sampling.

1. Simple random sampling: is a method of sampling for which every possible sample has
equal chance of selection. Let n denote the number of subjects in the sample. This
number is called the sample size. A simple random sample of subjects from a population
is one in which each possible sample of that size has the same probability (chance) of
being selected.

Why is it a good idea to use random sampling?

Because everyone has the same chance of inclusion in the sample, so it provides fairness.
This reduces the chance that the sample is seriously biased in some way, leading to
inaccurate inferences about the population.

3
Most inferential statistical methods assume randomization of the sort provided by random
sampling.

How to select a simple random sample

1. Lottery system: A researcher randomly picks numbers, with each number


corresponding to a subject or item, in order to create the sample. To create a sample this
way, the researcher must ensure that the numbers are well mixed before selecting the
sample population. The lottery system consists of writing the name of each item in the
sample frame on a slip of paper or a card and then drawing them from a container
one after the other. To ensure a bias free selection, shuffle the cards or the slips of paper
before each draw.

Advantages of the lottery system

□ It is independent of the properties of the population.

□ It is a very reliable method of selecting random samples.

□It eliminates selection bias. Disadvantages of the lottery system

□ It is time-consuming and number some when the population is large.

□ Cannot be used when the population is infinite.

2. Table of random numbers

Random numbers are numbers that are computer generated according to a scheme whereby
each digit is equally likely to be any of the integers 0, 1, 2, …, 9 and does not depend on the
other digits generated.

o The numbers fluctuate according to no set pattern. Any particular digit has the
same chance of being a0, 1, 2, …, or 9.

o The numbers are chosen independently, so any digit chosen has no influence on
any other selection. If the first digit in a row of the t able is 9, for instance, the
next digit is still just as likely to be a 9 as a0 or a1 or any other number.

4
o Random numbers are available in published tables and can be generated with
software and many statistical calculators.

Example: Suppose you want to select a simple random sample of 10 house hold from a total
of 20 house hold to study their socio economic status . The sampling frame is a directory of
these House hold. You can select the house hold by using two digit random numbers to
identify them, as follows:

(1) Assign the numbers 01 to 20 to the house hold in the directory, using 01 for the first
house hold in the list, 02 for the second house hold, and so on.

(2) Starting at any point in the above Table, choose successive two- digit numbers until you
obtain 10 distinct numbers between 01 and 20.

(3) Include in the sample the house hold with the assigned numbers equal to the random
numbers selected.

5
(4) For example, using the first row of the above Table, the first 5 two-digit random numbers
are 10, 15, 01, 02 and 14. Notice that we skipped the numbers which are greater than 20
since no student in the directory has an assigned number greater than these numbers.

(5) After using the first row of the above Table, move to the n row of numbers and continue.
Note: The column (or row) from which you begin selecting the number does not matter,
since the numbers have no set pattern.

2. Systematic random sample: Another method of random sampling is to choose every k


th item from the list, starting from a randomly chosen entry among the first k items on the
list. This is called systematic sampling. The number k is called the skip number. Involves
using a random start to determine the first element of the sample and the selection of the
rest of the sample is done systematically, i.e., every k th interval, where k= N/n.
Example: The following Fig. shows how to sample every fourth item, starting from item
2, resulting in a sample of size n= 20 items from a list of N = 78 items. A systematic
sample of n items from a population of N items requires that the skip number be
approximately N/n. In sampling from a sampling frame, it is
simplerto select a systematic random sample than a simple random sample because it
uses only one random number.

6
Example: Suppose we want a systematic random sample of 100 house
hold to study the saving habit of the community from a population of 30 000 house hold
listed in a campus directory. Here, n= 100 and N=30 000, and so k= 30 000 /100 =300.

 The population size is 300 times the sample size. Therefore we have to select one of
every 300 students.

 We select one house hold at random using every 300 the student after the one selected
randomly. This produces a sample of size 100.

 The first three digits in Table the above table from slide 10 are 104, which falls between
001 and 300, so we first select the house hold numbered 104.

 The numbers of the other house hold selected are 104+ 300 = 404, 404 + 300 = 704, 704
+ 300 = 1004, 1004 +300 = 1304, and so on. The 100th house hold selected is listed in
the last 300 names in the directory.

7
3. Stratified random sample: Another probability sampling method, useful in social
science research for studies comparing groups, is stratified random sampling.

A stratified random sample divides the population into subgroups called strata, and then
selects a simple random sample from each stratum. Stratified random sampling is a
method of sampling that involves the division of a population into smaller sub-
groups known as strata. Involves dividing the population into groups called STRATA
according to some chosen classification category such as age, gender, geographic
location, and so on. Sub sample from each stratum are selected by simple random
sampling.

Example: Taxpayer A sells three types of computer equipment: laptops, desktops and
network servers. The auditor decides to stratify the total population of sales by type of
computer equipment, since that tends to create more homogeneous sub-populations.

There were $2,000,000 total laptop sales during the period, $3,000,000 total desktop
sales and $5,000,000 server sales. The auditor should allocate the total sample size as
follows: 100 from laptop sales 150 from desktop sales 200 from network sales The results
were as follows:

For example, a population may consist of males and females who are smokers or non
smokers.

The researcher will want to include in the sample people from each group that is, males
who smoke, males who do not smoke, females who smoke, and females who do not
smoke. To accomplish this selection, the researcher divides the population into four
subgroups and then selects a random sample from each subgroup. This method ensures
that the sample is representative on the basis of the characteristics of gender and
smoking. Stratified random sampling is called proportional if the sampled strata
proportions are the same as those in the entire population.

For example, if 90% of the population of interest is men and 10% are women, then the
sampling is proportional if the sample size for men is nine times the sample size for
women. Stratified random sampling is called disproportional if the sampled strata

8
proportion differs from the population proportions. This is useful when the population
size for a stratum is relatively small. A group that comprises a small part of the
population may not have enough representation in a simple random sample to allow
precise inferences.

Example: Suppose we want to estimate smallpox vaccination rate among employees in a


university, and we know that our target population (those individuals we are trying to
study) is 55% male and 45% female. Suppose our budget only allows a sample of size
200. To ensure the correct gender balance, we could sample 110 males and 90 females.

4. Cluster random sampling: Simple, systematic, and stratified random sampling are often
difficult to implement, because they require a complete sampling frame. Such lists are
easy to obtain when sampling cities or hospitals for example, but more difficult to obtain
when sampling individuals or families. A cluster sample is a sample obtained by selecting
a preexisting or natural group, called a cluster, and using the members in the cluster for
the sample. Cluster samples are essentially strata consisting of geographical regions. We
divide a region (say a city) into sub-regions (say, blocks, subdivisions, or schools).
Cluster sampling is used in large geographic samples where no list is available of all the
units in the population but the population boundaries can be well-defined. Cluster
sampling is useful When

 Population frame and stratum characteristic are not readily available.

 It is too expensive to obtain a simple or stratified sample.

 The cost of obtaining data increases sharply with distance.

 Some loss of reliability is acceptable.

Cluster sampling is cheap and quick, it is often reasonably accurate because people in the
same neighborhood tend to be similar in income, ethnicity, educational background, and so
on.

Example1: The most common cluster used in research is a geographical cluster. For example, a
researcher wants to survey monthly income of house hold in Ethiopia. He can divide the

9
entire population (population of Ethiopia) into different clusters (cities). Then the researcher
selects a number of clusters depending on his research through simple or systematic random
sampling. Then, from the selected clusters (randomly selected cities) the researcher can either
include all the house hold as subjects or he can select a number of subjects from each cluster
through simple or systematic random sampling.

Example2: To in the state and select a simple random sample of school districts. Obtain
information about the drug habits of all high school students in a state, you could obtain a
list of all the school districts. Then, within in each selected school district, list all the high
schools and select a simple random sample of high schools. Within each selected high school,
list all high school classes, and select a simple random sample of classes. Then use the high
school students in those classes as your sample.

Example: What is the difference between a stratified sample and a cluster sample?

Solution: A stratified sample uses every stratum. The strata are usually groups we want to
compare. By contrast, a cluster sample uses a sample of the clusters, rather than all of them. In
cluster sampling, clusters are merely ways of easily identifying groups of subjects. The goal is
not to compare the clusters but to use them to obtain a sample. Most clusters are not represented
in the eventual sample.

2. Non-probability sampling: Non-probability sampling designs select samples with


features not embodying randomness. The selection of the elements in the sample lies
solely on personal judgment. The chance of selecting an element cannot be determined.
For this reason, there is no means of measuring the risk of making erroneous conclusion
desired from non-probability samples. Thus the reliability of results (i.e. sampling errors)
cannot be assessed and also used to make valid conclusions about the population.

The main methods of non-probability sampling are Convenience, Judgmental and Quota
Sampling.

1. Convenience sample: is a non-probability sampling technique where samples are


selected from the population only because they are conveniently available to the
researcher. It is a non probability sampling method where the sample is taken from a

10
group of people easy to contact or to reach. Researchers choose these samples just
because they are easy to recruit, and the researcher did not consider selecting a sample
that represents the entire population.

Example:

1. An example of convenience sampling would be using student volunteers known to


the researcher. Researchers can send the survey to students belonging to a particular
school, college, or university, and act as a sample.
2. An accounting professor who wants to know how many MBA students would take a
summer elective in international accounting can just survey the class she is
currently teaching. The students polled may not be representative of all MBA
students, but an answer (although imperfect) will be available immediately.
2. Judgment sample: Judgment sampling is a non-probability sampling method where the
researcher selects units to be sampled based on his own existing knowledge or his
professional judgment. The sample obtained by personal judgment and some pre-
knowledge of the population. This method is based on Researchers choose only
those people who they deem fit to participate in the research study.
For example, if a sample of ten student is selected from a class of sixty for analyzing the
spending habits of the student the investigator would select 10 student who is the in his
opinion the best representative of the class.
3. Quota Sampling: Quota sampling is a special kind of judgment sampling, in which the
interviewer chooses a certain number of people in each category (e.g., men/women).
Quota sampling involves first classification of the population into non-overlapping sub
populations, called strata. The sample is then obtained by selecting the individual
elements from each stratum based on a specified quota. In quota sampling the selection
of the sample is made by the interviewer, who has been given quotas to fill from
specified sub- groups of the population. For example, In a radio listening survey the
interviewer may told to interview 500 people living in a particular area and out of that
every 100 person interviewed 60 are to be house wives , 25 farmers and 15 children
under age of 15.

11
Example: A researcher wants to survey individuals about what smart phone brand they prefer
to use. He/She consider a sample size of 500 respondents. Also he /she is only interested in
surveying ten states in the US.

Gender: 250 male and 250 female; Age: 100 respondents between ages 16-20, 21-30, 31-
40, 41- 50, 51+

Assignment I

1. Determine the sampling method to be used in each scenario. When


to use each type of probability and non-probability sampling? From a list containing the
names of 500 members of an alumni association, a sample size of 50 is obtained by
including every 10th person in the list in the sample.
2. The students in a given school are classified according to grade level. Twenty students
from each group will be randomly chosen to participate in a study involving students’
study habits.
3. All the students who belong to ten chosen sections in a certain school will participate in a
study designed to improve students’ critical thinking skills.
4. A researcher is interested in studying the effects of diet on the attention span of third-
grade students in a large city. There are 1,500 third-graders attending the elementary
schools in the city. The researcher selects 150 of these third graders, 30 each in five
different schools, as a sample for study.
5. An administrator in a large urban high school is interested in student opinions on a new
counseling program in the district. There are six high schools and some 14,000 students
in the district. From a master list of all students enrolled in the district schools, the
administrator selects a sample of 1,400 students (350 from each of the four grades, 9–12)
to whom he plans to mail a questionnaire asking their opinion of the program.
6. The principal of an elementary school wants to investigate the effectiveness of a new
U.S. history textbook used by some of the teachers in the district. Out of a total of 22
teachers who are using the text, she selects a sample of 6. She plans to compare the
achievement of the students in these teachers’ classes with those of another 6 teachers
who are not using the text.

12
1.2 Sampling Distribution

In inferential statistics, we want to use characteristics of the sample (i.e. a statistic) to estimate
the characteristics of the population (i.e. a parameter).

Sampling distribution is a probability distribution of a statistic obtained from a larger number of


samples drawn from a specific population. The sampling distribution of a given population is the
distribution of frequencies of a range of different outcomes that could possibly occur for a
statistic of a population.

The sampling distribution of X is the probability distribution of all possible values the random
variable may assume when a sample of size n is taken from a specified population.

Steps for the construction of Sampling Distribution of the mean

1. From a finite population of size N, randomly draw all possible samples of size n

2. Calculate the mean for each sample.

3. Summarize the mean obtained in step 2 in terms of frequency distribution or relative


frequency distribution.

 Example: Suppose we have a population of size N = 5, consisting of the age of five


children: 6, 8, 10, 12, and 14.

 What is the population mean?

 What is the sampling distribution of the sample means for samples of size 2?

 What is the mean of the sampling distribution?

a. Solutions: Pop mean= (6+8+10+12+14)/5= 10

b. To get sampling distribution of the sample mean, use the combination rule NCn= 5C2= 10.
Then 10 distinct sample means from all possible samples of 2 that can be drawn from the
population.

13
c. The mean of sampling distribution

14
Relationships between Population Parameters and the Sampling Distribution of the Sample
Mean

The expected value of the sample mean is equal to the population mean:

E( X )   X   X

The variance of the sample mean is equal to the population variance divided by the sample
size:

 X2
V (X )   2
X

n

The standard deviation of the sample mean, known as the standard error of the mean, is
equal to the population standard deviation divided by the square root of the sample size:

X
SD( X )   X 
n

Assignment II

1. Write a short note on the sampling techniques (Probability and Non-probability)

2. Discuss the sampling techniques you have assigned (you have chosen)

3. By taking real life problem (in your field of study), discuss how to select sample using the
sampling technique you have chosen

4. Submit the report on the due date

5. Make presentation of your work

15

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy