0% found this document useful (0 votes)
112 views

Statistics Group 1

The document discusses sampling distributions and the central limit theorem. It defines key terms like population, sample, parameter, and statistic. It explains how to calculate the mean and variance of the sampling distribution of the sample mean. When samples are drawn from a large population, the sampling distribution of the sample means will be approximately normal, even if the original population is not, according to the central limit theorem.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views

Statistics Group 1

The document discusses sampling distributions and the central limit theorem. It defines key terms like population, sample, parameter, and statistic. It explains how to calculate the mean and variance of the sampling distribution of the sample mean. When samples are drawn from a large population, the sampling distribution of the sample means will be approximately normal, even if the original population is not, according to the central limit theorem.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 59

Sampling and

Sampling
Distribution
Objectives
By the end of the chapter, the students must be able to

• Compute for the mean of a sampling distribution


• Compute for the variance of a sampling distribution
• Solve problems applying the central limit theorem
• Identify sampling distributions of statistics (sample
mean)
• Find the mean and variance of the sampling
distribution of the sample mean
• Define the sampling distribution of the sample mean
for normal population when the variance is a.)
known and b.) unknown
• Illustrate the Central Limit Theorem
• Define the sampling distribution of the sample mean
using the Central Limit Theorem
Simple Random Sampling
It is often necessary to study and make inferences about a
large group or population
by taking a small subset of it. The researcher usually does
not have the time
or enough fund to study all the individual elements of the
population. So, taking a subset
3
that presents the general traits or characteristics of the
large group is necessary.
This subset is called a sample.
To study the shopping habits of those going to a certain
mall over the Christmas season, a sample of 100 shoppers for
a particular day can be a representative already.
Voter preferences in an upcoming elections across the
country can be sampled by 1,000 respondents chosen online
across the country.
The resistance to heat of chicks in a farm can be studied
using a sample of chosen 100 chicks. The results from the
study can be as true for the whole population of chicks in
the farm.
The common method used to do sampling is the simple
random sampling. For example, for a batch of 1,000
students, those with class numbers ending with the unit digit
8 can be chosen as elements of the subset. As this method
can still generate a big subset, another random sampling can
further be done to get a more manageable size for the
sample.
Simple random sampling can be done with replacement or
without replacement. If you pic a student with class number
ending in eight and return in the list for a chance of being
selected again, it is a simple random sampling with
replacement (SRSWR). If you pick a student with class
number ending in eight and you did no return the student’s
name on the list, it is a simple random sampling without
replacement (SRSWOR)..
Lotteries are also used to do random sampling. This is
like the traditional picking of names and numbers from
a fishbowl of hundreds of names or in numbers. In
SRSWR, every element in the original population has
an equal chance of being chosen.

Computers are now able to generate random


samplings quickly. Available population data can be
gathered and small samples from these can be
generated at the blink of an eye.
Parameter Versus
Statistics
The distinction between parameter and statistic is very
clear-cut. The statistic is a measurement that describe a
sample. The parameter is the measurement that describes
the whole population. If a bunch of lanzones taken from a
big box is 90% sweet, then the value 90% is statistic. It
describes a sample, the bunch of lanzones.
If all the applicants is taking the Senior High School
Entrance Test were asked how they feel and 75% said they
are nervous, then 75% is a parameter, it describes group of
test-takers or the whole population. If 800 of the 1200 test-
takers above said they have eaten breakfast well, then
66.67% is a parameter.
Consider each of the following situations, then determine
whether it tells about statistic or a parameter.

1. All 550 churchgoers were asked about their preferences for


a barangay venue for Christmas celebration. Sixty percent
preferred the barangay hall to serve as venue.

2. There was 2,140 marathon runners who signed up, and


100% had a drink of at least 3 bottles of a water along the
course.
3. Thirty students out of the whole batch had a mean
percentage score of 82% in the diagnostic test.

4. There are 10 basketball games played during the year that


had a mean total score of 152 between the opposing teams.

5. The average height of 25 students from the Grade 6 batch


wss determined to be 5’2”.
Answers:
1. All 550 churchgoers is the population. 60% is a
parameter.
2. The 2,150 marathon runners are the population. 100%
is a parameter.
3. The thirty students are the sample. 82% is a statistic.
4. The 10 basketball games is a sample. The mean score
of 152 is a statistic.
5. The 25 students is a sample. The average height of 5’2”
is a statistic.
Identifying Sampling
Distributions
Consider a large population of prospective senior
high school students. They are given a test on abstract
reasoning. Samples of 20 students are taken each time,
and the mean of their abstract reasoning test scores is
taken. More samples of 20 students are then
continuously taken. As many means and standard
deviations are determined.
The distribution of the means of the numerous samples form
a normal distribution. This demonstrates an example of the
Central Limit Theorem (CLT). In short, if the random samples are
large and if many of these samples are taken, the mean of the
sample means and the mean of the population will be equal. Also,
the standard deviation of the sample means around the
parameter mean will be equal to the population standard
deviation divided by the square root of the sample size n.
Mean of the sampling
distribution of the Means

The mean of the population μ, is also the mean of the


sample μx taken from the population.
Therefore, μx = μ.
Mean of the sampling
distribution of the Means
It is, of course, possible to take several samples from a
population. The mean for each sample can be determined.
The probability distribution of these ‘means’ is the sampling
distribution.

Likewise, the variance of the sampling distribution of the


mean can also be determined.
Variance of the sampling
distribution of the Means
The variance of the sampling distribution of the means is
given by the formula.
2
2 𝜎
𝜎𝑥ҧ =
𝑛
Where 𝜎𝑥2ҧ is the variance of the sampling distribution mean;
𝜎 2 is the population variance; and
𝑛 is the sample size
2 𝜎2
From the formula 𝜎𝑥ҧ = , it can be seen that the
𝑛
variance approaches zero as 𝑛 increases. The more
samples are taken, the spread of the values of the mean
becomes narrower and narrower. The mean becomes
closer and closer to the population mean.
Take a look at an example in finding the mean and the
variance of the sampling distribution of the sample mean.
EXAMPLE 1: Our Hypothetical population contains
the scores 4, 6, 7 and 9. Determine the mean and
Variance of the sampling distribution of the sample
mean, given that samples contain two scores drawn
from the population with replacement.
Draw Sample
1 (4, 4)
2 (4, 6)
3 (4, 7)
4 (4, 9)
5 (6, 4)
SOLUTION: 6 (6, 6)
7 (6, 7)
The possible
8 (6, 9)
samples are 9 (7, 4)
given. 10 (7, 6)
11 (7, 7)
12 (7, 9)
13 (9, 4)
14 (9, 6)
15 (9, 7)
Draw Sample 𝑥ҧ
1 (4, 4) 4.0
2 (4, 6) 5.0
3 (4, 7) 5.5
4 (4, 9) 6.5
5 (6, 4) 5.0
6 (6, 6) 6.0
The means of the 7 (6, 7) 6.5
samples are then 8 (6, 9) 7.5
9 (7, 4) 5.5
determined. 10 (7, 6) 6.5
11 (7, 7) 7.0
12 (7, 9) 8.0
13 (9, 4) 6.5
14 (9, 6) 7.5
15 (9, 7) 8.0
16 (9, 9) 9.0
4+6+7+9
Given the population meanµ= or 6.5, we
4
complete the table with columns for 𝑥-µ,
ҧ The
Deviations of the sample means from the
population mean, and for (𝑥-µ)
ҧ 2.
Draw Sample 𝑥ҧ 𝑥-µ
ҧ ҧ 2
(𝑥-µ)
1 (4, 4) 4.0 -2.5 6.25
2 (4, 6) 5.0 -1.5 2.25
3 (4, 7) 5.5 -1.0 1.00
4 (4, 9) 6.5 0 0
5 (6, 4) 5.0 -1.5 2.25
6 (6, 6) 6.0 -0.5 0.25
7 (6, 7) 6.5 0 0
8 (6, 9) 7.5 1.0 1.00
9 (7, 4) 5.5 -1.0 1.00
10 (7, 6) 6.5 0 0
11 (7, 7) 7.0 0.5 0.25
12 (7, 9) 8.0 1.5 2.25
13 (9, 4) 6.5 0 0
14 (9, 6) 7.5 1.0 1.00
15 (9, 7) 8.0 1.5 2.25
16 (9, 9) 9.0 2.5 6.25
The mean of the sampling distribution of the means
is given by the formula.
σ 𝑥ҧ
𝜇𝑥ҧ =
𝑛

In our example, σ 𝑥=
ҧ 104 and 𝑛= 16.
σ 𝑥ҧ 104
Thus, 𝜇𝑥ҧ = = = 6.5
𝑛 16
The variance of the sampling Distribution of the means is
given by the formula

σ( 𝑥ҧ − 𝜇) 2
𝜎𝑥2ҧ =
𝑛

In our example, σ(𝑥ҧ − 𝜇)2 = 26 and 𝑛 = 16.


ҧ
σ(𝑥−𝜇) 2 26
Thus, 𝜎𝑥2ҧ = = = 1.625
𝑛 16
Given a population with known mean and known
variance, the sampling distribution of the sample means
approaches a normal distribution even if the population
itself does not show a normal distribution. Sufficiently
large sample size (30 or more) effect this approach to a
normal distribution. Small sample sizes can also be used
to approximate the normal distribution.
For populations of unknown variance, it is possible
to get the population mean and this equals the mean
of the sampling distribution of means.
Let us use again our hypothetical population (4,
6, 7, 9) and the table of values previously generated.
Draw Sample 𝑥ҧ 𝑥-µ
ҧ ҧ 2
(𝑥-µ)
1 (4, 4) 4.0 -2.5 6.25
2 (4, 6) 5.0 -1.5 2.25
3 (4, 7) 5.5 -1.0 1.00
4 (4, 9) 6.5 0 0
5 (6, 4) 5.0 -1.5 2.25
6 (6, 6) 6.0 -0.5 0.25
7 (6, 7) 6.5 0 0
8 (6, 9) 7.5 1.0 1.00
9 (7, 4) 5.5 -1.0 1.00
10 (7, 6) 6.5 0 0
11 (7, 7) 7.0 0.5 0.25
12 (7, 9) 8.0 1.5 2.25
13 (9, 4) 6.5 0 0
14 (9, 6) 7.5 1.0 1.00
15 (9, 7) 8.0 1.5 2.25
16 (9, 9) 9.0 2.5 6.25
𝑥ҧ frequency
4.0 |
4.5
We can make a tally of 5.0 ||
the means 𝑥ҧ and 5.5 ||
generate the following 6.0 |
6.5 ||||
frequency Distribution
7.0 |
table. 7.5 ||
8.0 ||
8.5
9.0 |
𝑥ҧ frequency Probability
4.0 | 1
16
4.5
5.0 || 1
8
Consistent with the standard 5.5 || 1
8
normal curve, we can 6.0 | 1
16
convert the frequency table 6.5 |||| 1

above to a probability table 7.0 |


4
1
and construct the probability 7.5 ||
16
1
Histogram accordingly. 8
8.0 || 1
8
8.5
9.0 | 1
16
Probability Histogram
4/16

2/16 2/16 2/16 2/16

1/16 1/16 1/16 1/16

4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9


The initial symmetry of the histogram is predictive
already. The sampling distribution begins to
approximate the normal distribution.
Try a sampling distribution for 𝑥ҧ of the same population
when the sample size is n = 3. What do you notice?
Describe the probability histogram. How does it
compare with the probability histogram for a sample
size n = 2?
Example 2: Given the population (18,20,24,28).
Determine the mean and variance of the sampling
distribution of the sample means, given that samples
contain two scores drawn from the population with
replacement.
Solution: The list of samples are given below.

Draw Sample Draw Sample


1 (18,18) 9 (24,24)
2 (18,20) 10 (24,18)
3 (18,24) 11 (24,20)
4 (18,28) 12 (24,28)
5 (20,20) 13 (28,28)
6 (20,18) 14 (28,18)
7 (20,24) 15 (28,20)
8 (20,28) 16 (28,24)
The means of the samples are as follows.
Draw Sample 𝑥ҧ
1 (18,18) 18
2 (18,20) 19
3 (18,24) 21
4 (18,28) 23
5 (20,20) 20
6 (20,18) 19
7 (20,24) 22
8 (20,28) 24
9 (24,24) 24
10 (24,18) 21
11 (24,20) 22
12 (24,28) 26
13 (28,28) 28
14 (28,18) 23
15 (28,20) 24
16 (28,24) 26
The population mean is determined to be
18+20+24+28
or 22.5.
4
We then compute for the deviations of the sample
means from the population mean (𝑥ҧ − 𝜇)2 , and
then, we also compute for the squares of these
deviations (𝑥ҧ − 𝜇)2 .
Draw Sample 𝑥ҧ (𝑥ҧ − 𝜇)
1 (18,18) 18 -4.5
2 (18,20) 19 -3.5
3 (18,24) 21 -1.5
4 (18,28) 23 0.5
5 (20,20) 20 -2.5
6 (20,18) 19 -3.5
7 (20,24) 22 -0.5
8 (20,28) 24 1.5
9 (24,24) 24 1.5
10 (24,18) 21 -1.5
11 (24,20) 22 -0.5
12 (24,28) 26 3.5
13 (28,28) 28 5.5
14 (28,18) 23 0.5
15 (28,20) 24 1.5
16 (28,24) 26 3.5
The mean of the sampling distribution of the mean is given
by the formula
σ 𝑥ҧ
𝜇𝑥ҧ =
𝑛
In our example, σ 𝑥ҧ = 360 and n = 16.
Therefore,
σ 𝑥ҧ 360
𝜇𝑥ҧ = = = 22.5
𝑛 16
The variance of the sampling distribution of the means is
given by the formula.

ҧ 𝜇 )2
σ(𝑥−
𝜎𝑥2ҧ =
𝑛
Draw Sample 𝑥ҧ (𝑥ҧ − 𝜇) (𝑥ҧ − 𝜇)2
1 (18,18) 18 -4.5 20.25
2 (18,20) 19 -3.5 12.25
3 (18,24) 21 -1.5 2.25
4 (18,28) 23 0.5 0.25
5 (20,20) 20 -2.5 6.25
We then 6 (20,18) 19 -3.5 12.25
7 (20,24) 22 -0.5 0.25
compute for
8 (20,28) 24 1.5 2.25
(𝑥ҧ − 𝜇)2 , our 9 (24,24) 24 1.5 2.25
5th column. 10 (24,18) 21 -1.5 2.25
11 (24,20) 22 -0.5 0.25
12 (24,28) 26 3.5 12.25
13 (28,28) 28 5.5 30.25
14 (28,18) 23 0.5 0.25
15 (28,20) 24 1.5 2.25
16 (28,24) 26 3.5 12.25
In our example, σ(𝑥ҧ − 𝜇𝑥ҧ )2 = 118 and n = 16.
Thus,
ҧ 𝜇 )2
σ(𝑥− 118
𝜎𝑥2ҧ = = = 7.375
𝑛 16
Given a population with known mean and known
variance, the sampling distribution of the sample
means approaches a normal distribution even if the
population itself does not show a normal distribution.
Consider the table again.
We can tally the means and determine the following
frequency distribution table
frequency
18 I
19 II
20 I
21 II
22 II
23 II
24 III
25
26 II
27
28 I
𝑥ҧ Frequency Probability
18 I 1/16
19 II 1/8
20 I 1/16
Again, we convert 21 II 1/8
the frequency 22 II 1/8
table to a 23 II 1/8
probability table. 24 III 3/16
25
26 II 1/8
27
28 I 1/16
Our probability histogram follows
PROBABILITY HISTOGRAM

18 19 20 21 22 23 24 25 26 27 28

As shown, the graph starts to approach a normal distribution.


With our examples above, we now state the Central Limit Theorem.
Central Limit
Theorem
For a population with a finite mean 𝜇 and a finite non-
zero variance 𝜎 2 the sampling distribution of the mean
approaches a normal distribution with a mean of 𝜇 and a
𝜎2
variance of as the sample size n increases.
𝑛
The following illustrates the sampling distribution
of the mean approaching a normal distribution as n
increases.

Distribution of Sample Mean, 𝑛 = 5


Distribution of Sample Mean, 𝑛 = 15
Distribution of Sample Mean, 𝑛 = 25
From the Central Limit Theorem, we also state that the
distribution of random sample means approaches a normal
𝜎
distribution with a mean 𝜇 and a standard deviation of as 𝑛
𝑛
increases.
The standard error of the mean is the standard deviation of
the sampling distribution of the sample means. While a
sample mean estimates the population mean, different
samples can have different means. We, thus, get the standard
deviation of these sample means and use this for computing
the z-score for the sampling distribution.
Example 3:
Suppose a population has mean 80 and standard
deviation 10. Then we get a sample of 90 cases
and the mean of this sample is 82. How frequently
does the sample of 90 cases differ by 2 or more
points from the population mean?
Solution:
Thus, the probability that the sample mean differ
by 2 or more points from the population mean is
0.0287 or about 3 times out of every 100
samples.
Example 4:
The mean height of the Grade V students is 148 cm with a
standard deviation of 8 cm. a sample of 30 students is taken
and the mean height of the sample 145 cm. what is the
probability that the sample of 30 students that has a mean
height that differs by 2 cm or less from the population mean?
Solution:
The probability is the area 0.0202. Therefore, there is
about 2% probability that a sample of 30 students had a
mean height that differs by 2 cm or less from the
population mean
Example 5:
The mean weight of a banana is 92 g with a standard
deviation of 6g. a sample of 36 is taken from a basket
and the mean weight of the sample is 94 g. what is the
probability that this sample has a mean weight that
differs by 2 g or more from the population mean?
Solution:
The probability that the mean weight of the sample
differs by 2 g or more compared to the population
mean is 0.0228 or 2.28%.
Example 6:
What is the probability that the sample mean is 70 or lower
for a sample of 100 taken from a population with mean and
standard deviation of 73 and 12, respectively?
Solution:
Using the Table II in appendix, the z-score of -2.5 indicates
0.0062 of probability to the left of 70. So, the probability is
6.2 out of 1000 times that the sample mean is 70 or lower.
Thank You for
listening!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy