Statistics and Probability-Module 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

11

Statistics and Probability


for Senior High School
Quarter 1 – Module 2
The Normal Distribution

Locally Developed Self-Learning Material


Statistics and Probability – Grade 11

Locally Developed Self-Learning Material

Quarter 2 – Lesson 1: The Normal Distribution and the Standard Normal


Distribution
Lesson 2: Sampling and Sampling Distribution

Development Team of the Module

Writer: April Joy V. Albior


Editors: Danica Mae B. Cleopas
Kimberly D. Miraflor
Rachel R. Puno
Reviewers: Ruel D. Emberga
Corazon B. Dumlao, EdD
Ramil G. Gonzales, PhD
Roderick A. Tadeo, PhD

Cover Illustrator: Gamaliel R. Paz Jr.

Management Team: Leilani S. Cunanan, CESO V


Maylene M. Minimo, EdD, CESE
Ariel C. Lansang
Jose C. Tala. EdD

1
Pre- Assessment

Read each item carefully and choose the letter of the correct answer. Write your
answers on a separate sheet of paper.

1. What is the total area under the standard normal distribution curve?
a. 0 c. 2
b. 1 d. 3

2. Which of the following is the normal curve symmetrical to?


a. Mean c. Variance
b. Standard Deviation d. Sample Mean

3. Approximately what percentage of normally distributed data values will fall within
1 standard deviation above or below the mean?
a. 68% c. 99.7%
b. 95% d. 34%

4. Which is not a property of the standard normal distribution?


a. It’s symmetric about the mean. c. It’s bell-shaped.
b. It’s uniform. d. It’s unimodal.

5. Find the area under the standard normal distribution curve between z=0 and
z=1.77.
a. 0.5000 c. 0.4616
b. 0.9616 d. 0.5394

6. Find the area under the standard normal distribution curve to the right of 𝑧 = 0.29.
a. 0.3859 c. 0.6141
b. 0.3959 d. 0.3895

7. Find the area under the standard normal distribution curve to the left of 𝑧 = −0.75.
a. 0.7734 c. 0.2626
b. 0.7334 d. 0.2266

8. Find the probability 𝑃(−1.96 < 𝑧 < 1.96)


a. 0.9550 c. 0.0450
b. 0.9950 d. 0.9500

9. Find the probability 𝑃(𝑧 < −1.77)


a. 0.0375 c. 0.9616
b. 0.0384 d. 0.9625

10. Find the probability 𝑃(𝑧 > 2.33)


a. 0.9901 c. 0.0099
b. 0.9904 d. 0.0909

11. If the area under a normal curve is 0.4177, which two z-values is the area in
between?
a. 0 and -1.39 c. -1.39 and 1.39
b. 1 and 1.39 d. 0 and

2
12. Which of the following is a subset of a population?
a. statistic c. variable
b. parameter d. sample

13. Which of the following is a sampling method in which every element of the
population has the same probability of being selected for inclusion in the sample?
a. systematic random sampling c. clustered sampling
b. simple random sampling d. stratified random sampling

14. Which of the following is a sampling method in which the whole population is
subdivided into clusters, or groups, and random samples are then collected from
each group or take all the selected groups?
a. systematic random sampling c. clustered sampling
b. simple random sampling d. stratified random sampling

15. Which of the following sampling technique is done by selecting every 𝑘th element
in your population list?
a. systematic random sampling c. clustered sampling
b. simple random sampling d. stratified random sampling

3
Lesson 1: The Normal Distribution and the Standard Normal Distribution

About the Lesson


In the previous module, it has been established that random variables can either be
discrete or continuous. Also, we have focused on the discrete random variable and its
probability distribution. This time, we will examine more closely the continuous random
variable and its distribution. Many continuous variables have distributions that are usually
bell-shaped, and these are called approximately normally distributed variables.

Objective/Learning Competency
In this lesson you are expected to:
1. Illustrate a normal random variable and its characteristics. (M11/12SP-IIIc-1)
2. identify regions under the normal curve corresponding to different standard
normal values. (M11/12SP-IIIc-3)
3. convert a normal random variable to a standard normal variable and vice
versa. (M11/12SP-IIIc-4)
4. compute probabilities and percentiles using the standard normal table.
(M11/12SP-IIId-1)

Lesson Proper

I. Activity

Activity 1.

Figure 1
Observe the four histograms above, what do you observe as the sample size is
increased and class width decreased further?
______________________________________________________________________
__________________________________________________________________

II. Questions to Ponder


1. What characterizes a normal distribution?

4
III. Example and Discussion

NORMAL DISTRIBUTION

In mathematics, curves can be represented by equations. For example, the equation


of the circle is 𝑥 2 + 𝑦 2 = 𝑟 2 where r is the radius. A circle can be used to represent many
physical objects, such as a wheel or a gear. Even though it is not possible to manufacture
a wheel that is perfectly round, the equation and the properties of a circle can be used
to study many aspects of the wheel, such as area, velocity, and acceleration. In a similar
manner, the theoretical curve, called a normal distribution curve, can be used to study
many variables that are not perfectly normally distributed but are nevertheless
approximately normal.

If a random variable has a probability distribution whose graph is continuous, bell-


shaped, and symmetric, it is called a normal distribution. The graph is called a normal
distribution curve. The distribution is also known as a bell curve or a Gaussian distribution
curve, named after Carl Friedrich Gauss, German mathematician who derived its
equation.

It is important to note that no variable fits a normal distribution perfectly, since a normal
distribution is a theoretical distribution. However, a normal distribution can be used to
describe many variables, because the deviations from a normal distribution are very small.

Examples:
1. A researcher selects a random sample of 100 adult women, measures their heights,
and constructs a histogram, the researcher gets a graph similar to Figure 1(a) in the
beginning activity. Now, if the researcher increases the sample size and decreases
the width of the classes, the histogram will look like the one in Figure 1(b) and (c).
Finally, if it were possible to measure exactly the heights of all adult Filipino women
and plot them, the histogram will approach what is called a normal distribution
curve as shown in Figure 1(d).

2. Consider the rate at which a bag of popcorn pops inside a microwave. For the first
few minutes, nothing happens, and then after a while, a few kernels start popping.
This increases to the point at which you hear most of the kernels popping., and then
gradually decreases again until just a kernel or two pops. This simple scenario
depicts a normal distribution.

3. When you measure the heights, shoe sizes or the width of the hands of the students
in a class, in most cases, you will probably find that there are a couple of students
with very low measurements, and a couple with very high measurements, with the
majority of the students centered on a particular value. These values typically follow
a normal distribution.

The normal distribution is an extremely important concept in Statistics. It occurs often in


the data that we collect from the natural world. It is also a critical component of many of
the most theoretical ideas that are foundation of Statistics.

5
The Properties of the Theoretical Normal Distribution

1. A normal distribution curve is bell-shaped.

2. The mean, median, and mode are equal and are located at the center of the
distribution.

Figure 2

3. A normal distribution curve is unimodal (i.e., it has only one mode).

4. The curve is symmetric about the mean, which is equivalent to saying that its shape is
the same on both sides of a vertical line passing through the center.

Figure 3

5. The curve is continuous; that is, there are no gaps or holes. For each value of X, there is
a corresponding value of Y.

6. The curve never touches the x axis. Theoretically, no matter how far in either direction
the curve extends, it never meets the x-axis—but it gets increasingly close.

7. The total area under a normal distribution curve is equal to 1.00, or 100%. This fact may
seem unusual, since the curve never touches the x axis, but one can prove it
mathematically by using calculus. (The proof is beyond the scope of this text.)

8. The area under the part of a normal curve that lies within 1 standard deviation of the
mean is approximately 0.68, or 68%; within 2 standard deviations, about 0.95, or 95%; and
within 3 standard deviations, about 0.997, or 99.7%. See Figure 4 below which shows the
area in each region.

Figure 4
6
The values given in property no. 8 follow the Empirical rule for data. One must know
the above properties to solve problems involving distributions that are approximately
normal.

Graphs of distributions can have many shapes. When the data values are evenly
distributed about the mean, a distribution is said to be a symmetric distribution. (A normal
distribution is symmetric, as in Figure 4. When the majority of the data values fall to the left
or right of the mean, the distribution is said to be skewed. When the majority of the data
values fall to the right of the mean, the distribution is said to be a negatively or left-skewed
distribution. (Figure 5) The mean is to the left of the median, and the mean and the median
are to the left of the mode. When the majority of the data values fall to the left of the
mean, a distribution is said to be a positively or right-skewed distribution. The mean falls to
the right of the median, and both the mean and the median fall to the right of the mode.
(Figure 6). The “tail” of the curve indicates the direction of skewness (right is positive, left is
negative.

Figure 5 Figure 6

Activity 2

Answer the following questions:

1. What is the mean of the distribution? _____


2. What is the median of the distribution? ______
3. What is the mode of the distribution? ______
4. What is the standard deviation of the
distribution? ______

Question: What if we want to find the area of the shaded region in the previous activity?

Since each normally distributed variable has its own mean and standard deviation,
as stated earlier, the shape and location of these curves will vary. In practical
applications, then, you would have to have a table of areas under the curve for each
variable. To simplify this situation, statisticians use what is called the standard normal
distribution.

7
The Standard Normal Distribution

The standard normal distribution is a normal distribution with a mean of 0 and a


standard deviation of 1.

The values under the curve indicate the proportion of area in each section. For
example, the area between the mean and 1 standard deviation above or below the
mean is about 0.3413, or 34.13%.

All normally distributed variables can be transformed into the standard normally
distributed variable by using the formula for the standard score:
𝑣𝑎𝑙𝑢𝑒−𝑚𝑒𝑎𝑛 𝑋−𝜇
𝑧= or 𝑧=
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎

The area under a normal distribution curve is used to solve practical application
problems, such as finding the lifespan of certain calculator batteries, or mean mass of a
bag of chips. Once the X values are transformed using the above formulas, then they are
called z-values. The z-value or the z-score is actually the number of standard deviations
that a particular X value is away from the mean. The z-table found in the appendix of this
module gives the area (to four decimal places) under the standard normal curve for any
z-value from -3.49 to 3.49.

Finding the Area under a Standard Normal Curve

When we want to find the area under a curve, it is useful to draw the normal
curve and shade the corresponding area. After doing this, we may refer to the z-table for
the z-scores corresponding to the areas under the curve.

There are three cases that can happen:

1. We look for the area to the left of any z-value

To find this area, look up the z-value in the z-table and use the area given.

8
2. We look for the area to the right of any z-value

To find this area, look up the z-value in the z-table and subtract the area from 1.

3. We look for the area between two z-values

To find this area, look up both z-values in the z-table and subtract the
corresponding areas.

The z-table gives the area under the normal distribution to the left of any z-value
which is given in two decimal places. For example, the area to the left of a z-value of 1.39
is found by looking up 1.3 in the left column and 0.09 in the top row. Where the row and
column lines meet gives an area of 0.9177. (See figure below).

Examples:
1. Find the area under the standard normal distribution curve to the left of 𝑧 = 2.09.
(First case)
Solution:

Draw the figure with the desired area, and


look up the area in the z-table. It is 0.9817. Hence,
98.17% of the area is to the left of 𝑧 = 2.09.

2. Find the area under the standard normal distribution curve to the right of 𝑧 =
−1.14 (Second case)
Solution:
Draw the figure with the desired area, and
look up the area for 𝑧 = −1.14 in the z-table. It is
0.1271. Subtract it from 1.0000: 1.000-
0.1271=0.8729. Hence, 87.29% of the area under
the standard normal distribution curve is to the
right of 𝑧 = −1.14.

9
3. Find the area under the standard normal distribution curve between 𝑧 = 1.62
and 𝑧 = −1.35. (Third Case)
Solution:
Draw the figure with the desired area. Since
the area desired is between two given z values,
we look up the areas corresponding to the two z-
values and subtract the smaller area from the
larger area. (Do not subtract the z-values.) The
area 𝑧 = 1.62 is 0.9474, and the area for 𝑧 = −1.35 is
0.0885. The area between the two z-values is
0.9474 - 0.0885 = 0.8589, or 85.89%.

The Normal Curve as a Probability Distribution Curve

A normal distribution curve can be used as a probability distribution curve for


normally distributed variables. Recall that a normal distribution is a continuous distribution,
as opposed to a discrete probability distribution. The fact that it is continuous means that
there are no gaps in the curve. In other words, for every z value on the x axis, there is a
corresponding height, or frequency, value.

The area under the standard normal distribution curve can also be thought of as a
probability or as the proportion of the population with a given characteristic. That is, if it
were possible to select a z-value at random, the probability of choosing one, say, be-
tween 0 and 2.00 would be the same as the area under the curve between 0 and 2.00. In
this case, the area is 0.4772. Therefore, the probability of randomly selecting a z-value
between 0 and 2.00 is 0.4772.

The problems involving probability are solved in the same manner as the previous
examples involving areas in this section. For example, if the problem is to find the
probability of selecting a z-value between 2.25 and 2.94, solve it by using the method
shown in the third case. For probabilities, a special notation is used to denote the
probability of a standard normal variable z. For example, if the problem is to find the
probability of any z value between 0 and 2.32, this probability is written as 𝑃(0 < 𝑧 < 2.32).

Note: In a continuous distribution, the probability of any exact z-value is 0 since the area
would be represented by a vertical line above the value. But vertical lines in theory have
no area. So 𝑃(𝑎 ≤ 𝑧 ≤ 𝑏) = 𝑃(𝑎 < 𝑧 < 𝑏). .

Examples:

1. Find the probability for each. (Assume this is a standard normal distribution.)
a. 𝑃(0 < 𝑧 < 2.53) b. 𝑃(𝑧 < 1.73) c. 𝑃(𝑧 > 1.98)
Solution:
a. 𝑃(0 < 𝑧 < 2.53) is used to find the area under
the standard normal distribution curve between
𝑧 = 0 and 𝑧 = 2.53. First, draw the curve and shade
the desired area. Second, find the area in the z-table.
It is 0.9943. Third, find the area in the z-table for 𝑧 = 0 .
It is 0.5000. Finally, subtract the two areas:
0.9943 - 0.5000 = 0.4943.
Hence, the probability is 0.4943, or 49.43%

10
b. 𝑃(𝑧 < 1.73)is used to find the area under the
standard normal distribution curve to the left of 𝑧 = 1.73.
First, draw the curve and shade the desired area.
Second, find the area in the z-table corresponding to 1.73.
It is 0.9582. Hence, the probability of obtaining a z-value
less than 1.73 is 0.9582, or 95.82%

c. 𝑃(𝑧 > 1.98)is used to find the area under the standard
normal distribution curve to the right of 𝑧 = 1.98.
First, draw the curve and shade the desired area. Second,
find the area in the z-table. It is 0.9761. Finally, subtract
this area from 1.0000. It is 1.0000 - 0.9761 = 0.0239.
Hence, the probability of obtaining a z-value
greater than 1.98 is 0.0239, or 2.39%

Applications of Normal Distribution

1. An adult has on average 5.2 liters of blood. Assume the variable is normally distributed
and has a standard deviation of 0.3. Find the percentage of people who have less than
5.4 liters of blood in their system.
Solution:

First, draw a normal curve and shade the desired


area. Second, find the z-value corresponding to 𝑧 =
𝑋−𝜇 5.4−5.2 0.2
= = = 0.67. Hence, 5.4 is 0.67 of a
𝜎 0.3 0.3
standard deviation above the mean. Finally, find the
corresponding area in the z-table. The area under the
standard normal curve to the left of 𝑧 = 0.67 is 0.7486.
Therefore, 0.7486, 74.86%, of adults have less than 5.4
liters of blood in their system.

2. A desktop PC uses 120 watts of electricity per hour based on 4 hours of use per day.
Assume the variable is approximately normally distributed and the standard deviation is 6.
If 500 PCs are selected, approximately how many will use less than 106 watts of power?
Solution:

First, draw a normal curve and shade the desired


area. Second, find the z-value corresponding to 𝑧 =
𝑋−𝜇 106−120
= = −2.33. Finally, find the area in the z-
𝜎 6
table corresponding to 𝑧 = −2.33 . It is 0.0099. Tofind the
number of PCs that use less than 106 watts per hour,
multiply 500 x 0.0099 = 4.95 or 5 PCs.

3. For a medical study, a researcher wishes to select people in the middle 60% of the
population based on blood pressure. Assuming that blood pressure readings are normally
distributed and the mean systolic blood pressure is 120 and the standard deviation is 8, find
the upper and lower readings that would qualify people to participate in the study.
Solution:

11
First, draw a normal curve and shade the
desired area. The cut-off points are shown on the
left. Two values are needed, one above the
mean and one below the mean. Second, find
the z-values. To get the area to the left of the
positive z-value, add 0.5000+0.3000=0.8000 (30%
=0.3000). The z-value with area to the left closest
to 0.8000 is 0.84.
𝑋−𝜇
Finally, calculate the X-values. Deriving the formula for X from 𝑧 = , we have
𝜎
𝑋 = 𝑧𝜎 + 𝜇.
𝑋1 = 𝑧𝜎 + 𝜇 = (0.84)(8) + 120 = 126.72.
The area to the left of the negative z-value is 20% or 0.2000. The area closest to 0.2000 is -
0.84.
𝑋2 = 𝑧𝜎 + 𝜇 = (−0.84)(8) + 120 = 113.28.

Therefore, the middle 60% will have blood pressure of 113.28 < 𝑋 < 126.72.

Post Assessment

Study the following problems and use the properties of the normal distribution to answer
them.

1. On a nationwide Math test, the mean score was 65 and the standard deviation was
10. If your scored 81, what is you z-score?
2. Find the area under the standard normal distribution curve:
a. Between 𝑧 = 0 and 𝑧 = 0.98
b. To the right of 𝑧 = 0.29
c. To the left of 𝑧 = −1.39
3. Find the following probabilities:
a. 𝑃(0 < 𝑧 < 0.92)
b. 𝑃(𝑧 < −1.77)
c. 𝑃(𝑧 > 2.51)
4. The average early-bird special admission price for a movie is P180.00. If the
distribution of movie admission charges is approximately normal with a standard
deviation of P20.00, what is the probability that a randomly selected admission
charge is less than P130.00?
5. Entry to a certain university is Metro Manila is determined by a national test. Suppose
the scores on this test are normally distributed with a mean of 500 and a standard
deviation of 100. Pedro wants to be admitted to this university and he knows he
must score better than at least 70% of the students who took the test. Pedro takes
the test and scored 585. Will he be admitted to this university? Justify your answer.

12
Lesson 2: Sampling and Sampling Distribution

About the Lesson


Some researches aim to study, describe and infer patterns of behavior, properties,
and characteristics about a population; sometimes, they intent to study in a very large
scale and because of the fact that we cannot study a very large population due to
feasibility, impracticality, and inconvenience, that is why we must select a representative
sample from the population.

In this lesson, sampling techniques that will help researches select samples that
would represent true inferences about the population where these samples came from.

Objective/Learning Competency
In this lesson you are expected to:
1. illustrate random sampling. (M11/12SP-IIId-2)
2. distinguish between parameter and statistic. (M11/12SP-IIId-3)
3. identify sampling distributions of statistics (sample mean). (M11/12SP-IIId-4)
4. find the mean and variance of the sampling distribution of the sample mean.
(M11/12SP-IIId-5)

Lesson Proper

I. Activity 1

Determine whether the statements is true or false. Write T is the statement is true and F is
otherwise, on the space provided before the number.
_______1. Parameter is a measure that describes a population.
_______2. Statistic is a measure that describes a sample.
_______3. An example of a parameter is 𝑥̅ .
_______4. An example of a statistic is 𝜇.
_______5. The given value in, “50% of the Philippine senators agreed to support a specific
bill.” Is a statistic.

II. Questions to Ponder


1. What is sampling?
2. How many subjects will I need to complete a viable study, and how will I
select them?
3. What is the difference between a statistic and a parameter?

III. Example and Discussion

RANDOM SAMPLING
What is sampling?

Sampling is the process of selecting units (i.e. people, organizations) from a


population of interest so that by studying the sample, we may fairly generalize our results
back to the population from which they were chosen.
Example 1: We want to test the effects of a new teaching strategy to high school
students. It is impossible to select the entire population of high school students. Hence, we
need to take a statistical sample from which to perform the experiment, gather data, and
generalize or extrapolate back to the population as a whole.

13
Advantages of Sampling
• It involves a smaller number of subjects, which reduces investment in time and
money.
• Sampling can actually be more accurate than studying an entire population,
because it affords researchers a lot more control over the subjects. Large studies
can bury interesting correlations amongst the ‘noise’.
• Statistical manipulations are much easier with smaller data sets, and it is easier to
avoid human error when inputting and analyzing the data.

Disadvantages of Sampling
• There is room for potential bias in the selection of suitable subjects for the research.
This may be because the researcher selects subjects that are more likely to give the
desired results, or that the subjects tend to select themselves.
Example, if an opinion poll company canvasses opinion by phoning people
between 9am and 5pm, they are going to miss people who are out working, totally
invalidating their results. These are called determining factors, and also include poor
experiment design, confounding variables and human error.
• Sampling requires a knowledge of statistics, and the entire design of the experiment
depends upon the exact sampling method required.

Types of Sampling Methods

1. Simple Random Sampling


Each element of the population has an equal chance of being selected. There are
no rules that dictate where and how you will start the selection process, as long as you do
not intentionally look for a specific number. In this method, the samples can be selected
through:
a. Lottery Method

Every member is assigned a unique number. These numbers are put in a jar and
thoroughly mixed. After that, the researcher picks some numbers without looking at it and
those people are included in the study.

b. Use of Table of Random Numbers

This table consists of a series of digits (0-9) that are generated randomly. The
numbers are arranged in rows and columns and can be read in any direction. All the digits
are equally probable.

To determine the desired number of samples needed given a certain number of


populations, there are different formulas can be used, one of which is Slovin’s Formula.
𝑁
Slovin’s Formula: 𝑛 = 2
𝑁𝑒

where: 𝑛 = sample size


𝑒 = margin of error,
𝑁 = population size

Example: 𝑛 =; 𝑒 = 0.05; 𝑁 = 1000


𝑁 1000 1000 1000
𝑛= 2
= 2
= = = 400 (𝒔𝒂𝒎𝒑𝒍𝒆 𝒔𝒊𝒛𝒆)
𝑁𝑒 1000(0.05) 1000(0.0025) 2.5

14
2. Systematic Random Sampling

This can be done by listing all the elements in the population and selecting every
kth element in your population list. This is equally precise as the simple random sampling. It
is often used on long population lists. To determine the interval to be used in identifying the
𝑁
samples who will participate in the study, use the formula 𝐾 = (population/sample size).
𝑛

Example:

𝑁
If Population (N) = 2000, sample size (n) = 500, 𝐾 = , so k = 2000/500 = 4th. Use a
𝑛
table of random numbers to determine the starting point for selecting every 4 th subject.
With list of the 2000 subjects in the sampling frame, go to the starting point, and select
every 4th name on the list until the sample size is reached. Probably will have to return to
the beginning of the list to complete the selection of the sample.

3. Stratified Random Sampling

This can be done by first dividing the elements in the population into strata and then
samples are randomly selected from each stratum ensuring that each selected element
is proportionately represented in the total population. Sampling fraction: n/N (desired
sample size divided by the population size)

Example:

Assume you have a population of 1000 students with 500 from grade school, 300
from high school, and 200 from senior high school. Determine the how many samples you
need or you can use the Slovin’s Formula or any other formula for computing the sample
size. In this example, Slovin’s Formula is used and a sample size of 400 is computed. To get
the samples from each stratum, divide 400 by 1000 and the answer is 0.4. Multiply 0.4 to
each of the number of students per stratum (e.g. 0.4 x 500 grade school is 200).

4. Clustered Sampling

A multistage sampling method adopted when it is either impossible or impractical


to compile an exhaustive list of elements found in the target population. The whole
population is subdivided into clusters, or groups, and random samples are then collected
from each group.

Example:

A researcher wants to survey about academic performance of high school students


in the municipality of Alubijid. He can divide the entire population into different clusters
(barangays). Then, the researcher selects a number of barangays depending on his
research through simple or systematic random sampling. The researcher could draw

15
random samples from the selected barangay through simple random sampling or take
them all.

Activity 2:

Identify the type of sampling method used by the researcher in each situation.

1. A researcher chose the participants of his study by selecting every 8th member of
the population.

2. A researcher interviewed all the teachers in each of 15 randomly selected private


schools in Cagayan de Oro City.

3. A researcher interviewed people from each barangay in the municipality of Alubijid


for his research on population.

4. A researcher is doing a research work on the students’ reaction to the newly


implemented curriculum in mathematics and interviewed every 5th student
entering the gate of the school.

5. A researcher randomly selected 15 barangays in a town for her study. She did this
by writing the names of each barangay on a piece of paper which she folded and
put in a bowl then she draws 15 pieces of paper from the bowl.

PARAMETER AND STATISTIC

Parameters in statistics are important components of any statistical analysis. In


simple words, a parameter is a numerical quantity that describes a population. This means
that it is a measure that characterizes or tells something about the whole population.

What is the difference between a statistic and a parameter?

A statistic, on the other hand, is a numerical quantity that describes a sample.

Examples of parameters are population mean 𝜇, population variance 𝜎 2 and population


standard deviation 𝜎. While examples of statistics are sample mean 𝑥̅ , sample variance
𝑠 2 and sample standard deviation 𝑠 (or a.k.a. standard error).

Example:

A researcher wants to estimate the average monthly allowance of the Grade 11


female students. From a random sample of 50 female students, the researcher obtains a
sample mean monthly allowance of 75 pesos.

Parameter: The average monthly allowance of all Grade 11 female students.


Statistic: The average monthly allowance of 75 pesos from a sample of 50 Grade 11
female students.

16
Activity 3:

Read each statistical study below. Identify the parameter and the statistic for each
of the following:

Example:
A researcher wants to estimate the average monthly allowance of the
Grade 11 female students. From a random sample of 50 female students, the
researcher obtains a sample mean monthly allowance of 75 pesos.

Parameter: The average monthly allowance of all Grade 11 female students.


Statistic: The average monthly allowance of 75 pesos from a sample of 50 Grade 11
female students.

1. A teacher wants to determine the average score in the first periodic examination
in General Mathematics of his 5 classes. From a random sample of 120 students. The
teacher obtains an average score of 84.
Parameter: _________________________________________________________________________
Statistic: _________________________________________________________________________

2. A teacher wants to know the average hours spent on social media of his advisory
class. He randomly selected and asked 35 students and found out that they spend an
average of 3 hours per day on social media.
Parameter: _________________________________________________________________________
Statistic: _________________________________________________________________________

Sampling Distribution of Sample Means

A statistic (such as the sample mean or sample standard deviation) is computed


from a sample. Since a sample is random, then every statistic is a random variable. It varies
from sample to sample in a way that cannot be predicted with certainty. Since a sample
mean is a random variable, then it has a mean, standard deviation and probability
distribution. The probability distribution of a statistic is called a sampling distribution.

A sampling distribution of sample means is a probability distribution that describes


the probability for each mean of all samples with the same sample size 𝑛.

Finding the Mean and Variance of the Sampling Distribution of Sample Means

The following are formulae needed to compute the mean, variance and standard
deviation of a population and mean, variance, and standard deviation of the sampling
distribution of sample means.

Population Sampling Distribution of


Sample Means
Mean ∑𝑥 ∑ 𝑥̅
𝜇= 𝜇𝑥̅ =
𝑛 𝑛
Variance ∑(𝑥 − 𝜇)2 ∑(𝑥̅ − 𝜇𝑥̅ ) 2
𝜎2 = 𝜎 2 𝑥̅ =
𝑁 𝑛
Standard Deviation
∑(𝑥 − 𝜇)2 ∑(𝑥̅ − 𝜇𝑥̅ ) 2
𝜎=√ 𝜎 2
𝑥̅ =√
𝑁 𝑛

17
Sampling with Replacement

Example1:
a. Compute the population mean.
b. Compute the population variance.
c. Find the population standard deviation.
d. List all the possible samples of size n = 2 with replacement and their corresponding
means.
e. Find the mean of the sampling distribution of sample means.
f. Find the variance of the sampling distribution of sample means.
g. Find the standard deviation of the sampling distribution of sample means.

Solution:
a. Compute the population mean.
∑ 𝑥 2 + 5 + 8 15
𝜇= = = =5
𝑛 3 3
Hence, the population mean is 5.

b. Compute the population variance.


𝒙 𝒙−𝝁 (𝒙 − 𝝁)𝟐
2 -3 9
5 0 0
8 3 9
∑(𝒙 − 𝝁)𝟐 = 𝟏𝟖

∑(𝑥 − 𝜇)2 18
𝜎2 = = =6
𝑁 3
Hence, the population variance is 6.

c. Find the population standard deviation.


∑(𝑥 − 𝜇)2
𝜎=√ = √6 = 2.45
𝑁
Hence, the population standard deviation is 2.45.

d. List all the possible samples of size n = 2 with replacement and their corresponding
means.
Observation Samples ̅
𝒙
1 2,2 2
2 2,5 3.5
3 2,8 5
4 5,2 3.5
5 5,5 5
6 5,8 6.5
7 8,2 5
8 8,5 6.5
9 8,2 8

18
e. Find the mean of the sampling distribution of sample means.

Observation Samples ̅
𝒙
1 2,2 2
2 2,5 3.5
3 2,8 5
4 5,2 3.5
5 5,5 5
6 5,8 6.5
7 8,2 5
8 8,5 6.5
9 8,2 8
̅ = 𝟒𝟓
∑𝒙
∑ 𝑥̅ 45
𝜇𝑥̅ = = =5
𝑛 9
Hence, the mean of the sampling distribution of sample means is 5.

f. Find the variance of the sampling distribution of sample means.

Observation Samples ̅
𝒙 ̅ − 𝝁𝒙̅
𝒙 ̅ − 𝝁𝒙̅ ) 𝟐
(𝒙
1 2,2 2 -3 9
2 2,5 3.5 -1.5 2.25
3 2,8 5 0 0
4 5,2 3.5 -1.5 2.25
5 5,5 5 0 0
6 5,8 6.5 1.5 2.25
7 8,2 5 0 0
8 8,5 6.5 1.5 2.25
9 8,2 8 3 9
̅ = 𝟒𝟓
∑𝒙 ̅ − 𝝁𝒙̅ ) 𝟐 = 𝟐𝟕
∑(𝒙
∑(𝑥̅ − 𝜇𝑥̅ ) 2 27
𝜎 2 𝑥̅ = = =3
𝑛 9
Hence, the variance of the sampling distribution of sample means is 3

g. Find the standard deviation of the sampling distribution of sample means.

∑(𝑥̅ − 𝜇𝑥̅ ) 2
𝜎 2 𝑥̅ = √ = √3 = 1.73
𝑛
Hence, the standard deviation of the sampling distribution of sample means is 1.73

Sampling without Replacement

Example 2: Consider a population consisting of the values 1, 3, and 5.


a. Compute the population mean.
b. Compute the population variance.
c. Find the population standard deviation.

19
d. List all the possible samples of size n = 2 without replacement and their
corresponding means.
e. Find the mean of the sampling distribution of sample means.
f. Find the variance of the sampling distribution of sample means.
g. Find the standard deviation of the sampling distribution of sample means.

Solution:
a. Compute the population mean.
∑𝑥 1 + 3 + 5 9
𝜇= = = =3
𝑛 3 3
Hence, the population mean is 3.

b. Compute the population variance.


𝒙 𝒙−𝝁 (𝒙 − 𝝁)𝟐
1 -2 4
3 0 0
5 2 4
∑(𝒙 − 𝝁)𝟐 = 𝟖

2
∑(𝑥 − 𝜇)2 8
𝜎 = = = 2.67
𝑁 3
Hence, the population variance is 2.67.

c. Find the population standard deviation.


∑(𝑥 − 𝜇)2
𝜎=√ = √2.67 = 1.63
𝑁
Hence, the population standard deviation is 1.63.

d. List all the possible samples of size n = 2 with replacement and their corresponding
means.
Observation Samples ̅
𝒙
1 1,3 2
2 1,5 3
3 3,1 2
4 3,5 4
5 5,1 3
6 5,3 4
e. Find the mean of the sampling distribution of sample means.

Observation Samples ̅
𝒙
1 1,3 2
2 1,5 3
3 3,1 2
4 3,5 4
5 5,1 3
6 5,3 4
̅ = 𝟏𝟖
∑𝒙

20
∑ 𝑥̅ 18
𝜇𝑥̅ = = =3
𝑛 6
Hence, the mean of the sampling distribution of sample means is 3.

f. Find the variance of the sampling distribution of sample means.

Observation Samples ̅
𝒙 ̅ − 𝝁𝒙̅
𝒙 ̅ − 𝝁𝒙̅ ) 𝟐
(𝒙
1 1,3 2 -1 1
2 1,5 3 0 0
3 3,1 2 -1 1
4 3,5 4 1 1
5 5,1 3 0 0
6 5,3 4 1 1
̅ = 𝟏𝟖
∑𝒙 ̅ − 𝝁𝒙̅ ) 𝟐 = 𝟒
∑(𝒙

2
∑(𝑥̅ − 𝜇𝑥̅ ) 2 4
𝜎 𝑥̅ = = = 0.67
𝑛 6
Hence, the variance of the sampling distribution of sample means is 0.67.

g. Find the standard deviation of the sampling distribution of sample means.

∑(𝑥̅ − 𝜇𝑥̅ ) 2
𝜎𝑥̅ = √ = √0.67 = 0.82
𝑛
Hence, the standard deviation of the sampling distribution of sample means is 0.82.

Try to think about the answers to these questions:

1. What do you notice about the population mean and the mean of the sampling
distribution of sample means? How do you compare them?
2. How do you compare the population variance and the variance of the sampling
distribution of sample means?

Let us summarize the example above by comparing the means and variances of
population and the sampling distribution of the sample means.

We can summarize the properties of the sampling distribution below.

If all possible samples of size 𝑛 that can be drawn from the population of size N with
mean 𝜇 and variance 𝜎 2 , then the sampling distribution of the sample means has the
following properties.

21
With Replacement
• The mean of the sampling distribution of means is equal to the mean of the
population.
𝜇𝑥̅ = 𝜇
• The variance of the sampling distribution of means is equal to the population
variance divided by the size of 𝑛 of the samples. That is,
𝜎2
𝜎 2 𝑥̅ =
𝑛
• The standard deviation of the sampling distribution of means is equal to the
population standard deviation divided by the square root of the sample size of 𝑛
of the samples. That is,
𝜎
𝜎𝑥̅ =
√𝑛
Without Replacement
• The mean of the sampling distribution of means is equal to the mean of the
population.
𝜇𝑥̅ = 𝜇
• The variance of the sampling distribution of means is equal to the population
variance divided by the size of 𝑛 of the samples. That is,
2
𝜎2 𝑁 − 𝑛
𝜎 𝑥̅ = ∙( )
𝑛 𝑁−1
• The standard deviation of the sampling distribution of means is equal to the
population standard deviation divided by the square root of the sample size of 𝑛
of the samples. That is,
𝜎 𝑁−𝑛
𝜎𝑥̅ = ∙ √( )
√𝑛 𝑁−1
Activity 4.

Solve the following problems. Use the properties of the sampling distribution of
sample means.

1. If a population has a mean of 5.7, what is the mean of the sampling


distribution of its means?
2. If a population has a variance of 7.4, what is the variance of the
sampling distribution of its means? The sampling distribution has a
sample size of 2 and all possible samples are drawn with replacements.
3. If a population has a standard deviation of 3.2, what is the standard
deviation of the sampling distribution of its means? The sampling
distribution has a sample size of 4 and all possible samples are drawn
with replacements.
4. If a population with size of 4 has a variance of 6.8, what is the variance
of the sampling distribution of its means? The sampling distribution has a
sample size of 3 and all possible samples are drawn without
replacements.
5. If a population with size 3 has a standard deviation of 2.4, what is the standard
deviation of the sampling distribution of its means? The sampling distribution has
a sample size of 2 and all possible samples are drawn without replacements

22
Summary

The graph of a continuous random variable is called the normal distribution curve.
This curve is bell-shaped, unimodal, continuous, asymptotic to the x-axis, and symmetric
about the mean. The mean, median, and mode are equal and are located at the center
of the distribution. The total area under a normal distribution curve is equal to 1.00, or 100%.
The area under the part of a normal curve that lies within 1 standard deviation of the mean
is approximately 0.68, or 68%; within 2 standard deviations, about 0.95, or 95%; and within
3 standard deviations, about 0.997, or 99.7%.

To get the area under the normal distribution table, we use the standard normal
distribution. The standard normal distribution is a normal distribution with a mean of 0 and
a standard deviation of 1. All normally distributed variables can be transformed into the
standard normally distributed variable by using the formula for the standard score:
𝑣𝑎𝑙𝑢𝑒−𝑚𝑒𝑎𝑛 𝑋−𝜇
𝑧= or 𝑧=
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎
After computing for the z-score using the formula above, we refer to a z-table to find the
area of regions under the normal curve.

The area under the standard normal distribution curve can also be thought of as a
probability or as the proportion of the population with a given characteristic. In solving
problems involving probability, we undergo the same process as finding the areas under
the normal curve.

Sampling is the process of selecting units (i.e. people, organizations) from a


population of interest so that by studying the sample, we may fairly generalize our results
back to the population from which they were chosen. Obtaining a sample from the
population usually is more practical since it involves a smaller number of subjects, reduces
time, money, effort and tends to be more accurate and easier to manipulate data. Types
of sampling include simple random sampling, systematic sampling, stratified random
sampling and clustered sampling.

A parameter is a measure that characterizes a population while a statistic is a


measure that characterizes a sample. The probability distribution of a statistic is called a
sampling distribution. A sampling distribution of sample means is a probability distribution
that describes the probability for each mean of all samples with the same sample size 𝑛.

To find the mean and variance of a sample distribution of sample means, the
formulae are given below:

Population Sampling Distribution of


Sample Means
Mean ∑𝑥 ∑ 𝑥̅
𝜇= 𝜇𝑥̅ =
𝑛 𝑛
Variance ∑(𝑥 − 𝜇)2 ∑(𝑥̅ − 𝜇𝑥̅ ) 2
𝜎2 = 𝜎 2 𝑥̅ =
𝑁 𝑛
Standard Deviation
∑(𝑥 − 𝜇)2 ∑(𝑥̅ − 𝜇𝑥̅ ) 2
𝜎=√ 𝜎 2 𝑥̅ = √
𝑁 𝑛

23
Post Assessment

1. Identify whether the given value is a parameter or a statistic.


a. The researcher found out that the 29 senior high school teachers of a certain
school spend an average of 2 hours preparing their lessons.
b. Based on a sample of 900 elementary students, it was found out 30% of them
do not know multiplication facts.
c. Based on sample of 1,200 surveyed students, it was found out that 20% of
them needed financial assistance.
d. A teacher surveyed all 50 students under his advisory class about their
learning styles and found out that most them are visual learners.
e. The Statistics teacher wants to know the average score of the students in
the final exam. He randomly selected 35 students and obtained an average
score of 43.

2. Consider all samples of size of 5 from this population:

2, 5, 7, 9, 10, 11, 12

a. Compute the population mean.


b. Compute the population variance.
c. Compute the population standard deviation.
d. Compute the mean of the sampling distribution of the sample means and
compare it the mean of the population.
e. Compute the variance of the sampling distribution of the sample means.
f. Compute the standard deviation of the sampling distribution of the
sample means.

24
25
Pre-Assessment Lesson 1 Lesson 2
Activity 1 (answers may vary) Activity 3
1. B Activity 2 1. P: The ave. score of 5 Gen Math
2. A 1. 150 classes in the 1st Periodical Test
3. A 2. 150 S: The ave. of 84 from a sample of
4. B 3. 150 120 students
5. C 4. 30 2. P: The ave. hours spent on
6. A social media of students in advisory class
7. D Lesson 2 S: The ave. of 3 hours spent by 35
8. D Activity 1 students in social media
9. B 1. T
10. C 2. T Activity 4
11. A 3. F 1. 5.7
12. D 4. F 2. 3.7
13. B 5. T 3. 1.6
14. D Activity 2 4. 0.76
15. A 1. Systematic random sampling 5. 1.2
2. simple random sampling
3. clustered sampling
4. systematic random sampling
5. simple random sampling
Answer Key (it should be inverted)
Appendix
The Standard Normal Distribution (z-table)

26
The Standard Normal Distribution (z-table) continued

27
References:

Bluman, Allan G. Elementary Statistics A Step By Step Approach, Ninth Edition, McGraw-Hill
Education, pp. 257-275

Malate, Jose S. Statistics & Probability for Senior High School, Vicarish Publications and
Trading, Inc.

Munez, Cherry Ann B. & Cervantes, Irl John. Statistics & Probability-Grade 11, Alternative
Delivery Mode, First Edition 2020, Department of Education.

28

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy