SLG 4.2 Measures of Variability
SLG 4.2 Measures of Variability
TARGET
TARGET
HOOK
From the previous module, the measures of center were discussed. Now, consider the scores of
two sets of students on a quiz, data set 1: 2, 2, 3, 3, 3, 4, 4 and data set 2: 1, 1, 3, 3, 3, 5, 5. If you
compute the mean, median, and mode of the two data sets, it can be observed that they have the same
means, medians, and modes all equaling to 3. However, based on the raw data, these two data sets are
not the same. So how do we differentiate the two data sets?
In this module, we introduce the measures of variability or spread of data.
You may visit the link below to watch a video on the measures of variability.
https://www.khanacademy.org/math/probability/data-distributions-a1/summarizing-
spread-distributions/v/range-variance-and-standard-deviation-as-measures-of-
dispersion)
IGNITE
Aside from the center of a data set, the spread is also an important attribute of a distribution. In
this learning module, three measures of variability will be discussed – the range, variance, and the
standard deviation.
The range basically states how far the extreme values are in the data set.
The range is simply the difference between the highest value and the lowest value in the data set,
represented by the letter 𝑅.
𝑅 =highest value − lowest value
Example 1: Using the scores of two sets of students on a quiz, data set 1: 2, 2, 3, 3, 3, 4, 4 and data set
2: 1, 1, 3, 3, 3, 5, 5. Find the range of each data set.
Solution:
For data set 1, the range 𝑅 = 4 − 2 = 2, while for data set 2, the range 𝑅 = 5 − 1 = 4. It
can be observed that the range of data set 2 is twice that of data set 1, and based on the range, the
second set is more spread out compared to the first one.
Another way to describe the variability of the values in a data set are, is by comparing how far
they are from the middle value. This is what is measured by the next measure of variability called the
variance, which is the average of the square of the difference of data points from the mean. While
the standard deviation is the square root of the variance.
The variance of the population is represented by the symbol 𝜎 2 , this is the Greek lowercase letter
sigma. The formula is given by
∑(𝑥 − 𝜇)2
𝜎2 =
𝑁
where
𝑥 = values in the data set
𝜇 = population mean
𝑁 = population size
Solution:
First, the mean is computed. This is a population since all the employees in the small company
is being considered.
55 + 31 + 56 + 27 + 33 + 41 + 49 + 52 344
𝜇= = = 43
8 8
The population variance then is computed as
2
∑(𝑥 − 𝜇)2 (55 − 43)2 + (41 − 43)2 + (56 − 43)2 + ⋯ + (52 − 43)2
𝜎 = =
𝑁 8
934
= = 116.75
8
The population variance is 116.75.
𝜎 = √𝜎 2 = √116.75 = 10.805
Observe that based on the formula, the variance is the average of the squared distances from
the mean. This means that if values are close to the mean, the value of the variance will be small, while
existence of values distant from the mean will increase the variance.
Squared distances are used instead of the difference, 𝑥 − 𝜇, because the difference of the values
with the mean will simply cancel each other out and the sum of the distances will always be zero.
Squaring will eliminate the negative signs. Thus, the variance will always be non-negative.
On the other hand, the reason for taking the square root of the variance is because of the units.
The resulting unit for the variance is the square of the units of the original data. By taking the square
root, the standard deviation will have the same unit as the data. Also, since the standard deviation is a
measure of distance, it can never be negative.
Take note that this formula only works for data containing data from all the individuals in the
group. This collection of all individuals is called the population. In the previous example, the data is
from all the employees in the company, which is the population of interest, and thus the formula is
appropriate. However, in cases where the population contains many individuals, and collecting data
from all the individuals in the population is too difficult or wasteful, then data can just be collected from
a subset of the population, which is called the sample. For example, a group of students wishing to form
a Kpop club would want to know how much time on average all the students in school spend listening
to Kpop. It is impractical to ask everyone in school, so instead they only select a set of students and ask
them instead as a substitute, as this would be more practical. This smaller group is called a sample.
In computing the variance for data coming from a sample, it might be thought to have the same
formulas as the population variance. However, the variance computed from a sample computed using a
this formula usually underestimates the population variance, which is unfortunate because typically the
goal of finding the sample variance is to estimate the population variance. It can be proven, though not
2
∑(𝑥 − 𝑥̅ )2
𝑠 =
𝑛−1
where
𝑥 = individual value
𝑥̅ = sample mean
𝑛 = sample size
∑(𝑥 − 𝑥̅ )2
𝑠 = √𝑠 2 = √
𝑛−1
Example 3: The following data give the numbers of vehicular accidents on a national highway that
occurred in a city in the past 8 weeks.
6 3 7 1 14 3 8 7
Calculate the variance and standard deviation and give your interpretation
Solution:
First, the mean is computed. This is a sample since it a subset of the population of all weekly
vehicular accidents in the highway.
6 + 3 + 7 + 1 + 14 + 3 + 8 + 7 49
𝑥̅ = = = 6.125
8 8
The sample variance then is computed as
𝑠 = √𝑠 2 = √16.125 = 4.02
Thus, the weekly number of vehicular accidents in the highway for the past 8 weeks differs
from the mean (𝑥̅ = 6.125) by 4.02, on the average.
NAVIGATE
It’s your time to work on different word problems. Please solve the following word problems on your
notebook. On both questions, compute the (a) range, (b) variance, and (c) standard deviation for these
data. Interpret the value obtained values.
1. The following data give the hourly wage rates (in Philippine pesos) of all six employees of a
small startup company.
250 180 180 150 130 100
2. The lowest recorded temperatures (in °C) in the first ten months of 2019 are as follows.
16.5, 17.2, 25.6, 27.8, 32.1, 33.2, 26.9, 28.0, 27.8, 24.7
KNOT
In summary, the measures of variability are range, variance and standard deviation. The range
measures the difference between the maximum and the minimum values, while variances and standard
deviations indicate the spread of the data based on its distance from the mean. A larger value for the
variance or standard deviation means the data are more dispersed. The variance and standard deviation
are also used as indicators of consistency of a variable, because larger values mean the values are farther
from the mean, which means larger variations in the data. These values will play an important role in
further lessons.
References:
1. De Veaux, Richard, Velleman, Paul & Bock, David (2014). Intro Stats (4th Edition). Pearson.
2. Bluman, Allan (2009). Elementary Statistics: A Step by Step Approach (8th Edition). McGraw-
Hill.
Prepared by: Jose Mari E. Ortega Reviewed by: Jennifer Ann L. de los Reyes
Position: Special Science Teacher IV Position: Special Science III
Campus: PSHS-SMC Campus: PSHS-EVC