BComp3 Module 5 Measures of Variability
BComp3 Module 5 Measures of Variability
LAGUNA UNIVERSITY
College of Entrepreneurship, Management and Accountancy
Bachelor of Science in Accountancy
Module 5
Lesson 1
Measures of Variability
I. Learning Outcomes
1. Solve for range, quartile deviation, mean absolute deviation, and standard deviation for
grouped and ungrouped data.
2. Solve for quartile, decile and percentile for grouped and ungrouped data.
3. Use calculator and MS Excel to facilitate faster and easier computation.
II. Introduction
Descriptive measures that are used to indicate the amount of variation in a data set
are called measures of variability, dispersion, or spread. When descriptive statistics are
presented, there is usually at least one measure of central tendency and at least one
measure of variability reported. The measures of dispersion to be discussed are the range,
mean absolute deviation, quartile deviation, interquartile range, variance, and standard
deviation.
Definition of Variability
The Most Common Measures of Dispersion or Variability of scores are the following:
1. The Range
- The range is the simplest and the easiest of the measures of dispersion.
- It simply measures the distance given by the highest score and the lowest score.
- It is considered as the least satisfactory measure of dispersion because it does not
tell anything about the scores between these two extremes.
5. The Variance-
- It is the square of the deviation from the mean.
- The lesser the value of the measure, the more consistent, the more homogeneous
and the less scattered are the observations in the set of data.
- If there is a large amount of variation, then on average, the data values will be far
from the mean. Hence, the SD will be large.
- If there is only a small amount of variation, then on average, the data values will be
close to the mean. Hence, the SD will be small.
3
Computation of Range, Quartile Deviation, Mean Absolute Deviation, And Standard Deviation
from Ungrouped Data
The range of a dataset is the difference between the largest and smallest values in that
dataset. For example, in the two datasets below, dataset 1 has a range of 20 – 38 = 18 while
dataset 2 has a range of 11 – 52 = 41. Dataset 2 has a broader range and, hence, more
variability than dataset 1.
4
Interquartile range is defined as the difference between the upper and lower
quartile values in a set of data. It is commonly referred to as IQR and is used as a measure of
spread and variability in a data set.
Example Problem
You grow 20 crystals from a solution and measure the length of each crystal in millimeters.
Here is your data:
1. Calculate the mean of the data. Add up all the numbers and divide by the total number
of data points.(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 = 140/20 = 7
2. Subtract the mean from each data point (or the other way around, if you prefer... you
will be squaring this number, so it does not matter if it is positive or negative).(9 - 7) 2 =
(2)2 = 4
(2 - 7)2 = (-5)2 = 25
(5 - 7)2 = (-2)2 = 4
(4 - 7)2 = (-3)2 = 9
(12 - 7)2 = (5)2 = 25
(7 - 7)2 = (0)2 = 0
(8 - 7)2 = (1)2 = 1
8
In a frequency distribution table, the range is the difference between the upper limit of
the highest class interval and the lower limit of the lowest class interval.
9
The quartiles may be determined from grouped data in the same way as the median except
that in place of n/2 we will use n/4. For calculating quartiles from grouped data we will form
cumulative frequency column. Quartiles for grouped data will be calculated from the following
formulae;
= Median.
Where,
l = lower class boundary of the class containing the , i.e. the class corresponding to the
cumulative frequency in which n/4 or 3n/4 lies
h = class interval size of the class containing .
f = frequency of the class containing .
n = number of values, or the total frequency.
C.F = cumulative frequency of the class preceding the class containing .
For Example:
We will calculate the quartiles from the frequency distribution for the weight of 120 students as
given in the following Table 18;
Table 18
∑f = n = 120
i. The first quartile is the value of or the 30th item from the lower end. From
Table 18 we see that cumulative frequency of the third class is 22 and that of the fourth class is
50. Thus lies in the fourth class i.e. 140 – 149.
ii. The thirds quartile is the value of or 90th item from the lower end. The
cumulative frequency of the fifth class is 75 and that of the sixth class is 93. Thus, lies in the
sixth class i.e. 160 – 169.
Conclusion
From we conclude that 25% of the students weigh 142.36 pounds or less and 75% of
the students weigh 167.83 pounds or less.
When working with data that are grouped into categories or intervals, the variance and
standard deviation are again obtained using the deviations about the mean and the squared
value of these. But in this case, the sum of squares of differences1 about the mean are weighted
by the number of times each occurs.
13
The computation is essentially the same as that for ungrouped data, except that the X is
not the value of the item, but rather, the class marks for each of the class intervals.
where,
s = sample standard deviation
= sum of...
= sample mean
f = frequency
n = number of scores in sample.
where,
14
f = frequency
However, in statistics, we are usually presented with a sample from which we wish to
estimate (generalize to) a population, and the standard deviation is no exception to this.
Therefore, if all you have is a sample, but you wish to make a statement about the population
standard deviation from which the sample is drawn, you need to use the sample standard
deviation.
Q. A teacher sets an exam for their pupils. The teacher wants to summarize the results
the pupils attained as a mean and standard deviation. Which standard deviation should
be used?
A. Population standard deviation. Why? Because the teacher is only interested in this
class of pupils' scores and nobody else.
Q. A researcher has recruited males aged 45 to 65 years old for an exercise training
study to investigate risk markers for heart disease (e.g., cholesterol). Which standard
deviation would most likely be used?
(Note: Mean ˉx=∑fxn)
https://statistics.laerd.com/statistical-guides/measures-of-spread-standard-deviation.php
Youtube tutorial https://www.youtube.com/watch?v=9i2gNbvA0dQ
Sample problem: