Measures of Grouped and Ungrouped Data Se 201
Measures of Grouped and Ungrouped Data Se 201
center of a set of data, arranged in an increasing or decreasing order of magnitude. The most
commonly used are the mean, median, and mode.
Population mean
If the set of data 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑁 not necessarily all distinct represents a finite population of size
𝑁, then the population mean is
∑𝑁𝑖=1 𝑥𝑖
𝜇=
𝑁
Example 1: The following are the scores of 11 graduating students in Math 112 Real Analysis II.
Compute for the mean.
65 76 78 83 90 86 45 52 37 56 72
∑𝑁
𝑖=1 𝑥𝑖 65+76+78+83+90+86+45+52+37+56+72 740
Solution: 𝜇= 𝑁
= 11
= 11
= 67.3
Example 2: The Intelligence Quotients (IQs) of five members of a family are 112, 125, 130, 98,
and 96. Find the mean IQ.
Solution:
∑𝑁
𝑖=1 𝑥𝑖 112+125+130+98+96 561
𝜇= = = = 112.2
𝑁 5 5
Sample mean
If the set of data 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 not necessarily all distinct represents a finite sample of size 𝑛,
then the sample mean is
∑𝑛𝑖=1 𝑥𝑖
𝑥̅ =
𝑛
Example 3: On a midterm Exam in Elementary Statistics, 7 students obtained the following grades:
89 82 76 79 65 92 and 54. Treating the above results as a
sample, compute for the mean.
Solution:
∑𝑛𝑖=1 𝑥𝑖 89 + 82 + 76 + 79 + 65 + 92 + 54 537
𝑥̅ = = = = 76.71
𝑛 7 7
Median
The median of a set of observations arranged in an increasing or decreasing order of magnitude is
the middle value when the number of observations is odd or the arithmetic mean of the two middle
values when the number of observation is even.
Mode
The mode of a set of observations is that value which occurs most often or with the greatest
frequency. The mode does not always exist. This is certainly true when all observations occur with
the same frequency.
Example 6: The number of participants present during the five-day workshop-seminar are as
follows: 40 46 50 49 and 46. Find the mode.
Example 7: The number of incorrect-answers on a true-false competency test for a random sample
of 15 students were recorded as follows: 2, 1, 3, 0, 1, 3, 3, 6, 0, 3, 3, 5, 2, 1, 4, and
2. Find the mode.
The mode is 3.
Remarks:
Mean
1. The mean is the most commonly used measure of location in statistics.
2. It is easy to calculate and it employs all the variable information.
3. The disadvantage of the mean is it is adversely affected by extreme values.
Median
1. The median is easy to compute if the number of observations is relatively small.
2. It is not affected by extreme values.
Mode
1. The mode is the least used measure of the three.
2. Its value is almost useless for small sets of data.
3. It requires no calculation.
4. It can be used for both quantitative and qualitative data.
The measures of central location do not give an adequate description of our data. We need to know
how the observations spread out from the average.
The most important statistics for measuring the variability of a set of data are the range and the
variance:
Range
The range of a set of data is the difference between the largest and smallest number in a set.
Example 8: The range in example 2 is 130-96 = 34
Example 9: The range in example 1 is 90-37 = 53
The range is very simple to compute. However, it is a poor a measure of variation, particularly if
the size of the sample is large. It only considers the extreme values and it tells us nothing about
the distribution of numbers in between.
Population variance:
∑𝑁
𝑖=1(𝑥𝑖 − 𝜇)
2
𝜎2 =
𝑁
𝑥𝑖 𝑥𝑖 − 𝜇 (𝑥𝑖 − 𝜇)2
65 65 − 67.3 = −2.3 (−2.3)2 = 5.1529
76 76 − 67.3 = 8.7 (8.7)2 = 76.2129
78 78 − 67.3 = 10.7 (10.7)2 = 115.1329
83 83 − 67.3 = 15.7 (15.7)2 = 247.4329
90 90 − 67.3 = 22.7 (22.7)2 = 516.6529
86 86 − 67.3 = 18.7 (18.7)2 = 350.8129
45 45 − 67.3 = −22.3 (−22.3)2 = 495.9529
52 52 − 67.3 = −15.3 (−15.3)2 = 233.1729
37 37 − 67.3 = −30.3 (−30.3)2 = 916.2729
56 56 − 67.3 = −11.3 (−11.3)2 = 127.0129
72 72 − 67.3 = 4.7 (4.7)2 = 22.3729
𝑵
2
∑𝑁
𝑖=1(𝑥𝑖 − 𝜇)
2
𝜎 =
𝑁
3106.1819
=
11
= 282.3802
Sample variance
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2
𝑠2 =
𝑛−1
𝑛 ∑ 𝑥𝑖 2 − (∑ 𝑥𝑖 )
𝑠2 =
𝑛(𝑛 − 1)
The standard deviation provides a method for converting observed variances to standard form so
that they
can be more easily understood and compared. The variance and standard deviation provide the
most
powerful estimate of variation because they consider the value of each score.
Remarks:
1. The range is the least reliable of the measures and is used only when one is in a hurry to
get a measure of variability. It may be used for ordinal, interval, or ratio data.
2. The most important measures of variability are the standard deviation and its square, the
variance. The variance is the average of the squared deviation around the mean.
In statistical analysis, measures of central tendency play a crucial role in summarizing and
describing a dataset. These measures provide insights into the central or typical value around which
the data tends to cluster. When dealing with grouped data, where individual values are organized
into intervals or classes, it becomes essential to adapt central tendency measures to this format.
Mean =
f i xi
n
where : fi = the frequency of class int erval i
xi = the midpo int of class int erval i
f i xi = the sum of the products of the frequency and midpo int of the class
int erval i
n
− cf c
Median = L1 +
2
f
n f i xi2 − ( f i xi )
2
s = 2
n (n − 1)
The quartile deviation is used when the median is used as an average; when the data depart
noticeably from the normal. It is used for ordinal data.
The quartile deviation, Q, is frequently called the semi-interquartile range. It is half of the distance
between two quartile points, Q1 , and Q3 .
𝑄3 −𝑄1
In symbols: 𝑄= 2
n 3n
− cf c − cf c
Q1 = L1 + Q3 = L1 +
4 4
Where: ,
f f
DECILES – are the score points which divide the distribution into ten equal parts.
𝒏 𝟐𝒏 𝟗𝒏
( – 𝒄 𝒇)𝒄 ( – 𝒄 𝒇)𝒄 ( – 𝒄 𝒇)𝒄
𝐷1 = 𝑳𝟏 + 𝟏𝟎
, 𝐷2 = 𝑳𝟏 + 𝟏𝟎
, … , 𝐷9 = 𝑳𝟏 + 𝟏𝟎
𝒇 𝒇 𝒇
PERCENTILES are the score points which divide the distribution into 100 equal parts.
Percentiles are useful in cases where comparison between individual scores relative to their
position in the entire group is a major concern. For example, a student who surpassed 90% of all
the examinees gets a score of 90 and a student who belongs to the top 20 % of the class gets 98%.
𝒏 𝟐𝒏 𝟗𝟗𝒏
( – 𝒄 𝒇)𝒄 ( – 𝒄 𝒇)𝒄 ( – 𝒄 𝒇)𝒄
𝑃1 = 𝑳𝟏 + 𝟏𝟎𝟎
, 𝑃2 = 𝑳𝟏 + 𝟏𝟎𝟎
, … , 𝑃99 = 𝑳𝟏 + 𝟏𝟎𝟎
𝒇 𝒇 𝒇
Complete the table and find the mean, median, mode, variance, sd , Q 1 , Q3, and Q. Make the bar
chart, histogram, frequency polygon, and ogive.
Solution:
Class 𝑓 Class midpoint 𝑓𝑖 𝑥𝑖 𝑓𝑥 2 𝑐𝑓
interval boundaries
36-40 2 35.5-40.5 38 76 2888 50
31-35 8 30.5-35.5 33 264 8712 48
26-30 12 25.5-30.5 28 336 9408 40
21-25 18 20.5-25.5 23 414 9522 28
16-20 10 15.5-20.5 18 180 3240 10
a) mean =
f i xi
=
1270
= 25.4
n 50
n 50
− cf c − 10
b) median = L1 + 2 = 20.5 + 2 (5) = 24.67
f 18
(d1 )c 8
(5) = 23.36
c) mode = L1 + = 20.5 +
d1 + d 2 8+6
e) variance
n f i xi2 − ( f i xi ) 50(33770 ) − (1270 ) 1688500 − 1612900
2 2
s =
2
= =
n (n − 1) 50(49 ) 2450
75600
= = 30.86
2450
f) SD
s = 30.86
n 50
− cf c − 10
g) Q1 = L1 + 4 = 20.5 + 4 5 = 20.5 + 0.69 = 21.19
f 18
3n 3
− cf c (50 ) − 28
h) Q3 = L1 + = 25.5 + 4
4 5 = 25.5 + 3.96 = 29.46
f 12
Q − Q1 29.46 − 21.19
i) Q = 3 = = 4.135
2 1 2