Modules Week 1 8 2nd Quarter
Modules Week 1 8 2nd Quarter
Second Quarter
(Week 1)
Average occurs regularly in our daily life and it is an important tool in statistics. A well-chosen average consists of a single
number about which a given data are centered. There can be several different types of averages or sometimes called
measures of central tendency. They are the mean, median, and the mode.
MEAN
The mean is the most used measure of central tendency. When we speak of average, we always refer to the mean.
X =∑
number of education plans: 25, 18, 36, 13, 22. Find the mean. X
❑
N
25+18+36+13+ 22
X=
5
114
X=
5
❑
X =∑
Illustrative Example 2:
X
❑
A researcher collects data on the ages of recipients of doctoral degree in N
science and engineering, and his study yields the following: 44, 24, 28, X=
43, 36, 41, 33, 27, 37, 37
44+ 24+28+ 43+36+ 41+33+27+ 37+37
Determine the average age of the recipients.
10
The mean is determined by the sum of the ages and then dividing by the 350
X=
total number of recipients. 10
X =35
For grouped data, the midpoints of the classes are used for the values of
the x. ❑
The following are the steps in solving for the mean of grouped data.
Formula:
∑
❑
fx
X=
Grouped Mean ( Class Mark Formula ) N
1. Find the midpoint for each class. Place them in a column.
2. Multiply the frequency by the midpoint for each class. Place them in where:
another column. f = class frequency
x = class midpoint
N= sum of frequencies
3. Find the sum of the resulting column in step 2.
4. Divide the sum obtained in step 3 by the total number of frequencies. That is ,
Formula:
∑
❑
fx
X=
N
1502
X=
50
❑
( ∑ fx)(i)
Formula: ❑
X =AM +
N
(34)(3)
X =28+
50
X =28+2.04
X =30.04
Characteristics of mean:
Below are the characteristics of the mean of any distribution:
1. The mean is the most appropriate measure of central tendency when the data are in the interval or ratio scale
2. The mean lies between the largest and smallest values or measurements.
3. There is only one value for the mean for a given set of values or measurements
4. The mean is easily influenced by extreme values because all values contribute to the average. If there are high values, the
mean tends to be high also. If there are extremely low values, the mean tends to be low also.
Statistics
Second Quarter
(Week 2 )
MEDIAN
Median is the middle value of a given set of measurements, provided that the values or measurements are arranged in an
array. An array is an arrangement of values in increasing or decreasing order
Example 1:
The following are the ages of the mathematics teachers in San Juan Elementary School 21, 23, 32, 28, 25, 50, 48. Compute the
median
Step 1: We first arrange the data in an array: 21, 23, 25, 28, 32, 48, 50
Step 2: Then, we get the middle score, which is 28. Hence, the median is 28.Observe in this example that we have an odd
number or values ( values or measurements). Thus, there is only one middlemost value.
Step 3: Interpret the result: The average score in the given data is 28
Example 2:
In an English test, 8 students obtained the following scores: 10.15, 12, 18, 16,20. 12. 14. Find the median
Step 1: We first arrange the scores in an array, that is, 10, 12, 12, 14, 15, 16, 18, 20
Step 2: Since the number of scores is even (8 scores), there are two
middlemost scores; namely, 14 and 15. To get the median, we get
( )
the mean of the two middlemost scores. Thus, the median 14+ 15 N
−¿ cf
divided by 2 is 14.5 Formula ~ 2
X= LBMC + i
f MC
Step 3: Interpret the result. The average score of 8 students in
English test is 14.5 where :
LB MC= lower boundary of median class
¿ cf = cumulative frequency of the class
Grouped Median preceding before the media class
Here are the steps in finding the median for any grouped data. f MC = frequency of the median class
1. Make a table of cumulative frequency N=∑ of frequencies
2. Divide n, number of frequencies by 2, to get the halfway point. i = class interval
3. Locate the median class in the cumulative frequency column.
4. Substitute in the formula,
( )
Example 1. Find the median of the given set of data N
−¿ cf
Class Interval frequency LB <cf Formula ~ 2
X= LBMC + i
53-64 6 52.5 6 f MC
( )
65- 76 12 64.5 18 80
−18
77-88 25 76.5 43 ~ 2
X=76.5+ 12
25
89- 100 18 88.5 61
101- 112 14 100.5 75
113- 124 5 112.5 80
~X=76.5+ (
40−18
25
12 )
( )
N=80 ~ 22
X=76.5+ 12
25
~
X=76.5+10.56
( )
Class Interval frequency LB <cf N
−¿ cf
Formula ~ 2
16-23 1 15.5 1 X= LBMC + i
f MC
24- 31 3 23.5 4
( )
40
32- 39 6 31.5 10 −10
~ 2
X=39.5+ 8
12
40-47 12 39.5 22
48- 55 10 47.5 32
56- 62 8 55.5 40
N=40
1. The median is the most appropriate measure of central tendency for interval data.
2. The median lies between the highest and lowest measurements.
3. There is only one value for the median in a given set of measurements.
4. The median is not influenced by extreme values.
5. The median is used when the middle value is desired. It is the value where 50% or half of the
distribution lies above it and 50% lies below it.
Statistics
Second Quarter
(Week 3 )
MODE
Another measure of central tendency is the mode. Mode is the value which occurs most frequently in a set of measurement or
values. It is the least common among the three measures of central tendency. However, it is very useful as a measure of
popularity. For example, we might be interested in determining the most popular TV show, most preferred brand of
toothpaste or the most favorite ice cream, or the most saleable brand of shoes. In these situations, the mode is the most
appropriate measure of central tendency.
The mode for ungrouped data is easy to find. It is just the value or measurement which occurs the most number of times. In
other words, it is the most popular value. A distribution may have only one mode. In this case, the distribution is said to be
unimodal. Data that have two values for the mode are said to be bimodal. It is also possible that the set of data is multimodal
if there set of data occur only once, then the set of data has no mode
Example 1: The data on the number of times 10 mothers go to market every week are shown below
Mother A B C D E F G H I J
No of times mother 2 1 3 3 1 3 2 3 3 2
goes to market
Find the Mode.
Solution: The mode is 3. This means that many of the mothers go to market three times a week.
Example 2. Find the mode of the following measurements: 20, 15, 20, 14, 18, 15, 6
Solution: the modes are 20 and 15. Hence the set of data is bimodal
Below are the steps in finding the mode of grouped data: where :
LB Md= lower boundary of modal class
1. Find the modal class. This is the class interval with the highest f m = frequency of the modal class
frequency.
f a= frequency above the modal class
f b = frequency below the modal class
i = class interval
2. Use the formula to find the mode.
It is important to note that the formula for the mode given above holds only for unimodal distribution. For
multimodal distribution, the rough mode is given by the formula
Let us use the distribution of scores of 40 students in Mathematics to illustrate how to compute the mode
for grouped data.
Example 1:
Find the mode of the data whose frequency distribution is given below.
Class Interval frequency LB <cf
16-23
24- 31
1
3
15.5
23.5
1
4
Formula ^
X =LBMd +
( f m−f a
2 f m −f a−f b )
i
32- 39
40-47
6
12
31.5
39.5
10
22 Modal
^
X =39.5+
(
12−6
2(12)−6−10
8
)
48- 55 10 47.5 32
class
^
X =39.5+
6
24−16
8( )
56- 62 8 55.5 40 ^
X =45.5
N=40
Example 2:
Class Interval
40-44
frequency
3
Formula ^
X =LBMd +
( f m−f a
2 f m −f a−f b )
i
45-49
50-54
10
13 Modal Class
^
X =49.5+
( 2(13)−10−9
13−10
)5
X =39.5+(
26−19 )
55- 59 9 ^ 3
8
60-64 8
^
X =42.93
65-69 7
N=50
1. The mode is the most appropriate measure of central tendency when the data are nominal in
scale.
2. The mode is the least reliable among the three measures of central tendency because its value
is
undefined in some distributions,
3. The mode is used when we want to find the value which occurs most often.
4. The mode is a quick approximation of the average. The mode is sometimes referred to as an
inspection average
Statistics
Second Quarter
Week 4
Example 1:
The midrange is the sum of the lowest value, 35, and the highest value, 160.
Then these are divided by 2.
MR = 35+ 160/ 2
MR = 97.5
Weighted Mean
This is used to find the mean of values of data set that are not equally represented. The weighted average can be
found by multiplying the value by its corresponding weight and dividing the sum of the products by the sum of their
weights.
Example:
A recent survey of a new cola reported the following percentages of people who liked the taste. Find the weighted
mean of the percentages.
❑
1 40 1000 ∑
❑
w
( 0.40 ) (1000 )+ ( 30 ) (3000 )+(50)(800)
X=
2 30 3000 0.40+ 0.30+0.50
1700
X=
3 50 800 1.2
X =1416.67 ≈ 1417
Statistics
Second Quarter
(Week 5)
The previous section focused on averages or measures of central tendency. The averages are supposed to be central scores of
as given set of data. However, not all features of a
given data set may be reflected by the averages. For example, two different groups of 5 students are given identical quizzes in
Math. The following data at the right were the results.
14 5 Group 1 Group 2
13 19 Mean 14 14
18 18 Median 14 14
14 14 Mode 14 14
11 14 midrange 14.5 12
These two sets of averages have no difference. But intuitively, both groups show an obvious difference. Group 2 has a more
widely scattered data than Group 1. This characteristic called variability is not reflected by averages. The three basic measures
of variation are range, variance, and standard deviation.
RANGE
The range is the simplest measure of variation to calculate. It is just the difference between the largest and the smallest value
in each data set. For group 1, the range is 18 - 11 = 7. The range for group 2 is 19 – 5 = 14. A much larger range suggests
greater variation or dispersion. The range has a disadvantage of being influenced by extreme values called outliers. Anotheris
that it is based on two values only. All the other values in the set are being ignored.
(∑❑ ❑ x)
❑ ❑ 2
n∑ ❑ x − 14 5
2
2 ❑
variance :s =
n ( n−1 ) 13 19
√
18 18
(∑❑ ❑ x )
❑ ❑ 2
n∑❑ x −
2
standard deviation: ❑
s= 14 14
n ( n−1 )
11 14
Example 1: From the given data from the table at the right, identify which group are more consistent.
Interpretation: The standard deviation obtained value from group 1 and 2 respectively are 2. 55 and 5. 52. Since the value
obtained from group 1 was relatively smaller than in group 2, then we can say that the 5 members from group 1 are more
consistent than those in group 2.
2 ❑
s= Class limits Frequency
n ( n−1 )
Example: for 108 randomly selected junior high students, the following IQ 90-98 6
frequency distribution were obtained. 99-107 22
Find the variance and standard deviation. 8-116 43
Class Frequenc x fx x2 f x2 117-125 28
limits y
126-134 9
90- 98 6 94 564 8836 53016
Step 1: Make a table. Find the midpoints of each class. Multiply the midpoints by the frequency for each class
Step 2: Multiply the frequency by the square of the midpoint for each class
Step 3: Find the sum of columns 2, 4, and 5, then substitute it in the formula
(∑❑ fx )
❑ ❑ 2
n∑ f x −
2
2
For variance: 2 ❑ 108 ( 1387854 )−(12204 ) 82.25
s= = ≈
n ( n−1 ) 108(108−1)
Second Quarter
(Week 6)
COEFFICIENT OF VARIATION
The standard deviation measures absolute variability and not relative variability. It can only compare two samples that have
the same units of measure. A statistic that allows us to compare two different data sets that have different units of
measurement is called coefficient of variation. This expresses the standard deviation as a percentage of the mean.
s ϑ
For samples: CV = ∙100 % For population: CV = ∙ 100 %
x μ
The data with larger CV is more variable
Example: The average score of the students in one English class is 110, with a standard deviation of 5; the average score of
students in a History class is 106, with a standard deviation of 4. Which class is more variable in terms of score?
5 4
For English: CV = ∙ 100 % ≈ 4.55 % For History: CV = ∙ 100 % ≈ 3.77 %
100 106
Interpretation: Since the coefficient of variation for the English class is larger, the scores here are more variable than the
scores in the History class.
MEASURES OF POSITION
There are times when we want to know the position of a value relative to the other observations in a data set. For instance,
you took a 100-item test. You might want to know how you score of 88 compares to the scores of the others
STANDARD SCORES OR Z SCORES
A z score measures the distance between an observation and the mean, measured in units of standard deviation. Suppose
that a student got a grade of 78 in her Math test and 55 in her Science test. The scores cannot be compared directly since the
exams may not be equivalent in terms number of questions, value of each question and, so on. But the relative position of the
scores can be made using the z scores.
The standard score is obtained by subtracting the mean from the value/observation and dividing the result by the standard
value−mean x−x
deviation. The formula is z= =
standard deviation s
If the z score is positive, the score is above the mean. If the z score is 0, the score is it same as the mean. If the z score is
negative, the score is below the mean.
Example 1: An IQ test has a mean of 105 and a standard deviation of 20. Find the corresponding z score for each IQ.
a) 88 b) 122 c) 110
x−x 88−105
Solution: a) z= = =−0.85
s 20
x−x 122−105
b) z= = =0.85
s 20
x−x 110−105
c) z= = =0.25
s 20
Example 2: Which of the following exam grades has a better relative position?
A grade of 43 on an Algebra test with x = 40 and s = 3
or
A grade of 75 on a Geometry test with x = 72 and s = 5?
x−x 43−40
Solution For a grade of 43: z= = =1
s 3
x−x 75−72
For a grade of 75: z= = =0.6
s 5
Interpretation: Since the z score for the algebra test is larger, the position in the Algebra test is higher than the position in the
Geometry test.
Second Quarter
(Week 7)
QUARTILE
A quartile is a measure of relative standing. Let x1, X2, ..., xn, be a set of n measurements arranged in order of magnitude. The
first quartile, Q1, is the value of x that exceeds one-fourth of the measurements and is less than the remaining three-fourths.
The second quartile. Q2, is the median. The third quartile, Q3, is the value of x that exceeds three-fourths of the measurement
and is less than one-fourth.
Rules:
When the measurements are arranged in order of magnitude, that is increasing or decreasing.
PERCENTILE
Percentiles are position measures used in educational and health-related fields to indicate
the position of an individual in a group. It is symbolized by P1, P2, P3,…..Pn. and divide the distribution into 100 groups.
Solution: Arrange the data in order from lowest to highest. The substitute in the formula
np
Solution: Arrange the data in order from lowest to highest. Compute c= , where n is the total number of values and p is
100
10(25)
the percentile. c= = 2.5 = 3 3rd value which is 5.
100
DECILE
Deciles divide the distribution into tenths or 10 equal parts. A data set has nine deciles which is denoted by D 1, D2, …D9.
Basically, the first decile, D1, is the number that divides the bottom 10% of the data from the top 90%. To obtain the deciles,
divide the data set into tenths and then determine the number dividing the tenths.
Note that the second quartile, fifth decile, and fiftieth percentile of a data set are all the same and all equal to the median.
Median = Q2 = D5 =P50 Similarly, Q1, = P25, D1 = P10 and Q3 = P75
Example 6: Find the value corresponding to the 6th decile for the given data set.
Since D6 is equivalent to P60, then we will use the formula for percentile
np 10(60)
c= = =6 th value half way between 79 and 80 is ( 79+80)/2= 79.5 hence 79.5 corresponds to the 60th
100 100
percentile
(Week 8)
GROUPED DATA
For grouped data, the quartiles, deciles, or percentiles can be determined using the following formula
LB+ ( kn−cf
f )
i
a a a
Where k is equal to : for quartile for decile for percentile
4 10 100
a = ath quartile, decile, percentile
i = class interval
f LB <cf
53- 63 6 52.5 6
64- 74 12 63.5 18
75-85 25 74.5 43
86- 96 28 85.5 71
Example 7: Find the third quartile, 4th decile and 70th percentile for the given frequency distribution below
3
Solution: Qi = Q3 a = 3, k = : the class that contains kn = (3/4)(90) = 67.5 or 68th
4
The observation is the class with interval 86-96
Q 3=LB+ ( kn−cf
f ) i=85.5+ (
28 )
67.5−43
11=95.1
4
Di = D4 a = 4, k = : the class that contains kn = 4/10)(90) = 36 or 36th
10
The observation is the class with interval 75- 85
D 4 =LB+ ( kn−cf
f ) i=74.5+(
25 )
36−18
11=82.4
70
Pi = P70 a = 70, k = : the class that contains kn = ( 70/100)(90) = 63 or 63rd
100
The observation is the class with interval 86 - 96
P70=LB+ ( kn−cf
f ) i=85.5+ (
28 )
63−43
11=93.4