Unit 4: Descriptive Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 59

Unit 4: Descriptive Statistics

Unit 4: Descriptive Statistics

Introduction
In the previous units, it has been discussed that one of the nature of statistics is
descriptive statistics along with inferential statistics. Descriptive statistics refers to
the field of statistics that includes the methods of collecting, classifying, graphing,
and averaging data with the objective of simply describing the properties or
characteristics of the data on hand. Thus, the task of the statistician in this area is
simply to select a few procedures, do some averaging, and eventually be able to
identify significant features of the given data. This unit will focus on the
commonly used descriptive statistics such as measures of central tendency,
measures of position, measures of variation, and measures of skewness. Both
manual computation and use of digital technology in solving will be introduced
on this unit.

Learning Outcomes
At the end of this unit, you are expected to:
1. Determine when to use the appropriate measure of central tendency,
position, variability, and skewness.
2. Compute and interpret the mean, median, mode, quartile, decile,
percentile, range, quartile deviation, mean deviation, variance, standard
deviation, and skewness of a given set of data.
3. Use digital technology in solving measure of central tendency, position,
variation, and skewness.

1
Unit 4: Descriptive Statistics

Activating Prior Knowledge


Teacher Juan needs help! Who should be the valedictorian of the class, Maria or
Peter? Write your answer on the box provided and show your solution or reasons
for your decision. Note: There should be only one valedictorian.

2
Unit 4: Descriptive Statistics

Topic 1. Measures of Central Tendency

Learning Objectives
At the end of the lesson, you are expected to:
1. Calculate the measures of central tendency such as mean, median, and
mode.
2. Provide a sound interpretation of these measures.
3. Discuss the properties of these measures.

Presentation of Content
Measures of Central Tendency
A measure of central tendency is a value at the center or middle of a data
set.
‘The Average’
There are three common measures of central tendency: mean, median, and
mode.
Mean
The most reliable and the most sensitive measure of central tendency.
It is the most widely used measure.
It is commonly known as the “average” although the median and the mode
are also known as averages.
Mean for Ungrouped Data
It comes into 2 different forms: 1) Simple Mean and 2) Weighted Mean
Simple Mean
The simple mean is obtained by adding all the values/ observations of a certain
variable and divide the sum by the total number of values, cases or observations.
Formula:
𝒔𝒖𝒎 𝒐𝒇 𝒂𝒍𝒍 𝒕𝒉𝒆 𝒗𝒂𝒍𝒖𝒆𝒔 𝒊𝒏 𝒕𝒉𝒆 𝒅𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒊𝒐𝒏
̅=
𝒙
𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒗𝒂𝒍𝒖𝒆𝒔 𝒊𝒏 𝒕𝒉𝒆 𝒅𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒊𝒐𝒏
∑𝒙
̅=
𝒙 𝒏

3
Unit 4: Descriptive Statistics

Example: The scores of 15 students in a 25-item test are 15, 18, 17, 16, 19, 21,
18, 23, 24, 18, 16, 17, 20, 21, and 19. Determine the mean score of the
students.

Solution:
∑𝑥
𝑥̅ =
𝑛
15 + 18 + 17 + 16 + 19 + 21 + 18 + 23 + 24 + 18 + 16 + 17 + 20 + 21 + 19
𝑥̅ =
15

̅ = 𝟏𝟖. 𝟖𝟎
𝒙
The performance of the class/group (15 students) in the 25-item test can be
represented by their mean score which is 18.80.

Solve and Interpret!


The daily sales of ABC Enterprises for the first 7 days of a certain month are
shown below.
Php 5286, Php 10826, Php 2580, Php 6386, Php 4560, Php 3635, Php 8625
Determine the daily mean sales of the store for the first seven days.

Weighted Mean
Formula:
∑ 𝑥𝑤
𝑥̅ =
∑𝑤
where
x = represents the item value
w = represents the weight associated to x
Example: The following represents the final grades obtained by a CTEd
student on one summer term:
Chemistry (3 units) ---------88
Statistics (5 units) ---------93
NSTP (2 units) ---------89
Find the weighted average of the student.

4
Unit 4: Descriptive Statistics

Solution:
∑ 𝑥𝑤
𝑥̅ =
∑𝑤
93(5) + 88(3) + 89(2)
𝑥̅ =
10
̅ = 𝟗𝟎. 𝟕
𝒙
The student’s performance for the summer term can be represented by the
average/mean of his grades in the three subjects which is 90.7.

Example: The following represents the responses of 50 randomly chosen


respondents in one item of a research questionnaire:
Very Strongly Agree (5) - - - 17
Strongly Agree (4) - - - 11
Agree (3) - - - 9
Disagree (2) - - - 12
Strongly Disagree (1) - - - 1
Find the weighted response of the respondents.

Solution:
∑ 𝑥𝑤
𝑥̅ =
∑𝑤
5(17) + 4(11) + 3(9) + 2(12) + 1(1)
𝑥̅ =
50
̅ = 𝟑. 𝟔𝟐 (Strongly Agree)
𝒙

Table of Interpretation for a 5 point Likert Scale


4.20 – 5.00 Very Strongly Agree
3.40 – 4.19 Strongly Agree
2.60 – 3.39 Agree
1.80 – 2.59 Disagree
1.00 – 1.79 Strongly Disagree

5
Unit 4: Descriptive Statistics

Mean for Grouped Data


Grouped data (data presented in a FDT)
Formula:
∑ 𝑓𝑥
𝑥̅ =
𝑛
where
f-----represents the frequency of each class
x----the midpoints of each class
n----total number of frequencies or sample size

Steps in getting the mean of a grouped data


1. Get the midpoint of each class
2. Multiply each midpoint by its corresponding frequency
3. Get the sum of the products in step 2
4. Divide the sum obtained in step 3 by the total number of frequencies. The
result shall be rounded off to two decimal places.

Example: Consider the frequency distribution of the examination scores of the


sixty students in a statistics class. Compute the value of the mean.

Solution:
Step 1. Get the midpoint for each class. The midpoints are shown in the 3 rd
column.

6
Unit 4: Descriptive Statistics

Step 2. Multiply each midpoint by its corresponding frequency. The products are
shown in the 4th column.

Step 3. Get the sum of the products in step 2.

Step 4. Divide the result in step 3 by the sample size. The result is the mean of the
distribution. Hence,
∑ 𝑓𝑥
𝑥̅ =
𝑛
3,174
𝑥̅ =
60
̅ = 𝟓𝟐. 𝟗𝟎
𝒙

The performance of the class/group (60 students) in a statistics class can be


represented by their mean score which is 52.90.

Solve! Consider the frequency distribution of the ages of 75 mayors. Compute


the mean age of the mayors.

7
Unit 4: Descriptive Statistics

The Mean is used…


For interval and ratio measurements
When there are no extreme values in a distribution since it is easily
affected by extremely high or extremely low scores
When higher statistical computations are wanted
When the greatest reliability of the measure of central tendency is wanted
since its computations include all the given values
---------------------------------------------------------------------------------------------------
Median
A positional measure that divides the set of data exactly into two parts.
It is the score/observation that is centrally located between the highest and
the lowest observation.
Determined by rearranging the data into an array.

Median for Ungrouped Data


How to solve for the median?
To find the median, first sort the values (arrange them in order), then follow one
of these procedures:
1. If the number of values is odd, the median is the number located in the
exact middle of the list.
2. If the number is even, the median is found by computing the mean of the
two middle scores.

Formula:
𝑥̃ = 𝑥(𝑛+1) 𝑖𝑓 𝑛 𝑖𝑠 𝑜𝑑𝑑
2

𝑥(𝑛)+𝑥(𝑛+1)
2 2
𝑥̃ = 𝑖𝑓 𝑛 𝑖𝑠 𝑒𝑣𝑒𝑛
2

Example: The following are the daily wages of seven (7) employees of a
certain food chain in Tuguegarao City:
Php108, Php120, Php154, Php118, Php125, Php164, Php135
Determine the median wage of these employees.

Solution:
The value of n=7. Arranging the values in terms of magnitude and using
𝑥̃ = 𝑥(𝑛+1)
2

8
Unit 4: Descriptive Statistics

Php108, Php118, Php120, Php125, Php135, Php154, Php164


𝑥̃ = 𝑥(7+1)
2

𝑚𝑒𝑑𝑖𝑎𝑛 = 𝑥4
Php108, Php118, Php120, Php125, Php135, Php154, Php164
̃ = 𝟏𝟐𝟓
𝒙

Example: The following values are the number of students of the first 8 classes
in a certain college taken for inspection:
21, 25, 26, 30, 36, 39, 42, 55
Determine the median.

Solution:
The values are already arranged in terms of magnitude. Since n=8 and is even,
then we shall use
𝑥(𝑛)+𝑥(𝑛+1)
2 2
𝑥̃ =
2
21, 25, 26, 30, 36, 39, 42, 55
𝑥(8)+𝑥(8+1)
2 2
𝑥̃ =
2
𝑥4+𝑥5
𝑥̃ =
2
30 + 36
𝑥̃ =
2
̃ = 𝟑𝟑
𝒙

9
Unit 4: Descriptive Statistics

Median for Grouped Data


Formula:
𝑛
− 𝑐𝑢𝑚𝑏𝑓𝑏
𝑥̃ = 𝑥𝑙𝑏 + (2 )𝑐
𝑓𝑚

where
𝑥𝑙𝑏 ------------lower boundary of the median class
𝑛--------------sample size
𝑐𝑢𝑚𝑏𝑓𝑏 ------cumulative frequency before the median class
𝑓𝑚 ------------the frequency of the median class
𝑐---------------class width

Steps in getting the median of a grouped data


𝑛
1. Solve for 2 .

2. Determine the value of 𝑐𝑢𝑚𝑏𝑓𝑏


3. Determine the median class.
4. Determine the lower boundary and the frequency of the median class and
the size of the class interval.
5. Substitute the values obtained in steps 1-4 to
𝑛
− 𝑐𝑢𝑚𝑏𝑓𝑏
𝑥̃ = 𝑥𝑙𝑏 + ( 2 )𝑐
𝑓𝑚

How to determine the 𝑐𝑢𝑚𝑓𝑏 and median class?

 Construct a less than cumulative frequency


(< 𝑐𝑢𝑚𝑓)
 Divide the total number of frequencies by 2.
𝑛
2
𝑛
 The value 2 shall be used to determine the cumulative frequency before
the median class denoted by
𝑐𝑢𝑚𝑓𝑏
 𝑐𝑢𝑚𝑓𝑏 refers to the highest value under the < 𝑐𝑢𝑚𝑓 column that is less
𝑛
than 2

10
Unit 4: Descriptive Statistics

 The median class refers to the interval that contains the median, that is,
𝑛 𝑡ℎ
where (2 ) value is located.
𝑛
 Among the entries under the < 𝑐𝑢𝑚𝑓 column which are greater than 2 , the
smallest shall be the frequency of the median class.
 If a distribution contains an interval where the cumulative frequency is
𝑛
exactly 2 , then the upper boundary of that class will be the median and no
interpolation is needed.

Example: Consider the frequency distribution of the examination scores of the


sixty students in a statistics class. Compute the value of the median.

Solution (step by step):


𝑛 60
1. = = 30
2 2

2. 𝑐𝑢𝑚𝑏𝑓𝑏 = 19
3. median class: 47 − 58

4. 𝑥𝑙𝑏 = 46.5; 𝑓𝑚 = 19; 𝑐 = 12


𝑛
−𝑐𝑢𝑚𝑏𝑓𝑏
5. 𝑥̃ = 𝑥𝑙𝑏 + ( 2 )𝑐
𝑓𝑚
60
̃𝑥 = 46.5 + ( 2 −19) 12
19

̃ = 𝟓𝟑. 𝟒𝟓
𝒙

11
Unit 4: Descriptive Statistics

Solve! Consider the frequency distribution of the ages of 75 mayors. Compute


the median of the mayors’ age.

The Median is used…


For ordinal and ranked measurements
When there are extreme values, thus the distribution is markedly skewed
For an open-end distribution; that is, the lowest or the highest class
interval or both are defined (i.e., 50 and below or 100 and above)
When one desires to know whether the cases fall within the upper halves
or the lower halves of a distribution.

---------------------------------------------------------------------------------------------------
Mode
The most favorite score.
The score having the highest frequency.
The most frequently occurring score.
The least reliable measure of position.

Mode for Ungrouped Data


How to solve for the mode?
The value of the mode can be obtained through inspection, thus, no
computation is needed.
In some instances, the mode might exist or it might not exist (no mode). If
it exists, there can be more than one value.

12
Unit 4: Descriptive Statistics

A set of data is said to be


Unimodal or monomodal if it has only one mode.
Example: 33, 35, 35, 38, 40, 46
Its mode is 35.
Bimodal if it has two modes.
Example: 33, 35, 35, 38, 40, 40, 46
Its modes are 35 and 40.
Multimodal if it has more than two modes.
Example: 33, 35, 35, 38, 40, 40, 46, 46, 51, 58, 58, 60
Its modes are 35, 40, 46 and 58.

Mode for Grouped Data


Formula:
𝑑1
𝑥̂ = 𝑥𝑙𝑏 + ( )𝑐
𝑑1 + 𝑑2
where
𝑥𝑙𝑏 ------------lower boundary of the modal class
𝑑1 -------------difference of the frequency of the modal class and the
frequency of the interval preceding the modal class
𝑑2 -------------difference of the frequency of the modal class and the
frequency of the interval after the modal class
𝑐---------------class width

Steps in getting the mode of a grouped data


1. Determine the modal class. The modal class is the interval that contains
the highest frequency in the distribution
2. Get the value of 𝑑1 .
3. Get the value of 𝑑2 .
4. Get the lower boundary of the modal class (𝑥𝑙𝑏 ).

13
Unit 4: Descriptive Statistics

5. Apply the formula by substituting the values obtained in the preceding


steps.
𝑑1
𝑥̂ = 𝑥𝑙𝑏 + ( )𝑐
𝑑1 + 𝑑2

Example: Consider the frequency distribution of the examination scores of the


sixty students in a statistics class. Compute the value of the mode.

Solution (step by step):


1. modal class: 47 − 58
2. 𝑑1 = 19 − 11 = 8
3. 𝑑2 = 19 − 14 = 5
4. 𝑥𝑙𝑏 : 46.5
5. Apply the formula by substituting the values obtained in the preceding
steps.
𝑑1
𝑥̂ = 𝑥𝑙𝑏 + ( )𝑐
𝑑1 + 𝑑2
8
𝑥̂ = 46.5 + ( ) 12
8+5
𝑥̂ = 53.88

Solve! Consider the frequency distribution of the ages of 75 mayors. Compute


the mode of the mayors’ age.

14
Unit 4: Descriptive Statistics

The Mode is used…


For nominal and categorical data
When a rough or quick estimate of a central value is wanted
When the most popular or the most typical case or value in a distribution
is wanted

---------------------------------------------------------------------------------------------------
Comparison of Mean, Median, and Mode

Mean
The mean always exist in any distribution. This implies that for any set of
data, the mean can always be computed.
The value of the mean in any distribution is unique. This implies that for
any distribution, there is only one possible value of the mean.
In the computation for this measure, it takes into consideration all the
values in the distribution.

Median
Like the mean, the median also exists in any distribution.
The value of the median is also unique.
This is a positional measure.

Mode
It does not always exist.
If the mode exists, it is not always unique.
In determining the value of the mode, it does not take into account all the
values in the distribution.

15
Unit 4: Descriptive Statistics

Application
Solve the given problem below. Show all pertinent solutions on the box provided.

The rate per hour (in pesos) of the 10 employees of a certain company were
taken and are shown below.
44.17, 44.17, 38, 39.25, 18, 15, 57.17, 65.25, 44.17, 39.5

1. Determine the value of the mean.

Solution box

2. Determine the value of the median.

Solution box

3. Determine the value of the mode.

Solution box

16
Unit 4: Descriptive Statistics

4. If the value 39.25 was erroneously written whose actual value is 49.25,
then what measure of central tendency will be affected? Support your
answer.

Solution/Reasoning box

The ages of 210 qualified voters in a certain barangay were taken and are shown
below. Compute the value of the mean, median, and mode. Show all your pertinent
solutions on the box provided.

Solution box for Mean

17
Unit 4: Descriptive Statistics

Solution box for Median

Solution box for Mode

Feedback
Solve the following problems. Show all pertinent solutions on the box provided.

The NCEE scores of 12 students in a certain college were taken and are shown
below.
93, 65, 87, 56, 99, 76, 58, 87, 76, 93, 68, 69

1. Determine the value of the mean.

Solution box

18
Unit 4: Descriptive Statistics

2. Determine the value of the median.

Solution box

3. Determine the value of the mode.

Solution box

The daily mean sales of ABC Store for the month of November was computed
at Php 5, 386.65. When the figures were reviewed, it was found out that on
November 21, the actual sales was Php 6, 389 and was erroneously written Php
3, 689. Assuming that the store was open for 20 days for the month of
November, then determine the store’s daily mean sales for November.

Solution box

19
Unit 4: Descriptive Statistics

The results of an IQ
test of a group of
students in a certain
college were taken and
are presented in a
frequency distribution.

Compute the value of the mean.

Compute the value of the median.

Compute the value of the mode.

20
Unit 4: Descriptive Statistics

Topic 2. Measures of Variation

Learning Objectives
At the end of the lesson, you are expected to:
1. Calculate the measures of variation such as range, semi inter-quartile
range or quartile deviation, mean deviation, variance, and the standard
deviation.
2. Provide a sound interpretation of these measures.
3. Use a digital technology in calculating measures of variation.

Presentation of Content
Measures of Variation

It is used to describe the degree to which scores or observations are


scattered or dispersed.
It is used to determine the degree of consistency and homogeneity of
scores.

Comparing Measures of Central Tendency and Measures of Variation

Central tendency describes the central point of the distribution, and


variability describes how the scores are scattered around that central point.
Together, central tendency and variability are the two primary values that
are used to describe a distribution of scores.

They have the same mean (C+), yet they shaped differently.

Question: Who has a better class performance, Cruz or Perez?


Answer: Professor Cruz (red line). Though the two classes have the same mean, it
shows in the graph that the class of Professor Cruz is less scattered around the
mean.

21
Unit 4: Descriptive Statistics

Five measures of variation


Measure of Variation Symbol
Range 𝑹
Inter quartile range or Quartile deviation 𝑸
Mean deviation 𝑴𝑫
Variance 𝒔𝟐
Standard deviation 𝒔

How to interpret?

The lesser the value of the measure, the more consistent, the more
homogeneous and the less scattered are the observations in the set of data.

Range
The range is the simplest measure of variation to calculate.
It is the difference between the largest and the smallest value in a given
data set.
A much larger range suggest greater variation or dispersion.

Range for Ungrouped Data


Formula:
𝑹 = 𝑯𝑶 − 𝑳𝑶

where
R------------range value
HO----------highest observation
LO----------lowest observation

Example:
Find the range of the two groups of score distribution.
𝑅𝐴 = 𝐻𝑂 − 𝐿𝑂
𝑅𝐴 = 35 − 10
𝑅𝐴 = 25
𝑅𝐵 = 𝐻𝑂 − 𝐿𝑂
𝑅𝐵 = 30 − 15
𝑅𝐵 = 15
𝑹𝑩 < 𝑹𝑨
The implication of this is that the scores in group B are less scattered than the
scores in group A.

22
Unit 4: Descriptive Statistics

Range in Microsoft Excel

Range for Grouped Data


Formula:
𝑹 = 𝑯𝑶𝑼𝑩 − 𝑳𝑶𝑳𝑩

where
R------------range value
𝐻𝑂𝑈𝐵 -------upper boundary of the highest observation
𝐿𝑂𝐿𝐵 ---------lower boundary of the lowest observation

23
Unit 4: Descriptive Statistics

Example: Consider the frequency distribution of the examination scores of the


sixty students in a statistics class. Compute the value of the range.

Solution:
𝑅 = 𝐻𝑂𝑈𝐵 − 𝐿𝑂𝐿𝐵
𝑅 = 94.5 − 10.5
𝑹 = 𝟖𝟒

Properties of Range
The value is always affected by extreme values.
In the process of computing the value of the range, not all values are
considered.
The range does not consider the variation of the items relative to the
central value of the distribution.
---------------------------------------------------------------------------------------------------
Semi Inter-Quartile Range or Quartile Deviation
It indicates the distance we need to go above and below the median to include the
middle 50% of the scores. It is based on the range of the middle 50% of the
scores, instead of the range of the entire set.
Semi Inter-Quartile Range or Quartile Deviation for Ungrouped Data
Formula:

𝑸𝟑 − 𝑸 𝟏
𝑸=
𝟐

where
Q------------value of the quartile deviation
𝑄3 ----------value of the 3rd quartile
𝑄1----------value of the 1st quartile

24
Unit 4: Descriptive Statistics

Example: Using the given data 6,8,10,12,12,14,15,16,20, find the quartile


deviation.

1 1 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝑄1 = [4 (9) + (1 − 4)]
12 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝑄1 = [ ]
4
𝑄1 = 3𝑟𝑑 𝑠𝑐𝑜𝑟𝑒
𝑄1 = 10

3 3 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝑄3 = [ (9) + (1 − )]
4 4
𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
28
𝑄3 = [ ]
4
𝑄3 = 7𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝑄3 = 15

𝑄3 − 𝑄1
𝑄=
2
15 − 10
𝑄=
2
𝑸 = 𝟐. 𝟓
The amount that deviates from the mean value is 2.5.

Quartile Deviation in Microsoft Excel

25
Unit 4: Descriptive Statistics

Semi Inter-Quartile Range or Quartile Deviation for Grouped Data


Formula:
𝑸𝟑 − 𝑸 𝟏
𝑸=
𝟐

where
Q------------value of the quartile deviation
𝑄3 ----------value of the 3rd quartile
𝑄1----------value of the 1st quartile

26
Unit 4: Descriptive Statistics

Example: Suppose the performance of 100 faculty members of a certain college


were taken and are presented in a frequency distribution as follows:

Solution:
𝑛
− 𝑐𝑢𝑚𝑓𝑏
𝑄1 = 𝑥𝑙𝑏 + (4 )𝑐
𝑓𝑄1
100
− 13
𝑄1 = 78.5 + ( 4 )4
13
𝑄1 = 82.19

3𝑛
− 𝑐𝑢𝑚𝑓𝑏
𝑄3 = 𝑥𝑙𝑏 + ( 4 )𝑐
𝑓𝑄3
3(100)
− 69
𝑄3 = 90.5 + ( 4 )4
19
𝑄3 = 91.76

𝑄3 − 𝑄1
𝑄=
2
91.76 − 82.19
𝑄=
2
𝑸 = 𝟒. 𝟕𝟖
The larger the value of the Q, the more dispersed the scores at the middle 50% of
the distribution. On the other hand, if the Q is small, the scores are less dispersed
at the middle 50%of the distribution. The point of dispersion is the median value.

27
Unit 4: Descriptive Statistics

Mean Deviation
Mean Deviation for Ungrouped Data
Formula:
∑|𝒙 − 𝒙
̅|
𝑴𝑫 =
𝒏

where
MD-------------mean deviation value
x----------------individual score
𝑥̅ ----------------sample mean
n---------------number of cases

Steps:
1. Solve the mean value.
2. Subtract the mean value from each score.
3. Take the absolute value of the difference in step 2.
4. Solve the mean deviation using the formula:
∑|𝑥 − 𝑥̅ |
𝑀𝐷 =
𝑛

Example: Find the mean deviation of the scores of 10 students in a


Mathematics test. Given the scores: 35,30,26,24,20,18,18,16,15,10.

∑𝑥
𝑥̅ =
𝑛
212
𝑥̅ =
10
𝑥̅ = 21.2
∑|𝑥 − 𝑥̅ |
𝑀𝐷 =
𝑛
60.4
𝑀𝐷 =
10
𝑀𝐷 = 6.04
The mean deviation of the 10 scores of students is 6.04. This means that on the
average, the value deviated from the mean of 21.2 is 6.04.

28
Unit 4: Descriptive Statistics

Mean Deviation for Grouped Data


Formula:

∑ 𝒇|𝒙 − 𝒙
̅|
𝑴𝑫 =
𝒏

where
MD-------------mean deviation value
x----------------the midpoint of each class
𝑥̅ ----------------the mean of the distribution
n---------------total number of frequency

Steps:
1. Compute the value of the mean.
2. Get the deviation by using the expression 𝑥 − 𝑥̅ .
3. Multiply the deviation by its corresponding frequency.
4. Add the results in step 3.
5. Divide the sum in step 4 by n.

Example:
Step 1. Compute the value of the mean.

∑ 𝑓𝑥 3174
𝑥̅ = = = 𝟓𝟐. 𝟗𝟎
𝑛 60

29
Unit 4: Descriptive Statistics

Step 2. Construct the deviation column 𝑥 − 𝑥̅ .

Step 3. Convert the deviations to positive deviations.

Step 4. Multiply the positive deviations by their corresponding frequencies. The


products shall be added and the result shall be divided by the sample size.

∑ 𝑓 |𝑥 − 𝑥̅ |
𝑀𝐷 =
𝑛
750.4
𝑀𝐷 =
60
𝑀𝐷 = 𝟏𝟐. 𝟓𝟏

30
Unit 4: Descriptive Statistics

Variance
It is a measure of variability that uses all the data.
It is the average of the squared differences between the observations and the
mean value.

Variance for Ungrouped Data


Formula:

𝟐
̅ )𝟐
∑(𝒙 − 𝒙
𝒔 =
𝒏

where
x----------represents the individual values in the distribution
𝑥̅ ----------the mean of the distribution
n----------the sample size

Steps:
1. Compute the value of the mean.
2. Get the deviation of each value from the mean.
3. Square the deviations.
4. Calculate the sum of the squared deviations.
5. Divide the sum by the total number of values.

Example: Compute the value of the variance of the following measurements.


13, 5, 7, 9, 10, 17, 15, 12

Solution:
∑𝑥
𝑥̅ =
𝑛
88
𝑥̅ =
8
𝑥̅ = 11

∑(𝑥 − 𝑥̅ )2
𝑠2 =
𝑛
114
𝑠2 =
8
𝑠 2 = 𝟏𝟒. 𝟐𝟓

31
Unit 4: Descriptive Statistics

Variance in Microsoft Excel

Variance for Grouped Data


Formula:

𝟐
̅ )𝟐
∑ 𝒇 (𝒙 − 𝒙
𝒔 =
𝒏

where
x----------midpoint of each class interval
𝑥̅ ----------mean
n----------sample size

Steps:
1. Compute the value of the mean.
2. Determine the deviation 𝑥 − 𝑥̅ by subtracting the mean from the midpoint
of each class interval.
3. Square the deviations obtained in step 2.
4. Multiply the frequencies by their corresponding squared deviations.
5. Add the results in step 4.
6. Divide the result in step 5 by the sample size.

32
Unit 4: Descriptive Statistics

Example:
∑ 𝑓𝑥 3174
𝑥̅ = = = 52.90
𝑛 60

∑ 𝑓 (𝑥 − 𝑥̅ )2
𝑠2 =
𝑛
16406.40
𝑠2 =
60
𝑠 2 = 𝟐𝟕𝟑. 𝟒𝟒
---------------------------------------------------------------------------------------------------
Standard Deviation
It is the square root of the variance.
It is the most commonly used measure of variation. It indicates how
closely the values of a given data set are clustered around the mean.
A lower value of the standard deviation means that the values of that given
data set are spread over a smaller range around the mean.
On the other hand, a large value of the standard deviation means that the
values of that data set are spread over a larger range around the mean.

Standard Deviation for Ungrouped Data


Formula:
For ungrouped data:

∑(𝒙 − ̅
𝒙 )𝟐
𝒔=√
𝒏

33
Unit 4: Descriptive Statistics

Example: Compute the value of the standard deviation of the following


measurements.
13, 5, 7, 9, 10, 17, 15, 12

Solution:
∑𝑥
𝑥̅ =
𝑛
88
𝑥̅ =
8
𝑥̅ = 11

∑(𝑥 − 𝑥̅ )2
𝑠=√
𝑛

114
𝑠=√
8

𝑠 = 𝟑. 𝟕𝟕

Standard Deviation in Microsoft Excel

34
Unit 4: Descriptive Statistics

Standard Deviation for Grouped Data


Example:
∑ 𝑥 3174
𝑥̅ = = = 52.90
𝑛 60

∑ 𝑓 (𝑥 − 𝑥̅ )2
𝑠=√
𝑛

16406.40
𝑠=√
60

𝑠 = 𝟏𝟔. 𝟓𝟒
---------------------------------------------------------------------------------------------------
Coefficient of Variation
The standard deviation measures absolute variability and not relative variability.
It can only compare two samples that have the same units of measure.
It allows us to compare two different data sets that have different units of
measurement.
This expresses the standard deviation as a percentage of the mean.
The smaller the value of the coefficient variation, the more homogeneous
the scores in a particular group.
The higher the value of the coefficient of variation, the more dispersed the
scores in a particular distribution.

35
Unit 4: Descriptive Statistics

Formula:
𝒔
𝑪𝑽 = ( ) 𝟏𝟎𝟎%
̅
𝒙
where
𝑥̅ --------------mean value
s--------------standard deviation

Example: The average score of the students in one English class is 110, with a
standard deviation of 5; the average score of students in a Mathematics class is
106, with a standard deviation of 4. Which class is more variable in terms of
score?

Solution:
5
English class: 𝐶𝑉 = (110) 100% = 4.55%
4
Math class: 𝐶𝑉 = (106) 100% = 3.77%

The scores in the Math class are less scattered than the scores in the English
class.
The scores in the English class are more spread out than the scores in the Math
class.

36
Unit 4: Descriptive Statistics

Application
Solve the following problems. Show all pertinent solutions inside the box.

The number of minutes required for a group of 10 college students to finish a


test in physics are:
12, 14, 15, 10, 16, 18, 19, 20, 24, 25

1. Determine the value of range.


Solution box

2. Determine the value of quartile deviation.

Solution box

3. Determine the value of mean deviation.

Solution box

4. Determine the value of variance and standard deviation.

Solution box

37
Unit 4: Descriptive Statistics

Consider the examination results of 60 students in a Statistics class.

Compute the value of range.

Compute the value of quartile deviation

38
Unit 4: Descriptive Statistics

Compute the value of mean deviation.

Compute the value of variance and standard deviation.

39
Unit 4: Descriptive Statistics

Feedback
Solve the following problems. Show all pertinent solutions inside the box.
The following are the hourly rate of 12 employees in a certain fastfood chain in
Tuguegarao City.
Php 26.30, Php 45.25, Php 18.25, Php 13.50, Php 18.60, Php 25.60, Php 55.81,
Php 13.50, Php 13.50, Php 18.25, Php 18.60, Php 25.60

1. Determine the value of range.


Solution box

2. Determine the value of quartile deviation.

Solution box

3. Determine the value of mean deviation.

Solution box

4. Determine the value of variance and standard deviation.

Solution box

40
Unit 4: Descriptive Statistics

A researcher is conducting an investigation on the income of the alumni of


a certain university years after graduation. The monthly income of 200
respondents were taken and are presented in a distribution as shown
below.

Compute the value of range.

Compute the value of quartile deviation

41
Unit 4: Descriptive Statistics

Compute the value of mean deviation.

Compute the value of variance and standard deviation.

42
Unit 4: Descriptive Statistics

Topic 3. Measures of Position (Quantiles)

Learning Objectives
At the end of the lesson, you are expected to:
1. Calculate the measures of position (quantiles) such as quartile, decile, and
percentile.
2. Provide a sound interpretation of these measures.
3. Use a digital technology in calculating measures of position (quantiles).

Presentation of Content

Quantiles
It is treated as extensions of the concept on median.
If in median, the value divides the given distribution into two equal parts,
the distribution in quantiles is divided into:

4 equal parts (quartile)


10 equal parts (decile)
100 equal parts (percentile)

Quartile
Quartiles refer to the values that divide the distribution into four equal
parts. There are three quartiles represented by 𝑄1 , 𝑄2 ,and 𝑄3 .

Quartile for Ungrouped Data


Formula:

𝒌 𝒌 𝒏𝒕𝒉 𝒔𝒄𝒐𝒓𝒆
𝑸𝒌 = [ 𝒏 + (𝟏 − )]
𝟒 𝟒
where
k----------1,2,3
n----------number of cases

43
Unit 4: Descriptive Statistics

Example: Using the given data below:


6, 8, 10, 12, 12, 14, 15, 16, 20
Find 𝑸𝟏, 𝑄2, 𝑄3.

Solution:

1 1 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝑄1 = [ (9) + (1 − )]
4 4

12 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝑄1 = [ ]
4
𝑄1 = 3𝑟𝑑 𝑠𝑐𝑜𝑟𝑒
The value of 𝑄1 is 10 which is the 3rd score in the distribution. Therefore, 25% of
the scores are below 10.

Quartile in Microsoft Excel

44
Unit 4: Descriptive Statistics

Quartile for Grouped Data


Formula:
𝒌𝒏
− 𝒄𝒖𝒎𝒇𝒃
𝑸𝒌 = 𝒙𝒍𝒃 + ( 𝟒 )𝒄
𝒇𝑸𝒌
where
𝑥𝑙𝑏 --lower boundary of the kth quartile class
𝑐𝑢𝑚𝑓𝑏 --cumulative frequency before the kth quartile class
𝑓𝑄𝑘 --frequency of the kth quartile class

Steps:
To compute the value of 𝑄𝑘 , follow the procedure used in computing the value of
the median.
𝑘𝑛
1. Get .
4
2. Get the value of the cumulative frequency before the kth quartile class.
3. Determine the kth quartile class.
4. Determine the lower boundary of the kth quartile class.
5. Get the frequency of the kth quartile class.
6. Substitute all the values in the formula:
𝑘𝑛
− 𝑐𝑢𝑚𝑓𝑏
𝑄𝑘 = 𝑥𝑙𝑏 + ( 4 )𝑐
𝑓𝑄𝑘

Example: Consider the frequency distribution of the examination scores of the


sixty students in a statistics class. Compute 𝑸𝟏, 𝑄2, 𝑄3.

STEPS:
𝑘𝑛 (1)(60)
1. 4
= 4
= 15
2. 𝑐𝑢𝑚𝑏𝑓𝑏 = 8
3. 1st quartile class: 35 − 46
4. 𝑥𝑙𝑏 = 34.5; 𝑓𝑄1 = 11; 𝑐 = 12
𝑘𝑛
−𝑐𝑢𝑚𝑓𝑏
5. 𝑄𝑘 = 𝑥𝑙𝑏 + ( 4 )𝑐
𝑓𝑄𝑘
(1)(60)
−8
𝑄1 = 34.5 + ( 4
) 12
11

𝑄1 = 42.14

Therefore, 25% of the scores of 60 students who took the Statistics test are less
than 42.14.

45
Unit 4: Descriptive Statistics

Decile
Deciles refer to the values that divide the distribution into 10 equal parts.
There are nine deciles represented by 𝐷1 , 𝐷2 , 𝐷3, . . . 𝐷9

Decile for Ungrouped Data


Formula:

𝒌 𝒌 𝒏𝒕𝒉 𝒔𝒄𝒐𝒓𝒆
𝑫𝒌 = [ 𝒏 + (𝟏 − )]
𝟏𝟎 𝟏𝟎
where
k----------1,2,3,4,5,6,7,8,9
n----------number of cases

Example: Using the given data below:


6, 8, 10, 12, 12, 14, 15, 16, 20
Find 𝐷1, 𝐷2, 𝐷3, 𝐷4, 𝐷5, 𝑫𝟔, 𝐷7, 𝐷8, 𝑫𝟗

Solution:

6 6 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝐷6 = [ (9) + (1 − )]
10 10

58 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝐷6 = [ ]
10
𝐷6 = 5.8𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
The value of D6 lies within the sum of the 5th score and 80% of the difference of
6th and 5th scores.

𝐷6 = 5𝑡ℎ 𝑠𝑐𝑜𝑟𝑒 + .80(6𝑡ℎ 𝑠𝑐𝑜𝑟𝑒 − 5𝑡ℎ 𝑠𝑐𝑜𝑟𝑒)


𝐷6 = 12 + .80(14 − 12)
𝑫𝟔 = 𝟏𝟑. 𝟔𝟎

Therefore, 60% of the scores in the distribution are less than 13.60

46
Unit 4: Descriptive Statistics

Solution: (𝐷9 )

9 9 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝐷9 = [ (9) + (1 − )]
10 10
𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
82
𝐷9 =[ ]
10
𝐷9 = 8.2𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝐷9 = 8𝑡ℎ 𝑠𝑐𝑜𝑟𝑒 + .20(9𝑡ℎ 𝑠𝑐𝑜𝑟𝑒 − 8𝑡ℎ 𝑠𝑐𝑜𝑟𝑒)
𝐷9 = 16 + .20(20 − 16)
𝐷9 = 16.80

Therefore, 90% of the scores in the distribution are less than 16.80

Decile for Grouped Data


Formula:
𝒌𝒏
− 𝒄𝒖𝒎𝒇𝒃
𝑫𝒌 = 𝒙𝒍𝒃 + ( 𝟏𝟎 )𝒄
𝒇𝑫𝒌
where
𝑥𝑙𝑏 --lower boundary of the kth decile class
𝑐𝑢𝑚𝑓𝑏 --cumulative frequency before the kth decile class
𝑓𝐷𝑘 --frequency of the kth decile class

Steps:
To compute the value of 𝐷𝑘 , follow the procedure used in computing the value of
the median.
𝑘𝑛
1. Get 10 .
2. Get the value of the cumulative frequency before the kth decile class.
3. Determine the kth decile class.
4. Determine the lower boundary of the kth decile class.
5. Get the frequency of the kth decile class.
6. Substitute all the values in the formula:
𝑘𝑛
− 𝑐𝑢𝑚𝑓𝑏
𝐷𝑘 = 𝑥𝑙𝑏 + (10 )𝑐
𝑓𝐷𝑘

47
Unit 4: Descriptive Statistics

Example: Consider the frequency distribution of the examination scores of the


sixty students in a statistics class. Compute 𝐷1, 𝐷2, 𝑫𝟑, . . . 𝐷9

STEPS:
𝑘𝑛 3(60)
1. = = 18
10 10
2. 𝑐𝑢𝑚𝑏𝑓𝑏 = 8
3. 3rd decile class: 35 − 46
4. 𝑥𝑙𝑏 = 34.5; 𝑓𝐷3 = 11; 𝑐 = 12
𝑘𝑛
−𝑐𝑢𝑚𝑓𝑏
5. 𝐷𝑘 = 𝑥𝑙𝑏 + (10 )𝑐
𝑓𝐷𝑘
(3)(60)
−8
10
𝐷3 = 34.5 + ( ) 12
11

𝑫𝟑 = 𝟒𝟓. 𝟒𝟏

Therefore, 30% of the scores of 60 students who took the Statistics test are less
than 45.41.
---------------------------------------------------------------------------------------------------
Percentile
Percentiles refer to the values that divide the distribution into 100 equal
parts. There are 99 percentiles represented by 𝑃1 , 𝑃2 , 𝑃3 , 𝑃4 , 𝑃5 , . . . 𝑃99

Percentile for Ungrouped Data


Formula:

k k nth score
Pk = [ n + (1 − )]
100 100
where
k----------1,2,3,4,5, . . . 99
n----------number of cases

48
Unit 4: Descriptive Statistics

Example: Using the given data below:


6, 8, 10, 12, 12, 14, 15, 16, 20

Find 𝑃1, 𝑃2, 𝑃3, . . . 𝑷𝟗𝟗

Solution:

99 99 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝑃99 = [ (9) + (1 − )]
100 100

892 𝑛𝑡ℎ 𝑠𝑐𝑜𝑟𝑒


𝑃99 = [ ]
100
𝑃99 = 8.92𝑛𝑑 𝑠𝑐𝑜𝑟𝑒
The value of 𝑃99 lies within the sum of the 8th score and the 92% of the difference
of 9th and 8th scores.

𝑃99 = 8𝑡ℎ 𝑠𝑐𝑜𝑟𝑒 + .92(9𝑡ℎ 𝑠𝑐𝑜𝑟𝑒 − 8𝑡ℎ 𝑠𝑐𝑜𝑟𝑒)


𝑃99 = 16 + .92(20 − 16)
𝑃99 = 19.68

Therefore, 99% of the scores in the distribution are less than 19.68.

Percentile in Microsoft Excel

49
Unit 4: Descriptive Statistics

Percentile for Grouped Data


Formula:
𝒌𝒏
− 𝒄𝒖𝒎𝒇𝒃
𝑷𝒌 = 𝒙𝒍𝒃 + (𝟏𝟎𝟎 )𝒄
𝒇𝑷𝒌
where
𝑥𝑙𝑏 --lower boundary of the kth percentile class
𝑐𝑢𝑚𝑓𝑏 --cumulative frequency before the kth percentile class
𝑓𝑃𝑘 --frequency of the kth percentile class

Steps:
To compute the value of 𝑃𝑘 , follow the procedure used in computing the value of
the median.
𝑘𝑛
1. Get 100.
2. Get the value of the cumulative frequency before the kth percentile class.
3. Determine the kth percentile class.
4. Determine the lower boundary of the kth percentile class.
5. Get the frequency of the kth percentile class.
6. Substitute all the values in the formula:
𝑘𝑛
− 𝑐𝑢𝑚𝑓𝑏
𝑃𝑘 = 𝑥𝑙𝑏 + (100 )𝑐
𝑓𝑃𝑘

Example: Consider the frequency distribution of the examination scores of the


sixty students in a statistics class. Compute 𝑃43 .

STEPS:
43𝑛 43(60)
1. = = 25.8
100 100
2. 𝑐𝑢𝑚𝑏𝑓𝑏 = 19
3. 43rd percentile class: 47 − 58
4. 𝑥𝑙𝑏 = 46.5; 𝑓𝑃43 = 19; 𝑐 = 12
𝑘𝑛
−𝑐𝑢𝑚𝑓𝑏
5. 𝑃𝑘 = 𝑥𝑙𝑏 + (100 𝑓 )𝑐
𝑃𝑘
(43)(60)
−19
100
𝑃43 = 46.5 + ( ) 12
19

𝑃43 = 50.8

Therefore, 43% of the scores of 60 students who took the Statistics test are less
than 50.8.

50
Unit 4: Descriptive Statistics

Application
Solve the following problems. Show all pertinent solutions inside the box.

Consider the following measurements


87, 94, 36, 56, 54, 76, 87, 85, 68, 56, 78, 88

1. Determine the value of Q3.

Solution box

2. Determine the value of D7.

Solution box

3. Determine the value of P78.

Solution box

51
Unit 4: Descriptive Statistics

The efficiency ratings of 155 faculty members of a certain college were taken and
are shown below.

Compute the value of Q1.

52
Unit 4: Descriptive Statistics

Compute the value of D4.

Compute the value of P48.

53
Unit 4: Descriptive Statistics

Feedback
Solve the following problems. Show all pertinent solutions inside the box.

The efficiency ratings of 12 employees of a certain department were taken


and are shown below.
81, 86, 68, 69, 78, 93, 81, 83, 71, 88, 95, 83

1. Determine the value of Q2.

Solution box

2. Determine the value of D9.

Solution box

3. Determine the value of P67.

Solution box

54
Unit 4: Descriptive Statistics

The ages of residents of a certain zone in a barangay were taken and are shown
below.

Compute the value of Q2.

55
Unit 4: Descriptive Statistics

Compute the value of D1.

Compute the value of P86.

56
Unit 4: Descriptive Statistics

Summary
 A measure of central tendency is a location measure that pinpoints the
center or middle value.
 The three common measures of central tendency are the mean, median,
and mode.
 Each measure of central tendency has its own properties that serve as basis
in determining when to use it appropriately.
 Measure of variation is used to further describe the distribution of the data
set.
 Absolute measures of variation include range, semi-interquartile range or
quartile deviation, mean deviation, variance, and standard deviation.
 A relative measure of variation is provided by the coefficient of variation.
 There are other measures of location that could further describe the
distribution of the data set.
 Quartiles, deciles, and percentiles are measures of location that divide the
distribution into 4, 10, and 100 equal parts, respectively.

Reflection
Congratulations! You are done with the fourth unit of this module. Now, go back
to the activities and lessons you have taken in this unit and answer the following
questions. Limit your answers for each question to 5 to 10 sentences only.
1. What made you successful/unsuccessful with this unit of the module?
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________

2. Which among the activities on this unit you enjoyed most? Explain.
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________

57
Unit 4: Descriptive Statistics

____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________

3. What did you learn that was unexpected?


____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________

References
Albert, J. R. (2008). Basic statistics for the tertiary level (with applications in
excel). Rex Bookstore.

Asaad, Abubakar S. (2008). Statistics made simple for researchers. Quezon City
Rex Printing Company, Inc.

Balanos, A. B. (2003). Probability and statistical concepts: an introduction.


Manila, Philippines: Rex Bookstore.

Commission on Higher Education and Philippine Normal University (2016).


Teaching guide for senior high school statistics and probability.

Deauna, M. C. (2011). Applied educational statistics 1. Quezon City C & E


Publishing, Inc.

Devore, J. L. (2004). Probability and statistics for engineers and the sciences.(6th
ed.). Belmont, CA Thompson

Diego, A. M. (2005). Fundamentals of statistics. Quezon City: Lorimar Publishing.

Gravetter, F. J., & Wallnau, L. B. (2009). Statistics for the behavioural sciences.
California USA Wadsworth, Cengage Learning.

58
Unit 4: Descriptive Statistics

Keller, G. (2002). Statistics: a systematic approach. Belmont, California:


Wadsworth Publishing Company.

Mendenhall, W. (2003). Introduction to probability and statistics. (10th ed.).


Duxbury Press Books/Cole Publishing Company.

Paano, R. R. (2007). Understanding statistics in the behavioural sciences.


Singapore. Wadsworth.

Triola, W. (2001). Elementary Statistics. USA. Addison Wesley Longman, Inc.

Walpole, R. (2002). Introduction to statistics. (3rd ed.). Pearson Education Asia


Pte. Ltd.

Electronic Sources:
https://www.youtube.com/watch?v=8WdSJhEIrQk
https://www.youtube.com/watch?v=8U7UUHVaVVo
https://www.youtube.com/watch?v=H9ITfdaX2ZQ
https://www.youtube.com/watch?v=7m51Vzndhdo
https://www3.nd.edu/~rwilliam/stats1/x21.pdf
http://www.mathportal.org/calculators/statistics-calculator/normal-distribution-
calculator.php
https://blog.udemy.com/importance-of-statistics/
https://www.youtube.com/watch?v=095BdbOunPU
https://www.youtube.com/watch?v=be9e-Q-jC-0
https://www.youtube.com/watch?v=0zZYBALbZgg
https://www.academia.edu/34527054/Textual_presentation_of_data

59

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy