0% found this document useful (0 votes)
66 views

Measure of Location (Final)

This document introduces descriptive statistics and measures of central tendency. It discusses the mean, median and mode as measures of central tendency. It provides the mathematical formulas to calculate the mean for both ungrouped and grouped data. It also discusses how to calculate the median for grouped data. The key differences between calculating the mean of ungrouped versus grouped data and how to interpret these measures are explained.

Uploaded by

Nada Imran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views

Measure of Location (Final)

This document introduces descriptive statistics and measures of central tendency. It discusses the mean, median and mode as measures of central tendency. It provides the mathematical formulas to calculate the mean for both ungrouped and grouped data. It also discusses how to calculate the median for grouped data. The key differences between calculating the mean of ungrouped versus grouped data and how to interpret these measures are explained.

Uploaded by

Nada Imran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Introduction to

Statistics
Muhammad Irfan Malik, Ph.D
School of Social Sciences & Humanities (S3H),
National University of Science and Technology (NUST), Islamabad
Email: irfanmalik@s3h.nust.edu.pk
 What are descriptive measures?
 Types of descriptive measures
 What is measure of central tendency or measure of location?
 What are the measures central tendency?
 How to find these measures?
 What are the advantages and disadvantages of these
measures?
 Shape of distribution and relationship between mean, median
and mode
 Application on real world data.
 Interpretation
Descriptive Measures
Numbers that are used to describe the data set are
called descriptive measures.
Types
Measures of Location (Mean, Median, Mode)

Measures of Dispersion (Range, Variance, Standard Deviation)

Measures of Position (Quartiles, Percentiles)


Measure of central tendency

A measure of central tendency (also referred to as measure of


center or central location) is a measure that attempts to
describe the whole set of data with a single value that
represents the middle or center of its distribution

They are often called Averages


Measures of central tendency

Measures

Arithmetic
Mean Median Mode
Arithmetic Mean
(Mean)
The arithmetic mean (mean) of a data set is obtained by
dividing the sum of values by the number of values. It is usually
denoted by µ (population mean) and (sample mean)
Properties of Mean
 It needs not be an element of the collection.
 It needs not be an integer even if all the elements of the collection are integers.
 It is somewhere between the smallest and largest values in the collection.
 It needs not be halfway between the two extremes; in general, it is not true
that half the elements in a collection are above the mean.
 If the collection consists of values of a variable measured in specified units,
then the mean has the same units too.
点击添加文本
点击添加文本
Mathematical formula of mean for ungrouped (raw) data

Population Mean

Sample Mean
𝑛

𝑿 𝟏 + 𝑿 𝟐 + 𝑿 𝟑 +…+ 𝑿 𝒏
∑ 𝑿𝒊
𝑿= = 𝑖 =1
𝒏 𝑛

represents the value of observation.


represents the number of observations.
How to find mean of ungrouped (raw) data?

According to formula given in last slide, we need to


perform following steps to find mean:

1. Count the number of observations


2. Find the total of all the observations
3. Divide the sum of all the observations with the number of
observations.
4. The resultant is mean.
Mathematical formula of mean for grouped data
Population Mean

Sample Mean

𝒇 𝑿𝟏+𝒇 𝑿𝟐 + 𝒇 𝑿 𝟑+ … + 𝒇 𝑿𝒌
∑ 𝒇 𝒊 𝑿𝒊
𝟏 𝟐 𝟑 𝒌 𝑖=1
𝑿=  =
𝑘
𝑛
𝒏= ∑ 𝒇 𝒊
𝑖 =1

represents the number of classes and represents the mid point of class
How to find mean of grouped data?

We need to perform the following steps


according to the formula given in last slide:
 Make the table given below

 Find the mid point each class and place in column C.


 Multiply column B with column C and place in column D
 Find the sum of column B and column D.
 Divide the sum of column D with the sum of column B.
 The resultant is mean for group data.
Classes Frequency Class Boundaries
100-104 2 99.5-104.5
105-109 8 104.5-109.5
These data represent the record high temperatures in degrees 110-114 18 109.5-114.5
Fahrenheit (°F) for each of the 50 states.
115-119 13 114.5-119.5
120-124 7 119.5-124.5
125-129 1 124.5-129.5
130-134 1 129.5-134.5
i X
k Classes Frequency Class Boundaries Mid Point (x) fx 1 112
1 100-104 2 99.5-104.5 (100+104)/2=102 2×102=204
2 105-109 8 104.5-109.5 (105+109)/2=107
2 100
8×107=856
50
3 110-114 18 109.5-114.5 (110+114)/2=112 18×112=2016
   
4 115-119 13 114.5-119.5 (115+119)/2=117 13×117=1521
∑ 𝑿 𝒊=𝟓𝟕𝟎𝟓
5
6
120-124
125-129
7
1
119.5-124.5
124.5-129.5
(120+124)/2=122
(125+129)/2=129
7×122=854
1×127=127     𝑖=1
7 130-134 1 129.5-134.5 (130+134)/2=132 1×132=132
    𝒏=𝟓𝟎
sum 50 5710
5 705
    𝑿=
50
=11 4.1
5 5    
∑ 𝒇 𝒊 𝑿 𝒊 =𝟓𝟕𝟏𝟎 ∑ 𝒇 𝒊 𝑿 𝒊 =50    
𝑖=1 𝑖=1
49 111
5 710 50 114
𝑿=
50
=11 4.2

What was the reason that the mean of ungrouped data and the mean of grouped data are
different?
The energy consumption of natural gas (in billions of Btu)
by the 50 states and the District of Columbia.

• Find the mean of the given data Consumption of Natural Gas.


• Construct a frequency distribution for the given data set (use 9 classes).
• Plot the histogram using relative frequency
• Find the approximate value of mean
• Find the mean of the frequency distribution
• Compare the results
Median
Median is the middle or central value of an arranged data set.

Some important properties of Median


 Unlike mean, median does not depend upon all the observations.
 It is fixed by its position
 Outliers and skewed data have less impact on the median.

点击添加文本
点击添加文本
Median for ungrouped (raw) data

When number of observations are odd

Then median is the observation


When number of observations are even

Then median is

Note:
Mathematical formula of Median for grouped data

𝒉 𝒏
𝑴𝒆𝒅𝒂𝒊𝒏=𝒍 + ( − 𝑪 )
𝒇 𝟐
Where:
is the lower-class boundary of median class
is the width or class interval of median class
frequency of median class
is the total no of observation in the data set or sum of frequency column
is the cumulative frequency of previous class to median class.
How to find median of grouped data?

According to formula given in the previous slide, we


need to perform the following steps:

Find the class boundaries if not given


Find the median class
a.
b. Find the in cumulative frequency column starting from first
class and so on.
c. Median class is class when is equal or greater than the
value.

After having median class, now we can use the


formula for median given in the previous slide.
Classes Frequency Class Boundaries
These data represent the record high temperatures in degrees 100-104 2 99.5-104.5
Fahrenheit (°F) for each of the 50 states. 105-109 8 104.5-109.5
110-114 18 109.5-114.5
115-119 13 114.5-119.5
120-124 7 119.5-124.5
125-129 1 124.5-129.5
130-134 1 129.5-134.5
no X no X no X
1 100 21 112 41 119
2 104

( )
22 113 42 120 𝒕𝒉
3 105 𝟓𝟎 𝒕𝒉
4 105
23 113 43 120 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏=𝟐𝟓 𝒐𝒃𝒔¿𝟏𝟏𝟒
5 105 24 114 44 120 𝟐

( )
𝟓𝟎 𝒕𝒉
6 106 25 114
45 120

+𝟏 𝒐𝒃𝒔𝒆𝒓 𝒗𝒂𝒕𝒊𝒐𝒏=𝟐𝟔 𝒐𝒃𝒔¿𝟏𝟏𝟒


7 107 26 114 𝒕𝒉
27 114 46 121

𝟐
8 108
9 109 28 114 47 122
29 115 48 122
10 109
30 116
11 110 49 127 (114+114)=114
31 116
12 110
32 117 50 134
13 110
33 117
14 110
34 117
15 110
35 118
16 111
36 118
17 111
37 118
n=50, which an even number
18 112 38 118

() ( )
𝒕𝒉 𝒕𝒉
19 112 39 118 𝟏 𝒏 𝒏
𝐌𝐞𝐝𝐢𝐚𝐧= [ 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏+ +𝟏 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏]
20 112 40 118 𝟐 𝟐 𝟐

𝒕𝒉
𝟐𝟓 𝒐𝒃𝒔 𝟐𝟔𝒕𝒉 𝒐𝒃𝒔
Median Class
k Classes Frequency Class Boundaries Cf
1 100-104 2 99.5-104.5 2 ¿ 𝟐𝟓 , Condition   not   satisfied   and   continue
2 105-109 8 104.5-109.5 10 ¿ 𝟐𝟓 , Condition   not   satisfied   and   continue
3 110-114 18 109.5-114.5 28 ≥ 𝟐𝟓 , Condition   satisfied   and   stop
4 115-119 13 114.5-119.5 41
5 120-124 7 119.5-124.5 48 Now
6 125-129 1 124.5-129.5 49 𝒍=𝟏𝟎𝟗 .𝟓 𝐟 =𝟏𝟖 𝒉=𝟏𝟏𝟒 .𝟓−𝟏𝟎𝟗.𝟓=𝟓 𝐂=𝟏𝟎
7 130-134 1 129.5-134.5 50
Sum 50  

( )
𝒕𝒉
𝒏
𝐌𝐞𝐝𝐢𝐚𝐧=
𝒕𝒉
𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏=𝟐𝟓 𝑶𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏 𝟓 𝟓𝟎
𝟐 𝑴𝒆𝒅𝒂𝒊𝒏=𝟏𝟎𝟗 . 𝟓+ ( −𝟏𝟎)
We can not find the 25th observation
𝟏𝟖 𝟐
because the data is grouped. To find 𝟓
median, first, we need to find the class in ¿ 𝟏𝟎𝟗 . 𝟓+ (𝟏𝟓)
which 25th observation fall. To do so, in cf 𝟏𝟖
column from the first cell, find the cell in
which cf is greater than or equal to the 25 4.17
and stop. ¿ 𝟏𝟏𝟑. 𝟔𝟕
Class associated with this cell is median
class

What was the reason that the median of ungrouped data and the median of grouped data are
different?
Ages of the Vice Presidents at the time of their death. The ages at the time
of death of those Vice Presidents of the United States who have passed away
are listed below.

 Find the median age at the time of death of the vice presidents
 Use the data to construct a frequency distribution. Use 6 classes.
 Plot the histogram and comment of the symmetry
 Draw the Ogive using percentages instead of frequency and find
out the approximate value of age at death below which 50% of the
ages of presidents fall.
 Using the frequency distribution, find the median class
 Find the median age at the time of death using frequency
distribution.
 Comment on the both values of the median
Mode
The mode is a value that appears most frequently in a data set.
Some important properties of Mode
 This is a measure that appear more than once.
 The mode is not calculated on all observations in a data set
 It is not affected by extreme observations (extreme minimum or extreme
maximum
 Sometimes it may not be possible to calculate the mode.

If data have one mode, then it is called unimodal, if


data have two modes, then it is called bimodal and if
have more than two modes, it is called multimodal
Mode for ungrouped (raw) data
Find the most common value(s).
Mode for grouped data

is the lower-class boundary of the model class


is the width or class interval of the model class
is the frequency of model class
is the frequency of previous class of the model class
is the frequency of next class of the model class

Model class is the class having the maximum frequency


Classes Frequency Class Boundaries
100-104 2 99.5-104.5
These data represent the record high temperatures in degrees 105-109 8 104.5-109.5
Fahrenheit (°F) for each of the 50 states.
110-114 18 109.5-114.5
115-119 13 114.5-119.5
120-124 7 119.5-124.5
125-129 1 124.5-129.5
130-134 1 129.5-134.5
no X no X
24 114
1 100 1 25 114
2 104 1
3 105 26 114 5
4 105 27 114
3 28 114
5 105
29 115
6 106
1 30 116
1
7 107
8 108 1 31 116
3
1 32 117
9 109 33 117
10 109 2 34 117 2
11 110 35 118
12 110 36 118 Mode=118
13 110 37 118
14 110 5 38 118
Because it appears 6
15 110 39 118
6 times in the data set.
16 111 40 118
17 111
Its appearance is
41 119
18 112 2 42 120 1 more than any other
19 112 43 120 number
20 112 44 120
21 112 4 45 120 4
46 121
22 113 47 122
23 113 48 122 1
2 49 127
2
50 134
1
1
k Classes Frequency Class Boundaries
1 100-104 2 99.5-104.5 Model Class
2 105-109 8 104.5-109.5
3 110-114 18 109.5-114.5
4 115-119 13 114.5-119.5
5 120-124 7 119.5-124.5 Now
6
7
125-129
130-134
1
1
124.5-129.5
129.5-134.5 𝒍=𝟏𝟎𝟗 .𝟓 𝒇 𝒎=𝟏𝟖 𝒇 𝟏=𝟖 𝒇 𝟐=𝟏𝟑 𝒉=𝟏𝟏𝟒 .𝟓−𝟏𝟎𝟗.𝟓=𝟓
Sum 50  

Class associated with


maximum frequency is the 𝟏𝟎
class which contain mode is ¿ 𝟏𝟎𝟗 . 𝟓+ (𝟓)
called model class.
𝟏𝟓
¿ 𝟏𝟎𝟗. 𝟓+𝟑 .𝟑𝟑
¿ 𝟏𝟏𝟐 . 𝟖𝟑

What was the reason that the mean of ungrouped data and the mode of grouped data are
different?
Enrollments for Selected Independent Religiously Controlled 4-
Year Colleges Listed below are the enrollments for selected
independent religiously controlled 4-year colleges that offer
bachelor’s degrees only.

 Construct a grouped frequency distribution with six


classes?
 Draw the histogram for the frequency distribution and
identify the model class
 Find the mode using the formula
Weighted Mean

A weighted mean is the mean of a data set whose entries have


varying weights.

Mathematical formula

Where is the weight of each entry


Assessment Type Score(x) Weight (w) x×w
Assignment 25 10 250
Quiz 37 10 370
Class participation 15 5 75
Mid Term 50 25 1250
Final Term 89 50 4450

Sum 216 100 6395

∑ 𝒙 = 𝟐𝟏𝟔 =𝟒𝟑 . 𝟐 𝒙 𝒘=
∑ 𝒙 𝒘 = 𝟔𝟑𝟔𝟗 =𝟔𝟑 .𝟔𝟗
𝒙=
𝒏 𝟓 ∑ 𝒘 𝟏𝟎𝟎
∑ 𝒙 = 𝟐𝟏𝟔 =𝟒𝟑 . 𝟐 𝒙 𝒘=
∑ 𝒙 𝒘 = 𝟔𝟑𝟔𝟗 =𝟔𝟑 .𝟔𝟗
𝒙=
𝒏 𝟓 ∑ 𝒘 𝟏𝟎𝟎
The shapes of distribution and relationship between
Mean, Median and Mode

A frequency distribution is
symmetric when a vertical
line can be drawn through the
middle of a graph of the
distribution and the resulting
halves are approximately mirror
images. ∑ 𝒙 𝟐𝟏𝟔
𝒙= = =𝟒𝟑 . 𝟐
𝒏 𝟓
The shapes of distribution and relationship between
Mean, Median and Mode
A frequency distribution is skewed if the “tail” of the graph elongates more to one
side than to the other. A distribution is skewed left (negatively skewed) if its tail
extends to the left. A distribution is skewed right (positively skewed) if its tail
extends to the right.
The shapes of distribution and relationship between
Mean, Median and Mode

A frequency distribution is uniform


(or rectangular) when all entries, or
classes, in the distribution have equal
or approximately equal frequencies. A
uniform distribution is also
symmetric.
𝒙=
∑ 𝒙 = 𝟐𝟏𝟔 =𝟒𝟑 . 𝟐
𝒏 𝟓

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy