LS 01 - Basic Concept - Dispersion
LS 01 - Basic Concept - Dispersion
LS 01 - Basic Concept - Dispersion
It is difficult to define statistics in a few words; since its dimension, scope, function, use and
importance are constantly changing over time. No formal definition thus has emerged so far and no
definition is perhaps beyond controversy.
According to Fisher (1947) 1, the science of statistics is essentially a branch of applied mathematics
and may be regarded as mathematics, applied to observational data.
Croston and Cowden (1948) defined statistics as the subject of collection, presentation and
analysis of numerical data.
As Yule and Kendal (1950) opined, Statistics means quantitative data, which are affected to a
marked extent by multiplicity of causes.
American Heritage Dictionary defines statistics as: “The mathematics of the collection,
organization and interpretation of numerical data especially the analysis of population
characteristics by inference form sampling.”
1
R. A. Fisher (1890- 1962) is known as the father of STATISTICS.
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Government often conducts experiments to aid in the development of public policy and
social programs. Such experiments include:
o Consumer price
o Fluctuations in the economy
o Employment patterns
o Population trends
o Opinion polls.
3. Scientific research:
Statistical sciences are used to enhance the validity of inference in all the fields of science,
medical science etc. Such as:
o Radio carbon dating to estimate the risk of earthquakes.
o Clinical trials to investigate the effectiveness of new treatments.
o Field experiments to evaluate the irrigation methods.
o Measurements of water quality
Population:
A set of all values or elements defined on some common characteristics is called a population.
Page 2 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Example: If we want to study the average weight of the student of 1 st semester BBA then the set
that consists of all the weights of the student of 1st semester BBA will be the population in this case.
Parameter
A parameter is a numerical measure that describes a characteristic of a population.
Sample:
A small and representative (desirably) part of population is known as sample.
In many particular situations it is impossible or even impractical to study the whole population, in
such case only a small and representative part of population is taken under consideration to draw
inferences about the population by analyzing that part of population. Such a part of population is
known as sample.
Statistic:
A statistic is a numerical measure that describes a characteristic of sample.
Variable:
The measurement of elements of a population having certain characteristics may vary from
element to element either in magnitude or in quality. These measurable characteristics are called
variables.
Thus a measurable characteristic, which can vary from element to element with in its domain
called a variable. Usually we denote the variables by capital letters and their values by small
letters.
Example: Height, weight, age, SSC and HSC marks, family size, sex, etc. are some variables of 1 st
semester BBA students of BRAC University.
Types of Variables
There are two basic types of variables -
1. Qualitative variable (also known as categorical variable or attribute)
A qualitative variable is one for which numerical measurement is not possible. In other
word when the characteristic being studied is nonnumeric, it is called a qualitative variable
or an attribute.
For example: Hair color (brown, black, white etc.), religion (Muslim, Hindu, etc.), sex (male,
female), home district (Dhaka, Rajshahi, Bogra etc.), occupational status (employed,
unemployed, self-employed, others) etc.
An individual is simply assigned to any one of the several mutually exclusive categories on
the basis of observation on the individual. The qualitative observations can neither
Page 3 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
meaningfully ordered nor physically measured, these can only be classified and then
enumerated.
In dealing with the qualitative data, researchers are usually interested in how many or what
proportion fall in each category.
For Example:
- What percent of students of BRAC Universities of English medium background?
- What proportion of people opted in favor of construction of the new Airport?
- How many Muslims and how many Hindus are there in Bangladesh?
Qualitative variable
Variable Discrete variable
Quantitative variable
Continuous variable
feet.
Page 4 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Exercise:
a. Classify each variable as qualitative or quantitative:
i. Marital status of nurses in a hospital
ii. Time it takes to run a marathon
iii. Weights of lobsters in a tank in a restaurant
iv. Colors of automobiles in a shopping centre parking lot
v. Ages of people living in a personal care home
Data:
Numerical facts gathered from a statistical investigation are called a data.
In a statistical analysis the first work is to collect data the raw materials of statistics after
identifying a specific problem and field of enquiry.
Data is in fact the plural form of ‘datum’. Single information of a phenomenon on any subject of
interest is called a datum. So data is called the collection of datum.
Example: If we are interested about the height of the students of 1 st semester in BBA of BU, then a
single value (that is the height of a student) is called a datum, and the set of all values of height will
be data.
Primary data:
A data is said to be primary data if it is obtained from an investigation conducted for the first
time. Thus the data collected for the first time by the investigator as original data are known
as primary data.
Page 5 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Secondary data:
When a statistical analysis is conducted on a data set available from a prior investigation is
called a secondary data.
Example: National income data collected by the government are primary data but they
become secondary data for those who use them.
Raw data:
In any statistical investigation, when data first collected usually appear in raw form where,
information has been recorded merely in arbitrary order in which they happened to occur. This is
known as the raw data set.
Raw data, collected for any statistical investigation, is unable to represent the
summaryinformation, which are although preliminary but necessary for analyses with advanced
statistical method. So it is necessary to represent the raw data in such a way, which will enable us
to extract the preliminary ideas about the variable(s) under study, to get some summary measures
and also to perform further statistical analysis.
Dealing with Raw Data: How to prepare data for further Statistical operation
In the next few subsequent segments we are going to discuss on some techniques of statistics that
we usually used to condense raw data, to make the data prepared for further statistical application.
The most frequently used methods for data condensation or/and representation are
i. Classification
ii. Tabulation
iii. Graphical representation
Classification:
Classification is the process of arranging data values of a variable in groups or classes according to
their affinities or of our interest. It is the first step towards further processing of a heterogeneous
mass of data in to a number of homogeneous groups and subgroups by their respective
characteristics.
Purpose of classification:
Classification is necessary to serve the following purpose:
i. To eliminate unnecessary details.
ii. To bring out clearly point of similarity and dissimilarity.
iii. To enable one to form mental picture of the object.
iv. To enable one to make comparisons.
v. To pin point the most significant features of the data at glance.
vi. To enable a statistical treatment of the collected data.
Page 6 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
A statistical table is the logical listing of collected data in vertical columns and horizontal rows of
numbers with sufficient explanatory and qualifying words, terms and statements in the form of
titles, headings and notes which make clear the full meaning of data and their origin.
Page 7 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Frequency distribution:
The number of times a particular value of a certain variable occurs in a set of observations is called
the frequency of that value and the manner in which the frequencies are distributed in the
different classes is known as the frequency distribution of the values of that variable.
Table 01: Frequency distribution of number Table 02: Frequency distribution of height of
of children per family trees in Sundarban
0 10 0-50 1000
1 27 50 – 100 2735
Class limit:
Class limits are the highest and the lowest values that can be included in the class.
For example if we consider the class 50 – 100, here 50 is the lower limit and 100 is the upper limit.
In such case no values greater than 100 shall fall into that class. Similarly no values less than 50
shall fall into that class either.
Class interval:
The difference between the upper limit and the lower limit of a class is called the class interval.
Page 8 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
For example the class interval of the class ‘50 – 100’ is 50.
Class frequency:
The number of observation falling with in a particular class is called its frequency or class
frequency.
l+ s
=
Class midpoint 2 ; Where l = Upper limit of the class, s = Lower limit of the class
Cumulative
Class mid Relative Cumulative
Class limit Frequency relative
value frequency frequency
frequency
0 – 10 5 4 0.148 4 0.148
10 – 20 15 8 0.296 4+8 0.444
20 – 30 5 4+8+5
30 - 40 4
40 – 50 3
50 – 60 2
60 – 70 65 1
Total
Exercise:
The following information, extracted from a survey of a Microfinance institution (MFI) represents
the amount of loan request of 50 potential borrowers from any particular branch of that MFI.
1850 9250 6100 4500 5100 1800 6100 6500 6999 6780
3100 7475 6400 4950 8789 6100 6480 7050 9900 4790
Page 9 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
4400 7900 6900 3865 5556 4859 6999 6780 8050 9900
5600 6600 9980 4800 8855 5550 1200 4790 6500 8050
3858 7300 8050 6200 7155 4980 8050 6480 7050 1500
For the given data construct a suitable frequency distribution table featuring the following
components
i. Class mid value ii. Tally Bars iii. Frequency
iv. Relative frequency v. Cumulative frequency vi. Cumulative relative frequency
The following table illustrates a summary table that asked people where they prefer to do their
banking.
Table 1: Table of percentage distribution of banking preference of the customer of BANK XYZ
Example 1:
Summary table of levels of Risk of Mutual Funds.
A sample of 868 mutual funds has been selected and questions were asked to assess and categories
the risk associated with the customer’s investments in mutual funds. Of the 868 mutual funds 202
funds are classified as the low risk funds, 311 funds are classified as average-risk fund and the rest
of 355 funds are categorized as high- risk. Hence the summary table of levels of risk of mutual
funds is given below.
Page 10 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Table 2: Frequency and Percentage Summary Table Pertaining to Risk Level for 868 Mutual Funds
Figure 1 displays the bar chart for the people’s preference to do their banking as depicted in table
1. Bar chart allows researchers to compare the percentages in different categories. In figure 1:
respondents are most likely to bank in person at a branch and on the internet, followed by drive
through service at a branch and ATM. Very few respondents mentioned automated or live
telephone.
45
40
35
30
25
20
15
10
0
ATM Automated or live Drive-through In person at Internet
telephone service at branch branch
Example 2:
Bar Chart of levels of risk of Mutual Funds.
Construct a bar chart for the levels of risk of mutual funds (based on data shown in table 2) and
interpret the result.
High
Level of Risk
Average
Low
Page 11 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Drive-through
In table 1 of this lecture 16% of the service at branch
17%
respondents stated that they prefer to bank
using ATM. Thus in constructing the pie chart,
the 360 degrees that makes up a circle is
multiplied by 0.16, resulting in a slice of the pie In person at branch
41%
that takes up 57.6 degrees of the 360 degrees
of the circle. In this figure, bank in person at the
branch takes 41% of the pie and automated or live telephone takes only 2%.
f
θ0= ∗3600
N
Page 12 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Exercise:
Using data given in table 2 construct a pie chart for the levels of risk of mutual funds and interpret the
results.
2. A qualitative variable with three classes (X, Y and Z) is measured for each 20 randomly sampled from
a target population. The data (observed class for each unit) are listed below.
Y X X Z X Y Y Y X X
Z X Y Y X Z Y Y Y X
a. Compute the frequency for each of the three classes.
b. Compute the relative frequency for each of the three classes.
c. Display the results, part a, in a frequency bar graph.
d. Display the results, part b, in a pie chart.
3. Assume telecommunication companies in Bangladesh spent about BDT 300 million in advertising.
The spending is as follows:
Media Amount ($ millions) Percentage (%)
Radio 20 6.67
Internet 30 10.00
Cinema 5 1.67
Direct mail 15 5.00
Magazines 35 11.67
Newspapers 65 21.67
Outdoor 45 15.00
TV 35 11.67
Other 50 16.67
300 100
a. Construct a bar chart and a pie chart.
b. Which graphical method do you think is best to portray these data?
4. The international Rhino Federation estimates that there are 25280 rhinoceroses living in the wild in
Africa and Asia. A breakdown of the number of rhinos of each species is reported in the
accompanying table.
Page 13 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
5. The following data set represents the scores on intelligence quotient (IQ) examinations of 40 sixth-
grade students at a particular school:
114 122 103 118 99 105 134 125 117 106
109 104 111 127 133 111 117 103 120 98
100 130 141 119 128 106 109 115 113 121
100 130 125 117 119 113 104 108 110 102
i. Organize the data in classes such as 90 – 100, 100 – 110 and so on.
ii. Present the data set in a frequency histogram.
iii. Determine the mean scores on intelligence quotient (IQ) examinations.
iv. Determine the proportion of scores above the average scores.
Page 14 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Stem and leaf plot is a graphical technique of representing quantitative data that can be used to
examine the shape of a frequency distribution, the range of the values and point of concentration of
the values. This is, in essence a display technique taken from the area of statistics called
exploratory data analysis (EDA).
Tukey (1977) first proposed the technique. It allows us to use the information contained in a
frequency distribution to show
The range of score
Concentration of scores
The shape of the distribution
Presence of any specific values or scores not represented in the entire data set
Whether there are any stray or extreme values in the distribution.
Example:
1. The following data represented the marks obtained by 20 students in a statistics test.
84 17 78 45 47 53 76 54 75 22
66 65 55 54 51 33 39 19 54 72
Use the stem leaf plot to display the data.
The stem leaf plot for the given data After arranging the stem leaf plot we get for
the given data
Page 15 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
2. Form an ordered array, given the following data from a sample of n=8 midterm exam scores in
math:
63 99 68 72 79 83 71 62
3. Form an stem and leaf display, given the following data from a sample of n=7 midterm exam
scores in physics:
70 44 79 88 83 73 84
Page 16 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Usually in histogram
- The variable of interest is displayed or plotted along the horizontal (X) axis.
- Frequency or the percentage of the values per class is displayed or plotted along the
vertical (Y) axis.
Example::
3000
2500
2000
Frequency
1500
1000
500
0
25-30
35-40
40-45
45-50
50-55
55-60
60-65
10-15
15-20
20-25
30-35
65+
5-10
0-5
Age group
Page 17 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
1500
1200
Fre quency
900
600
300
2 5-3 0
4 0-4 5
5 5-6 0
6 0-6 5
10 -15
15 -20
20 -25
30 -35
35 -40
45 -50
50 -55
65 +
5 -10
0 -5
Age Group
The Polygon-
The Frequency Polygon
In constructing frequency polygon the mid values of the class intervals of the frequency
distribution are placed on the horizontal (X) axis and the corresponding frequencies are
represented on the vertical (Y) axis. The co-ordinates points thus obtained joined by straight line.
The left most point is to be joined with the mid value of the immediate previous interval and the
right most co- ordinate point is to be joined with the mid value of the immediate next interval.
Thus we obtain a polygon known as frequency polygon.
Figure 1.4: frequency distribution of marks obtained by students of STA 101 of section 2 Spring 2011)
8
Table 1.4: Frequency distribution of
male and female by age group
6
Marks Midvalue Frequency
40-50 45 2 4
Frequency
50-60 55 6
2
60-70 65 8
70-80 75 3
0
80-90 85 2 45 55 65 75 85 95
Midvalues of exam mark group
90-100 95 1
Page 18 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
A percentage polygon is formed by having the midpoint of each class represent the data in that
class and then connecting the sequence of midpoint at their respective class percentages. The
following table 1.5 and figure 1.5 illustrates the construction of the percentage polygon.
Figure 1.5: Comparison of percentage distribution of grades obtained by students of STA 101 (Spring 2011)
40.0
Percentage of students
30.0
20.0
10.0
0.0
45 55 65 75 85 95
Mid values of the exam mark group
Cross Tabulations:
The study of patterns that may exist between two or more categorical variables is common in
practice. Often by cross-tabulating the data, these patterns can be explained. One can present cross
tabulations in tabular form (contingency tables) or graphical from (side by side charts).
Page 19 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
columns are called cells. Depending on the type of contingency table constructed, the cells for each
row-column combination contain the frequency, the percentage of the overall total, the percentage
of the row total, or the percentage of the column total.
Figure 1.6.1: Frequency distribution of religion by sex Figure 1.6.2: Frequency Distribution of Sex by religion
Male Female Muslim Hindu Christian Buddha Others
30 30
25 25
20 20
15 15
Frequency
Frequency
10 10
5 5
0 0
Muslim Hindu Christian Buddha Others Male Female
A useful way to visually display the results of cross-classification data is by constructing a side by
sidebar chart. Figure 1.6.1 and figure 1.6.2 uses the data from table 1.6.
Page 20 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Example& Exercise:
A sample of 500 shoppers was selected in a large metropolitan area to determine various
information concerning consumer behavior. Among the questions asked was “do you enjoy
shopping for clothing?” the results are summarized in the following cross classified table:
No 104 36 140
Page 21 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Exercise:
The following table represents the information of 50 individuals collected in a socio-economic
survey. Using the information given in table 1 answer question A - D
Table 1: Summary information of 50 individuals
Sl. # Sex Religion Previous month’s Division Marital Status
Income
1 M Islam 1500 Dhaka Married
2 F Hindu 3100 Rajshahi Married
3 M Buddha 4400 Sylhet Married
4 M Christian 5600 Khulna Unmarried
5 F Hindu 3858 Dhaka Divorced
6 M Islam 9250 Rajshahi Married
7 M Islam 7475 Chittagong Married
8 M Hindu 7900 Khulna Unmarried
9 F Buddha 6600 Rangpur Divorced
10 F Islam 7300 Dhaka Unmarried
11 M Islam 6100 Barishal Married
12 M Buddha 6400 Rajshahi Married
13 M Christian 6900 Sylhet Married
14 F Islam 9980 Khulna Unmarried
15 M Islam 8050 Dhaka Divorced
16 M Christian 4500 Rajshahi Married
17 M Islam 4950 Chittagong Married
18 M Hindu 3865 Dhaka Unmarried
19 F Hindu 4800 Rajshahi Divorced
20 M Buddha 6200 Sylhet Unmarried
21 F Islam 5100 Barishal Married
22 M Islam 8789 Rajshahi Married
23 M Christian 5556 Sylhet Married
24 F Islam 8855 Khulna Unmarried
25 M Buddha 7155 Dhaka Divorced
26 M Islam 1800 Rajshahi Married
27 F Islam 6100 Chittagong Married
28 M Christian 4859 Khulna Married
29 M Islam 5550 Rangpur Married
30 F Christian 4980 Dhaka Unmarried
31 M Hindu 6100 Barishal Divorced
32 F Islam 6480 Rajshahi Married
33 M Christian 6999 Sylhet Married
34 M Islam 1200 Khulna Unmarried
35 F Christian 8050 Dhaka Divorced
36 F Hindu 6500 Rajshahi Unmarried
37 M Christian 7050 Chittagong Married
38 F Islam 6780 Khulna Married
39 M Hindu 4790 Rangpur Married
40 M Buddha 6480 Barishal Married
Question A:
i. How many variables are listed in table I?
ii. Mention the variable name listed in Table I.
Question B:
Page 22 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Construct a frequency distribution table to represent the summary information of the variable
“Division” and determine proportion of respondent from Dhaka.
Question C:
Complete the following table # 3 and answer (a) & (b)
Religion Total
Sex
Islam Hindu Christian Buddha
Male
Female
Total
Question D:
Complete the following table # 4 and answer a), b) & c)
Table 4: Frequency distribution of previous month’s income
Relative Cumulative
Income Group Tally Frequency
frequency relative frequency
Below – 2000
2000 – 4000
4000 – 6000
6000 – 8000
8000 - 10000
a) What proportion (Percentage) of people had previous month’s income between 2000 - 6000
b) What proportion (Percentage) of people had previous month’s income less than 4000
Page 23 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Page 24 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Adding the values of the observations and then dividing the sum by the number of observations
obtain the arithmetic mean of a series of observations.
Example:
Banglatel is studying the number of minutes used by clients in a particular cell phone rate plan. A
random sample of 12 clients showed the following number of minutes used last month.
90 77 94 89 119 112
91 110 92 100 113 83
What is the mean (arithmetic mean) number of minutes used?
Answer:
Average use of the rate plan
x 1+ x 2 + x 3+ …+…+ x n 90+77+ …+91+…+113 +83
x́= = =97.5
n 12
Thus the arithmetic mean number of minutes used last month by the sample of cell phone users is
97.5 minutes.
Exercise:
1. “Dolphine Autos” employed 12 sales people. The number of new cars sold last month by the
respective sales people were as given in the following table:
15 23 10 4 18 8
10 28 13 19 14 12
Determine the average number of car sold by the sales people. Also determine the proportion
of sales people performing below average.
2. During the last month Shameem Refrigeration and Air Conditioning Company completed 129
different assignments for their clients and earned mean revenue of 13449 tk per assignment. If
the managing director wants to know the total revenue for the month can you compute the
total revenue? What it is?
3. Following data represents the battery life (in shots) for a sample of 12 three-pixel digital
cameras:
300 180 380 260 35 380
85 170 460 120 110 240
Determine the average number of shots taken for each battery. Also determine the proportion
of batteries performing above average.
Formula for
Grouped
Data Page 25 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Values: x 1 x2 … … xk
Frequencies : f 1 f2 … … fk
Such that
f 1 +f 2 +.. .+ f k =n then the AM is denoted by x̄ is defined as
f 1 x 1+ f 2 x 2 + f 3 x 3 +…+ f k x k
x́= ;(i=1,2 , … , k )
n
Example & Exercise:
Calculate the mean for the following frequency distribution for n=100:
Class interval Frequency
0-10 10
10-20 20
20-30 40
30-40 20
40-50 10
Answer:
Calculation:
Frequency
Mid values
Class interval ( f i )∗( x i) Arithmetic mean
( x i)
( f i) f 1 x 1+ f 2 x 2 + f 3 x 3 +…+ f k x k
0-10 10 5 50 x́=
n
10-20 20 15
20-30 40 50+ …+ 450
30-40 20 ¿ =¿??
100
40-50 10 45 450
Total k =5 k =5
Page 26 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Step 2 Arrange the values in ascending order 2, 7, 12, 17, 19, 21, 34
Step 3 n+1
Median: Me = Value of th observation
2
7+1
= Value of th observation
2
= Value of 4th observation = 17
Step 4 Median age of the family is 17 years
Example:
The ages of a family of eight members are given as 12, 7, 2, 34, 17, 40, 21 and 19. Find the median
age.
Step 1 Count the total number of elements, n=? Here n= 8 8 is a even number
Step 2 Arrange the values in ascending order 2, 7, 12, 17, 19, 21, 34, 40
Step 3 n n
( +1 )
Median: Me = AM of the values of 2 th and 2 observation
= AM of the values of _ _ _ and _ _ _ observation
__+__
= 2 =?
Step 4 Median age of the family is ? ? ? years
Page 27 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
n
Me=L0 +
2
−F −Me
∗W Me
( )
Me is given by the formula,
f Me
Formula for
Grouped
Where
Data Me = Median f Me = Frequency of the median class
L0 = Lower Limit of the median W Me = Width of the median class
class
F−Me = Cumulative frequency of the n = Total number of observation
pre median class
n
MEDIAN CLASS is the class that contains 2 th observation of the given data.
Example: Table 1.6 displays summary information of the parent of 50 students. Compute the median age
of woman.
Hints:
Table 1.6: Income distribution of the
student’s of ECO 202 Step 1: Compute the cumulative frequencies.
Income of parent n
Frequency 2
(in thousand taka) Step2: Determine , one half of the total number of
Below 20 3
20 – 40 4
40 – 60 6
60 – 80 8
80 – 100 12
100 – 120 10
120 and over 7
Total 50
cases.
Step 3: Locate the median class.
Step5:Sum the frequencies of all the classes prior to the median class. This is
F−Me .
You got all the quantities to compute median. So compute the median. …
Exercise:
Page 28 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
1. The following table gives the data pertaining to kilowatt hours of electricity consumed by 100
randomly selected flat owners of Japan garden city.
Consumption
0-100 100-200 200-300 300-400 400-500
(in K-watt hours)
No. of users 6 25 36 20 13
Calculate
i. Mean consumption of electricity ii. Median use of electricity
iii. Standard deviation of electricity iv. Skewness of electric consumption.
consumption
2. The following data represents the amount (in thousands taka) of loan requirements of the people of
two different upazilla. Using median comment on which upazilla has the greater average demand of
loans.
Upazilla 1 42 12 26 18 9 35 28 39 8
Upazilla 2 8 15 10 18 22 20 26 42 35
( f 0 −f −1)
Mo=L0 +
{( f 0−f −1 ) +(f 0−f 1)}∗W
Page 29 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Where, Mo = Mode
L0 = Lower Limit of the Modal class
f0 = Frequency of the modal class
f −1 = Frequency of the pre modal class
f1 = Frequency of post modal class
W = Width of the modal class
Page 30 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Exercise:
1. The frequency distribution below represents the weights in pounds of a sample of packages carried
last month by a small airfreight company.
Class Frequency Class Frequency
10.0 – 11.0 1 15.0 – 16.0 11
11.0 – 12.0 4 16.0 – 17.0 8
12.0 – 13.0 6 17.0 – 18.0 7
13.0 – 14.0 8 18.0 – 19.0 6
14.0 – 15.0 12 19.0 – 20.0 2
Find the mean, median and mode.
2. Suppose that 100 students are enrolled in a statistics class and the following are the test scores
received by them:
77 44 49 33 38 76 68 68 39 44
29 41 32 45 83 58 73 47 40 26
34 47 66 53 55 58 49 45 61 41
54 50 51 66 80 73 57 61 56 50
38 45 51 44 41 68 45 92 43 12
59 36 55 47 61 53 32 65 51 33
59 55 43 66 44 41 25 39 72 37
55 92 83 77 45 62 45 36 78 48
45 82 71 48 46 69 38 72 56 64
37 16 44 57 63 71 40 64 57 51
3. The following data set represents the record high temperatures in degree Fahrenheit (℉ ) for each
of the 50 US states:
112 100 117 106 114 118 105 110 109 112
110 118 117 116 118 112 114 114 105 109
116 112 114 115 118 117 118 92 106 110
88 108 110 121 113 120 119 111 104 111
107 113 98 117 105 110 118 112 114 114
i. Construct a suitable frequency distribution table using interval 85 – 95, 95 – 105 and so
on.
ii. Determine the modal temperature.
iii. Determine the proportion of states having temperature that is more than modal
temperature.
4. The data given represent the ages of patients admitted to a small hospital on February 28, 2004.
85 75 66 43 40 41 88 80
56 56 67 69 89 83 65 53
75 74 87 83 52 44 48 49
i. Construct a frequency distribution table.
ii. Compute the sample mean median and mode from the frequency distribution table.
iii. Compute the sample mean, median and mode from the raw data.
8.3 9.6 9.5 9.1 8.8 11.2 7.7 10.1 9.9 10.8
10.2 8.0 8.4 8.1 11.6 9.6 8.8 8.0 10.4 9.8
9.2 6.5 8.9 7.4 12.5 13.8 8.6 11.2 10.5 11.2
Organize this information into a stem-leaf display. Hence answer the following
a. How many rates are less than 9.0?
Page 31 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
6. 168 handloom factories have the following distribution of average number of workers in various
income groups:
Income Groups: 800 - 1000 1000 - 1200 1200 – 1400 1400 – 1600 1600 – 1800
Number of firms: 40 32 26 28 42
Average Number
8 12 8 8 4
of Workers:
Find the mean salary paid to the workers.
Answer: 1228.84
J K Sharma, 91
7. A class of 50 students sits for a class test. The following table gives result of the students who
passed the examination:
Marks: 40 50 60 70 80 90
Number of Students: 8 10 9 6 4 3
If the mean marks for all the students were 51.6, find out the mean marks of the students who
failed.
Answer: 21Marks
J K Sharma, 93
8. The average declared by a group of 10 chemical companies was 18 percent. Later on it was
discovered that one correct figure, 12 was misread as 22. Find the correct average dividend.
Answer: 17 percent
J K Sharma, 93
9. A company wants to pay bonus to members of the staff. The following “Table 1” demonstrates the
amount to be paid as bonus and” table 2” represents the actual amount of salary drawn by the
employees of that company:
Table 1: Monthly Bonus Policy Table 2: Monthly Salary
Monthly salary (in tk.) Bonus 3250 3780 4200 4550 6600
3000 – 4000 1000 6200 6800 7250 3630 8320
4000 – 5000 1200 9420 9520 8000 10020 10280
5000 – 6000 1400 11000 6100 6250 7630 3820
6000 – 7000 1600 5400 4630 5780 7230 6900
7000 – 8000 1800
8000 – 9000 2200
9000 – 10000 2300
10000 - 11000 2400
Page 32 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
J K Sharma, 93
11. There are two units of a garment in two different cities employing 760 and 800 persons,
respectively. The arithmetic means of monthly salaries paid to persons in these two units are tk
18750 and tk. 16950 respectively. Find the combined arithmetic mean of salaries of the
employees in both the units.
Answer: tk. 17827 (appx.)
J K Sharma, 96
12. An investor buys Tk. 12000 worth of shares of a company each month. During the first 5 months
he bought the shares at a price of tk. 100, tk. 120, tk. 150, Tk. 200 and tk. 240 per share
respectively. After 5 months what is the average price paid for the shares by the investor.
Answer: tk. 146.34 (appx.)
J K Sharma, 99
13. The mean yearly salary paid to all employees in a company is tk. 2400000. The mean yearly
salaries paid to male and female employees are tk. 2500000 and tk. 1900000 respectively.
Determine the percentage of male to female employees in the company.
Answer: Male 83.33% and Female 16.67%
J K Sharma, 97
14. The mean monthly salaries paid to 100 employees of a company were tk. 5000. The mean
monthly salaries paid to male and female employees were tk. 5200 and tk. 4200 respectively.
Determine the percentage of males and females employed by the company.
Answer: Male 80% and Female 20%
J K Sharma,127
15. A charitable organization decided to give Old-age pension to people over sixty years of age. The
scales of pension were fixed as follows (see Table 1) and the ages of persons who secured the
pension are given in table 2:
Table 1: Pension policy Table 2: Actual salary drawn by employees
Pension 74 76 60 83 67
Age Group 71 84 68 74 81
/Month
75 61 61 66 79
60 – 65 200 62 69 67 72 64
65 – 70 250 63 72 78 64 73
70 – 75 300
75 – 80 350
80 - 85 400
Determine –
i. How much money would the organization need to pay by way of pension?
ii. What shall be the average pension payable person and the standard deviation?
Page 33 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
16. In 2014, a person spends tk. 1800 monthly on an average for the first four months and tk. 2000
monthly for the next eight months and saves tk. 5600 in that a year. Determine the person’s
average monthly income.
17. The average of 11 results is 60. If the average of first 6 results is 58 and that of the last six is 63,
find the sixth result.
Page 34 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
To explain: Suppose the Shumi’s Hot Cake offers three different kinds of burger packages small,
medium and large for Tk. 100, Tk. 125 and Tk. 150. Of the last 10 burgers sold 3 were small, 4 were
medium and 3 were large. To find the mean price of the last 10 burger packages sold we can
calculate using the usual formula of the arithmetic mean as follows –
The mean selling price of the last 10 burger packages sold is Tk. 125.
An easier ways to find the mean selling price is to determine the weighted mean. In this method we
multiply each observation by the number of times it happens as described below –
In this case the weights are frequency counts. However, any measure of importance could be used
X̄ w=
∑ ( WX ) = W 1 X 1+W 2 X 2+. ..+W n X n
∑W W 1 +W 2 +.. .+W n
Example:
Madina Construction Company pays its part time employees hourly basis. For different level of
employee the hourly rate are Tk. 50, Tk. 75 and Tk. 90. There are 260 hourly employees, 140 of
which are paid at Tk. 50 rate, 100 at Tk. 75 and 20 at the Tk. 90 rate. What is the mean hourly rate
paid to the employees?
Answer:
To find the mean hourly rate, we multiply each of the hourly rates by the number of employees
earning that rate as follows -
X̄ w=
∑ ( WX ) =140∗50+100∗75+20∗90 =16300 =Tk . 62. 69
∑ W 140+ 100+20 260 .
The weighted mean hourly wage is Tk. 62.69 or Tk. 63.00 (approximately).
Page 35 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Page 36 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Page 37 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Quartile:
If the items in a series are arranged in ascending order of their magnitudes then those values of the
variable that divide the total frequency in to four equal parts are called quartiles.
than
Q1 and three forth is greater than
Q1 .
Problem:
For the following data compute the three quartiles.
99 75 84 33 45 66 97 69 55 61
72 91 74 93 54 76 62 91 77 68
Answer:
Arrange the data
33 45 54 55 61 62 66 68 69 72
74 75 76 77 84 91 91 93 97 99
Hints:
First find the median
(Q2 )
n th n th
Median
(Q2 ) = AM of the values of ( ) ∧( +1)
2 2
= AM of the values of 10th and 11th observation
72+74
=73
= 2
1st quartile
Q1 = median of the 1st half of observations =? ? ?
3rd quartile
Q3 = median of the 2nd half of observations =? ? ?
Page 38 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Merits Demerits
1. Rigidly defined. 1. Cannot be defined graphically.
2. Easy to understand and calculate. 2. Cannot be used in case of qualitative
Arithmetic mean
Arithmetic mean
Median
values. 3. Not easy for algebraic treatment.
4. Can be calculated in the case of the data 4. For calculating median it is
with open-end class. necessary to arrange the data either
5. Can be defined graphically. ascending or descending order.
Mode
of a distribution. bimodal or multi modal distribution.
2. Not at all affected by extreme values. 2. Not based on all observation.
3. Can be calculated in the case of the data 3. Not suitable for further algebraic
Page 39 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
data that the first group consists of near average intelligent student and the 2 nd group is made up of very
bright and very dull students. It is evident that the distributions of both groups have the same AM. But
they differ in variation from X̄ ; such variation is usually measured by the measure of dispersion.
On the other hand often it is necessary to compare the distribution in two or more different
frequency distributions having variables expressed in different units. In such a case dispersion is
calculated by dividing the absolute measure of dispersion by a measure of central tendency. The
resultant numerical value is a relative measure of dispersion.
Page 40 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Different types of Absolute and Relative measure of dispersion are listed below:
set. If
X l ∧X s the smallest and the largest values respectively in a set then the range “R” is
defined as
R= X l− X s .
For group data the range is taken either as the difference between the lower boundary of the first
class and the upper boundary of the last class or as the difference between the highest and the
lowest mid-values.
The coefficient of dispersion corresponding to range called coefficient of range and it is obtained
by
X l− X s
Coefficient of range =
Xl+ Xs ; Where
Xl= Largest value and
X s= Smallest value
Quartile Deviation and Coefficient of Quartile Deviation:
Quartiles divide the observations in to four equal parts, when observations are arranged in order
Therefore
Q2 −Q1 and
Q3 −Q2 gives us some measure of dispersion. The AM of these two
Page 41 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Q3 −Q1
=
Coefficient of QD Q 3 +Q 1
If
X 1 , X 2 ,. .. , X N denote the value of N observations then the mean deviation about an average
(or measure of central tendency) A is defined as
1 1
MD= ∑ |X i− A| ∑ |Di|
N = N
In case of frequency distribution
1 1
MD= ∑ f i|X i − A| ∑ f i|D i|
N = N
The coefficient of dispersion corresponding to mean deviation is known as coefficient of mean
deviation and is obtained by dividing mean deviation by the particular average used in computing
mean deviation.
MD
That is coefficient of mean deviation, Co. MD =
Particular Average
MD
Thus if mean deviation has been computed from AM, then the Co. MD = AM .
Population Variance:
The formula for computing variance of a set of sample observations is given below :
Case 1:
If
X 1 , X 2 ,. .. , X N are N values of a population of size N, then the population variance commonly
2
designated as σ , is defined as
N
∑ ( X i−μ )2
σ 2 = i=1
N , Where μ=Mean of the distribution
Problem:
Page 42 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Let a population of 10 students got the marks in the examination as given in the table below. Find
the variance of the given data.
13 15 14 16 2 8 9 23 28 12
Answer:
For the required solution please complete the following steps and table:
frequencies
f 1 ,f 2 , .. ., f k respectively then the variance of the distribution will be
k k
∑ f i ( X i−μ )2 ∑ f i ( X i −μ )2
σ 2 = i=1 k
= i =1
N
∑ fi
i=1
Problem:
Let a population of 40 students got the marks in the examination as given in the table below. Find
the variance of the given data.
Xi 15 20 25 30 35
fi 6 8 15 7 4
Answer:
For the required solution please complete the following steps and table:
∑ f i ¿¿¿ ∑ f i ( X i−μ )2
i=1
σ 2 = i=1
N =?
Page 43 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Sample Variance:
The formula for computing variance of a set of sample observations is given below:
Case 1:
If
X 1 , X 2 ,. .. , X n are n values of a sample of size n, then the sample variance commonly
2
designated as s , is defined as
N
∑ ( X i − x̄ )2
s 2 = i=1
n−1 , Where x̄= Sample mean of the distribution
Problem:
Let a sample of 10 students got the marks in the examination as given in the table below. Find the
variance of the given data.
13 15 14 16 2 8 9 23 28 12
Answer:
For the required solution please complete the following steps and table:
Step 1: First find the AM of the given value. Sample Mean AM, x̄= ? ?
Step 2: Then complete the following table:
2 2
Xi ( X i− x̄ ) ( X i− x̄ ) Xi ( X i− x̄ ) ( X i− x̄ )
13 8
15 9
14 23
16 28
2 12
Page 44 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Problem:
Let a sample of 40 students got the marks in the examination as given in the table below. Find the
variance of the given data.
Xi 15 20 25 30 35
fi 6 8 15 7 4
Answer: For the required solution please complete the following steps and table:
k
∑ f i ( X i− x̄ )2
i=1
∑ f i ( X i− x̄ )2
σ 2 = i=1
n−1 =
?
Standard deviation:
The standard deviation of a given data is obtained by taking the square root of the corresponding
variance value.
SD( X )
That is coefficient of variation ( CV ) = AM ( X ) .
Page 45 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
N 2 k 2
Population
Population σ =21
N [ N
∑x−
i=1
2
i
(∑ )
i=1
N
N
xi
2
] 2
σ =
1
N [ ∑
k
i =1
(∑ )
f i x 2i −
i=1
f 1 x1
N ]
2
[ ]
k
( ) [ ]
Sample
Sample
2
s =
1
N
∑ x2i −
∑
i=1
xi
s2=
1
k
∑ f i x 2i −
( ∑ f 1 x1
i=1
)
n−1 n n−1 i=1 n
i=1
n1 ( σ 21 + d21 ) + n2 ( σ 22 +d 22 )
σ 12=
Where,
√ n1+ n2
Question:
From the analysis of monthly wages paid to employees in two service organizations X and Y, the
following results were obtained:
Organization X Organization Y
Number of wage-earners 550 650
Average monthly wages 5000 4500
Variance of the distribution of wages 900 1600
a. Which organization pays a larger amount as monthly wages?
b. Determine the combined variance of all the employees taken together?
Page 46 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Question:
For a group of 50 male workers, the mean and standard deviation of their monthly wages are tk. 6300 and
tk. 600 respectively. For a group of 40 female workers, these are tk. 5400 and tk. 600 respectively. Find
the standard deviation of monthly wages for the combined group of workers.
Answer: tk. 900
J K Sharma, 151
Page 47 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Problem:
The following data give the number of passengers travelling by airplane from one city to another in
one week.
115 122 129 113 119 124 132 120 110 116
Calculate themean and standard deviation and determine the percentage of class that lie between
Solution:
The calculation for mean and standard deviation are given in the following table
x x−μ ( x−μ )2
115
122
129
…
…
110
116
2
∑ x ???? 2 ∑ ( x−μ )
μ= = =120 and σ = =? ? ?=43.6
N ?? N
Therefore, σ = √ σ 2= √ 43.3=6.60
Page 48 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
The percentage of cases that lie between a given limit are as follows:
Percentage of Percentage falling
Interval Values within Interval
population Outside
μ ± σ=120 ± 6.60 113, 115, 116, 119, 120, 122,
70% 30%
¿ 113.4 and 126.6 124
μ ±2 σ =120 ±2 ( 6.60 ) 110, 113, 115, 116, 119, 120,
100% Nil
= 106.80 and 133.20 122, 124, 129, 132
Page 49 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
1. An Advertising company is looking for a group of extras to shoot a sequence for a movie. The ages
of the first 20 candidates to be interviewed are
50 56 44 49 52 57 56 57 56 59
54 55 61 60 51 59 62 52 54 49
The director of the movie wants men whose ages are tightly grouped around 55 years. Being a
statistics buff of sorts, the director suggests that a standard deviation of 3 years would be
acceptable. Does this group of extras qualify?
2. The normal daily high temperatures (in degrees Fahrenheit) in January for 10 selected cities are as
follows.
50, 37, 29, 54, 30, 61, 47, 38, 34, 61
The normal monthly precipitation (in inches) for these same 10 cities is listed below:
4.8, 2.6, 1.5, 1.8, 1.8, 3.3, 5.1, 1.1, 1.8, 2.5
Which variable represents greater relative variability?
3. A collar manufacturer is considering the production of new collars to attract young men. Thus
following statistics of neck circumference are available based on measurements of a typical group
of students of a particular university:
Mid values (in inches): 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0
Number of students: 2 16 36 60 76 37 18 3 2
Compute the standard deviation and use the criterion x́ ± 3 σ , where σ is the standard deviation
and x́ is the arithmatic mean to determine the largest and smallest size of the collar he should
make in order to meet the needs of practically all the customers bearing in mind that collars are
worn average half an inch longer than the neck size.
Answer: 12.2 and 16.4 inches
J K Sharma, 155
4. ANIK Electronics is considering employing one of two training programs. Two groups
were trained for the same task. Group 1 was trained by program A, group 2 by program B.
for the first group, the times required to train the employees had an average of 32.11
hours and a variance of 68.09. In the second group, the average was 19.75 and the
variance was 71.14. Which training program has less relative variability in its
performance?
5. The administrator of a Georgia hospital surveyed the number of days 200 randomly
chosen patients stayed in the hospital following an operation. The data are:
Hospital Stay in days Number of patients
1–3 18
4–6 90
7–9 44
10 – 12 21
13 – 15 9
16 – 18 9
19 – 21 4
Page 50 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
22 – 24 5
6. The manager of Nando’s Chicken has just received two dozen tomatoes form her supplier,
but she is not ready to accept them. She knows from the invoice that thew average weight
is 7.5 ounces, but she insists that all be of uniform weight. She will accept them only if the
average weight is 7.5 ounces and the standard deviation is less than 0.5 ounce. Here are
the weights of the tomatoes.
6.3 7.2 7.3 8.1 7.8 6.8 7.5 7.8
7.2 7.5 8.1 8.2 8.0 7.4 7.6 7.7
7.6 7.4 7.5 8.4 7.4 7.6 6.2 7.4
What would be the manager’s decision and why?
7. Student’s ages in the regular daytime MBA program and the evening program of BRAC
University are described by these two samples:
Regular MBA 23 29 27 22 24 21 25 27 24 26
Evening MBA 27 34 30 29 28 30 34 35 28 29
If homogeneity of the class is a positive factor in learning, use a measure of relative variability to
suggest which of the two groups will be easier to teach?
8. In two factories A and B engaged in the same industry, the average monthly wages and
standard deviations are as follows:
Factory Average monthly S.D. of No. of Wage
Wages (Tk.) Wages (Tk. ) Earners
A 4600 500 100
B 4900 400 80
Determine
i. Which factory A or B pays larger amount as monthly wages?
ii. Which factory shows greater variability in the distribution of wages?
iii. What is the mean wage of all workers in two factories taken together?
Page 51 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Skewness:
The term skewness means the lack of symmetry. The skewness may be either positive or negative.
When the skewness is positive the associated distribution is called positively skewed. When the
skewness is negative the associated distribution is negatively skewed.
Mean−Mode 3 (Mean−Median )
Sk p = =
SD SD
Kurtosis:
There is considerable variation among symmetrical distributions. For instance, they can differ
markedly in terms of peaked ness. This is what we call kurtosis. Kurtosis, as defined by Spiegel
(Spiegel: Theory and Problems of Statistics) is the degree of peaked ness of a distribution, usually
taken in relation to a normal distribution.
A curve having relatively higher peak than the normal curve, is known as leptokurtic.
A curve, which is neither too peaked nor too flat topped, is known as mesokurtic.
A curve that is more flat topped than the normal curve is called platykurtic.
Question:
If for a distribution Mean=18, Median=32 and Mode=36 ⇒ the distribution is _ _ _ _ _ _ _ _ _ _ _ _
skewed.
a. Positively b. Symmetrically c. None d. Negatively
Merits Demerits
Page 52 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
Range
Range
deviation
Quartile
dispersion. 25% and last 25% of observations.
It is applicable in Open-end class. Very much affected by sampling
EasyItto
is understand and compute. fluctuations.
DeviationMean
Deviation
Mean
It considers all observations. treatment.
Less affected by extreme values. The greatest drawback of this method
Variance
Less affected by sampling fluctuations. class.
Suitable for further algebraic
treatment.
Merits Demerits
Box Plot:
A box plot is a graphic display that shows the general shape of a variable’s distribution. It is based
Example:
Pizza Hut offers free delivery of its pizza within 15 miles. Mr. Rahman the owner wants some
information on the time it takes for delivery. How long does a typical delivery take? Within what
Page 53 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
range of times will most deliveries be completed? For a sample of 20 deliveries, he determined the
following information:
Minimum value = 13 minutes
Q1 = 15 minutes
Median = 18 minutes
Q3= 22 minutes
Maximum value = 30 minutes
Develop a boxes plot for the delivery times. What conclusions can you make about make about the
delivery times?
Solution:
In order to draw box plot follow the steps mentioned below:
Step 1:Create an appropriate scale along the horizontal axis.
2
These horizontal lines outside of the box are sometimes called “whiskers” because the looks a bit like a cat’s
whiskers.
3
The inter quartile range is the distance between the first and the third quartile.
Page 54 of 56
Iftekhar M S Kalam
Assistant Professor, MNS
minutes is longer than the dashed line from the left of 15 minutes
(Q1) to the minimum
value of 13 minutes.
– The median is not in the middle in the center of the box. The distance from the first quartile to the
median is smaller than the distances from the median to the third quartile.
Question:
Construct a box plot for the data given below and hence comment on the skewness of the
distribution:
99 75 84 33 45 66 97 69 55 61
72 91 74 93 54 76 62 91 77 68
Page 55 of 56
STA 101_Introduction to Statistics
LS01_Basic Concept, Central Tendency and Dispersion
Miscellaneous Exercise
Question 1:
Average mark obtained by 15 students was 10 and the average mark obtained by 10 students was 15.
What was the average mark obtained by all students?
a. 10 b. 8 c. 12 d. 15 e. 11
Answer 11: c
Question 2:
Study the following histogram and hence determine the modal class and what proportion of students get
marks below 80.
6
4 4
4
2 2
2
0
below 50 50-60 60-70 70-80 80-90 90+
Marks of the students
Question 3:
A school had 100 students aged 20 years on an average. At the end of the year, 20 students aged 22 years
on an average left and 25 students of 18 years on an average joined the school. What is the average age of
the present students of the school?
a. 20.14 b. 19.14 c. 22.14 d. 22 e. None
Answer: b
Question 4:
A group of students has hired a bus for Tk. 3000 for going to a picnic. They had an understanding that
each participant would share the charge in equal amounts. But because of 10 students not turning up, the
charge per student increased by taka 10 over the initial estimates. What is the number of students who
originally registered for the picnic?
Answer: 60
Question 5:
Salman bought 500 shares of company “X” at tk. 600 and 2 months later bought another 250 shares of the
same company at tk. 560. At what price should he purchase additional 250 shares in order to have an
average price of tk. 580 per share?
Answer: Tk. 560
Page 56 of 56