Scope and Limitations of Scope and Limitations of Statistics Statistics Statistics Statistics
Scope and Limitations of Scope and Limitations of Statistics Statistics Statistics Statistics
Statistics
Dr. T N KAVITHA
Assistant Professor of Mathematics
SCSVMV
Here we discuss the Importance, Scope, and Limitations of
Statistics.
The scope of Statistics was limited in ancient times as the
government used statistics for the purpose of
administration alone.
Gradually, the subject became more and more popular and
its application has become more extensive. Now is hardly
any field of human activity where statistics are not used.
Now is used by economists, businessmen, scientists,
administrators, etc.
What is the Scope of Statistics?
All economic plans of a formulated on the
basis of statistical data. The success of the
plan is also evaluated with the help of
statistics.
Economic problems such as production,
consumption, wages, price profits,
unemployment, poverty, etc. can be
expressed numerically.
Statistics are very useful to businessmen. It
helps businessmen in formulating policies
regarding business and forecasting future
trends.
Efficient administration cannot be perceived
without statistics. Statistics have been used
from the time of origin of statistics to collect
information regarding the military and fiscal
policies.
Statistical methods are extensively used in
every type of research work. Whether it is
agriculture, health, or social science, the
statistics help in carrying out different types
of researches.
Data Collection Methods
In Statistics, the data collection is a process of
gathering information from all the relevant
sources to find a solution to the research problem.
It helps to evaluate the outcome of the problem.
The data collection methods allow a person to
conclude an answer to the relevant question.
Most of the organizations use data collection
methods to make assumptions about future
probabilities and trends.
A data can be classified into two types, namely
• Primary Data Collection methods
• Secondary Data Collection methods
Primary data or raw data is a type of information
that is obtained directly from the first-hand
source through experiments, surveys, or
observations. The primary data collection
method is further classified into two types.
They are
Quantitative Data Collection Methods
Qualitative Data Collection Methods
Quantitative Data Collection Methods
It is based on mathematical calculations using
various formats like close-ended
questions, methods,
mean, median or mode measures. This method is
cheaper than qualitative data collection methods,
and it can be applied in a short duration of time.
It does not involve any mathematical calculations.
This method is closely associated with elements
that are not quantifiable. This qualitative data
collection method includes interviews,
questionnaires, observations, case studies etc.
There are several methods to collect this type of
data. They are
• Observation Method
• Questionnaire Method
• Interview Method
The different types of observations are:
Structured and unstructured observation
Controlled and uncontrolled observation
Participant,
Participant, non-participant and disguised
observation
The method of collecting data in terms of oral or
verbal responses. It is achieved in two ways, such as
Personal Interview – In this method, a person
known as an interviewer is required to ask questions
face to face to the other person. The personal
interview can be structured or unstructured, direct
investigation, focused conversation etc.
Telephonic Interview – In this method, an
interviewer obtains information by contacting people
on the telephone to ask the questions or views orally.
In this method, the set of questions are mailed to
the respondent.
They should read, reply and subsequently return
the questionnaire.
The questions are printed in the definite order on
the form.
A good survey should have the following
features:
Short and simple
Should follow a logical sequence
Provide adequate space for answers
Avoid technical terms
Should have good physical appearance such as
colour, quality of the paper to attract the
attention of the respondent
This method is similar to the questionnaire
method with a slight difference. The
enumerations are specially appointed for the
purpose of filling the schedules. It explains the
aims and objects of the investigation and may
remove misunderstandings if any have come up.
Enumerations should be trained to perform their
job with hard work and patience.
Secondary data is data collected by someone
other than the actual user. It means that the
information is already available, and someone
analyses it. The secondary data includes
magazines, newspapers, books, journals etc. It
may be either published data or unpublished data.
Government publications
Public records
Historical and statistical documents
Business documents
Technical and trade journals
Diaries
Letters
Unpublished biographies etc.
Types of
Classifications&Tabulation
Classifications &Tabulation
The grouping of related facts/data into different
classes according to certain common
characteristic.
Basis of data Classification: Broadly 4 broad
basis
1. Geographical
2. Chronological or Temporal
3. Qualitative
4. Quantitative
Geographical classifications i.e. area wise
• Total Population of india by states and by
districts
• No. of death due to covid-19 by countries.
• Deaths in Tamilnadu by districts
Example: People by
place of residence,
Rural- Urban, Male-
Female, Illiterate-
Literate .......
Quantitative: On the basis of quantitative class
intervals
For example students of a college may be classified
according to weight as follows
M ean = A +
d
n
w h e re d = x -A ,
A -A s s u m e d m e a n ( In d iv id u a l O b s e rv a tio n s )
Mode = the item which is occurred more
number of times.
Mode = the item which is occurred more
number of times.
Relation Between Mean, Median and Mode
If the value of the n=mode is equal to the value of
the median and the mean then we call it as
symmetrical data set. For such data sets, there is a
simple relationship between the three M’s (mean,
median and mode):
Mean - Mode =3 (Mean – Median)
(OR)
Mode = Mean - 3 Mean + 3 Median
(OR)
Mode = 3 Median - 2 Mean
Find the AM , median and mode of the following
set of observations:
25,32,28,34,24,31,36; 27,29,30.
Find the mean, median and
mode for the following data:
Class(x) frequency(f)
0-10 20
10-20 5
20-30 3
30-40 8
40-50 10
50-60 35
60-70 10
70-80 4
80-90 3
90-100 2
Measures of Dispersion
MEASURES OF DISPERSION
• Range
• Quartile Deviation
• Mean Deviation:
• Standard Deviation
Range
Range = L - S
L- Largest values
S- Smallest value
L − S
Coefficient of Range =
L + S
Quartile Deviation
Mean Deviation(Individual
Observations)
Mean Deviation(Discrete series)
Mean Deviation(Continuous
series)
Standard Deviation(Individual
Observations)
Standard Deviation(Discrete
series)
Standard Deviation(Continuous
series)
Coefficient of Variation
Range
Geometric Mean, weighted
Arithmetic mean & Harmonic Mean
A simple way to define a harmonic mean is to
call it the reciprocal of the arithmetic mean of
the reciprocals of the observations. The most
important criteria for it is that none of the
observations should be zero.
If all the observation taken by a variable are
constants, say k, then the harmonic mean of the
observations is also k
The harmonic mean has the least value when
compared to the geometric mean and the
arithmetic mean
A.M <= G.M <= H.M
• A harmonic mean is rigidly defined
• It is based upon all the observations
• The fluctuations of the observations do not
affect the harmonic mean
• More weight is given to smaller items
• Not easily understandable
• Difficult to compute
The Geometric Mean is a special type of average
where we multiply the numbers together and
then take a square root (for two numbers), cube
root (for three numbers) etc.
Geometric mean = G.M.
= (x1 f1. x2 f2 … xn fn) (1 ∕ N)
A geometric mean is a mean or average which
shows the central tendency of a set of numbers
by using the product of their values. For a set of
n observations, a geometric mean is the nth root
of their product. The geometric mean G.M., for a
set of numbers x1, x2, … , xn is given as
G.M. = (x1. x2 … xn)1∕n
or, G. M. = (π i = 1n xi) 1∕n
= n√( x1, x2, … , xn).
The geometric mean of two numbers, say x, and
y is the square root of their product x × y. For
three numbers, it will be the cube root of their
products i.e., (x y z) 1∕3.
In order to make our calculation easy and less time
consuming we use the concept of logarithms in the
calculation of geometric means.
Since, G.M. = (x1. x2 … xn) 1∕n
Taking log on both sides, we have
log G.M. = (1 ∕ n) (log ((x1. x2 … xn))
or, log G.M. = (1 ∕ n)(log x1 + log x2 + … + log xn)
or, log G.M. = (1 ∕ n) ∑ I = 1n log xi
or, G.M. = Antilog(1 ∕ n)∑ I = 1n log xi)).
For a grouped frequency distribution, the geometric
mean G.M. is
G.M. = (x1 f1. x2 f2 … xn fn) 1 ∕ N ,
where N = ∑ i = 1n fi
Taking logarithms on both sides, we get
log G.M. = 1 ∕ N (f1 log x1 + f2 log x2 + …+ fn log xn)
= 1 ∕ N [∑ i = 1n fi log xi ].
x 2 4 5 8
f 3 3 2 2
• The logarithm of geometric mean is the
arithmetic mean of the logarithms of given
values
• If all the observations assumed by a variable
are constants, say K >0, then the G.M. of the
observation is also K
• The geometric mean of the ratio of two
variables is the ratio of the geometric means of
the two variables
• The geometric mean of the product of two
variables is the product of their geometric means
Suppose G1, and G2 are the geometric means of two
series of sizes n1, and n2 respectively. The geometric
mean G, of the combined groups, is:
log G = (n1 log G1 + n2 log G2) ∕ (n1 + n2)
or, G = antilog [(n1 log G1 + n2 log G2) ∕ (n1 + n2)]
In general for ni geometric means, i = 1 to k,
we have
G = antilog [(n1 log G1 + n2 log G2 + … + nk log Gk)
∕ /(n1 + n2 + … +nk)]
• A geometric mean is based upon all the
observations
• It is rigidly defined
• The fluctuations of the observations do not
affect the geometric mean
• It gives more weight to small items
• A geometric mean is not easily
understandable by a non-mathematical
person
• If any of the observations is zero, the
geometric mean becomes zero
• If any of the observation is negative, the
geometric mean becomes imaginary
Weighted Mean
Weighted Mean is an average computed by
giving different weights to some of the
individual values. If all the weights are equal,
then the weighted mean is the same as the
arithmetic mean.
Question: Suppose that a marketing firm conducts a survey of
1,000 households to determine the average number of TVs each
household owns. The data show a large number of households
with two or three TVs and a smaller number with one or four.
Every household in the sample has at least one TV and no
household has more than four. Find the mean number of TVs per
household.
Number of 1 2 3 4
TVs per
Household
Number of 73 378 459 90
Households
Solution:
As many of the values in this data set are repeated
multiple times, you can easily compute the sample
mean as a weighted mean. Follow these steps to
calculate the weighted arithmetic mean:
• Step 1: Assign a weight to each value in the
dataset:
• x1= 1, w1= 73
• x2= 2, w2= 378
• x3= 3, w3= 459
• x4= 4, w4= 90
Step 2:
Compute the numerator of the weighted mean
formula.
Multiply each sample by its weight and then add
the products together:
BY
DR. T N KAVITHA
ASSISTANT PROFESSOR OF MATHEMATICS
SCSVMV
Regression:
byx= 5/4= 1.25
bxy = 9/20 = 0.45
r = ±v 1.25x0.45= ±v0.5625
=±0.75
Rank Correlation:
Rank correlation for Repeated ranks: