Maths 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 156

Noida Institute of Engineering and Technology, Greater Noida

Statistical Technique 1

Unit: 1
Subject Name: Mathematics-IV Dr. Kunti Mishra
Subject Code: AAS0402 NIET, Gr Noida
Department of
B Tech 4th Sem Mathematics

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 1


Brief Introduction of Faculty

Dr. Kunti Mishra


Assistant Professor
Department of Mathematics

Qualifications :
M.Sc.(Maths), M. Tech.(Gold Medalist) in Applied and
Computational Mathematics, Ph.D

Ph.D. Thesis : Some Investigations in Fractal Theory


Total Number of Research Papers:15
Area of Interests: Fixed Point Theory, Fractals
Teaching Experience: 9 years

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 2


Evaluation Scheme

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 3


Syllabus

Unit-I (Statistical Techniques-I)


Introduction: Measures of central tendency: Mean, Median, Mode, Moment,
Skewness, Kurtosis, Curve Fitting ,Method of least squares, Fitting of straight
lines, Fitting of second degree parabola, Exponential curves ,Correlation and
Rank correlation, Linear regression, nonlinear regression and multiple linear
regression
Unit-II (Statistical Techniques-II)
Testing a Hypothesis, Null hypothesis, Alternative hypothesis, Level of
significance, Confidence limits, p-value, Test of significance of difference of
means, Z-test, t-test and Chi-square test, F-test, ANOVA: One way and Two
way. Statistical Quality Control (SQC), Control Charts, Control Charts for
variables (Mean and Range Charts), Control Charts for Variables ( p, np and C
charts).

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 4


Syllabus

Unit III (Probability and Random Variable)


Random Variable: Definition of a Random Variable, Discrete Random
Variable, Continuous Random Variable, Probability mass function, Probability
Density Function, Distribution functions.
Multiple Random Variables: Joint density and distribution Function,
Properties of Joint Distribution function, Marginal density Functions,
Conditional Distribution and Density, Statistical Independence, Central Limit
Theorem (Proof not expected).
Unit IV (Expectations and Probability Distribution)
Operation on One Random Variable – Expectations: Introduction, Expected
Value of a Random Variable, Mean, Variance, Moment Generating
Function, Binomial, Poisson, Normal, Exponential distribution.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 5


Syllabus

Unit V (Wavelets and applications and Aptitude-IV)


Wavelet Transform, wavelet series. Basic wavelets
(Haar/Shannon/Daubechies), orthogonal wavelets, multi-resolution analysis,
reconstruction of wavelets and applications.
Number System, Permutation & Combination, Probability, Function, Data
Interpretation, Syllogism.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 6


Branch Wise Application

 Data Analysis
 Artificial intelligence
 Network and Traffic modeling

Faculty Name Dr. Kunti Mishra Unit 1


2/14/2023 7
Course Objectives

• The objective of this course is to familiarize the students with statistical


techniques. It aims to present the students with standard concepts and tools
at an intermediate to superior level that will provide them well towards
undertaking a variety of problems in the discipline.
The students will learn:
• Understand the concept of correlation, moments, skewness and kurtosis and
curve fitting.
• Apply the concept of hypothesis testing and statistical quality control to
create control charts.
• Remember the concept of probability to evaluate probability distributions.
• Understand the concept of Mathematical Expectations and Probability
Distribution.
• Remember the concept of Wavelet Transform and Solve the problems of
Number System, Permutation & Combination, Probability, Function, Data
Interpretation, Syllogism.

Faculty Name Dr. Kunti Mishra Unit 1


2/14/2023 8
Course Outcomes

CO1: Understand the concept of correlation, moments, skewness and


kurtosis and curve fitting.
CO2: Apply the concept of hypothesis testing and statistical quality control to
create control charts.
CO3: Remember the concept of probability to evaluate probability
distributions
CO4: Understand the concept of Mathematical Expectations and Probability
Distribution
CO2: Remember the concept of Wavelet Transform and Solve the problems of
Number System, Permutation & Combination, Probability, Function, Data
Interpretation, Syllogism.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 9


Program Outcomes

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 10


PSOs

PSO Program Specific Outcomes(PSOs)

PSO1 The ability to identify, analyze real world problems and design
their ethical solutions using artificial intelligence, robotics,
virtual/augmented reality, data analytics, block chain technology,
and cloud computing
PSO2 The ability to design and develop the hardware sensor devices and
related interfacing software systems for solving complex
engineering problems.
PSO3 The ability to understand inter disciplinary computing techniques
and to apply them in the design of advanced computing.
PSO4 The ability to conduct investigation of complex problem with the
help of technical, managerial, leadership qualities, and modern
engineering tools provided by industry sponsored laboratories.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 11


CO-PO Mapping(CO1)

Sr. Course PO1 PO PO PO4 PO PO PO PO PO PO10 PO11 PO12


No Outcome 2 3 5 6 7 8 9

1 CO1 H H H H L L L L L L L M

2 CO2 H H H H L L L L L L M M

3 CO3 H H H H L L L L L L M M

4 CO4 H H H H L L L L L L L M

5 CO5 H H H H L L L L L L M M

*L= Low *M= Medium *H= High

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 12


CO-PSO Mapping(CO2)

CO PSO1 PSO2 PSO3 PSO4

CO.1 H L M L

CO.2 L M L M

CO.3 M M M M

CO.4 H M M M

CO.5 H M M M

*L= Low *M= Medium *H= High

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 13


Program Educational Objectives(PEOs)

PEO-1: To have an excellent scientific and engineering breadth so as to


comprehend, analyze, design and provide sustainable solutions for real-life
problems using state-of-the-art technologies.
PEO-2: To have a successful career in industries, to pursue higher studies or to
support entrepreneurial endeavors and to face the global challenges.
PEO-3: To have an effective communication skills, professional attitude,
ethical values and a desire to learn specific knowledge in emerging trends,
technologies for research, innovation and product development and
contribution to society.
PEO-4: To have life-long learning for up-skilling and re-skilling for successful
professional career as engineer, scientist, entrepreneur and bureaucrat for
betterment of society.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 14


Result Analysis

Branch Semester Sections No. of No. Passed % Passed


enrolled Students
Students
CS IV A 67 65 97%

IOT IV A 49 45 91.83%

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 15


End Semester Question Paper Template

Link:100 Marks Question Paper Template.docx

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 16


Prerequisite and Recap (CO1)

 Knowledge of Maths 1 B.Tech.


 Knowledge of Maths 2 B.Tech.
 Knowledge of Permutation and Combination.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 17


Brief Introduction about the Subject with Videos

• We will discuss properties of complex function (limits, continuity,


differentiability, Analyticity and integration)
• In 3rd module we will discuss application of partial differential
equations
• In 4th module we will discuss numerical methods for solving algebraic
equations, system of linear equations, definite integral and 1st order
ordinary differential equation.
• In 5th module we will discuss aptitude part.
• https://youtu.be/iUhwCfz18os
• https://youtu.be/ly4S0oi3Yz8
• https://youtu.be/f8XzF9_2ijs

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 18


Unit Content

• Introduction
• Measures of central tendency: Mean, Median, Mode.
• Moment
• Skewness
• Kurtosis
• Curve Fitting
• Method of least squares
• Fitting of straight lines
• Fitting of second degree parabola
• Exponential curves
• Correlation and Rank correlation,
• Linear regression
• Nonlinear regression
• Multiple linear regression

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 19


Unit Objectives(CO1)

• The objective of this course is to familiarize the engineers with concept


of Statistical techniques.
• It aims to show case the students with standard concepts and tools from
B. Tech to deal with advanced level of mathematics and applications that
would be essential for their disciplines.

Faculty Name Dr. Kunti Mishra Unit 1


2/14/2023 20
Topic objectives (CO1)

Measures of central tendency


• To present a brief picture of data- It helps in giving a brief
description of the main feature of the entire data.
• Essential for comparison- It helps in reducing the data to a single
value which is used for doing comparative studies.
• Helps in decision making- Most of the companies use measuring
central tendency to plan and develop their businesses economy.
• Formulation of policies- Many governments rely on this medium
while forming any policies.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 21


Measures of Central Tendency (CO1)

 Measures of Central Tendency or Averages:


Definition : According to Prof. Bowley: Averages are “statistical
constants which enable us to comprehend in a single effort the
significance of the whole.”
Types of Measures of Central Tendency: There are five types of
measures of centraltendency
 Arithmetic Mean or Simple Mean
 Median
 Mode
 Geometric Mean
 Harmonic Mean

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 22


Arithmetic Mean (CO1)

 Arithmetic Mean
Definition
Arithmetic mean of a set of observations is their sum divided by the
number of observations, e.g., the arithmetic mean x¯of n observations
x1, x2, ..., xn is given by:
𝑛
x1+ x2+ … + xn 1
𝑥ҧ = = ෍ 𝑥𝑛
𝑛 𝑛
𝑖=1
 In case of the frequency distribution xi |fi , i = 1, 2, ..., n, where
fi is the frequency of the variable xi,
𝑛 𝑛 𝑛
𝑓1 x1 +𝑓2 x2 +⋯ + 𝑓𝑛 xn σ𝑖=1 𝑓𝑖 𝑥𝑖 1
𝑥ҧ = = 𝑛 = ෍ 𝑓𝑖 𝑥𝑖 , where ෍ 𝑓𝑖
𝑓1 + 𝑓2 + ⋯ + 𝑓𝑛 σ𝑖=1 𝑓𝑖 𝑁
𝑖=1 𝑖=1
=𝑁

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 23


Arithmetic Mean(CO1)

In case of grouped or continuous frequency distribution, x is taken as the


mid-value of the corresponding class.
Example: Find the arithmetic mean of the following frequency distribution:
X: 1 2 3 4 5 6 7
f: 5 9 12 17 14 10 6
Solution:
Computation of mean
𝑓1 x1 +𝑓2 x2 +⋯ + 𝑓𝑛 xn
𝑥ҧ =
𝑓1 + 𝑓2 + ⋯ + 𝑓𝑛
𝑛 𝑛
σ𝑖=1 𝑓𝑖 𝑥𝑖 1
= 𝑛 = ෍ 𝑓𝑖 𝑥𝑖
σ𝑖=1 𝑓𝑖 𝑁
𝑖=1
𝑛

where ෍ 𝑓𝑖 = 𝑁
𝑖=1

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 24


Arithmetic Mean(CO1)

By using formula σ𝑛𝑖=1 𝑓𝑖 = 𝑁 = 73, σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 = 299


𝑛
1 299
𝑀𝑒𝑎𝑛 = ෍ 𝑓𝑖 𝑥𝑖 = = 4.09
𝑁 73
𝑖=1

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 25


Daily Quiz (CO1)

Example: Calculate the mean for the following frequency distribution:


Class 0-8 8-16 16-24 24-32 32-40 40-48
interval
Frequency 8 7 16 24 15 7
Solution: Arithmetic mean =25.404
Example: The average salary of male employees in a farm was Rs.
5,200 and that of females was Rs. 4,200. The mean salary of all the
employees was Rs. 5,000.Find the percentage of male and female
employees.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 26


Median(CO1)

 Median:
Definition: Median of a distribution is the value of the variable which
divides it into two equal parts.
It is the value such that the number of observations above it is equal to
the number of observations below it. The median is thus a positional
average.
 Ungrouped Data:
• If the number of observations is odd then median is the middle
value after the values have been arranged in ascending or descending
order of magnitude.
• In case of even number of observations, there are two middle
terms and median is obtained by taking the arithmetic mean of
middle terms.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 27


Median(CO1)

Example
1. Median of Values 25, 20, 15, 35, 18. Median: 20
2. Median of Values 8, 20, 50, 25, 15, 30. Median: 22.5

 Discrete Frequency Distribution


In this case median is obtained by considering the cumulative
frequencies. The steps involved
𝑁
i. Find , where N =σ𝑛
𝑖=1 𝑓𝑖
2
𝑁
ii. See the cumulative frequency (c.f.) just greater than .
2

iii. corresponding value of x is median.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 28


Median(CO1)

Example: Obtain the median for the following frequency distribution:


x: 1 2 3 4 5 6 7 8 9
f: 8 10 11 16 20 25 15 9 6
Solution:
𝑁 8+10+11+16+20+25+15+9+6 120
i. Find 2
= 2
= 2
= 60,

where N =σ𝑛𝑖=1 𝑓𝑖
𝑁
i. See the cumulative frequency (c.f.) just greater than 2 .

ii. corresponding value of x is median.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 29


Median(CO1)

𝑁
Here N = 120, The cumulative frequency just greater than is 65 and
2
the 2 value of x corresponding to 65 is 5. Therefore, median is 5.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 30


Median(CO1)

 Continuous Frequency Distribution


𝑁
In this case, the class corresponding to the c.f. justgreate 2 is called the
median class and the value of median is obtained by the formula:
ℎ 𝑁
Median = 𝑙 + −𝑐
𝑓 2
where
• l is the lower limit of the class,
• f is the frequency of the median class,
• h is the magnitude of the median class,
• c is the c.f. of the class preceding the median class,
• N =σ𝑛𝑖=1 𝑓𝑖

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 31


Daily Quiz(CO1)

Example : find the median wages of the following distribution.


Wages No. of workers
2000-3000 3
3000-4000 5
4000-5000 20
5000-6000 10
6000-7000 5

Solution: The median wage is Rs. 4,675.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 32


Mode(CO1)

 Mode:
• Mode is the value which occurs most frequently in a set of
observations and around which the other items of the set cluster
densely.
• It is the point of maximum frequency or the point of greatest
density.
• In other words the mode or modal value of the distribution is that
value of the variate for which frequency is maximum.
Calculation of Mode
 In case of discrete distribution: Mode is the value of x
corresponding to maximum frequency but in any one (or more)of
the following cases.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 33


Mode(CO1)

i. If the maximum frequency is repeated.


ii. If the maximum frequency occurs in the very beginning or at the
end of distribution .
iii. If there are irregularities in the distribution, the value of mode is
determined by the method of grouping.
 In case of continuous frequency distribution: mode is given by
the formula
𝑓𝑚 −𝑓1
Mode= 𝑙 + 2𝑓 −𝑓 −𝑓 ×ℎ
𝑚 1 2

where 𝑙 is the lower limit,ℎ 𝑡ℎ𝑒 width and 𝑓𝑚 the frequency of the
model class 𝑓1 𝑎𝑛𝑑 𝑓2 are the frequencies of the classes preceding and
succeeding the modal class respectively. While applying the above
formula it is necessary to see that the class intervals are of the same
size.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 34


Mode(CO1)

 For a symmetrical distribution, mean, median and mode coincide.


When mode is ill defined ,where the method of grouping also fails
its value can be ascertained by the formula
Mode=3Median-2Mean
This measure is called the empirical mode.
Q. Calculate the mode from the following frequency distribution.
Size(𝒙) 4 5 6 7 8 9 10 11 12 13
Freqen 2 5 8 9 12 14 14 15 11 13
cy
(𝑓)

Solution: Method of Grouping :

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 35


Mode(CO1)

𝑺𝒊𝒛𝒆(𝒙) 1 2 3 4 5 6
4 2 7
5 5 13
6 8 17 15
7 9 21 22 29
8 12 26 35
9 14 28 40 43
10 14 29 40
11 15 26 39
12 11 24
13 13

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 36


Mode(CO1)

Since the item 10 occurs maximum number of times i.e.5times,hence


the mode is 10.

𝑪𝒐𝒍𝒖𝒎𝒏𝒔 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒊𝒕𝒆𝒎 𝒉𝒂𝒗𝒊𝒏𝒈 𝒎𝒂𝒙. 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚


1 max.15 11
2max 29 10, 11
3 max 28 9, 10
4 max 40 10, 11, 12
5 max 40 8 9 10
6 max 43 9 10 11

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 37


Mode(CO1)

Q. Find the mode of the following:


Marks 0-5 6-10 11-15 16-20 21-25
No.of candidates 7 10 16 32 24
Marks 26-30 31-35 36-40 41-45
No.of candidates 18 10 5 1

Solution: Here the greatest frequency 32 lies in the class 16-20.Hence


modal class is 16-20.But the actual limits of this class are 15.5-20.5.
𝑙 = 15.5, 𝑓𝑚 = 32, 𝑓1 = 16, 𝑓2 = 24, ℎ = 5

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 38


Mode(CO1)

𝑓𝑚 −𝑓1
Mode= 𝑙 + ×ℎ
2𝑓𝑚 −𝑓1 −𝑓2

32 − 16
= 15.5 + ×5
64 − 16 − 24
16
= 15.5 + ×5
24
10
= 15.5 +
3
= 18.83 𝑚𝑎𝑟𝑘𝑠

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 39


Daily Quiz(CO1)

Q.1 Calculate the mean, median and mode of the following data-

Wages (in Rs) 0-20 20-40 40-60 60-80 80-100 100-120 120-140

No. of 6 8 10 12 6 5 3
Workers

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 40


Recap(CO1)

 Measures of central tendency


 Mean
 Mode
 Median

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 41


Topic Objective (CO1)

Moments
• In mathematical statistics it involve a basic calculation. These
calculations can be used to find a probability distribution's mean,
variance, and skewness.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 42


Moments (CO1)

 Moments: The moment of a distribution are the arithmetic means of


the various powers of the deviations of items from some given
number.
 Moments about mean (central moment)
 Moments about any arbitrary number (Raw Moment)
 Moments about origin

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 43


Central Moments (CO1)

 Moment about mean (central moment):


 For an Individual Series :If 𝑥1 , 𝑥2,….. 𝑥𝑛 are the values of the
variable under consideration , the 𝑟 𝑡ℎ moment 𝜇𝑟 about mean ഥ𝑥 is
defined as

σ𝑛
𝑖=1 𝑥𝑖 −𝑥ҧ
𝑟
Moment about mean 𝜇𝑟 = 𝑛
;r = 0,1,2, … .

 For a frequency Distribution: If 𝑥1, 𝑥2,…., 𝑥𝑛 are the values of a


variable 𝑥 with the corresponding frequencies 𝑓1 , 𝑓2 , … . , 𝑓𝑛
respectively then 𝑟 𝑡ℎ moment 𝜇𝑟 about the mean 𝑥ҧ is defined as

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 44


Central Moments (CO1)

σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 𝑟
𝜇𝑟 = ; r = 0,1,2 … .
𝑁

where 𝑁 = σ𝑛𝑖=1 𝑓𝑖
1 1 𝑁
in particular 𝜇0 = 𝑁 σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 0
= 𝑁 σ𝑛𝑖=1 𝑓𝑖 = 𝑁 = 1
Note. In case of a frequency distribution with class intervals, the values
of 𝑥 are the midpoints of the intervals.
Example 1.Find the first four moments for the following individual
series.
Solution: Calculation of Moments
𝒙 3 6 8 10 18

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 45


Central Moments (CO1)

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 46


Central Moments (CO1)

For any distribution,𝜇0 = 1


𝑛
1
𝜇1 = ෍ 𝑥𝑖 − 𝑥ҧ = 0
𝑛
𝑖=1
For any distribution,𝜇1 = 0, for r=2,
𝑛
1 2
128
𝜇2 = ෍ 𝑥𝑖 − 𝑥ҧ = = 25.6
𝑛 5
𝑖=1
Therefore for any distribution ,𝜇2 coincides with the variance of the
distribution.
1 486
Similarly, 𝜇3 = σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 3
= = 97.2
𝑛 5
1 𝑛 4 7940
𝜇4 = σ 𝑥𝑖 − 𝑥ҧ = = 1588
𝑛 𝑖=1 5

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 47


Central Moments (CO1)

σ 𝑥 45
Now 𝑥ҧ = = =9
𝑛 5
σ 𝑥−𝑥ҧ 0
𝜇1 = 𝑛
=5=0,
σ 𝑥−𝑥ҧ 2 128
𝜇2 = 𝑛
= 5 =25.6,
σ 𝑥−𝑥ҧ 3 486
𝜇3 = 𝑛
= 5 =97.2,
σ 𝑥−𝑥ҧ 4 7940
𝜇4 = 𝑛
= 5 =1588,

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 48


Central Moments (CO1)

For any distribution,𝜇0 = 1 for r=1


𝑛 𝑛 𝑛
1 1 1
𝜇1 = ෍ 𝑓𝑖 𝑥𝑖 − 𝑥ҧ = ෍ 𝑓𝑖 𝑥𝑖 − 𝑥ҧ ෍ 𝑓𝑖 = 𝑥ҧ − 𝑥ҧ = 0
𝑁 𝑁 𝑁
𝑖=1 𝑖=1 𝑖=1
For any distribution,𝜇1 = 0, for r=2,
𝑛
1
𝜇2 = ෍ 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 2 = 𝑆. 𝐷 2 = 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
𝑁
𝑖=1
Therefore for any distribution ,𝜇2 coincides with the variance of the
distribution.
1 𝑛 3
Similarly, 𝜇3 = σ 𝑓 𝑥𝑖 − 𝑥ҧ
𝑁 𝑖=1 𝑖
1 𝑛 4
𝜇4 = σ 𝑓 𝑥𝑖 − 𝑥ҧ and so on.
𝑁 𝑖=1 𝑖

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 49


Central Moments (CO1)

• Example 𝜇1, 𝜇2, 𝜇3, 𝜇4 for the following frequency distribution.


Marks 5-15 15-25 25-35 35-45 45-55 55-65
No.of 10 20 25 20 15 10
students

• Sol. Calculation of Moments

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 50


Central Moments (CO1)

𝒇𝒙 𝒙−ഥ𝒙 𝒇 𝒙−ഥ
𝒙 𝟐 𝒇(𝒙 𝒇 𝒙−ഥ 𝟒
Mark No.of Mid- 𝒇 𝒙−ഥ
𝒙 𝒙
s Studen Point =𝒙
ts(𝒇) (𝒙) − 𝟑𝟒
5-15 10 10 100 -24 -240 5760 -138240 3317760

15-25 20 20 400 -14 -280 3920 -54880 768320

25-35 25 30 750 -4 -100 400 -1600 6400


35-45 20 40 800 6 120 720 4320 25920
45-55 15 50 750 16 240 3840 61440 983040
55-65 10 60 600 26 260 6760 175760 4569760
N=100 σ 𝑓𝑥 σ 𝒇(𝒙 − σ 𝒇(𝒙 − 𝒇(𝒙 − 𝒇(𝒙 −
=34
00

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 51


Central Moments (CO1)

σ 𝑓𝑥 3400
𝑥ҧ = = = 34
𝑁 100
σ𝑓 𝑥 − 𝑥ҧ 0
𝜇1 = = =0
𝑁 100
σ𝑓 𝑥 − 𝑥ҧ 2 21400
𝜇2 = = = 214
𝑁 100
σ𝑓 3
𝑥 − 𝑥ҧ 46800
𝜇3 = = = 468
𝑁 100
σ𝑓 4
𝑥 − 𝑥ҧ 9671200
𝜇4 = = = 96712
𝑁 100

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 52


Raw Moments (CO1)

 Moments about an arbitary number(Raw Moments):


 If 𝑥1 , 𝑥2 , 𝑥3 , … . . , 𝑥𝑛 are the values of a variable 𝑥 with the
corresponding frequencies 𝑓1 , 𝑓2 , 𝑓3,….. 𝑓𝑛 respectively then
𝑟 𝑡ℎ moment 𝜇𝑟 ′ about the number 𝑥 = 𝐴 is defined as

1
𝜇′𝑟 = 𝑁 σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝐴 𝑟 ; 𝑟 = 0,1,2, …

Where,𝑁 = σ𝑛𝑖=1 𝑓𝑖
1
For 𝑟 = 0, 𝜇′0 = 𝑁 σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝐴 0
=1

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 53


Raw Moments (CO1)

1 𝑛 1 𝑛 𝐴 𝑛
For 𝑟 = 1, 𝜇′1 = σ 𝑓 𝑥𝑖 − 𝐴 = σ 𝑓𝑥 − σ𝑖=1 𝑓𝑖 = 𝑥ҧ − 𝐴
𝑁 𝑖=1 𝑖 𝑁 𝑖=1 𝑖 𝑖 𝑁
1 𝑛 2
For 𝑟 = 2, 𝜇′2 = σ 𝑓 𝑥𝑖 − 𝐴
𝑁 𝑖=1 𝑖
1 𝑛 3
For 𝑟 = 3, 𝜇′3 = σ 𝑓 𝑥𝑖 − 𝐴 and so on.
𝑁 𝑖=1 𝑖
In Calculation work, if we find that there is some common factor ℎ(>1)
in values of 𝑥 − 𝐴,we can ease our calculation work by defining 𝑢 =
𝑥−𝐴
.

In that case , we have
𝑛
1
𝜇′𝑟 = ෍ 𝑓𝑖 𝑢𝑖 𝑟 ℎ𝑟 ; 𝑟 = 0,1,2, … .
𝑁
𝑖=1

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 54


Moments about the origin (CO1)

 Moments about the Origin:


If 𝑥1 , 𝑥2 , … … , 𝑥𝑛 be the values of a variable 𝑥 with corresponding
frequencies 𝑓1 , 𝑓2 , … … , 𝑓𝑛 respectively then 𝑟 𝑡ℎ moment about the
origin 𝑣𝑟 is defined as
1 𝑛
𝑣𝑟 = σ𝑖=1 𝑓𝑖 𝑥𝑖 𝑟 ; r = 0,1,2, … .
𝑁

Where, 𝑁 = σ𝑛𝑖=1 𝑓𝑖
1 𝑛 𝑁
For 𝑟 = 0, 𝑣0 = σ𝑖=1 𝑓𝑖 𝑥𝑖 0 = = 1
𝑁 𝑁
1 𝑛
For 𝑟 = 1, 𝑣1 = σ 𝑓 𝑥 = 𝑥ҧ
𝑁 𝑖=1 𝑖 𝑖
1 𝑛
For 𝑟 = 2, 𝑣2 = σ𝑖=1 𝑓𝑖 𝑥𝑖 2 and so on.
𝑁

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 55


Relations (CO1)

relations:
𝜇1 = 0
𝜇2 = 𝜇2 ′ − 𝜇1 ′2
𝜇3 = 𝜇3 ′ − 3𝜇2 ′𝜇1 ′ + 2𝜇1 ′3
𝜇4 = 𝜇4′ − 4𝜇3′ 𝜇1 ′ + 6𝜇2 ′𝜇1 ′2 − 3𝜇1 ′4

• Relation Between 𝒗𝒓 and 𝝁𝒓 :


𝑣1 = 𝑥ҧ
𝑣2 = 𝜇2 + 𝑥ҧ 2
𝑣3 = 𝜇3 + 3𝜇2 𝑥ҧ + 𝑥ҧ 3
𝑣4 = 𝜇4 + 4𝜇3 𝑥ҧ + 6𝜇2 𝑥ҧ 2 + 𝑥ҧ 4

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 56


Karl Pearson’s Coefficients(CO1)

 Karl Pearson’s 𝜷, 𝜸 Coefficients:


Karl Pearson defined the following four coefficients based upon the
first four moments of a frequency distribution about it mean:
𝜇3 2 𝜇4
𝛽1 = 𝛽2 = (𝛽 −coefficients)
𝜇2 3 𝜇2 2

𝛾1 = + 𝛽1 𝛾2 = 𝛽2 − 3 (𝛾 −coefficients)

The practical use of this coefficients is to measure the skewness and


kurtosis of a frequency distribution .These coefficients are pure
numbers independent of units of measurement.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 57


Karl Pearson’s Coefficients(CO1)

Example1 : The first three moments of a distribution about the value


“2” of the variable are 1,16 and −40.Show that the mean is 3,variance
is 15 and 𝜇3 = −86.
Solution: We have A=2,𝜇′1 = 1, 𝜇′2 = 16 and 𝜇′3 = −40
We have that 𝜇′1 = 𝑥ҧ − 𝐴 ⟹ 𝑥ҧ = 𝜇′1 + 𝐴 = 1 + 2 = 3
2
Variance=𝜇2 = 𝜇′2 − 𝜇′1 = 16 − 1 2 = 15
3
𝜇3 = 𝜇′3 − 3𝜇′ 2 𝜇′1 + 2𝜇′ 1 = −40 − 3 16 1 + 2 1 3

= −40 − 48 + 2 = −86.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 58


Karl Pearson’s Coefficients(CO1)

Example 2:The first moments of a distribution about the value “35”


are−1.8,240, −1020 𝑎𝑛𝑑 144000.Find the values of 𝜇1 , 𝜇2 , 𝜇3 , 𝜇4.
Solution: 𝜇1 = 0
𝜇2 = 𝜇′2 − 𝜇1 ′2 = 240 − −1.8 2 = 236.76
3
𝜇3 = 𝜇′3 − 3𝜇′2 𝜇′1 +2𝜇′1
= −1020 − 3 240 −1.8 + 2 −1.8 3 = 264.36
2 4
𝜇4 = 𝜇′4 − 4𝜇′ 3 𝜇′1 + 6𝜇′ 2 𝜇′ 1 − 3𝜇′ 1
= 144000 − 4 −1020 −1.8 + 6 240 −1.8 2− 3 −1.84 4

= 141290.11.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 59


Karl Pearson’s Coefficients(CO1)

Example 3:Calculate the variance and third central moment from the
following data.
𝒙𝒊 0 1 2 3 4 5 6 7 8
𝐹𝑖 1 9 26 59 72 52 29 7 1
Solution: Calculation of Moments

𝒙 𝒇 𝒙−𝑨 𝒇𝒖 𝒇𝒖𝟐 𝒇𝒖𝟑


𝒖= , 𝑨 = 𝟒, 𝒉 = 𝟏
𝒉

0 1 -4 -4 16 -64
1 9 -3 -27 81 -243
2 26 -2 -52 104 -208
3 59 -1 -59 59 -59
4 72 0 0 0 0

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 60


Karl Pearson’s Coefficients(CO1)

σ 𝑓𝑢 −7
𝜇′1 = h= = −0.02734
𝑁 256
σ 𝑓𝑢2 507
𝜇′2 = ℎ2 = 256 =1.9805
𝑁

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 61


Karl Pearson’s Coefficients(CO1)

σ 𝑓𝑢3 3 −37
𝜇′3 = ℎ = = −0.1445
𝑁 256
Moments about Mean:
𝜇1 = 0
2
𝜇2 = 𝜇′2 − 𝜇′ 1 = 1.9805 − −.02734 2
= 1.97975
Variance=1.97975
Also 𝜇3 = 𝜇′3 − 3𝜇′2 𝜇′1 + 2𝜇1 ′3
3
= −0.1445 − 3 1.9805 −0.02734 + 2 −0.02734
=0.0178997
Third central moment= 0.0178997.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 62


Daily Quiz(CO1)

Q1. The first four moments of a distribution are 3,


10.5,40.5,168.Comment upon the nature of the distribution.
Q2. For a distribution, the mean is 10,variance is 16,𝛾1 is 1 and 𝛽2 is 4.
Find the first four moment about origin.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 63


Recap(CO1)

 Measures of central tendency


 Moment

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 64


Topic objective (CO1)

Skewness
• It tells us whether the distribution is normal or not
• It gives us an idea about the nature and degree of concentration of
observations about the mean
• The empirical relation of mean, median and mode are based on a
moderately skewed distribution

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 65


Skewness(CO1)

 Skewness:
• I t means lack of symmetry.
• It gives us an idea about the shape of the curve which we can draw with
the help of the given data.
• A distribution is said to be skewed if—
Mean, median and mode fall at different points, i.e.,
Mean ƒ= Median ƒ= Mode;
• Quartiles are not equidistant from median; and
• The curve drawn with the help of the given data is not symmetrical but
stretched more to one side than to the other.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 66


Skewness(CO1)

Symmetrical Distribution
A symmetric distribution is a type of distribution where the left side of the
distribution mirrors the right side. In a symmetric distribution, the mean,
mode and median all fall at the same point.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 67


Skewness(CO1)

M e as ure s o f Skewness:
The measures of skewnessare:
• Sk = M − M d ,
• Sk = M − M o ,
• Sk = (Q3 − Md ) − (Md − Q1 ),
where M is the mean, Md , the median, Mo , the mode, Q1, the first quartile
deviation and Q3, the third quartile deviation of the distribution.
These are the absolute measures of skewness.
• C o e f f i c i e n t s o f S k e w n e s s : For comparing two series we do
not calculate these absolute measures but we calculate the relative measures
called the coefficients of skewness which are pure numbers independent of
units of measurement.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 68


Skewness(CO1)

The following are the coefficients ofskewness:


• Prof. Karl Pearson’s Coefficient of Skewness,
• Prof. Bowley’s Coefficient of Skewness,
• Coefficient of Skewness based upon Moments.
P r o f . K a r l Pearson’s C o e f f i c i e n t o f Skewness:
Definition
• It is defined as:
𝐴. 𝑀. −𝑀𝑜𝑑𝑒 3 𝑀 − Md
𝑆𝐾𝑝 = =
𝑆. 𝐷 σ
where σ is the standard deviation of the distribution. If mode is ill-
𝑀𝑜𝑑𝑒=3Median-2mean

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 69


Skewness(CO1)

defined, then using the empirical relation,


Mo = 3Md − 2M, for a moderately asymmetrical distribution, we have
• From above two formulas, we observe that Sk = 0 if M = Mo = Md .
• Hence for a symmetrical distribution, mean, median and mode coincide.
• Skewness is positive if M > Mo or M > Md , and negative if M <
Mo or M < Md .
• Limits are: |Sk | ≤ 3 or −3 ≤ Sk ≤ 3.
• However, in practice, these limits are rarely attained.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 70


Skewness(CO1)

Co e ff i c i e n t o f Skewness based upon Moments


Definition
𝜇3
It is defined as: 𝛾1 =
𝜇2 3

where 𝛾1 are Pearson’s Coefficients and defined as:


Sk = 0, if either 𝛽1 = 0 or 𝛽2 = −3. Thus Sk = 0, if and
only if 𝛽1 = 0.
Thus for a symmetrical distribution 𝛽1 = 0.
In this respect 𝛽1 is taken as a measure ofskewness.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 71


Skewness(CO1)

• The coefficient of skewness based upon moments is to be regarded as


without sign.
• The Pearson’s and Bowley’s coefficients of skewness can be positive as
well as negative.
 P o s i t i v e l y S k e w e d D i s t r i b u t i o n : The skewness is
positive if the larger tail of the distribution lies towards the higher
values of the variate (the right),i.e., if the curve drawn
with the help of the given data is
stretched more to the right than
to the left.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 72


Skewness(CO1)

 Negatively Skewed Distribution:


The skewness is negative if the larger tail of the distribution lies towards
the lower values of the variate (the left), i.e., if the curve drawn with the
help of the given data is stretched more to the left than to the right.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 73


Skewness(CO1)

Pearson’s 𝜷𝟏 a n d 𝜸 𝟏 C o e f f i c i e n t s :
𝝁𝟑
𝜸 𝟏 = 𝜷𝟏 = ±
𝝁𝟐 𝟑

Q1. Karl Pearson coefficient of skewness of a distribution is 0.32, its


standard deviation is 6.5 and mean is 29.6. find the mode of the
distribution.
Solution: Given that 𝑆𝐾𝑝 = 0.32, σ=6.5 mean =29.6
𝐴. 𝑀. −𝑀𝑜𝑑𝑒 3 𝑀 − Md
𝑆𝐾𝑝 = =
𝑆. 𝐷 σ
29.6 − 𝑀𝑜𝑑𝑒
0.32 = ⟹ 𝑀𝑜𝑑𝑒 = 27.52
6.5

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 74


Topic objective (CO1)

Kurtosis
• Describe the concepts of kurtosis
• Explain the different measures of kurtosis
• Explain how kurtosis describe the shape of a distribution.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 75


Kurtosis (CO1)

 Kurtosis
• If we know the measures of central tendency, dispersion and skewness, we
still cannot form a complete idea about the distribution. Let us consider the
figure in which all the three curves
• A, B, and C are symmetrical about the mean and have the same range.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 76


Kurtosis (CO1)

Definition: Kurtosis is also known as Convexity of the Frequency Curve due


to Prof. Karl Pearson.
• It enables us to have an idea about the flatness or peakness of the
frequency curve.
• It is measure by the coefficient β2 or its derivation γ2 given as:
𝜇4
𝛽2 = 2
𝜇2
• Curve of the type A which is neither flat nor peaked is called the normal
curve or mesokurtic curve and for such curve 𝛽2 = 3, i.e., γ2 = 0.
• Curve of the type B which is flatter than the normal curve is known as
platycurtic curve and for such curve 𝛽2 < 3, i.e., γ2 <0.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 77


Kurtosis (CO1)

Curve of the type C which is more peaked than the normal curve is called leptokurtic
curve and for such curve 𝛽2 > 3, i.e., γ2 >0.
Q2. For a distribution, the mean is 10, variance is 16, γ1 is +1 and 𝛽2 is 4. Comment
about the nature of distribution. Also find third central moment.
𝝁𝟑
Solution1 = ± ⇒ 𝝁𝟑 =64, 𝝁𝟐 =16,
𝟒𝟎𝟗𝟔

𝜇4
4= ⇒ 𝜇4 = 1024
256

Since γ1 = +1, the distribution is moderately positively skewed, i.e,


if we draw the curve of the given distribution, it will have longer tail towards theright.
Further, since β2 = 4 > 3, the distribution is leptokurtic, i.e.,
it will be sightly more peaked than the normal curve.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 78


Kurtosis (CO1)

Example 3 The first four moment about the working mean 28.5 of a distribution
are 0.294,7.144,42.409 and 454.98. Calculate the first four moment about mean.
Also evaluate 𝛽1 and 𝛽2 and comment upon the skewness and kurtosis of
the distribution.
Solution:𝜇′1 = .294, 𝜇′2 = 7.144, 𝜇′3 = 42.409, 𝜇′4 =
454.98Moment about mean
𝜇1 = 0,
𝜇2 = 𝜇2′ − 𝜇1 ′2 = 7.0576.
𝜇3 = 𝜇3′ − 3𝜇2′ 𝜇1 ′ + 2𝜇1 ′3 = 36.1588,
𝜇4 = 𝜇4′ − 4𝜇3′ 𝜇1′ + 6𝜇2′ 𝜇1 ′2 − 3𝜇1 ′4 = 408.7896

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 79


Kurtosis (CO 1)

𝜇2 3
𝛽1 = 3 = 3.7193,
𝜇2
𝜇4
𝛽2 = 2 = 8.207
𝜇2
Skewness :𝛽1 is positive
𝛾 1 = 1.9285 so distribution is positivley skewed.
Kurtosis: 𝛽2 = 8.207 > 3 so distribution is leptokutic.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 80


Daily Quiz(CO1)

Q1. Find all four central moments and Discuss Skewness and Kurtosis
for the following distribution-

Range of 2-4 4-6 6-8 8-10 10-12


Expenditur
es
No. of 38 292 389 212 69
families

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 81


Weekly Assignment(CO1)

Q1. The First four moments of a distribution about 𝑥 = 4 are


1, 4, 10, 𝑎𝑛𝑑 45. Find the first four moments about mean. Discuss the
Skewness and Kurtosis and also comment upon the nature of the
distribution.
Q2. Define the Mode and calculate Mode for the distribution of
monthly rent Paid by Libraries in Karnataka
Monthly rent 500-1000 1000-1500 1500-2000 2000-2500 2500-3000 3000 & above

No.of Library 5 10 8 16 14 12

Q3. Write Short Note on


i. Range ii. Inter quartile range iii. Mean deviation iv. Standard
deviation v. Variance
Q 4. Explain the measures of dispersion and also find the range &
Coefficient of Range for the following data: 20, 35, 25, 30, 15.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 82


Recap(CO1)

 Moments
 Relation between 𝑣𝑟 𝑎𝑛𝑑 𝜇𝑟
 Relation between 𝜇𝑟 𝑎𝑛𝑑 𝜇′𝑟
 Skewness
 Kurtosis

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 83


Topic objectives(CO1)

Curve Fitting
• The objective of curve fitting is to find the parameters of a
mathematical model that describes a set of data in a way that
minimizes the difference between the model and the data.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 84


Curve Fitting (CO1)

 Curve Fitting :Curve fitting means an exact relationship between


two variables by algebraic equation. It enables us to represent the
relationship between two variables by simple algebraic expressions
e.g. polynomials, exponential or logarithmic functions. .It is also
used to estimate the values of one variable corresponding to the
specified values of other variables.

 Method of Least Squares: Method of least squares provides a


unique set of values to the constants and hence suggests a curve of
best fit to the given data.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 85


Curve Fitting (CO1)

• Fitting a Straight Line: Let 𝑥𝑖 , 𝑦𝑖 , 𝑖 = 1,2, … . 𝑛 be n sets of


observations of related data and
𝑦 = 𝑎. 1 + 𝑏. 𝑥 (1)
Normal equations
σ 𝑦 = 𝑛𝑎 + 𝑏 σ 𝑥 (2)
σ 𝑥𝑦 = 𝑎 σ 𝑥 + 𝑏 σ 𝑥 2 (3)
𝑥−(𝑚𝑖𝑑𝑑𝑙𝑒 𝑡𝑒𝑟𝑚)
If n is odd then,𝑢 =
𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙(ℎ)
𝑥−(𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡𝑤𝑜 𝑚𝑖𝑑𝑑𝑙𝑒 𝑡𝑒𝑟𝑚𝑠)
If n is even then,𝑢 = 1
(𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙)
2

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 86


Curve Fitting (CO1)

Q.Fit a straight line to the following data by least square method.


𝒙 0 1 2 3 4
𝑦 1 1.8 3.3 4.5 6.3

Sol. Let the straight line obtained from the given data be
𝑦 = 𝑎. 1 + 𝑏𝑥 (1)
then the normal equations are
σ 𝑦 = 𝑚𝑎 + 𝑏 σ 𝑥 (2)
σ 𝑥𝑦 = 𝑎 σ 𝑥 + 𝑏 σ 𝑥 2 (3) m=5

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 87


Curve Fitting (CO1)

From(2) and (3), σ 𝑦 = 𝑚𝑎 + 𝑏 σ 𝑥 ⇒ 16.9=5𝑎 + 10𝑏

෍ 𝑥𝑦 = 𝑎 ෍ 𝑥 + 𝑏 ෍ 𝑥 2 ⇒ 47.1 = 10𝑎 + 30𝑏

Solving we get 𝑎 = 0.72, 𝑏 = 1.33


Required lines is 𝑦 = 0.72 + 1.33𝑥

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 88


Curve Fitting (CO1)

 Fitting of an Exponential Curve


Let 𝑦 = 𝑎𝑒 𝑏𝑥
Taking logarithm on both sides, we get
log10 𝑦 = log10 𝑎 + 𝑏𝑥 log10 𝑒
𝑌 = 𝐴 + 𝐵𝑋
Where 𝑌 = log10 𝑦 , 𝐴 = log10 𝑎,𝐵 = 𝑏 log10 𝑒, 𝑋 = 𝑥
The normal equation for (1) are
෍ 𝑌 = 𝑛𝐴 + 𝐵 ෍ 𝑋 𝑎𝑛𝑑 ෍ 𝑋𝑌 = 𝐴 ෍ 𝑋 + 𝐵 ෍ 𝑋 2

Solving these, we get A and B.


𝐵
Then 𝑎 = 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 𝐴𝑎𝑛𝑑 𝐵 = log
10 𝑒

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 89


Curve Fitting (CO1)

 FITTING OF THE CURVE


Let 𝑦 = 𝑎𝑥 𝑏
Taking logarithm on both sides, we get
log10 𝑦 = log10 𝑎 + 𝑏 log10 𝑥
𝑌 = 𝐴 + 𝐵𝑋
Where 𝑌 = log10 𝑦 , 𝐴 = log10 𝑎,𝐵 = 𝑏 , 𝑋 = log10 𝑥
The normal equation to (1) are
෍ 𝑌 = 𝑛𝐴 + 𝐵 ෍ 𝑋 𝑎𝑛𝑑 ෍ 𝑋𝑌 = 𝐴 ෍ 𝑋 + 𝐵 ෍ 𝑋 2

Which results A and B on solving and 𝑎 = 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 𝐴, 𝑏 = 𝐵.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 90


Curve Fitting (CO1)

Example Use the method of least squares to the fit the curve:
𝑐0
𝑦= + 𝑐1 𝑥 to the following table of values:
𝑥

X 0.1 0.2 0.4 0.5 1 2


Y 21 11 7 6 5 6
𝒄𝟎
 Solution: Let given curve is 𝒚 = 𝒙
+ 𝒄𝟏 𝒙
Normal equations are
𝑦 1 1
෍ = 𝑐0 ෍ 2 + 𝑐1 ෍
𝑥 𝑥 𝑥
1
෍ 𝑦 𝑥 = 𝑐0 ෍ + 𝑐1 ෍ 𝑥 .
𝑥

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 91


Curve Fitting (CO1)

𝒙 𝑦 𝑦 𝑦 𝑥 𝟏 1
𝑥 𝑥 𝑥2
0.1 21 210 6.64078 3.16228 100
0.2 11 55 4.91935 2.23607 25
0.4 7 17.5 4.42719 1.58114 6.25
0.5 6 12 4.24264 1.41421 4
1 5 5 5 1 1
2 6 3 8.48528 0.70711 0.25
4.2 302.5 33.7152 10.1008 136.5
4 1

302.5 = 136.5𝑐0 + 10.10081𝑐1

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 92


Curve Fitting (CO1)

33,71524 = 10.10081𝑐0 + 4.2𝑐1


so we have
𝑐0 = 1.97327, 𝑐1 = 3.28182
Hence the curve is
1.97327
𝒚= + 3.28182 𝒙
𝒙

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 93


Daily Quiz(CO1)

Q Fit a second degree parabola to the following data-

𝑥 0 1 2 3 4
𝑓 1 0 3 10 21

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 94


Recap(CO1)

 Moments
 Relation between 𝑣𝑟 𝑎𝑛𝑑 𝜇𝑟
 Relation between 𝜇𝑟 𝑎𝑛𝑑 𝜇′𝑟
 Skewness & kurtosis
 Curve fitting

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 95


Topic objective (CO1)

Correlation
• Identify the direction and strength of a correlation between two factors.
• Compute and interpret the Pearson correlation coefficient and test for
significance.
• Compute and interpret the coefficient of determination.
• Compute and interpret the Spearman correlation coefficient and test for
significance.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 96


Correlation(CO1)

 C o r r e l a t i o n : In a bivariate distribution we are interested to find


out i f there is any correlation between the two variables under study.
• If the change in one variable affects a change in the other variable,the
variables are said to be correlated.
 Positive C o r re l a t i o n
• If the two variables deviate in the same direction, i.e., if the increase (or
decrease) in one results in a corresponding increase (or decrease) in the
other, correlation is said to be direct or positive.
• For example, the correlation between (i) the heights and weights of a
group of persons, and (ii) the income and expenditure; is positive.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 97


Correlation(CO1)

 Negative Correlation:
• If the two variables deviate in the opposite directions, i.e., if increase (or
decrease) in one results in corresponding decrease (or increase) in the other,
correlation is said to be diverse or negative.
• For example, the correlation between (i) the price and demand of a
commodity, and (ii) the volume and pressure of a perfect gas; is
negative.
 Perfect Correlation:
• Correlation is said to be perfect if the deviation in one variable is
followed by a corresponding and proportional deviation in the other.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 98


Correlation(CO1)

Correlation Coefficient:
• The correlation coefficient due to Karl Pearson is defined as a measure of
intensity or degree of linear relationship between two variables.
• Karl Pearson’s Correlation Coefficient
• Karl Pearson’s correlation coefficient between two variables X and Y ,
is denoted by r (X, Y ) or rXY , is a measure of linear relationship between
them and is definedas:
𝐶𝑜𝑣(𝑥,𝑦)
• r(X, Y ) = σ σ
X Y
• f (xi, yi ); i = 1, 2, ...,n is the bivariate distribution, then

• Cov(X, Y ) = E [{X − E (X )}{Y − E (Y )}]

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 99


Correlation(CO1)

Karl Pearson’s Co –Efficient Of Correlation(or Product Moment


Correlation Co-efficient)
Correlation co-efficient between two variable 𝑥 𝑎𝑛𝑑 𝑦, usually denoted
by 𝑟 𝑥, 𝑦 𝑜𝑟 𝑟𝑥𝑦 is a numerical measure of linear relationship between
them and defined as
σ 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
𝑟𝑥𝑦 =
σ 𝑥𝑖 − 𝑥ҧ 2 σ 𝑦𝑖 − 𝑦ത 2
1
σ 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
= 𝑛
1 1
σ 𝑥𝑖 − 𝑥ҧ 2 . σ 𝑦𝑖 − 𝑦ത 2
𝑛 𝑛

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 100


Correlation(CO1)

1
σ 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
= 𝑛
𝜎𝑥 𝜎𝑦
σ 𝑥 − 𝑥ҧ 𝑦 − 𝑦ത
𝑟𝑥𝑦 =
𝑛𝜎𝑥 𝜎𝑦
𝑛 σ 𝑥𝑦−σ 𝑥 σ 𝑦
Or 𝑟 𝑥, 𝑦 =
𝑛 σ 𝑥 2− σ 𝑥 2 𝑛 σ 𝑦2− σ 𝑦 2
Here 𝑛 is the no. of pairs of values of 𝑥 𝑎𝑛𝑑 𝑦.
Note: Correlation co efficient is independent of change of origin and
scale.
Let us define two new variables 𝑢 𝑎𝑛𝑑 𝑣 𝑎𝑠
𝑥−𝑎 𝑦−𝑏
𝑢= ℎ
,𝑣 = 𝑘
where 𝑎, 𝑏, ℎ, 𝑘 𝑎𝑟𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝑡ℎ𝑒𝑛 𝑟𝑥𝑦 = 𝑟𝑢𝑣
𝑛 σ 𝑢𝑣−σ 𝑢 σ 𝑣
Then 𝑟 𝑢, 𝑣 =
𝑛 σ 𝑢2 − σ 𝑢 2 𝑛 σ 𝑣 2 − σ 𝑣 2

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 101


Correlation(CO1)

Q.Find the coefficient of correlation between the values of 𝑥 𝑎𝑛𝑑 𝑦:


𝒙 1 3 5 7 8 10
𝑦 8 12 15 17 18 20
Sol. Here 𝑛 = 6. The table is as follows.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 8 1 64 8
3 12 9 144 36
5 15 25 225 75
7 17 49 289 119
8 18 64 324 144
10 20 100 400 200

෍ 𝑥 = 34 ෍ 𝑦 = 90 ෍ 𝑥 2 = 248෍ 𝑦 2 = 1446
෍ 𝑥𝑦 = 582

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 102


Correlation(CO1)

Karl Pearson’s coefficient of correlation is given by


𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑟 𝑥, 𝑦 =
𝑛 σ 𝑥2 − σ 𝑥 2 𝑛 σ 𝑦2 − σ 𝑦 2
6 × 582 − 34 × 90
𝑟 𝑥, 𝑦 = = 0.9879
6 × 248 − 34 2 6 × 1446 − 90 2
Q. Find the co-efficient of correlation for the following table:
𝒙 10 14 18 22 26 30
𝑦 18 12 24 6 30 36

𝑥−22 𝑦−24
Solution: Let 𝑢 = ,𝑣 =
4 6

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 103


Correlation(CO1)

𝒙 𝒚 𝒖 𝒗 𝒖𝟐 𝒗𝟐 𝒖𝒗
10 18 -3 -1 9 1 3
14 12 -2 -2 4 4 4
18 24 -1 0 1 0 0
22 6 0 -3 0 9 0
26 30 1 1 1 1 1
30 36 2 2 4 4 4
Total
෍ 𝑢 = −3 ෍ 𝑣 = −3 ෍ 𝑢2 = 19 ෍ 𝑣 2 = 19 ෍ 𝑢𝑣

= 12

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 104


Correlation(CO1)

1 1 1 1 1 1
Hence,n=6,𝑢ത = σ 𝑢 = −3 = − ; 𝑣ҧ = σ 𝑣 = −3 = −
𝑛 6 2 𝑛 6 2
𝑛 σ 𝑢𝑣−σ 𝑢 σ 𝑣
Then 𝑟𝑢𝑣 =
𝑛 σ 𝑢2 − σ 𝑢 2 𝑛 σ 𝑣 2 − σ 𝑣 2
6 × 12 − −3 −3 63
= = = 0.6
6 × 19 − −3 2 6 × 19 − −3 2 105 105

 Calculation of co-efficient of correlation for a bivariate


frequency distribution.
• If the bivariate data on 𝑥 𝑎𝑛𝑑 𝑦 is presented on a two way
correlation table and 𝑓 is the frequency of a particular rectangle
• In the correlation table then

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 105


Correlation(CO1)

1
σ 𝑓𝑥𝑦 − σ 𝑓𝑥 σ 𝑓𝑦
𝑟𝑥𝑦 = 𝑛
1 1
σ 𝑓𝑥 2 − σ 𝑓𝑥 2 σ 𝑓𝑦 2 − σ 𝑓𝑦 2
𝑛 𝑛
Since change of origin and scale do not affect the co-efficient of
correlation.𝑟𝑥𝑦 = 𝑟𝑢𝑣 where the new variables 𝑢, 𝑣 are properly chosen.
Q. The following table given according to age the frequency of marks
obtained by 100 students is an intelligence test:

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 106


Correlation(CO1)

Marks 18 19 20 21 total
10-20 4 2 2 8
20-30 5 4 6 4 19
30-40 6 8 10 11 35
40-50 4 4 6 8 22
50-60 2 4 4 10
60-70 2 3 1 6
Total 19 22 31 28 100

Calculate the coefficient of correlation between age and intelligence.


Solution: Age and intelligence be denoted by 𝑥 𝑎𝑛𝑑 𝑦 respectively.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 107


Correlation(CO1)

𝑴𝒊𝒅 x⟶ 18 19 20 21 𝒇 𝒖 𝒇𝒖 f𝒖𝟐 𝒇𝒖𝒗


𝒗𝒂𝒍𝒖𝒆 y↓ 𝒚 − 𝟒𝟓
=
𝟏𝟎
15 10-20 4 2 2 8 -3 -24 72 30
25 20-30 5 4 6 4 19 -2 -38 76 20
35 30-40 6 8 10 11 35 -1 -35 35 9
45 40-50 4 4 6 8 22 0 0 0 0
55 50-60 2 4 4 10 1 10 10 2
65 60-70 2 3 1 6 2 12 24 -2
𝑓 19 22 31 28 100 total -75 217 59
𝑣 -2 -1 0 1 Total
= 𝑥 − 20
𝑓𝑣 -38 -22 0 28 -32
𝑓𝑣 2 76 22 0 28 126
2/14/2023 𝑓𝑢𝑣 56 16 0 Name
Faculty -13Dr. Kunti
59 Mishra Unit 1 108
Correlation(CO1)

𝑦−45
Let us define two new variables 𝑢 𝑎𝑛𝑑 𝑣 𝑎𝑠 𝑢 = = 𝑥 − 20 ,𝑣
10
1
σ 𝑓𝑢𝑣 − σ 𝑓𝑢 σ 𝑓𝑣
𝑟𝑥𝑦 = 𝑟𝑢𝑣 = 𝑛
1 1
σ 𝑓𝑢2 − σ 𝑓𝑢 2 σ 𝑓𝑣 2 − σ 𝑓𝑣 2
𝑛 𝑛
1
59 − 100 −75 −32 59 − 24
= =
1 1 643 2894
217 − 100 −75 2 126 − 100 −32 2
4 × 25
= 0.25

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 109


Rank Correlation(CO1)

RANK CORRELATION:
Definition: Assuming that no two individuals are bracketed equal in either
classification, each of the variables X and Y takes the values 1, 2, ...,n.
Hence, the rank correlation coefficient between A and Bis denoted by r, and
is given as:

𝟔 σ 𝑫𝒊 𝟐
𝒓=𝟏−
𝒏 𝒏𝟐 − 𝟏

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 110


Rank Correlation(CO1)

Question. Compute the rank correlation coefficient for the following


data.

Person A B C D E F G H I J
Rank in 9 10 6 5 7 2 4 8 1 3
maths
Rank in 1 2 3 4 5 6 7 8 9 10
physics

Sol. Here the ranks are given and 𝑛 = 10

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 111


Rank Correlation(CO1)

Person 𝑹𝟏 𝑹𝟐 D=𝑹𝟏 − 𝑹𝟐 𝑫𝟐
A 9 1 8 64
B 10 2 8 64
C 6 3 3 9
D 5 4 1 1
E 7 5 2 4
F 2 6 -4 16
G 4 7 -3 9
H 8 8 0 0
I 1 9 -8 64
J 3 10 -7 49

෍ 𝐷2 = 280

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 112


Rank Correlation(CO1)

6 σ 𝐷2 6 × 280
𝑟 =1− 2
=1− = 1 − 1.697 = −0.697
𝑛 𝑛 −1 10 100 − 1
Uses:
• It is used for finding correlation coefficient if we are dealing with
qualitative characteristics which cannot be measured quantitatively but
can be arranged serially.
• It can also be used where actual data are given.
• In case of extreme observations, Spearman’s formula is preferred to
Pearson’s formula.
Limitations:
• It is not applicable in the case of bivariate frequency distribution.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 113


Tied Correlation(CO1)

• For n > 30, this formula should not be used unless the ranks are given,
since in the contrary case the calculations are quite time-consuming.

TIED RANKS: If some of the individuals receive the same rank in a


ranking of merit, they are said to be tied.
• Let us suppose that m of the individuals, say, (k + 1)th, (k + 2)th,...,(k +
m)th, are tied.
• Then each of these m individuals assigned a common rank, which is
arithmetic mean of the ranks k + 1, k + 2,...,k + m.
𝟏 𝟏
𝟔 σ 𝑫𝟐 + 𝟏𝟐 𝒎𝟏 𝒎𝟏 𝟐 − 𝟏 + 𝟏𝟐 𝒎𝟐 𝒎𝟐 𝟐 − 𝟏 + ⋯
𝒓=𝟏−
𝒏 𝒏𝟐 − 𝟏

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 114


Tied Correlation(CO1)

Question: Obtain the rank correlation co-efficient for the following


data:

𝒙 68 64 75 50 64 80 75 40 55 64
𝑦 62 58 68 45 81 60 68 48 50 70

Solution: Here marks are given so write down the ranks

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 115


Tied Correlation(CO1)

75 𝑿 68 64 75 50 64 80 75 40 55 64 Total
𝑌 62 58 68 45 81 60 68 48 50 70
Ranks in 4 6 2.5 9 6 1 2.5 10 8 6
𝑋(𝑥)
Ranks in 5 7 3.5 10 1 6 3.5 9 8 2
Y(𝑦)

𝐷 = 𝑥 − 𝑦 -1 -1 -1 -1 5 -5 -1 1 0 4 0
𝐷2 1 1 1 1 25 25 1 1 0 16 72

75 2 times
64 3 times
68 2 times

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 116


Tied Correlation(CO1)

1 1 1
6 σ 𝐷2 +
𝑚1 𝑚1 2 − 1 + 𝑚2 𝑚2 2 − 1 + 𝑚3 𝑚 3 2 − 1
𝑟 =1− 12 12 12
𝑛 𝑛2 − 1
1 1 1
6 72 + . 2 22 − 1 + . 3 32 − 1 + . 2 22 − 1
=1− 12 12 12
10 102 − 1
6 × 75 6
=1− = = 0.545
990 11

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 117


Daily Quiz(CO1)

Q1. Find the rank correlation coefficient for the following data:
𝑥 23 27 28 28 29 30 31 33 35 36

𝑦 18 20 22 27 21 29 27 29 28 29

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 118


Recap(CO1)

 Correlation
 Karl Pearson coefficient of correlation
 Rank Correlation
 Tied Rank

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 119


Topic objectives (CO1)

Regression
• Explanation of the variation in the dependent variable, based on the
variation in independent variables and Predict the values of the
dependent variable.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 120


Regression Analysis(CO1)

 REGRESSION ANALYSIS:
• Regression measures the nature and extent of correlation
.Regression is the estimation or prediction of unknown values of one
variable from known values of another variable.
Difference between curve fitting and regression analysis: The only
fundamental difference, if any between problems of curve fitting and
regression is that in regression, any of the variables may be considered
as independent or dependent while in curve fitting, one variable cannot
be dependent.
Curve of regression and regression equation:
• If two variates 𝑥 𝑎𝑛𝑑 𝑦 are correlated i.e., there exists an
association or relationship between them, then the scatter diagram

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 121


Regression Analysis(CO1)

will be more or less concentrated round a curve. This curve is called the
curve of regression and the relationship is said to be expressed by
means of curvilinear regression.
• The mathematical equation of the regression curve is called
regression equation.

Some following types of regression will discuss here:


 Linear Regression
 Non- linear Regression
 Multiple linear Regression

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 122


Linear Regression(CO1)

 LINEAR REGRESSION:
• When the point of the scatter diagram concentrated round a straight
line, the regression is called linear and this straight line is known as
the line of regression.
• Regression will be called non-linear if there exists a relationship
other than a straight line between the variables under consideration.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 123


Linear Regression(CO1)

LINES OF REGRESSION: A line of regression is the straight line


which gives the best fit in the least square sense to the given frequency.

LINES OF REGRESSION:
Let 𝑦 = 𝑎 + 𝑏𝑥 ----.(1)
be the equation of regression line of 𝑦 𝑜𝑛 𝑥.
σ 𝑦 = 𝑛𝑎 + 𝑏 σ 𝑥 … … .(2)
σ 𝑥𝑦 = 𝑎 σ 𝑥 + 𝑏 σ 𝑥 2 … … .(3)
Solving (2) and (3) for ‘𝑎’ and ‘𝑏’ we get.
1
σ 𝑥𝑦− σ 𝑥 σ 𝑦 𝑛 σ 𝑥𝑦−σ 𝑥 σ 𝑦
𝑏= 𝑛
1 = …..(4)
σ 𝑥2− σ𝑥 2 𝑛 σ 𝑥2− σ 𝑥 2
𝑛

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 124


Linear Regression(CO1)

σ𝑦 σ𝑥
𝑎= −𝑏 = 𝑦ത − 𝑏𝑥ҧ … …(5)
𝑛 𝑛
Eqt.(5) given 𝑦ത = 𝑎 + 𝑏𝑥ҧ
Hence 𝑦 = 𝑎 + 𝑏𝑥 line passes through point 𝑥,ҧ 𝑦ത
Putting 𝑎 = 𝑦ത − 𝑏𝑥ҧ in equation 𝑦 = 𝑎 + 𝑏𝑥 ,we get
𝑦 − 𝑦ത = 𝑏 𝑥 − 𝑥ҧ ………(6)
Eqt.(6) is called regression line of 𝑦 𝑜𝑛 𝑥.′ 𝑏′ is called the regression
coefficient of 𝑦 𝑜𝑛 𝑥 and is usually denoted by 𝑏𝑦𝑥.
𝑦 − 𝑦ത = 𝑏𝑦𝑥 𝑥 − 𝑥ҧ
𝜎𝑦
𝑏𝑦𝑥 = 𝑟
𝜎𝑥

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 125


Linear Regression(CO1)

𝑥 = 𝑎 + 𝑏𝑦
𝑥 − 𝑥ҧ = 𝑏𝑥𝑦 𝑦 − 𝑦ത
Where 𝑏𝑥𝑦 is the regression coefficient of 𝑥 𝑜𝑛 𝑦 and is given by
𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑏𝑥𝑦 =
𝑛 σ 𝑦 2 − (σ 𝑦)2
𝜎𝑥
Or 𝑏𝑥𝑦 = 𝑟 where the terms have their usual meanings.
𝜎𝑦

USE OF REGRESSION ANALYSIS:


A) In the field of a business this tool of statistical analysis is widely
used .Businessmen are interested in predicting future production,
Consumption ,investment, prices, profits and sales etc.
B) In the field of economic planning and sociological studies,
projections of population birth rates ,death and other similar variables
are of great use.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 126


Linear Regression(CO1)

Where 𝑥ҧ 𝑎𝑛𝑑 𝑦ത are mean values while


𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑏𝑦𝑥 =
𝑛 σ 𝑥2 − σ 𝑥 2
In eqt.(3),shifting the origin to 𝑥,ҧ 𝑦ത , we get
2
෍ 𝑥 − 𝑥ҧ 𝑦 − 𝑦ത = 𝑎 ෍ 𝑥 − 𝑥ҧ + 𝑏 ෍ 𝑥 − 𝑥ҧ

⇒ 𝑛𝑟𝜎𝑥 𝜎𝑦 = 𝑎 0 + 𝑏𝑛𝜎𝑥 2
𝜎𝑦
⇒𝑏=𝑟
𝜎𝑥
Where 𝑟 is the coefficient of correlation 𝜎𝑥 𝑎𝑛𝑑 𝜎𝑦 are the standard
deviations of 𝑥 𝑎𝑛𝑑 𝑦 series respectively.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 127


Regression Analysis Properties(CO1)

Properties of Regression Coefficients:


Property 1. Correlation coefficient is the geometric mean between the
regression coefficients.
𝑟𝜎𝑦 𝑟𝜎𝑥
Proof :The coefficients of regression are 𝜎𝑥
and 𝜎𝑦
.

𝑟𝜎𝑦 𝑟𝜎𝑥
G.M. between them= 𝜎𝑥
× 𝜎𝑦
= 𝑟 2 = r =coefficient of correlation.

Property 2.If one of the regression coefficients is greater than unity,


the other must be less than unity.
𝑟𝜎𝑦 𝑟𝜎𝑥
Proof. The two regression coefficients are 𝑏𝑦𝑥 = and 𝑏𝑥𝑦 = .
𝜎𝑥 𝜎𝑦

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 128


Regression Analysis Properties(CO1)

1
Let 𝑏𝑦𝑥 >1,then <1
𝑏𝑦𝑥

Since 𝑏𝑦𝑥 . 𝑏𝑥𝑦 = 𝑟 2 ≤ 1


1
𝑏𝑥𝑦 ≤ <1
𝑏𝑦𝑥
Similarly if 𝑏𝑥𝑦 > 1, 𝑡ℎ𝑒𝑛 𝑏𝑦𝑥 < 1.
Property 3.Airthmetic mean of regression coefficient is greater than
the Correlation coefficient.
Proof. We have to prove that
𝑏𝑦𝑥 + 𝑏𝑥𝑦
>𝑟
2
𝜎𝑦 𝜎𝑥
r + r > 2𝑟
𝜎𝑥 𝜎𝑦

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 129


Regression Analysis Properties(CO1)

𝜎𝑥 2 + 𝜎𝑦 2 > 2𝜎𝑥 𝜎𝑦
2
𝜎𝑥 − 𝜎𝑦 > 0 which is true.
Property 4:Regression coefficients are independent of the origin but
not of scale.
𝑥−𝑎 𝑦−𝑏
Proof. Let 𝑢 = ℎ , 𝑣 = 𝑘 , where a, b, h and k are constants
𝑟𝜎𝑦 𝑘𝜎𝑣 𝑘 𝑟𝜎𝑣 𝑘
byx = = r. = = 𝑏𝑣𝑢
𝜎𝑥 ℎ𝜎𝑢 ℎ 𝜎𝑢 ℎ

Similarly, 𝑏𝑥𝑦 = 𝑏𝑢𝑣 ,
𝑘
Thus 𝑏𝑦𝑥 and 𝑏𝑥𝑦 are both independent of a and b but not of ℎ 𝑎𝑛𝑑 𝑘.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 130


Regression Analysis Properties(CO1)

Property 5:The correlation coefficient and the two regression


coefficient have same sign.
𝜎𝑦
Proof: Regression coefficient of 𝑦 𝑜𝑛 𝑥 = 𝑏𝑦𝑥 = 𝑟
𝜎𝑥
Regression coefficient of x 𝑜𝑛 𝑦 = 𝑏𝑥𝑦 = 𝑟
𝜎𝑥
𝜎𝑦
Since 𝜎𝑥 and 𝜎𝑦 are both positive; 𝑏𝑦𝑥 , 𝑏𝑥𝑦 and 𝑟 have same sign.

• Angle Between Two Lines of Regression:


If 𝜃 is the acute angle between the two regression lines in the case of
two variables 𝑥 𝑎𝑛𝑑 𝑦 ,show that

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 131


Regression Analysis Properties(CO1)

1−𝑟 2 𝜎 𝜎
𝑡𝑎𝑛𝜃 = . 2𝑥 𝑦 2 , where 𝑟, 𝜎𝑥, 𝜎𝑦 have their usual meanings.
𝑟 𝜎𝑥 +𝜎𝑦

Explain the significance of the formula where 𝑟 = 0 𝑎𝑛𝑑 𝑟 = ±1


Proof: Equations to the lines of regression of 𝑦 𝑜𝑛 𝑥 𝑎𝑛𝑑 𝑥 𝑜𝑛 𝑦 𝑎𝑟𝑒
𝑟𝜎𝑦 𝑟𝜎𝑥
𝑦 − 𝑦ത = 𝑥 − 𝑥ҧ and (𝑥 − 𝑥)=
ҧ (𝑦 − 𝑦)
𝜎𝑥 𝜎𝑦
𝑟𝜎𝑦 𝜎𝑦
The slopes are 𝑚1 = and 𝑚2 =
𝜎𝑥 𝑟𝜎𝑥
𝜎𝑦 𝑟𝜎𝑦
𝑚 −𝑚1 −
2 𝑟𝜎𝑥 𝜎𝑥
tan𝜃 = ± 1+𝑚 =± 𝜎𝑦 2
2 𝑚1 1+ 2
𝜎𝑥

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 132


Regression Analysis Properties(CO1)

1 − 𝑟 2 𝜎𝑦 𝜎𝑥 2 1 − 𝑟2 𝜎𝑥 𝜎𝑦
=± . . 2 2
=± . 2
𝑟 𝜎𝑥 𝜎𝑥 + 𝜎𝑦 𝑟 𝜎𝑥 + 𝜎𝑦 2
Since 𝑟 2 ≤ 1 and 𝜎𝑥 , 𝜎𝑦 are positive.
1−𝑟 2 𝜎 𝜎 𝜋
tan𝜃 = . 2𝑥 𝑦 2 Where 𝑟 = 0, 𝜃 = the two lines of regression
𝑟 𝜎𝑥 +𝜎𝑦 2
are Perpendicular to each other. Hence the estimated value of 𝑦 is the
same for all values of 𝑥 and vice versa.
When 𝑟 = ±1, 𝑡𝑎𝑛𝜃 = 0 so that 𝜃 = 0 𝑜𝑟 𝜋
Hence the lines of regression coincide and there is perfect correlation
between the two variates 𝑥 𝑎𝑛𝑑 𝑦.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 133


Linear Regression(CO1)

Q. The equation of two regression lines, obtained in a correlation


analysis of 60 observations are:
5𝑥 = 6𝑦 + 24 𝑎𝑛𝑑 1000𝑦 = 768𝑥 − 3608.What is the correlation
Coefficient ?Show that the ratio of coefficient of variability of
5
𝑥 𝑡𝑜 𝑡ℎ𝑎𝑡 𝑜𝑓 𝑦 is 24 .What is the ratio of variance of 𝑥 𝑎𝑛𝑑 𝑦?
Solution: Regression line of 𝑥 𝑜𝑛 𝑦 𝑖𝑠
5𝑥 = 6𝑦 + 24
6 24
𝑥 = 𝑦+
5 5
6
𝑏𝑥𝑦 =
5
Regression line of 𝑦 𝑜𝑛 𝑥 𝑖𝑠

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 134


Linear Regression(CO1)

1000𝑦 = 768𝑥 − 3608


𝑦 = 0.768𝑥 − 3.608
𝑏𝑦𝑥 = 0.768
𝜎𝑥 6
𝑟 = ……..(3)
𝜎𝑦 5
𝜎𝑦
𝑟 𝜎 =0.768….(4)
𝑥

Multiply equations(3) and (4) we get


𝑟 2 = 0.9216 ⇒ 𝑟 = 0.96
Dividing (3) by (4) we get
𝜎𝑥 2 6 1
= × = 1.5625
𝜎𝑦 2 5 0.768

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 135


Linear Regression(CO1)

Taking square root, we get


𝜎𝑥 5
=1.25 =
𝜎𝑦 4
Since the regression lines pass through the point(𝑥,ҧ 𝑦)
ത we have
5𝑥ҧ = 6𝑦ത + 24
1000𝑦ത = 768𝑥ҧ − 3608
Solving the above equation 𝑥𝑎𝑛𝑑
ҧ 𝑦ത ,we get 𝑥=6,
ҧ 𝑦ത =1
𝜎𝑥
Coefficient of variability of 𝑥 = 𝑥ҧ
𝜎𝑦
Coefficient of variability of y =
𝑦ത
𝜎 𝑦ത 𝑦ത 𝜎𝑥 1 5 5
Required ratio= 𝑥𝑥ҧ × 𝜎 = 𝑥ҧ 𝜎𝑦
= 6 × 4 = 24
𝑦

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 136


Non-Linear Regression(CO1)

 Non-linear Regression:
Let 𝑦 = 𝑎. 1 + 𝑏𝑥 + 𝑐𝑥 2
Be a second degree parabolic curve of regression of 𝑦 on 𝑥.
⇒ ෍ 𝑦 = 𝑛𝑎 + 𝑏 ෍ 𝑥 + 𝑐 ෍ 𝑥 2

⇒ ෍ 𝑥𝑦 = 𝑎 ෍ 𝑥 + 𝑏 ෍ 𝑥 2 + 𝑐 ෍ 𝑥 3

⇒ ෍ 𝑥2𝑦 = 𝑎 ෍ 𝑥2 + 𝑏 ෍ 𝑥3 + 𝑐 ෍ 𝑥4

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 137


Multiple Linear Regression(CO1)

 Multiple Linear Regression:


Where the dependent variable is a function of two or more linear or non
linear independent variables. consider such a linear function as 𝑦 =
𝑎 + 𝑏𝑥 + 𝑐𝑧
෍ 𝑦 = 𝑚𝑎 + 𝑏 ෍ 𝑥 + 𝑐 ෍ 𝑧

෍ 𝑥𝑦 = 𝑎 ෍ 𝑥 + 𝑏 ෍ 𝑥 2 + 𝑐 ෍ 𝑥𝑧

෍ 𝑦𝑧 = 𝑎 ෍ 𝑧 + 𝑏 ෍ 𝑥𝑧 + 𝑐 ෍ 𝑧 2

Solving the above equations we get values of 𝑎, 𝑏 𝑎𝑛𝑑 𝑐 then we get


linear function 𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑧 is called the regression plan.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 138


Multiple Linear Regression(CO1)

Q. Obtain a regression plane by using multiple linear regression


To fit the data given below.

𝒙 1 2 3 4
𝑦 12 18 24 30
𝑧 0 1 2 3

Sol. Let 𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑧 𝑏𝑒 𝑡ℎ𝑒 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑝𝑙𝑎𝑛𝑒 𝑤ℎ𝑒𝑟𝑒


𝑎, 𝑏, 𝑐 𝑎𝑟𝑒 𝑡ℎ𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠 𝑡𝑜 be determined by following equations.
෍ 𝑦 = 𝑚𝑎 + 𝑏 ෍ 𝑥 + 𝑐 ෍ 𝑧

෍ 𝑥𝑦 = 𝑎 ෍ 𝑥 + 𝑏 ෍ 𝑥 2 + 𝑐 ෍ 𝑥𝑧

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 139


Multiple Linear Regression(CO1)

෍ 𝑦𝑧 = 𝑎 ෍ 𝑧 + 𝑏 ෍ 𝑥𝑧 + 𝑐 ෍ 𝑧 2

Here 𝑚 = 4 Substitution yields,


84=4𝑎 + 10𝑏 + 6𝑐
240 = 10𝑎 + 30𝑏 + 20𝑐
156=6a+20b+14c
𝑎 = 10, 𝑏 = 2, 𝑐 = 4
Hence the required regression plane is
𝑦 = 10 + 2𝑥 + 4𝑧

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 140


Multiple Linear Regression(CO1)

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 141


Daily Quiz(CO1)

Q1 Two lines of regression are given by 7𝑥 − 16𝑦 + 9 = 0 and


− 4𝑥 + 5𝑦 − 3 = 0 and 𝑣𝑎𝑟(𝑥)=16.Calculate
(i) the mean of 𝑥 and 𝑦
(ii) variance of 𝑦
(iii) The correlation coefficient.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 142


Weekly Assignment(CO1)

Q1. Fit a straight line trend by the method of least square to the
following data:
Year 1979 1980 1981 1982 1983 1984
5 7 9 10 12 17
Production

Q2. From the following data calculate Karl Pearson's coefficient


of skewness
Marks 10 20 30 40 50 60 70
Less than
No. of 10 30 60 110 150 180 200
students

Q3. Write regression equations of X on Y and of Y on X for the


following data -

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 143


Weekly Assignment(CO1)

X 1 2 3 4 5
Y 2 4 5 3 6

Q4. Fit a straight line trend by the method of least squares to the
following data: -
Year 2012 2013 2014 2015 2016 2017
Sales of 7 10 12 14 17 24
T.V. sets
(in’000)

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 144


Faculty Video Links, Youtube & NPTEL Video Links and
Online Courses Details

Suggested Youtube/other Video Links:


https://youtu.be/wWenULjri40
https://youtu.be/mL9-WX7wLAo
https://youtu.be/nPsfqz9EljY
https://youtu.be/nqPS29IvnHk
https://youtu.be/aaQXMbpbNKw
https://youtu.be/wDXMYRPup0Y
https://youtu.be/m9a6rg0tNSM
https://youtu.be/Qy1YAKZDA7k
https://youtu.be/Qy1YAKZDA7k
https://youtu.be/s94k4H6AE54
https://youtu.be/lBB4stn3exM
https://youtu.be/0WejW9MiTGg
https://youtu.be/QAEZOhE13Wg
https://youtu.be/ddYNq1TxtM0
https://youtu.be/YciBHHeswBM
https://youtu.be/VCJdg7YBbAQ
https://youtu.be/VCJdg7YBbAQ
https://youtu.be/yhzJxftDgms

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 145


MCQ (CO1)

Q1. Which one is true


i. Correlation helps to determine the validity of a test.
ii. Correlation helps to determine the reliability of a test.
iii. Correlation indicates the nature of the relationship between two
variables.
iv. All of the above
Q2. Which one is true
i. 𝐼𝑓 𝑏𝑥𝑦 > 1, 𝑡ℎ𝑒𝑛 𝑏𝑦𝑥 < 1.
𝑏𝑦𝑥 + 𝑏𝑥𝑦
ii. 2
>𝑟
𝑏𝑦𝑥 + 𝑏𝑥𝑦
iii. 4
> 2𝑟
iv. 𝐼𝑓 𝑏𝑦𝑥 > 1, 𝑡ℎ𝑒𝑛 𝑏𝑥𝑦 < 1.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 146


MCQ (CO1)

Q3. Sum of squares of items 2430, mean is 7 N=12, find the variance.
i. 176.5
ii. 12.38
iii. 153.26
iv. 14
Q4. Calculate the standard variation of the following
9, 8, 6,5,8,6
i. 2
ii. 3
iii. 1.414
iv. 2.414

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 147


Glossary (CO1)

Q 1 An in complete distribution is given below:


x 10-20 20-30 30-40 40-50 50-60 60-70 70-80
f 12 30 X 65 Y 25 18
Given that median value is 46 and N=229
i. X
ii. Y
iii. Mean
iv. Mode
Pick the correct option from glossary
a. 45.82
b. 33.5
c. 46.07
d. 45

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 148


Glossary (CO1)

Q2. For the following:


i. Equation of line y on x
ii. Regression coefficient x on y
iii. Correlation coefficient
iv. Equation of line x on y
Pick the correct option from glossary
a. 𝑥 − 𝑥ҧ = 𝑏𝑥𝑦 𝑦 − 𝑦ത
b. r(x,y)
c. 𝑦 − 𝑦ത = 𝑏𝑦𝑥 𝑥 − 𝑥ҧ
d. 𝑏𝑥𝑦

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 149


Old Question Papers

First Sessional Set-1 (CSE,IT,CS,ECE,IOT).docx


Second Sessional Set-2 (CSE,IT,CS,ECE,IOT).docx
Maths IV PUT.docx
Maths IV final paper_2022.pdf

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 150


Expected Questions for University Exam(CO1)

Q1 Obtain normal equation by method of least square to the curve 𝑦 = 𝑐0 𝑥 +


𝑐1
.Fit it to the following data:
𝑥
𝑥 0.1 0.2 0.4 0.5 1 2
𝑦 21 11 7 6 5 6

Q2. Find the multiple linear regressions of 𝑥 on 𝑦 and 𝑧 from the data relating
to three variables:
𝑥 7 12 17 20
𝑦 4 7 9 12
𝑧 1 2 5 8

Q3. If 𝜃 is the angle between the two line of regression.then express tan 𝜃 in
terms of correlation coefficient(𝑟). Explain the significance when 𝑟 = 0 and
𝑟 = ±1.
Q4. Two lines of regression are given by 7𝑥 − 16𝑦 + 9 = 0 and −4𝑥 +
5𝑦 − 3 = 0 and 𝑣𝑎𝑟(𝑥)=16.Calculate-(i) the mean of 𝑥 and 𝑦 (ii) S.D. of 𝑦
(iii) the correlation coefficient.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 151
Expected Questions for University Exam(CO1)

Q5 An incomplete distribution of families according to their


expenditure per week is given below. The median and mode for the
distribution are Rs 25 and Rs 24 respectively. Calculate the missing
frequencies.
Expenditure 0-10 10-20 20-30 30-40 40-50
No. of families 14 ? 27 ? 15

Q6. The first four moments of a distribution about 2 are 1,2.5,5.5 and
16 resp.Calculate the four moments about mean and about the origin.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 152


Recap (CO1)

We discussed the following topics:


 Measures of central tendency – mean, median,
mode
 Moment
 Skewness
 Kurtosis
 Curve fitting
 Least squares principles of curve fitting
 Correlation
 Regression analysis

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 153


References

Text Books
• Erwin Kreyszig, Advanced Engineering Mathematics, 9thEdition,
John Wiley & Sons, 2006.

• P. G. Hoel, S. C. Port and C. J. Stone, Introduction to Probability


Theory, Universal Book Stall, 2003(Reprint).

• S. Ross: A First Course in Probability, 6th Ed., Pearson Education


India, 2002.

• W. Feller, An Introduction to Probability Theory and its


Applications, Vol. 1, 3rd Ed., Wiley, 1968.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 154


References

Reference Books
• B.S. Grewal, Higher Engineering Mathematics, Khanna Publishers,
35th Edition, 2000. 2.T.Veerarajan : Engineering Mathematics (for
semester III), Tata McGraw-Hill, New Delhi.

• R.K. Jain and S.R.K. Iyenger: Advance Engineering Mathematics;


Narosa Publishing House, New Delhi.

• J.N. Kapur: Mathematical Statistics; S. Chand & Sons Company


Limited, New Delhi.

• D.N.Elhance,V. Elhance & B.M. Aggarwal: Fundamentals of


Statistics; Kitab Mahal Distributers, New Delhi.

2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 155


2/14/2023 Faculty Name Dr. Kunti Mishra Unit 1 156

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy