0% found this document useful (0 votes)

56 views

TCH442E Quantitative Methods For Finance

This document provides an overview of the course "Quantitative Methods for Finance" taught by Dr. Quyen Do Nguyen. The course aims to apply econometrics and quantitative methods to examine relationships between variables in finance. It will cover linear and non-linear regression models, hypothesis testing, instrumental variables, panel data models, discrete choice models, and time series models. Students will learn these techniques and apply them to finance research. Assessment includes class attendance, a mid-term group work, and a final exam. The document discusses different data types like cross-section, time series, and panel data and introduces the linear regression model and ordinary least squares estimation method.

Uploaded by

Ngọc Mai Vũ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views

TCH442E Quantitative Methods For Finance

Uploaded by

Ngọc Mai Vũ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

10/30/2021

TCH442E

QUANTITATIVE
METHODS FOR FINANCE

Dr. Quyen Do Nguyen

quyendn@ftu.edu.vn

In Economics, Econometrics has been described as the

discipline that "aim[s] to give empirical content to economic
relations.“ (The New Palgrave: A Dictionary of Economics)

Quantitative
methods for Applied Econometrics is the application of mathematics and
statistical methods to data in order to:

Finance - Investigate the existence of

relationships between variables.
Measure the strength of these
relationships.
Econometrics

Quantitative methods for Finance is the application of

econometrics in examining the relationship between variables
in finance area.

1
2
10/30/2021

Course learning
outcomes
• Become familiar with the basic
econometrics techniques to conduct and
evaluate a wide range of applied
econometric research and to apply them in
finance research area.

• On successful completion of this module you

should:
• Have developed a deeper knowledge of
quantitative methods and applied econometrics
and its usefulness for understanding economic
relationships.
• Have applied the quantitative methods and
econometrics in finance.
• Have acquired a basic knowledge of Stata.

Outline and Textbooks

Main topics:
• Linear and Non-Linear Regression Models.
• Hypothesis Testing.
• IV.
• Panel Data Models.
• Discrete Choice Models.
• Time Series Models.

Textbooks:
W. H. Greene. Econometric Analysis, 5th Edition, Prentice Hall
International, 2003.
J. Wooldridge. Introductory Econometrics: A Modern Approach, 4th
Edition, South-Western Cengage Learning, 2006. (Simple/Less Math)
4

2
4
10/30/2021

Structure and Assessment

15 SESSIONS OF ATTENDANCE MID-TERM GROUP FINAL EXAM (60%).

LECTURES AND CHECKING (10%) WORK (30%)
SEMINARS.

Data
Data used by
econometricians Experimental (e.g. controlled
can be:
experiment).
Observational (e.g. survey).

Data are
typically Cross-section data.
organised in
different forms:
Time series data.
Panel data.
6

3
6
10/30/2021

Cross Section Data

Year Id Income Hours
2000 1 x x
2000 2 x x
2000 3 x x
2001 4 x x
2001 5 x x
2001 6 x x
2002 7 x x
2002 8 x x
2002 9 x x

Several individuals in each period but different individuals

in different periods.
7

Time Series Data

Year Income Hours

2000 x x

2001 x x

2002 x x

Only one individual in each period and the same individual

is followed over time.

4
8
10/30/2021

Panel Data
Year id Income Hours
2000 1 x x
2000 2 x x
2000 3 x x
2001 1 x x
2001 2 x x
2001 3 x x
2002 1 x x
2002 2 x x
2002 3 x x

Several individuals in each period and the individuals are

followed over time.
9

LINEAR
REGRESSION
MODEL

5
10
10/30/2021

The Linear Regression Model

The linear regression model is used to study the relationship
between a set of variables and make inference on the average
effect of a given variable on an other.

Estimation.
Inference includes: Hypothesis testing.
Confidence intervals.

Estimation with a linear regression is used to compute the

average effect on a given outcome Y of a unit change in a
given variable X by fitting a straight line to data on Y and X.

The Linear Regression Model

with One Variable
Yi = 0 + 1Xi + ui i = 1,…, n
 X is the independent or explanatory variable
 Y is the dependent variable
 0 is the intercept
 1 is the slope
 ui is the error term

The regression error consists of all other factors affecting Y

other than X, including possible errors in the measurement of Y.
12

6
12
10/30/2021

An Application: Class Size Data

Research question: what is the average effect of class size on
students’performance?
Suppose that we can measure students’ performance by test
scores and class size by the student-teacher ratio (STR).
(This example is based on 420 California School Districts in 1998)

An Application: Class Size Data

We can write the corresponding linear regression model:

Test Score = 0 + 1STR+u

1 = slope of regression line

Test score
=
STR
= change in test score for a unit change in STR
 0 and 1 are unknown.
 We can estimate them with data. How?
ORDINARY LEAST SQUARE (OLS) estimator.
15

7
14
10/30/2021

Ordinary Least Squares Estimator

The OLS estimator minimizes the sum of the squared distances
between the observed yi and the estimated regression line, i.e. the
sum of squared residuals.

An estimate of the unknown parameters 0 and 1 is found by

solving the following minimization problem:

 
n 2
min  yi  ( ˆ0  ˆ1x)i
ˆ0,ˆ1 i1
1
15

1
16

8
16
10/30/2021

Computing the Betas

The OLS estimator minimizes the sum of squared residuals:
u  ( yi  yˆ i ) 2  ( yi  ˆ0  ˆ1x i) 2
ˆ ˆ  i
2
min
0, 1 i i i

First Derivative:
ui
 2( y i  ˆ0  ˆ1x i)  0
ˆ
 0
ui
 2  ( y i  ˆ0  ˆ1 xi )x i  0
ˆ1

1
17

Computing the Betas

The two equations above can be re-written as:

yi
i  nˆ0  ˆ1  xi
i

yx i i  ˆ0  xi  ˆ1  xi

i i i

These two equations, known as normal equations, can be

solved to obtain

n  y i x i  y i  x i  (x i  x) ( y i  y)
̂1  i i i
 i i

n xi  ( x i )  (x
2 2 2
i  x)
i i i

ˆ0  y  ˆ1x
18

9
18
10/30/2021

The OLS regression line

Estimated slope = ˆ1 = – 2.28

Estimated intercept = ˆ0 = 698.9
Estimated 1regression line: TestScore= 698.9 – 2.28STR
19

Interpreting the Estimates

Test Score= 698.9 – 2.28STR

 The slope implies that districts with one more student per
teacher on average have test scores that are 2.28 points lower.
Test score
That is, = –2.28
STR
 The intercept implies that districts with zero STR would have
a (predicted) test score of 698.9. The intercept is not always
economically meaningful.

1
20

10
20
10/30/2021

Predicted Values & Residuals

One of the districts in the data set is Antelope, CA, for which
STR = 19.33 and Test Score = 657.8
Predicted value: Yˆ
Antelope= 698.9 – 2.2819.33 = 654.8
Residual: uˆAntelope = 657.8 – 654.8 = 3.0 22

Summary and Next Steps

• Linear regression model with one variable.
• OLS estimator.

Questions: How well does the model fit the data? Is OLS
a “good” estimator, what are its properties?

• Measures of fit.
• Sampling distribution of OLS estimator.

11
22
10/30/2021

Measures of Fit
How well does the regression line fit the data?

 Root Mean Square error (RMSE): square root of the average

squared distance of a data point from the fitted regression line,
i.e. it measures the standard deviation of the regression
residuals.
n
1
RMSE = 
n i1
uˆi2

 R2: measure of the fraction of the variance of Y that is

explained by X, i.e. how well the model interpolates the data
points. It ranges between zero (no fit) and one (perfect fit).

Derivation of R2
R2 measures the fraction of the sample variance of Y that is
“explained” by the regression model.

 Var (Y) = Var(Yˆ )i + Var(uˆ )i

 Total sum of squares = “Explained” SS + “Residual” SS
n

ESS  (Yˆ Yˆ )

i
2

R2 = 1-RSS/TSS= = i1
n
TSS
 (Y i  Y )2
i1

 R2 = 0 means ESS = 0; R2 = 1 means ESS = TSS.

 0 ≤ R2 ≤ 1
24

12
24
10/30/2021

Example of R2

Test Score= 698.9 – 2.28xSTR R2 = 0.05

STR explains a small fraction of the variation in test scores (5%).

Does this mean that STR is not important? 26

Adjusted R2
Problem: adding regressors with any additional explanatory
power necessarily raises the R2, but this increase is artificial and
inflates the apparent explanatory power of the model.

Adjusted R2 :

2 n 1 SSR
R  1
n  k 1 TSS

where K is the number of regressors.

As K increases, Adjusted R2 decreases: adding more regressors

lowers SSR but also (n−k−1). Parsimonious model is preferred! 27

13
26
10/30/2021

The Least Squares Proprieties

Under some assumptions, the OLS estimator yields estimates of

β0 and β1 that are:

 Unbiased, i.e. the estimates are on average equal to the true

value of β.
 Consistent, i.e. as sample size increases, the probability that
the estimates approach the true β values is close to 1.
 Efficient, i.e. the estimates have the smallest variance among
all estimates obtained with a linear estimator.

The Least Squares Assumptions

Yi = 0 + 1Xi + ui i = 1,…, n

1. The conditional distribution of u given X has mean zero,

that is, E(u|X = x) = 0.
(i.e. X and u are independently distributed)
 This implies that the estimated betas are unbiased.

2. (Xi,Yi), i =1,…,n, are i.i.d.

 This is true if observations on X and Y are drawn by
random sampling.

3. X and Y have finite fourth moments.

 This implies that large outliers in X and/or Y are rare,
which is important since large outliers can result in
29
meaningless values of the estimates.

14
28
10/30/2021

Assumption 1: E(u|X = x) = 0
For any given value of X, the mean value of u is zero:

Example: Test Scorei = 0 + 1STRi + ui,

Question: is E(u|X=x) = 0 a plausible assumption? 30

Assumption 2: (Xi,Yi) are i.i.d.

Random variables are independent and identically distributed
(i.i.d.) if each random variable has the same probability
distribution as the others and all are mutually independent.

This assumption holds by definition when X and Y are

sampled by simple random sampling.

However, in reality this assumption is often violated.

Non-i.i.d. sampling typically occurs when data are recorded
over time (time series and/or panel data).
30

15
30
10/30/2021

Assumption 3: E(X4) <  and E(Y4) < 

The fourth moment is a measure of whether the distribution if tall
and skinny or rather short and compact when compared to a
normal distribution with the same variance. Bounded forth
moments mean that large outliers are rare.

The Sampling Distribution of ˆ

The OLS estimator is computed from a sample of data.

Hence it is necessary to:

 Quantify the sampling uncertainty associated with ˆ1

 Construct a confidence interval for ˆ1

 Use ˆ1 to test hypotheses such as 1 = 0

16
32
10/30/2021

The Sampling Distribution of ˆ1

Under the three OLS assumptions that we discussed, the exact
(finite sample) distribution of ˆ is such that:
1

 E(ˆ1 ) = 1 (that is, ˆ1 is unbiased)

p
 ˆ1  1 (that is, ˆ1 is consistent)
1 var[( X i  x )ui ]
 Var(ˆ1) = 
n  X4
ˆ1  E(ˆ1)
 When n is large, ~ N(0,1)
var(ˆ1 )
Other than first and second moments, the exact distribution of ˆ1
is complicated and depends on the joint distribution of (X,u). 34

The larger the sample (large N),

the smaller the variance of ˆ1

17
34
10/30/2021

Summary
Linear regression model with one variable:
• OLS estimator.
• Measures of fit.
• Sampling distribution of OLS estimator.

Next:
• Linear regression model with many variables.

The Sampling Distribution of ˆ1

Technical Appendix
Yi = 0 + 1Xi + ui
Y = 0 + 1 X + u
so Yi – Y = 1(Xi – X ) + (ui – u )
Thus,
n

( X i  X )(Yi Y )
̂1 = i1
n

( X i  X )2
i1
n

( X i  X )[1 ( X i  X )  (ui  u )]
= i1
n

( X i  X )2 37
i1

18
36
10/30/2021

n n

( X i  X )(X i  X ) ( X i  X )(ui  u )
̂1 = 1 i1
n
 i1
n

( X i  X) 2
( X
i1
i  X )2
i1
n

( X i  X )(ui  u )
so ̂1 – 1 = i1
n
.
( X i  X) 2

i1
n n
 n 
Now  i ( X  X )(u i  u ) =  ( X i  X )u i –   ( X i  X ) u
i1 i1  i1 
n
 n  
=  ( X i  X )ui –   X i   nX  u
i1   i1  
n
=  ( X i  X )ui 38
i1

n n
Substitute  ( X i  X )(ui  u ) =  ( X i  X)ui into the
i1 i1

expression for ˆ1 – 1:

( X i  X )(ui  u )
̂1 – 1 = i1
n

( X i  X )2
i1

so
n

( X  X )u i i

ˆ1 – 1 = i1
n

( X i  X )2
i1

19
38
10/30/2021

Is ˆ1 unbiased?
 n 
  ( X i  X )ui 
E(ˆ1) – 1 = E  i1n 
  ( X  X )2 
 i1 i 
  n  
   ( X i  X )ui 

= E  E  i1n  X 1 ,..., X n 
   ( X i  X )2  
  i1  
=0
 If Assumption 1 holds, then E(ˆ1) = 1
 Hence, ˆ1 is an unbiased estimator of 1.
39

Derivation of Var(ˆ1)
Let us write
n
1 n
( X i  X )ui vi
n i1
̂1 – 1 = i1
=
n
 n 1 2
( X i  X) 2

 n 
 sX
i1

n1
where vi = (Xi – X )ui. If n is large, s2X   X2 and  1, so
n
1 n
 vi
n i1
ˆ1 – 1  ,
 X2

41
where vi = (Xi – X )ui

20
40
10/30/2021

1 n
 vi
n i1
̂1 – 1 
 X2
so var(ˆ1 – 1) = var(ˆ1)
var(v) / n
=
( X2 ) 2
so
1 var[( X i  x )ui ]
var(ˆ1 – 1)=  .
n  X4

Hence:
 Var(ˆ 1) is inversely proportional to n

n
When n is large, 1  vi is approximately distributed N(0, v2 / n)
n i1

Hence,

 2 
ˆ1 ~ N  1 , v4  , where vi = (Xi – X)ui
 n X 

21
42

Fuse Box Diagram Chevrolet Captiva (C100), 2006 - 2010
100% (1)
Fuse Box Diagram Chevrolet Captiva (C100), 2006 - 2010
10 pages
Introduction To Econometrics - Stock & Watson - CH 4 Slides
100% (2)
Introduction To Econometrics - Stock & Watson - CH 4 Slides
84 pages
Solved Problems in Geostatistics
From Everand
Solved Problems in Geostatistics
Oy Leuangthong
5/5 (1)
High-Speed, Multi-Camera Machine Vision System CV-X200/X100 Series
No ratings yet
High-Speed, Multi-Camera Machine Vision System CV-X200/X100 Series
52 pages
D&D5e League of Legends - Campaign Handbook GM Binder
50% (2)
D&D5e League of Legends - Campaign Handbook GM Binder
71 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Linear Regression
No ratings yet
Linear Regression
73 pages
qrm2 Session1 2
No ratings yet
qrm2 Session1 2
89 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
cheatsheet
No ratings yet
cheatsheet
2 pages
TCH442E Quantitative Methods For Finance: Last Lecture: Next
No ratings yet
TCH442E Quantitative Methods For Finance: Last Lecture: Next
13 pages
Simple Regression
No ratings yet
Simple Regression
27 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
No ratings yet
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
41 pages
Econometrics jimma assignment
No ratings yet
Econometrics jimma assignment
6 pages
Introduction To Econometrics - Summary
No ratings yet
Introduction To Econometrics - Summary
23 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Introduction To Mathematical Modeling: Simple Linear Regression
No ratings yet
Introduction To Mathematical Modeling: Simple Linear Regression
21 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
The Linear Regression Model
No ratings yet
The Linear Regression Model
25 pages
統計摘要
No ratings yet
統計摘要
12 pages
3 SimpleLinearRegression
No ratings yet
3 SimpleLinearRegression
30 pages
Ec 384 Applied Econometrics Topic 1 - 2023
No ratings yet
Ec 384 Applied Econometrics Topic 1 - 2023
99 pages
Week 2, OLS
No ratings yet
Week 2, OLS
83 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Problem Set 3 SOLUTIONS
No ratings yet
Problem Set 3 SOLUTIONS
7 pages
Introduction To Multiple Regression
No ratings yet
Introduction To Multiple Regression
36 pages
Econometrics - Review Sheet ' (Main Concepts)
No ratings yet
Econometrics - Review Sheet ' (Main Concepts)
5 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
Ch3 Slides Ed4 2024
No ratings yet
Ch3 Slides Ed4 2024
72 pages
Lecture 6
No ratings yet
Lecture 6
45 pages
Econometrics Chap 3
No ratings yet
Econometrics Chap 3
19 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Welcome To The Course: Financial Econometrics I
No ratings yet
Welcome To The Course: Financial Econometrics I
14 pages
Simple Linear Regression - Lecture Notes
No ratings yet
Simple Linear Regression - Lecture Notes
19 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Ch3_slides_Ed4_2024_20(1)
No ratings yet
Ch3_slides_Ed4_2024_20(1)
72 pages
Basic Economterics - I
No ratings yet
Basic Economterics - I
17 pages
Notes2
No ratings yet
Notes2
16 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Lecture set 2
No ratings yet
Lecture set 2
47 pages
Econometrics
No ratings yet
Econometrics
13 pages
ECON 6001 Assignment1 2023
No ratings yet
ECON 6001 Assignment1 2023
9 pages
Chapter Three
No ratings yet
Chapter Three
22 pages
Chapter 1
No ratings yet
Chapter 1
17 pages
BRM - L4,5 - Linear Regression
No ratings yet
BRM - L4,5 - Linear Regression
113 pages
EmFi L 04
No ratings yet
EmFi L 04
17 pages
AG909 Quantitative Methods For Finance
No ratings yet
AG909 Quantitative Methods For Finance
7 pages
1-Chap II Econometrics ABC DR Mitiku
No ratings yet
1-Chap II Econometrics ABC DR Mitiku
80 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
Econ7020X 2024S FinalExam
No ratings yet
Econ7020X 2024S FinalExam
10 pages
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
No ratings yet
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
2 pages
Ols 23-24
No ratings yet
Ols 23-24
87 pages
IAES Tajikistan Day4
No ratings yet
IAES Tajikistan Day4
46 pages
Lecture 2-3
No ratings yet
Lecture 2-3
8 pages
Stock and Watson - Slides For Chapter 4
No ratings yet
Stock and Watson - Slides For Chapter 4
43 pages
Multiple regression
No ratings yet
Multiple regression
14 pages
Temas 4 Al 7
No ratings yet
Temas 4 Al 7
191 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
Econometrics Test Prep
100% (2)
Econometrics Test Prep
7 pages
Lecture 2: Simple Linear Regression Model: Recap
No ratings yet
Lecture 2: Simple Linear Regression Model: Recap
5 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Lampiran C1.7.1 Table 14 Mapping Course Technical Services
No ratings yet
Lampiran C1.7.1 Table 14 Mapping Course Technical Services
1 page
Exercise 1 solutions
No ratings yet
Exercise 1 solutions
20 pages
Study Solar Cooker
No ratings yet
Study Solar Cooker
3 pages
Ball and Socket Joint
No ratings yet
Ball and Socket Joint
2 pages
Memory Segmentation of 8086
100% (1)
Memory Segmentation of 8086
14 pages
National Disaster Risk Reduction Policy (2013), IRFAN
No ratings yet
National Disaster Risk Reduction Policy (2013), IRFAN
12 pages
Han Dynasty
No ratings yet
Han Dynasty
33 pages
Gec STS Reveiwer
No ratings yet
Gec STS Reveiwer
14 pages
Makalah Keperawatan Dalam B.INGGRIS 1
100% (2)
Makalah Keperawatan Dalam B.INGGRIS 1
15 pages
1156629-Introduction To OSHAcademy
No ratings yet
1156629-Introduction To OSHAcademy
1 page
Iccs Brochure Vagai
No ratings yet
Iccs Brochure Vagai
2 pages
Ai Notes
No ratings yet
Ai Notes
2 pages
Gold Rate Prediction Using Linear Regression
No ratings yet
Gold Rate Prediction Using Linear Regression
10 pages
Aarav GA - 1 - 1MeasuringPractice
No ratings yet
Aarav GA - 1 - 1MeasuringPractice
3 pages
Chapter 2 - Plant Start Up & Shut Down
No ratings yet
Chapter 2 - Plant Start Up & Shut Down
23 pages
Abb Parts Fiser68259959 PDF
No ratings yet
Abb Parts Fiser68259959 PDF
4 pages
Digitally Signed by Khele Janhavi Kishor Date: 2023.07.05 14:18:43 +05'30'
No ratings yet
Digitally Signed by Khele Janhavi Kishor Date: 2023.07.05 14:18:43 +05'30'
188 pages
Car India May
No ratings yet
Car India May
133 pages
Class 7 Light Notes
No ratings yet
Class 7 Light Notes
5 pages
Macbeth: William Shakespeare
No ratings yet
Macbeth: William Shakespeare
12 pages
NBS Deep-Resilience-Master Final
No ratings yet
NBS Deep-Resilience-Master Final
23 pages
2021 Output2 Mathadvance
No ratings yet
2021 Output2 Mathadvance
6 pages
An Efficient Deepfake Detection Using Robust Deep Learning Approch
No ratings yet
An Efficient Deepfake Detection Using Robust Deep Learning Approch
24 pages
Achievement Test, Diagnostic Test, Achievements Vs
No ratings yet
Achievement Test, Diagnostic Test, Achievements Vs
9 pages
SPARROW
No ratings yet
SPARROW
3 pages
Personal Letter Steven Putera Rejeki XI MIPA 2
No ratings yet
Personal Letter Steven Putera Rejeki XI MIPA 2
6 pages
Research1
No ratings yet
Research1
77 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

TCH442E Quantitative Methods For Finance

Uploaded by

TCH442E Quantitative Methods For Finance

Uploaded by

10/30/2021

Dr. Quyen Do Nguyen

In Economics, Econometrics has been described as the

Finance - Investigate the existence of

Quantitative methods for Finance is the application of

• On successful completion of this module you

Outline and Textbooks

Structure and Assessment

15 SESSIONS OF ATTENDANCE MID-TERM GROUP FINAL EXAM (60%).

Cross Section Data

Several individuals in each period but different individuals

Time Series Data

Only one individual in each period and the same individual

Several individuals in each period and the individuals are

The Linear Regression Model

Estimation with a linear regression is used to compute the

The Linear Regression Model

The regression error consists of all other factors affecting Y

An Application: Class Size Data

An Application: Class Size Data

Test Score = 0 + 1STR+u

1 = slope of regression line

Ordinary Least Squares Estimator

An estimate of the unknown parameters 0 and 1 is found by

Computing the Betas

Computing the Betas

yx i i  ˆ0  xi  ˆ1  xi

These two equations, known as normal equations, can be

The OLS regression line

Estimated slope = ˆ1 = – 2.28

Interpreting the Estimates

Test Score= 698.9 – 2.28STR

Predicted Values & Residuals

Summary and Next Steps

 Root Mean Square error (RMSE): square root of the average

 R2: measure of the fraction of the variance of Y that is

 Var (Y) = Var(Yˆ )i + Var(uˆ )i

ESS  (Yˆ Yˆ )

 R2 = 0 means ESS = 0; R2 = 1 means ESS = TSS.

Test Score= 698.9 – 2.28xSTR R2 = 0.05

STR explains a small fraction of the variation in test scores (5%).

where K is the number of regressors.

As K increases, Adjusted R2 decreases: adding more regressors

The Least Squares Proprieties

Under some assumptions, the OLS estimator yields estimates of

 Unbiased, i.e. the estimates are on average equal to the true

The Least Squares Assumptions

1. The conditional distribution of u given X has mean zero,

2. (Xi,Yi), i =1,…,n, are i.i.d.

3. X and Y have finite fourth moments.

Example: Test Scorei = 0 + 1STRi + ui,

Question: is E(u|X=x) = 0 a plausible assumption? 30

Assumption 2: (Xi,Yi) are i.i.d.

This assumption holds by definition when X and Y are

However, in reality this assumption is often violated.

Assumption 3: E(X4) <  and E(Y4) < 

The Sampling Distribution of ˆ

The OLS estimator is computed from a sample of data.

Hence it is necessary to:

 Construct a confidence interval for ˆ1

 Use ˆ1 to test hypotheses such as 1 = 0

The Sampling Distribution of ˆ1

 E(ˆ1 ) = 1 (that is, ˆ1 is unbiased)

The larger the sample (large N),

The Sampling Distribution of ˆ1

expression for ˆ1 – 1:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.