0% found this document useful (0 votes)
56 views

TCH442E Quantitative Methods For Finance

This document provides an overview of the course "Quantitative Methods for Finance" taught by Dr. Quyen Do Nguyen. The course aims to apply econometrics and quantitative methods to examine relationships between variables in finance. It will cover linear and non-linear regression models, hypothesis testing, instrumental variables, panel data models, discrete choice models, and time series models. Students will learn these techniques and apply them to finance research. Assessment includes class attendance, a mid-term group work, and a final exam. The document discusses different data types like cross-section, time series, and panel data and introduces the linear regression model and ordinary least squares estimation method.

Uploaded by

Ngọc Mai Vũ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

TCH442E Quantitative Methods For Finance

This document provides an overview of the course "Quantitative Methods for Finance" taught by Dr. Quyen Do Nguyen. The course aims to apply econometrics and quantitative methods to examine relationships between variables in finance. It will cover linear and non-linear regression models, hypothesis testing, instrumental variables, panel data models, discrete choice models, and time series models. Students will learn these techniques and apply them to finance research. Assessment includes class attendance, a mid-term group work, and a final exam. The document discusses different data types like cross-section, time series, and panel data and introduces the linear regression model and ordinary least squares estimation method.

Uploaded by

Ngọc Mai Vũ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

10/30/2021

TCH442E

QUANTITATIVE
METHODS FOR FINANCE

Dr. Quyen Do Nguyen


quyendn@ftu.edu.vn

In Economics, Econometrics has been described as the


discipline that "aim[s] to give empirical content to economic
relations.“ (The New Palgrave: A Dictionary of Economics)

Quantitative
methods for Applied Econometrics is the application of mathematics and
statistical methods to data in order to:

Finance - Investigate the existence of


relationships between variables.
Measure the strength of these
relationships.
Econometrics

Quantitative methods for Finance is the application of


econometrics in examining the relationship between variables
in finance area.

1
2
10/30/2021

Course learning
outcomes
• Become familiar with the basic
econometrics techniques to conduct and
evaluate a wide range of applied
econometric research and to apply them in
finance research area.

• On successful completion of this module you


should:
• Have developed a deeper knowledge of
quantitative methods and applied econometrics
and its usefulness for understanding economic
relationships.
• Have applied the quantitative methods and
econometrics in finance.
• Have acquired a basic knowledge of Stata.

Outline and Textbooks


Main topics:
• Linear and Non-Linear Regression Models.
• Hypothesis Testing.
• IV.
• Panel Data Models.
• Discrete Choice Models.
• Time Series Models.

Textbooks:
W. H. Greene. Econometric Analysis, 5th Edition, Prentice Hall
International, 2003.
J. Wooldridge. Introductory Econometrics: A Modern Approach, 4th
Edition, South-Western Cengage Learning, 2006. (Simple/Less Math)
4

2
4
10/30/2021

Structure and Assessment

15 SESSIONS OF ATTENDANCE MID-TERM GROUP FINAL EXAM (60%).


LECTURES AND CHECKING (10%) WORK (30%)
SEMINARS.

Data
Data used by
econometricians Experimental (e.g. controlled
can be:
experiment).
Observational (e.g. survey).

Data are
typically Cross-section data.
organised in
different forms:
Time series data.
Panel data.
6

3
6
10/30/2021

Cross Section Data


Year Id Income Hours
2000 1 x x
2000 2 x x
2000 3 x x
2001 4 x x
2001 5 x x
2001 6 x x
2002 7 x x
2002 8 x x
2002 9 x x

Several individuals in each period but different individuals


in different periods.
7

Time Series Data


Year Income Hours

2000 x x

2001 x x

2002 x x

Only one individual in each period and the same individual


is followed over time.

4
8
10/30/2021

Panel Data
Year id Income Hours
2000 1 x x
2000 2 x x
2000 3 x x
2001 1 x x
2001 2 x x
2001 3 x x
2002 1 x x
2002 2 x x
2002 3 x x

Several individuals in each period and the individuals are


followed over time.
9

LINEAR
REGRESSION
MODEL

10

5
10
10/30/2021

The Linear Regression Model


The linear regression model is used to study the relationship
between a set of variables and make inference on the average
effect of a given variable on an other.

Estimation.
Inference includes: Hypothesis testing.
Confidence intervals.

Estimation with a linear regression is used to compute the


average effect on a given outcome Y of a unit change in a
given variable X by fitting a straight line to data on Y and X.

11

11

The Linear Regression Model


with One Variable
Yi = 0 + 1Xi + ui i = 1,…, n
 X is the independent or explanatory variable
 Y is the dependent variable
 0 is the intercept
 1 is the slope
 ui is the error term

The regression error consists of all other factors affecting Y


other than X, including possible errors in the measurement of Y.
12

6
12
10/30/2021

An Application: Class Size Data


Research question: what is the average effect of class size on
students’performance?
Suppose that we can measure students’ performance by test
scores and class size by the student-teacher ratio (STR).
(This example is based on 420 California School Districts in 1998)

13

13

An Application: Class Size Data


We can write the corresponding linear regression model:

Test Score = 0 + 1STR+u

1 = slope of regression line


Test score
=
STR
= change in test score for a unit change in STR
 0 and 1 are unknown.
 We can estimate them with data. How?
ORDINARY LEAST SQUARE (OLS) estimator.
15

7
14
10/30/2021

Ordinary Least Squares Estimator


The OLS estimator minimizes the sum of the squared distances
between the observed yi and the estimated regression line, i.e. the
sum of squared residuals.

An estimate of the unknown parameters 0 and 1 is found by


solving the following minimization problem:

 
n 2
min  yi  ( ˆ0  ˆ1x)i
ˆ0,ˆ1 i1
1
15

15

1
16

8
16
10/30/2021

Computing the Betas


The OLS estimator minimizes the sum of squared residuals:
u  ( yi  yˆ i ) 2  ( yi  ˆ0  ˆ1x i) 2
ˆ ˆ  i
2
min
0, 1 i i i

First Derivative:
ui
 2( y i  ˆ0  ˆ1x i)  0
ˆ
 0
ui
 2  ( y i  ˆ0  ˆ1 xi )x i  0
ˆ1

1
17

17

Computing the Betas


The two equations above can be re-written as:

yi
i  nˆ0  ˆ1  xi
i

yx i i  ˆ0  xi  ˆ1  xi


2

i i i

These two equations, known as normal equations, can be


solved to obtain

n  y i x i  y i  x i  (x i  x) ( y i  y)
̂1  i i i
 i i

n xi  ( x i )  (x
2 2 2
i  x)
i i i

ˆ0  y  ˆ1x
18

9
18
10/30/2021

The OLS regression line

Estimated slope = ˆ1 = – 2.28


Estimated intercept = ˆ0 = 698.9
Estimated 1regression line: TestScore= 698.9 – 2.28STR
19

19

Interpreting the Estimates

Test Score= 698.9 – 2.28STR

 The slope implies that districts with one more student per
teacher on average have test scores that are 2.28 points lower.
Test score
That is, = –2.28
STR
 The intercept implies that districts with zero STR would have
a (predicted) test score of 698.9. The intercept is not always
economically meaningful.

1
20

10
20
10/30/2021

Predicted Values & Residuals

One of the districts in the data set is Antelope, CA, for which
STR = 19.33 and Test Score = 657.8
Predicted value: Yˆ
Antelope= 698.9 – 2.2819.33 = 654.8
Residual: uˆAntelope = 657.8 – 654.8 = 3.0 22

21

Summary and Next Steps


• Linear regression model with one variable.
• OLS estimator.

Questions: How well does the model fit the data? Is OLS
a “good” estimator, what are its properties?

• Measures of fit.
• Sampling distribution of OLS estimator.

22

11
22
10/30/2021

Measures of Fit
How well does the regression line fit the data?

 Root Mean Square error (RMSE): square root of the average


squared distance of a data point from the fitted regression line,
i.e. it measures the standard deviation of the regression
residuals.
n
1
RMSE = 
n i1
uˆi2

 R2: measure of the fraction of the variance of Y that is


explained by X, i.e. how well the model interpolates the data
points. It ranges between zero (no fit) and one (perfect fit).

23

23

Derivation of R2
R2 measures the fraction of the sample variance of Y that is
“explained” by the regression model.

 Var (Y) = Var(Yˆ )i + Var(uˆ )i


 Total sum of squares = “Explained” SS + “Residual” SS
n

ESS  (Yˆ Yˆ )


i
2

R2 = 1-RSS/TSS= = i1
n
TSS
 (Y i  Y )2
i1

 R2 = 0 means ESS = 0; R2 = 1 means ESS = TSS.


 0 ≤ R2 ≤ 1
24

12
24
10/30/2021

Example of R2

Test Score= 698.9 – 2.28xSTR R2 = 0.05

STR explains a small fraction of the variation in test scores (5%).


Does this mean that STR is not important? 26

25

Adjusted R2
Problem: adding regressors with any additional explanatory
power necessarily raises the R2, but this increase is artificial and
inflates the apparent explanatory power of the model.

Adjusted R2 :

2 n 1 SSR
R  1
n  k 1 TSS

where K is the number of regressors.

As K increases, Adjusted R2 decreases: adding more regressors


lowers SSR but also (n−k−1). Parsimonious model is preferred! 27

13
26
10/30/2021

The Least Squares Proprieties

Under some assumptions, the OLS estimator yields estimates of


β0 and β1 that are:

 Unbiased, i.e. the estimates are on average equal to the true


value of β.
 Consistent, i.e. as sample size increases, the probability that
the estimates approach the true β values is close to 1.
 Efficient, i.e. the estimates have the smallest variance among
all estimates obtained with a linear estimator.

28

27

The Least Squares Assumptions


Yi = 0 + 1Xi + ui i = 1,…, n

1. The conditional distribution of u given X has mean zero,


that is, E(u|X = x) = 0.
(i.e. X and u are independently distributed)
 This implies that the estimated betas are unbiased.

2. (Xi,Yi), i =1,…,n, are i.i.d.


 This is true if observations on X and Y are drawn by
random sampling.

3. X and Y have finite fourth moments.


 This implies that large outliers in X and/or Y are rare,
which is important since large outliers can result in
29
meaningless values of the estimates.

14
28
10/30/2021

Assumption 1: E(u|X = x) = 0
For any given value of X, the mean value of u is zero:

Example: Test Scorei = 0 + 1STRi + ui,

Question: is E(u|X=x) = 0 a plausible assumption? 30

29

Assumption 2: (Xi,Yi) are i.i.d.


Random variables are independent and identically distributed
(i.i.d.) if each random variable has the same probability
distribution as the others and all are mutually independent.

This assumption holds by definition when X and Y are


sampled by simple random sampling.

However, in reality this assumption is often violated.


Non-i.i.d. sampling typically occurs when data are recorded
over time (time series and/or panel data).
30

15
30
10/30/2021

Assumption 3: E(X4) <  and E(Y4) < 


The fourth moment is a measure of whether the distribution if tall
and skinny or rather short and compact when compared to a
normal distribution with the same variance. Bounded forth
moments mean that large outliers are rare.

31

31

The Sampling Distribution of ˆ


1

The OLS estimator is computed from a sample of data.

Hence it is necessary to:


 Quantify the sampling uncertainty associated with ˆ1

 Construct a confidence interval for ˆ1

 Use ˆ1 to test hypotheses such as 1 = 0

32

16
32
10/30/2021

The Sampling Distribution of ˆ1


Under the three OLS assumptions that we discussed, the exact
(finite sample) distribution of ˆ is such that:
1

 E(ˆ1 ) = 1 (that is, ˆ1 is unbiased)


p
 ˆ1  1 (that is, ˆ1 is consistent)
1 var[( X i  x )ui ]
 Var(ˆ1) = 
n  X4
ˆ1  E(ˆ1)
 When n is large, ~ N(0,1)
var(ˆ1 )
Other than first and second moments, the exact distribution of ˆ1
is complicated and depends on the joint distribution of (X,u). 34

33

The larger the sample (large N),


the smaller the variance of ˆ1

34

17
34
10/30/2021

Summary
Linear regression model with one variable:
• OLS estimator.
• Measures of fit.
• Sampling distribution of OLS estimator.

Next:
• Linear regression model with many variables.

35

35

The Sampling Distribution of ˆ1


Technical Appendix
Yi = 0 + 1Xi + ui
Y = 0 + 1 X + u
so Yi – Y = 1(Xi – X ) + (ui – u )
Thus,
n

( X i  X )(Yi Y )
̂1 = i1
n

( X i  X )2
i1
n

( X i  X )[1 ( X i  X )  (ui  u )]
= i1
n

( X i  X )2 37
i1

18
36
10/30/2021

n n

( X i  X )(X i  X ) ( X i  X )(ui  u )
̂1 = 1 i1
n
 i1
n

( X i  X) 2
( X
i1
i  X )2
i1
n

( X i  X )(ui  u )
so ̂1 – 1 = i1
n
.
( X i  X) 2

i1
n n
 n 
Now  i ( X  X )(u i  u ) =  ( X i  X )u i –   ( X i  X ) u
i1 i1  i1 
n
 n  
=  ( X i  X )ui –   X i   nX  u
i1   i1  
n
=  ( X i  X )ui 38
i1

37

n n
Substitute  ( X i  X )(ui  u ) =  ( X i  X)ui into the
i1 i1

expression for ˆ1 – 1:


n

( X i  X )(ui  u )
̂1 – 1 = i1
n

( X i  X )2
i1

so
n

( X  X )u i i

ˆ1 – 1 = i1
n

( X i  X )2
i1

38

19
38
10/30/2021

Is ˆ1 unbiased?
 n 
  ( X i  X )ui 
E(ˆ1) – 1 = E  i1n 
  ( X  X )2 
 i1 i 
  n  
   ( X i  X )ui 

= E  E  i1n  X 1 ,..., X n 
   ( X i  X )2  
  i1  
=0
 If Assumption 1 holds, then E(ˆ1) = 1
 Hence, ˆ1 is an unbiased estimator of 1.
39

39

Derivation of Var(ˆ1)
Let us write
n
1 n
( X i  X )ui vi
n i1
̂1 – 1 = i1
=
n
 n 1 2
( X i  X) 2

 n 
 sX
i1

n1
where vi = (Xi – X )ui. If n is large, s2X   X2 and  1, so
n
1 n
 vi
n i1
ˆ1 – 1  ,
 X2

41
where vi = (Xi – X )ui

20
40
10/30/2021

1 n
 vi
n i1
̂1 – 1 
 X2
so var(ˆ1 – 1) = var(ˆ1)
var(v) / n
=
( X2 ) 2
so
1 var[( X i  x )ui ]
var(ˆ1 – 1)=  .
n  X4

Hence:
 Var(ˆ 1) is inversely proportional to n

41

41

n
When n is large, 1  vi is approximately distributed N(0, v2 / n)
n i1

Hence,

 2 
ˆ1 ~ N  1 , v4  , where vi = (Xi – X)ui
 n X 

42

21
42

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy