Econ-607 - Unit2-W1-3
Econ-607 - Unit2-W1-3
Econ-607 - Unit2-W1-3
yi = β1 + β2 xi ,2 + β3 xi ,3 + + βk xi ,k + ei ; i = 1, 2, . . . , n
= 1 β1 + xi ,2 β2 + xi ,3 β3 + + xi ,k βk + ei
2 3
6 β1 7
6 7
6 7
6 7
6 β2 7
6 7
= 1 xi ,2 xi ,k 6 7 + ei = xi0 β + ei
6 .. 7
6 . 7
6 7
6 7
4 5
βk
F. Guta (CoBE) Econ 607 September, 2018 2 / 117
Stacking all the n observations as a column vector
gives:
2 3 2 3 2 3 2 3
0 0
6 y1 7 6 x1 β + e1 7 6 x1 β 7 6 e1 7
6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7
6 7 6 0 7 6 7 6 7
6 y2 7 6 x2 β + e2 7 6 x20 β 7 6 e2 7
6 7 6 7 6 7 6 7
6 7 = 6 7=6 +
7 6 7
6 .. 7 6 .. 7 6 .. 7 6 .. 7
6 . 7 6 . 7 6 . 7 6 . 7
6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7
4 5 4 5 4 5 4 5
yn 0
xn β + en xn0 β en
2 3 2 3
6 x10 7 6 e1 7
6 7 6 7
6 7 6 7
6 0 7 6 7
6 x2 7 6 e2 7
6 7 6 7
= 6 7β+6 7 = xβ + e
6 .. 7 6 .. 7
6 . 7 6 . 7
6 7 6 7
6 7 6 7
4 5 4 5
xn0 en
y = x β + e (observables: y, x)
(n 1) (n k ) (k 1) (n 1)
∂S ( β)
= 2x0 y+2x0 xβ
∂β
∂S ( β)
=0
∂β β=bβ
) 2x0 y+2x0 xb
β=0
1
) b
β = (x0 x ) x0 y
b
β is a linear estimator.
Assume that E ( ej x) = 0
h i
e 1
E β x = B + (x0 x ) 0
x xβ
= Bxβ + β = β 8 β i¤ Bx = 0
So, if we assume
F. Guta (CoBE) Econ 607 September, 2018 6 / 117
a) E ( ej x) = 0 ) any other linear unbiased
estimator has the property that Bx = 0 and thus b
β
is unbiased as Bx = 0 for b
β. If we restrict ourselves
to the class of all linear unbiased estimators e
β we
need an assumption for var (e).
0
var e
β x = E e
β E e
β e
β E e
β x
h i
1 0 1
= E B + x0 x x ee0 x x0 x + B0 x
1 0 1
= B + x0 x x E ee0 x x x0 x + B0
1
= σ2 BB 0 + σ2 x0 x ; as Bx = x0 B 0 = 0
1
This exceeds σ2 (x0 x) by a positive de…nite matrix
σ2 BB 0 , minimized by choosing B = 0; e
β=b
β.
If e
β is any other linear unbiased estimator of β, then
var e
β x var b
β x is positive de…nite matrix,
the variance of any linear combination,
var λ0 b
β x < var λ0 e
β x .
var λ0 e
β x var λ0 b
β x = λ0 var e
β x λ λ0 var b
β x λ
h i
= λ0 var e
β x var b
β x λ>0
N.B. λ0 b
β is an estimator of λ0 β.
Example
0 1
4 1 0
B C
B C
Example of PD matrix: A = B 1 5 1 C.
@ A
0 1 1
y0 y = y0 Py + y0 My
y0 y 1 0 0
n y ii y = y0 Py 1 0 0
n y ii y +y0 My
y0 In 1 0
n ii y = y0 Py 1 0 0
n y Pii Py +y0 My
1
where P = x (x0 x) x0 ; Pi = i , as i = xJ
2 3
6 1 7
where J = 4 5.
0
(k 1) 1
y0 In 1 0
n ii y = y0 P In 1 0
n ii Py + y0 My
) y0 Ay = y0 PAPy + y0 My
1 0
where A = In n ii
Assumptions implicit:
i ). Model is correctly speci…ed.
ii ). rank (x) = k, full column rank and x is …xed
(non-stochastic).
iii ).E ( ej x) = 0.
F. Guta (CoBE) Econ 607 September, 2018 15 / 117
iv ).var ( ej x) = σ2 In .
Residuals:
1
e = y xb
β where b
β = (x0 x ) x0 y
1
= y x (x0 x ) x0 y
1
= In x (x0 x ) x0 y
1
= (In P ) y where P = x (x0 x) x0
= My where M = In P
) My =M (x β + e) = M e
) E ( e j x) = E ( Myj x) = E ( M ej x) = ME ( ej x) = 0
var ( e j x) = E ee 0 x = E M ee0 M 0 x
= E M ee0 M x = ME ee0 x M = σ2 M
1 0 1 0
tr (M ) = tr In x x0 x x = tr (In ) tr x x0 x x
h i
1 0
= n tr x0 x xx =n tr (Ik ) = n k = rank (M )
= E e0 M e x = E tr e0 M e x , as e0 M e is scalar
= tr E M ee0 x = tr ME ee0 x = tr M σ2 In
= σ2 tr (M ) = σ2 (n k)
var ( yj x) = σ2 In , var ( ej x) = σ2 In .
v ) . ej x N (0, σ2 In ).
b 1 0 1 0 1 0
β = (x0 x ) xy = (x0 x ) x (x β + e ) = β + (x0 x ) xe
1
) ej x N 0, σ2 In ) b
β x N β , σ 2 (x0 x )
1 b 1 0 1 1 1
β β( x x ) ( x0 e ) = ( x0 x ) x0
= e
σ σ σ
1 1 1 0 1
and 2 e 0 e = 2 e0 Me = e M e
σ σ σ σ
F. Guta (CoBE) Econ 607 September, 2018 21 / 117
1
are independent as (x0 x) x0 M = 0 and
1
e N (0, I ).
σ
)b β and S 2 are independent.
3) . Distribution of S 2 :
(n k) S2 1 0 1
2
= e M e χ2 (n k ) ,
σ σ σ
1 0 1
i.e., e M e χ2 (rank (M )) .
σ σ
H0 : R β = r
b 1
β x N β, σ2 (x0 x)
1
Rb
β x N R β, σ2 R (x0 x) R0
F. Guta (CoBE) Econ 607 September, 2018 24 / 117
and so, under H0 : R β = r
1
Rb
β r X N 0, σ2 R (x0 x) R0
0 1
So, (x µ) V (x µ) =
0 1 d
(x µ ) (P 0 ) P 1
(x µ) = y0 y.
1
Where y =P (x µ ) ; (x µ) N (0, V ).
1 d
So, y =P 1
(x µ) N 0, P 1
V (P 0 ) =
N (0, IJ )
0 d
) (x µ) V 1
(x µ) = y0 y χ2 (J )
F. Guta (CoBE) Econ 607 September, 2018 26 / 117
1
Returning to Rb
β r x N 0, σ2 R (x0 x) R0
0 h i 1
1
) Rb
β r σ 2 R x0 x R0 Rb
β r χ2 (q ) , under H0
1 0 h i 1
1
) Rb
β r R x0 x R0 Rb
β r χ2 (q ) (2.1.1)
σ2
1
Here, we are assuming that if (x0 x) is positive
1
de…nite and rank (R ) = q, R (x0 x) R 0 is positive
de…nite and thus invertible.
We cannot use (2.1.1) for a test, but if we divide by
the degrees of freedom q, we have χ2 (q ) /q,
independent of
F. Guta (CoBE) Econ 607 September, 2018 27 / 117
(n k ) S 2 /σ2 / (n k) χ2 (n k ) / (n k ).
βj = 0 ; q = 1 ; R = 0 0 1 0 0 = ej0
"
j th position
F. Guta (CoBE) Econ 607 September, 2018 28 / 117
1 1 1
R (x0 x ) R 0 = ej0 (x0 x) ej = (x0 x)jj , the j th principal
diagonal
element of (x0 x) 1 .
h i 1
1
R (x0 x ) R0 1 1
= 1
=h i2 ,
qS 2 S 2 (x0 x)jj c b
s.e. βj
s.e. standard error.
So, we can test H0 : βj = 0 by calculating
b
βj
t (n k )under H0 . If the alternative is
.e. b
sc βj
HA : βj 6= 0, we use a two tailed test, with 2 12 %
point of the t statistic
F. Guta (CoBE) being
Econ 607 2 (or see table).
September, 2018 29 / 117
Standard regression output might look like the
following table:
Equation for US investment, 1968-82
Variable Coe¢ cient Standard error t-ratio p-value
2 3
6 β̂1 7
0 1 0 6 7
R xx R = V22 ; R β̂ r = β̂2 if β̂ = 6 7
4 5
β̂2
0
b
β2 V221 bβ2
So, F= 2
F (J, n k ) under H0 : β2 =0
JS
To make progress with this, we need to be able to
F. Guta (CoBE) Econ 607 September, 2018 33 / 117
partition the inverse of (x0 x). Alternatively,
iii ). The hypothesis R β = r can also be tested
using:
(ẽ 0 ẽ e 0 e ) /J (RRSS URSS ) /J e 0e
F = = F (J, n k ) , where S 2 =
e 0 e / (n k) URSS / (n k ) n k
R 2 and R̄ 2
e 0e e 0e
R2 = 1 =1
∑ni=1 (yi ȳ )
2 y0 Ay
1 0 0
where A = In n i i and i = 1 1 1
y = xb
b β.
e 0 e / (n k )
R̄ 2 = 1
y0 Ay/ (n 1)
F. Guta (CoBE) Econ 607 September, 2018 35 / 117
R̄ 2 : attempts to avoid an ever increase in R 2 , but
adding variables increases R̄ 2 if F ( β2 = 0) > 1.
2
Note that ∑ni=1 (yi ȳ ) is the RRSS obtained
(RRSS URSS ) / (k 1)
F = F (k 1, n k)
URSS / (n k )
0
S ( β ) = (y xβ) (y xβ) subject to R β = r
F. Guta (CoBE) Econ 607 September, 2018 37 / 117
The Lagrangian is given by:
0
L ( β, λ) = (y xβ) (y xβ) + 2λ0 (r R β)
∂ L ( β, λ )
= 2x0 y+2x0 x β̃ 2R 0 λ̃ = 0 (2.2.1)
∂β β̃, λ̃
∂ L ( β, λ )
= 2 r R β̃ = 0 (2.2.2)
∂λ β̃, λ̃
1
r = Rb
β + R x0 x R 0 λ̃
h i 1
1
) λ̃ = R x0 x R0 r Rb
β
h i 1
1 1
) β̃ = b
β + x0 x R 0 R x0 x R0 r Rb
β
F. Guta (CoBE) Econ 607 September, 2018 39 / 117
Note that if R b
β = r, then β̃ = b
β. One can …nd
E β̃ and var β̃ , and establish that
var b
β var β̃ is psd; but β̃ is biased unless
R β = r.
One can compare MSE β̃ and MSE b
β ;
h i
0
MSE β̃ = E β̃ β β̃ β
n o
0
= E β̃ E β̃ + E β̃ β β̃ E β̃ + E β̃ β
h i
0
= var β̃ + E E β̃ β E β̃ β
condition for
MSE β̃ MSE b
β
nsd
h i 1
0 1
λ = (R β r) R x0 x R0 (R β r ) /σ2 < 1
ẽ = y xb
β + xb
β x β̃ = e + x b
β β̃
0
) ẽ 0 ẽ = e 0 e + b
β β̃ x0 x b
β β̃
h i 1
1 1
But b
β β̃ = x0 x R 0 R x0 x R0 r Rb
β
0h 1
i 1
) ẽ 0 ẽ e 0e = r Rb
β R x0 x R0 r Rb
β
i ). x may be stochastic
ii ). e may be non-normal, may be correlated.
Convergence in Probability:
n o n o σ2
P µ e < Xn < µ + e = P Xn µ <e 1
n e2
all n > n
P Xn µ <e >1 δ
lim P Xn µ <e = 1
n!∞
i.e. plimX n = µ
We write plim X n = µ.
F. Guta (CoBE) Econ 607 September, 2018 49 / 117
We say that X n is a consistent estimator of µ.
Consider an alternative estimator of µ, denoted by
mn , such that E (mn ) = µ + nc , where c is a
constant. This is obviously biased in small samples,
but limn!∞ E (mn ) = µ, i.e., mn is asymptotically
unbiased.
Chebysheve’s Theorem:
q
c 1
P mn µ+ n λ var (mn )
λ2
p
Letting e = λ var (mn ), we have
var (mn )
P mn µ + nc e
e2
) lim P mn µ + nc e =0
n!∞
i ). Asymptotic unbiasedness
ii ). Variance! 0 as n ! ∞.
2
plim X 2 = (plim)
plimXY = (plimX ) (plimY )
F. Guta (CoBE) Econ 607 September, 2018 52 / 117
Example
Consistency of least squares estimator b
β:
b 1 0 1 0
β = x0 x x y = β + x0 x xe
1
= β + x0 x / n x0 e/n = β+A 1 B
x0 x x0 e
where A = ; B=
n n
x0 x
plim =Σ (positive de…nite)
n
F. Guta (CoBE) Econ 607 September, 2018 53 / 117
Example
If x contains the constant term:
2 3
6 plim ∑ ei /n 7
6 7
6 7
6 7
6 plim ∑ x e /n 7
0
xe 6 i ,2 i 7
plim =6
6
7
7
n 6 .. 7
6 . 7
6 7
6 7
4 5
plim ∑ xi ,k ei /n
x0 e
i.e. plim =0
n
) plim b
β = β + plim A 1
(plim B )
F. Guta (CoBE) Econ 607 September, 2018 55 / 117
Example
1
= β + (plim A) (plim B ) = β + Σ 1
0=β
Convergence in distribution
Suppose X N (µ, σ2 ), then X n N (µ, σ2 /n).
But as n ! ∞, the distribution of X n is degenerate,
i.e., collapses around the point µ. However, consider
p
Zn = n X n µ
) E (Zn ) = 0, var (Zn ) = σ2
F. Guta (CoBE) Econ 607 September, 2018 56 / 117
Thus the density function (distribution) of Zn is
N (0, σ2 ) which is independent of n and hence …nite
sample distributions are the same as the limiting
distribution.
Often, small sample distributions cannot be derived
or di¢ cult to calculate, but limiting distributions
based on standardized variates such as Zn are
available.
Convergence in probability
Consistency, plim b
θ = θ.
De…nition
The likelihood of an observation of X is the value of
the density function at x, f (x, θ ), where f depends
on θ (parameter).
F. Guta (CoBE) Econ 607 September, 2018 59 / 117
Clari…cation: the density f (x, θ ) is regarded as a
function of x for a given θ.
Model : y = xβ + u (2.4.1)
∂u
fY (y) = fU (u (y)) = fU (u (y)) jIn j = fU (u (y)
∂y0
F. Guta (CoBE) Econ 607 September, 2018 63 / 117
Assuming u N (0, σ2 In ), we have
n /2 1 0
fU (u) = 2πσ2 exp uu (2.4.2)
2σ 2
n /2 1
) fY (y) = 2πσ2 exp (y x β ) 0 (y x β)
2σ 2
n n 1
ln L = ln 2π ln σ2 (y x β ) 0 (y x β)
2 2 2σ 2
9
∂ ln L 1 >
>
= 2x0 y+2x0 xb
β =0 >
>
>
∂β b
β, σ̂2 b2
2σ >
>
>
=
1
= x0 y x0 x b
β =0 >
(2.4.3)
b2
σ >
>
>
∂ ln L n 1 0 >
>
>
= 2
+ 4 y xb β y b
xβ = 0 >
;
∂σ2 b
β, σ̂2 2σb b
2σ
1
p d 1 ∂2 ln L
n b
θn θ0 ! N (0, V) where V = lim E
n !∞ n ∂θ∂θ0
1
var b
θn I (θ0 ) is positive semi-de…nite
F. Guta (CoBE) Econ 607 September, 2018 67 / 117
Theorem
∂ ln L ∂ ln L ∂2 ln L
where I (θ0 ) = E = E
∂θ ∂θ0 ∂θ∂θ0
1
If var b
θn = I (θ0 ) , then b
θn is said to be
e¢ cient.
F. Guta (CoBE) Econ 607 September, 2018 68 / 117
1
N.B. var b
θn I (θ0 ) 0 in the sense that the
(k k ) psd
(k k )
di¤erence is a psd matrix.
Now b
βML is unbiased but σ̂2ML is biased.
0 1 0 1
b
β β
So @ ML A is a biased estimator of @ A.
2 2
bML
σ σ
1
Thus, var bβ = var bβML = σ2 (x0 x) = MVB .
So b
β and b
βML are both e¢ cient estimators of β.
b2ML is a biased estimator σ2 , the theorem is
Since σ
not really relevant or applicable.
2σ4 2σ4
But var (S 2 ) = n k > n for …nite n ) S 2 is not
F. Guta (CoBE) Econ 607 September, 2018 71 / 117
an e¢ cient estimator of σ2 . In fact, there is no
unbiased estimator of σ2 that attains the MVB.
Hypothesis Testing: the Likelihood Ratio Principle
Example
yi = β1 + β2 xi2 + β3 xi3 + ei
n n
L (θ) = ∏ f ( Xi j θ) ; ` (θ) = ln L (θ) = ∑ ln f ( Xi j θ)
i =1 i =1
e N 0, σ2 In ,
n n 1
` (θ) = ` β,σ2 = ln 2π ln σ2 (y x β ) 0 (y x β)
2 2 2σ 2
So, 0 1 0 1
∂` (θ) 1
∂` (θ) B
B
C B
C B (x0 y x0 x β ) C
S (θ) = =B ∂β C=@ σ2 C
∂θ @ A A
∂` (θ) n 1
+ 4 (y x β ) 0 (y x β)
∂σ2 2 σ2 2σ
0 1 0 1
1 0 1 0
B C B C
B σ2 X X 0
C B σ2 X X 0
C
) E (H ) = B C =B C
@ n n σ2 A @ n A
00 4
+ 6 00
2σ σ 2σ 4
p
If n1 x0 x ! Q, positive de…nite matrix, then
0 1
1 0 1
h i 1 B σ2 lim nx x 0 C
B C
lim 1 In (θ) = B n !∞ C
n !∞ n @ A
00 2σ 4
If H0 : C (θ) = 0 and e
LR is the likelihood
(J 1) (k 1)
bU eR n σ̃2
2 ln λ = 2 ln L ln L =2 ln 2
2 σb
!n /2
n /2
b2
σ e 0e
) λ= =
σ̃2 ẽ 0 ẽ
2 /n ẽ 0 ẽ ẽ 0 ẽ e 0e
λ 1= 1=
e 0e e 0e
n k 2 /n (ẽ 0 ẽ e 0 e ) /J
) λ 1 = =F F (J, n k)
J e 0 e / (n k )
2 /n JF
i.e. λ = 1+
n k
2 JF JF
) ln λ = ln 1 + ) 2 ln λ = n ln 1 +
n n k n k
h i0 h i 1h i
d
W = C b
θ var C b
θ C b
θ ! χ2 (J )
H0
1
1 1
var b
θ = lim In (θ)
n n!∞ n
0 1
1
2 lim 1 0
B
1 B σ xx 0 C
n!∞ n C
= B C
n @ A
00 2σ 4
2 3
2 0 1
σ (x x) 0
and we use 4 5
0 4
0 2σ /n
0 h i 1
1
) W = Rb
β r R x0 x R0 Rb
β r /σ2
and as n ! ∞, . n n k ! 1) W = JF d! χ2 (J ).
H0
LR < W , but LR ! W as n ! ∞
) LR < W
∂L ∂` (θ) ∂C (θ)
= c 0λ e e = 0, where c =
∂θ e e
θ, λ ∂θ e e
θ, λ
θ, λ ∂θ0
F. Guta (CoBE) Econ 607 September, 2018 88 / 117
F.O.C. imply
0 ∂` e
θ
c e
θ e=S e
λ θ , where S e
θ = the score vector
∂θ
If e
θ is close to b
θ, we expect S e
θ 'S b
θ so, we can
0h i 1 d
S e
θ f S e
var θ S e
θ ! χ2 (J )
H0
Theorem
In general,
F. Guta (CoBE) Econ 607 September, 2018 89 / 117
Theorem
i ). As E (S (θ)) = 0 , E S e
θ =0
H0
h i
ii ). As E (S (θ)) = 0 , var (S (θ)) = E S (θ) S (θ)0 = In (θ).
h i 1 0
So, LM = λe0c eθ In e θ c eθ λ e
0 1
1 1
LM = S e
θ lim In e
θ S e
θ
n n !∞ n
But ẽ = y x β̃ = y xb
β+x b
β β̃ = e + x b
β β̃
) x0 e
e = x0 e + x0 x b
β e
β = x0 x b
β e
β , as x0 e = 0
h i 1
1
=) x0 x b
β e
β = R 0 R x0 x R0 Rb
β r
h i 1
1
) x0 e
e = R 0 R x0 x R0 Rb
β r (2.5.2)
0 1
1 1
LM = S eθ lim In ~ θ S eθ
n n !∞ n
" #
1 1 0
h i 1
= 1
n 2
R β̂ r R (x0 x ) R 0 R0 0
e
σ
0 1 1
1 1 0
B 2 xx 0 C
B σ C
B e n C
@ 1 A
0 0
2σe4
2 3
1 h i 1
6 2 R 0 R (X 0 X ) 1 R 0 Rbβ r 7
6 σe 7
6 7
4 5
0
0h 1
i 1
) LM = R b
β r R x0 x R0 Rb
β σ2
r /e (2.5.4)
e2 , LM<W. Moreover,
Now, as σ̂2 < σ
e 0e
e e e 0e e 0e
LM = =n 1
ẽ 0 ẽ /n e 0e
e e
n /2
e 0e e 0e
LR = 2 ln = n ln 0
e 0e
e e e
ee e
LR /n e 0e
) e = 0
e
ee e
F. Guta (CoBE) Econ 607 September, 2018 94 / 117
Therefore,
LR /n
LM = n 1 e
LR /n LM
) e =1
n
LR LM LM
) = ln 1 <
n n n
) LR > LM
F. Guta (CoBE) Econ 607 September, 2018 95 / 117
Moreover, as n ! ∞, LR ! LM.
Therefore, for linear hypothesis in the linear
regression model, we have: W LR LM.
This ranking does not carry over to non-linear
restrictions . However, if we have linear restrictions
plus a quadratic likelihood, then it follows that
W = LR = LM. Note that in regression: we have a
likelihood quadratic in β but not in σ2 , so, equality
holds if σ2 is known or as n ! ∞.
ut = ρut 1 + et (2.6.1)
i.e., Lxt = xt 1
L2 xt = xt 2
..
.
Lj xt = xt j , L0 xt = xt
) (1 ρL) ut = et
et
) ut = = 1 + ρL + ρ2 L2 + et
1 ρL
Or,
ut = et + ρet 1 + ρ2 et 2 + (2.6.2)
E (u t u t 1) = ρσ2
E (u t u t 2) = ρ2 σ 2
.
.
.
E (u t u t j) = ρj σ 2
0 1
B 1 ρ ρ2 ρT 1 C
B C
B C
B C
B ρ 1 ρ ρT 2 C
B C
B C
B
2B
C
i .e., E uu0 =σ B C (2.6.4)
ρ2 ρ 1 ρT 3
C
B C
B .. .. .. .. C
B .. C
B . . . . . C
B C
B C
@ A
ρT 1 ρT 2 ρT 3 1
E (uu0 ) = V (2.6.5)
E (u) = 0
E (uu0 ) = σ2 Ω
b 1 1
β = (x0 x ) x0 y = β + (x0 x ) x0 u
Thus, E b
β = β, since E (u) = 0, i.e., unbiased.
But
0
var b
β =E b
β E b
β b
β E b
β
F. Guta (CoBE) Econ 607 September, 2018 107 / 117
h i
0 1 0 0 0 1
= E (x x) x uu x (x x)
1 1
= ( x0 x ) x0 E (uu0 ) x (x0 x)
1 1
= ( x0 x ) x0 ( σ 2 Ω ) x (x0 x )
) var b
1 1
β = σ 2 ( x0 x ) (x0 Ωx) (x0 x) (2.6.6)
Although b
β is still unbiased, the usual expression for
var b
β is no longer appropriate, –inferences will be
misleading or incorrect, i.e., t stat, F stat,
W y = W xβ + W u (2.6.7)
such that:
E (W u) = 0
F. Guta (CoBE) Econ 607 September, 2018 109 / 117
E (W uu0 W 0 ) = σ2 W ΩW 0 (2.6.8)
If we could choose W such that W ΩW 0 = IT , then
we could apply OLS to (2.6.7)
Theorem
If Ω is a symmetric positive de…nite matrix, then we
can …nd a matrix P such that:
Ω = PP 0 (2.6.9)
1
This suggests P 1
Ω (P 0 ) = IT and that we could
1
Ω 1
= (P 0 ) P 1
= W 0W
b 1
β G = (x0 W 0 W x ) (x0 W 0 W y )
1
= x0 Ω 1 x x0 Ω 1 y
0
var b
βG = E b
βG E b
βG b
βG E b
βG
h i
1 1
= E x0 Ω 1
x x0 Ω 1
uu0 Ω 1
x x0 Ω 1
x
1 1
= x0 Ω 1
x x0 Ω 1
E uu0 Ω 1
x x0 Ω 1
x
1 1
= x0 Ω 1
x x0 Ω 1
σ2 Ω Ω 1
x x0 Ω 1
x
1
) var b
β G = σ 2 (x0 Ω 1 x )
0
minS = (y xβ) (y xβ)
β
minSG = (y x β)0 Ω 1
(y x β)
β
∂SG
= 2x0 Ω 1
y + 2x0 Ω 1
xb
βG = 0
∂β b
βG
) x0 Ω 1
xb
β G = x0 Ω 1
y
1
Or bβG = x0 Ω 1 x x0 Ω 1y
H0 : R β = r
given by:
0 h i 1
1
Rb
βG r R x0 Ω 1x R0 Rb
βG r
F = F (q, T k)
qS 2
0
y xb
βG Ω 1 y xb
βG bG
u 0 Ω 1u
bG
2
where S = =
T k T k
F. Guta (CoBE) Econ 607 September, 2018 115 / 117
The following are the assumptions called regularity
conditions
∂
i ). log f (x; θ ) exist for all x and θ
∂θ
Z Z n
∂
ii ).
∂θ ∏ f (xi ; θ ) dx1 dxn =
i =1
Z Z n
∂
∂θ ∏ f (xi ; θ ) dx1 dxn
i =1