Ch. 3 - Systems of Equations
Ch. 3 - Systems of Equations
System of Equations
3.1 Introduction
So far we have considered the estimation of models that concerned the estimation of one
equation. There are many cases, however, when such equations are not determined by
themselves, but rather simultaneously with other equations. In these cases it makes more
sense to estimate all the equations simultaneously. Consider, for example, that we want
to estimate the household demand functions for gas and groceries. These decisions are
normally made simultaneously given an expenditure constraint, so it stands to reason
that the unobserved effects on the demand of each good must be related, i.e. the errors
of both equations must be correlated across equations.
In this chapter we are going to consider a modeling framework for multiple equations
that can be used for different applications, and then we are going to consider different
estimation methods for different cases. The general model of equations can be written
as Let there be m equations that you want to estimate which we can write as
y 1 = X1 β 1 + ε 1
y 2 = X2 β 2 + ε 2
...
y m = Xm β m + ε m (3.1)
1
2 CHAPTER 3. SYSTEM OF EQUATIONS
There are m equations and n observations to estimate each equation in (3.1). Each
equation in this system has its own set of Ki regressors, for i = 1 to m, so the
coefficient vector for each equation also has its own size (Ki ). For the seemingly
unrelated regressions (SUR) model, we assume that there is strict exogeneity, i.e.
E [εi | X1 , X2 , . . . , Xm ] = 0, for i = 1 to m, and homoskedasticity within the equations,
i.e. E [εi ε0i | X1 , X2 , . . . , Xm ] = σii In . The errors, although uncorrelated across observa-
tions from the assumption of homoskedasticity within the equations, are correlated across
equations. This means that for observations t and s in equations i and j, respectively,
Let the m × m covariance matrix of the disturbances for the tth observation be
σ11 σ12 . . . σ1m
σ21 σ22 . . . σ2m
Ω = .. , (3.2)
.. ... ..
. . .
σm1 σm2 . . . σmm
3.2. SUR 3
Since we do not know Σ, we need to estimate it and use the FGLS estimator. Remember
that Σ = Ω ⊗ In , so in reality all we need is an estimate of Ω. To get such estimate,
all we need is to run OLS equation by equation, and use the vectors of residuals of each
1
The Kronecker product between matrices A and B, denoted A⊗B, is defined such that each element
in matrix A is multiplied by matrix B.
−1 −1
2
The rule is that (A ⊗ B) = A−1 ⊗ B−1 , so since I−1
n = In , Σ = Ω−1 ⊗ In .
4 CHAPTER 3. SYSTEM OF EQUATIONS
Equation (3.7) does not correct for the degrees of freedom lost because of the number
of regressors used. Notice that equations i and j have Ki and Kj regressors, respectively,
where Ki does not necessarily equal Kj . Two possibilities that are unbiased when i = j
are
ε0ib
εj
bij∗ =
b
σ (3.8a)
[(n − Ki ) (N − Kj )]1/2
and
ε0ib
εj
bij∗∗ =
b
σ . (3.8b)
n − max (Ki , Kj )
Notice that FGLS does not require that Σ b is unbiased, only consistent, which would
be the case if we just used the estimates in equation (3.7). Using those estimates, then
σ
b11 σ b12 . . . σ b1m
σ
b21 σ b22 . . . σ b2m
Σ = Ω ⊗ In = .. .. ⊗ In , (3.9)
b b .. . .
. . . .
σ
bm1 σ bm2 . . . σ bmm
which is the same as the GLS estimator in equation (3.5). Let σ ij be the ij th element in
Ω−1 .The first-order condition with respect to σ ij is
∂ ln (L ) n 1
= 0 ⇒ 2σij − 2 (yi − Xi β i )0 yj − Xj β j = 0
∂σ ij 2 2
0
nσij = εi εj .
Iterated FGLS
Maybe we should have covered this in the previous chapter with FGLS and heteroskedas-
ticity. What I present now is also valid for iterated FGLS in that case. We have just
3 n m −1
Let A be an m×m matrix, and B be an n×n matrix. Then |A ⊗ B| = |A| |B| , and (A ⊗ B) =
A ⊗ B−1 .
−1
6 CHAPTER 3. SYSTEM OF EQUATIONS
mentioned that iterated FGLS converges to MLE, and this may be confusing because
when deriving both estimators we came up with the same mathematical expression for
both the coefficients and the covariances of the errors. The issue at hand is that when not
iterated, FGLS uses OLS residuals to estimate the σ bij . Even though they are consistent
estimators, MLE provides better estimates of the covariances of the errors. However,
iterated FGLS may converge to MLE, so let us explain how iterating FGLS works. It
follows the following steps:
3. use the σ
bij to form Ω
b and estimate FGLS model;
A particular case arises when we are using the same K variables to estimate each of the
m equations. In that case X1 = X2 = · · · = Xm = Z, so that
Z 0 ··· 0
0 Z ··· 0
X= .. .. . . .. = Im ⊗ Z. (3.14)
. . . .
0 0 ··· Z
3.2. SUR 7
(Z0 Z)−1 Z0
0 ··· 0
0 −1 0
0 (Z Z) Z · · · 0
β
b GLS = y
.. .. ... ..
. . .
0 0 · · · (Z Z)−1 Z0
0
(Z0 Z)−1 Z0 y1
β
b
1
(Z0 Z)−1 Z0 y2
β
b
2
= = .. .
..
. .
0 −1 0
(Z Z) Z ym β
bm
Notice that even though the efficient estimates of the coefficients are the same as
those obtained via OLS equation by equation, the variance of the coefficients estimates
still has to account for the correlation of the coefficients. Equation (3.6) simplifies to
h i
V βb GLS | Z = Ω ⊗ (Z0 Z)−1 . (3.16)
0
4
Some additional rules of Kronecker products that are useful here. (A ⊗ B) = A0 ⊗ B0 , and
(A ⊗ B) (C ⊗ D) = AC ⊗ BD.
8 CHAPTER 3. SYSTEM OF EQUATIONS
We now consider how to setup tests to test hypotheses (restrictions) about the popula-
tions’ coefficients. We consider two types of tests: one based on an F Wald-test type of
statistic, and another based on the likelihood ratio test. As usual whether you use one or
another depends on whether you want to estimate the restricted model or not. The Wald
test is based on statistics calculated from the results of the unrestricted model, and the
likelihood ratio test is based on statistics from the estimations of both the unrestricted
and restricted models.
Wald Test
The above statistic needs Σ for its estimation, which is unknown. Using FGLS Σ b from
equation (3.9), and since the denominator converges to one, in large samples the statistic
will behave the same as:
1 b 0 h h i i−1
0
Fb = Rβ F GLS − q RV b β b
F GLS | X R R β
b
F GLS − q . (3.18)
J
You can compare this to a critical F statistic with J degrees of freedom in the numerator
and mn − K degrees of freedom in the denominator.5 Because we are using Σ, b even with
normally distributed errors, the F distribution is only valid approximately. In general,
the statistic F [J, k] converges to J −1 χ2 (J) as k → ∞. So
0 h h i i−1
0
J F = Rβ F GLS − q RV β F GLS | X R
b b b b Rβ F GLS − q
b (3.19)
follows a χ2 (J) distribution. This is basically a Wald statistic that measures the
distance between Rβ b
F GLS and q. Both statistics are valid asymptotically but (3.18)
may perform better in small samples.
5
K here represents the total number of coefficients estimated in the estimation of the system of
equations.
3.2. SUR 9
The general procedure for a likelihood ratio test is to run both the unrestricted and
restricted models using maximum likelihood, calculate the likelihood of each of the two
and form a test-statistic based on the ratio of the likelihood of the restricted model and
the likelihood of the unrestricted model. You can estimate the SUR model using a maxi-
mum likelihood estimator in any econometric software that maximizes the log-likelihood
function in (3.12), under the assumption that the errors are normally distributed. Al-
ternatively, you could run iterated FGLS of each model until they converge, but there is
no guarantee that either estimation may actually converge.
the residuals in equation (3.2). Let the restricted model be identified by R and the
unrestricted model be identified by U . The likelihood ratio statistic is then
Ω
b R
λLR = n ln = n ln ΩR − ln ΩU , (3.20)
b b
ΩbU
Test of Specification
Under the assumption of the normality of the errors, we can perform a test to see
whether seemingly unrelated equations estimation is appropriate or not. We would like
to test the null hypothesis is that OLS equation by equation is appropriate against the
alternative that the seemingly unrelated equations model is the correct specification.
Consider that OLS holds, the covariance matrix for an observation would be a di-
agonal matrix with the different variances of the residuals in the respective equations
in such a diagonal. Let Ω
b O represent the estimate of the mentioned covariance matrix,
10 CHAPTER 3. SYSTEM OF EQUATIONS
then
σ
b11 0 ··· 0
0 σb22 ··· 0
Ω
bO = , (3.21)
.. .. .. ..
. . . .
0 0 ··· σ
bmm
so, its determinant would equal the multiplication of the diagonal elements of the matrix,
and the log-likelihood would thus equal
X m
ln ΩO = εi 0 b
ln (b εi /n) . (3.22)
b
i=1
So the log-likelihood is the sum of the natural logarithms of the OLS errors’ variance
estimates of each equation.
The OLS equation by equation estimation is the restricted model. This is because we
are restricting σij to be zero for every i 6= j in equation (3.2). The unrestricted model is
thus the SUR model. Letting U represent this model, the likelihood ratio test statistic
for the specification test is
" m #
X
λLR = n ln Ω O − ln ΩU = n εi 0 b
ln (b εi /n) − ln ΩU , (3.23)
b b b
i=1
a
− χ2 [m (m − 1) /2] distribution.6
and λLR →
Many applications of the multivariate regression model have been in the context of
systems of demands equations, either commodity demands or factor demands in
studies of production. In principle each is basically an application of the SUR model
we have just covered, but with additional constraints or characteristics that need to
be accounted for. For example, the systems are generally constrained across equations,
and many of these models the covariance matrix of the disturbances Ω is singular. We
consider two cases: the Cobb-Douglas and the Translog cost functions.
6
The only unrestricted parameters under OLS equation by equation in equation (3.2) are the diagonal
elements of the matrix. There are m diagonal elements, and thus m × m − m = m (m − 1) off-diagonal
elements. Since Ω is a symmetric matrix, i.e. σij = σji , there are thus m (m − 1) /2 restrictions.
3.3. SINGULAR SYSTEMS 11
Profit maximization with an exogenously determined output price calls for the firm to
maximize output for a given cost level of C (or minimize costs for a given output Q).
The maximization problem has the following Lagrangean
M
Y
Λ = α0 xαi i + λ (C − p0 x) (3.25)
i=1
where p is the vector of M factor prices. The first order conditions thus are
∂Λ αi Q αi Q
=0⇒ − λpi = 0; pi xi =
∂xi xi λ
and
M
∂Λ X
= 0 ⇒ C − p0 x = 0; C = p0 x; C = p i xi .
∂λ i=1
Solving for the systems of equations yields the demands for the factors, x∗i (Q, p), and
λ∗ (Q, p). The total cost of production is
M M
X X αi Q
pi x∗i = ,
i=1 i=1
λ∗
which means that the cost share allocated to the ith factor is
p i xi αi
PM ∗
= PM = βi . (3.26)
i=1 pi xi i=1 αi
By construction, M
P PM
i=1 βi = 1 and
PM i=1 i
s = 1. The cost shares will also add up to
1 in the data, which implies that i=1 εi = 0 at every data point,7 making the system
of equations singular. To further understand this point, disregard equation (3.27) and
concentrate on the system of equations represented by (3.28). Let ε =P[ε1 , ε2 , · · · , εM ]0 .
Letting i be a column of ones, what we have discussed means that ε0 i Mi=1 εi = 0. This
0
means that E [εε i] = Ωi = 0, which means that Ω is singular, and thus non-invertible,
so we cannot apply SUR to estimate the system of equations presented here.
M
X −1
ln C = β0 + βq ln Q + βi ln pi + (1 − β1 − · · · − βM −1 ) ln pM + εc ;
i=1
M
X −1
ln C − ln pM = β0 + βq ln Q + βi (ln pi − ln pM ) + εc ;
i=1
M −1
C X pi
ln = β0 + βq ln Q + βi ln + εc .
pM i=1
pM
M −1
C X pi
ln = β0 + βq ln Q + βi ln + εc , (3.29)
pM i=1
pM
si = βi + εi , for i = 1, · · · , M − 1. (3.30)
7
For example for every firm we observe.
3.3. SINGULAR SYSTEMS 13
Specifying output with a Cobb-Douglas functional form, restricts the elasticity of sub-
stitution of the factors to be equal to 1. Let output be, for now, defined by a general
function Q = f (x), where x is the vector of factors. The solution of the cost minimiza-
tion problem for a given output level Q, will give the factor demands x∗i (Q, p), where p
is the vector of factor prices, for i = 1, · · · , M . The cost function would thus be
M
X
C (Q, p) = pi x∗i (Q, p) (3.31)
i=1
If we can assume that there are constant returns to scale, then C (Q, p) /Q =
c (p), where c (p) is the per unit, average, cost.8 We get the cost-minimizing factor
demands by applying Shephard’s lemma
∂C (Q, p) Q∂c (p.)
x∗i (Q, p) = = (3.32)
∂pi ∂pi
If instead we differentiate the cost function logarithmically, we obtain the cost-minimizing
factor cost shares
∂ ln C (Q, p) pi x i
si = = . (3.33)
∂ ln pi C (Q, p)
With constant returns to scale C (Q, p) = Qc (p), so ln C (Q, p) = ln Q + ln c (p), and
∂ ln c (p)
si = . (3.34)
∂ ln pi
The purpose of many empirical studies are to determine the elasticities of factor sub-
stitution, θij , and the own price elasticities of factor demand, ηii . These are given by
∂2c
∂pi ∂pj
θij = c ∂c ∂c
(3.35a)
∂pi ∂pj
and
So by suitably specifying the cost function and the cost shares we have an M or M + 1
equation model that we can use to estimate the quantities on equations (3.35a) and
(3.35b).
Let βi , for i = 1, · · · , M , represent the first derivatives, and δij , for i = 1, · · · , M and
j = 1, · · · , M , represent the second derivatives. The function can be expressed as
M M M
X 1 XX
ln c (p) = β0 + βi ln pi + δij ln pi ln pij (3.37)
i=1
2 i=1 i=1
Converting from unit cost to total cost we have the translog cost function
M
1 X
ln C = β0 + βQ ln Q + δQQ (ln Q)2 + βi ln pi
2 i=1
M M M
1 XX X
+ δij ln pi ln pij + δQi ln Q ln pi (3.38)
2 i=1 i=1 i=1
If all δ coefficients are zero, we would have the Cobb-Douglas cost function in equation
(3.27). Applying Shephard’s lemma, we get the shares of the cost
M
∂ ln C X
= Si = βi + δQi ln Q + δii ln pi + δij ln pj (3.39)
∂ ln pi j=1
M
X
δQi = 0 (3.40b)
i=1
M
X M
X M X
X M
δij = δij = δij = 0 (3.40c)
i=1 j=1 i=1 j=1
In addition, it should be clear from equation (3.36) that we also need to constraint
δij = δji since ∂ 2 ln c/ (∂ ln pi ∂ ln pj ) = ∂ 2 ln c/ (∂ ln pj ∂ ln pi ).
To estimate the parameters we can use SUR estimation by imposing the restrictions
in equations (3.40a) to (3.40c), the symmetry restriction that δij = δji and solving
the problem of singularity of the disturbance covariance matrix of the share
equations by dropping one of the equations and adapting the translog cost function
accordingly. Estimation must be done now via maximum likelihood (iterated FGLS)
to ensure invariance of the parameters with respect to the choice of share equation
omitted.
For the translog cost functions we can see if there are scale economies or not with a
simple measure. Let SCE be
∂ ln C
SCE = 1 − . (3.41)
∂ ln Q
For positive SCE numbers there are economies of scale, for negative numbers there are
diseconomies of scale, and when SCE = 0 we have constant returns to scale. Fur-
thermore, the translog cost function has easy ways to estimate the elasticities of factor
substitution:
δij + si sj
θij = (3.42a)
si sj
δii + si (sj − 1)
θii = (3.42b)
s2i
The assignment for Seemingly Unrelated Regression is based on Christensen and Greene
(1976). The idea is to estimate the same models that they estimate there and test the
different restrictions of the models, as well as to test whether there are economies of
scale present in the industry with the results of each model estimated. Their study uses
a translog function system setup to analyze the electrical power sector. They consider
three factors (inputs): labor, L, capital, K, and fuel F . Using the factor letter as
subindexes for the different parameters, the translog cost function can be written as:9
(ln Q)2
ln C = β0 + βQ ln Q + δQQ + βL ln pL + βK ln pK + βF ln pF
2
(ln pL )2 (ln pK )2 (ln pF )2
+ δLL + δKK + δF F
2 2 2
+ δQL ln Q ln pL + δQK ln Q ln pK + δQF ln Q ln pF
+ δLK ln pL ln pK + δLF ln pL ln pF + δKF ln pK ln pF + εC . (3.43)
βL + βK + βF = 1 (3.45a)
Dropping the fuel share equation, and using the restrictions above to adjust the model
accordingly the model then is:
(ln Q)2
C pL pK
ln = β0 + βQ ln Q + δQQ + βL ln + βK ln +
pF 2 pF pF
[ln(pL /pF )]2 [ln(pK /pF )]2
pL
δLL + δKK + δQL ln Q ln +
2 2 pF
pK pL pK
δQK ln Q ln + δLK ln ln + εC (3.46)
pF pF pF
pL pK
sL = βL + δLL ln + δQL ln Q + δLK ln + εL (3.47a)
pF pF
pK pL
sK = βK + δKK ln + δQK ln Q + δLK ln + εK (3.47b)
pF pF
This is Model A in the paper. To estimate this model you use SUR but you need to
make the cross-equation restrictions to ensure that the coefficients that are supposed to
be the same in the different equations are actually the same, e.g. δLK has to be the same
in all three equations.
The assignment is to perform the estimation of the 5 models (A through C), test if
the restrictions in models B, C, D, E, and F hold by using likelihood ratio tests. You also
are responsible to proof in your writeup that by removing sF from the model and using
the restriction the modified model is the one expressed in equations (3.46) to (3.47b).
We now consider another case where several equations are better estimated as a system
of equations. Simultaneous equations models (SEM) are developed because many times
18 CHAPTER 3. SYSTEM OF EQUATIONS
when applying economic theory we have models that rely on several equations to be
solved together. For example, a simple market equilibrium needs three equations to be
solved together to get the equilibrium price and quantity: the market demand equation,
the market supply equation, and the market clearing condition.
These models will have variables that will be solved for in the model, the endogenous
variables, and variables that are needed to estimate the endogenous variables but that
the model does not need to solve for, i.e. the exogenous variables. If we express the
model as derived from economic theory, the model will be in its structural form. As we
transform the model to express the endogenous variables in terms of the exogenous ones,
we then have the reduced form equations.
To understand this concept consider the following structural equations for a market
equilibrium,
qd = α0 + α1 p + α2 z + εd (demand), (3.48a)
qs = β 0 + β 1 p + β 2 z + ε s (supply), (3.48b)
qd = q s = q (equilibrium), (3.48c)
where q is the quantity of the good in the market, p is the price, and z is the price of a
related good that we are assuming affects both the supply and the demand (an example
could be oil that is needed for the production of electricity and that also affects the
consumption of other goods in different ways). This model solves for both price and
quantity together, so the only exogenous variable in this model is z. Solving for q and
p in terms of z we get the reduced form of the model
α1 β0 − α0 β1 α1 β2 − α2 β1 α1 εs − β1 εd
q= + z+ = π11 + π21 z + νq , (3.49a)
α1 − β1 α1 − β1 α1 − β1
β0 − α0 β2 − α2 εs − εd
p= + z+ = π12 + π22 z + νp . (3.49b)
α1 − β1 α1 − β1 α 1 − β1
Our purpose with these models is to estimate the reduced form of the models and then
calculate the estimates of the structural equations from the estimates of the parameters
of the reduced equations. This poses a problem since the number of parameters in the
reduced form equations are usually less than the parameters in the structural equations.
Notice that there are only 4 parameters to estimate in equations (3.49a) and (3.49b), but
there are 6 parameters in equations (3.48a) and (3.48b), so it is impossible to identify
the structural parameters.
3.5. SIMULTANEOUS EQUATIONS MODELS 19
In this section we consider a general setup for these models, how to estimate the
coefficients of the reduced form equations, and the identification issues of the structural
parameters.
Let y represent the endogenous variables, x represent the exogenous ones, and i represent
an observation with i = 1, 2, · · · , n. The general structural form would be
γ11 yi1 + γ21 yi2 + · · · + γM 1 yiM + β11 xi1 + β21 xi2 + · · · + βK1 xiK = εi1 ,
γ12 yi1 + γ22 yi2 + · · · + γM 2 yiM + β12 xi1 + β22 xi2 + · · · + βK2 xiK = εi2 ,
..
.
γ1M yi1 + γ2M yi2 + · · · + γM M yiM + β1M xi1 + β2M xi2 + · · · + βKM xiK = εiM .
Since this is a linear system of equations, in order to be able to find the solutions for the
M endogenous variables, there must be M equations. This makes the system complete.
We can write the system in matrix notation
γ11 γ12 · · · γ1M
γ21 γ22 · · · γ2M
y1 y2 · · · yM i .. .. .. ..
. . . .
γM 1 γM 2 · · · γM M
β11 β12 · · · β1M
β21 β22 · · · β2M
+ x1 x2 · · · xM i .. .. . .. .
..
= ε 1 ε 2 · · · ε M i
. .
βM 1 βM 2 · · · βM M
or
Looking at the matrices of the parameters, Γ and B, we can see that each column
is the vector of the coefficients of a particular equation, and each row is the vector of
the coefficients for a given variable across equations. Now, to express the endogenous
20 CHAPTER 3. SYSTEM OF EQUATIONS
variables in terms of the exogenous variables we transform equation (3.50) to get the
reduced form
For this solution to exist, the model must satisfy the completeness condition for
simultaneous equations systems: Γ must be nonsingular.
E [εi εj 0 | xi , xj ] = 0 ∀i, j
Σ = Γ0 ΩΓ.
3.5.2 Identification
Before I mentioned that even though we estimate the reduced form system of equations,
the purpose really is to be able to measure the parameters of the structural model. The
problem that arises is that one reduced form can represent different structural models
(economic theories). When more than one theory is consistent with the same reduced
form, then the theories are said to be observationally equivalent The problem that
this causes is that without more information about the theory, and thus the structural
form, we cannot estimate the structural parameters from the parameters estimated by
the reduced form. This allows us to identify the model. The additional information
about the theory comes in the following forms:
3.5. SIMULTANEOUS EQUATIONS MODELS 21
Normalization: Because we have a dependent variable for each reduced form equation,
we can normalize each structural form equation so that the dependent variable in
the reduced form of the equation has a coefficient of 1.
Identities: In some models, variable definitions or equilibrium conditions imply that all
the coefficients of a particular equation are known.
Exclusions: Omitting variables from equations places zeros on B and Γ.
Linear Restrictions: If we know that the structural parameters follow some restric-
tions, these restrictions will help too in ruling out false structures.
Restrictions on the Disturbance Covariance Matrix: Whether we know if the struc-
tural disturbances are correlated or uncorrelated with each other.
Order Condition: Notice that in the reduced form each equation may include vari-
ables that are endogenous as explanatory variables. The order condition is that in
each equation the number of exogenous variables of the whole system that are not
included in the equation is at least as large as the number of endogenous variables
included as explanatory variables in the equation. The idea is that the exogenous
variables that are left out are going to be used as instruments for the endogenous
variables in the equation, therefore the equation must be at least just-identified for
us to be able to estimate it. If you have more endogenous variables in the equation
than exogenous variables that are not included, then the equation is not identified.
If you have fewer endogenous variables in the equation than exogenous variables
are left out, you then have an over-identified equation. This is equivalent to what
we saw when we talked about endogeneity and the IV and 2SLS estimators. The
order condition is only a necessary condition, i.e. it is necessary for the system
to be identified that each equation satisfies the order condition, but it is not a
sufficient condition.
Rank Condition: The rank condition states that for each equation, each of the vari-
ables excluded from the equation must appear in at least another equation (no
zero columns), and at least one of the variables excluded from the equation must
appear in each of the other equations (no zero rows). This is equivalent as saying
that the matrix of coefficients of the variables excluded in one equation in the other
equations must have full row rank. This is a sufficient condition for identification.
22 CHAPTER 3. SYSTEM OF EQUATIONS
To illustrate how to test for these two conditions, consider the following system of
equations:
In this system there are three endogenous variables y1 , y2 , and y3 , and three exogenous
variables x1 , x2 , and x3 . To test whether the order and rank conditions are met, we
build a matrix of coefficients, with the rows containing the equations, and the columns
the variables
y1 y2 y3 x 1 x2 x 3
Eq. 1 1 0 0 β21 0 β41
Eq. 2 0 1 γ32 β12 β32 0
Eq. 3 γ13 0 1 β23 0 β43
Let us start with equation 1. To test the order condition, the number of coefficients
on the endogenous variables for the equation has to be less or equal to the number of
zeros of the exogenous variables. We see that there are no coefficients on the exogenous
variables (no γs), and there is one zero in the exogenous variables (one β is missing), so
equation 1 is over-identified. We now need to test the rank condition for this equation.
To do this we check where the equations have zeros (y2 , y3 , and x2 ), and form a matrix
with the values in those columns that are not included in equation 1
1 γ32 β32
.
0 1 0
First no rows are all zeros, and second the two rows are linearly distinct (there is no
possible multiple of the first row that equals the second), so the matrix has full row rank
(2), and the equation is identified.
Let us now consider equation 2. The number of endogenous variables in the equa-
tion is equal to the number of excluded exogenous variables, 1, so the equation is just-
identified and satisfies the order condition. The relevant matrix to test the rank condition
is
1 β41
.
γ13 β43
3.5. SIMULTANEOUS EQUATIONS MODELS 23
As long as the coefficients in one row are not proportional to the other row, the matrix
will have full row rank, 2, and be identified. If there is a multiple that can transform
row 1 into row 2, then the matrix will not have full row rank, and the equation would
not be identified.
We can see that this matrix does not have full row rank, since all we need to do is to
multiply the first column by β32 to have 2 identical columns. The rank of the above
matrix is 1, and equation 3 is not identified.
The case of equation 3 clearly illustrates how satisfying the order condition is not
sufficient to ensure identification. Equation 3 is just-identified according to the order
condition, but fails to be identified because it fails to satisfy the rank condition.
For estimation purposes, let Xj represent the exogenous variables in equation j, and Yj
the endogenous variables in the right hand side of equation j. We can then write the
general model to be estimated as:
y j = X j δ j + Yj γ j + ε j
= Zj β j + εj , (3.52)
Just like in SUR estimations, we can estimate the system of equations either equa-
tion by equation, using Limited information estimators or estimate all equations
simultaneously using Full information estimators. The major difference between
these estimators and the SUR estimators, is that we now have to accommodate for the
endogeneity of Yj .
24 CHAPTER 3. SYSTEM OF EQUATIONS
When you estimate the models equation by equation, since there are endogenous vari-
ables, OLS will provide a biased and inconsistent estimate, so the consistent estimator is
2SLS (which translates into the IV estimator in the just-identified case) for each equation
−1
0b b 0 yj
β j,2SLS = Zj Zj
b b Z j
h i−1
−1 −1
= Z0j X (X0 X) X0 Zj Z0j X (X0 X) X0 yj , (3.53)
where X = (Xj X−j ) is the matrix of all exogenous variables in the system. The
asymptotic variance estimate is
h i −1
0b
V
b β b
j,2SLS = σ
bjj Z
b Z
j j
h i−1
−1
=σbjj Z0j X (X0 X) X0 Zj , (3.54)
where
0
yj − Zj β
b
j yj − Zj β
b
j
σ
bjj = ,
n
which uses the original variables not the predicted ones, Z
bj.
Note the role of the order condition of identification. It requires that the number of
exogenous variables that appear elsewhere in the model to be at least as large as the
number of endogenous variables that appear in the equation. This is because we are
predicting Zj = (Xj Yj ) using X = (Xj X−j ). This means that there must be at
least as many variables in X−j than in Yj , which is the order condition.
We are now going to estimate the coefficients using all the equations together. Let us
formulate the whole system as
y1 Z1 0 . . . 0 β1 ε1
y2 0 Z2 . . . 0 β ε2
2
.. = .. .. .. + .. (3.55a)
.. . .
. . . . . . .
yM 0 0 . . . ZM βM εM
3.5. SIMULTANEOUS EQUATIONS MODELS 25
or
y = Zβ + ε (3.55b)
where E [ε | X] = 0 and E [εε0 | X] = Σ = Ω ⊗ In , i.e. homoskedastic by equation.
The OLS estimator β b = (Z0 Z)−1 Z0 y is equation by equation OLS and is inconsistent.
But even if it were consistent, we know from SUR that it would be inefficient compared
to an estimator that uses the cross-equation correlations of the disturbances. For the first
issue, inconsistency, we need an IV based estimator. For the second issue, inefficiency,
we use a GLS approach.
1st Stage: Estimate the reduced form in equation (3.51) by OLS (equation by equation)
and compute Y
b j for each equation. Notice that this is similar to the first stage of
2SLS.
2nd Stage: Compute β b
j,2SLS for each equation by running OLS on each equation and
replacing Yj on each equation by Y b j from stage 1. Then Ω
b can be formed with
0
yi − Zi βb
i,2SLS y j − Z β
j j,2SLS
b
σ
bij = .
n
rd
3 Stage: Run FGLS of y on Z = X Y to get the 3SLS estimator
b b
h
0 −1
i−1 0 −1
β
b
3SLS = Z
b Ω ⊗ In Z
b b Ω ⊗ In y.
Z (3.56)
The command for 3SLS is reg3. This command is only valid for homoskedastic errors
as we have assumed is the case in the presentation above, which of course is problematic
26 CHAPTER 3. SYSTEM OF EQUATIONS
The assignment this time is to estimate the model discussed in Cameron and Trivedi
(2009, section 6.6.). Estimate each equation independently using 2SLS (remember that
the instruments for each equation are the exogenous variables in the other equation that
are not included in that equation), test for endogeneity, the validity of the instruments
and for the presence of weak instruments as you did in the endogeneity assignment. Es-
timate then the whole system using 3SLS (basically do the same as they do in the book).
3SLS is supposed to be more efficient than 2SLS equation by equation, so make sure
you compare the results of both estimations in that sense. Remember that although the
purpose of the assignment is to do 3SLS the write-up has to be a report on the estimation
of both equations and you must analyze what the coefficients of the estimations mean,
and if the difference of the coefficients between both models. Finally check whether the
equations in he following model are identified:
1 γ12 0 0
γ21 1 γ23 γ24
y1 y2 y3 y4 0 γ32 1 γ34
γ41 γ42 0 1
0 β12 β13 β14
β21 1 0 β24
+ x1 x2 x3 x4 x5 β 31 β32 β 33 0
0 0 β43 β44
0 β52 0 0
= ε1 ε2 ε3 ε4
10
For more information on cmp go to http://ideas.repec.org/c/boc/bocode/s456882.html.
11
See Greene (2012, p. 319) for what a recursive system is.
Bibliography
Greene, William H., Econometric Analysis, 7 ed., Upper Saddle River, NJ USA:
Prentice Hall, 2012.
27