Graybill Intro
Graybill Intro
1.1 Matrices
(cº) is defined as the matrix C with pgth element equal to X a,b. For
8= 1
is,
all
zero; that (dº), then dº
j.
=
+
-
elements are
D
if
if
0
i
A; that
is,
A'
A.
Theorem The transpose equals (A')'
of
=
1.1
•
A; that
is,
Theorem A.
of
=
1.2 The inverse A-1 (A-1)-1
is
•
be
Theorem 1.3
•
is,
muted; that
=
(A') (A-1)'.
-1
Theorem (AB) B'A'.
=
1.4
•
A
1.5 and
if
B
•
is,
Theorem 1.6 A scalar commutes with every matrix; that
•
kA = Ak.
Theorem For any matrix IA AI A.
=
=
A
1.7 we have
•
of
1.8 the same dimension are
•
commutative.
1.9
•
is,
D,
diagonal; that diagonal.
D =
=
DiD2 D2D1 where
is
D
is
The ith diagonal element the product the ith diagonal
of
of
is
if of
element Ds.
Theorem nonsingular
If
A
1.10 and
X is
Y
•
= a
the equation AX holds, then A-Y.
=
matrix and
Y
if
the product AB
of
Theorem
A of
The rank the two matrices
•
1.11
equal
B. or
or
to
of
and
is
to B
is less
equal
of
the rank
Theorem equal
of
of
The rank or
+
A
to
B.
of
the rank
If A
Theorem |A|
0,
1.13 an
if
is
x
•
n
of
of
the
is
matrix A.)
Theorem
of
1.14
A
is
•
pendent. (A
is
×
m
m.)
the rank
s
A
1.15
m
is
•
is
m
m.)
×
Theorem A'
0,
If
0.
=
=
A
1.16 then
A
•
B,
is,
• Theorem 1.20 The rank of AA’ equals the rank of A'A equals the
rank of A equals the rank of A'. Z 3
- * - X
2 x 1%x. 3, ...}
3%,
& X2
1.2 Quadratic Forms 3. 3,
with ijth element equal to au, then the quadratic form Y'AY is defined as
º º
X jX yºy,ag. The rank of the quadratic form Y'AY is defined as the
i =1 = 1
rank of the matrix A. The quadratic form Y'AY is said to be positive
definite if and only if Y'AY > 0 for all vectors Y where Y # 0. A
quadratic form Y'AY is said to be positive semidefinite if and only if
Y'AY > 0 for all Y, and Y'AY = 0 for some vector Y # 0. The
matrix A of a quadratic form Y'AY is said to be positive definite
(semidefinite) when the quadratic form is positive definite (semi
definite). If C is an n x n matrix such that C'C = I, then C is said
to be an orthogonal matrix, and C" = C-".
Consider the transformation from the vector Z to the vector Y by
the matrix P such that Y = PZ. Then, Y'AY = (PZ)'A(PZ) =
Z'P'APZ. Thus, by the transformation Y = PZ, the quadratic
form Y'AY is transformed into the quadratic form Z'(PAP)Z.
• Theorem 1.21 If P is a nonsingular matrix and if A is positive
definite (semidefinite), then P'AP is positive definite (semidefinite).
- - - - - - - - - - -
4 LINEAR STATISTICAL MODELS
- - -
(ºn 1 (ºn 2 Cºnn
C2
• Theorem 1.34 Let C1 = be p rows of an n x n orthogonal
p Xm - - -
Cp
p), f(f,
=
1,
2,
a
-
j
(i
0
n
1
i
matrix
if
if
× a
i
p)
–
p
and
C
n
)
|
2
rows
C
is
6 LINEAR STATISTICAL MODELS
1.3 Determinants
A=
[.
A21 …)
Age
where A11 and A2a are square matrices, and if A12 = 0 or A21 = 0,
then |A| = |A11||Assl.
• Theorem 1.44 If A1 and A2 are symmetric and A2 is positive defi
nite and if A1 – As is positive semidefinite (or positive definite),
then |A|| > |Ag|.
is,
sum of the diagonal elements of tr(A)
=
au.
X
=
1
i
.
Theorem 1.45 tr(AB) tr(BA).
=
•
to
equal
is
*]
tr(BA) equal But clear that Xbºast;
-
to
agby,
= X
ijX
bºast.
is
it
is
ik ºk
therefore, tr(AB) tr(BA).
tr(BCA); that
is,
Theorem tr(ABC) tr(CAB)
=
1.46 the
•
of
trace matrices
is
permutation
of
the matrices.
By Theorem trC(AB)].
Proof: tr. (AB)C]
1.45,
=
Theorem tr(I) identity matrix.
n,
=
1.47 where an
is
x
•
n
Theorem 1.48 I
an orthogonal matrix, tr(C’AC) tr(A).
If
==
is
C
*
•
=
1.46, tr(A).
=
sometimes advantageous
It
is is
be a
partitioned many ways. For example, might partitioned
in
A
be
into submatrices as follows:
A =
[.
...)
A21 A22
m,
where A11
is
m
m1 m1 me
is is
is
is
×
×
×
m2
=
and
m
me m2
X
n
1
B
if A
the matrix.
the dimensions of the matrices and of the submatrices must be such
that they will multiply. For example, matrix such
p
an
if
B
is
×
m
that
=
|.
...)
B
B21 Bes
B, an
p,
where
is
dimension
is
as
m;
of
dimension pe.
is
is
x
follows:
-
ſº
+
+
A12B21 A11B12
...)
ſº
AB =
|.
...)
...)
A =
[.
...)
Aal Age
A
and
if
B
is
=
...)
B
|.
Bai Bes
of
and are each
if
X
BigB.
–
Alſº B11 Bel
=
A
Since
I.
F
[.
|.
...)
...)
I
A21 A22 B21 B22
and
+
=
A11B11 A12 B21 A11B12 A12B22
0
I
=
Substituting this value for A12 into the first equation, we get
A11B11
- AuBigB. Bel =
I
and
B
is
principal
of
definite matrices and since A11 and B22 are minors
A
B,
is
a
be
1.50
•
=
A
...)
ſº
A21 Age
Ass |Ass||A11
–
be =
is
Proof:
of
can
A
|Ass||A||B
=
|A|
0
-
I
where =
B
—1 —1
Ag: A21 Ag:
MATHEMATICAL CONCEPTS 9
All I 0
- AºAal
1. A12
|A| = |A22.
21 Age As.
= | 22
A21 A22 -As. Ası A.
The corresponding submatrices are such that they multiply; so
|A| - |A22
Ali - A12A. As A12A;
22
F |A22 |A11
- A12A.
-
Aal
0
as was to be shown.
+
-i-
V2
+
-
+ nºº + + ºn mºm =
-
-
n121 ſ/n
0
given set
y,
of
(that and
A
is
a
vector Y),
is,
vector X)
a,
of
considered:
The equations have no solution.
In
vector
X
is
said to be inconsistent.
is
In
There
is
2.
unique
to
more than one such vector exists, then an infinite number of vectors
X
augmented matrix B = (A, Y), which is the matrix A with the vector
Y joined to it as the (m. -- 1)st column; that is to say,
0 n1 0 nº (tr. n $/n
of
it)
that the rank the coefficient matrix
of is
(A, Y).
be
to
=
the rank
A
B
Theorem p(A) p(B)
p,
of
If
the unknowns
=
p
1.52 then
—
m
•
be
of
the
p
can
will be uniquely determined.
of
essential that the
It
p
—
m
is
that are assigned given values be chosen such that
ar,
the unknown
the remaining
of
of
p
the coefficients
p.
rank
X
=
1.53 there
m
×
is
•
a
that satisfies AX = Y.
equations
of
6
2
a
2a1 23:2
3
This
as
can
|||—||
|
|
matrix
It
—l
|
—
|2
2
is
2.
it.
9
04
0x, 0x,
a,.
0Z/0X
so
a,
which equals
of
;
A.
02/0X
=
be
be
1.55
B
x
•
q
x
a
a
be
p
q
X
a
p
q
a,a,n,b,
=
=
X
X
A^XB
Z
m
=
=
1
is
×
q
a
are independent,
of
X
{3,3,…) -
(I
*
=
m
m
=
=
1
a,b,
ôr, da,
12 LINEAR STATISTICAL MODELS
24 – AB
0X
24
"[3,3,…)
da'ij ôa,
2x,z, (remember
j,
+
If
-
i
ing that a,
a,
–
D(XX").
=
=
).
be
Theorem Let
be
1.57 vector and let
X
Z p
p
•
X
x
a
a
1
symmetric matrix such that X'AX; then 0Z/0X 2AX.
=
=
to
X
Z
is
x
a
2, p
*(3,4-3,3,…)
p
p
2
m m
=
=
1
1
mn
m
#
or,
p
rain rain
=
2xas
--
=
i 1X
X
2
2
m
=
=
n
n;4
p
X
2
=
n
1
to
in
a
the theorems.
MATHEMATICAL CONCEPTS 13
AX = 2*X
(A* – A)x = 0
jX
att = (li;(13;
=1
14 LINEAR STATISTICAL MODELS
-
a; t =
n
X a;;
2
3=1
so is,
But if a., H 0, n); that the
But A';
of
the ith row are all zero.
=
A
elements the
elements of the ith column are also all zero.
of
rank
If
=
A
r,
is
•
r.
Proof: By Theorem orthogonal matrix
an
1.31, there exists
P
PAP E. tr(PAP) tr(A); thus tr(A)
=
such that But
=
tr(PAP) tr(E)
=
=
r.
Theorem an idempotent matrix and idem
If
A
1.64 is an
B
is
•
=
if
is
Proof: AB BA, (AB)(AB) (AA)(BB) AB.
If
=
then
1.65
is
P
is
is
•
idempotent.
Proof: (PAP)(PAP) (PA)(AP) PAP.
=
=
Theorem idempotent and idem
If
=
I,
+
A
B
1.66 A
is
is
•
potent and AB BA
0.
=
= =
B
B
is
A,
I B.
We get
=
(I
–
–
I
A)(I A) IA AI A. Thus B? B.
=
A*
–
=
A=
–
–
–
—
(I
+
I
—
B =
=
0.
Bº
=
=+
=
or B
B
I.
+
A
we on the
B
I
right by the result BA follows.
=
0
1.67
p
p
X
•
,
.
.
.
matrix such that P'A1P, P'AgP, PA, ..., are each diagonal
P
P
AA, A,
A,
j.
Theorem 1.32.
is
a
separate theorem.
of
Because
it
1.68
p
p
X
•
,
.
.
.
is
X
(2)
B
žj.
=
1
i
(3)
0
i
MATHEMATICAL CONCEPTS 15
I, 0
P'BP =
0 0
B, 0
PAP =
0 0
P'BP = X
i= 1
PAP
nº
we have I =X B,
i =1
C
X
r
that
I,
0
C’B,C
=
0
I,
ºn.
0
..., m;
0
C’B,C
#
2,
=
1,
=
t
t
K,
0
16 LINEAR STATISTICAL MODELS
C’B,CC'B,C = 0
which implies
A*
A.)
B%
($
+
X
=
A
B
X
-X
#j
=
=
1
1
i
i
i
We have shown that the sum idempotent, and condition (2)
is
satisfied.
is
P
PA,
A,
are each diagonal (since A, A;
=
0). Since
A
=
P
,
.
.
.
it
is
a
that P'BP By condition
3,
diagonal. we know that
= is
of
for all But the
P
if
ż of of is
if
in
P
k or be is
P'A,P (for E,
all
Since P'BP
of
=
element must zero.
j
i)
P.
0
j) # I
is (i to
is
or
...,
0
2, 1
P'A.P.
of
m). Thus =
A,or
1,
elements
1.
k
k
0
played down the diagonal, and, since these roots are either
A,
is
is
0
to
is
if
A =
I
X
º
1
this situation,
In
of
of
If
of
, 0
(a) P’BP = where the rank of B is r
0 0
0 0 0
I,
(b) PA, P = | "
0
, ); P'AgP = | 0 I, 2 0 |; . . . ;
0 0
0 0 0
0 0 0
PA, P = | 0 I, 0
0 0 0
A
rs.
*1.7
e
Maxima, -
Minima, -
and Jacobians
v
tº
-
20
function variables
=
1.70°.
is
y
•
n
a
y
its maxima and minima only the points where
at
*
...
*
Pu
–
–
–
o
–
is
a
orders;
so
f(x1,x2, ...,x,)
all
Theorem
If
21–0
*
=
–
2
minimum, ijth
of
is
“
a
ac,
to
are not but are constraints. For example,
suppose necessary minimize the function f(x1, x2, ...,a,)
to
it
is
subject the condition h(x1, x2, ...,a,)
ac,
=
Since the are not
to
0.
independent, Theorems 1.70 and 1.71 will not necessarily give the
the equation h(x1, x2, ...,a,)
be
=
solved
If
desired result. could
0
ac,
=
h(~1.2.2, .a.)
.
.
.
substituted into f(x1, x2, ,a,) and
ac,
be
of
then this value could
.
.
.
applied.
be
Theorems 1.70 and 1.71 could
to
As an example, suppose we want
of
find the minimum
–
=
f
a
Using Theorem 1.70, we get
x;
+
+
–
-2
2,
#
=
=
0
0x1
2,
-6–0
#
=
ôre
3.”
1
a
of
of
\"
-
=
ôr, or, 0x,
- +-O
ºr
K
=
|
àºf
| 0
Öºf
6a, 6a, 0.r.
3.
positive definite; the point
so
at
x,
r2
=
minimum
1,
-
K
has
If is
f
a
subject
to
to
of
as as
we proceed
22
aci
–
+
=
1
1
a
6*, +16
re)
x;
–
* –
–
=
+
–
re)”
(1
f
-6
of
=
=
+
+
=
H
0
2
ôre
are
of
= In
consists
K
f. a
is
subject
at
æs
aci
point -$,
=
#.
a *
1
2
a
the constraint
is
if
alternative
if
the method
is
-*
MATHEMATICAL CONCEPTS 19
• Theorem 1.72 If f(x1, x2, ...,a,) and the constraint h(x1, x2, ...,
a,) = 0 are such that all first partial derivatives are continuous,
then the maximum or minimum of f(x1, x2, ...,a,) subject to
the constraint h(x1,2,2,...,a,) = 0 can occur only at a point
where the derivatives of F = f(x1, x2, ...,a,) Wh(x1,22, ...,a,) –
vanish; i.e., where
6F OF 9F 9F
0x1 02, 0a:
dº
01"
?"
31
0
g(x1, x2,
a.,
o
=
1,
–
Co
&
3
where
(i
sent
is
a
(1)
>
(r1,r2, .ºn)
9
0
.
.
.
oo oo OO
x2,
(2)
–
da's dan
1
ſ
ſ
ſ
'
.
.
.
'
'
ºv
ar,
2,
=
the transformation
1,
i
the frequency-density function
n,
of
terms the new variables
in
,
.
.
.
given by
y,
v1, y2,
is
,
.
.
.
ya) g(hisha, .h,) |J|
=
k(V1,V2,...
.
.
.
.
.
(we shall assume certain regularity conditions on the transformation
-
equations). The symbol |J| denotes the absolute value
of
the Jacobian
the transformation. The Jacobian
of
of
the determinant matrix
K
is
a
whose jth element 0xi|0y,.
is
For example, if
e-ri-r;
*1
f(r1,r2)
a.,
<
3
o
00;
ari
=
3
3
–
od
—oo
T
frequency-density function and find the corre
to
we want
is
if
a
y2
are arl
= =
+
4!/1
ye
2/1 –
dri dri
ôy, dy
4
1
1
K=
*|
we have -6
-2
7
=
H
-
–1
\2
dre drº
I
ôy, dy,
J|
=
−6 and
=
So we have
6.
J
-
(*-*.
& 6
k(y,y)
=
e-tºwitva)"
T
Thus quite clear that the Jacobian will play an important part
it
is
theory probability
in
of
the distributions.
*
Theorem given by
ar,
= of
If
transformations
-
1.74. set
is
•
...,y,),
n,
J
,
i
.
.
.
is
yield
y,
d.(r1,r2,
=
.”.)
w;
1,
=
n.
i
.
.
.
(if
is
useful
if
to
were difficult
obtain.