Systems and Matrices
Systems and Matrices
Systems and Matrices
, to be the matrix in M
n,m
(F) whose ij-th entry is
a
ji
. (Of course, if F = R then A
= A
t
.)
Example: Let
A =
_
1 2
1 0
_
and C =
_
i 0 3
1 1 + i 0
_
Then
AC =
_
1 2
1 0
__
i 0 3
1 1 + i 0
_
=
_
2 + i 2 + 2i 3
i 0 3
_
,
C
t
=
_
_
i 1
0 1 + i
3 0
_
_
and C
=
_
_
i 1
0 1 i
3 0
_
_
Note that CA is not dened!
Remark 2.2. It is often convenient to interpret the product of matrices in the following
way. Let A M
m,k
(F) and B M
k,n
(F). Denote by a
i
(1 i m) the i-th row of A and
by b
j
(1 j n) the j-th column of B. Then
AB =
_
Ab
1
Ab
2
. . . Ab
n
_
,
and also
AB =
_
_
_
_
_
a
1
B
a
2
B
.
.
.
a
m
B
_
_
_
_
_
.
4 LINEAR EQUATIONS, MATRICES AND DETERMINANTS
Proposition 2.3. The product of matrices satisfy the following properties:
i) For every in the eld (A)B = (AB) = A(B);
ii) associativity: (AB)C = A(BC);
iii) left distributivity: A(B + C) = AB + AC;
iv) right distributivity: (A + B)C = AC + BC;
where, in each case, A, B and C are assumed to be of the appropriate sizes.
Proof. We only prove (iii). Let n be the number of columns of A(= number of rows of B and
C). Then the ij-th entry of A(B +C) is
n
k=1
a
ik
(b
kj
+c
kj
) =
n
k=1
a
ik
b
kj
+
n
k=1
a
ik
c
kj
= the ij-th entry of AB + the ij-th entry of AC = ij-th entry of AB + AC.
Using the terminology introduced in Denition 2.1, the system of linear equations of
the previous section can be written as
Ax = b,
where
A =
_
_
0 2 1 1
1 1 1 1
1 2 0 8
_
_
, x =
_
_
_
_
x
1
x
2
x
3
x
4
_
_
_
_
and b =
_
_
10
9
11
_
_
.
Clearly, all the information about the initial system is still contained in the matrices A
and b, and one can carry out the Gaussian elimination process directly on the augmented
matrix (A|b). The process translates to the matrix setting as follows:
Let us suppose there are no zero columns in (A|b), otherwise discard them. First, by
interchanging rows, if necessary, make sure that the rst entry of (A|b) is = 0. Then,
multiply the (new) rst row by the inverse of the (new) rst entry, and use it as a pivot to
annihilate all the other entries in the rst column. Apply the same process to the (new)
submatrix of (A|b) obtained by discarding the rst row. Keep repeating the whole process
until it is no longer possible.
Clearly, the process just described can be carried out on any matrix. For convenience,
we shall refer to it also as Gaussian elimination.
Thus far, we have dened the sum and product of matrices, and the product of a matrix
by a scalar. Moreover, if we have two matrices A and B of the same size we can also dene
their dierence AB := A + (1 B).
What about the quotient of matrices?
To answer this question, let us rst look at the quotient m/n of two natural numbers.
One possible way of interpreting it is as the product m n
1
, where n
1
is dened by the
identity n n
1
= 1. Clearly, to carry this idea to the matrix setting all is needed is a
notion of inverse for matrices, and in turn, a candidate for the role of number 1. It turns
out that, for each size n, there is a natural candidate for this role, namely, the matrix
I
n
=
_
_
_
_
_
1 0 . . . 0
0 1 . . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 . . . 1
_
_
_
_
_
M
n
(F).
LINEAR EQUATIONS, MATRICES AND DETERMINANTS 5
We shall refer to I
n
as the identity matrix of size n. You can check that, for every
m N and any pair of matrices A M
m,n
(F) and B M
n,m
(F), one has that AI
n
= A
and I
n
B = B. In what follows, we will simply write I if the size n is clear from context.
We can now dene the notion of inverse for matrices as follows.
Denition 2.4. A matrix A M
n
(F) is said to be invertible if there exists a matrix
A
1
M
n
(F), so that
AA
1
= I = A
1
A,
where I stands for the identity matrix. The matrix A
1
is called the inverse of A.
Remark 2.5. Notice that if we have a system of linear equations in matrix form, Ax = b
say, and A happens to be invertible then Ax = b A
1
(Ax) = A
1
b (A
1
A)x =
A
1
b Ix = A
1
b x = A
1
b. In the next section, we will deduce a formula for
A
1
in terms of the entries of A, which combined with the latter, will give a formula for
x. Unfortunately, this formula for the inverse, though elegant, wont provide an ecient
way for computing it.
Two natural questions can be raised at this point:
1) How do we know whether a given square matrix has an inverse or not?
2) Knowing that the inverse of a matrix exists, is there any ecient way to com-
pute it?
We will concentrate in this section on answering the second question and postpone
dealing with the rst until the next section.
Let us start by noting that the operations (which from now onwards we shall refer to
as elementary row transformations) of
i) multiplying a row of a matrix by a number dierent from zero;
ii) exchanging two rows of a matrix; and
iii) adding to a row of a matrix another row of the same matrix multiplied by a number;
can be seen as the result of multiplying the given matrix on the left by a matrix of one of
the following types
1
.
.
.
1
1
.
.
.
1
1
.
.
.
0 1
.
.
.
1 0
.
.
.
1
1
.
.
.
1
.
.
.
1
.
.
.
1
or
1
.
.
.
1
.
.
.
1
.
.
.
1
,
where = 0. Matrices of these types are called elementary.
Rather than giving a rigorous proof of this fact (which wouldnt be very illuminating),
let us illustrate it by means of a concrete example.
6 LINEAR EQUATIONS, MATRICES AND DETERMINANTS
Example: Below is a sequence of elementary row transformations reducing the matrix
A =
_
_
0 1 0
2 0 1
1 1 0
_
_
to the identity matrix. Each transformation is followed by its corresponding interpretation
as multiplication on the left by an elementary matrix, E
i
.
Interchange the rst and the second rows of A:
_
_
0 1 0
1 0 0
0 0 1
_
_
. .
E
1
A
..
_
_
0 1 0
2 0 1
1 1 0
_
_
=
A
1
..
_
_
2 0 1
0 1 0
1 1 0
_
_
Multiply the rst row of A
1
by
1
2
:
_
_
1
2
0 0
0 1 0
0 0 1
_
_
. .
E
2
A
1
..
_
_
2 0 1
0 1 0
1 1 0
_
_
=
A
2
..
_
_
1 0
1
2
0 1 0
1 1 0
_
_
Add to the third row of A
2
its rst row multiplied by 1:
_
_
1 0 0
0 1 0
1 0 1
_
_
. .
E
3
A
2
..
_
_
1 0
1
2
0 1 0
1 1 0
_
_
=
A
3
..
_
_
1 0
1
2
0 1 0
0 1
1
2
_
_
Add the second row of A
3
to the third row:
_
_
1 0 0
0 1 0
0 1 1
_
_
. .
E
4
A
3
..
_
_
1 0
1
2
0 1 0
0 1
1
2
_
_
=
A
4
..
_
_
1 0
1
2
0 1 0
0 0
1
2
_
_
Multiply the third row of A
4
by 2:
_
_
1 0 0
0 1 0
0 0 2
_
_
. .
E
5
A
4
..
_
_
1 0
1
2
0 1 0
0 0
1
2
_
_
=
A
5
..
_
_
1 0
1
2
0 1 0
0 0 1
_
_
LINEAR EQUATIONS, MATRICES AND DETERMINANTS 7
Add to the rst row of A
5
its third row multiplied by
1
2
:
_
_
1 0
1
2
0 1 0
0 0 1
_
_
. .
E
6
A
5
..
_
_
1 0
1
2
0 1 0
0 0 1
_
_
=
A
6
..
_
_
1 0 0
0 1 0
0 0 1
_
_
Remark 2.6. One can also perform elementary transformations on the columns of a matrix.
It is not hard to see that the latter can be realized as multiplication on the right by
elementary matrices.
In the last example, we were able to reduce the matrix A to the identity by means of
row transformations of the types indicated earlier. This is in fact possible whenever the
matrix A is invertible. To see why, note that if we apply the Gaussian elimination to
an invertible matrix we should obtain by the end of this process an upper triangular
matrix (i.e., a matrix all of whose entries below the main diagonal are zero) with ones
in the main diagonal. Otherwise, some of the last rows would have to be zero, which is
impossible because
each time we perform an elementary row transformation in the rows of A, it
amounts to multiplying by an elementary matrix;
elementary matrices are invertible and the product of invertible matrices is in-
vertible (see Tutorial 1, Exercise 3.c.1), so the matrix obtained by the end of the
Gaussian elimination process must be invertible; and
an invertible matrix must have at least one non-zero entry in each row and each
column (see Tutorial 1, Exercise 6).
Then, it is not hard to see that from this upper triangular matrix it is always possible to
reach the identity by means of elementary row transformations, as in the last example.
Thus, if A is invertible then there is a sequence of elementary matrices E
1
, . . . , E
n
, so
that E
n
. . . E
1
A = I. Note also that A
1
= E
n
. . . E
1
, for
E
n
. . . E
1
= E
n
. . . E
1
I = E
n
. . . E
1
(AA
1
) = (E
n
. . . E
1
A)A
1
= IA
1
= A
1
.
The last observation provides a method for computing the inverse of an invertible ma-
trix, and hence, an answer to Question 2 above. Let us explain it by means of an example.
Example: Find the inverse of the matrix
A =
_
_
1 0 1
2 1 3
1 1 0
_
_
.
Solution: Form the augmented matrix
(A|I) =
_
_
_
_
1 0 1
1 0 0
2 1 3
0 1 0
1 1 0
0 0 1
_
_
_
_
.
Then perform elementary row transformations on (A|I) until the rst block gets trans-
formed into I. By the previous discussion, this will amount to multiplying A and I
8 LINEAR EQUATIONS, MATRICES AND DETERMINANTS
simultaneously on the left by a nite sequence of elementary matrices E
1
, . . . , E
n
, and so,
by the end of this process we will have
_
E
n
. . . E
1
A
. .
I
E
n
. . . E
1
. .
A
1
_
,
i.e., the matrix resulting on the second block will be precisely the inverse of A.
A possible sequence of transformations for the given matrix is:
(A|I)
_
_
_
_
1 0 1
1 0 0
0 1 1
2 1 0
0 1 1
1 0 1
_
_
_
_
_
_
_
_
1 0 1
1 0 0
0 1 1
2 1 0
0 0 2
1 1 1
_
_
_
_
_
_
_
_
1 0 1
1 0 0
0 1 1
2 1 0
0 0 1
1
2
1
2
1
2
_
_
_
_
_
_
_
_
1 0 0
3
2
1
2
1
2
0 1 0
3
2
1
2
1
2
0 0 1
1
2
1
2
1
2
_
_
_
_
.
So,
A
1
=
_
_
3
2
1
2
1
2
3
2
1
2
1
2
1
2
1
2
1
2
_
_
.
The relationship between invertible and elementary matrices can be summarized as
follows.
Proposition 2.7. A matrix is invertible if and only if it is a product of elementary
matrices.
Proof. Since elementary matrices are invertible and the product of invertible matrices is
invertible, a product of elementary matrices is invertible. Conversely, if A is invertible then
there are elementary matrices E
1
, . . . , E
n
such that E
n
E
1
A = I A = E
1
1
. . . E
1
n
(again, because elementary matrices are invertible). Since the inverse of an elementary
matrix is again elementary, we are done.
3. Determinants
Let us start with the following.
Denition 3.1 (Permutations). A permutation of a set S is a bijective map : S S.
A permutation that exchanges two elements of a set and leaves invariant all the others
is called a transposition. Given x, y S, the transposition that exchanges x and y is
commonly denoted by (x, y).
Every permutation can be represented as a composition of transpositions. Although
this representation is not unique, it can be shown that the parity of the number of trans-
positions guring in any representation of is always the same. We dene sgn() to be
1 if the number of transpositions guring in any representation of is even and 1 if it
is odd.
LINEAR EQUATIONS, MATRICES AND DETERMINANTS 9
Example: Let
=
_
1 2 3 4 5
5 1 2 4 3
_
,
where each element in the second row is the image under of the element right above it,
in the rst row, so (1) = 5, (2) = 1, (3) = 2, (4) = 4 and (5) = 3. Then
= (1, 5)(5, 3)(3, 2) and also = (3, 4)(2, 4)(1, 3)(3, 4)(5, 2)(2, 4)(4, 1).
The number of transpositions in the rst decomposition of is 3 and in the second is 7.
Both are odd, so sgn() = 1.
Denition 3.2 (Determinant). Let A M
n
(F). The determinant of A, denoted det A,
is dened by the formula
det A :=
S
n
sgn() a
(1)1
a
(2)2
. . . a
(n)n
,
where S
n
stands for the family of all permutations of the ordered set {1, 2, . . . , n}.
Proposition 3.3 (Properties of the det function). Let A M
n
(F). Then
i) If B is obtained from A by multiplying one column of A by a scalar then det B =
det A, i.e.,
det
_
_
_
a
11
a
1i
a
1n
a
21
a
2i
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
ni
a
nn
_
_
_
. .
B
= det
_
_
_
a
11
a
1i
a
1n
a
21
a
2i
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
ni
a
nn
_
_
_
. .
A
.
(In particular, if A has a zero column then det A = 0.)
ii) Suppose B and C dier from A only by the i-th column, and that the i-th column
of A is the sum of the i-th columns of B and C. Then det A = det B +det C, i.e.,
det
_
_
_
a
11
b
1i
+ c
1i
a
1n
a
21
b
2i
+ c
2i
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
b
ni
+ c
ni
a
nn
_
_
_
. .
A
= det
_
_
_
a
11
b
1i
a
1n
a
21
b
2i
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
b
ni
a
nn
_
_
_
. .
B
+ det
_
_
_
a
11
c
1i
a
1n
a
21
c
2i
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
c
ni
a
nn
_
_
_
.
. .
C
iii) If B is obtained from A by exchanging two columns then det B = det A, i.e.,
det
_
_
_
a
11
a
1j
a
1i
a
1n
a
21
a
2j
a
2i
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
nj
a
ni
a
nn
_
_
_
. .
B
= det
_
_
_
a
11
a
1i
a
1j
a
1n
a
21
a
2i
a
2j
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
ni
a
nj
a
nn
_
_
_
. .
A
.
(In particular, if A has two equal columns then det A = 0.)
iv) det I = 1.
v) det is the only function from M
n
(F) to F with all the above properties!
10 LINEAR EQUATIONS, MATRICES AND DETERMINANTS
Proof. (i) det B =
S
n
sgn() a
(1)1
. . . a
(i)i
. . . a
(n)n
= det A.
(ii) Its proof is similar to that of (i), so we leave it to you.
(iii) Suppose B is obtained from A by exchanging its i-th and j-th columns. For each
S
n
dene := (i, j). Then (i) = (j), (j) = (i) and (k) = (k) for any other
k dierent from i and j.
det B =
S
n
sgn() a
(1)1
. . . a
(i)j
. . . a
(j)i
. . . a
(n)n
=
S
n
sgn() a
(1)1
. . . a
(j)i
. . . a
(i)j
. . . a
(n)n
=
S
n
sgn( ) a
(1)1
. . . a
(i)i
. . . a
(j)j
. . . a
(n)n
= det A,
where the last equality follows on noting that the map S
n
S
n
, , is a bijection.
(iv) Simply note that the only summand dierent from zero in the formula for det I is
the one corresponding to = id.
(v) The proof of this part boils down to showing that if f : M
n
(F) F is a function
that satises (i)(iv) then, for every A M
n
(F), the value f(A) is completely determined
by these properties. Let us illustrate the idea in the case of 2 2 matrices. In this case,
we have that
f
__
a
11
a
12
a
21
a
22
__
= f
__
a
11
a
12
0 a
22
__
+ f
__
0 a
12
a
21
a
22
__
= f
__
a
11
a
12
0 0
__
+ f
__
a
11
0
0 a
22
__
+f
__
0 a
12
a
21
0
__
+ f
__
0 0
a
21
a
22
__
= a
11
a
12
f
__
1 1
0 0
__
+ a
11
a
22
f
__
1 0
0 1
__
+ a
21
a
12
f
__
0 1
1 0
__
+ a
21
a
22
f
__
0 0
1 1
__
= a
11
a
22
a
21
a
12
,
where the rst two equalities follow from (ii), the third equality follows from (i), and the
last follows from (iii) and (iv). This shows that, for 2 2 matrices, there can only be one
function f with the above properties.
For matrices of other sizes one can proceed in a similar way.
Also, as a direct consequence of Denition 3.2, we have the following.
Proposition 3.4. det A
t
= det A (A M
n
(F)).
Proof.
det A
t
=
S
n
sgn() a
1(1)
a
2(2)
. . . a
n(n)
=
S
n
sgn() a
1
((1))(1)
a
1
((2))(2)
. . . a
1
((n))(n)
=
S
n
sgn() a
1
(1)1
a
1
(2)2
. . . a
1
(n)n
=
1
S
n
sgn(
1
) a
1
(1)1
a
1
(2)2
. . . a
1
(n)n
= det A,
LINEAR EQUATIONS, MATRICES AND DETERMINANTS 11
where the last equality follows from the facts that the inversion map S
n
S
n
,
1
is a bijection and that sgn() = sgn(
1
) ( S
n
). You should verify both facts.
Remark 3.5. It follows easily from this last result that Proposition 3.3 still holds if one
replaces everywhere in it column by row!
Example: Assuming abc = 0, show that
det
_
_
bc ca ab
a b c
a
2
b
2
c
2
_
_
= det
_
_
1 1 1
a
2
b
2
c
2
a
3
b
3
c
3
_
_
.
By Proposition 3.3, one successively nds
det
_
_
bc ca ab
a b c
a
2
b
2
c
2
_
_
= abc det
_
_
a
1
b
1
c
1
a b c
a
2
b
2
c
2
_
_
= bc det
_
_
1 b
1
c
1
a
2
b c
a
3
b
2
c
2
_
_
= c det
_
_
1 1 c
1
a
2
b
2
c
a
3
b
3
c
2
_
_
= det
_
_
1 1 1
a
2
b
2
c
2
a
3
b
3
c
3
_
_
.
Exercise: Verify that the determinant of an elementary matrix is dierent from zero.
Another important property of the det function is the following.
Corollary 3.6. det(AB) = det Adet B (A, B M
n
(F)). In particular, if A is invertible
then det A = 0 and det(A
1
) = (det A)
1
.
Sketch of the proof. If det A = 0 then dene : M
n
(F) F, X (det A)
1
det(AX).
Noting that the i-th column of AX is the product of A with the i-th column of X (see
Remark 2.2), it is easy to verify that satises (i)(iv) of Proposition 3.3, so, by the nal
part of the same proposition, det B = (B) = (det A)
1
det(AB).
If det A = 0 then A cannot be invertible (use Proposition 2.7). Let
A be the matrix
obtained by applying the Gaussian elimination process to A, so
A = E
n
. . . E
1
A for some
elementary matrices E
1
, . . . , E
n
. Then A non-invertible
A non-invertible
A has zero
rows
AB has zero rows det(
AB) = 0 det(AB) = 0.
If A is invertible then AA
1
= I and, by the previous part, det Adet(A
1
) = det I = 1,
so det(A
1
) = (det A)
1
.
It is apparent, however, that for practical purposes, the formula given in Denition 3.2
is not particularly useful. So, how do we compute the determinant of a matrix?
Denition 3.7 (Minors, cofactors and adjugate). Let A M
n
(F). The ij-th minor of A,
denoted M
ij
(A), is dened to be the determinant of the (n 1) (n 1) matrix obtained
from A by deleting its i-th row and its j-th column. The ij-th cofactor of A, denoted
C
ij
(A), is then dened by C
ij
(A) := (1)
i+j
M
ij
(A). The matrix (C
ij
(A)) is called the
cofactors matrix of A. Lastly, the adjugate of A, denoted adj(A), is dened to be the
transpose of the cofactors matrix, i.e., adj(A) = (C
ij
(A))
t
.
Example: Let
A =
_
_
1 0 2
3 1 1
0 1 0
_
_
12 LINEAR EQUATIONS, MATRICES AND DETERMINANTS
Then
M
13
(A) = det
_
3 1
0 1
_
= 3, C
21
(A) = (1)
2+1
M
21
(A) = det
_
0 2
1 0
_
= 2
and
adj(A) =
_
_
1 2 2
0 0 5
3 1 1
_
_
.
Proposition 3.8. Let A M
n
(F). Then
det A =
n
k=1
a
ik
C
ik
(A) =
n
k=1
a
kj
C
kj
(A) (1 i, j n).
Sketch of the proof. By Proposition 3.3, it suces to show that both expressions on the
right hand side of the last formula satisfy (i)(iv) of the same proposition.
Example: Let A be the matrix of the previous example. Then
det A = a
11
C
11
(A) + a
12
C
12
(A) + a
13
C
13
(A)
= 1 det
_
1 1
1 0
_
0 det
_
3 1
0 0
_
+ 2 det
_
3 1
0 1
_
= 1 + 6 = 5.
Example: Show that the determinant of an upper triangular matrix is the product of its
diagonal entries.
We prove this by induction on the size of the matrix. The statement is trivially true for
n = 1. Suppose it has been established for n = k, and let A = (a
ij
) M
k+1
(F) be upper
triangular. By Proposition 3.8, det A = a
k+1k+1
C
k+1k+1
(A) = a
k+1k+1
M
k+1k+1
(A). The
matrix obtained from A by deleting its k + 1 row and its k + 1 column is a k k upper
triangular matrix, so, by the induction hypothesis, M
k+1k+1
(A) = a
11
a
22
. . . a
kk
. Whence
the desired result.
Lemma 3.9. For every A M
n
(F),
(4) Aadj(A) = (det A) I = adj(A) A.
Proof. Let C = Aadj(A). Then
c
ij
=
n
k=1
a
ik
C
jk
(A) =
_
det A, if i = j
0, if i = j
,
i.e., C = (det A) I. Similarly, one shows that adj(A) A = (det A) I.
Proposition 3.10. Let A M
n
(F). If det A = 0 then A is invertible and
A
1
= (det A)
1
adj(A).
Proof. If det A = 0 then, dividing (4) by det A, one obtains that A
_
(det A)
1
adj(A)
=
I =
_
(det A)
1
adj(A)
n
j=1
C
ji
(A)b
j
= (det A)
1
det A
i
.
Remark 3.13. It is not hard to verify that the area (resp. the volume) of the parallelogram
(resp. parallelepiped) spanned by a pair of vectors (x
1
, y
1
) and (x
2
, y
2
) in R
2
(resp. by
vectors (x
1
, y
1
, z
1
), (x
2
, y
2
, z
2
) and (x
3
, y
3
, z
3
) in R
3
) equals
det
_
x
1
x
2
y
1
y
2
_
_
_
resp.
det
_
_
x
1
x
2
x
3
y
1
y
2
y
3
z
1
z
2
z
3
_
_
_
_
.
By analogy, for n > 3, the volume of the n-dimensional parallelepiped spanned by a set
of vectors (v
11
, v
21
, . . . , v
n1
), . . . , (v
12
, v
22
, . . . , v
n2
), (v
1n
, v
2n
, . . . , v
nn
) in R
n
is dened by
the formula
det
_
_
_
_
_
v
11
v
12
v
1n
v
21
v
22
v
2n
.
.
.
.
.
.
.
.
.
.
.
.
v
n1
v
n2
v
nn
_
_
_
_
_
,
so not surprisingly, determinants arise naturally in the study of multiple integrals, most
notably in the Change of Variables Theorem.
14 LINEAR EQUATIONS, MATRICES AND DETERMINANTS
4. Polynomials of matrices
Recall that a polynomial with coecients in F is an expression of the form
a
n
x
n
+ a
n1
x
n1
+ + a
1
x + a
0
,
where a
0
, a
1
, . . . , a
n
F. The numbers a
0
, a
1
, . . . , a
n
are called the coecients of the
polynomial and the largest m for which a
m
= 0 is called its degree. The letter x is called
the variable or indeterminate of the polynomial. We shall use the notation F[x] to
denote the collection of all polynomials in the variable x with coecients in F.
One could think of a polynomial as an innite (formal) sum of the form a
0
+ a
1
x +
+ a
n
x
n
+ in which all but nitely many a
i
s are zero. So the i-th coecient of a
polynomial of degree n is zero for every i > n. With this point of view in mind one can
then dene two polynomials p(x) = a
0
+a
1
x+ +a
n
x
n
and q(x) = b
0
+b
1
x+ +b
m
x
m
in F[x] to be equal if they have the same coecients. Furthermore, one can dene
their sum, denoted (p+q)(x), to be the polynomial whose i-th coecient is a
i
+b
i
;
and
their product, denoted (pq)(x), to be the polynomial whose i-th coecient is
i
k=0
a
k
b
ik
.
Recall also that a number F is called a root of a polynomial p(x) F[x] if p() = 0,
i.e., if the nal result of substituting x by in the expression for p and performing all the
computations is zero.
Given a matrix A M
n
(F) we dene, for every i N, the i-th power of A, denoted A
i
,
by
A
i
:= AA. . . A
. .
i times
.
We also dene A
0
:= I. Then, for any polynomial p(x) = a
n
x
n
+ +a
1
x+a
0
F[x], we
dene p(A) by
p(A) := a
n
A
n
+ + a
1
A + a
0
I.
Denition 4.1 (Characteristic polynomial). Let A M
n
(F). The characteristic poly-
nomial of A, denoted p
A
, is dened to be the polynomial det(AxI) F[x].
It follows easily from the denition of determinant that the degree of p
A
is equal to the
size of A.
Theorem 4.2 (CayleyHamilton Theorem). Let A M
n
(F). Then p
A
(A) = 0.
Proof. Let adj(AxI) =
n1
i=0
x
i
B
i
. Then, by Lemma 3.9,
p
A
(x)I = det(AxI)I = (AxI)
_
n1
i=0
x
i
B
i
_
(5)
= x
n
B
n1
+
n1
i=1
x
i
(AB
i
B
i1
) + AB
0
.
Let p
A
(x) = a
n
x
n
+ + a
1
x + a
0
. It follows from (5) that
a
n
I = B
n1
, a
i
I = AB
i
B
i1
(1 i n 1) and a
0
I = AB
0
.
LINEAR EQUATIONS, MATRICES AND DETERMINANTS 15
From these last one obtains that
a
n
A
n
= A
n
B
n1
,
a
n1
A
n1
= A
n
B
n1
A
n1
B
n2
,
a
n2
A
n2
= A
n1
B
n2
A
n2
B
n3
,
.
.
. =
.
.
.
a
i
A
i
= A
i+1
B
i
A
i
B
i1
,
.
.
. =
.
.
.
a
1
A = A
2
B
1
AB
0
,
a
0
I = AB
0
.
Sum up all these identities. The terms on the right hand sides cancel out while the terms
on the left hand sides add up to p
A
(A). The theorem is proved.
Example: The matrix
A =
_
1 2
3 1
_
has characteristic polynomial
p
A
() = det(AI) = det
_
1 2
3 1
_
= (1 )(1 + ) 6 =
2
7,
and
p
A
(A) = A
2
7I =
_
1 2
3 1
__
1 2
3 1
_
7
_
1 0
0 1
_
= 0.
One can go even further and dened power series of matrices. For instance, one can
dene
exp(A) :=
i=0
1
i!
A
i
(A M
n
(F)).
The computation of exp(A) is essential in solving systems of linear dierential equations.
For example, the general solution of the system of equations
x