7 Diagonalization and Quadratic Forms
7 Diagonalization and Quadratic Forms
7 Diagonalization and Quadratic Forms
Diagonalization
Recall the definition of a diagonal matrix from Section 1.6.
A7 = AAAAAAA.
1
Exercise. With A as above, work out A16 . Then try and do it directly
without the “diagonalization”!
2
and one can check (exercise) that
−1 3 0
P AP = .
0 −1
(iii) A is symmetric.
Proof. (i) ⇔ (ii) This follows from Theorems 6.6, 6.8 and 7.4 (exercise:
write down exactly how).
(i) ⇒ (iii) If (i) holds, say P −1 AP = D is diagonal with P orthogonal,
then we have A = P DP −1 = P DP T . Clearly D is symmetric, so
AT = (P DP T )T = (P T )T DT P T = P DP T = A
Quadratic Forms
Definition 7.8. A quadratic form in n variables is a function f : Rn → R
of the form X
f (x) = f (x1 , ..., xn ) = cij xi xj (∗)
1≤i≤j≤n
1. f (x1 ) = x21
3
4. f (x1 , . . . , xn ) = x21 + x22 + · · · + x2n = hv|vi, where v = (x1 , . . . , xn ). So
the Euclidean inner product (see Chapter 6) gives rise to a quadratic
form.
1
If we set aii = cii for i = 1, ..., n and aij = cij for 1 ≤ i < j ≤ n, then
2
(∗) becomes
n
X X
f (x) = aii x2i + 2aij xi xj
i=1 1≤i<j≤n
4
Theorem 7.11. (The Principal Axes Theorem) Every quadratic form f can
be diagonalized.
More specifically, if f (x) = xT Ax is a quadratic form in
x1
..
x = . , then there exists an orthogonal matrix Q such that
xn
From the first part of the course we know that Q is the matrix whose
columns are the unit eigenvalues of the matrix A of f .
and so we have √ √
1/√3 −2/√ 6 0√
Q = 1/√3 1/√6 1/ √2 .
1/ 3 1/ 6 −1/ 2
If we set y = QT x, we get
1 1 1
y1 = √ (x1 + x2 + x3 ), y2 = √ (−2x1 + x2 + x3 ), y3 = √ (x2 − x3 ).
3 6 2
Then, expressed in terms of the variables y1 , y2 and y3 , the quadratic form
becomes 2y12 − y22 − y32 .
5
An immediate consequence of the Principal Axes Theorem is the follow-
ing:
Theorem 7.14. Let f (x) = xT Ax be a quadratic form with matrix A. Then
f is positive definite if and only if all the eigenvalues of A are positive.
Proof. By the Principal Axes Theorem, there exists an orthogonal matrix Q
such that
f (x) = λ1 y12 + ... + λn yn2
y1
where y = ... = QT x and λ1 , ..., λn are the eigenvalues of A. If all the
yn
λi are positive then f (x) > 0 except when y = 0. But this happens if and
only if x = 0 because QT is invertible. Therefore f is positive definite.
On the other hand if one of the eigenvalues λi ≤ 0, letting y = ei and
x = Qy we get f (x) = λi ≤ 0 and so f is not positive definite.
We say that a symmetric matrix A is positive definite if the associated
quadratic form
f (x) = xT Ax
is positive definite.
The Principal Axes Theorem has important applications in geometry.
8 Vector Spaces
Definition and Examples
In the first part of the course we’ve looked at properties of the real n-space
Rn . We also introduced the idea of a field K in Section 3.1 which is any set
with two binary operations + and × satisfying the 9 field axioms. R is an
example of a field but there are many more, for example C, Q and Zp (p a
prime, with modulo p addition and multiplication).
6
1.(i) u + v = v + u for all u, v ∈ V
The elements of V are called vectors and the elements of K are called
scalars. We sometimes refer to V as a K−space.
Examples 8.2. 1. For all n ≥ 1, Rn with the usual addition and scalar
multiplication is a vector space over R. More generally, let
x 1
n ..
K = . | xi ∈ K
x
n
and define
x1 y1 x1 + y 1 x1 cx1
.. .. = ..
. + , c ... = ...
. .
xn yn xn + y n xn cxn
2. The set Mmn (R) of all m × n matrices with entries in R with addition
of matrices and scalar multiplication is a vector space over R. More
generally, let
a 11 ... a 1n
.. .
.. | aij ∈ K
Mmn (K) = .
a
m1 ... amn
7
be the set of m × n matrices with entries in K and define
a11 ... a1n b11 ... b1n a11 + b11 ... a1n + b1n
.. .. + .. .. = .. ..
,
. . . . . .
am1 ... amn bm1 ... bmn am1 + bm1 ... amn + bmn
a11 ... a1n ca11 ... ca1n
c ... .. = .. ..
. . .
am1 ... amn cam1 ... camn
where (aij , bij , c ∈ K). Then Mmn (K) is a vector space over K.
We write Mn (K) = Mnn (K).
3. Let Pn denote the set of all polynomials of degree ≤ n with real coef-
ficients:
and define n n n
X X X
i i
ai x + bi x = (ai + bi )xi
i=0 i=0 i=0
n
X n
X
c( ai xi ) = (cai )xi .
i=0 i=0
8
5. The set C of complex numbers is a vector space over R with the usual
addition of complex numbers and multiplication by real numbers.
6. An unusual example: Let U be a set. Consider the power set
P(U ) = {A| A ⊆ U }. For A, B ⊆ U define
A + B = (A ∪ B)\(A ∩ B).
This definition satisfies conditions 1(i)-(iv) of Definition 8.1.
The zero in P(U ) is ∅ and −A = A.
Consider the field Z2 = {0, 1} and define
1.A = A, 0.A = ∅, ∀A ⊆ U.
We can show that 2(i)-(iv) of Definition 8.1 are satisfied.
Hence P(U ) is a vector space over Z2 .
Theorem 8.3. Let V be a vector space over K. Then, for all u ∈ V and all
a ∈ K we have:
(i) 0u = 0;
(ii) a0 = 0;
(iii) (−1)u = −u; and
(iv) if au = 0, then a = 0 or u = 0.
9
Subspaces
Definition 8.4. A non-empty subset W of a K−space V is a subspace if
(i) u + v ∈ W, ∀u, v ∈ W ; and
(ii) au ∈ W, ∀u ∈ W, ∀a ∈ K.
Theorem 8.5. A subspace W of a K−space V is itself a vector space over
K with the same addition and scalar multiplication as in V .
W = {f ∈ F| f (1) = 0}.
10
(v) In any vector space V , the subset {0} is a subspace, called the zero
subspace.
Proof. We have 0 = 0 + 0 ∈ W1 + W2 .
Let u + v, u0 + v 0 ∈ W1 + W2 , where u, u0 ∈ W1 and v, v 0 ∈ W2 . Then
(u + v) + (u0 + v 0 ) = (u + u0 ) + (v + v 0 ) ∈ W1 + W2
c(u + v) = cu + cv ∈ W1 + W2
W1 = {A ∈ Mn (R)| A = AT }, W2 = {B ∈ Mn (R)| B = −B T }.
We have already seen that W1 is a subspace and it’s not hard to show that
W2 is a subspace. We have
11
Let A ∈ Mn (R). Then we can write
1 1
A = (A + AT ) + (A − AT )
2 2
1 1
and by properties of the transpose we have (A+AT ) ∈ W1 and (A−AT ) ∈
2 2
W2 . Therefore A ∈ W1 + W2 and W1 + W2 = Mn (R).
We can extend Definition 8.8 to the sum of more than two subspaces:
An easy induction on t and Theorem 8.9 show that this sum is a subspace
of V .
12
Examples 9.2. 1. The set
1 0 0 1 0 0 0 0
S= , , ,
0 0 0 0 1 0 0 1
is a spanning set for the vector space M2 (K) over any field K because
we can write any 2 × 2 matrix as
a b 1 0 0 1 0 0 0 0
=a +b +c +d
c d 0 0 0 0 1 0 0 1
where a, b, c, d ∈ K.
2. The set S = {1, x, x2 , x3 , ...} is a spanning set for the vector space P
of polynomials with coefficients in R. By definition, any polynomial
f = ni=0 ai xi is a linear combination of elements of S.
P
in M2 (K).
Since the three matrices in S are symmetric, any linear combination of
them is symmetric. Indeed,
a b 1 0 0 1 0 0
=a +b +d
b d 0 0 1 0 0 1
3b + c = −1, 2a = 4, −a + c = 0, −b = 1.
13
(i) span(v1 , ..., vk ) is a subspace of V ;
Proof. (i) Consider the span of a single vector vi , span(vi ) = {avi | a ∈ K}.
This is a subspace because
Linear Independence
Definition 9.4. Let V be a vector space. A finite set S = {v1 , ..., vk } of
vectors in V is called linearly independent if the only solution to
a1 v1 + ... + ak vk = 0
is a1 = ... = ak = 0.
If S is not linearly independent we say it is linearly dependent.
Remark:
If we can find a1 , ..., ak ∈ K, not all zero, such that a1 v1 + ... + ak vk = 0,
we say this is a non-trivial combination of the vectors.
Proof. Let S = {v1 , ..., vk } and assume, without loss of generality, that
v1 = a2 v2 + ... + ak vk .
Then
1v1 − a2 v2 − ... − ak vk = 0
and therefore S is linearly dependent.
14
Conversely assume there is a non-trivial combination
b1 v1 + ... + bk vk = 0.
Then at least one of the coefficients bi 6= 0 and we can write
= (−b−1 −1 −1 −1
i b1 )v1 + ...(−bi bi−1 )vi−1 + (−bi bi+1 )vi+1 + ... + (−bi bk )vk
as required.
1 0 0 1
Examples 9.6. 1. In M2 (R), let A = ,B = ,
0 0 −2 0
−1 0
C= . If
0 1
aA + bB + cC = 0
for some a, b, c ∈ R we have
a−c b 0 0
=
−2b c 0 0
and so a − c = 0, b = −2b = 0, c = 0 and the only solution is a = b =
c = 0. Therefore {A, B, C} is linearly independent in M2 (R).
2. Consider the vector space P and let S = {1, x, x2 , ..., xn }. Then
a0 + a1 x + ... + an xn = 0
if and only if a0 = a1 = ... = an = 0. Hence S is linearly independent
in P. (Here we use the fact that two polynomials are equal if and only
if their coefficients are equal.)
3. In the vector space F of all functions f : R → R, the set {sin2 (x), cos2 (x), cos(2x)}
is linearly dependent because
cos(2x) = cos2 (x) − sin2 (x).
15
The concept of linear independence can be extended to infinite sets: an
arbitrary non-empty set S in a vector space V is linearly independent if all
its finite subsets are linearly independent.
1 if x = α
fα (x) = {
0 if x 6= α.
⇒ ai .1 = 0 ⇒ ai = 0, ∀i = 1, ..., k.
Hence {fα1 , ..., fαk } is linearly independent and so by definition S is
linearly independent.
Notice this is an example of an uncountably infinite linearly inde-
pendent set.
10 Bases
Definition 10.1. A subset B of a vector space V is a basis for V if B spans
V and B is linearly independent.
Examples 10.2. 1. The standard basis for K n is the set {e1 , ..., en }
0
..
.
where ei = 1 , where the 1 appears in the i-th row. There are
.
..
0
many more bases for K n .
16
3. In Mmn (K), the set of matrices Eij with a 1 in the (i, j)−th position
and 0 everywhere else, for 1 ≤ i ≤ m, 1 ≤ j ≤ n is the standard basis
for Mmn (K).
4. The set of polynomials {1, x, x2 , ..., xn } form a basis for Pn and the
infinite set {1, x, x2 , ...} is a basis for P.
Theorem 10.3. Let V be a vector space and let B be a finite basis for V .
Then for every vector v ∈ V there is a unique expression for v as a linear
combination of the vectors in B.
v = a1 v1 +, ... + an vn = b1 v1 + ... + bn vn
a1 − b1 = ... = an − bn = 0
⇒ ai = b i , ∀i = 1, ..., n.
(This proves the theorem for finite B. The proof for an infinite B is
similar, but requires some additional argument.)
17
for all vj ∈ S\B, B ∪ {vj } is linearly dependent. So there is a non-trivial
combination
a1 v1 + ... + an vn + aj vj = 0
where aj 6= 0 because B is linearly independent. Therefore
vj = −a−1
j (a1 v1 + ... + an vn )
n
X m
X
= bi vi + bj v j
i=1 j=n+1
n
X m
X n
X
= bi vi + bj (−a−1
j ai )vi
i=1 j=n+1 i=1
n
X n
X m
X
= bi v i + ( −bj a−1
j ai )vi
i=1 i=1 j=n+1
n
X m
X
= (bi − bj a−1
j ai )vi .
i=1 j=n+1
Proof. Let S0 be a finite linearly independent set in V and let T be any finite
spanning set for V . Then S = S0 ∪ T is a finite spanning set for V containing
S0 . By Lemma 10.5, S contains a basis B of V with S0 ⊆ B.
The next result is required to prove the main theorem of this section:
Lemma 10.7. Let V be a finite dimensional vector space. Let R = {u1 , ..., un }
be a linearly independent set in V and S = {v1 , ..., vm } be a spanning set for
V . Then n ≤ m.
18
Proof. Consider the set T1 = {u1 } ∪ S. Then T1 is a spanning set because
it contains S and it is linearly dependent because u1 is a linear combination
of elements of S. Therefore T1 contains a basis B1 containing the linearly
independent set {u1 } by Lemma 10.5. B1 is a proper subset of T1 because T1
is linearly dependent and so B1 = {u1 } ∪ S1 where S1 is a proper subset of
S.
Now consider T2 = {u2 } ∪ B1 = {u1 , u2 } ∪ S1 . Then T2 is a spanning set
and it is linearly dependent because u2 ∈ span(B1 ). Therefore T2 contains a
basis B2 containing the linearly independent set {u1 , u2 } and B2 is a proper
subset of T2 . We can write B2 = {u1 , u2 } ∪ S2 where S2 is a proper subset of
S1 .
Continuing in this way we find that V has a basis Bn = {u1 , ..., un } ∪ Sn
where
Sn ( Sn−1 ( ... ( S1 ( S.
Hence
0 ≤ |Sn | ≤ |S| − n
⇒ n ≤ |S| = m.
Theorem 10.8. (The Basis Theorem) Any two bases of a finite dimen-
sional vector space have the same number of elements.
Proof. Let V be a finite dimensional vector space . Then V has a finite basis
B by definition. Suppose B has n elements. Let C be any other finite basis
and suppose that C has m elements. Since B is linearly independent and C
is a spanning set, then Lemma 10.7 gives n ≤ m. Similarly m ≤ n since B is
a spanning set and C is linearly independent. Therefore n = m.
Finally, any basis of V is finite, otherwise V would contain an infinite
linearly independent set, and hence a finite linearly independent set with
more than n elements. However this contradicts Lemma 10.7 because B is a
spanning set with n elements.
Definition 10.9. Let V be a finite dimensional vector space over K, with
V 6= {0}. The number of vectors in any basis for V is called the dimension
of V , denoted by dimK (V ) or simply dim(V ). For the zero space we set
dim{0} = 0.
Examples 10.10. 1. For any n ≥ 1 and any field K,
dimK (K n ) = n.
19
1 0 0 0 0 1
3. We have seen that the set of matrices , ,
0 0 0 1 1 0
forms a basis for the space of symmetric matrices in M2 (K). Hence this
space has dimension 3. It is a subspace of M2 (K).
{Eij | 1 ≤ i ≤ m, 1 ≤ j ≤ n}
where Eij is the matrix with (i, j)-th entry 1 and all other entries 0.
Therefore
dimK (Mmn (K)) = mn.
5. Indicating the field K can be important because some sets can be re-
garded as vector spaces over different fields. For example if V = C,
then
dimC (C) = 1, with basis {1},
dimR (C) = 2, with basis {1, i}.
1. V has a finite basis and all bases of V have the same (finite) number of ele-
ments. This number is called the dimension of V .
20
Proof. (i) If S is a linearly independent subset of V , then S is contained in
a basis B and so |S| ≤ |B| = n.
(ii) If S is a spanning set for V , then S contains a basis B and n = |B| ≤
|S|.
(iii) If S is a linearly independent set with |S| = n, then (i) implies that
S = B for some basis B.
(iv) If S is a spanning set for V with |S| = n, then (ii) implies that S = B
for some basis B.
The next result shows that any subspace of a finite dimensional vector
space is also finite dimensional.
dim(W ) = m ≤ n = dim(V ).
W = V ⇒ dimW = dimV.
21
Proof. Since W1 ∩ W2 is a subspace of W1 and of W2 it is finite dimensional
by Theorem 10.12(i). Let B = {u1 , ..., un } be a basis for W1 ∩ W2 . Then
this can be extended to a basis B1 = {u1 , ..., un , v1 , ..., vk } for W1 and a basis
B2 = {u1 , ..., un , w1 , ..., wm } for W2 . We will show that
B3 = B1 ∪ B2 = {u1 , ..., un , v1 , ..., vk , w1 , ..., wm }
is a basis for W1 + W2 . It is clear that B3 spans W1 + W2 so it remains to
show that it is linearly independent. Assume that
a1 u1 + ... + an un + b1 v1 + ... + bk vk + c1 w1 + ... + cm wm = 0 (∗)
for some scalars a1 , ..., an , b1 , ..., bk , c1 , ..., cm ∈ K. Then u + v + w = 0, where
u = a1 u1 + ... + an un , v = b1 v1 + ... + bk vk , w = c1 w1 + ... + cm wm . So
w = −u − v ∈ W1 ∩ W2 and we can write w = d1 u1 + ... + dn un for some
d1 , ..., dn ∈ K. Substituting into (∗) we get
(a1 + d1 )u1 + ... + (an + dn )un + b1 v1 + ... + bk vk = 0. (∗∗)
Since B1 is linearly independent, all coefficients in (∗∗) are zero and so (∗)
becomes
a1 u1 + ... + an un + c1 w1 + ... + cm wm = 0. (∗ ∗ ∗)
Since B2 is linearly independent, all coefficients in (∗ ∗ ∗) are zero and this
means that all coefficients in (∗) are zero. Therefore B3 is linearly indepen-
dent and is a basis for W1 + W2 .
Finally we have
dim(W1 + W2 ) = n + k + m = (n + k) + (n + m) − n
= dim(W1 ) + dim(W2 ) − dim(W1 ∩ W2 ).
22
11 Co-ordinates and change of bases
Let V be a vector space with dim(V ) = n and let B = {v1 , ..., vn } be a
basis for V . Then every vector v ∈ V has a unique expression as a linear
combination
v = a1 v1 + ... + an vn
for some a1 , ..., an ∈ K. The scalarsa1 , ...,
an are called the co-ordinates of
a1
..
v with respect to B and the vector . ∈ K n is called the co-ordinate
an
vector of v with respect to B, denoted by [v]B .
23
Thereforethe co-ordinates
of A are a, b, c, d and the co-ordinate vector
a
b
is [A]B =
c .
d
3. Consider C as a vector space over R. A standard basis is B = {1, i}.
The co-ordinate vector of v = a + bi with respect to B is
a
[v]B = .
b
u = v ⇔ [u]B = [v]B .
v ↔ [v]B .
The next result shows that this correspondence agrees with the vector space
operations:
Theorem 11.2. Let B = {v1 , ..., vn } be a basis for V . Then for all u, v ∈ V
and all a ∈ K
(i) [u + v]B = [u]B + [v]B
u = a1 v1 + ... + an vn , v = b1 v1 + ... + bn vn .
Then
24
Hence
a1 + b 1 a1 b1
[u + v]B = .. .. + .. = [u] + [v] , and
=
. . . B B
an + b n an bn
aa1 a1
[au]B = ... = a .. = a[u] .
. B
aan an
[a1 u1 + ... + ak uk ]B = 0 ∈ K n .
Then
a1 [u1 ]B + ... + ak [uk ]B = 0 ∈ K n
and so a1 = ... = ak = 0 since C 0 is linearly independent. Therefore C is
linearly independent as required.
25
Change of Bases
Let B = {u1 , ..., un } and C = {v1 , ..., vn } be bases for a vector space V . Then
each ui has a unique expression as a linear combination of the vectors in C:
Definition 11.5. The n×n matrix whose column vectors are the co-ordinate
vectors [u1 ]C , ..., [un ]C of the vectors of B with respect to C is denoted by PB→C
and is called the change of basis matrix from B to C:
2 1 −1 1 2
Example 11.6. In R , let B = , ,C = , .
1 1 2 1
Then
1 1
1 1 2 1 1
u1 = = + = v1 + v2 ,
3 2
1 3 1 3 3
−1 1 2
u2 = = − = 1.v1 − 1.v2 .
1 2 1
1/3 1
Therefore PB→C = .
1/3 −1
If we know the change of basis matrix PB→C and the co-ordinates of a
vector w ∈ V with respect to B we can easily get the co-ordinates of w with
respect to C.
Theorem 11.7. Let B = {u1 , ..., un } and C = {v1 , ..., vn } be bases for a
vector space V and let PB→C be the change of basis matrix from B to C. Then
26
−1
(iii) PB→C is invertible and PB→C = PC→B .
a1
Proof. (i) Let v ∈ V with [v]B = ... ie. v = a1 u1 + ... + an un . Then
an
[v]C = [a1 u1 + ... + an un ]C
= a1 [u1 ]C + ... + an [un ]C
a1
= ([u1 ]C ...[un ]C ) ...
an
= PB→C [v]B .
(ii) Suppose P is an n × n matrix with P [v]B = [v]C for all v ∈ V . Then
for v = ui we get
[ui ]B = ei ,
where ei is the vector with a 1 in the i-th position and 0 everywhere else. So
the i-th column of P is
Pi = P ei = P [ui ]B = [ui ]C ,
the i-th column of P = PB→C . Therefore P = PB→C .
(iii) By part (i) we have, for all v ∈ V ,
[v]B = PC→B [v]C = PC→B PB→C [v]B .
Hence P = PC→B PB→C has the property that
[v]B = P [v]B
for all v ∈ V . Therefore P = PC→B PB→C = In . Hence PB→C is invertible and
−1
PB→C = PC→B .
Corollary 11.8. If B, C and D are bases of a vector space V , then
PB→C = PD→C PB→D .
Proof. By part (i) of Theorem 11.7, we have for all v ∈ V ,
[v]C = PD→C [v]D = PD→C PB→D [v]B
and then part (ii) of Theorem 11.7 gives that
PB→C = PD→C PB→D .
27
This Corollary gives an easy method for computing a change of basis
matrix. Suppose we are given B, C and D. Then we have
PB→C = PD→C PB→D
= (PC→D )−1 PB→D .
We can take advantage of this if PB→D is easy to compute, for example if D
is a standard basis.
1 −1
Example 11.9. As in our previous example, let B = , ,
1 1
1 2 1 0
C = , . Set D = , , the standard basis for
2 1 0 1
R2 . Then
1 −1 1 2
PB→D = , PC→D = .
1 1 2 1
Then
PB→C = (PC→D )−1 PB→D
−1/3 2/3 1 −1 1/3 1
= =
2/3 −1/3 1 1 1/3 −1
which is what we calculated directly before.
Example 11.10. Consider C as a vector space over R. Let B = {1+i, 1−i},
C = {2 + 3i, 1 + 2i}, bases for C. Then
1 3
PB→C = .
−1 −5
12 Linear Transformations
Definition 12.1. A linear transformation from a vector space V (over
K) to a vector space W (over K) is a function T : V → W such that for all
u, v ∈ V and all a ∈ K,
1. T (u + v) = T (u) + T (v);
2. T (au) = aT (u).
Note: It follows from the definition that a function T : V → W is a linear
transformation if and only if for all u1 , ..., uk ∈ V and for all a1 , ..., ak ∈ K,
T (a1 u1 + ... + ak uk ) = a1 T (u1 ) + ... + ak T (uk ).
We say that T commutes with linear combinations.
28
Examples 12.2. 1. Matrix transformations: For any matrix A ∈ Mmn (K),
define TA : K n → K m by setting
TA (u) = Au
T (A) = AT .
29
4. For any two vector spaces V and W over K, the zero transformation
T0 : V → W defined by
T0 (v) = 0
for all v ∈ V and the identity map I : V → V defined by
I(v) = v
(i) T (0) = 0;
for all x ∈ A.
Proof. Exercise.
For any three linear transformations T : U → V, S : V → W, R : W → Y
we have
R ◦ (S ◦ T ) = (R ◦ S) ◦ T,
30
the associativity law for composition of functions. Also,
T ◦ IU = T, IV ◦ T = T.
Recall that a function f : A → B is invertible if there exists a function
g : B → A with g ◦ f = IA and f ◦ g = IB , and in this case g is unique and
is called the inverse of f , denoted by f −1 . Also f is invertible if and only if
f is injective and surjective.
Theorem 12.5. If a linear transformation T : V → W is invertible, then
the inverse T −1 : W → V is also a linear transformation.
Proof. Let x, y ∈ W and a ∈ K. Then
T ◦ T −1 (x + y) = IW (x + y) = x + y
= T ◦ T −1 (x) + T ◦ T −1 (y)
and since T is injective this gives
T −1 (x + y) = T −1 (x) + T −1 (y).
Also
T ◦ T −1 (ax) = ax = aT ◦ T −1 (x) = T (aT −1 (x))
and again, since T is injective this gives T −1 (ax) = aT −1 (x). Hence T −1 is a
linear transformation.
31
2. Consider T : Mmn (K) → Mnm (K) given by T (A) = AT . Then
A ∈ ker(T ) ⇔ T (A) = 0nm ⇔ AT = 0nm ⇔ A = 0mn .
Therefore ker(T ) = {0mn }. Here we have
range(T ) = Mnm (K)
because every matrix in Mnm (K) is the image of its transpose, ie.
A = T (AT ) = (AT )T
for all A ∈ Mnm (K).
3. Consider T : M2 (R) → F given by
a b
T = a sin(x) − 2d cos(x).
c d
a b
Here ∈ ker(T ) ⇔ a sin(x) − 2d cos(x) = 0 for all x ∈ R.
c d
Since sin(x) and cos(x) are linearly independent in F, this implies that
a = d = 0. Hence
0 b
ker(T ) = | b, c ∈ R .
c 0
The range of T is the subspace of F spanned by {sin(x), cos(x)}.
Theorem 12.8. The kernel of a linear transformation T : V → W is a
subspace of V and the range is a subspace of W .
Proof. Since T (0) = 0, we have 0 ∈ ker(T ) and so ker(T ) 6= ∅. Let u, v ∈
ker(T ) and a ∈ K. Then
T (u + v) = T (u) + T (v) = 0 + 0 = 0
and so u + v ∈ ker(T ), and
T (au) = aT (u) = a0 = 0
and so au ∈ ker(T ). By the subspace criteria, ker(T ) is a subspace of V .
Let x, y ∈ range(T ) and suppose x = T (u), y = T (v) for some u, v ∈ V .
Then
x + y = T (u) + T (v) = T (u + v)
and so x + y ∈ range(T ) and if a ∈ K, then
ax = aT (u) = T (au)
and so ax ∈ range(T ). Therefore range(T ) is a subspace of W .
32
Definition 12.9. Let T : V → W be a linear transformation. The rank of
T is defined by
rank(T ) = dim(range(T ))
and the nullity of T is defined as
33
But then
w = T (a1 v1 + ... + ak vk + ak+1 vk+1 + ... + an vn )
= a1 T (v1 ) + ... + ak T (vk ) + ak+1 T (vk+1 ) + ... + an T (vn )
= ak+1 T (vk+1 ) + ... + an T (vn )
because v1 , .., vk ∈ ker(T ). Therefore w is a linear combination of vectors in
B2 and so range(T ) is spanned by B2 . To show that B2 is linearly indepen-
dent, assume that
and ck+1 vk+1 + ... + cn vn ∈ ker(T ). Since B1 is a basis for ker(T ) we have
34
Theorem 12.13. Let V and W be vector spaces over a field K with dim(V ) =
dim(W ). Then a linear transformation T : V → W is injective if and only
if it is surjective.
Proof. Exercise
in W . Then
T (a1 v1 + ... + an vn ) = 0
and so a1 v1 + ... + an vn ∈ ker(T ) = {0}. Then
a1 v1 + ... + an vn = 0
Proof. Let B = {v1 , ..., vn } be a basis for V . Then T (B) is linearly indepen-
dent set of n vectors in W and since dim(W ) = n, T (B) must form a basis
for W .
35
Proof. If T : V → W is an isomorphism, then range(T ) = W and ker(T ) =
{0} and then the Rank Theorem gives
Corollary 12.18. Any finite dimensional vector space V over a field K with
dim(V ) = n is isomorphic to K n .
T (v) = [v]B
is an isomorphism.
36
and C = {w1 , w2 }, where
1 1
w1 = , w2 = ,
1 0
Proof. Let B = {v1 , ..., vn } and let {e1 , ..., en } be the standard basis for K n .
Then [vi ]B = ei for each i = 1, ..., n. Let A =B [T ]C . Then
A[vi ]B = Aei = [T (vi )]C (∗)
since Aei is the i-th column of A and this is [T (vi )]C by the definition of A.
Now any v ∈ V can be written as a linear combination v = a1 v1 + ... + an vn
and then
[v]B = a1 e1 + ... + an en .
So
A[v]B = A(a1 e1 + ... + an en )
= a1 Ae1 + ... + an Aen
= a1 [T (v1 )]C + ... + an [T (vn )]C
= [a1 T (v1 ) + ... + an T (vn )]C
= [T (a1 v1 + ... + an vn )]C = [T (v)]C .
37
1
In our previous example, let v = 3 = 1v1 + 2v2 + 2v3 . Then
4
4
T (v) = = −1w1 + 5w2 ,
−1
so
−1
[T (v)]C = .
5
Also
1
1 0 −1 2 = −1
B [T ]C [v]B = ,
1 1 1 5
2
as predicted by the Theorem.
Example 13.4. Let IV : V → V be the identity map on a vector space V
with dim(V ) = n and let B = {v1 , ..., vn } and C be bases of V . What is
B [IV ]C ?
By definition, the i-th column of this matrix is [IV (vi )]C = [vi ]C . Therefore
B [IV ]C = PB→C ,
the change of basis matrix. In particular
B [IV ]B = In .
38
Theorem 13.6. Let V and W be finite dimensional vector spaces over a
field K with bases B and C respectively, and let T : V → W be a linear
transformation. Then T is invertible if and only if dim(V ) = dim(W ) and
B [T ]C is an invertible matrix. In this case
−1
C [T ]B = (B [T ]C )−1 .
T (v) = 0 ⇒ [T (v)]C = 0
⇒B [T ]C [v]B = 0 ⇒ [v]B = 0
⇒ v = 0.
The Rank Theorem now gives range(T ) = W and so T is a bijection and
hence T is invertible.
[T ]B =B [T ]B .
[T ]C = P −1 [T ]B P
39
Proof. Let B = {u1 , ..., un } and C = {v1 , ..., vn }. Then the i-th column of
[T ]C is
[T (vi )]C = PB→C .[T (vi )]B
= PB→C .[T ]B [vi ]B
= PB→C .[T ]B .PC→B [vi ]C
= PB→C .[T ]B .PC→B ei .
Therefore
[T ]C = PB→C .[T ]B .PC→B
= (PC→B )−1 .[T ]B .PC→B
as required.
Note: If matrices A and B can be written as B = P −1 AP for some invertible
matrix P , then we say that A and B are similar matrices. Theorem 13.7
says that the matrices of T with respect to different bases are similar.
Conversely if A, B ∈ Mn (K) are similar matrices, then they represent the
same linear operator T : K n → K n with respect to some bases B, C of K n .
Suppose B = P −1 AP for some invertible P ∈ Mn (K). We have A = [TA ]B
where TA (v) = Av for all v ∈ K n and B = {e1 , ..., en } is the standard basis
for K n . Therefore
B = P −1 AP = P −1 [TA ]B P = [TA ]C
where C is the basis {P e1 , ..., P en }, ie. the basis of K n with change of basis
matrix PC→B = P .
1 1
Example 13.8. Let A = ∈ M2 (R). Then A = [TA ]B where
1 1
B = {e1 , e2 } is the standard basis of R2 and
x 1 1 x x+y
TA = = .
y 1 1 y x+y
1 1
Let C = {u1 , u2 } with u1 = = e1 + e2 and u2 = = e1 − e2 .
1 −1
1 1 1/2 1/2
Then PC→B = and (PC→B )−1 = . Therefore
1 −1 1/2 −1/2
1/2 1/2 1 1 1 1
[TA ]C =
1/2 −1/2 1 1 1 −1
40
2 0
= .
0 0
We can check this directly from the definition:
2 0
TA (u1 ) = = 2u1 + 0u2 , TA (u2 ) = = 0u1 + 0u2 .
2 0
41
If we wish to discover whether a vector v is in row(A), then we can
consider AT , using the method above and the fact that row(A) = col(AT ).
Theorem 14.4. For any matrix A ∈ Mmn (K), the dimension of row(A) is
the number of non-zero rows in the reduced row echelon form of A and the
non-zero rows of the RREF of A form a basis for row(A).
A(u + v) = Au + Av = 0 + 0 = 0
42
The dimension of null(A) is called the nullity of A, denoted nullity(A).
1 −1 2
Example 14.7. Find a basis for null(A) where A = 0 1 −1 .
3 −3 6
a
We must find a basis for the set of vectors v = b such that Av = 0.
c
The augmented matrix is
1 −1 2 0
0 1 −1 0
3 −3 6 0
which reduces to
1 −1 2 0
0 1 −1 0 .
0 0 0 0
−t −1
We get a − b + 2c = b − c = 0 and so v = t = t 1 for t ∈ R.
t 1
−1
Therefore null(A) a 1-dimensional subspace of R3 spanned by 1 and
1
nullity(A) = 1.
Proof. Write vi for the ith column of A. Then vi = Aei ∈ range(TA ) (where
{e1 , . . . , en } is the standard basis of K n ). So
43
Definition 14.9. For A ∈ Mmn (K), define rank(A) to be the dimension of
the column space of A. By the above lemma we have rank(A) = rank(TA ).
Note that it is not true in general that col(A) = row(A). They just have
the same dimension.
Proof. Let R be the reduced row echelon form (RREF) of A. By Theorem
14.4 we have dim(row(A)) = r, the number of non-zero rows in R. Since
solutions to Av = 0 are precisely the solutions of Rv = 0, we have null(A) =
null(R).
By Theorem 14.10 null(A) + rank(A) = n = null(R) + rank(R). So
3 −1 5
Example 14.12. Consider the matrix A = 2 1 3 ∈ M3 (R). The
0 −5 1
reduced row echelon form of A is
1 0 8/5
R = 0 1 −1/5 .
0 0 0
44
Therefore row(A) had dimension 2 and we can take the two non-zero rows
of R as a basis for row(A), i.e.,
1 0
B= 0 , 1
8/5 −1/5
is a basis for row(A) and row(A) has dimension 2. For the column space we
T
can perform column operations on A or row operations
on A . If we do the
3 2 0
latter the reduced row echelon form of AT = −1 1 −5 is
5 3 1
1 0 2
R0 = 0 1 −3 .
0 0 0
Therefore a basis for col(A) is
1 0
B0 = 0 , 1 .
2 −3
45
Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.
Alternative Proxies: