0% found this document useful (0 votes)
259 views

MAT 212 Lecture Notes 2

The document contains lecture notes on linear algebra from the MAT212 course. It covers 8 lectures on topics including matrix operations, systems of linear equations, vector spaces, linear transformations, and matrix representations of transformations. The notes provide definitions, examples, and solutions to problems for each topic.

Uploaded by

Abeer Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
259 views

MAT 212 Lecture Notes 2

The document contains lecture notes on linear algebra from the MAT212 course. It covers 8 lectures on topics including matrix operations, systems of linear equations, vector spaces, linear transformations, and matrix representations of transformations. The notes provide definitions, examples, and solutions to problems for each topic.

Uploaded by

Abeer Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

MAT212

Linear Algebra
Department of Mathematics and Natural Sciences (MNS)

Lecture Notes
Contents
0 Textbook References 3
0.1 Book 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.2 Book 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1 Lecture 1 4
1.1 The Basic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Addition of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Scalar Multiplication of a matrix by a number . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Various identities satisfied by matrix operations . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Block Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Lecture 2 12
2.1 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Elementary Row Operations (ERO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Lecture 3 21
3.1 Row-reduced Echelon matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4 Lecture 4 26
4.1 Interpreting matrix multiplication AB as linear combination of the rows of B . . . . . 26
4.2 Invertible matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Algorithm for computation of A−1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5 Lecture 5 35
5.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 Solution to Problem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.4 Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6 Lecture 6 47
6.1 Bases and Dimension (Continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.2 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.3 Revisiting row-equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.3.i To summarize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.3.ii Computations concerning subspaces . . . . . . . . . . . . . . . . . . . . . . . . 56
6.4 Why does the above method work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.4.i Solution (a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.4.ii Solution (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.4.iii Solution (c) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.4.iv Solution (d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

7 Lecture 7 63
7.1 Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2 The Algebra of Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2.i Let us address the question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

8 Lecture 8 75
8.0.i Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.1 Representation of transformations by Matrices . . . . . . . . . . . . . . . . . . . . . . 75
8.2 Linear Algebra 8th lecture continued . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
0 Textbook References
§0.1 Book 1
Algebra (2nd Edition) by Michael Artin.

§0.2 Book 2
Linear Algebra (2nd Edition) by Kenneth M Hoffman, Ray Kunze.
1 Lecture 1

§1.1 The Basic Operations


An (m × n) matrix:
a11 a12 . . . a1n
 
 a21 a22 . . . a2n 
A=
 ... .. .. ..  .
. . . 
am1 am2 . . . amn

The numbers in an (m × n) matrix are called matrix entries.They are denoted by aij with
1 ≤ i ≤ m and 1 ≤ j ≤ n.

i→
− row index,
j→
− column index,

So that aij is the entry which appears in the ith row and the jth column of the matrix:
i↓
 
... ... ... ... ...
. . . . . . . . . . . . . . .
 
j→ 
. . . . . . aij . . . . . . .

. . . . . . . . . . . . . . .
... ... ... ... ...

Such a matrix is denoted by A or by(aij ).

A (1 × n) matrix is called an n-dimensional row vector.


 
A = a1 . . . . . . . . . an .

An (m × 1) matrix is called an m-dimensional column vector.


 
b1
 .. 
 . 
 . 
B=  ..  .

 . 
 .. 
bm
1 Lecture 1

§1.2 Addition of Matrices


(aij ) + (bij ) = (Sij ) ,
where, Sij = aij + bij for ∀ i, j. Thus,
     
2 1 0 1 0 3 3 1 3
+ = .
1 3 5 4 −3 1 5 0 6

§1.3 Scalar Multiplication of a matrix by a number


C (aij ) = (bij ) ,
where bij = c aij for ∀ i, j. Thus,
   
0 1 0 2
2 2 3 = 4 6 .
2 1 4 2
Numbers are referred to as scalars.

§1.4 Matrix Multiplication


Not all matrices can be multiplied with a matrix of any other size. The column number of
the first matrix should be the same as the row number of the second matrix, i.e, a matrix of
size (m × n) can only be multiplied with a matrix of size (n × l). The resulting matrix will be
of size (m × l).

Here, A = (aij ) ; B = (bij ),

A is of size (m × n) → 1 ≤ i ≤ m ; 1 ≤ j ≤ n,

B is of size (n × l) → 1 ≤ i ≤ n ; 1 ≤ j ≤ l.
n
P
AB = (Cij ), where Cij = aik bkj with 1 ≤ i ≤ m and 1 ≤ j ≤ l.
k=1

So that AB is of size (m × l) .

Example 1.1.  
0 −1 2
A = (aij ) = , Size = (2 × 3) ,
3 4 −6
 
1 2
B = (bij ) = −1
 3 , Size = (3 × 2) .
4 2
1 Lecture 1

 
 1 2 
0 −1 2 
AB = −1 3 .
3 4 −6
4 2
AB = (Cij ) will be a (2 × 2) matrix.
 
C11 C12
AB = .
C21 C22
C11 = a11 b11 + a12 b21 + a13 b31 = 0 + 1 + 8 = 9,
C12 = a11 b12 + a12 b22 + a13 b32 = 0 − 3 + 4 = 1,
C21 = a21 b11 + a22 b21 + a23 b31 = 3 − 4 − 24 = −25,
C22 = a21 b12 + a22 b22 + a23 b32 = 6 + 12 − 12 = 6.
 
9 1
Therefore, AB = .
−25 6
The system of linear equations:

a11 x1 + a12 x2 + · · · + a1n xn = b1 ,


a21 x1 + a22 x2 + · · · + a2n xn = b2 ,
.. .. ..
. . .
am1 x1 + am2 x2 + · · · + amn xn = bm .

can be expressed in matrix notation as AX=B, where A = (aij ); 1 ≤ i ≤ m, 1 ≤ j ≤ n is the


(m × n) coefficient matrix.    
x1 b1
 ..   .. 
.  . 
.
 , B =  ...  .
 
X=  .
.  
.  . 
 ..   .. 
xn bm
X is an (n × 1) column vector and B is an (m × 1) column vector.
Example 1.2. The following system of 2 linear equations of 3 unknowns:

−x2 + 2x3 = 2,

3x1 + 4x2 − 6x3 = 1.


can be represented by the matrix equation:
 
  x1  
0 −1 2   2
x2 = .
3 4 −6 1
x3
1 Lecture 1

§1.5 Various identities satisfied by matrix operations


Distributive Law:

 ′
 ′
 ′
 ′
A B+B = AB + AB ; and A + A B = AB + A B. (1.1)

Associative Law:

(AB) C = A (BC) . (1.2)

As long as the matrices involved are of suitable sizes, so that the products are defined. For
example, in Equation 1.2, one requires A be of size (m × n), B to be of size (n × l) and C to
be of size (l × p), so that the 2 matrices on both sides of Equation 1.2 are of size (m × p).
Example 1.3.  
  2 0
1  
ABC = 1 0 1 1 1 ,
2
0 1
 
  2 0  
1 0 1  2 1
(AB) C = 1 1 = ,
2 0 2 4 2
0 1
   
1   2 1
while, A (BC) = 2 1 = .
2 4 2
Scalar Multiplication is compatible with matrix multiplication in the obvious way:

C (AB) = (CA) B = A (CB) . (1.3)

Commutativity law does not hold, in general, for matrix multiplication:

AB ̸= BA.

When size A = m × n and size B = n × m or even when they are of the same size.

• 0m×n denotes a zero matrix (all the entries are 0) of size (m × n).
• The square (n × n) matrix whose only non zero entries are 1 in each diagonal position is
called the (n × n) identity matrix and is denoted by In .
• If A is an (m × n) matrix, then Im A = A and AIn = A with:
1 Lecture 1

1 0 ... 0
 
0 1 . . . 0
Im = 
 ... .. . . . ..  m rows.
. .
0 0 ... 1
m columns.

Let A be a square (n × n) matrix. If there is a matrix B such that,

AB = In , BA = In . (1.4)

Then B is called the inverse of A and is denoted by A−1 ,

A−1 A = In = AA−1 . (1.5)

When A has an inverse, it is said to be an invertible matrix.

Example 1.4.
   
2 1 −1 3 −1
The matrix A= is invertible with its inverse A = .
5 3 −5 2

Indeed,        
2 1 3 −1 1 0 3 −1 2 1
= = .
5 3 −5 2 0 1 −5 2 5 3
Let us quickly check the fact that, an inverse is unique if it exists at all.

Let B and B be 2 matrices satisfying Equation 1.4 for the same matrix A.

One has, AB = In and B A = In . (1.6)

By the associativity law in Equation 1.2, one obtains:

′ ′  ′ ′
B (AB) = B A B = B In = In B = B = B.

[Using Equation 1.5].

”Uniqueness Proved”.

Proposition 1.1
Let A, B be (n × n) matrices. If both are invertible, so is their product AB and,

(AB)−1 = B −1 A−1 .

More generally, if A1 , . . . , Am are invertible, then so is the product A1 . . . Am and its inverse
−1
is A−1
m . . . A1 .

Proof. Assume that A and B are invertible. We check that the inverse of AB is B −1 A−1 .

(AB) B −1 A−1 = A BB −1 A−1 = AIA−1 = AA−1 = I,


 

Similarly, B −1 A−1 (AB) = B −1 A−1 A B = B −1 IB = B −1 B = I.


 
1 Lecture 1

When m = 1, it asserts that the inverse of A1 is A−1


1 . This holds trivially.

Next, we assume that the assertion holds for m = k. Then we have to show that it holds
for m = k + 1.

We suppose that A1 , A2 , . . . , Ak+1 are invertible (n × n) matrices, and we denote by P , the


product of the first k matrices, i.e;

P = A1 A2 . . . Ak . (1.7)

And its inverse,


P −1 = A−1 −1
k . . . A1 [From the Assumption] . (1.8)

Also Ak+1 is invertible. Therefore,

(P Ak+1 )−1 = A−1


k+1 P
−1
.

=⇒ (A1 . . . . . . Ak Ak+1 )−1 = A−1 −1 −1


k+1 Ak . . . . . . A1 [Using (1.7) and (1.8)] .

This completes the Proof. ■

§1.6 Block Multiplication



We may decompose matrices into blocks to facilitate matrix multiplication. Let M, M be
(m × n) and (n × p) matrices, and r be an integer less than n.

We decompose M and M into blocks as follows:
′

A
M = A ... B
h i ′
and M = . . . .

B


Where, A has r columns and A has r rows. Then the matrix product can be computed
as follows:
′ ′ ′
M M = AA + BB . (1.9)
Example 1.5.
 
" .. # 2 3           
1 0 . 5  4
 8 1 0 2 3 5   2 3 0 0 2 3
.
= + 0 0 = + = .
0 1 .. 7 . . . . . . 0 1 4 8 7 4 8 0 0 4 8

0 0

We may also multiply matrices divided into more blocks. For our purpose, a decomposition
into four blocks will be the most useful.

In this case, the rule for block multiplication is the same as for multiplication of 2 × 2
matrices.

Let r + s = n and k + l = m. Suppose, we decompose an (m × n) matrix M and (n × p)



matrix M into sub matrices.
1 Lecture 1

 .. 
k
A . B
M = . . . . . . ,
 
..
C . D l
r s

′ .. ′

r
A . B

M = . . . . . . ,
 
′ .. ′
C . D s

′ ′ .. ′ ′

AA + BC . AB + BD

MM =  ... ... .
 
′ ′ .. ′ ′
CA + DC . CB + DD
Example 1.6.  . 
1 0 .. 5
. . . ... . . . ,
 
.
0 1 .. 7
(2 + 1)
..
 
23 . 11
 41 ... 00 
 
  (2 + 1),
. . . . . . 
 
..
01 . 10
 23 ... 11
 
 .  .. 
1 0 .. 5  .  28 . 61
  41 .. 00 
 ..
. ... . . .  = ..
. . . . .
 
. . . . . .
. ..
0 1 .. 7 48 . 70
 
.
01 .. 10

This can also be performed as:


..
 
 .  2 3 1 .
 1 . 
1 0 .. 5   .. 2 8 6 .. 1
 4
 1 0 0 .
. . . . . . . . .   ..
=. ... ... . . . .
 
. . . ... ... . . .
. .
0 1 .. 7 4 8 7 .. 0
 
.
0 1 1 .. 0
1 Lecture 1

 .. 
1 0 . 5
 ..
. ... . . . ,
 
..
0 1 . 7
(2 + 1)
..
 
2 3 1 . 1

 4 .. 
 1 0 . 0  (2 + 1).
. . . ... ... . . .
 
..
0 1 1 . 0

Exercise: Basic Operations (Page 31 - From the book by Artin).

10 PROBLEMS: 2 (a) , 3, 4, 5, 7, 8, 11, 12, 16, 19.

For 5 :    
1 a 1 a
= .
1 0 1

We will discuss the notion of a field and study examples in the next lecture following the text
book by Hoffman and Kunze.
2 Lecture 2
Last lecture was about definition and basic operations involved in matrices. But we didn’t
talk about the entries of the matrices. The entries of the matrices belong to a certain structured
set called field.

§2.1 Fields
1. Addition is commutative: x + y = y + x, ∀ x, y ∈ F.
2. Addition is associative: x + (y + z) = (x + y) + z, ∀ x, y, z ∈ F.
3. There is a unique element 0 (zero) in F s.t. x + 0 = x, ∀ x ∈ F.
4. To each x in F , there corresponds a unique element (−x) ∈ F s.t. x + (−x) = 0.
5. Multiplication is commutative: x.y = y.x, ∀ x, y ∈ F.
6. Multiplication is associative: x. (y.z) = (x.y) .z, ∀ x, y, z ∈ F.
7. There is a unique non-zero element 1 (one) in F such that x.1 = x, ∀ x ∈ F.
8. To each non-zero x ∈ F, there exists a unique element x−1 or x1 in F such that xx−1 = 1.


9. Multiplication distributes over addition. i.e; x. (y + z) = x.y + x.z, ∀ x, y, z ∈ F.

Suppose one has a set F and there are 2 defined operations on the elements of F , namely + and .

The first operation +, called addition associates with each pair of elements x, y ∈ F an
element (x + y) ∈ F .

The second operation . called multiplication associates each pair x, y an element x.y ∈ F ;
and these two operations satisfy conditions (1) − (9) above. For convenience, we will drop
the multiplication notation . between elements of F , i.e, by simply xy we will mean x.y, given
x, y ∈ F.

The set F together with these two operations (+, .) is called a field.

Example 2.1. The set of complex numbers denoted by C is a field with respect to the standard
addition and multiplication of complex numbers:

(a + ib) + (c + id) = (a + c) + i (b + d)

(a + ib) + (c + id) = (ac − bd) + i (bc + ad)

(Check!)
• A subfield of C is a set F of complex numbers which is itself a field under the usual
operations of addition and multiplication of complex numbers. This means that 0 and 1
are in the set F , and that if x, y ∈ F , then (x + y) , −x, xy, and x−1 (x ̸= 0) are also in F .

An example of such a subfield of C is the field of real numbers denoted by R.


2 Lecture 2

• The set of positive integers: 1, 2, 3, . . . is not a subfield of C for a variety of reasons. For
example, the additive identity 0 is not a positive integer. Also, given a positive integer
n, its additive inverse −n is not a positive integer. Also, if n is any positive integer other
than 1, then its multiplicative inverse n1 is not a positive integer.
• The set of integers Z is not a subfield of C, because for n ∈ Z \ {±1}, its multiplicative
inverse n1 ∈
/ Z.
• The set of rational numbers Q is a subfield of C. (Check!) .

• The set of all complex numbers (real numbers infact) of the form x + y 2, where x, y ∈ Q
is a subfield of C.
√ 
(Exercise)→ This field is denoted by Q 2 .

§2.2 Systems of Linear Equations


Suppose F is a field. The problem that concerns us now is to find an n − tuple of scalars
(elements of F).
(x1 , x2 , . . . , xn ) ∈ Fn which satisfies the conditions:

A11 x1 + A12 x2 + · · · + A1n xn = y1 ,


A21 x1 + A22 x2 + · · · + A2n xn = y2 ,
.. .. .. (2.1)
. . .
Am1 x1 + Am2 x2 + · · · + Amn xn = ym .

where y1 , . . . , ym and Aij , 1 ≤ i ≤ m, 1 ≤ j ≤ n, are given elements of F .

We call equation 2.1 a system of m linear equations in n unknowns.

Any n tuple (x1 , x2 , . . . , xn ) of elements of F that satisfies each of the m equations in equation
2.1 is called a solution of the system. If y1 = y2 = · · · = ym = 0 in equation 2.1, we say that
the system is homogeneous.

Consider multiplying the m linear equations of equation 2.1 by the constants C1 , C2 , . . . , Cm


respectively and adding them to form a new linear equation:

C1 (A11 x1 + · · · + A1n xn ) + C2 (A21 x1 + · · · + A2n xn ) + . . . . . . . . . + Cm (Am1 x1 + · · · + Amn xn ) .


= C1 y1 + C2 y2 + · · · + Cm ym . (2.2)

=⇒ (C1 A11 + C2 A21 + · · · + Cm Am1 ) x1 + · · · + (C1 A1n + C2 A2n + · · · + Cm Amn ) xn .

= C1 y1 + C2 y2 + · · · + Cm ym . (2.3)

Equation 2.3 represents the new linear equation in n unknowns with coefficients:

(C1 A11 + C2 A21 + · · · + Cm Am1 ) , . . . , (C1 A1n + C2 A2n + · · · + Cm Amn ) ,


and the inhomogeneity term:

C1 y1 + C2 y2 + · · · + Cm ym .
2 Lecture 2

Equation 2.3 is called a linear combination of the linear equations in equation 2.1. If the n
tuple (x1 , x2 , . . . , xn ) solves the system 2.1, then it solves the linear equation 2.3 as well which
is evident from equation 2.2. Now, if we have the following system of linear equations:

B11 x1 + · · · + B1n xn = z1 ,
.. .. .. (2.4)
. . .
Bk1 x1 + · · · + Bkn xn = zk .

where each linear equation in 2.4 is a linear combination of the linear equations in 2.1. One
then immediately finds that a solution of 2.1 is going to solve the system 2.4 as well. But the
system 2.4 may have solutions that don’t solve 2.1.

Now if every equation in the system 2.1 can be expressed as a linear combination of the linear
equations in 2.4, then any solution of 2.4 will also solve 2.1. In such a situation, the system 2.1
is said to be equivalent to system 2.4.

2 system of linear equations are said to be equivalent if each equation in each system can
be
expressed as a linear combination of the equations in the other system. This fact is expressed
formally using the following theorem:

Theorem 2.1
Equivalent systems of linear equations have exactly the same solutions.

Exercises (Hoffman and Kunze) - (Page: 5) - Problem: 2, 3, 4, 5 and 6.

§2.3 Elementary Row Operations (ERO)


Expressing an equation of a system as a linear combination of equations from another system
is cumbersome as one sees while working with the simple systems introduced in the exercises
above. To this end, one therefore defines elementary row operations.

Get back to the system 2.1 and write it as the following matrix equation:

AX = Y, (2.5)

where
A11 ... A1n
 

A =  ... ..  ;
.
Am1 . . . Amn

   
x1 y1
 ..   .. 
.  . 
.
 and Y =  ...  .
 
X=
 .
.  
.  . 
 ..   .. 
xn ym

We call A the matrix of coefficients.


Here, the entries of A and those of Y take values in F . We say that A and Y are (m × n) and
(m × 1) matrices over the field F , respectively.
2 Lecture 2

Definition 2.1 (Elementary Row Operation). An elementary row operation is a special type
of function e that associates each (m × n) matrix an (m × n) matrix e (A) in the following
3 possible ways:
1. e (A)ij = Aij if i ̸= r, e (A)rj = cArj . [Multiplication of a row of A by a non-zero
scalar (element from F ) c].
2. e (A)ij = Aij if i ̸= r, e (A)rj = Arj + cAsj . [Replacement of the rth row of A by row
r plus c times row s, c ∈ F and r ̸= s].
3. e (A)ij = Aij if i ̸= r and i ̸= s, e (A)rj = Asj , e (A)sj = Arj . [Interchange of 2 rows
of A].

So there are 3 basic types of elementary row operations.

Theorem 2.2
To each elementary row operation e, there corresponds an elementary row operation e1 , of
the same type as e, such that e1 (e (A)) = e (e1 (A)) = A for each A. In other words, the
inverse operation(function) of an elementary row operation exists and is an elementary row
operation of the same type.

Proof. We prove the result for the 3 basic types of elementary row operations separately:
1. Define e1 by e1 (A)ij = Aij if i ̸= r; e1 (A)rj = c−1 Arj ; c ∈ F \ {0},
 
Then e1 (e (A))ij = e (A)ij = Aij if i ̸= r; e1 (e (A))rj = c−1 e (A)rj = c−1 .c Arj = Arj .

∴ e1 (e (A)) = A.

Similarly, one can show that e (e1 (A)) = A for type 1 of elementary row operations
e.
2. Define e1 by e1 (A)ij = Aij , i ̸= r; e1 (A)rj = Arj − cAsj where r ̸= s and c ∈ F.

Then e1 (e (A))ij = e (A)ij = Aij if i ̸= r and e1 (e (A))rj = e (A)rj − ce (A)sj ,


h i
= Arj + cAsj − cAsj ∵ e (A)sj = Asj as s ̸= r & e (A)rj = Arj + cAsj = Arj .

∴ e1 (e (A)) = A.

Similarly, for the type 2 of EROs, one can show that e (e1 (A)) = A.
3. Define e1 by e1 (A)ij = Aij , for i ̸= r and i ̸= s; e1 (A)rj = Asj and e1 (A)sj = Arj .

In other words, e1 = e,

∴ e1 (e (A))ij = e (A)ij = Aij for i ̸= r and i ̸= s;

e1 (e (A))rj = e (A)sj = Arj and e1 (e (A))sj = e (A)rj = Asj .

∴ e1 (e (A)) = A.

It’s obvious that e (e1 (A)) = A as e1 = e in this case of type 3.



2 Lecture 2

Definition 2.2. If A and B are m×n matrices over the field F, B is said to be row-equivalent
to A if B can be obtained from A by a finite sequence of elementary row operations.

Theorem 2.3
If A and B are row-equivalent m × n matrices over F , the homogeneous systems of linear
equations AX = 0 and BX = 0 have exactly the same solutions.

Proof. Suppose we pass from A to B by a finite sequence of EROs:

A = A0 → A1 → · · · → Ak = B.

It is enough to prove that Aj X = 0 and Aj+1 X = 0 have the same solutions. i.e; one ele-
mentary row operation doesn’t perturb the set of solutions.

So, suppose the B is obtained from A by a single ERO. Irrespective of the type of EROs
performed, each equation in the system BX = 0 will be a linear combination of the equations
in the system AX = 0.

For instance, consider type 2 ERO:

B = e (A) =⇒ Bij = e (A)ij = Aij if i ̸= r, i.e; for i ̸= r, one has:


[Bi1 . . . Bin ] = 0 [A11 . . . A1n ] + 0 [A21 . . . A2n ] + · · · + 1 [Ai1 · · · + Ain ] + · · · + 0 [Am1 . . . Amn ] .

Also, Brj = e (A)rj = Arj + cAsj where s ̸= r and c ∈ F.

[Br1 . . . Brn ] = 0 [A11 . . . A1n ] + · · · + 1 [Ar1 . . . Arn ] + · · · + c [As1 . . . Asn ] + · · · + 0 [Am1 . . . Amn ] .

Therefore, each equation of the system BX = 0 can be expressed as a linear combination


of the equations in the system AX = 0 if B = e (A) if e is of type 2. One can prove the result
for the other 2 types of EROs as well.

Now, if B = e (A) with e begin 1 of the 3 basic types of EROs then A = e1 (B) where e1
is the inverse ERO described in Theorem 2.2. This is because e (e1 (B)) = B as proved in The-
orem 2.2. Therefore, A can be obtained from B by an elementary row operation. Therefore,
using the arguments presented earlier in the proof, each equation of the system AX = 0 can
be expressed as a linear combination of the equations in the system BX = 0.

Therefore, each equation of the system BX = 0 can be expressed as a linear combination


of the equations in the system AX = 0 and vice-versa. And hence by Theorem 2.1 the systems
AX = 0 and BX = 0 have the same solutions. ■
Example 2.2. Suppose F is the field of rational numbers and
 
2 −1 3 2
A = 1 4 0 −1 .
2 6 −1 5

We shall apply a finite sequence of EROs on A


2 Lecture 2

     
2 −1 3 2 0 −9 3 4 0 1 − 13 − 49
(2)







 (1) 



1 4 0 −1
 −2R2 + R1 = R1′
1 4 0 −1 1 4 0 −1 ,
 − R1 = R1′
   1  
  ′   
−2R2 + R3 = R3 9
2 6 −1 5 0 −2 −1 7 0 −2 −1 7

0 1 − 31 − 94
  

 Type of ERO



 (2) 



4 7 
EROs applied ← 1 0 3 .

9 
 ′  



 −4R1 + R2 = R2  

−2R1 + R3 = R3 0 0 − 35 55

9

 
0 1 − 31 − 49
(1) 



1 0 4 7 
′  3 9 ,
− 35 R3 = R3  
11
0 0 1 −3

 
0 1 0 − 35
(2)  
 
′ 1 0 0 51  .
− 43 R3 + R2 = R2  9 

− 13 R3 + R1 = R1
 
11
0 0 1 −3

Therefore, by means of Theorem 2.3, one finds that the homogeneous system of linear equations:

2x1 − x2 + 3x3 + 2x4 = 0,

x1 + 4x2 − x4 = 0,
2x1 + 6x2 − x3 + 5x4 = 0,

and the one given by:


5
x2 − x4 = 0,
3
51
x1 + x4 = 0,
9
11
x3 − x4 = 0.
3

have the same solutions.


Example 2.3. Suppose F is the field of complex numbers C and
 
−1 i
 
 
A =  −i 3

.
 
1 2
2 Lecture 2

   
−1 i 0 2+i
  (2)  
   
Now,  −i 3 ′ 0 3 + 2i
  R1 + R3 = R1  
  ′  
iR3 + R2 = R2
1 2 1 2
   
0 1 0 1
(2)
(1) 







0 3 + 2i ′ 0 0 .
1 ′   − (3 + 2i) R1 + R2 = R2  
R = R1
2+i 1
  ′  
−2R1 + R2 = R2
1 2 1 0

Hence by Theorem 2.3, the homogeneous system:

−x1 + ix2 = 0,

−ix1 + 3x2 = 0,
x1 + 2x2 = 0,
and the one given by:
x2 = 0,
x1 = 0,
have the same solutions. i.e., the former system admits trivial solution only.
Remark. In Example 2.1 and Example 2.2, the EROs we performed aren’t random. The aim
was to put the coefficient matrix in a “desired form”. This ”desired form” is defined as follows:

Definition 2.3. An (m × n) matrix R is called row-reduced if:


(a) The first non-zero entry in each non-zero row of R is equal to 1.
(b) Each column of R which contains the leading non-zero entry (1) of some row has all
its other entries 0.

Example 2.4. The (n × n) identity matrix I is row-reduced.


(
1 if i = j,
Iij = δij =
̸ j.
0 if i =

The following 2 matrices are not row-reduced:


 
1 0 0 0
 0 1 −1 0 .
0 0 1 0

The leading 1’s are circled. The leading 1 in the third row belongs to the third column where
there are other non-zero entries, e.g; −1. And hence part (b) of the definition fails.
 
0 2 1
1 0 3 .
0 0 0
2 Lecture 2

The first row is a non-zero row with 2 being the first non-zero entry. Hence, part (a) of the
definition fails and the matrix in question is not row-reduced.

Theorem 2.4
Every (m × n) matrix over the field F is row-equivalent to a row-reduced matrix.

Proof. Let A be an (m × n) matrix over F . If every entry of the first row is zero, then condition
(a) for row-reduced matrix is satisfied as far as row 1 of A is concerned. If row 1 has a non-zero
entry, let 1 < k < n be the least positive integer j for which A1j ̸= 0. Multiply row 1 by A−1 1k
to obtain the leading non-zero entry of the first row of A to be 1. Now, for i ≥ 2, perform the

elementary row operation −Aik R1 + Ri = Ri so that column k containing the leading 1 of the
first row has all other entries to be zero as required by condition (b) of row-reduced matrix.
The resulting matrix is of the following form:
kth column

1st row → 0 . . . . . . . . . 0 1 ∗ ......... ∗

∗ . . . . . . . . . ∗ 0 ∗ ......... ∗
.
 .. .. .. .. ..  .
. . . .
∗ ......... ∗ 0 ∗ ......... ∗

∗ denotes unknown entries. If every entry in row 2 of the matrix above is zero, i.e; all the
∗’s are zero, we do nothing to it.

If some of the ∗’s in row 2 are non-zero, we find the lowest positive integer value k of j
for which A2j is non-zero. We multiply the second row by A−1 2k′
to obtain the leading 1 in the

second row. We also observe that k ̸= k, since there can’t be non-zero entry in the k-th column
of the second row. At this stage, the matrix reduces to one of the following 2 forms depending
′ ′
on whether k < k or k > k:


column k column k

↓ ↓

1st row → 0 ... 0 ... ... 0 1 ∗ ......... ∗


 

2nd row →  0 ... 1 ∗ ... ∗ 0 ∗ ......... ∗


∗ ... ∗ ∗ ... ∗ 0 ∗ ......... ∗ .
 
. .. .. .. .. .. 
 .. . . . . .
∗ ... ∗ ... ... ∗ 0 ∗ ......... ∗

∗’s need to be made zero by EROs,

−Aik′ R2 + Ri = Ri ,
for i = 3, . . . , m.
2 Lecture 2


column k column k
↓ ↓

1st row → 0 . . . 0 1 ∗ ... ∗ ∗ ...... ∗


 

2nd row →  0 ... 0 0 0 ... 1 ∗ ...... ∗


∗ . . . ∗ 0 ∗ ... ∗ ∗ ...... ∗ .
 
. .. .. .. .. .. .. 
 .. . . . . . .
∗ ... ∗ 0 ∗ ... ∗ ∗ ...... ∗

∗’s need to be made zero by EROs,

−Aik′ R2 + Ri = Ri ,
for i = 1, 3, 4 . . . , m.

In both cases, all the non-zero entries in column k except for the leading 1 of row 2 can be
made zero performing the following ERO:
′ ′
− Aik R2 + Ri = Ri . (2.6)

This last operation given by 2.6 doesn’t alter the zero entries appearing before the leading 1s
in row 1 and row 2. Also, the k-th column remains unaltered by the ERO 2.6 as can easily be
verified using the figure above.

Working with one row at a time (as done above for row 1 and 2) in the above manner, it
is clear that we will arrive at a row-reduced matrix after a finite number of steps. ■
Exercises (Hoffman and Kunze) (Page: 10) Problem: 1, 2, 4, 5.

Next Lecture → Row-reduced echelon matrices.


3 Lecture 3

§3.1 Row-reduced Echelon matrix


An (m × n) matrix R is called a row-reduced echelon matrix if:
(a) R is row-reduced.
(b) All rows of R having all its entries 0 occur at the bottom of the matrix.
(c) If rows 1, . . . , r are the non-zero rows of R, and if the leading 1 of row i occurs in column
ki , i = 1, 2, . . . , r, then k1 < k2 < · · · < kr .
Example 3.1. 2 examples of row-reduced echelon matrices are the (n × n) identity matrix,
and the (m × n) zero matrix 0m,n , in which all the entries are 0. The following matrix is also
a row-reduced echelon matrix:
 
0 1 −3 0 21
0 0 0 1 2  .
0 0 0 0 0

In this example, k1 = 2 and k2 = 4, where 2 < 4.

Theorem 3.1
Every (m × n) matrix A is row equivalent to a row-reduced echelon matrix.

Proof. We know that A is row equivalent to a row-reduced matrix from Theorem 2.4 of
Lecture 2.

By a finite number of type 3 EROs (interchanging rows of a row-reduced matrix), one can
bring all the zero-rows (if any) at the bottom of the matrix. Also, by applying finite number of
interchange of rows on a row-reduced matrix, one can achieve the 3rd property of a row-reduced
echelon matrix in a row-reduced matrix. ■
In lecture 2, we’ve seen importance of row-reduced matrices in solving homogeneous system of
linear equations.

Let us now focus on the system RX = 0, where R is a row-reduced echelon matrix. Let
rows, 1, 2, . . . , r be the non-zero rows of R, and suppose that the leading 1 in the ith row occurs
at the ki th column, where 1 ≤ i ≤ r. The system RX = 0 then consists of r non-trivial equa-
tions. Note that the unknown xki will appear only in the ith equation of the system. We call
xki ’s the leading variables. The (n − r) unknowns which are different from xk1 , xk2 , . . . , xkr , are
denoted by u1 , u2 , . . . , un−r . We call these variables free variables.

Using the leading and free variables, one can write the r non-trivial equations in RX = 0
in the following way:
3 Lecture 3

n−r
P
xk 1 + c1j uj = 0,
j=1

n−r
P
xk 2 + c2j uj = 0, (3.1)
j=1
.. .. ..
. . .
n−r
P
xkr + crj uj = 0.
j=1

Here cij ’s aren’t to be confused with the entries of R. But the cij ’s for i = 1, 2, . . . , r and
j = 1, 2, . . . , n − r can be read off from the matrix R as we will see using examples.

All the solutions of the system RX = 0 are obtained by assigning arbitrary values to u1 , u2 , . . . , Un−r
and hence they are called free variables. The leading variables xk1 , . . . , xkr are computed in
terms of the free variables u1 , u2 , . . . , un−r using equation 3.1.

Let us revisit Example 3.1:

Here the row reduced matrix R is:


 
0 1 −3 0 12
R = 0 0 0 1 2  .
0 0 0 0 0

The system RX = 0 reads in this case as:

x2 − 3x3 + 21 x5 = 0 or, x2 = 3x3 − 12 x5 ,

x4 + 2x5 = 0 or, x4 = −2x5 .

Let us assign arbitrary values to the free variables x1 = a, x3 = b and x5 = c and obtain the
solution.
 c 
a, 3b − , b, −2c, c
2
.
Remark. Let us focus back in the system RX = 0. If the number r of non-zero rows in R is less
than n, then the system RX = 0 has a non-trivial solution, that is, a solution (x1 , x2 , . . . , xn )
in which not every xj is 0. This is because, since r > n, one can choose (n − r) variables
u1 , u2 , . . . , un−r arbitrarily to write xk1 , xk2 , . . . , xkr in terms of u1 , u2 , . . . , un−r . These are the
xj ’s in (x1 , x2 , . . . , xn ) which is different from xk1 , xk2 , . . . , xkr .
We now have the following theorem based on the remark above:

Theorem 3.2
If A is an (m × n) matrix and m < n, then the homogeneous system of linear equations
AX = 0 has a non-trivial solution.
3 Lecture 3

Proof. Let R be a row-reduced echelon matrix which is row equivalent to A. Then the systems
AX = 0 and RX = 0 have the same solutions by Theorem 2.3 of Lecture 2. If r is the number
of non-zero rows in R, then certainly r ≤ m. Since m < n by hypothesis, r < n. Then by
the remark above RX = 0 has a non-trivial solution. Therefore, AX = 0 has a non-trivial
solution. ■

Theorem 3.3
If A is an (n × n) matrix, then A is row equivalent to the (n × n) identity matrix if and
only if the system of equations AX = 0 has only the trivial solution.

Proof. (⇒) Let A be row equivalent to the (n × n) identity matrix I, then by Theorem 2.3 of
lecture 2, AX = 0 and IX = 0 have the same solutions. Since, IX = 0 has the trivial solution
only, so does the system AX = 0.

(⇐) Suppose AX = 0 has only the trivial solution X = 0. Let R be an (n × n) row-reduced


echelon matrix row equivalent to A. Let r be the number of non-zero rows in R. Since AX = 0
has only the trivial solution, RX = 0 also does not admit any non-trivial solution. Hence, by
the remark appearing before Theorem 3.2, r ≥ n. But since R has n rows, certainly r ≤ n.
Therefore, r = n. In other words, R is an (n × n) row-reduced echelon matrix with exactly n
non-zero rows. Since, this means that R actually has a leading 1 in each of its n−rows, and
since these 1’s occur each in different one of the n−columns, R must be the (n × n) identity
matrix. ■
Let us discuss now an inhomogeneous system of linear equations AX = Y . Before discussing
this, first note a basic difference between a homogeneous and inhomogeneous system of linear
equations.

While a homogeneous system of linear equations always admits the trivial solution, an
inhomogeneous system may not admit any solution at all.

x1 = · · · = xn = 0.

We form the augmented matrix A of the system AX = Y . This is the m × (n + 1) matrix
whose first n columns are the columns of A and whose last column is Y . More precisely,

Aij = Aij if j ≤ n,

Aij = Yi if j = n + 1,

Ai(n+1) = Yi .
Suppose, we perform a sequence of EROs on A to arrive at a row-reduced echelon matrix

R. If we perform this same sequence of EROs on the augmented matrix A , we’ll arrive at a
matrix R ′ whose first n columns are the columns of R and whose last column contains some
scalars z1 , . . . , zm . The scalars zi are the entries of the (m × 1) matrix.
 
z1
 .. 
 . 
 . 
Z=  ..  ,

 . 
 .. 
zm

which results from applying the stated sequence of EROs to the (m × 1) matrix Y .
3 Lecture 3

Using the proof techniques used in Theorem 2.3 of Lecture 2, one can show that the linear
equations in the system AX = Y can be expressed as linear combinations of the equations in
the system RX = Z and vice versa. And hence, the 2 systems are equivalent and they have the
same solutions. Let us see how to determine whether the system RX = Z has any solutions
and to determine all the solutions if any exist.If R has r non-zero rows, with leading 1 of row
i occurring in the column ki , i = 1, 2, . . . , r, then the first r equations of RX = Z effectively
express xk1 , . . . , xkr in terms of the (n − r) remaining xj and the scalars z1 , . . . , zr . The last
(m − r) equations are:

0 = zr+1 ,
.. .. (3.2)
. .
0 = zm ,

and accordingly the condition for the system to have a solution is Zi = 0 for i > r. If this
condition given by Equation 3.2 is met, all solutions of the system AX = Y can be found by
assigning arbitrary values to the (n − r) of the xj , i.e., the free variables and then computing
xki from the ith equation.

Example 3.2. Let F be the field of rational numbers and,


 
1 −2 1
A = 2 1 1 .
0 5 −1

and suppose we wish to solve the system AX = Y for some y1 , y2 and y3 . The augmented

matrix A reads:
..
 
1 −2 1 . y1
′ 
A = 2 1 .. 
1 . y2 .

..
0 5 −1 . y3

Let us row-reduce A :
.. ..
   
1 −2 1 . y1 1 −2 1 . y1
 .   . 
2 1
 1 .. y2 
, ′
−2R1 + R2 = R2
0 5 −1 .. −2y + y  ,
 1 2 
. ..
0 5 −1 .. y3 0 5 −1 . y3
.
 
1 −2 1 .. y1

 .
0 5 −1 .. −2y + y  ,

−R2 + R3 = R3  1 2 
..
0 0 0 . 2y1 − y2 + y3
..
 
1 −2 1 . y1

1 .
. −2y 1 +y2

,

0 1 − 5 .
1

R2 = R2 5 
5 ..
0 0 0 . 2y1 − y2 + y3
3 Lecture 3

..
 
3 y1 +2y2
1 0 5
. 5

0 1 − 1 .. −2y1 +y2

.
2R2 + R1 = R1

 5
. 5 
..
0 0 0 . 2y1 − y2 + y3

The condition for the system AX = Y to admit a solution is:

2y1 − y2 + y3 = 0.

And if the condition above is met, then we can express the leading variables in terms of the
free variable x3 in the following way:
3 y1 + 2y2
x1 = − x3 + ,
5 5
1 y2 − 2y1
x 2 = x3 + .
5 5
Assigning the free variable an arbitrary scalar c, one obtains:
3 y1 + 2y2
x1 = − c + ,
5 5
1 y2 − 2y1
x2 = c + ,
5 5
so that the solution set reads:
 
3 y1 + 2y2 1 y2 − 2y1
− c+ , c+ ,c .
5 5 5 5

Suggested Exercises (Hoffman and Kunze) (Page: 15) Problem: 1, 2, 3, 4, 5, 6, 7, 9.


4 Lecture 4

§4.1 Interpreting matrix multiplication AB as linear combination of the


rows of B
A11 ... A1n B11 B12 . . . B1p
  
 A21 ... A2n   B21 B22 . . . B2p 
 .
 .. .. ..   . .. .. ..  ,
. .   .. . . . 
Am1 . . . Amn Bn1 Bn2 . . . Bnp
A11 . . . A1n β1
  
 A21 . . . A2n   β2 
=  ... ... ..   . ,
.   .. 
Am1 . . . Amn βn
A11 β1 + · · · + A1n βn
 
 A21 β1 + · · · + A2n βn 
= .. .
 . 
Am1 β1 + · · · + Amn βn
Here β1 , . . . , βn are the n rows of the (n × p) matrix B. Hence, the matrix AB has size (m × p)
each of the m rows of which is given by:
n
X
γi = Aij βj , 1 ≤ i ≤ m.
j=1

Therefore, each row of the matrix AB, denoted by γi , 1 ≤ i ≤ m, is a linear combination of


the rows βi , 1 ≤ i ≤ n, of the matrix B.
Example 4.1.  
  2 3 1 1
1 3 2 
1 2 3 2
2 2 1
2 1 2 0
 
1 [2 3 1 1] + 3 [1 2 3 2] + 2 [2 1 2 0]
= 
2 [2 3 1 1] + 2 [1 2 3 2] + 1 [2 1 2 0]
 
[2 3 1 1] + [3 6 9 6] + [4 2 4 0]
=  
[4 6 2 2] + [2 4 6 4] + [2 1 2 0]
 
9 11 14 7
=
8 11 10 6

Theorem 4.1
If A, B, C are matrices over the field F such that the products BC and A (BC) are defined,
then so are the products AB and (AB) C and,

A (BC) = (AB) C.
4 Lecture 4

Proof. Suppose B is an (n × p) matrix. Since, BC is defined, C is a matrix with p rows, and


BC has n rows. Since A (BC) is defined, A has n columns. So, without loss of generality, we
may assume that the size of A is (m × n). Thus, AB exists and is an (m × p) matrix. Hence,
the matrix (AB) C is defined.

To show that A (BC) = (AB) C, one must show that [A (BC)]ij = [(AB) C]ij for each i, j.

Now,
n
X
[A (BC)]ij = Air (BC)rj ,
r=1
n p
X X
= Air Brk Ckj ,
r=1 k=1
p
n X
X
= Air Brk Ckj ,
r=1 k=1
p
n
!
X X
= Air Brk Ckj ,
k=1 r=1
p
X
= (AB)ik Ckj ,
k=1

= [(AB) C]ij .

Remark. The relation A (BC) = (AB) C implies that linear combinations of linear combina-
tions of the rows of C are again linear combinations of the rows of C.

If B is a given matrix and C is obtained from B by means of an elementary row operation,


then every row of C can be expressed as a linear combination of the rows of B, hence there
is a matrix A such that AB = C. In general, there are many such matrices. Among all such
matrices it is convenient to choose one having a number of special properties:

Definition 4.1. An (m × m) matrix is said to be an elementary matrix if it can be obtained


from the (m × m) identity matrix by means of a single elementary row operation.

Example 4.2. A (2 × 2) elementary matrix is necessarily one of the following:


     
0 1 1 c 1 0
, , .
1 0 0 1 c 1
   
c 0 1 0
, c ̸= 0; , c ̸= 0
0 1 0 c

Theorem 4.2
Let e be an ERO and E be the (m × m) elementary matrix E = e (I) . Then, for every
(m × n) matrix A,
e (A) = EA
(This is how elementary row operation on a matrix is written in terms of matrix multipli-
cation).
4 Lecture 4

Proof.
m
X
(EA)ij = Eik Akj .
k=1

The entry in the ith row and the j th column of the product matrix EA is obtained from the ith
row of E and the j th column of A.

Let us give a detailed proof for the 2nd type of ERO. Proof for the other 2 types are left
as exercises.

Recall the 2nd type of ERO from lecture 2:

e (A)ij = Aij if i ̸= r;

e (A)rj = Arj + cAsj ; (4.1)


[Replacement of the rth row of A by row r plus c times row s, c ∈ F , and r ̸= s].

Now, apply this 2nd type of ERO to the (m × m) identity matrix to obtain the (m × m)
elementary matrix E = e (I), so that,

Eik = e (I)ik = δik if i ̸= r,

Erk = e (I)rk = δrk + c δsk .


Hence, (
m
X Aij , i ̸= r;
(EA)ij = Eik Akj = (4.2)
k=1
Arj + c Asj , i = r.
Comparing equation 4.1 with 4.2, one immediately finds that,

e (A) = EA.

Corollary 4.3 (to Theorem 4.2)


Let A and B be (m × n) matrices over the field F , then B is row-equivalent to A if and
only if B = P A, where P is a product of (m × m) elementary matrices.

Proof. Suppose B = P A where P = Es , . . . , E2 E1 with each Ei is an (m × m) elementary


matrix. We have to prove that B is row equivalent to A. One can write B as:

B = (Es . . . E2 E1 ) A.

Since, E1 A is obtained by applying an ERO to A by Theorem 4.2 then by using the definition
of row equivalence of 2 matrices (2 matrices are said to be row equivalent to each other if one
can be obtained from the other by applying a finite sequence of EROs) one observes that E1 A
is row-equivalent to A. In the same way, E2 (E1 A) is row-equivalent to E1 A and hence is row-
equivalent to A. Continuing this way, one concludes that (Es . . . E2 E1 ) A is row-equivalent to A.

Now, suppose that B is row-equivalent to A. Then by definition of row-equivalence, B is


obtained from A by application of a finite sequence of EROs, namely, e1 , e2 , . . . , es , i.e.,
4 Lecture 4

B = es (. . . e2 (e1 (A)) . . . ) (4.3)


Now, by Theorem 4.2, there exists elementary matrices Es , . . . , E2 , E1 such that,
e1 (A) = E1 A, e2 (e1 (A)) = E2 E1 A. . . . , and es (. . . e2 (e1 (A)) . . . ) = (Es . . . E2 E1 ) A.
Therefore, by equation 4.3, B = (Es . . . E2 E1 ) A. ■
Suggested Exercises (Hoffman and Kunze) Section 1.6, (Page: 21), Problem: 2, 3, 4, 5, 7, 8.

§4.2 Invertible matrices

Definition 4.2. Let A be an (n × n) matrix over the field F . An (n × n) matrix B such


that BA = I is called a left inverse of A; an (n × n) matrix B such that AB = I is called
a right inverse of A. If AB = BA = I, then B is called a two-sided inverse of A.

Lemma 4.4
If A has a left inverse B and a right inverse C, then B = C.

Proof. Suppose BA = I and AC = I. Then, one has:

B = BI = B (AC) = (BA) C = IC = C.


Thus, if A (a square matrix) has both a left and a right inverse, then A has a unique two-sided
inverse (which is equal to any of the left/right inverses) and A is called invertible.The unique
two-sided inverse or simply the inverse of A is denoted by A−1 .

Theorem 4.5
Let A and B be (n × n) matrices over F .
−1
(i) If A is invertible, so is A−1 and (A−1 ) = A.
(ii) If both A and B are invertible, so is AB and,

(AB)−1 = B −1 A−1 .

Proof. (i) Since A is invertible (n × n) matrix, one must have an (n × n) matrix A−1 , called
the inverse of A so that the following equality holds,

A−1 A = AA−1 = I. (4.4)

From equation 4.4, one finds that the (n × n) matrix A−1 is invertible and whose two-sided
inverse is the (n × n) matrix A. In other words,
−1
A−1 = A.

(ii) Given (n × n) invertible matrices A and B, one has,

AA−1 = A−1 A = I and BB −1 = B −1 B = I. (4.5)


4 Lecture 4

∴ (AB) B −1 A−1 = A BB −1 A−1 = AIA−1 = I;


 

[upon successive use of equation 4.5], i,e., B −1 A−1 is a right inverse of AB.
Also,
B −1 A−1 (AB) = B −1 A−1 A B = B −1 IB = B −1 B = I
 

[again upon successive use of equation 4.5], i.e., B −1 A−1 is also a left inverse of AB.
Hence, B −1 A−1 is the unique 2-sided inverse of AB. i.e.,

(AB)−1 = B −1 A−1 .

Corollary 4.6 (to Theorem 4.2)


A product of invertible matrices is invertible with the inverse given by:

(A1 A2 . . . Ak )−1 = A−1 −1 −1


k . . . A2 A1 .

Theorem 4.7
An elementary matrix is invertible.

Proof. Let E be an elementary matrix corresponding to the ERO e (Theorem 4.2).

E = e (I)

If e1 is the inverse operation of e (Theorem 2.2 of Lecture 2) and E1 = e1 (I), then,

EE1 = e (E1 ) , [By Theorem 2.2],

= e (e1 (I)) ,
= I; [Since,e1 is the inverse operation of e].
and,
E1 E = e1 (E) , [again by Theorem 2.2]
= e1 (e (I)) ,
= I; [Since,e1 is the inverse operation of e].
Hence, E is invertible and E1 = E −1 . ■
Example 4.3. (a)
 −1  
0 1 0 1
= ;
1 0 1 0

(b)
 −1  
1 c 1 −c
= ;
0 1 0 1

(c)
 −1  
1 0 1 0
= ;
c 1 −c 1

(d) When c ̸= 0,
 −1  −1   −1  
c 0 c 0 1 0 1 0
= & = ;
0 1 0 1 0 c 0 c−1
4 Lecture 4

Theorem 4.8
If A is a (n × n) matrix, TFAE:
(i) A is invertible.
(ii) A is row-equivalent to the (n × n) identity matrix,
(iii) A is a product of elementary matrices.

Proof. let R be a row-reduced echelon matrix which is row-equivalent to A. (By Theorem 3.1
of Lecture 3). Now by Theorem 4.2 (or it’s corollary) of the current lecture.

R = (Ek . . . E2 E1 ) A (4.6)

where, E1 , E2 , . . . , Ek are some elementary matrices. Since each Ej , 1 ≤ j ≤ k, is invertible(by


Theorem 4.7), one may write:
A = E1−1 E2−1 . . . Ek−1 R. (4.7)
Now, since a product of invertible matrices is also invertible (following from the Corollary to
Theorem 4.5), by equation 4.7, A is invertible if and only if R is invertible. Since, R is a
row-reduced echelon matrix, R is invertible if and only if each row of R contains a leading 1,
i.e., if and only if R = I and if R = I, then by equation 4.7:

A = E1−1 E2−1 . . . Ek−1 . (4.8)

From equations 4.6, 4.7, and 4.8, one concludes that (i), (ii) and (iii) are equivalent statements.

Corollary 4.9 (Corollary 1 to Theorem 4.5)


If A is an invertible (n × n) matrix and if a sequence of EROs reduces A to the identity,
then that same sequence of operations when applied to I yields A−1 .

Proof. Using equation 4.8 and Corollary to Theorem 4.5, one obtains:
−1
A−1 = E1−1 E2−1 . . . Ek−1 = (Ek . . . E2 E1 )

=⇒ A−1 = (Ek . . . E2 E1 ) I (4.9)


Write equation 4.8 again,
A = E1−1 E2−1 . . . Ek−1 .
Successively applying Ek . . . E2 E1 on both sides of equation 4.8, one obtains,

(Ek . . . E2 E1 ) A = I (4.10)

This sequence (Ek . . . E2 E1 ) of ERO reduces A to the identity matrix. This same sequence of
EROs when applied to I yields A−1 as is evident from equation 4.9. ■

Corollary 4.10 (Corollary 2 to Theorem 4.8)


Let A and B be (m × n) matrices. Then B is row-equivalent to A if and only if B = P A
where P is an invertible (m × m) matrix.
4 Lecture 4

Proof. From Corollary to Theorem 4.2 of this lecture, we know that B is row-equivalent to A
if and only if B = P A where P is a product of (m × m) elementary matrices. Now by corollary
to Theorem 4.5 and 4.7, P is invertible. ■

Theorem 4.11
For an (n × n) matrix, TFAE:
(i) A is invertible.
(ii) The homogeneous system AX = 0 only has the trivial solution X = 0.
(iii) The system of equations AX = Y has a solution X for each (n × 1) matrix Y .

Proof. According to Theorem 3.3, condition (ii) is equivalent to the fact that A is row-equivalent
to the identity matrix. Now, by Theorem 4.8, (i) and (ii) are equivalent.

If A is invertible, the solution of AX = Y is given by X = A−1 Y for a given (n × 1) ma-


trix Y . Therefore, condition (i) implies condition (iii). Conversely, suppose AX = Y has a
solution for each given (n × 1) matrix Y . Let R be a row-reduced echelon matrix which is
row-equivalent to A. We wish to show that R = I. Choose Y in such a way that,

0
 
0
E=  ...  with RX = E.

Indeed, since R is taken to be row-equivalent to A, R = P A with P invertible by Corollary 4.10 to


Theorem 4.8. Therefore, RX = E iff P AX = E iff AX = P −1 E. Choose Y = P −1 E, then
you’ll have RX = E. Once you have RX = E with:

0
 
0
E=  ...  ,

The last row of R can’t be a zero row at the bottom to have a consistent system of linear
equations. Since R is an (n × n) row-reduced echelon matrix with a non-zero last row (hence
with no zero row at all), R = I. In other words, A is row-equivalent to the (n × n) identity
matrix. So, by Theorem 4.8, A is invertible. Hence, we proved that (i) ⇐⇒ (iii). Since,
(i) ⇐⇒ (ii) and (i) ⇐⇒ (iii), one has (i) ⇐⇒ (ii) ⇐⇒ (iii). ■

Corollary 4.12 (Corollary 1 to Theorem 4.11)


A square matrix with either a left or right inverse is invertible.

Proof. Let A be an (n × n) matrix. Suppose A has a left inverse, i.e., a matrix B such that
BA = I. Then since, X = IX = (BA) X = B (AX) , AX = 0 has only the trivial solution
X = 0. Therefore, A is invertible by Theorem 4.11.

Now, suppose that A has a right inverse, i.e., an (n × n) matrix C such that AC = I. Then, by
previous argument C has a left inverse, namely A and hence C is invertible. One then obtains
C −1 = A. Therefore, by Theorem 4.5, C −1 = A is invertible with:

A−1 = C


4 Lecture 4

§4.3 Algorithm for computation of A−1


One should first record the (n × n) matrix A and (n × n) identity matrix side by side and try
to apply a certain set of EROs to reduce the given square matrix A into an (n × n) identity
matrix I. Provided A can be row-reduced to the identity matrix by the application of this set
of EROs, the same set of EROs, on the other side, then, are expected to reduce the identity
matrix to A−1 as stated by Corollary 4.2 to Theorem 4.8.
Example 4.4 (Computing Inverse).
 1 1

1 2 3
 
1 1 1

A=
2 3 4 .
 
1 1 1
3 4 5
 1 1

1 2 3  
  1 0 0
1 1

1 

2 3 4 ,
0 1 0 .
  0 0 1
1 1 1
3 4 5
 1 1
  
1 2 3
0 0 1
   
′    
− 12 R1 + R2 = R2 0 1
12
1   1
− 1 0
12  ,  2 ,


− 31 R1 + R3 = R3    
1 4 1
0 12 45
−3 0 1
 1 1
  
1 2 3
1 0 0
   
   
′ 0 1 1  ,  −6 12 0
 
,
12R2 = R2 
   
1 4 1
0 12 45
−3 0 1
   
1 0 − 61 4 −6 0
   
1 ′    
− 12 R2 + R3 = R3 0 1 1  , −6 12 0
 
,

1

− 2 R2 + R1 = R1    
1 1
0 0 180 6
−1 1
   
1 0 − 16 4 −6 0
   
   
′ 0 1 1  , −6 12 0 ,
180R3 = R3 

 
 


0 0 1 30 −180 180
   
1 0 0 9 −36 30
   
1 ′    
R+ R1 = R1 0 1 0 , −36 192 −180 .
6 3 ′
   
−R3 + R2 = R2    
0 0 1 30 −180 180
4 Lecture 4

Hence, by the Corollary 4.2 to Theorem 4.8,


 
9 −36 30
 
−1
 
A = −36 192 −180

.
 
30 −180 180

Suggested Exercises (Hoffman and Kunze) (Page: 26), Problem: 1, 2, 3, 4, 5, 8, 12.

Next Lecture: Vector Spaces.


5 Lecture 5

§5.1 Vector Spaces

Definition 5.1. A vector space consists of the following:


1. a field F of scalars;
2. a set V of objects to be called vectors;
3. a rule (or operation), called vector addition, which associates with each pair of vectors
⃗u, ⃗v ∈ V , a vector ⃗u + ⃗v ∈ V , called the sum of ⃗u and ⃗v , in such a way that for
⃗ ∈ V , the following are fulfilled:
⃗u, ⃗v , w
a) Addition is commutative, ⃗u + ⃗v = ⃗v + ⃗u;
b) Addition is associative, ⃗u + (⃗v + w)
⃗ = (⃗u + ⃗v ) + w;

c) There is a unique vector ⃗0 ∈ V , called the zero vector, such that ⃗u + ⃗0 = ⃗u,
∀ ⃗u ∈ V ;
d) For each ⃗u ∈ V , there is a unique vector −⃗u ∈ V such that ⃗v + (−⃗u) = ⃗0;
4. a rule (or operation), called scalar multiplication, which associates with the pair (c, ⃗u)
with c ∈ F and ⃗u ∈ V , a vector c ⃗u ∈ V , called the product of c and ⃗v , in such a
way that the following hold for any c1 , c2 ∈ F and ⃗u, ⃗v ∈ V :
a) 1 ⃗u = ⃗u, ∀ ⃗u ∈ V ;
b) (c1 c2 ) ⃗u = c1 (c2 ⃗u) ;
c) c (⃗u + ⃗v ) = c ⃗u + c ⃗v ;
d) (c1 + c2 ) ⃗u = c1⃗u + c2⃗u.
If all of the above are fulfilled by the set V and the field F (often referred to as ground
field), we say that V is a vector field over the field F.

Example 5.1. The n-tuple space, Fn . Let, F be any field, and V be the set of all n-tuples,

⃗v = (v1 , v2 , . . . , vn ) of scalars vi ∈ F.

⃗ = (w1 , w2 , . . . , wn ) with wi ∈ F, the sum of w


If w ⃗ + ⃗v is defined by:

w
⃗ + ⃗v = (w1 + v1 , w2 + v2 , . . . , wn + vn ) . (5.1)

The product of a scalar c ∈ F and a vector ⃗v = (v1 , v2 , . . . , vn ) is defined by:

c ⃗v = (c v1 , c v2 , . . . , c vn ) (5.2)

Verify (Exercise) that Fn equipped with the vector addition and scalar multiplication defined
above indeed forms a vector space over F.
Example 5.2. The space of all m × n matrices over F, denoted by Fm×n . Let F be any field
and m and n are positive integers. Let Fm×n be the set of all (m × n) matrices over the field
F. The sum of 2 vectors (matrices of size m × n over F) A and B is defined as:

(A + B)ij = Aij + Bij , where Aij + Bij is addition in F. (5.3)


5 Lecture 5

Scalar here comes from the field F and scalar multiplication is defined as:

(cA)ij = c.Aij , where c.Aij is multiplication in F. (5.4)

Now, it is straightforward to check that the set Fm×n satisfies all the properties of a vector
space when the scalars take their values from F. (Exercise-Verify it!)
Hence, Fm×n is a vector space over the field F.
Example 5.3. The space of functions from a non-empty set to a field; F(S, F): Let F(S, F)
be the set of all functions from the set S to F. The sum of 2 elements in F(S, F) is defined as:

(f + g) (s) = f (s) + g (s) (5.5)

where this addition f (s) + g (s) is in F.

The product of a scalar c ∈ F and an element f ∈ F(S, F) is defined by:

(cf ) (s) = c.f (s) (5.6)

where this multiplication c.f (s) is in F.

Now, verify (exercise) that all the properties of a vector space are satisfied by F(S, F) when
the scalars take their value from F. Thus F(S, F) is a vector space over the field F.
Example 5.4. The field C of complex numbers can be regarded as a vector space over the field
R of real numbers. More generally, one can consider the vector space V of complex n-tuples
(x1 , x2 , . . . , xn ) , ∀ xi ∈ C, over the field R on real numbers.

In this case, the vector addition is defined as in equation 5.1:

(x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ) (5.7)

where (x1 + y1 , . . . , xn + yn ) is addition in the field of complexes.

and the scalar multiplication is defined as:

c (x1 , x2 , . . . , xn ) = (c.x1 , c.x2 , . . . , c.xn ) (5.8)

where (c.x1 , c.x2 , . . . , c.xn ) is multiplication in the field of complexes since c ∈ R can be re-
garded as a complex number.

This vector space V is different from Cn (which is a special case of Fn in example 5.1) in
which case the ground field is C whereas the vector space V of complex n-tuples is defined over
the ground field of reals.

Definition 5.2. (Linear Combination) A vector ⃗v ∈ V is said to be a linear combination


of the vectors ⃗u1 , ⃗u2 , . . . , ⃗un ∈ V provided there exist scalars c1 , c2 , . . . , cn ∈ F such that:
n
X
⃗v = c1⃗u1 + c2⃗u2 + · · · + cn⃗un = ci⃗ui . (5.9)
i=1

There are a few simple facts that follow from the definition of a vector space.
(i) c ⃗0 = ⃗0 for c ∈ F.
5 Lecture 5

first write:  
c ⃗0 = c ⃗0 + ⃗0 = c ⃗0 + c ⃗0;
    
=⇒ −c ⃗0 + c ⃗0 = −c ⃗0 + c ⃗0 + c ⃗0; [adding −c ⃗0 on both sides].

=⇒ ⃗0 = ⃗0 + c ⃗0 = c ⃗0;

∴ c ⃗0 = ⃗0.

(ii) 0 ⃗u = ⃗0 [ 0 is a Scalar].

0 ⃗u = (0 + 0) ⃗u = 0 ⃗u + 0 ⃗u;
=⇒ 0 ⃗u + (−0 ⃗u) = 0 ⃗u + (0 ⃗u + (−0 ⃗u)) ;
=⇒ ⃗0 = 0 ⃗u + ⃗0 = 0 ⃗u;

∴ 0 ⃗u = ⃗0.

(iii) (−1) ⃗u = −⃗u

We’ve seen in (ii), ⃗0 = 0 ⃗u.

=⇒ ⃗0 = (1 − 1) ⃗u = 1 ⃗u + (−1) ⃗u = ⃗u + (−1) ⃗u;

=⇒ −⃗u + ⃗0 = (−⃗u + ⃗u) + (−1) ⃗u;


=⇒ −⃗u = ⃗0 + (−1) ⃗u = (−1) ⃗u;
∴ −⃗u = (−1) ⃗u.

Suggested Exercises (Hoffman and Kunze) (Page: 33), Problem: 1, 2, 3, 4, 5, 7.

§5.2 Solution to Problem 3


C3 is the underlying vector space: The ground field is C. Then,

C1 (1, 0, −1) + C2 (0, 1, 1) + C3 (+1, 1, 1) = (z1 , z2 , z3 )

C1 − C3 = z1
C2 + C3 = z2
+ C1 + C2 + C3 = z3 (5.10)
For which values of z1 , z2 , z3 does the system have a solution?

Augmented Matrix:
   
1 0 −1 z1 1 0 −1 z1
 0 1 1 z2  , ′ 0 1 1 z2  ,
−R1 + R3 = R3
+1 1 1 z3 0 1 2 z1 + z3
5 Lecture 5

 
1 0 −1 z1
′ 0 1 1 z2 ,
−R2 + R3 = R3
0 0 1 z1 + z3 − z2
which tells us that the system 5.10 always admits a solution no matter what you take for
z1 , z2 and z3 .
Hence, any vector in C3 is a linear combination of the vectors (1, 0, −1) , (0, 1, 1) , and (1, 1, 1) .

§5.3 Subspaces

Definition 5.3. Let V be a vector space over the field F. A subspace of V is a subset W
of V which is itself a vector space over F with the operations of vector addition and scalar
multiplication on V .

A direct check of the axioms of the vector space reveals the fact that the subset W of
V is a subspace if for each ⃗u and ⃗v in W , the vector ⃗u + ⃗v is also in W ; the zero vector ⃗0 is
in W ; for each ⃗u ∈ W, (−⃗u) ∈ W ; for each ⃗u ∈ W , and c ∈ F, the vector c ⃗u ∈ W . The
commutativity and associativity of vector addition and the properties listed in definition
5.1’s 4(a)-(d) related to scalar multiplication are automatically fulfilled as these properties
concern the operations on V . One actually needs to check even less to see if a subset of a
vector space is indeed a subspace.

Theorem 5.1
A non-empty subset W of V is a subspace of V if and only if for each pair of vectors
⃗u, ⃗v ∈ W and each scalar c ∈ F, the vector c ⃗u + ⃗v is again in W .

Proof. Suppose W is a non-empty subset of V such that c ⃗u + ⃗v ∈ W, ∀ ⃗u, ⃗v ∈ W and


∀ c ∈ F. Since, W is non-empty, there is a vector w ⃗ ∈ W , and hence (−1) w ⃗ +w⃗ = ⃗0 ∈ W .
Given any vector ⃗u ∈ W , and any scalar c ∈ F, c ⃗u + ⃗0 = c ⃗u belongs to W . In particular,
(−1) ⃗u = −⃗u ∈ W . Finally, if ⃗u, ⃗v ∈ W , then 1. (⃗u) + ⃗v = ⃗u + ⃗v ∈ W . Thus, W is a subspace
of W .

Conversely, if W is a subspace of V , i.e., W is a subset of V that is a vector space over


F. ⃗u, ⃗v ∈ W , and c ∈ F then by property (4) of definition 5.1 c ⃗u ∈ W and by property 3 of
of definition 5.1 c ⃗u + ⃗v ∈ W . ■
Example 5.5. (a) If V is a vector space, V is a subspace of V . The subset of V consisting
of the zero vector alone is a subspace of V , called the zero subspace of V .
(b) In Fn , the set of n-tuples (x1 , x2 , . . . , xn ) with x1 = 0 is a subspace. However, the set of
n-tuples (x1 , x2 , . . . , xn ) with x1 = 1 + x2 is not a subspace (n ≥ 2).
(c) An n × n matrix over the field F is symmetric if Aij = Aji , ∀ i, j ∈ {1, 2, . . . , n}. The
symmetric matrices form a subspace of all n × n matrices Fn×n over F.
(d) An n × n matrix A over the field C of complex numbers is Hermitian if:

Ajk = Akj , ∀ k, j = {1, 2, . . . , n} (5.11)

A 2 × 2 matrix is Hermitian if and only if it has the form:


 
z x + iy
where x, y, z, w are reals.
x − iy w
5 Lecture 5

The set of all Hermitian matrices of size n × n is not a subspace of the vector space Cn×n
of (n × n) matrices over C.

The diagonal entries are all real for an n × n Hermitian matrix. Just take j = k in
equation 5.11:
Ajj = Ajj ,
i.e., Ajj ’s are all real for j ∈ {1, 2, . . . , n}. But iAij ’s or the diagonal entries iA11 , iA22 , . . . , iAnn
of the (n × n) matrix iA are, in general, not real, i.e., if A is a Hermitian matrix, iA is
not Hermitian. Therefore, the set of all (n × n) Hermitian matrices doesn’t form a vector
subspace of the vector space Cn×n .

I leave it as an exercise for you to check that the set of n × n complex Hermitian matrices
is indeed a vector space over the field of real numbers (with the usual operations).
Example 5.6. The solution space of a system of homogeneous linear equations. Let A ∈ Fm×n ,
an m × n matrix over the field F. Then the set of all n × 1 matrices X over F satisfying AX = 0
is a subspace of Fn×1 , the vector space of all n × 1 matrices over F. To prove this, according to
theorem 5.1, one must show that if AX = 0 and AY = 0, and C ∈ F, then A (CX + Y ) = 0.
This is true because of the following more general facts:

Lemma 5.2
If A is an (m × n) matrix over F and B, C are n × p matrices over F, then:

A (α B + C) = α (AB) + AC, ∀ α ∈ F.

Proof.
n
X
[A (α B + C)]ij = Aik (αB + C)kj ,
k=1
n
X n
X
=α Aik Bkj + Aik Ckj ,
k=1 k=1

= α (AB)ij + (AC)ij ,
= [α (AB) + AC]ij .

Theorem 5.3
Let V be a vector space over the field F. The intersection of any collection of subspaces of
V is a subspace of V .

Proof. Let {Wa }a be a collection of subspaces of V , and W = ∩a Wa be their intersection. Then


we know from elementary set theory that W is the collection of the elements belonging to every
Wa . Let, ⃗u, ⃗v ∈ W and c ∈ F. By definition of W , ⃗u and ⃗v belong to each Wa . And since each
Wa is a subspace of V , c ⃗u + ⃗v belongs to each Wa . Thus, c ⃗u + ⃗v is again in W . By theorem
5.1, W is a subspace of V . ■
Remark. From theorem 5.3, it follows that if S is any collection of vectors in V , then there
is a smallest subspace of V which contains S, i.e., a subspace which contains S and which is
contained in every other subspace containing S.
5 Lecture 5

Definition 5.4. Let S be a set of vectors in a vector space V over F. The subspace spanned by
S is defined to be the intersection W of all subspaces of V which contains S. When S is a
finite set of vectors,
S = {⃗u1 , ⃗u2 , . . . , ⃗un } .
we shall simply call W the subspace spanned by the vectors ⃗u1 , ⃗u2 , . . . , ⃗un .

Theorem 5.4
The subspace spanned by a non-empty subset S of a vector space V is the set of all linear
combinations of vectors in S.

Proof. Let W be the subspace spanned by S. Then each linear combination ⃗u = c1⃗u1 + c2⃗u2 +
· · · + cn⃗un of vectors ⃗u1 , ⃗u2 , . . . , ⃗un in S is clearly in W . It is because each elements of S belongs
to W and since W is a subspace of V , any F-linear combination of the elements of S will also
be in W according to Theorem 5.1.

Thus, W contains the set L of all linear combinations of vectors in S. On the other hand,
S ⊂ L and hence the set L is non-empty.

⃗ ∈ L, then ∃ ⃗u1 , ⃗u2 , . . . , ⃗un ∈ S, such that,


Let ⃗v , w

⃗v = α1⃗u1 + α2⃗u2 + · · · + αn⃗un , (5.12)

for αi ∈ F and ∃ ⃗s1 , ⃗s2 , . . . , ⃗sm ∈ S such that,

⃗ = β1⃗s1 + β2⃗s2 + · · · + βm⃗sm ,


w (5.13)

for βi ∈ F, now for each c ∈ F,


n
X m
X
c⃗v + w
⃗= (cαi ) ⃗ui + βj ⃗sj (5.14)
i=1 j=1

From equation 5.14, one sees that c⃗v + w ⃗ is an F -linear combination of the (n + m) vectors
⃗u1 , ⃗u2 , . . . , ⃗un , ⃗s1 , ⃗s2 , . . . , ⃗sm ∈ S.

Hence, L is a subspace of the vector space V .

Thus, we have shown that L is a subspace of V which contains S. We also have shown that the
subspace W spanned by S, i.e., the intersection of all subspaces of V that contains S, contains
L, which in turn is a subspace of V containing S.

W ⊇ L ⊃ S.

where W is the intersection of all subspaces of V containing S and L is the subspace of V


containing S.

Hence, W = L. In other words, L is the subspace spanned by S. ■

Definition 5.5. If S1 , S2 , . . . , Sk are subsets of a vector space V over some field F, the set
of all sums,
⃗u1 + ⃗u2 + · · · + ⃗uk
of vectors ⃗ui ∈ Si is called the sum of the subsets S1 , S2 , . . . , Sk and is denoted by,
5 Lecture 5

k
X
S1 + S2 + · · · + Sk = Si .
i=1

If W1 , W2 , . . . , Wk are subspaces of the vector space V , then the sum,


k
X
W = Wi ,
i=1

is seen to be a subspace of V which contains each of the component subspaces Wi , i =


1, 2, . . . , k. Indeed, choose ⃗u, ⃗v ∈ W so that one has the decompositions,
k
X
⃗u = ⃗u1 + ⃗u2 + · · · + ⃗uk = ⃗ui , with ⃗ui ∈ Wi for i = {1, 2, . . . , k} .
i=1

And,
k
X
⃗v = ⃗v1 + ⃗v2 + · · · + ⃗vk = ⃗vi with ⃗vi ∈ Wi ∀ i ∈ {1, 2, . . . , k} .
i=1

so that for any scalar c ∈ F, one obtains,

c ⃗u + ⃗v ,

= c (⃗u1 + ⃗u2 + · · · + ⃗uk ) + (⃗u1 + ⃗u2 + · · · + ⃗uk ) ,


k
X
(c ⃗ui + ⃗vi ) (5.15)
i=1

Now, since ⃗ui , ⃗vi ∈ Wi for a given i ∈ {1, 2, . . . , k} , for ∀ c ∈ F, one must have c ⃗ui +⃗vi ∈
Wi , as Wi is a subspace of V . Therefore, from equation 5.15, one immediately sees that
c ⃗u + ⃗v ∈ W or in other words, W is a subspace of V that contains each of the subspaces
Wi , ∀ i ∈ {1, 2, . . . , k} . Now, using the arguments used in theorem 5.4, one should be able
to see that W is the subspace spanned by the union of W1 , W2 , . . . , Wk .

Example 5.7. Let F be a given sub-field of the field C of complex numbers. Suppose,

⃗u1 = (1, 2, 0, 3, 0) ;

⃗u2 = (0, 0, 1, 4, 0) ;
⃗u3 = (0, 0, 0, 0, 1) ;
By theorem 5.4, the span of the above 3 vectors is a subspace of the vector space F5 , which
we denote by W . An element of this subspace W can be written as an F-linear combination of
⃗u1 , ⃗u2 and ⃗u3 , i.e., given ⃗u ∈ W , one can find c1 , c2 , c3 ∈ F such that,

⃗u = c1⃗u1 + c2⃗u2 + c3⃗u3 ,

= c1 (1, 2, 0, 3, 0) + c2 (0, 0, 1, 4, 0) + c3 (0, 0, 0, 0, 1) ,


= (c1 , 2c1 , c2 , 3c1 + 4c2 , c3 )
Alternatively, the subspace W consists of 5-tuples (x1 , x2 , x3 , x4 , x5 ) ∈ F5 such that,

x2 = 2x1 ,

x4 = 3x1 + 4x3 .
One can for example, check that (−3, −6, 1, −5, 2) is in W . On the other hand, (−3, 1, 1, −5, 2)
doesn’t belong to W .
5 Lecture 5

Example 5.8. Let F be a sub-field of the field C of complex numbers. Let W1 be the subset of
F2×2 , the vector space of all 2 × 2 matrices over F, consisting of 2 × 2 matrices of the following
form,  
x y
with x, y, z ∈ F.
z 0
Also, let W2 be the subset of F2×2 consisting of 2 × 2 matrices of the form,
 
x 0
where x, y ∈ F.
0 y

Then both W1 and W2 are subspaces of F2×2 and,

F2×2 = W1 + W2 .
         
a b a b a b 0 0 a b
Any ∈ F2×2 admits a decomposition, = + ,where, ∈ W1 ,
c d c d c 0 0 d c 0
 
0 0
and ∈ W2 . The subspace W1 ∩ W2 , on the other hand, consists of matrices of the form
0 d
 
a 0
with a ∈ F.
0 0
Example 5.9. (Row space of a matrix) Let A be an m × n matrix over F. The row vectors of
A are the vectors in Fn given by,

⃗ui = (Ai1 , Ai2 , . . . , Ain ) , i = 1, 2, . . . , m.

The subspace of Fn spanned by the row-vectors of A is called the row space of A. Refer back to
example 5.7. The subspace spanned by 3 vectors there is actually the row space of the following
matrix,  
1 2 0 3 0
A = 0 0 1 4 0 .
0 0 0 0 1
It is also the row space of the matrix given by,
 
1 2 0 3 0
0 0 1 4 0
B= 0
.
0 0 0 1
−4 −8 1 −8 0

In other words, the row space of A and the row space of B are the same. It is so because, the
last row of B can be written as - 4 times the first row plus the second row.
Suggested Exercises (Hoffman and Kunze) (Page: 39), Problem: 2, 3, 4, 5, 8, 9.
5 Lecture 5

§5.4 Bases and Dimension

Definition 5.6. Let V be a vector space over F. A subset S of V is said to be linearly


dependent if there exist distinct vectors ⃗u1 , ⃗u2 , . . . , ⃗un in S and scalars c1 , c2 , . . . , cn (not
all of which are 0) such that,

c1⃗u1 + c2⃗u2 + · · · + cn⃗un = ⃗0 (5.16)

A set which is not linearly dependent is called linearly independent.

In the case where S contains only finitely many vectors, namely ⃗u1 , ⃗u2 , . . . , ⃗un , one just
says that ⃗u1 , ⃗u2 , . . . , ⃗un are dependent/independent instead of saying that S is
dependent/independent.

Some consequences of the definition of linear dependence/independence:


1. Any set which contains a linearly dependent set is linearly dependent.
2. Any subset of a linearly independent set is linearly independent.
3. Any set which contains the vector ⃗0 is linearly dependent.
4. A set S of vectors is linearly independent if and only if each finite subset of S is
linearly independent, i.e., if and only if for any distinct vectors ⃗u1 , ⃗u2 , . . . , ⃗un of S,

c1⃗u1 + c2⃗u2 + · · · + cn⃗un = ⃗0, where each ci = 0.

Definition 5.7. (Basis of a vector space) Let V be a vector space over the field F. A basis
for V is a linearly independent set of vectors in V which spans the space V . The space V
is finite dimensional if it has a finite basis.

Example 5.10. Let F be a field and in Fn let S be the subset consisting of the vectors
⃗ε1 = (1, 0, . . . , 0) ; ⃗ε2 = (0, 1, 0, . . . , 0) ; . . . . . . . . . ⃗εn = (0, 0, . . . , 1) .

Let x1 , x2 , . . . , xn ∈ F and put,

⃗ε = x1 ⃗ε1 + x2 ⃗ε2 + · · · + xn ⃗εn , (5.17)

(x1 , x2 , . . . , xn ) . (5.18)
Hence, an element (x1 , x2 , . . . , xn ) ∈ Fn can be expressed as an F-linear combination of the
vectors ⃗ε1 , ⃗ε2 , . . . , ⃗εn , meaning that the set S = {⃗ε1 , ⃗ε2 , . . . , ⃗εn } spans the vector space Fn .

Set ⃗ε = ⃗0 in equation 5.17, i.e.,


⃗0 = x1 ⃗ε1 + x2 ⃗ε2 + · · · + xn ⃗εn = (x1 , x2 , . . . , xn ) .

=⇒ x1 = 0, x2 = 0, . . . , xn = 0
In other words, S is a linear independent set. Hence, S = {⃗ε1 , ⃗ε2 , . . . , ⃗εn } is a basis for Fn . We
shall call this particular basis the standard basis of Fn .
Example 5.11. At this point, we’ll give an example of an infinite basis. Let F be a sub-field
of the field C of complex numbers and let P be the set of polynomial functions over F. These
functions are functions from F to F which have the form:

f (x) = c0 + c1 x + · · · + cn xn . (5.19)
5 Lecture 5

Let fk (x) = xk , k = 0, 1, 2, . . . . We claim that the infinite set S = {f0 , f1 , . . . } is a basis for
P. Clearly, the set S spans P since a given polynomial function f ∈ P as given by equation
5.19 can be written as:
f = c0 f 0 + c1 f 1 + c2 f 2 · · · + cn f n , (5.20)
For c0 , c1 , . . . , cn ∈ F, why is the infinite set S = {f0 , f1 , . . . } linearly independent? To show
that the set {f0 , f1 , . . . } linearly independent is equivalent to showing that each finite subset
of this is linearly independent. It will actually suffice to show that the set {f0 , f1 , . . . , fn } is
independent for each n ∈ F.

Suppose that c0 f0 + c1 f1 + · · · + cn fn = 0 where the zero function is defined as 0 (x) = 0


[0 on the Right Hand Side is the scalar 0].

This says that c0 f0 (x) + c1 f1 (x) + · · · + cn fn (x) = 0 (x) = 0

c0 + c1 x + · · · + cn xn = 0, ∀ x ∈ F (5.21)

i.e., every x ∈ F is a root of the polynomial equation 5.21. But we know that a polynomial
equation of degree n with complex coefficients can have at most n distinct roots. Therefore, it
follows the c1 = c2 = · · · = cn = 0.

Hence, the set S = {f0 , f1 , . . . } is linearly independent. Therefore, the set {f0 , f1 , . . . } is a
basis for P.

Theorem 5.5
Let V be a vector space over F which is spanned by a finite set of vectors ⃗u1 , ⃗u2 , . . . , ⃗um .
Then any independent set of vectors in V is finite and contains no more than m elements.

Proof. It suffices to show that any set S containing more that m elements is linearly depen-
dent. Suppose, in S there are distinct vectors ⃗v1 , ⃗v2 , . . . , ⃗vn with n > m. Since the vectors
⃗u1 , ⃗u2 , . . . , ⃗um span V , there are scalars Aij such that,
m
X
⃗vj = Aij ⃗ui .
i=1

For given n scalars x1 , x2 , . . . , xn we have,


n
X
x1⃗v1 + x2⃗v2 + · · · + xn⃗vn = xj ⃗vj ,
j=1

n
X m
X
= xj Aij ⃗ui ,
j=1 i=1
n X
X m
= (Aij xj ) ⃗ui ,
j=1 i=1

m n
!
X X
= Aij xj ⃗ui . (5.22)
i=1 j=1

Given the m × n matrix A (with entries Aij ) over the field F and the n × 1 matrix X (with
entries Xj , j = 1, 2, . . . , n), the ith row (AX)i of the m × 1 matrix AX is given by,
n
X
(AX)i = Aij xj (5.23)
j=1
5 Lecture 5

Now, by theorem 3.2 of lecture 3, since m < n for the m × n matrix A, the homogeneous
system of linear equations AX = 0m×1 has a non-trivial solution.
AX = 0m×1 in the component form reads off as,
n
X
Aij xj = 0, 1 ≤ i ≤ m. (5.24)
j=1

when not all of the scalars x1 , x2 , . . . , xn are 0. Therefore, from equation 5.22 to 5.24, it follows
that, x1⃗v1 + x2⃗v2 + · · · + xn⃗vn = ⃗0 if and only if,
n
X
Aij xj = 0, ∀ i ∈ {1, 2, . . . , m} .
j=1

By theorem 3.2 of lecture 3, it has a non-trivial solution.

This means, ∃ scalars x1 , x2 , . . . , xn , not all of which are identically zero, such that,

x1⃗v1 + x2⃗v2 + · · · + xn⃗vn = ⃗0.

Therefore, the set S = {⃗v1 , ⃗v2 , . . . , ⃗vn } is a linearly dependent set. ■

Corollary 5.6 (Corollary 1 to Theorem 5.5)


If V is a finite dimensional vector space, then any 2 bases of V have the same (finite)
number of elements.

Proof. Since V is a finite dimensional vector space, it has a finite basis,

{⃗u1 , ⃗u2 , . . . , ⃗um } .

Since, the above set is a basis, it spans V . In other words, V is spanned by the finite set of
vectors {⃗u1 , ⃗u2 , . . . , ⃗um } . Therefore, by theorem 5.5, every basis of V is finite and contains no
more than m elements. Thus if, {⃗v1 , ⃗v2 , . . . , ⃗vn } is a basis of V , then n ≤ m.

Now, if {⃗v1 , ⃗v2 , . . . , ⃗vn } is a basis of V , then V is spanned by the finite set of vectors {⃗u1 , ⃗u2 , . . . , ⃗um } .
Then by theorem 5.5, the finite basis {⃗u1 , ⃗u2 , . . . , ⃗um } can’t have more than n elements in it,
i,e., m ≤ n.

Now, n ≤ m and m ≤ n imply m = n. This corollary allows us to define the dimension


of a finite dimensional vector space as the number of elements in a basis of V . ■
We now reformulate theorem 5.5 as follows:

Corollary 5.7 (Corollary 2 to theorem 5.5)


Let V be a finite dimensional vector space and let n = dimV. Then,
(a) Any subset of V which contains more than n vectors is linearly dependent;
(b) No subset of V which contains fewer than n vectors can span V .

Proof. (a) Let {⃗u1 , ⃗u2 , . . . , ⃗um } be a set of m vectors with m > n, i.e., the set above con-
tains more elements than the number of elements in a basis, say, {⃗v1 , ⃗v2 , . . . , ⃗vn } of an
n-dimensional vector space V . Since {⃗v1 , ⃗v2 , . . . , ⃗vn } clearly spans V and m > n, the set
{⃗u1 , ⃗u2 , . . . , ⃗um } can’t be linearly independent by theorem 5.5.
5 Lecture 5

(b) Suppose as in part (a) that {⃗v1 , ⃗v2 , . . . , ⃗vn } is a basis of the vector space V . Now let
{w
⃗ 1, w ⃗ m } be a set of vectors in V with m < n. I want to show that {w
⃗ 2, . . . , w ⃗ 1, w ⃗ m}
⃗ 2, . . . , w
does not span V . Let us proceed by contradiction and suppose that {w ⃗ 1, w ⃗ m } span
⃗ 2, . . . , w
V . Then each of the vectors in {⃗v1 , ⃗v2 , . . . , ⃗vn } can be expressed as a linear combination
of the vectors in {w ⃗ 1, w ⃗ m };
⃗ 2, . . . , w

⃗v1 = a11 w ⃗ 2 + · · · + a1m w


⃗ 1 + a12 w ⃗ m,
⃗v2 = a21 w ⃗ 2 + · · · + a2m w
⃗ 1 + a22 w ⃗ m,
.. .. ..
. . .
⃗vn = an1 w ⃗ 2 + · · · + anm w
⃗ 1 + an2 w ⃗ m.
Write it in matrix form:
a11 a21 ... an1
 
 a12 a22 ... an2 
[⃗v1 ⃗v2 . . . ⃗vn ] = [w
⃗1 w
⃗2 . . . w
⃗ m] 
 ... .. .. ..  (5.25)
. . . 
a1m a2m . . . anm

Since m < n, the following system has a non-trivial solution by theorem 3.2 of lecture 3:

x1 = b1 , x2 = b2 , . . . , xn = bn

a11 a21 . . . an1 x1 0


    
 a12 a22 . . . an2   x2  0
 . .. .. ..   .  = . (5.26)
 .. . . .   ..   .. 
a1m a2m . . . anm xn 0
So that,
a11 a21 ... an1 b1 0
    
 a12 a22 ... an2   b2  0
 . .. ... ..   .  = . (5.27)
 .. . .   ..   .. 
a1m a2m . . . anm bn 0
b1
 
 b2 
Multiplying both sides of equation 5.25 by 
 ...  on the right, one gets,

bn

b1 a11 a21 ... an1 b1


    
 b2   a12 a22 ... an2   b2 
[⃗v1 ⃗v2 . . . ⃗vn ] 
 ...  = [w
 ⃗1 w
⃗2 . . . w
⃗ m] 
 ... .. .. ..  .
. . .   .. 
bn a1m a2m . . . anm bn

which boils down to, b1⃗v1 + b2⃗v2 + · · · + bn⃗vn = ⃗0, where not all the bi ’s are zero.

Hence, the set {⃗v1 , ⃗v2 , . . . , ⃗vn } is linearly dependent contradicting the fact that {⃗v1 , ⃗v2 , . . . , ⃗vn }
is a basis of the vector space V .

6 Lecture 6

§6.1 Bases and Dimension (Continued)

Lemma 6.1
Let S be a linearly independent subset of a vector space V . Suppose ⃗v ∈ V which is
not in the subspace spanned by S. Then the set obtained by adjoining ⃗v to S is linearly
independent.

Proof. Suppose ⃗u1 , ⃗u2 , . . . , ⃗um are distinct vectors in S. Also, suppose:

α1⃗u1 + α2⃗u2 + · · · + αm⃗um + β ⃗v = ⃗0. (6.1)

with α1 , α2 , . . . , αm ∈ F and β ∈ F where V is defined over F. Equation 6.1 implies that


β = 0, for otherwise, one would have,

β⃗v = −α1⃗u1 − α2⃗u2 · · · − αm⃗um ,


     
α1 α2 αm
=⇒ ⃗v = − ⃗u1 + − ⃗u2 + · · · + − ⃗um .
β β β
resulting in the fact that ⃗v belongs to the subspace spanned by S contradicting the hypothesis.
Plugging in β = 0 in equation 6.1, one obtains,

α1⃗u1 + α2⃗u2 + · · · + αm⃗um = ⃗0. (6.2)

Since, ⃗u1 , ⃗u2 , . . . , ⃗um are distinct vectors of the linearly independent set S, one must have,

α1 = α2 = · · · = αm = 0 in equation 6.2.

Therefore, for the vectors ⃗u1 , ⃗u2 , . . . , ⃗um and ⃗v ,

α1⃗u1 + α2⃗u2 + · · · + αm⃗um + β ⃗v = ⃗0.

implies α1 = α2 = · · · = αm = β = 0. Since, the distinct vectors ⃗u1 , ⃗u2 , . . . , ⃗um in S were


chosen arbitrarily, the set obtained by adjoining ⃗v to S is linearly independent. ■

Theorem 6.2
If W is a subspace of a finite-dimensional vector space V , every linearly independent subset
of W is finite and is part of a (finite) basis of W .

Proof. Suppose S0 is a linearly independent subset of W . Then S0 is also a linearly independent


subset of V . Since V is finite dimensional, S0 contains no more than dimV elements.

We extend S0 to a basis for W , as follows. If S0 spans W , then S0 is a basis for W and


we’re done. If S0 does not span W , then there is a vector w ⃗ 1 in W that is not contained in
the subspace of W that is spanned by S0 . By Lemma 6.1, then, the set S1 = S0 ∪ {w ⃗ 1 } is
linearly independent. If S1 spans W , we’re done. If not, then, again, one can find a vector
⃗ 2 ∈ W that doesn’t belong to the subspace of W spanned by S1 . Using Lemma 6.1, then,
w
one concludes that the set S2 = S1 ∪ {w
⃗ 2 } is a linearly dependent set in W . If one continues
this way, then in not more than dimV steps, one reaches a set,
6 Lecture 6

Sm = S0 ∪ {w
⃗ 1, w ⃗ m} ,
⃗ 2, . . . , w
which is a basis for W . In other words, S0 is a part of a finite basis of W . ■

Corollary 6.3
(Corollary 1 to Theorem 6.2) If W is a proper subspace of a finite-dimensional vector space
V , then W is finite dimensional and,

dimW < dimV.

Proof. We may suppose without loss of generality that W contains a non-zero vector ⃗u. Then
by theorem 6.2 and it’s proof, one can construct a basis of W containing ⃗u such that the basis
contains no more than dimV elements. It means that W is finite dimensional and,

dimW ≤ dimV. (6.3)

Since W is a proper subspace of V , there is a vector ⃗u ∈ V such that ⃗v ∈ / W , i.e., ⃗v doesn’t


belong to a basis of W spanning W . Hence by lemma 6.1, the set one obtains by adjoining ⃗v to
the given basis of W is a linearly independent set in V , the cardinality of which is dimW + 1.
Since dimV is the cardinality of the maximally linearly independent set of V, dimW +1 ≤ dimV ,
i.e., dimW < dimV. ■

Corollary 6.4
(Corollary 2 to Theorem 6.2) In a finite dimensional vector space V , every non-empty
linearly independent set of vectors is part of a basis.

Corollary 6.5
(Corollary 3 to Theorem 6.2) Let A be an n × n matrix over F, suppose the row vectors of
A form a linearly independent set of vectors in Fn . Then A is invertible.

Proof (Exercise).

Theorem 6.6
If W1 and W2 are finite dimensional subspaces of a vector space V , then the sum set,

W1 + W2 = {w ⃗2 | w
⃗1 + w ⃗ 1 ∈ W1 and w
⃗ 2 ∈ W2 } ,

is finite-dimensional and,

dimW1 + dimW2 = dim (W1 ∩ W2 ) + dim (W1 + W2 )

Proof. By theorem 6.2 and it’s corollaries, the subspace W1 ∩ W2 of both the finite dimensional
vector spaces W1 and W2 , is finite-dimensional. Suppose it has a finite basis, {⃗u1 , ⃗u2 , . . . , ⃗ul }
which is part of a basis {⃗u1 , ⃗u2 , . . . , ⃗ul , ⃗v1 , ⃗v2 , . . . , ⃗vm } for W1 and part of a basis
{⃗u1 , ⃗u2 , . . . , ⃗ul , w
⃗ 1, w ⃗ n } for W2 .
⃗ 2, . . . , w

The subspace W1 + W2 (it is easy to check that it is a subspace!) is spanned by the vec-
tors belonging to the following set,
6 Lecture 6

{⃗u1 , . . . , ⃗ul , ⃗v1 , . . . , ⃗vm , w ⃗ n} .


⃗ 1, . . . , w
Now, the claim is that the set above is a linearly independent set. Suppose,
l
X m
X n
X
αi ⃗ui + βj ⃗vj + ⃗ k = ⃗0.
γk w (6.4)
i=1 j=1 k=1

for αi , βj , γk ∈ F, then,
n
X l
X m
X
− γk w
⃗k = αi ⃗ui + βj ⃗vj = ⃗0. (6.5)
k=1 i=1 j=1

n
P n
P n
P
which shows that γk w
⃗ k belongs to W1 . But γk w
⃗ k also belongs to W2 . Hence, γk w
⃗k
k=1 k=1 k=1
belongs to W1 ∩ W2 . So, there are scalars ρi ∈ F such that,
n
X l
X
γk w
⃗k = ρi ⃗ui . (6.6)
k=1 i=1

Now, since {⃗u1 , ⃗u2 , . . . , ⃗ul , w ⃗ n } is linearly independent from equation 6.6, one concludes
⃗ 1, . . . , w
that,
γ1 = γ2 = · · · = γn = ρ1 = ρ2 = · · · = ρl = 0. (6.7)
Using equation 6.7 in equation 6.5, one obtains,
l
X m
X
αi ⃗ui + βj ⃗vj = ⃗0, (6.8)
i=1 j=1

and since {⃗u1 , . . . , ⃗ul , ⃗v1 , . . . , ⃗vm } is also a linearly independent set,

α1 = α2 = · · · = αl = β1 = · · · = βm = 0 (6.9)

Equations 6.4, 6.7, and 6.9 together imply that whenever,


l
X m
X n
X
αi ⃗ui + βj ⃗vj + ⃗ k = ⃗0.
γk w
i=1 j=1 k=1

one has,
α1 = · · · = αl = β1 = · · · = βm = γ1 = · · · = γn = 0.
Thus, {⃗u1 , . . . , ⃗ul , ⃗v1 , . . . , ⃗vm , w ⃗ n } . is a linearly independent set. Also, the set above
⃗ 1, . . . , w
spans W1 + W2 . Hence, {⃗u1 , . . . , ⃗ul , ⃗v1 , . . . , ⃗vm , w ⃗ n } is a basis for W1 + W2 . Finally,
⃗ 1, . . . , w

dimW1 + dimW2 = (l + m) + (l + n)

= l + (m + l + n)
= dim (W1 ∩ W2 ) + dim (W1 + W2 ) .

Suggested Exercises (Hoffman and Kunze) (Page: 48), Problem: 1, 2, 3, 7, 11.
6 Lecture 6

§6.2 Coordinates

Definition 6.1. If V is a finite dimensional vector space, an ordered basis for V is a finite
sequence of vectors which is linearly independent and spans V .

Remark. If a sequence ⃗u1 , ⃗u2 , . . . , ⃗un is an ordered basis, then the set {⃗u1 , ⃗u2 , . . . , ⃗un } is a basis
for V . The ordered basis is the set, together with the specific ordering. In what follows, we
will be engaged in a slight abuse of notation and describe an ordered basis for V as below:

B = {⃗u1 , ⃗u2 , . . . , ⃗un } .

Given ⃗u ∈ V , there is a unique n-tuple x1 , x2 , . . . , xn ) of scalars, i.e., xi ∈ F, ∀ i ∈ {1, 2, . . . , n}


such that,
X n
⃗u = xi ⃗ui (6.10)
i=1

The n-tuple is unique, because if we also had


n
X
⃗u = yi ⃗ui then we would have
i=1

n
P
xi − yi i = 0 whence linear independence of the set {⃗u1 , ⃗u2 , . . . , ⃗un } would yield xi = yi , ∀ i ∈
i=1
{1, 2, . . . , n} .

We shall xi the ith coordinate of ⃗u relative to the ordered basis B = {⃗u1 , ⃗u2 , . . . , ⃗un }. If
another vector ⃗v ∈ V can be written as,
n
X
⃗v = zi ⃗ui ,
i=1

n
X
then ⃗u + ⃗v = (xi + zi ) ⃗ui . (6.11)
i=1

so that the ith coordinate of the vector ⃗u + ⃗v in this ordered basis B reads xi + zi . Similarly,
the ith coordinate of (c ⃗u) is c xi .

This is how one explores that each ordered basis for the vector space V over F determines
a one-one correspondence.
⃗u → (x1 , x2 , . . . , xn ) .
between the set of all vectors of V and the set of all n-tuples in Fn .

Most of the time, it is more convenient to use the coordinate matrix of ⃗u relative to the ordered
basis B
x1
 
 x2 
X=  ... 

xn
To indicate the dependence of this coordinate matrix on the basis, we’ll use the symbol ⃗u B for
the coordinate matrix of the vector ⃗u ∈ V relative to the ordered basis B, i.e.,
6 Lecture 6

x1

 x2 
[⃗u]B = 
 ... 
 (6.12)
xn
Question: What happens to the coordinate of ⃗u ∈ V as we change from one ordered basis
of V to another. Suppose that V is n-dimensional and that,

B = {⃗u1 , ⃗u2 , . . . , ⃗un } and B = {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } ,

are 2 ordered bases for V . There are unique scalars Pij such that,
n
X
⃗u ′j = Pij ⃗u ′i , 1 ≤ j ≤ n (6.13)
i=1

Pij ’s are unique for a given j ∈ {1, 2, . . . , n} . Writing each ⃗u ′j ∈ V as a unique linear combi-
nation of the vectors ⃗ui in the ordered basis {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } .

P1j
 
 ′  P2j 
⃗u j B = 
 ... 
 (6.14)
Pnj
 ′
⃗u j B is the coordinate matrix of ⃗u ′j relative to the ordered basis B. Let x′1 , x′2 , . . . , x′n be the

coordinates of a given vector ⃗u ∈ V in the ordered basis B . Then,

⃗u = x′1⃗u ′1 + x′2⃗u ′2 + · · · + x′n⃗u ′n ;


n
X n
X n
X
= x′j ⃗u ′j = x′j Pij ⃗u ′i ; [using 6.13]
j=1 j=1 i=1
n X
X n
Pij x′j ⃗ui ;

=
j=1 i=1

n n
!
X X
= Pij x′j ⃗ui ;
i=1 j=1

Thus we obtain, !
n
X n
X
⃗u = Pij x′j ⃗ui . (6.15)
i=1 j=1
n
P
x1 , x2 , . . . , xn → ⃗u = xi ⃗ui . Since the coordinates x1 , x2 , . . . , xn of ⃗u in the ordered basis
i=1
B = {⃗u1 , ⃗u2 , . . . , ⃗un } are uniquely determined, it follows from equation 6.15 that,
n
X
xi = Pij x′j . (6.16)
j=1
6 Lecture 6

P1 j
 
 P2 j 
Recall from equation 6.13 that ⃗u ′j B = 
 
 ...  .

Pn j
Let P be the n × n matrix whose i, j entry is the scalar Pij given in equation 6.16 or in 6.13.
Equation 6.15 can be written using the matrix P as follows:

[⃗u]B = P [⃗u]B′ (6.17)



Since both B and B are linearly independent sets then ⃗u B = ⃗0 =⇒ x1 = x2 = · · · = xn = 0.
n
xi ⃗ui = ⃗0.
P
Hence, ⃗u =
i=1
n n
x′i ⃗u ′i .
P P
But, ⃗u = xi ⃗ui =
i=1 i=1
n

x′i ⃗u ′i = ⃗0 ∴ x′1 = x′2 = · · · = x′n = 0; [as B is linearly independent]. In other words,
P
Hence,
i=1
⃗u B′ = ⃗0. i.e., ⃗u B = 0 ⇐⇒ ⃗u B′ = 0. From equation 6.17, one obtains that P ⃗u B′ = ⃗0 has only
the trivial solution. Theorem 4.11 of lecture 4 then implies that P is invertible.
Hence, one can write,
[⃗u]B′ = P −1 [⃗u]B . (6.18)
We formalize the discussion above by means of a theorem.

Theorem 6.7

Let V be an n-dimensional vector space over the field F, and let B and B be 2 ordered
bases of V . Then there is a unique, necessarily invertible, n × n matrix P with entries in
F such that,
(i) [⃗u]B = P [⃗u]B′ ,
(ii) [⃗u]B′ = P −1 [⃗u]B ,
for every vector ⃗u ∈ V . The columns of P are given by, Pj = ⃗u ′j B , j = 1, 2, . . . , n.
 

Example 6.1. Given the field R of real numbers and θ ∈ R, the matrix,
 
cos θ − sin θ
P = is invertible with inverse.
sin θ cos θ

The inverse of this matrix is equal to its transpose.


 
−1 cos θ sin θ
P =
− sin θ cos θ

We discussed earlier that (observe equation 6.13) the vectors ⃗u ′j belonging to the ordered basis

B = {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } can be identified with the n-tuple belonging to Fn whose components
P1 j
 
 ′  P2 j 
(coordinates) in the basis B = {⃗u1 , ⃗u2 , . . . , ⃗un } are given by Pij , 1 ≤ j ≤ n, i.e., ⃗u j B = 
 ...  .

        Pnj
′ P11 ′ P12 ′ cos θ ′ − sin θ
Hence, [⃗u 1 ]B = ; [⃗u 2 ]B = . i.e., [⃗u 1 ]B = ; [⃗u 2 ]B = .
P21 P22 sin θ cos θ
6 Lecture 6
   
cos θ − sin θ
In other words, one can identify the ordered basis, B = by {⃗u ′1 , ⃗u ′2 }
,
sin θ cos θ
   
x x′
containing vectors in R2 . If ⃗u ∈ V with [ ⃗u ]B = 1 and one writes [ ⃗u ]B′ = 1 , then,
x2 x2′

[ ⃗u ]B′ = P −1 [ ⃗u ]B .
    
x1′ cos θ sin θ x1
=⇒ = .
x2′ − sin θ cos θ x2
Or,
x1′ = x1 cos θ + x2 sin θ;
x2′ = −x1 sin θ + x2 cos θ.
Geometrically, one obtains the coordinate matrix of ⃗u relative to the basis B = {⃗u ′1 , ⃗u′2 }
by
1 0
rotating the coordinate matrix of ⃗u relative to the standard basis B = {⃗u ′1 , ⃗u ′2 } ≡ , .
0 1
The coordinate
  matrix of ⃗u = x1⃗u1 + x2⃗u2 relative to the ordered basis B = {⃗u ′1 , ⃗u ′2 } is
n
x
[ ⃗u ]B = 1 . Indeed verify that ⃗u ′j =
P
Pij ⃗ui holds, i.e.,
x2 i=1
     
′ 1 0 cos θ
⃗u 1 = P11⃗u1 + P21⃗u2 = cos θ + sin θ = ,
0 1 sin θ
     
′ 1 0 − sin θ
⃗u 2 = P12⃗u1 + P22⃗u2 = − sin θ + cos θ = .
0 1 cos θ
n
P
On the other hand, see that in matrix notation ⃗uj ′ = Pij ⃗ui translates,
i=1

⃗u ′1
 ′
⃗u 1
 
.
 ..  = P  ...  ,
T

⃗u ′n ⃗u ′n

in particular, in the present context, one has,


 ′ 
P11 P21 ⃗u ′1
 
⃗u 1
=
⃗u ′2 P12 P22 ⃗u ′2

P11 P12 ⃗u ′1
    
⃗u1
=⇒ =
⃗u2 P21 P22 ⃗u ′2
cos θ − sin θ ⃗u ′1
    
⃗u1
=⇒ =
⃗u2 sin θ cos θ ⃗u ′2
Or,
⃗u1 = (cos θ) ⃗u ′1 − (sin θ) ⃗u ′2 , and,
⃗u2 = (sin θ) ⃗u ′1 + (cos θ) ⃗u ′2 .
     
cos θ − sin θ 1
Therefore, one verifies that ⃗u1 = cos θ − sin θ = ;
sin θ cos θ 0
6 Lecture 6
     
cos θ − sin θ 0
and, ⃗u2 = sin θ + cos θ = .
sin θ cos θ 1
Example 6.2. Let F be a sub-field C of complex numbers. It’s an easy exercise for you to find
that,  
−1 4 5
P =  0 2 −3
0 0 8
is invertible with inverse,  
−1 2 11 8
P −1 =  0 12 16 3 
.
1
0 0 8
P1 j
 
   
 ′  P2 j  −1 4
′ ′
Now using equation 6.14: ⃗u j B =  .. , one obtains, [⃗u 1 ]B = 0 ; [⃗u 2 ]B = 2 ;
    
. 0 0
Pn j
 
5

[⃗u ′3 ]B = −3 ; the 3 vectors above form a basis B of F3 . Here B is the standard basis of F3 .
8
We also know that,  ′  
⃗u 1 ⃗u1
⃗u ′2  = P T ⃗u2  .
⃗u ′3 ⃗u3
   ′
⃗u1 ⃗u 1
−1
T ⃗u ′2  .

Or, ⃗u2 = P
  
⃗u3 ⃗u ′3
 
  −1 0 0  
⃗u1   ⃗u ′
  1
Or, ⃗u2  =  2 1
0  ⃗u ′2  .
2
 ⃗u ′3
 
⃗u3 
11 3 1
8 16 8
 
  −⃗u ′1
⃗u1 



Or, ⃗u2  = 
 2⃗u ′1 + 21 ⃗u ′2 .

⃗u3  
11 ′ 3 ′
⃗u
8 1
+ 16 ⃗u 2 + 81 ⃗u ′3
       
1 −2 2 0
∴ ⃗u1 = 0 ; ⃗u2 = 0 + 1 = 1 ;
      
0 0 0 0
       11 3 5   
−1 4 5 − + + 0
11   3   1    8 3 4 3 8  
⃗u3 = 0 + 2 + −3 = 0+ 8 − 8 = 0 .
8 0 16 0 8 8 0+0+1 1
     
 1 0 0 
Therefore, indeed B is the standard basis 0 , 1 , 0 of F3 . The coordinates x′1 , x′2 , x′3
 0 0 1 

of the vector ⃗u relative to the basis B are given by,

[⃗u]B′ = P −1 [⃗u]B
6 Lecture 6

 11

 ′ −1 2 8  
x1 

 x1

x′2  =  0 1 3  
x2 [By theorem 6.7 (ii)].
2 16 
x′3

  x3
1
0 0 8
 11

 ′ −x 1 + 2x 2 + 8
x 3
x1 



′ 1 3
Therefore, x2  =  2
x 2 + 16
x 3
 (6.19)

 
x3  
1
x
8 3

We know that,
x1⃗u1 + x2⃗u2 + x3⃗u3 = ⃗u = x′1⃗u ′1 + x′2⃗u ′2 + x′3⃗u ′3 . (6.20)
Choose for example,
x1 = 3, x2 = 3 and x3 = −8. (6.21)
From equation 6.19 then it follows that,
1
x′1 = −10, x′2 = − , x′3 = −1. (6.22)
2
Since {⃗u1 , ⃗u2 , ⃗u3 } = B is the standard basis for F3 , by plugging in the value for x1 , x2 , x3 and
x1′ , x2′ , x3′ from equations 6.21 and 6.22 into equation 6.20, one then obtains,
 
3
 2  = −10⃗u ′1 − 1 ⃗u ′2 − ⃗u ′3 .
−8 2

Suggested Exercises (Hoffman and Kunze) (Page: 54), Problem: 1, 2, 4, 5, 6.

§6.3 Revisiting row-equivalence


Recall that if A is an m × n matrix over the field F, the row vectors of A are the vectors
α
⃗ 1, α ⃗ m in Fn . i.e., each of the m vectors αi is an n-tuple,
⃗ 2, . . . , α

α
⃗ i = (Ai1 , Ai2 , . . . , Ain ) .

and the row space of A is defined to be the subspace of Fn spanned by these m vectors. The
row rank of A is the dimension of the row space of A. If P is a k × m matrix, then B = P A is
a k × n matrix whose row vectors β⃗1 , . . . , β⃗k are linear combinations of the rows of A:
m
X
β⃗i = ⃗ j , 1 ≤ i ≤ k.
Pij α (6.23)
j=1

Thus, one immediately see that the row space of B is a subspace of the row space of A. If
k = m and P is an invertible m × m matrix. Then the analog of equation 6.23 exists expressing
each row α⃗ i of the m × n matrix A as a linear combination of the rows β⃗j of the m × n matrix
B = P A establishing the row-equivalence of the 2 matrices A and B. In this case, one finds
that the row space of A is a subspace of the row space of B. One, therefore, has the following
theorem.
6 Lecture 6

Theorem 6.8
Row-equivalent matrices have the same row space. Therefore, we see that studying the row
space of a given matrix A is equivalent to studying the row space of a row-reduced echelon
matrix which is row-equivalent to A.

We now proceed to study the row space of a row-reduced echelon matrix by means of
the following theorem whose proof is omitted.

Theorem 6.9
Let R be a non-zero row-reduced echelon matrix. Then, the non-zero row vectors of R
form a basis for the row space of R.

Corollary 6.10
Each m × n matrix A is row-equivalent to one and only one row-reduced echelon matrix.

Corollary 6.11
Let A and B be m × n matrices over the field F. Then A and B are row-equivalent
if and only if they have the same row space.

§6.3.i To summarize
If A and B are m × n matrices over F, then the following statements are equivalent:
1. A and B are row-equivalent.
2. A and B have the same row space.
3. B = P A, where P is an invertible m × m matrix.

§6.3.ii Computations concerning subspaces


Suppose we are given m vectors α
⃗ 1, α ⃗ m (n-tuples) in Fn . We consider the following
⃗ 2, . . . , α
questions:
1. How does one determine if the vectors α ⃗ 1, α
⃗ 2, . . . , α
⃗ m are linearly independent? More
generally, how does one find the dimension of the subspace W spanned by these vectors?

2. Given β⃗ ∈ Fn , how does one determine if β⃗ is a linear combination of α


⃗ 1, . . . , α
⃗ m , i.e., if

β is in the subspace W spanned by α⃗ 1, . . . , α
⃗ m?
3. How can one give an ”explicit description” of the subspace W ? It will be made clear in a
while.
Let A be the m × n matrix with row vectors α
⃗ 1, α
⃗ 2, . . . , α
⃗ m:

α
⃗ i = (Ai1 , Ai2 , . . . , Ain ) .

Perform a sequence of ERO’s on A to obtain a row-reduced echelon matrix R. At this point,


the dimension of W (the row space of A) is simply the number of non-zero rows of R.
6 Lecture 6

If ρ⃗1 , . . . , ρ⃗r are the non-zero row vectors of R, then B = ρ⃗1 , . . . , ρ⃗r is a basis for W .

The subspace W consists of all vectors of the following form:


r r r r
! r
X X X X X
β⃗ = Ci (Ri1 , . . . , Rin ) = Ci Ri1 , Ci Ri2 , . . . , Ci Rin = Ci ρ⃗i ,
i=1 i=1 i=1 i=1 i=1

where (Ri1 , . . . , Rin ) is the ith row of R ≡ (b1 , b2 , . . . , bn ) ∈ Fn . The coordinates b1 , b2 , . . . , bn


of such a vector β are then,
r
X
bj = Ci Rij , β⃗ = (b1 , b2 , . . . , bn ) . (6.24)
i=1

The row-reduced echelon matrix R looks as follows:

k1 k2 kj kr
↓ ↓ ↓ ↓
 
Row 1 → 0 ... 0 1 ∗ ∗ ∗ 0 ... 0 ∗ 0 ∗
Row 2 → 0
 ... 0 0 0 0 0 1 ∗ ∗ 0 ∗ 0 ∗
 .. .. .. .. ....

. . . . . .


Row j → 0 ... 0 0 0 0 0 0 ... 1 ∗ 0 ∗
 
. .. .. .. ....
 ..

. . . . . 
Row r → 
0 ... 0 0 ... ... 0 ... ...

0 0 1 ∗ ∗ ∗
 
0 ... 0 0 ... ... 0 ... ... 0 0 0 ∗ ∗ ∗
0 ... 0 0 ... ... 0 ... ... 000 ∗ ∗ ∗

There are r such non-zero rows starting with leading 1. If the first non-zero coordinate (leading
1) of the row ρ⃗i is occurring in the kith column of R, then we have for i ≤ r (the first r rows),
the properties of a row-reduced echelon matrix R
(a) Rij = 0, if j < ki ;
(b) Rikj = δij ;
(c) k1 < · · · < kr .
Now, get back to equation 6.24:
r
X
bj = Ci Rij ,
i=1

take j = ks (for some s ≤ r) in the above equation and use property (b) of the row-reduced
echelon matrix R: r
X
bks = Ci Riks = Cs . (6.25)
i=1

Therefore, equation 6.24 takes the following form:


r
X
bj = bki Rij , j = 1, 2, . . . , n. (6.26)
i=1
6 Lecture 6

Equation 6.26 is the ”explicit description” of the subspace W spanned by α ⃗ 1, . . . , α


⃗ m , i.e.,
⃗ n
the subspace consists of all vectors β ∈ F whose coordinates satisfy equation 6.26. Hence,
r
X r
X
β⃗ = Ci ρ⃗i = bki ρ⃗i (6.27)
i=1 i=1

It’s better to consider an example at this stage.


Example 6.3. Let W be the subspace of R4 spanned by the vectors,

α
⃗ 1 = (1, 2, 2, 1) ; α ⃗ 3 = (−2, 0, −4, 3) .
⃗ 2 = (0, 2, 0, 1) ; α

(a) Prove that α


⃗ 1, α
⃗ 2, α
⃗ 3 form a basis for W ,i.e., that these vectors are linearly independent.

(b) Let β⃗ = (b1 , b2 , b3 , b4 ) be a vector in W . What are the coordinates of β⃗ relative to the
ordered basis {⃗ α1 , α ⃗ 3 }?
⃗ 2, α
(c) Let,
⃗ ′1 = (1, 0, 2, 0) ,
α
⃗ ′2 = (0, 2, 0, 1) ,
α
⃗ ′3 = (0, 0, 0, 3) .
α
Show that α ⃗ ′3 form a basis for W .
⃗ ′2 , α
⃗ ′1 , α

(d) If β⃗ ∈ W , let X denote the coordinate matrix of β⃗ relative to the α-basis and X the
′ ′
coordinate matrix of β⃗ relative to the α -basis. Find the 3 × 3 matrix P such that

X = P X for every such β⃗ ∈ W
To answer these questions we form the matrix A with row vectors α ⃗ 1, α
⃗ 2, α
⃗ 3 , find the row-
reduced echelon matrix R which is row-equivalent to A and simultaneously perform the same
operations on 3 × 3 identity matrix to obtain the invertible matrix Q such that R = QA.

     
1 2 2 1 1 2 2 1 1 0 2 0
 0 2 0 1 ′ 0 2 0 1 −R2 + R1 = R1′ 0 2 0 1
2R1 + R3 = R3 ′
−2 0 −4 3 0 4 0 5 −2R2 + R3 = R3 0 0 0 3
   
1 0 2 0 1 0 2 0
1 ′  0 2 0 1 ′ 0 2 0 0 .
R = R3
3 3
−R3 + R2 = R2
0 0 0 1 0 0 0 1
     
1 0 0 1 0 0 1 −1 0
0 1 0 ′ 0 1 0 −R2 + R1 = R1′ 0 1 0
2R1 + R3 = R3 ′
0 0 1 2 0 1 −2R2 + R3 = R3 2 −2 1
 
  1 −1 0
1 −1 0 
 2 5


− 1
1 ′  0 1 0 ′ −3
R = R3
3 3 2 2 1 −R3 + R2 = R2  3 3
3
− 3 3
 
2
3
− 23 1
3
   
1 2 2 0 1 0 2 0
 0 2 0 1 1 ′ 0 1 0 0 .
R2 = R2
−2 0 −4 0 2 0 0 0 1
6 Lecture 6

 
  1 −1 0  
1 0 0   6 −6 0
 1 5
 1
0 1 0 1 ′ −
 3 6
− 16 
 = 6 −2 5 −1 .

R
2 2
= R2
0 0 1   4 −4 2
2
3
− 23 1
3

§6.4 Why does the above method work?


Here, Q = QI (Product of some elementary matrices). In words, after performing a set of ERO’s
on I, one reaches the matrix Q, i.e.,

Q = (Es . . . E2 E1 ) I (Product of elementary matrices).

or in terms of elementary row operations:

Q = es (. . . e2 (e1 (I)) . . . ) .

Now, given the matrix A, by theorem 4.2 of lecture 4:

e1 (A) = E1 A,

e2 (e1 (A)) = E2 E1 A,
∴ es (. . . e2 (e1 (A)) . . . ) = Es1 A,
But,
es (. . . e2 (e1 (A)) . . . ) = R,
where R is the row-reduced echelon matrix.

∴ R = (Es . . . E2 E1 ) A,

= QA
∴ we obtain Q by applying the same set of ERO’s on I as the ones applied on A to obtain the
row-reduced echelon matrix R.

§6.4.i Solution (a)


Since there are 3 non-zero rows in R, A has row rank 3. Hence, the 3 vectors α ⃗ 1, α
⃗ 2 and α
⃗ 3,
forming the matrix A whose rows are α⃗ 1, α
⃗ 2 and α
⃗ 3 are linearly independent.

§6.4.ii Solution (b)


A basis for W is given by ρ⃗1 , ρ⃗2 , ρ⃗3 , the row vectors of R.
 
1 0 2 0 ← ρ⃗1
R= 0 1 0 0  ← ρ⃗2
0 0 0 1 ← ρ⃗3

where, k1 = 1 for the 1 in row ρ⃗1 , k2 = 2 for the 1 in row ρ⃗2 , k3 = 4 for the 1 in row ρ⃗3 . Using
equation 6.26,
3
X
β⃗ = bki ρ⃗i = bk1 ρ⃗1 + bk2 ρ⃗2 + bk3 ρ⃗3 = b1 ρ⃗1 + b2 ρ⃗2 + b4 ρ⃗3 .
i=1
6 Lecture 6

One, therefore, writes for

β⃗ ∈ W : β⃗ = b1 ρ⃗1 + b2 ρ⃗2 + b4 ρ⃗3 ,


 
ρ⃗1
= [b1 b2 b4 ] ρ⃗2  = [b1 b2 2b1 b4 ] ,

ρ⃗3

where [b1 b2 2b1 b4 ] is the span of ρ⃗1 , ρ⃗2 , ρ⃗3 consists of the vectors β⃗ = (b1 , b2 , b3 , b4 ) for which
b3 = 2b1 .

Now, β⃗ = [b1 b2 b4 ] R; where R is the row-reduced echelon matrix whose non-zero rows are
ρ⃗1 , ρ⃗2 , ρ⃗3 .

Therefore,
 
α
⃗1
β⃗ = [b1 b2 b4 ] Q A = [b1 b2 b4 ] Q α
⃗ 2  ; Q = [Q1 Q2 Q3 ] where Q1 , Q2 , Q3 are the columns of Q.
α
⃗3

So that, [b1 b2 b4 ] Qi = xi , i = 1, 2, 3. One, therefore, obtains,


 
1
1 2
[b1 b2 b4 ] − 31  = x1 =⇒ x1 = b1 − b2 + b4 ;

2 3 3
3
 
−1
5 2
[b1 b2 b4 ]  56  = x2 =⇒ x2 = −b1 + b2 − b4 ;
− 23 6 3
 
0
1 1
[b1 b2 b4 ] − 61  = x3 =⇒ x3 = − b2 + b4 .
1 6 3
3

One gets:
1 2
x 1 = b1 − b2 + b4 ;
3 3
5 2
x2 = −b1 + b2 − b4 ;
6 3
1 1
x 3 = − b2 + b4 . (6.28)
6 3

§6.4.iii Solution (c)



⃗ ′1 , α
One can form the matrix A by taking its rows to be the vectors α ⃗ ′2 and α
⃗ ′3 :
 
1 0 2 0

A = 0 2 0 1 .
0 0 0 3
′ ′
and notice that A is row-equivalent to A. Since A is of rank 3, A is also of rank 3. Hence,
⃗ ′1 , α
the vectors α ⃗ ′2 and α
⃗ ′3 span W and are linearly independent. In other words, α
⃗ ′1 , α
⃗ ′2 and

α
⃗ 3 form a basis for W .
6 Lecture 6

§6.4.iv Solution (d)


We know from theorem 6.7 that,  ′
Pj = α
⃗j .
 
P1j
Pj = P2j  [the j th column of the 3 × 3 invertible matrix P (have a look at equation 6.14)].

P3j
Where B = {⃗
α1 , α ⃗ 3 } . We know, β⃗ = b1 ρ⃗1 + b2 ρ⃗2 + b4 ρ⃗3 .
⃗ 2, α

Take β⃗ = α
⃗ ′1 = [1 0 2 0] . Now, we want [1 0 2 0] = b1 [1 0 2 0] + b2 [0 1 0 0] + b4 [0 0 0 1] .
Which means b1 = 1, b2 = 0; b4 = 0. Pluggin in these values in equation 6.28, one gets:

x1 = 1;

x2 = −1;
x3 = 0. (6.29)
From equation 6.29, it follows that,

⃗ ′1 = x1 α
α ⃗ 1 + x2 α
⃗ 2 + x3 α ⃗1 − α
⃗3 = α ⃗ 2.

Now, take β⃗ = α
⃗ ′2 = [0 2 0 1].

We want [0 2 0 1] = b1 [1 0 2 0] + b2 [0 1 0 0] + b4 [0 0 0 1] which implies b1 = 0, b2 = 2,


and b4 = 1. Plugging in these values again in equation 6.28 yields:
2 2
x1 = − + = 0,
3 3
5 2
x2 = − = 1,
3 3
1 1
x3 = − + = 0. (6.30)
3 3
From equation 6.30, it follows that,

⃗ ′2 = [0 2 0 1] = x1 α
α ⃗ 1 + x2 α
⃗ 2 + x3 α
⃗3 = α
⃗ 2.

Therefore, α ⃗ 2 . Now, take β⃗ = α


⃗ ′2 = α ⃗ ′3 = [0 0 0 3].
We want α ⃗ ′3 = [0 0 0 3] = b1 [1 0 2 0] + b2 [0 1 0 0] + b4 [0 0 0 1] which implies b1 =
0, b2 = 0, and b4 = 3. Plugging in these values again in equation 6.28 yields:

x1 = 2; x2 = −2; x3 = 1. (6.31)

From equation 6.31, it follows that,

⃗ ′3 = [0 0 0 3] = x1 α
α ⃗ 1 + x2 α
⃗ 2 + x3 α
⃗3 = α α1 − 2⃗
⃗ 2 = 2⃗ α2 + α
⃗ 3.

One, therefore, has the columns of P


     
1 0 2
′ ′ ′
α 1 ]B = −1 ; [⃗
[⃗     α 3 ]B = −2 .
α 2 ]B = 1 ; [⃗ 
0 0 1
6 Lecture 6

 
1 0 2
P = −1 1 −2 .
0 0 1
Suggested Exercises (Hoffman and Kunze) (Page: 66), Problem: 2, 3, 4, 5, 6.
7 Lecture 7

§7.1 Linear Transformation

Definition 7.1. Let V and W be vector spaces over the field F. A linear transformation
from V to W is a function T : V → W satisfying,

T (c ⃗u + ⃗v ) = cT (⃗u) + T (⃗v ) , ∀ ⃗u, ⃗v ∈ V and all scalars c ∈ F.

Example 7.1. If V is any vector space, the identity transformation I, defined by I ⃗u = ⃗u, ∀ ⃗u ∈
V , is a linear transformation from V to itself. The zero transformation 0̂ : V → V given by
0̂⃗u = ⃗0 ∈ V, ∀ ⃗u ∈ V is a linear transformation from V to itself.
Example 7.2. Let F be a field and let P be the vector space of polynomial functions over F.
Such a polynomial function f is given by,

f (x) = C0 + C1 x + · · · + Ck xk , Ci ∈ F, ∀ i ∈ {1, 2, . . . , k} .

Let, D : P → P be defined by,

(Df ) (x) = C1 + 2C2 x + · · · + kCk xk−1 .

Verify that D is a linear transformation from V to itself. The linear transformation D is called
the differentiation transformation.
Example 7.3. Let A be a fixed (m × n) matrix with entries from the field F. The function
T : Fn×1 → Fm×1 defined by T (⃗u) = A ⃗u ∈ Fm×1 , ∀ ⃗u ∈ Fn×1 , is a linear transformation from
Fn×1 to Fm×1 . Also, the function u : Fm → Fn defined by u (⃗v ) = ⃗v A ∈ Fn , ∀ ⃗v ∈ Fm , is
a linear transformation from Fm to Fn . Here, Fm is shorthand notation for F1×m . This is the
vector space of row-vectors with m entries each from the field F.
 
Remark. 1. If T : V → W is a linear transformation, then T ⃗0V = ⃗0W . Here, ⃗0v is the
zero vector of the vector space V and ⃗0W is the zero vector for the vector space W .
2. If T : V → W is a linear transformation, then T ’preserves’ linear combinations; that is if
⃗u1 , ⃗u2 , . . . , ⃗un are vectors in V and c1 , c2 , . . . , cn are scalars, then

T (C1⃗u1 + C2⃗u2 + · · · + Cn⃗un ) = C1 T (⃗u1 ) + C2 T (⃗u2 ) + · · · + Cn T (⃗un ) . (7.1)

Theorem 7.1
Let V be a finite-dimensional vector space over F and let {⃗u1 , ⃗u2 , . . . , ⃗un } be an ordered
basis for V . Let W be a vector space over the same field F and w ⃗ 1, w ⃗ 2, . . . , w
⃗ n be any
vectors in W . Then there is a unique linear transformation T : V → W such that,

T ⃗uj = w
⃗ j , j = 1, 2, . . . , n. (7.2)

Proof. Existence: Let ⃗u ∈ V . Denote by B = {⃗u1 , ⃗u2 , . . . , ⃗un } the given ordered basis of V .
Let the coordinate matrix of the vector ⃗u ∈ V relative to the basis B be given by,

x1
 
 x2 
[⃗u]B = 
 ...  ,

xn
7 Lecture 7

so that in the ordered basis B,

⃗u = x1⃗u1 + x2⃗u2 + · · · + xn⃗un . (7.3)

Now, define T : V → W by its action on ⃗u ∈ V as follows,

T ⃗u = x1 w ⃗ 2 + · · · + xn w
⃗ 1 + x2 w ⃗ n. (7.4)

The definition of the transformation T : V → W is provided using the unique (n × 1) matrix


x1
 
 x2 
 .  obtained from the vector ⃗u ∈ V and the given vectors w
⃗ 1, w
⃗ 2, . . . , w
⃗ n.
 .. 
xn
Equation 7.4 is well-defined for any ⃗u ∈ V since for a given ⃗u, its coordinate matrix [⃗u]B
relative to the ordered basis B is unique, i.e., for a given ⃗u ∈ V , there is a unique vector given
by the right hand side of equation 7.4 satisfying equation 7.4.

In addition, if one chooses ⃗u = ⃗uj in equation 7.3 so that,


 
0
 .. 
.
[⃗uj ]B = 1 , where 1 is the j th component,
 
.
 .. 
0

then from equation 7.4, one has T ⃗uj = w


⃗ j as required in the statement of the theorem 7.1.

It remains to show that T is linear. Let ⃗v ∈ V have the following basis expansion:

⃗v = y1⃗u1 + y2⃗u2 + · · · + yn⃗un . (7.5)

Now, for c ∈ F,

c⃗u + ⃗v = (cx1 + y1 ) ⃗u1 + (cx2 + y2 ) ⃗u2 + · · · + (cxn + yn ) ⃗un ;

[Using equations 7.3 and 7.5]. By the definition of T provided in equation 7.4,

T (c⃗u + ⃗v ) = (cx1 + y1 ) w ⃗ 2 + · · · + (cxn + yn ) w


⃗ 1 + (cx2 + y2 ) w ⃗ n. (7.6)

On the other hand,


c (T ⃗u) + T⃗v ,
= c (x1 w ⃗ 2 + · · · + xn w
⃗ 1 + x2 w ⃗ n ) + (y1 w ⃗ 2 + · · · + yn w
⃗ 1 + y2 w ⃗ n) ,
(cx1 + y1 ) w ⃗ 2 + · · · + (cxn + yn ) w
⃗ 1 + (cx2 + y2 ) w ⃗ n. (7.7)
Combining equation 7.6 and 7.7, one obtains,

T (c⃗u + ⃗v ) = c (T ⃗u) + T⃗v . (7.8)

Hence, T : V → W given by equation 7.4 is linear.


7 Lecture 7

Uniqueness Suppose K : V → W be another linear transformation satisfying,

K⃗uj = w
⃗ j , j = 1, 2, . . . , n. (7.9)

For the vector ⃗u ∈ V given by equation 7.3,

K⃗u = K (x1⃗u1 + · · · + xn⃗un )

= x1 K (⃗u1 ) + · · · + xn K (⃗un ) ; [∵ k is linear ].


⃗ 1 + · · · + xn w
= x1 w ⃗ n ; [∵ k (⃗uj ) = w
⃗ j , j = 1, 2, . . . , n].
= T ⃗u; [By equation 7.4]. (7.10)
Since, equation 7.10 holds for any ⃗u ∈ V, K = T. In other words, the linear operator defined
by equation 7.4 is unique. ■
Example 7.4. (Elucidating Theorem 7.1) The vectors ⃗u1 = (1, 2) , ⃗u2 = (3, 4) are linearly
independent and hence form a basis for the vector space R2 . Given 2 vectors (3, 2, 1) and
(6, 5, 4) of R3 , by means of theorem 7.1, there is a unique linear transformation T : R2 → R3
such that,
T ⃗u1 = (3, 2, 1) , and T ⃗u2 = (6, 5, 4) . (7.11)
Where does ϵ1 = (1, 0) go under the action of this unique linear transformation T . i.e., T ϵ1 =?
First write ϵ1 = (1, 0) as a linear combination of the 2 basis elements ⃗u1 = (1, 2) and ⃗u2 = (3, 4)
as follows:
ϵ1 = c1⃗u1 + c2⃗u2
=⇒ (1, 0) = c1 (1, 2) + c2 (3, 4) =⇒ (1, 0) = (c1 + 3c2 , 2c1 + 4c2 ) (7.12)
From equation 7.12, one obtains,

c1 + 3c2 = 1, 2c1 + 4c2 = 0,

=⇒ 2c2 = 2,
∴ c2 = 1 and c1 = −2.
∴ (1, 0) = ⃗u1 + ⃗u2 so that T (1, 0) = −2T ⃗u1 + T ⃗u2 ,
= −2 (3, 2, 1) + (6, 5, 4) ; [ Using equation 7.11 ].
= (0, 1, 2) .
2 interesting subspaces that arise in the context of a linear transformation T : V → W are
Range of T , abbreviated as RT ⊂ W and Kernel of T denoted by Ket Tn ⊂ V . Let us quickly
o

verify that RT = {T⃗v | ⃗v ∈ V } is indeed a subspace of W and KerT = ⃗v ∈ V | T⃗v = 0W is
a vector subspace of V .

We will use theorem 5.1 of lectur 5 for this purpose. Let ⃗u1 , ⃗u2 ∈ RT , then from the defi-
nition of RT , there are vectors ⃗v1 , ⃗v2 ∈ V s.t. T⃗v1 = ⃗u1 and T⃗v2 = ⃗u2 . Now, since T : V → W
is a linear transformation, for a given c ∈ F,

T (c⃗v1 + ⃗v2 ) = cT⃗v1 + T⃗v2 = c⃗u1 + ⃗u2 .

Thus, we see that ∃ c⃗v1 + ⃗v2 ∈ V, s.t. T (c⃗v1 + ⃗v2 ) = c⃗u1 + ⃗u2 , proving that c⃗u1 + ⃗u2 ∈ RT .
Hence, by theorem 5.1 of lecture 5, RT or Range of the linear transformation T is a vector
subspace of W .
7 Lecture 7

Now, we verify that  KerT


 is a subspace of V . Indeed, we’ve seen in a remark preceeding
theorem 7.1 that T 0V = 0W so that ⃗0V ∈ KerT and hence KerT is alwyas non-empty.
⃗ ⃗
Now, let ⃗u1 , ⃗u2 ∈ KerT . Hence, T ⃗u1 = ⃗0W , T ⃗u2 = ⃗0W .

Therefore, for any c ∈ F,

T (c⃗u1 + ⃗u2 ) = cT ⃗u1 + T ⃗u2 ; [Using linearity of T ].

= c.⃗0W + ⃗0W = ⃗0W


Hence, c⃗u1 + ⃗u2 ∈ KerT. Now again application of theorem 5.1 of lecture 5 reveals that Ker T
is a subspace of V .

Definition 7.2. Let V and W be vector spaces over F, and let T be a linear transformation
from V to W . The null space of T is the set of all vectors ⃗u ∈ V such that T ⃗u = ⃗0W . If V
is finite dimensional, the rank of T is the dimension of the range of T and the nullity of T
is the dimension of the null-space of T .

Theorem 7.2
Let V and W be vector spaces over the field F and T : V → W be a linear transformation.
Suppose V is finite dimensional. Then,

rank (T ) + nullity (T ) = dimV. (7.13)

Proof. Let {⃗u1 , ⃗u2 , . . . , ⃗uk } be a basis for Ker T , the null space of T . We know, Ker T is a sub-
space of V . If dimV = n, then there are vectors {⃗uk+1 , ⃗uk+2 , . . . , ⃗un } such that {⃗u1 , ⃗u2 , . . . , ⃗un }
is a basis for V . We will now prove that {T ⃗uk+1 , . . . , T ⃗un } is a basis for RT , the range of T .

First of all, RT = Span {T ⃗u1 , T ⃗u2 , . . . , T ⃗un }. This can easily be seen as given w
⃗ ∈ RT , ∃ ⃗u ∈
Pn
⃗ But since {⃗u1 , . . . , ⃗un } is a basis for V, ⃗u =
V s.t. T ⃗u = w. Ci⃗ui for some scalars ci ∈ F,
i=1
n
P
leading to T ⃗u = ⃗ ∈ RT , there exist scalars ci
ci T ⃗ui , using linearity of T . Therefore, given w
i=1
such that, X
w
⃗= ci T ⃗ui , i.e., RT = Span {T ⃗u1 , . . . , T ⃗un } .
i=1n

Now, since {⃗u1 , ⃗u2 , . . . , ⃗uk } is a basis for Ker T, T ⃗uj = 0 for 1 ≤ j ≤ k and hence we can further
strengthen our result by stating that RT = Span {T ⃗uk+1 , T ⃗uk+2 , . . . , T ⃗uk }. Now suppose,
n
X
ci T ⃗ui = ⃗0W . for scalars ci ∈ F, K + 1 ≤ i ≤ n. (7.14)
i=k+1

Now, equation 7.14 reduces to,


n
!
X
T ci ⃗ui = ⃗0W [Using linearity of T ],
i=k+1

Therefore,
n
X
⃗v = ci ⃗ui ∈ KerT, (7.15)
i=K+1
7 Lecture 7

But, according to the hypothesis {⃗u1 , ⃗u2 , . . . , ⃗uk } forms a basis for KerT . Hence, there are
scalars b1 , b2 , . . . , bk such that,
X k
⃗v = bi ⃗ui , (7.16)
i=1
k
P n
P
Gathering equations 7.15 and 7.16 together, one sees, bi ⃗ui = cj ⃗uj ,
i=1 j=k+1

k
X n
X
=⇒ bi ⃗ui − cj ⃗uj = ⃗0V . (7.17)
i=1 j=k+1

Since {⃗u1 , ⃗u2 , . . . , ⃗uk } is a linearly independent set, one must have,

b1 = b2 = · · · = bk = ck+1 = · · · = cn = 0.

Now, equation 7.14 together with ck+1 = ck+2 = · · · = cn = 0 imply that the set {T ⃗uk+1 , T ⃗uk+2 , . . . , T ⃗un }
is a linearly independent set. Hence, {T ⃗uk+1 , . . . , T ⃗un } forms a basis of RT , the range of the
linear transformation T . One immediately finds that the rank r of T is given by r = n − k.
Remember: We chose {⃗u1 , ⃗u2 , . . . , ⃗uk } as a basis for Ker T which means that the nullity of T
is k. Additionally, since n is the dimension of V, r = n − k translates to,

rank (T ) + nullity (T ) = dimV.


Suggested Exercises (Hoffman and Kunze) (Page: 73), Problem: 2, 3, 7, 8, 13.

§7.2 The Algebra of Linear Transformations


Given 2 vector spaces V and W over a given field F, the set of linear transformations is
endowed with a natural vector space structure. When one restricts attention to the set of
linear transformations from a vector space to itself, one finds that the underlying set has more
structure than just a vector space. Such transformations in the set can be composed and one
can define ’multiplication’ of such transformations that satisfy certain nice properties. This
structure is called the structure of a linear algebra.

Theorem 7.3
Let V and W be vector spaces over the field F. Let T and U be linear transformations
from V to W . The function (T + U) defined by,

(T + U) (⃗v ) = T⃗v + U⃗v , ∀ ⃗v ∈ V. (7.18)

is a linear transformation from V to W . If c ∈ F, the function cT defined by,

(cT ) (⃗v ) = c (T⃗v ) ; ∀ ⃗v ∈ V. (7.19)

is a linear transformation from V to W . The set of all linear transformations from V to


W , equipped with the addition given by equation 7.18 and scalar multiplication given by
equation 7.19, is a vector space over the field F.

Proof. Given linear transformations T and U from V to W , one defines T + U according to


equation 7.18. Using this definition, one writes,
7 Lecture 7

(T + U) (c⃗v + w) ⃗ + U (c⃗v + w)
⃗ = T (c⃗v + w) ⃗ for c ∈ F, ⃗v , w
⃗ ∈ V.
=⇒ (T + U) (c⃗v + w) ⃗ + c (U⃗v ) + U w;
⃗ = c (T⃗v ) + T w ⃗ [Using linearity of T and U].
⃗ + U w,
=⇒ c (T⃗v ) + c (U⃗v ) + T w ⃗
= c (T⃗v + U⃗v ) + T w
⃗ + U w;
⃗ [∵ T⃗v , U⃗v ∈ W and c ∈ F].
= c (T + U) (⃗v ) + (T + U) (w)
⃗ ; [Using equation 7.18].
proving that T + U is indeed a linear transformation from V to W . Now, for ⃗v , w
⃗ ∈ V and
c, d ∈ F, (cT ) (d⃗v + w)

= c [T (d⃗v + w)]
⃗ ; [Using definition 7.19].

c [dT⃗v + T w]
⃗ ,
= (cd) (T⃗v ) + c (T w) ⃗ ∈ W and c, d ∈ F and W is a vector space over F].
⃗ ; [∵ T⃗v , T w
⃗ ; [c, d ∈ F and multiplication in a field is commutative].
= dc (T⃗v ) + c (T w)
= d [c (T⃗v )] + c (T w)
⃗ ,
= d [(cT ) (⃗v )] + (cT ) (w)
⃗ ; [Using definition 7.19].
proving that cT is also a linear transformation from V to W . I leave it to the reader to verify
that the set of linear transformations with respect to addition and scalar multiplication defined
by equations 7.18 and 7.19, respectively, indeed satisfies all vector space axioms. ■
We shall denote the vector space of linear transformations from V to W by L (V, W ). We
remark here that L (V, W ) is defined only when the vector spaces V and W are defined over
the same field F.
Theorem 7.4
Let V be an n-dimensional vector space over the field F, and let W be an m-dimensional
vector space over the same field F. The vector space L (V, W ) is finite dimensional and has
dimension mn.

Proof. is omitted. ■

Definition 7.3. If V is a vector space over the field F, a linear operator on V is a linear
transformation from V to itself.

Theorem 7.5
Let V, W and Z be vector spaces over the field F. Let T be a linear transformation from
V to W and U be a linear transformation from W to Z. Then the composed function UT
⃗ V , is a linear transformation from V to Z.
defined by (UT ) (⃗v ) = U (T (⃗v )) , ∀ v ∈

Proof. Given c ∈ F and ⃗v , w


⃗ ∈ V,

UT (c⃗v + w)
⃗ = U (T (c⃗v + w))
⃗ ; [Using the definition of the composed function].
= U (c T⃗v + T w)
⃗ ; [∵ T is linear].
= c U (T (⃗v )) + U (T (w))
⃗ ; [∵ U is linear].
= c UT (⃗v ) + UT (w)
⃗ .
7 Lecture 7

Hence, the composed function UT : V → Z defined by UT (⃗v ) = U (T (⃗v )) ; ∀ ⃗v ∈ V , is


indeed a linear transformation from V to Z. ■
Remark. If one chooses V = W = Z in theorem 7.5, so that both U and T are linear operators
on V , one immediately finds that the composition U T is again a linear operator in V . Thus, the
space L (V, V ) can be endowed with a ’multiplication’ defined on it by composition. In this case,
one also has the linear operator T U defined and in general, UT ̸= T U, i.e., UT − T U ̸= 0. Also,
note that if T ∈ L (V, V ), multiplication of T with itself is defined, i.e., T can be composed
with itself to yield T 2 = T T and in general T n = T . . . T (ntimes) for n = 1, 2, 3, . . . . We define
T 0 = I, if T ̸= 0.

Lemma 7.6
Let V be a vector space over the field F. Let U, T1 , and T2 be linear operators on V ; let
c ∈ F. Then,
(a) I U = UI = U.
(b) U (T1 + T2 ) = UT1 + UT2 , or, (T1 + T2 ) U = T1 U + T2 U.
(c) c (UT1 ) = (c U) T1 = U (cT1 ). Here I is the identity transformation on V , i.e.,
I⃗v = ⃗v , ∀ ⃗v ∈ V.

Proof. (a) IU (⃗v ) = U (⃗v ) ; [from the definition of I].


UI (⃗v ) = U (⃗v ) , ∀ ⃗v ∈ V ;
∴ IU = UI = U.
(b) [U (T1 + T2 )] (⃗v ) = U [(T1 + T2 ) (⃗v )] ; [Using the definition of composition given in
theorem 7.5].
= U (T1⃗v + T2⃗v ) ; [Using equation 7.18, vector space structure of L (V, V )].
= U (T1⃗v ) + U (T2⃗v ) ; [Using linearity of U].
= (UT1 ) (⃗v ) + (UT2 ) (⃗v ) ; [Using definition of composition provided in theorem 7.5].
(UT1 + UT2 ) (⃗v ) ; [Using equation 7.18 back since, UT1 , UT2 ∈ L (V, V )].
∴ U (T1 + T2 ) = UT1 + UT2 .
Now, [(T1 + T2 ) U] (⃗v ) = (T1 + T2 ) (U⃗v ) ;
[Using the definition of composition given in theorem 7.5].
= T1 (U⃗v ) + T2 (U ⃗v ) ; [Using equation 7.18, T1 , T2 ∈ L (V, V ) and U⃗v ∈ V ].
= (T1 U) (⃗v ) + (T2 U) (⃗v ) ; [Using definition of composition provided in theorem 7.5].
= (T1 U + T2 U) (⃗v ) ; [Using equation 7.18 back since, T1 U, T2 U ∈ L (V, V )].
∴ (T1 + T2 ) U = T1 U + T2 U
(c) U, T1 , T2 ∈ L (V, V ) , c ∈ F, where L (V, V ) is an F-vector space. Given ⃗v ∈ V
[c (UT1 )] (⃗v ) = c (UT1 ) (⃗v ) ; [UT1 ∈ L (V, V ) and L (V, V ) is an F-vector space, hence
using equation 7.19].
= c U (T1 ⃗v ) ; [Using the definition of composition in theorem 7.5].
= (c U) (T1 ⃗v ) ; [Using equation 7.19].
= [(c U) T1 ] (⃗v ) ; [Using back the definition of composition].
∴ c (UT1 ) = (c U) T1 .
Again [c (UT1 )] (⃗v ) = c U (T1 ⃗v ) ,
= U (c T1 ⃗v ) ; [Using linearity of U].
= U [(c T1 ) ⃗v ] ; [Using equation 7.19 since c T1 ∈ L (V, V ) given c ∈ F and T1 ∈ L (V, V )].
= [U (c T1 )] (⃗v ) ; [Using back the definition of composition from theorem 7.5].
∴ c (UT1 ) = U (cT1 ) ,
Hence, c (UT1 ) = (c U) T1 = U (cT1 ) .

7 Lecture 7

Theorem 7.4 and Lemma 7.6 together tell us that the vector space L (V, V ), together with
the composition operation, is a linear algebra with identity.
Example 7.5. Let F be a field and V be the vector space of all polynomial functions from F
to F. A general element of V is given by,

v (x) = c0 + c1 x + c2 x2 + · · · + cn xn with ci ∈ F for 0 ≤ 1 ≤ n.

Let D be the differential operator on V given by,

Dv (x) = c1 + 2c2 x + 3c3 x2 + · · · + ncn xn−1 . (7.20)

And T be the multiplication by x operator, i.e.,

T v (x) = xv (x) = c0 x + c1 x2 + c2 x3 + · · · + cn xn+1 . (7.21)

Verify that both these operators are linear, i.e., D.T ∈ L (V, V ). For example, given α ∈ F,
verify,
T ((x) + w (x)) = α T v (x) + T w (x) ,
for another polynomial function from F to F:

w (x) = d0 + d1 x + d2 x2 + · · · + dm xm .

It is easy to see:

T (αv (x) + w (x)) = x [αv (x) + w (x)] = αxv (x) + xw (x) ,

= αT v (x) + T w (x) .
Also, DT − T D = I.

In fact,

DT v (x) = D c0 x + c1 x2 + c2 x3 + · · · + cn xn+1 ; [Using equation 7.21]




= c0 + 2c1 x + 3c2 x2 + · · · + (n + 1) cn xn . (7.22)


And,
T Dv (x) = T c1 + 2c2 x + 3c3 x2 + · · · + ncn xn−1 ; [Using equation 7.20].


= x c1 + 2c2 x + 3c3 x2 + · · · + ncn xn−1




c1 x + 2c2 x2 + 3c3 x3 + · · · + ncn xn . (7.23)


Subtracting equation 7.24 from 7.22 yields,

DT v (x) − T Dv (x) = c0 + c1 x + c2 x2 + · · · + cn xn = v (x) .

∴ (DT − T D) v (x) = Iv (x) leading to DT − T D = I.

§7.2.i Let us address the question


For which linear operators T on the vector space V , does there exist a linear operator T −1 such
that T T −1 = T −1 T = I?

The function T from V to W is called invertible if there exists a function U from W to V


such that UT : V → V is the identity function on V , i.e., UT = IV and T U : W → W is the
identity function on W , i.e., T U = IW .
7 Lecture 7

You should be able to verify that if T is invertible, the function U is unique and is denoted
by T −1 . Furthermore, T is imvertible if and only if,
1. T is 1 − 1, that is T⃗v = T w ⃗ ∀ ⃗v , w
⃗ =⇒ ⃗v = w, ⃗ ∈ V.
2. T is onto, that is the range of T is all of W .

Theorem 7.7
Let V and W be F-vector spaces. And let T be a linear transformation from V to W . If
T is invertible, then the inverse function T −1 is a linear transformation from W to V .

Proof. Let ⃗v1 , ⃗v2 ∈ W and c ∈ F be a scalar. We want to show that the following holds:

T −1 (c⃗v1 + ⃗v2 ) = cT −1⃗v1 + T −1⃗v2 .

Let,
⃗u1 = T −1⃗v1 and ⃗u2 = T −1⃗v2 .
⃗u1 = T −1⃗v1 implies that ⃗u1 is the unique vector in V such that T ⃗u1 = ⃗v1 . Similarly, ⃗u2 is the
unique vector in V such that T ⃗u2 = ⃗u2 . (Because T is 1 − 1).

From the linearity of T , it follows that,

T (c⃗u1 + ⃗u2 ) = cT ⃗u1 + T ⃗u2 ,

= c⃗v1 + ⃗v2 .
Now, since T is 1 − 1, c⃗u1 + ⃗u2 is the unique vector in V which is sent to c⃗v1 + ⃗v2 ∈ W by the
linear transformation T from V to W and hence,

T −1 (c⃗v1 + ⃗v2 ) = c⃗u1 + ⃗u2 = cT −1⃗v1 + T −1⃗v2 .

and T −1 is linear. ■

Definition 7.4. A linear transformation T from V to W is called non-singular if T⃗v =


n o
⃗0W =⇒ ⃗v = ⃗0V , i.e., if the nullspace of T, KerT = ⃗0V .

Fact: Let T : V → W be linear. Then T (⃗u − ⃗v ) = T ⃗u − T⃗v . Therefore,

T ⃗u = T⃗v ⇐⇒ T (⃗u − ⃗v ) = ⃗0W .

Now, for a linear map to be 1 − 1 ⇐⇒ whenever for linear T , T ⃗u = T⃗v implies ⃗u = ⃗v ⇐⇒


T (⃗u − ⃗v ) = ⃗0W implies ⃗u − ⃗v = ⃗0V , i.e., T is non-singular. Hence, a linear transformation
T : V → W is 1 − 1 if and only if it is non-singular.

Theorem 7.8
Let T be a linear transformation from V to W . Then T is non-singular if and only if T
carries each linearly independent subset of V to a linearly independent subset of W .

Proof. First, suppose that T is non-singular. And let, S = {⃗u1 , ⃗u2 , . . . , ⃗uk } be a linearly indepen-
dent subset of V . One needs to prove that the set {T ⃗u1 , T ⃗u2 , . . . , T ⃗uk } is linearly independent
in W . Let,
7 Lecture 7

c1 (T ⃗u1 ) + c2 (T ⃗u2 ) + · · · + ck (T ⃗uk ) = ⃗0W ,


for c1 , c2 , . . . , ck ∈ F. We need to prove that, c1 = c2 = · · · = ck = 0.

Indeed, linearity of T yields,

T (c1⃗u1 + c2⃗u2 + · · · + ck ⃗uk ) = ⃗0W . (7.24)

Since, T is non-singular, one must have,

c1⃗u1 + c2⃗u2 + · · · + ck ⃗uk = ⃗0V .

But the linear independence of S = {⃗u1 , ⃗u2 , . . . , ⃗uk } then implies c1 = c2 = · · · = ck = 0. This
argument shows that the image of S = {⃗u1 , ⃗u2 , . . . , ⃗uk } under T is linearly independent.

Suppose now that T carries linearly independent subsets of V to linearly independent sub-
sets of W . Let, ⃗u ∈ V be a non-zero vector so that {⃗u} is a linearly independent subset of
V . Then according to the hypothesis {T ⃗u} is linearly independent in W . One then must have
T ⃗u ̸= ⃗0W . Otherwise, the set {T ⃗u} consisting of the zero vector ⃗0W only would be linearly de-
pendent. In other words, T ⃗u ̸= ⃗0Wnwhenever
o ⃗u ̸= ⃗0V . Hence, the only vector ⃗u ∈ V satisfying
T ⃗u = ⃗0W is ⃗u = ⃗0V , i.e., KerT = ⃗0V , i.e., T is non-singular. ■

Example 7.6. Let F be a field and T be the linear operator on F2 defined by,

T (x1 , x2 ) = (x1 + x2 , x1 ) ,

T [c (x1 , x2 ) + (y1 , y2 )] = T (cx1 + y1 , cx2 + y2 ) = (cx1 + y1 + cx2 + y2 , cx1 + y1 ) .


(x1 , x2 ) + (y1 , y2 ) = (x1 + y1 , x2 + y2 ) .

c (x1 , x2 ) = (cx1 , cx2 ) .


While,
c T (x1 , x2 ) + T (y1 , y2 ) = c (x1 + x2 , x1 ) + (y1 + y2 , y1 ) ,
= (cx1 + cx2 , cx1 ) + (y1 + y2 , y1 ) ,
= (cx1 + cx2 + y1 + y2 , cx1 + y1 ) .
∴ T [c (x1 , x2 ) + (y1 , y2 )] = c T (x1 , x2 ) + T (y1 , y2 ) .
Hence, T is a linear operator on F2 . Now, we verify that T is non-singular. Solve the system
given by T (x1 , x2 ) = (0, 0) i.e.,
x1 + x2 = 0,
x1 = 0.
=⇒ x1 = 0 and x2 = 0
so that KerT = {(0, 0)}. Hence, T is non-singular. One also finds that T is onto. Let (x1 , x2 )
be any vector in F2 . To show that (z1 , z2 ) is in the range of T one must find x1 , x2 ∈ F s.t.

T (x1 , x2 ) = (z1 , z2 )

i.e., (x1 + x2 , x1 ) = (z1 , z2 ) .


And, hence, one must satisfy
x1 + x2 = z1 ,
x1 = z2 .
7 Lecture 7

=⇒ x1 = z2 , x2 = z1 − z2 .
∴ T is onto. Hence, T −1 exists. Explicitly,

T −1 (z1 , z2 ) = (z2 , z1 − z2 ) .

In this example, the linear operator T is non-singular and is also onto. In general, a non-singular
linear operator defined on a vector space need not to be onto.

Theorem 7.9
Let V and W be finite dimensional vector spaces over the field F such that dimV = dimW .
If T : V → W is a linear transformation, the following are equivalent (TFAE):
(i) T is invertible.
(ii) T is non-singular.
(iii) T is onto, that is the range of T is all of W .
Proof. Let n = dimV = dimW . From theorem 7.2, we know that,

rank (T ) + nullity (T ) = n. (7.25)

Now, T is non-singular.
n o
=⇒ KerT = ⃗0V =⇒ nullity (T ) = 0 =⇒ rank (T ) = n = dimW

=⇒ range of T = W =⇒ T is onto.
It also holds in the other direction, meaning, when T is onto, one has,

range of T = W =⇒ rank (T ) = n = dimW.


n o
Equation 7.25 then suggests that nullity (T ) = 0, i.e., KerT = ⃗0V =⇒ T is
non-singular. One therefore finds that (ii) ⇐⇒ (iii), i.e., (ii) and (iii) are always
simultaneously satisfied and (ii) and (iii) together imply that T is invertible (since T
is 1 − 1 and onto). ■
There are 2 more conditions that are equivalent to the 3 conditions stated in Theorem
9. These 2 conditions are as follows:
(iv) If {⃗u1 , ⃗u2 , . . . , ⃗un } is a basis for V , then, {T ⃗u1 , T ⃗u2 , . . . , T ⃗un } is a basis for W .
(v) There is some basis {⃗u1 , ⃗u2 , . . . , ⃗un } for V such that {T ⃗u1 , T ⃗u2 , . . . , T ⃗un } is a basis
for W .

(iii) =⇒ (iv). We assume that T is onto. Let {⃗u1 , ⃗u2 , . . . , ⃗un } be a basis for V . Now, given
⃗ ∈ W.∃ ⃗v ∈ V s.t. T ⃗v = w
w ⃗ since T is onto.

Therefore, w ⃗ = T (c1⃗u1 + c2⃗u2 + · · · + cn⃗un );


[∵ {⃗u1 , ⃗u2 , . . . , ⃗un } is a basis for V and there exist scalars ci ∈ F s.t. ⃗v = c1⃗u1 + · · · + cn⃗un for
given ⃗v ∈ V ]. = c1 T ⃗u1 + c2 T ⃗u2 + · · · + cn T ⃗un ; [∵ T is linear].
7 Lecture 7

Any vector w ⃗ ∈ W can bhe written as a linear combination of the vectors in {T ⃗u1 , T ⃗u2 , . . . , T ⃗un } .
Hence, W = Span {T ⃗u1 , T ⃗u2 , . . . , T ⃗un }. Since dimW = n and there are n vectors in the set
{T ⃗u1 , . . . , T ⃗un }, these vectors must be linearly independent. Therefore, {T ⃗u1 , . . . , T ⃗un } is a
basis for W .
(iv) =⇒ (v); Since V is a finite dimensional vector space, it has a basis comprising of dimV
many vectors. Let dimV = n. Denote a basis (which always exists) of V by {⃗u1 , ⃗u2 , . . . , ⃗un } .
Now, by (iv), {T ⃗u1 , . . . , T ⃗un , } is a basis for W .

(v) =⇒ (i); Suppose there is some basis {⃗u1 , ⃗u2 , . . . , ⃗un } of V such that {T ⃗u1 , T ⃗u2 , . . . , T ⃗un } .
is a basis for W . Hence, the set {T ⃗u1 , . . . , T ⃗un , } spans W , i.e., given w ⃗ ∈ W, ∃ scalars
ci ∈ F, 1 ≤ i ≤ n, such that,

⃗ = c1 T ⃗u1 + c2 T ⃗u2 + · · · + cn T ⃗un .


w

= T (c1⃗u1 + · · · + cn⃗un ) ; [Linearity of T ].


= T⃗v .
for some vector ⃗v = c1⃗u1 + · · · + cn⃗un ∈ V . Hence, T is onto. If ⃗v = c1⃗u1 , c2⃗u2 + · · · + cn⃗un is
in the null-space of T , then,

T (c1⃗u1 + c2⃗u2 + · · · + cn⃗un ) = ⃗0W ,

c1 T ⃗u1 + c2 T ⃗u2 + · · · + cn T ⃗un = ⃗0W . (7.26)


Since {T ⃗u1 , T ⃗u2 , . . . , T ⃗un } is a linearly independent set in W . From equation 7.26nit follows
o
that c1 = c2 = · · · = cn = 0. Hence, ⃗v = c1⃗u1 + · · · + cn⃗un = ⃗0V . Therefore, KerT = ⃗0V and
T is non-singular. Hence, T is invertible.

The set of all invertible linear operators on a vector space V , with the operation of composition
is an example of the algebraic structure called a group.

Definition 7.5. A group consists of the following data:


1. A set G;
2. An operation, which associates with each pair of elements x, y ∈ G, an element
x, y ∈ G in such a way that
a) x (yz) = (xy) z.
b) There is a distinguished element e ∈ G such that e.x = x.e = x, ∀ x ∈ G;
c) To each element x ∈ G, there corresponds an element x−1 ∈ G such that
xx−1 = x−1 x = e.
Check that the set of all invertible linear operators on a vector space V , with the operation
of composition is indeed a group.

Suggested Exercises (Hoffman and Kunze) (Page: 83), Problem: 1, 2, 3, 5, 10, 11, 12.
8 Lecture 8
If V and W are vector spaces over the field F, any 1 − 1 and onto linear transformation from
V to W is called an isomorphism of V to W . If there exists an isomorphism between V and
W , we say that V is isomorphic to W .

Note that V is trivially isomorphic to itself. The identity operator I being an isomorphism
of V to itself.

§8.0.i Question
Verify that isomorphism is an equivalence relation on the class of vector spaces, i.e., the relation
obeys reflexivity, symmetry and transitivity.

Theorem 8.1
Every n-dimensional vector space over the field F is isomorphic to the space F2 .

Proof. Let V be an n-dimensional vector space over the field F and let B = {⃗u1 , ⃗u2 , . . . , ⃗un } be
an ordered basis for V . We define a function T : V → Fn as follows:

If ⃗u ∈ V , let T ⃗u be the n-tuple (x1 , x2 , . . . , xn ) of coordinates of ⃗u relative to the ordered


basis B, i.e., the n-tuple such that,

⃗u = x1⃗u1 + x2⃗u2 + · · · + xn⃗un . (8.1)

Verify that T : V → Fn by T (⃗u) = (x1 , x2 , . . . , xn ) is linear, 1 − 1 and onto. ■


Suggested Exercises (Hoffman and Kunze) (Page: 85-86), Problem: 1, 2 (a) , (b) , (c) , 3, 4, 6.

§8.1 Representation of transformations by Matrices


Let V be an n-dimensional vector space over the field F, and let W be an m-dimensional
vector space over the same field F. Let B = {⃗u1 , ⃗u2 , . . . , ⃗un } be an ordered basis for V and

B = {⃗v1 , ⃗v2 , . . . , ⃗vm } be an ordered basis for W . If T : V → W is a linear transformation, then
T is determined by its action on ⃗uj ’s, 1 ≤ j ≤ n.

Each of the n vectors T ⃗uj is uniquely expressible as a linear combination of the basis vec-

tors in B = {⃗v1 , ⃗v2 , . . . , ⃗vm } of W :
m
X
T ⃗uj = Aij ⃗vi . (8.2)
i=1

The scalars A1j , A2j , . . . , Amj being the unique coordinates of T ⃗uj ∈ W relative to the basis

B = {⃗v1 , ⃗v2 , . . . , ⃗vm }. Accordingly, the linear transformation T is fully determined by the mn
scalars Aij , 1 ≤ i ≤ m, 1 ≤ j ≤ n via formula 8.2. The m×n matrix A defined by A (i, j) = Aij
(where i represents rows and j represents columns) is called the matrix of T relative to the pair

of ordered bases B and B . Let us now understand explicitly how the matrix A determines the
linear transformation T :

If ⃗u = x1⃗u1 + x2⃗u2 + · · · + xn⃗un is a vector in V , then,


8 Lecture 8

n
! n
X X
T ⃗u = T xj ⃗uj = xj (T ⃗uj ) ; [by linearity of T ].
j=1 j=1
n
X m
X
= xj Aij ⃗vi ;
j=1 i=1

m n
!
X X
= Aij xj ⃗vi . (8.3)
i=1 j=1

Let X be the coordinate matrix of ⃗u in the ordered basis B: since ⃗u = x1⃗u1 +x2⃗u2 +· · ·+xn⃗un ,
one obtains,
x1
 
 x2 
X=  ...  , and X = [⃗u]B .

xn
From equation 8.3, it then follows that,
m
X
T ⃗u = (AX)i ⃗vi , (8.4)
i=1


where (AX)i is the ith row of the (m × 1) matrix AX and AX = [T ⃗u]B . Equation 8.4 tells us

that AX is the coordinate matrix of T ⃗u in the ordered basis B . In the other direction if A is
any m × n matrix over the field F, then T defined by,
n
! m n
!
X X X
T xj ⃗uj = Aij xj ⃗vi , (8.5)
j=1 i=1 j=1

is a linear transformation from V to W , the matrix representation of which is A, relative to



B, B . We now summarize these results formally as follows:

Theorem 8.2
Let V be an n-dimensional vector space over the field F, and W an m-dimensional vector

space over F. Let B and B be ordered bases of V and W , respectively. For each linear
transformation T from V to W , there is an m × n matrix A with entries in F such that,

[T ⃗u]B′ = A [⃗u]B , (8.6)

for every ⃗u ∈ V . Furthermore, T ↔ A is a 1 − 1 correspondence between the set of all


linear transformation from V to W and the set of all m × n matrices over the field F. Let
us refer back to equation 8.2:

T ⃗uj = A1j ⃗v1 + A2j ⃗v2 + · · · + Amj ⃗vm ,

A1j
 
 A2j 
=⇒ [T ⃗uj ]B′  ...  = Aj ← jth column of the (m × n) matrix A.
= 

Amj
8 Lecture 8

In what follows, we will be particularly interested in the representation by matrices of linear


transformations of a vector space to itself, i.e., linear operators on the vector space V . The
most convenient option, in this case, is to use the same ordered basis in each case, that is

to take B = B for the case when V = W . In this setting, since V is n-dimensional and

B = B = {⃗u1 , ⃗u2 , . . . , ⃗un }, the matrix corresponding to the linear operator T : V → V is the
n × n matrix A; whose entries Aij are determined by the equations:
n
X
T ⃗uj = Aij ⃗ui , j = 1, 2, . . . , n. (8.7)
i=1

Compare equation 8.7 with equation 8.2, that A, the representing matrix of the linear operator
T , on the finite dimensional vector space V depends on the ordered basis B is captured by the
notation: [T ]B = A so that equation 8.6 now becomes:

[T ⃗u]B = [T ]B [⃗u]B . (8.8)

Example 8.1. Let F be a field and T be the operator on F2 defined by T (x1 , x2 ) = (x1 , 0) .

On F2 vector addition:
(x1 , x2 ) + (y1 , y2 ) ,
= (x1 + y1 , x2 + y2 ) .
c (x1 , x2 ) = (cx1 , cx2 ) ,
for (x1 , x2 ) , (y1 , y2 ) ∈ F2 and c ∈ F.
Check that,
T [c (x1 , x2 ) + (y1 , y2 )] ,
= T (cx1 + y1 , cx2 + y2 ) ,
= (cx1 + y1 , 0)
= (cx1 , 0) + (y1 , 0) ,
= (cx1 , 0) + (y1 , 0) ,
= cT (x1 , x2 ) + T (y1 , y2 ) .
Hence, T : F2 → F2 is linear. Let B = {⃗ϵ1 ,⃗ϵ2 } be the standard ordered basis for F2 , with
⃗ϵ1 = (1, 0) and ⃗ϵ2 = (0, 1). Now,

T ⃗ϵ1 = T (1, 0) = (1, 0) = 1.⃗ϵ1 + 0.⃗ϵ2 ,

T ⃗ϵ2 = T (0, 1) = (0, 0) = 0.⃗ϵ1 + 0.⃗ϵ2 ,


where 1.⃗ϵ1 and 0.⃗ϵ2 are multiplication in F. So, the matrix representing T in the ordered basis
is,  
1 0
[T ]B = .
0 0
Example 8.2. Let P3 be the vector space of all polynomial functions from R to R of the form,

f (x) = c0 + c1 x + c2 x2 + c3 x3 , ci ∈ R, 0 ≤ i ≤ 3.

that are of degree 3 or less. The differentiation operator D : P3 → P3 dealt in example 7.2 of
lecture 7, in terms of its action on the elements of the ordered basis B = {f1 , f2 , f3 , f4 } with
fj (x) = xj−1 , 1 ≤ j ≤ 4 :
8 Lecture 8

(Df1 ) (x) = 0 =⇒ Df1 = 0.f1 + 0.f2 + 0.f3 + 0.f4 ,


(Df2 ) (x) = 1 =⇒ Df2 = 1.f1 + 0.f2 + 0.f3 + 0.f4 ,
where 1.f1 came from (Df2 ) (x) = (1.f1 ) (x) = 1.f1 (x) = 1.
1.f1 is scalar multiplication in P3 .

(Df3 ) (x) = 2x =⇒ Df3 = 0.f1 + 2.f2 + 0.f3 + 0.f4 ,


where 2.f2 came from (Df3 ) (x) = (2.f2 ) (x) = 2f2 (x) = 2x.
2.f2 is scalar multiplication in P3 . Remember that the field is R.
And,
(Df4 ) (x) = 3x2 =⇒ Df4 = 0.f1 + 0.f2 + 3.f3 + 0.f4 ,
where 3.f3 came from (Df4 ) (x) = (3.f3 ) (x) = 3.f3 (x) = 3x2 .
So, the matrix representation of D in the ordered basis B is given by,
 
0 1 0 0
0 0 2 0
[D]B = 
0 0
.
0 3
0 0 0 0

Question: What happens to representing matrix if 2 composable linear transformations are


composed to yield another linear transformation?

Suppose T : V → W is a linear transformation from an n-dimensional vector space V to


an m-dimensional vector space W . And U : W → Z be a linear transformation from the
n-dimensional vector space W to a p-dimensional vector space Z. All 3 vector spaces V, W, and
Z are considered to be defined over the ground field F. Suppose we have ordered bases,
′ ′′
B = {⃗v1 , ⃗v2 , . . . , ⃗vn } , B = {w
⃗ 1, w ⃗ m } and B = {⃗z1 , ⃗z2 , . . . , ⃗zp } ,
⃗ 2, . . . , w

for the vector spaces V, W, and Z, respectively.

Let A be the representing matrix of the linear transformation T : V → W relative to the



pair of bases B and B .

Therefore, equation 8.6, in this setting, yields, for any vector ⃗v ∈ V ,

[T ⃗v ]B′ = A [⃗v ]B , (8.9)

since, T⃗v ∈ W and B is the matrix that represents the linear transformation U : W → Z,
application of equation 8.6 yields,

[U (T⃗v )]B′′ = B [T⃗v ]B′ (8.10)

Equation 8.9, together with equation 8.10 imply,

[U (T⃗v )]B′′ = BA [⃗v ]B

[(UT ) (⃗v )]B′′ = BA [⃗v ]B (8.11)


Now, if C is the matrix representing the composed linear transformation UT : V → Z, then
for v ∈ V ,
8 Lecture 8

[(UT ) (⃗v )]B′′ = C [⃗v ]B (8.12)


Then by the uniqueness of the representing matrix, it follows that,

C = AB.

We summarize this result by means of the following theorem:

Theorem 8.3
Let V, W, and Z be finite dimensional vector spaces over the field F; let T : V → W be

a linear transformation and U : W → Z be another linear transformation. If B, B , and
′′
B are ordered bases for the vector spaces V, W, and Z, respectively, if A is the matrix

representing T relative to the pair B, B , and B is the matrix of U relative to the the pair
′ ′′
B , B , then the matrix representing the composed linear transformation UT relative to the
′ ′′
pair B , B is the product matrix C = BA.

Now we should inquire what happens to representing matrices when the ordered basis is changed.
In particular,we will consider this question for linear operators on a vector space V over the
field F. The specific question is as follows: let T be a linear operator on the finite dimensional
vector space V over a field F, and also let,

B = {⃗u1 , ⃗u2 , . . . , ⃗un } , B = {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } ,
be 2 ordered bases of V . How are the matrix representing T in ordered basis B related to the

matrix representing T in the ordered basis B ?

Before answering this question, it is important to note that if T and U are linear operators on a
vector space V and if [T ]B and [U]B denote the matrix representations of T and U, respectively,
relative to the single ordered basis B, then by theorem 8.3, one obtains,

[UT ]B = [B] [T ]B

where [UT ]B is the matrix representation of the linear operator UT on V .

An easy consequence of this is that the linear operator T is invertible if and only if its matrix
representation [T ]B is an invertible matrix. Note that the identity operator I : V → V is
represented by the identity matrix in any basis. We denote the identity operator and its matrix
representation by the same symbol I. And thus,

UT = T U = I ⇐⇒ [U]B [T ]B = [T ]B [U]B = I.

Since, in this case, one has, U = T −1

T −1 = [T ]−1
 
B B (8.13)

Now, let us get back to the question we asked. We’ve seen in theorem 6.7 of lecture 6 that
there is a unique (n × n) invertible matrix P s.t.

[⃗u]B = P [⃗u]B′ , for every ⃗u ∈ V. (8.14)

Write P = [P1 , P2 , . . . , Pn ] so that Pj = ⃗u ′j are columns of the invertible matrix P .


 

By definition (equation 8.8),


[T ⃗u]B = [T ]B [⃗u]B
.
8 Lecture 8

Now applying equation 8.14 to the vector T ⃗u, one obtains,

[T ⃗u]B = P [T ⃗u]B′ (8.15)

Combining equations 8.14,8.8 and 8.15, one has,

[T ⃗u]B = P [T ⃗u]B′ ,

=⇒ [T ]B [⃗u]B = P [T ⃗u]B′ ,
=⇒ [T ]B P [⃗u]B′ = P [T ⃗u]B′ ,
=⇒ P −1 [T ]B P [⃗u]B′ = [T ⃗u]B′ = [T ]B′ [⃗u]B′ .
Therefore, it must be the case that,

[T ]B′ = P −1 [T ]B P. (8.16)

This is how the matrix representations of the same linear operator T on V , relative to the 2

ordered bases B and B are related.

Now, let V be an n-dimensional vector space. B = {⃗u1 , ⃗u2 , . . . , ⃗un } , B = {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } ,
are 2 ordered bases of V . Using theorem 7.1 of lecture 7 (Choose the 2 vector spaces to be the

same and the latter set of vectors to be the basis B ), then there is a unique linear operator
U on V , s.t. U⃗uj = ⃗u ′j , j = 1, 2, . . . , n.U is invertible by theorem 7.9 of lecture 7. Refer to
equation 6.13 of lecture 6:
Xn

⃗u j = Pij ⃗ui , 1 ≤ j ≤ n. (8.17)
i=1

Therefore, we see that the invertible (n × n) matrix P is precisely the matrix representation of
the invertible linear operator U. The equation 8.17, can, therefore, be written as:
n
X
U⃗uj = Pij ⃗ui , 1 ≤ j ≤ n. (8.18)
i=1

Then using equation 8.7, one immediately finds that [U]B = P . Let us summarize these results
formally as follows:

Theorem 8.4

Let V be a finite dimensional vector space over the field F, and let B = {⃗u1 , ⃗u2 , . . . , ⃗un } , B =
{⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } , be 2 ordered bases for V . Suppose T isa linear operator on V . If P =
[P1 .P2 , . . . , Pn ] is the (n × n) matrix with columns Pj = ⃗u j B , then [T ]B′ = P −1 [T ]B P .


Alternatively, if U is the invertible linear operator on V defined by U⃗uj = ⃗u ′j , 1 ≤ j ≤ n,
then [T ]B′ = [U]−1 B [T ]B [U]B .

§8.2 Linear Algebra 8th lecture continued


Example 8.3. Let T be a linear operator defined by T (x1 , x2 ) = (x1 , 0) as introduced in ex-
ample 8.1.

We’ve seen in that example that the matrix representation of T in the ordered basis ⃗ϵ1 ,⃗ϵ2
with ⃗ϵ1 = (1, 0) and ⃗ϵ2 = (0, 1) is  
1 0
[T ]B = .
0 0
8 Lecture 8

Suppose B is the ordered basis for R2 consisting of the vectors ⃗ϵ ′1 = (1, 1) ,⃗ϵ ′2 = (2, 1) . Then
it is immediate that,
⃗ϵ ′1 = ⃗ϵ1 + ⃗ϵ2
⃗ϵ ′2 = 2⃗ϵ1 + ⃗ϵ2 (8.19)
 
1
Therefore, the first column P1 of the invertible matrix P = [P1 , P2 ] is P1 = [⃗ϵ ′1 ]B = , while
1
 
2
the second column P2 = [⃗ϵ ′2 ]B = so that the invertible matrix P = [P1 , P2 ] now reads,
1
 
1 2
P = .
1 1

Let us now compute P −1 using the standard algorithm presented in lecture 4:


       
1 2 1 0 1 2 1 0
, ′ , ,
1 1 0 1 −R1 + R2 = R2 0 −1 −1 1
   
1 2 1 0
′ , ,
−R2 = R2 0 1 1 −1
   
1 0 −1 2
′ , ,
−2R2 + R1 = R1 0 1 1 −1
 
−1 −1 2
∴P .
1 −1
Then, according to theorem 8.4,
[T ]B′ = P −1 [T ]B P
   
−1 2 1 0 1 2
1 −1 0 0 1 1
  
−1 2 1 2
1 −1 0 0
 
−1 −2
= .
1 2
We immediately verify (using definition T (x1 , x2 ) = (x1 , 0))

T ⃗ϵ ′1 = T (1, 1) = (1, 0) = a⃗ϵ ′1 + b⃗ϵ ′2 = a (1, 1) + b (2, 1)

and,
T ⃗ϵ ′2 = T (2, 1) = (2, 0) = c⃗ϵ ′1 + d⃗ϵ ′2 = c (1, 1) + d (2, 1)
Solve for a, b, c, and d:
a + 2b = 1
a+b=0
c + 2d = 2
c+d=0
=⇒ b = 1, a = −1, d = 2 c = −2.
8 Lecture 8

Hence, T⃗ϵ ′1 = −⃗ϵ ′1 + ⃗ϵ ′2 and T⃗ϵ ′2 = −2⃗ϵ ′1 + 2⃗ϵ ′2 .


 
−1 −2
=⇒ [T ]B′ = which matches with what we found earlier.
1 2

Suggested Exercises (Hoffman and Kunze) (Page: 95), Problem: 1, 2, 5, 6, 7.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy