MAT 212 Lecture Notes 2
MAT 212 Lecture Notes 2
Linear Algebra
Department of Mathematics and Natural Sciences (MNS)
Lecture Notes
Contents
0 Textbook References 3
0.1 Book 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.2 Book 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1 Lecture 1 4
1.1 The Basic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Addition of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Scalar Multiplication of a matrix by a number . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Various identities satisfied by matrix operations . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Block Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Lecture 2 12
2.1 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Elementary Row Operations (ERO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Lecture 3 21
3.1 Row-reduced Echelon matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4 Lecture 4 26
4.1 Interpreting matrix multiplication AB as linear combination of the rows of B . . . . . 26
4.2 Invertible matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Algorithm for computation of A−1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Lecture 5 35
5.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 Solution to Problem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.4 Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6 Lecture 6 47
6.1 Bases and Dimension (Continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.2 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.3 Revisiting row-equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.3.i To summarize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.3.ii Computations concerning subspaces . . . . . . . . . . . . . . . . . . . . . . . . 56
6.4 Why does the above method work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.4.i Solution (a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.4.ii Solution (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.4.iii Solution (c) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.4.iv Solution (d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7 Lecture 7 63
7.1 Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2 The Algebra of Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2.i Let us address the question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8 Lecture 8 75
8.0.i Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.1 Representation of transformations by Matrices . . . . . . . . . . . . . . . . . . . . . . 75
8.2 Linear Algebra 8th lecture continued . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
0 Textbook References
§0.1 Book 1
Algebra (2nd Edition) by Michael Artin.
§0.2 Book 2
Linear Algebra (2nd Edition) by Kenneth M Hoffman, Ray Kunze.
1 Lecture 1
The numbers in an (m × n) matrix are called matrix entries.They are denoted by aij with
1 ≤ i ≤ m and 1 ≤ j ≤ n.
i→
− row index,
j→
− column index,
So that aij is the entry which appears in the ith row and the jth column of the matrix:
i↓
... ... ... ... ...
. . . . . . . . . . . . . . .
j→
. . . . . . aij . . . . . . .
. . . . . . . . . . . . . . .
... ... ... ... ...
A is of size (m × n) → 1 ≤ i ≤ m ; 1 ≤ j ≤ n,
B is of size (n × l) → 1 ≤ i ≤ n ; 1 ≤ j ≤ l.
n
P
AB = (Cij ), where Cij = aik bkj with 1 ≤ i ≤ m and 1 ≤ j ≤ l.
k=1
So that AB is of size (m × l) .
Example 1.1.
0 −1 2
A = (aij ) = , Size = (2 × 3) ,
3 4 −6
1 2
B = (bij ) = −1
3 , Size = (3 × 2) .
4 2
1 Lecture 1
1 2
0 −1 2
AB = −1 3 .
3 4 −6
4 2
AB = (Cij ) will be a (2 × 2) matrix.
C11 C12
AB = .
C21 C22
C11 = a11 b11 + a12 b21 + a13 b31 = 0 + 1 + 8 = 9,
C12 = a11 b12 + a12 b22 + a13 b32 = 0 − 3 + 4 = 1,
C21 = a21 b11 + a22 b21 + a23 b31 = 3 − 4 − 24 = −25,
C22 = a21 b12 + a22 b22 + a23 b32 = 6 + 12 − 12 = 6.
9 1
Therefore, AB = .
−25 6
The system of linear equations:
−x2 + 2x3 = 2,
′
′
′
′
A B+B = AB + AB ; and A + A B = AB + A B. (1.1)
Associative Law:
As long as the matrices involved are of suitable sizes, so that the products are defined. For
example, in Equation 1.2, one requires A be of size (m × n), B to be of size (n × l) and C to
be of size (l × p), so that the 2 matrices on both sides of Equation 1.2 are of size (m × p).
Example 1.3.
2 0
1
ABC = 1 0 1 1 1 ,
2
0 1
2 0
1 0 1 2 1
(AB) C = 1 1 = ,
2 0 2 4 2
0 1
1 2 1
while, A (BC) = 2 1 = .
2 4 2
Scalar Multiplication is compatible with matrix multiplication in the obvious way:
AB ̸= BA.
When size A = m × n and size B = n × m or even when they are of the same size.
• 0m×n denotes a zero matrix (all the entries are 0) of size (m × n).
• The square (n × n) matrix whose only non zero entries are 1 in each diagonal position is
called the (n × n) identity matrix and is denoted by In .
• If A is an (m × n) matrix, then Im A = A and AIn = A with:
1 Lecture 1
1 0 ... 0
0 1 . . . 0
Im =
... .. . . . .. m rows.
. .
0 0 ... 1
m columns.
AB = In , BA = In . (1.4)
Example 1.4.
2 1 −1 3 −1
The matrix A= is invertible with its inverse A = .
5 3 −5 2
Indeed,
2 1 3 −1 1 0 3 −1 2 1
= = .
5 3 −5 2 0 1 −5 2 5 3
Let us quickly check the fact that, an inverse is unique if it exists at all.
′
Let B and B be 2 matrices satisfying Equation 1.4 for the same matrix A.
′
One has, AB = In and B A = In . (1.6)
′ ′ ′ ′
B (AB) = B A B = B In = In B = B = B.
”Uniqueness Proved”.
Proposition 1.1
Let A, B be (n × n) matrices. If both are invertible, so is their product AB and,
(AB)−1 = B −1 A−1 .
More generally, if A1 , . . . , Am are invertible, then so is the product A1 . . . Am and its inverse
−1
is A−1
m . . . A1 .
Proof. Assume that A and B are invertible. We check that the inverse of AB is B −1 A−1 .
Next, we assume that the assertion holds for m = k. Then we have to show that it holds
for m = k + 1.
P = A1 A2 . . . Ak . (1.7)
′
Where, A has r columns and A has r rows. Then the matrix product can be computed
as follows:
′ ′ ′
M M = AA + BB . (1.9)
Example 1.5.
" .. # 2 3
1 0 . 5 4
8 1 0 2 3 5 2 3 0 0 2 3
.
= + 0 0 = + = .
0 1 .. 7 . . . . . . 0 1 4 8 7 4 8 0 0 4 8
0 0
We may also multiply matrices divided into more blocks. For our purpose, a decomposition
into four blocks will be the most useful.
In this case, the rule for block multiplication is the same as for multiplication of 2 × 2
matrices.
..
k
A . B
M = . . . . . . ,
..
C . D l
r s
′ .. ′
r
A . B
′
M = . . . . . . ,
′ .. ′
C . D s
′ ′ .. ′ ′
AA + BC . AB + BD
′
MM = ... ... .
′ ′ .. ′ ′
CA + DC . CB + DD
Example 1.6. .
1 0 .. 5
. . . ... . . . ,
.
0 1 .. 7
(2 + 1)
..
23 . 11
41 ... 00
(2 + 1),
. . . . . .
..
01 . 10
23 ... 11
. ..
1 0 .. 5 . 28 . 61
41 .. 00
..
. ... . . . = ..
. . . . .
. . . . . .
. ..
0 1 .. 7 48 . 70
.
01 .. 10
..
1 0 . 5
..
. ... . . . ,
..
0 1 . 7
(2 + 1)
..
2 3 1 . 1
4 ..
1 0 . 0 (2 + 1).
. . . ... ... . . .
..
0 1 1 . 0
For 5 :
1 a 1 a
= .
1 0 1
We will discuss the notion of a field and study examples in the next lecture following the text
book by Hoffman and Kunze.
2 Lecture 2
Last lecture was about definition and basic operations involved in matrices. But we didn’t
talk about the entries of the matrices. The entries of the matrices belong to a certain structured
set called field.
§2.1 Fields
1. Addition is commutative: x + y = y + x, ∀ x, y ∈ F.
2. Addition is associative: x + (y + z) = (x + y) + z, ∀ x, y, z ∈ F.
3. There is a unique element 0 (zero) in F s.t. x + 0 = x, ∀ x ∈ F.
4. To each x in F , there corresponds a unique element (−x) ∈ F s.t. x + (−x) = 0.
5. Multiplication is commutative: x.y = y.x, ∀ x, y ∈ F.
6. Multiplication is associative: x. (y.z) = (x.y) .z, ∀ x, y, z ∈ F.
7. There is a unique non-zero element 1 (one) in F such that x.1 = x, ∀ x ∈ F.
8. To each non-zero x ∈ F, there exists a unique element x−1 or x1 in F such that xx−1 = 1.
Suppose one has a set F and there are 2 defined operations on the elements of F , namely + and .
The first operation +, called addition associates with each pair of elements x, y ∈ F an
element (x + y) ∈ F .
The second operation . called multiplication associates each pair x, y an element x.y ∈ F ;
and these two operations satisfy conditions (1) − (9) above. For convenience, we will drop
the multiplication notation . between elements of F , i.e, by simply xy we will mean x.y, given
x, y ∈ F.
The set F together with these two operations (+, .) is called a field.
Example 2.1. The set of complex numbers denoted by C is a field with respect to the standard
addition and multiplication of complex numbers:
(a + ib) + (c + id) = (a + c) + i (b + d)
(Check!)
• A subfield of C is a set F of complex numbers which is itself a field under the usual
operations of addition and multiplication of complex numbers. This means that 0 and 1
are in the set F , and that if x, y ∈ F , then (x + y) , −x, xy, and x−1 (x ̸= 0) are also in F .
• The set of positive integers: 1, 2, 3, . . . is not a subfield of C for a variety of reasons. For
example, the additive identity 0 is not a positive integer. Also, given a positive integer
n, its additive inverse −n is not a positive integer. Also, if n is any positive integer other
than 1, then its multiplicative inverse n1 is not a positive integer.
• The set of integers Z is not a subfield of C, because for n ∈ Z \ {±1}, its multiplicative
inverse n1 ∈
/ Z.
• The set of rational numbers Q is a subfield of C. (Check!) .
√
• The set of all complex numbers (real numbers infact) of the form x + y 2, where x, y ∈ Q
is a subfield of C.
√
(Exercise)→ This field is denoted by Q 2 .
Any n tuple (x1 , x2 , . . . , xn ) of elements of F that satisfies each of the m equations in equation
2.1 is called a solution of the system. If y1 = y2 = · · · = ym = 0 in equation 2.1, we say that
the system is homogeneous.
= C1 y1 + C2 y2 + · · · + Cm ym . (2.3)
Equation 2.3 represents the new linear equation in n unknowns with coefficients:
C1 y1 + C2 y2 + · · · + Cm ym .
2 Lecture 2
Equation 2.3 is called a linear combination of the linear equations in equation 2.1. If the n
tuple (x1 , x2 , . . . , xn ) solves the system 2.1, then it solves the linear equation 2.3 as well which
is evident from equation 2.2. Now, if we have the following system of linear equations:
B11 x1 + · · · + B1n xn = z1 ,
.. .. .. (2.4)
. . .
Bk1 x1 + · · · + Bkn xn = zk .
where each linear equation in 2.4 is a linear combination of the linear equations in 2.1. One
then immediately finds that a solution of 2.1 is going to solve the system 2.4 as well. But the
system 2.4 may have solutions that don’t solve 2.1.
Now if every equation in the system 2.1 can be expressed as a linear combination of the linear
equations in 2.4, then any solution of 2.4 will also solve 2.1. In such a situation, the system 2.1
is said to be equivalent to system 2.4.
2 system of linear equations are said to be equivalent if each equation in each system can
be
expressed as a linear combination of the equations in the other system. This fact is expressed
formally using the following theorem:
Theorem 2.1
Equivalent systems of linear equations have exactly the same solutions.
Get back to the system 2.1 and write it as the following matrix equation:
AX = Y, (2.5)
where
A11 ... A1n
A = ... .. ;
.
Am1 . . . Amn
x1 y1
.. ..
. .
.
and Y = ... .
X=
.
.
. .
.. ..
xn ym
Definition 2.1 (Elementary Row Operation). An elementary row operation is a special type
of function e that associates each (m × n) matrix an (m × n) matrix e (A) in the following
3 possible ways:
1. e (A)ij = Aij if i ̸= r, e (A)rj = cArj . [Multiplication of a row of A by a non-zero
scalar (element from F ) c].
2. e (A)ij = Aij if i ̸= r, e (A)rj = Arj + cAsj . [Replacement of the rth row of A by row
r plus c times row s, c ∈ F and r ̸= s].
3. e (A)ij = Aij if i ̸= r and i ̸= s, e (A)rj = Asj , e (A)sj = Arj . [Interchange of 2 rows
of A].
Theorem 2.2
To each elementary row operation e, there corresponds an elementary row operation e1 , of
the same type as e, such that e1 (e (A)) = e (e1 (A)) = A for each A. In other words, the
inverse operation(function) of an elementary row operation exists and is an elementary row
operation of the same type.
Proof. We prove the result for the 3 basic types of elementary row operations separately:
1. Define e1 by e1 (A)ij = Aij if i ̸= r; e1 (A)rj = c−1 Arj ; c ∈ F \ {0},
Then e1 (e (A))ij = e (A)ij = Aij if i ̸= r; e1 (e (A))rj = c−1 e (A)rj = c−1 .c Arj = Arj .
∴ e1 (e (A)) = A.
Similarly, one can show that e (e1 (A)) = A for type 1 of elementary row operations
e.
2. Define e1 by e1 (A)ij = Aij , i ̸= r; e1 (A)rj = Arj − cAsj where r ̸= s and c ∈ F.
∴ e1 (e (A)) = A.
Similarly, for the type 2 of EROs, one can show that e (e1 (A)) = A.
3. Define e1 by e1 (A)ij = Aij , for i ̸= r and i ̸= s; e1 (A)rj = Asj and e1 (A)sj = Arj .
In other words, e1 = e,
∴ e1 (e (A)) = A.
Definition 2.2. If A and B are m×n matrices over the field F, B is said to be row-equivalent
to A if B can be obtained from A by a finite sequence of elementary row operations.
Theorem 2.3
If A and B are row-equivalent m × n matrices over F , the homogeneous systems of linear
equations AX = 0 and BX = 0 have exactly the same solutions.
A = A0 → A1 → · · · → Ak = B.
It is enough to prove that Aj X = 0 and Aj+1 X = 0 have the same solutions. i.e; one ele-
mentary row operation doesn’t perturb the set of solutions.
So, suppose the B is obtained from A by a single ERO. Irrespective of the type of EROs
performed, each equation in the system BX = 0 will be a linear combination of the equations
in the system AX = 0.
[Br1 . . . Brn ] = 0 [A11 . . . A1n ] + · · · + 1 [Ar1 . . . Arn ] + · · · + c [As1 . . . Asn ] + · · · + 0 [Am1 . . . Amn ] .
Now, if B = e (A) with e begin 1 of the 3 basic types of EROs then A = e1 (B) where e1
is the inverse ERO described in Theorem 2.2. This is because e (e1 (B)) = B as proved in The-
orem 2.2. Therefore, A can be obtained from B by an elementary row operation. Therefore,
using the arguments presented earlier in the proof, each equation of the system AX = 0 can
be expressed as a linear combination of the equations in the system BX = 0.
2 −1 3 2 0 −9 3 4 0 1 − 13 − 49
(2)
(1)
1 4 0 −1
−2R2 + R1 = R1′
1 4 0 −1 1 4 0 −1 ,
− R1 = R1′
1
′
−2R2 + R3 = R3 9
2 6 −1 5 0 −2 −1 7 0 −2 −1 7
0 1 − 31 − 94
Type of ERO
(2)
4 7
EROs applied ← 1 0 3 .
9
′
−4R1 + R2 = R2
′
−2R1 + R3 = R3 0 0 − 35 55
9
0 1 − 31 − 49
(1)
1 0 4 7
′ 3 9 ,
− 35 R3 = R3
11
0 0 1 −3
0 1 0 − 35
(2)
′ 1 0 0 51 .
− 43 R3 + R2 = R2 9
′
− 13 R3 + R1 = R1
11
0 0 1 −3
Therefore, by means of Theorem 2.3, one finds that the homogeneous system of linear equations:
x1 + 4x2 − x4 = 0,
2x1 + 6x2 − x3 + 5x4 = 0,
−1 i 0 2+i
(2)
Now, −i 3 ′ 0 3 + 2i
R1 + R3 = R1
′
iR3 + R2 = R2
1 2 1 2
0 1 0 1
(2)
(1)
0 3 + 2i ′ 0 0 .
1 ′ − (3 + 2i) R1 + R2 = R2
R = R1
2+i 1
′
−2R1 + R2 = R2
1 2 1 0
−x1 + ix2 = 0,
−ix1 + 3x2 = 0,
x1 + 2x2 = 0,
and the one given by:
x2 = 0,
x1 = 0,
have the same solutions. i.e., the former system admits trivial solution only.
Remark. In Example 2.1 and Example 2.2, the EROs we performed aren’t random. The aim
was to put the coefficient matrix in a “desired form”. This ”desired form” is defined as follows:
The leading 1’s are circled. The leading 1 in the third row belongs to the third column where
there are other non-zero entries, e.g; −1. And hence part (b) of the definition fails.
0 2 1
1 0 3 .
0 0 0
2 Lecture 2
The first row is a non-zero row with 2 being the first non-zero entry. Hence, part (a) of the
definition fails and the matrix in question is not row-reduced.
Theorem 2.4
Every (m × n) matrix over the field F is row-equivalent to a row-reduced matrix.
Proof. Let A be an (m × n) matrix over F . If every entry of the first row is zero, then condition
(a) for row-reduced matrix is satisfied as far as row 1 of A is concerned. If row 1 has a non-zero
entry, let 1 < k < n be the least positive integer j for which A1j ̸= 0. Multiply row 1 by A−1 1k
to obtain the leading non-zero entry of the first row of A to be 1. Now, for i ≥ 2, perform the
′
elementary row operation −Aik R1 + Ri = Ri so that column k containing the leading 1 of the
first row has all other entries to be zero as required by condition (b) of row-reduced matrix.
The resulting matrix is of the following form:
kth column
↓
1st row → 0 . . . . . . . . . 0 1 ∗ ......... ∗
∗ . . . . . . . . . ∗ 0 ∗ ......... ∗
.
.. .. .. .. .. .
. . . .
∗ ......... ∗ 0 ∗ ......... ∗
∗ denotes unknown entries. If every entry in row 2 of the matrix above is zero, i.e; all the
∗’s are zero, we do nothing to it.
′
If some of the ∗’s in row 2 are non-zero, we find the lowest positive integer value k of j
for which A2j is non-zero. We multiply the second row by A−1 2k′
to obtain the leading 1 in the
′
second row. We also observe that k ̸= k, since there can’t be non-zero entry in the k-th column
of the second row. At this stage, the matrix reduces to one of the following 2 forms depending
′ ′
on whether k < k or k > k:
′
column k column k
↓ ↓
′
column k column k
↓ ↓
This last operation given by 2.6 doesn’t alter the zero entries appearing before the leading 1s
in row 1 and row 2. Also, the k-th column remains unaltered by the ERO 2.6 as can easily be
verified using the figure above.
Working with one row at a time (as done above for row 1 and 2) in the above manner, it
is clear that we will arrive at a row-reduced matrix after a finite number of steps. ■
Exercises (Hoffman and Kunze) (Page: 10) Problem: 1, 2, 4, 5.
Theorem 3.1
Every (m × n) matrix A is row equivalent to a row-reduced echelon matrix.
Proof. We know that A is row equivalent to a row-reduced matrix from Theorem 2.4 of
Lecture 2.
By a finite number of type 3 EROs (interchanging rows of a row-reduced matrix), one can
bring all the zero-rows (if any) at the bottom of the matrix. Also, by applying finite number of
interchange of rows on a row-reduced matrix, one can achieve the 3rd property of a row-reduced
echelon matrix in a row-reduced matrix. ■
In lecture 2, we’ve seen importance of row-reduced matrices in solving homogeneous system of
linear equations.
Let us now focus on the system RX = 0, where R is a row-reduced echelon matrix. Let
rows, 1, 2, . . . , r be the non-zero rows of R, and suppose that the leading 1 in the ith row occurs
at the ki th column, where 1 ≤ i ≤ r. The system RX = 0 then consists of r non-trivial equa-
tions. Note that the unknown xki will appear only in the ith equation of the system. We call
xki ’s the leading variables. The (n − r) unknowns which are different from xk1 , xk2 , . . . , xkr , are
denoted by u1 , u2 , . . . , un−r . We call these variables free variables.
Using the leading and free variables, one can write the r non-trivial equations in RX = 0
in the following way:
3 Lecture 3
n−r
P
xk 1 + c1j uj = 0,
j=1
n−r
P
xk 2 + c2j uj = 0, (3.1)
j=1
.. .. ..
. . .
n−r
P
xkr + crj uj = 0.
j=1
Here cij ’s aren’t to be confused with the entries of R. But the cij ’s for i = 1, 2, . . . , r and
j = 1, 2, . . . , n − r can be read off from the matrix R as we will see using examples.
All the solutions of the system RX = 0 are obtained by assigning arbitrary values to u1 , u2 , . . . , Un−r
and hence they are called free variables. The leading variables xk1 , . . . , xkr are computed in
terms of the free variables u1 , u2 , . . . , un−r using equation 3.1.
Let us assign arbitrary values to the free variables x1 = a, x3 = b and x5 = c and obtain the
solution.
c
a, 3b − , b, −2c, c
2
.
Remark. Let us focus back in the system RX = 0. If the number r of non-zero rows in R is less
than n, then the system RX = 0 has a non-trivial solution, that is, a solution (x1 , x2 , . . . , xn )
in which not every xj is 0. This is because, since r > n, one can choose (n − r) variables
u1 , u2 , . . . , un−r arbitrarily to write xk1 , xk2 , . . . , xkr in terms of u1 , u2 , . . . , un−r . These are the
xj ’s in (x1 , x2 , . . . , xn ) which is different from xk1 , xk2 , . . . , xkr .
We now have the following theorem based on the remark above:
Theorem 3.2
If A is an (m × n) matrix and m < n, then the homogeneous system of linear equations
AX = 0 has a non-trivial solution.
3 Lecture 3
Proof. Let R be a row-reduced echelon matrix which is row equivalent to A. Then the systems
AX = 0 and RX = 0 have the same solutions by Theorem 2.3 of Lecture 2. If r is the number
of non-zero rows in R, then certainly r ≤ m. Since m < n by hypothesis, r < n. Then by
the remark above RX = 0 has a non-trivial solution. Therefore, AX = 0 has a non-trivial
solution. ■
Theorem 3.3
If A is an (n × n) matrix, then A is row equivalent to the (n × n) identity matrix if and
only if the system of equations AX = 0 has only the trivial solution.
Proof. (⇒) Let A be row equivalent to the (n × n) identity matrix I, then by Theorem 2.3 of
lecture 2, AX = 0 and IX = 0 have the same solutions. Since, IX = 0 has the trivial solution
only, so does the system AX = 0.
While a homogeneous system of linear equations always admits the trivial solution, an
inhomogeneous system may not admit any solution at all.
x1 = · · · = xn = 0.
′
We form the augmented matrix A of the system AX = Y . This is the m × (n + 1) matrix
whose first n columns are the columns of A and whose last column is Y . More precisely,
′
Aij = Aij if j ≤ n,
′
Aij = Yi if j = n + 1,
′
Ai(n+1) = Yi .
Suppose, we perform a sequence of EROs on A to arrive at a row-reduced echelon matrix
′
R. If we perform this same sequence of EROs on the augmented matrix A , we’ll arrive at a
matrix R ′ whose first n columns are the columns of R and whose last column contains some
scalars z1 , . . . , zm . The scalars zi are the entries of the (m × 1) matrix.
z1
..
.
.
Z= .. ,
.
..
zm
which results from applying the stated sequence of EROs to the (m × 1) matrix Y .
3 Lecture 3
Using the proof techniques used in Theorem 2.3 of Lecture 2, one can show that the linear
equations in the system AX = Y can be expressed as linear combinations of the equations in
the system RX = Z and vice versa. And hence, the 2 systems are equivalent and they have the
same solutions. Let us see how to determine whether the system RX = Z has any solutions
and to determine all the solutions if any exist.If R has r non-zero rows, with leading 1 of row
i occurring in the column ki , i = 1, 2, . . . , r, then the first r equations of RX = Z effectively
express xk1 , . . . , xkr in terms of the (n − r) remaining xj and the scalars z1 , . . . , zr . The last
(m − r) equations are:
0 = zr+1 ,
.. .. (3.2)
. .
0 = zm ,
and accordingly the condition for the system to have a solution is Zi = 0 for i > r. If this
condition given by Equation 3.2 is met, all solutions of the system AX = Y can be found by
assigning arbitrary values to the (n − r) of the xj , i.e., the free variables and then computing
xki from the ith equation.
and suppose we wish to solve the system AX = Y for some y1 , y2 and y3 . The augmented
′
matrix A reads:
..
1 −2 1 . y1
′
A = 2 1 ..
1 . y2 .
..
0 5 −1 . y3
′
Let us row-reduce A :
.. ..
1 −2 1 . y1 1 −2 1 . y1
. .
2 1
1 .. y2
, ′
−2R1 + R2 = R2
0 5 −1 .. −2y + y ,
1 2
. ..
0 5 −1 .. y3 0 5 −1 . y3
.
1 −2 1 .. y1
′
.
0 5 −1 .. −2y + y ,
−R2 + R3 = R3 1 2
..
0 0 0 . 2y1 − y2 + y3
..
1 −2 1 . y1
1 .
. −2y 1 +y2
,
′
0 1 − 5 .
1
R2 = R2 5
5 ..
0 0 0 . 2y1 − y2 + y3
3 Lecture 3
..
3 y1 +2y2
1 0 5
. 5
0 1 − 1 .. −2y1 +y2
.
2R2 + R1 = R1
′
5
. 5
..
0 0 0 . 2y1 − y2 + y3
2y1 − y2 + y3 = 0.
And if the condition above is met, then we can express the leading variables in terms of the
free variable x3 in the following way:
3 y1 + 2y2
x1 = − x3 + ,
5 5
1 y2 − 2y1
x 2 = x3 + .
5 5
Assigning the free variable an arbitrary scalar c, one obtains:
3 y1 + 2y2
x1 = − c + ,
5 5
1 y2 − 2y1
x2 = c + ,
5 5
so that the solution set reads:
3 y1 + 2y2 1 y2 − 2y1
− c+ , c+ ,c .
5 5 5 5
Theorem 4.1
If A, B, C are matrices over the field F such that the products BC and A (BC) are defined,
then so are the products AB and (AB) C and,
A (BC) = (AB) C.
4 Lecture 4
To show that A (BC) = (AB) C, one must show that [A (BC)]ij = [(AB) C]ij for each i, j.
Now,
n
X
[A (BC)]ij = Air (BC)rj ,
r=1
n p
X X
= Air Brk Ckj ,
r=1 k=1
p
n X
X
= Air Brk Ckj ,
r=1 k=1
p
n
!
X X
= Air Brk Ckj ,
k=1 r=1
p
X
= (AB)ik Ckj ,
k=1
= [(AB) C]ij .
■
Remark. The relation A (BC) = (AB) C implies that linear combinations of linear combina-
tions of the rows of C are again linear combinations of the rows of C.
Theorem 4.2
Let e be an ERO and E be the (m × m) elementary matrix E = e (I) . Then, for every
(m × n) matrix A,
e (A) = EA
(This is how elementary row operation on a matrix is written in terms of matrix multipli-
cation).
4 Lecture 4
Proof.
m
X
(EA)ij = Eik Akj .
k=1
The entry in the ith row and the j th column of the product matrix EA is obtained from the ith
row of E and the j th column of A.
Let us give a detailed proof for the 2nd type of ERO. Proof for the other 2 types are left
as exercises.
e (A)ij = Aij if i ̸= r;
Now, apply this 2nd type of ERO to the (m × m) identity matrix to obtain the (m × m)
elementary matrix E = e (I), so that,
e (A) = EA.
B = (Es . . . E2 E1 ) A.
Since, E1 A is obtained by applying an ERO to A by Theorem 4.2 then by using the definition
of row equivalence of 2 matrices (2 matrices are said to be row equivalent to each other if one
can be obtained from the other by applying a finite sequence of EROs) one observes that E1 A
is row-equivalent to A. In the same way, E2 (E1 A) is row-equivalent to E1 A and hence is row-
equivalent to A. Continuing this way, one concludes that (Es . . . E2 E1 ) A is row-equivalent to A.
Lemma 4.4
If A has a left inverse B and a right inverse C, then B = C.
B = BI = B (AC) = (BA) C = IC = C.
■
Thus, if A (a square matrix) has both a left and a right inverse, then A has a unique two-sided
inverse (which is equal to any of the left/right inverses) and A is called invertible.The unique
two-sided inverse or simply the inverse of A is denoted by A−1 .
Theorem 4.5
Let A and B be (n × n) matrices over F .
−1
(i) If A is invertible, so is A−1 and (A−1 ) = A.
(ii) If both A and B are invertible, so is AB and,
(AB)−1 = B −1 A−1 .
Proof. (i) Since A is invertible (n × n) matrix, one must have an (n × n) matrix A−1 , called
the inverse of A so that the following equality holds,
From equation 4.4, one finds that the (n × n) matrix A−1 is invertible and whose two-sided
inverse is the (n × n) matrix A. In other words,
−1
A−1 = A.
[upon successive use of equation 4.5], i,e., B −1 A−1 is a right inverse of AB.
Also,
B −1 A−1 (AB) = B −1 A−1 A B = B −1 IB = B −1 B = I
[again upon successive use of equation 4.5], i.e., B −1 A−1 is also a left inverse of AB.
Hence, B −1 A−1 is the unique 2-sided inverse of AB. i.e.,
(AB)−1 = B −1 A−1 .
Theorem 4.7
An elementary matrix is invertible.
E = e (I)
= e (e1 (I)) ,
= I; [Since,e1 is the inverse operation of e].
and,
E1 E = e1 (E) , [again by Theorem 2.2]
= e1 (e (I)) ,
= I; [Since,e1 is the inverse operation of e].
Hence, E is invertible and E1 = E −1 . ■
Example 4.3. (a)
−1
0 1 0 1
= ;
1 0 1 0
(b)
−1
1 c 1 −c
= ;
0 1 0 1
(c)
−1
1 0 1 0
= ;
c 1 −c 1
(d) When c ̸= 0,
−1 −1 −1
c 0 c 0 1 0 1 0
= & = ;
0 1 0 1 0 c 0 c−1
4 Lecture 4
Theorem 4.8
If A is a (n × n) matrix, TFAE:
(i) A is invertible.
(ii) A is row-equivalent to the (n × n) identity matrix,
(iii) A is a product of elementary matrices.
Proof. let R be a row-reduced echelon matrix which is row-equivalent to A. (By Theorem 3.1
of Lecture 3). Now by Theorem 4.2 (or it’s corollary) of the current lecture.
R = (Ek . . . E2 E1 ) A (4.6)
From equations 4.6, 4.7, and 4.8, one concludes that (i), (ii) and (iii) are equivalent statements.
■
Proof. Using equation 4.8 and Corollary to Theorem 4.5, one obtains:
−1
A−1 = E1−1 E2−1 . . . Ek−1 = (Ek . . . E2 E1 )
(Ek . . . E2 E1 ) A = I (4.10)
This sequence (Ek . . . E2 E1 ) of ERO reduces A to the identity matrix. This same sequence of
EROs when applied to I yields A−1 as is evident from equation 4.9. ■
Proof. From Corollary to Theorem 4.2 of this lecture, we know that B is row-equivalent to A
if and only if B = P A where P is a product of (m × m) elementary matrices. Now by corollary
to Theorem 4.5 and 4.7, P is invertible. ■
Theorem 4.11
For an (n × n) matrix, TFAE:
(i) A is invertible.
(ii) The homogeneous system AX = 0 only has the trivial solution X = 0.
(iii) The system of equations AX = Y has a solution X for each (n × 1) matrix Y .
Proof. According to Theorem 3.3, condition (ii) is equivalent to the fact that A is row-equivalent
to the identity matrix. Now, by Theorem 4.8, (i) and (ii) are equivalent.
0
0
E= ... with RX = E.
0
0
E= ... ,
The last row of R can’t be a zero row at the bottom to have a consistent system of linear
equations. Since R is an (n × n) row-reduced echelon matrix with a non-zero last row (hence
with no zero row at all), R = I. In other words, A is row-equivalent to the (n × n) identity
matrix. So, by Theorem 4.8, A is invertible. Hence, we proved that (i) ⇐⇒ (iii). Since,
(i) ⇐⇒ (ii) and (i) ⇐⇒ (iii), one has (i) ⇐⇒ (ii) ⇐⇒ (iii). ■
Proof. Let A be an (n × n) matrix. Suppose A has a left inverse, i.e., a matrix B such that
BA = I. Then since, X = IX = (BA) X = B (AX) , AX = 0 has only the trivial solution
X = 0. Therefore, A is invertible by Theorem 4.11.
Now, suppose that A has a right inverse, i.e., an (n × n) matrix C such that AC = I. Then, by
previous argument C has a left inverse, namely A and hence C is invertible. One then obtains
C −1 = A. Therefore, by Theorem 4.5, C −1 = A is invertible with:
A−1 = C
■
4 Lecture 4
Example 5.1. The n-tuple space, Fn . Let, F be any field, and V be the set of all n-tuples,
⃗v = (v1 , v2 , . . . , vn ) of scalars vi ∈ F.
w
⃗ + ⃗v = (w1 + v1 , w2 + v2 , . . . , wn + vn ) . (5.1)
c ⃗v = (c v1 , c v2 , . . . , c vn ) (5.2)
Verify (Exercise) that Fn equipped with the vector addition and scalar multiplication defined
above indeed forms a vector space over F.
Example 5.2. The space of all m × n matrices over F, denoted by Fm×n . Let F be any field
and m and n are positive integers. Let Fm×n be the set of all (m × n) matrices over the field
F. The sum of 2 vectors (matrices of size m × n over F) A and B is defined as:
Scalar here comes from the field F and scalar multiplication is defined as:
Now, it is straightforward to check that the set Fm×n satisfies all the properties of a vector
space when the scalars take their values from F. (Exercise-Verify it!)
Hence, Fm×n is a vector space over the field F.
Example 5.3. The space of functions from a non-empty set to a field; F(S, F): Let F(S, F)
be the set of all functions from the set S to F. The sum of 2 elements in F(S, F) is defined as:
Now, verify (exercise) that all the properties of a vector space are satisfied by F(S, F) when
the scalars take their value from F. Thus F(S, F) is a vector space over the field F.
Example 5.4. The field C of complex numbers can be regarded as a vector space over the field
R of real numbers. More generally, one can consider the vector space V of complex n-tuples
(x1 , x2 , . . . , xn ) , ∀ xi ∈ C, over the field R on real numbers.
where (c.x1 , c.x2 , . . . , c.xn ) is multiplication in the field of complexes since c ∈ R can be re-
garded as a complex number.
This vector space V is different from Cn (which is a special case of Fn in example 5.1) in
which case the ground field is C whereas the vector space V of complex n-tuples is defined over
the ground field of reals.
There are a few simple facts that follow from the definition of a vector space.
(i) c ⃗0 = ⃗0 for c ∈ F.
5 Lecture 5
first write:
c ⃗0 = c ⃗0 + ⃗0 = c ⃗0 + c ⃗0;
=⇒ −c ⃗0 + c ⃗0 = −c ⃗0 + c ⃗0 + c ⃗0; [adding −c ⃗0 on both sides].
=⇒ ⃗0 = ⃗0 + c ⃗0 = c ⃗0;
∴ c ⃗0 = ⃗0.
(ii) 0 ⃗u = ⃗0 [ 0 is a Scalar].
0 ⃗u = (0 + 0) ⃗u = 0 ⃗u + 0 ⃗u;
=⇒ 0 ⃗u + (−0 ⃗u) = 0 ⃗u + (0 ⃗u + (−0 ⃗u)) ;
=⇒ ⃗0 = 0 ⃗u + ⃗0 = 0 ⃗u;
∴ 0 ⃗u = ⃗0.
C1 − C3 = z1
C2 + C3 = z2
+ C1 + C2 + C3 = z3 (5.10)
For which values of z1 , z2 , z3 does the system have a solution?
Augmented Matrix:
1 0 −1 z1 1 0 −1 z1
0 1 1 z2 , ′ 0 1 1 z2 ,
−R1 + R3 = R3
+1 1 1 z3 0 1 2 z1 + z3
5 Lecture 5
1 0 −1 z1
′ 0 1 1 z2 ,
−R2 + R3 = R3
0 0 1 z1 + z3 − z2
which tells us that the system 5.10 always admits a solution no matter what you take for
z1 , z2 and z3 .
Hence, any vector in C3 is a linear combination of the vectors (1, 0, −1) , (0, 1, 1) , and (1, 1, 1) .
§5.3 Subspaces
Definition 5.3. Let V be a vector space over the field F. A subspace of V is a subset W
of V which is itself a vector space over F with the operations of vector addition and scalar
multiplication on V .
A direct check of the axioms of the vector space reveals the fact that the subset W of
V is a subspace if for each ⃗u and ⃗v in W , the vector ⃗u + ⃗v is also in W ; the zero vector ⃗0 is
in W ; for each ⃗u ∈ W, (−⃗u) ∈ W ; for each ⃗u ∈ W , and c ∈ F, the vector c ⃗u ∈ W . The
commutativity and associativity of vector addition and the properties listed in definition
5.1’s 4(a)-(d) related to scalar multiplication are automatically fulfilled as these properties
concern the operations on V . One actually needs to check even less to see if a subset of a
vector space is indeed a subspace.
Theorem 5.1
A non-empty subset W of V is a subspace of V if and only if for each pair of vectors
⃗u, ⃗v ∈ W and each scalar c ∈ F, the vector c ⃗u + ⃗v is again in W .
The set of all Hermitian matrices of size n × n is not a subspace of the vector space Cn×n
of (n × n) matrices over C.
The diagonal entries are all real for an n × n Hermitian matrix. Just take j = k in
equation 5.11:
Ajj = Ajj ,
i.e., Ajj ’s are all real for j ∈ {1, 2, . . . , n}. But iAij ’s or the diagonal entries iA11 , iA22 , . . . , iAnn
of the (n × n) matrix iA are, in general, not real, i.e., if A is a Hermitian matrix, iA is
not Hermitian. Therefore, the set of all (n × n) Hermitian matrices doesn’t form a vector
subspace of the vector space Cn×n .
I leave it as an exercise for you to check that the set of n × n complex Hermitian matrices
is indeed a vector space over the field of real numbers (with the usual operations).
Example 5.6. The solution space of a system of homogeneous linear equations. Let A ∈ Fm×n ,
an m × n matrix over the field F. Then the set of all n × 1 matrices X over F satisfying AX = 0
is a subspace of Fn×1 , the vector space of all n × 1 matrices over F. To prove this, according to
theorem 5.1, one must show that if AX = 0 and AY = 0, and C ∈ F, then A (CX + Y ) = 0.
This is true because of the following more general facts:
Lemma 5.2
If A is an (m × n) matrix over F and B, C are n × p matrices over F, then:
A (α B + C) = α (AB) + AC, ∀ α ∈ F.
Proof.
n
X
[A (α B + C)]ij = Aik (αB + C)kj ,
k=1
n
X n
X
=α Aik Bkj + Aik Ckj ,
k=1 k=1
= α (AB)ij + (AC)ij ,
= [α (AB) + AC]ij .
■
Theorem 5.3
Let V be a vector space over the field F. The intersection of any collection of subspaces of
V is a subspace of V .
Definition 5.4. Let S be a set of vectors in a vector space V over F. The subspace spanned by
S is defined to be the intersection W of all subspaces of V which contains S. When S is a
finite set of vectors,
S = {⃗u1 , ⃗u2 , . . . , ⃗un } .
we shall simply call W the subspace spanned by the vectors ⃗u1 , ⃗u2 , . . . , ⃗un .
Theorem 5.4
The subspace spanned by a non-empty subset S of a vector space V is the set of all linear
combinations of vectors in S.
Proof. Let W be the subspace spanned by S. Then each linear combination ⃗u = c1⃗u1 + c2⃗u2 +
· · · + cn⃗un of vectors ⃗u1 , ⃗u2 , . . . , ⃗un in S is clearly in W . It is because each elements of S belongs
to W and since W is a subspace of V , any F-linear combination of the elements of S will also
be in W according to Theorem 5.1.
Thus, W contains the set L of all linear combinations of vectors in S. On the other hand,
S ⊂ L and hence the set L is non-empty.
From equation 5.14, one sees that c⃗v + w ⃗ is an F -linear combination of the (n + m) vectors
⃗u1 , ⃗u2 , . . . , ⃗un , ⃗s1 , ⃗s2 , . . . , ⃗sm ∈ S.
Thus, we have shown that L is a subspace of V which contains S. We also have shown that the
subspace W spanned by S, i.e., the intersection of all subspaces of V that contains S, contains
L, which in turn is a subspace of V containing S.
W ⊇ L ⊃ S.
Definition 5.5. If S1 , S2 , . . . , Sk are subsets of a vector space V over some field F, the set
of all sums,
⃗u1 + ⃗u2 + · · · + ⃗uk
of vectors ⃗ui ∈ Si is called the sum of the subsets S1 , S2 , . . . , Sk and is denoted by,
5 Lecture 5
k
X
S1 + S2 + · · · + Sk = Si .
i=1
And,
k
X
⃗v = ⃗v1 + ⃗v2 + · · · + ⃗vk = ⃗vi with ⃗vi ∈ Wi ∀ i ∈ {1, 2, . . . , k} .
i=1
c ⃗u + ⃗v ,
Now, since ⃗ui , ⃗vi ∈ Wi for a given i ∈ {1, 2, . . . , k} , for ∀ c ∈ F, one must have c ⃗ui +⃗vi ∈
Wi , as Wi is a subspace of V . Therefore, from equation 5.15, one immediately sees that
c ⃗u + ⃗v ∈ W or in other words, W is a subspace of V that contains each of the subspaces
Wi , ∀ i ∈ {1, 2, . . . , k} . Now, using the arguments used in theorem 5.4, one should be able
to see that W is the subspace spanned by the union of W1 , W2 , . . . , Wk .
Example 5.7. Let F be a given sub-field of the field C of complex numbers. Suppose,
⃗u1 = (1, 2, 0, 3, 0) ;
⃗u2 = (0, 0, 1, 4, 0) ;
⃗u3 = (0, 0, 0, 0, 1) ;
By theorem 5.4, the span of the above 3 vectors is a subspace of the vector space F5 , which
we denote by W . An element of this subspace W can be written as an F-linear combination of
⃗u1 , ⃗u2 and ⃗u3 , i.e., given ⃗u ∈ W , one can find c1 , c2 , c3 ∈ F such that,
x2 = 2x1 ,
x4 = 3x1 + 4x3 .
One can for example, check that (−3, −6, 1, −5, 2) is in W . On the other hand, (−3, 1, 1, −5, 2)
doesn’t belong to W .
5 Lecture 5
Example 5.8. Let F be a sub-field of the field C of complex numbers. Let W1 be the subset of
F2×2 , the vector space of all 2 × 2 matrices over F, consisting of 2 × 2 matrices of the following
form,
x y
with x, y, z ∈ F.
z 0
Also, let W2 be the subset of F2×2 consisting of 2 × 2 matrices of the form,
x 0
where x, y ∈ F.
0 y
F2×2 = W1 + W2 .
a b a b a b 0 0 a b
Any ∈ F2×2 admits a decomposition, = + ,where, ∈ W1 ,
c d c d c 0 0 d c 0
0 0
and ∈ W2 . The subspace W1 ∩ W2 , on the other hand, consists of matrices of the form
0 d
a 0
with a ∈ F.
0 0
Example 5.9. (Row space of a matrix) Let A be an m × n matrix over F. The row vectors of
A are the vectors in Fn given by,
The subspace of Fn spanned by the row-vectors of A is called the row space of A. Refer back to
example 5.7. The subspace spanned by 3 vectors there is actually the row space of the following
matrix,
1 2 0 3 0
A = 0 0 1 4 0 .
0 0 0 0 1
It is also the row space of the matrix given by,
1 2 0 3 0
0 0 1 4 0
B= 0
.
0 0 0 1
−4 −8 1 −8 0
In other words, the row space of A and the row space of B are the same. It is so because, the
last row of B can be written as - 4 times the first row plus the second row.
Suggested Exercises (Hoffman and Kunze) (Page: 39), Problem: 2, 3, 4, 5, 8, 9.
5 Lecture 5
In the case where S contains only finitely many vectors, namely ⃗u1 , ⃗u2 , . . . , ⃗un , one just
says that ⃗u1 , ⃗u2 , . . . , ⃗un are dependent/independent instead of saying that S is
dependent/independent.
Definition 5.7. (Basis of a vector space) Let V be a vector space over the field F. A basis
for V is a linearly independent set of vectors in V which spans the space V . The space V
is finite dimensional if it has a finite basis.
Example 5.10. Let F be a field and in Fn let S be the subset consisting of the vectors
⃗ε1 = (1, 0, . . . , 0) ; ⃗ε2 = (0, 1, 0, . . . , 0) ; . . . . . . . . . ⃗εn = (0, 0, . . . , 1) .
(x1 , x2 , . . . , xn ) . (5.18)
Hence, an element (x1 , x2 , . . . , xn ) ∈ Fn can be expressed as an F-linear combination of the
vectors ⃗ε1 , ⃗ε2 , . . . , ⃗εn , meaning that the set S = {⃗ε1 , ⃗ε2 , . . . , ⃗εn } spans the vector space Fn .
=⇒ x1 = 0, x2 = 0, . . . , xn = 0
In other words, S is a linear independent set. Hence, S = {⃗ε1 , ⃗ε2 , . . . , ⃗εn } is a basis for Fn . We
shall call this particular basis the standard basis of Fn .
Example 5.11. At this point, we’ll give an example of an infinite basis. Let F be a sub-field
of the field C of complex numbers and let P be the set of polynomial functions over F. These
functions are functions from F to F which have the form:
f (x) = c0 + c1 x + · · · + cn xn . (5.19)
5 Lecture 5
Let fk (x) = xk , k = 0, 1, 2, . . . . We claim that the infinite set S = {f0 , f1 , . . . } is a basis for
P. Clearly, the set S spans P since a given polynomial function f ∈ P as given by equation
5.19 can be written as:
f = c0 f 0 + c1 f 1 + c2 f 2 · · · + cn f n , (5.20)
For c0 , c1 , . . . , cn ∈ F, why is the infinite set S = {f0 , f1 , . . . } linearly independent? To show
that the set {f0 , f1 , . . . } linearly independent is equivalent to showing that each finite subset
of this is linearly independent. It will actually suffice to show that the set {f0 , f1 , . . . , fn } is
independent for each n ∈ F.
c0 + c1 x + · · · + cn xn = 0, ∀ x ∈ F (5.21)
i.e., every x ∈ F is a root of the polynomial equation 5.21. But we know that a polynomial
equation of degree n with complex coefficients can have at most n distinct roots. Therefore, it
follows the c1 = c2 = · · · = cn = 0.
Hence, the set S = {f0 , f1 , . . . } is linearly independent. Therefore, the set {f0 , f1 , . . . } is a
basis for P.
Theorem 5.5
Let V be a vector space over F which is spanned by a finite set of vectors ⃗u1 , ⃗u2 , . . . , ⃗um .
Then any independent set of vectors in V is finite and contains no more than m elements.
Proof. It suffices to show that any set S containing more that m elements is linearly depen-
dent. Suppose, in S there are distinct vectors ⃗v1 , ⃗v2 , . . . , ⃗vn with n > m. Since the vectors
⃗u1 , ⃗u2 , . . . , ⃗um span V , there are scalars Aij such that,
m
X
⃗vj = Aij ⃗ui .
i=1
n
X m
X
= xj Aij ⃗ui ,
j=1 i=1
n X
X m
= (Aij xj ) ⃗ui ,
j=1 i=1
m n
!
X X
= Aij xj ⃗ui . (5.22)
i=1 j=1
Given the m × n matrix A (with entries Aij ) over the field F and the n × 1 matrix X (with
entries Xj , j = 1, 2, . . . , n), the ith row (AX)i of the m × 1 matrix AX is given by,
n
X
(AX)i = Aij xj (5.23)
j=1
5 Lecture 5
Now, by theorem 3.2 of lecture 3, since m < n for the m × n matrix A, the homogeneous
system of linear equations AX = 0m×1 has a non-trivial solution.
AX = 0m×1 in the component form reads off as,
n
X
Aij xj = 0, 1 ≤ i ≤ m. (5.24)
j=1
when not all of the scalars x1 , x2 , . . . , xn are 0. Therefore, from equation 5.22 to 5.24, it follows
that, x1⃗v1 + x2⃗v2 + · · · + xn⃗vn = ⃗0 if and only if,
n
X
Aij xj = 0, ∀ i ∈ {1, 2, . . . , m} .
j=1
This means, ∃ scalars x1 , x2 , . . . , xn , not all of which are identically zero, such that,
Since, the above set is a basis, it spans V . In other words, V is spanned by the finite set of
vectors {⃗u1 , ⃗u2 , . . . , ⃗um } . Therefore, by theorem 5.5, every basis of V is finite and contains no
more than m elements. Thus if, {⃗v1 , ⃗v2 , . . . , ⃗vn } is a basis of V , then n ≤ m.
Now, if {⃗v1 , ⃗v2 , . . . , ⃗vn } is a basis of V , then V is spanned by the finite set of vectors {⃗u1 , ⃗u2 , . . . , ⃗um } .
Then by theorem 5.5, the finite basis {⃗u1 , ⃗u2 , . . . , ⃗um } can’t have more than n elements in it,
i,e., m ≤ n.
Proof. (a) Let {⃗u1 , ⃗u2 , . . . , ⃗um } be a set of m vectors with m > n, i.e., the set above con-
tains more elements than the number of elements in a basis, say, {⃗v1 , ⃗v2 , . . . , ⃗vn } of an
n-dimensional vector space V . Since {⃗v1 , ⃗v2 , . . . , ⃗vn } clearly spans V and m > n, the set
{⃗u1 , ⃗u2 , . . . , ⃗um } can’t be linearly independent by theorem 5.5.
5 Lecture 5
(b) Suppose as in part (a) that {⃗v1 , ⃗v2 , . . . , ⃗vn } is a basis of the vector space V . Now let
{w
⃗ 1, w ⃗ m } be a set of vectors in V with m < n. I want to show that {w
⃗ 2, . . . , w ⃗ 1, w ⃗ m}
⃗ 2, . . . , w
does not span V . Let us proceed by contradiction and suppose that {w ⃗ 1, w ⃗ m } span
⃗ 2, . . . , w
V . Then each of the vectors in {⃗v1 , ⃗v2 , . . . , ⃗vn } can be expressed as a linear combination
of the vectors in {w ⃗ 1, w ⃗ m };
⃗ 2, . . . , w
Since m < n, the following system has a non-trivial solution by theorem 3.2 of lecture 3:
x1 = b1 , x2 = b2 , . . . , xn = bn
bn
which boils down to, b1⃗v1 + b2⃗v2 + · · · + bn⃗vn = ⃗0, where not all the bi ’s are zero.
Hence, the set {⃗v1 , ⃗v2 , . . . , ⃗vn } is linearly dependent contradicting the fact that {⃗v1 , ⃗v2 , . . . , ⃗vn }
is a basis of the vector space V .
■
6 Lecture 6
Lemma 6.1
Let S be a linearly independent subset of a vector space V . Suppose ⃗v ∈ V which is
not in the subspace spanned by S. Then the set obtained by adjoining ⃗v to S is linearly
independent.
Proof. Suppose ⃗u1 , ⃗u2 , . . . , ⃗um are distinct vectors in S. Also, suppose:
Since, ⃗u1 , ⃗u2 , . . . , ⃗um are distinct vectors of the linearly independent set S, one must have,
α1 = α2 = · · · = αm = 0 in equation 6.2.
Theorem 6.2
If W is a subspace of a finite-dimensional vector space V , every linearly independent subset
of W is finite and is part of a (finite) basis of W .
Sm = S0 ∪ {w
⃗ 1, w ⃗ m} ,
⃗ 2, . . . , w
which is a basis for W . In other words, S0 is a part of a finite basis of W . ■
Corollary 6.3
(Corollary 1 to Theorem 6.2) If W is a proper subspace of a finite-dimensional vector space
V , then W is finite dimensional and,
Proof. We may suppose without loss of generality that W contains a non-zero vector ⃗u. Then
by theorem 6.2 and it’s proof, one can construct a basis of W containing ⃗u such that the basis
contains no more than dimV elements. It means that W is finite dimensional and,
Corollary 6.4
(Corollary 2 to Theorem 6.2) In a finite dimensional vector space V , every non-empty
linearly independent set of vectors is part of a basis.
Corollary 6.5
(Corollary 3 to Theorem 6.2) Let A be an n × n matrix over F, suppose the row vectors of
A form a linearly independent set of vectors in Fn . Then A is invertible.
Proof (Exercise).
Theorem 6.6
If W1 and W2 are finite dimensional subspaces of a vector space V , then the sum set,
W1 + W2 = {w ⃗2 | w
⃗1 + w ⃗ 1 ∈ W1 and w
⃗ 2 ∈ W2 } ,
is finite-dimensional and,
Proof. By theorem 6.2 and it’s corollaries, the subspace W1 ∩ W2 of both the finite dimensional
vector spaces W1 and W2 , is finite-dimensional. Suppose it has a finite basis, {⃗u1 , ⃗u2 , . . . , ⃗ul }
which is part of a basis {⃗u1 , ⃗u2 , . . . , ⃗ul , ⃗v1 , ⃗v2 , . . . , ⃗vm } for W1 and part of a basis
{⃗u1 , ⃗u2 , . . . , ⃗ul , w
⃗ 1, w ⃗ n } for W2 .
⃗ 2, . . . , w
The subspace W1 + W2 (it is easy to check that it is a subspace!) is spanned by the vec-
tors belonging to the following set,
6 Lecture 6
for αi , βj , γk ∈ F, then,
n
X l
X m
X
− γk w
⃗k = αi ⃗ui + βj ⃗vj = ⃗0. (6.5)
k=1 i=1 j=1
n
P n
P n
P
which shows that γk w
⃗ k belongs to W1 . But γk w
⃗ k also belongs to W2 . Hence, γk w
⃗k
k=1 k=1 k=1
belongs to W1 ∩ W2 . So, there are scalars ρi ∈ F such that,
n
X l
X
γk w
⃗k = ρi ⃗ui . (6.6)
k=1 i=1
Now, since {⃗u1 , ⃗u2 , . . . , ⃗ul , w ⃗ n } is linearly independent from equation 6.6, one concludes
⃗ 1, . . . , w
that,
γ1 = γ2 = · · · = γn = ρ1 = ρ2 = · · · = ρl = 0. (6.7)
Using equation 6.7 in equation 6.5, one obtains,
l
X m
X
αi ⃗ui + βj ⃗vj = ⃗0, (6.8)
i=1 j=1
and since {⃗u1 , . . . , ⃗ul , ⃗v1 , . . . , ⃗vm } is also a linearly independent set,
α1 = α2 = · · · = αl = β1 = · · · = βm = 0 (6.9)
one has,
α1 = · · · = αl = β1 = · · · = βm = γ1 = · · · = γn = 0.
Thus, {⃗u1 , . . . , ⃗ul , ⃗v1 , . . . , ⃗vm , w ⃗ n } . is a linearly independent set. Also, the set above
⃗ 1, . . . , w
spans W1 + W2 . Hence, {⃗u1 , . . . , ⃗ul , ⃗v1 , . . . , ⃗vm , w ⃗ n } is a basis for W1 + W2 . Finally,
⃗ 1, . . . , w
dimW1 + dimW2 = (l + m) + (l + n)
= l + (m + l + n)
= dim (W1 ∩ W2 ) + dim (W1 + W2 ) .
■
Suggested Exercises (Hoffman and Kunze) (Page: 48), Problem: 1, 2, 3, 7, 11.
6 Lecture 6
§6.2 Coordinates
Definition 6.1. If V is a finite dimensional vector space, an ordered basis for V is a finite
sequence of vectors which is linearly independent and spans V .
Remark. If a sequence ⃗u1 , ⃗u2 , . . . , ⃗un is an ordered basis, then the set {⃗u1 , ⃗u2 , . . . , ⃗un } is a basis
for V . The ordered basis is the set, together with the specific ordering. In what follows, we
will be engaged in a slight abuse of notation and describe an ordered basis for V as below:
n
P
xi − yi i = 0 whence linear independence of the set {⃗u1 , ⃗u2 , . . . , ⃗un } would yield xi = yi , ∀ i ∈
i=1
{1, 2, . . . , n} .
We shall xi the ith coordinate of ⃗u relative to the ordered basis B = {⃗u1 , ⃗u2 , . . . , ⃗un }. If
another vector ⃗v ∈ V can be written as,
n
X
⃗v = zi ⃗ui ,
i=1
n
X
then ⃗u + ⃗v = (xi + zi ) ⃗ui . (6.11)
i=1
so that the ith coordinate of the vector ⃗u + ⃗v in this ordered basis B reads xi + zi . Similarly,
the ith coordinate of (c ⃗u) is c xi .
This is how one explores that each ordered basis for the vector space V over F determines
a one-one correspondence.
⃗u → (x1 , x2 , . . . , xn ) .
between the set of all vectors of V and the set of all n-tuples in Fn .
Most of the time, it is more convenient to use the coordinate matrix of ⃗u relative to the ordered
basis B
x1
x2
X= ...
xn
To indicate the dependence of this coordinate matrix on the basis, we’ll use the symbol ⃗u B for
the coordinate matrix of the vector ⃗u ∈ V relative to the ordered basis B, i.e.,
6 Lecture 6
x1
x2
[⃗u]B =
...
(6.12)
xn
Question: What happens to the coordinate of ⃗u ∈ V as we change from one ordered basis
of V to another. Suppose that V is n-dimensional and that,
′
B = {⃗u1 , ⃗u2 , . . . , ⃗un } and B = {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } ,
are 2 ordered bases for V . There are unique scalars Pij such that,
n
X
⃗u ′j = Pij ⃗u ′i , 1 ≤ j ≤ n (6.13)
i=1
Pij ’s are unique for a given j ∈ {1, 2, . . . , n} . Writing each ⃗u ′j ∈ V as a unique linear combi-
nation of the vectors ⃗ui in the ordered basis {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } .
P1j
′ P2j
⃗u j B =
...
(6.14)
Pnj
′
⃗u j B is the coordinate matrix of ⃗u ′j relative to the ordered basis B. Let x′1 , x′2 , . . . , x′n be the
′
coordinates of a given vector ⃗u ∈ V in the ordered basis B . Then,
n n
!
X X
= Pij x′j ⃗ui ;
i=1 j=1
Thus we obtain, !
n
X n
X
⃗u = Pij x′j ⃗ui . (6.15)
i=1 j=1
n
P
x1 , x2 , . . . , xn → ⃗u = xi ⃗ui . Since the coordinates x1 , x2 , . . . , xn of ⃗u in the ordered basis
i=1
B = {⃗u1 , ⃗u2 , . . . , ⃗un } are uniquely determined, it follows from equation 6.15 that,
n
X
xi = Pij x′j . (6.16)
j=1
6 Lecture 6
P1 j
P2 j
Recall from equation 6.13 that ⃗u ′j B =
... .
Pn j
Let P be the n × n matrix whose i, j entry is the scalar Pij given in equation 6.16 or in 6.13.
Equation 6.15 can be written using the matrix P as follows:
Theorem 6.7
′
Let V be an n-dimensional vector space over the field F, and let B and B be 2 ordered
bases of V . Then there is a unique, necessarily invertible, n × n matrix P with entries in
F such that,
(i) [⃗u]B = P [⃗u]B′ ,
(ii) [⃗u]B′ = P −1 [⃗u]B ,
for every vector ⃗u ∈ V . The columns of P are given by, Pj = ⃗u ′j B , j = 1, 2, . . . , n.
Example 6.1. Given the field R of real numbers and θ ∈ R, the matrix,
cos θ − sin θ
P = is invertible with inverse.
sin θ cos θ
We discussed earlier that (observe equation 6.13) the vectors ⃗u ′j belonging to the ordered basis
′
B = {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } can be identified with the n-tuple belonging to Fn whose components
P1 j
′ P2 j
(coordinates) in the basis B = {⃗u1 , ⃗u2 , . . . , ⃗un } are given by Pij , 1 ≤ j ≤ n, i.e., ⃗u j B =
... .
Pnj
′ P11 ′ P12 ′ cos θ ′ − sin θ
Hence, [⃗u 1 ]B = ; [⃗u 2 ]B = . i.e., [⃗u 1 ]B = ; [⃗u 2 ]B = .
P21 P22 sin θ cos θ
6 Lecture 6
cos θ − sin θ
In other words, one can identify the ordered basis, B = by {⃗u ′1 , ⃗u ′2 }
,
sin θ cos θ
x x′
containing vectors in R2 . If ⃗u ∈ V with [ ⃗u ]B = 1 and one writes [ ⃗u ]B′ = 1 , then,
x2 x2′
[ ⃗u ]B′ = P −1 [ ⃗u ]B .
x1′ cos θ sin θ x1
=⇒ = .
x2′ − sin θ cos θ x2
Or,
x1′ = x1 cos θ + x2 sin θ;
x2′ = −x1 sin θ + x2 cos θ.
Geometrically, one obtains the coordinate matrix of ⃗u relative to the basis B = {⃗u ′1 , ⃗u′2 }
by
1 0
rotating the coordinate matrix of ⃗u relative to the standard basis B = {⃗u ′1 , ⃗u ′2 } ≡ , .
0 1
The coordinate
matrix of ⃗u = x1⃗u1 + x2⃗u2 relative to the ordered basis B = {⃗u ′1 , ⃗u ′2 } is
n
x
[ ⃗u ]B = 1 . Indeed verify that ⃗u ′j =
P
Pij ⃗ui holds, i.e.,
x2 i=1
′ 1 0 cos θ
⃗u 1 = P11⃗u1 + P21⃗u2 = cos θ + sin θ = ,
0 1 sin θ
′ 1 0 − sin θ
⃗u 2 = P12⃗u1 + P22⃗u2 = − sin θ + cos θ = .
0 1 cos θ
n
P
On the other hand, see that in matrix notation ⃗uj ′ = Pij ⃗ui translates,
i=1
⃗u ′1
′
⃗u 1
.
.. = P ... ,
T
⃗u ′n ⃗u ′n
P11 P12 ⃗u ′1
⃗u1
=⇒ =
⃗u2 P21 P22 ⃗u ′2
cos θ − sin θ ⃗u ′1
⃗u1
=⇒ =
⃗u2 sin θ cos θ ⃗u ′2
Or,
⃗u1 = (cos θ) ⃗u ′1 − (sin θ) ⃗u ′2 , and,
⃗u2 = (sin θ) ⃗u ′1 + (cos θ) ⃗u ′2 .
cos θ − sin θ 1
Therefore, one verifies that ⃗u1 = cos θ − sin θ = ;
sin θ cos θ 0
6 Lecture 6
cos θ − sin θ 0
and, ⃗u2 = sin θ + cos θ = .
sin θ cos θ 1
Example 6.2. Let F be a sub-field C of complex numbers. It’s an easy exercise for you to find
that,
−1 4 5
P = 0 2 −3
0 0 8
is invertible with inverse,
−1 2 11 8
P −1 = 0 12 16 3
.
1
0 0 8
P1 j
′ P2 j −1 4
′ ′
Now using equation 6.14: ⃗u j B = .. , one obtains, [⃗u 1 ]B = 0 ; [⃗u 2 ]B = 2 ;
. 0 0
Pn j
5
′
[⃗u ′3 ]B = −3 ; the 3 vectors above form a basis B of F3 . Here B is the standard basis of F3 .
8
We also know that, ′
⃗u 1 ⃗u1
⃗u ′2 = P T ⃗u2 .
⃗u ′3 ⃗u3
′
⃗u1 ⃗u 1
−1
T ⃗u ′2 .
Or, ⃗u2 = P
⃗u3 ⃗u ′3
−1 0 0
⃗u1 ⃗u ′
1
Or, ⃗u2 = 2 1
0 ⃗u ′2 .
2
⃗u ′3
⃗u3
11 3 1
8 16 8
−⃗u ′1
⃗u1
Or, ⃗u2 =
2⃗u ′1 + 21 ⃗u ′2 .
⃗u3
11 ′ 3 ′
⃗u
8 1
+ 16 ⃗u 2 + 81 ⃗u ′3
1 −2 2 0
∴ ⃗u1 = 0 ; ⃗u2 = 0 + 1 = 1 ;
0 0 0 0
11 3 5
−1 4 5 − + + 0
11 3 1 8 3 4 3 8
⃗u3 = 0 + 2 + −3 = 0+ 8 − 8 = 0 .
8 0 16 0 8 8 0+0+1 1
1 0 0
Therefore, indeed B is the standard basis 0 , 1 , 0 of F3 . The coordinates x′1 , x′2 , x′3
0 0 1
′
of the vector ⃗u relative to the basis B are given by,
[⃗u]B′ = P −1 [⃗u]B
6 Lecture 6
11
′ −1 2 8
x1
x1
x′2 = 0 1 3
x2 [By theorem 6.7 (ii)].
2 16
x′3
x3
1
0 0 8
11
′ −x 1 + 2x 2 + 8
x 3
x1
′ 1 3
Therefore, x2 = 2
x 2 + 16
x 3
(6.19)
′
x3
1
x
8 3
We know that,
x1⃗u1 + x2⃗u2 + x3⃗u3 = ⃗u = x′1⃗u ′1 + x′2⃗u ′2 + x′3⃗u ′3 . (6.20)
Choose for example,
x1 = 3, x2 = 3 and x3 = −8. (6.21)
From equation 6.19 then it follows that,
1
x′1 = −10, x′2 = − , x′3 = −1. (6.22)
2
Since {⃗u1 , ⃗u2 , ⃗u3 } = B is the standard basis for F3 , by plugging in the value for x1 , x2 , x3 and
x1′ , x2′ , x3′ from equations 6.21 and 6.22 into equation 6.20, one then obtains,
3
2 = −10⃗u ′1 − 1 ⃗u ′2 − ⃗u ′3 .
−8 2
α
⃗ i = (Ai1 , Ai2 , . . . , Ain ) .
and the row space of A is defined to be the subspace of Fn spanned by these m vectors. The
row rank of A is the dimension of the row space of A. If P is a k × m matrix, then B = P A is
a k × n matrix whose row vectors β⃗1 , . . . , β⃗k are linear combinations of the rows of A:
m
X
β⃗i = ⃗ j , 1 ≤ i ≤ k.
Pij α (6.23)
j=1
Thus, one immediately see that the row space of B is a subspace of the row space of A. If
k = m and P is an invertible m × m matrix. Then the analog of equation 6.23 exists expressing
each row α⃗ i of the m × n matrix A as a linear combination of the rows β⃗j of the m × n matrix
B = P A establishing the row-equivalence of the 2 matrices A and B. In this case, one finds
that the row space of A is a subspace of the row space of B. One, therefore, has the following
theorem.
6 Lecture 6
Theorem 6.8
Row-equivalent matrices have the same row space. Therefore, we see that studying the row
space of a given matrix A is equivalent to studying the row space of a row-reduced echelon
matrix which is row-equivalent to A.
We now proceed to study the row space of a row-reduced echelon matrix by means of
the following theorem whose proof is omitted.
Theorem 6.9
Let R be a non-zero row-reduced echelon matrix. Then, the non-zero row vectors of R
form a basis for the row space of R.
Corollary 6.10
Each m × n matrix A is row-equivalent to one and only one row-reduced echelon matrix.
Corollary 6.11
Let A and B be m × n matrices over the field F. Then A and B are row-equivalent
if and only if they have the same row space.
§6.3.i To summarize
If A and B are m × n matrices over F, then the following statements are equivalent:
1. A and B are row-equivalent.
2. A and B have the same row space.
3. B = P A, where P is an invertible m × m matrix.
α
⃗ i = (Ai1 , Ai2 , . . . , Ain ) .
If ρ⃗1 , . . . , ρ⃗r are the non-zero row vectors of R, then B = ρ⃗1 , . . . , ρ⃗r is a basis for W .
k1 k2 kj kr
↓ ↓ ↓ ↓
Row 1 → 0 ... 0 1 ∗ ∗ ∗ 0 ... 0 ∗ 0 ∗
Row 2 → 0
... 0 0 0 0 0 1 ∗ ∗ 0 ∗ 0 ∗
.. .. .. .. ....
. . . . . .
Row j → 0 ... 0 0 0 0 0 0 ... 1 ∗ 0 ∗
. .. .. .. ....
..
. . . . .
Row r →
0 ... 0 0 ... ... 0 ... ...
0 0 1 ∗ ∗ ∗
0 ... 0 0 ... ... 0 ... ... 0 0 0 ∗ ∗ ∗
0 ... 0 0 ... ... 0 ... ... 000 ∗ ∗ ∗
There are r such non-zero rows starting with leading 1. If the first non-zero coordinate (leading
1) of the row ρ⃗i is occurring in the kith column of R, then we have for i ≤ r (the first r rows),
the properties of a row-reduced echelon matrix R
(a) Rij = 0, if j < ki ;
(b) Rikj = δij ;
(c) k1 < · · · < kr .
Now, get back to equation 6.24:
r
X
bj = Ci Rij ,
i=1
take j = ks (for some s ≤ r) in the above equation and use property (b) of the row-reduced
echelon matrix R: r
X
bks = Ci Riks = Cs . (6.25)
i=1
α
⃗ 1 = (1, 2, 2, 1) ; α ⃗ 3 = (−2, 0, −4, 3) .
⃗ 2 = (0, 2, 0, 1) ; α
(b) Let β⃗ = (b1 , b2 , b3 , b4 ) be a vector in W . What are the coordinates of β⃗ relative to the
ordered basis {⃗ α1 , α ⃗ 3 }?
⃗ 2, α
(c) Let,
⃗ ′1 = (1, 0, 2, 0) ,
α
⃗ ′2 = (0, 2, 0, 1) ,
α
⃗ ′3 = (0, 0, 0, 3) .
α
Show that α ⃗ ′3 form a basis for W .
⃗ ′2 , α
⃗ ′1 , α
′
(d) If β⃗ ∈ W , let X denote the coordinate matrix of β⃗ relative to the α-basis and X the
′ ′
coordinate matrix of β⃗ relative to the α -basis. Find the 3 × 3 matrix P such that
′
X = P X for every such β⃗ ∈ W
To answer these questions we form the matrix A with row vectors α ⃗ 1, α
⃗ 2, α
⃗ 3 , find the row-
reduced echelon matrix R which is row-equivalent to A and simultaneously perform the same
operations on 3 × 3 identity matrix to obtain the invertible matrix Q such that R = QA.
1 2 2 1 1 2 2 1 1 0 2 0
0 2 0 1 ′ 0 2 0 1 −R2 + R1 = R1′ 0 2 0 1
2R1 + R3 = R3 ′
−2 0 −4 3 0 4 0 5 −2R2 + R3 = R3 0 0 0 3
1 0 2 0 1 0 2 0
1 ′ 0 2 0 1 ′ 0 2 0 0 .
R = R3
3 3
−R3 + R2 = R2
0 0 0 1 0 0 0 1
1 0 0 1 0 0 1 −1 0
0 1 0 ′ 0 1 0 −R2 + R1 = R1′ 0 1 0
2R1 + R3 = R3 ′
0 0 1 2 0 1 −2R2 + R3 = R3 2 −2 1
1 −1 0
1 −1 0
2 5
− 1
1 ′ 0 1 0 ′ −3
R = R3
3 3 2 2 1 −R3 + R2 = R2 3 3
3
− 3 3
2
3
− 23 1
3
1 2 2 0 1 0 2 0
0 2 0 1 1 ′ 0 1 0 0 .
R2 = R2
−2 0 −4 0 2 0 0 0 1
6 Lecture 6
1 −1 0
1 0 0 6 −6 0
1 5
1
0 1 0 1 ′ −
3 6
− 16
= 6 −2 5 −1 .
R
2 2
= R2
0 0 1 4 −4 2
2
3
− 23 1
3
Q = es (. . . e2 (e1 (I)) . . . ) .
e1 (A) = E1 A,
e2 (e1 (A)) = E2 E1 A,
∴ es (. . . e2 (e1 (A)) . . . ) = Es1 A,
But,
es (. . . e2 (e1 (A)) . . . ) = R,
where R is the row-reduced echelon matrix.
∴ R = (Es . . . E2 E1 ) A,
= QA
∴ we obtain Q by applying the same set of ERO’s on I as the ones applied on A to obtain the
row-reduced echelon matrix R.
where, k1 = 1 for the 1 in row ρ⃗1 , k2 = 2 for the 1 in row ρ⃗2 , k3 = 4 for the 1 in row ρ⃗3 . Using
equation 6.26,
3
X
β⃗ = bki ρ⃗i = bk1 ρ⃗1 + bk2 ρ⃗2 + bk3 ρ⃗3 = b1 ρ⃗1 + b2 ρ⃗2 + b4 ρ⃗3 .
i=1
6 Lecture 6
where [b1 b2 2b1 b4 ] is the span of ρ⃗1 , ρ⃗2 , ρ⃗3 consists of the vectors β⃗ = (b1 , b2 , b3 , b4 ) for which
b3 = 2b1 .
Now, β⃗ = [b1 b2 b4 ] R; where R is the row-reduced echelon matrix whose non-zero rows are
ρ⃗1 , ρ⃗2 , ρ⃗3 .
Therefore,
α
⃗1
β⃗ = [b1 b2 b4 ] Q A = [b1 b2 b4 ] Q α
⃗ 2 ; Q = [Q1 Q2 Q3 ] where Q1 , Q2 , Q3 are the columns of Q.
α
⃗3
One gets:
1 2
x 1 = b1 − b2 + b4 ;
3 3
5 2
x2 = −b1 + b2 − b4 ;
6 3
1 1
x 3 = − b2 + b4 . (6.28)
6 3
Take β⃗ = α
⃗ ′1 = [1 0 2 0] . Now, we want [1 0 2 0] = b1 [1 0 2 0] + b2 [0 1 0 0] + b4 [0 0 0 1] .
Which means b1 = 1, b2 = 0; b4 = 0. Pluggin in these values in equation 6.28, one gets:
x1 = 1;
x2 = −1;
x3 = 0. (6.29)
From equation 6.29, it follows that,
⃗ ′1 = x1 α
α ⃗ 1 + x2 α
⃗ 2 + x3 α ⃗1 − α
⃗3 = α ⃗ 2.
Now, take β⃗ = α
⃗ ′2 = [0 2 0 1].
⃗ ′2 = [0 2 0 1] = x1 α
α ⃗ 1 + x2 α
⃗ 2 + x3 α
⃗3 = α
⃗ 2.
x1 = 2; x2 = −2; x3 = 1. (6.31)
⃗ ′3 = [0 0 0 3] = x1 α
α ⃗ 1 + x2 α
⃗ 2 + x3 α
⃗3 = α α1 − 2⃗
⃗ 2 = 2⃗ α2 + α
⃗ 3.
1 0 2
P = −1 1 −2 .
0 0 1
Suggested Exercises (Hoffman and Kunze) (Page: 66), Problem: 2, 3, 4, 5, 6.
7 Lecture 7
Definition 7.1. Let V and W be vector spaces over the field F. A linear transformation
from V to W is a function T : V → W satisfying,
Example 7.1. If V is any vector space, the identity transformation I, defined by I ⃗u = ⃗u, ∀ ⃗u ∈
V , is a linear transformation from V to itself. The zero transformation 0̂ : V → V given by
0̂⃗u = ⃗0 ∈ V, ∀ ⃗u ∈ V is a linear transformation from V to itself.
Example 7.2. Let F be a field and let P be the vector space of polynomial functions over F.
Such a polynomial function f is given by,
f (x) = C0 + C1 x + · · · + Ck xk , Ci ∈ F, ∀ i ∈ {1, 2, . . . , k} .
Verify that D is a linear transformation from V to itself. The linear transformation D is called
the differentiation transformation.
Example 7.3. Let A be a fixed (m × n) matrix with entries from the field F. The function
T : Fn×1 → Fm×1 defined by T (⃗u) = A ⃗u ∈ Fm×1 , ∀ ⃗u ∈ Fn×1 , is a linear transformation from
Fn×1 to Fm×1 . Also, the function u : Fm → Fn defined by u (⃗v ) = ⃗v A ∈ Fn , ∀ ⃗v ∈ Fm , is
a linear transformation from Fm to Fn . Here, Fm is shorthand notation for F1×m . This is the
vector space of row-vectors with m entries each from the field F.
Remark. 1. If T : V → W is a linear transformation, then T ⃗0V = ⃗0W . Here, ⃗0v is the
zero vector of the vector space V and ⃗0W is the zero vector for the vector space W .
2. If T : V → W is a linear transformation, then T ’preserves’ linear combinations; that is if
⃗u1 , ⃗u2 , . . . , ⃗un are vectors in V and c1 , c2 , . . . , cn are scalars, then
Theorem 7.1
Let V be a finite-dimensional vector space over F and let {⃗u1 , ⃗u2 , . . . , ⃗un } be an ordered
basis for V . Let W be a vector space over the same field F and w ⃗ 1, w ⃗ 2, . . . , w
⃗ n be any
vectors in W . Then there is a unique linear transformation T : V → W such that,
T ⃗uj = w
⃗ j , j = 1, 2, . . . , n. (7.2)
Proof. Existence: Let ⃗u ∈ V . Denote by B = {⃗u1 , ⃗u2 , . . . , ⃗un } the given ordered basis of V .
Let the coordinate matrix of the vector ⃗u ∈ V relative to the basis B be given by,
x1
x2
[⃗u]B =
... ,
xn
7 Lecture 7
T ⃗u = x1 w ⃗ 2 + · · · + xn w
⃗ 1 + x2 w ⃗ n. (7.4)
It remains to show that T is linear. Let ⃗v ∈ V have the following basis expansion:
Now, for c ∈ F,
[Using equations 7.3 and 7.5]. By the definition of T provided in equation 7.4,
K⃗uj = w
⃗ j , j = 1, 2, . . . , n. (7.9)
=⇒ 2c2 = 2,
∴ c2 = 1 and c1 = −2.
∴ (1, 0) = ⃗u1 + ⃗u2 so that T (1, 0) = −2T ⃗u1 + T ⃗u2 ,
= −2 (3, 2, 1) + (6, 5, 4) ; [ Using equation 7.11 ].
= (0, 1, 2) .
2 interesting subspaces that arise in the context of a linear transformation T : V → W are
Range of T , abbreviated as RT ⊂ W and Kernel of T denoted by Ket Tn ⊂ V . Let us quickly
o
⃗
verify that RT = {T⃗v | ⃗v ∈ V } is indeed a subspace of W and KerT = ⃗v ∈ V | T⃗v = 0W is
a vector subspace of V .
We will use theorem 5.1 of lectur 5 for this purpose. Let ⃗u1 , ⃗u2 ∈ RT , then from the defi-
nition of RT , there are vectors ⃗v1 , ⃗v2 ∈ V s.t. T⃗v1 = ⃗u1 and T⃗v2 = ⃗u2 . Now, since T : V → W
is a linear transformation, for a given c ∈ F,
Thus, we see that ∃ c⃗v1 + ⃗v2 ∈ V, s.t. T (c⃗v1 + ⃗v2 ) = c⃗u1 + ⃗u2 , proving that c⃗u1 + ⃗u2 ∈ RT .
Hence, by theorem 5.1 of lecture 5, RT or Range of the linear transformation T is a vector
subspace of W .
7 Lecture 7
Definition 7.2. Let V and W be vector spaces over F, and let T be a linear transformation
from V to W . The null space of T is the set of all vectors ⃗u ∈ V such that T ⃗u = ⃗0W . If V
is finite dimensional, the rank of T is the dimension of the range of T and the nullity of T
is the dimension of the null-space of T .
Theorem 7.2
Let V and W be vector spaces over the field F and T : V → W be a linear transformation.
Suppose V is finite dimensional. Then,
Proof. Let {⃗u1 , ⃗u2 , . . . , ⃗uk } be a basis for Ker T , the null space of T . We know, Ker T is a sub-
space of V . If dimV = n, then there are vectors {⃗uk+1 , ⃗uk+2 , . . . , ⃗un } such that {⃗u1 , ⃗u2 , . . . , ⃗un }
is a basis for V . We will now prove that {T ⃗uk+1 , . . . , T ⃗un } is a basis for RT , the range of T .
First of all, RT = Span {T ⃗u1 , T ⃗u2 , . . . , T ⃗un }. This can easily be seen as given w
⃗ ∈ RT , ∃ ⃗u ∈
Pn
⃗ But since {⃗u1 , . . . , ⃗un } is a basis for V, ⃗u =
V s.t. T ⃗u = w. Ci⃗ui for some scalars ci ∈ F,
i=1
n
P
leading to T ⃗u = ⃗ ∈ RT , there exist scalars ci
ci T ⃗ui , using linearity of T . Therefore, given w
i=1
such that, X
w
⃗= ci T ⃗ui , i.e., RT = Span {T ⃗u1 , . . . , T ⃗un } .
i=1n
Now, since {⃗u1 , ⃗u2 , . . . , ⃗uk } is a basis for Ker T, T ⃗uj = 0 for 1 ≤ j ≤ k and hence we can further
strengthen our result by stating that RT = Span {T ⃗uk+1 , T ⃗uk+2 , . . . , T ⃗uk }. Now suppose,
n
X
ci T ⃗ui = ⃗0W . for scalars ci ∈ F, K + 1 ≤ i ≤ n. (7.14)
i=k+1
Therefore,
n
X
⃗v = ci ⃗ui ∈ KerT, (7.15)
i=K+1
7 Lecture 7
But, according to the hypothesis {⃗u1 , ⃗u2 , . . . , ⃗uk } forms a basis for KerT . Hence, there are
scalars b1 , b2 , . . . , bk such that,
X k
⃗v = bi ⃗ui , (7.16)
i=1
k
P n
P
Gathering equations 7.15 and 7.16 together, one sees, bi ⃗ui = cj ⃗uj ,
i=1 j=k+1
k
X n
X
=⇒ bi ⃗ui − cj ⃗uj = ⃗0V . (7.17)
i=1 j=k+1
Since {⃗u1 , ⃗u2 , . . . , ⃗uk } is a linearly independent set, one must have,
b1 = b2 = · · · = bk = ck+1 = · · · = cn = 0.
Now, equation 7.14 together with ck+1 = ck+2 = · · · = cn = 0 imply that the set {T ⃗uk+1 , T ⃗uk+2 , . . . , T ⃗un }
is a linearly independent set. Hence, {T ⃗uk+1 , . . . , T ⃗un } forms a basis of RT , the range of the
linear transformation T . One immediately finds that the rank r of T is given by r = n − k.
Remember: We chose {⃗u1 , ⃗u2 , . . . , ⃗uk } as a basis for Ker T which means that the nullity of T
is k. Additionally, since n is the dimension of V, r = n − k translates to,
■
Suggested Exercises (Hoffman and Kunze) (Page: 73), Problem: 2, 3, 7, 8, 13.
Theorem 7.3
Let V and W be vector spaces over the field F. Let T and U be linear transformations
from V to W . The function (T + U) defined by,
(T + U) (c⃗v + w) ⃗ + U (c⃗v + w)
⃗ = T (c⃗v + w) ⃗ for c ∈ F, ⃗v , w
⃗ ∈ V.
=⇒ (T + U) (c⃗v + w) ⃗ + c (U⃗v ) + U w;
⃗ = c (T⃗v ) + T w ⃗ [Using linearity of T and U].
⃗ + U w,
=⇒ c (T⃗v ) + c (U⃗v ) + T w ⃗
= c (T⃗v + U⃗v ) + T w
⃗ + U w;
⃗ [∵ T⃗v , U⃗v ∈ W and c ∈ F].
= c (T + U) (⃗v ) + (T + U) (w)
⃗ ; [Using equation 7.18].
proving that T + U is indeed a linear transformation from V to W . Now, for ⃗v , w
⃗ ∈ V and
c, d ∈ F, (cT ) (d⃗v + w)
⃗
= c [T (d⃗v + w)]
⃗ ; [Using definition 7.19].
c [dT⃗v + T w]
⃗ ,
= (cd) (T⃗v ) + c (T w) ⃗ ∈ W and c, d ∈ F and W is a vector space over F].
⃗ ; [∵ T⃗v , T w
⃗ ; [c, d ∈ F and multiplication in a field is commutative].
= dc (T⃗v ) + c (T w)
= d [c (T⃗v )] + c (T w)
⃗ ,
= d [(cT ) (⃗v )] + (cT ) (w)
⃗ ; [Using definition 7.19].
proving that cT is also a linear transformation from V to W . I leave it to the reader to verify
that the set of linear transformations with respect to addition and scalar multiplication defined
by equations 7.18 and 7.19, respectively, indeed satisfies all vector space axioms. ■
We shall denote the vector space of linear transformations from V to W by L (V, W ). We
remark here that L (V, W ) is defined only when the vector spaces V and W are defined over
the same field F.
Theorem 7.4
Let V be an n-dimensional vector space over the field F, and let W be an m-dimensional
vector space over the same field F. The vector space L (V, W ) is finite dimensional and has
dimension mn.
Proof. is omitted. ■
Definition 7.3. If V is a vector space over the field F, a linear operator on V is a linear
transformation from V to itself.
Theorem 7.5
Let V, W and Z be vector spaces over the field F. Let T be a linear transformation from
V to W and U be a linear transformation from W to Z. Then the composed function UT
⃗ V , is a linear transformation from V to Z.
defined by (UT ) (⃗v ) = U (T (⃗v )) , ∀ v ∈
UT (c⃗v + w)
⃗ = U (T (c⃗v + w))
⃗ ; [Using the definition of the composed function].
= U (c T⃗v + T w)
⃗ ; [∵ T is linear].
= c U (T (⃗v )) + U (T (w))
⃗ ; [∵ U is linear].
= c UT (⃗v ) + UT (w)
⃗ .
7 Lecture 7
Lemma 7.6
Let V be a vector space over the field F. Let U, T1 , and T2 be linear operators on V ; let
c ∈ F. Then,
(a) I U = UI = U.
(b) U (T1 + T2 ) = UT1 + UT2 , or, (T1 + T2 ) U = T1 U + T2 U.
(c) c (UT1 ) = (c U) T1 = U (cT1 ). Here I is the identity transformation on V , i.e.,
I⃗v = ⃗v , ∀ ⃗v ∈ V.
Theorem 7.4 and Lemma 7.6 together tell us that the vector space L (V, V ), together with
the composition operation, is a linear algebra with identity.
Example 7.5. Let F be a field and V be the vector space of all polynomial functions from F
to F. A general element of V is given by,
Verify that both these operators are linear, i.e., D.T ∈ L (V, V ). For example, given α ∈ F,
verify,
T ((x) + w (x)) = α T v (x) + T w (x) ,
for another polynomial function from F to F:
w (x) = d0 + d1 x + d2 x2 + · · · + dm xm .
It is easy to see:
= αT v (x) + T w (x) .
Also, DT − T D = I.
In fact,
You should be able to verify that if T is invertible, the function U is unique and is denoted
by T −1 . Furthermore, T is imvertible if and only if,
1. T is 1 − 1, that is T⃗v = T w ⃗ ∀ ⃗v , w
⃗ =⇒ ⃗v = w, ⃗ ∈ V.
2. T is onto, that is the range of T is all of W .
Theorem 7.7
Let V and W be F-vector spaces. And let T be a linear transformation from V to W . If
T is invertible, then the inverse function T −1 is a linear transformation from W to V .
Proof. Let ⃗v1 , ⃗v2 ∈ W and c ∈ F be a scalar. We want to show that the following holds:
Let,
⃗u1 = T −1⃗v1 and ⃗u2 = T −1⃗v2 .
⃗u1 = T −1⃗v1 implies that ⃗u1 is the unique vector in V such that T ⃗u1 = ⃗v1 . Similarly, ⃗u2 is the
unique vector in V such that T ⃗u2 = ⃗u2 . (Because T is 1 − 1).
= c⃗v1 + ⃗v2 .
Now, since T is 1 − 1, c⃗u1 + ⃗u2 is the unique vector in V which is sent to c⃗v1 + ⃗v2 ∈ W by the
linear transformation T from V to W and hence,
and T −1 is linear. ■
Theorem 7.8
Let T be a linear transformation from V to W . Then T is non-singular if and only if T
carries each linearly independent subset of V to a linearly independent subset of W .
Proof. First, suppose that T is non-singular. And let, S = {⃗u1 , ⃗u2 , . . . , ⃗uk } be a linearly indepen-
dent subset of V . One needs to prove that the set {T ⃗u1 , T ⃗u2 , . . . , T ⃗uk } is linearly independent
in W . Let,
7 Lecture 7
But the linear independence of S = {⃗u1 , ⃗u2 , . . . , ⃗uk } then implies c1 = c2 = · · · = ck = 0. This
argument shows that the image of S = {⃗u1 , ⃗u2 , . . . , ⃗uk } under T is linearly independent.
Suppose now that T carries linearly independent subsets of V to linearly independent sub-
sets of W . Let, ⃗u ∈ V be a non-zero vector so that {⃗u} is a linearly independent subset of
V . Then according to the hypothesis {T ⃗u} is linearly independent in W . One then must have
T ⃗u ̸= ⃗0W . Otherwise, the set {T ⃗u} consisting of the zero vector ⃗0W only would be linearly de-
pendent. In other words, T ⃗u ̸= ⃗0Wnwhenever
o ⃗u ̸= ⃗0V . Hence, the only vector ⃗u ∈ V satisfying
T ⃗u = ⃗0W is ⃗u = ⃗0V , i.e., KerT = ⃗0V , i.e., T is non-singular. ■
Example 7.6. Let F be a field and T be the linear operator on F2 defined by,
T (x1 , x2 ) = (x1 + x2 , x1 ) ,
T (x1 , x2 ) = (z1 , z2 )
=⇒ x1 = z2 , x2 = z1 − z2 .
∴ T is onto. Hence, T −1 exists. Explicitly,
T −1 (z1 , z2 ) = (z2 , z1 − z2 ) .
In this example, the linear operator T is non-singular and is also onto. In general, a non-singular
linear operator defined on a vector space need not to be onto.
Theorem 7.9
Let V and W be finite dimensional vector spaces over the field F such that dimV = dimW .
If T : V → W is a linear transformation, the following are equivalent (TFAE):
(i) T is invertible.
(ii) T is non-singular.
(iii) T is onto, that is the range of T is all of W .
Proof. Let n = dimV = dimW . From theorem 7.2, we know that,
Now, T is non-singular.
n o
=⇒ KerT = ⃗0V =⇒ nullity (T ) = 0 =⇒ rank (T ) = n = dimW
=⇒ range of T = W =⇒ T is onto.
It also holds in the other direction, meaning, when T is onto, one has,
(iii) =⇒ (iv). We assume that T is onto. Let {⃗u1 , ⃗u2 , . . . , ⃗un } be a basis for V . Now, given
⃗ ∈ W.∃ ⃗v ∈ V s.t. T ⃗v = w
w ⃗ since T is onto.
Any vector w ⃗ ∈ W can bhe written as a linear combination of the vectors in {T ⃗u1 , T ⃗u2 , . . . , T ⃗un } .
Hence, W = Span {T ⃗u1 , T ⃗u2 , . . . , T ⃗un }. Since dimW = n and there are n vectors in the set
{T ⃗u1 , . . . , T ⃗un }, these vectors must be linearly independent. Therefore, {T ⃗u1 , . . . , T ⃗un } is a
basis for W .
(iv) =⇒ (v); Since V is a finite dimensional vector space, it has a basis comprising of dimV
many vectors. Let dimV = n. Denote a basis (which always exists) of V by {⃗u1 , ⃗u2 , . . . , ⃗un } .
Now, by (iv), {T ⃗u1 , . . . , T ⃗un , } is a basis for W .
(v) =⇒ (i); Suppose there is some basis {⃗u1 , ⃗u2 , . . . , ⃗un } of V such that {T ⃗u1 , T ⃗u2 , . . . , T ⃗un } .
is a basis for W . Hence, the set {T ⃗u1 , . . . , T ⃗un , } spans W , i.e., given w ⃗ ∈ W, ∃ scalars
ci ∈ F, 1 ≤ i ≤ n, such that,
The set of all invertible linear operators on a vector space V , with the operation of composition
is an example of the algebraic structure called a group.
Suggested Exercises (Hoffman and Kunze) (Page: 83), Problem: 1, 2, 3, 5, 10, 11, 12.
8 Lecture 8
If V and W are vector spaces over the field F, any 1 − 1 and onto linear transformation from
V to W is called an isomorphism of V to W . If there exists an isomorphism between V and
W , we say that V is isomorphic to W .
Note that V is trivially isomorphic to itself. The identity operator I being an isomorphism
of V to itself.
§8.0.i Question
Verify that isomorphism is an equivalence relation on the class of vector spaces, i.e., the relation
obeys reflexivity, symmetry and transitivity.
Theorem 8.1
Every n-dimensional vector space over the field F is isomorphic to the space F2 .
Proof. Let V be an n-dimensional vector space over the field F and let B = {⃗u1 , ⃗u2 , . . . , ⃗un } be
an ordered basis for V . We define a function T : V → Fn as follows:
Each of the n vectors T ⃗uj is uniquely expressible as a linear combination of the basis vec-
′
tors in B = {⃗v1 , ⃗v2 , . . . , ⃗vm } of W :
m
X
T ⃗uj = Aij ⃗vi . (8.2)
i=1
The scalars A1j , A2j , . . . , Amj being the unique coordinates of T ⃗uj ∈ W relative to the basis
′
B = {⃗v1 , ⃗v2 , . . . , ⃗vm }. Accordingly, the linear transformation T is fully determined by the mn
scalars Aij , 1 ≤ i ≤ m, 1 ≤ j ≤ n via formula 8.2. The m×n matrix A defined by A (i, j) = Aij
(where i represents rows and j represents columns) is called the matrix of T relative to the pair
′
of ordered bases B and B . Let us now understand explicitly how the matrix A determines the
linear transformation T :
n
! n
X X
T ⃗u = T xj ⃗uj = xj (T ⃗uj ) ; [by linearity of T ].
j=1 j=1
n
X m
X
= xj Aij ⃗vi ;
j=1 i=1
m n
!
X X
= Aij xj ⃗vi . (8.3)
i=1 j=1
Let X be the coordinate matrix of ⃗u in the ordered basis B: since ⃗u = x1⃗u1 +x2⃗u2 +· · ·+xn⃗un ,
one obtains,
x1
x2
X= ... , and X = [⃗u]B .
xn
From equation 8.3, it then follows that,
m
X
T ⃗u = (AX)i ⃗vi , (8.4)
i=1
′
where (AX)i is the ith row of the (m × 1) matrix AX and AX = [T ⃗u]B . Equation 8.4 tells us
′
that AX is the coordinate matrix of T ⃗u in the ordered basis B . In the other direction if A is
any m × n matrix over the field F, then T defined by,
n
! m n
!
X X X
T xj ⃗uj = Aij xj ⃗vi , (8.5)
j=1 i=1 j=1
Theorem 8.2
Let V be an n-dimensional vector space over the field F, and W an m-dimensional vector
′
space over F. Let B and B be ordered bases of V and W , respectively. For each linear
transformation T from V to W , there is an m × n matrix A with entries in F such that,
A1j
A2j
=⇒ [T ⃗uj ]B′ ... = Aj ← jth column of the (m × n) matrix A.
=
Amj
8 Lecture 8
Compare equation 8.7 with equation 8.2, that A, the representing matrix of the linear operator
T , on the finite dimensional vector space V depends on the ordered basis B is captured by the
notation: [T ]B = A so that equation 8.6 now becomes:
Example 8.1. Let F be a field and T be the operator on F2 defined by T (x1 , x2 ) = (x1 , 0) .
On F2 vector addition:
(x1 , x2 ) + (y1 , y2 ) ,
= (x1 + y1 , x2 + y2 ) .
c (x1 , x2 ) = (cx1 , cx2 ) ,
for (x1 , x2 ) , (y1 , y2 ) ∈ F2 and c ∈ F.
Check that,
T [c (x1 , x2 ) + (y1 , y2 )] ,
= T (cx1 + y1 , cx2 + y2 ) ,
= (cx1 + y1 , 0)
= (cx1 , 0) + (y1 , 0) ,
= (cx1 , 0) + (y1 , 0) ,
= cT (x1 , x2 ) + T (y1 , y2 ) .
Hence, T : F2 → F2 is linear. Let B = {⃗ϵ1 ,⃗ϵ2 } be the standard ordered basis for F2 , with
⃗ϵ1 = (1, 0) and ⃗ϵ2 = (0, 1). Now,
f (x) = c0 + c1 x + c2 x2 + c3 x3 , ci ∈ R, 0 ≤ i ≤ 3.
that are of degree 3 or less. The differentiation operator D : P3 → P3 dealt in example 7.2 of
lecture 7, in terms of its action on the elements of the ordered basis B = {f1 , f2 , f3 , f4 } with
fj (x) = xj−1 , 1 ≤ j ≤ 4 :
8 Lecture 8
since, T⃗v ∈ W and B is the matrix that represents the linear transformation U : W → Z,
application of equation 8.6 yields,
C = AB.
Theorem 8.3
Let V, W, and Z be finite dimensional vector spaces over the field F; let T : V → W be
′
a linear transformation and U : W → Z be another linear transformation. If B, B , and
′′
B are ordered bases for the vector spaces V, W, and Z, respectively, if A is the matrix
′
representing T relative to the pair B, B , and B is the matrix of U relative to the the pair
′ ′′
B , B , then the matrix representing the composed linear transformation UT relative to the
′ ′′
pair B , B is the product matrix C = BA.
Now we should inquire what happens to representing matrices when the ordered basis is changed.
In particular,we will consider this question for linear operators on a vector space V over the
field F. The specific question is as follows: let T be a linear operator on the finite dimensional
vector space V over a field F, and also let,
′
B = {⃗u1 , ⃗u2 , . . . , ⃗un } , B = {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } ,
be 2 ordered bases of V . How are the matrix representing T in ordered basis B related to the
′
matrix representing T in the ordered basis B ?
Before answering this question, it is important to note that if T and U are linear operators on a
vector space V and if [T ]B and [U]B denote the matrix representations of T and U, respectively,
relative to the single ordered basis B, then by theorem 8.3, one obtains,
[UT ]B = [B] [T ]B
An easy consequence of this is that the linear operator T is invertible if and only if its matrix
representation [T ]B is an invertible matrix. Note that the identity operator I : V → V is
represented by the identity matrix in any basis. We denote the identity operator and its matrix
representation by the same symbol I. And thus,
UT = T U = I ⇐⇒ [U]B [T ]B = [T ]B [U]B = I.
T −1 = [T ]−1
B B (8.13)
Now, let us get back to the question we asked. We’ve seen in theorem 6.7 of lecture 6 that
there is a unique (n × n) invertible matrix P s.t.
[T ⃗u]B = P [T ⃗u]B′ ,
=⇒ [T ]B [⃗u]B = P [T ⃗u]B′ ,
=⇒ [T ]B P [⃗u]B′ = P [T ⃗u]B′ ,
=⇒ P −1 [T ]B P [⃗u]B′ = [T ⃗u]B′ = [T ]B′ [⃗u]B′ .
Therefore, it must be the case that,
[T ]B′ = P −1 [T ]B P. (8.16)
This is how the matrix representations of the same linear operator T on V , relative to the 2
′
ordered bases B and B are related.
′
Now, let V be an n-dimensional vector space. B = {⃗u1 , ⃗u2 , . . . , ⃗un } , B = {⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } ,
are 2 ordered bases of V . Using theorem 7.1 of lecture 7 (Choose the 2 vector spaces to be the
′
same and the latter set of vectors to be the basis B ), then there is a unique linear operator
U on V , s.t. U⃗uj = ⃗u ′j , j = 1, 2, . . . , n.U is invertible by theorem 7.9 of lecture 7. Refer to
equation 6.13 of lecture 6:
Xn
′
⃗u j = Pij ⃗ui , 1 ≤ j ≤ n. (8.17)
i=1
Therefore, we see that the invertible (n × n) matrix P is precisely the matrix representation of
the invertible linear operator U. The equation 8.17, can, therefore, be written as:
n
X
U⃗uj = Pij ⃗ui , 1 ≤ j ≤ n. (8.18)
i=1
Then using equation 8.7, one immediately finds that [U]B = P . Let us summarize these results
formally as follows:
Theorem 8.4
′
Let V be a finite dimensional vector space over the field F, and let B = {⃗u1 , ⃗u2 , . . . , ⃗un } , B =
{⃗u ′1 , ⃗u ′2 , . . . , ⃗u ′n } , be 2 ordered bases for V . Suppose T isa linear operator on V . If P =
[P1 .P2 , . . . , Pn ] is the (n × n) matrix with columns Pj = ⃗u j B , then [T ]B′ = P −1 [T ]B P .
′
Alternatively, if U is the invertible linear operator on V defined by U⃗uj = ⃗u ′j , 1 ≤ j ≤ n,
then [T ]B′ = [U]−1 B [T ]B [U]B .
We’ve seen in that example that the matrix representation of T in the ordered basis ⃗ϵ1 ,⃗ϵ2
with ⃗ϵ1 = (1, 0) and ⃗ϵ2 = (0, 1) is
1 0
[T ]B = .
0 0
8 Lecture 8
′
Suppose B is the ordered basis for R2 consisting of the vectors ⃗ϵ ′1 = (1, 1) ,⃗ϵ ′2 = (2, 1) . Then
it is immediate that,
⃗ϵ ′1 = ⃗ϵ1 + ⃗ϵ2
⃗ϵ ′2 = 2⃗ϵ1 + ⃗ϵ2 (8.19)
1
Therefore, the first column P1 of the invertible matrix P = [P1 , P2 ] is P1 = [⃗ϵ ′1 ]B = , while
1
2
the second column P2 = [⃗ϵ ′2 ]B = so that the invertible matrix P = [P1 , P2 ] now reads,
1
1 2
P = .
1 1
and,
T ⃗ϵ ′2 = T (2, 1) = (2, 0) = c⃗ϵ ′1 + d⃗ϵ ′2 = c (1, 1) + d (2, 1)
Solve for a, b, c, and d:
a + 2b = 1
a+b=0
c + 2d = 2
c+d=0
=⇒ b = 1, a = −1, d = 2 c = −2.
8 Lecture 8