Lecture Notes in Linear Algebra 1 (80134) : Raz Kupferman Institute of Mathematics The Hebrew University
Lecture Notes in Linear Algebra 1 (80134) : Raz Kupferman Institute of Mathematics The Hebrew University
Lecture Notes in Linear Algebra 1 (80134) : Raz Kupferman Institute of Mathematics The Hebrew University
Raz Kupferman
Institute of Mathematics
The Hebrew University
1 Fields 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Definition of a field . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Solvability of linear equations . . . . . . . . . . . . . . . . . . . 9
1.5 Equality as an equivalence relation . . . . . . . . . . . . . . . . 14
1.6 Extended associativity and commutativity . . . . . . . . . . . . 15
Fields
1.1 Motivation
Recall the times when you learned to solve a linear equation is one unknown,
say,
3X + 6 = 18. (1.1)
An equation “asks a question”: which number x yields an equality between
both sides of the equation when substituted for the unknown X. What did
you do? As a first step, you determined that if “something” plus 6 equals
18, then that “something” had to be equal 12, namely, that every solution
of (1.1) is also a solution of the equation
3X = 12.
Stated differently, you used the fact that since the two sides of an equation are
by definition equal, then the equation will remain true if you subtract 6 from
both sides. As a second step, you determined that if 3 times “something”
equals 12, then by solving an unknown-factor problem, that “something” has
to be equal 4, finally, leading to the solution
x = 4.
in some sense “the same”. Second, this notion of equality justifies the fact
that if the same operation is applied on both sides, then the results of this
operation preserve the sameness of the two sides. Third, we assume the
existence of the operations of addition and multiplication, and their inverses,
subtraction and division. We used the fact if “unknown + number = number”
then “unknown” can be determined uniquely, and similarly for “unknown ×
number = number” (unless if the first number is zero).
But even before that, what are numbers? In the above example, we make do
with the natural numbers,
N = {1, 2, 3, . . . },
X + 8 = 6.
(The letter “Z” stands for the German word zahl, which means number.)
Does every equation with coefficients in Z have a solution in Z? No. The
equations
4X = 3 and 4X = (−3)
do not have solutions in Z. Requiring these equation to be solvable requires
the introduction of the rational numbers (!M הרציונלייM)המספרי, which are
denoted by Q (for quotients).
Fields 3
The set of rational numbers gives us already the ability to solve any equation
of the form
aX + c = b,
where a, b, c ∈ Q as long as a ≠ 0 (we will discuss the case of a = 0 later). Thus,
the rational numbers are “complete” in the sense that any linear equation
with coefficients in that set has a solution within that set.
The rational numbers are however not “complete” in other respects. More
than two millennia ago, it was discovered that the quadratic equation X 2 = 2
does not have a solution within the set of rational numbers, leading eventually
to the definition of the set of real numbers (!M הממשייM)המספרי, which we
denote by R. The set of real numbers extends the set of rational numbers in
a sense described in your Calculus class. And yet, even with this extension,
there still exist “simple” equations that are not solvable, such as
X 2 = (−1).
This observation has eventually led to the further extension of the set of
real numbers into the set of complex numbers (!M המרוכביM)המספרי, which
we denote by C. The complex numbers are defined by introducing a new
“number” ı, satisfying ı2 = (−1), and then considering all combinations a+b ı,
with a, b ∈ R.
It should be noted that in the context of linear equations, denoting either Q,
R or C by the generic notation F, every equation of the form
aX + c = b,
With that, we spell out the first four axioms, which are pertinent to addition:
1
Throughout this text we will use the standard notation of set theory: if A is a set,
then a ∈ A means that a is an element in A, or that a belongs to A. For two sets A and
B, the relation A ⊆ B means that A is a subset of B, implying that every element in A
is also an element in B; note that this relation holds also if A = B. In fact A = B means
that both A ⊆ B and B ⊆ A.
Fields 5
(a + b) + c = a + (b + c). (A1)
a + b = b + a. (A2)
a + 0F = a. (A3)
a + (−a) = 0F . (A4)
The next four axioms are analogous (with one big difference!) and pertinent
to multiplication:
(a ⋅ b) ⋅ c = a ⋅ (b ⋅ c). (M1)
a ⋅ b = b ⋅ a. (M2)
a ⋅ 1F = a. (M3)
a ⋅ a−1 = 1F . (M4)
a ⋅ (b + c) = a ⋅ b + a ⋅ c. (D)
Comments:
(a) Elements of a field are called scalars (!M( )סקלריrather than numbers).
(b) When no ambiguity occurs, we may denote the product of two elements
by ab rather than by a ⋅ b.
(c) We denoted the elements zero and one by 0F and 1F to emphasize that
they may differ from the numbers zero and one. Nevertheless, when no
confusion arises, we may revert to the more standard notation 0 and 1.
(d) A priori, a scalar may be its own additive and/or multiplicative inverse.
In fact, 0F is always its own additive inverse and 1F is always its own
multiplicative inverse. We will shortly see an example in which 1F is
also its own additive inverse.
(e) Subtraction (! )חיסורis defined as the addition of the additive inverse,
a − b = a + (−b),
a ÷ b = ab−1 .
Exercises
(easy) 1.1 S is a set. S claims to be a field. List all the properties you
should check in order to verify whether S’s claim is correct.
(easy) 1.2 Draw an “addition machine”, which is a box having two input
ports (labeled Input 1 and Input 2) and one output port. Combine two such
machines to generate the output (a + b) + c. Combine two such machines to
generate the output a + (b + c).
0F − a = (−a).
Fields 7
1F ÷ a = a−1 .
Solution 1.3: By the definitions of subtraction and division, and by the neutrality
properties of 0F and 1F ,
1.3 Examples
Example: We are already acquainted with three fields, Q, R and C. Since
Q ⊂ R ⊂ C,
this may give the impression that all the fields in the world form a hierarchy
of inclusions. This is not the case, as the next example shows. ▲▲▲
+ 0 1 ⋅ 0 1
0 0 1 0 0 0
1 1 0 1 0 1
It takes some explicit verification to check that this is indeed a field (do you
recognize it?). This field is commonly denoted by F2 . That addition and
multiplication are commutative is apparent by the symmetry of the tables.
The neutrality of zero and one is also apparent. For associativity and dis-
tributivity we actually have to examine all the cases. Finally, 0 is its own
additive inverse and 1 is both its own additive and multiplicative inverses.
▲▲▲
8 Chapter 1
Exercises
+ 0 1 2 ⋅ 0 1 2
0 0 1 2 0 0 0 0
1 1 2 0 1 0 1 2
2 2 0 1 2 0 2 1
(harder) 1.5 Construct a field having four elements. Hint: construct first
the multiplication table. Then, construct addition tables and show that only
one of them is consistent with all axioms.
+ 0 1 a b ⋅ 0 1 a b
0 0 1 a b 0 0 0 0 0
1 1 0 b a 1 0 1 a b
a a b 0 1 a 0 a b 1
b b a 1 0 b 0 b 1 a
S = {(1, a) ∶ a ∈ R} ,
where the addition and the multiplication on the right-hand sides are the
standard addition and multiplication in R.
Solution 1.6: Intuitive answer: just ignore the 1. Indeed, S is closed with respect to
both operations; (1, 0) is neutral to addition, (1, 1) is neutral to multiplication, etc. So
yes, S is a field.
T = {(a, b) ∶ a, b ∈ R} ,
where the addition and the multiplication on the right-hand sides are the
standard addition and multiplication in R.
Solution 1.7: (0, 0) is neutral to ⊕ and (1, 1) is neutral to ⊙. However, this is not a
field. For example, the non-zero element (1, 0) doesn’t have a multiplicative inverse.
Theorem 1.1 Let F be a field and let a, b, c ∈ F with a ≠ 0F . Then, the linear
equation
aX + c = b
has a solution and this solution is unique.
10 Chapter 1
Proof : There are two claims to be proved: first, that there exists an x ∈ F
such that
ax + c = b,
and second, that if x, y ∈ F both satisfy
ax + c = b and ay + c = b,
then x = y.
For existence, x = a−1 (b + (−c)) is a solution, as
(M1)
a (a−1 (b + (−c))) + c = (aa−1 )(b + (−c)) + c
(M4)
= 1F ⋅ (b + (−c)) + c
(M3)
= (b + (−c)) + c
(A1)
= b + (c + (−c))
(A4)
= b + 0F
(A3)
= b.
(Be sure to understand the justification of each passage.)
To prove uniqueness, suppose that
ax + c = b and ay + c = b.
Since both left-hand sides equal to b, they are equal, i.e.,
ax + c = ay + c.
We now proceed with the following deductions:
(ax + c) + (−c) = (ay + c) + (−c)
ax + (c + (−c)) = ay + (c + (−c))
ax + 0F = ay + 0F
ax = ay
a (ax) = a−1 (ay)
−1
(Be sure you understand why we had to assume that a ≠ 0F both for the
existence and the uniqueness.) n
The above proposition has a number of implications pertinent to any field:
x + b = b,
then x = 0F .
X + b = b.
X + b = 0F .
Exercises
x ⋅ a = a,
then x = 1F .
12 Chapter 1
aX + 0F = 0F .
Since x = 0F and x = b are both solutions, it follows from Theorem 1.1 that b = 0F . Similarly,
if b = 0F ≠ 0, it follows that a = 0F .
(a) −(−a) = a.
(b) (a−1 )−1 = a.
(c) (−1)a = (−a).
(d) (−0) = 0.
(e) a ≠ 0 if and only if (−a) ≠ 0.
(f) a = b if and only if a − b = 0.
(g) −(a + b) = −a − b.
(h) −(a − b) = b − a.
(i) (−a)b = a(−b) = −(ab).
(j) (−a)(−b) = ab.
(k) a ⋅ a = 1 if and only if a = 1 or a = −1.
(l) a ⋅ a = b ⋅ b if and only if a = b or a = −b.
(m) If a, b ≠ 0 then (ab)−1 = a−1 b−1 .
(n) If a ≠ 0 then 0/a = 0.
(o) a/1 = a.
(p) If b, d ≠ 0 then a/b = c/d if and only if ad = bc.
(q) If b, d ≠ 0 then (b/d)−1 = d−1 /b−1 .
(r) If b, d ≠ 0 then (a/b)(c/d) = (ac)/(bd).
(s) If b, d ≠ 0 then a/b + c/d = (ad + bc)/(bd).
14 Chapter 1
Solution 1.12: Most items are based on the same idea. Take for example Item (i):
consider the equation
X + ab = 0F .
On the one hand x = −(ab) is a solution. On other hand, substituting x = (−a)b,
i.e., x = (−a)b is also a solution, and by uniqueness (−a)b = −(ab). Item (l) follows by
noting that
a ⋅ a − b ⋅ b = (a + b)(a − b),
hence a ⋅ a − b ⋅ b = 0F if and only if either a + b = 0F or a − b = 0F .
a = a.
a=b implies b = a.
You will encounter many equivalence relations throughout your studies, in-
cluding in this course.
Moreover, we assume that addition and multiplication are consistent with
this notion of equivalence, namely, for all a, b, c ∈ F,
a=b implies a + c = b + c,
Fields 15
and
a=b implies a ⋅ c = b ⋅ c.
This assumption is the basis for the practice of adding the same term to both
sides of an equation.
Exercises
(easy) 1.13 Show that
a=b and c=d implies a + c = b + d,
and
a=b and c=d implies a ⋅ c = b ⋅ d.
Solution 1.13: It is an immediate consequence of consistency with respect to addition
and multiplication, along with the transitivity of the equality, for example,
a + c = b + c = b + d.
a1 + a2 + ⋅ ⋅ ⋅ + an .
While this notation may be self-explanatory. there may be cases where the
use of an ellipsis (three dots) is ambiguous. The more formal way or writing
this sum is n
∑ ai or ∑ ai ,
i=1 1≤i≤n
which we read as “the sum of all ai ’s where i ranges from one to n”. Formally,
this sum is defined inductively (! )הגדרה אינדוקטיביתas follows:
1
∑ ai = a1 ,
i=1
Note that such a definition is meaningful even if the operation is nor asso-
ciative nor commutative.
▲▲▲
Let a1 , a2 , . . . , an ∈ F and b1 , b2 , . . . , bn ∈ F. It can be shown inductively on n
that
n n n
∑ ai + ∑ bi = ∑(ai + bi ).
i=1 i=1 i=1
Likewise, for c ∈ F,
n n
c (∑ ai ) = ∑(c ai ).
i=1 i=1
(a1 , . . . , an ),
Fn = {(a1 , . . . , an ) ∶ ai ∈ F, i = 1, . . . , n} .
S n = {(s1 , . . . , sn ) ∶ si ∈ S, i = 1, . . . , n}.
18 Chapter 1
For reasons that will become apparent later in this course, we will sometimes
write n-tuples of scalars as columns delimited by square brackets; we denote
this set by
⎧
⎪ ⎡ x1 ⎤ ⎫
⎪
⎪
⎪ ⎢ ⎥ ⎪
⎪
⎢ ⎥
Fncol = ⎨⎢ ⋮ ⎥ ∶ xi ∈ F, i = 1, . . . , n⎬ .
⎪
⎪ ⎢ n
⎥ ⎪
⎪
⎩⎢⎣x ⎥⎦
⎪ ⎪
⎭
At other times, the scalars will be arrange in a row delimited by square
brackets, and we denote this set by
Fnrow = {[a1 . . . an ] ∶ ai ∈ F, i = 1, . . . , n} .
At times, when writing columns is calligraphically annoying we will write
⎡ a1 ⎤
T
⎢ ⎥
⎢ ⎥
[x1 . . . xn ] = ⎢ ⋮ ⎥ .
⎢ ⎥
⎢an ⎥
⎣ ⎦
The reasons for this apparent nonsense (who cares about the form of paren-
theses and why write scalars in columns?) will be clarified later on.
Exercises
(easy) 1.15 Prove that all five ways of adding four addends in (1.2) yield
the same sum.
Solution 1.15: The identities follow from associativity for 3 addends, for example,
((a + b) + c) + d = (a + b) + (c + d),
where we take (a + b) as one of the addends. Now, taking (c + d) as one addend,
(a + b) + (c + d) = a + (b + (c + d)),
and so on.
Fields 19
Solution 1.16: For n = 1 the identity is a1 + b1 = a1 + b1 , hence holds. Suppose that the
identity holds for n = k. Then, by definition,
k+1 k+1 k k
∑ ai + ∑ bi = ∑ ai + ak+1 + ∑ bi + bk+1
i=1 i=1 i=1 i=1
k k
= ∑ ai + ∑ bi + ak+1 + bk+1
i=1 i=1
k
= ∑(ai + bi ) + (ak+1 + bk+1 )
i=1
k+1
= ∑ (ai + bi ),
i=1
where in the passage to the third line we used the inductive assumption.
(a) ∑20
n=3 (k ⋅ k − (k − 1) ⋅ (k − 1)).
(b) ∑99 1
n=1 n(n+1) .
Such a sum is called a “telescopic sum”. In the second case, we note that
1 1 1
= − ,
n(n + 1) n n + 1
20 Chapter 1
Solution 1.19:
1 2 3
S = ∑ (1 + 2j) + ∑ (2 + 2j) + ∑ (3 + 2j)
j=1 j=1 j=1
= (1 + 2) + (2 + 2) + (2 + 4) + (3 + 2) + (3 + 4) + (3 + 6).
where in the passage to the second line we used the inductive assumption and in the
passage to the third line we used the distributive law.
{aij ∈ F ∶ 1 ≤ i ≤ n, 1 ≤ j ≤ n}
Solution 1.21: Since the summation sign is defined inductively, we have to use induc-
tion. For n = 1, the identity is a11 = a11 , which holds trivially. Suppose this holds for n,
then
n+1 ⎛ i ⎞ n ⎛ i ⎞ n+1
∑ ∑ aij = ∑ ∑ aij + ∑ an+1,j
i=1 ⎝j=1 ⎠ i=1 ⎝j=1 ⎠ j=1
n ⎛ n ⎞ n+1
= ∑ ∑ aij + ∑ an+1,j
j=1 ⎝i=j ⎠ j=1
n ⎛ n ⎞ n
= ∑ ∑ aij + ∑ an+1,j + an+1,n+1
j=1 ⎝i=j ⎠ j=1
n ⎛ n ⎞
= ∑ ∑ aij + an+1,j + an+1,n+1
j=1 ⎝i=j ⎠
n ⎛n+1 ⎞
= ∑ ∑ aij + an+1,n+1
j=1 ⎝ i=j ⎠
we finally obtain
n+1 ⎛n+1 ⎞
∑ ∑ aij .
j=1 ⎝ i=j ⎠
22 Chapter 1
whereas
n
∑ an bn = 1 ⋅ 1 + 1 ⋅ 1 = 2.
i=1
Chapter 2
a1 X 1 + a2 X 2 + ⋅ ⋅ ⋅ + an X n = b. (2.1)
The scalar ai is called the coefficient (!M )מקדof the i-th unknown. We write
the coefficients of the X i ’s in the form
[a1 , a2 , . . . , an ] ∈ Fnrow .
We also refer to the extended list of coefficients, which includes the right-
hand side
[a1 , a2 , . . . , an , b] ∈ Fn+1
row .
2(X 1 + X 2 − 6) = 3X 2 + 4(8 − X 1 ).
This is a linear equation in two unknowns albeit not of the form (2.1). By
algebraic manipulations (based on the axioms of field) we can rewrite it as
6X 1 − X 2 = 44,
24 Chapter 2
X 1 + X 2 = 1. (2.3)
We are looking for pairs of scalars [x1 , x2 ]T ∈ F2col satisfying this equation.
We may see right away that
1 0
[ ] and [ ]
0 1
are both solutions to (2.3), but do there exist more solutions? Take any t ∈ F
and substitute it for X 2 . Then, we are left with the equation
X 1 + t = 1,
Linear Systems of Equations 25
a1 x1 + ⋅ ⋅ ⋅ + an xn = b.
Multiplying both sides by c, using the distributive law and the associativity
of products,
cb = c (a1 x1 + ⋅ ⋅ ⋅ + an xn )
= c(a1 x1 ) + ⋅ ⋅ ⋅ + c(an xn )
= (ca1 )x1 + ⋅ ⋅ ⋅ + (can )xn ,
ak X k + ak+1 X k+1 + ⋅ ⋅ ⋅ + an X n = b.
(We call X k the leading variable (! )המשתנה המובילof the equation.) Mul-
tiplying this equation by a−1
k we obtain an equation having the same set of
solutions, whose first non-zero coefficient is one,
there exists a unique value of X k for which this equation holds. That is, the
set of solutions can be written as
⎧
⎪ ⎡ t1 ⎤ ⎫
⎪
⎪
⎪
⎪ ⎢ ⎥ ⎪
⎪
⎪
⎪
⎪ ⎢ ⋮ ⎥ ⎪
⎪
⎪
⎪ ⎢ ⎥ ⎪
⎪
⎪ ⎢ k−1 ⎥ ⎪
⎪
⎪
⎪ ⎢ t ⎥ ⎪
⎪
⎪
⎪⎢⎢ n i
⎥ 1 n ⎪
S[a1 ,...,an ∣b] = ⎨⎢b/ak − ∑i=k+1 (ai /ak )t ⎥ ∶ t , . . . , t ∈ F⎬ .
⎥
⎪ ⎢ ⎥ ⎪
⎪
⎪
⎪
⎪ ⎢ tk+1 ⎥ ⎪
⎪
⎪
⎪
⎪
⎪ ⎢ ⎥ ⎪
⎪
⎪
⎪ ⎢ ⋮ ⎥ ⎪
⎪
⎪
⎪ ⎢ n
⎥ ⎪
⎪
⎪
⎪ ⎢ t ⎥ ⎪
⎪
⎩⎣ ⎦ ⎭
This is what we mean by a solutions which is constructive, or explicit (!)מפורש.
The full set of solutions can be generated by selecting all possible values for
(t1 , . . . , tk−1 , tk+1 , . . . tn ). In this representation we say that the variables X i
for i ≠ k are free variables (!M חופשייM( )משתניbecause we can generate
all solutions by selecting their values “freely”) whereas X k is a dependent
variable (!( )משתנה קשורbecause once the free variables have been assigned,
the value of X k depends on those assigned values).
We may formulate the following corollary:
Exercises
(easy) 2.1 Write the set of solutions to the linear equation in two unknowns
over R,
3X 1 − 4X 2 = 7.
0 X 1 + 0 X 2 − 4X 3 + 0 X 4 + 7X 5 = 3
X1 + X2 + X3 = 1
over the field F2 . Solve the same equation over the field F3 .
a1 X 1 + a2 X 2 + a3 X 3 = b
a1 X 1 + ⋅ ⋅ ⋅ + an X n = b and c1 X 1 + ⋅ ⋅ ⋅ + cn X n = d.
(α a1 + c1 )X 1 + ⋅ ⋅ ⋅ + (α an + cn )X n = α b + d.
Solution 2.5: This follows from the axioms of field and summation rules. We need to
prove that if
n n
i i
∑ ai x = b and ∑ ci x = d,
i=1 i=1
then
n
i
∑(α ai + ci )x = α b + d.
i=1
where
aij ∈ F i = 1, . . . , m and j = 1, . . . , n
is the coefficient (!M )מקדof the j-th variable in the i-th equation, and
bi ∈ F i = 1, . . . , m
is the right-hand side of the i-th equation. Please note: the upper indexes in
aij and in bi enumerate the equation, whereas the lower index in aij enumerates
the variable.
A solution (!N )פתרוto the system is any n-tuple [x1 , x2 , . . . , xn ]T ∈ Fncol , such
that
a11 x1 +a12 x2 + . . . +a1n xn = b1
a21 x1 +a22 x2 + . . . +a2n xn = b2
. (2.6)
⋮ ⋮ ⋮ ⋮
m 1
a1 x +am
2 x
2 + . . . +am xn = bm
n
⎧
⎪ ⎡ x1 ⎤ ⎫
⎪
n
⎪⎢⎢ ⎥⎥
⎪ n i j i
⎪
⎪
S = ⎨⎢ ⋮ ⎥ ∈ F ∶ ∑ aj x = b for all i = 1, . . . , m⎬ .
⎪
⎪ ⎢ n⎥ j=1 ⎪
⎪
⎩⎢⎣x ⎥⎦
⎪ ⎪
⎭
As in the case of a single equation, solving the system of equations means
obtaining a constructive way of generating all of its solutions.
Like for a single equation:
Linear Systems of Equations 31
If, for example, F is a finite field, then we can enumerate the set of solutions,
which is a finite set. ▲▲▲
Not every system of equations is as “transparent” as in the above example.
What do we do when the system is more complicated? We transform it into
a “transparent” one having the same set of solutions, and we then solve the
easier one.
X 1 +2X 2 +X 3 +3X 4 = 4
3X 1 +6X 2 +2X 3 +5X 4 = 9.
We proved (Proposition 2.4) that this does not alter its set of solutions. Take
now this equation and subtract it from the second equation in the original
system, yielding
−X 3 −4X 4 = −3.
Then, add this equation to the first equation in the original system, yielding
X 1 +2X 2 −X 4 = 1.
X 3 +4X 4 = 3.
Look at the last two equations. This is the system of the previous example—
the “transparent” system, whose solution we’ve already found. As we will
prove in the next section, the solutions of both sets of equations are the same.
▲▲▲
Linear Systems of Equations 33
ai1 X 1 + ⋅ ⋅ ⋅ + ain X n = bi ,
(c1 a11 + . . . cm am 1 1 m n 1 m
1 )X + ⋅ ⋅ ⋅ + (c1 an + . . . cm an )X = c1 b + ⋅ ⋅ ⋅ + cm b . (2.7)
Note that we applied here both the extended associativity and commutativity
of addition and the distributive law. We conclude that a linear combination
of linear equations is again a linear equation.
Multiplying the i-th equation by ci , summing over i and applying the dis-
tributive law, we recover the desired result after exchanging the order of
summation. Note that we used here the consistency of equality and addi-
tion: if s1 = t1 , s2 = t2 up to tm = sm , then
s1 + s2 + ⋅ ⋅ ⋅ + sm = t1 + t2 + ⋅ ⋅ ⋅ + tm .
n
Note, however, that the reverse is not necessarily true. Not every solution to
(2.7) is necessarily a solution of (2.5) (“information may have been lost”).
More generally, consider a linear system of k equations in n unknowns,
g11 X 1 +g21 X 2 + . . . +gn1 X n = z 1
g12 X 1 +g22 X 2 + . . . +gn2 X n = z 2
. (2.8)
⋮ ⋮ ⋮ ⋮
g1k X 1 +g2k X 2 + . . . +gnk X n = z k
Definition 2.8 Two linear systems of equations are called equivalent (!)שקולות
if every equation in one system is a linear combination of the equations in
the other system.
the k equations in System B, with coefficients di1 , . . . , dik , namely, the i-th
equation of system C is of the form
k m n k m
∑ di` ∑ c`s ∑ asj X j = ∑ di` ∑ c`s bs .
`=1 s=1 j=1 `=1 s=1
m k n m k
∑ (∑ di` c`s ) ∑ asj X j = ∑ (∑ di` c`s ) bs ,
s=1 `=1 j=1 s=1 `=1
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
eis eis
Comment: Note that in all summations, the index we sum upon always
appears once as an upper index and once as a lower index. If you come
across a summation in which this is not the case, look for an error.
We end this section by observing that while we have a well-defined notion
of equivalence between systems of equations, we don’t yet have a means for
verifying wether two systems of equations are equivalent, nor a systematic
way of generating equivalent systems to a given system.
Linear Systems of Equations 37
Exercises
(easy) 2.6 Consider the following linear system of two equations in three
unknowns over R,
2X 1 +X 2 +X 3 = 2
X 1 +2X 2 −X 3 = −1.
Solution 2.6:
(a) No.
(b) No. It is not a solution of the second equation.
(c) X 1 − 4X 2 + 5X 3 = 7.
(d) Yes. The coefficients are [1/3, 1/3].
(e) No. You can’t find a, b such that 2a + b = 2, a + 2b = 1, a − b = 1 and 2a − b = 1.
(easy) 2.7 Write the set of solutions of the linear system over R in the
unknowns (X, Y ):
X +Y = 5
2X −Y = 3.
(easy) 2.8 Write the set of solutions of the linear system over R in the
unknowns (X, Y, Z):
X +Y −Z = −1
X −Y −Z = −1.
Solution 2.9: The second system is obtained from the first by taking the coefficients
to be [1/3, 4/3] and [−1/3, 2/3], respectively. The first system is obtained from the second
by taking the coefficients to be [1, −2] and [1/2, 1/2], respectively.
Solution 2.10: We have to show that every equation in one system is a linear combina-
tion of the equations in the other system. Let’s show for example that the first equation
on the left is a linear combination of the two equations on the right. We need to show
that there exists a, b ∈ R such that
a ⋅ 1 + b ⋅ 0 = −1 a⋅0+b⋅1=1 and a ⋅ (−1) + b ⋅ 3 = 4.
Indeed, a = (−1) and b = 1 is a solution. I.e., the first equation on the left is obtained by
subtracting the first equation on the right by the second equation.
Linear Systems of Equations 39
Are they equivalent? If they are, write each system as a linear combination
of the other.
Solution 2.11: These systems are not equivalent. To prove it, it suffices to show that
one equation in one of the systems is not a linear combination of the equations in the
other system. We will show that the first equation on the left is not a linear combination
of the two equations on the right. If it were, there would exists a, b ∈ R, such that
(a) m = 4 and n = 3.
(b) m = 3 and n = 4.
Solution 2.13: The answer to the first item is positive. Take for example the system
of equations
X1 = 1 X2 = 1 X3 = 1 and X 1 + X 2 = 2.
The answer to the second equation is negative. Note that a system of three equations in
four unknowns may be inconsistent, in which case there are no solutions at all. If, however
it is consistent, the analysis is the subsequent sections will show that the solution is not
unique.
40 Chapter 2
which we call the m×n matrix of coefficients (!M)מטריצת המקדמי. The entry
at the i-th row and the j-th column is the coefficient of the j-th unknown in
the i-th equation. Likewise, we organize the bi ’s as an m × 1 matrix
⎡ b1 ⎤
⎢ ⎥
⎢ b2 ⎥
⎢ ⎥
b = ⎢ ⎥,
⎢ ⋮ ⎥
⎢ m⎥
⎢b ⎥
⎣ ⎦
which is an element of Fm
col . If we further organize the unknowns as an n × 1
matrix,
⎡X 1 ⎤
⎢ ⎥
⎢X 2 ⎥
⎢ ⎥
X = ⎢ ⎥,
⎢ ⋮ ⎥
⎢ n⎥
⎢X ⎥
⎣ ⎦
then we may symbolically represent the system of equations as AX = b. At
this stage this is just a symbolic notation, but it will acquire a meaning
shortly.
Comments:
(a) Note that in aij , the upper index i designates the row and the lower
index j designates the column. It will sometimes be convenient to write
the (i, j)-th element of a matrix A also by (A)ij .
Linear Systems of Equations 41
Mm×n (F).
(d) M1×n (F) coincides with Fnrow , whereas Mm×1 (F) coincides with Fm
col .
⎡ a1j ⎤
⎢ ⎥
⎢ a2 ⎥
⎢ ⎥
Colj (A) = ⎢ j ⎥ .
⎢ ⋮ ⎥
⎢ m⎥
⎢a ⎥
⎣ j⎦
We may also write the coefficients and the right-hand side of the equations
as a unified m × (n + 1) matrix,
⎡
⎢ a11 a12 ⋯ a1n b1 ⎤
⎥
⎢ a21 a22 ⋯ a2n b2 ⎥
⎢ ⎥
[A∣b] = ⎢ ⎥.
⎢ ⋮ ⋮ ⋯ ⋮ ⋮ ⎥
⎢ ⎥
⎢
⎣ a1 am
m
2
m
⋯ an b m ⎥
⎦
Exercises
(easy) 2.15 Write the system of equations represented by the extended ma-
trix
⎡ 0 0 1 4 3 ⎤
⎢ ⎥
⎢ ⎥
⎢ 2 4 2 6 7 ⎥.
⎢ ⎥
⎢ 3 6 2 5 8 ⎥
⎣ ⎦
Solution 2.15:
+X 3 + 4X 4 =3
1 2
2X +4X +2X 3 + 6X 4 =7
3X 1 +6X 2 +2X 3 + 5X 4 = 8.
These operations are in fact functions taking an element in Mm×n (F) and
returning an element in Mm×n (F).
Formally, if A is a matrix, and e is the operation (the function) taking a
matrix and returning a matrix having all rows the same, except that the r-th
row has been multiplied by F ∋ c ≠ 0, then for every pair of indexes i, j,
⎧
⎪c a i
⎪ i=r
(e(A))ij =⎨ i j
⎩ aj i ≠ r,
⎪
⎪
i.e.,
⎡ a11 ⋯ a1n ⎤ ⎡ a11 ⋯ a1n ⎤
⎢ ⎥ ⎢ ⎥
⎢ ⋮ ⋮ ⋮ ⎥⎥ ⎢⎢ ⋮ ⋮ ⋮ ⎥⎥
⎢
⎢ ⎥ ⎢ ⎥
e ∶ ⎢ ar1 ⋯ arn ⎥ ↦ ⎢c ar1 ⋯ c arn ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⋮ ⋮ ⋮ ⎥⎥ ⎢⎢ ⋮ ⋮ ⋮ ⎥⎥
⎢
⎢am ⋯ am ⎥ ⎢ am ⋯ am ⎥
⎣ 1 n⎦ ⎣ 1 n ⎦
If e is the operation taking a matrix and returning a matrix having all rows
the same, except for the r-th row being the sum of the r-th row and c times
the s-th row of A, then for every pair of indexes i, j,
⎧
⎪ai + c asj
⎪ i=r
(e(A))ij = ⎨ ji
⎩ aj i ≠ r,
⎪
⎪
i.e.,
⎡ a11 ⋯ a1n ⎤ ⎡ a11 ⋯ a1n ⎤⎥
⎢ ⎥ ⎢
⎢ ⋮ ⋮ ⋮ ⎥⎥ ⎢⎢ ⋮ ⋮ ⋮ ⎥
⎢ ⎥
⎢ r r ⎥ ⎢ r s r s ⎥
e ∶ ⎢ a1 ⋯ an ⎥ ↦ ⎢a1 + c a1 ⋯ an + c an ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⋮ ⋮ ⋮ ⎥⎥ ⎢⎢ ⋮ ⋮ ⋮ ⎥
⎢ ⎥
⎢am ⋯ am ⎥ ⎢ am ⋯ am ⎥
⎣ 1 n⎦ ⎣ 1 n ⎦
A = es (es−1 (. . . e1 (B))).
Example: Since
A = e(A).
e−1
k (A) = ek (ek (ek−1 (⋯e2 (e1 (B))))) = ek−1 (⋯e2 (e1 (B))),
−1
Linear Systems of Equations 45
S[A∣0] = S[B∣0] .
S[A∣c] = S[B∣d] .
Exercises
Solution 2.16: This is so by the very definition of elementary row operations; when an
elementary row operation e acts on the matrix A, all its row but one remain unchanged,
whereas one row of e(A) is a linear combination of either one or two rows of A.
(easy) 2.17 Explain explicitly why in the proof of Proposition 2.15 it suffices
to show that the solutions of a homogeneous linear system do not change
under a single elementary row-operation.
Solution 2.17: Just apply an inductive argument on the number of elementary row-
operations.
Solution 2.18: No. Row-equivalent matrices are of the same size, since elementary
row-operations do no change the dimensions of a matrix.
then
1 −1 2
e2 (e1 ([ ])) = e2 ([ ]) = [ ] ,
3 3 3
whereas
1 4 −4
e1 (e2 ([ ])) = e1 ([ ]) = [ ] .
3 3 3
48 Chapter 2
a b
A=[ ] ∈ M2×2 (F).
c d
1 0
[ ].
0 1
Solution 2.21: (a) Suppose that c = 0. Then, ad are zero, i.e., either a = 0 or d = 0. If
d = 0, then we are done. Otherwise, if a = 0 and d ≠ 0, then
0 b 0 0
[ ] is row equivalent to [ ].
0 d 0 d
a ad/c 0 0
[ ] is row equivalent to [ ].
c d c d
(intermediate) 2.22 Show that two matrices in which two rows have been
interchanged are row-equivalent.
Solution 2.22: The interchange of rows r and s can be partitioned as follows: (i) add
row r to row s, (ii) subtract row s from row r, (iii) add row r to row s, and (iv) multiply
row r by (−1).
Linear Systems of Equations 49
X1 +17/3 X 4 = 0
X2 −5/3 X 4 = 0
X3 −11/3 X 4 = 0.
⎧
⎪ ⎡−17/3s⎤ ⎫
⎪
⎪
⎪ ⎢ ⎥ ⎪
⎪
⎪ ⎢
⎪⎢ 5/3s ⎥ ⎥ ⎪
⎪
4
S[A∣0] = S[B∣0] = ⎨⎢ ⎥ ∈ Fcol ∶ s ∈ F⎬ .
⎪
⎪ ⎢ 11/3s ⎥ ⎪
⎪
⎪
⎪ ⎢ ⎥ ⎪
⎪
⎪
⎩⎣⎢ s ⎥ ⎪
⎭
⎦
▲▲▲
50 Chapter 2
(a) There exists a number r ≤ m, such that the rows r + 1, . . . , m are iden-
tically zero (if r = m then there are no rows that are identically zero).
(b) For each i = 1, . . . , r (i.e., for each non-zero row), let aiki be the first
non-zero entry; then aiki = 1 and k1 < k2 < ⋅ ⋅ ⋅ < kr (the ki ’s are the
columns of the leading coefficients in the non-zero rows).
(c) For each i, aiki is the only nonzero element in the ki -th column.
Linear Systems of Equations 51
Example: A zero matrix (!M )מטריצת אפסיis a matrix having all entries
zero; if A ∈ Mm×n is a zero matrix, we write A = 0, or A = 0m×n . A zero
matrix is an example of a row-reduced echelon matrix (with r = 0). ▲ ▲ ▲
Note that the summation is only on the indexes of the free variables.
52 Chapter 2
▲▲▲
When is a non-homogeneous system consistent? Let AX = b with A being
a row-reduced echelon matrix. There are two possibilities: if A has a row
with all its entries zero, and the corresponding row of b is non-zero, then the
system does not have any solution. Otherwise, the zero rows can be ignored,
and the system is consistent.
Exercises
(easy) 2.23 Construct three matrices, each of which fails to satisfy exactly
one condition in the definition of a row-reduced echelon matrix.
That r ≤ m (the number of non-zero rows is not larger than the total number of rows) is
obvious.
(easy) 2.25 Characterize all 1×n and all m×1 row-reduced echelon matrices.
Solution 2.25: There is only one m × 1 row-reduced echelon matrix,
⎡1⎤
⎢ ⎥
⎢0⎥
⎢ ⎥
⎢ ⎥.
⎢⋮⎥
⎢ ⎥
⎢0⎥
⎣ ⎦
On the other hand, there are more types of 1 × n row-reduced echelon matrices,
[1 ∗ ⋯ ∗] [0 1 ∗ ⋯ ∗] ,
(easy) 2.26 Explain why the n × n identity matrix is the unique n × n row-
reduced echelon matrix having no zero row.
implies that ki = i for all i = 1, . . . , n. The only such matrix is the unit matrix.
Next, ignore the first row and bring to the second row the row whose first
nonzero entry is the least. Denote by k2 the column of the first non-zero
entry of the second row; by construction, k2 > k1 . Divide the second row by
a2k2 such that after this change a2k2 = 1. Then, subtract from the i-th row,
i ≠ 2, aik2 times the second row. These are elementary row-operations which
eliminate all entries in the k2 -nd column except in the second row. Note
also that this did not destroy the fact that up to the k2 -th column, the only
nonzero entries are a1k1 = 1 and possibly a1j for k1 < j < k2 .
We proceed this way, until reaching the m-th row, or until the remaining
rows are identically zero. n
X `1 , X `2 , . . . , X `n−r
⋮
n−r
X kr + ∑ r`rj X `j = dr ,
j=1
0 = dr+1
⋮
0 = dm .
Evidently, if dr+1 , . . . , dm are not all zero, then the system is not consistent.
If however,
dr+1 = dr+2 = ⋅ ⋅ ⋅ = dm = 0,
then we can replace the free variables X `1 , . . . , X `n−r by any sequence of
scalars t1 , . . . , tn−r , obtaining a solvable equation for each of the dependent
56 Chapter 2
We should have perhaps noted long ago, that any homogeneous linear system
has at least one solution, [0, 0, . . . , 0]T , which we simply denote by 0 ∈ Fncol .
This solution is called the trivial solution (! הטריוויאליN)הפתרו. For any
matrix that has free variables, there also exist non-trivial solutions to the
homogeneous problem (as they may assume any value). In particular,
Proof : There are two directions to prove. Assume first that A is row-
equivalent to the identity matrix. Since row-equivalent matrices have the
same associated solutions, the solutions to AX = 0 coincide with the solu-
tions of IX = 0, i.e,
X1 = 0 X2 = 0 ⋯ X n = 0,
Exercises
− 37 X 1 +2X 2 − 83 X 3 = 0
by first writing it in matrix form, and then transforming the matrix of coef-
ficients into a row-reduced echelon matrix.
Solution 2.29: Writing the matrix and coefficients and reducing it, we obtain
⎡ 1 2 −6 ⎤⎥ ⎡1 0 19 ⎤
⎢ 3 ⎢ 6 ⎥
⎢ −4 0 5 ⎥⎥ ⎢0 1 − 67 ⎥
⎢ ⎢ 24 ⎥
⎢ ⎥ → ⎢ ⎥
⎢ −3
⎢ 7 6 −13⎥⎥ ⎢0
⎢ 0 0 ⎥
⎥
⎢−
⎣ 3 2 − 83 ⎥⎦ ⎢0
⎣ 0 0 ⎥
⎦
Thus, X 3 is a free variables, and the set of solutions is
{[− 19
6
a, 67
24
a, a]T ∶ a ∈ Q}.
58 Chapter 2
(intermediate) 2.30 What are all the solutions (if any) of the system
X 1 −X 2 +2X 3 = 1
2X 1 +2X 3 = 1
X 1 −3X +4X 3 = 2.
2
(intermediate) 2.31 Show using the Gauss-Jordan algorithm that the non-
homogeneous system
X 1 −2X 2 +X 3 +2X 4 = 1
X 1 +X 2 −X 3 +X 4 = 2
X 1 +7X 2 −5X 3 −X 4 = 3.
has no solutions.
Solution 2.31: We start with the augmented matrix
⎡ 1 −2 1 2 1 ⎤⎥
⎢
⎢ ⎥
⎢ 1 1 −1 1 2 ⎥ .
⎢ ⎥
⎢ 1 7 −5 −1 3 ⎥
⎣ ⎦
Applying the Gauss-Jordan elimination algorithm we obtain
⎡ 1 0 1 4 5 ⎤
⎢ 3 3 3 ⎥
⎢ 2 1 1 ⎥
⎢ 0 1 −3 −3 ⎥.
3 ⎥
⎢
⎢ 0 0 0 0 −1 ⎥
⎣ ⎦
The last equation does not have a solution.
Linear Systems of Equations 59
Solution 2.33: Performing the Gauss-Jordan algorithm, the reduced extended matrix
is
⎡ 1 −2
⎢ 2/3 −1/3 b1 /3 ⎤
⎥
⎢ 0 0 1 1 3b /7 + 2b1 /7
2 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0 0
⎢ 0 0 b − 3b2 /70 − 2b1 /7
3 ⎥
⎥
⎢ 0 0
⎣ 0 0 3b4 − 3b2 /7 − 9b1 /7 ⎥
⎦
For the system to be solvable,
S =A+B
Note that the “+” sign in both relations has a totally different meaning: the
first is addition in Mm×n (F), whereas the second is addition in F. Another
way to write the definition of the addition of matrices (of the same size!) is
Example:
1 2 3 7 8 9 8 10 12
[ ]+[ ]=[ ].
4 5 6 10 11 12 14 16 18
▲▲▲
If we denote by 0m×n (or just 0 in short) the m × n-matrix whose entries are
all zero, then
A+0=A
for every A ∈ Mm×n (F). Likewise, given A ∈ Mm×n (F), we denote by (−A)
the m × n matrix given by
(−A)ij = −aij .
For every A ∈ Mm×n (F),
A + (−A) = 0.
It is easy to see that matrix addition is associative, namely, for every A, B, C ∈
Mm×n (F) we have
(A + B) + C = A + (B + C),
and commutative. namely,
A + B = B + A.
Linear Systems of Equations 61
Note that the addition of matrices satisfies the four axioms of addition in a
field. This doesn’t make Mm×n (F) into a field!
We could define in a similar way products of matrices of the same size. We
could. But we won’t do so. We will rather have a different definition for
products of matrices, not necessarily of the same size, which will relate to
linear combinations of systems of equations.
You may ask yourself what is the purpose of adding up matrices, and whether
it relates to the solution of linear systems of equations. The meaning of
matrix addition will be clarified later in this course, in the context of linear
transformations.
Exercises
(easy) 2.35 Show that matrix addition is both associative and commuta-
tive.
and
((A + B) + C)ij = (A + B)ij + cij = aij + bij + cij = aij + (B + C)ij = (A + (B + C))ij .
(c A)ij = c aij .
That is, the scalar c multiplies every entry of A to yield the matrix c A.
We could think of the elements of F as “acting” on elements in Mm×n (F)
resulting in an element in Mm×n (F).
62 Chapter 2
Example:
1 2 3 4 8 12
4⋅[ ]=[ ].
4 5 6 16 20 24
▲▲▲
It is easy to see that multiplication by a scalar satisfies
1F ⋅ A = A
c(dA) = (cd)A
0F ⋅ A = 0m×n
(−1F )A = (−A).
c (A + B) = c A + c B. (2.10)
(c + d) A = c A + d A. (2.11)
Exercises
(−1F )A = (−A),
(easy) 2.38 Prove the two distributive properties (2.10), (2.11) of the prod-
uct of a scalar and a matrix.
Note that the index i remains fixed—it represents the index of the equation
in the new system—and so does the index 1—which represents the variable
whose coefficient we calculate.
Likewise, the coefficient of X 2 in the new i-th equation is
m
bi1 a12 + bi2 a22 + ⋅ ⋅ ⋅ + bim am i k
2 = ∑ b k a2 ,
k=1
Definition 2.21 Let B ∈ Mp×m (F) and let A ∈ Mm×n (F). Their product
(! )מכפלה של מטריצותBA is a p × n matrix whose (i, j)-th entry is given by
m
(BA)ij = ∑ bik akj = bi1 a1j + ⋅ ⋅ ⋅ + bim am
j .
k=1
Linear Systems of Equations 65
1 0 5 −1 2 5 −1 2
[ ][ ]=[ ].
−3 1 15 4 8 0 7 2
▲▲▲
Note that when we wrote the unknowns as an n×1 matrix, and the right-hand
side of the equation as an m × 1 matrix,
⎡X 1 ⎤ ⎡ b1 ⎤
⎢ ⎥ ⎢ ⎥
⎢X 2 ⎥ ⎢ b2 ⎥
⎢ ⎥ ⎢ ⎥
X=⎢ ⎥ and b = ⎢ ⎥,
⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢ n⎥ ⎢ m⎥
⎢X ⎥ ⎢b ⎥
⎣ ⎦ ⎣ ⎦
the equation AX = b can be interpreted in terms of matrix multiplication:
the product of an m × n matrix and an n × 1 matrix is an m × 1 matrix.
Then,
m
(Im A)ij = ∑ δki Akj = Aij ,
k=1
namely Im A = A for every A. Likewise,
n
(AIn )ij = ∑ Aik δjk = Aij ,
k=1
namely, AIn = A. ▲▲▲
Exercises
is a solution.
Solution 2.43: Looking for the first few powers, you may convince yourself that
1 n
An = [ ].
0 1
1 1 1 n−1 1 n
An = AAn−1 = [ ][ ]=[ ].
0 1 0 1 0 1
(intermediate) 2.44 Let A ∈ Mm×n (F) and B ∈ Bk×m (F). Which of the
following statements is true? If it is, prove it, otherwise provide a counter
example.
(a) If the first row of A is zero, then the first row of BA is zero.
Linear Systems of Equations 69
(b) If the first column of A is zero, then the first column of BA is zero.
(c) If the first two rows of B are zero, then the first two rows of BA are
zero.
(d) If the first two columns of B are zero, then the first two columns of BA
are zero.
(e) If the i-th and the j-th rows of A are equal then the i-th and the j-th
rows of BA are equal.
(f) If the i-th and the j-th columns of A are equal then the i-th and the
j-th columns of BA are equal.
(g) If the i-th and the j-th rows of B are equal then the i-th and the j-th
rows of BA are equal.
(h) If the i-th and the j-th columns of B are equal then the i-th and the
j-th columns of BA are equal.
(AB)C = A(BC).
Proof : Just follow the definition, using the associative properties of both
addition and multiplication in F.
p p n p n
((AB)C)ij = ∑ (AB)ik ckj = ∑ (∑ ais bsk ) ckj = ∑ ∑ ais bsk ckj ,
k=1 k=1 s=1 k=1 s=1
and
n n p n p
(A(BC))ij = ∑ ais (BC)sj = ∑ ais ( ∑ bsk ckj ) = ∑ ∑ ais bsk ckj .
s=1 s=1 k=1 s=1 k=1
Proposition 2.24 Let A ∈ Mm×n (F) and B ∈ Mn×p (F). Let λ ∈ F. Then,
λ(AB) = (λA)B.
Exercises
B E F
C C ⋅E C ⋅F
D D⋅E D⋅F
Here, we partition the rows of A into two groups so that we represent the
matrix A ∈ M3×2 (F) as
C
A = [ ],
D
Linear Systems of Equations 73
where C ∈ M2×2 (F) and D ∈ M1×2 (F). Likewise, we partition the columns of
B into two groups so that we represent the matrix B ∈ M3×2 (F) as
B = [E F ] ,
where E ∈ M2×3 (F) and F ∈ M2×2 (F). Then, the product AB ∈ M3×5 (F) can
be represented as a block matrix
CE CF
AB = [ ],
DE DF
with CE ∈ M2×3 (F), CF ∈ M2×2 (F), DE ∈ M1×3 (F) and DF ∈ M1×2 (F).
Comments:
Comment: If a matrix A ∈ Mn (F) has a row whose entries are all zero or a
column whose entries are all zero, then it is not invertible. Why? Suppose
that the i-th row of A is zero. Then, for every matrix B ∈ Mn (F),
a b
A=[ ] ≠ 02×2 .
c d
a b d −b d −b a b ad − bc
[ ][ ]=[ ][ ]=[ ] = (ad − bc)I.
c d −c a −c a c d ad − bc
Linear Systems of Equations 75
a b d −b
[ ][ ] = 02×2 ,
c d −c a
LA = In and AR = In .
(The matrix L is called a left-inverse (! )הפכית שמאליתof A and the matrix
R is called a right-inverse (! )הפכית ימניתof A.) Then
L = R,
and A is invertible.
LA = In and AR = In .
(AB)−1 = B −1 A−1 .
AA−1 = A−1 A = In ,
which proves, by definition, that A−1 is invertible and A is its inverse. For
the second statement, using the associativity of matrix multiplication,
Comment: We have just seen that the set GLn (F) satisfies the following
properties:
Such a structure is called a group (! ;)חבורהit is the main subject of a second-
year course in algebra. Note that a group, unlike a field, is endowed with
only one algebraic operation, which does not need to be commutative. The
notation GLn stands for the general linear group.
Exercises
(easy) 2.47 Let A ∈ Mn (F). Show that if there exists a non-zero matrix
C ∈ Mn (F) such that CA = 0, then A in not invertible.
Solution 2.47: Suppose that A is invertible, then
C = C(AA−1 ) = (CA)A−1 = 0 ⋅ A−1 = 0,
which is a contradiction.
Solution 2.50: We have seen that the product of two invertible matrices is invertible.
To show that a product of k invertible matrices is invertible we proceed inductively.
Solution 2.51:
c 1
[ ] and [ ].
1 c
The elementary matrices corresponding to adding s times the first row to the
second row, and s times the second row to the first row are
1 1 s
[ ] and [ ].
s 1 1
▲▲▲
More generally, we denote by Dk (a) the elementary matrix multiplying the
k-th row by a. It is easy to verify that since
⎧
⎪c Akj
⎪ i=k
(e(A))ij =⎨ i
⎩A j i ≠ k,
⎪
⎪
it follows that
⎧
⎪ 1 i=j≠k
⎪
⎪
⎪
i
(Dk (a))j = ⎨a i = j = k .
⎪
⎪
⎪
⎩0 otherwise
⎪
80 Chapter 2
B = P A.
P = Ek Ek−1 . . . E1 .
R = PA
Exercises
(easy) 2.52 Show by a direct calculation that the elementary 2 × 2 matrix
1 c
T12 (c) = [ ]
0 1
is invertible.
Solution 2.52: By a direct calculation,
1 c 1 −c 1 0
[ ][ ]=[ ].
0 1 0 1 0 1
Theorem 2.35 Let A ∈ Mm (F). Then the following statements are equiva-
lent:
(a) A is invertible.
(b) A is row-equivalent to Im .
(c) A is a product of elementary matrices.
Comment: When we say that three (or more) statements are equivalent, it
means that if one of them is true, then all of them are true, and equivalently,
if one of them is false, then all are false. To prove it, it suffices to prove that
the first statement implies the second, that the second implies the third, and
so on, and finally that the last implies the first.
A = Es Es−1 . . . E1 Im = Es Es−1 . . . E1 .
Statement (c) implies statement (a) by Proposition 2.32. Thus, it only re-
mains to prove that an invertible matrix is row-equivalent to Im .
Let R be a row-reduced echelon matrix which is row-equivalent to A. Since
R and A are row-equivalent, there exists a matrix P , which is a product of
elementary matrices, such that R = P A, hence R is invertible. It follows that
R does not have a row that is zero, but a square row-reduced echelon matrix
which has no zero rows can only be the identity matrix. n
In fact, the inverse of an invertible matrix can be calculated as follows:
84 Chapter 2
P Im = A−1 .
Proof : We have
(P Im )A = P (Im A) = P A = Im ,
which, by the uniqueness of the matrix inverse proves that P Im = A−1 . n
We finally relate the property of being invertible to the existence of solution
to linear systems of equations:
(a) A is invertible.
(b) The homogeneous system AX = 0 only has the trivial solution.
(c) For every m × 1 matrix b, the system AX = b is consistent and its
solution is unique.
Proof : Suppose that Statement (a) holds, i.e., A is invertible. On the one
hand, x = A−1 b is a solution; on the other hand, if Ax = b, then
x = Ix = A−1 Ax = A−1 b,
least one row identically zero. It follows that it has at least one free variable,
contradicting the fact that AX = 0 has a unique solution (since its solutions
are the same as the solutions of RX = 0). n
With that we finally have:
Proof : We will show it for a product of two matrices; the general case can
be shown inductively. We already know that a product of invertible matrices
is invertible—we will now show that if AB is invertible then both A and B
are invertible. Let C be the inverse of AB, then
(AB)C = A(BC) = I,
C(AB) = (CA)B = I,
Exercises
⎡1 2 3 4⎤⎥
⎢
⎢0 2 3 4⎥⎥
⎢
⎢ ⎥
⎢0 0 3 4⎥⎥
⎢
⎢0 0 0 4⎥⎦
⎣
invertible? If it is, what is its inverse?
Linear Systems of Equations 87
b
X = [ ],
−a
A2 − A + I = 0.
Solution 2.61: Since A(I − A) = I, it follows that A (and I − A) are invertible and
A−1 = I − A.
A3 − 2A + I = 0.
Solution 2.62: Since A(2I − A2 ) = I, it follows that A (and 2I − A2 ) are invertible, and
A−1 = 2I − A2 .
A D
C=[ ].
B
Show that if A and B are invertible, then so is C. What about the converse?
Linear Systems of Equations 89
Solution 2.64: Suppose that A and B are invertible. Consider the equation CX = 0.
If we partition the n + k rows of a solution into n row (which we denote x) and k rows
(which we denote y), we obtain two equations,
Ax + Dy = 0 and By = 0.
Solution 2.66: If A is not invertible, then there exists a non-zero m × 1 matrix X such
that AX = 0. Take the matrix B whose first column is X and the other columns are zero.
Then, AB = 0.
Solution 2.67: The equation BX has non-trivial solutions since the row-reduced ech-
elon matrix which is row-equivalent to B has at least m − n free variables. It follows that
AB = 0 has non-trivial solutions proving that AB is not invertible.
90 Chapter 2
(In other words, the set of solutions of a homogeneous system is closed under
addition and under scalar multiplication.)
Au = 0 and Av = 0,
A(u + v) = Au + Av = 0,
A(λu) = λ(Au) = 0,
namely λu ∈ S[A∣0] . Note that we used here the fact that for every i = 1, . . . , m,
n n n
(A(λu))i = ∑ aik (λu)k = ∑ λaik uk = λ ∑ aik uk = (λ(Au))i .
k=1 k=1 k=1
n
Linear Systems of Equations 91
t λt
λ[ ] = [ ]
−t −λt
is an element of S[1,1∣0] . ▲▲▲
Au = b and Av = 0.
Then,
A(u + v) = Au + Av = b + 0 = b,
which means that u + v ∈ S[A∣b] . n
In fact, we can prove something even stronger.
Theorem 2.42 Let A ∈ Mm×n (F) and b ∈ Fm col . Suppose that the inhomoge-
neous system is consistent, namely, that there exists x ∈ Fncol satisfying
Ax = b.
u = x + v,
u = x + (u − x) .
´¹¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¶
=v
Now,
Av = A(u − x) = Au − Ax = b − b = 0,
i.e., v ∈ S[A∣0] . n
In other words, if an inhomogeneous system is consistent, every solution can
be represented as the sum of one particular solution and a solution of the
corresponding homogeneous system.
Linear Systems of Equations 93
X 1 + X 2 = 5.
t
S[1,1∣5] = {[ ] ∶ t ∈ F} .
5−t
Note that
0
x=[ ]
5
is a particular solution of this system, and that the set of solutions can be
written as
0 t
S[1,1∣5] = {[ ] + [ ] ∶ t ∈ F} .
5 −t
That is, every solution can be represented as the sum of one particular solu-
tion and a solution of the homogeneous system. ▲▲▲
translating them into other points. We denote this action using the addition
sign: a translation v acting on a point P yields a point which we denote by
P + v.
v P +v
The rule is that for every two points P, Q there exists a unique translation
v such that Q = P + v; we denote this unique translation by Q − P , which is
sometimes also denoted by P⃗Q.
Comments:
(a) There is no meaning to adding two points! Thus far, the addition
operation represents only the action of a translation on a point.
(b) An affine space does not come equipped with a special point, such as
an origin.
v
w
P
v+w
P + (v + w) = (P + v) + w.
Linear Systems of Equations 95
Note the difference between the types of addition on both sides of the equa-
tion. The addition on the left-hand side is a function
This last point merits some elaboration. By assumption, there exists for a
point P a unique translation v satisfying
P + v = P,
Q + w = Q,
The claim is that v = w, so that there exists a single translation which leaves
all points unaffected. Why this? Because if Q − P = u, i.e., Q = P + u, then
Q + v = (P + u) + v = P + (u + v) = P + (v + u) = (P + v) + u = P + u = Q,
α(βv) = (αβ)v,
(α + β)v = αv + βv
α(u + v) = αu + αv.
And now we connect this geometric construct to the set of solutions to linear
system. Let A ∈ Mm×n (F). We interpret the solutions of the system AX = b
(which are n-tuples of field elements) as points in the affine space An (F). In
contrast, we interpret the set of solutions of the homogeneous system AX = 0
(which are also n-tuples of field elements) as the space of translations V n (F).
Theorem 2.42 can then be interpreted as follows. Suppose that the system
AX = b is consistent, i.e., it has at least one solution P (a point in the affine
space). Then, its set of solutions is all the points Q obtained by translating
P by a solution of the homogeneous equation (which is indeed a translation).
L = {P + tv ∶ t ∈ F}.
v
P
Proposition 2.43 Let a, b ∈ F, which are not both zero and let c ∈ F. Then,
the set of solutions to the equation
aX + bY = c
Proof : We already have a technique for finding the space of solutions S[a,b∣c] .
Suppose first that a ≠ 0F . Then, the extended matrix [a, b∣c] is row-equivalent
98 Chapter 2
to a matrix of the form [1, d∣e]; the corresponding linear systems have the
same solutions. The set of solutions is the set of points
e − dt
S[1,d∣e] = {( ) ∶ t ∈ F} ,
t
If a = 0 and b ≠ 0, then
t
S[0,b∣c] = {( ) ∶ t ∈ F} ,
c/b
Proof : Let
p1 v1
L = {( 2 ) + t [ 2 ] ∶ t ∈ F}
p v
be a line in A2 (F). Let [x1 , x2 ]T ∈ L. Then, there exists a t ∈ F, such that
x1 = p1 + tv 1 and x2 = p 2 + t v 2 .
t = (x1 − p1 )/v 1 ,
Linear Systems of Equations 99
so that
x2 = p2 + v 2 (x1 − p1 )/v 1 ,
which we may rewrite as
v 2 x1 − v 1 x2 = v 2 p 1 − v 1 p 2 .
That is, all points in L are solution of the equation
v 2 X 1 − v 1 X 2 = v 2 p 1 − v 1 p2 .
Proposition 2.45 Let a, b, c ∈ F, which are not all zero and let d ∈ F. Then,
the set of solutions to the equation
aX + bY + cZ = d
Vector Spaces
The subject of this course is a theory of sets for which there is a notion of
linear combinations of elements. We have already encountered linear combi-
nations of equations and linear combinations of matrices; we are now going
to formalize axiomatically such sets, which we call vector spaces. Vector
spaces are abundant in mathematics (and its applications in all branches
of science), and their theory is foundational to that branch of mathematics
called algebra.
+ ∶ V × V → V,
Comments:
(a) A vector space hinges on two structures, a set of vectors and a field.
Formally, a vector field is a four-tuple, (V, +, F, ⋅).
(b) Vector spaces are also called linear spaces (!M לינארייM)מרחבי.
(c) Be careful not to confuse 0F ∈ F and 0V ∈ V , although we often denote
them by the same symbol, 0.
(d) There is no meaning to a product ua, with u ∈ V and a ∈ F (even
though we could have defined it by commutativity).
(e) Vector spaces don’t have a canonical notion of products of vectors.
For those who are acquainted with scalar and vector products, these
products assume additional structure.
(f) Inductively, a vector space is closed under any finite linear combination
of vectors. That is, for every v1 , . . . , vn ∈ V and a1 , . . . , an ∈ F,
a1 v1 + ⋅ ⋅ ⋅ + an vn ∈ V.
Vector Spaces 103
We will often write such sums using our notation for matrix multipli-
cation,
⎡ a1 ⎤
⎢ ⎥
1 n ⎢ ⎥
a v1 + ⋅ ⋅ ⋅ + a vn = (v1 . . . vn ) ⎢ ⋮ ⎥ .
⎢ n⎥
⎢a ⎥
⎣ ⎦
The interpretation is that the column of scalars “acts” on the row
of vectors to produce a linear combination. At this stage, the role
of matrices enclosed by square bracket becomes “operators” forming
linear combinations. Note that we obtain products such as v1 a1 , which
we interpret as a1 v1 .
(g) Physicists often describe vectors as entities having a “magnitude” and
a “direction”; at this stage (and throughout this course) vectors have
neither magnitudes nor directions.
Example: Let F be any field. A set comprising just one element, V = {0V },
is a vector space with vector addition and scalar multiplication defined the
only possible way, namely
0V + 0V = 0V and a 0V = 0V .
Such a vector space is called the zero space (!)מרחב האפס, even though strictly
speaking, the vector space ({0V }, +, F, ⋅) is a different space for each field F.
▲▲▲
Example: For any field F and every n ∈ N, the set V = Fn is a vector space
over F with respect to vector addition,
0Fn = (0F , . . . , 0F ),
All the vector space axioms follow from the properties of the field F (which
you should verify). Thus, (Fn , +, F, ⋅) is a vector space. The same applies if
we rather consider Fnrow or Fncol . ▲▲▲
Example: Consider the vector space (F2 , +, F, ⋅) and let v1 = (2, 3) and
v2 = (4, 5). The linear combination 8v1 + 9v2 is written using the action of a
matrix,
8
((2, 3) (4, 5)) [ ] .
9
▲▲▲
Example: Let S be any non-empty set and let V = Func(S, F ) be the space
of functions f ∶ S → F (you will learn about functions in depth in the calculus
course, but let’s just think of a function as a “machine” which when fed with
an element in S, returns an element in F). Then, V is a vector space over F
with respect to vector addition
(a f )(s) = a f (s).
Vector Spaces 105
The zero element of this space is the function returning 0 ∈ F for all s ∈ S.
The additive inverse (−f ) of a function f is the function
(−f )(s) = −f (s).
Thus, (Func(S, F), +, F, ⋅) is a vector space. Once again don’t be confused:
in this vector space, the vectors are functions. ▲▲▲
It is readily checked that F[X] forms a vector space over F with respect to
these operations. ▲▲▲
Example: The complex numbers C are a field, hence C is a vector space over
C under the natural operations of addition and multiplication by scalars. On
the other hand, C is also a vector space over R, which is a totally different
vector space, despite the fact that the elements of the space (i.e., the vectors)
are the same. More generally, C is a vector space over any subfield of C (e.g.,
the complex rationals). ▲▲▲
106 Chapter 3
Proof :
u = 0V + u = (w + v) + u = w + (v + u) = w + (u + v) = w + 0V = w.
a 0V = a(0V + 0V ) = a 0V + a 0V .
Adding −(a 0V ) to both sides and using the properties of vector addi-
tion,
0V = a 0V + (−(a 0V ))
= (a 0V + a 0V ) + (−(a 0V ))
= a 0V + (a 0V + (−(a 0V )))
= a 0V + 0V
= a 0V ,
proving that a 0V = 0V .
Vector Spaces 107
(c) Similarly,
0F u = (0F + 0F )u = 0F u + 0F u.
Adding −(0F u) to both sides,
0V = 0F u + (−(0F u))
= (0F u + 0F u) + (−(0F u))
= 0F u + (0F u + (−(0F u)))
= 0F u + 0V
= 0F u,
proving that 0F u = 0V .
(d) Suppose that au = 0V . If a ≠ 0F , then using the fact that a has a
multiplicative inverse,
i.e., either a = 0F or u = 0V .
(e) We have
0V = 0F u = (1F + (−1F ))u = u + (−1F )u,
and it follows from the uniqueness of the inverse that (−1F )u = −u.
Definition 3.3 Let V be a vector space over a field F and let (u1 , . . . , un ) ⊂
V be a sequence of n vectors. A vector v ∈ V is said to be a linear combi-
nation of (u1 , . . . , un ), if there exists a sequence of scalars (a1 , . . . , an ) ∈ Fn ,
such that
v = a1 u1 + ⋅ ⋅ ⋅ + an un ,
or in matrix form, if there exists an a ∈ Fncol , such that
v = (u1 . . . un ) a.
108 Chapter 3
S + T = {u + v ∶ u ∈ S, v ∈ T }
Exercises
(easy) 3.1 What is a vector? Let S be any non-empty set and let x ∈ S.
How can we tell whether x is a vector?
Solution 3.1: This is an almost senseless question. An element of a set is a vector if the
set is endowed with an algebraic structure (relying in particular on another structure—a
field), with two binary operations satisfying all axioms.
(easy) 3.2 In what sense is every field a vector space? Is it true that every
vector space is a field?
Solution 3.2: Every field is a vector space over itself in the following sense: every
two field elements can be added yielding another field element (here we think of field
elements as vectors). Every field element can be multiplied by a field element, yielding a
field element (here we think of it as a scalar times a vector resulting in a vector). Every
field element (viewed as a vector) has an additive inverse. Under this perspective, one
can verify that all the axioms of vector spaces are satisfied. On the other hand, not every
vector space is a field. For example, in a general vector space there is no product taking
two vectors and returning a vector.
(easy) 3.3 Let S be any non-empty set and let V = Func(S, F). Prove that
it is indeed a vector space with respect to the vector addition and scalar
multiplication defined above.
The zero element of this vector space is the function ζ ∈ Func(S, F) defined by ζ(s) = 0F
for all s ∈ S. You just have to take each of the axioms one-by-one and check that they are
satisfied. For example, for every s ∈ S.
(f + ζ)(s) = f (s) + ζ(s) = f (s) + 0F = f (s),
hence f + ζ = f .
(easy) 3.4 Let V = R2 be the set of pairs of real number and let F = R.
Define
(x, y) + (w, z) = (x + w, 0)
a(x, y) = (ax, 0).
Is V a vector space over R under these operations?
Solution 3.4: No. The unit property of 1F is not satisfied, for example,
1F (2, 3) = (2, 0) ≠ (2, 3).
(easy) 3.5 What is the smallest vector space containing more than one vec-
tor?
Solution 3.5: The field F2 , viewed as a vector space over itself contains exactly two
elements.
(easy) 3.6 Show that any vector space over R is either the zero space, or
contains infinitely-many vectors.
Solution 3.6: Let V be a vector space over R. If V is not the zero space, it contains at
least one non-zero element v. The set of vectors
Rv = {av ∶ a ∈ R}
is infinite, because a ≠ b implies that av ≠ bv (see comment above).
(intermediate) 3.7 Let V be a vector space over F. Prove that for every
v, w ∈ V and 0 ≠ a ∈ F there exists a unique u ∈ V satisfying
au + v = w.
Hint: you’ve done something very similar in the context of fields.
110 Chapter 3
(intermediate) 3.8 Use the result of Exercise 3.7 to deduce the uniqueness
of the additive inverse.
Solution 3.8: Consider the equation
1F u + v = 0V .
By the previous exercise, there exists a unique u ∈ V satisfying u + v = 0F , which is by
definition (−v).
(intermediate) 3.11 Consider the vector space (C3 , +, C, ⋅). Which vectors
are linear combinations of the vectors (1, 0, −1), (0, 1, 1) and (1, 1, 1)?
Solution 3.11: By definition, all the vectors of the form
a(1, 0, −1) + b(0, 1, 1) + c(1, 1, 1) = (a + c, b + c, b + c − a),
with a, b, c ∈ C. But this is not an explicit solution. Are there vectors in C3 which are not
linear combinations of these three vectors? In other words, can every (x, y, z) be expressed
as such a linear combination? This amounts to ask wether the linear system
⎡ 1 0 1⎤ ⎡a⎤ ⎡x⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ 0 1 1⎥ ⎢ b ⎥ = ⎢ y ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢−1 1 1⎥ ⎢ c ⎥ ⎢ z ⎥
⎣ ⎦⎣ ⎦ ⎣ ⎦
is always consistent. It is readily verified that the matrix of coefficients is row-equivalent
to the unit matrix, hence the answer is positive: every vector in C3 is a linear combination
of these three vectors.
112 Chapter 3
3.3 Subspaces
+∣W ×W ∶ W × W → V.
⋅∣F×W ∶ F × W → V.
(−1)w + 0V ∈ W,
Example: Consider the vector space (Mn (F), +, F, ⋅). A matrix A ∈ Mn (F)
is called symmetric if aij = aji for all i, j ∈ 1, . . . , n. It is easy to see that the
subset of symmetric matrices is a linear subspace of Mn (F). ▲▲▲
Example: Consider the vector space V = (Fncol , +, F, ⋅). Let A ∈ Mm×n (F)
and let
W = {x ∈ Fncol ∶ Ax = 0}
be the set of solutions of the corresponding homogeneous system of equations.
By Theorem 2.40, W ≤ V . ▲▲▲
u = aw and v = b w.
114 Chapter 3
Then,
u + v = a w + b w = (a + b) w ∈ W.
Let c ∈ F, then
c u = c (a w) = (ca) w ∈ W,
proving W is a linear subspace of V . ▲▲▲
Exercises
Solution 3.12: By definition, since U ≤ W , then U is not empty and for every u, v ∈ U
and a ∈ F,
u+v ∈U and a v ∈ U.
By definition U ≤ V . (Yes, there is almost nothing to prove...)
Solution 3.13: No, because the operations in V are not restrictions of the operations
in R. It is not sufficient that V be a subset of R to qualify as a linear subspace.
(a) Find a subset W ⊂ R2 including the zero vector, which is closed under
scalar multiplication but not closed under vector addition.
(b) Find a subset U ⊂ R2 including the zero vector, which is closed under
vector addition but not closed under scalar multiplication.
(c) Does there exist a non-empty subset V ⊂ R2 which does not include
the zero vector, which is closed under scalar multiplication?
0V = 0F v ∈ S,
i.e., 0V ∈ S.
W = {(z, w) ∶ 2z = 3w} .
a b
W = {[ ] ∶ ad = 0} .
c d
W = {(x, y, z) ∶ 2x − y + z = 0, y − 2z = 0} .
W = {(x, y, z) ∶ xy = z} .
W = {f ∶ R → R ∶ f (2) = f (3)}.
W = {f ∶ R → R ∶ f (0) = f 2 (1)}.
116 Chapter 3
Solution 3.15: (a) yes, (b) no, (c) yes, (d) yes, (e) no, (f) yes, (g) no.
U = {(z̄ 1 , . . . , z̄ n ) ∶ (z 1 , . . . , z n ) ∈ W } ,
Then,
1 n
(z̄ 1 , . . . , z̄ n ) + (w̄1 , . . . , w̄n ) = ((z + w) , . . . , (z + w) ) ∈ U
since
((z + w)1 , . . . , (z + w)n ) ∈ W.
It follows that U is closed under vector addition. We proceed similarly to show the closure
of U under scalar multiplication.
Solution 3.18: Only (b). (a), (c) and (e) are not closed under scalar multiplication;
(d) is not closed under vector addition,
W = ⋂C
is a linear subspace of V .
118 Chapter 3
W = ⋂C
u+v ∈W and a u ∈ W.
0V ∈ U for every U ∈ C ,
u, v ∈ U for every U ∈ C ,
au ∈ U for every U ∈ C ,
Consider the collection of all linear subspaces of V which contain all those
vectors, namely,
C = {W ≤ V ∶ S ⊆ W }.
This collection is not empty, because V itself is a linear subspace of V con-
taining all vectors in S, i.e.,
V ∈ C.
Whatever this collection of linear subspaces is, its intersection is a linear
subspace of V . We call it the linear subspace generated (! )תת מרחב נוצרby
the vectors in S, and denote it by
T ∈ ⟨S⟩ .
120 Chapter 3
Proof : Once again, this is a direct consequence of the definition of the gen-
erated subspace. If
then
T ⊆ ⋂{W̃ ≤ V ∶ S ⊂ W̃ }.
n
⟨w⟩ = Fw,
that is, the linear subspace generated by a single vector is the subspace
obtained by all multiples of that vectors by scalars. If w = 0V , then this
subspace is the zero subspace. Otherwise, it is a line.
We have already seen that
Fw ≤ V.
Since {w} ⊂ Fw ≤ V , if follows from Lemma 3.7 that
⟨w⟩ ⊆ Fw.
⟨∅⟩ = {0V }.
Vector Spaces 121
Proof : Since {0V } ≤ V contains the empty set, it follows from Lemma 3.7
that
⟨∅⟩ ⊆ {0V }.
Conversely, since {0V } is contained in every linear subspace of V , it follows
that
{0V } ⊆ W for all W ∈ {W̃ ≤ V ∶ ∅ ⊂ W̃ },
and by Lemma 3.8,
{0V } ⊆ ⟨∅⟩ .
n
⟨S⟩ ≤ ⟨T ⟩ .
(Note that we write ⟨S⟩ ≤ ⟨T ⟩ rather than ⟨S⟩ ⊆ ⟨T ⟩ because these are linear
subspaces.)
{W ≤ V ∶ T ⊆ W } ⊆ {W ≤ V ∶ S ⊆ W }.
Intersecting over more sets can only reduce the intersection, hence
n
More properties of generated subspaces are derived in the exercise section.
Exercises
Solution 3.20: Since W1 and W2 are both linear subspaces, 0V ∈ W1 and 0V ∈ W2 , from
which follows that 0V ∈ W1 ∩ W2 . Let u, v ∈ W1 ∩ W2 and let a ∈ F. Since in particular
u, v ∈ W1 which is a subspace, it follows that u + v ∈ W1 and av ∈ W1 . Since in particular
u, v ∈ W2 which is a subspace, it follows that u + v ∈ W2 and av ∈ W2 . Thus,
u + v ∈ W1 ∩ W2 and a v ∈ W1 ∩ W2 ,
Show that
⟨S1 ⟩ = ⟨S2 ⟩ .
Vector Spaces 123
and
S2 ⊆ ⟨S1 ⟩ implies ⟨S2 ⟩ ⊆ ⟨⟨S1 ⟩⟩ = ⟨S1 ⟩ .
Solution 3.24:
(a) False. Take the vector space R with S1 = {1, 2} and S2 = {1}. Then, ⟨S1 ⟩ = ⟨S2 ⟩
(i.e., ⟨S1 ⟩ ⊆ ⟨S2 ⟩) but S1 ⊆/ S2 .
(b) True. by Proposition 3.10, S2 ⊆ S1 implies that ⟨S2 ⟩ ⊆ ⟨S1 ⟩, which together with
⟨S1 ⟩ ⊆ ⟨S2 ⟩ yields an equality.
(c) False. Let V = R2 , S1 = {(1, 0), (0, 1), (2, 0)} and S2 = {(1, 0)}. Then ⟨S1 ⟩ = R2 and
⟨S2 ⟩ = R(1, 0) i.e., ⟨S2 ⟩ ≠ ⟨S1 ⟩. Then, (2, 0) ∈ S1 ∖ S2 and yet (2, 0) ∈ ⟨S2 ⟩.
(d) True. Suppose by contradiction that every v ∈ S1 ∖ S2 satisfies v ∈ ⟨S2 ⟩. Then,
S1 = S2 ∪ (S1 ∖ S2 ) ⊆ ⟨S2 ⟩ ,
hence
⟨S1 ⟩ ⊆ ⟨S2 ⟩ .
On the other hand, since S2 ⊆ S1 it follows that ⟨S2 ⟩ ⊆ ⟨S1 ⟩, which contradicts
⟨S2 ⟩ ≠ ⟨S1 ⟩.
(e) False. Let V = R, S1 = {1} and S2 = {2}. Then, S1 ∩ S2 = ∅ however ⟨S1 ⟩ ∩ ⟨S2 ⟩ = R.
124 Chapter 3
Example: Let w ∈ V . Then, the only linear combinations of {w} are scalar
multiples of w,
Span{w} = Fw.
Note that Span{w} = ⟨w⟩. We will shortly see that this is a general identity
(note also that we defined the span such that Span ∅ = {0V } = ⟨∅⟩). ▲ ▲ ▲
Then,
(x, y) = (a − b, a + b),
where
a = 12 (y + x) and b = 12 (y − x),
proving that Span S = R2 . ▲▲▲
u = a1 u1 + ⋅ ⋅ ⋅ + an un and v = b1 v 1 + ⋅ ⋅ ⋅ + bm v m .
Then
u + v = a1 u1 + ⋅ ⋅ ⋅ + an un + b1 v1 + ⋅ ⋅ ⋅ + bm vm ∈ Span S.
Likewise,
c u = c a1 u1 + ⋅ ⋅ ⋅ + c an un ∈ Span S,
proving that Span S ≤ V . n
Theorem 3.13 Let V be a vector space over a field F and let S ⊂ V . Then,
Span S = ⟨S⟩ .
Span S ∈ {W ≤ V ∶ S ⊆ W },
⟨S⟩ ⊆ Span S.
Span S ⊆ ⟨S⟩ ,
Span W = W.
Proof : This corollary asserts that linear subspaces are closed under linear
combinations. We can prove it directly, but we can get this as a consequence
of the last theorem, recalling that ⟨W ⟩ = W (see Exercise 3.21). n
Exercises
(easy) 3.25 Let V be a vector space over the field F2 and let v ∈ V be a
non-zero vector. Write explicitly all the vectors in Span{v}.
Solution 3.25:
Span{v} = {a v ∶ a ∈ F2 } = {0F v, 1F v} = {0V , v}.
Vector Spaces 127
(easy) 3.26 Consider the vector space (F3 , +, F, ⋅). Find two vectors u, v ∈
F3 , such that
Span{u, v} = {(0F , a, b) ∶ a, b ∈ F} .
Solution 3.26: For example u = (0F , 1F , 0F ) and u = (0F , 0F , 1F ). Then for every a, b ∈ F,
a u + b v = (0F , a, b),
i.e.,
Span{u, v} = {a u + b v ∶ a, b ∈ F} = {(0F , a, b) ∶ a, b ∈ F} .
(easy) 3.27 Consider the vector space (R4 , +, R, ⋅). Find two different sets
S, T ⊂ R4 , such that
and
T = {(π, π, 0, π), (0, −eπ , eπ , eπ )}.
(Yes, I could have worked harder to make the second example more ”interesting”, but why
work harder?)
(intermediate) 3.28 Consider the vector space (R4 , +, R, ⋅), and let
v1 = (2, −1, 3, 2)
v2 = (−1, 1, 1, −3)
v3 = (1, 1, 9, −5).
Is
(3, −1, 0, −1) ∈ Span{v1 , v2 , v3 }?
128 Chapter 3
Solution 3.28: The question is whether there exist real numbers x1 , x2 , x3 , such that
x1 (2, −1, 3, 2) + x2 (−1, 1, 1, −3) + x3 (1, 1, 9, −5) = (3, −1, 0, −1).
We can turn this into whether the non-homogeneous system
2x2 − x2 + x3 = 3 − x1 + x2 + x3 = −1 3x1 + x2 + 9x3 = 0 2x1 − 3x2 − 5x3 = −1
is consistent. In matrix form
⎡2 −1 1 ⎤⎥ ⎡ 1 ⎤ ⎡⎢ 3 ⎤⎥
⎢ x
⎢−1 1 1 ⎥⎥ ⎢⎢ 2 ⎥⎥ ⎢⎢−1⎥⎥
⎢
⎢ ⎥ ⎢x ⎥ = ⎢ ⎥ .
⎢3 1 9 ⎥⎥ ⎢⎢ 3 ⎥⎥ ⎢⎢ 0 ⎥⎥
⎢
⎢2 x
⎣ −3 −5⎥⎦ ⎣ ⎦ ⎢⎣−1⎥⎦
A this stage we know how to proceed by reducing the system. This yields,
⎡ 2 −1 1 3 ⎤ ⎡ 1 0 2 2 ⎤
⎢ ⎥ ⎢ ⎥
⎢ −1 1 1 −1 ⎥ ⎢ 0 1 3 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ → ⎢ ⎥.
⎢
⎢ 3 1 9 0 ⎥
⎥
⎢
⎢ 0 0 0 −2 ⎥
⎥
⎢
⎣ 2 −3 −5 −1 ⎥
⎦
⎢
⎣ 0 0 0 −2 ⎥
⎦
This system is not consistent, hence the answer is negative.
Solution 3.29:
(a) Yes. Set S1 = {u−v, v−w, w} and S2 = {u, v, w}. Clearly, S1 ⊂ Span S2 . Conversely,
u = (u − v) + (v − w) + w and v = (v − w) + w,
hence S2 ⊂ Span S1 , from which we conclude that Span S1 = Span S2 .
(b) No, and we show it by finding a counter example. Let V = R3 with u = (1, 0, 0),
v = (0, 1, 0) and w = (0, 0, 1). On the one hand, Span{u, v, w} = R3 . On the other
hand
{u − v, v − w, w − u} = {(1, −1, 0), (0, 1, −1), (−1, 0, 1)},
whose span is
Span{u − v, v − w, w − u} = {(a − c, −a + b, c − b) ∶ a, b, c ∈ R}
This span contains only vectors x ∈ R3 for which x1 + x2 + x3 = 0, hence this span is
a strict subset of R3 .
Vector Spaces 129
Span{u − v, v − w, w − u} = Span{u, v, w} = R.
(harder) 3.30 Let W ⊂ R5 be the set of all solutions to the linear system
2X 1 −X 2 + 34 X 3 −X 4 =0
X1 + 23 X 3 −X 5 = 0
9X 1 −3X 2 +6X 3 −3X 4 −3X 5 = 0.
Find a set of three vectors spanning W .
Solution 3.30: We first reduce the system to find the subspace of solutions,
⎡ 2 4 ⎤ ⎡ 1 0 2
⎢ −1 3
−1 0 ⎥ ⎢ 3
0 −1 ⎤⎥
⎢ 2 ⎥ ⎢ ⎥
⎢ 1 0 3
0 −1 ⎥ → ⎢ 0 1 0 1 2 ⎥.
⎢ ⎥ ⎢ ⎥
⎢ 9 −3 6 −3 −3 ⎥ ⎢ 0 0 0 0 0 ⎥⎦
⎣ ⎦ ⎣
Thus, there are three free variables, and the space of solutions is
{(u − 32 s, 2u − t, s, t, u) ∶ s, t, u ∈ R}.
(intermediate) 3.31 Prove that the only linear subspaces of R (the field R
as a vector space over itself) are R and {0}.
Solution 3.31: Let W ≤ R. If W is not the zero subspace, then there exists a non-zero
element a ∈ W . Since {a} ⊂ W , it follows that
Span{a} ≤ Span W = W,
Solution 3.33: It suffices to show one direction, since the relation is symmetric with
respect to u and v. So suppose that
u ∈ Span W ∪ {v}.
u = a1 w1 + ⋅ ⋅ ⋅ + an wn + c v.
(harder) 3.34 Prove that the only linear subspaces of R2 are R2 , {0} or
sets of the form
Rv
for some v ∈ R2 .
Solution 3.34: Clearly, R2 , {0} and sets of the form Rv are linear subspaces of R2 .
We need to show that there are no other. If all the vectors in W ≤ V are multiple of one
vector v, then W = Span{v} = Rv. Otherwise, there exists in W a non-zero vector v and
a vector not its its span, say w. We need to show that Span{v, w} = R2 .
Write v = (a, b) and w = (c, d). We are asking whether every (e, f ) ∈ R2 can be written as
a c x1 e
[ ][ ] = [ ]
b d x2 f
Vector Spaces 131
is consistent for all e, f ∈ R. We know that the answer is positive if and only if ad − bc ≠ 0,
i.e., if and only if (a, b) and (c, d) are not proportional to each other. Thus, Span{v, w} =
R2 .
(harder) 3.35 What are all the linear subspaces of (C, +, R, ⋅)?
Solution 3.35: The vector space (C, +, R, ⋅) can be viewed as (R2 , +, R, ⋅) with the
identification a + ıb → (a, b). Thus, the answer is the zero subspace, C and all subspaces
of the form Rz, with z ∈ C.
u + v ∈ W1 ∪ W2 ,
v = (u + v) − u ∈ W1 ,
u = (u + v) − v ∈ W2 ,
{Rowi (A) ∶ i = 1, . . . , m}
are a subset of Fnrow , which is a vector space over F. Their linear span is
called the row space (! )מרחב השורותof A, denoted by
R(AB) ≤ R(B).
Vector Spaces 133
Proof : Recall that A and B are row-equivalent if and only if there exist
matrices P, Q ∈ Mm (F), such that
B = PA and A = QB.
Exercises
(intermediate) 3.37 Consider the vector space (R3 , +, R, ⋅) and the sets
S = {(1, 2, 3), (2, 2, 1)} and T = {(2, 3, −1), (3, 0, −2)}.
Is Span S = Span T ?
Hint: find matrices A, B such that Span S = R(A) and Span T = R(B). Re-
duce these matrices and base your answer on those reduced representations.
Solution 3.37: Consider the matrices
1 2 3 2 3 −1
A=[ ] and B=[ ].
2 2 1 3 0 −2
Clearly, Span S = R(A) and Span T = R(B). The row space of two matrices of the same
size is the same if and only if they are row-equivalent. Denoting by RA and RB the
row-reduced forms of A and B, a direct calculation gives
1 0 −2 1 0 −2/3
RA = [ ] and RB = [ ].
0 1 5/2 0 1 1/9
These two matrices are not row-equivalent hence neither are A and B.
▲▲▲
1 1
A=[ ].
0 0
Then,
R(A) = Span{[1, 1]} = {[c c] ∶ c ∈ R} ,
whereas
1 c
C (A) = Span {[ ]} = {[ ] ∶ c ∈ R} .
0 0
▲▲▲
136 Chapter 3
C (AB) ≤ C (A).
b ∈ C (A).
Proof : If you think of it, there is nothing to prove. b ∈ C (A) if and only if
there exists an x ∈ Fncol , such that Ax = b, which by definition amounts to
the system AX = b being consistent. n
2 4
S1 = {[ ] , [ ]} ⊂ V,
3 5
and
2 7
S2 = {[ ] , [ ]} ⊂ V,
3 8
Then,
2 4 7
S1 ∪ S2 = {[ ] , [ ] , [ ]} ,
3 5 8
whereas
4 9 6 11
S1 + S2 = {[ ] , [ ] , [ ] , [ ]} .
6 11 8 13
▲▲▲
W = W1 + W2 + ⋅ ⋅ ⋅ + Wn
u = a1 u1 + ⋅ ⋅ ⋅ + an un and v = b1 v 1 + ⋅ ⋅ ⋅ + bn v n ,
u + v = (a1 u1 + b1 v1 ) + ⋅ ⋅ ⋅ + (an un + bn vn ) ∈ W,
c u = c a1 u1 + ⋅ ⋅ ⋅ + c an un ∈ W,
138 Chapter 3
On the other hand, since W ≤ V contains the union of the Wi ’s, it follows by
Lemma 3.7 that
n
⟨⋃ Wi ⟩ ≤ W,
i=1
and by Theorem 3.13,
n
Span (⋃ Wi ) ≤ W,
i=1
which completes the proof. n
Exercises
(harder) 3.38 Let W1 , W2 be linear subspaces of a vector space V , such
that
W1 + W2 = V and W1 ∩ W2 = {0V }.
Prove that for every vector v ∈ V there exist unique vectors w1 ∈ W1 and
w2 ∈ W2 , such that
v = w1 + w2 .
Solution 3.38: Since V = W1 + W2 , then by definition every v ∈ V can be represented
as
v = w1 + w2 ,
where w1 ∈ W1 and w2 ∈ W2 . Suppose that
v = u1 + u2 ,
where u1 ∈ W1 and u2 ∈ W2 . Then,
w1 − u1 = u2 − w2 .
The left-hand side is an element of W1 and the right-hand side is an element of W2 . Since
the only element belonging to both subspaces is 0V , it follows that w1 − u1 = u2 − w2 = 0V ,
i.e., u1 = w1 and u2 = w2 , thus proving the uniqueness of the representation.
Vector Spaces 139
w
140 Chapter 3
a1 v1 + ⋅ ⋅ ⋅ + an vn + = 0V .
On the other hand, the set {(1, 0), (0, 1)} is linearly-independent because
c1 v1 + ⋅ ⋅ ⋅ + cn vn = (v1 . . . vn ) c.
(v1 . . . vn ) c = 0
(a) S is linearly-dependent.
(b) There exists a vector v ∈ S which is dependent on S ∖ {v}.
a1 v1 + ⋅ ⋅ ⋅ + an vn = 0.
Let j ∈ {1, . . . , n} be such that aj ≠ 0 (at least one such j exists). Then,
v = a1 v 1 + ⋅ ⋅ ⋅ + an v n .
142 Chapter 3
Setting vn+1 = v and an+1 = (−1), we obtain that (v1 , . . . , vn , vn+1 ) are distinct
vectors in S satisfying
a1 v1 + ⋅ ⋅ ⋅ + an vn + an+1 vn+1 = 0,
i.e., S is linearly-dependent. n
What makes a set S ⊆ V linearly-independent? If for every sequence (v1 , . . . , vn ) ⊂
S of distinct vectors and every sequence (a1 , . . . , an ) of scalars,
a1 v 1 + ⋅ ⋅ ⋅ + an v n = 0
if and only if a1 = ⋅ ⋅ ⋅ = an = 0.
(a) S is linearly-independent.
(b) Every v ∈ S is linearly-independent of S ∖ {v}.
Proof :
Vector Spaces 143
Exercises
(easy) 3.39 Why did we insist in Definition 3.22 that the vectors vi be
distinct?
Solution 3.39: If we allow vectors to repeat, e.g., S = {v}, and v1 = v2 = v, then
1F ⋅ v1 + (−1F ) v2 = 0.
Without this restriction, every non-empty set of non-zero vectors would be linearly-
dependent.
144 Chapter 3
(easy) 3.40 Let v ∈ V . Show that the set {v} is linearly-dependent if and
only if v = 0V .
(easy) 3.41 Show that if two vectors are linearly-dependent, then one is a
(scalar) multiple of the other.
Solution 3.41: Suppose that the set {u, v} is linearly-dependent. This means that
there exists a non-trivial linear combination that vanishes,
a u + b v = 0V .
linearly-independent?
Vector Spaces 145
Solution 3.44:
(a) Suppose that a, b ∈ C satisfy
a(1 − ı, 3 + ı) + b(1, 1 + 2ı) = (0, 0).
That is,
(1 − ı)a + b = 0 and (3 + ı)a + (1 + 2ı)b = 0.
The homogeneous linear system
1−ı 1 a 0
[ ][ ] = [ ]
3 + ı 1 + 2ı b 0
has non-trivial solutions (check that the determinant of the matrix vanishes), hence
the vectors u and v are linearly-dependent.
146 Chapter 3
(b) For a, b ∈ R, we get separate equations for the real and imaginary parts,
⎡ 1 1⎤ ⎡0⎤
⎢ ⎥ ⎢ ⎥
⎢−1 0⎥ a ⎢0⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥[ ] = ⎢ ⎥,
⎢ 3 1⎥ b ⎢0⎥
⎢ ⎥ ⎢ ⎥
⎢ 1 2⎥ ⎢0⎥
⎣ ⎦ ⎣ ⎦
And you can check that this system only has trivial solutions, hence the vectors u
and v are linearly-independent.
(intermediate) 3.47 Consider the vector space (F2col , +, F, ⋅). Show that
the set
a c
S = {[ ] , [ ]}
b d
is linearly-independent if and only if ad − bc ≠ 0.
Vector Spaces 147
a c x 0
[ ][ ] = [ ].
b d y 0
only has trivial solutions. We know that the answer is if and only if the determinant ad−bc
does not vanish.
3.4.2 Bases
Definition 3.26 Let V be a vector space over F. A subset S ⊆ V is called
a generating set (! )קבוצה יוצרתif
Span S = V.
T = {(1, 0)} ⊂ V
(1, 1) ∈/ Span T.
▲▲▲
148 Chapter 3
e1 = (1, 0, 0, . . . , 0, 0)
e2 = (0, 1, 0, . . . , 0, 0)
⋮=⋮
en = (0, 0, 0, . . . , 0, 1)
S = {1}
S = {1}
T = {1, ı}
is a basis. I recommend thinking again about the difference between the last
two examples. ▲▲▲
We claim that S is a basis for V = Fncol . There are two things to show: that
S is a linearly-independent set and that S spans Fncol (i.e., that the column
space of A is Fncol ).
Vector Spaces 149
Example: Thus far, all the vector spaces in this section were finitely-
generated. Consider now the vector space of polynomials F[X]. This space
is infinite-dimensional. Why? Let P1 , . . . , Pn ∈ F[X] be a finite set of poly-
nomials; we will show that it cannot span F[X]. Let
n
N = max deg Pi .
i=1
v = a1 v 1 + ⋅ ⋅ ⋅ + an v n
u = b1 u1 + ⋅ ⋅ ⋅ + bm um ,
Vector Spaces 151
u ∈ Span G ∖ {v},
or, if one of the ui equals v, we can substitute for v its expression as a linear
combination of element in G ∖ {v}, so in either case
u ∈ Span G ∖ {v},
Proof : Start with L, and add to it vectors in G, as long as the set remains
linearly-independent. This process must terminate, as G is a finite set. Con-
sider the resulting set L ⊂ B ⊂ G. By construction, B is linearly-independent,
and for every v ∈ G ∖ B we obtain that G ∪ {v} is linearly-dependent. It
follows that every such v is in the span of B, i.e.,
G ∖ B ⊂ Span B,
G ⊂ Span B,
hence
V = Span G ⊂ Span B ⊂ V,
i.e., B is a generating set, hence a basis. n
152 Chapter 3
Exercises
(easy) 3.48 Prove that for every field F, {1F } is a basis for (F, +, F, ⋅).
Solution 3.48: Every non-zero singleton (a set containing only one vector) is linearly-
independent. It is generating because to every a ∈ F (viewed as a vector) corresponds a ∈ F
(viewed as a scalar), such that a = a ⋅ 1F .
Solution 3.49: The standard basis {(1, 0), (0, 1)} is a basis for C2 over C.
Solution 3.50: The set {(1, 0), (ı, 0), (0, 1), (0, ı)} is a basis for C2 over R.
The vectors v1 and v2 are linearly-independent (one is not a scalar multiple of the other).
To show that they form a basis, it suffices to show that v3 and v4 are linear combinations
Vector Spaces 153
of v1 and v2 (for then any vector in the span of all four is in the span of the first two).
Indeed,
v3 = − 13 v1 + 23 v2 and v4 = 34 v1 + 13 v2 .
form a basis for (R3 , +, R, ⋅). Write each of the standard basis vectors as a
linear combination of v1 , v2 , v3 .
a −a
W1 = {[ ] ∶ a, b, c ∈ F} ,
b c
and
a b
W2 = {[ ] ∶ a, b, c ∈ F} .
−a c
(b) Prove that W1 + W2 ≤ V (repeat the proof which was given for the
general case).
(c) Find bases for W1 , W2 , W1 + W2 and W1 ∩ W2 .
Solution 3.53:
(a) Just check that W1 , W2 are not empty and closed under vector addition and scalar
multiplication.
(b) Let u, v ∈ W1 + W2 . By definition, there exist u1 , v1 ∈ W1 and u2 , v2 ∈ W2 , such
that
u = u1 + u2 and v = v1 + v2 .
Thus,
Likewise, for a ∈ F,
a v = a (v1 + v2 ) = a v1 + a v2 ∈ W1 + W2 ,
1 −1 0 0 0 0 1 0 0 0 0 1
S1 = {[ ],[ ],[ ]} and S2 = {[ ],[ ],[ ]}
0 0 1 0 0 1 −1 0 1 0 0 0
are bases for W1 and W2 . Since any 2×2 matrix can be written as a sum of elements
in W1 and W2 ,
1 0 0 0 0 1 0 0
{[ ],[ ],[ ],[ ]} .
0 0 1 0 0 0 0 1
Finally,
1 −1 0 0
{[ ],[ ]}
−1 0 0 1
is a basis for W1 ∩ W2 .
Vector Spaces 155
Solution 3.54: Since the first and third columns are identical, it is easy to see that the
first two columns are a basis for the column space of A. On the the other hand the space
of solution of AX = 0, is obtained by reducing the matrix,
⎡1 0 1⎤⎥ ⎡1 0 1⎤⎥
⎢ ⎢
⎢ ⎥ ⎢ ⎥
⎢0 1 0⎥ → ⎢0 1 0⎥ .
⎢ ⎥ ⎢ ⎥
⎢1 1 1⎥⎦ ⎢0 0 0⎥⎦
⎣ ⎣
The space of solutions is all vectors of the form [−a, 0, a]T , for which a basis is the singleton
{[1, 0, −1]T }.
Show that
{ft ∶ t ∈ S}
is a basis for V .
a1 fs1 + ⋅ ⋅ ⋅ + an fsn = 0,
156 Chapter 3
where on the right-hand side is the function returning zero for all input. Substituting sj
on both sides, we obtain that aj = 0, proving that only a trivial combination of the ft ’s
can yield the zero function.
(G ∖ {v}) ∪ {u}
is generating, and
(L ∖ {u}) ∪ {v}
is linearly-independent. This fact is known as the exchange lemma.
L ∖ {u} ∪ {vn }
is linearly-independent. On the other hand, the set (G ∖ {vn }) ∪ {u} is generating because
any vector can be written as a linear combination of vectors in G. If this combination
includes vn , then we can replace this term by a linear combination of u and elements in
G, showing that every vector can be written as a linear combination of u and vectors in
G, excluding vn .
V = Fv + W.
Vector Spaces 157
u = a1 v1 + ⋅ ⋅ ⋅ + an vn .
{v1 , . . . , vn } ⊂ U,
hence
V = Span{v1 , . . . , vn } ≤ U,
but this necessarily implies that U = V , proving that W is a hyperplane. Finally, every
v = b1 v1 + ⋅ ⋅ ⋅ + bn vn
can be written as
v = b1 v1 + ⋅ ⋅ ⋅ + bn−1 vn−1 + bn ((an )−1 u − (a1 /an )v1 − ⋅ ⋅ ⋅ − (an−1 /an )vn−1 )
= (b1 − (a1 /an ))v1 + ⋅ ⋅ ⋅ + (bn−1 − (an−1 /an ))vn−1 + (bn /an ) u,
Proof : Let S ⊂ V have more than n elements, and let {u1 , . . . , un+1 } ⊂ S be
distinct vectors. By the definition of a generating set, each of the vectors ui
is in the span of that G, hence there exist (n + 1) × n scalars aij such that
For that c,
c1 u1 + ⋅ ⋅ ⋅ + cn+1 un+1 = 0V ,
proving that the set {u1 , . . . , un+1 } ⊂ S is linearly-dependent, hence so is
S (which contains a linearly-dependent set). It follows that any linearly-
independent set of vectors contains at most n vectors.
We may rewrite this proof in matrix form. Let the sequence (v1 , . . . , vn ) be
an ordered n-tuple containing the vectors in G, and let (u1 , . . . , um ), m > n,
be any sequence of m vectors. Since the {vi } are a generating set, there
exists an n × m matrix A, such that
(u1 . . . um ) = (v1 . . . vn ) A.
(u1 . . . um ) c = (v1 . . . vn ) Ac = 0V ,
Proof : Let {v1 , . . . , vn } and {u1 , . . . , um } be two bases for V . By the pre-
vious theorem, since bases are by definition generating sets and linearly-
independent, m ≤ n and n ≤ m, which completes the proof. n
This last corollary implies that we can associate with every non-zero finitely-
generated vector space a natural number which is the cardinality of any of
its bases. This number is called the dimension (! )מימדof V , and is denoted
dimF V.
Note the explicit mention of the field F, as the same set of vectors may con-
stitute a vector space of different dimension depending on the field. The zero
space (which contains no independent sets of vectors) is assigned dimension
zero,
dimF {0V } = 0.
If follows from the last theorem that if dim V = n, then every set of vectors
containing more than n elements is linearly-dependent, and that no set of
vectors containing fewer than n elements can span V (see exercises).
dimR C = 2.
Generally, for any field F, the vector space (Fn , +, F, ⋅) has dimension n,
dimF Fn = n,
Proof : This was essentially proved in Proposition 3.28, but for the sake of
completeness, we repeat the proof. Suppose, by contradiction that S ∪ {v}
is linearly-dependent. Then, there exist distinct vectors v1 , . . . , vn ∈ S and
scalar c1 , . . . , cn , c, not all zero, such that
c1 v1 + ⋅ ⋅ ⋅ + cn vn + cv = 0V .
Proof : One direction is immediate. For the other direction, suppose that
dimF W = dimF V . If W < V , then there exists a v ∈ V ∖ W . Let L be a
maximally-independent set for W ; then L ∪ {v} is independent, proving that
L is not maximally-independent for V , hence dimF W < dimF V , which is a
contradiction. n
Finally, a statement reminiscent of the inclusion-exclusion principle:
162 Chapter 3
Thus,
m k n
∑ ci wi = − ∑ ai ui − ∑ bi vi .
i=1 i=1 i=1
m k
∑ ci wi = ∑ di ui ,
i=1 i=1
or
m k
∑ ci wi − ∑ di ui = 0,
i=1 i=1
contradicting the fact that the vectors {u1 , . . . , uk }∪{w1 , . . . , wm } are linearly-
independent. Hence, S is linearly-independent, and therefore a basis. n
We end this section with a very important theorem:
Theorem 3.40 Let A ∈ Mn (F). Then, A is invertible if and only if its rows
form a linearly-independent set in Fnrow .
⎡ Row1 (A) ⎤
⎢ ⎥
⎢ ⎥
[xi1 ⋯ xin ] ⎢ ⋯ ⎥ = [0 ⋯ 1 ⋯ 0] ,
⎢ ⎥
⎢Rown (A)⎥
⎣ ⎦
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
A
164 Chapter 3
where the right-hand side is a vector of zeros except for a 1 in the i-th column.
Then,
⎡ x1 ⋯ x1 ⎤ ⎡ Row1 (A) ⎤ ⎡1 ⎤
⎢ 1 n⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ ⋮ ⋮ ⋮ ⎥⎢ ⋯ ⎥ = ⎢ ⋱ ⎥,
⎢ n ⎥ ⎢ ⎥ ⎢ ⎥
⎢x1 ⋯ xnn ⎥ ⎢Rown (A)⎥ ⎢ 1⎥⎦
⎣ ⎦⎣ ⎦ ⎣
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
A
i.e., the only vanishing linear combination of the rows of A is the trivial one,
proving that the rows of A are linearly-independent. n
Exercises
(easy) 3.58 Show that the vector space (M2×2 (F), +, F, ⋅) has dimension
four. More generally, show that the vector space (Mm×n (F), +, F, ⋅) has di-
mension mn.
Solution 3.60: We know that A is invertible if and only if the homogeneous system
AX = 0 has only trivial solutions, and if and only if the non-homogeneous system AX = b
is consistent for every b ∈ Fncol . Suppose that the columns of A are linearly-independent;
then they form a basis, so that there exists an x ∈ Fncol for which
⎡ x1 ⎤ ⎡1⎤
⎢ 1⎥ ⎢ ⎥
⎢ x2 ⎥ ⎢0⎥
⎢ ⎥ ⎢ ⎥
[Col1 (A) ⋯ Coln (A)] ⎢ 1 ⎥ = ⎢ ⎥ .
⎢ ⋮ ⎥ ⎢⋮⎥
⎢ n⎥ ⎢ ⎥
⎢x1 ⎥ ⎢0⎥
⎣ ⎦ ⎣ ⎦
Repeating this n times, there exist n columns of x’s. such that
⎡ x1 ... x1n ⎤⎥ ⎡⎢1 ⎤
⎢ 1 ⎥
⎢ ⎥ ⎢ ⎥
[Col1 (A) ⋯ Coln (A)] ⎢ ⋮ ⋮ ⋮ ⎥=⎢ ⋱ ⎥,
⎢ n ⎥ ⎢ ⎥
⎢x1
⎣ ⋯ xnn ⎥⎦ ⎢⎣ 1⎥⎦
from which we deduce that A is invertible. Conversely, suppose that A is invertible. Then,
to every b ∈ Fncol corresponds an x ∈ Fncol such that Ax = b, proving that the columns of A
generate Fncol . Since dimF Fncol = n, any generating set of size n is linearly-independent.
Solution 3.62:
(a) False. Let V = R4 and
U = W = Span{e1 , e2 , e3 }.
Then, dimR V = 4 and dimR U = dimR W = 3, so that 2 + dimR V ≤ dimR U + dimR W .
On the other hand, U + W = U = W ≠ V .
(b) True. In fact, this implies that dimF U = dimF W = dimF V , i.e., U = W = V , and in
particular U + W = V .
(c) True, for if V = U + W , then
and
x1 + 3x2 + x3 − x4 = 0
W = {x ∈ R4 ∶ }.
x2 − 3x3 + 2x4 = 0
What is dimR (U + W )?
Vector Spaces 167
Span{(1, 0, −1, −2), (−1, −1, 0, 2), (1, 2, 1, −1), (−10, 3, 1, 0), (7, −2, 0, 1)} = U + W,
so lets look at it. If those five vectors are a generating set for R4 , then U + W = R4 . So
the question is whether the system of equations
⎡ x1 ⎤
⎡1 −1 1 −10 7 ⎤⎥ ⎢⎢ 2 ⎥⎥ ⎡⎢a⎤⎥
⎢ x
⎢0 −2 2 3 −2⎥⎥ ⎢⎢ 3 ⎥⎥ ⎢⎢ b ⎥⎥
⎢
⎢ ⎥ ⎢x ⎥ = ⎢ ⎥
⎢−1 0 1 1 0 ⎥⎥ ⎢⎢ 4 ⎥⎥ ⎢⎢ c ⎥⎥
⎢ x
⎢−2
⎣ 2 −1 0 1 ⎥⎦ ⎢⎢ 5 ⎥⎥ ⎢⎣d⎥⎦
⎣x ⎦
is always solvable. You know how to proceed from here.
dimF V = n.
Show that any set of vectors containing less than n vectors does not span V .
(harder) 3.65 Consider the vector space (R, +, Q, ⋅) (i.e. the vectors are
real numbers, the scalars are rational numbers, with the operations of vector
addition and scalar multiplication defined as usual in R). Prove that this
vector space is not finitely-generated. (Hint: start by convincing yourself
that {1} is not a basis for this space; the argument is based on the fact that
Q is countable, whereas R is not.)
Solution 3.65: Suppose that (R, +, Q, ⋅) was finitely-generated. It would imply that
there exists a finite number of real numbers a1 , . . . , an generating R. Consider Span(a1 , . . . , an );
it is a countable set, whereas R is uncountable—contradiction.
168 Chapter 3
Solution 3.66: The answer is n, since W0 < W1 < ⋅ ⋅ ⋅ < Wm implies that
0 ≤ dimF W0 < dimF W1 < ⋅ ⋅ ⋅ < dimF Wm ≤ n.
B = (v1 ... vn ) ,
and then
W0 = ⟨∅⟩ W1 = ⟨v1 ⟩ W2 = ⟨v1 , v2 ⟩ . etc.
The row-rank (! )דרגה לפי שורותof a matrix is the dimension of its row space,
whereas its column-rank (! )דרגה לפי עמודותis the dimension of its column
space. Even though these two spaces are seemlingly unrelated, it turns out
that
dimF R(A) = dimF C (A).
This joint dimension is called the rank (! )דרגהof the matrix A.
We start with the row space:
dimF R(A)
dimF C (A)
and that
⎡0 1 0⎤ ⎡1 2 0 −1⎤ ⎡0 0 1 4⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢2 2 1⎥ ⎢0 0 1 4 ⎥ = ⎢2 4 2 6⎥ .
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢3 2 1⎥ ⎢0 0 0 0 ⎥ ⎢3 6 2 5⎥
⎣ ⎦⎣ ⎦ ⎣ ⎦
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
Q=P −1 R A
Consider first the row space of R. It is spanned by two non-zero rows, hence
its dimension is at most 2; it is in fact equal to 2, because
a [1 2 0 −1] + b [0 0 1 4] = [0 0 0 0]
at most 2. Its dimension is 2 because the first and third columns are linearly-
independent. Thus,
dimR R(R) = dimR C (R) = 2.
The question is why these are also the dimensions of the row and column
spaces of A. The easier part to see is the row space. The rows of A and
linear combinations of the rows of R and vice-versa, hence,
{Rowi (A) ∶ i = 1, 2, 3} ⊂ R(R) and {Rowi (R) ∶ i = 1, 2, 3} ⊂ R(A),
from which we deduce that R(A) = R(R), hence
dimR R(A) = 2.
The more surprising fact is that the column space of A has the same dimen-
sion as the column space of R, even though the two spaces are not identical.
The second column of R equals twice its first column,
Col2 (R) = 2 Col1 (R),
and the same holds for the column of A,
Col2 (A) = 2 Col1 (A).
Likewise,
Col4 (R) = 4 Col3 (R) − 5 Col1 (R),
but also,
Col4 (A) = 4 Col3 (A) − 5 Col1 (A).
In other words, the relations between the column of A are the same as the
relations between the columns of R.
Look again at the identity
⎡0 1 0⎤ ⎡1 2 0 −1⎤ ⎡0 0 1 4⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢2 2 1⎥ ⎢0 0 1 4 ⎥ = ⎢2 4 2 6⎥ .
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢3 2 1⎥ ⎢0 0 0 0 ⎥ ⎢3 6 2 5⎥
⎣ ⎦⎣ ⎦ ⎣ ⎦
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
R A
It states that the first and third columns of A are the first and third columns
of R, and that the other columns of A are linear combinations of those same
two columns of R. Hence dimR C (A) = 2. ▲▲▲
172 Chapter 3
3.5 Coordinates
3.5.1 Motivation
Consider the vector space (R2 , +, R, ⋅). It is not hard to verify that the set
S = {(1, 2), (2, 1)}
is a basis for R2 . The fact that S spans R2 implies that every vector (x, y) ∈ R2
can be written as a linear combination
(x, y) = a(1, 2) + b(2, 1)
for some a, b ∈ R. The fact that the vectors in S are independent, implies
that a and b are determined uniquely, as if the pairs of scalars a, b and c, d
satisfy
a(1, 2) + b(2, 1) = c(1, 2) + d(2, 1),
then
(a − c)(1, 2) + (b − d)(2, 1) = (0, 0),
which implies that a = c and b = d. This means that given the basis S, every
element in R2 can be identified with a pair of scalars, which are coefficients
of the basis vectors. For example,
(8, 7) = 2(1, 2) + 3(2, 1).
This is shown in the following plot, where the vector (8, 7) is shown to be
twice the vector (1, 2) plus three times the vector (2, 1). Note also how the
two basis vectors define a grid.
(8, 7)
(1, 2)
(2, 1)
Vector Spaces 173
Note that the only difference between an ordered basis and any old basis
is that its elements are ordered... also, a priori, not all the elements of a
sequence have to be distinct, but linear-independence implies at once that
all the elements in the sequence are distinct.
(3, 5)
(1, 1)
(1, −1)
176 Chapter 3
Proof : By definition,
⎡ a1 ⎤ ⎡ b1 ⎤
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
[u]B = ⎢ ⋮ ⎥ and [v]B = ⎢ ⋮ ⎥ ,
⎢ n⎥ ⎢ n⎥
⎢a ⎥ ⎢b ⎥
⎣ ⎦ ⎣ ⎦
are the unique matrices satisfying
That is,
⎡ a1 ⎤
⎢ ⎥
⎢ ⎥
u = a1 v1 + ⋅ ⋅ ⋅ + an vn = (v1 . . . vn ) ⎢ ⋮ ⎥ ,
⎢ n⎥
⎢a ⎥
⎣ ⎦
and
⎡ b1 ⎤
⎢ ⎥
⎢ ⎥
v = b1 v1 + ⋅ ⋅ ⋅ + bn vn = (v1 . . . vn ) ⎢ ⋮ ⎥ .
⎢ n⎥
⎢b ⎥
⎣ ⎦
Hence,
⎡ a1 + b 1 ⎤
⎢ ⎥
⎢ ⎥
u + v = (a1 + b1 )v1 + ⋅ ⋅ ⋅ + (an + bn )vn = (v1 . . . vn ) ⎢ ⋮ ⎥ ,
⎢ n n⎥
⎢a + b ⎥
⎣ ⎦
Vector Spaces 177
Example: Consider once again the vector space V = (R2 , +, R, ⋅) with the
ordered basis
B = ((1, 1), (1, −1)).
Let
u = (−1, 0) and v = (3, 5) hence u + v = (2, 5).
We proceed the calculate the coordinates,
−1/2
u = ((1, 1) (1, −1)) [ ]
−1/2
4
v = ((1, 1) (1, −1)) [ ]
−1
7/2
u + v = ((1, 1) (1, −1)) [ ],
−3/2
so that indeed
[u + v]B = [u]B + [v]B .
▲▲▲
Before we end this section, we emphasize its main result. The choice of an
ordered basis allows us to view vectors in V as matrices of coordinates. Both
V and Fncol are vector spaces over the same field F, but they are different
spaces. What we have is an identification (which really is a one-to-one and
onto function) of elements in V with elements in Fncol . What we proved is
that vector addition and scalar multiplication “respect” this identification:
for example, the column vector representing the sum of two vectors is the
sum of the column vectors representing each vector.
178 Chapter 3
Exercises
be ordered bases.
Solution 3.67:
(a) Since
(3, 3) = 0(1, 0) + 3(1, 1) and (3, 3) = 1(1, 2) + 1(2, 1),
it follows that
0 1
[v]B = [ ] and [v]C = [ ] .
3 1
1 0
[(1, 2)]C = [ ] and [(2, 1)]C = [ ] .
0 1
B = (1, X, X 2 ) C = (X 2 , X, 1) and D = (X + 1, X 2 , X − 1)
be ordered bases.
(intermediate) 3.70 Let V = (C3 , +, C, ⋅). What are the coordinates of the
vector (1, 0, 1) in the ordered basis
B = ((2ı, 1, 0) (2, −1, 1) (0, 1 + ı, 1 − ı)) ?
[(a, b, c)]B
for arbitrary a, b, c ∈ R.
Solution 3.72:
(a) By definition, every set of vectors generates their span, so we only need to show that
v1 and v2 and linearly-independent, which is the case if they are not proportional
to each other, and they are not (just look at the second entry).
Vector Spaces 181
(b) To show that u1 and u2 form a basis for the two-dimensional space W it suffices
to show that they are an independent set in W . That they are independent is
immediate (look at the third entry). To show that they are in W , we note (after a
direct calculation) that
u1 = ı v1 + v2 and u2 = (2 − ı) v1 + ı v2 .
(c) For this, we invert the relations we’ve just found. Let’s be clever, and write this in
matrix form,
ı 2−ı
(u1 u2 ) = (v1 v2 ) [ ].
1 ı
Since we know how to invert a 2 × 2 matrix, we find that
1 ı −2 + ı
(u1 u2 ) [ ] = (v1 v2 ) .
ı−3 −1 ı
Thus,
1 ı 1 −2 + ı
[v1 ](u1 ,u2 ) = [ ] and [v2 ](u1 ,u2 ) = [ ].
ı−3 −1 ı−3 ı
be two ordered bases. What can be said about the relation between coordi-
nates relative to both bases. In other words, for v ∈ V , what is the relation
between [v]B and [v]C ?
Since B is a basis, each of the vectors vi in the basis C has a unique repre-
sentation as a linear combination of the basis vectors ui . In other words, for
every i = 1 . . . , n, there exists n scalars p1i , . . . , pni , such that
vi = p1i u1 + ⋅ ⋅ ⋅ + pni un ,
i.e.,
⎡ p1 ⎤
⎢ i⎥
⎢ ⎥
vi = 1
(u . . . u n ⎢ ⋮ ⎥.
)
⎢ n⎥
⎢pi ⎥
⎣ ⎦
In fact, that column vector is nothing but the coordinate matrix of vi relative
to the basis B,
Coli (P ) = [vi ]B ,
where P is the n × n matrix whose entries are pij . Since this hold for every
i = 1 . . . , n,
⎡ p1 . . . p1n ⎤⎥
⎢ 1
⎢ ⎥
(v1 . . . vn ) = (u1 . . . un ) ⎢ ⋮ ⋮ ⋮ ⎥,
⎢ n ⎥
⎢p1
⎣ . . . pnn ⎥⎦
namely
C = B P.
ui = qi1 v1 + ⋅ ⋅ ⋅ + qin vn ,
namely,
Coli (Q) = [ui ]C ,
we obtain that
⎡q1 . . . q1 ⎤
⎢ 1 n⎥
⎢ ⎥
(u1 . . . un ) = (v1 . . . vn ) ⎢ ⋮ ⋮ ⋮ ⎥,
⎢ n ⎥
⎢q1 . . . qnn ⎥
⎣ ⎦
Vector Spaces 183
or
B = C Q.
Combining the two, for every i = 1, . . . , n,
n n n n
vi = ∑ pji ( ∑ qjk vk ) = ∑ (∑ pji qjk ) vk ,
j=1 k=1 k=1 j=1
namely,
C = C ⋅ QP,
or
C ⋅ (QP − I) = 0.
Since the basis vectors in C are all independent, and since multiplication by
(QP − I) yields n linear combinations of the basis vectors C, these combina-
tions vanish only if each column of QP − I is identically zero, form which we
deduce that
QP = I,
i.e., P ∈ GLn (F) and Q = P −1 . That is, the transitions between bases is
through a right-multiplication by an invertible n × n matrix. The matrices P
and Q are called transition matrices (!)מטריצות מעבר.
Let now v ∈ V . By definition,
v = B [v]B and v = C [v]C
Since C = B P , it follows that
v = (B P ) [v]C = B (P [v]C ),
which implies that
[v]B = P [v]C .
Likewise, since B = C Q,
v = (C Q) [v]B = C (Q [v]B ),
from which we deduce that
[v]C = Q [v]B .
Coli (P ) = [vi ]B .
Furthermore,
BP = C and C Q = B,
and for every v ∈ V ,
(1, 2)
(1, 1)
(2, 1)
(1, −1)
We verify that
1/3 −1
[(1, 1)]B = [ ] and [(1, −1)]B = [ ] ,
1/3 1
Vector Spaces 185
so that
1/3 −1
((1, 1) (1, −1)) = ((1, 2) (2, 1)) [ ],
1/3 1
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
C B ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
P
and
3/2 3/2
[(1, 2)]C = [ ] and [(2, 1)]C = [ ] ,
−1/2 1/2
so that
3/2 3/2
((1, 2) (2, 1)) = ((1, 1) (1, −1)) [ ].
−1/2 1/2
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
B C ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
Q
Indeed,
−1
1/3 −1 3/2 3/2
[ ] =[ ].
1/3 1 −1/2 1/2
Let v = (3, 4). A direct calculation shows that
5/3 7/2
[v]B = [ ] and [v]C = [ ].
2/3 −1/2
▲▲▲
Example: Consider the vector space (R2 , +, R, ⋅) and the ordered bases
and
C = (v1 v2 ) = ((cos β, sin β) (− sin β, cos β)) ,
for some α, β ∈ R (convince yourself geometrically that these are ordered
bases). You may verify that for every (x, y) ∈ R2 ,
x cos α + y sin α
[(x, y)]B = [ ].
−x sin α + y cos α
186 Chapter 3
In particular,
and
− sin β cos α + cos β sin α − sin(β − α)
[v2 ]B = [ ] =[ ].
sin β sin α + cos β cos α cos(β − α)
That is, C = B ⋅ P , where
cos(β − α) − sin(β − α)
P =[ ].
sin(β − α) cos(β − α)
cos(β − α) sin(β − α)
P −1 = [ ].
− sin(β − α) cos(β − α)
cos(β − α) sin(β − α)
[(x, y)]C = [ ] [(x, y)]B .
− sin(β − α) cos(β − α)
▲▲▲
Exercises
(a) Find the matrix P whose columns are the coordinates of the vectors in
C relative to the basis B.
(b) Show directly that P is invertible and find its inverse.
(c) Find the matrix whose columns are the coordinates of the vectors in B
relative to the basis C.
Solution 3.75:
(a) Every set of vectors generated, by definition, its own span. If they are linearly-
independent in V they are in particular linearly-independent within their own span,
hence form a basis for their span.
(b) Since the dimension of W is three, it suffices to show that those three vectors are
linearly-independent. Suppose that
i.e.,
(a + c)v1 + (a + b + c)v2 + (c − b)v3 = 0.
Since v1 , v2 , v3 are linearly-independent, it follows that
from which we obtain at once that a = b = c = 0, proving that they are indeed
linearly-independent.
(c) The solution is
⎡2 −1 −1⎤⎥
⎢
⎢ ⎥
(v1 v2 v3 ) = (v1 + v2 v2 − v3 v1 + v2 + v3 ) ⎢−1 1 0 ⎥.
⎢ ⎥
⎢−1
⎣ 1 1 ⎥⎦
188 Chapter 3
Chapter 4
Linear Forms
`(a v) = a `(v).
and
`(a v) = 0F = a `(v).
This linear form is called the zero form (!)תבנית האפס. ▲▲▲
B = (v1 , . . . , vn )
where we used here Proposition 3.46. Note the different types of addition:
in the first two terms it is addition in V , in the third term it is addition in
Fncol , and in the last two terms it is addition in F.
Likewise, using once again Proposition 3.46, for u ∈ V and c ∈ F,
i.e., `i (vj ) = δji . This particular set of linear forms will have an important
role shortly. ▲▲▲
Linear Forms 191
Example: Let V = (Mn (F), +, F, ⋅) and define the function known as the
trace (! )עקבהof the matrix.
n
tr(A) = ∑ aii .
i=1
Example: Let S be a non-empty set (it doesn’t need to have any other
structure than being a set) and consider the set V = FS of all functions
f ∶ S → F. We have seen that V is a vector space over F with respect to
the natural operations of addition and scalar multiplication of field-valued
functions (make sure you remember the vectorial structure of FS ). Let s ∈ S,
and define the function Evals ∶ V → F,
Evals (f ) = f (s).
192 Chapter 4
and
Evals (c f ) = (c f )(s) = c f (s) = c Evals (f ).
▲▲▲
Proposition 4.2 Let ` be a linear form on a vector space (V, +, F, ⋅). Then
for every v1 , . . . , vn ∈ V and a1 , . . . , an ∈ F,
Proposition 4.3 Let ` be a linear form on a vector space (V, +, F, ⋅). Then
`(0V ) = 0F .
Proof : Let v ∈ V be arbitrary. Then, using the fact that 0F v = 0V and the
properties of `,
`(0V ) = `(0F v) = 0F `(v) = 0F .
n
An important fact about linear forms (in finitely-generated vector spaces) is
that they are completely determined by their action on basis vectors. We
establish this in two separate propositions:
Linear Forms 193
B = (v1 . . . vn )
be an ordered basis for V . Then, for every set c1 , . . . , cn of scalars there exists
a linear form `, such that
Proof : There really is only one way to define such a functional. Since every
v ∈ V has a unique representation as
v = a1 v 1 + ⋅ ⋅ ⋅ + an v n ,
B = (v1 . . . vn )
then ` = `′ .
Exercises
(easy) 4.1 Prove using induction that for a linear from ` on a vector space
V,
f (a1 v1 + ⋅ ⋅ ⋅ + an vn ) = a1 f (v1 ) + . . . an f (vn )
for every a1 , . . . , an ∈ F and v1 , . . . , vn ∈ V .
Solution 4.1: For n = 1 the assertion is that f (a v) = a f (v), which hold by definition.
Assume that the assertion holds for n = k, then
Solution 4.2:
(a) The vectors (v1 , v2 , v3 ) form a basis. Every (x, y, z) ∈ R3 can be written as
hence
f (x, y, z) = (2x − 2y − z)f (v1 ) + (x − y − z)f (v2 ) + (x − 2y − z)f (v3 )
= (2x − 2y − z) − 2(x − y − z) + 3(x − 2y − z)
= 3x − 6y − 2z.
196 Chapter 4
and define ` ∈ V ∨ by
Solution 4.4: By the previous exercise (with v replaced by v − u), there exists a linear
functional ` ∈ V ∨ satisfying `(u − v) ≠ 0, and since ` is linear `(u) ≠ `(v).
Solution 4.5: If ` = 0V ∨ then m = 0V ∨ (and vice-versa) and the claim holds. Otherwise,
there exists a vector u ∈ V such that `(u) ≠ 0F . Let v ∈ V be arbitrary. Then,
`(v)
` (v − u) = 0F .
`(u)
It follows that
`(v)
m (v − u) = 0F ,
`(u)
which precisely means that
`(v)
m(v) = m(u),
`(u)
namely,
m(u)
m= `.
`(u)
we define n
b pi i+1 i+1
∫a P (x) dx = ∑ i + 1 (b − a ).
i=0
is a linear form. Note: you are not expected to know anything about
integrals—just follow the definitions.
it is a subset of the set of Func(V, F), which comprises all (i.e., not necessarily
linear) functions f ∶ V → F. Recall that Func(V, F) is itself a vector space
over F with respect to the function addition
(f + g)(v) = f (v) + g(v)
and the scalar multiplication
(c f )(v) = c f (v).
and
(a `)(c u) = a `(c u)
= a (c `(u))
= c (a `(u))
= c (a `)(u),
⟨⋅, ⋅⟩ ∶ V ∨ × V → F,
where
⟨`, v⟩ = `(v).
Example: For V = Fncol we have seen that V ∨ can be identified with Fnrow :
every a ∈ Fnrow defined a unique `a ∈ V ∨ defined by
`a (v) = a ⋅ v.
It is customary to write
(Fncol )∨ ≃ Fnrow ,
where the ≃ sign mean that the two spaces can be identified (more on that
later). ▲▲▲
200 Chapter 4
B = (v1 . . . vn )
⎝`n ⎠
is an ordered basis for V ∨ , called the dual basis (! )בסיס דואליof B, where
`i is the unique linear form satisfying
or equivalently
`i (v) = ([v]B )i .
As a result,
dimF V ∨ = dimF V.
a1 `1 + ⋅ ⋅ ⋅ + an `n = 0V ∨
Since this holds for every j = 1, . . . , n, it follows that the linear combination
of the `i ’s is trivial, namely, the linear forms `i are linearly-independent.
It remains to show that B∨ is spanning. We will show that any ` ∈ V ∨ can
be represented as
` = `(v1 ) `1 + ⋅ ⋅ ⋅ + `(vn ) `n ,
i.e., it is a linear combination of the linear forms `i (note that `(vi ) are
scalars). By Proposition 4.5 it suffices to verify that both sides yield the
same scalar when acting on basis vectors vj . Indeed,
E = (e1 . . . en )
⎝en ⎠
ei (v) = [v]E = xi ,
that is the i-th linear form in the dual standard basis extracts the i-th coor-
dinate of a vector. ▲▲▲
Since V ∨ is a vector space and since B∨ is a basis for V ∨ , every linear form
in V ∨ can be represented using coordinates. Every ` ∈ V ∨ has a unique
representation
1
⎛` ⎞
` = [c1 . . . cn ] ⎜ ⋮ ⎟,
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ⎝`n ⎠
[`]B∨
±
B ∨
where [`]B∨ ∈ Fnrow is the coordinate matrix. We have just proved that
Can we express the scalar `(v) obtained by the action of the linear form on
the vector using their respective coordinates?
Let denote the coordinates of v and ` as
v = a1 v 1 + ⋅ ⋅ ⋅ + an v n
` = b1 `1 + ⋅ ⋅ ⋅ + bn `n ,
namely,
⎡ a1 ⎤
⎢ ⎥
⎢ ⎥
[v]B = ⎢ ⋮ ⎥ and [`]B∨ = [b1 . . . bn ] .
⎢ n⎥
⎢a ⎥
⎣ ⎦
Then,
n n
`(v) = ∑ bi `i (∑ aj vj )
i=1 j=1
n n
= ∑ ∑ bi aj `i (vj )
i=1 j=1
n n
= ∑ ∑ bi aj δji
i=1 j=1
n
= ∑ b i ai .
i=1
Consider the right-hand side; it is the product of the row vector [`]B∨ and
the column vector [v]B .
We have just proved the following:
B = (v1 . . . vn )
Linear Forms 203
⎝`n ⎠
We have seen that given an ordered basis B = (v1 , . . . , vn ) and its dual
B∨ = (`1 , . . . , `n ) in a finitely-generated vector space, every linear form ` ∈ V ∨
can be represented as
n
` = ∑ `(vi ) `i .
i=1
This representation has an analog for vectors: every vector v ∈ V is given by
n
v = ∑ `i (v) vi ,
i=1
⎝`n ⎠ ⎝mn ⎠
Then, the transition matrix from B∨ to C∨ is Q = P −1 ,
C∨ = QB∨ .
204 Chapter 4
Example: Consider once again the vector space (R2 , +, R, ⋅) endowed with
the two bases
B = ((1, 2) (2, 1)) and C = ((1, 1) (1, −1)) .
(1, 2)
(1, 1)
(2, 1)
(1, −1)
Linear Forms 205
and
3/2 3/2
((1, 2) (2, 1)) = ((1, 1) (1, −1)) [ ].
−1/2 1/2
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
B C ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
Q
Exercises
(easy) 4.7 Consider the vector space (R2 , +, R, ⋅). Find the ordered basis
dual to the ordered basis
B = ((3, 4) (5, 7)) .
Solution 4.7: Denote v1 = (3, 4) and v2 = (5, 7). Every (x, y) ∈ R2 can be written as
(x, y) = (7x − 5y)v1 + (3y − 4x)v2 .
The linear forms `1 and `2 are determined by
`1 (x, y) = (7x − 5y) `1 (v1 ) + (3y − 4x)`1 (v2 ) = 7x − 5y
`2 (x, y) = (7x − 5y) `2 (v1 ) + (3y − 4x)`2 (v2 ) = 3y − 4x.
Solution 4.8:
(a) Let (v1 , . . . , vn ) be a basis for V and let (`1 , . . . , `n ) be the dual basis. One direction
is immediate: if v = 0V then `(0V ) = 0F for every ` ∈ V ∨ . Conversely, suppose that
`(v) = 0 for all ` ∈ V ∨ . Let v ∈ V be arbitrary; it can be written as
v = a1 v1 + ⋅ ⋅ ⋅ + an vn .
For every j = 1, . . . , n,
0F = `j (v) = a1 `j (v1 ) + ⋅ ⋅ ⋅ + an `j (vn ) = aj ,
i.e., aj = 0F for every j = 1, . . . , n, which proves that v = 0V .
(b) Once again, one direction is trivial (in fact, by definition of the zero form). Suppose
that f `(v) = 0 for all v ∈ V . Expanding ` in the dual basis,
` = b1 `1 + ⋅ ⋅ ⋅ + bn `n ,
and substituting vj ,
0F = `(vj ) = b1 `1 (vj ) + . . . , +bn `n (vj ) = bj ,
i.e., bj = 0F for every j = 1, . . . , n, which proves that ` = 0V ∨ .
Linear Forms 207
(intermediate) 4.9 Consider the vector space (C3 , +, C, ⋅). Find the basis
dual to the ordered basis
Solution 4.10:
(a) We really solved it in the previous exercise (and the fact that the field was different
doesn’t matter for the sake of the calculation):
The solution is
⎡ 1 1 2⎤
⎢ ⎥
⎢ ⎥
P = ⎢ 0 1 2⎥ .
⎢ ⎥
⎢−1 1 0⎥
⎣ ⎦
(d) Now we need to write
1 1
⎛e ⎞ ⎛` ⎞
⎜e ⎟ = Q ⎜`2 ⎟ .
2
⎝e3 ⎠ ⎝`3 ⎠
The solution is
⎡ 1 1 2⎤
⎢ ⎥
⎢ ⎥
Q = ⎢ 0 1 2⎥ .
⎢ ⎥
⎢−1 1 0⎥
⎣ ⎦
(e) You may verify that
⎡1 −1 0 ⎤⎥
⎢
⎢ ⎥
((1, 0, 0) (0, 1, 0) (0, 0, 1)) = ((1, 0, −1) (1, 1, 1) (2, 2, 0)) ⎢ 1 −1 1 ⎥.
⎢ 1 1⎥
⎢−
⎣ 2 1 − 2 ⎥⎦
(v1 . . . vn )
via
ϕi (vj ) = aij for all i, j = 1, . . . , n.
(Recall that this defines the linear forms uniquely.) Show that the linear
forms ϕ1 , . . . , ϕn are linearly-independent. Try to relate this question to the
last three.
We have a homogeneous linear system with coefficient matrix aij ; since this matrix is
invertible, the only solution is the trivial one, proving that the linear forms ϕi are linearly-
independent.
(a) Show that the functions `i are indeed well-defined for all v ∈ V , and
are linear forms.
(b) Show that the sequence B∨ is linearly-independent.
210 Chapter 4
(c) Show that B∨ is not a basis for V ∨ . I.e., there exists an ` ∈ V ∨ which
is not in the span of B∨ . Hint: set `(vi ) = 1 for all i ∈ N.
Solution 4.14:
(a) One way to think of a countable basis is that every v ∈ V has a representation
∞
v = ∑ ai vi ,
i=1
where this sum is not really infinite, as the coefficients ai vanish starting from some
i > n. Since the basis vectors are linearly-independent, this representation is unique.
Then, the `i are well-defined as for such a v
we have
∞
v + w = ∑(ai + bi ) vi ,
i=1
so that
`i (v + w) = ai + bi = `i (v) + `i (w),
and likewise for c ∈ F,
∞
c v = ∑(c ai ) vi ,
i=1
hence
`i (c v) = c ai = c `i (v).
(b) Let
a1 `1 + ⋅ ⋅ ⋅ + an `n = 0V ∨ .
Then for every j = 1, . . . , n,
a1 `1 (vj ) + ⋅ ⋅ ⋅ + an `n (vj ) = aj = 0F ,
proving that the linear forms `i are linearly-independent. Note that we used here
the fact that any linear combination of `i ’s can be written a linear combination of
all of them up to some i = n.
Linear Forms 211
(c) To show that the `i ’s do not span V ∨ we consider the linear form defined by its
action on all basis vectors, `(vi ) = 1. Then, for
∞
v = ∑ ai vi ,
i=1
`(v) is the sum of all coefficients. It is easy to see that this is a linear form. It is
however not spanned by the `i ’s has any (finite!) linear combination of `i ’s fail to
“see” the coefficients beyond a certain value of i.
Example: Let S = {0V }, then the set of linear forms ` ∈ V ∨ satisfying that
`(v) = 0F for all v ∈ S, i.e., `(0V ) = 0F is the entirety of V ∨ , i.e.,
{0V }0 = V ∨ .
▲▲▲
S 0 = {` ∈ V ∨ ∶ `(1, 0) = 0F }.
Example: Let V = (R2 , +, R, ⋅) and let S = {(1, 0), (0, 1)}. Then,
S 0 = {` ∈ V ∨ ∶ `(1, 0) = 0F and `(0, 1) = 0F }.
Using the same basis for V ∨ , we obtain that both a and b vanish, i.e.,
S 0 = {0V ∨ }.
▲▲▲
Look at the above three example: first notice that the larger S is, the smaller
S 0 is. Second, in all instances S 0 turned out to be a linear subspace of V ∨ .
The next two propositions show that this is always the case:
(a) If S ⊆ T then T 0 ≤ S 0 .
(b) S 0 = (Span S)0
v = a1 v 1 + ⋅ ⋅ ⋅ + an v n
S 0 ⊆ (Span S)0 .
Conversely, since S ⊆ Span S, it follows from the first item that (Span S)0 ⊆
S 0 , proving that (Span S)0 = S 0 . n
Thus far, S was just any old set; consider now the case there S = W is a
subspace of V , in which case we have two subspaces, W and W 0 , of spaces,
V and V ∨ , having the same dimension. As we show the dimensions of W
and W 0 are inter-related:
214 Chapter 4
B = (w1 , . . . , wn , v1 , . . . , vk )
B∨ = (`1 , . . . , `n , m1 , . . . , mk ),
such that
` = (a1 `1 + ⋅ ⋅ ⋅ + an `n ) + (b1 m1 + ⋅ ⋅ ⋅ + bk mk ).
For every j = 1, . . . , n,
proving that
` = b1 m1 + ⋅ ⋅ ⋅ + bk mk ,
i.e., (m1 , . . . , mk ) is a generating set for W 0 ; since it is also independent, it
is a basis for W 0 . n
Linear Forms 215
Definition 4.14 Let V be vector space and let L ⊆ V ∨ . The null space
(!M )קבוצת האפסיof L is the set of vectors
L0 = {v ∈ V ∶ 0V ∨ (v) = 0F } = V.
▲▲▲
`([x, y, z]T ) = x + y + z.
Then,
L0 = {([x, y, z]T ) ∈ F3col ∶ x + y + z = 0},
which we know how to express explicitly. In fact, we know that
⎡⎡−s − t⎤ ⎤ ⎧
⎪ ⎡−1⎤ ⎡−1⎤⎫
⎢⎢
⎢⎢
⎥
⎥
⎥
⎥ ⎪⎢⎢ ⎥⎥ ⎢⎢ ⎥⎥⎪
⎪ ⎪
⎪
L0 = ⎢⎢ s ⎥ ∶ s, t ∈ F⎥ = Span ⎨⎢ 1 ⎥ , ⎢ 0 ⎥⎬ .
⎢⎢ ⎥ ⎥ ⎪
⎪ ⎢ ⎥ ⎢ ⎥⎪ ⎪
⎢⎢ t ⎥ ⎥ ⎩⎢⎣ 0 ⎥⎦ ⎢⎣ 1 ⎥⎦⎪
⎪ ⎭
⎣⎣ ⎦ ⎦
This example shows that the left-hand side of a linear equation of the type
we started this course with is really a linear form, and the solution of a
homogeneous equation is nothing but its null space. ▲▲▲
a b
` ([ ]) = a + d.
c d
a b
{`}0 = {[ ] ∶ a, b, c ∈ F} ,
c −a
216 Chapter 4
or
1 0 0 1 0 0
{`}0 = Span {[ ],[ ],[ ]} .
0 −1 0 0 1 0
▲▲▲
The following three propositions are the analogs of Propositions 4.11–4.13:
Proposition 4.15 The null space of a set of linear forms is a vector sub-
space: let V be a vector space and let L ⊆ V ∨ , then
L0 ≤ V.
Then,
`(u + v) = `(u) + `(v) = 0F for all ` ∈ L,
which implies that u + v ∈ L0 . For u ∈ L0 and a ∈ F,
(a) If L ⊆ M then M0 ≤ L0 .
(b) L0 = (Span L)0
Proof : Before we prove it formally, two observations: (i) the larger a set
of linear forms is, the more constraints are imposed on its null space, hence
its null space should be smaller. (ii) Think of L0 as a set of homogeneous
linear equations on Fncol (just as an example—we haven’t even required V to
Linear Forms 217
` = a1 `1 + ⋅ ⋅ ⋅ + an `n
L0 ⊆ (Span L)0 .
Conversely, since L ⊆ Span L, it follows from the first item that (Span L)0 ⊆
L0 , proving that (Span L)0 = L0 . n
and
dimF W + dimF W 0 = dimF V,
from which we conclude that W and dim(W 0 )0 have the same dimension.
It suffices then to show every vector in W is also in (W 0 )0 (actually, justify
this assertion formally).
By definition,
whereas
W 0 = {` ∈ V ∨ ∶ `(w) = 0F for all w ∈ W }.
So let w ∈ W . For every ` ∈ W 0
`(w) = 0F ,
Proof : We prove the first item. One direction is obvious, U = W implies that
U 0 = W 0 . The other direction follows from the fact that U 0 = W 0 implies
that (U 0 )0 = (W 0 )0 , along with (4.1). The second item is left as an exercise.
n
Exercises
Solution 4.15: The set U is not empty because {0V ∨ }0 = V , hence 0V ∨ ∈ U . Let
`, m ∈ U . By definition,
Now,
{v ∈ V ∶ `(v) = 0F } ⊆ {v ∈ V ∶ a `(v) = 0F }
W ≤ {v ∈ V ∶ a `(v) = 0F },
{w}0 = {a `1 − a `2 ∶ a ∈ R}.
One direction is easy: if ` ∈ (W1 )0 + (W2 )0 , then there exist m1 ∈ (W1 )0 and m2 ∈ (W2 )0 ,
such that ` = m1 + m2 . Then for all w ∈ W1 ∩ W2 ,
(w1 , . . . , wk , u1 , . . . , up , v1 , . . . , vq , x1 , . . . , xr )
(`1 , . . . , `k , m1 , . . . , mp , s1 , . . . , sq , t1 , . . . , tr ).
(s1 , . . . , sq , t1 , . . . , tr )
(m1 , . . . , mp , s1 , . . . , sq , t1 , . . . , tr )
Solution 4.18: The annihilator of W is the set of all linear forms ` ∈ V ∨ satisfying
`(w) = 0 for all w ∈ W . It suffices to require that `(w) = 0 for all w in a generating set,
i.e.,
`(1, 2, −3, 4) = `(0, 1, 4, −1) = 0.
Let’s use the basis dual to the standard basis,
a + 2b − 3c + 4d = 0 and b + 4c − d = 0.
{−6 `1 + `2 + `4 , −13 `1 + `3 + 4 `4 }.
W 0 = Span({`1 , `2 , `3 }).
i.e.,
⎡w1 ⎤
⎡1 2 2 1⎤⎥ ⎢⎢ 2 ⎥⎥ ⎡⎢0⎤⎥
⎢
⎢ ⎥ ⎢w ⎥ ⎢ ⎥
⎢2 0 0 1⎥ ⎢ 3 ⎥ = ⎢0⎥ .
⎢ ⎥ ⎢w ⎥ ⎢ ⎥
⎢−2 0 −3 3⎥⎦ ⎢⎢ 4 ⎥⎥ ⎢⎣0⎥⎦
⎣w ⎦
⎣
and
dimF L + dimF L0 = dimF V,
from which we conclude that dimF L = dimF (L0 )0 . We will be done if we prove that
L ≤ (L0 )0 , as a proper subspace has lower dimension than the space it is a subspace of.
Let ` ∈ L; by definition,
`(v) = 0 for all v ∈ L0 ,
By this means, by definition, that ` ∈ (L0 )0 , which completes the first part.
The second part now follows. Clearly, L = M implies that L0 = M0 . Conversely, if L0 = M0
then (L0 )0 = (M0 )0 , i.e., L = M .
SA = {v ∈ Fncol ∶ Av = 0Fm
col
}
By Proposition 4.16,
i.e., the set of solutions is the null space of the row space of A. Proposi-
tion 4.17 asserts that
Recall that the dimension of the row space equals the dimension of the column
space, and that this dimension is called the rank of the matrix. Thus,
dimF SA = n − rankA.
224 Chapter 4
Example: Let’s have a different look on the relation between equations and
solutions. Let V = F3col ; then V ∨ = F3row under the action through row-column
multiplication. We use the standard bases for V and V ∨ . Consider the linear
form
⎡x1 ⎤
⎢ ⎥
⎢ ⎥
`(x) = [1 1 1] ⎢x2 ⎥ = x1 + x2 + x3 .
⎢ 3⎥
⎢x ⎥
⎣ ⎦
The space of solutions, which is the null space of {`} is
⎧
⎪ ⎡−s − t⎤ ⎫
⎪ ⎧
⎪ ⎡−1⎤ ⎡−1⎤⎫
⎪⎢⎢
⎪ ⎥
⎥ ⎪
⎪ ⎪⎢⎢ ⎥⎥ ⎢⎢ ⎥⎥⎪
⎪ ⎪
⎪
{`}0 = ⎨⎢ s ⎥ ∶ s, t ∈ F⎬ = Span ⎨⎢ 1 ⎥ , ⎢ 0 ⎥⎬ ≤ F3col .
⎪⎢⎢ t ⎥⎥
⎪ ⎪
⎪ ⎪
⎪ ⎢ ⎥ ⎢ ⎥⎪ ⎪
⎪
⎩⎣ ⎦ ⎪
⎭ ⎩⎢⎣ 0 ⎥⎦ ⎢⎣ 1 ⎥⎦⎪
⎪ ⎭
The equation represented by the linear form whose coordinates (relative to
the standard dual basis) are [1, 1, 1], induces a space of solutions, which is
Linear Forms 225
` = a1 e 1 + a2 e 2 + a3 e 3 ,
Linear Transformations
is the space of functions with domain (!M )תחוV and codomain (! )טווחW . But
just as with the linear forms on V , which are functions FV , we delineate a
subset of all functions that “respect” the vector space stucture:
and
f (a v) = a f (v)
for all u, v ∈ V and a ∈ F. The set of all linear transformations from V to
W is denoted by HomF (V, W ).
Comments:
(a) Note once again how addition and scalar multiplication on both sides
of an equations are operations on different spaces.
(b) Setting W = F, HomF (V, F) = V ∨ .
The following properties of linear transformation are easy to prove (cf. with
their analogs for linear forms):
Proposition 5.2 Let (V, +, F, ⋅) and (W, +, F, ⋅) be vector spaces and let f ∈
HomF (V, W ). Then,
(a) f (0V ) = 0W .
(b) For every v ∈ V , f (−v) = −f (v).
(c) For every v1 , . . . , vn ∈ V and a1 , . . . , an ∈ F,
Example: Linear forms are linear transformations HomF (V, F). ▲▲▲
defined by
f (v) = Av and g(w) = wA.
Both are linear transformations. ▲▲▲
230 Chapter 5
B = (v1 . . . vn )
Proof : There really is only one way to define such a transformation. Since
every v ∈ V has a unique representation as
v = a1 v 1 + ⋅ ⋅ ⋅ + an v n ,
u = a1 v 1 + ⋅ ⋅ ⋅ + an v n
v = b1 v 1 + ⋅ ⋅ ⋅ + bn v n .
Then,
u + v = (a1 + b1 ) v1 + ⋅ ⋅ ⋅ + (an + bn ) vn .
By the way we defined f ,
f (u) = a1 w1 + ⋅ ⋅ ⋅ + an wn
f (v) = b1 w1 + ⋅ ⋅ ⋅ + bn wn ,
Linear Transformations 231
and
f (u + v) = (a1 + b1 ) w1 + ⋅ ⋅ ⋅ + (an + bn ) wn ,
so that indeed f (u + v) = f (u) + f (v). We proceed similarly to show that
f (k v) = k f (v) for k ∈ F. n
The following complementing proposition asserts that there really was no
other way to define f :
B = (v1 . . . vn )
v = a1 v 1 + ⋅ ⋅ ⋅ + an v n
B = (v1 . . . vn )
f (v) = Av.
▲▲▲
Hence,
1 1
f (x, y) = (3y − 4x) f (v1 ) + (2x − y) f (v2 )
2 2
1 1
= (3y − 4x)(3, 2, 1) + (2x − y)(6, 5, 4).
2 2
For example,
f (1, 0) = −2(3, 2, 1) + (6, 5, 4) = (0, 1, 2).
▲▲▲
Linear Transformations 233
Exercises
it follows that
f (x, y) = (−2x + 32 y)f (1, 2) + (x − 12 y)f (3, 4) = (−2x + 23 y)(3, 2, 1) + (x − 12 y)(6, 5, 4).
Solution 5.3: The answer to (b) is negative, because f only yields diagonal matrices
in which the middle term in the sum of the two other term. The answer to (c) is also
negative, because f (a + bX) = 0W if and only if a = b = 0.
f (z, w) = z + w̄,
Solution 5.4: The answer to (a) is negative, because for (w, z), (s, t) ∈ C2 and a ∈ C ∖ R
(viewed as a scalar),
The answer to (b) is positive, because for (w, z), (s, t) ∈ C2 and a ∈ R (viewed as a scalar),
where
v1 = (1, 0, 1) v2 = (1, 2, 1) v3 = (0, 1, 1) v4 = (2, 3, 3) ?
If it does, write it explicitly; otherwise explain why not.
Solution 5.5: Let
u1 = v2 − v1 = (0, 2, 0) u2 = v3 − v1 = (−1, 1, 0) and u3 = v4 − v1 = (1, 3, 2).
By linearity
f (e1 ) = f (e2 ) = f (e3 ) = 0,
3
hence for every (x, y, z) ∈ R ,
so that
f (0, 2, 3) = 2 f (0, 1, 2) − f (0, 0, 1) = 2(1, 0) − (1, 1) = (1, −1).
The answer to the second item is negative, since f (1, 0, 0) cannot be recovered from the
data.
Solution 5.7: We need to show that the set of vectors v ∈ V satisfying that f (v) ∈ U
is not just a subset of V , but a vector subspace. First, we claim that S is not empty as
0V ∈ S; indeed f (0V ) = 0W ∈ U as U is a vector subspace of W . Second, if u, v ∈ S, then
f (u), f (v) ∈ U , hence
f (u + v) = f (u) + f (v) ∈ U,
proving that u + v ∈ S. Finally, if u ∈ S and a ∈ F, then
f (a u) = a f (u) ∈ U,
i.e., a u ∈ S.
(a f )(v) = a f (v).
Proposition 5.5 Let V and W be vector spaces over a field F. The set
HomF (V, W ) is a linear subspace of Func(V, W ).
Proof : The set HomF (V, W ) is non-empty because it contains the zero map.
Let f, g ∈ HomF (V, W ) and b ∈ F; we need to show that f + g ∈ HomF (V, W )
and that b f ∈ HomF (V, W ). For all u, v ∈ V ,
(a) U + W = V .
(b) To every v ∈ V correspond unique u ∈ U and w ∈ W , such that
v = u + w.
We have already seen earlier in this course that these two conditions are
equivalent to the conditions that U + W = V and U ∩ W = {0V }.
I.e., the polynomials of odd and even powers are complementary in the space
of all polynomials. ▲▲▲
p1 ∶ V → V and p2 ∶ V → V,
by
p1 (v) = u1 and p2 (v) = u2 ,
where v = u1 + u2 is the unique decomposition of v as a sum of elements in
U1 , U2 . The operator p1 is called the projection on U1 parallel to U2 ; the
operator p2 is called the projection on U2 parallel to U1 .
Comments:
Example: Let V = R3 ,
so that
U2
S2 (v) p2 (v)
v
U1
−p1 (v) p1 (v)
▲▲▲
S1 ∶ V → V and S2 ∶ V → V,
by
S1 (v) = u1 − u2 and S2 (v) = u2 − u1 ,
where v = u1 + u2 is the unique decomposition of v as a sum of elements in
U1 , U2 .
u + v = u1 + v1 + u2 + v2 and a v = a v1 + a v 2 ,
´¹¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¶ ° °
∈U1 ∈U2 ∈U1 ∈U2
hence by definition
Exercises
(easy) 5.10 Prove (possibly for the second time) that V = U ⊕ W if and
only if V = U + W and U ∩ W = {0V }.
Solution 5.10: Suppose first that V = U + W and U ∩ W = {0V }. We need to show that
every v ∈ V has a unique representation as u + w with u ∈ U and w ∈ W . Suppose that
v = u1 + w1 = u2 + w2 ,
u1 − u2 = w2 − w1 .
Since the left-hand side is in U and the right-hand side is in W , both vanish, i.e., u1 = u2
and w1 = w2 . Conversely, suppose that V = U ⊕ W . It follows at once that in particular
V = W + U . It remains to show that U ∩ W = {0V }. Suppose that there exists a v ∈ U ∩ W ,
v ≠ 0. Then,
v = v + 0V = 0V + v,
i.e., there are two different ways to write v as the sum of an element in U and an element
in W .
Solution 5.11: For the first part, R2 = U + V as every (x, y) ∈ R2 can be written as
(x, y) = x(1, 0) + y(0, 1) .
´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸¹ ¹ ¹ ¹¶
∈U ∈W
Also since all the elements in U ar of the form (x, 0), x ∈ R and all the elements in W are
of the form (0, y), y ∈ R, U ∩ W = {0}. For the second part,
p1 (x, y) = (x, 0) p2 (x, y) = (0, y) S1 (x, y) = (x, −y) and S2 (x, y) = (−x, y).
242 Chapter 5
Hence
p1 (x, y) = (y − x)(1, 2) p2 (x, y) = (2x − y)(1, 1)
S1 (x, y) = (y − x)(1, 2) − (2x − y)(1, 1) and S2 (x, y) = (2x − y)(1, 1) − (y − x)(1, 2).
Hence,
Solution 5.14: (a) We have f (x, y, z) = (x, y, z) if and only if z = 0, i.e., for all vectors
of the form
(x, y, 0) = x e1 + y e2 .
(b) We have f (x, y, z) = −(x, y, z) if and only if x = y = 0, i.e., for all vectors of the form
(0, 0, z) = z e3 .
(c) Since p1 (x, y, z) = (x, y, 0) and p2 (x, y, z) = (0, 0, z), it follows that
a b −c 0
U = {[ ] ∶ a, b ∈ R} and W = {[ ] ∶ c, d ∈ R} .
0 0 c d
Solution 5.15: It is easy to see that U and W are not empty (both contain the zero
matrix) and are closed under vector addition and scalar multiplication, from which we
deduce that U, W ≤ V . Moreover, every element in V has a unique representation as
a b a+c b −c 0
[ ]=[ ]+[ ].
c d 0 0 c d
Hence
a b a+c b a b −c 0
p1 ([ ]) = [ ] and p2 ([ ]) = [ ].
c d 0 0 c d c d
The reflections S1 , S2 are then readily obtained.
244 Chapter 5
and
W = {f ∈ Func(R, R) ∶ f (x) = −f (−x) for all x ∈ R}.
ker f = {v ∈ V ∶ f (v) = 0W }.
The image (! )תמונהof f is those vectors in w ∈ W for which there exists a
vectors in v ∈ W , such that w = f (v),
Proof : The set ker f is not empty, because it contains 0V . Let u, v ∈ ker f
and a ∈ F, i.e.,
f (u) = f (v) = 0W .
Then,
By the linearity of f ,
f (v1 + v2 ) = f (v1 ) + f (v2 ) = w1 + w2 ,
i.e., w1 + w2 ∈ Image f , and
f (a v1 ) = a f (v1 ) = a w1 ,
i.e., a w1 ∈ Image f , thus proving that Image f is a linear subspace of W . n
Likewise,
ker p1 = U2 ,
because if u2 ∈ U2 , then p1 (u2 ) = 0V , proving that
U2 ≤ ker p1 .
Conversely, if u ∈ ker p1 , then p1 (u) = 0V , proving that u ∈ U2 , i.e.,
ker p1 ≤ U2 .
▲▲▲
Recall that a function f ∶ V → W is called one-to-one (!( )חד חד ערכיתor
injective), if f (u) = f (v) implies that u = v. The following proposition
relates the kernel of a linear transformation to its injectivity.
Linear Transformations 247
f (u − v) = f (u) − f (v) = 0W ,
Exercises
1 2
A=[ ],
3 6
f (v) = Av.
The answer is
−2t
ker f = {[ ] ∶ t ∈ R} .
t
For the second part,
1 2 x x + 2y t
Image f = {[ ] [ ] ∶ x, y ∈ R} = {[ ] ∶ x, y ∈ R} = {[ ] ∶ t ∈ R} .
3 6 y 3(x + 2y) 3t
248 Chapter 5
and let W = M2×2 (R). Consider the linear transformation f ∈ HomR (V, W ),
a+b 0
f (a + bX + cX 2 ) = [ ].
b+c c−a
and
a+b 0 x 0
Image f = {[ ] ∶ a, b, c ∈ R} = {[ ] ∶ x, y ∈ R} .
b+c c−a y y−x
f (D) = Image f.
f −1 (T ) = {x ∈ D ∶ f (x) ∈ T }.
f −1 (C) = {x ∈ D ∶ f (x) ∈ C} = D.
f (U ) ≤ W,
Image f = f (V )
ker f = f −1 ({0W })
is a linear subspace of V .
250 Chapter 5
Proof : The proof follows the same lines as the proof of Proposition 5.11. The
set f (U ) is not empty because 0V ∈ U , hence 0W ∈ f (U ). Let w1 .w2 ∈ f (U )
and let a ∈ F. By definition, there exist u1 , u2 ∈ U such that
By the linearity of f ,
f (a u1 ) = a f (u1 ) = a w1 ,
i.e.,
u + v ∈ f −1 (Z) and a v ∈ f −1 (Z).
proving that f −1 (Z) is a linear subspace of V . n
Exercises
Let
Solution 5.20:
(a) ker f = Span{(−2, 1, 1)}.
(b) Image f = {(t + 2s, s, t) ∶ s, t ∈ R}.
(c) f (U ) = Span{(1, 0, 1)}.
(d) f (U ) = Span{(1, −1, 3), (2, 1, 0)}.
(e) We need to find the solutions to f (x, y, z) = (c, c, c), i.e.,
x + 2y = c y−z =c and x + 2z = c.
This system is only solvable for c = 0, i.e., f −1 (U ) = ker f (note that it is always the
case that ker f ≤ f −1 (U )).
(f) We need to find the solutions to f (x, y, z) = (a, b, a), i.e.,
x + 2y = a y−z =b and x + 2z = a.
f −1 (W ) = {(x, y, y) ∶ x, y ∈ R}.
Lemma 5.16 Let V and W be vector spaces over a field F, with V finitely-
generated. Let f ∈ HomF (V, W ). Then, both ker f and Image f are finitely-
generated.
B = (v1 . . . vn )
252 Chapter 5
C = (f (v1 ) . . . f (vn ))
Image f ≤ Span C.
Intuitively, the larger the nullity of a linear transformation, the more vectors
in V are mapped to the zero vector in W . The larger the range of f , the
more vectors in W are obtained by applying f on vectors in V .
▲▲▲
Linear Transformations 253
Proof : The idea of the proof is quite similar in essence to the proof of the
theorems relating the annihilators of subspaces for linear forms. Denote by
n the dimension of V . Let
(u1 . . . uk )
be a basis for ker f (which is of dimension at most n) and let
B = (u1 , . . . , uk , v1 , . . . , vn−k )
be its completion to a basis for V (recall that a basis for any subspace can be
completed into a basis for the entire space). Since, by definition, ν(f ) = k,
it remains to prove that %(f ) = n − k. Consider the set
C = (f (v1 ) . . . f (vn−k )) .
v = a1 u1 + ⋅ ⋅ ⋅ + ak uk + b1 v1 + ⋅ ⋅ ⋅ + bn−k vn−k
Example: Let A ∈ Mm×n (F) and consider the linear map f ∈ HomF (Fncol , Fm
col )
given by
f (v) = Av.
In this case, the image of f is the column space of A,
Image A = C (A),
whereas the kernel of f is the set of zeros of the rows of A, viewed as linear
forms, i.e.,
ker A = (R(A))0 .
Linear Transformations 255
Exercises
B = {v1 , v2 , v3 , v4 }
be a basis for V . Let f ∈ HomF (V, V ) such that {v1 , v2 } is a basis for ker f .
Show that the set {f (v3 ), f (v4 )} is linearly-independent.
(harder) 5.22 Find a linear transformation f ∶ R<4 [X] → M2×3 (R), such
that
ker f = Span{X 3 − 2X + 1, X 3 + X 2 − X + 3}
and
−1 2 1
Span {[ ]} ⊆ Image f.
3 −1 0
256 Chapter 5
such that each entry of the matrix is a linear combination of x1 , x2 , x3 , x4 . Each entry of
the image of f is a linear form in x1 , .., x4 . All vanish for (x1 , x2 , x3 , x4 ) = (1, −2, 0, 1) and
(x1 , x2 , x3 , x4 ) = (3, −1, 1, 1), i.e.,
Any such transformation satisfies the kernel condition. All that remains is to decide, for
example, that
−1 2 1
f (1) = [ ],
3 −1 0
and set (for example) t = 0 above to yield,
This implies that (0, 5, −4, −1) ∈ ker g. (b) Yes. Since dimR ker g = 1 it follows that
dimR Image g = 3.
Solution 5.24: Since v1 , v2 ∈ ker f are independent, it follows that dimF ker f ≥ 2, hence
dimF Image f ≤ 4 − 2 < 3, from which follows that f is not onto.
Then,
ker f = Image f = Span{(1, 0)}.
(b) No. Because dimF ker f + dimF Image f is an odd number.
a1 f (v1 ) + ⋯ + an f (vn ) = 0W ,
hence all the ai are zero, proving that the (v1 , . . . , vn ) are linearly-independent.
(b) Let f be onto and let (v1 , . . . , vn ) be a generating set. Let w ∈ W , Since f is onto,
there exists a v such that w = f (v). Since we can write v as
v = a1 v1 + ⋅ ⋅ ⋅ + an vn ,
it follows that
w = a1 f (v1 ) + ⋯ + an f (vn ),
proving that (f (v1 ), . . . , f (vn )) is a generating set.
(c) Let V = R2 , W = R and f (x, y) = x. Then, {f (1, 0)} = {1} is a generating set for W ,
but {(1, 0)} is not a generating set for V .
(d) If f is one-to-one and onto, then by the rank theorem dimF V = dimF W . If (v1 , . . . , vn )
is a basis for V then it is in particular linearly-independent, hence (f (v1 ), . . . , f (vn )) is
linearly-independent, hence a basis for W . By the same argument, if (f (v1 ), . . . , f (vn ))
is a basis for W , then (f −1 (f (v1 )), . . . , f −1 (f (vn ))) is linearly-independent, hence a basis
for V .
Proposition 5.21 Let U, V, W be vector spaces over the same field F. Let
f ∈ HomF (U, V ) and g ∈ HomF (V, W ). Then,
ker f ≤ ker(g ○ f ),
and
Image(g ○ f ) ≤ Image g.
w = (g ○ f )(u),
Exercises
2f − g f + 2h f ○g g○f h ○ f + 2g.
ker f = ker(f ○ f )
Solution 5.30: In the previous exercise we say that it is always the case that ker f ≤
ker(f ○f ). Thus, we need to show that ker(f ○f ) ≤ ker f if and only if ker f ∩Image f = {0V }.
Suppose that ker(f ○ f ) = ker f . Let v ∈ ker f ∩ Image f . Then, there exists a w such that
v = f (w) and
f (v) = f (f (w)) = 0V .
This implies that f (w) = 0V , i.e., ker f ∩ Image f = {0V }.
Conversely, suppose that ker f ∩ Image f = {0V } and let f (f (v)) = 0V . Since f (v) ∈
ker f Image , it follows that f (v) = 0V , i.e., ker(f ○ f ) ≤ ker f .
262 Chapter 5
Solution 5.31: Let w ∈ ker f ∩ Image f , then w = f (v) for some v ∈ V and
w = f (v) = 21 f (f (v)) = 21 f (w) = 0V ,
proving that ker f ∩ Image f = {0V }. It remains to prove that V = Image f ⊕ ker f , but
dimF (ker f + Image f ) = dimF ker f + dimF Image f − dimF (ker f ∩ Image f ),
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=dimF V =0
(r cos α, r sin α)
Rotθ ∶ R2 → R2 ,
where θ ∈ R, where
That is, this transformation rotates vector about the origin by an angle θ.
264 Chapter 5
(r cos α, r sin α)
θ
α
On the face of it, this transformation doesn’t seem linear; the trigonometric
functions are nonlinear. However, using the trigonometric identities,
cos(α + θ) = cos α cos θ − sin α sin θ
sin(α + θ) = sin α cos θ + cos α sin θ,
setting x = r cos α and y = r sin α, we find that
Rotθ (x, y) = (cos θ x − sin θ y, sin θ x + cos θ y),
that is Rotθ ∈ HomR (R2 , R2 ).
If we rather write the components of vectors relative to the standard basis,
x cos θ x − sin θ y cos θ − sin θ x
Rotθ ([ ]) = [ ]=[ ][ ].
y sin θ x + cos θ y sin θ cos θ y
cos θ − sin θ
Rθ = [ ].
sin θ cos θ
i.e.,
Rϕ Rθ = Rϕ+θ .
Note that
R2π = R0 = I,
and
Rθ R−θ = I.
be ordered bases for V and W . Then, there exists for every i = 1, . . . , n and
j = 1, . . . , m a unique linear transformation fji ∈ HomF (V, W ), such that
⎧
⎪wj
⎪ k=i
fji (vk ) =⎨ (5.1)
⎩0W k ≠ i.
⎪
⎪
Let f ∈ HomF (V, W ) and consider the vector f (v1 ): it has a unique represen-
tation as a linear combination of the basis vectors wi , which we may write
as
⎡ a1 ⎤
⎢ 1⎥
⎢ ⎥
f (v1 ) = (w1 . . . wm ) ⎢ ⋮ ⎥ .
⎢ 1⎥
⎢am ⎥
⎣ ⎦
Repeating this for each of the n vectors f (vj ), we obtain that f is uniquely
determined by an m × n matrix
⎡ a1 ⋯ an ⎤
⎢ 1 1⎥
⎢ ⎥
(f (v1 ) . . . f (vn )) = (w1 . . . wm ) ⎢ ⋮ ⋮ ⋮ ⎥.
⎢ 1 ⎥
⎢am ⋯ anm ⎥
⎣ ⎦
The function fji corresponds to the matrix A which is zero everywhere, except
for the element on the i-th colum and j-th row, which is equal to one. n
namely,
⎡0 0 0⎤⎥
⎢
⎢0 0 0⎥⎥
⎢
⎢ ⎥
(f42 (v1 ) f42 (v2 ) f42 (v3 )) = (w1 w2 w3 w4 w5 ) ⎢0 0 0⎥ .
⎢ ⎥
⎢0 1 0⎥⎥
⎢
⎢0 0 0⎥⎦
⎣
▲▲▲
be ordered bases for V and W . Then, the linear transformations fji defined
by (5.1) are a basis for HomF (V, W ). In particular,
{fji ∶ i = 1, . . . , n, j = 1, . . . , m}
For every fixed k = 1, . . . , n, the coefficients ajk are the coordinates of f (vk ) ∈
W relative the basis C,
ajk = ([f (vk )]C )j .
In other words, every f ∈ HomF (V, W ) can be represented as
n m
f = ∑ ∑([f (vi )]C )j fji ,
i=1 j=1
thus proving that the linear transformations fji is a generating set for HomF (V, W ).
It remains to show that they are also independent. Let aij be scalars and
suppose that
n m
j
∑ ∑ ai fji = 0HomF (V,W ) .
i=1 j=1
268 Chapter 5
Since C is a basis for W , it follows that ajk = 0F for all j = 1, . . . , m (and for
all k = 1, . . . , n), which completes the proof. n
5.11 Isomorphisms
The notion of isomorphism (!M )איזומורפיזis fundamental is mathematics:
loosely speaking, two sets are said to be isomorphic if they are “the same”
up to a renaming of their elements. The most basic notion of isomorphism
is between plain sets: two sets S and T are isomorphic if there exists a func-
tion f ∶ S → T that is one-to-one (! )חד חד ערכיתand onto (! ;)עלthen,
the function f induces a relation where every element in S can be identified
with an element in T and vice-versa, so that we could say that f (x) ∈ T is
a “renaming” of x ∈ S. We then say that f is an isomorphism and that S
and T are isomorphic (!M)איזומורפיי. An alternative way of stating that two
sets are isomorphic is that there exist two functions f ∶ S → T and g ∶ T → S,
such that
g(f (x)) = x for every x ∈ S,
and
f (g(y)) = y for every y ∈ T .
In other words, g ○ f = IdS and f ○ g = IdT .
Vector spaces are not just plain sets; they are endowed with a linear structure,
so for two vector spaces to be considered isomorphic, we require more than
being equivalent as sets. The function identifying an element in one space
with an element in the other space has to “respect” linear operations. This
leads us to the following definition:
Definition 5.25 Let V and W be vector spaces over a field F. The spaces
are called isomorphic if there exists a linear transformation f ∈ HomF (V, W )
and a linear transformation g ∈ HomF (W, V ) such that g ○ f = IdV and f ○ g =
IdW . The function f is called an isomorphism from V to W and g is called
an isomorphism from W to V .
Linear Transformations 269
Note that f and g both necessarily one-to-one and onto. Take f for example:
for every w ∈ W ,
w = f (g(w)),
showing that f is onto. Likewise, if
f (v1 ) = f (v2 ),
then
v1 = g(f (v1 )) = g(f (v2 )) = v2 ,
showing that f is one-to-one.
The next proposition provides a sufficient condition for two vector spaces to
be isomorphic.
Reciprocally,
v1 = g(w1 ) and v2 = g(w2 ).
By the linearity of f ,
and reciprocally,
By the linearity of f ,
a w = a f (v) = f (a v),
and reciprocally,
g(a w) = a v = a g(w),
thus proving the second condition of linearity for g. n
It is invertible (its inverse being also the identity) and linear, proving that V is
isomorphic to itself. Next, if V is isomorphic to W then W is isomorphic to V ,
because an isomorphism is symmetric by construction. Remains transitivity:
suppose that U and V and isomorphic and V and W are isomorphic. By
definition, there exist
f ∈ HomF (U, V ) g ∈ HomF (V, U )
h ∈ HomF (V, W ) k ∈ HomF (W, V ),
such that
h○f ∶U →W and g ○ k ∶ W → U.
Linear Transformations 271
Since they are compositions of linear transformations, they are linear trans-
formations, i.e.,
B = (v1 . . . vn )
f ∶ v ↦ [v]B ,
Proposition 5.29 Every two finitely-generated vector spaces over the same
field having the same dimension are isomorphic.
be ordered bases for V and W . Define f ∈ HomF (V, W ) as the unique linear
transformation satisfying
If we show that f is one-to-one and onto, then we are done, but from
Lemma 5.28 it suffices to show just one of them. Let w ∈ W ; by the definition
of a basis, there exist scalars a1 , . . . , an ∈ F, such that
w = a1 w1 + ⋅ ⋅ ⋅ + an wn .
Then,
w = a1 f (v1 ) + . . . an f (vn ) = f (a1 v1 + ⋅ ⋅ ⋅ + an vn ),
i.e., w ∈ Image f , which proves that f is onto. n
f ∶ V → (V ∨ )∨
f (u + v) = f (u) + f (v).
f (a v) = a f (v).
for all ` ∈ V ∨ . We have seen that if v ≠ 0 then there exists a linear form
` ∈ V ∨ such that `(v) ≠ 0. We conclude that v = 0V , i.e.,
ker f = {0V },
completing the proof that f is an isomorphism. This isomorphism is consid-
ered natural because it does not hinge on any arbitrary construct. ▲ ▲ ▲
We end this section with one more manifestation of isomorphisms respecting
the linear structure of vector spaces:
B = (v1 . . . vn )
Exercises
f
V /W
B C
Fncol / Fm
Af col
coordinate matrices Fncol , and one from W to the space of its coordinate matri-
ces Fm
col . In this section, we show that to every f ∈ HomF (V, W ) corresponds
a unique matrix Af ∈ Mm×n (F), which we view as a linear transformation in
HomF (Fncol , Fm
col ), such that this diagram “commutes”. To explain what this
means, consider the same diagram through its action on a vector v ∈ V :
v_
f
/ f (v)
_
B C
[v]B Af
/ [f (v)]C
[f (vi )]C ∈ Fm
col .
Here we used several facts: first, by Proposition 5.3 and Proposition 5.4,
a linear transformation is uniquely determined by its action on basis vec-
tors. But second, we used the fact that [vi ]B is a basis for Fncol ; this follows
from the fact that the mapping from a vector to its coordinate matrix is an
isomorphism (Proposition 5.31).
We claim that A has the desired property: for every v ∈ V , which we write
as
v = a1 v 1 + ⋅ ⋅ ⋅ + an v n ,
we have
A[v]B = A[a1 v1 + ⋅ ⋅ ⋅ + an vn ]B
= A (a1 [v1 ]B + ⋅ ⋅ ⋅ + an [vn ]B )
= a1 A[v1 ]B + ⋯ + an A[vn ]B
= a1 [f (v1 )]C + ⋅ ⋅ ⋅ + an [f (vn )]C
= [a1 f (v1 ) + ⋅ ⋅ ⋅ + an f (vn )]C
= [f (a1 v1 + ⋅ ⋅ ⋅ + an vn )]C
= [f (v)]C ,
Example: Let dimF V = n and consider the identity function Id ∈ HomF (V, V ),
i.e.,
Id(v) = v for all v ∈ V .
Let B = (v1 . . . vn ) be a basis for V . Then,
⎡1 ⎤
⎢ ⎥
⎢ ⎥
(Id(v1 ) . . . Id(vn )) = (v1 . . . vn ) ⎢ ⋱ ⎥ ,
⎢ ⎥
⎢
⎣ 1⎥⎦
i.e., the matrix representing the identity map of a vector space is the identity
matrix,
[Id]B
B = In .
In this very special case it does not depend on the choice of basis, as long as
we use the same basis both for the domain and the codomain. ▲▲▲
Then,
1/3 −1
[Id]CB = [ ],
1/3 1
implying that for all v ∈ R2 ,
1/3 −1
[v]B = [ ] [v]C .
1/3 1
▲▲▲
Exercises
Now.
x x
[(x, y)]E = [ ] and [(x, y)]B = [ ],
y y−x
hence
2x 2x
[f (x, y)]E = [ ] and [f (x, y)]B = [ ].
x+y y−x
We deduce that
2x 2 0 x 2x 2 0 x
[f (x, y)]E = [ ]=[ ][ ] and [f (x, y)]B = [ ]=[ ][ ].
x+y 2 1 y−x y−x −1 1 y
´¹¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¶
[f ]B
E
[f ]E
B
2 0
(f (1, 0), f (0, 1)) = ((2, 1), (0, 1)) = ((1, 1), (0, 1)) [ ].
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ −1 1
f (E) B ´¹¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¶
[f ]E
B
(easy) 5.37 Let A ∈ M2 (F) and let f ∶ M2 (F) → M2 (F) be given by f (B) =
AB.
The answer to the second question is negative because M2 (F) is a 4-dimensional space,
hence representing matrices are elements of M4 (F).
(easy) 5.38 Let En denote the standard ordered basis for Rn . Let f ∈
HomR (R2 , R3 ) be given by
(easy) 5.39 Repeat the previous exercise, this time using the ordered bases
B = ((1, 1) (1, −1)) and C = ((1, 0, 0) (1, 1, 0) (1, 1, 1)) .
It follows that
g(1, 0) = 21 (g(1, 1) + g(1, −1)) = (2, 32 , 12 )
g(0, 1) = 12 (g(1, 1) − g(1, −1)) = (−2, − 52 , − 32 ),
i.e.,
g(x, y) = (2x, 23 x, 12 x) + (−2y, − 52 y, − 23 y).
282 Chapter 5
It follows that
1
f (1, 0, 0) = 2
(f (1, 0, −1) + f (1, −1, 0) + f (0, 1, 1)) = 12 (3, 2, 1)
1
f (0, 1, 0) = 2
(f (1, 0, −1) − f (1, −1, 0) + f (0, 1, 1)) = 12 (1, 0, 1)
1
f (0, 0, 1) = 2
(−f (1, 0, −1) + f (1, −1, 0) + f (0, 1, 1)) = 12 (1, 2, 1),
hence
f (x, y, z) = 21 (3x + y + z, 2x + 2z, x + y + z).
a b −c 0
U = {[ ] ∶ a, b ∈ R} and W = {[ ] ∶ c, d ∈ R} .
0 0 c d
In Exercise 5.15 you showed that V = U ⊕ W and wrote explicitly the pro-
jections pi and reflections Si .
Solution 5.42:
1 0
(f (1), f (−ı)) = (1, ı) = (1, −ı) [ ].
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸¹ ¹ ¹ ¶ 0 −1
f (B) B ´¹¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¶
[f ]B
B
Let f ∶ V → V be defined by
(f (p))(X) = X p′ (X),
a b
f ([ ]) = (a + 2b + c) + (3a − d)X + (a − 4b − 2c − d)X 2 .
c d
284 Chapter 5
(a) Find [f ]B 2
C for C = (1, X, X ) and
1 0 0 1 0 0 0 0
B = ([ ] [ ] [ ] [ ] .)
0 0 0 0 1 0 0 1
Then,
1 −1 0 −1
ker f = Span {[ ],[ ]} ,
1 3 2 0
and
Image f = Span{1 − 2X 2 , 1 + 2X}.
For the image, once we know that the kernel is two-dimensional, it suffices to find two
independent vectors in the image of f .
(intermediate) 5.45 Let f ∶ R<3 [X] → R<3 [X] be the linear transformation
represented by the matrix
⎡1 2 5⎤
⎢ ⎥
⎢ ⎥
[f ]B
B = ⎢−1 0 −1⎥
⎢ ⎥
⎢0 1 2⎥
⎣ ⎦
relative to the ordered basis B = (1, 1 + X, 1 − X + X 2 ). Find Image f .
f (1) = −X f (1 + X) = 3 − X + X 2 and f (1 − X + X 2 ) = 6 − 3X + 2X 2 .
Linear Transformations 285
We note that
f (2 + 3X − X 2 ) = 0,
hence
ker f = Span{2 + 3X − X 2 }.
By the rank theorem, the image of f is the span of any two linearly-independent vectors
in the image of f , e.g.,
Image f = Span{−X, 3 − X + X 2 }.
−2 5 0
[f ]EB = [ ],
1 0 −1
where E is the standard basis. Find a matrix A ∈ M2×3 (R) such that
Hence,
⎡1 −6⎤⎥
⎢
⎢ ⎥
f (x, y, z) = (x + 5y − 3z, −6x + 15y) = (x, y, z) ⎢ 5 15 ⎥ .
⎢ ⎥
⎢−3
⎣ 0 ⎥⎦
where
C = ((1, 2) (0, −1)) .
Let v ∈ V satisfy
⎡1⎤
⎢ ⎥
⎢ ⎥
[v]B = ⎢ 0 ⎥ .
⎢ ⎥
⎢−1⎥
⎣ ⎦
Find f (v).
f g
U / V / W
B C D
Fncol / Fm / Fkcol
col
[f ]B
C [g]C
D
Linear Transformations 287
[g ○ f ]B C B
D = [g]D [f ]C .
[f (u)]C = [f ]B
C [u]B
and by definition,
[g]CD [f ]B B
C = [g ○ f ]D .
n
Linear transformation from V to W can also be added. The addition of
linear transformations is represented by the addition of the corresponding
transition matrices:
[f + g]B B B
C = [f ]C + [g]C .
[f (v)]C = [f ]B
C [v]B and [g(v)]C = [g]B
C [v]B .
288 Chapter 5
and by definition
[f ]B B B
C + [g]C = [f + g]C .
n
Similarly, we can prove:
[a f ]B B
C = a [f ]C .
1 0
[u]B = [ ] and [w]B = [ ] .
0 1
1 0
[S1 ]B B B
B = [p1 ]B − [p2 ]B = [ ],
0 −1
and
−1 0
[S2 ]B B B
B = [p2 ]B − [p1 ]B = [ ],
0 1
so that [S1 ]B B
B + [S2 ]B = 0M2 (F) , as expected.
Consider now compositions of these operators, for example,
p1 ○ p1 = p1 and p1 ○ p2 = 0V .
Indeed,
1 0 1 0 1 0
[p1 ]B B
B [p1 ]B = [ ][ ]=[ ] = [p1 ○ p1 ]B
B,
0 0 0 0 0 0
and
1 0 0 0 0 0
[p1 ]B B
B [p2 ]B = [ ][ ]=[ ] = [p1 ○ p2 ]B
B.
0 0 0 1 0 0
▲▲▲
290 Chapter 5
Exercises
(easy) 5.48 Show explicitly in the last example that p1 ○ S2 = −p1 and
[p1 ]B B B
B [S1 ]B = [p1 ○ S1 ]B .
f ○ f = 0HomF (V,V ) .
Solution 5.49: It is given that Image f ≤ ker f < 3. By the rank theorem,
dimF ker f + dimF Image f = 3,
but since
dimF Image f ≤ dimF dimF ker f,
we conclude that dimF ker f = 2 and dimF Image f = 1. Let {u1 } be a basis for Image f .
Let u2 complete u1 into a basis for ker f and let u3 ∈/ ker f complete u1 , u2 into a basis for
V . We choose u3 such that
f (u3 ) = u1 ,
which we can always do by an appropriate scalar multiplication. Hence,
⎡0 0 1⎤⎥
⎢
⎢ ⎥
(f (u1 ), f (u2 ), f (u3 )) = (0V , 0V , u1 ) = (u1 , u2 , u3 ) ⎢0 0 0⎥ .
⎢ ⎥
⎢0 0 0⎥⎦
⎣
Solution 5.50: It is given that Image(f ○ f ) ≤ ker f ≤ ker(f ○ f ) < 3. By the rank
theorem,
dimF ker(f ○ f ) + dimF Image(f ○ f ) = 3,
from which we conclude that dimF ker(f ○ f ) = 2 and dimF Image(f ○ f ) = 1. It is also given
that Image f ≤ ker(f ○ f ) ≤ 3. A priori, there are two possibilities,
or
dimF ker f = 2 and dimF Image f = 1.
Suppose that dimF ker f = 2 and dimF Image f = 1. Since f ○ f ≠ 0, it follows that Image f ∩
ker f = {0V } (figure out why!). It follows that there exists a non-zero u ∈ Image f , such
that f (u) = u, i.e., f (f (f (u)) ≠ 0V , which is a contradiction. We thus conclude that
Let {u1 } be a basis for ker f . Let u2 complete u1 into a basis for ker(f ○ f ), and let u3
complete the two other into a basis for V . We can choose u2 such that f (u2 ) = u1 and we
can choose u3 such that f (u3 ) = u2 (figure out why!). Hence,
⎡0 1 0⎤⎥
⎢
⎢ ⎥
(f (u1 ), f (u2 ), f (u3 )) = (0V , u1 , u2 ) = (u1 , u2 , u3 ) ⎢0 0 1⎥ .
⎢ ⎥
⎢0 0 0⎥⎦
⎣
C = BP,
[f ]CC = P −1 [f ]B
B P.
[f (v)]B = [f ]B
B [v]B and [f (v)]C = [f ]CC [v]C .
Moreover,
[f ]CC = P −1 [f ]B
B P.
x
[(x, y)]E = [ ] ,
y
and
3x + 7y 3 7 x
[f (x, y)]E = [ ]=[ ][ ],
2x − 5y 2 −5 y
Linear Transformations 293
namely
3 7
[f ]EE = [ ].
2 −5
Let now
B = ((1, 2) (2, 1))
be another ordered basis for R2 . Then,
1 2
((1, 2) (2, 1)) = ((1, 0) (0, 1)) [ ],
2 1
and for (x, y) ∈ R2 ,
x 1 2 −1/3 2/3 x
[ ]=[ ] [(x, y)]B and [(x, y)]B = [ ][ ].
y 2 1 2/3 −1/3 y
Now,
[f (x, y)]B = [(3x + 7y, 2x − 5y)]B
−1/3 2/3 3x + 7
=[ ][ ]
2/3 −1/3 2x − 5y
−1/3 2/3 3 7 x
=[ ][ ][ ]
2/3 −1/3 2 −5 y
−1/3 2/3 3 7 1 2
=[ ][ ][ ] [(x, y)]B .
2/3 −1/3 2 −5 2 1
We conclude that
−1/3 2/3 3 7 1 2
[f ]B
B =[ ][ ][ ].
2/3 −1/3 2 −5 2 1
▲▲▲
Thus, we have proved that matrices representing the same linear transfor-
mation f ∈ HomF (V, V ) relative to different bases are similar. The opposite
is also true: two matrices that are similar represent the same linear transfor-
mation relative to different bases.
294 Chapter 5
Exercises
(easy) 5.53 Prove that similar matrices represent the same linear transfor-
mation relative to different bases.
(easy) 5.54 Show that for any scalar a, the matrix a In is similar only to
itself. Interpret this result in terms of the matrix representation of linear
transformations.
3 0
D=[ ]
0 2
Prove that
(A − 3I2 )(A − 2I2 ) = 0.
296 Chapter 5
Chapter 6
6.1 Motivation
u u
2v
v
u + 0.5v
u u
v v
As we will see, these two properties determine almost uniquely the area
function; there always remains a choice of “units”, which assigns an area to
Volume Forms and Determinants 299
We will not prove this theorem right away; for the time being, we will assume
that such volume forms exist and examine their properties.
ω(v1 , . . . , vi + a vj , . . . , vn ) = ω(v1 , . . . , vn ).
a ω(v1 , . . . , vn ) = ω(v1 , . . . , vi , . . . , a vj , . . . , vn )
= ω(v1 , . . . , vi + a vj , . . . , a vj , . . . , vn )
= a ω(v1 , . . . , vi + a vj , . . . , vj , . . . , vn ),
where the first and third equalities follow from (6.2) and the second equality
follows from (6.1). Dividing both sides by a, we obtain the required result.
n
Volume Forms and Determinants 301
ω (v1 , . . . , vi + ∑ aj vj , . . . , vn ) = ω(v1 , . . . , vn ).
j≠i
ω(v1 , . . . , vn ) = 0F .
Proof : If the vectors are linearly-dependent, then one of the vectors, say vi ,
can be written as a linear combination of all the others,
v i = ∑ aj v j .
j≠i
Then,
⎛ ⎞
⎜ ⎟
⎜ ⎟
ω(v1 , . . . , vn ) = ω ⎜v1 , . . . , 0V + ∑ aj vj , . . . , vn ⎟
⎜ j≠i
⎟
⎜ ⎟
⎝ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ⎠
i-th term
= ω (v1 , . . . , 0V , . . . , vn ) = 0F .
n
and
f (v1 , . . . , a vi , . . . , vn ) = a f (v1 , . . . , vn ).
is multilinear. ▲▲▲
u = a1 v 1 + ⋅ ⋅ ⋅ + an v n and v = b 1 v 1 + ⋅ ⋅ ⋅ + bn v n .
ω(v1 , . . . , vi , . . . , vj , . . . , vn ) = ω(v1 , . . . , vi − vj , . . . , vj , . . . , vn )
= ω(v1 , . . . , 0V , . . . , vj , . . . , vn ) = 0F ,
ω(v1 , . . . , vi , . . . , vj , . . . , vn ) = ω(v1 , . . . , vi + vj , . . . , vj , . . . , vn ),
ω(v1 , . . . , vi + vj , . . . , vj , . . . , vn ) = ω(v1 , . . . , vi , . . . , vj , . . . , vn )
+ ω(v1 , . . . , vj , . . . , vj , . . . , vn )
= ω(v1 , . . . , vi , . . . , vj , . . . , vn ) + 0F .
Volume Forms and Determinants 305
n
In practice, volume forms are more natural to think of geometrically, and al-
ternating multilinear functions are more convenient to think of algebraically.
We have just shown that they are the same.
ω(v1 , . . . , vi , . . . , vj , . . . , vn ) = −ω(v1 , . . . , vj , . . . , vi , . . . , vn ).
Proof : Consider
ω(v1 , . . . , vi + vj , . . . , vi + vj , . . . , vn ) = 0F ,
which vanishes by the alternating property of the volume form. Using the
multilinearity, two of the terms vanish by the alternating property, remaining
with
ω(v1 , . . . , vi , . . . , vj , . . . , vn ) + ω(v1 , . . . , vj , . . . , vi , . . . , vn ) = 0F .
Note that we proved in fact Theorem 6.2 (the existence part) for n = 2.
Suppose now that
a b
(u, v) = (v1 , v2 ) [ ],
c d
306 Chapter 6
namely,
u = a v1 + c v 2 and v = b v1 + d v2 .
Then,
ω(u, v) = ad − bc,
which should ring a bell. This is the what we called the determinant of the
matrix. ▲▲▲
In view of Theorem 6.9, we can replace Theorem 6.2 by the equivalent:
ω(v1 , . . . , vn ) = 1F .
pL (u) = λ(u) v1 .
where the “hat” over the j-th term means that this term has been omitted.
We now show that ω is a normalized, alternating multilinear function. Let’s
write it more explicitly.
ω(u1 , . . . , un ) = λ(u1 ) ωH (pH (u2 ), . . . , pH (un ))
− λ(u2 ) ωH (pH (u1 ), pH (u3 ), . . . , pH (un ))
+ λ(u3 ) ωH (pH (u1 ), pH (u2 ), pH (u4 ), . . . , pH (un ))
∓ ...
+ (−1)n+1 λ(un ) ωH (pH (u1 ), pH (u2 ), . . . , pH (un−1 )).
The function ω is multilinear: each of the terms in the sum is linear in each
of the uj ’s, either because λ is linear, or because pH is linear and ωH is
multilinear. The function ω is also alternating. Suppose, for example, that
u1 = u2 . In all of the summands but two, u1 and u2 are arguments of ωH ,
which is alternating, hence these terms vanish. Remain two terms, which in
this case are
λ(u1 ) ωH (pH (u2 ), pH (u3 ), . . . , pH (un ))
− λ(u2 ) ωH (pH (u1 ), pH (u3 ), . . . , pH (un )) = 0F .
You may convince yourself that this would happen whenever ui = uj for i ≠ j.
As for the normalization, since λ(vi ) = 0F and pH (vi ) = vi for all i ≥ 2,
Exercises
(easy) 6.1 Let V be a vector space over F and let k ∈ N (not necessarily
the dimension on V ). We denote by Mult(k, V, F) the set of functions f ∶
V k → F that are multilinear (it is a subspace of Func(V k , F)). Show that
Mult(k, V, F) is a vector space over F.
Solution 6.1: Let f, g ∈ Mult(k, V, F), and let u1 , . . . , uk ∈ V , v ∈ V and a ∈ F. Then,
(f + g)(u1 + av, u2 , . . . , uk ) = f (u1 + av, u2 , . . . , uk ) + g(u1 + av, u2 , . . . , uk )
= f (u1 , u2 , . . . , uk ) + g(u1 , u2 , . . . , uk )
+ a f (v, u2 , . . . , uk ) + a g(v, u2 , . . . , uk )
= (f + g)(u1 , u2 , . . . , uk ) + a(f + g)(v, u2 , . . . , uk ),
Volume Forms and Determinants 309
thus showing that f + g is linear in its first argument. We proceed similarly to show that
it is linear in any argument, and so it a f .
Solution 6.3: Just follow the idea of the previous example: start with
f (u, v, w, x) = `1 (u)`2 (v)`3 (w)`4 (x),
and then “anti-symmetrize” it, by adding or subtracting similar terms with the `i ’s inter-
changed. This yields,
6.5 Determinants
Let V be an n-dimensional vector space over a field F. Let B = (v1 , . . . , vn )
be an ordered basis and let ω be a volume form on V . By the definition of a
basis, every (u1 , . . . , un ) ∈ V n has a unique representation as
(u1 , . . . , un ) = (v1 , . . . , vn )A
The right-hand side only depends on the matrix A. Consider then the func-
tion
ω((v1 , . . . , vn )A)
f (A) = .
ω(v1 , . . . , vn )
We note that it satisfies the following properties:
for the i-th entry, where the i-th entry on left-hand side is the sum of
the i-th entries on the right-hand side. By the multilinearity of volume
forms,
⎡ 1 1 1 1⎤ ⎡ 1 . . . b1 . . . a1n ⎤⎥⎞
⎛⎢⎢ a1 . . . b + c . . . an ⎥⎥⎞ ⎛⎢⎢ a1
⎜⎢ a2 . . . b2 + c2 . . . a1n ⎥⎟ ⎜⎢ a2 . . . b2 . . . a1n ⎥⎥⎟
f ⎜⎢ 1 ⎥⎟ = f ⎜⎢ 1 ⎥⎟
⎜⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥⎟ ⎜⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥⎥⎟
⎝⎢⎢an . . . bn + cn . . . an ⎥⎥⎠ ⎝⎢⎢an . . . bn . . . ann ⎥⎦⎠
⎣ 1 n⎦ ⎣ 1
⎡ a1 . . . c1 . . . a1n ⎤⎥⎞
⎛⎢⎢ 1
⎜ ⎢ a2 . . . c2 . . . a1n ⎥⎥⎟
+ f ⎜⎢ 1 ⎥⎟
⎜⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥⎥⎟
⎝⎢⎢an . . . cn . . . ann ⎥⎦⎠
⎣ 1
volume form,
⎡ a1 . . . c a1 . . . a1n ⎤ ⎡ a1 . . . a1 . . . a1n ⎤
⎛ ⎢ 1 i ⎥ ⎢ 1 i ⎥
⎢ a2 . . . c a2 . . . a1 ⎥⎞ ⎛ ⎢ a2 . . . a2 . . . a1 ⎥⎞
⎜ ⎢ 1 i n ⎥⎟ ⎜ ⎢ 1 i n ⎥⎟
ω ⎜(v1 , . . . , vn ) ⎢ ⎥⎟ = c ω ⎜(v1 , . . . , vn ) ⎢ ⎥⎟ .
⎜ ⎢⋮ ⋮ ⋮ ⋮ ⋮ ⎥⎟ ⎜ ⎢⋮ ⋮ ⋮ ⋮ ⋮ ⎥⎟
⎢ n ⎥ ⎢ n ⎥
⎝ ⎢a . . . c an . . . ann ⎥⎠ ⎝ ⎢a . . . an . . . ann ⎥⎠
⎣ 1 i ⎦ ⎣ 1 i ⎦
Dividing both sides by ω(v1 , . . . , vn ) we obtain
⎡ 1 1 1⎤ ⎡ 1 1 1⎤
⎛⎢⎢ a1 . . . c ai . . . an ⎥⎥⎞ ⎛⎢⎢ a1 . . . ai . . . an ⎥⎥⎞
⎜⎢ a2 . . . c a2i . . . a1n ⎥⎟ ⎜⎢ a2 . . . a2i . . . a1n ⎥⎟
f ⎜⎢ 1 ⎥⎟ = c f ⎜⎢ 1 ⎥⎟ .
⎜⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥⎥⎟ ⎜⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥⎥⎟
⎢
⎝⎢an . . . c an . . . an ⎥⎠ ⎢
⎝⎢an . . . an . . . an ⎥⎠
⎣ 1 i n⎦ ⎣ 1 i n⎦
Proposition 6.13 For every field F and n ∈ N there exists a unique deter-
minant function f ∶ Mn (F) → F. We denote this function either by A ↦ det A
or by A ↦ ∣A∣.
Volume Forms and Determinants 313
We have thus obtained a means for calculating the volume of every n-tuple
of vectors given its value for a basis, assuming we know how to calculate the
determinant of a matrix.
Exercises
det(λA) = λn det(A).
(easy) 6.5 Let A ∈ Mn (F) be such that its n-th column is a linear combi-
nation of the other columns. Show that
det A = 0F .
314 Chapter 6
By multilinearity, det A is a sum a determinants where tin each he i-th column has been
replaced by cj times the j-th column. Once again by multilinearity, the cj factors out and
we remain with the determinant of a matrix having two equal columns, hence zero.
(easy) 6.6 Let A ∈ Mn (F) be such that aji = 0F for all i < j. Show that
det A = a11 a22 . . . ann .
Solution 6.6: Since the determinant is invariant under adding a column to another, once
can inductively eliminate all the off-diagonal entries of the matrix without modifying the
determinant. One remains then with a diagonal matrix having the same diagonal entries
as the original matrix.
(a) f (A) = 1F .
(b) f (A) = 0F .
(c) f (A) = a11 + a22 + a33 .
(d) f (A) = a11 a11 + 2a11 a22 .
(e) f (A) = −a11 a12 a33 .
(f) f (A) = a12 a23 a31 + a13 a21 a32 .
Solution 6.7: (a) Neither (b) Both linear and multilinear (c) Linear (d) Neither (e)
Multilinear (f) Multilinear.
for α, βγ ∈ F satisfying
Proposition 6.15 Let Dkk (a) and Tk` (a) be the elementary matrices defined
by (6.3) and (6.4). Then, for all A ∈ Mn (F),
In particular,
and
ω((v1 , . . . , vn )ATk` (a)) = ω((v1 , . . . , vn )A).
Dividing by ω(v1 , . . . , vn ), we obtained the desired result. n
Proposition 6.17 Let A ∈ Mn (F). Then, A ∈ GLn (F) if and only if det A ≠
0.
Volume Forms and Determinants 317
A = E1 ⋯En ,
and since det Ei ≠ 0 for all i, it follows from the previous corollary that
Proof : If either A or B are not invertible, then AB is not invertible and both
sides of the equation vanish. Otherwise, both A and B can be written as
products of elementary matrices,
Then,
det(AB) = det(E1 )⋯ det(En ) det(F1 )⋯ det(Fk ) .
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
det A det B
3 1 1 1 1 0 1 0 1 2
[ ]=[ ][ ][ ][ ].
2 −1 0 1 2 1 0 −5 0 1
Hence
3 1
det = [ ] = −5.
2 −1
▲▲▲
Such a means for calculating determinants is not very convenient. A more
systematic way hinges on the proof of Theorem 6.11, which we remind, was
318 Chapter 6
Example: For n = 2,
a b
∣ ∣ = a∣d∣ − b∣c∣ = ad − bc.
c d
For n = 3,
RRRa b c RRR
RRRd e f RRRRR = a ∣ e f ∣ − b ∣d f ∣ + c ∣d e ∣ .
RRR R
RRR R h i g i g h
RRg h i RRRR
▲▲▲
Volume Forms and Determinants 319
Exercises
det(λI2 − A) = λ2 − λ tr A + det A,
In the next semester you will see why such an operations makes sense.
Example: If
1 2 3
A=[ ],
4 5 6
then
⎡1 4⎤
⎢ ⎥
t ⎢ ⎥
A = ⎢2 5⎥ .
⎢ ⎥
⎢3 6⎥
⎣ ⎦
▲▲▲
Volume Forms and Determinants 321
Lemma 6.20 Let A ∈ Mm×n (F) and let B ∈ Mn×k (F). Then
(AB)t = B t At .
det E t = det E.
det At = det A.
Exercises
(intermediate) 6.15 Let A ∈ Fncol and let B = Fnrow for n > 1. What can be
said about
det(AB) ?
Solution 6.16: It is zero because the right-most three columns are not linearly-
independent.
(intermediate) 6.22 For each of the following matrices, calculate the de-
terminants and determine for what values of the parameters those matrices
are invertible:
⎡1 a − 2 −a + 1⎤ ⎡ 1 d d2 ⎤
⎢ ⎥ b − 3 −2 c−1 4 ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢0 2 a−1 ⎥ [ ] [ ] ⎢ d d2 1 ⎥ .
⎢
⎢a a2
⎥ 1 b − 6 2 c − 3 ⎢ ⎥
⎣ a2 − 1 ⎥⎦ ⎢d2 1 d ⎥
⎣ ⎦
Volume Forms and Determinants 325
hence
det A = (−1)k+1 a1k det AA1k + (−1)k+2 a1k det AA1k = 0F .
A A
If, for example ` = k + 2, then AA1k and AA1` differ by the interchange of two
columns, which implies that their determinants differ by a sign, i.e.,
A C
⎡1 3 4⎤ ⎡1 3 4⎤ ⎡1 3 4⎤
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
AA21 = ⎢7 2 1⎥ AA22 = ⎢7 2 1⎥ AA23 = ⎢7 2 1⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
A ⎢9 3 2⎥ A ⎢9 3 2⎥ A ⎢9 3 2⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
So that
3 4 1 4 1 3
det A = −7 ∣ ∣+2 ∣ ∣−1 ∣ ∣ = (−7)(−6) + 2(−34) + (−1)(−24) = (−2).
3 2 9 2 9 3
328 Chapter 6
▲▲▲
Since the determinant is invariant under transposition, we could have as well
chosen a distinguished column, say the j-th column, and then sum up over
all rows n
det A = ∑(−1)j+i aij det AijC . (6.11)
i=1 C
Example: Take the same matrix as in the previous example and take say
the third column, j = 3. Then,
⎡1 3 4⎤ ⎡1 3 4⎤ ⎡1 3 4⎤
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
AA13 = ⎢7 2 1⎥ AA23 = ⎢7 2 1⎥ AA33 = ⎢7 2 1⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
A ⎢9 3 2⎥ A ⎢9 3 2⎥ A ⎢9 3 2⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
So that
7 2 1 3 1 3
det A = 4 ∣ ∣ + (−1) ∣ ∣+2 ∣ ∣ = 4 ⋅ 3 − (−24) + 2(−19) = (−2).
9 3 9 3 7 2
▲▲▲
Next, we relate determinants to the first subject of this course, the solution
of linear systems; we focus on the case where the number of equations equals
the number of unknowns. Let A ∈ Mn (F) and b ∈ Fncol , and denote by
Aj→b
C
The matrix in which the j-th column has been replaced by the column matrix
b.
5 3 x1 4
[ ][ ] = [ ].
2 1 x2 0
Then,
4 3 5 4
∣ ∣ ∣ ∣
0 1 4 2 0 (−8)
x1 = = and x2 = = ,
5 3 (−1) 5 3 (−1)
∣ ∣ ∣ ∣
2 1 2 1
and indeed,
5 3 −4 4
[ ][ ] = [ ].
2 1 8 0
But let’s actually go through the steps of the proof below. We have
4 5x1 + 3x2
[ ]=[ 1 ].
0 2x + x2
Hence,
4 3 5x1 + 3x2 3 5x1 3 5 3
∣ ∣=∣ 1 2 ∣ = ∣ 1 ∣ = x1 ∣ ∣,
0 1 2x + x 1 2x 1 2 1
and
5 4 5 5x1 + 3x2 5 3x2 5 3
∣ ∣=∣ 1 2 ∣=∣ ∣ = x2 ∣ ∣.
2 0 2 2x + x 2 x2 2 1
▲▲▲
Aj→b
C
= [Col1 (A) . . . ∑ni=1 xi Coli (A) . . . Coln (A)] ,
where the sum is ay the j-th column. By the multilinearity of the determi-
nant,
n
det Aj→b
C
= ∑ xi ∣Col1 (A) . . . Coli (A) . . . Coln (A)∣ .
i=1
330 Chapter 6
By the alternation of the determinant, all the summands vanish, except for
the j-th, i.e.,
det Aj→b
C
= xj det A.
n
And with this, we obtain Cramer’s formula for the inverse matrix:
Proof : Let’s verify that this coincides with the known formula for 2 × 2
matrices. We have
det AA11
d det AA21 b
(A−1 )11 = (−1)2
= A (A−1 )12 = (−1) 3 A =−
det A ad − bc det A ad − bc
det Ai→e
C j
= (−1)i+j det AjiC .
C
n
Exercises
(intermediate) 6.26 Solve the following linear systems over R using Cramer’s
formula.
(a)
X +2Y +3Z =6
4X +5Y +6Z = 15
7X +8Y +10Z = 25.
(b)
X +Y +Z = 11
2X −6Y −Z = 0
3X +4Y +2Z = 0.
(c)
3X −2Y =7
3Y −2Z = 6
−2X +3Z = −1.
(u1 . . . un ) = BA,
we have
ω (u1 . . . un ) = ω(B) det A.
ω(BA)
ω(B)
And further,
ω = c η,
ω (u1 . . . un ) = c η (u1 . . . un ) .
Proof : Let B be any ordered basis on V and let A ∈ Mn (F) be the unique
matrix satisfying
(u1 . . . un ) = BA.
Volume Forms and Determinants 333
Then,
ω(BA) η(BA)
ω (u1 . . . un ) = ω(BA) = ω(B) = ω(B) = c η (u1 . . . un ) ,
ω(B) η(B)
where
ω(B)
c= .
η(B)
n
Since all the volume forms are multiples of each other, they are essentially
the same; they only differ by a choice of units. This observation yields that
an operator on a vector space can be characterized by how much it magnifies
volumes:
Proof : Let A = [f ]B
B , i.e.,
f (B) = BA.
Then,
ω(f (B)) ω(BA)
= = det A,
ω(B) ω(B)
and the right-hand side depends neither on ω nor on B. n
Thus, the determinant of f coincides with the determinant of its representing
matrix, but, this identity does not depend on the basis relative to which we
334 Chapter 6
[f ]CC = P −1 [f ]B
B P.
Exercises
det[T ○ S]B B B
B = det[T ]B det[S]B .
(a) det[IdV ]B
C ≠ 0.
(b) (det[IdV ]B
C)
−1 = det[Id ]C .
V B
(c) det[IdV ]B C B
D = det[IdV ]D det[IdV ]C .
Solution 6.30: If A is the zero matrix then g is the zero function and its determinant
is clearly zero. Otherwise, since g(A) = 0, it follows that g is not invertible, and neither is
any of its representing matrices.
Show that
det L = det R = (det A)n .
Hint: Separate the cases A ∈ GLn (F) and A ∈/ GLn (F).
336 Chapter 6