Elementary Matrix Algebra
3.5/5
()
About this ebook
This complete and coherent exposition, complemented by numerous illustrative examples, offers readers a text that can teach by itself. Fully rigorous in its treatment, it offers a mathematically sound sequencing of topics. The work starts with the most basic laws of matrix algebra and progresses to the sweep-out process for obtaining the complete solution of any given system of linear equations — homogeneous or nonhomogeneous — and the role of matrix algebra in the presentation of useful geometric ideas, techniques, and terminology.
Other subjects include the complete treatment of the structure of the solution space of a system of linear equations, the most commonly used properties of determinants, and linear operators and linear transformations of coordinates. Considerably more material than can be offered in a one-semester course appears here; this comprehensive volume by Franz E. Hohn, Professor of Mathematics at the University of Illinois for many years, provides instructors with a wide range of choices in order to meet differing interests and to accommodate students with varying backgrounds.
Related to Elementary Matrix Algebra
Titles in the series (100)
Counterexamples in Topology Rating: 4 out of 5 stars4/5The Calculus Primer Rating: 0 out of 5 stars0 ratingsFirst-Order Partial Differential Equations, Vol. 1 Rating: 5 out of 5 stars5/5Calculus Refresher Rating: 3 out of 5 stars3/5Topology for Analysis Rating: 4 out of 5 stars4/5Analytic Inequalities Rating: 5 out of 5 stars5/5Fourier Series and Orthogonal Polynomials Rating: 0 out of 5 stars0 ratingsA Catalog of Special Plane Curves Rating: 2 out of 5 stars2/5Laplace Transforms and Their Applications to Differential Equations Rating: 5 out of 5 stars5/5First-Order Partial Differential Equations, Vol. 2 Rating: 0 out of 5 stars0 ratingsMethods of Applied Mathematics Rating: 3 out of 5 stars3/5Differential Geometry Rating: 5 out of 5 stars5/5Advanced Calculus: Second Edition Rating: 5 out of 5 stars5/5History of the Theory of Numbers, Volume II: Diophantine Analysis Rating: 0 out of 5 stars0 ratingsInfinite Series Rating: 4 out of 5 stars4/5Applied Functional Analysis Rating: 0 out of 5 stars0 ratingsMathematics for the Nonmathematician Rating: 4 out of 5 stars4/5Calculus: An Intuitive and Physical Approach (Second Edition) Rating: 4 out of 5 stars4/5Optimization Theory for Large Systems Rating: 5 out of 5 stars5/5Geometry: A Comprehensive Course Rating: 4 out of 5 stars4/5Theory of Approximation Rating: 0 out of 5 stars0 ratingsMathematics in Ancient Greece Rating: 5 out of 5 stars5/5Elementary Number Theory: Second Edition Rating: 4 out of 5 stars4/5Numerical Methods Rating: 5 out of 5 stars5/5How to Gamble If You Must: Inequalities for Stochastic Processes Rating: 0 out of 5 stars0 ratingsDynamic Probabilistic Systems, Volume II: Semi-Markov and Decision Processes Rating: 0 out of 5 stars0 ratingsAn Introduction to Lebesgue Integration and Fourier Series Rating: 0 out of 5 stars0 ratingsAn Adventurer's Guide to Number Theory Rating: 4 out of 5 stars4/5Introduction to Matrices and Vectors Rating: 0 out of 5 stars0 ratingsMatrices and Linear Algebra Rating: 4 out of 5 stars4/5
Related ebooks
A First Course in Topology: An Introduction to Mathematical Thinking Rating: 0 out of 5 stars0 ratingsMatrices and Transformations Rating: 4 out of 5 stars4/5Elementary Point-Set Topology: A Transition to Advanced Mathematics Rating: 5 out of 5 stars5/5Advanced Calculus: Second Edition Rating: 5 out of 5 stars5/5Laplace Transforms and Their Applications to Differential Equations Rating: 5 out of 5 stars5/5Introductory Discrete Mathematics Rating: 4 out of 5 stars4/5Linear Mathematics: A Practical Approach Rating: 5 out of 5 stars5/5Basic Abstract Algebra: For Graduate Students and Advanced Undergraduates Rating: 4 out of 5 stars4/5Methods of Applied Mathematics Rating: 3 out of 5 stars3/5An Introduction to Mathematical Modeling Rating: 5 out of 5 stars5/5Introduction to the Theory of Abstract Algebras Rating: 0 out of 5 stars0 ratingsModern Calculus and Analytic Geometry Rating: 4 out of 5 stars4/5Concepts of Mathematical Modeling Rating: 4 out of 5 stars4/5Introduction to Proof in Abstract Mathematics Rating: 5 out of 5 stars5/5Introduction to Matrices and Vectors Rating: 0 out of 5 stars0 ratingsVector Spaces and Matrices Rating: 0 out of 5 stars0 ratingsA Second Course in Complex Analysis Rating: 0 out of 5 stars0 ratingsIntroductory Complex Analysis Rating: 4 out of 5 stars4/5Extremal Graph Theory Rating: 3 out of 5 stars3/5Introduction to Linear Algebra Rating: 1 out of 5 stars1/5Applied Functional Analysis Rating: 0 out of 5 stars0 ratingsA First Course in Partial Differential Equations: with Complex Variables and Transform Methods Rating: 5 out of 5 stars5/5Applied Complex Variables Rating: 5 out of 5 stars5/5Advanced Calculus: An Introduction to Classical Analysis Rating: 5 out of 5 stars5/5Introduction to Abstract Algebra Rating: 3 out of 5 stars3/5Number Theory Rating: 4 out of 5 stars4/5Introduction to Real Analysis Rating: 3 out of 5 stars3/5Introduction to Differential Geometry for Engineers Rating: 0 out of 5 stars0 ratingsApplied Partial Differential Equations Rating: 0 out of 5 stars0 ratingsIntroduction to Analysis Rating: 4 out of 5 stars4/5
Mathematics For You
Algebra - The Very Basics Rating: 5 out of 5 stars5/5What If?: Serious Scientific Answers to Absurd Hypothetical Questions Rating: 5 out of 5 stars5/5Calculus Made Easy Rating: 4 out of 5 stars4/5Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics Rating: 4 out of 5 stars4/5Sneaky Math: A Graphic Primer with Projects Rating: 0 out of 5 stars0 ratingsBasic Math & Pre-Algebra For Dummies Rating: 4 out of 5 stars4/5My Best Mathematical and Logic Puzzles Rating: 4 out of 5 stars4/5Real Estate by the Numbers: A Complete Reference Guide to Deal Analysis Rating: 0 out of 5 stars0 ratingsQuantum Physics for Beginners Rating: 4 out of 5 stars4/5Relativity: The special and the general theory Rating: 5 out of 5 stars5/5Algebra I Workbook For Dummies Rating: 3 out of 5 stars3/5The Golden Ratio: The Divine Beauty of Mathematics Rating: 5 out of 5 stars5/5Statistics: a QuickStudy Laminated Reference Guide Rating: 0 out of 5 stars0 ratingsAlgebra II For Dummies Rating: 3 out of 5 stars3/5Mental Math Secrets - How To Be a Human Calculator Rating: 5 out of 5 stars5/5Math Magic: How To Master Everyday Math Problems Rating: 3 out of 5 stars3/5How to Solve It: A New Aspect of Mathematical Method Rating: 4 out of 5 stars4/5Limitless Mind: Learn, Lead, and Live Without Barriers Rating: 4 out of 5 stars4/5Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition Rating: 4 out of 5 stars4/5Pre-Calculus For Dummies Rating: 5 out of 5 stars5/5smarTEST Prep: Guide to LSAT Logic Games Rating: 5 out of 5 stars5/5Basic Math & Pre-Algebra Workbook For Dummies with Online Practice Rating: 4 out of 5 stars4/5Geometry For Dummies Rating: 4 out of 5 stars4/5Mathematics, Magic and Mystery Rating: 4 out of 5 stars4/5The Little Book of Mathematical Principles, Theories & Things Rating: 3 out of 5 stars3/5Calculus For Dummies Rating: 4 out of 5 stars4/5Algebra I For Dummies Rating: 4 out of 5 stars4/5Mental Math: Tricks To Become A Human Calculator Rating: 5 out of 5 stars5/5
Reviews for Elementary Matrix Algebra
7 ratings0 reviews
Book preview
Elementary Matrix Algebra - Franz E. Hohn
INDEX
ELEMENTARY MATRIX
ALGEBRA
CHAPTER
1
Introduction to
Matrix Algebra
1.1 Matrices
There are many situations in both pure and applied mathematics in which we have to deal with rectangular arrays of numbers or functions. An array of this kind may be represented by the symbol
.
The numbers or functions aij of this array are called its elements or entries and in this book are assumed to have real or complex values. Such an array, subject to rules of operation to be defined below, is called a matrix. We shall denote matrices with pairs of square brackets, but pairs of double bars, || ||, and pairs of parentheses, ( ), are also used for this purpose. The subscripts i and j of the element aij of a matrix A identify respectively the row and the column of A in which aij is located. When there is no need to distinguish between rows and columns, we call them simply lines of the matrix.
A matrix A with m rows and n columns is called a matrix of order (m, n) or an m × n ("m by n") matrix. When m = n so that the matrix is square, it is called a matrix of order n or an n-square matrix. When A is of order n, the elements a11, a22, ..., ann are said to constitute the main or principal diagonal of A and the elements an1, an–1.2, ..., a1n constitute its secondary diagonal. When A is m × 1, it is called a column matrix or vector.
It is often convenient to abbreviate the symbol (1.1.1) to the form [aij](m,n), which means " the matrix of order (m, n) whose elements are the aij’s." When the order of the matrix need not be specified or is clear from the context, this is abbreviated further to the form [aij]. Another convenient procedure which we shall follow is to denote matrices by capital letters such as A, B, X, Y, etc., whenever it is not necessary to indicate explicitly the elements or the orders of the matrices in question.
A simple illustration of the matrix concept is the following: The coefficients of x and y in the system of linear equations
provide the matrix of order 2:
,
which is called the coefficient matrix of the system. The 2 × 3 matrix
,
containing the coefficients of x and y and the constant terms as well, is called the augmented matrix of the system. The coefficient and augmented matrices of systems of equations are useful in investigating their solutions, as we shall see later.
1.2 Equality of Matrices
Two matrices [aij](m,n) and [bij](m,n) are defined to be equal if and only if aij = bij for each pair of subscripts i and j. In words, two matrices are equal if and only if they have the same order and have equal corresponding elements throughout.
From this definition and from the properties of equality in ordinary algebra, there follow four properties of the equality of matrices:
(a) If A and B are any two matrices, either A = B or A ≠ B (the determinative property).
(b) If A is any matrix, A = A (the reflexive property).
(c) If A = B, then B = A (the symmetric property).
(d) If A = B and B = C, then A = C (the transitive property).
Many mathematical relationships other than equality of matrices possess these same four properties. (The similarity of triangles is a simple example. Can you think of others?) Any relation between pairs of mathematical objects which possesses these properties is called an equivalence relation. Several types of equivalence relations will be defined and used in this book.
Because equality means that matrices are in fact identical, a matrix may be substituted for any equal matrix in the following operations.
1.3 Addition of Matrices
If A = [aij](m,n) and B = [bij](m,n), we define the sum A + B to be the matrix [(aij + bij)](m,n). That is, the sum of two matrices of the same order is found by adding corresponding elements throughout. For example,
.
Two matrices of the same order are said to be conformable for addition.
Since the sum of any two m × n matrices is again an m × n matrix, we say that the set of all m × n matrices is closed with respect to addition.
1.4 Commutative and Associative Laws of Addition
Throughout this book, the real and the complex numbers and functions thereof will be called scalars to distinguish them from the arrays which are called matrices. In scalar algebra, the fact that a + b = b + a for any two scalars a and b is known as the commutative law of addition. The fact that a +(b + c) = (a + b)+c for any three scalars a, b, and c is known as the associative law of addition. It is not hard to see that these laws extend to matrix addition also.
Let A, B, C be arbitrary matrices of the same order. Then, using the definition of the sum of two matrices and the commutative law of addition of scalars, we have in the abbreviated notation
Similarly, applying the associative law for the addition of scalars, we have
We have thus proved
Theorem 1.4.1: The addition of matrices is both commutative and associative; that is, if A, B, and C are conformable for addition,
,
.
The reader who finds the above notation a little too condensed should write out the details in full for matrices of order, say, (2, 3).
These two laws, applied repeatedly if necessary, enable us to arrange the terms of a sum in any order we wish, and to group them in any fashion we wish. In particular, they justify the absence of parentheses in an expression like A + B + C, which (for given A, B, C of the same order) has a uniquely defined meaning.
Another important property of matrix addition is given in
Theorem 1.4.2: A + C = B + C if and only if A = B.
The fact that A + C = B + C implies A = B is called the cancellation law for addition.
Indeed, A + C = B + C if and only if aij + cij = bij + cij in every case. But aij + cij = bij + cij if and only if aij = bij, by the cancellation law of addition in the complex domain. This implies A + C = B + C if and only if A = B.
1.5 Subtraction of Matrices
A matrix all of whose elements are zero is called a zero matrix and is denoted by 0, or by 0n, or by 0m × n when the order needs emphasis. The basic property of the matrix 0m × n is that, for all m × n matrices A,
;
that is, the zero matrix is an identity element for addition.
The negative of an m × n matrix A = [aij] is defined to be –A = [–aij]. That is, the negative of A is formed by changing the sign of every element of A. The reason for this definition is, of course, to guarantee that
.
Thus – A is an inverse of A with respect to addition.
Instead of A + (– A) = 0, we agree to write A – A = 0. In general, we define
.
This implies that the difference A – B may be found by subtracting corresponding elements.
For example,
.
An important consequence of the preceding definitions is that, if X, A, and B are all of the same order, then a solution of the equation
is
.
In fact, replacement of X by B – A in X + A yields
,
which shows that B – A is indeed a solution of (1.5.4). [Note that (1.5.2), (1.4.2), (1.4.1), and (1.5.1) were all employed in the proof.] Moreover, it is the only solution, for, if Y is any solution, then
,
so that, by substitution,
(Each use of a basic law should be identified here by the reader.) In summary, B – A is the unique solution of the equation X + A = B.
In particular, this proves that the equations
have respectively the unique solutions 0 and –A; that is, for a given order (m, n), the additive identity element is unique, and each matrix A has a unique inverse with respect to addition, namely the matrix –A defined above.
In summary, with respect to addition, the set of all m × n matrices has the same properties as the set of complex numbers. This is because matrix addition is effected by mn independent scalar additions—one in each of the mn positions.
1.6 Scalar Multiples of Matrices
If A = [aij] and if α is a scalar, we define αA = Aα = [αaij]. In words, to multiply a matrix A by a scalar α, multiply every element of A by α. This definition is, of course, suggested by the fact that, if we add n A’s, we obtain a matrix whose elements are those of A each multiplied by n. For example,
.
The operation of multiplying a matrix by a scalar has these basic properties:
All four are readily proved by appealing to the definition. In the case of the second, we have (α + β)A = [(α + β)aij] = [αaij + βaij] = [αaij] + [βaij] = αA + βA. We leave it as an exercise to the reader to prove the other laws in a similar fashion.
1.7 The Multiplication of Matrices
Frequently in the mathematical treatment of a problem, the work can be simplified by the introduction of new variables. Translations of axes, effected by equations of the form
and rotations of axes, effected by
are the most familiar examples. A rotation of axes is a special case of a change of variables of the type
in which the a’s are constants. Substitutions of the latter kind are called linear homogeneous transformations of the variables and are of great usefulness. The properties of these transformations suggest the law that should be adopted for the multiplication of matrices, which we now proceed to illustrate.
Let us consider, for example, the effect on the system of linear functions
resulting from an application of the linear transformation (1.7.1). Substitution from (1.7.1) into (1.7.2) yields the new system of linear functions
From the three systems of linear expressions, (1.7.1), (1.7.2), and (1.7.3), we obtain three coefficient matrices. Since the third matrix is in a sense the product
of the first two, we shall relate them by the following matrix equation:
.
The question now is, What rule for ‘multiplying’ matrices does this equation imply?
The element (2a11 + 3a21) in the first row and first column of the matrix on the right may be obtained by multiplying the elements of the first row of the extreme left matrix respectively by the corresponding elements of the first column of the second matrix on the left and then adding the results: (first × first) + (second × second). If we multiply the elements of the second row of the extreme left matrix respectively by the corresponding elements of the first column of the second matrix and add, we obtain the entry (3a11 – 4a21) in the second row and first column on the right. A similar procedure is followed for every other entry on the right. (The reader should check them all.)
This example suggests the following general definition. Let A be an m × p matrix and let B be a p × n matrix. The product AB is then defined to be the m × n matrix whose element in the ith row and jth column is found by multiplying corresponding elements of the ith row of A and of the jth column of B, and then adding the results. Symbolically, we may write
,
where
.
(The arrows have been used for emphasis and are not customarily part of the notation.)
Two things should be noted particularly. First, the product AB has the same number of rows as the matrix A and the same number of columns as the matrix B. Second, the number of columns in A and the number of rows in B must be the same since otherwise there will not always be corresponding elements to multiply together. When the number of columns of a matrix A is the same as the number of rows of a matrix B, A is said to be conformable to B for multiplication.
These matters are illustrated further in the examples which follow:
(b)
.
(c)
.
1.8 The Properties of Matrix Multiplication
In the product AB we say that B is premultiplied by A and that A is postmultiplied by B. This terminology is essential since ordinarily AB ≠ BA. In fact, if A has order (m, p) and B has order (p, n) with m ≠ n, the product AB is defined but the product BA is not. Thus the fact that A is conformable to B for multiplication does not imply that B is conformable to A for multiplication. Even if m = n, we need not have AB = BA. That is, matrix multiplication is not in general commutative. We give some numerical examples in which the reader should verify every detail:
(a)
,
but
.
(b)
,
but
.
This last example shows that multiplication is not commutative even in the case of square matrices.
The fact that multiplication is not, in general, commutative
does not mean that we never have AB = BA. There are, in fact, important special cases when this equality holds. Examples will appear later in this chapter.
The familiar rule of scalar algebra that if a product is zero, then one of the factors must be zero, also fails to hold for matrix multiplication. An example is the product
.
Here neither factor is a zero matrix, although the product is.
When a product AB = 0 but neither A nor B is 0, then the factors A and B are called divisors of zero. Thus, in the algebra of matrices, there exist divisors of zero, whereas in the algebra of complex numbers there do not.
We illustrate a final contrast with the laws of scalar algebra by means of the following example. Let
.
Then
.
Thus we can have AB = AC without having B = C. In other words, we cannot ordinarily cancel A from AB = AC even if A ≠ 0. However, there is an important special case when the cancellation is possible, as we shall see later.
In summary, then, three fundamental properties of multiplication in scalar algebra do not carry over to matrix algebra:
(a) The commutative law AB = BA does not hold true generally.
(b) From AB = 0, we cannot conclude that at least one of A and B must be zero; that is, there exist divisors of zero.
(c) From AB = AC or BA = CA we cannot in general conclude that B = C, even if A ≠ 0; that is, the cancellation law does not hold in general in multiplication.
These rather staggering losses might make one wonder whether matrix multiplication is not a nearly useless operation. This is, of course, not the case, for, as we shall prove, the most vital properties—the associative and the distributive laws—still remain. However, it should be clear at this point why we have been, and must continue to be, so careful to prove the validity of the matrix operations which we employ.
Theorem 1.8.1: The multiplication of matrices is associative.
Let
.
Then the theorem says that
.
Applying the definition of multiplication, we see first that
.
Here i ranges from 1 to m and denotes the row of the element in parentheses, whereas k ranges from 1 to p and denotes its column.
We apply the definition now to AB and C. The new summation will be on the column subscript k of AB, which is the row subscript of C, so
.
Multiplying the factor ckr into each sum in parentheses, we obtain
,
in which the row subscript i ranges from 1 to m while the column subscript r ranges from 1 to q.
In the same way we find
.
Since the order of summation is arbitrary in a finite sum, we have
for each pair of values of i and r, so that (AB)C = A(BC).
If the uses made of the Σ sign in this proof are unfamiliar to the reader, he may refer to an explanation of these matters in Appendix I. It would also help to write out the proof in full for 2 × 2 matrices.
Theorem 1.8.2: Matrix multiplication is distributive with respect to addition.
To make this explicit, let
.
Here A is conformable to B and also to C for multiplication, and B is conformable to C for addition. Then the theorem says that
.
Indeed
The theorem also says that, assuming conformability,
.
This second distributive law is distinct from the first, since matrix multiplication is not in general commutative. It is proved in the same manner as the first, however, and details are left to the reader.
The proofs of the last two theorems involve a detailed examination of the elements of the matrices involved. They are thus essentially scalar in nature. As the theory develops, we shall increasingly employ proofs involving only manipulations with matrices. Such proofs are typically more compact than are scalar-type proofs of the same results and, hence, are to be preferred. The reader’s progress in learning matrix algebra will be accelerated if in the exercises he avoids the use of scalar-type proof whenever this is possible. For example, to prove that, for conformable matrices,
,
we do not again resort to a scalar type of proof. We simply note that, by the first distributive law above, (A + B)(C + D) = (A + B)C + (A + B)D, so that, by the second distributive law,
.
1.9 Exercises
Throughout this book, exercises marked with an asterisk (*) develop an important part of the theory and should not be overlooked.
In many problems, conditions of conformability for addition or multiplication must be satisfied for the problem to have meaning. These conditions are usually rather obvious, so that we shall frequently omit statement of them. The reader is then expected to make the necessary assumptions in working the exercises.
In these exercises, the words prove
and show
are to be taken as synonymous.
1. In each case, find all solutions (x, y) of the given equation:
2. Given that
,
solve for x1, x2, x3, and x4:
.
3. Prove that (A + B) – C = A + (B – C) and name each property used in the proof.
Why is (A – B) + C ≠ A – (B + C) in general?
*4. Prove in detail the second distributive law:
.
*5. Prove that if αA = 0 where a is a scalar, then either α = 0 or A = 0. Prove also that A = B if and only if αA = αB for all scalars α.
6. Given that αA = βA and A ≠ 0, prove that α = β (α and β scalars).
*7. Prove:
(a) αA · βB = α β · AB (α, β scalars).
(b) (–1)A = –A.
(c) (–A)(–B) = AB.
(d) A(αB) = (αA)B = α (AB).
8. Perform the matrix multiplications:
(a)
,
,
,
,
,
,
(g)
,
(h)
.
9. (a) Using
,
test the rule (AB)C = A(BC).
(b) Under what conditions is a matrix product ABCD defined ? According to the associative law, what are the various ways of computing it?
(c) If you have to compute a product AB ··· MX where X is a vector, how will you apply the associative law so as to reduce the computation to a minimum? Illustrate by computing the product
.
10. (a) Explain why in matrix algebra
and
,
except in special cases. Under what circumstances would equality hold?
(b) Expand (A + B)³.
11. Prove that, if A has identical rows and AB is defined, AB has identical rows also.
*12. Given that A is a square matrix, define Ap+1 = Ap · A for p ≥ 1. By induction on q, prove that, for each positive integer p,
for all positive integers q. (Note that neither the definition nor the proof is of the scalar type.)
13. , compute A², A³, and A⁴. (Here i² = – 1). Give a general rule for An. (You may treat the cases n even and n odd separately.)
14. Compute A², B², and B⁴, where
.
Define what is meant by an nth root of a square matrix and comment on what these examples imply about the number of nth roots of a matrix.
15. Evaluate:
.
Then state and prove a general rule illustrated by these three examples.
16. Let AB = C where A and B are of order n. If, in the ith row of A, aik = 1, where i and k are fixed, but all other elements of the ith row are zero, what can be said about the ith row of C? What is the analogous fact for columns?
17. Let A and B be of order n and let
where α1, α2, β1, and β2 are scalars such that α1β2 ≠ α2β1. Show that C1C2 = C2 C1 if and only if AB = BA.
18. Let
.
Compare the products AB and BA.
19. For what real values of x is
For what value of x is this product a minimum and what is the minimum value?
20. If AB = BA, the matrices A and B are said to be commutative or to commute. Show that for all values of a, b, c, d, the matrices
commute.
21. What must be true about a, b, c, and d if the matrices
are to commute?
22. If AB = – BA, the matrices A and B are said to be anticommutative or to anticommute. Show that each of the matrices
,
anticommutes with the others. These are the Pauli spin matrices, which are used in the study of electron spin in quantum mechanics.
23. The matrix AB – BA (A and B of order n) is called the commutator of A and B. Using Exercise 22, show that the commutators of σx and σy, σy and σz, and σz and σx are respectively 2iσz, 2iσx, and 2iσy.
24. If A B = AB – BA, prove that
(a) A (B C) = (A BC if and only if B (A C) = 0,
(b) A (B C) + B (C A) + C (A B) = 0.
25. A matrix A such that Ap = 0 for some positive integer p is called nilpotent. Show that every 2 × 2 nilpotent matrix A such that A² = 0 may be written in the form
,
where λ and μ are scalars, and that every such matrix is nilpotent. If A is real, must λ and μ also be real?
26. If
,
show that A1 and A2 commute. What is the connection with transformations used in plane analytic geometry?
27. Given that
,
find the matrix A.
*28. The sum of the main diagonal elements aii, i = 1, 2, ..., n, of a square matrix A of order n is called the trace of A:
.
(a) If A and B are of order n, show that
.
(b) If C is of order (m, n) and G is of order (n, m), show that
.
29. Given that A, B, C all have order n, use Exercise 28(b) to show that
.
What is the generalization of this observation? Show that ordinarily tr ABC ≠ tr ACB.
*30. A square matrix of the form
,
that is, one in which dij = 0 if i ≠ j, is called a diagonal matrix of order n. (Note that this does not say dii ≠ 0.) Let A be any matrix of order (p, q) and evaluate the products DpA and ADq. Describe the results in words. What happens in the special cases d11 = d22 = ... = α and d11 = d22 = ... = 1?
*31. (a) Show that any two diagonal matrices of the same order commute.
(b) Give a formula for Dp where D is diagonal and p is a positive integer.
(c) Show that if D is diagonal with non-negative elements, if p is a fixed positive integer, and if A is a fixed matrix, then ADP = DPA if and only if AD = DA.
32. A matrix A such that A² = A is called idempotent. Determine all diagonal
matrices of order n which are idempotent. How many are there?
*33. Show by induction that, if A is square and AB = λB, where λ is a scalar, then ApB = λpB for every positive integer p.
*34. Given that AB = BA, show that, for all positive integers r and s, ArBs = BsAr.
35. Prove by induction that if B, C are of order n and if A = B + C, C ² = 0, and BC = CB, then for every positive integer k, Ak+1 = Bk(B + (k + 1)C).
1.10 Linear Equations in Matrix Notation
In a great variety of applications of mathematics, there appear systems of linear equations of the general form
,
where the number m of equations is not necessarily equal to the number n of unknowns. In view of the definition of matrix multiplication, such a system of equations may be written as the single matrix equation
.
In fact, if we compute the matrix product on the left, this equation becomes
.
Since these two matrices are equal if and only if all their corresponding elements are equal, this single equation is equivalent to the system of (1.10.1). If we now put
the bulky equation (1.10.2) may be written in the highly compact form
.
When a system of equations is written in the form (1.10.2) or (1.10.4), in which one matrix equation replaces the entire system of scalar equations, it is said to be represented in matrix notation.
Single column matrices, such as X and B in the above discussion, are called vectors. The relation between this concept of a vector and the usual one will be developed later. For the present, a vector is simply a column matrix. The elements of such a matrix are commonly called its components. The vector X above is called an n-vector since it has n components. By the same token, B is an m-vector. Frequently, to save space, an n-vector is written in the form {a1,a2,...,an}, the curly braces being used to identify it as a column matrix. In some books row matrices, that is, matrices consisting of a single row, are also called vectors.
Using this terminology, we see that, whenever we know a set of values of x1, x2,..., xn which simultaneously satisfy the scalar equations (1.10.1), we also know the components of a vector which satisfies the matrix equation (1.10.4), and conversely. Such a vector is called a solution of (1.10.4). The problems of solving the system of scalar equations (1.10.1) and of solving the matrix equation (1.10.4) are thus seen to be equivalent. In Chapter 2, we discuss in detail the computation of the solutions of a system (1.10.1).
At times it is convenient to represent the system (1.10.1) in the vector form
,
where Aj denotes the jth column of the coefficient matrix A. For example, we can write the system
in matrix notation as
or as
.
Substitution reveals that every vector of the form
is a solution of this system. In the next chapter, we see how such solutions may be computed.
1.11 The Transpose of a Matrix
The matrix AT of order (n, m) obtained by interchanging rows and columns in a matrix A of order (m, n) is called the transpose of A. For example, the transpose of
.
Theorem 1.11.1: If AT and BT are the transposes of A and B, and if α is a scalar, then:
The first three of these rules are easy to think through. Detailed proofs are left to the reader. Only (d) will be proved here. Let A = [aik](m, n), B = [bkj](n,p). Then AB = [cij](m,p. Here i, ranging from 1 to m, identifies the row, and j, ranging from 1 to p, identifies the column of the element cij.
Now BT is of order (p, n) and AT is of order (n, m) so that BT is conformable to AT for multiplication. To compute the element γji in the ith row and the ith column of BTAT, we must multiply the jth row of BT into the ith column of AT. Observing that the second subscript of an element in B or A identifies the row and the first subscript identifies the column in which it appears in BT or AT, we see that
.
Here j ranges from 1 to p and identifies the row of γji, whereas i ranges from 1 to m and identifies the column. Thus γji = cji but with the meanings of i and j for rows and columns just opposite in the two cases, so that BTAT = (AB)T. The reader would do well to construct a numerical example.
The preceding proof illustrates the fact that the expression for the typical element of a matrix may at times assume a complicated form, but that an essential aspect of the expression is a pair of subscripts which may be used to identify the row and the column of the element. Such subscripts are called free subscripts. Thus, in the expression,
,
the free subscripts are i and j. Moreover, i is used to designate the row and j the column of the element. The index k is an index of summation. It has nothing to do with the row-column position of the element cij. On the other hand, in the expression
used in the foregoing, the free subscripts j and i designate respectively the row and the column of the element γji. This illustrates the fact that, whereas the index i is frequently used as a row index and j as a column index, this is not necessary nor is it always convenient. Any convenient index symbol may be used for either purpose. Indeed, in the sum represented by cij in the foregoing, the index k is used both ways—as a column index in aik and as a row index in bkj. It is this freedom to use indices in both ways that enabled us to prove the foregoing theorem in a simple fashion.
As another example illustrating these matters, let
,
where i denotes the row and j the column of the element aij. If i ranges from 1 to m and j from 1 to p, then we can interpret rik as the entry in the ith row and kth column of a matrix Rm × n. In sjk, k must play the role of a row index if we want to represent the matrix A =[aij]m×p as a product. Assuming that normally the first subscript is the row subscript, it follows that the element sjk is the element in the jth column and kth row of a matrix (Sp × n)T. That is,
.
Consider finally the sum
,
which we can rewrite as
.
The sum in parentheses is the element in row i and column k2 of A². Hence the second sum represents the element in row i and column j of A² · A. That is, we have
,
1.12 Symmetric, Skew-Symmetric, and Hermitian Matrices
A symmetric matrix is a square matrix A such that A = AT. A skew-symmetric matrix is a square matrix A such that A = –AT. These definitions may also be stated in terms of the individual elements: A is symmetric if and only if aij = aji for all pairs of subscripts; it is skew-symmetric if and only if aij = –aji for all pairs of subscripts. The reader should demonstrate the equivalence of the alternative definitions in each case.
The following are examples of symmetric and skew-symmetric matrices respectively:
.
Example (b) illustrates the fact that the main diagonal elements of a skew-symmetric matrix must all be zero. Why is this true?
A matrix is called a real matrix if and only if all its elements are real. In the applications, real symmetric matrices occur most frequently. However, matrices of complex elements are also of great importance. When the elements of such a matrix A are replaced by their complex conjugates, the resulting matrix is called the conjugate of A . Evidentally a matrix A is real if and only if A , we obtain the transposed conjugate or tranjugate )T of A. This will be denoted by the symbol A*. (A* is sometimes called the adjoint of A, but not in this book.) For example, if
,
then
,
and
.
When A = A*, for all pairs of subscripts, A is called a Hermitian matrix (after the French mathematician, Charles Hermite, 1822-1901). When matrices of complex elements appear in the applications, for example, in the theory of atomic physics, they are often Hermitian. The matrices
.
are simple examples of Hermitian matrices, as is readily verified. Why must the diagonal elements of a Hermitian matrix all be real numbers?
If the elements of A are real, AT = A*, so that the property of being real and symmetric is a special case of the property of being Hermitian. The following pages will contain a great many results about Hermitian matrices. The reader interested only in the real case may interpret the word Hermitian
as real symmetric,
and all will be well.
1.13 Scalar Matrices
A square matrix of the form
in which each element of the main diagonal equals the scalar α and all other elements are zero, is called a scalar matrix of order n. In matrix multiplication, a scalar matrix behaves like a scalar, as the following equations show:
.
The scalar matrices are even more fundamentally like scalars, however, for if α and β are any two scalars, and if
then
.
These equations show that corresponding to the arithmetic of the real numbers, for example, there is a strictly analogous arithmetic of scalar matrices of a fixed order n in which the scalar matrix with main diagonal elements a corresponds to the real number α. In the same way, corresponding to the arithmetic of complex numbers, there is an analogous arithmetic of scalar matrices the main diagonal elements of which are complex numbers. Many other examples of such a correspondence between a set of scalars and a corresponding set of scalar matrices could be constructed.
of scalars such that, if α and β , and if α + β, = γ, α β = δ, then γ and δ of scalar matrices of a fixed order n are in one-to-one correspondence of scalars, and if equations of corresponding matrices are isomorphic. or that there is an isomorphism . [Isomorphic means of the same (iso-) form (morphos).]
The notion of isomorphism developed here is a particular application of a general concept of isomorphism which is one of the powerful tools of modern abstract algebra.
1.14 The Identity Matrix
The scalar matrix of order n corresponding to the scalar 1 will be denoted by the symbol In, or simply by I if the order need not be emphasized:
.
It is called the identity matrix or the unit matrix of order n because it plays in matrix algebra the role corresponding to that played by the integer 1 in scalar algebra, as the isomorphism just explained would lead us to expect. However, the role extends beyond the domain of scalar matrices, for