Math 31AH: Lecture 13

Last time, we began our discussion of linear transformations, and in particular observed that the set \mathrm{Hom}(\mathbf{V},\mathbf{W}) of linear transformations from a vector space \mathbf{V} to a vector space \mathbf{W} is itself a vector space in a natural way, because there are natural ways to add and scale linear transformations which are compliant with the vector space axioms. In the case that \mathbf{V}=\mathbf{W}, linear transformations are usually referred to as “linear operators,” and the vector space \mathrm{Hom}(\mathbf{V},\mathbf{V}) of linear operators on \mathbf{V} is typically denoted \mathrm{End}\mathbf{V}. This notation stems from the fact that a fancy name for a linear operator is endomorphism. This term derives from the Greek endon, meaning within, and is broadly used in mathematics to emphasize that one is considering a function whose range is contained within its domain. Linear operators are special in that, in addition to being able to scale and add them to one another, we can also multiply them in a natural way. Indeed, given two linear operators A,B \in \mathrm{End}\mathbf{V}, we may define their product to be their composition, i.e. AB:=A \circ B. Spelled out, this means that AB \in \mathrm{End}\mathbf{V} is the linear operator defined by

AB(\mathbf{v}):=A(B(\mathbf{v})) \quad \forall \mathbf{v} \in \mathbf{V}.

So \mathbf{End}\mathbf{V} is a special type of vector space whose vectors (which are operators) can be scaled, added, and multiplied. Such vector spaces warrant their own name.

Definition 1: An algebra is a vector space \mathbf{V} together with a multiplication rule

\mathbf{V} \times \mathbf{V} \to \mathbf{V}

which is bilinear and associative.

Previously, the only algebra we had encountered was \mathbb{R}, and now we find that there are in fact many more algebras, namely all vector spaces \mathrm{End}\mathbf{V} for \mathbf{V} an arbitrary vector space. So, linear operators are in some sense a generalization of numbers.

However, there are some notable differences between numerical multiplication and the multiplication of operators. One of the main differences is that multiplication of linear operators is noncommutative: it is not necessarily the case that AB=BA.

Exercise 1: Find an example of linear operators A,B such that AB\neq BA.

Another key difference between the arithmetic of numbers and the arithmetic of operators is that division is only sometimes possible: it is not the case that all non-zero operators have a multiplicative inverse, which is defined as follows.

Definition 2: An operator A \in \mathrm{End}\mathbf{V} is said to be invertible if there exists an operator B \in \mathrm{End}\mathbf{V} such that AB=BA=I, where I \in \mathbf{End}\mathbf{V} is the identity operator defined by I\mathbf{v}=\mathbf{v} for all \mathbf{v} \in \mathbf{V}.

You should take a moment to compare this definition of invertible linear operator with the definition of a vector space isomorphism from Lecture 2. You will then that A being invertible is equivalent to A \colon \mathbf{V} \to \mathbf{V} being an isomorphism of \mathbf{V} with itself. An isomorphism from a vector space to itself is called an automorphism where the prefix “auto” is from the Greek work for “self.” The set of all invertible linear operators in \mathrm{End}\mathbf{V} is therefore often denoted \mathrm{Aut}\mathbf{V}.

Proposition 1: If A \in \mathrm{Aut}\mathbf{V}, then there is precisely one operator B \in \mathrm{End}\mathbf{V} such that AB=BA=I.

Proof: Suppose that B,C \in \mathrm{End}\mathbf{V} are such that


Then we have

AB = AC \implies BAB = BAC \implies IB = IC \implies B=C.

— Q.E.D.

Thus, if A is an invertible operator, then it has a unique inverse, so it is reasonable to call this “the inverse” of A, and denote it A^{-1}. You should check for yourself that A^{-1} is invertible, and that its inverse is A, i.e. that (A^{-1})^{-1}=A.

Exercise 2: Find an example of a nonzero linear operator which is not invertible.

Proposition 2: The set \mathrm{Aut}\mathbf{V} of invertible operators is closed under multiplication: if A,B \in \mathrm{Aut}\mathbf{V}, then AB \in \mathrm{Aut}\mathbf{V}.

Proof: We have

(AB)(B^{-1}A^{-1})=A(BB^{-1})A^{-1} = AIA^{-1} = AA^{-1}=I,

which shows that AB is invertible, and that (AB)^{-1}=A^{-1}B^{-1}..

— Q.E.D.

Proposition 2 shows that the set \mathrm{Aut}\mathbf{V} is an example of a type of algebraic structure called a group, which roughly means a set together with a notion of multiplication in which every element has an inverse. We won’t give the precise definition of a group, since the above is the only example of a group we will see in this course. The subject of group theory is its own branch of algebra, and it has many connections to linear algebra.

All of the above may seem quite abstract, and perhaps it is. However, in the case of finite-dimensional vector spaces, linear transformations can be described very concretely as tables of numbers, i.e. as matrices. Consider the vector space \mathrm{Hom}(\mathbf{V},\mathbf{W}) of linear transformations from an n-dimensional vector space \mathbf{V} to an m-dimensional vector space \mathbf{W}. Let A \in \mathrm{Hom}(\mathbf{V},\mathbf{W}) be a linear transformation, let E=\{\mathbf{e}_1,\dots,\mathbf{e}_n\} be a basis of \mathbf{V}, and let F=\{\mathbf{f}_1,\dots,\mathbf{f}_m\} be a basis of \mathbf{W}. The transformation A is then uniquely determined by the finitely many vectors


Indeed, any vector \mathbf{v} \in \mathbf{V} may be uniquely represented as a linear combination of vectors in E,

\mathbf{v}=x_1\mathbf{e}_1 + \dots + x_n \mathbf{e}_n,

and we then have

A\mathbf{v}= A(x_1\mathbf{e}_1 + \dots + x_n \mathbf{e}_n)= x_1A\mathbf{e}_1 + \dots + x_n A\mathbf{e}_n.

Now, we may represent each of the vectors A\mathbf{e}_j as a linear combination of the vectors in F,

A\mathbf{e}_j = \sum_{i=1}^m a_{ij} \mathbf{f}_i, \quad 1 \leq j \leq n,

and we then have

A\mathbf{v} = \sum_{j=1}^n x_jA\mathbf{e}_j = \sum_{j=1}^n x_j\sum_{i=1}^m a_{ij}\mathbf{f}_i=\sum_{i=1}^m \left( \sum_{j=1}^n a_{ij}x_j \right)\mathbf{f}_i.

Thus, if

A\mathbf{v}= \sum_{i=1}^m y_i \mathbf{f}_i,

is the unique representation of the vector A\mathbf{v} \in \mathbf{W} relative to the basis F of \mathbf{W}. So, our computation shows that we have the matrix equation

\begin{bmatrix} y_1 \\ \vdots \\ y_m \end{bmatrix} = \begin{bmatrix} {} & \vdots & {} \\ \dots & a_{ij} & \dots \\ {} & \vdots & {} \end{bmatrix} \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix}.

Schematically, this matrix equation can be expressed as follows: for any \mathbf{v} \in \mathbf{V}, we have that

[A\mathbf{v}]_F = [A]_{E,F} [\mathbf{v}]_E,

where, [\mathbf{v}]_E denotes the n \times 1 matrix whose entries are the coordinates of the vector \mathbf{v} \in \mathbf{V} relative to the basis E of \mathbf{V}, [\mathbf{w}]_F denotes the m \times 1 matrix whose entries are the coordinates of the vector \mathbf{w} \in \mathbf{W} relative to the basis F of \mathbf{W}, and [A]_{E,F} is the m \times n matrix

[A]_{E,F} = \begin{bmatrix} [A\mathbf{e}_1]_F & \dots & [A\mathbf{e}_n]_F \end{bmatrix}

whose jth column is the m \times 1 matrix [A\mathbf{e}_j]_F. What this means at the conceptual level is the following: choosing a basis E in \mathbf{V} results in a vector space isomorphism \mathbf{V} \to \mathbb{R}^n defined by

\mathbf{v} \mapsto [\mathbf{v}]_E,

choosing a basis F in \mathbf{W} results in a vector space isomorphism \mathbf{W} \to \mathbb{R}^m defined by

\mathbf{w} \mapsto [\mathbf{w}]_F,

and these two choices together result in a vector space isomorphism \mathrm{Hom}(\mathbf{V},\mathbf{W}) \to \mathbb{R}^{m \times n} defined by

A \mapsto [A]_{E,F}.

Let us consider how the above works in the special case that \mathbf{V}=\mathbf{W} and E=F. We are then dealing with linear operators A \in \mathrm{End}\mathbf{V}, and the matrix representing such an operator is the square n \times n matrix

[A]_E = \begin{bmatrix} [A\mathbf{e}_1]_E & \dots & [A\mathbf{e}_n]_E \end{bmatrix}.

For every \mathbf{v} \in \mathbf{V}, we have the matrix equation

[A\mathbf{v}]_E = [A]_E [\mathbf{v}]_E.

In this case, there is an extra consideration. Suppose we have two linear operators A,B \in \mathrm{End}\mathbf{V}. Then, we also have their product AB \in \mathrm{End}\mathbf{V}, and a natural issue is to determine the relationship between the matrices [A]_E,[B]_E, and [AB]_E. Let us now work this out.

Start with a vector \mathbf{v} \in \mathbf{V}, and let

\mathbf{v}=x_1\mathbf{e}_1+ \dots + x_n\mathbf{e}_n

be its representation relative to the basis E of \mathbf{V}. Let

B\mathbf{e}_j = \sum_{I=1}^n b_{ij}\mathbf{e}_i, \quad 1 \leq i \leq n

be the representations of the vectors B\mathbf{e}_1,\dots,B\mathbf{e}_n relative to the basis E.

We then have

AB(\mathbf{v}) \\ = \sum_{j=1}^n x_j AB\mathbf{e}_j \\ = \sum_{j=1}^n x_j A\sum_{k=1}^n b_{kj}\mathbf{e}_k \\ = \sum_{j=1}^n x_j \sum_{k=1}^n b_{kj}A\mathbf{e}_k \\ = \sum_{j=1}^n x_j \sum_{k=1}^n b_{kj}\sum_{i=1}^n a_{ik} \mathbf{e}_i \\ = \sum_{i=1}^n  \left( \sum_{j=1}^n \left(\sum_{k=1}^n a_{ik}b_{kj} \right)x_j\right)\mathbf{e}_i.

This shows that the matrix of the product transformation AB relative to the basis E is given by the product of the matrices representing A and B in this basis, i.e.

[AB]_E = [A]_E [B]_E.

So, in the case of linear operators, the isomorphism \mathrm{End}\mathbf{V} \to \mathbb{R}^{n \times n} given by

A \mapsto [A]_E

is not just a vector space isomorphism, but a vector space isomorphism compatible with multiplication — an algebra isomorphism.


  1. Rahul Puranam says:

    “AB:=A \circ B” was near the top of the lecture. Was \circ supposed to be compiled using LaTeX?

Leave a Reply