Math 31AH: Lecture 14

In Lecture 13, we discussed matrix representations of linear transformations between finite-dimensional vector spaces. In this lecture, we consider linear transformations between finite-dimensional Euclidean spaces, and discuss the relationship between the scalar product and the matrix representation of linear transformations. Note that any vector space \mathbf{V} can be promoted to a Euclidean space (\mathbf{V},\langle \cdot,\cdot \rangle) by choosing a basis E in \mathbf{V} and defining \langle \cdot,\cdot \rangle to be the unique scalar product on \mathbf{V} such that E is orthonormal.

Let \mathbf{V} and \mathbf{W} be Euclidean spaces; by abuse of notation, we will denote the scalar product in each of these spaces by the same symbol \langle \cdot,\cdot \rangle. Let E=\{\mathbf{e}_1,\dots,\mathbf{e}_n\} be an orthonormal basis in \mathbf{V}, and let F=\{\mathbf{f}_1,\dots,\mathbf{f}_m\} be an orthonormal basis in \mathbf{W}. Let A \in \mathrm{Hom}(\mathbf{V},\mathbf{W}) be a linear transformation.

Definition 1: The matrix elements of A relative to the bases E and F are the scalar products

\langle \mathbf{f}_i, A\mathbf{e}_j \rangle, \quad 1 \leq i \leq m,\ 1 \leq j \leq n.

The reason the number \langle \mathbf{f}_i,A\mathbf{e}_j \rangle is called a “matrix element” of A is that this number is exactly the (i,j)-element of the matrix [A]_{E,F} of defined in Lecture 13. Indeed, if

A\mathbf{e}_j = \sum_{k=1}^m a_{kj} \mathbf{f}_k,


\langle \mathbf{f}_i,A\mathbf{e}_j \rangle = \left\langle \mathbf{f}_i,\sum_{k=1}^m a_{kj} \mathbf{f}_k\right\rangle = \sum_{k=1}^m a_{kj} \langle \mathbf{f}_i,\mathbf{f}_k \rangle = a_{ij},

where the last equality follows from the orthonormality of F. However, one can note that it is not actually necessary to assume that \mathbf{V} and \mathbf{W} are finite-dimensional in order for the matrix elements of A to be well-defined. However, we will always make this assumption, and thus in more visual form, we have that

[A]_{E,F} = \begin{bmatrix} {} & \vdots & {} \\ \dots & \langle \mathbf{f}_i,A\mathbf{e}_j \rangle & \dots \\ {} & \vdots & {} \end{bmatrix}_{1 \leq i \leq m, 1 \leq j \leq n}.

The connection between matrices and scalar products is often very useful for performing computations which would be much more annoying without the use of scalar products. A good example is change of basis for linear operators. The setup here is that \mathbf{V}=\mathbf{W}, so that m=n and E,F are two (possibly) different orthonormal bases of the same Euclidean space. Given an operator A \in \mathrm{End}\mathbf{V}, we would like to understand the relationship between the two n \times n matrices

[A]_E \quad\text{ and }\quad [A]_F

which represent the operator A relative to the bases E and F, respectively. In order to do this, let us consider the linear operator U \in \mathrm{End}\mathbf{V} uniquely defined by the n equations

U\mathbf{e}_i = \mathbf{f}_i, \quad 1 \leq i \leq n.

Why do these n equations uniquely determine U? Because, for any \mathbf{v} \in \mathbf{V}, we have

U\mathbf{v} = U\sum_{i=1}^n \langle \mathbf{e}_i,\mathbf{v}\rangle \mathbf{e}_i = \sum_{i=1}^n \langle \mathbf{e}_i,\mathbf{v}\rangle U\mathbf{e}_i = \sum_{i=1}^n \langle \mathbf{e}_i,\mathbf{v}\rangle \mathbf{f}_i.

Let us observe that the operator U we have defined is an automorphism of \mathbf{V}, i.e. it has an inverse. Indeed, it is clear that the linear operator U^{-1} uniquely determined by the n equations

U^{-1}\mathbf{f}_i=\mathbf{e}_i, \quad 1 \leq i \leq n

is the inverse of U. Operators which transform orthonormal bases into orthonormal bases have a special name.

Definition 2: An operator U \in \mathrm{End}\mathbf{V} is said to be an orthogonal operator if it preserves orthonormal bases: for any orthonormal basis \{\mathbf{e}_1,\dots,\mathbf{e}_n\} in \mathbf{V}, the set \{U\mathbf{e}_1,\dots,U\mathbf{e}_n\} is again an orthonormal basis in \mathbf{V}.

Note that every orthogonal operator is invertible, since we can always define U^{-1} just as we did above. In particular, the operators U,U^{-1} we defined above by U\mathbf{e}_i=\mathbf{f}_i, U^{-1}\mathbf{f}_i=\mathbf{e}_i are orthogonal operators.

Proposition 1: An operator U \in \mathrm{End}\mathbf{V} is orthogonal if and only if

\langle U\mathbf{v},U\mathbf{w} \rangle = \langle \mathbf{v},\mathbf{w} \rangle, \quad \forall \mathbf{v},\mathbf{w} \in \mathbf{V}.

Proof: Observe that, by linearity of U and bilinearity of \langle \cdot,\cdot \rangle, it is sufficient to prove the claim in the case that \mathbf{v}=\mathbf{e}_i and \mathbf{w}=\mathbf{e}_j for some 1 \leq i,j \leq n, where \{\mathbf{e}_1,\dots,\mathbf{e}_n\} is an orthonormal basis of \mathbf{V}.

Suppose that U is an orthogonal operator. Let \mathbf{f}_i=U\mathbf{e}_i, 1 \leq i \leq n. Then \{\mathbf{f}_1,\dots,\mathbf{f}_n\} is an orthonormal basis of \mathbf{V}, and consequently we have

\langle U\mathbf{e}_i,U\mathbf{e}_j \rangle = \langle \mathbf{f}_i,\mathbf{f}_j \rangle = \delta_{ij} = \langle \mathbf{e}_i,\mathbf{e}_j \rangle.

Conversely, suppose that

\langle U\mathbf{e}_i,U\mathbf{e}_j \rangle = \langle \mathbf{e}_i,\mathbf{e}_j \rangle.

We then have that \langle \mathbf{f}_i,\mathbf{f}_j \rangle = \delta_{ij}, so that \{\mathbf{f}_1,\dots,\mathbf{f}_n\} is an orthonormal basis of \mathbf{V}, and thus U is an orthogonal operator.

— Q.E.D.

Proposition 2: An operator U \in \mathrm{End}\mathbf{V} is orthogonal if and only if it is invertible and

\langle \mathbf{v},U\mathbf{w} \rangle = \langle U^{-1}\mathbf{v},\mathbf{w} \rangle, \quad \forall \mathbf{v},\mathbf{w} \in \mathbf{V}.

Proof: Suppose first that U is orthogonal. Then, U is invertible and U^{-1} is also orthonal, and hence for any \mathbf{v},\mathbf{w} \in \mathbf{W}, we have

\langle \mathbf{v},U\mathbf{w} \rangle = \langle U^{-1}\mathbf{v},U^{-1}U\mathbf{w} \rangle = \langle U^{-1}\mathbf{v},\mathbf{w} \rangle.

Conversely, suppose that U is invertible and

\langle \mathbf{v},U\mathbf{w} \rangle = \langle U^{-1}\mathbf{v},\mathbf{w} \rangle, \quad \forall \mathbf{v},\mathbf{w} \in \mathbf{V}.

Then, for any \mathbf{v},\mathbf{w} \in \mathbf{V}, we have

\langle U\mathbf{v},U\mathbf{w} \rangle = \langle U^{-1}U\mathbf{v},\mathbf{w} \rangle = \langle \mathbf{v},\mathbf{w} \rangle,

whence U is orthogonal by Proposition 1.

— Q.E.D.

Now let us return to the problem that we were working on prior to our digression into the generalities of orthogonal operators, namely that of computing the relationship between the matrices [A]_E,[A]_F. We have

\langle \mathbf{f}_i,A\mathbf{f}_j \rangle = \langle U\mathbf{e}_i,AU\mathbf{e}_j \rangle = \langle \mathbf{e}_i,U^{-1}AU\mathbf{e}_j \rangle, \quad \forall 1 \leq i, j \leq n,

where we used Proposition 2 to obtain the second equality. Thus, we have the matrix equation

[A]_F = [UAU^{-1}]_E = [U]_E [A]_E [U^{-1}]_E = [U]_E [A]_E [U]_E^{-1}

where on the right hand side we are using the fact that

[\cdot]_E \colon \mathrm{End}\mathrm{V} \to \mathbb{R}^{n\times n}

is an algebra isomorphism, as in Lecture 13, which means that the matrix representing a product of operators is the product of the matrices representing each operator individually. This relationship between is usually phrased as the statement that the matrix [A]_F representing the operator A in the “new” basis F is obtained from the matrix [A]_E representing A in the “old” basis F by “conjugating” it by the of the matrix [A]_E by the matrix [U]_E, where U is the orthogonal operator that transforms the old basis into the new basis.

Lecture 14 coda

1 Comment

Leave a Reply