Math 31AH: Lecture 27

Let \mathbf{V} be a two-dimensional Euclidean space with orthonormal basis E=\{\mathbf{e}_1,\mathbf{e}_2\}. Let A \in \mathrm{End}\mathbf{V} be the operator defined by

A\mathbf{e}_1 = \mathbf{e}_1+\mathbf{e}_2,\ A\mathbf{e}_1 = -\mathbf{e}_1+\mathbf{e}_2,

so that

[A]_E = \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix}.

Geometrically, the operator A acts on vectors in \mathbf{V} by rotating them counterclockwise through an angle of 45^\circ, and then scaling it by \sqrt{2}. It is geometrically clear that A has no eigenvalues: rotating any vector by 45^\circ results in a new vector which is not a scalar multiple of the original vector. Right? Maybe not, now that we know about complex numbers.

By definition, a nonzero vector \mathbf{v} \in \mathbf{V} is an eigenvector of A if and only if we have R\mathbf{v}= \lambda \mathbf{v} for some scalar \lambda. This is the same thing as saying that the vector \mathbf{v} belongs to the kernel of the operator A-\lambda I, where I is the identity operator. This is in turn equivalent to saying that the kernel of A contains the nonzero vector \mathbf{v}, which means that A is not invertible, which in turn means that \det(A-\lambda I)=0. This chain of reasoning is true in general, i.e. we have the following general proposition.

Proposition 1: The eigenvalues of an operator A \in \mathrm{End}\mathbf{V} are exactly the scalars \lambda such that \det(A-\lambda I)=0.

So, according to Proposition 1, to find the eigenvalues of an operator A, we can try to solve the characteristic equation

\det(A-\lambda)I = 0.

The “unknown” in this equation is \lambda.

Let us write down the characteristic equation of the rotation operator A defined above. We have

(A-\lambda I)^{\wedge 2} \mathbf{e}_1 \wedge \mathbf{e}_2 \\ = (A-\lambda I) \mathbf{e}_1 \wedge (A-\lambda I)\mathbf{e}_2 \\ = (A\mathbf{e}_1-\lambda\mathbf{e}_1) \wedge (A\mathbf{e}_2 - \lambda \mathbf{e}_2) \\  =(\mathbf{e}_1+\mathbf{e}_2-\lambda\mathbf{e}_1)\wedge (-\mathbf{e_1}+\mathbf{e}_2-\lambda\mathbf{e}_2) \\ = ((1-\lambda)\mathbf{e}_1 + \mathbf{e}_2) \wedge (-\mathbf{e}_1+(1-\lambda)\mathbf{e}_2) \\ = (1-\lambda)^2\mathbf{e}_1 \wedge \mathbf{e}_2 -\mathbf{e}_2 \wedge \mathbf{e}_1 \\ = \left( (\lambda-1)^2+1 \right)\mathbf{e}_1 \wedge \mathbf{e}_2,

so that \det(A-\lambda I) = (\lambda-1)^2+1 and the characteristic equation of A is

(\lambda-1)^2+1=0.

There is no number \lambda \in \mathbb{R} which solves the characteristic equation, since for any such number the LHS is the sum of a nonnegative number and a positive number, which is a nonzero quantity. However, if we widen our scope of the number concept, this equation does have solutions, corresponding to the fact that

i^2+1=0 \text{ and } (-i)^2+1=0,

where i \in \mathbb{C} is the imaginary unit, as introduced in Lecture 26. That is, while the characteristic equation has no real solutions, it has the two distinct complex solutions

\lambda_1 = 1+i \text{ and } \lambda_2 = 1-i.

In fact, this is always the case: the main advantage of complex vector spaces is that operators on such spaces always have eigenvalues.

Theorem 1: If A \in \mathrm{End}V is a linear operator on an n-dimensional complex vector space, then A has n (not necessarily distinct) complex eigenvalues \lambda_1,\dots,\lambda_n.

Proof: The eigenvalues of A are the solutions of the characteristic equation, \det(A-\lambda I)=0. Now, since

(A-\lambda I)\mathbf{e}_1 \wedge \dots \wedge (A-\lambda I)\mathbf{e}_n = \det(A-\lambda)I \mathbf{e}_1 \wedge \dots \wedge (A-\lambda I)\mathbf{e}_n,

where \{\mathbf{e}_1,\dots,\mathbf{e}_n\} is any basis of \mathbf{V}, the determinant \det(A-\lambda I) is a polynomial function of \lambda, and the highest degree term of which is (-1)^n \lambda^n. But the fundamental theorem of algebra says that every polynomial of degree n has n (not necessarily distinct) roots in \mathbb{C}.

— Q.E.D.

The above is saying that if we consider the rotation operator A of our example as on operator on a complex vector space, then it does have eigenvalues, even though it did not when considered as an operator on a real vector space. Now comes the question of what the eigenvectors corresponding to these eigenvalues are. In order for the solutions of the characteristic equation to actually correspond to eigenvalues of the operator A, there must be nonzero vectors \mathbf{f}_1,\mathbf{f}_2 \in \mathbf{V} such that

A\mathbf{f}_1 = (1+i)\mathbf{v}_1 \text{ and } A\mathbf{v}=(1-i)\mathbf{f}_2.

Let us see if we can actually calculate \mathbf{f}_1 and \mathbf{f}_2. We have that

[A-\lambda_1I]_E = \begin{bmatrix} 1-(1+i) & -1 \\ 1 & 1-(1+i) \end{bmatrix} = \begin{bmatrix} -i & -1 \\ 1 & -i \end{bmatrix}.

Thus, \mathbf{f}_1=x\mathbf{e}_1+y\mathbf{e}_2 satisfies A\mathbf{f}_1=\lambda_1\mathbf{f}_1 if and only if x,y \in \mathbb{C} are complex numbers, not both zero, such that

\begin{bmatrix} 1-(1+i) & -1 \\ 1 & 1-(1+i) \end{bmatrix} = \begin{bmatrix} -i & -1 \\ 1 & -i \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix},

or equivalently

ix+y = 0\\ x-iy=0.

By inspection, x=i and y=1 are solutions to the above equations, whence

\mathbf{f}_1 = i\mathbf{e}_1 + \mathbf{e}_2

is an eigenvector of A. Similarly, \mathbf{f}_2=x\mathbf{e}_1+y\mathbf{e}_2 satisfies A\mathbf{f}_2=\lambda_2\mathbf{f}_2 if and only if x,y \in \mathbb{C} are complex numbers, not both zero, such that

\begin{bmatrix} 1-(1-i) & -1 \\ 1 & 1-(1-i) \end{bmatrix} = \begin{bmatrix} i & -1 \\ 1 & i \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix},

or equivalently

ix-y=0 \\ x+iy=0.

By inspection, x=i and y=-1 are solutions of these equations, whence

\mathbf{f}_2=i\mathbf{e}_1-\mathbf{e}_2

is an eigenvector of A. Now, what is the whole point of calculating eigenvalues and eigenvectors? Well, if F=\{\mathbf{f}_1,\mathbf{f}_2\} is a basis of \mathbf{V}, then we will have found that the matrix of the operator A in this basis is diagonal,

[A]_F = \begin{bmatrix} 1+i &0 \\ 0 & 1-i \end{bmatrix},

which is a more convenient matrix representation of A then that given by [A]_E, since we can use it to easily do computations with A. So, we wish to show that F = \{\mathbf{f}_1,\mathbf{f}_2\} is a linearly independent set in \mathbf{V}. This follows immediately if \mathbf{f}_1,\mathbf{f}_2 are orthogonal, so let’s see if we get lucky:

\langle \mathbf{f}_1,\mathbf{f}_2 \rangle = \langle i\mathbf{e}_1+\mathbf{e}_2,i\mathbf{e}_1 - \mathbf{e}_2 \rangle = i^2\langle \mathbf{e}_1,\mathbf{e}_1 \rangle - \langle \mathbf{e}_2,\mathbf{e}_2\rangle = -1 -1 = -2 \neq 0.

We didn’t get lucky, and worse than that this scalar product calculation suggests that something unpleasant happens when we start computing scalar products with complex numbers. Indeed, if we modify the above calculation by computing the scalar product of \mathbf{f}_1 with itself, we find that

\langle \mathbf{f}_1,\mathbf{f}_1 \rangle = \langle i\mathbf{e}_1+\mathbf{e}_2,i\mathbf{e}_1 + \mathbf{e}_2 \rangle = i^2\langle \mathbf{e}_1,\mathbf{e}_1 \rangle + \langle \mathbf{e}_2,\mathbf{e}_2\rangle = -1 +1 = 0.

This is disturbing, since it says that the nonzero vector \mathbf{f}_1 is orthogonal to itself, or equivalently that it has zero length, \|\mathbf{f}_1 \|=0. The original of this problem is that, unlike squares of real numbers, squares of complex numbers can be negative. We have to modify the scalar product for complex vector spaces to accommodate this, we insist that a complex scalar product \langle \cdot,\cdot \rangle is an antilinear function of its first argument:

\langle z_1 \mathbf{v}_1 + z_2 \mathbf{v}_2, \mathbf{w} \rangle = \overline{z}_1\langle \mathbf{v}_1, \mathbf{w} \rangle + \overline{z}_2\langle \mathbf{v}_2, \mathbf{w} \rangle,

where if z=x+yi then \bar{z} = x-yi is the complex conjugate of z. Complex vector spaces with which come with a complex scalar product are the complex version of Euclidean spaces, and they have a special name.

Definition 1: A Hilbert space is a pair (\mathbf{V},\langle \cdot,\cdot \rangle) consisting of a complex vector space \mathbf{V} together with a complex scalar product.

Continuing with our running example, let us re-compute the inner product of the eigenvectors \mathbf{f}_1,\mathbf{f}_2 of the rotation operator that we found above. Interpreting \langle \cdot,\cdot \rangle as a complex inner product, we now find that

\langle \mathbf{f}_1,\mathbf{f}_2 \rangle = \langle i\mathbf{e}_1+\mathbf{e}_2, i\mathbf{e}_1-\mathbf{e}_2 \rangle = \langle i\mathbf{e}_1,i\mathbf{e}_1 \rangle + \langle \mathbf{e}_2,-\mathbf{e}_2 \rangle = \bar{i}i \langle \mathbf{e}_1,\mathbf{e}_1 \rangle - \langle \mathbf{e}_2,-\mathbf{e}_2 \rangle = -ii-1 = 1-1 =0,

so that \mathbf{f}_1,\mathbf{f}_2 actually are orthogonal with respect to the complex scalar product on \mathbf{V} in which the basis E=\{\mathbf{e}_1,\mathbf{e}_2\} is orthonormal. Thus F=\{\mathbf{f}_1,\mathbf{f}_2\} is a basis of the complex vector space \mathbf{V} consisting of eigenvectors of the operator A, meaning that while A has no eigenvalues or eigenvectors when considered as an operator on a Euclidean space, it is in fact semisimple when considered as an operator on a Hilbert space.

Even though they may initially seem more complicated, Hilbert spaces are actually easier to work with than Euclidean spaces — linear algebra runs more smoothly over the complex numbers than over the real numbers. For example, it is possible to give a succinct necessary and sufficient criterion for an operator A on a (finite-dimensional) Hilbert space \mathbf{V} to be semisimple. As in the case of a Euclidean space, define the adjoint of A to be the unique operator A^* \in \mathrm{End}\mathbf{V} such that

\langle A^*\mathbf{v},\mathbf{w} \rangle = \langle \mathbf{v},A\mathbf{w}\rangle \quad \forall\ \mathbf{v},\mathbf{w} \in \mathbf{V}.

Theorem 2: A is a semisimple operator if and only if it commutes with its adjoint, meaning that A^*A=AA^*.

I regret that we will not have time to prove this Theorem.

Leave a Reply