Math 31AH: Lecture 18

Let $\mathbf{V}$ be a vector space, and let us consider the algebra $\mathrm{End}\mathbf{V}$ as a kind of ecosystem consisting of various life forms of varying complexity. We now move on to the portion of the course which is concerned with the taxonomy of linear operators — their classification and division into various particular classes.

The simplest organisms in the ecosystem $\mathrm{End}\mathbf{V}$ are operators which act by scaling every vector $\mathbf{v} \in \mathbf{V}$ by a fixed number $\lambda \in \mathbb{R}$; these are the single-celled organisms of the operator ecosystem.

Definition 1: An operator $A \in \mathrm{End}\mathbf{V}$ is said to be simple if there exists a scalar $\lambda \in \mathbb{R}$ such that

$A\mathbf{v}=\lambda \mathbf{v} \quad \forall\ \mathbf{v} \in \mathbf{V}.$

— Q.E.D.

Simple operators really are very simple, in the sense that they are no more complicated than numbers. Indeed, Definition 1 is equivalent to saying that $A=\lambda I,$ where $I \in \mathrm{End}\mathbf{V}$ is the identity operator, which plays the role of the number $1$ in the algebra $\mathrm{End}\mathbf{V},$ meaning that it is the multiplicative identity in this algebra. Simple operators are extremely easy to manipulate algebraically: if $A=\lambda I,$ then we have

$A^k = \underbrace{(\lambda I)(\lambda I) \dots (\lambda I)}_{k \text{ factors }} =\lambda^kI,$

for any nonnegative integer $k,$ and more generally if $p(x)$ is any polynomial in a single variable then we have

$p(A) = p(\lambda)I.$

Exercise 1: Prove the above formula.

The formula $A^k=\lambda^kI$ even works in the case that $k$ is a negative integer, provided that $\lambda \neq 0;$ equivalently, the simple operator $A=\lambda I$ is invertible if and only if $\lambda \neq 0,$ its inverse being $A^{-1} = \lambda^{-1}I.$ If $A =\lambda I$ and $B = \mu I$ are simple operators, then they commute,

$AB = (\lambda I)(\mu I)=(\lambda\mu)I = (\mu I)(\lambda I) = BA,$

just like ordinary numbers, and more generally

$p(A,B) = p(A,B)I$

for any polynomial $p(x,y)$ in two variables.

Exercise 2: Prove the above formula.

Another way to appreciate how truly simple simple operators are is to look at their matrices. In order to do this, we have to restrict to the case that the vector space $\mathbf{V}$ is finite-dimensional. If $\mathbf{V}$ is $n$-dimensional, and $E=\{\mathbf{e}_1,\dots,\mathbf{e}_n\}$ is any basis of $\mathbf{V},$ then the matrix of $A=\lambda I$ relative to $E$ is simply

$[A]_E = \begin{bmatrix} \lambda & {} & {} \\ {} & \ddots & {} \\ {} & {} & \lambda \end{bmatrix},$

where the off-diagonal matrix elements are all equal to zero. For this reason, simple operators are often called diagonal operators.

Most operators in $\mathrm{End}\mathbf{V}$ are not simple operators — they are complicated multicellular organisms. So, to understand them we have to dissect them and look at their organs one at a time. Mathematically, this means that, given an operator $A \in \mathrm{End}\mathbf{V},$ we look for special vectors in $\mathbf{V}$ on which $A$ acts as if it was simple.

Definition 2: A nonzero vector $\mathbf{e} \in \mathbf{V}$ is said to be an eigenvector of an operator $A \in \mathbf{End} \mathbf{V}$ if

$A\mathbf{e} = \lambda \mathbf{e}$

for some $\lambda \in \mathbf{R}.$ The scalar $\lambda$ is said to be an eigenvalue of $A.$

The best case scenario is that we can find a basis of $\mathbf{V}$ entirely made up of eigenvectors of $A.$

Defintion 3: An operator $A \in \mathrm{End} \mathbf{V}$ is said to be semisimple if there exists a basis $E$ of $\mathbf{V}$ consisting of eigenvectors of $A.$ Such a basis is called an eigenbasis for $A.$

As the name suggests, semisimple operators are pretty simple, but not quite as simple as simple operators. In particular, every simple operator is semisimple, because if $A$ is simple then every nonzero vector in $\mathbf{V}$ is an eigenvector of $A,$ and hence any basis in $\mathbf{V}$ is an eigenbasis for $A.$ The converse, however, is not true.

Let $\mathbf{V}$ be an $n$-dimensional vector space, and let $A \in \mathrm{End} \mathbf{V}$ be a semisimple operator. By definition, this means that there exists a basis $E=\{\mathbf{e}_1,\dots,\mathbf{e}_n\}$ in $\mathbf{V}$ consisting of eigenvectors of $A.$ This in turn means that there exist numbers $\lambda_1,\dots,\lambda_n \in \mathbb{R}$ such that

$A\mathbf{e}_i = \lambda_i \mathbf{e}_i \quad \forall\ 1 \leq i \leq n.$

If $\lambda_1=\dots=\lambda_n,$ then $A$ is simple, but if these numbers are not all the same then it is not. However, even if all these numbers are different, the matrix of $A$ relative to $E$ will still be a diagonal matrix, i.e. it will have the form

$[A]_E = \begin{bmatrix} \lambda_1 & {} & {} \\ {} & \ddots & {} \\ {} & {} & \lambda_n \end{bmatrix}.$

For this reason, semisimple operators are often called diagonalizable operators. Note the shift in terminology from “diagonal,” for simple, to “diagonalizable,” for semisimple. The former term suggest an immutable characteristic, independent of basis, whereas the latter indicates that some action must be taken, in that a special basis must be found to reveal diagonal form. More precisely, the matrix of a semisimple operator $A$ is not diagonal with respect to an arbitrary basis; the definition only says that the matrix of $A$ is diagonal relative to some basis.

Most linear operators are not semisimple — indeed, there are plenty of operators that have no eigenvectors at all. Consider the operator

$R_\theta \colon \mathbb{R}^2 \to \mathbb{R}^2$

which rotates a vector $\mathbf{v} \in \mathbb{R}^2$ counterclockwise through the angle $\theta \in [0,2\pi).$ The matrix of this operator relative to the standard basis

$\mathbf{e}_1 = (1,0),\ \mathbf{e}_2 = (0,1)$

of $\mathbb{R}^2$ is

$\begin{bmatrix} \cos \theta & - \sin \theta \\ \sin \theta & \cos \theta \end{bmatrix}.$

If $\theta = 0,$ then $R_\theta = I$, so that $R_\theta$ is a simple operator: $\mathbf{e}_1,\mathbf{e}_2$ are eigenvectors, with eigenvalues $\lambda_1=\lambda_2=1.$ If $\theta = \pi,$ then $R_\theta=-I$ and again $R_\theta$ is simple, with the same eigenvectors and eigenvalues $\lambda_1=\lambda_2=-1.$ However, taking any other value of $\theta,$ for example $\theta = \frac{\pi}{2},$ rotation through a right angle, it is geometrically clear that $R_\theta \mathbf{v}$ is never a scalar multiple of $\mathbf{v},$ so that $R_\theta$ has no eigenvectors at all. In particular, it is not semisimple.

Let us now formulate necessary and sufficient conditions for an operator to be semisimple. In this endeavor it is psychologically helpful to reorganize the eigenvector/eigenvalue definition by thinking of eigenvalues as the primary objects, and eigenvectors as secondary objects associated to them.

Defintion 4: The spectrum of an operator $A \in \mathrm{End}\mathbf{V}$ is the set $\sigma(A) \subseteq \mathbb{R}$ defined by

$\sigma(A) = \{ \lambda \in \mathbb{R} \colon \lambda \text{ is an eigenvalues of } A\}.$

For each $\lambda \in \sigma(A),$ the set $\mathbf{V}_\lambda \subseteq \mathbf{V}$ defined by

$\mathbf{V}_\lambda = \{\mathbf{v} \in \mathbf{V} \colon A\mathbf{v} = \lambda \mathbf{v}\}$

is called the $\lambda$eigenspace of $A.$ The dimension of $\mathbf{V}_\lambda$ is called the geometric multiplicity of $\lambda.$

In these terms, saying that $A \in \mathrm{End}\mathbf{V}$ is a simple operator means that the spectrum of $A$ consists of a single number,

$\sigma(A) = \{\lambda\},$

and that the corresponding eigenspace exhausts $\mathbf{V},$

$\mathbf{V}_\lambda = \mathbf{V}.$

At the other extreme, the rotation operator $R_{\pi/2}$ considered above has empty spectrum,

$\sigma(R_{\pi/2}) = \{\},$

and thus does not have any eigenspaces.

Proposition 1: For any $A \in \mathrm{End}\mathbf{V}$, for each $\lambda \in \sigma(A)$ the eigenspace $\mathbf{V}_\lambda$ is a subspace of $\mathbf{V}.$

Proof: First, observe that $\mathbf{0} \in \mathbf{V}_\lambda,$ because

$A\mathbf{0} = \mathbf{0} = \lambda \mathbf{0}.$

Second, $\mathbf{V}_\lambda$ is closed under scalar multiplication: if $\mathbf{v} \in \mathbf{V}_\lambda,$ then

$A(t\mathbf{v}) = tA\mathbf{v} = t\lambda\mathbf(v)=\lambda(t\mathbf{v}).$

Third, $\mathbf{V}_\lambda$ is closed under vector addition: if $\mathbf{v},\mathbf{w} \in \mathbf{V}_\lambda,$ then

$A(\mathbf{v}+\mathbf{w}) = A\mathbf{v}+A\mathbf{w}=\lambda\mathbf{v}+\lambda\mathbf{w}=\lambda(\mathbf{v}+\mathbf{w}).$

— Q.E.D.

So, the eigenspaces of an operator $A \in \mathrm{End}\mathbf{V}$ constitute a collection of subspaces of $\mathbf{V}_\lambda$ of $\mathbf{V}$ indexed by the numbers $\lambda \in \sigma(A).$ A key feature of these subspaces is that they are independent of one another.

Theorem 1: Suppose that $\lambda_1,\dots,\lambda_k$ are distinct eigenvalues of an operator $A \in \mathrm{End}\mathbf{V}.$ Let $\mathbf{e}_1,\dots,\mathbf{e}_k$ be nonzero vectors such that $\mathbf{e}_i \in \mathbf{V}_{\lambda_i}$ for each $1 \leq i\leq k.$ Then $\{\mathbf{e}_1,\dots,\mathbf{e}_k\}$ is a linearly independent set.

Proof: We prove this by induction on $k.$ The base case is $k=1,$ and in this case the assertion is simply that the set $\{\mathbf{e}_{\lambda_1}\}$ consisting of a single eigenvector of $A$ is linearly independent. This is true, since eigenvectors are nonzero by definition.

For the induction step, suppose that $\{\mathbf{e}_1,\dots,\mathbf{e}_k\}$ is a linearly dependent set. Then, there exist numbers $t_1,\dots,t_k \in \mathbb{R},$ not all equal to zero, such that

$\sum_{i=1}^k t_i\mathbf{e}_i = \mathbf{0}.$

Let us suppose that $t_1 \neq 0.$ Applying the operator $A$ to both sides of the above vector equation, we get

$\sum_{i=1}^k t_i\lambda_i\mathbf{e}_i = \mathbf{0}.$

On the other hand, we can multiply the original vector equation by any scalar and it remains true; in particular, we have

$\sum_{i=1}^k t_i\lambda_k\mathbf{e}_i = \mathbf{0}.$

Now, subtracting this third equation from the second equation, we obtain

$\sum_{i=1}^{k-1} t_i(\lambda_i-\lambda_k)\mathbf{e}_i = \mathbf{0}.$

By the induction hypothesis, $\{\mathbf{e}_1,\dots,\mathbf{e}_{k-1}\}$ is a linearly independent set, and hence all the coefficients in this vector equation are zero. In particular, we have

$t_1(\lambda_1-\lambda_k) \neq 0.$

But this is impossible, since $t_1 \neq 0$ and $\lambda_1 \neq \lambda_k.$ Hence, the set $\{\mathbf{e}_1,\dots,\mathbf{e}_k\}$ cannot be linearly dependent — it must be linearly independent.

— Q.E.D.

Restricting to the case that $\mathbf{V}$ is finite-dimensional, $\dim \mathbf{V}=n,$ Theorem 1 has the following crucial consequences.

Corollary 1: $A \in \mathrm{End}\mathbf{V}$ is semisimple if and only if

$\sum\limits_{\lambda \in \sigma(A)} \dim \mathbf{V}_\lambda = \dim \mathbf{V}.$

Proof: Suppose first that $A$ is semisimple. By definition, this means that the span of the eigenspaces of $A$ is all of $\mathbf{V},$

$\mathrm{Span} \bigcup\limits_{\lambda \in \sigma(A)} \mathbf{V}_\lambda = \mathbf{V}.$

Thus

$\dim \mathrm{Span} \bigcup\limits_{\lambda \in \sigma(A)} \mathbf{V}_\lambda = \dim \mathbf{V}.$

By Theorem 1, we have

$\dim \mathrm{Span} \bigcup\limits_{\lambda \in \sigma(A)} \mathbf{V}_\lambda =\sum\limits_{\lambda \in \sigma(A)} \dim \mathbf{V}_\lambda,$

and hence

$\sum\limits_{\lambda \in \sigma(A)} \dim \mathbf{V}_\lambda = \dim \mathbf{V}.$

Conversely, suppose that the sum of the dimensions of the eigenspaces of $A$ is equal to the dimension of $\mathbf{V}.$ For each $\lambda \in \sigma(A),$ let $E_\lambda$ be a basis of the eigenspace $\mathbf{V}_\lambda.$ Then, by Theorem 1, the set

$E = \bigcup\limits_{\lambda \in \sigma(A)} E_\lambda$

is a linearly independent set, and hence a basis of the subspace $\mathrm{Span}(E)$ of $\mathbf{V}.$ Thus

$\dim \mathrm{Span}(E) = \sum\limits_{\lambda \in \sigma(A)} \dim \mathbf{V}_\lambda.$

Since by hypothesis we have

$\sum\limits_{\lambda \in \sigma(A)} \dim \mathbf{V}_\lambda = \dim \mathbf{V},$

this implies that

$\dim \mathrm{Span}(E) =\dim \mathbf{V},$

which in turn implies that

$\mathrm{Span}(E) =\mathbf{V}.$

Thus $E$ is a basis of $\mathbf{V}$ consisting of eigenvectors of $A,$ whence $A$ is semisimple.

— Q.E.D.

Corollay 3: If $|\sigma(A)| = \dim \mathbf{V},$ then $A$ is semisimple.

Proof: To say that $|\sigma(A)|=\dim \mathbf{V}$ is equivalent to saying that the spectrum of $A$ consists of $n=\dim \mathbf{V}$ distinct numbers,

$\sigma(A) = \{\lambda_1,\dots,\lambda_n\}.$

Sampling a collection of nonzero vectors from each corresponding eigenspace,

$e_i \in \mathbf{V}_{\lambda_i}, \quad 1 \leq i \leq n,$

we get a set $E= \{\mathbf{e}_1,\dots,\mathbf{e}_n\}$ of eigenvectors of $A.$ By Theorem 1, $E$ is a linearly independent set, hence it is a basis of $\mathbf{V}.$

— Q.E.D.