Math 202C: Lecture 2

Before leaving the world of linear algebra for multilinear algebra (i.e. tensors), let us reformulate the spectral theorem entirely “upstairs” in \mathrm{End} \mathbf{V} (as in Lecture 1, \mathbf{V} is complex vector space of dimension N < \infty equipped with a scalar product).

Theorem 1: If A \in \mathrm{End}\mathbf{V} is selfadjoint, then there is a positive integer n \leq N, real numbers \lambda_1>\dots>\lambda_n, and selfadjoint operators P_1,\dots,P_n in \mathrm{End}\mathbf{V} satisfying

P_i^2 = P_i, \quad 1 \leq i \leq n


P_iP_j = 0 \quad 1 \leq i \neq j \leq n

such that

A = \lambda_1 P_1 + \dots + \lambda_n P_n.

The operators P_1,\dots,P_n above are called orthogonal idempotents. The “idempotent” part is P_i^2=P_i, and this term is applied more generally in ring theory to refer to anything equal to its own square. The “orthogonal” part is P_iP_j = 0, the zero operator in \mathrm{End}\mathbf{V}, when i \neq j. This is a useful representation of A, because orthogonal idempotence immediately implies that

A^k = \lambda_1^k P_1 + \dots + \lambda_n^k P_n.

Given your Math 202A background, you have realized immediately that Theorem 1 is equivalent to the eigenvalue/eigenvector form of the Spectral Theorem discussed in Lecture 1, because the set \mathcal{P}(\mathbf{V}) \subset \mathrm{End}\mathbf{V} of selfadjoint idempotents acting in \mathbf{V} is in canonical bijection with the lattice \mathcal{L}(\mathbf{V}) of subspaces of \mathbf{V}, since these are exactly the operators which act by orthogonal projection onto a subspace: for any W \in \mathcal{L}(\mathbf{V}), the corresponding P \in \mathcal{P}(\mathbf{V}) maps \mathbf{v} \in \mathbf{V} to the nearest point of the subspace \mathbf{W}. The correspondence between selfadjoint idempotents — which are often simply called orthogonal projections — and subspaces even holds in the infinite-dimensional setting provided \mathbf{V} is complete with respect to the norm \|\mathbf{v}\|=\sqrt{\langle \mathbf{v},\mathbf{v}\rangle}, i.e. \mathbf{V} is a Hilbert space, provided we restrict to closed subspaces. Thus the Grassmannian \mathrm{Gr}_k(\mathbf{V}) is for all practical purposes equivalent to the set \mathrm{Pr}_k(\mathbf{V}) of selfadjoint idempotents/orthogonal projections of rank k. Theorem 1 above is obtained from the eigenvalue/eigenvector form of the Spectral Theorem by replacing the eigenspaces of A with the orthogonal projections onto those spaces, and conversely Theorem 1 gives the eigenvalue/eigenvector form of the Spectral Theorem with \lambda_1>\dots>\lambda_n being the distinct eigenvalues of A whose corresponding eigenspaces are the images of the orthogonal idempotents P_1,\dots,P_n.

Let us close out this brief recap of eigenvectors and eigenvalues (i.e. Math 202A material) by recalling the algebraic criterion for diagonalizability. An operator A \in \mathrm{End}\mathbf{V} is said to be normal if it commutes with its adjoint, AA^*=A^*A. Obviously, selfadjoint operators X are normal, XX^*=X^*X=X^2, and so are unitary operators U, since UU^*=U^*U=I.

Theorem 2: Given A \in \mathrm{End} \mathbf{V}, there exists an orthonormal basis \mathbf{e}_1,\dots,\mathbf{e}_N of \mathbf{V} such that A\mathbf{e}_i=\lambda_i\mathbf{e}_i for each 1 \leq i \leq N, where \lambda_1,\dots,\lambda_N \in \mathbb{C} are scalars, if and only if A is normal. Equivalently, A is a \mathbb{C}-linear combination of pairwise orthogonal selfadjoint idempotents if and only if AA^*=A^*A.

After revisiting this essential piece of 202A technology, my plan was to jump into multilinear algebra — tensor products, exterior products, symmetric products, and their applications to eigenvalue inequalities — and work up to the construction of Schur functors, which interpolate between these using the representation theory of the symmetric groups. This would be an application of the material you learned in 202B wherein the goal is to enhance the material you learned in Math 202A. However, I have since changed my mind. After reviewing Professor Sam’s 202B lecture notes (which I am posting to Piazza for your reference), and speaking briefly with Steven, I have realized that you already have some exposure to this material. Even though there is much left to be done in this direction, namely starting from Section 6 in Steven’s notes and working up to a proof of Theorem 6.4.1 via Schur-Weyl duality, I have ultimately decided to leave this for the second half of the course, and begin instead with some very natural and appealing applications of symmetric group representations which we can sink our teeth into immediately. Also, this will give you exposure to material which cannot be looked up in textbooks, or at least cannot be found presented in the way we’ll do it.

Definition 1: For each d \in \mathbb{N}, the symmetric group \mathrm{S}(d) of degree d consists of the set of bijective functions

\sigma \colon [d] \to [d],

i.e. automorphisms of [d]:=\{1,\dots,d\}, with the group law being composition of functions. Elements of \mathrm{S}(d) are called permutations.

As you know from Math 202B (and likely much before), any permutation \sigma can be factored into a product of disjoint cyclic permutations, and this factorization is unique up to the order of the factors. Our starting point will be a rather different unique factorization theorem for permutations. To explain, let us consider \mathrm{S}(d) from the perspective of geometric group theory. Let T \subset \mathrm{S}(d) denote the set of transpositions, i.e T is the conjugacy class of 2-cycles,

(i\ j), \quad 1 \leq i<j \leq d.

As you are also no doubt aware, T generates \mathrm{S}(d), which means that any permutation \sigma \in \mathrm{S}(d) can be written as a product of transpositions,

\sigma = \tau_1 \dots \tau_r, \quad r \in \mathbb{N}_0,\ \tau_i \in T.

This representation is very far from being unique, and indeed the question of counting the number of factorizations of a given permutation into a given number of transpositions is a classical problem in algebraic combinatorics known as the Hurwitz enumeration problem; despite its elementary formulation, this question has deep ties to enumerative algebraic geometry and mathematical physics, some of which we will hopefully be able to discuss. Nevertheless, for each \sigma there is a minimal value of r for which such a factorization exists, and this is called the word norm of \sigma and denoted |\sigma|_T, where the subscript reminds that this quantity depends on the choice of generating set T. We will omit this subscript in order to lighten the notation, and if we switch to a different generating set T' inducing a different word norm on \mathrm{S}(d), we will say so loud and clear. The following is elementary combinatorics, but you should take the time to prove it.

Proposition 1: The length r of any factorization \sigma=\tau_1\dots\tau_r of a permutation \sigma \in \mathrm{S}(d) into transpositions satisfies r=|\sigma|+2k for some k \in \mathbb{N}_0. That is, all factorizations of \sigma into transpositions have the same parity.

Somewhat less elementary but no less important is the fact that there is a tight connection between the word norm of a permutation \sigma and the number of factors c(\sigma) in its unique decomposition into disjoint cycles.

Proposition 2 (join-cut relation): We have |\sigma|=d-c(\sigma).

The key idea in the proof of Proposition 2 is expressed in its name: multiplying a permutation by a transposition will either join two of its cycles into one, or cut one of its cycles into two. Thus,

The word norm makes the symmetric group \mathrm{S}(d) into a metric space space.

Proposition 3: The function

\mathrm{d} \colon \mathrm{S}(d) \times \mathrm{S}(d) \longrightarrow \mathbb{N}_0

defined by \mathrm{d}(\rho,\sigma) = |\rho^{-1}\sigma| is a metric.

We may thus for example speak of the open ball of radius r centered at a given permutation \rho,

B_r(\rho) = \{\sigma \in \mathrm{S}(d) \colon \mathrm{d}(\rho,\sigma) < r\},

and its boundary, the sphere

\partial B_r(\rho) = \{\sigma \in \mathrm{S}(d) \colon \mathrm{d}(\rho,\sigma)= r\}.

The metric \mathrm{d} is nothing but the graph theory distance (or geodesic distance) on \Gamma(d) = \mathrm{Cay}(\mathrm{S}(d),T), the Cayley graph of the symmetric group as generated by the conjugacy class T \subset \mathrm{S}(d) of transpositions. The vertex set of this graph is \mathrm{S}(d) itself, and two permutations \rho,\sigma \in \mathrm{S}(d) are joined by an edge if and only if there is \tau \in T such that \sigma = \rho\tau. Note that this is an undirected graph because \tau^2=\iota is the identity permutation for every \tau \in T, so that \sigma=\rho\tau is equivalent to \rho=\sigma\tau. Note also that \Gamma(d) is a regular graph of degree |T|={d \choose 2}. Another feature of \Gamma(d) is that it is a graded graph: it decomposes into a disjoint union of independent sets or “levels,”

\mathrm{S}(d) = \bigsqcup\limits_{r=0}^{d-1} L_r,

with L_r being the sphere of radius r centered at the identity permutation \iota. By the join-cut relation, L_r may also be described as the set of permutations whose disjoint cycle decomposition contains d-r factors.

The Hurwitz enumeration problem has an equivalent formulation in terms of the Cayley graph: it asks for the number W^r(\pi) of r-step walks from \iota to \pi on \Gamma(d). This description also points to a refinement of the Hurwitz problem. Let us introduce the edge-labeling on \Gamma(d) in which each edge corresponding to the transposition (i\ j) is marked by j, the larger of the two numbers it interchanges. We will call this the Biane-Stanley edge labeling of the symmetric group, after Philippe Biane and Richard Stanley, in whose work it appears implicitly.

\mathrm{S}(4) with Biane-Stanley labeling.

Given an r-step walks \rho \to \sigma on \mathrm{S}(d), we define its signature to be the corresponding sequence j_1,\dots,j_r of edge labels. We say that a walk is strictly monotone if its signature is a strictly increasing sequence. Thus, an r-step strictly monotone walk from \iota to \sigma is the same thing as a strictly monotone factorization of \sigma into r transpositions, i.e. a factorization

\sigma = (i_1\ j_1) \dots (i_r\ j_r)

such that

j_1 < \dots < j_r.

Theorem (unique monotone factorization): Every permutation \pi \in \mathrm{S}(d) admits a unique strictly monotone factorization, and the length of this factorization is equal to |\pi|. That is,

\vec{\vec{W}}^r(\pi) = \begin{cases} 1, \text{ if }r=|\pi| \\ 0, \text{ otherwise}\end{cases}.

Proof: The proof is by induction on d.

In the base case we have \mathrm{S}(2) = \{\iota,\tau\} with \iota = (1)(2) and \tau=(1\ 2). The complete answer to the Hurwitz enumeration problem for d=2 is thus given by

W^r(\iota)=\begin{cases} 1,\text{ if } r \text{ even} \\ 0, \text{ if }r \text{ odd } \end{cases}


W^r(\tau)=\begin{cases} 1, \text{ if } r \text{ odd } \\0, \text{ if }r\text{ even}\end{cases}.

This corresponds to the fact that if \pi \in \mathrm{S}(2) and r \in \mathbb{N}_0 have the same parity, then we have the unique length r factorization

\pi=\underbrace{(1\ 2) \dots (1\ 2)}_{r \text{ factors}},

but no factorization exists if \pi and r have opposite parity. Imposing the strictly monotone constraint whittles this down to

\vec{\vec{W}}^r(\iota)=\begin{cases} 1, \text{ if }r=0 \\ 0, \text{ otherwise} \end{cases}


\vec{\vec{W}}^r(\tau) = \begin{cases} 1, \text{ if } r=1 \\ 0, \text{ otherwise} \end{cases}.

For the inductive step, there are two cases to consider. Let d>2 and let \pi \in \mathrm{S}(d) be any permutation. If d is a fixed point of \pi, i.e. \pi(d)=d, then \pi \in \mathrm{S}(d-1) has a unique strictly monotone factorization by the induction hypothesis. If on the other hand \pi \in \mathrm{S}(d) is one of the

d! - (d-1)! = (d-1)(d-1)!

permutations which contains d in its support (meaning \pi(d)\neq d), then \pi is of the form

\pi = \pi'(i\ d)

for a unique pair \pi' \in \mathrm{S}(d-1) and i \in \{1,\dots,d-1\}. By the induction hypothesis, \pi' admits a unique strictly monotone factorization of length equal to |\pi'|_{d-1}, and the result follows.

—- Q.E.D.

Graphically, the unique monotone factorization theorem gives a presentation of the symmetric group \mathrm{S}(d) as a starlike tree, which we call the Jucys tree after the Lithuanian physicist Algimantas Adolfas Jucys (not to be confuse with his father, the Lithuanian physicist Adolfas Jucys), in whose work it appears implicitly. We will embed this construction in the group algebra \mathbb{C}\mathrm{S}(d) in Lecture 3, and start using it in conjunction with linear representations of the group algebra.

Jucys tree for \mathrm{S}(4).

Leave a Reply