Math 202C: Lecture 11

*** Problems in this lecture due 05/03/2026 at 23:59 ***

In Lecture 10, we proved that the Jucys-Murphy specialization \Xi_n \colon \mathcal{S}_n \to \mathcal{Z}_n is a surjection of the algebra \mathcal{S}_n=\mathbb{C}[x_1,\dots,x_n]^{S_n} of symmetric polynomials onto the center \mathcal{Z}_n=\mathcal{C}(S_n) of the convolution algebra of the symmetric group S_n. That is, for every central element |A\rangle \in \mathcal{C}(S_n) there exists a symmetric polynomial f \in \mathcal{S}_n such that |A\rangle = f(\Xi_n), where

f(\Xi_n) = f(|J_1\rangle,\dots,|J_n\rangle)

is the evaluation of f on the Jucys-Murphy elements

|J_t\rangle = \sum\limits_{1 \leq s < t} |st\rangle, \quad 1 \leq t \leq n.

Equivalently, combining the fundamental theorem of symmetric polynomials with the fact that

e_r(\Xi_n) = |L_r\rangle, \quad 0 \leq r \leq n-1,

where

|L_r\rangle = \sum\limits_{\substack{\pi \in S_n \colon |\pi|=r}} |\pi\rangle

is the sum of all elements on level r of the Cayley graph G_n=(S_n,T_n) of S_n as generated by the class T_n = C_1(n) of transpositions, we can say that there exists a (not necessarily symmetric) polynomial g \in \mathcal{P}_n=\mathbb{C}[x_1,\dots,x_n] such that

|A\rangle = g(|L_0\rangle,|L_1\rangle,\dots,|L_{n-1}\rangle).

So the two takeaways are:

  • Every central element is a symmetric polynomial in the Jucys-Murphy elements |J_t\rangle;
  • Every central element is a polynomial in the level elements |L_r\rangle.

Symbolically, we have

\mathcal{Z}_n=\mathbb{C}[|J_1\rangle,\dots,|J_n\rangle]^{S_n}=\mathbb{C}[|L_0\rangle,\dots,|L_{n-1}\rangle],

which means that we have two surjective specializations

\mathcal{S}_n \to \mathcal{Z}_n \quad\text{and}\quad \mathcal{P}_n \to \mathcal{Z}_n,

the first obtained by substituting |J_1\rangle,\dots,|J_n\rangle for the variables of symmetric polynomials in n variables, and the second obtained by substituting |L_0\rangle,\dots,|L_{n-1}\rangle for the variables of arbitrary polynomials in n variables.

This begs the question: what happens when we substitute |J_1\rangle,\dots,|J_n\rangle for the variables of arbitrary polynomials in n variables? More precisely, recall that \mathcal{J}_n is the commutative subalgebra of \mathcal{C}(S_n) generated by

\mathcal{Z}_1 \cup \dots \cup \mathcal{Z}_n,

where \mathcal{Z}_k is the center of the subalgebra \mathcal{C}(S_k) of \mathcal{C}(S_n) spanned by permutations which fix the points k+1,\dots,n. Our original proof of the fact that the JM-elements commute was simply to point out that they lie in the commutative algebra \mathcal{J}_n. We may thus regard the JM-specialization as an algebra homomorphism

\Xi_n \colon \mathcal{P}_n \longrightarrow \mathcal{J}_n

defined by

f(x_1,\dots,x_n) \mapsto f(\Xi_n)=f(|J_1\rangle,\dots,|J_n\rangle).

Since we know that \Xi_n surjects \mathcal{S}_n onto \mathcal{Z}_n, the natural guess is that it surjects \mathcal{P}_n onto \mathcal{J}_n.

Theorem 11.1. The JM-specialization is a surjection of \mathcal{P}_\mathbb{C}[x_1,\dots,x_n] onto \mathcal{J}_n = \mathrm{Alg}(\mathcal{Z}_1 \cup \dots \cup \mathcal{Z}_n).

In fact, this is a direct consequence of what we have already established, which gives that

\Xi_k \colon \mathcal{S}_k \to \mathcal{Z}_k

is surjective for each k.

Problem 11.1. Write out a careful proof of Theorem 11.1.

Note to those who are perusing one of the recommended texts, namely this one: the material we have established so far takes us up to and through the proof of Theorem 4.4.5, as well as the result there called Olshanskis’s theorem, although our arguments have been quite different.

Math 202C: Lecture 10

Let G=(S,T) be a finite connected simple graph with vertex set S=\{\pi,\rho,\sigma,\dots\} and edge set T. We pass to the Hilbert space \mathcal{F}(S) with orthonormal basis |\pi\rangle,|\rho\rangle,|\sigma\rangle,\dots and consider the selfadjoint operators

L_r|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \mathrm{d}(\rho,\sigma)=r}} |\sigma\rangle, \quad r \geq 0,

which encode the entire metric structure of G. We have that L_r is the zero operator for all r which exceed the diameter of G, while the nonzero operators \{L_0,L_1,\dots,L_{\mathrm{diam}(G)}\} form an orthogonal set in \mathrm{End}\mathcal{C}(S) with respect to the Hilbert-Schmidt scalar product. However, not much is known about the subalgebra \mathcal{L}(G) of \mathrm{End}\mathcal{F}(S) generated by the selfadjoint operators L_r.

If G=(S,T) is the Cayley graph of a finite group S corresponding to a symmetric set of generators T \subseteq S, we are in a more favorable situation because the operator L_r is the image of an element of the convolution algebra \mathcal{C}(S), namely

|L_r\rangle= \sum\limits_{|\pi|=r} |\pi\rangle,

in the regular representation. Because the regular representation is faithful, we can view \mathcal{L}(G) as the subalgebra of \mathcal{C}(S) generated by the selfadjoint elements |L_r\rangle, and leverage the group structure of S to help us understand the situation. For example, if S is an abelian group we automatically get that \mathcal{L}(G) is a commutative algebra.

Problem 10.1. Let G_n=(S_n,T_n) be the Cayley graph corresponding to the cyclic group S_n=\{g^0,g^1,\dots,g^{n-1}\} of order n with generating set T_n=\{g,g^{-1}\}. This graph is easy to picture: it is a cycle on n vertices. Show that

\mathcal{L}(G_n) = \left\{\sum\limits_{k=0}^{n-1} a_k|g^k\rangle \colon a_k=a_{-k}\right\},

and compute the dimension of this commutative algebra.

Now let us consider the Cayley graph G_n=(S_n,T_n) of the symmetric group S_n=\mathrm{Aut}\{1,\dots,n\} as generated by the conjugacy class of transpositions, T_n=\{(st) \colon 1 \leq s<t \leq n\}. This is a nonabelian group, and yet the operators L_0,L_1,L_2,\dots commute. This is a consequence of the fact that the elements

|L_r\rangle= \sum\limits_{\substack{\mu \vdash n \\ \ell(\mu)=n-r}}|C_\mu\rangle,

where C_\mu \subseteq S_n denotes the conjugacy class of permutations of cycle type \mu, belong to the center \mathcal{Z}_n of \mathcal{C}(S_n). Thus, the sphere sums |L_r\rangle generate a subalgebra \mathcal{L}_n of \mathcal{Z}_n. In fact, we have the following classical result of Farahat-Higman.

Theorem 10.1. \mathcal{L}_n=\mathcal{Z}_n.

In this lecture we will prove Theorem 10.1, or at least explain why it is true. Our starting point is the first main result of Math 202C, namely that

|L_r\rangle=e_r(|J_1\rangle,\dots,|J_n\rangle)

where |J_1\rangle,\dots,|J_n\rangle \in \mathcal{C}(S_n) are the Jucys-Murphy elements and e_r(x_1,\dots,x_n) is the elementary symmetric polynomial of degree r. This together with the second main result in Math 202C so far, the fundamental theorem of symmetric polynomials, gives us a specialization

\Xi_n \colon \mathcal{S}_n \longrightarrow \mathcal{Z}_n

of the algebra \mathcal{S}_n=\mathbb{C}[x_1,\dots,x_n]^{S_n} of symmetric polynomials defined by

\Xi_n(f(x_1,\dots,x_n))=f(|J_1\rangle,\dots,|J_n\rangle).

The image of \mathcal{S}_n under \Xi_n is exactly \mathcal{L}_n, so that Theorem 10.1 is equivalent to the following.

Theorem 10.2. \Xi_n is surjective.

Problem 10.1. Carefully explain the equivalence between Theorems 10.1 and 10.2.

In order to prove Theorem 10.2, we would ideally like to explicitly construct a family f_\mu(x_1,\dots,x_n) of symmetric polynomials indexed by partitions \mu \vdash n such that

f_\mu(|J_1\rangle,\dots,|J_n\rangle)=C_\mu.

In fact, we know how to do this in three cases: the polynomials

f_{(1,1,\dots, 1)}:=e_0,\ f_{(2,1,\dots,1)}:=e_1, \quad f_{(n)}:=e_{n-1}

map to the conjugacy classes

|C_{(1,1,\dots,1)}\rangle,\ |C_{(2,1,\dots,1)}\rangle,\ |C_{(n)}\rangle

under \Xi_n.

Next, we would like to construct symmetric polynomials f_{(3,1,\dots,1)} and f_{(2,2,1,\dots,1)} such that

f_{(3,1,\dots,1)}(|J_1\rangle,\dots,|J_n\rangle)=|C_{(3,1,\dots,1)}\rangle

and

f_{(2,2,1,\dots,1)}(|J_1\rangle,\dots,|J_n\rangle) = |C_{(2,2,1,\dots,1)}\rangle.

However, at present we only know that

e_2(|J_1\rangle,\dots,|J_n\rangle)=|L_2\rangle=|C_{(3,1,\dots,1)}\rangle+|C_{(2,2,1,\dots,1)}\rangle.

Before going any further, let us note that the standard labeling C_\mu of conjugacy classes in S_n is not ideal for our purposes. Instead, it is much better to define the reduced cycle-type of a permutation \pi \in C_\mu to be the partition \nu obtained by subtracting 1 from each part of \mu. Then, the decomposition of the levels L_r of the Cayley graph becomes

|L_r\rangle = \sum\limits_{\nu \vdash r} C_\nu(n),

where C_\nu(n) is the conjugacy class of permutations in S_n of reduced cycle type \nu. In this language, the data we have so far is

e_0(\Xi_n) = C_0(n),\ e_1(\Xi_n) = C_1(\Xi_n),\ e_{n-1}=C_{n-1}(n).

It is definitely not the case that e_r(\Xi_n)=C_r(n) — what is actually true is that

e_r(\Xi_n) = |L_r(n)\rangle, \quad 0 \leq r \leq n,

and the three data points above are precisely those in which a single level actually is a conjugacy class.

Now let us come back to the first unknown case of our problem: we know that

e_2(\Xi_n)=|C_2(n)\rangle + |C_{11}(n)\rangle,

but we do not yet know that each of the summands in this expression can be expressed as a symmetric polynomial in the JM-elements. The space \mathcal{S}_n^2 of homogeneous degree 2 symmetric polynomial in n variables is two-dimensional, with monomial basis

m_2(x_1,\dots,x_n) = p_2(x_1,\dots,x_n) = \sum\limits_{1 \leq I \leq n} x_i^2,\\ m_{11}(x_1,\dots,x_n)=e_2(x_1,\dots,x_n)=\sum\limits_{1 \leq i <j \leq n}x_ix_j.

We already know what m_{11}(\Xi_n)=e_2(\Xi_n) is, so it remains to calculate

m_2(\Xi_n)=p_2(\Xi_n)=\sum\limits_{1 \leq t \leq n} |J_t\rangle^2.

Now,

|J_t\rangle^2 = \sum\limits_{s_1,s_2<t} |s_1t\rangle |s_2t\rangle=\sum\limits_{s<t} |st\rangle|st\rangle+\sum\limits_{s_1\neq s_2<t}|s_1t\rangle|s_2t\rangle.

The first sum is just t-1 copies of |st\rangle|st\rangle=|\iota\rangle, giving a net contribution of

\sum\limits_{1 \leq t \leq n}(t-1)|\iota\rangle={n \choose 2}C_0(n).

The terms of the second sum are |s_1t\rangle|s_2t\rangle with s_1\neq s_2, so the product gives one of the two distinct three-cycles |s_1s_2t\rangle or |s_2s_1t\rangle and the opposite product |s_2t\rangle|s_1t\rangle gives the other one, so that

\sum\limits_{1 \leq t \leq n}\sum\limits_{s_1\neq s_2<t}|s_1t\rangle|s_2t\rangle=|C_2(n)\rangle.

We thus have a system of two linear equations,

m_{11}(\Xi_n) = |C_{11}(n)\rangle+|C_2(n)\rangle \\ m_2(\Xi_n) = |C_2(n)\rangle+{n \choose 2}|C_0(n)\rangle.

The second equation gives us

|C_2(n)\rangle=m_2(\Xi_n)-{n \choose 2}m_0(\Xi_n),

which is a symmetric polynomial in the JM-elements, and substituting this into the first equation and solving we get

|C_{11}(n)\rangle=m_{11}(\Xi_n)-m_2(\Xi_n)+{n \choose 2}m_0(\Xi_n),

so that we have successfully expressed each of the conjugacy classes C_\nu(n) of S_n of reduced cycle type \nu\vdash 2 as a symmetric polynomial in the JM-elements.

For level L_3 in S_n, the class expansion of the monomial symmetric polynomials m_\lambda(\Xi_n) of degree three is

m_3(\Xi_n) = C_3(n) + (2n-3)C_1(n) \\ m_{21}(\Xi_n) = C_{21}(n)+3C_3(n)+\left({n \choose 2}-1\right)C_1(n) \\ m_{111}(\Xi_n) = C_{111}(n)+C_{21}(n)+C_3(n),

which modulo levels L_0,L_1,L_2 becomes

\tilde{m}_3(\Xi_n) =C_3(n) \\ \tilde{m}_{21}(\Xi_n) =C_{21}(n)+3C_3(n) \\ \tilde{m}_{111}(\Xi_n) =C_{111}(n)+C_{21}(n)+C_3(n).

If we order the partitions of 3 as 3 < 21 < 111, this is the triangular system

\begin{bmatrix} \tilde{m}_3(\Xi_n) \\ \tilde{m}_{21}(\Xi_n) \\ \tilde{m}_{111}(\Xi_n) \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 3 & 1 & 0 \\ 1 & 1 & 1 \end{bmatrix} \begin{bmatrix} C_3(n) \\ C_{21}(n) \\ C_{111}(n) \end{bmatrix}.

These computations can, to some extent, be performed in full generality. The result which comes out of the attempt to do so is the following fairly explicit description of the images of all monomial symmetric polynomials under the JM-specialization \Xi_n.

Theorem 10.2. For each level 0 \leq r \leq n-1 of the Cayley graph G_n=(S_n,T_n), not only do we have e_r(\Xi_n) = |L_r\rangle, but in fact for every partition \lambda \vdash r there holds a class expansion

m_\lambda(\Xi_n) = \sum\limits_{|nu|\leq |\lambda|}L^\lambda_\nu(n)C_\nu(n),

in which the coefficients L^\lambda_\nu(n) are polynomials in n. In fact, L^\lambda_\nu(n) is independent of n for |\lambda|=|\nu|=r, and modulo lower levels L_0,L_1,\dots,L_{r-1}, we have

\tilde{m}_\lambda(\Xi_n) = C_\lambda(n) + \sum\limits_{\substack{\nu \vdash r \\ \nu < \lambda}}L^\lambda_\nu(n),

where \nu < \lambda means that \nu,\lambda are partitions of r with \nu coarser than \lambda.

Math 202C: Lecture 9

*** Problems due April 26 at 23:59 ***

Week Four review and preview.

We began with the algebraic approach to graphs, which you can think of as a natural road to go down after Math 202A. Given a finite, connected, simple graph G=(S,T) with vertex set S=\{\pi,\rho,\sigma,\dots\}, we pass to the Hilbert space \mathcal{C}(S) with orthonormal basis |\pi\rangle,|\rho\rangle,|\sigma\rangle,\dots. We encode the edge set T as a linear transformation of \mathcal{C}(S), which by abuse of notation we also denote T \in \mathrm{End}\mathcal{C}(S) because it contains exactly the same information. That is, we define the adjacency operator by

T|\rho\rangle = \sum\limits_{\sigma} |\sigma\rangle,

where the sum is over all \sigma \in S such that \{\rho,\sigma\} \in T. The adjacency operator is self-adjoint, and we could equally well define it by

\langle \sigma |T = \sum\limits_{\rho} \langle \rho|,

where the sum is over all \rho \in S such that \{\rho,\sigma\} \in T. Either definition of T is consistent with

\langle \sigma | T |\rho\rangle = \begin{cases} 1, \text{ if } \rho,\sigma \text{ adjacent }\\ 0, \text{ otherwise}\end{cases}.

More generally, for any nonnegative integer r the matrix element \langle \sigma |T^r |\rho\rangle equals the number of r-step walks \rho \to \sigma in the graph G=(S,T). The linear algebraic approach is to find an orthonormal basis B \subset \mathcal{C}(S) such that T belongs to the maximal commutative subalgebra \mathcal{D}(B) \leq \mathrm{End}\mathcal{C}(S) consisting of operators which act diagonally on B.

We also observed that the adjacency operator T is the first member of a sequence T=L_1,L_2,L_3,\dots of selfjadoint linear operators on \mathcal{C}(S) which we called the level operators of the graph G=(S,T). These operators encode not just adjacency in G=(S,T), but its complete structure as a finite connected metric space. By definition,

L_r|\rho\rangle = \sum\limits_{\mathrm{d}(\rho,\sigma)=r} |\sigma\rangle,

where \mathrm{d} is geodesic distance in G=(S,T). That is, L_r acts on an input vertex |\rho\rangle by summing all vertices |\sigma\rangle in the sphere of radius r centered at \rho. Equivalently,

\langle \sigma | L_r |\rho\rangle = \begin{cases} 1, \text{ if } \mathrm{d}(\rho,\sigma)=r\\ 0, \text{ otherwise}\end{cases}.

We noted that although T=L_1,L_2,L_3,\dots is a finite family of selfadjoint operators in \mathrm{End}\mathcal{C}(S), it may not be a commutative family. Indeed, we have

\langle \sigma | L_sL_r |\rho\rangle = |S_r(\rho) \cap S_s(\sigma)|,

where S_r(\rho) and S_s(\sigma) are the spheres of radii r and s centered at the vertices \rho and \sigma, respectively. There is no general reason why this would be equal to

\langle \sigma | L_rL_s |\rho\rangle = |S_r(\sigma) \cap S_s(\rho)|.

Indeed, the subalgebra \mathcal{L}(G) of \mathcal{C}(G) generated by the selfadjoint operators L_r, r \geq 0 is in general not well-understood.

Problem 9.1. If G=(S,T) is a finite connected simple graph of diameter n, show that \{L_0,L_1,\dots,L_n\} is an orthogonal set in \mathrm{End}\mathcal{C}(G) with respect to the Hilbert-Schmidt (aka Frobenius) scalar product. In particular, conclude that \mathcal{L}(G) is a subalgebra of dimension at least n+1.

Although the algebra \mathcal{L}(G) generated by the selfadjoint operators L_0,L_1,L_2,\dots is not well-understood, the complete sequence of level operators can be conveniently wrapped into a single operator

\Omega_q = \sum\limits_{r=0}^\infty q^r L_r

called the exponential distance operator of G=(S,T). The sum is finite because L_r is the zero operator for all r which exceed the diameter of G. Note that \Omega_q is really a one-parameter family of operators in \mathrm{End}\mathcal{C}(G) depending on q \in \mathbb{C}; note also that \Omega_q is selfadjoint for q \in \mathbb{R}. The reason for the name “exponential distance operator” is that

\Omega_q|\rho\rangle = \sum\limits_{\sigma \in S} q^{\mathrm{d}(\rho,\sigma)}|\sigma\rangle,

which says that the matrix of \Omega_q is the entrywise exponential of the distance matrix of G=(S,T),

\langle \sigma | \Omega_q | \rho \rangle = q^{\mathrm{d}(\rho,\sigma)}.

In recent years, graph theorists have grappled with the spectral theory of \Omega_q without recognizing its true significance: the exponential distance operator encodes the same information as the distance operator in a way which interfaces more meaningfully with statistical mechanics on graphs. We spent some time exploring this in Week One.

Now we move on to a concrete case of the above in which G=(S,T) is a specific family of graphs. Namely, we consider the graphs G_n=(S_n,T_n) in which S_n=\mathrm{Aut}\{1,\dots,n\} is the symmetric group and T_n \subseteq S_n is the conjugacy class of transpositions. This looks strange because the edge set of a graph is not a subset of its vertex set; however for Cayley graphs of groups it can be regarded as such. More precisely, two vertices \rho,\sigma \in S_n are adjacent in G_n if and only if there exists \tau \in T_n such that \sigma = \rho\tau. The fact that we have group law means that we can consider adjacency as induced by multiplication. This same feature propagates up to the level of linear operators, where we can also identify T_n \subseteq S_n with the adjacency operator T_n \in \mathrm{End}\mathcal{C}(S_n) using convolution in \mathcal{C}(S_n), which is itself induced by multiplication in S_n. Namely, for any arbitrary subset A \subseteq S_n we have a corresponding vector in |A\rangle \in \mathcal{C}(S_n),

|A\rangle = \sum\limits_{\pi \in A} |\pi\rangle,

which in turn gives becomes an operator A \in \mathrm{End}\mathcal{C}(S_n),

A|\rho\rangle = |\rho\rangle|A\rangle = \sum\limits_{\pi \in A} |\rho\pi\rangle.

Applying this recipe to the conjugacy class T_n \subseteq S_n implements the adjacency operator on G_n=(S_n,T_n) as right multiplication by |T_n\rangle. More generally, T_n=L_1 is the sphere of radius one centered at the identity permutation \iota \in S_n. If we apply this recipe to the sphere L_r \subseteq S_n of radius r centered at \iota, we first get the vector

|L_r\rangle = \sum\limits_{|\pi|=r} |\pi\rangle,

which in turn becomes the operator

L_r|\rho\rangle = \sum\limits_{|\pi|=r} |\rho\pi\rangle.

These facts hold for all Cayley graphs. What is special about the case G_n=(S_n,T_n) is that all of the spheres L_0,L_1,L_2,\dots \subseteq S_n are invariant under conjugation, and in fact are disjoint unions of conjugacy classes in an especially transparent way,

L_r = \bigsqcup\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}}C_\alpha,

where C_\alpha \subseteq S_n is the conjugacy class of permutations of cycle type \alpha. This immediately implies that

|L_r\rangle = \sum\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}}|C_\alpha\rangle

is a central element in \mathcal{C}(S_n), and hence that L_1,L_2,L_3,\dots \in \mathrm{End}\mathcal{C}(S_n) are commuting selfadjoint operators. In particular, there exists an orthonormal basis B \subset \mathcal{C}(S_n) such that L_1,L_2,L_3,\dots belong to the corresponding diagonal subalgebra \mathcal{D}(B) \leq \mathrm{End}\mathcal{C}(S_n). Actually, we can say something stronger. Starting with a subset A \subseteq S_n, when we make the chain of identifications

A \longrightarrow |A\rangle \longrightarrow A,

the first arrow is quantization and the second is the regular representation. Saying that A \subseteq S_n is conjugation invariant is exactly the same thing as saying that |A\rangle \in \mathcal{C}(S_n), which in turn means (by Schur’s Lemma) that A \in \mathrm{End}\mathcal{C}(S_n) acts as multiplication by a scalar operator with eigenvalue \widehat{A}(\lambda) in every irreducible subrepresentation V^\lambda of \mathcal{C}(S_n). So, not only are the operators L_0,L_1,L_2,\dots,L_{n-1} simultaneously diagonalizable, they are actually simultaneously diagonal in any orthonormal basis B^\lambda of any irreducible representation V^\lambda. But this is all abstractly true — what we do not have is a concrete formula for the Fourier transform \widehat{L}^r(\lambda) of L_r acting in a given irreducible representation V^\lambda. In particular, we do not have any explicit numerical understanding of the spectrum of the adjacency operator L_1=T_n of the graph G_n=(S_n,T_n), and this is the information that we wish to obtain.

The first step down the path to explicitly diagonalizing the level operators of G_n = (S_n,T_n) is recognize path is to recognize that the central elements |L_0\rangle,|L_1\rangle,\dots,|L_{n-1}\rangle\rangle are nonlinear functions of a fundamental family of non central elements which are even more granular than conjugacy classes. Namely, we decompose

T_n = J_1 \sqcup J_2 \sqcup \dots \sqcup J_n,

where J_t is the set of transpositions of the form (st) with s<t. These are the Jucys-Murphy subsets of S_n, and the corresponding algebra elements

|J_t\rangle = \sum\limits_{1 \leq s<t} |st\rangle, \quad 1 \leq t \leq n

give a decomposition

|T_n\rangle = |J_1\rangle + |J_2\rangle + \dots + |J_n\rangle

into pairwise orthogonal elements whose norms are

\|J_1\|=0, \|J_2\|=1, \|J_3\|=\sqrt{2},\dots,\|J_n\|=\sqrt{n-1}.

Even though the Jucys-Murphy elements do not lie in the center \mathcal{Z}_n=\mathcal{C}(S_n) of the convolution algebra they commute with one another. Hence, the Jucys-Murphy operators J_1,J_2,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n) are commuting selfadjoint operators, and hence are simultaneously diagonalizable. The simple but important way to see this commutativity is to recognize that we have a chain of groups

S_0 \subset S_1 \subset S_2 \subset \dots \subset S_n

in which S_k=\mathrm{Aut}\{1,\dots,k\} is identified with the subgroup of S_n consisting of permutations which fix k+1,\dots,n. This gives a chain of subalgebras

\mathcal{C}(S_0) \subset \mathcal{C}(S_1) \subset \mathcal{C}(S_2) \subset \dots \subset\mathcal{C}(S_n),

in which \mathcal{C}(S_k) is the span of all permutations which fix k+1,\dots,n. Each of these subalgebras comes with its own center \mathcal{Z}_k=Z\mathcal{C}(S_k), and while the commutative subalgebras

\mathcal{Z}_1,\dots,\mathcal{Z}_n

are not nested (in fact \mathcal{Z}_k \cap \mathcal{Z}_l =\mathbb{C}|\iota\rangle for k \neq l) they commute with one another, and the the JM-elements |J_1\rangle,\dots,|J_n\rangle belong to the commutative subalgebra \mathcal{J}_n \subseteq \mathcal{C}(S_n) generated by their union,

\mathcal{Z}_1 \cup \dots \cup \mathcal{Z}_n.

We proved that |J_1\rangle,|J_2\rangle,\dots,|J_n\rangle pairwise commute by showing that they are contained in the algebra \mathcal{J}_n. Indeed, we have that

|J_k \rangle = |T_k\rangle - |T_{k-1}\rangle

is contained in the commutative algebra generated by \mathcal{Z}_{k-1} \cup \mathcal{Z}_k in \mathcal{C}(S_n). The key result which marks the transition from linear algebra to polynomial algebra in Math 202C is the following.

Theorem 9.1. We have

|L_r\rangle = e_r(|J_1\rangle,\dots,|J_n\rangle),

where e_r(x_1,\dots,x_n) is the elementary symmetric polynomial of degree r.

A good way to test your understanding of this key result is to solve the following problem.

Problem 9.2. Prove that the exponential distance operator \Omega_q of the Cayley graph G_n=(S_n,T_n) factors as

\Omega_q = (I+qJ_1)(I+qJ_2) \dots (I+qJ_n).

Deduce from this that \Omega_q is invertible for all complex q satisfying |q| < \frac{1}{n-1}.

Theorem 9.1 is what led us into a discussion of polynomials in general, and symmetric polynomials in particular. The algebra \mathcal{P}_n=\mathbb{C}[x_1,\dots,x_n] of polynomials in n commuting selfadjoint variables has basis

x^\alpha = x_1^{\alpha_1} \dots x_n^{\alpha_n}, \quad \alpha \in \mathrm{Com}_n

indexed by the set \mathrm{Com}_n of nonnegative integer vectors \alpha=(\alpha_1,\dots,\alpha_n). We equip \mathcal{P}_n with the scalar product in which \{x^\alpha \colon \alpha \in \mathrm{Com}_n\} is orthonormal. We then have the orthogonal decomposition

\mathcal{P}_n =\bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d,

where

\mathcal{P}_n^d = \mathrm{Span}\{x^\alpha \colon \alpha \in \mathrm{Com}_n^d\}

with \mathrm{Com}_n^d = \{\alpha \in \mathrm{Com}_n \colon \alpha_1+\dots+\alpha_n=d\}.

We consider the (infinite-dimensional) unitary representation (\mathcal{P}_n,\varphi_n) of the symmetric group S_n=\mathrm{Aut}\{1,\dots,n\} in which \mathcal{P}_n is the polynomial algebra as above and

\varphi_n(\pi) f(x_1,\dots,x_n)=f(x_{\pi^{-1}(1)},\dots,x_{\pi^{-1}(n)}).

The space of invariants \mathcal{S}_n=\mathcal{P}_n^{S_n}=\mathbb{C}[x_1,\dots,x_n]^{S_n} is (by definition) the algebra of symmetric polynomials in n commuting selfadjoint variables. It has the orthogonal decomposition

\mathcal{S}_n = \bigoplus\limits_{d=0}^\infty \mathcal{S}_n^d

with \mathcal{S}_n^d = \mathcal{S}_n \cap \mathcal{P}_n^d. An orthogonal basis for \mathcal{S}_n is given by the monomial symmetric polynomials

m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha

where \lambda ranges over the set \mathrm{Par}_n \subset \mathrm{Com}_n of nonnegative integer vectors with weakly decreasing coordinates, and \mathcal{O}_\lambda \subset \mathrm{Com}_n is the set of all \alpha’s whose weakly decreasing rearrangement is \lambda.

Problem 9.3. Give a bijection \Phi between \mathrm{Par}_n and the set of all n-element subsets of \mathbb{N}_0=\{0,1,2,3,\dots\}. Express the sum of the elements of \Phi(\lambda) in terms of the sum |\lambda| of the coordinates of \lambda.

The elementary symmetric polynomials e_0,e_1,\dots,e_n \in \mathcal{S}_n are defined by

e_r(x_1,\dots,x_n)=m_{(1^r,0^{n-r})}(x_1,\dots,x_n), \quad 0 \leq r \leq n,

where (1^r,0^{n-r})=(1,\dots,1,0,\dots,0) is the vector whose first r coordinates equal 1 and remaining coordinates equal 0. For r>n, we declare e_r to be the zero polynomial. Let \mathrm{Par} be the set of nonnegative integer vectors of infinite length whose coordinates are weakly decreasing and eventually equal zero, and define

e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^\infty e_{\lambda_i}(x_1,\dots,x_n).

The condition \lambda^\prime \in \mathrm{Par}_n is necessary and sufficient in order that e_\lambda is not the zero polynomial, and we proved that

\{e_\lambda(x_1,\dots,x_n) \colon \lambda^\prime \in \mathrm{Par}_n\}

is a basis of \mathcal{S}_n. This result is called the fundamental theorem of symmetric polynomials.

We are now going to consider a further algebraic consequence of this result. Let \mathcal{C}(S_n) be the convolution algebra of the symmetric group S_n=\mathrm{Aut}\{1,\dots,n\}. For each 1 \leq k \leq n, the symmetric group S_k=\mathrm{Aut}\{1,\dots,k\} is isomorphic to the subgroup of S_n consisting of permutations which fix the numbers k+1,\dots,n. Thus we can view \mathcal{C}(S_k) as the subalgebra of \mathcal{C}(S_n) with basis |\pi\rangle, \pi \in S_k. Let \mathcal{Z}_k denote the center of \mathcal{C}(S_k) sitting inside \mathcal{C}(S_n). The Jucys-Murphy algebra \mathcal{J}_n is by definition the commutative subalgebra of \mathcal{C}(S_n) generated by the set

\mathcal{Z}_1 \cup \mathcal{Z}_2 \cup \dots \cup \mathcal{Z}_n.

Within this subalgebra we have the Jucys-Murphy elements

|J_k\rangle = |T_k\rangle - |T_{k-1}\rangle,\quad 1 \leq k \leq n,

where T_k is the conjugacy class of transpositions in S_k. One of our first results is that level sets

|L_r\rangle = \sum\limits_{\substack{\pi \in S_n \\ |\pi|=r}}|\pi\rangle

are elementary symmetric polynomials evaluated on JM-elements,

|L_r\rangle = e_r(|J_1\rangle,\dots,|J_n\rangle), \quad 0 \leq r \leq n-1.

Since |L_r\rangle \in \mathcal{Z}_n, combining this with the fundamental theorem of symmetric polynomials gives the following.

Theorem 9.2. For any f \in \mathcal{S}_n=\mathbb{C}[x_1,\dots,x_n]^{S_n}, we have f(|J_1\rangle,\dots,|J_n\rangle) \in \mathcal{Z}_n.

In words, this says that when you evaluate any arbitrary symmetric polynomial on the Jucys-Murphy elements and expand this out as a linear combination of permutations, every pair of conjugate permutations appears with the same coefficient.

As an example, let us consider the complete homogeneous symmetric polynomials, which are defined as

h_r(x_1,\dots,x_n) = \sum\limits_{\lambda \vdash r} m_\lambda(x_1,\dots,_n),

where the sum is over all partitions of r.

Problem 9.4 Prove that

h_r(x_1,\dots,x_n)=\sum\limits_{1 \leq t_1 \leq \dots \leq t_r \leq n} x_{t_1}\dots x_{t_r},

and deduce from this that

h_r(|J_1\rangle,\dots,|J_n\rangle)=\sum\limits_{\pi \in S_n} \vec{W}^r(\pi)|\pi\rangle,

where \vec{W}^r(\pi) is the number of r-step monotone walks \iota \to \pi on the Cayley graph of S_n.

Thus, the function \vec{W}^r(\pi) counts factorizations

\pi=(s_1\ t_1) \dots (s_r\ t_r), \quad s_i<t_i,

of a given permutation into transpositions subject to the condition

t_1 \leq \dots \leq t_r.

Problem 9.2 shows that \vec{W}^r(\pi) depends only on the cycle type of \pi. Although this fact has been known for over fifteen years, no combinatorial explanation has been found.

I will say a bit more about this enumerative problem since I know that some of you are Stirling number aficionados. For any permutation \pi, the minimum length of a factorization of \pi into transpositions is |\pi|, and this upper bound of course applies to weakly monotone factorizations as well. Moreover, since we know that there exists a unique strictly monotone geodesic \iota \to \pi, there is at least one weakly monotone geodesic.

Theorem 9.2. The number of weakly monotone factorizations of a full cycle \gamma \in S_n into r=n-1+2g transpositions is

\vec{W}^{n-1+2g}(\pi) = \mathrm{Cat}_{n-1} \times T(n-1,n-1+2g),

where T(m,n) are the central factorial numbers of the second kind.

The central factorial numbers are certain generalizations of Stirling numbers which appear in quite a variety of situations.

The upshot of the above is that we actually know a few things about the algebra \mathcal{L}_n=\mathcal{L}(G_n) for the all-transpositions Cayley graph G_n=(S_n,T_n) of the symmetric group. Not only do we know that the distance operators L_1,L_1,\dots,L_n, we have a polynomial representation of them in terms of the JM-operators J_1,\dots,J_n. In particular, we know that the commutative algebra \mathcal{L}_n is a subalgebra of \mathcal{J}_n. Over the course of the next two lectures, we will prove that in fact \mathcal{L}_n=\mathcal{J}_n. We will then pivot and give a completely different spectral description of this algebra in terms of a distinguished orthonormal basis of \mathcal{C}(S_n), known as its Gelfand-Tsetlin basis.

Math 202C: Lecture 8

As in Lecture 7, let \mathrm{Par}_n denote the set of nonnegative integer vectors \mu=(\mu,\dots,\mu_n) with weakly decreasing coordinates. Then, each \mu \in \mathrm{Par}_n can be viewed as a partition of the number

|\mu|=\mu_1+\dots+\mu_n

having at most n parts. Set

\mathrm{Par}_n^d = \{\mu \in \mathrm{Par}_n^d \colon |\mu|=d\}.

We know that the monomial symmetric polynomials

\{m_\mu(x_1,\dots,x_n) \colon \mu \in \mathrm{Par}_n^d\}

form a basis of the degree d component \mathcal{S}_n^d of the algebra \mathcal{S}_n=\mathbb{C}[x_1,\dots,x_n]^{S_n} of symmetric polynomial in n commuting selfadjoint variables x_1,\dots,x_n.

The fundamental theorem of symmetric polynomials gives a second basis of \mathcal{S}_n^d constructed from the elementary symmetric polynomials

e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subseteq [n] \\ |I|=r}} \prod\limits_{i \in I} x_i.

Let \mathrm{Par}^d denote the set of all partitions of d, and observe that we may think of this as the set of all nonnegative integer vectors \lambda=(\lambda_1,\dots,\lambda_d) with weakly decreasing coordinates whose total sum |\lambda| equals d. We extend the definition of the elementary symmetric polynomial multiplicatively by defining

e_\lambda(x_1,\dots,x_n)=\prod\limits_{i=1}^d e_{\lambda_i}(x_1,\dots,x_n),

where by convention e_0(x_1,\dots,x_n)=1. Then, in order for e_\lambda(x_1,\dots,x_n) not to be the zero polynomial it is necessary and sufficient that \lambda_1 \leq n, which is equivalent to the condition \lambda^\prime \in \mathrm{Par}_n. The fundamental theorem of symmetric polynomials says that

\{e_\lambda(x_1,\dots,x_n) \colon \lambda^\prime \in \mathrm{Par}_n^d\},

is a basis of \mathcal{S}_n^d. Equivalently, the claim is that

\{e_{\lambda^\prime}(x_1,\dots,x_n) \colon \lambda\in \mathrm{Par}_n^d\}

is a basis of \mathcal{S}_n^d.

In Lecture 7, we assembled the main ingredients needed to establish this. First, we showed that for any \lambda \in \mathrm{Par}^d we have

e_\lambda(x_1,\dots,x_n) = \sum\limits_{\mu \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(x_1,\dots,x_n),

where M_{\lambda\mu} is the number of d \times n matrices with entries in \{0,1\} whose row sums are encoded by \lambda=(\lambda_1,\dots,\lambda_d) and whose column sums are encoded by \mu=(\lambda_1,\dots,\lambda_n). Note that this is consistent with what we have seen: if \lambda_1>n, then there cannot be a \{0,1\}-matrix with n columns whose first row sums to \lambda_1, and all the coefficients M_{\mu\lambda} are zero. Second (this is a homework problem), we have M_{\lambda\mu}=0 unless \mu \leq \lambda^\prime in dominance order on \mathrm{Par}_n^d, and moreover M_{\lambda\lambda'}=1.

Now let us complete the proof of the fundamental theorem. From the above, for any \lambda such that \lambda^\prime \in \mathrm{Par}_n^d we have

e_\lambda = \sum\limits_{\mu \leq \lambda'}M_{\lambda\mu}m_\mu,

where we are suppressing the variables x_1,\dots,x_n to lighten the notation. Now let \nu \in \mathrm{Par}_n^d be such that \lambda^\prime = \nu. Since transpose of Young diagrams is an involution, this is equivalent to \lambda=\nu', and the above becomes

e_{\nu^\prime}=\sum\limits_{\mu \leq \nu}M_{\nu^\prime \mu}m_\mu.

Now in the above I am simply going to replace the letter \nu with \lambda because this helps me keep things straight in my mind (this has no mathematical content, it’s purely psychological). Making this letter replacement the above is

e_{\lambda^\prime}=\sum\limits_{\mu \leq \lambda}M_{\lambda^\prime \mu}m_\mu.

Now, the splitting of the term of the sum corresponding to \mu=\lambda we have

e_{\lambda^\prime}=m_\lambda+\sum\limits_{\mu < \lambda}M_{\lambda^\prime \mu}m_\mu

where the sum on the right hand side is now over partitions \mu which are strictly less than \lambda in dominance order on \mathrm{Par}_n^d.

First we show that \{e_{\lambda^\prime} \colon \lambda \in \mathrm{Par}_n^d\} spans \mathcal{S}_n^d. Suppose otherwise. Then, there must be some monomial symmetric polynomial m_\mu which cannot be represented as a linear combination of the elementary polynomials e_{\lambda^\prime}. Let us choose such an m_\mu with \mu minimal in the dominance order on \mathrm{Par}_n^d. Then, by our formula above,

m_\mu = e_{\mu^\prime} - \sum\limits_{\nu < \mu} M_{\mu^\prime \nu}m_\nu,

and the summation on the right does lie in the span of \{e_{\lambda^\prime} \colon \lambda \in \mathrm{Par}_n^d\}, which gives a contradiction.

Now we show that \{e_{\lambda^\prime} \colon \lambda \in \mathrm{Par}_n^d\} is linearly independent. Indeed, suppose

\sum\limits_{\lambda \in \mathrm{Par}_n^d} c_\lambda e_{\lambda^\prime}=0

where not all coefficients c_\lambda equal zero. Choose a nonzero coefficient c_\lambda with \lambda maximal in dominance order. Then, in the monomial expansion of the left hand side, the coefficient of m_\lambda is exactly c_\lambda, which forces c_\lambda=0, contradiction.

This completes the proof that \{e_\lambda^\prime \colon \lambda \in \mathrm{Par}_n^d\} is a basis of the finite-dimensional Hilbert space \mathcal{S}_n^d. Since the argument was formulated for arbitrary degree d, we conclude that

\{e_{\lambda'} \colon \lambda \in \mathrm{Par}_n\}

is a basis of the infinite-dimensional algebra \mathcal{S}_n. Finally, because e_\lambda is defined multiplicatively, this is equivalent to

\mathcal{S}_n = \mathbb{C}[e_1,\dots,e_n],

showing that the algebra \mathcal{S}_n of symmetric polynomials in variables x_1,\dots,x_n is isomorphic to the algebra \mathbb{C}[e_1,\dots,e_n] of all polynomials in variables

e_1=x_1+x_2+\dots+x_n,\ \dots,\ e_n=x_1x_2 \dots x_n.

Math 202C: Lecture 7

*** Problems in this lecture due April 19 at 23:59 ***

Recall our objective: we know that

\{m_\lambda(x_1,\dots,x_n) \colon \lambda \in \mathrm{Par}_n^d\}

is a basis for \mathcal{S}_n^d = \mathbb{C}[x_1,\dots,x_n]^{S_n}, and we are trying to prove that

\{e_\lambda(x_1,\dots,x_n) \colon \lambda^\prime \in \mathrm{Par}_n^d\}

is a second basis, where \lambda^\prime is the partition obtained by transposing the Young diagram of \lambda. Thus, the condition \lambda^\prime \in \mathrm{Par}_n^d means that the largest part of \lambda satisfies \lambda_1 \leq n, hence all parts of \lambda are bounded by n. Since

e_{\lambda}(x_1,\dots,x_n) = \prod\limits_{i=1}^{\ell(\lambda)} e_{\lambda_i}(x_1,\dots,x_n)

is defined multiplicatively, this is equivalent to the statement that \mathbb{C}[x_1,\dots,x_n]^{S_n}=\mathbb{C}[e_1,\dots,e_n]. Note that since \lambda is a partition of d, we can write it as \lambda=(\lambda_1,\dots,\lambda_d) where we allow trailing zero coordinates, and declare e_0(x_1,\dots,x_n)=1.

Theorem 7.1. For every \lambda \vdash d with \lambda_1 \leq n, we have

e_{\lambda^\prime}(x_1,\dots,x_n) = \sum\limits_{\mu \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(x_1,\dots,x_n)

where M_{\lambda\mu} is the number of d \times n matrices with entries in \{0,1\} with row sum vector \lambda=(\lambda_1,\dots,\lambda_d) and column sum vector \mu=(\mu_1,\dots,\mu_n).

Proof: Our objective is to expand the product

e_{\lambda}(x_1,\dots,x_n)=\prod\limits_{i=1}^d e_{\lambda_i}(x_1,\dots,x_n)

as a sum of monomials, where each factor in the product has the form

e_{\lambda_i}(x_1,\dots,x_n)=\sum\limits_{\substack{S \subseteq [n] \\ |S|=\lambda_i}} \prod\limits_{i \in S} x_i

and by convention e_0(x_1,\dots,x_n)=1. Also, [n]=\{1,\dots,n\}. To visualize the monomials which occur in this expansion, write down the d \times n symbolic matrix

\begin{bmatrix} x_1 & x_2 & \dots & x_n \\ x_1 & x_2 & \dots & x_n \\ \vdots & \vdots & \dots & \vdots \\ x_1 & x_2 & \dots & x_n \end{bmatrix}

obtained by listing the variable sequence x_1,x_2,\dots,x_n a total of d times. Each monomial in the expansion of e_\lambda(x_1,\dots,x_n) is obtained exactly once by choosing \lambda_i distinct entries from the first row of the above matrix, \lambda_2 distinct entries from the second row, etc. This selection process can be recorded by setting all chosen elements to 1 and all unchosen elements to 0, leaving a d \times n matrix with entries in \{0,1\} whose row sums form the list \lambda=(\lambda_1,\dots,\lambda_n). The meaning of the list \alpha=(\alpha_1,\dots,\alpha_n) of column sums of this \{0,1\}-matrix is that \alpha_1 is the number of times the variable x_1 was chosen, \alpha_2 is the number of times the variable x_2 was chosen, etc., so that \alpha_1+\dots+\alpha_n=d. Thus the number of times the monomial

x^\alpha=x_1^{\alpha_1} \dots x_n^{\alpha_n}

appears in the expansion is M_{\lambda\alpha}, the number of d \times n matrices over \{0,1\} with row sums given by the partition \lambda of d and column sums given the by the composition \alpha of d. That is, we have

e_\lambda(x_1,\dots,x_n)=\sum\limits_{\alpha \in \mathrm{Com}_n^d} x^\lambda.

Finally, if \mu \in \mathrm{Par}_n^d is the weakly decreasing rearrangement of \alpha, we have M_{\lambda\alpha}=M_{\lambda\mu} since permuting columns has no effect on row sums. \square

In order to parlay the above expansion into a proof that \{e_\lambda \colon \lambda^\prime \in \mathrm{Par}_n^d\} is a basis of \mathcal{S}_n^d, we want to exploit properties of the “matrix” of coefficients M_{\lambda\mu}. The quotes indicate the fact that, since we have not said how the elements of \mathrm{Par}_n^d should be ordered, we have not said how the numbers M_{\lambda\mu} should be tabulated.

In order to rectify this, we first introduce a partial order on \mathrm{Par}_n^d called dominance order. This is not as kinky as it sounds, and is in fact given by a very straightforward recipe: given two nonnegative integer vectors \mu=(\mu_1,\dots,\mu_n) and \nu=(\nu_1,\dots,\nu_n) with weakly decreasing coordinates and total sum equal to d, we declare \mu \leq \nu if and only if

\mu_1 + \dots + \mu_k \leq \nu_1 + \dots + \nu_k

holds for all 1 \leq k \leq n.

Problem 7.1. Prove that dominance order really is a partial order on \mathrm{Par}_n^d\}.

The key property of dominance order in relation to Theorem 7.1 is the following.

Problem 7.2. Prove that the numbers M_{\lambda\mu} defined by Theorem 7.1 satisfy M_{\lambda\mu}=0 unless \mu \leq \lambda^\prime, and that moreover M_{\lambda\lambda^\prime}=1.

Math 202C: Lecture 6

*** Problem in this lecture due April 19 at 23:59 ***

In Lecture 5, we constructed the polynomial algebra \mathcal{P}_n and the symmetric polynomial algebra \mathcal{S}_n. The basic combinatorial objects underlying these constructions are integer compositions and integer partitions. Let

\mathrm{Com}_n=\{\alpha=(\alpha_1,\dots,\alpha_n) \colon \alpha_i \in \mathbb{N}_0\}.

Then,

\mathrm{Com}_n = \bigsqcup\limits_{d=0}^\infty \mathrm{Com}_n^d

with

\mathrm{Com}_n^d=\{\alpha \in \mathrm{Com}_n \colon |\alpha|=d\},

where |\alpha|=\alpha_1+\dots+\alpha_n. Now we quantize, meaning that we replace \mathrm{Com}_n with the Hilbert space \mathcal{P}_n having orthonormal basis

|\alpha\rangle, \quad \alpha \in \mathrm{Com}_n.

The disjoint union decomposition of the infinite set \mathrm{Com}_n then becomes an orthogonal direct sum decomposition

\mathcal{P}_n = \bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d

of the infinite-dimensional space \mathcal{P}_n into finite-dimensional subspaces

\mathcal{P}_n^d = \bigoplus\limits_{\alpha \in \mathrm{Com}_n^d} \mathbb{C}|\alpha\rangle.

Moreover, we have a natural notion of multiplication in \mathcal{P}_n defined by

|\alpha\rangle|\beta\rangle=|\alpha+\beta\rangle,

which has the feature that

|\alpha\rangle \in \mathcal{P}_n^c,\ |\beta\rangle \in \mathcal{P}_n^d \implies |\alpha+\beta\rangle \in \mathcal{P}_n^{c+d}.

On one hand, the infinite-dimensional algebra is similar to the convolution algebra of an abelian group: if

|v\rangle = \sum\limits_\alpha c_\alpha|\alpha\rangle \quad\text{and}\quad |w\rangle = \sum\limits_\beta d_\beta|\beta\rangle,

then

|v\rangle |w\rangle = \sum\limits_\gamma \left( \sum\limits_{\alpha+\beta=\gamma} c_\alpha d_\beta\right)|\gamma\rangle.

Indeed, \mathcal{P}_n is precisely the convolution algebra of finitely-supported functions on the infinite abelian monoid \mathrm{Com}_n. On the other hand, \mathcal{P}_n is something quite familiar and intuitive: the algebra of polynomials in n commuting variables. Indeed, under the identification

|\alpha\rangle \mapsto x^\alpha = x_1^{\alpha_1} \dots x_n^{\alpha_n},

we have

|v\rangle = \sum\limits_\alpha c_\alpha|\alpha\rangle = \sum\limits_\alpha c_\alpha x^\alpha = f(x_1,\dots,x_n).

In this sense, what the familiar algebra of polynomials in n commuting variables “really” is is the convolution algebra of the commutative monoid \mathrm{Com}_n.

The symmetric group S_n = \mathrm{Aut}\{1,\dots,n\} acts on \mathrm{Com}_n according to

\pi \cdot \alpha = (\alpha_{\pi(1)} \dots \alpha_{\pi(n)}),

and this permutation action respects the decomposition of \mathrm{Com}_n into finite-dimensional components \mathrm{Com}_n^d. This permutation action gives rise to a unitary representation (\mathcal{P}_n,\varphi_n) of S_n, where

\varphi_n \colon S_n \longrightarrow U(\mathcal{P}_n)

is the group homomorphism defined by

\varphi_n(\pi)|\alpha\rangle = |\alpha_{\pi(1)} \dots \alpha_{\pi(n)}\rangle.

Equivalently, thinking of \mathcal{P}_n as the algebra of polynomials in n variables this is

\varphi_n(\pi)f(x_1,\dots,x_n) = f(x_{\pi^{-1}(1)},\dots,x_{\pi^{-1}(n)}).

The algebra of symmetric polynomials in n variables is the space \mathcal{S}_n=\mathcal{P}_n^{S_n} of latex S_n$-invariant vectors in this unitary representation, i.e. polynomials with the property that

f(x_{\pi(1)},\dots,x_{\pi(n)}) = f(x_1,\dots,x_n), \quad \forall \pi \in S_n.

As explained in Lecture 5, it is not too difficult to construct a basis of \mathcal{S}_n. Indeed, one need only observe that orbits of the the action of S_n on \mathrm{Com}_n are naturally indexed by the set

\mathrm{Par}_n = \{\lambda \in \mathrm{Com}_n \colon \lambda_1 \geq \dots \geq \lambda_n\}.

For a fixed positive integer d, the set \mathrm{Par}_n^d=\mathrm{Par}_n \cap \mathrm{Com}_n^d may be identified with the set of partitions of d having at most n parts. Then, associated to each \lambda \in \mathrm{Par}_n^d is the corresponding orbit sum

m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha,

and \{m_\lambda \colon \lambda \in \mathrm{Par}_n^d\} is an orthogonal basis of \mathcal{S}_n^d=\mathcal{S}_n \cap \mathcal{P}_n^d. This is the basis of monomial symmetric polynomials, which have the form

m_\lambda(x_1,\dots,x_n) = x_1^{\lambda_1} \dots x_n^{\lambda_n} + \text{symmetries}.

So while m_\lambda is not in general a monomial, it is a polynomial having the feature that any of its constituent monomials determines all the others.

Problem 6.1. Give necessary and sufficient conditions under which m_\lambda(x_1,\dots,x_n) actually is a monomial.

Because m_\lambda is not in general a monomial, it is not in general the case that

m_\lambda(x_1,\dots,x_n)m_\mu(x_1,\dots,x_n)=m_{\lambda+\mu}(x_1,\dots,x_n).

This is somewhat analogous to the fact that convolution of conjugacy classes in the center of the convolution algebra of a noncommutative group is determined by, but not the same as, multiplication of group elements. As in that context, the natural question is to compute the connection coefficients of the monomial basis of \mathcal{S}_n,

m_\lambda m_\mu = \sum\limits_{\nu \in \mathrm{Par}_n} c_{\lambda\mu\nu} m_\nu.

In support of this analogy, consider the following.

Problem 6.2. Show that for any \gamma \in \mathcal{O}_\nu, the coefficient of the monomial x^\gamma in the product m_\lambda(x_1,\dots,x_n)m_\mu(x_1,\dots,x_n) is

c_{\lambda\mu\nu} =|\{(\alpha,\beta) \in \mathcal{O}_\lambda \times \mathcal{O}_\nu \colon \alpha+\beta=\gamma\}|.

Taking the analogy to this natural conclusion, remember the a priori surprising fact the center of the convolution algebra, whose multiplication appears quite complicated in the class basis, is actually isomorphic to a pointwise function algebra. In particular, in the abelian case we have \mathcal{C}(G) \simeq \mathcal{F}(G).$

Theorem 6.1 (Newton). The algebra \mathcal{S}_n of symmetric polynomials in n variables is isomorphic to the algebra \mathcal{P}_n of polynomials in n variables.

Note that this is an infinite-dimensional phenomenon: the proper subalgebra \mathcal{S}_n of \mathcal{P}_n is isomorphic to the full algebra in which it resides. In particular, the isomorphism is definitely not the tautological inclusion.

Theorem 6.1. is called the fundamental theorem of symmetric polynomials, and as indicated above it is generally attributed to Newton. Among many other achievements, Newton laid the foundations for classical mechanics based on his theory of gravity, and developed a theory of optics which involved sticking a knitting needle into his own eye to demonstrate that color is an intrinsic property of light as opposed to an artifact of perception. Now that’s applied mathematics. Given that symmetric polynomials were good enough for Newton, it’s a bit hard to take complaints about our subject matter being “too pure” seriously. But ok, nowadays we’re doing data science, not physical science — why would you care about reconciling gravitation with quantum mechanics when you can make a living applying principal components analysis to an endless supply of large rectangular matrices?

From the data science perspective, Newton’s theorem says that the elementary symmetric polynomials form a universal coordinate system for polynomial features of unordered data. More precisely, his result is that \mathcal{S}_n=\mathbb{C}[x_1,\dots,x_n]^{S_n} is isomorphic to \mathcal{P}_n=\mathbb{C}[e_1,\dots,e_n], where

e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subset [n] \\ |I|=r}} \prod\limits_{i \in I} x_i.

Here [n]=\{1,\dots,n\}, so e_r(x_1,\dots,x_n) is the zero polynomial for r>n. It is notationally convenient to define e_0(x_1,\dots,x_n) to be the constant polynomial 1. Then, for any partition \lambda, i.e. for any vector \lambda=(\lambda_1,\lambda_2,\lambda_3,\dots) from the set \mathrm{Par} of infinite integer vectors with weakly decreasing coordinate eventually equal to zero, we can extend the definition of the elementary symmetric polynomials by declaring

e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^\infty e_{\lambda_i}(x_1,\dots,x_n),

or equivalently

e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^{\ell(\lambda)}e_{\lambda_i}(x_1,\dots,x_n)

where \ell(\lambda) is the number of nonzero coordinates of \lambda. The condition required for this product to be nonzero is that \lambda_1 \leq n. Now, the set of partitions with largest part at most n is in bijection with the set of partitions with at most n parts via transposing Young diagrams, \lambda \mapsto \lambda^T. Thus, Newton’s theorem is equivalent to the following.

Theorem 6.2. For any d \in \mathbb{N}, the set \{e_\lambda(x_1,\dots,x_n) \colon \lambda^T \in \mathrm{Par}_n^d\} is a vector space basis of \mathcal{S}_n^d.

In Lecture 7, we will prove Theorem 6.2 by giving a combinatorial interpretation of the coefficients in the expansion

e_\lambda(x_1,\dots,x_n) = \sum\limits_{\lambda \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(x_1,\dots,x_n).

of the (extended) elementary symmetric polynomials e_\lambda in terms of the monomial symmetric polynomials m_\mu. In fact, the value of this computation extends beyond its immediate application to Newton’s theorem. Indeed, the expansion above is universal in the sense that it remains valid under any specialization, i.e. under any algebra homomorphism \Xi \colon \mathcal{S}_n \to \mathcal{A}. In particular, we have

e_\lambda(|J_1\rangle, \dots, |J_n\rangle) = \sum\limits_{\lambda \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(|J_1\rangle, \dots, |J_n\rangle),

where |J_1\rangle,\dots,|J_n\rangle are the Jucys-Murphy elements in the convolution algebra \mathcal{C}(S_n) of the symmetric group. Recall that we established the commutativity of the JM-elements by proving that they lie in the commutative subalgebra \mathcal{J}_n of \mathcal{C}(S_n) generated by

Z\mathcal{C}(S_1) \cup \dots \cup Z\mathcal{C}(S_n),

which we named the JM-subalgebra of \mathcal{C}(S_n). We then showed that

|L_r\rangle=e_r(|J_1\rangle,\dots,|J_n\rangle),

i.e. the levels of the symmetric group are precisely the elementary symmetric polynomials specialized on the JM-elements.

Next week, we shall see that essentially the same argument used to prove Newton’s theorem can be retooled to prove the apparently much more sophisticated result that |J_1\rangle,\dots,|J_n\rangle not only lie in the algebra \mathcal{J}_n, but in fact generate it. This will essentially complete Phase I of our development of the representation theory of the symmetric group. Phase II will be to show that \mathcal{J}_n has a completely different spectral interpretation as the so-called Gelfand-Tsetlin subalgebra of \mathcal{C}(S_n). As discovered by Okounkov and Vershik, the fact that the Jucys-Murphy and Gelfand-Tsetlin subalgebras of \mathcal{C}(S_n) are one and the same completely unlocks the representation theory of the symmetric groups.

For now, here is something concrete to think about.

Problem 6.3. Show that e_r(1,2,\dots,n-1) is equal to the number of permutations in S_n consisting of n-r disjoint cycles.

For context, recall that the evaluation of p_r(1,2,\dots,n)=1^r+2^r+\dots+n^r is the classical problem of summing powers of consecutive numbers. At some point in your life you undoubtedly learned that

p_1(1,2,\dots,n) = \frac{n(n-1)}{2},

and this may have led you to discover that

p_2(1,2,\dots,n) = \frac{n(n+1)(2n+1)}{6},

which probably means that you tried to understand what was going on with p_r(1,2,\dots,n) in general. The problem is “solved” by Faulhaber’s formula, which is not very satisfying. The situation with e_r(1,2,\dots,n) is much cleaner, and may help to mend this childhood trauma.

Math 202C: Lecture 5

*** Problems in this Lecture due March 12 at 23:59 ***

Let

\mathrm{Com}_n = \{(\alpha_1,\dots,\alpha_n) \in \mathbb{Z}^n \colon \alpha_i \geq 0\}

be the set of nonnegative integer vectors with n components. Elements \alpha of \mathrm{Com}_n are referred to as weak n-part compositions, where “weak” means that some coordinates may be zero. Let

\mathrm{Com}_n^d=\{\alpha \in \mathrm{Com}_n \colon \alpha_1+\dots +\alpha_n=d\}.

Elements of \mathrm{Com}_n^d are called weak n-part compositions of d. Let \mathcal{P}_n be the Hilbert space with orthonormal basis

|\alpha\rangle = |\alpha_1\dots\alpha_n\rangle,\quad \alpha \in \mathrm{Com}_n.

That is, \mathcal{P}_n is the space of complex-valued functions on weak n-part compositions which vanish at all but finitely many points. A general vector in \mathcal{P}_n thus has the form

|v\rangle = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha |\alpha\rangle,

where all but finitely many of the coefficients c_\alpha \in \mathbb{C} are zero. Note that \mathcal{P}_n is an infinite-dimensional Hilbert space, and recall that our definition of Hilbert space does not require completeness. We have an orthogonal direct sum decomposition

\mathcal{P}_n = \bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d,

where

\mathcal{P}_n^d = \mathrm{span}\{|\alpha\rangle \colon \alpha \in \mathrm{Com}_n^d\}.

Problem 5.1. Prove that \mathcal{P}_n^d is finite-dimensional, and determine its dimension.

Now define a multiplication in \mathcal{P}_n by bilinearly extending the rule

|\alpha\rangle|\beta\rangle = |\alpha+\beta\rangle.

Furthermore, define a conjugation in \mathcal{P}_n by antilinearly extending the rule

|\alpha\rangle^*=|\alpha\rangle.

Then, \mathcal{P}_n is an infinite-dimensional commutative algebra which you likely know by another name: the algebra of polynomials in n (commuting, selfadjoint) variables. Indeed, if we write

x_1=|100\dots 0\rangle\ x_2=|010\dots 0\rangle, \dots x_n=|000\dots 1\rangle,

then a basis vector in \mathcal{P}_n becomes

|\alpha\rangle = x_1^{\alpha_1} \dots x_n^{\alpha_n},

and a general vector in \mathcal{P}_n becomes

|v\rangle = \sum\limits_{\alpha} c_\alpha|\alpha\rangle = \sum\limits_{\alpha} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}.

So, if you have ever wondered what exactly a polynomial is, the above constitutes one possible answer: a polynomial is a finitely-supported complex-valued function on the set of compositions. This explains the sense in which there are “no polynomial relations” among the “variables” x_1,\dots,x_n. Indeed, writing

\sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n} = \sum\limits_\alpha c_\alpha |\alpha\rangle =0

shows that this means nothing more than linear independence of the computational basis in \mathcal{P}_n. Also note that \mathcal{P}_n^d is the span of all monomials whose exponents add to d,

x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}, \quad \alpha_1+\dots+\alpha_n=d.

Polynomials f(x_1,\dots,x_n) \in \mathcal{P}_n^d are said to be homogeneous of degree d, and the space of these is finite-dimensional.

Since you come with a lot of software already installed, we will continue to use the familiar notation

f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}

for vectors |v\rangle in \mathcal{P}_n = \mathbb{C}[x_1,\dots,x_n]. This will allow us to more readily access your existing libraries, fragmented as they may be. The monomial basis of \mathcal{P}_n is

|\alpha\rangle= x_1^{\alpha_1} \dots x_n^{\alpha_n}.

If we want to write things in a compressed way, we may use

f(x)=\sum\limits_\alpha c_\alpha x^\alpha.

One advantage of the classical polynomial notation is that it allows you to visualize homomorphisms from \mathcal{P}_n into other algebras as “substitutions.”

Definition 5.1. An algebra homomomorphism \Xi \colon \mathcal{P}_n \to \mathcal{A} is called a specialization when \mathcal{A} is commutative.

Any specialization \Xi is uniquely determine by the images A_1=\Xi(x_1),\dots,A_n=\Xi(x_n), since with the image of a general polynomial

f(x_1,\dots,x_n) = \sum\limits_\alpha c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}

is

\Xi(f(x_1,\dots,x_n)=\sum\limits_\alpha A_1^{\alpha_1} \dots A_n^{\alpha_n} =:f(A_1,\dots,A_n).

Conversely, given any elements A_1,\dots,A_n in a commutative algebra \mathcal{A} there is a unique specialization \Xi \colon \mathbb{C}[x_1,\dots,x_n] \to \mathcal{A} such that \Xi(x_1)=A_1,\dots,\Xi(x_n)=A_n.

For example, consider the specialization \Xi \colon \mathcal{P}_n \to \mathbb{C} determined by

\Xi(x_1)=1,\ \Xi(x_2)=2,\ \dots,\ \Xi(x_n)=n.

Then, for a general polynomial

f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}

we have

\Xi(f) = f(1,2,\dots,n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha 1^{\alpha_1}2^{\alpha_2} \dots n^{\alpha_n}.

In simpler times, people were interested in finding the image of the power sum polynomial

p_r(x_1,\dots,x_n)=x_1^r + x_2^r + \dots + x_n^r

under the specialization \Xi, which is the number

\Xi(p_r)=p_r(1,2,\dots,n) = 1^r+2^r+ \dots+ n^r.

A formula for this number was eventually found in terms of another family of numbers called Bernoulli numbers, which are themselves not so easy to describe. The main conceptual insight this formula provides is that \Xi(p_r) admits a second description as the the image of a univariate polynomial q_r(x) under the specialization x \mapsto n.

In probability theory, one is concerned with specializations of \mathcal{P}_n whose target \mathcal{A} is the algebra of complex-valued random variables on a given probability space. In fact, the problem we want to solve in this context is the stochastic version of the above numerical problem: given X_1,\dots,X_n \in \mathcal{A}, determine the distribution of

p_r(X_1,\dots,X_n)=X_1^r+ \dots + X_n^r

in terms of the joint distribution of X_1,\dots,X_n. Probability theory provides efficient tools for doing this when the random variables X_1,\dots,X_n are independent. If they are highly correlated, however, we know much less.

Problem 5.2. Determine the distribution of p_r(X_1,\dots,X_n) when X_1,\dots,X_n are iid Bernoulli random variables.

We can recast the content of Lecture 4 in terms of a specialization of \mathcal{P}_n whose target is the Jucys-Murphy algebra, \mathcal{J}_n. Recall that \mathcal{J}_n is the commutative subalgebra of the convolution algebra \mathcal{C}(S_n) of the symmetric group S_n=\mathrm{Aut}\{1,\dots,n\} generated by

\mathcal{Z}_1 \cup \mathcal{Z}_2 \cup \dots \cup \mathcal{Z}_n,

where \mathcal{Z}_k=Z\mathcal{C}(S_k) is the center of the subalgebra \mathcal{C}(S_k) of \mathcal{C}(S_n) consisting of functions supported on permutations which fix the points $k+1,k+2,\dots,n.$ The Jucys-Murphy specialization of \mathcal{P}_n is the algebra homomorphism

\Xi \colon \mathcal{P}_n \longrightarrow \mathcal{J}_n

defined by

\Xi(x_1)=|J_1\rangle,\ \Xi(x_2)=|J_2\rangle,\ \dots,\ \Xi(x_n)=|J_n\rangle,

where

|J_t\rangle = \sum\limits_{1 \leq s<t} |st\rangle, \quad 1 \leq t \leq n,

are the Jucys-Murphy elements in \mathcal{C}(S_n). Our key result is that the metric level sets

|L_r\rangle = \sum\limits_{\substack{\pi \in S_n \\ |\pi|=r}} |\pi\rangle, \quad 1 \leq r \leq n,

in the symmetric group (identity-centered spheres in the all-transpositions metric) are the image of a particular family of the polynomials e_1,\dots,e_n \in \mathcal{P}_n under the JM-specialization. The polynomials in question are the elementary symmetric polynomials

e_r(x_1,\dots,x_n) = \sum\limits_{1 \leq t_1 < \dots < t_r \leq n} x_{t_1}x_{t_2} \dots x_{t_r}, \quad 1 \leq r \leq n,

which may also be written

e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subseteq \{1,\dots,n\} \\ |I|=r}} \prod\limits_{i \in I}x_i.

Our theorem from Lecture 4 (which actually goes back to Lecture 1, and seems to have been completely uninteresting to many people) is that

|L_r \rangle = e_r(|J_1\rangle,|J_2\rangle,\dots,|J_n\rangle).

We can also look at the image of this identity in the regular representation of \mathcal{C}(S_n), where it becomes

L_r= e_r(J_1,J_2,\dots,J_n),

giving us a polynomial decomposition of the metric level-set operators L_r \in \mathrm{End}\mathcal{C}(S_n) on the all-transpositions Cayley graph of S_n in terms of a simpler family of operators J_t \in \mathrm{End}\mathcal{C}(S_n). As described in Lecture 1, the metric level set operators on any graph interpolate between the adjacency operator and the distance operator. However, the polynomial decomposition of these operators given above is a special feature of the all-transpositions Cayley graph of the symmetric group, and we will soon diagonalize the Jucys-Murphy operators J_1,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n).

Before doing this, let us come to an understanding of the elementary symmetric polynomials, and symmetric polynomials more generally. The basic fact is that we have a natural unitary representation of S_n on \mathcal{P}_n, namely the group homomorphism

\varphi \colon S_n \longrightarrow U(\mathcal{P}_n)

from the symmetric group of \{1,\dots,n\} to the unitary group of \mathcal{P}_n defined by

\varphi(\pi)|\alpha\rangle = |\alpha_{\pi(1)}\alpha_{\pi(2)} \dots \alpha_{\pi(n)}\rangle, \quad \pi \in S_n,\ \alpha \in \mathrm{Com}_n.

Note that this is an infinite-dimensional unitary representation of the symmetric group, but that each of the finite-dimensional subspaces \mathcal{P}_n^d is a finite-dimensional unitary representation of S_n. In polynomial notation, we have

\varphi(\pi)x^\alpha = x_1^{\alpha_{\pi(1)}}x_2^{\alpha_{\pi(2)}} \dots x_n^{\alpha_{\pi(n)}} = x_{\pi^{-1}(1)}^{\alpha_1}x_{\pi^{-1}(2)}^{\alpha_2} \dots x_{\pi^{-1}(n)}^{\alpha_n},

and thus for a general polynomial

f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}

we have

\varphi(\pi) f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_{\pi^{-1}(1)}^{\alpha_1}x_{\pi^{-1}(2)}^{\alpha_2} \dots x_{\pi^{-1}(n)}^{\alpha_n}=f(x_{\pi^{-1}(1)}, \dots x_{\pi^{-1}(n)}).

Like every unitary representation, (\mathcal{P}_n,\varphi) contains a space of invariants,

\mathcal{S}_n = \mathcal{P}_n^{S_n} = \mathbb{C}[x_1,\dots,x_n]^{S_n},

and since \mathcal{P}_n is a commutative algebra so is \mathcal{S}_n. The commutative algebra \mathcal{S}_n is called the algebra of symmetric polynomials in n variables, and it consists precisely of those polynomials such that

f(x_1,\dots,x_n)=f(x_{\pi(1)},\dots,x_{\pi(n)}), \quad \text{for all } \pi \in S_n.

Let us find a basis for \mathcal{S}_n. First, S_n acts on \mathrm{Com}_n, which is just a set and not a Hilbert space, according to

\pi \cdot \alpha = (\alpha_{\pi(1)},\dots,\alpha_{\pi(n)}).

This action preserves the coordinate sum of \alpha, so in fact S_n is acting on each of the finite sets \mathrm{Com}_n^d, whose cardinality is easy to compute (Problem 4.1). However, the number of orbits into which \mathrm{Com}_n^d decomposes under S_n is not at all easy to compute. Let \mathrm{Par}_n^d \subset \mathrm{Com}_n^d be the set of weak n-part compositions of d, and observe that each of these may be identified with a partition of d having at most n parts. The coordinates of a given composition \alpha \in \mathrm{Com}_n^d can be permuted so that they become a weakly decreasing list of numbers, hence the orbits of \mathrm{Com}_n^d may be indexed by the elements of \mathrm{Par}_n^d and we obtain the decomposition

\mathrm{Com}_n^d = \bigsqcup\limits_{\lambda \in \mathrm{Par}_n^d} \mathcal{O}_\lambda,

where \mathcal{O}_\lambda is the set of compositions which sort to the partition \lambda. The monomial symmetric polynomials are defined by

m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha.

Problem 5.3. Prove that the monomial symmetric polynomials form an orthogonal basis of \mathcal{S}_n. Decompose the power sum symmetric polynomials p_r(x_1,\dots,x_n) and elementary symmetric polynomials e_r(x_1,\dots,x_n) in this basis.

The above was a first-principles construction of a basis for the space of S_n-invariants in \mathcal{P}_n. On the other hand, we know from Math 202B that for any unitary representation (V,\varphi) of a finite group G, averaging the operators \varphi(g) gives the orthogonal projection V \to V^G.

Problem 5.4. Show that the operator P = \frac{1}{n!}\sum_{\pi \in S_n} \varphi(\pi) projects each monomial in \mathcal{P}_n onto an element of \mathcal{S}_n that is in fact a monomial symmetric polynomial, up to an explicitly describable scalar factor.

Math 202C: Lecture 4

*** Problems in this lecture due April 12 at 23:59 ***

Let S_n=\mathrm{Aut}\{1,\dots,n\} be the symmetric group on \{1,\dots,n\}. Our convention is that permutations are multiplied left-to-right, \rho\sigma := \sigma \circ \rho. We identify $lates S_n$ with its Cayley graph \mathrm{Cay}(S_n,T_n) as generated by the conjugacy class of transpositions,

T_n=\{\tau=(s t) \colon 1 \leq s <t \leq n\}.

Thus, two permutations \rho,\sigma \in S_n are adjacent if and only if \sigma=\rho\tau for \tau \in T_n. Under this identification, S_n decomposes into a disjoint union of n independent sets, the levels

L_r=\{\pi \in S_n \colon |\pi|=r\} = \{\pi \in S_n \colon n-\mathsf{cyc}(\pi)=r\}, \quad 0 \leq r \leq n-1.

The cardinalities of these levels make up the nth row in the number triangle whose entries are the Stirling numbers of the first kind. Each level further decomposes as a disjoint union of conjugacy classes: we have

L_r = \bigsqcup\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} C_\alpha,

where \Lambda is the set of partitions of n, and \ell(\alpha) is the number of terms in the partition \alpha. We can encode this decomposition algebraically in \mathcal{C}(S_n) as

|L_r\rangle = \sum\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} |C_\alpha\rangle.

Note that, because S_n is an ambivalent group, each class indicator |C_\alpha\rangle is a selfadjoint element in the convolution algebra \mathcal{C}(S_n). In the regular representation, the above becomes an identity in \mathrm{End}\mathcal{C}(S_n), namely

L_r = \sum\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} C_\alpha,

and we have used this already to give a simple proof of the fact that the metric level set operators L_1,\dots,L_{n-1} on S_n=\mathrm{Cay}(S_n,T_n) commute. In particular, these selfadjoint operators are simultaneously diagonalizable. In this Lecture, we will obtain a different combinatorial decomposition of L_r which opens the door to actually diagonalizing L_1,\dots,L_{n-1}.

In Lecture 1, we defined an edge-labeling of S_n by marking each edge corresponding to the transposition (s\ t) with t, the larger of the two elements it transposes. We defined a walk on S_n to be strictly monotone if the labels of the edges it traverses form a strictly increasing sequence. It is clear that the length of any such walk is at most n-1, which is the diameter of S_n. The following remarkable result holds.

Theorem 4.1. There exists a unique strictly monotone walk between every pair of permutations, and this walk is a geodesic.

Theorem 4.1 gives a second description of L_r as the set of all strictly monotone r-step walks on S_n emanating from the identity permutation, \iota. The figure below illustrates this for n=4, with \text{blue}=2, \text{purple}=3, \text{black}=4.

The Jucys-Murphy spanning tree of S_4.

Definition 4.2. The Jucys-Murphy subsets of S_n are defined by

J_1=\emptyset

J_2 = \{(1 2)\}

J_3=\{(1 3),(2 3)\}

\vdots

J_n=\{(1 n),(2 n),\dots,(n-1 n).\}

Thus,

J_t = \{(s t) \colon 1 \leq s <t\}

is the set of all transpositions which swap a given number t with smaller numbers s. The set J_1 = \emptyset is included only for notational convenience, as a sequence with initial index 2 may be unsettling. It is clear that the JM-subsets of S_n are pairwise disjoint, and that their union is T_n=L_1. More generally, the set of all r-step strictly monotone walks on S_n emanating from the identity permutation \iota is in bijection with

\bigsqcup\limits_{1 \leq t_1 < \dots < t_r \leq n} J_{t_1} \times J_{t_2} \times \dots \times J_{t_r},

and hence we have a bijection.

L_r \simeq \bigsqcup\limits_{1 \leq t_1 < \dots < t_r \leq n} J_{t_1} \times J_{t_2} \times \dots \times J_{t_r},

We will now encode this correspondence algebraically in \mathcal{C}(S_n).

Definition 4.3. The Jucys-Murphy elements in \mathcal{C}(S_n) are

|J_t\rangle = \sum\limits_{1 \leq s<t} |s t\rangle, \quad 1 \leq t \leq n.

The JM-elements are selfadjoint elements satisfying \langle J_t | J_t \rangle = t-1 and

|T_n\rangle = \sum\limits_{t=1}^n |J_t\rangle.

Thus, J_1,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n) are selfadjoint operators satisfying

T_n = \sum\limits_{t=1}^n J_t,

where T_n is the adjacency operator on \mathrm{Cay}(S_n,T_n).

Problem 4.2. Prove that the operator norm of J_t is \|J_t\|=t-1.

At this point, we can write down the algebraic identity

|L_r\rangle = \sum\limits_{1 \leq t_1 < \dots < t_r \leq n}|J_{t_1}\rangle \dots |J_{t_n}\rangle.

This is equivalent to the combinatorial decomposition of L_r in terms of JM-subsets, but at the same time it has many advantages. To see these we first the need the following fact.

Theorem 4.3. The JM-elements |J_1\rangle,\dots,|J_n\rangle pairwise commute in \mathcal{C}(S_n).

One way to verify this would be to check directly that the products

|J_k\rangle|J_l\rangle = \sum\limits_{i<k}\sum\limits_{j<l}|ik\rangle|jl\rangle \quad\text{and}\quad |J_k\rangle|J_l\rangle = \sum\limits_{i<k}\sum\limits_{j<l}|jl\rangle|ik\rangle

are indeed the same. We will instead construct a manifestly commutative subalgebra \mathcal{J}_n of \mathcal{C}(S_n) which contains \{|J_1\rangle,\dots,|J_n\rangle\}.

The key to this construction is the fact that the symmetric groups S_1,S_2,\dots,S_n form an inductive chain

S_1 \subset S_2 \subset \dots \subset S_n,

where we identity S_k=\mathrm{Aut}\{1,\dots,k\} with the subgroup of S_n consisting of permutations which fix the numbers k+1,k+2,\dots,n. This gives an inductive chain of algebras,

\mathcal{C}(S_1) \subset \mathcal{C}(S_2) \subset \dots \subset \mathcal{C}(S_n),

each of which has its own center, giving us a collection

Z\mathcal{C}(S_1),\ Z\mathcal{C}(S_2),\ \dots,\ Z\mathcal{C}(S_n)

of commutative subalgebras of \mathcal{C}(S_n). However, it is not true that these commutative subalgebras form an inductive chain. In fact, they are as disjoint as subalgebras can be.

Problem 4.3. Prove that Z\mathcal{C}(S_k) \cap Z\mathcal{C}(S_l)=\mathbb{C}|\iota\rangle for k<l.

What is true is that each element of Z\mathcal{C}(S_k) commutes with every element of Z\mathcal{C}(S_l). Indeed, assuming k<l, the center Z\mathcal{C}(S_l) consists of all elements commuting with every element of \mathcal{C}(S_l), and

Z\mathcal{C}(S_k) \subset \mathcal{C}(S_k) \subset \mathcal{C}(S_l).

Definition 4.4. The Jucys-Murphy subalgebra \mathcal{J}_n of \mathcal{C}(S_n) is the commutative subalgebra generated by Z\mathcal{C}(S_1) \cup Z\mathcal{C}(S_2) \cup \dots \cup Z\mathcal{C}(S_n).

Problem 4.4. Prove that \dim \mathcal{J}_n \geq \sum_{k=1}^n p(k)-(n-1), where p(k) is the number of partitions of k.

The commutativity of the JM-elements follows from the fact that they belong to the JM-subalgebra.

Theorem 4.4. We have |J_1\rangle,|J_2\rangle,\dots,|J_n\rangle \in \mathcal{J}_n.

Proof: For each 1 \leq k \leq n, let

|T_k\rangle = \sum\limits_{1 \leq s < t \leq k} |st\rangle

be the sum of all transpositions in S_k. This is an element of \mathcal{C}(S_k), and indeed an element of Z\mathcal{C}(S_k). Now observe that

|J_k\rangle = |T_k\rangle - |T_{k-1}\rangle,

which expresses |J_k\rangle as a difference of elements in \mathcal{J}_n. \square

Math 202C: Model Homework Solution

Lectures are informal by design: we meet to discuss algebra and chart our course through the subject. I talk to you, you talk to me. We generate ideas to craft a narrative. We then retire to our respective domiciles to polish that narrative. My job is to identify the key large scale ideas, treating these as girders which I assemble into an overarching structure presented in the lecture notes. I also identify important small-scale ideas which play the role of joists supporting the overarching structure. Your job is to install these joists, which you do in the homework problems. In order for this collaborative endeavor to succeed, there must be committed and earnest effort from both parties.

If I was a telepath, I would read your mind to determine if you are supplying this effort and base your grade on the results of this brain scan. Although I am not a telepath, I can gauge your earnestness to a certain extent based on verbal interactions. However, this measurement is imperfect and moreover I do not interact verbally with every student, though I am willing to do so. Therefore, your written solutions must communicate your commitment to the cause.

Below is an exemplar on which you can model your work in terms of level of detail and tone, should you wish to do so. My process was the following: I discussed the problem with two people for about fifteen minutes after lecture; I thought about it more on my fifteen minute drive home; I thought about it again while falling asleep, probably another fifteen minutes; I woke up and worked out the solution carefully, which took about an hour.

Problem 1.1. Find a formula for the number of walks of given length between two given vertices of the complete graph.

Solution: Let K_n be the complete graph on \{1,\dots,n\}. Given i,j \in S and r \in \mathbb{N}, let W^r(i,j) denote the number of r-step walks i \to j in G. Observe that

W^r(1,1) + W^r(1,2) + \dots + W^r(1,n)=(n-1)^r.

Indeed, at the each step of the walk we have n-1 choices for the next vertex to visit. We claim that

W^r(1,2)= \dots = W^r(1,n).

Intuitively, this is because the complete graph is completely homogeneous. Let us translate this intuition into a precise argument.

Let S_n=\mathrm{Aut}\{1,\dots,n\} be the symmetric group on \{1,\dots,n\}, and let \tau = (2\ 3) be the transposition which transposes 2 and 3. Let \tau act on the set of walks 1 \to v_1 \to \dots \to v_{r-1} \to 2 by

\tau(1 \to v_1 \to \dots \to v_{r-1} \to 2) = \tau(1) \to \tau(v_1) \to \dots \tau(v_{r-1}) \to \tau(2).

In words, \tau acts on an input walk by replacing all occurrences of 2 with 3, and vice versa. This is a bijective mapping from the set of r-step walks 1 \to 2 to the set of r-step walks 1 \to 3, and in fact this bijection is an involution because \tau is. Consequently, W^r(1,2)=W^r(1,3). The same argument shows that W^r(1,2)=W^r(1,4), just use \tau=(2\ 4) instead.

We now know that

W^r(1,1) + (n-1)W^r(1,2) = (n-1)^r.

If we can find a second linear equation satisfied by these numbers, we can solve the resulting system. We will find a second relation by haruspication, which in the present situation means looking at small values of r.

For r=1, we have W^1(1,1)=0 and W^1(1,2)=1. For r=2, we have W^2(1,1) = n-1 and W^2(1,2)=n-2. We thus conjecture that

W^r(1,2) = W^r(1,1) + (-1)^{r-1},

and we can prove this by induction on r. The base step of the induction has been completed; the inductive step is to show that

W^{r+1}(1,2) = W^{r+1}(1,1) + (-1)^r.

We have

W^{r+1}(1,2) = W^r(1,1) + \overline{W^r(1,2)}+ \dots + W^r(1,n)

and

W^{r+1}(1,1)= \overline{W^r(1,1)} + W^r(1,2)+\dots+W^r(1,n),

where in both equations the overline denotes an omitted term. Subtracting the second equation from the first, we get

W^{r+1}(1,2)-W^{r+1}(1,1)=W^r(1,1)-W^r(1,2)=-(-1)^{r-1}=(-1)^r,

where the second equality is the induction hypothesis.

We have arrived at a system of two linear equations in two unknowns,

W^r(1,1) + (n-1)W^r(1,2)=(n-1)^r \quad\text{ and }\quad W^r(1,1)-W^r(1,2)=(-1)^r.

Subtracting the second equation from the first, we arrive at

W^r(1,2) = \frac{(n-1)^r -(-1)^r}{n}.

Now substitute this value into the second equation to arrive at

W^r(1,1) = (-1)^r+ \frac{(n-1)^r -(-1)^r}{n}=\frac{(n-1)^r+ (-1)^r(n-1)}{n}.

Math 202C: Lecture 0

This post serves as the syllabus for Math 202C.

DATETOPICModality
March 30, Lecture 1Graphs and operatorsIn-person lecture
April 1, Lecture 2Cayley graphsIn-person lecture
April 3, Lecture 3Level operators on Cayley graphsAsynchronous lecture post
April 6, Lecture 4Jucys-Murphy elementsIn-person lecture
April 8, Lecture 5Polynomials and symmetric polynomialsIn-person lecture
April 10, Lecture 6Elementary symmetric polynomialsIn-person lecture
April 13, Lecture 7Elementary-to-monomial transition Asynchronous lecture post
April 15, Lecture 8Fundamental theorem of symmetric polysAsynchronous lecture post
April 17, Lecture 9NoneCancelled
April 20Review and preview, content and contextIn-person lecture
April 22Farahat-Higman theoremIn-person lecture
April 24Simple branchingIn-person lecture
April 27Centralizers and multiplicity-free branchingIn-person lecture
April 29Multiplicity-free branching for symmetric groupsIn-person lecture
May 1GT-basis and GT-algebraIn-person lecture
May 4
May 6
May 8
May 11
May 13
May 15
May 18
May 20
May 22
May 27
May 29
June 1
June 3
June 5
Schedule subject to change