Jonathan Novak

Math 202C: Lecture 7

*** Problems in this lecture due April 19 at 23:59 ***

Recall our objective: we know that

$\{m_\lambda(x_1,\dots,x_n) \colon \lambda \in \mathrm{Par}_n^d\}$

is a basis for $\mathcal{S}_n^d = \mathbb{C}[x_1,\dots,x_n]^{S_n}$ , and we are trying to prove that

$\{e_\lambda(x_1,\dots,x_n) \colon \lambda^\prime \in \mathrm{Par}_n^d\}$

is a second basis, where $\lambda^\prime$ is the partition obtained by transposing the Young diagram of $\lambda$ . Thus, the condition $\lambda^\prime \in \mathrm{Par}_n^d$ means that the largest part of $\lambda$ satisfies $\lambda_1 \leq n,$ hence all parts of $\lambda$ are bounded by $n$ . Since

$e_{\lambda}(x_1,\dots,x_n) = \prod\limits_{i=1}^{\ell(\lambda)} e_{\lambda_i}(x_1,\dots,x_n)$

is defined multiplicatively, this is equivalent to the statement that $\mathbb{C}[x_1,\dots,x_n]^{S_n}=\mathbb{C}[e_1,\dots,e_n]$ . Note that since $\lambda$ is a partition of $d$ , we can write it as $\lambda=(\lambda_1,\dots,\lambda_d)$ where we allow trailing zero coordinates, and declare $e_0(x_1,\dots,x_n)=1.$

Theorem 7.1. For every $\lambda \vdash d$ with $\lambda_1 \leq n,$ we have

$e_{\lambda^\prime}(x_1,\dots,x_n) = \sum\limits_{\mu \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(x_1,\dots,x_n)$

where $M_{\lambda\mu}$ is the number of $d \times n$ matrices with entries in $\{0,1\}$ with row sum vector $\lambda=(\lambda_1,\dots,\lambda_d)$ and column sum vector $\mu=(\mu_1,\dots,\mu_n).$

Proof: Our objective is to expand the product

$e_{\lambda}(x_1,\dots,x_n)=\prod\limits_{i=1}^d e_{\lambda_i}(x_1,\dots,x_n)$

as a sum of monomials, where each factor in the product has the form

$e_{\lambda_i}(x_1,\dots,x_n)=\sum\limits_{\substack{S \subseteq [n] \\ |S|=\lambda_i}} \prod\limits_{i \in S} x_i$

and by convention $e_0(x_1,\dots,x_n)=1.$ Also, $[n]=\{1,\dots,n\}.$ To visualize the monomials which occur in this expansion, write down the $d \times n$ symbolic matrix

$\begin{bmatrix} x_1 & x_2 & \dots & x_n \\ x_1 & x_2 & \dots & x_n \\ \vdots & \vdots & \dots & \vdots \\ x_1 & x_2 & \dots & x_n \end{bmatrix}$

obtained by listing the variable sequence $x_1,x_2,\dots,x_n$ a total of $d$ times. Each monomial in the expansion of $e_\lambda(x_1,\dots,x_n)$ is obtained exactly once by choosing $\lambda_i$ distinct entries from the first row of the above matrix, $\lambda_2$ distinct entries from the second row, etc. This selection process can be recorded by setting all chosen elements to $1$ and all unchosen elements to $0$ , leaving a $d \times n$ matrix with entries in $\{0,1\}$ whose row sums form the list $\lambda=(\lambda_1,\dots,\lambda_n).$ The meaning of the list $\alpha=(\alpha_1,\dots,\alpha_n)$ of column sums of this $\{0,1\}$ -matrix is that $\alpha_1$ is the number of times the variable $x_1$ was chosen, $\alpha_2$ is the number of times the variable $x_2$ was chosen, etc., so that $\alpha_1+\dots+\alpha_n=d.$ Thus the number of times the monomial

$x^\alpha=x_1^{\alpha_1} \dots x_n^{\alpha_n}$

appears in the expansion is $M_{\lambda\alpha},$ the number of $d \times n$ matrices over $\{0,1\}$ with row sums given by the partition $\lambda$ of $d$ and column sums given the by the composition $\alpha$ of $d$ . That is, we have

$e_\lambda(x_1,\dots,x_n)=\sum\limits_{\alpha \in \mathrm{Com}_n^d} x^\lambda.$

Finally, if $\mu \in \mathrm{Par}_n^d$ is the weakly decreasing rearrangement of $\alpha,$ we have $M_{\lambda\alpha}=M_{\lambda\mu}$ since permuting columns has no effect on row sums. $\square$

In order to parlay the above expansion into a proof that $\{e_\lambda \colon \lambda^\prime \in \mathrm{Par}_n^d\}$ is a basis of $\mathcal{S}_n^d,$ we want to exploit properties of the “matrix” of coefficients $M_{\lambda\mu}.$ The quotes indicate the fact that, since we have not said how the elements of $\mathrm{Par}_n^d$ should be ordered, we have not said how the numbers $M_{\lambda\mu}$ should be tabulated.

In order to rectify this, we first introduce a partial order on $\mathrm{Par}_n^d$ called dominance order. This is not as kinky as it sounds, and is in fact given by a very straightforward recipe: given two nonnegative integer vectors $\mu=(\mu_1,\dots,\mu_n)$ and $\nu=(\nu_1,\dots,\nu_n)$ with weakly decreasing coordinates and total sum equal to $d,$ we declare $\mu \leq \nu$ if and only if

$\mu_1 + \dots + \mu_k \leq \nu_1 + \dots + \nu_k$

holds for all $1 \leq k \leq n.$

Problem 7.1. Prove that dominance order really is a partial order on $\mathrm{Par}_n^d\}.$

The key property of dominance order in relation to Theorem 7.1 is the following.

Problem 7.2. Prove that the numbers $M_{\lambda\mu}$ defined by Theorem 7.1 satisfy $M_{\lambda\mu}=0$ unless $\mu \leq \lambda^\prime,$ and that moreover $M_{\lambda\lambda^\prime}=1.$

Math 202C: Lecture 6

*** Problem in this lecture due April 19 at 23:59 ***

In Lecture 5, we constructed the polynomial algebra $\mathcal{P}_n$ and the symmetric polynomial algebra $\mathcal{S}_n$ . The basic combinatorial objects underlying these constructions are integer compositions and integer partitions. Let

$\mathrm{Com}_n=\{\alpha=(\alpha_1,\dots,\alpha_n) \colon \alpha_i \in \mathbb{N}_0\}.$

Then,

$\mathrm{Com}_n = \bigsqcup\limits_{d=0}^\infty \mathrm{Com}_n^d$

with

$\mathrm{Com}_n^d=\{\alpha \in \mathrm{Com}_n \colon |\alpha|=d\},$

where $|\alpha|=\alpha_1+\dots+\alpha_n.$ Now we quantize, meaning that we replace $\mathrm{Com}_n$ with the Hilbert space $\mathcal{P}_n$ having orthonormal basis

$|\alpha\rangle, \quad \alpha \in \mathrm{Com}_n.$

The disjoint union decomposition of the infinite set $\mathrm{Com}_n$ then becomes an orthogonal direct sum decomposition

$\mathcal{P}_n = \bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d$

of the infinite-dimensional space $\mathcal{P}_n$ into finite-dimensional subspaces

$\mathcal{P}_n^d = \bigoplus\limits_{\alpha \in \mathrm{Com}_n^d} \mathbb{C}|\alpha\rangle.$

Moreover, we have a natural notion of multiplication in $\mathcal{P}_n$ defined by

$|\alpha\rangle|\beta\rangle=|\alpha+\beta\rangle,$

which has the feature that

$|\alpha\rangle \in \mathcal{P}_n^c,\ |\beta\rangle \in \mathcal{P}_n^d \implies |\alpha+\beta\rangle \in \mathcal{P}_n^{c+d}.$

On one hand, the infinite-dimensional algebra is similar to the convolution algebra of an abelian group: if

$|v\rangle = \sum\limits_\alpha c_\alpha|\alpha\rangle \quad\text{and}\quad |w\rangle = \sum\limits_\beta d_\beta|\beta\rangle,$

then

$|v\rangle |w\rangle = \sum\limits_\gamma \left( \sum\limits_{\alpha+\beta=\gamma} c_\alpha d_\beta\right)|\gamma\rangle.$

Indeed, $\mathcal{P}_n$ is precisely the convolution algebra of finitely-supported functions on the infinite abelian monoid $\mathrm{Com}_n.$ On the other hand, $\mathcal{P}_n$ is something quite familiar and intuitive: the algebra of polynomials in $n$ commuting variables. Indeed, under the identification

$|\alpha\rangle \mapsto x^\alpha = x_1^{\alpha_1} \dots x_n^{\alpha_n},$

we have

$|v\rangle = \sum\limits_\alpha c_\alpha|\alpha\rangle = \sum\limits_\alpha c_\alpha x^\alpha = f(x_1,\dots,x_n).$

In this sense, what the familiar algebra of polynomials in $n$ commuting variables “really” is is the convolution algebra of the commutative monoid $\mathrm{Com}_n.$

The symmetric group $S_n = \mathrm{Aut}\{1,\dots,n\}$ acts on $\mathrm{Com}_n$ according to

$\pi \cdot \alpha = (\alpha_{\pi(1)} \dots \alpha_{\pi(n)}),$

and this permutation action respects the decomposition of $\mathrm{Com}_n$ into finite-dimensional components $\mathrm{Com}_n^d.$ This permutation action gives rise to a unitary representation $(\mathcal{P}_n,\varphi_n)$ of $S_n$ , where

$\varphi_n \colon S_n \longrightarrow U(\mathcal{P}_n)$

is the group homomorphism defined by

$\varphi_n(\pi)|\alpha\rangle = |\alpha_{\pi(1)} \dots \alpha_{\pi(n)}\rangle.$

Equivalently, thinking of $\mathcal{P}_n$ as the algebra of polynomials in $n$ variables this is

$\varphi_n(\pi)f(x_1,\dots,x_n) = f(x_{\pi^{-1}(1)},\dots,x_{\pi^{-1}(n)}).$

The algebra of symmetric polynomials in $n$ variables is the space $\mathcal{S}_n=\mathcal{P}_n^{S_n} of$ latex S_n$-invariant vectors in this unitary representation, i.e. polynomials with the property that

$f(x_{\pi(1)},\dots,x_{\pi(n)}) = f(x_1,\dots,x_n), \quad \forall \pi \in S_n.$

As explained in Lecture 5, it is not too difficult to construct a basis of $\mathcal{S}_n$ . Indeed, one need only observe that orbits of the the action of $S_n$ on $\mathrm{Com}_n$ are naturally indexed by the set

$\mathrm{Par}_n = \{\lambda \in \mathrm{Com}_n \colon \lambda_1 \geq \dots \geq \lambda_n\}.$

For a fixed positive integer $d,$ the set $\mathrm{Par}_n^d=\mathrm{Par}_n \cap \mathrm{Com}_n^d$ may be identified with the set of partitions of $d$ having at most $n$ parts. Then, associated to each $\lambda \in \mathrm{Par}_n^d$ is the corresponding orbit sum

$m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha,$

and $\{m_\lambda \colon \lambda \in \mathrm{Par}_n^d\}$ is an orthogonal basis of $\mathcal{S}_n^d=\mathcal{S}_n \cap \mathcal{P}_n^d.$ This is the basis of monomial symmetric polynomials, which have the form

$m_\lambda(x_1,\dots,x_n) = x_1^{\lambda_1} \dots x_n^{\lambda_n} + \text{symmetries}.$

So while $m_\lambda$ is not in general a monomial, it is a polynomial having the feature that any of its constituent monomials determines all the others.

Problem 6.1. Give necessary and sufficient conditions under which $m_\lambda(x_1,\dots,x_n)$ actually is a monomial.

Because $m_\lambda$ is not in general a monomial, it is not in general the case that

$m_\lambda(x_1,\dots,x_n)m_\mu(x_1,\dots,x_n)=m_{\lambda+\mu}(x_1,\dots,x_n).$

This is somewhat analogous to the fact that convolution of conjugacy classes in the center of the convolution algebra of a noncommutative group is determined by, but not the same as, multiplication of group elements. As in that context, the natural question is to compute the connection coefficients of the monomial basis of $\mathcal{S}_n$ ,

$m_\lambda m_\mu = \sum\limits_{\nu \in \mathrm{Par}_n} c_{\lambda\mu\nu} m_\nu.$

In support of this analogy, consider the following.

Problem 6.2. Show that for any $\gamma \in \mathcal{O}_\nu$ , the coefficient of the monomial $x^\gamma$ in the product $m_\lambda(x_1,\dots,x_n)m_\mu(x_1,\dots,x_n)$ is

$c_{\lambda\mu\nu} =|\{(\alpha,\beta) \in \mathcal{O}_\lambda \times \mathcal{O}_\nu \colon \alpha+\beta=\gamma\}|.$

Taking the analogy to this natural conclusion, remember the a priori surprising fact the center of the convolution algebra, whose multiplication appears quite complicated in the class basis, is actually isomorphic to a pointwise function algebra. In particular, in the abelian case we have $\mathcal{C}(G) \simeq \mathcal{F}(G)$ .$

Theorem 6.1 (Newton). The algebra $\mathcal{S}_n$ of symmetric polynomials in $n$ variables is isomorphic to the algebra $\mathcal{P}_n$ of polynomials in $n$ variables.

Note that this is an infinite-dimensional phenomenon: the proper subalgebra $\mathcal{S}_n$ of $\mathcal{P}_n$ is isomorphic to the full algebra in which it resides. In particular, the isomorphism is definitely not the tautological inclusion.

Theorem 6.1. is called the fundamental theorem of symmetric polynomials, and as indicated above it is generally attributed to Newton. Among many other achievements, Newton laid the foundations for classical mechanics based on his theory of gravity, and developed a theory of optics which involved sticking a knitting needle into his own eye to demonstrate that color is an intrinsic property of light as opposed to an artifact of perception. Now that’s applied mathematics. Given that symmetric polynomials were good enough for Newton, it’s a bit hard to take complaints about our subject matter being “too pure” seriously. But ok, nowadays we’re doing data science, not physical science — why would you care about reconciling gravitation with quantum mechanics when you can make a living applying principal components analysis to an endless supply of large rectangular matrices?

From the data science perspective, Newton’s theorem says that the elementary symmetric polynomials form a universal coordinate system for polynomial features of unordered data. More precisely, his result is that $\mathcal{S}_n=\mathbb{C}[x_1,\dots,x_n]^{S_n}$ is isomorphic to $\mathcal{P}_n=\mathbb{C}[e_1,\dots,e_n],$ where

$e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subset [n] \\ |I|=r}} \prod\limits_{i \in I} x_i.$

Here $[n]=\{1,\dots,n\},$ so $e_r(x_1,\dots,x_n)$ is the zero polynomial for $r>n.$ It is notationally convenient to define $e_0(x_1,\dots,x_n)$ to be the constant polynomial $1.$ Then, for any partition $\lambda,$ i.e. for any vector $\lambda=(\lambda_1,\lambda_2,\lambda_3,\dots)$ from the set $\mathrm{Par}$ of infinite integer vectors with weakly decreasing coordinate eventually equal to zero, we can extend the definition of the elementary symmetric polynomials by declaring

$e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^\infty e_{\lambda_i}(x_1,\dots,x_n),$

or equivalently

$e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^{\ell(\lambda)}e_{\lambda_i}(x_1,\dots,x_n)$

where $\ell(\lambda)$ is the number of nonzero coordinates of $\lambda.$ The condition required for this product to be nonzero is that $\lambda_1 \leq n$ . Now, the set of partitions with largest part at most $n$ is in bijection with the set of partitions with at most $n$ parts via transposing Young diagrams, $\lambda \mapsto \lambda^T.$ Thus, Newton’s theorem is equivalent to the following.

Theorem 6.2. For any $d \in \mathbb{N},$ the set $\{e_\lambda(x_1,\dots,x_n) \colon \lambda^T \in \mathrm{Par}_n^d\}$ is a vector space basis of $\mathcal{S}_n^d.$

In Lecture 7, we will prove Theorem 6.2 by giving a combinatorial interpretation of the coefficients in the expansion

$e_\lambda(x_1,\dots,x_n) = \sum\limits_{\lambda \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(x_1,\dots,x_n).$

of the (extended) elementary symmetric polynomials $e_\lambda$ in terms of the monomial symmetric polynomials $m_\mu$ . In fact, the value of this computation extends beyond its immediate application to Newton’s theorem. Indeed, the expansion above is universal in the sense that it remains valid under any specialization, i.e. under any algebra homomorphism $\Xi \colon \mathcal{S}_n \to \mathcal{A}$ . In particular, we have

$e_\lambda(|J_1\rangle, \dots, |J_n\rangle) = \sum\limits_{\lambda \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(|J_1\rangle, \dots, |J_n\rangle),$

where $|J_1\rangle,\dots,|J_n\rangle$ are the Jucys-Murphy elements in the convolution algebra $\mathcal{C}(S_n)$ of the symmetric group. Recall that we established the commutativity of the JM-elements by proving that they lie in the commutative subalgebra $\mathcal{J}_n$ of $\mathcal{C}(S_n)$ generated by

$Z\mathcal{C}(S_1) \cup \dots \cup Z\mathcal{C}(S_n),$

which we named the JM-subalgebra of $\mathcal{C}(S_n)$ . We then showed that

$|L_r\rangle=e_r(|J_1\rangle,\dots,|J_n\rangle),$

i.e. the levels of the symmetric group are precisely the elementary symmetric polynomials specialized on the JM-elements.

Next week, we shall see that essentially the same argument used to prove Newton’s theorem can be retooled to prove the apparently much more sophisticated result that $|J_1\rangle,\dots,|J_n\rangle$ not only lie in the algebra $\mathcal{J}_n$ , but in fact generate it. This will essentially complete Phase I of our development of the representation theory of the symmetric group. Phase II will be to show that $\mathcal{J}_n$ has a completely different spectral interpretation as the so-called Gelfand-Tsetlin subalgebra of $\mathcal{C}(S_n)$ . As discovered by Okounkov and Vershik, the fact that the Jucys-Murphy and Gelfand-Tsetlin subalgebras of $\mathcal{C}(S_n)$ are one and the same completely unlocks the representation theory of the symmetric groups.

For now, here is something concrete to think about.

Problem 6.3. Show that $e_r(1,2,\dots,n-1)$ is equal to the number of permutations in $S_n$ consisting of $n-r$ disjoint cycles.

For context, recall that the evaluation of $p_r(1,2,\dots,n)=1^r+2^r+\dots+n^r$ is the classical problem of summing powers of consecutive numbers. At some point in your life you undoubtedly learned that

$p_1(1,2,\dots,n) = \frac{n(n-1)}{2},$

and this may have led you to discover that

$p_2(1,2,\dots,n) = \frac{n(n+1)(2n+1)}{6},$

which probably means that you tried to understand what was going on with $p_r(1,2,\dots,n)$ in general. The problem is “solved” by Faulhaber’s formula, which is not very satisfying. The situation with $e_r(1,2,\dots,n)$ is much cleaner, and may help to mend this childhood trauma.

Math 202C: Lecture 5

*** Problems in this Lecture due March 12 at 23:59 ***

Let

$\mathrm{Com}_n = \{(\alpha_1,\dots,\alpha_n) \in \mathbb{Z}^n \colon \alpha_i \geq 0\}$

be the set of nonnegative integer vectors with $n$ components. Elements $\alpha$ of $\mathrm{Com}_n$ are referred to as weak $n$ -part compositions, where “weak” means that some coordinates may be zero. Let

$\mathrm{Com}_n^d=\{\alpha \in \mathrm{Com}_n \colon \alpha_1+\dots +\alpha_n=d\}.$

Elements of $\mathrm{Com}_n^d$ are called weak $n$ -part compositions of $d.$ Let $\mathcal{P}_n$ be the Hilbert space with orthonormal basis

$|\alpha\rangle = |\alpha_1\dots\alpha_n\rangle,\quad \alpha \in \mathrm{Com}_n.$

That is, $\mathcal{P}_n$ is the space of complex-valued functions on weak $n$ -part compositions which vanish at all but finitely many points. A general vector in $\mathcal{P}_n$ thus has the form

$|v\rangle = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha |\alpha\rangle,$

where all but finitely many of the coefficients $c_\alpha \in \mathbb{C}$ are zero. Note that $\mathcal{P}_n$ is an infinite-dimensional Hilbert space, and recall that our definition of Hilbert space does not require completeness. We have an orthogonal direct sum decomposition

$\mathcal{P}_n = \bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d,$

where

$\mathcal{P}_n^d = \mathrm{span}\{|\alpha\rangle \colon \alpha \in \mathrm{Com}_n^d\}.$

Problem 5.1. Prove that $\mathcal{P}_n^d$ is finite-dimensional, and determine its dimension.

Now define a multiplication in $\mathcal{P}_n$ by bilinearly extending the rule

$|\alpha\rangle|\beta\rangle = |\alpha+\beta\rangle.$

Furthermore, define a conjugation in $\mathcal{P}_n$ by antilinearly extending the rule

$|\alpha\rangle^*=|\alpha\rangle.$

Then, $\mathcal{P}_n$ is an infinite-dimensional commutative algebra which you likely know by another name: the algebra of polynomials in $n$ (commuting, selfadjoint) variables. Indeed, if we write

$x_1=|100\dots 0\rangle\ x_2=|010\dots 0\rangle, \dots x_n=|000\dots 1\rangle,$

then a basis vector in $\mathcal{P}_n$ becomes

$|\alpha\rangle = x_1^{\alpha_1} \dots x_n^{\alpha_n},$

and a general vector in $\mathcal{P}_n$ becomes

$|v\rangle = \sum\limits_{\alpha} c_\alpha|\alpha\rangle = \sum\limits_{\alpha} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}.$

So, if you have ever wondered what exactly a polynomial is, the above constitutes one possible answer: a polynomial is a finitely-supported complex-valued function on the set of compositions. This explains the sense in which there are “no polynomial relations” among the “variables” $x_1,\dots,x_n$ . Indeed, writing

$\sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n} = \sum\limits_\alpha c_\alpha |\alpha\rangle =0$

shows that this means nothing more than linear independence of the computational basis in $\mathcal{P}_n$ . Also note that $\mathcal{P}_n^d$ is the span of all monomials whose exponents add to $d$ ,

$x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}, \quad \alpha_1+\dots+\alpha_n=d.$

Polynomials $f(x_1,\dots,x_n) \in \mathcal{P}_n^d$ are said to be homogeneous of degree $d$ , and the space of these is finite-dimensional.

Since you come with a lot of software already installed, we will continue to use the familiar notation

$f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}$

for vectors $|v\rangle$ in $\mathcal{P}_n = \mathbb{C}[x_1,\dots,x_n].$ This will allow us to more readily access your existing libraries, fragmented as they may be. The monomial basis of $\mathcal{P}_n$ is

$|\alpha\rangle= x_1^{\alpha_1} \dots x_n^{\alpha_n}.$

If we want to write things in a compressed way, we may use

$f(x)=\sum\limits_\alpha c_\alpha x^\alpha.$

One advantage of the classical polynomial notation is that it allows you to visualize homomorphisms from $\mathcal{P}_n$ into other algebras as “substitutions.”

Definition 5.1. An algebra homomomorphism $\Xi \colon \mathcal{P}_n \to \mathcal{A}$ is called a specialization when $\mathcal{A}$ is commutative.

Any specialization $\Xi$ is uniquely determine by the images $A_1=\Xi(x_1),\dots,A_n=\Xi(x_n)$ , since with the image of a general polynomial

$f(x_1,\dots,x_n) = \sum\limits_\alpha c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}$

$\Xi(f(x_1,\dots,x_n)=\sum\limits_\alpha A_1^{\alpha_1} \dots A_n^{\alpha_n} =:f(A_1,\dots,A_n).$

Conversely, given any elements $A_1,\dots,A_n$ in a commutative algebra $\mathcal{A}$ there is a unique specialization $\Xi \colon \mathbb{C}[x_1,\dots,x_n] \to \mathcal{A}$ such that $\Xi(x_1)=A_1,\dots,\Xi(x_n)=A_n.$

For example, consider the specialization $\Xi \colon \mathcal{P}_n \to \mathbb{C}$ determined by

$\Xi(x_1)=1,\ \Xi(x_2)=2,\ \dots,\ \Xi(x_n)=n.$

Then, for a general polynomial

$f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}$

we have

$\Xi(f) = f(1,2,\dots,n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha 1^{\alpha_1}2^{\alpha_2} \dots n^{\alpha_n}.$

In simpler times, people were interested in finding the image of the power sum polynomial

$p_r(x_1,\dots,x_n)=x_1^r + x_2^r + \dots + x_n^r$

under the specialization $\Xi,$ which is the number

$\Xi(p_r)=p_r(1,2,\dots,n) = 1^r+2^r+ \dots+ n^r.$

A formula for this number was eventually found in terms of another family of numbers called Bernoulli numbers, which are themselves not so easy to describe. The main conceptual insight this formula provides is that $\Xi(p_r)$ admits a second description as the the image of a univariate polynomial $q_r(x)$ under the specialization $x \mapsto n$ .

In probability theory, one is concerned with specializations of $\mathcal{P}_n$ whose target $\mathcal{A}$ is the algebra of complex-valued random variables on a given probability space. In fact, the problem we want to solve in this context is the stochastic version of the above numerical problem: given $X_1,\dots,X_n \in \mathcal{A}$ , determine the distribution of

$p_r(X_1,\dots,X_n)=X_1^r+ \dots + X_n^r$

in terms of the joint distribution of $X_1,\dots,X_n$ . Probability theory provides efficient tools for doing this when the random variables $X_1,\dots,X_n$ are independent. If they are highly correlated, however, we know much less.

Problem 5.2. Determine the distribution of $p_r(X_1,\dots,X_n)$ when $X_1,\dots,X_n$ are iid Bernoulli random variables.

We can recast the content of Lecture 4 in terms of a specialization of $\mathcal{P}_n$ whose target is the Jucys-Murphy algebra, $\mathcal{J}_n$ . Recall that $\mathcal{J}_n$ is the commutative subalgebra of the convolution algebra $\mathcal{C}(S_n)$ of the symmetric group $S_n=\mathrm{Aut}\{1,\dots,n\}$ generated by

$\mathcal{Z}_1 \cup \mathcal{Z}_2 \cup \dots \cup \mathcal{Z}_n$ ,

where $\mathcal{Z}_k=Z\mathcal{C}(S_k)$ is the center of the subalgebra $\mathcal{C}(S_k)$ of $\mathcal{C}(S_n)$ consisting of functions supported on permutations which fix the points $k+1,k+2,\dots,n.$ The Jucys-Murphy specialization of $\mathcal{P}_n$ is the algebra homomorphism

$\Xi \colon \mathcal{P}_n \longrightarrow \mathcal{J}_n$

defined by

$\Xi(x_1)=|J_1\rangle,\ \Xi(x_2)=|J_2\rangle,\ \dots,\ \Xi(x_n)=|J_n\rangle,$

where

$|J_t\rangle = \sum\limits_{1 \leq s<t} |st\rangle, \quad 1 \leq t \leq n,$

are the Jucys-Murphy elements in $\mathcal{C}(S_n).$ Our key result is that the metric level sets

$|L_r\rangle = \sum\limits_{\substack{\pi \in S_n \\ |\pi|=r}} |\pi\rangle, \quad 1 \leq r \leq n,$

in the symmetric group (identity-centered spheres in the all-transpositions metric) are the image of a particular family of the polynomials $e_1,\dots,e_n \in \mathcal{P}_n$ under the JM-specialization. The polynomials in question are the elementary symmetric polynomials

$e_r(x_1,\dots,x_n) = \sum\limits_{1 \leq t_1 < \dots < t_r \leq n} x_{t_1}x_{t_2} \dots x_{t_r}, \quad 1 \leq r \leq n,$

which may also be written

$e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subseteq \{1,\dots,n\} \\ |I|=r}} \prod\limits_{i \in I}x_i.$

Our theorem from Lecture 4 (which actually goes back to Lecture 1, and seems to have been completely uninteresting to many people) is that

$|L_r \rangle = e_r(|J_1\rangle,|J_2\rangle,\dots,|J_n\rangle).$

We can also look at the image of this identity in the regular representation of $\mathcal{C}(S_n)$ , where it becomes

$L_r= e_r(J_1,J_2,\dots,J_n),$

giving us a polynomial decomposition of the metric level-set operators $L_r \in \mathrm{End}\mathcal{C}(S_n)$ on the all-transpositions Cayley graph of $S_n$ in terms of a simpler family of operators $J_t \in \mathrm{End}\mathcal{C}(S_n).$ As described in Lecture 1, the metric level set operators on any graph interpolate between the adjacency operator and the distance operator. However, the polynomial decomposition of these operators given above is a special feature of the all-transpositions Cayley graph of the symmetric group, and we will soon diagonalize the Jucys-Murphy operators $J_1,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n).$

Before doing this, let us come to an understanding of the elementary symmetric polynomials, and symmetric polynomials more generally. The basic fact is that we have a natural unitary representation of $S_n$ on $\mathcal{P}_n$ , namely the group homomorphism

$\varphi \colon S_n \longrightarrow U(\mathcal{P}_n)$

from the symmetric group of $\{1,\dots,n\}$ to the unitary group of $\mathcal{P}_n$ defined by

$\varphi(\pi)|\alpha\rangle = |\alpha_{\pi(1)}\alpha_{\pi(2)} \dots \alpha_{\pi(n)}\rangle, \quad \pi \in S_n,\ \alpha \in \mathrm{Com}_n.$

Note that this is an infinite-dimensional unitary representation of the symmetric group, but that each of the finite-dimensional subspaces $\mathcal{P}_n^d$ is a finite-dimensional unitary representation of $S_n.$ In polynomial notation, we have

$\varphi(\pi)x^\alpha = x_1^{\alpha_{\pi(1)}}x_2^{\alpha_{\pi(2)}} \dots x_n^{\alpha_{\pi(n)}} = x_{\pi^{-1}(1)}^{\alpha_1}x_{\pi^{-1}(2)}^{\alpha_2} \dots x_{\pi^{-1}(n)}^{\alpha_n},$

and thus for a general polynomial

$f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}$

we have

$\varphi(\pi) f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_{\pi^{-1}(1)}^{\alpha_1}x_{\pi^{-1}(2)}^{\alpha_2} \dots x_{\pi^{-1}(n)}^{\alpha_n}=f(x_{\pi^{-1}(1)}, \dots x_{\pi^{-1}(n)}).$

Like every unitary representation, $(\mathcal{P}_n,\varphi)$ contains a space of invariants,

$\mathcal{S}_n = \mathcal{P}_n^{S_n} = \mathbb{C}[x_1,\dots,x_n]^{S_n},$

and since $\mathcal{P}_n$ is a commutative algebra so is $\mathcal{S}_n$ . The commutative algebra $\mathcal{S}_n$ is called the algebra of symmetric polynomials in $n$ variables, and it consists precisely of those polynomials such that

$f(x_1,\dots,x_n)=f(x_{\pi(1)},\dots,x_{\pi(n)}), \quad \text{for all } \pi \in S_n.$

Let us find a basis for $\mathcal{S}_n$ . First, $S_n$ acts on $\mathrm{Com}_n$ , which is just a set and not a Hilbert space, according to

$\pi \cdot \alpha = (\alpha_{\pi(1)},\dots,\alpha_{\pi(n)}).$

This action preserves the coordinate sum of $\alpha,$ so in fact $S_n$ is acting on each of the finite sets $\mathrm{Com}_n^d,$ whose cardinality is easy to compute (Problem 4.1). However, the number of orbits into which $\mathrm{Com}_n^d$ decomposes under $S_n$ is not at all easy to compute. Let $\mathrm{Par}_n^d \subset \mathrm{Com}_n^d$ be the set of weak $n$ -part compositions of $d$ , and observe that each of these may be identified with a partition of $d$ having at most $n$ parts. The coordinates of a given composition $\alpha \in \mathrm{Com}_n^d$ can be permuted so that they become a weakly decreasing list of numbers, hence the orbits of $\mathrm{Com}_n^d$ may be indexed by the elements of $\mathrm{Par}_n^d$ and we obtain the decomposition

$\mathrm{Com}_n^d = \bigsqcup\limits_{\lambda \in \mathrm{Par}_n^d} \mathcal{O}_\lambda,$

where $\mathcal{O}_\lambda$ is the set of compositions which sort to the partition $\lambda.$ The monomial symmetric polynomials are defined by

$m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha.$

Problem 5.3. Prove that the monomial symmetric polynomials form an orthogonal basis of $\mathcal{S}_n$ . Decompose the power sum symmetric polynomials $p_r(x_1,\dots,x_n)$ and elementary symmetric polynomials $e_r(x_1,\dots,x_n)$ in this basis.

The above was a first-principles construction of a basis for the space of $S_n$ -invariants in $\mathcal{P}_n$ . On the other hand, we know from Math 202B that for any unitary representation $(V,\varphi)$ of a finite group $G$ , averaging the operators $\varphi(g)$ gives the orthogonal projection $V \to V^G.$

Problem 5.4. Show that the operator $P = \frac{1}{n!}\sum_{\pi \in S_n} \varphi(\pi)$ projects each monomial in $\mathcal{P}_n$ onto an element of $\mathcal{S}_n$ that is in fact a monomial symmetric polynomial, up to an explicitly describable scalar factor.

Math 202C: Lecture 4

*** Problems in this lecture due April 12 at 23:59 ***

Let $S_n=\mathrm{Aut}\{1,\dots,n\}$ be the symmetric group on $\{1,\dots,n\}.$ Our convention is that permutations are multiplied left-to-right, $\rho\sigma := \sigma \circ \rho.$ We identify $lates S_n$ with its Cayley graph $\mathrm{Cay}(S_n,T_n)$ as generated by the conjugacy class of transpositions,

$T_n=\{\tau=(s t) \colon 1 \leq s <t \leq n\}.$

Thus, two permutations $\rho,\sigma \in S_n$ are adjacent if and only if $\sigma=\rho\tau$ for $\tau \in T_n.$ Under this identification, $S_n$ decomposes into a disjoint union of $n$ independent sets, the levels

$L_r=\{\pi \in S_n \colon |\pi|=r\} = \{\pi \in S_n \colon n-\mathsf{cyc}(\pi)=r\}, \quad 0 \leq r \leq n-1.$

The cardinalities of these levels make up the $n$ th row in the number triangle whose entries are the Stirling numbers of the first kind. Each level further decomposes as a disjoint union of conjugacy classes: we have

$L_r = \bigsqcup\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} C_\alpha,$

where $\Lambda$ is the set of partitions of $n$ , and $\ell(\alpha)$ is the number of terms in the partition $\alpha$ . We can encode this decomposition algebraically in $\mathcal{C}(S_n)$ as

$|L_r\rangle = \sum\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} |C_\alpha\rangle.$

Note that, because $S_n$ is an ambivalent group, each class indicator $|C_\alpha\rangle$ is a selfadjoint element in the convolution algebra $\mathcal{C}(S_n)$ . In the regular representation, the above becomes an identity in $\mathrm{End}\mathcal{C}(S_n)$ , namely

$L_r = \sum\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} C_\alpha,$

and we have used this already to give a simple proof of the fact that the metric level set operators $L_1,\dots,L_{n-1}$ on $S_n=\mathrm{Cay}(S_n,T_n)$ commute. In particular, these selfadjoint operators are simultaneously diagonalizable. In this Lecture, we will obtain a different combinatorial decomposition of $L_r$ which opens the door to actually diagonalizing $L_1,\dots,L_{n-1}.$

In Lecture 1, we defined an edge-labeling of $S_n$ by marking each edge corresponding to the transposition $(s\ t)$ with $t$ , the larger of the two elements it transposes. We defined a walk on $S_n$ to be strictly monotone if the labels of the edges it traverses form a strictly increasing sequence. It is clear that the length of any such walk is at most $n-1,$ which is the diameter of $S_n$ . The following remarkable result holds.

Theorem 4.1. There exists a unique strictly monotone walk between every pair of permutations, and this walk is a geodesic.

Theorem 4.1 gives a second description of $L_r$ as the set of all strictly monotone $r$ -step walks on $S_n$ emanating from the identity permutation, $\iota.$ The figure below illustrates this for $n=4,$ with $\text{blue}=2,$ $\text{purple}=3,$ $\text{black}=4.$

The Jucys-Murphy spanning tree of $S_4$ .

The Jucys-Murphy spanning tree of $S_4$ .

Definition 4.2. The Jucys-Murphy subsets of $S_n$ are defined by

$J_1=\emptyset$

$J_2 = \{(1 2)\}$

$J_3=\{(1 3),(2 3)\}$

$\vdots$

$J_n=\{(1 n),(2 n),\dots,(n-1 n).\}$

Thus,

$J_t = \{(s t) \colon 1 \leq s <t\}$

is the set of all transpositions which swap a given number $t$ with smaller numbers $s$ . The set $J_1 = \emptyset$ is included only for notational convenience, as a sequence with initial index $2$ may be unsettling. It is clear that the JM-subsets of $S_n$ are pairwise disjoint, and that their union is $T_n=L_1$ . More generally, the set of all $r$ -step strictly monotone walks on $S_n$ emanating from the identity permutation $\iota$ is in bijection with

$\bigsqcup\limits_{1 \leq t_1 < \dots < t_r \leq n} J_{t_1} \times J_{t_2} \times \dots \times J_{t_r},$

and hence we have a bijection.

$L_r \simeq \bigsqcup\limits_{1 \leq t_1 < \dots < t_r \leq n} J_{t_1} \times J_{t_2} \times \dots \times J_{t_r},$

We will now encode this correspondence algebraically in $\mathcal{C}(S_n).$

Definition 4.3. The Jucys-Murphy elements in $\mathcal{C}(S_n)$ are

$|J_t\rangle = \sum\limits_{1 \leq s<t} |s t\rangle, \quad 1 \leq t \leq n.$

The JM-elements are selfadjoint elements satisfying $\langle J_t | J_t \rangle = t-1$ and

$|T_n\rangle = \sum\limits_{t=1}^n |J_t\rangle.$

Thus, $J_1,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n)$ are selfadjoint operators satisfying

$T_n = \sum\limits_{t=1}^n J_t,$

where $T_n$ is the adjacency operator on $\mathrm{Cay}(S_n,T_n).$

Problem 4.2. Prove that the operator norm of $J_t$ is $\|J_t\|=t-1.$

At this point, we can write down the algebraic identity

$|L_r\rangle = \sum\limits_{1 \leq t_1 < \dots < t_r \leq n}|J_{t_1}\rangle \dots |J_{t_n}\rangle.$

This is equivalent to the combinatorial decomposition of $L_r$ in terms of $JM$ -subsets, but at the same time it has many advantages. To see these we first the need the following fact.

Theorem 4.3. The JM-elements $|J_1\rangle,\dots,|J_n\rangle$ pairwise commute in $\mathcal{C}(S_n).$

One way to verify this would be to check directly that the products

$|J_k\rangle|J_l\rangle = \sum\limits_{i<k}\sum\limits_{j<l}|ik\rangle|jl\rangle \quad\text{and}\quad |J_k\rangle|J_l\rangle = \sum\limits_{i<k}\sum\limits_{j<l}|jl\rangle|ik\rangle$

are indeed the same. We will instead construct a manifestly commutative subalgebra $\mathcal{J}_n$ of $\mathcal{C}(S_n)$ which contains $\{|J_1\rangle,\dots,|J_n\rangle\}.$

The key to this construction is the fact that the symmetric groups $S_1,S_2,\dots,S_n$ form an inductive chain

$S_1 \subset S_2 \subset \dots \subset S_n,$

where we identity $S_k=\mathrm{Aut}\{1,\dots,k\}$ with the subgroup of $S_n$ consisting of permutations which fix the numbers $k+1,k+2,\dots,n.$ This gives an inductive chain of algebras,

$\mathcal{C}(S_1) \subset \mathcal{C}(S_2) \subset \dots \subset \mathcal{C}(S_n),$

each of which has its own center, giving us a collection

$Z\mathcal{C}(S_1),\ Z\mathcal{C}(S_2),\ \dots,\ Z\mathcal{C}(S_n)$

of commutative subalgebras of $\mathcal{C}(S_n).$ However, it is not true that these commutative subalgebras form an inductive chain. In fact, they are as disjoint as subalgebras can be.

Problem 4.3. Prove that $Z\mathcal{C}(S_k) \cap Z\mathcal{C}(S_l)=\mathbb{C}|\iota\rangle$ for $k<l.$

What is true is that each element of $Z\mathcal{C}(S_k)$ commutes with every element of $Z\mathcal{C}(S_l).$ Indeed, assuming $k<l$ , the center $Z\mathcal{C}(S_l)$ consists of all elements commuting with every element of $\mathcal{C}(S_l),$ and

$Z\mathcal{C}(S_k) \subset \mathcal{C}(S_k) \subset \mathcal{C}(S_l).$

Definition 4.4. The Jucys-Murphy subalgebra $\mathcal{J}_n$ of $\mathcal{C}(S_n)$ is the commutative subalgebra generated by $Z\mathcal{C}(S_1) \cup Z\mathcal{C}(S_2) \cup \dots \cup Z\mathcal{C}(S_n).$

Problem 4.4. Prove that $\dim \mathcal{J}_n \geq \sum_{k=1}^n p(k)-(n-1),$ where $p(k)$ is the number of partitions of $k.$

The commutativity of the JM-elements follows from the fact that they belong to the JM-subalgebra.

Theorem 4.4. We have $|J_1\rangle,|J_2\rangle,\dots,|J_n\rangle \in \mathcal{J}_n.$

Proof: For each $1 \leq k \leq n$ , let

$|T_k\rangle = \sum\limits_{1 \leq s < t \leq k} |st\rangle$

be the sum of all transpositions in $S_k$ . This is an element of $\mathcal{C}(S_k),$ and indeed an element of $Z\mathcal{C}(S_k).$ Now observe that

$|J_k\rangle = |T_k\rangle - |T_{k-1}\rangle,$

which expresses $|J_k\rangle$ as a difference of elements in $\mathcal{J}_n$ . $\square$

Math 202C: Model Homework Solution

Lectures are informal by design: we meet to discuss algebra and chart our course through the subject. I talk to you, you talk to me. We generate ideas to craft a narrative. We then retire to our respective domiciles to polish that narrative. My job is to identify the key large scale ideas, treating these as girders which I assemble into an overarching structure presented in the lecture notes. I also identify important small-scale ideas which play the role of joists supporting the overarching structure. Your job is to install these joists, which you do in the homework problems. In order for this collaborative endeavor to succeed, there must be committed and earnest effort from both parties.

If I was a telepath, I would read your mind to determine if you are supplying this effort and base your grade on the results of this brain scan. Although I am not a telepath, I can gauge your earnestness to a certain extent based on verbal interactions. However, this measurement is imperfect and moreover I do not interact verbally with every student, though I am willing to do so. Therefore, your written solutions must communicate your commitment to the cause.

Below is an exemplar on which you can model your work in terms of level of detail and tone, should you wish to do so. My process was the following: I discussed the problem with two people for about fifteen minutes after lecture; I thought about it more on my fifteen minute drive home; I thought about it again while falling asleep, probably another fifteen minutes; I woke up and worked out the solution carefully, which took about an hour.

Problem 1.1. Find a formula for the number of walks of given length between two given vertices of the complete graph.

Solution: Let $K_n$ be the complete graph on $\{1,\dots,n\}$ . Given $i,j \in S$ and $r \in \mathbb{N},$ let $W^r(i,j)$ denote the number of $r$ -step walks $i \to j$ in $G.$ Observe that

$W^r(1,1) + W^r(1,2) + \dots + W^r(1,n)=(n-1)^r.$

Indeed, at the each step of the walk we have $n-1$ choices for the next vertex to visit. We claim that

$W^r(1,2)= \dots = W^r(1,n).$

Intuitively, this is because the complete graph is completely homogeneous. Let us translate this intuition into a precise argument.

Let $S_n=\mathrm{Aut}\{1,\dots,n\}$ be the symmetric group on $\{1,\dots,n\},$ and let $\tau = (2\ 3)$ be the transposition which transposes $2$ and $3.$ Let $\tau$ act on the set of walks $1 \to v_1 \to \dots \to v_{r-1} \to 2$ by

$\tau(1 \to v_1 \to \dots \to v_{r-1} \to 2) = \tau(1) \to \tau(v_1) \to \dots \tau(v_{r-1}) \to \tau(2).$

In words, $\tau$ acts on an input walk by replacing all occurrences of $2$ with $3$ , and vice versa. This is a bijective mapping from the set of $r$ -step walks $1 \to 2$ to the set of $r$ -step walks $1 \to 3$ , and in fact this bijection is an involution because $\tau$ is. Consequently, $W^r(1,2)=W^r(1,3).$ The same argument shows that $W^r(1,2)=W^r(1,4)$ , just use $\tau=(2\ 4)$ instead.

We now know that

$W^r(1,1) + (n-1)W^r(1,2) = (n-1)^r.$

If we can find a second linear equation satisfied by these numbers, we can solve the resulting system. We will find a second relation by haruspication, which in the present situation means looking at small values of $r.$

For $r=1,$ we have $W^1(1,1)=0$ and $W^1(1,2)=1.$ For $r=2,$ we have $W^2(1,1) = n-1$ and $W^2(1,2)=n-2.$ We thus conjecture that

$W^r(1,2) = W^r(1,1) + (-1)^{r-1},$

and we can prove this by induction on $r.$ The base step of the induction has been completed; the inductive step is to show that

$W^{r+1}(1,2) = W^{r+1}(1,1) + (-1)^r.$

We have

$W^{r+1}(1,2) = W^r(1,1) + \overline{W^r(1,2)}+ \dots + W^r(1,n)$

and

$W^{r+1}(1,1)= \overline{W^r(1,1)} + W^r(1,2)+\dots+W^r(1,n),$

where in both equations the overline denotes an omitted term. Subtracting the second equation from the first, we get

$W^{r+1}(1,2)-W^{r+1}(1,1)=W^r(1,1)-W^r(1,2)=-(-1)^{r-1}=(-1)^r,$

where the second equality is the induction hypothesis.

We have arrived at a system of two linear equations in two unknowns,

$W^r(1,1) + (n-1)W^r(1,2)=(n-1)^r \quad\text{ and }\quad W^r(1,1)-W^r(1,2)=(-1)^r.$

Subtracting the second equation from the first, we arrive at

$W^r(1,2) = \frac{(n-1)^r -(-1)^r}{n}.$

Now substitute this value into the second equation to arrive at

$W^r(1,1) = (-1)^r+ \frac{(n-1)^r -(-1)^r}{n}=\frac{(n-1)^r+ (-1)^r(n-1)}{n}.$

Math 202C: Lecture 0

This post serves as the syllabus for Math 202C.

Lecturer: Jonathan Novak, office hours immediately after lectures.
Grader: Johann Birnick, office hours TBA.
Topics: Symmetric groups, multilinear algebra, algebraic varieties.
Textbook: Of course not. Useful references are this, this, and this.
Evaluation: Grade assigned based on problem sets, no exam.
Problem sets: GradeScope submission, typeset your work unless calligraphy is a hobby. If you’re using AI to solve the problems, please use it to LaTeX the solutions.

DATE	TOPIC	Modality
March 30, Lecture 1	Graphs and operators	In-person lecture
April 1, Lecture 2	Cayley graphs	In-person lecture
April 3, Lecture 3	Level operators on Cayley graphs	Asynchronous lecture post
April 6, Lecture 4	Jucys-Murphy elements	In-person lecture
April 8, Lecture 5	Polynomials and symmetric polynomials	In-person lecture
April 10, Lecture 6	Elementary symmetric polynomials	In-person lecture
April 13, Lecture 7	Elementary-to-monomial transition	Asynchronous lecture post
April 15, Lecture 8	Fundamental theorem of symmetric polys	Asynchronous lecture post
April 17, Lecture 9	None	Cancelled
April 20	Review and preview, content and context	In-person lecture
April 22	Farahat-Higman theorem	In-person lecture
April 24	Simple branching	In-person lecture
April 27	Centralizers and multiplicity-free branching	In-person lecture
April 29	Multiplicity-free branching for symmetric groups	In-person lecture
May 1	GT-basis and GT-algebra	In-person lecture
May 4
May 6
May 8
May 11
May 13
May 15
May 18
May 20
May 22
May 27
May 29
June 1
June 3
June 5

Schedule subject to change

Math 202C: Lecture 3

*** All problems in this lecture due April 12 at 23:59 ***

We return to the general setting of a finite, connected, simple graph $G=\mathrm{Cay}(S,T)$ , and as before $\mathcal{C}(S)$ the Hilbert space with orthonormal basis $|\pi\rangle, |\rho\rangle,|\sigma\rangle,\dots$ . In Lecture 1 we introduced the metrics level set linear operators $L_0,L_1,L_2,\dots$ in $\mathrm{End}\mathcal{C}(S)$ which act on vertices according to

$L_r|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \mathrm{d}(\rho,\sigma)=r}} |\sigma\rangle.$

We argued the case that these operators are quite fundamental: $L_0=I$ is the identity operator, $L_1=K$ is the adjacency operator,

$D = \sum\limits_{r=0}^\infty rL_r$

is the distance operator,

$\Omega_q=\sum\limits_{r=0}^\infty q^r L_r$

is the exponential distance operator. However, the selfadjoint operators $L_0,L_1,L_2,\dots$ and the subalgebra of $\mathrm{End}\mathcal{C}(S)$ they generate are not easy to understand.

Problem 3.1. Show that $\langle \sigma|L_sL_r|\rho\rangle$ is the cardinality of the intersection of the sphere of radius $r$ centered at $\rho$ with the sphere of radius $s$ centered at $\sigma$ .

So, the matrix elements of the commutator

$[L_s,L_r]=L_sL_r - L_rL_s$

are differences of intersection numbers,

and if it seems difficult to characterize graphs for which all such differences are zero, that’s because it is. There is one much-studied class of graphs which does have the property that metric level set operators $L_0,L_1,L_2,\dots$ commute.

Problem 3.2. Show that if $G$ is a distance regular graph, then $[L_r,L_s]=0.$

The following is the only restricted situation in which I personally know a simple combinatorial characterization of commutativity of the metric level set operators (it is likely that an actual graph theorist would know more).

Problem 3.3. Suppose $G$ has diameter $2.$ Prove that the metric level set operators on $G$ commute if and only if $G$ is a regular graph. (Hint: the basic reason behind this is that in a graph of diameter $2,$ the metric does not contain any more information than the adjacency relation).

I am torturing you (us, really) with these combinatorial considerations because I want to further hammer home the point that Cayley graphs are much easier to understand than general graphs due to the underlying group structure, which is reflected in the behavior of linear operators acting on the vertices of the graph.

Suppose that $S=\{\pi,\rho,\sigma,\dots\}$ is a finite group, so that $\mathcal{C}(S)$ is the convolution algebra of this group. For any arbitrary subset $A \subseteq S,$ we write

$|A\rangle=\sum\limits_{\pi \in A}|\pi\rangle,$

which is simply the indicator function of the set $A$ . It is a very convenient abuse of notation to also simply write $A$ for the image of $|A\rangle$ in the right regular representation, i.e. we identity the set $A \subseteq S$ with the operator $A \in \mathrm{End}\mathcal{C}(S)$ defined by

$A|\rho\rangle = |\rho\rangle|A\rangle=\sum\limits_{\pi \in A} |\rho\pi\rangle.$

Recall from Math 202B that we have a tracial state

$\langle \cdot \rangle \colon \mathcal{C}(S) \longrightarrow \mathbb{C}$

on the convolution algebra of $S$ defined by

$\langle A \rangle := \langle \iota |A\rangle,$

where $\iota \in S$ is the group identity. In other words, $\langle A \rangle$ is the coefficient of the vector $|A\rangle$ in the group basis $|\pi\rangle,$ which is the same thing as the value of the function $A$ at the point $\pi=\iota.$ Here is a second under-appreciated characterization of this important functional, which is often referred to as the canonical group trace.

Problem 3.4. Prove that all diagonal matrix elements $\langle \pi | A |\pi\rangle$ of the image $A \in \mathrm{End}\mathcal{C}(S)$ of $|A\rangle\in \mathcal{C}(S)$ in the regular representation are equal to the same number, and that that number is $\langle A \rangle.$ Hence conclude (as was already shown in Math 202B) that the $\langle A \rangle =\frac{1}{|S|}\mathrm{Tr} A.$

Now to Cayley graphs. We continue with a finite group $S$ and select inside it a symmetric set of generators $T$ , giving us the Cayley graph $G = \mathrm{Cay}(S,T).$ As discussed above, we may also view $T \in \mathrm{End}\mathcal{C}(S)$ as the adjacency operator on $G.$

Problem 3.5. Show that $\langle T^r\rangle$ is the number of length $r$ loops based at the identity element (or any other specified point) in $S.$

Now specialize to the case where $S=\mathrm{Aut}\{1,\dots,n\}$ is the symmetric group and $T$ is the full conjugacy class of transpositions. The metric level set operators $L_0,L_1,L_2,\dots \in \mathrm{End}\mathcal{C}(S)$ commute for a very simple reason: they are the images of the vectors

$|L_r \rangle = \sum\limits_{\substack{\pi \in S \\ |\pi|=r}} |\pi\rangle, \quad r=0,1,2,\dots,$

and these vectors lie in the center of the convolution algebra $\mathcal{C}(S)$ . Indeed, the center of $\mathcal{C}(S)$ is spanned by the conjugacy classes

$|C_\alpha\rangle = \sum\limits_{\pi \in C_\alpha} |\pi\rangle, \quad \alpha \in \Lambda,$

and

$|L_r\rangle = \sum\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}} |C_\alpha\rangle$

is simply the sum of all conjugacy classes corresponding to partitions of $n$ with $n-r$ parts.

The fact that the level set operators $L_0,L_1,L_2,\dots$ on a graph $G$ commute is equivalent to saying that they are simultaneously diagonalizable. However, in the group-theoretic setting this statement can be refined. Let us stick with the case where $S=\mathrm{Aut}\{1,\dots,n\}$ is the symmetric group, noting that what follows is true for the Cayley graph $G=\mathrm{Cay}(S,T)$ of any finite group $S$ with generating set $T \subseteq S$ such that $|T\rangle$ is a central element in the convolution algebra $\mathcal{C}(S)$ . Let $\Lambda$ be a set indexing the conjugacy classes in $S,$ so for the symmetric group $\Lambda$ is the set of partitions of $n$ . Then $\Lambda$ also indexes isomorphism classes of irreducible unitary representations of $S,$ or equivalently irreducible linear representations of $\mathcal{C}(S).$ For each $\lambda \in \Lambda,$ choose a representative $(V^\lambda,\Phi^\lambda)$ of the corresponding isomorphism class. By Schur’s Lemma, any central element $|A\rangle \in \mathcal{C}(S)$ acts in each irrep $(V^\lambda,\Phi^\lambda)$ as a scalar operator: we have

$\Phi^\lambda|A\rangle = \widehat{A}(\lambda) I_{V^\lambda}, \quad \widehat{A}(\lambda) \in \mathbb{C}.$

Let us apply this to the adjacency operator on the all-transpositions Cayley graph $G=\mathrm{Cay}(S,T)$ of the symmetric group, which is the image of the central element

$|T\rangle = \sum\limits_{\tau \in T} |\tau\rangle= \sum\limits_{1 \leq s<t \leq n} |s\ t\rangle.$

The eigenvalues of the adjacency operator $T \in \mathrm{End}\mathcal{C}(S)$ are the numbers

$\widehat{T}(\lambda) = {n \choose 2} \frac{\chi^\lambda(\tau)}{\dim V^\lambda}, \quad \lambda \in \Lambda,$

where $\tau \in T$ is any transposition. Moreover, where the multiplicity of $\widehat{T}(\lambda)$ is $(\dim V^\lambda)^2$ because the isotypic decomposition of the regular representation is

$\mathcal{C}(S) = \bigoplus\limits_{\lambda \in \Lambda} (\dim V^\lambda)V^\lambda.$

Thus, in a sense, we completely know the spectrum of the adjacency operator on the all-transpositions Cayley graph of the symmetric group. However, to actually compute these numbers explicitly, we need to know both the dimension

$\dim V^\lambda = \chi^\lambda(\iota) = \mathrm{Tr}\varphi^\lambda(\iota)$

of every irreducible unitary representation $(V^\lambda,\varphi^\lambda)$ of $S$ , as well as the character

$\chi^\lambda(\tau) = \mathrm{Tr} \varphi^\lambda(\tau)$

of a transposition in each representation.

This highlights the fact that we really only know the irreducible unitary representations $(V^\lambda,\varphi^\lambda)$ of $S=\mathrm{Aut}\{1,\dots,n\}$ abstractly. In particular, while we know that the irreducible representations of $S$ are (up to isomorphism) in bijection with the conjugacy classes in $S$ , we have not explicitly selected such a bijection and it is not at all clear how best to do so.

Nevertheless, there are a few things we can say concretely at this stage. First, as with every group, $S$ has the trivial representation $(V^\mathsf{triv},\varphi^\mathsf{triv})$ in which $V^\mathsf{triv}$ is a one-dimensional Hilbert space and $\varphi^\mathsf{triv}$ sends every $\sigma \in S$ to the identity operator on this space. For this representation, we have

$\widehat{T}^\mathsf{triv}={n \choose 2} \frac{\chi^\mathsf{triv}(\tau)}{\dim V^\mathsf{triv}}= {n \choose 2},$

and we have reproduced the fact that the degree of the regular graph $G=\mathrm{Cay}(S,T)$ is an eigenvalue of its adjacency operator.

Second, the symmetric group $S$ has a second one-dimensional representation $(V^\mathsf{alt},\varphi^\mathsf{alt})$ in which

$\varphi^\mathsf{alt} = (-1)^{|\sigma|}I.$

This gives the eigenvalue

$\widehat{T}^\mathsf{alt}={n \choose 2} \frac{\chi^\mathsf{alt}(\tau)}{\dim V^\mathsf{alt}}=- {n \choose 2},$

which could also be deduced from the fact that the spectrum of the adjacency operator on a bipartite graph is symmetric about zero.

We can compute a third eigenvalue of the adjacency operator on the all-transpositions Cayley graph which is not directly accessible without representation-theoretic methods. To do so, we start with the permutation representation $(V^\mathsf{perm},\varphi^\mathsf{perm})$ , in which $V$ is the Hilbert space with orthonormal basis $|1\rangle,\dots,|n\rangle$ and $\varphi^\mathrm{perm}(\sigma)|i\rangle=|\sigma(i)\rangle.$ The character of this representation is

$\chi^\mathsf{perm}(\sigma) = \sum\limits_{i=1}^n \langle i | \varphi^\mathsf{perm}(\sigma) | i\rangle = \sum\limits_{i=1}^n \langle i|\sigma(i)\rangle=\mathsf{fix}(\sigma),$

the number of fixed points of the permutation $\sigma$ . However, the permutation representation is not irreducible, as it has a one-dimensional invariant subspace spanned by the vector

$|w\rangle = |1\rangle + \dots + |n\rangle.$

The orthogonal complement of this vector gives an $(n-1)$ -dimensional representation $(V^\mathsf{std},\varphi^\mathsf{std})$ of $S$ called the standard representation, whose character satisfies

$\chi^\mathsf{std}+\chi^\mathsf{triv}=\chi^\mathsf{perm}$

and is therefore given by

$\chi^\mathsf{std}(\sigma)=\mathsf{fix}(\sigma)-1.$

Problem 3.6. Prove that the standard representation of $S=\mathrm{Aut}\{1,\dots,n\}$ is irreducible, and give a third eigenvalue of the adjacency operator on the all-transpositions Cayley graph $G=\mathrm{Cay}(S,T).$

Math 202C: Lecture 2

*** Problems in this lecture due April 10 at 23:59 ***

In this lecture, we remain in the setting of the symmetric group $S=\mathrm{Aut}\{1,\dots,n\}$ but we will change the geometry on this group. More precisely, instead of identifying $S$ with its Cayley graph as generated by the set of adjacent transpositions, we will identify the symmetric group with its Cayley graph as generated by the full conjugacy class of transpositions,

$T = \{(s\ t) \in S \colon 1 \leq s < t \leq n\}.$

Let $\mathrm{d}$ denote the corresponding metric on $S,$ so that $\mathrm{d}(\rho,\sigma)$ is the minimal number $r$ such that there exist transpositions $\tau_1,\dots,\tau_r \in T$ with

$\sigma = \rho\tau_1 \dots \tau_r.$

As before, write $|\pi|=\mathrm{d}(\iota,\pi)$ for the distance from the identity permutation $\iota$ to a given permutation $\pi.$

Problem 2.1. Prove that $|\pi|=n-\mathrm{cyc}(\pi),$ where $\mathrm{cyc}(\pi)$ is the number of factors in the disjoint cycle decomposition of $\pi.$

Identifying $S$ with $\mathrm{Cay}(S,T),$ the symmetric group becomes a graded graph: it decomposes into a disjoint union

$S = L_0 \sqcup L_1 \sqcup \dots \sqcup L_{n-1}$

in which

$L_r = \{\pi \in S \colon |\pi|=r\} = \{\pi \in S \colon \mathsf{cyc}(\pi)=n-r\}$

is the sphere of radius $r$ centered at $\iota,$ and $L_r$ is an independent set, meaning that no two of its points are adjacent vertices. Moreover, if $\rho,\sigma \in S$ are adjacent vertices, then we have $\rho \in L_r$ and $\sigma \in L_s$ with $|r-s|=1.$ The decomposition of a group into concentric spheres is a general thing that happens whenever one chooses a generating set; a special feature of this particular situation is that each sphere further decomposes into a disjoint union of conjugacy classes,

$L_r = \bigsqcup\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}}C_\alpha.$

Here, we are indexing the conjugacy classes of $S$ by the set $\Lambda$ of partitions of $n$ . More precisely, the notation $\alpha \vdash n$ means that $\alpha$ is a decomposition of $n$ into a sum of $\ell(\alpha)$ positive integers, without regard to order of the terms. Every permutation $\pi \in S$ gives a partition $\mathsf{type}(\pi)\vdash n$ obtained from the lengths of the cycles in its disjoint cycle decomposition; conversely, for any partition $\alpha \vdash n$ the corresponding fiber $C_\alpha =\mathsf{type}^{-1}(\alpha)$ is a conjugacy class in $S.$

Let us now construct the analogue of Mallows measure corresponding to the all-transpositions geometry. This is called the Ewens measure on permutations: by definition, it is the Gibbs measure

$\mathbb{P}(\sigma) = \frac{q^|\sigma|}{Z} = \frac{q^{n-\mathsf{cyc}(\sigma)}}{Z}$

corresponding to a statistical mechanical system whose states $\sigma$ are permutations of $1,\dots,n$ with the potential energy of $\sigma$ being inversely proportional to its number of cycles. The partition function is

$Z= \sum\limits_{\sigma \in S} q^{|\sigma|} = q^n\sum\limits_{\sigma \in S} q^{-\mathsf{cyc}(\sigma)},$

or equivalently

$Z=\sum\limits_{k=0}^n \begin{bmatrix} n \\ k \end{bmatrix} q^{-k},$

where $\begin{bmatrix} n \\ k \end{bmatrix}$ is the Stirling number of the first kind, which counts permutations in $S=\mathrm{Aut}\{1,\dots,n\}$ whose disjoint cycle factorization has $k$ components. This in turn gives

$Z= (1+q)(1+2q) \dots (1+(n-1)q),$

corresponding to the fact that Stirling Numbers of the first kind are the coefficients of the falling factorial polynomial.

Problem 2.2. Show that the expected number of cycles in a random permutation under Ewens measure is

$1 + \sum\limits_{j=1}^{n-1} \frac{1}{1+jq}.$

Deduce from this that the expected number of cycles in a uniformly random permutation of $1,\dots,n$ is the harmonic number

$H_n=1+\frac{1}{2} + \dots + \frac{1}{n}.$

The all-transpositions geometry on permutations has many features which make it easier to understand than the adjacent-transpositions geometry. Perhaps the most significant of these is that it is “Euclideanizable.” Right, that’s not a real word – here is what is meant more precisely.

Consider the full forward cycle $\gamma=(1\ 2\ \dots\ n)$ in the symmetric group $S$ . In the all-transpositions geometry, we have $|\gamma|=n-1$ , and the number of geodesics $\iota \to \gamma$ is $\mathrm{Cay}_{n-1},$ where

$\mathrm{Cay}_n = (n+1)^{n-1}$

is the $n$ th Cayley number. Taking this formula for granted, we can calculate the number of geodesics from the identity to any permutation (hence the number of geodesics between any two permutations) as follows.

Problem 2.3. Let $\pi$ be any permutation in the conjugacy class $C_\alpha$ of $S$ . Show that the number of geodesics $\iota \to \pi$ is

${n \choose \alpha_1-1,\dots,\alpha_\ell-1} \prod\limits_{k=1}^\ell \mathrm{Cay}_{\alpha_k-1},$

where $\ell=\ell(\alpha).$

To Euclideanize the all-transpositions geometry on the symmetric group, we give a rule which allows one to systematically select a distinguished geodesic between every pair of points. To do so, let us label the edges of $\mathrm{Cay}(S,T)$ as follows: we mark each edge corresponding to the transposition $(s\ t)$ with $t$ , the larger of the two numbers swapped. Thus, emanating from every vertex of the graph we have one $2$ -edge, two $3$ -edges, three $4$ -edges, etc. Call a walk on the Cayley graph strictly monotone if the labels of the edges it traverses form a strictly increasing sequence. Thus, an $r$ -step strictly monotone walk $\rho \to \sigma$ corresponds to an equation of the form

$\sigma=\rho(s_1\ t_1) \dots (s_r\ t_r), \quad 2\leq t_1 < \dots < t_r \leq n$

in $S.$

Problem 2.4. Prove that for any $\pi \in S$ , there exists a unique strictly monotone walk $\iota \to \pi$ , and that the length of this walk is $|\pi|.$ Conclude that there exists a unique strictly monotone walk between every pair of points in $S$ , and that this walk is a geodesic.

This more or less marks the end of my efforts to get you thinking about the combinatorics of permutations; after this we will move on to the algebraic aspects of the symmetric group and its representations. The campaign for the elementary beauty and unexpected subtlety of permutations met with mixed success: there were interesting follow-up discussions with some students, whereas other pairs of eyes glazed over as the minds behind them fell into the “I already know this stuff” trap, which is one the of the more pernicious lies you can tell yourself.

Here is one final push. Our ever-curious Lani was recently asking about posets, and whether there is any active research being conducted on them these days. The answer is definitely “yes,” and though I don’t really work in the area I do have one burning question about posets which keeps me up at night. The question is about finding a partial order on permutations of a fixed genus.

From our Euclideanized perspective, genus is a metric invariant which measures “flatness” of triangles $S$ as generated by the full conjugacy class of transpositions. More precisely, let $\pi,\rho,\sigma$ be three distinct permutations in $S=\mathrm{Aut}\{1,\dots,n\}$ . There are two oriented strictly monotone triangles formed by these three points, namely $\rho \to \pi \to \sigma \to \rho$ and $\sigma \to \pi \to \rho \to \sigma.$ It may happen $\pi$ lies on a geodesic $\rho \to \sigma,$ in which case both of these triangles are flat, and have perimeter

$|\pi^{-1}\rho| + |\rho^{-1}\sigma| + |\sigma^{-1}\pi| = 2|\rho^{-1}\sigma|.$

In general, the oriented triangle $\rho \to \pi \to \sigma \to \rho$ is a cycle in a bipartite graph, so necessarily has even perimeter: we have

$|\pi^{-1}\rho| + |\rho^{-1}\sigma| + |\sigma^{-1}\pi| = 2|\rho^{-1}\sigma| + 2g(\rho,\pi,\sigma)$

for some nonnegative integer $g(\rho,\pi,\sigma)$ which we call the genus of the triple $(\rho,\pi,\sigma)$ . At the other extreme from flatness, it may be that $\pi,\rho,\sigma$ are pairwise antipodal points, i.e. the distance between each pair of them is the diameter of $S,$ in which case the above equation becomes

$3(n-1)=2(n-1)+2g(\rho,\pi,\sigma),$

giving a maximal genus of

$g(\pi,\rho,\sigma) = \lfloor \frac{n-1}{2} \rfloor.$

Problem 2.5. Show that a triple of pairwise antipodal points in $S$ exists if and only if $n$ is odd.

Since

$g(\rho,\pi,\sigma)=g(\iota,\rho^{-1}\pi,\rho^{-1}\sigma),$

it is natural to view genus as a function $g(\pi,\sigma):=g(\iota,\pi,\sigma)$ on pairs of permutations. Going even further, let us fix the second argument to be $\sigma=\gamma$ , where $\gamma=(1\ 2\ \dots\ n)$ is the full forward cycle, and define the genus of a permutation by $g(\pi):=g(\pi,\gamma).$ Then, we obtain a corresponding decomposition of the symmetric group

$S = \bigsqcup\limits_{k=0}^{\lfloor \frac{n-1}{2} \rfloor} S_g$

in which $S_g$ is the set of permutations of genus $g$ . The set $S_0$ of “planar” permutations consists precisely of those $\pi$ which lie on a geodesic $\iota \to \gamma,$ and these permutations have a special feature: the disjoint cycle decomposition of any $\pi \in S_0$ is a noncrossing partition of $\{1,\dots,n\},$ and in fact this map from genus zero permutations into noncrossing permutations is bijective. There is also a very natural partial order on $S_0$ which can be described as follows: we have $\pi \leq \sigma$ if and only if $\pi$ and $\sigma$ lie on a common geodesic $\iota \to \gamma$ , and a particle emanating from $\iota$ and traveling along this geodesic passes through $\pi$ before passing through $\sigma.$ This is a special case of the geodesic partial order which can be defined relative to any two vertices of a connected graph, and in this case it coincides exactly with the reverse refinement order on noncrossing partitions.

Now the question is: what is the higher genus analogue of this partial order? How can we define a poset structure by some uniform recipe which feels good for all values of $g$ , subject to the condition that our recipe reverts to the above at $g=0$ ? This is an open question. The most natural thing that I can think of is the following: given two permutations $\pi,\sigma \in S_g$ , we declare $\pi \leq \sigma$ if and only if they lie on a common genus $g$ triangle two of whose vertices are $\iota$ and $\gamma,$ and a particle emanating from $\iota$ and traveling along the perimeter of this triangle encounters $\pi$ before encountering $\sigma$ . Whether or not this actually defines a partial order on $S_g$ is not clear to me, but my fluid intelligence is declining from its peak while yours is cresting. So, if you want to discuss further please feel free to get in touch.

Math 202B: Revenge of the Centralizer

Yesterday during office hours I had the unsettling experience of being shown that something I assumed to be true is, in fact, subtly incorrect. That’s what happens when you take things for granite. My sloppy thinking was pointed out to me by Amelia and Eliza in the context of the following problem which appeared on the Math 202B Exam: prove that if $\mathcal{B}$ is a maximal abelian subalgebra of an algebra $\mathcal{A}$ , then the centralizer $Z(\mathcal{B},\mathcal{A})$ of $\mathcal{B}$ in $\mathcal{A}$ equals $\mathcal{B}.$

This was supposed to be easy. Since $\mathcal{B}$ is abelian, it is certainly contained in its centralizer $Z(\mathcal{B},\mathcal{A}).$ Now, if the containment is proper, then there exists an element $C \in Z(\mathcal{B},\mathcal{A})$ which does not lie in $\mathcal{B}.$ Then,

$\mathcal{C}=\mathrm{CAlg}(\mathcal{B} \sqcup \{C\}),$

the commutative algebra generated by $\mathcal{B}$ with $C$ adjoined, is a commutative algebra properly containing $\mathcal{B}$ , contradicting the fact that $\mathcal{B}$ is a MASA.

The subtle issue here is that $\mathcal{C}$ is by definition the intersection of all commutative subalgebras of $\mathcal{A}$ containing the set $\mathcal{B} \sqcup \{C\},$ and this could conceivably be empty. To be concrete, imagine a single non-normal matrix $C$ . There is no commutative subalgebra of the full matrix algebra containing $C$ , since our definition of “algebra” requires $*$ -closure, and by hypothesis $C$ does not commute with its adjoint.

Luckily, our attempted solution can be salvaged: as Evan immediately recognized, if we can show that $Z(\mathcal{B},\mathcal{A})$ contains a *selfadjoint* element $D$ not contained in $\mathcal{B}$ this issue goes away. This is the case, since we can take $D$ to be either the real or imaginary part of $C$ — at least one of these does not lie in $\mathcal{B}$ , since if they both did so would $C.$

Hopefully I’m now getting this right, and my apologies for being such a bad professor. Try reading some of my papers, I swear they’re really good.

In any case, we will have to incorporate this correction into our census of reality. In particular, if this problem were to appear on the qual, a proposed solution which did not address this issue would no longer pass muster. Live and learn.

Math 202C: Lecture 1

***Problems in this lecture due 23:59 on April 8***

Let $G=(S,T)$ be a finite, connected, simple graph with vertex set $S=\{\pi,\rho,\sigma,\dots\}$ and edge set $T \subseteq {S \choose 2}.$ A basic (meaning fundamental, not necessarily easy) problem is to compute the number $W^r(\rho,\sigma)$ of $r$ -step walks from $\rho$ to $\sigma$ in $G.$

Problem 1.1. Let $G$ be the complete graph on $S=\{1,\dots,N\},$ whose vertex set is $T={S \choose 2}.$ Using nothing but your brain, find a formula for $W^r(\rho,\sigma).$ If you find this problem intractable or, worse, uninteresting, consider a degree in law or medicine.

Math 202A provides an algebraic encoding of graphs. Let $\mathcal{C}(S)$ be the Hilbert space with orthonormal basis $|\rho\rangle,|\pi\rangle,|\sigma\rangle,\dots$ . Let $\langle \rho|,\langle \pi|,\langle \sigma|,\dots$ be the dual basis of $\mathrm{Hom}(\mathcal{C}(S),\mathbb{C}),$ so that

$\langle \rho | \sigma \rangle = \langle \rho,\sigma \rangle = \delta_{\rho\sigma}.$

Does using Dirac notation make this an applied math course? The question is vacuous since there is no such thing as non-applied math, but there are cultural differences between fields which are reflected in their iconography, and hopefully you can at least read a paper in quantum computing at the end of this sequence.

The adjacency operator $K\in \mathrm{End}\mathcal{C}(S)$ acts on the vertex basis of $\mathcal{C}(S)$ by creating a flat superposition of all vertices adjacent to a given input,

$K|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \{\rho,\sigma\} \in T}} |\sigma\rangle.$

Equivalently, the matrix elements of $K$ in the vertex basis are

$\langle \sigma | K |\rho\rangle=\begin{cases} 1, \text{ if } \{\rho,\sigma\} \in T\\ 0, \text{ if } \{\rho,\sigma\} \not\in T\end{cases}.$

Thus, the selfadjoint operator $K \in \mathrm{End}\mathcal{C}(S)$ encodes the edge set $T \subseteq {S\choose 2}.$

For example, consider the complete graph $G=(S,T)$ on the vertex set $S=\{1,\dots,N\}.$ The computational basis in $\mathcal{C}(S)$ consists of the vectors

$|1\rangle, |2\rangle,\dots,|N\rangle,$

upon which $A$ acts according to

$K| i\rangle = \sum\limits_{j \neq i} |j\rangle.$

Problem 1.2. Show that $W^r(\rho,\sigma) = \langle \rho| K^r|\sigma\rangle.$ In particular, show that the trace of $K^r$ equals the total number of loops (i.e. closed walks) of length $r$ in $G.$

The linear algebra approach to counting walks in graphs is to find a basis of $\mathcal{C}(G)$ on which $K$ acts diagonally.

Problem 1.3. Implement this approach for the complete graph and recover your barehands solution to Problem 1.1.

The mathematical subject which uses eigenvalues of linear operators to analyze graphs is called spectral graph theory. In particular, the spectrum of the adjacency operator $K$ on a given graph $G=(S,T)$ is called the adjacency spectrum of $G$ . A couple of facts worth mentioning: if $G$ is bipartite then the spectrum of $K$ is symmetric about $0$ ; if $G$ is degree-regular then the largest eigenvalue of $K$ is the degree of any vertex.

Instead of asking how many walks there are between two given vertices, we may ask how far apart they are. This information is encoded in the distance operator $D \in \mathrm{End}\mathcal{C}(S)$ , whose matrix elements in the vertex basis tabulate geodesic distance in $G$ ,

$\langle \sigma | D | \rho\rangle = \mathrm{d}(\rho,\sigma).$

Ron Graham made the striking discovery that the determinant of the distance operator on a tree depends only on the cardinality of its vertex set. Distance eigenvalues of graphs have been much studied ever since. I found some recent interesting work on distance operators via Johann Birnick’s website; Johann is the TA for this course.

One can in a sense interpolate between $K$ and $D$ . To do so, we consider operators $L_0,L_1,L_2,\dots \in \mathcal{C}(S)$ corresponding to metric level sets in $G=(S,T)$ , meaning that

$\langle \sigma | L_r |\rho \rangle = \begin{cases} 1, \text{ if }\mathrm{d}(\rho,\sigma)=r \\ 0, \text{ otherwise}\end{cases}.$

Equivalently,

$L_r|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \mathrm{d}(\rho,\sigma)=r}} |\sigma\rangle$

is the sum of all points on the sphere of radius $r$ centered at $\rho$ . We have that $L_0=I$ is the identity operator, $L_1 = K$ is the adjacency operator, and

$D=\sum\limits_{r=0}^\infty rL_r.$

This is in fact a finite sum, since $L_r$ is the zero operator for all $r$ exceeding the diameter of $G$ .

One can also consider the exponential adjacency operator

$\Psi_t= \sum\limits_{r=0}^\infty \frac{t^r}{r!} K^r =: e^{tK}$

on $G$ , whose matrix elements are exponential generating functions enumerating walks between vertices by length,

$\langle \sigma | \Psi_t |\rho \rangle = \sum\limits_{r=0}^\infty \frac{t^r}{r!} W^r(\rho,\sigma).$

For regular graphs this is essentially the heat kernel, which has been much studied. The exponential distance operator,

$\Omega_q = \sum\limits_{r=0}^\infty q^r L_r,$

is the element-wise exponential of the distance operator,

$\langle \sigma | \Omega_q |\rho\rangle= q^{\mathrm{d}(\rho,\sigma)},$

and unlike $\Psi_t$ it is a relative newcomer in spectral graph theory.

Looking at the power series which define $\Psi_t$ and $\Omega_q,$ one sees that the level set operators $L_0,L_1,L_2,\dots$ are to $\Omega_q$ as powers $K^0,K^1,K^2,\dots$ of the adjacency operator are to $\Psi_t.$ While the first two elements of these sequences are the same, a big flaw in this symbolic analogy is that the subalgebra of $\mathrm{End}\mathcal{C}(S)$ generated by the (finite) set of selfadjoint operators $\{L_0,L_1,L_2,\dots\}$ is highly non-trivial, whereas the subalgebra of $\mathrm{End}\mathcal{C}(S)$ generated by the single selfadjoint operator $K$ simply consists of polynomials in this operator, which is but a small sector of $\mathrm{Alg}(L_0,L_1,L_2,L_3,\dots).$ In particular, it is not at all clear what relations exist between $L_r$ and $L_s$ , or even whether they commute. We will have more to say about this later.

The exponential distance operator $\Omega_q$ encodes the same information as the distance operator $D$ in a manner which interfaces more readily with the formalism of statistical mechanics. More precisely, let us write $q=e^{-\beta}$ so that

$\langle \sigma | \Omega_q | \rho \rangle = e^{-\beta \mathrm{d}(\rho,\sigma)}.$

Then, fixing a root vertex $\rho$ , the $\rho$ th row of $\Omega_q$ defines Gibbs measure (after Josiah, not Amelia) on the vertices of $G$ according to

$\mathbb{P}(\sigma) = \frac{1}{Z}e^{-\beta |\sigma|}.$

Physically, this distribution corresponds to a thermodynamic ensemble whose states are the vertices of $G$ , with $\beta>0$ the inverse temperature and the potential energy $|\sigma|$ of a state $\sigma$ being equal to its distance $|\sigma|:=\mathrm{d}(\rho,\sigma)$ to the root vertex $\rho$ in $G$ . The partition function

$Z= \sum\limits_\sigma q^{|\sigma|} = \sum\limits_{\sigma} e^{-\beta\mathrm{d}(\rho,\sigma)}$

sums the energy over all states, ensuring that $\mathbb{P}$ is a probability measure on $S$ . Its logarithm $\log Z$ is the free energy of our ensemble, and

$q \frac{\partial}{\partial q} \log Z = \frac{1}{Z} \sum\limits_{\sigma} |\sigma|q^{|\sigma|}=\sum\limits_\sigma |\sigma| \mathbb{P}(\sigma)$

is precisely the average distance to the root vertex $\rho$ under the non-uniform distribution $\mathbb{P},$ which becomes the uniform distribution on states in the high-temperature limit $\beta \to 0$ limit, where $q=e^{-\beta} \to 1$ .

Let us try this out on a toy example, the complete graph $G=(S,T)$ on $S=\{1,\dots,N\}$ . Take the root vertex to be $1$ , so that the potential energy of any other vertex $\sigma \in \{1,\dots,N\}$ is

$|\sigma| = \mathrm{d}(1,\sigma)=1.$

Thus, under the corresponding Gibbs measure on states $S=\{1,\dots,N\}$ we have

$\mathbb{P}(1) = \frac{1}{Z}$

and

$\mathbb{P}(\sigma) = \frac{q}{Z}, \quad \sigma \neq 1,$

where the partition function is

$Z= q^0+q^1+ \dots + q^1 = 1+(N-1)q.$

We thus have

$q\frac{d}{dq} \log Z = \frac{(N-1)q}{1+(N-1)q},$

and setting $q=1$ we recover the obvious fact that the average distance to $1$ under the uniform measure on the complete graph is $\frac{N-1}{N}.$ Indeed, for the uniform measure on the complete graph the random variable $\sigma \mapsto |\sigma|$ has a Bernoulli distribution on $\{0,1\}$ which assigns probability $\frac{1}{N}$ to $0$ and probability $\frac{N-1}{N}$ to $1$ .

Linear operators on graphs fall under the broad umbrella of Math 202A, which treated linear maps between finite-dimensional Hilbert spaces. Math 202B gives us additional algebraic tools which apply to graphs whose vertices form a group; these are officially known as Cayley graphs.

Let $S$ be a finite group and $T \subset S$ a symmetric set of generators in $G$ . Symmetric means that $\tau \in T$ implies $\tau^{-1} \in T$ , and saying that $T$ generates $S$ means that for any $\sigma \in S$ there exists $r \in \mathbb{N}_0$ and $\tau_1,\dots, \tau_r \in T$ such that $\sigma=\tau_1\tau_2 \dots \tau_r.$ The Cayley graph $\mathrm{Cay}(S,T)$ has vertex set $S$ , with $\{\rho,\sigma\}$ an edge if and only if $\sigma = \rho\tau$ for some $\tau \in T,$ which is then necessarily unique. The fact that $T$ is symmetric means that this is a well-defined as an undirected graph without multiple edges, and the fact that $T$ generates $S$ means that the graph is connected. If $T$ does not contain the identity element $\iota$ then no vertex is adjacent to itself, and we henceforth assume this to be the case. We see that for Cayley graphs $G=\mathrm{Cay}(S,T),$ the edge set can effectively be viewed as a subset of the vertex set, since the generating $T \subseteq S$ determines the adjacency relation on vertices.

For a Cayley graph $G=\mathrm{Cay}(S,T),$ the space $\mathcal{C}(S)$ carries not just a scalar product but also a vector product, namely convolution: we have

$|\pi\rangle|\sigma\rangle = |\pi \sigma\rangle, \quad \pi,\sigma \in S,$

where $\pi\sigma$ is the group product of $\pi$ and $\sigma.$ This gives us a new point of view on the adjacency operator $K$ on $G=\mathrm{Cay}(S,T).$ For any subset $P \subseteq S,$ let

$|P\rangle = \sum\limits_{\pi \in P} |\pi\rangle$

be the corresponding element of $\mathcal{C}(S)$ . Thus in particular

$\langle P|Q\rangle = |P \cap Q|.$

Let us allow the following extremely useful abuse of notation: we will also identify $P \subset S$ with its image in the (right) regular representation of $\mathcal{C}(S)$ . Thus,

$P|\rho \rangle = |\rho\rangle |P\rangle = \sum\limits_{\pi \in P} |\rho\pi\rangle.$

In particular, $T$ is both our chosen generating set of $S$ , and also its image in the regular representation is the adjacency operator on $\mathrm{Cay}(S,T),$

$T|\rho\rangle = |\rho\rangle |T\rangle = \sum\limits_{\tau \in T} |\rho\tau\rangle.$

More generally, let

$L_r = \{ \pi \in S \colon |\pi|=r\} = \{\pi \in S \colon \mathrm{d}(\iota,\pi)=r\}$

be the sphere of radius $r$ centered at the identity element. This set becomes a vector in $\mathcal{C}(S)$ via quantization,

$|L_r\rangle = \sum\limits_{\pi \in L_r} |\pi\rangle,$

and this vector becomes an operator in $\mathrm{End}\mathcal{S}$ via the regular representation,

$L_r|\rho\rangle = |\rho\rangle |L_r\rangle = \sum\limits_{|\pi|=r} |\rho\pi\rangle.$

The matrix elements of the operator $L_r$ are

$\langle \sigma | L_r |\rho \rangle = \sum\limits_{|\pi|=r} \langle \sigma |\rho\pi\rangle,$

and

$\langle \sigma|\rho\pi\rangle\neq 0 \iff \sigma=\rho\pi \iff \rho^{-1}\sigma=\pi,$

in which case

$\mathrm{d}(\rho,\sigma)=|\rho^{-1}\sigma|=|\pi|=r.$

Thus, the image of $|L_r\rangle$ in the (right) regular representation is indeed the metric level set operator $L_r.$

To sum up, the adjacency eigenvalues of the Cayley graph $\mathrm{Cay}(S,T)$ are the eigenvalues of $|T\rangle \in \mathcal{C}(S)$ acting in the regular representation of $\mathcal{C}(S)$ . This means that the spectral graph theory of Cayley graphs is part of the representation theory of finite groups. If we are in the incredibly fortuitous situation where $G$ is an abelian group, this comes down to using the discrete Fourier transform.

Problem 1.4. Let $C$ be the cyclic group of order $n$ with generator $\gamma$ , and take the symmetric generating set $D=\{\gamma,\gamma^{-1}\}.$ Then $\mathrm{Cay}(C,D)$ is a cycle graph with $n$ vertices. Find a formula for $W^r(\rho,\sigma),$ the number of $r$ -step walks between two given vertices of a cycle graph.

Concerning distance in Cayley graphs $\mathrm{Cay}(S,T)$ , we by definition have that $\mathrm{d}(\rho,\sigma)$ is equal to the minimal value of $r$ such that there exist $\tau_1,\dots,\tau_r$ such that

$\sigma = \rho\tau_1 \dots \tau_r.$

Problem 1.5. Show that for any $\pi,\rho,\sigma \in S$ we have $\mathrm{d}(\pi\rho,\pi\sigma)=\mathrm{d}(\rho,\sigma),$ and explain why the same cannot in general be said for $\mathrm{d}(\rho\pi,\sigma\pi).$ Show that $\mathrm{d}(\rho,\sigma) = \mathrm{d}(\iota,\rho^{-1}\sigma)$ with $\iota$ the group identity.

Now let us consider a nonabelian example: our group is $S=\mathrm{Aut}\{1,\dots,n\},$ the symmetric group consisting of all bijections $\pi \colon \{1,\dots,n\} \to \{1,\dots,n\}$ , which are generally called permutations. We will take the group law to be $\pi\sigma := \sigma \circ \pi,$ which is the convention used in most computer algebra systems, or so I am told. Take the generating set

$T = \{(1\ 2),(2\ 3),\dots,(n-1\ n)\}$

of adjacent transpositions.

Problem 1.6. Draw $\mathrm{Cay}(S,T)$ for $n=3.$

Set $|\pi|=\mathrm{d}(\iota,\pi)$ with $\iota$ the identity permutation.

Now consider the metric Gibbs measure on the Cayley graph $\mathrm{Cay}(S,T)$ of the symmetric group $S=\mathrm{Aut}\{1,\dots,n\}$ corresponding to the generating set $T=\{(1\ 2),(2\ 3),\dots,(n-1\ n)\}$ of adjacent transpositions. We take the root vertex to be $\iota,$ the identity permutation in $S.$ Thus,

$\mathbb{P}(\sigma) = \frac{q^{|\sigma|}}{Z},$

where $|\sigma|=\mathrm{d}(\iota,\sigma)$ is the minimal number of adjacent transpositions whose produce is $\sigma$ .

Problem 1.7. Show that $|\sigma|=\mathsf{inv}(\sigma)$ , where $\mathrm{inv}(\sigma)$ is the cardinality of the set $\{1 \leq i<j \leq n\ \colon\ \sigma(i)>\sigma(j)\}$ of inversions in the permutation $\sigma.$ Hence compute the diameter of $G=\mathrm{Cay}(S,T).$

We conclude that the metric Gibbs measure on $G=\mathrm{Cay}(S,T)$ is

$\mathbb{P}(\sigma)=\frac{1}{Z}q^{\mathsf{inv}(\sigma)}.$

This is a famous probability measure on permutations called the Mallows measure; it arises in theoretical computer science in the analysis of sorting algorithms. It is a remarkable fact that the partition function

$Z=\sum\limits_{\sigma \in S} q^{|\mathsf{inv}(\sigma)|}.$

It is a remarkable fact that this partition function can be evaluated exactly, i.e. this stat mech model is exactly solvable. For any positive integer $k$ , the corresponding $q$ -number is

$[k]_q = 1+q+ \dots + q^{k-1},$

so that in particular $[k]_1=k$ reverts to an ordinary number at $q=1.$ The $q$ -factorial is accordingly defined as

$[k]_q! = [1]_q[2]_q \dots [k]_q.$

Here’s a pretty thing.

Theorem 1.1. The partition function for Mallows measure is $Z=[n]_q!.$

Now you can derive many things, including the following.

Problem 1.8. Find a formula for the expected number of inversions in a random permutation whose distribution is Mallows measure. Using your result, find a formula for the expected number of inversions in a uniformly random permutation.