Math 202C: Lecture 7

*** Problems in this lecture due April 19 at 23:59 ***

Recall our objective: we know that

\{m_\lambda(x_1,\dots,x_n) \colon \lambda \in \mathrm{Par}_n^d\}

is a basis for \mathcal{S}_n^d = \mathbb{C}[x_1,\dots,x_n]^{S_n}, and we are trying to prove that

\{e_\lambda(x_1,\dots,x_n) \colon \lambda^\prime \in \mathrm{Par}_n^d\}

is a second basis, where \lambda^\prime is the partition obtained by transposing the Young diagram of \lambda. Thus, the condition \lambda^\prime \in \mathrm{Par}_n^d means that the largest part of \lambda satisfies \lambda_1 \leq n, hence all parts of \lambda are bounded by n. Since

e_{\lambda}(x_1,\dots,x_n) = \prod\limits_{i=1}^{\ell(\lambda)} e_{\lambda_i}(x_1,\dots,x_n)

is defined multiplicatively, this is equivalent to the statement that \mathbb{C}[x_1,\dots,x_n]^{S_n}=\mathbb{C}[e_1,\dots,e_n]. Note that since \lambda is a partition of d, we can write it as \lambda=(\lambda_1,\dots,\lambda_d) where we allow trailing zero coordinates, and declare e_0(x_1,\dots,x_n)=1.

Theorem 7.1. For every \lambda \vdash d with \lambda_1 \leq n, we have

e_{\lambda^\prime}(x_1,\dots,x_n) = \sum\limits_{\mu \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(x_1,\dots,x_n)

where M_{\lambda\mu} is the number of d \times n matrices with entries in \{0,1\} with row sum vector \lambda=(\lambda_1,\dots,\lambda_d) and column sum vector \mu=(\mu_1,\dots,\mu_n).

Proof: Our objective is to expand the product

e_{\lambda}(x_1,\dots,x_n)=\prod\limits_{i=1}^d e_{\lambda_i}(x_1,\dots,x_n)

as a sum of monomials, where each factor in the product has the form

e_{\lambda_i}(x_1,\dots,x_n)=\sum\limits_{\substack{S \subseteq [n] \\ |S|=\lambda_i}} \prod\limits_{i \in S} x_i

and by convention e_0(x_1,\dots,x_n)=1. Also, [n]=\{1,\dots,n\}. To visualize the monomials which occur in this expansion, write down the d \times n symbolic matrix

\begin{bmatrix} x_1 & x_2 & \dots & x_n \\ x_1 & x_2 & \dots & x_n \\ \vdots & \vdots & \dots & \vdots \\ x_1 & x_2 & \dots & x_n \end{bmatrix}

obtained by listing the variable sequence x_1,x_2,\dots,x_n a total of d times. Each monomial in the expansion of e_\lambda(x_1,\dots,x_n) is obtained exactly once by choosing \lambda_i distinct entries from the first row of the above matrix, \lambda_2 distinct entries from the second row, etc. This selection process can be recorded by setting all chosen elements to 1 and all unchosen elements to 0, leaving a d \times n matrix with entries in \{0,1\} whose row sums form the list \lambda=(\lambda_1,\dots,\lambda_n). The meaning of the list \alpha=(\alpha_1,\dots,\alpha_n) of column sums of this \{0,1\}-matrix is that \alpha_1 is the number of times the variable x_1 was chosen, \alpha_2 is the number of times the variable x_2 was chosen, etc., so that \alpha_1+\dots+\alpha_n=d. Thus the number of times the monomial

x^\alpha=x_1^{\alpha_1} \dots x_n^{\alpha_n}

appears in the expansion is M_{\lambda\alpha}, the number of d \times n matrices over \{0,1\} with row sums given by the partition \lambda of d and column sums given the by the composition \alpha of d. That is, we have

e_\lambda(x_1,\dots,x_n)=\sum\limits_{\alpha \in \mathrm{Com}_n^d} x^\lambda.

Finally, if \mu \in \mathrm{Par}_n^d is the weakly decreasing rearrangement of \alpha, we have M_{\lambda\alpha}=M_{\lambda\mu} since permuting columns has no effect on row sums. \square

In order to parlay the above expansion into a proof that \{e_\lambda \colon \lambda^\prime \in \mathrm{Par}_n^d\} is a basis of \mathcal{S}_n^d, we want to exploit properties of the “matrix” of coefficients M_{\lambda\mu}. The quotes indicate the fact that, since we have not said how the elements of \mathrm{Par}_n^d should be ordered, we have not said how the numbers M_{\lambda\mu} should be tabulated.

In order to rectify this, we first introduce a partial order on \mathrm{Par}_n^d called dominance order. This is not as kinky as it sounds, and is in fact given by a very straightforward recipe: given two nonnegative integer vectors \mu=(\mu_1,\dots,\mu_n) and \nu=(\nu_1,\dots,\nu_n) with weakly decreasing coordinates and total sum equal to d, we declare \mu \leq \nu if and only if

\mu_1 + \dots + \mu_k \leq \nu_1 + \dots + \nu_k

holds for all 1 \leq k \leq n.

Problem 7.1. Prove that dominance order really is a partial order on \mathrm{Par}_n^d\}.

The key property of dominance order in relation to Theorem 7.1 is the following.

Problem 7.2. Prove that the numbers M_{\lambda\mu} defined by Theorem 7.1 satisfy M_{\lambda\mu}=0 unless \mu \leq \lambda^\prime, and that moreover M_{\lambda\lambda^\prime}=1.

Math 202C: Lecture 6

*** Problem in this lecture due April 19 at 23:59 ***

In Lecture 5, we constructed the polynomial algebra \mathcal{P}_n and the symmetric polynomial algebra \mathcal{S}_n. The basic combinatorial objects underlying these constructions are integer compositions and integer partitions. Let

\mathrm{Com}_n=\{\alpha=(\alpha_1,\dots,\alpha_n) \colon \alpha_i \in \mathbb{N}_0\}.

Then,

\mathrm{Com}_n = \bigsqcup\limits_{d=0}^\infty \mathrm{Com}_n^d

with

\mathrm{Com}_n^d=\{\alpha \in \mathrm{Com}_n \colon |\alpha|=d\},

where |\alpha|=\alpha_1+\dots+\alpha_n. Now we quantize, meaning that we replace \mathrm{Com}_n with the Hilbert space \mathcal{P}_n having orthonormal basis

|\alpha\rangle, \quad \alpha \in \mathrm{Com}_n.

The disjoint union decomposition of the infinite set \mathrm{Com}_n then becomes an orthogonal direct sum decomposition

\mathcal{P}_n = \bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d

of the infinite-dimensional space \mathcal{P}_n into finite-dimensional subspaces

\mathcal{P}_n^d = \bigoplus\limits_{\alpha \in \mathrm{Com}_n^d} \mathbb{C}|\alpha\rangle.

Moreover, we have a natural notion of multiplication in \mathcal{P}_n defined by

|\alpha\rangle|\beta\rangle=|\alpha+\beta\rangle,

which has the feature that

|\alpha\rangle \in \mathcal{P}_n^c,\ |\beta\rangle \in \mathcal{P}_n^d \implies |\alpha+\beta\rangle \in \mathcal{P}_n^{c+d}.

On one hand, the infinite-dimensional algebra is similar to the convolution algebra of an abelian group: if

|v\rangle = \sum\limits_\alpha c_\alpha|\alpha\rangle \quad\text{and}\quad |w\rangle = \sum\limits_\beta d_\beta|\beta\rangle,

then

|v\rangle |w\rangle = \sum\limits_\gamma \left( \sum\limits_{\alpha+\beta=\gamma} c_\alpha d_\beta\right)|\gamma\rangle.

Indeed, \mathcal{P}_n is precisely the convolution algebra of finitely-supported functions on the infinite abelian monoid \mathrm{Com}_n. On the other hand, \mathcal{P}_n is something quite familiar and intuitive: the algebra of polynomials in n commuting variables. Indeed, under the identification

|\alpha\rangle \mapsto x^\alpha = x_1^{\alpha_1} \dots x_n^{\alpha_n},

we have

|v\rangle = \sum\limits_\alpha c_\alpha|\alpha\rangle = \sum\limits_\alpha c_\alpha x^\alpha = f(x_1,\dots,x_n).

In this sense, what the familiar algebra of polynomials in n commuting variables “really” is is the convolution algebra of the commutative monoid \mathrm{Com}_n.

The symmetric group S_n = \mathrm{Aut}\{1,\dots,n\} acts on \mathrm{Com}_n according to

\pi \cdot \alpha = (\alpha_{\pi(1)} \dots \alpha_{\pi(n)}),

and this permutation action respects the decomposition of \mathrm{Com}_n into finite-dimensional components \mathrm{Com}_n^d. This permutation action gives rise to a unitary representation (\mathcal{P}_n,\varphi_n) of S_n, where

\varphi_n \colon S_n \longrightarrow U(\mathcal{P}_n)

is the group homomorphism defined by

\varphi_n(\pi)|\alpha\rangle = |\alpha_{\pi(1)} \dots \alpha_{\pi(n)}\rangle.

Equivalently, thinking of \mathcal{P}_n as the algebra of polynomials in n variables this is

\varphi_n(\pi)f(x_1,\dots,x_n) = f(x_{\pi^{-1}(1)},\dots,x_{\pi^{-1}(n)}).

The algebra of symmetric polynomials in n variables is the space \mathcal{S}_n=\mathcal{P}_n^{S_n} of latex S_n$-invariant vectors in this unitary representation, i.e. polynomials with the property that

f(x_{\pi(1)},\dots,x_{\pi(n)}) = f(x_1,\dots,x_n), \quad \forall \pi \in S_n.

As explained in Lecture 5, it is not too difficult to construct a basis of \mathcal{S}_n. Indeed, one need only observe that orbits of the the action of S_n on \mathrm{Com}_n are naturally indexed by the set

\mathrm{Par}_n = \{\lambda \in \mathrm{Com}_n \colon \lambda_1 \geq \dots \geq \lambda_n\}.

For a fixed positive integer d, the set \mathrm{Par}_n^d=\mathrm{Par}_n \cap \mathrm{Com}_n^d may be identified with the set of partitions of d having at most n parts. Then, associated to each \lambda \in \mathrm{Par}_n^d is the corresponding orbit sum

m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha,

and \{m_\lambda \colon \lambda \in \mathrm{Par}_n^d\} is an orthogonal basis of \mathcal{S}_n^d=\mathcal{S}_n \cap \mathcal{P}_n^d. This is the basis of monomial symmetric polynomials, which have the form

m_\lambda(x_1,\dots,x_n) = x_1^{\lambda_1} \dots x_n^{\lambda_n} + \text{symmetries}.

So while m_\lambda is not in general a monomial, it is a polynomial having the feature that any of its constituent monomials determines all the others.

Problem 6.1. Give necessary and sufficient conditions under which m_\lambda(x_1,\dots,x_n) actually is a monomial.

Because m_\lambda is not in general a monomial, it is not in general the case that

m_\lambda(x_1,\dots,x_n)m_\mu(x_1,\dots,x_n)=m_{\lambda+\mu}(x_1,\dots,x_n).

This is somewhat analogous to the fact that convolution of conjugacy classes in the center of the convolution algebra of a noncommutative group is determined by, but not the same as, multiplication of group elements. As in that context, the natural question is to compute the connection coefficients of the monomial basis of \mathcal{S}_n,

m_\lambda m_\mu = \sum\limits_{\nu \in \mathrm{Par}_n} c_{\lambda\mu\nu} m_\nu.

In support of this analogy, consider the following.

Problem 6.2. Show that for any \gamma \in \mathcal{O}_\nu, the coefficient of the monomial x^\gamma in the product m_\lambda(x_1,\dots,x_n)m_\mu(x_1,\dots,x_n) is

c_{\lambda\mu\nu} =|\{(\alpha,\beta) \in \mathcal{O}_\lambda \times \mathcal{O}_\nu \colon \alpha+\beta=\gamma\}|.

Taking the analogy to this natural conclusion, remember the a priori surprising fact the center of the convolution algebra, whose multiplication appears quite complicated in the class basis, is actually isomorphic to a pointwise function algebra. In particular, in the abelian case we have \mathcal{C}(G) \simeq \mathcal{F}(G).$

Theorem 6.1 (Newton). The algebra \mathcal{S}_n of symmetric polynomials in n variables is isomorphic to the algebra \mathcal{P}_n of polynomials in n variables.

Note that this is an infinite-dimensional phenomenon: the proper subalgebra \mathcal{S}_n of \mathcal{P}_n is isomorphic to the full algebra in which it resides. In particular, the isomorphism is definitely not the tautological inclusion.

Theorem 6.1. is called the fundamental theorem of symmetric polynomials, and as indicated above it is generally attributed to Newton. Among many other achievements, Newton laid the foundations for classical mechanics based on his theory of gravity, and developed a theory of optics which involved sticking a knitting needle into his own eye to demonstrate that color is an intrinsic property of light as opposed to an artifact of perception. Now that’s applied mathematics. Given that symmetric polynomials were good enough for Newton, it’s a bit hard to take complaints about our subject matter being “too pure” seriously. But ok, nowadays we’re doing data science, not physical science — why would you care about reconciling gravitation with quantum mechanics when you can make a living applying principal components analysis to an endless supply of large rectangular matrices?

From the data science perspective, Newton’s theorem says that the elementary symmetric polynomials form a universal coordinate system for polynomial features of unordered data. More precisely, his result is that \mathcal{S}_n=\mathbb{C}[x_1,\dots,x_n]^{S_n} is isomorphic to \mathcal{P}_n=\mathbb{C}[e_1,\dots,e_n], where

e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subset [n] \\ |I|=r}} \prod\limits_{i \in I} x_i.

Here [n]=\{1,\dots,n\}, so e_r(x_1,\dots,x_n) is the zero polynomial for r>n. It is notationally convenient to define e_0(x_1,\dots,x_n) to be the constant polynomial 1. Then, for any partition \lambda, i.e. for any vector \lambda=(\lambda_1,\lambda_2,\lambda_3,\dots) from the set \mathrm{Par} of infinite integer vectors with weakly decreasing coordinate eventually equal to zero, we can extend the definition of the elementary symmetric polynomials by declaring

e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^\infty e_{\lambda_i}(x_1,\dots,x_n),

or equivalently

e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^{\ell(\lambda)}e_{\lambda_i}(x_1,\dots,x_n)

where \ell(\lambda) is the number of nonzero coordinates of \lambda. The condition required for this product to be nonzero is that \lambda_1 \leq n. Now, the set of partitions with largest part at most n is in bijection with the set of partitions with at most n parts via transposing Young diagrams, \lambda \mapsto \lambda^T. Thus, Newton’s theorem is equivalent to the following.

Theorem 6.2. For any d \in \mathbb{N}, the set \{e_\lambda(x_1,\dots,x_n) \colon \lambda^T \in \mathrm{Par}_n^d\} is a vector space basis of \mathcal{S}_n^d.

In Lecture 7, we will prove Theorem 6.2 by giving a combinatorial interpretation of the coefficients in the expansion

e_\lambda(x_1,\dots,x_n) = \sum\limits_{\lambda \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(x_1,\dots,x_n).

of the (extended) elementary symmetric polynomials e_\lambda in terms of the monomial symmetric polynomials m_\mu. In fact, the value of this computation extends beyond its immediate application to Newton’s theorem. Indeed, the expansion above is universal in the sense that it remains valid under any specialization, i.e. under any algebra homomorphism \Xi \colon \mathcal{S}_n \to \mathcal{A}. In particular, we have

e_\lambda(|J_1\rangle, \dots, |J_n\rangle) = \sum\limits_{\lambda \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(|J_1\rangle, \dots, |J_n\rangle),

where |J_1\rangle,\dots,|J_n\rangle are the Jucys-Murphy elements in the convolution algebra \mathcal{C}(S_n) of the symmetric group. Recall that we established the commutativity of the JM-elements by proving that they lie in the commutative subalgebra \mathcal{J}_n of \mathcal{C}(S_n) generated by

Z\mathcal{C}(S_1) \cup \dots \cup Z\mathcal{C}(S_n),

which we named the JM-subalgebra of \mathcal{C}(S_n). We then showed that

|L_r\rangle=e_r(|J_1\rangle,\dots,|J_n\rangle),

i.e. the levels of the symmetric group are precisely the elementary symmetric polynomials specialized on the JM-elements.

Next week, we shall see that essentially the same argument used to prove Newton’s theorem can be retooled to prove the apparently much more sophisticated result that |J_1\rangle,\dots,|J_n\rangle not only lie in the algebra \mathcal{J}_n, but in fact generate it. This will essentially complete Phase I of our development of the representation theory of the symmetric group. Phase II will be to show that \mathcal{J}_n has a completely different spectral interpretation as the so-called Gelfand-Tsetlin subalgebra of \mathcal{C}(S_n). As discovered by Okounkov and Vershik, the fact that the Jucys-Murphy and Gelfand-Tsetlin subalgebras of \mathcal{C}(S_n) are one and the same completely unlocks the representation theory of the symmetric groups.

For now, here is something concrete to think about.

Problem 6.3. Show that e_r(1,2,\dots,n-1) is equal to the number of permutations in S_n consisting of n-r disjoint cycles.

For context, recall that the evaluation of p_r(1,2,\dots,n)=1^r+2^r+\dots+n^r is the classical problem of summing powers of consecutive numbers. At some point in your life you undoubtedly learned that

p_1(1,2,\dots,n) = \frac{n(n-1)}{2},

and this may have led you to discover that

p_2(1,2,\dots,n) = \frac{n(n+1)(2n+1)}{6},

which probably means that you tried to understand what was going on with p_r(1,2,\dots,n) in general. The problem is “solved” by Faulhaber’s formula, which is not very satisfying. The situation with e_r(1,2,\dots,n) is much cleaner, and may help to mend this childhood trauma.

Math 202C: Lecture 5

*** Problems in this Lecture due March 12 at 23:59 ***

Let

\mathrm{Com}_n = \{(\alpha_1,\dots,\alpha_n) \in \mathbb{Z}^n \colon \alpha_i \geq 0\}

be the set of nonnegative integer vectors with n components. Elements \alpha of \mathrm{Com}_n are referred to as weak n-part compositions, where “weak” means that some coordinates may be zero. Let

\mathrm{Com}_n^d=\{\alpha \in \mathrm{Com}_n \colon \alpha_1+\dots +\alpha_n=d\}.

Elements of \mathrm{Com}_n^d are called weak n-part compositions of d. Let \mathcal{P}_n be the Hilbert space with orthonormal basis

|\alpha\rangle = |\alpha_1\dots\alpha_n\rangle,\quad \alpha \in \mathrm{Com}_n.

That is, \mathcal{P}_n is the space of complex-valued functions on weak n-part compositions which vanish at all but finitely many points. A general vector in \mathcal{P}_n thus has the form

|v\rangle = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha |\alpha\rangle,

where all but finitely many of the coefficients c_\alpha \in \mathbb{C} are zero. Note that \mathcal{P}_n is an infinite-dimensional Hilbert space, and recall that our definition of Hilbert space does not require completeness. We have an orthogonal direct sum decomposition

\mathcal{P}_n = \bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d,

where

\mathcal{P}_n^d = \mathrm{span}\{|\alpha\rangle \colon \alpha \in \mathrm{Com}_n^d\}.

Problem 5.1. Prove that \mathcal{P}_n^d is finite-dimensional, and determine its dimension.

Now define a multiplication in \mathcal{P}_n by bilinearly extending the rule

|\alpha\rangle|\beta\rangle = |\alpha+\beta\rangle.

Furthermore, define a conjugation in \mathcal{P}_n by antilinearly extending the rule

|\alpha\rangle^*=|\alpha\rangle.

Then, \mathcal{P}_n is an infinite-dimensional commutative algebra which you likely know by another name: the algebra of polynomials in n (commuting, selfadjoint) variables. Indeed, if we write

x_1=|100\dots 0\rangle\ x_2=|010\dots 0\rangle, \dots x_n=|000\dots 1\rangle,

then a basis vector in \mathcal{P}_n becomes

|\alpha\rangle = x_1^{\alpha_1} \dots x_n^{\alpha_n},

and a general vector in \mathcal{P}_n becomes

|v\rangle = \sum\limits_{\alpha} c_\alpha|\alpha\rangle = \sum\limits_{\alpha} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}.

So, if you have ever wondered what exactly a polynomial is, the above constitutes one possible answer: a polynomial is a finitely-supported complex-valued function on the set of compositions. This explains the sense in which there are “no polynomial relations” among the “variables” x_1,\dots,x_n. Indeed, writing

\sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n} = \sum\limits_\alpha c_\alpha |\alpha\rangle =0

shows that this means nothing more than linear independence of the computational basis in \mathcal{P}_n. Also note that \mathcal{P}_n^d is the span of all monomials whose exponents add to d,

x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}, \quad \alpha_1+\dots+\alpha_n=d.

Polynomials f(x_1,\dots,x_n) \in \mathcal{P}_n^d are said to be homogeneous of degree d, and the space of these is finite-dimensional.

Since you come with a lot of software already installed, we will continue to use the familiar notation

f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}

for vectors |v\rangle in \mathcal{P}_n = \mathbb{C}[x_1,\dots,x_n]. This will allow us to more readily access your existing libraries, fragmented as they may be. The monomial basis of \mathcal{P}_n is

|\alpha\rangle= x_1^{\alpha_1} \dots x_n^{\alpha_n}.

If we want to write things in a compressed way, we may use

f(x)=\sum\limits_\alpha c_\alpha x^\alpha.

One advantage of the classical polynomial notation is that it allows you to visualize homomorphisms from \mathcal{P}_n into other algebras as “substitutions.”

Definition 5.1. An algebra homomomorphism \Xi \colon \mathcal{P}_n \to \mathcal{A} is called a specialization when \mathcal{A} is commutative.

Any specialization \Xi is uniquely determine by the images A_1=\Xi(x_1),\dots,A_n=\Xi(x_n), since with the image of a general polynomial

f(x_1,\dots,x_n) = \sum\limits_\alpha c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}

is

\Xi(f(x_1,\dots,x_n)=\sum\limits_\alpha A_1^{\alpha_1} \dots A_n^{\alpha_n} =:f(A_1,\dots,A_n).

Conversely, given any elements A_1,\dots,A_n in a commutative algebra \mathcal{A} there is a unique specialization \Xi \colon \mathbb{C}[x_1,\dots,x_n] \to \mathcal{A} such that \Xi(x_1)=A_1,\dots,\Xi(x_n)=A_n.

For example, consider the specialization \Xi \colon \mathcal{P}_n \to \mathbb{C} determined by

\Xi(x_1)=1,\ \Xi(x_2)=2,\ \dots,\ \Xi(x_n)=n.

Then, for a general polynomial

f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}

we have

\Xi(f) = f(1,2,\dots,n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha 1^{\alpha_1}2^{\alpha_2} \dots n^{\alpha_n}.

In simpler times, people were interested in finding the image of the power sum polynomial

p_r(x_1,\dots,x_n)=x_1^r + x_2^r + \dots + x_n^r

under the specialization \Xi, which is the number

\Xi(p_r)=p_r(1,2,\dots,n) = 1^r+2^r+ \dots+ n^r.

A formula for this number was eventually found in terms of another family of numbers called Bernoulli numbers, which are themselves not so easy to describe. The main conceptual insight this formula provides is that \Xi(p_r) admits a second description as the the image of a univariate polynomial q_r(x) under the specialization x \mapsto n.

In probability theory, one is concerned with specializations of \mathcal{P}_n whose target \mathcal{A} is the algebra of complex-valued random variables on a given probability space. In fact, the problem we want to solve in this context is the stochastic version of the above numerical problem: given X_1,\dots,X_n \in \mathcal{A}, determine the distribution of

p_r(X_1,\dots,X_n)=X_1^r+ \dots + X_n^r

in terms of the joint distribution of X_1,\dots,X_n. Probability theory provides efficient tools for doing this when the random variables X_1,\dots,X_n are independent. If they are highly correlated, however, we know much less.

Problem 5.2. Determine the distribution of p_r(X_1,\dots,X_n) when X_1,\dots,X_n are iid Bernoulli random variables.

We can recast the content of Lecture 4 in terms of a specialization of \mathcal{P}_n whose target is the Jucys-Murphy algebra, \mathcal{J}_n. Recall that \mathcal{J}_n is the commutative subalgebra of the convolution algebra \mathcal{C}(S_n) of the symmetric group S_n=\mathrm{Aut}\{1,\dots,n\} generated by

\mathcal{Z}_1 \cup \mathcal{Z}_2 \cup \dots \cup \mathcal{Z}_n,

where \mathcal{Z}_k=Z\mathcal{C}(S_k) is the center of the subalgebra \mathcal{C}(S_k) of \mathcal{C}(S_n) consisting of functions supported on permutations which fix the points $k+1,k+2,\dots,n.$ The Jucys-Murphy specialization of \mathcal{P}_n is the algebra homomorphism

\Xi \colon \mathcal{P}_n \longrightarrow \mathcal{J}_n

defined by

\Xi(x_1)=|J_1\rangle,\ \Xi(x_2)=|J_2\rangle,\ \dots,\ \Xi(x_n)=|J_n\rangle,

where

|J_t\rangle = \sum\limits_{1 \leq s<t} |st\rangle, \quad 1 \leq t \leq n,

are the Jucys-Murphy elements in \mathcal{C}(S_n). Our key result is that the metric level sets

|L_r\rangle = \sum\limits_{\substack{\pi \in S_n \\ |\pi|=r}} |\pi\rangle, \quad 1 \leq r \leq n,

in the symmetric group (identity-centered spheres in the all-transpositions metric) are the image of a particular family of the polynomials e_1,\dots,e_n \in \mathcal{P}_n under the JM-specialization. The polynomials in question are the elementary symmetric polynomials

e_r(x_1,\dots,x_n) = \sum\limits_{1 \leq t_1 < \dots < t_r \leq n} x_{t_1}x_{t_2} \dots x_{t_r}, \quad 1 \leq r \leq n,

which may also be written

e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subseteq \{1,\dots,n\} \\ |I|=r}} \prod\limits_{i \in I}x_i.

Our theorem from Lecture 4 (which actually goes back to Lecture 1, and seems to have been completely uninteresting to many people) is that

|L_r \rangle = e_r(|J_1\rangle,|J_2\rangle,\dots,|J_n\rangle).

We can also look at the image of this identity in the regular representation of \mathcal{C}(S_n), where it becomes

L_r= e_r(J_1,J_2,\dots,J_n),

giving us a polynomial decomposition of the metric level-set operators L_r \in \mathrm{End}\mathcal{C}(S_n) on the all-transpositions Cayley graph of S_n in terms of a simpler family of operators J_t \in \mathrm{End}\mathcal{C}(S_n). As described in Lecture 1, the metric level set operators on any graph interpolate between the adjacency operator and the distance operator. However, the polynomial decomposition of these operators given above is a special feature of the all-transpositions Cayley graph of the symmetric group, and we will soon diagonalize the Jucys-Murphy operators J_1,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n).

Before doing this, let us come to an understanding of the elementary symmetric polynomials, and symmetric polynomials more generally. The basic fact is that we have a natural unitary representation of S_n on \mathcal{P}_n, namely the group homomorphism

\varphi \colon S_n \longrightarrow U(\mathcal{P}_n)

from the symmetric group of \{1,\dots,n\} to the unitary group of \mathcal{P}_n defined by

\varphi(\pi)|\alpha\rangle = |\alpha_{\pi(1)}\alpha_{\pi(2)} \dots \alpha_{\pi(n)}\rangle, \quad \pi \in S_n,\ \alpha \in \mathrm{Com}_n.

Note that this is an infinite-dimensional unitary representation of the symmetric group, but that each of the finite-dimensional subspaces \mathcal{P}_n^d is a finite-dimensional unitary representation of S_n. In polynomial notation, we have

\varphi(\pi)x^\alpha = x_1^{\alpha_{\pi(1)}}x_2^{\alpha_{\pi(2)}} \dots x_n^{\alpha_{\pi(n)}} = x_{\pi^{-1}(1)}^{\alpha_1}x_{\pi^{-1}(2)}^{\alpha_2} \dots x_{\pi^{-1}(n)}^{\alpha_n},

and thus for a general polynomial

f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}

we have

\varphi(\pi) f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_{\pi^{-1}(1)}^{\alpha_1}x_{\pi^{-1}(2)}^{\alpha_2} \dots x_{\pi^{-1}(n)}^{\alpha_n}=f(x_{\pi^{-1}(1)}, \dots x_{\pi^{-1}(n)}).

Like every unitary representation, (\mathcal{P}_n,\varphi) contains a space of invariants,

\mathcal{S}_n = \mathcal{P}_n^{S_n} = \mathbb{C}[x_1,\dots,x_n]^{S_n},

and since \mathcal{P}_n is a commutative algebra so is \mathcal{S}_n. The commutative algebra \mathcal{S}_n is called the algebra of symmetric polynomials in n variables, and it consists precisely of those polynomials such that

f(x_1,\dots,x_n)=f(x_{\pi(1)},\dots,x_{\pi(n)}), \quad \text{for all } \pi \in S_n.

Let us find a basis for \mathcal{S}_n. First, S_n acts on \mathrm{Com}_n, which is just a set and not a Hilbert space, according to

\pi \cdot \alpha = (\alpha_{\pi(1)},\dots,\alpha_{\pi(n)}).

This action preserves the coordinate sum of \alpha, so in fact S_n is acting on each of the finite sets \mathrm{Com}_n^d, whose cardinality is easy to compute (Problem 4.1). However, the number of orbits into which \mathrm{Com}_n^d decomposes under S_n is not at all easy to compute. Let \mathrm{Par}_n^d \subset \mathrm{Com}_n^d be the set of weak n-part compositions of d, and observe that each of these may be identified with a partition of d having at most n parts. The coordinates of a given composition \alpha \in \mathrm{Com}_n^d can be permuted so that they become a weakly decreasing list of numbers, hence the orbits of \mathrm{Com}_n^d may be indexed by the elements of \mathrm{Par}_n^d and we obtain the decomposition

\mathrm{Com}_n^d = \bigsqcup\limits_{\lambda \in \mathrm{Par}_n^d} \mathcal{O}_\lambda,

where \mathcal{O}_\lambda is the set of compositions which sort to the partition \lambda. The monomial symmetric polynomials are defined by

m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha.

Problem 5.3. Prove that the monomial symmetric polynomials form an orthogonal basis of \mathcal{S}_n. Decompose the power sum symmetric polynomials p_r(x_1,\dots,x_n) and elementary symmetric polynomials e_r(x_1,\dots,x_n) in this basis.

The above was a first-principles construction of a basis for the space of S_n-invariants in \mathcal{P}_n. On the other hand, we know from Math 202B that for any unitary representation (V,\varphi) of a finite group G, averaging the operators \varphi(g) gives the orthogonal projection V \to V^G.

Problem 5.4. Show that the operator P = \frac{1}{n!}\sum_{\pi \in S_n} \varphi(\pi) projects each monomial in \mathcal{P}_n onto an element of \mathcal{S}_n that is in fact a monomial symmetric polynomial, up to an explicitly describable scalar factor.

Math 202C: Lecture 4

*** Problems in this lecture due April 12 at 23:59 ***

Let S_n=\mathrm{Aut}\{1,\dots,n\} be the symmetric group on \{1,\dots,n\}. Our convention is that permutations are multiplied left-to-right, \rho\sigma := \sigma \circ \rho. We identify $lates S_n$ with its Cayley graph \mathrm{Cay}(S_n,T_n) as generated by the conjugacy class of transpositions,

T_n=\{\tau=(s t) \colon 1 \leq s <t \leq n\}.

Thus, two permutations \rho,\sigma \in S_n are adjacent if and only if \sigma=\rho\tau for \tau \in T_n. Under this identification, S_n decomposes into a disjoint union of n independent sets, the levels

L_r=\{\pi \in S_n \colon |\pi|=r\} = \{\pi \in S_n \colon n-\mathsf{cyc}(\pi)=r\}, \quad 0 \leq r \leq n-1.

The cardinalities of these levels make up the nth row in the number triangle whose entries are the Stirling numbers of the first kind. Each level further decomposes as a disjoint union of conjugacy classes: we have

L_r = \bigsqcup\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} C_\alpha,

where \Lambda is the set of partitions of n, and \ell(\alpha) is the number of terms in the partition \alpha. We can encode this decomposition algebraically in \mathcal{C}(S_n) as

|L_r\rangle = \sum\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} |C_\alpha\rangle.

Note that, because S_n is an ambivalent group, each class indicator |C_\alpha\rangle is a selfadjoint element in the convolution algebra \mathcal{C}(S_n). In the regular representation, the above becomes an identity in \mathrm{End}\mathcal{C}(S_n), namely

L_r = \sum\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} C_\alpha,

and we have used this already to give a simple proof of the fact that the metric level set operators L_1,\dots,L_{n-1} on S_n=\mathrm{Cay}(S_n,T_n) commute. In particular, these selfadjoint operators are simultaneously diagonalizable. In this Lecture, we will obtain a different combinatorial decomposition of L_r which opens the door to actually diagonalizing L_1,\dots,L_{n-1}.

In Lecture 1, we defined an edge-labeling of S_n by marking each edge corresponding to the transposition (s\ t) with t, the larger of the two elements it transposes. We defined a walk on S_n to be strictly monotone if the labels of the edges it traverses form a strictly increasing sequence. It is clear that the length of any such walk is at most n-1, which is the diameter of S_n. The following remarkable result holds.

Theorem 4.1. There exists a unique strictly monotone walk between every pair of permutations, and this walk is a geodesic.

Theorem 4.1 gives a second description of L_r as the set of all strictly monotone r-step walks on S_n emanating from the identity permutation, \iota. The figure below illustrates this for n=4, with \text{blue}=2, \text{purple}=3, \text{black}=4.

The Jucys-Murphy spanning tree of S_4.

Definition 4.2. The Jucys-Murphy subsets of S_n are defined by

J_1=\emptyset

J_2 = \{(1 2)\}

J_3=\{(1 3),(2 3)\}

\vdots

J_n=\{(1 n),(2 n),\dots,(n-1 n).\}

Thus,

J_t = \{(s t) \colon 1 \leq s <t\}

is the set of all transpositions which swap a given number t with smaller numbers s. The set J_1 = \emptyset is included only for notational convenience, as a sequence with initial index 2 may be unsettling. It is clear that the JM-subsets of S_n are pairwise disjoint, and that their union is T_n=L_1. More generally, the set of all r-step strictly monotone walks on S_n emanating from the identity permutation \iota is in bijection with

\bigsqcup\limits_{1 \leq t_1 < \dots < t_r \leq n} J_{t_1} \times J_{t_2} \times \dots \times J_{t_r},

and hence we have a bijection.

L_r \simeq \bigsqcup\limits_{1 \leq t_1 < \dots < t_r \leq n} J_{t_1} \times J_{t_2} \times \dots \times J_{t_r},

We will now encode this correspondence algebraically in \mathcal{C}(S_n).

Definition 4.3. The Jucys-Murphy elements in \mathcal{C}(S_n) are

|J_t\rangle = \sum\limits_{1 \leq s<t} |s t\rangle, \quad 1 \leq t \leq n.

The JM-elements are selfadjoint elements satisfying \langle J_t | J_t \rangle = t-1 and

|T_n\rangle = \sum\limits_{t=1}^n |J_t\rangle.

Thus, J_1,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n) are selfadjoint operators satisfying

T_n = \sum\limits_{t=1}^n J_t,

where T_n is the adjacency operator on \mathrm{Cay}(S_n,T_n).

Problem 4.2. Prove that the operator norm of J_t is \|J_t\|=t-1.

At this point, we can write down the algebraic identity

|L_r\rangle = \sum\limits_{1 \leq t_1 < \dots < t_r \leq n}|J_{t_1}\rangle \dots |J_{t_n}\rangle.

This is equivalent to the combinatorial decomposition of L_r in terms of JM-subsets, but at the same time it has many advantages. To see these we first the need the following fact.

Theorem 4.3. The JM-elements |J_1\rangle,\dots,|J_n\rangle pairwise commute in \mathcal{C}(S_n).

One way to verify this would be to check directly that the products

|J_k\rangle|J_l\rangle = \sum\limits_{i<k}\sum\limits_{j<l}|ik\rangle|jl\rangle \quad\text{and}\quad |J_k\rangle|J_l\rangle = \sum\limits_{i<k}\sum\limits_{j<l}|jl\rangle|ik\rangle

are indeed the same. We will instead construct a manifestly commutative subalgebra \mathcal{J}_n of \mathcal{C}(S_n) which contains \{|J_1\rangle,\dots,|J_n\rangle\}.

The key to this construction is the fact that the symmetric groups S_1,S_2,\dots,S_n form an inductive chain

S_1 \subset S_2 \subset \dots \subset S_n,

where we identity S_k=\mathrm{Aut}\{1,\dots,k\} with the subgroup of S_n consisting of permutations which fix the numbers k+1,k+2,\dots,n. This gives an inductive chain of algebras,

\mathcal{C}(S_1) \subset \mathcal{C}(S_2) \subset \dots \subset \mathcal{C}(S_n),

each of which has its own center, giving us a collection

Z\mathcal{C}(S_1),\ Z\mathcal{C}(S_2),\ \dots,\ Z\mathcal{C}(S_n)

of commutative subalgebras of \mathcal{C}(S_n). However, it is not true that these commutative subalgebras form an inductive chain. In fact, they are as disjoint as subalgebras can be.

Problem 4.3. Prove that Z\mathcal{C}(S_k) \cap Z\mathcal{C}(S_l)=\mathbb{C}|\iota\rangle for k<l.

What is true is that each element of Z\mathcal{C}(S_k) commutes with every element of Z\mathcal{C}(S_l). Indeed, assuming k<l, the center Z\mathcal{C}(S_l) consists of all elements commuting with every element of \mathcal{C}(S_l), and

Z\mathcal{C}(S_k) \subset \mathcal{C}(S_k) \subset \mathcal{C}(S_l).

Definition 4.4. The Jucys-Murphy subalgebra \mathcal{J}_n of \mathcal{C}(S_n) is the commutative subalgebra generated by Z\mathcal{C}(S_1) \cup Z\mathcal{C}(S_2) \cup \dots \cup Z\mathcal{C}(S_n).

Problem 4.4. Prove that \dim \mathcal{J}_n \geq \sum_{k=1}^n p(k)-(n-1), where p(k) is the number of partitions of k.

The commutativity of the JM-elements follows from the fact that they belong to the JM-subalgebra.

Theorem 4.4. We have |J_1\rangle,|J_2\rangle,\dots,|J_n\rangle \in \mathcal{J}_n.

Proof: For each 1 \leq k \leq n, let

|T_k\rangle = \sum\limits_{1 \leq s < t \leq k} |st\rangle

be the sum of all transpositions in S_k. This is an element of \mathcal{C}(S_k), and indeed an element of Z\mathcal{C}(S_k). Now observe that

|J_k\rangle = |T_k\rangle - |T_{k-1}\rangle,

which expresses |J_k\rangle as a difference of elements in \mathcal{J}_n. \square

Math 202C: Model Homework Solution

Lectures are informal by design: we meet to discuss algebra and chart our course through the subject. I talk to you, you talk to me. We generate ideas to craft a narrative. We then retire to our respective domiciles to polish that narrative. My job is to identify the key large scale ideas, treating these as girders which I assemble into an overarching structure presented in the lecture notes. I also identify important small-scale ideas which play the role of joists supporting the overarching structure. Your job is to install these joists, which you do in the homework problems. In order for this collaborative endeavor to succeed, there must be committed and earnest effort from both parties.

If I was a telepath, I would read your mind to determine if you are supplying this effort and base your grade on the results of this brain scan. Although I am not a telepath, I can gauge your earnestness to a certain extent based on verbal interactions. However, this measurement is imperfect and moreover I do not interact verbally with every student, though I am willing to do so. Therefore, your written solutions must communicate your commitment to the cause.

Below is an exemplar on which you can model your work in terms of level of detail and tone, should you wish to do so. My process was the following: I discussed the problem with two people for about fifteen minutes after lecture; I thought about it more on my fifteen minute drive home; I thought about it again while falling asleep, probably another fifteen minutes; I woke up and worked out the solution carefully, which took about an hour.

Problem 1.1. Find a formula for the number of walks of given length between two given vertices of the complete graph.

Solution: Let K_n be the complete graph on \{1,\dots,n\}. Given i,j \in S and r \in \mathbb{N}, let W^r(i,j) denote the number of r-step walks i \to j in G. Observe that

W^r(1,1) + W^r(1,2) + \dots + W^r(1,n)=(n-1)^r.

Indeed, at the each step of the walk we have n-1 choices for the next vertex to visit. We claim that

W^r(1,2)= \dots = W^r(1,n).

Intuitively, this is because the complete graph is completely homogeneous. Let us translate this intuition into a precise argument.

Let S_n=\mathrm{Aut}\{1,\dots,n\} be the symmetric group on \{1,\dots,n\}, and let \tau = (2\ 3) be the transposition which transposes 2 and 3. Let \tau act on the set of walks 1 \to v_1 \to \dots \to v_{r-1} \to 2 by

\tau(1 \to v_1 \to \dots \to v_{r-1} \to 2) = \tau(1) \to \tau(v_1) \to \dots \tau(v_{r-1}) \to \tau(2).

In words, \tau acts on an input walk by replacing all occurrences of 2 with 3, and vice versa. This is a bijective mapping from the set of r-step walks 1 \to 2 to the set of r-step walks 1 \to 3, and in fact this bijection is an involution because \tau is. Consequently, W^r(1,2)=W^r(1,3). The same argument shows that W^r(1,2)=W^r(1,4), just use \tau=(2\ 4) instead.

We now know that

W^r(1,1) + (n-1)W^r(1,2) = (n-1)^r.

If we can find a second linear equation satisfied by these numbers, we can solve the resulting system. We will find a second relation by haruspication, which in the present situation means looking at small values of r.

For r=1, we have W^1(1,1)=0 and W^1(1,2)=1. For r=2, we have W^2(1,1) = n-1 and W^2(1,2)=n-2. We thus conjecture that

W^r(1,2) = W^r(1,1) + (-1)^{r-1},

and we can prove this by induction on r. The base step of the induction has been completed; the inductive step is to show that

W^{r+1}(1,2) = W^{r+1}(1,1) + (-1)^r.

We have

W^{r+1}(1,2) = W^r(1,1) + \overline{W^r(1,2)}+ \dots + W^r(1,n)

and

W^{r+1}(1,1)= \overline{W^r(1,1)} + W^r(1,2)+\dots+W^r(1,n),

where in both equations the overline denotes an omitted term. Subtracting the second equation from the first, we get

W^{r+1}(1,2)-W^{r+1}(1,1)=W^r(1,1)-W^r(1,2)=-(-1)^{r-1}=(-1)^r,

where the second equality is the induction hypothesis.

We have arrived at a system of two linear equations in two unknowns,

W^r(1,1) + (n-1)W^r(1,2)=(n-1)^r \quad\text{ and }\quad W^r(1,1)-W^r(1,2)=(-1)^r.

Subtracting the second equation from the first, we arrive at

W^r(1,2) = \frac{(n-1)^r -(-1)^r}{n}.

Now substitute this value into the second equation to arrive at

W^r(1,1) = (-1)^r+ \frac{(n-1)^r -(-1)^r}{n}=\frac{(n-1)^r+ (-1)^r(n-1)}{n}.

Math 202C: Lecture 0

This post serves as the syllabus for Math 202C.

DATETOPICModality
March 30, Lecture 1Graphs and operatorsIn-person lecture
April 1, Lecture 2Cayley graphsIn-person lecture
April 3, Lecture 3Level operators on Cayley graphsAsynchronous lecture post
April 6, Lecture 4Jucys-Murphy elementsIn-person lecture
April 8, Lecture 5Polynomials and symmetric polynomialsIn-person lecture
April 10, Lecture 6Elementary symmetric polynomialsIn-person lecture
April 13, Lecture 7Elementary-to-monomial transition Asynchronous lecture post
April 15, Lecture 8Fundamental theorem of symmetric polysAsynchronous lecture post
April 17, Lecture 9NoneCancelled
April 20Review and preview, content and contextIn-person lecture
April 22Farahat-Higman theoremIn-person lecture
April 24Simple branchingIn-person lecture
April 27Centralizers and multiplicity-free branchingIn-person lecture
April 29Multiplicity-free branching for symmetric groupsIn-person lecture
May 1GT-basis and GT-algebraIn-person lecture
May 4
May 6
May 8
May 11
May 13
May 15
May 18
May 20
May 22
May 27
May 29
June 1
June 3
June 5
Schedule subject to change

Math 202C: Lecture 3

*** All problems in this lecture due April 12 at 23:59 ***

We return to the general setting of a finite, connected, simple graph G=\mathrm{Cay}(S,T), and as before \mathcal{C}(S) the Hilbert space with orthonormal basis |\pi\rangle, |\rho\rangle,|\sigma\rangle,\dots. In Lecture 1 we introduced the metrics level set linear operators L_0,L_1,L_2,\dots in \mathrm{End}\mathcal{C}(S) which act on vertices according to

L_r|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \mathrm{d}(\rho,\sigma)=r}} |\sigma\rangle.

We argued the case that these operators are quite fundamental: L_0=I is the identity operator, L_1=K is the adjacency operator,

D = \sum\limits_{r=0}^\infty rL_r

is the distance operator,

\Omega_q=\sum\limits_{r=0}^\infty q^r L_r

is the exponential distance operator. However, the selfadjoint operators L_0,L_1,L_2,\dots and the subalgebra of \mathrm{End}\mathcal{C}(S) they generate are not easy to understand.

Problem 3.1. Show that \langle \sigma|L_sL_r|\rho\rangle is the cardinality of the intersection of the sphere of radius r centered at \rho with the sphere of radius s centered at \sigma.

So, the matrix elements of the commutator

[L_s,L_r]=L_sL_r - L_rL_s

are differences of intersection numbers,

\langle \sigma | [L_s,L_r] |\rho\rangle = |S_\rho(r) \cap S_\sigma(s)|-|S_\rho(s) \cap S_\rho(r)|,

and if it seems difficult to characterize graphs for which all such differences are zero, that’s because it is. There is one much-studied class of graphs which does have the property that metric level set operators L_0,L_1,L_2,\dots commute.

Problem 3.2. Show that if G is a distance regular graph, then [L_r,L_s]=0.

The following is the only restricted situation in which I personally know a simple combinatorial characterization of commutativity of the metric level set operators (it is likely that an actual graph theorist would know more).

Problem 3.3. Suppose G has diameter 2. Prove that the metric level set operators on G commute if and only if G is a regular graph. (Hint: the basic reason behind this is that in a graph of diameter 2, the metric does not contain any more information than the adjacency relation).

I am torturing you (us, really) with these combinatorial considerations because I want to further hammer home the point that Cayley graphs are much easier to understand than general graphs due to the underlying group structure, which is reflected in the behavior of linear operators acting on the vertices of the graph.

Suppose that S=\{\pi,\rho,\sigma,\dots\} is a finite group, so that \mathcal{C}(S) is the convolution algebra of this group. For any arbitrary subset A \subseteq S, we write

|A\rangle=\sum\limits_{\pi \in A}|\pi\rangle,

which is simply the indicator function of the set A. It is a very convenient abuse of notation to also simply write A for the image of |A\rangle in the right regular representation, i.e. we identity the set A \subseteq S with the operator A \in \mathrm{End}\mathcal{C}(S) defined by

A|\rho\rangle = |\rho\rangle|A\rangle=\sum\limits_{\pi \in A} |\rho\pi\rangle.

Recall from Math 202B that we have a tracial state

\langle \cdot \rangle \colon \mathcal{C}(S) \longrightarrow \mathbb{C}

on the convolution algebra of S defined by

\langle A \rangle := \langle \iota |A\rangle,

where \iota \in S is the group identity. In other words, \langle A \rangle is the coefficient of the vector |A\rangle in the group basis |\pi\rangle, which is the same thing as the value of the function A at the point \pi=\iota. Here is a second under-appreciated characterization of this important functional, which is often referred to as the canonical group trace.

Problem 3.4. Prove that all diagonal matrix elements \langle \pi | A |\pi\rangle of the image A \in \mathrm{End}\mathcal{C}(S) of |A\rangle\in \mathcal{C}(S) in the regular representation are equal to the same number, and that that number is \langle A \rangle. Hence conclude (as was already shown in Math 202B) that the \langle        A \rangle =\frac{1}{|S|}\mathrm{Tr} A.

Now to Cayley graphs. We continue with a finite group S and select inside it a symmetric set of generators T, giving us the Cayley graph G = \mathrm{Cay}(S,T). As discussed above, we may also view T \in \mathrm{End}\mathcal{C}(S) as the adjacency operator on G.

Problem 3.5. Show that \langle T^r\rangle is the number of length r loops based at the identity element (or any other specified point) in S.

Now specialize to the case where S=\mathrm{Aut}\{1,\dots,n\} is the symmetric group and T is the full conjugacy class of transpositions. The metric level set operators L_0,L_1,L_2,\dots \in \mathrm{End}\mathcal{C}(S) commute for a very simple reason: they are the images of the vectors

|L_r \rangle = \sum\limits_{\substack{\pi \in S \\ |\pi|=r}} |\pi\rangle, \quad r=0,1,2,\dots,

and these vectors lie in the center of the convolution algebra \mathcal{C}(S). Indeed, the center of \mathcal{C}(S) is spanned by the conjugacy classes

|C_\alpha\rangle = \sum\limits_{\pi \in C_\alpha} |\pi\rangle, \quad \alpha \in \Lambda,

and

|L_r\rangle = \sum\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}} |C_\alpha\rangle

is simply the sum of all conjugacy classes corresponding to partitions of n with n-r parts.

The fact that the level set operators L_0,L_1,L_2,\dots on a graph G commute is equivalent to saying that they are simultaneously diagonalizable. However, in the group-theoretic setting this statement can be refined. Let us stick with the case where S=\mathrm{Aut}\{1,\dots,n\} is the symmetric group, noting that what follows is true for the Cayley graph G=\mathrm{Cay}(S,T) of any finite group S with generating set T \subseteq S such that |T\rangle is a central element in the convolution algebra \mathcal{C}(S). Let \Lambda be a set indexing the conjugacy classes in S, so for the symmetric group \Lambda is the set of partitions of n. Then \Lambda also indexes isomorphism classes of irreducible unitary representations of S, or equivalently irreducible linear representations of \mathcal{C}(S). For each \lambda \in \Lambda, choose a representative (V^\lambda,\Phi^\lambda) of the corresponding isomorphism class. By Schur’s Lemma, any central element |A\rangle \in \mathcal{C}(S) acts in each irrep (V^\lambda,\Phi^\lambda) as a scalar operator: we have

\Phi^\lambda|A\rangle = \widehat{A}(\lambda) I_{V^\lambda}, \quad \widehat{A}(\lambda) \in \mathbb{C}.

Let us apply this to the adjacency operator on the all-transpositions Cayley graph G=\mathrm{Cay}(S,T) of the symmetric group, which is the image of the central element

|T\rangle = \sum\limits_{\tau \in T} |\tau\rangle= \sum\limits_{1 \leq s<t \leq n} |s\ t\rangle.

The eigenvalues of the adjacency operator T \in \mathrm{End}\mathcal{C}(S) are the numbers

\widehat{T}(\lambda) = {n \choose 2} \frac{\chi^\lambda(\tau)}{\dim V^\lambda}, \quad \lambda \in \Lambda,

where \tau \in T is any transposition. Moreover, where the multiplicity of \widehat{T}(\lambda) is (\dim V^\lambda)^2 because the isotypic decomposition of the regular representation is

\mathcal{C}(S) = \bigoplus\limits_{\lambda \in \Lambda} (\dim V^\lambda)V^\lambda.

Thus, in a sense, we completely know the spectrum of the adjacency operator on the all-transpositions Cayley graph of the symmetric group. However, to actually compute these numbers explicitly, we need to know both the dimension

\dim V^\lambda = \chi^\lambda(\iota) = \mathrm{Tr}\varphi^\lambda(\iota)

of every irreducible unitary representation (V^\lambda,\varphi^\lambda) of S, as well as the character

\chi^\lambda(\tau) = \mathrm{Tr} \varphi^\lambda(\tau)

of a transposition in each representation.

This highlights the fact that we really only know the irreducible unitary representations (V^\lambda,\varphi^\lambda) of S=\mathrm{Aut}\{1,\dots,n\} abstractly. In particular, while we know that the irreducible representations of S are (up to isomorphism) in bijection with the conjugacy classes in S, we have not explicitly selected such a bijection and it is not at all clear how best to do so.

Nevertheless, there are a few things we can say concretely at this stage. First, as with every group, S has the trivial representation (V^\mathsf{triv},\varphi^\mathsf{triv}) in which V^\mathsf{triv} is a one-dimensional Hilbert space and \varphi^\mathsf{triv} sends every \sigma \in S to the identity operator on this space. For this representation, we have

\widehat{T}^\mathsf{triv}={n \choose 2} \frac{\chi^\mathsf{triv}(\tau)}{\dim V^\mathsf{triv}}= {n \choose 2},

and we have reproduced the fact that the degree of the regular graph G=\mathrm{Cay}(S,T) is an eigenvalue of its adjacency operator.

Second, the symmetric group S has a second one-dimensional representation (V^\mathsf{alt},\varphi^\mathsf{alt}) in which

\varphi^\mathsf{alt} = (-1)^{|\sigma|}I.

This gives the eigenvalue

\widehat{T}^\mathsf{alt}={n \choose 2} \frac{\chi^\mathsf{alt}(\tau)}{\dim V^\mathsf{alt}}=- {n \choose 2},

which could also be deduced from the fact that the spectrum of the adjacency operator on a bipartite graph is symmetric about zero.

We can compute a third eigenvalue of the adjacency operator on the all-transpositions Cayley graph which is not directly accessible without representation-theoretic methods. To do so, we start with the permutation representation (V^\mathsf{perm},\varphi^\mathsf{perm}), in which V is the Hilbert space with orthonormal basis |1\rangle,\dots,|n\rangle and \varphi^\mathrm{perm}(\sigma)|i\rangle=|\sigma(i)\rangle. The character of this representation is

\chi^\mathsf{perm}(\sigma) = \sum\limits_{i=1}^n \langle i | \varphi^\mathsf{perm}(\sigma) | i\rangle = \sum\limits_{i=1}^n \langle i|\sigma(i)\rangle=\mathsf{fix}(\sigma),

the number of fixed points of the permutation \sigma. However, the permutation representation is not irreducible, as it has a one-dimensional invariant subspace spanned by the vector

|w\rangle = |1\rangle + \dots + |n\rangle.

The orthogonal complement of this vector gives an (n-1)-dimensional representation (V^\mathsf{std},\varphi^\mathsf{std}) of S called the standard representation, whose character satisfies

\chi^\mathsf{std}+\chi^\mathsf{triv}=\chi^\mathsf{perm}

and is therefore given by

\chi^\mathsf{std}(\sigma)=\mathsf{fix}(\sigma)-1.

Problem 3.6. Prove that the standard representation of S=\mathrm{Aut}\{1,\dots,n\} is irreducible, and give a third eigenvalue of the adjacency operator on the all-transpositions Cayley graph G=\mathrm{Cay}(S,T).

Math 202C: Lecture 2

*** Problems in this lecture due April 10 at 23:59 ***

In this lecture, we remain in the setting of the symmetric group S=\mathrm{Aut}\{1,\dots,n\} but we will change the geometry on this group. More precisely, instead of identifying S with its Cayley graph as generated by the set of adjacent transpositions, we will identify the symmetric group with its Cayley graph as generated by the full conjugacy class of transpositions,

T = \{(s\ t) \in S \colon 1 \leq s < t \leq n\}.

Let \mathrm{d} denote the corresponding metric on S, so that \mathrm{d}(\rho,\sigma) is the minimal number r such that there exist transpositions \tau_1,\dots,\tau_r \in T with

\sigma = \rho\tau_1 \dots \tau_r.

As before, write |\pi|=\mathrm{d}(\iota,\pi) for the distance from the identity permutation \iota to a given permutation \pi.

Problem 2.1. Prove that |\pi|=n-\mathrm{cyc}(\pi), where \mathrm{cyc}(\pi) is the number of factors in the disjoint cycle decomposition of \pi.

Identifying S with \mathrm{Cay}(S,T), the symmetric group becomes a graded graph: it decomposes into a disjoint union

S = L_0 \sqcup L_1 \sqcup \dots \sqcup L_{n-1}

in which

L_r = \{\pi \in S \colon |\pi|=r\} = \{\pi \in S \colon \mathsf{cyc}(\pi)=n-r\}

is the sphere of radius r centered at \iota, and L_r is an independent set, meaning that no two of its points are adjacent vertices. Moreover, if \rho,\sigma \in S are adjacent vertices, then we have \rho \in L_r and \sigma \in L_s with |r-s|=1. The decomposition of a group into concentric spheres is a general thing that happens whenever one chooses a generating set; a special feature of this particular situation is that each sphere further decomposes into a disjoint union of conjugacy classes,

L_r = \bigsqcup\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}}C_\alpha.

Here, we are indexing the conjugacy classes of S by the set \Lambda of partitions of n. More precisely, the notation \alpha \vdash n means that \alpha is a decomposition of n into a sum of \ell(\alpha) positive integers, without regard to order of the terms. Every permutation \pi \in S gives a partition \mathsf{type}(\pi)\vdash n obtained from the lengths of the cycles in its disjoint cycle decomposition; conversely, for any partition \alpha \vdash n the corresponding fiber C_\alpha =\mathsf{type}^{-1}(\alpha) is a conjugacy class in S.

Let us now construct the analogue of Mallows measure corresponding to the all-transpositions geometry. This is called the Ewens measure on permutations: by definition, it is the Gibbs measure

\mathbb{P}(\sigma) = \frac{q^|\sigma|}{Z} = \frac{q^{n-\mathsf{cyc}(\sigma)}}{Z}

corresponding to a statistical mechanical system whose states \sigma are permutations of 1,\dots,n with the potential energy of \sigma being inversely proportional to its number of cycles. The partition function is

Z= \sum\limits_{\sigma \in S} q^{|\sigma|} = q^n\sum\limits_{\sigma \in S} q^{-\mathsf{cyc}(\sigma)},

or equivalently

Z=\sum\limits_{k=0}^n \begin{bmatrix} n \\ k \end{bmatrix} q^{-k},

where \begin{bmatrix} n \\ k \end{bmatrix} is the Stirling number of the first kind, which counts permutations in S=\mathrm{Aut}\{1,\dots,n\} whose disjoint cycle factorization has k components. This in turn gives

Z= (1+q)(1+2q) \dots (1+(n-1)q),

corresponding to the fact that Stirling Numbers of the first kind are the coefficients of the falling factorial polynomial.

Problem 2.2. Show that the expected number of cycles in a random permutation under Ewens measure is

1 + \sum\limits_{j=1}^{n-1} \frac{1}{1+jq}.

Deduce from this that the expected number of cycles in a uniformly random permutation of 1,\dots,n is the harmonic number

H_n=1+\frac{1}{2} + \dots + \frac{1}{n}.

The all-transpositions geometry on permutations has many features which make it easier to understand than the adjacent-transpositions geometry. Perhaps the most significant of these is that it is “Euclideanizable.” Right, that’s not a real word – here is what is meant more precisely.

Consider the full forward cycle \gamma=(1\ 2\ \dots\ n) in the symmetric group S. In the all-transpositions geometry, we have |\gamma|=n-1, and the number of geodesics \iota \to \gamma is \mathrm{Cay}_{n-1}, where

\mathrm{Cay}_n = (n+1)^{n-1}

is the nth Cayley number. Taking this formula for granted, we can calculate the number of geodesics from the identity to any permutation (hence the number of geodesics between any two permutations) as follows.

Problem 2.3. Let \pi be any permutation in the conjugacy class C_\alpha of S. Show that the number of geodesics \iota \to \pi is

{n \choose \alpha_1-1,\dots,\alpha_\ell-1} \prod\limits_{k=1}^\ell \mathrm{Cay}_{\alpha_k-1},

where \ell=\ell(\alpha).

To Euclideanize the all-transpositions geometry on the symmetric group, we give a rule which allows one to systematically select a distinguished geodesic between every pair of points. To do so, let us label the edges of \mathrm{Cay}(S,T) as follows: we mark each edge corresponding to the transposition (s\ t) with t, the larger of the two numbers swapped. Thus, emanating from every vertex of the graph we have one 2-edge, two 3-edges, three 4-edges, etc. Call a walk on the Cayley graph strictly monotone if the labels of the edges it traverses form a strictly increasing sequence. Thus, an r-step strictly monotone walk \rho \to \sigma corresponds to an equation of the form

\sigma=\rho(s_1\ t_1) \dots (s_r\ t_r), \quad 2\leq t_1 < \dots < t_r \leq n

in S.

Problem 2.4. Prove that for any \pi \in S, there exists a unique strictly monotone walk \iota \to \pi, and that the length of this walk is |\pi|. Conclude that there exists a unique strictly monotone walk between every pair of points in S, and that this walk is a geodesic.

This more or less marks the end of my efforts to get you thinking about the combinatorics of permutations; after this we will move on to the algebraic aspects of the symmetric group and its representations. The campaign for the elementary beauty and unexpected subtlety of permutations met with mixed success: there were interesting follow-up discussions with some students, whereas other pairs of eyes glazed over as the minds behind them fell into the “I already know this stuff” trap, which is one the of the more pernicious lies you can tell yourself.

Here is one final push. Our ever-curious Lani was recently asking about posets, and whether there is any active research being conducted on them these days. The answer is definitely “yes,” and though I don’t really work in the area I do have one burning question about posets which keeps me up at night. The question is about finding a partial order on permutations of a fixed genus.

From our Euclideanized perspective, genus is a metric invariant which measures “flatness” of triangles S as generated by the full conjugacy class of transpositions. More precisely, let \pi,\rho,\sigma be three distinct permutations in S=\mathrm{Aut}\{1,\dots,n\}. There are two oriented strictly monotone triangles formed by these three points, namely \rho \to \pi \to \sigma \to \rho and \sigma \to \pi \to \rho \to \sigma. It may happen \pi lies on a geodesic \rho \to \sigma, in which case both of these triangles are flat, and have perimeter

|\pi^{-1}\rho| + |\rho^{-1}\sigma| + |\sigma^{-1}\pi| = 2|\rho^{-1}\sigma|.

In general, the oriented triangle \rho \to \pi \to \sigma \to \rho is a cycle in a bipartite graph, so necessarily has even perimeter: we have

|\pi^{-1}\rho| + |\rho^{-1}\sigma| + |\sigma^{-1}\pi| = 2|\rho^{-1}\sigma| + 2g(\rho,\pi,\sigma)

for some nonnegative integer g(\rho,\pi,\sigma) which we call the genus of the triple (\rho,\pi,\sigma). At the other extreme from flatness, it may be that \pi,\rho,\sigma are pairwise antipodal points, i.e. the distance between each pair of them is the diameter of S, in which case the above equation becomes

3(n-1)=2(n-1)+2g(\rho,\pi,\sigma),

giving a maximal genus of

g(\pi,\rho,\sigma) = \lfloor \frac{n-1}{2} \rfloor.

Problem 2.5. Show that a triple of pairwise antipodal points in S exists if and only if n is odd.

Since

g(\rho,\pi,\sigma)=g(\iota,\rho^{-1}\pi,\rho^{-1}\sigma),

it is natural to view genus as a function g(\pi,\sigma):=g(\iota,\pi,\sigma) on pairs of permutations. Going even further, let us fix the second argument to be \sigma=\gamma, where \gamma=(1\ 2\ \dots\ n) is the full forward cycle, and define the genus of a permutation by g(\pi):=g(\pi,\gamma). Then, we obtain a corresponding decomposition of the symmetric group

S = \bigsqcup\limits_{k=0}^{\lfloor \frac{n-1}{2} \rfloor} S_g

in which S_g is the set of permutations of genus g. The set S_0 of “planar” permutations consists precisely of those \pi which lie on a geodesic \iota \to \gamma, and these permutations have a special feature: the disjoint cycle decomposition of any \pi \in S_0 is a noncrossing partition of \{1,\dots,n\}, and in fact this map from genus zero permutations into noncrossing permutations is bijective. There is also a very natural partial order on S_0 which can be described as follows: we have \pi \leq \sigma if and only if \pi and \sigma lie on a common geodesic \iota \to \gamma, and a particle emanating from \iota and traveling along this geodesic passes through \pi before passing through \sigma. This is a special case of the geodesic partial order which can be defined relative to any two vertices of a connected graph, and in this case it coincides exactly with the reverse refinement order on noncrossing partitions.

Now the question is: what is the higher genus analogue of this partial order? How can we define a poset structure by some uniform recipe which feels good for all values of g, subject to the condition that our recipe reverts to the above at g=0? This is an open question. The most natural thing that I can think of is the following: given two permutations \pi,\sigma \in S_g, we declare \pi \leq \sigma if and only if they lie on a common genus g triangle two of whose vertices are \iota and \gamma, and a particle emanating from \iota and traveling along the perimeter of this triangle encounters \pi before encountering \sigma. Whether or not this actually defines a partial order on S_g is not clear to me, but my fluid intelligence is declining from its peak while yours is cresting. So, if you want to discuss further please feel free to get in touch.

Math 202B: Revenge of the Centralizer

Yesterday during office hours I had the unsettling experience of being shown that something I assumed to be true is, in fact, subtly incorrect. That’s what happens when you take things for granite. My sloppy thinking was pointed out to me by Amelia and Eliza in the context of the following problem which appeared on the Math 202B Exam: prove that if \mathcal{B} is a maximal abelian subalgebra of an algebra \mathcal{A}, then the centralizer Z(\mathcal{B},\mathcal{A}) of \mathcal{B} in \mathcal{A} equals \mathcal{B}.

This was supposed to be easy. Since \mathcal{B} is abelian, it is certainly contained in its centralizer Z(\mathcal{B},\mathcal{A}). Now, if the containment is proper, then there exists an element C \in Z(\mathcal{B},\mathcal{A}) which does not lie in \mathcal{B}. Then,

\mathcal{C}=\mathrm{CAlg}(\mathcal{B} \sqcup \{C\}),

the commutative algebra generated by \mathcal{B} with C adjoined, is a commutative algebra properly containing \mathcal{B}, contradicting the fact that \mathcal{B} is a MASA.

The subtle issue here is that \mathcal{C} is by definition the intersection of all commutative subalgebras of \mathcal{A} containing the set \mathcal{B} \sqcup \{C\}, and this could conceivably be empty. To be concrete, imagine a single non-normal matrix C. There is no commutative subalgebra of the full matrix algebra containing C, since our definition of “algebra” requires *-closure, and by hypothesis C does not commute with its adjoint.

Luckily, our attempted solution can be salvaged: as Evan immediately recognized, if we can show that Z(\mathcal{B},\mathcal{A}) contains a *selfadjoint* element D not contained in \mathcal{B} this issue goes away. This is the case, since we can take D to be either the real or imaginary part of C — at least one of these does not lie in \mathcal{B}, since if they both did so would C.

Hopefully I’m now getting this right, and my apologies for being such a bad professor. Try reading some of my papers, I swear they’re really good.

In any case, we will have to incorporate this correction into our census of reality. In particular, if this problem were to appear on the qual, a proposed solution which did not address this issue would no longer pass muster. Live and learn.

Math 202C: Lecture 1

***Problems in this lecture due 23:59 on April 8***

Let G=(S,T) be a finite, connected, simple graph with vertex set S=\{\pi,\rho,\sigma,\dots\} and edge set T \subseteq {S \choose 2}. A basic (meaning fundamental, not necessarily easy) problem is to compute the number W^r(\rho,\sigma) of r-step walks from \rho to \sigma in G.

Problem 1.1. Let G be the complete graph on S=\{1,\dots,N\}, whose vertex set is T={S \choose 2}. Using nothing but your brain, find a formula for W^r(\rho,\sigma). If you find this problem intractable or, worse, uninteresting, consider a degree in law or medicine.

Math 202A provides an algebraic encoding of graphs. Let \mathcal{C}(S) be the Hilbert space with orthonormal basis |\rho\rangle,|\pi\rangle,|\sigma\rangle,\dots. Let \langle \rho|,\langle \pi|,\langle \sigma|,\dots be the dual basis of \mathrm{Hom}(\mathcal{C}(S),\mathbb{C}), so that

\langle \rho | \sigma \rangle = \langle \rho,\sigma \rangle = \delta_{\rho\sigma}.

Does using Dirac notation make this an applied math course? The question is vacuous since there is no such thing as non-applied math, but there are cultural differences between fields which are reflected in their iconography, and hopefully you can at least read a paper in quantum computing at the end of this sequence.

The adjacency operator K\in \mathrm{End}\mathcal{C}(S) acts on the vertex basis of \mathcal{C}(S) by creating a flat superposition of all vertices adjacent to a given input,

K|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \{\rho,\sigma\} \in T}} |\sigma\rangle.

Equivalently, the matrix elements of K in the vertex basis are

\langle \sigma | K |\rho\rangle=\begin{cases} 1, \text{ if } \{\rho,\sigma\} \in T\\ 0, \text{ if } \{\rho,\sigma\} \not\in T\end{cases}.

Thus, the selfadjoint operator K \in \mathrm{End}\mathcal{C}(S) encodes the edge set T \subseteq {S\choose 2}.

For example, consider the complete graph G=(S,T) on the vertex set S=\{1,\dots,N\}. The computational basis in \mathcal{C}(S) consists of the vectors

|1\rangle, |2\rangle,\dots,|N\rangle,

upon which A acts according to

K| i\rangle = \sum\limits_{j \neq i} |j\rangle.

Problem 1.2. Show that W^r(\rho,\sigma) = \langle \rho| K^r|\sigma\rangle. In particular, show that the trace of K^r equals the total number of loops (i.e. closed walks) of length r in G.

The linear algebra approach to counting walks in graphs is to find a basis of \mathcal{C}(G) on which K acts diagonally.

Problem 1.3. Implement this approach for the complete graph and recover your barehands solution to Problem 1.1.

The mathematical subject which uses eigenvalues of linear operators to analyze graphs is called spectral graph theory. In particular, the spectrum of the adjacency operator K on a given graph G=(S,T) is called the adjacency spectrum of G. A couple of facts worth mentioning: if G is bipartite then the spectrum of K is symmetric about 0; if G is degree-regular then the largest eigenvalue of K is the degree of any vertex.

Instead of asking how many walks there are between two given vertices, we may ask how far apart they are. This information is encoded in the distance operator D \in \mathrm{End}\mathcal{C}(S), whose matrix elements in the vertex basis tabulate geodesic distance in G,

\langle \sigma | D | \rho\rangle = \mathrm{d}(\rho,\sigma).

Ron Graham made the striking discovery that the determinant of the distance operator on a tree depends only on the cardinality of its vertex set. Distance eigenvalues of graphs have been much studied ever since. I found some recent interesting work on distance operators via Johann Birnick’s website; Johann is the TA for this course.

One can in a sense interpolate between K and D. To do so, we consider operators L_0,L_1,L_2,\dots \in \mathcal{C}(S) corresponding to metric level sets in G=(S,T), meaning that

\langle \sigma | L_r |\rho \rangle = \begin{cases} 1, \text{ if }\mathrm{d}(\rho,\sigma)=r \\ 0, \text{ otherwise}\end{cases}.

Equivalently,

L_r|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \mathrm{d}(\rho,\sigma)=r}} |\sigma\rangle

is the sum of all points on the sphere of radius r centered at \rho. We have that L_0=I is the identity operator, L_1 = K is the adjacency operator, and

D=\sum\limits_{r=0}^\infty rL_r.

This is in fact a finite sum, since L_r is the zero operator for all r exceeding the diameter of G.

One can also consider the exponential adjacency operator

\Psi_t= \sum\limits_{r=0}^\infty \frac{t^r}{r!} K^r =: e^{tK}

on G, whose matrix elements are exponential generating functions enumerating walks between vertices by length,

\langle \sigma | \Psi_t |\rho \rangle = \sum\limits_{r=0}^\infty \frac{t^r}{r!} W^r(\rho,\sigma).

For regular graphs this is essentially the heat kernel, which has been much studied. The exponential distance operator,

\Omega_q = \sum\limits_{r=0}^\infty q^r L_r,

is the element-wise exponential of the distance operator,

\langle \sigma | \Omega_q |\rho\rangle= q^{\mathrm{d}(\rho,\sigma)},

and unlike \Psi_t it is a relative newcomer in spectral graph theory.

Looking at the power series which define \Psi_t and \Omega_q, one sees that the level set operators L_0,L_1,L_2,\dots are to \Omega_q as powers K^0,K^1,K^2,\dots of the adjacency operator are to \Psi_t. While the first two elements of these sequences are the same, a big flaw in this symbolic analogy is that the subalgebra of \mathrm{End}\mathcal{C}(S) generated by the (finite) set of selfadjoint operators \{L_0,L_1,L_2,\dots\} is highly non-trivial, whereas the subalgebra of \mathrm{End}\mathcal{C}(S) generated by the single selfadjoint operator K simply consists of polynomials in this operator, which is but a small sector of \mathrm{Alg}(L_0,L_1,L_2,L_3,\dots). In particular, it is not at all clear what relations exist between L_r and L_s, or even whether they commute. We will have more to say about this later.

The exponential distance operator \Omega_q encodes the same information as the distance operator D in a manner which interfaces more readily with the formalism of statistical mechanics. More precisely, let us write q=e^{-\beta} so that

\langle \sigma | \Omega_q | \rho \rangle = e^{-\beta \mathrm{d}(\rho,\sigma)}.

Then, fixing a root vertex \rho, the \rhoth row of \Omega_q defines Gibbs measure (after Josiah, not Amelia) on the vertices of G according to

\mathbb{P}(\sigma) = \frac{1}{Z}e^{-\beta |\sigma|}.

Physically, this distribution corresponds to a thermodynamic ensemble whose states are the vertices of G, with \beta>0 the inverse temperature and the potential energy |\sigma| of a state \sigma being equal to its distance |\sigma|:=\mathrm{d}(\rho,\sigma) to the root vertex \rho in G. The partition function

Z= \sum\limits_\sigma q^{|\sigma|} = \sum\limits_{\sigma} e^{-\beta\mathrm{d}(\rho,\sigma)}

sums the energy over all states, ensuring that \mathbb{P} is a probability measure on S. Its logarithm \log Z is the free energy of our ensemble, and

q \frac{\partial}{\partial q} \log Z = \frac{1}{Z} \sum\limits_{\sigma} |\sigma|q^{|\sigma|}=\sum\limits_\sigma |\sigma| \mathbb{P}(\sigma)

is precisely the average distance to the root vertex \rho under the non-uniform distribution \mathbb{P}, which becomes the uniform distribution on states in the high-temperature limit \beta \to 0 limit, where q=e^{-\beta} \to 1.

Let us try this out on a toy example, the complete graph G=(S,T) on S=\{1,\dots,N\}. Take the root vertex to be 1, so that the potential energy of any other vertex \sigma \in \{1,\dots,N\} is

|\sigma| = \mathrm{d}(1,\sigma)=1.

Thus, under the corresponding Gibbs measure on states S=\{1,\dots,N\} we have

\mathbb{P}(1) = \frac{1}{Z}

and

\mathbb{P}(\sigma) = \frac{q}{Z}, \quad \sigma \neq 1,

where the partition function is

Z= q^0+q^1+ \dots + q^1 = 1+(N-1)q.

We thus have

q\frac{d}{dq} \log Z = \frac{(N-1)q}{1+(N-1)q},

and setting q=1 we recover the obvious fact that the average distance to 1 under the uniform measure on the complete graph is \frac{N-1}{N}. Indeed, for the uniform measure on the complete graph the random variable \sigma \mapsto |\sigma| has a Bernoulli distribution on \{0,1\} which assigns probability \frac{1}{N} to 0 and probability \frac{N-1}{N} to 1.

Linear operators on graphs fall under the broad umbrella of Math 202A, which treated linear maps between finite-dimensional Hilbert spaces. Math 202B gives us additional algebraic tools which apply to graphs whose vertices form a group; these are officially known as Cayley graphs.

Let S be a finite group and T \subset S a symmetric set of generators in G. Symmetric means that \tau \in T implies \tau^{-1} \in T, and saying that T generates S means that for any \sigma \in S there exists r \in \mathbb{N}_0 and \tau_1,\dots, \tau_r \in T such that \sigma=\tau_1\tau_2 \dots \tau_r. The Cayley graph \mathrm{Cay}(S,T) has vertex set S, with \{\rho,\sigma\} an edge if and only if \sigma = \rho\tau for some \tau \in T, which is then necessarily unique. The fact that T is symmetric means that this is a well-defined as an undirected graph without multiple edges, and the fact that T generates S means that the graph is connected. If T does not contain the identity element \iota then no vertex is adjacent to itself, and we henceforth assume this to be the case. We see that for Cayley graphs G=\mathrm{Cay}(S,T), the edge set can effectively be viewed as a subset of the vertex set, since the generating T \subseteq S determines the adjacency relation on vertices.

For a Cayley graph G=\mathrm{Cay}(S,T), the space \mathcal{C}(S) carries not just a scalar product but also a vector product, namely convolution: we have

|\pi\rangle|\sigma\rangle = |\pi \sigma\rangle, \quad \pi,\sigma \in S,

where \pi\sigma is the group product of \pi and \sigma. This gives us a new point of view on the adjacency operator K on G=\mathrm{Cay}(S,T). For any subset P \subseteq S, let

|P\rangle = \sum\limits_{\pi \in P} |\pi\rangle

be the corresponding element of \mathcal{C}(S). Thus in particular

\langle P|Q\rangle = |P \cap Q|.

Let us allow the following extremely useful abuse of notation: we will also identify P \subset S with its image in the (right) regular representation of \mathcal{C}(S). Thus,

P|\rho \rangle = |\rho\rangle |P\rangle = \sum\limits_{\pi \in P} |\rho\pi\rangle.

In particular, T is both our chosen generating set of S, and also its image in the regular representation is the adjacency operator on \mathrm{Cay}(S,T),

T|\rho\rangle = |\rho\rangle |T\rangle = \sum\limits_{\tau \in T} |\rho\tau\rangle.

More generally, let

L_r = \{ \pi \in S \colon |\pi|=r\} = \{\pi \in S \colon \mathrm{d}(\iota,\pi)=r\}

be the sphere of radius r centered at the identity element. This set becomes a vector in \mathcal{C}(S) via quantization,

|L_r\rangle = \sum\limits_{\pi \in L_r} |\pi\rangle,

and this vector becomes an operator in \mathrm{End}\mathcal{S} via the regular representation,

L_r|\rho\rangle = |\rho\rangle |L_r\rangle = \sum\limits_{|\pi|=r} |\rho\pi\rangle.

The matrix elements of the operator L_r are

\langle \sigma | L_r |\rho \rangle = \sum\limits_{|\pi|=r} \langle \sigma |\rho\pi\rangle,

and

\langle \sigma|\rho\pi\rangle\neq 0 \iff \sigma=\rho\pi \iff \rho^{-1}\sigma=\pi,

in which case

\mathrm{d}(\rho,\sigma)=|\rho^{-1}\sigma|=|\pi|=r.

Thus, the image of |L_r\rangle in the (right) regular representation is indeed the metric level set operator L_r.

To sum up, the adjacency eigenvalues of the Cayley graph \mathrm{Cay}(S,T) are the eigenvalues of |T\rangle \in \mathcal{C}(S) acting in the regular representation of \mathcal{C}(S). This means that the spectral graph theory of Cayley graphs is part of the representation theory of finite groups. If we are in the incredibly fortuitous situation where G is an abelian group, this comes down to using the discrete Fourier transform.

Problem 1.4. Let C be the cyclic group of order n with generator \gamma, and take the symmetric generating set D=\{\gamma,\gamma^{-1}\}. Then \mathrm{Cay}(C,D) is a cycle graph with n vertices. Find a formula for W^r(\rho,\sigma), the number of r-step walks between two given vertices of a cycle graph.

Concerning distance in Cayley graphs \mathrm{Cay}(S,T), we by definition have that \mathrm{d}(\rho,\sigma) is equal to the minimal value of r such that there exist \tau_1,\dots,\tau_r such that

\sigma = \rho\tau_1 \dots \tau_r.

Problem 1.5. Show that for any \pi,\rho,\sigma \in S we have \mathrm{d}(\pi\rho,\pi\sigma)=\mathrm{d}(\rho,\sigma), and explain why the same cannot in general be said for \mathrm{d}(\rho\pi,\sigma\pi). Show that \mathrm{d}(\rho,\sigma) = \mathrm{d}(\iota,\rho^{-1}\sigma) with \iota the group identity.

Now let us consider a nonabelian example: our group is S=\mathrm{Aut}\{1,\dots,n\}, the symmetric group consisting of all bijections \pi \colon \{1,\dots,n\} \to \{1,\dots,n\}, which are generally called permutations. We will take the group law to be \pi\sigma := \sigma \circ \pi, which is the convention used in most computer algebra systems, or so I am told. Take the generating set

T = \{(1\ 2),(2\ 3),\dots,(n-1\ n)\}

of adjacent transpositions.

Problem 1.6. Draw \mathrm{Cay}(S,T) for n=3.

Set |\pi|=\mathrm{d}(\iota,\pi) with \iota the identity permutation.

Now consider the metric Gibbs measure on the Cayley graph \mathrm{Cay}(S,T) of the symmetric group S=\mathrm{Aut}\{1,\dots,n\} corresponding to the generating set T=\{(1\ 2),(2\ 3),\dots,(n-1\ n)\} of adjacent transpositions. We take the root vertex to be \iota, the identity permutation in S. Thus,

\mathbb{P}(\sigma) = \frac{q^{|\sigma|}}{Z},

where |\sigma|=\mathrm{d}(\iota,\sigma) is the minimal number of adjacent transpositions whose produce is \sigma.

Problem 1.7. Show that |\sigma|=\mathsf{inv}(\sigma), where \mathrm{inv}(\sigma) is the cardinality of the set \{1 \leq i<j \leq n\ \colon\ \sigma(i)>\sigma(j)\} of inversions in the permutation \sigma. Hence compute the diameter of G=\mathrm{Cay}(S,T).

We conclude that the metric Gibbs measure on G=\mathrm{Cay}(S,T) is

\mathbb{P}(\sigma)=\frac{1}{Z}q^{\mathsf{inv}(\sigma)}.

This is a famous probability measure on permutations called the Mallows measure; it arises in theoretical computer science in the analysis of sorting algorithms. It is a remarkable fact that the partition function

Z=\sum\limits_{\sigma \in S} q^{|\mathsf{inv}(\sigma)|}.

It is a remarkable fact that this partition function can be evaluated exactly, i.e. this stat mech model is exactly solvable. For any positive integer k, the corresponding q-number is

[k]_q = 1+q+ \dots + q^{k-1},

so that in particular [k]_1=k reverts to an ordinary number at q=1. The q-factorial is accordingly defined as

[k]_q! = [1]_q[2]_q \dots [k]_q.

Here’s a pretty thing.

Theorem 1.1. The partition function for Mallows measure is Z=[n]_q!.

Now you can derive many things, including the following.

Problem 1.8. Find a formula for the expected number of inversions in a random permutation whose distribution is Mallows measure. Using your result, find a formula for the expected number of inversions in a uniformly random permutation.