Math 202C – Page 2 – Jonathan Novak

Math 202C: Lecture 6

*** Problem in this lecture due April 19 at 23:59 ***

In Lecture 5, we constructed the polynomial algebra $\mathcal{P}_n$ and the symmetric polynomial algebra $\mathcal{S}_n$ . The basic combinatorial objects underlying these constructions are integer compositions and integer partitions. Let

$\mathrm{Com}_n=\{\alpha=(\alpha_1,\dots,\alpha_n) \colon \alpha_i \in \mathbb{N}_0\}.$

Then,

$\mathrm{Com}_n = \bigsqcup\limits_{d=0}^\infty \mathrm{Com}_n^d$

with

$\mathrm{Com}_n^d=\{\alpha \in \mathrm{Com}_n \colon |\alpha|=d\},$

where $|\alpha|=\alpha_1+\dots+\alpha_n.$ Now we quantize, meaning that we replace $\mathrm{Com}_n$ with the Hilbert space $\mathcal{P}_n$ having orthonormal basis

$|\alpha\rangle, \quad \alpha \in \mathrm{Com}_n.$

The disjoint union decomposition of the infinite set $\mathrm{Com}_n$ then becomes an orthogonal direct sum decomposition

$\mathcal{P}_n = \bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d$

of the infinite-dimensional space $\mathcal{P}_n$ into finite-dimensional subspaces

$\mathcal{P}_n^d = \bigoplus\limits_{\alpha \in \mathrm{Com}_n^d} \mathbb{C}|\alpha\rangle.$

Moreover, we have a natural notion of multiplication in $\mathcal{P}_n$ defined by

$|\alpha\rangle|\beta\rangle=|\alpha+\beta\rangle,$

which has the feature that

$|\alpha\rangle \in \mathcal{P}_n^c,\ |\beta\rangle \in \mathcal{P}_n^d \implies |\alpha+\beta\rangle \in \mathcal{P}_n^{c+d}.$

On one hand, the infinite-dimensional algebra is similar to the convolution algebra of an abelian group: if

$|v\rangle = \sum\limits_\alpha c_\alpha|\alpha\rangle \quad\text{and}\quad |w\rangle = \sum\limits_\beta d_\beta|\beta\rangle,$

then

$|v\rangle |w\rangle = \sum\limits_\gamma \left( \sum\limits_{\alpha+\beta=\gamma} c_\alpha d_\beta\right)|\gamma\rangle.$

Indeed, $\mathcal{P}_n$ is precisely the convolution algebra of finitely-supported functions on the infinite abelian monoid $\mathrm{Com}_n.$ On the other hand, $\mathcal{P}_n$ is something quite familiar and intuitive: the algebra of polynomials in $n$ commuting variables. Indeed, under the identification

$|\alpha\rangle \mapsto x^\alpha = x_1^{\alpha_1} \dots x_n^{\alpha_n},$

we have

$|v\rangle = \sum\limits_\alpha c_\alpha|\alpha\rangle = \sum\limits_\alpha c_\alpha x^\alpha = f(x_1,\dots,x_n).$

In this sense, what the familiar algebra of polynomials in $n$ commuting variables “really” is is the convolution algebra of the commutative monoid $\mathrm{Com}_n.$

The symmetric group $S_n = \mathrm{Aut}\{1,\dots,n\}$ acts on $\mathrm{Com}_n$ according to

$\pi \cdot \alpha = (\alpha_{\pi(1)} \dots \alpha_{\pi(n)}),$

and this permutation action respects the decomposition of $\mathrm{Com}_n$ into finite-dimensional components $\mathrm{Com}_n^d.$ This permutation action gives rise to a unitary representation $(\mathcal{P}_n,\varphi_n)$ of $S_n$ , where

$\varphi_n \colon S_n \longrightarrow U(\mathcal{P}_n)$

is the group homomorphism defined by

$\varphi_n(\pi)|\alpha\rangle = |\alpha_{\pi(1)} \dots \alpha_{\pi(n)}\rangle.$

Equivalently, thinking of $\mathcal{P}_n$ as the algebra of polynomials in $n$ variables this is

$\varphi_n(\pi)f(x_1,\dots,x_n) = f(x_{\pi^{-1}(1)},\dots,x_{\pi^{-1}(n)}).$

The algebra of symmetric polynomials in $n$ variables is the space $\mathcal{S}_n=\mathcal{P}_n^{S_n} of$ latex S_n$-invariant vectors in this unitary representation, i.e. polynomials with the property that

$f(x_{\pi(1)},\dots,x_{\pi(n)}) = f(x_1,\dots,x_n), \quad \forall \pi \in S_n.$

As explained in Lecture 5, it is not too difficult to construct a basis of $\mathcal{S}_n$ . Indeed, one need only observe that orbits of the the action of $S_n$ on $\mathrm{Com}_n$ are naturally indexed by the set

$\mathrm{Par}_n = \{\lambda \in \mathrm{Com}_n \colon \lambda_1 \geq \dots \geq \lambda_n\}.$

For a fixed positive integer $d,$ the set $\mathrm{Par}_n^d=\mathrm{Par}_n \cap \mathrm{Com}_n^d$ may be identified with the set of partitions of $d$ having at most $n$ parts. Then, associated to each $\lambda \in \mathrm{Par}_n^d$ is the corresponding orbit sum

$m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha,$

and $\{m_\lambda \colon \lambda \in \mathrm{Par}_n^d\}$ is an orthogonal basis of $\mathcal{S}_n^d=\mathcal{S}_n \cap \mathcal{P}_n^d.$ This is the basis of monomial symmetric polynomials, which have the form

$m_\lambda(x_1,\dots,x_n) = x_1^{\lambda_1} \dots x_n^{\lambda_n} + \text{symmetries}.$

So while $m_\lambda$ is not in general a monomial, it is a polynomial having the feature that any of its constituent monomials determines all the others.

Problem 6.1. Give necessary and sufficient conditions under which $m_\lambda(x_1,\dots,x_n)$ actually is a monomial.

Because $m_\lambda$ is not in general a monomial, it is not in general the case that

$m_\lambda(x_1,\dots,x_n)m_\mu(x_1,\dots,x_n)=m_{\lambda+\mu}(x_1,\dots,x_n).$

This is somewhat analogous to the fact that convolution of conjugacy classes in the center of the convolution algebra of a noncommutative group is determined by, but not the same as, multiplication of group elements. As in that context, the natural question is to compute the connection coefficients of the monomial basis of $\mathcal{S}_n$ ,

$m_\lambda m_\mu = \sum\limits_{\nu \in \mathrm{Par}_n} c_{\lambda\mu\nu} m_\nu.$

In support of this analogy, consider the following.

Problem 6.2. Show that for any $\gamma \in \mathcal{O}_\nu$ , the coefficient of the monomial $x^\gamma$ in the product $m_\lambda(x_1,\dots,x_n)m_\mu(x_1,\dots,x_n)$ is

$c_{\lambda\mu\nu} =|\{(\alpha,\beta) \in \mathcal{O}_\lambda \times \mathcal{O}_\nu \colon \alpha+\beta=\gamma\}|.$

Taking the analogy to this natural conclusion, remember the a priori surprising fact the center of the convolution algebra, whose multiplication appears quite complicated in the class basis, is actually isomorphic to a pointwise function algebra. In particular, in the abelian case we have $\mathcal{C}(G) \simeq \mathcal{F}(G)$ .$

Theorem 6.1 (Newton). The algebra $\mathcal{S}_n$ of symmetric polynomials in $n$ variables is isomorphic to the algebra $\mathcal{P}_n$ of polynomials in $n$ variables.

Note that this is an infinite-dimensional phenomenon: the proper subalgebra $\mathcal{S}_n$ of $\mathcal{P}_n$ is isomorphic to the full algebra in which it resides. In particular, the isomorphism is definitely not the tautological inclusion.

Theorem 6.1. is called the fundamental theorem of symmetric polynomials, and as indicated above it is generally attributed to Newton. Among many other achievements, Newton laid the foundations for classical mechanics based on his theory of gravity, and developed a theory of optics which involved sticking a knitting needle into his own eye to demonstrate that color is an intrinsic property of light as opposed to an artifact of perception. Now that’s applied mathematics. Given that symmetric polynomials were good enough for Newton, it’s a bit hard to take complaints about our subject matter being “too pure” seriously. But ok, nowadays we’re doing data science, not physical science — why would you care about reconciling gravitation with quantum mechanics when you can make a living applying principal components analysis to an endless supply of large rectangular matrices?

From the data science perspective, Newton’s theorem says that the elementary symmetric polynomials form a universal coordinate system for polynomial features of unordered data. More precisely, his result is that $\mathcal{S}_n=\mathbb{C}[x_1,\dots,x_n]^{S_n}$ is isomorphic to $\mathcal{P}_n=\mathbb{C}[e_1,\dots,e_n],$ where

$e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subset [n] \\ |I|=r}} \prod\limits_{i \in I} x_i.$

Here $[n]=\{1,\dots,n\},$ so $e_r(x_1,\dots,x_n)$ is the zero polynomial for $r>n.$ It is notationally convenient to define $e_0(x_1,\dots,x_n)$ to be the constant polynomial $1.$ Then, for any partition $\lambda,$ i.e. for any vector $\lambda=(\lambda_1,\lambda_2,\lambda_3,\dots)$ from the set $\mathrm{Par}$ of infinite integer vectors with weakly decreasing coordinate eventually equal to zero, we can extend the definition of the elementary symmetric polynomials by declaring

$e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^\infty e_{\lambda_i}(x_1,\dots,x_n),$

or equivalently

$e_\lambda(x_1,\dots,x_n) = \prod\limits_{i=1}^{\ell(\lambda)}e_{\lambda_i}(x_1,\dots,x_n)$

where $\ell(\lambda)$ is the number of nonzero coordinates of $\lambda.$ The condition required for this product to be nonzero is that $\lambda_1 \leq n$ . Now, the set of partitions with largest part at most $n$ is in bijection with the set of partitions with at most $n$ parts via transposing Young diagrams, $\lambda \mapsto \lambda^T.$ Thus, Newton’s theorem is equivalent to the following.

Theorem 6.2. For any $d \in \mathbb{N},$ the set $\{e_\lambda(x_1,\dots,x_n) \colon \lambda^T \in \mathrm{Par}_n^d\}$ is a vector space basis of $\mathcal{S}_n^d.$

In Lecture 7, we will prove Theorem 6.2 by giving a combinatorial interpretation of the coefficients in the expansion

$e_\lambda(x_1,\dots,x_n) = \sum\limits_{\lambda \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(x_1,\dots,x_n).$

of the (extended) elementary symmetric polynomials $e_\lambda$ in terms of the monomial symmetric polynomials $m_\mu$ . In fact, the value of this computation extends beyond its immediate application to Newton’s theorem. Indeed, the expansion above is universal in the sense that it remains valid under any specialization, i.e. under any algebra homomorphism $\Xi \colon \mathcal{S}_n \to \mathcal{A}$ . In particular, we have

$e_\lambda(|J_1\rangle, \dots, |J_n\rangle) = \sum\limits_{\lambda \in \mathrm{Par}_n^d} M_{\lambda\mu}m_\mu(|J_1\rangle, \dots, |J_n\rangle),$

where $|J_1\rangle,\dots,|J_n\rangle$ are the Jucys-Murphy elements in the convolution algebra $\mathcal{C}(S_n)$ of the symmetric group. Recall that we established the commutativity of the JM-elements by proving that they lie in the commutative subalgebra $\mathcal{J}_n$ of $\mathcal{C}(S_n)$ generated by

$Z\mathcal{C}(S_1) \cup \dots \cup Z\mathcal{C}(S_n),$

which we named the JM-subalgebra of $\mathcal{C}(S_n)$ . We then showed that

$|L_r\rangle=e_r(|J_1\rangle,\dots,|J_n\rangle),$

i.e. the levels of the symmetric group are precisely the elementary symmetric polynomials specialized on the JM-elements.

Next week, we shall see that essentially the same argument used to prove Newton’s theorem can be retooled to prove the apparently much more sophisticated result that $|J_1\rangle,\dots,|J_n\rangle$ not only lie in the algebra $\mathcal{J}_n$ , but in fact generate it. This will essentially complete Phase I of our development of the representation theory of the symmetric group. Phase II will be to show that $\mathcal{J}_n$ has a completely different spectral interpretation as the so-called Gelfand-Tsetlin subalgebra of $\mathcal{C}(S_n)$ . As discovered by Okounkov and Vershik, the fact that the Jucys-Murphy and Gelfand-Tsetlin subalgebras of $\mathcal{C}(S_n)$ are one and the same completely unlocks the representation theory of the symmetric groups.

For now, here is something concrete to think about.

Problem 6.3. Show that $e_r(1,2,\dots,n-1)$ is equal to the number of permutations in $S_n$ consisting of $n-r$ disjoint cycles.

For context, recall that the evaluation of $p_r(1,2,\dots,n)=1^r+2^r+\dots+n^r$ is the classical problem of summing powers of consecutive numbers. At some point in your life you undoubtedly learned that

$p_1(1,2,\dots,n) = \frac{n(n-1)}{2},$

and this may have led you to discover that

$p_2(1,2,\dots,n) = \frac{n(n+1)(2n+1)}{6},$

which probably means that you tried to understand what was going on with $p_r(1,2,\dots,n)$ in general. The problem is “solved” by Faulhaber’s formula, which is not very satisfying. The situation with $e_r(1,2,\dots,n)$ is much cleaner, and may help to mend this childhood trauma.

Math 202C: Lecture 5

*** Problems in this Lecture due March 12 at 23:59 ***

Let

$\mathrm{Com}_n = \{(\alpha_1,\dots,\alpha_n) \in \mathbb{Z}^n \colon \alpha_i \geq 0\}$

be the set of nonnegative integer vectors with $n$ components. Elements $\alpha$ of $\mathrm{Com}_n$ are referred to as weak $n$ -part compositions, where “weak” means that some coordinates may be zero. Let

$\mathrm{Com}_n^d=\{\alpha \in \mathrm{Com}_n \colon \alpha_1+\dots +\alpha_n=d\}.$

Elements of $\mathrm{Com}_n^d$ are called weak $n$ -part compositions of $d.$ Let $\mathcal{P}_n$ be the Hilbert space with orthonormal basis

$|\alpha\rangle = |\alpha_1\dots\alpha_n\rangle,\quad \alpha \in \mathrm{Com}_n.$

That is, $\mathcal{P}_n$ is the space of complex-valued functions on weak $n$ -part compositions which vanish at all but finitely many points. A general vector in $\mathcal{P}_n$ thus has the form

$|v\rangle = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha |\alpha\rangle,$

where all but finitely many of the coefficients $c_\alpha \in \mathbb{C}$ are zero. Note that $\mathcal{P}_n$ is an infinite-dimensional Hilbert space, and recall that our definition of Hilbert space does not require completeness. We have an orthogonal direct sum decomposition

$\mathcal{P}_n = \bigoplus\limits_{d=0}^\infty \mathcal{P}_n^d,$

where

$\mathcal{P}_n^d = \mathrm{span}\{|\alpha\rangle \colon \alpha \in \mathrm{Com}_n^d\}.$

Problem 5.1. Prove that $\mathcal{P}_n^d$ is finite-dimensional, and determine its dimension.

Now define a multiplication in $\mathcal{P}_n$ by bilinearly extending the rule

$|\alpha\rangle|\beta\rangle = |\alpha+\beta\rangle.$

Furthermore, define a conjugation in $\mathcal{P}_n$ by antilinearly extending the rule

$|\alpha\rangle^*=|\alpha\rangle.$

Then, $\mathcal{P}_n$ is an infinite-dimensional commutative algebra which you likely know by another name: the algebra of polynomials in $n$ (commuting, selfadjoint) variables. Indeed, if we write

$x_1=|100\dots 0\rangle\ x_2=|010\dots 0\rangle, \dots x_n=|000\dots 1\rangle,$

then a basis vector in $\mathcal{P}_n$ becomes

$|\alpha\rangle = x_1^{\alpha_1} \dots x_n^{\alpha_n},$

and a general vector in $\mathcal{P}_n$ becomes

$|v\rangle = \sum\limits_{\alpha} c_\alpha|\alpha\rangle = \sum\limits_{\alpha} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}.$

So, if you have ever wondered what exactly a polynomial is, the above constitutes one possible answer: a polynomial is a finitely-supported complex-valued function on the set of compositions. This explains the sense in which there are “no polynomial relations” among the “variables” $x_1,\dots,x_n$ . Indeed, writing

$\sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n} = \sum\limits_\alpha c_\alpha |\alpha\rangle =0$

shows that this means nothing more than linear independence of the computational basis in $\mathcal{P}_n$ . Also note that $\mathcal{P}_n^d$ is the span of all monomials whose exponents add to $d$ ,

$x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}, \quad \alpha_1+\dots+\alpha_n=d.$

Polynomials $f(x_1,\dots,x_n) \in \mathcal{P}_n^d$ are said to be homogeneous of degree $d$ , and the space of these is finite-dimensional.

Since you come with a lot of software already installed, we will continue to use the familiar notation

$f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}$

for vectors $|v\rangle$ in $\mathcal{P}_n = \mathbb{C}[x_1,\dots,x_n].$ This will allow us to more readily access your existing libraries, fragmented as they may be. The monomial basis of $\mathcal{P}_n$ is

$|\alpha\rangle= x_1^{\alpha_1} \dots x_n^{\alpha_n}.$

If we want to write things in a compressed way, we may use

$f(x)=\sum\limits_\alpha c_\alpha x^\alpha.$

One advantage of the classical polynomial notation is that it allows you to visualize homomorphisms from $\mathcal{P}_n$ into other algebras as “substitutions.”

Definition 5.1. An algebra homomomorphism $\Xi \colon \mathcal{P}_n \to \mathcal{A}$ is called a specialization when $\mathcal{A}$ is commutative.

Any specialization $\Xi$ is uniquely determine by the images $A_1=\Xi(x_1),\dots,A_n=\Xi(x_n)$ , since with the image of a general polynomial

$f(x_1,\dots,x_n) = \sum\limits_\alpha c_\alpha x_1^{\alpha_1} \dots x_n^{\alpha_n}$

$\Xi(f(x_1,\dots,x_n)=\sum\limits_\alpha A_1^{\alpha_1} \dots A_n^{\alpha_n} =:f(A_1,\dots,A_n).$

Conversely, given any elements $A_1,\dots,A_n$ in a commutative algebra $\mathcal{A}$ there is a unique specialization $\Xi \colon \mathbb{C}[x_1,\dots,x_n] \to \mathcal{A}$ such that $\Xi(x_1)=A_1,\dots,\Xi(x_n)=A_n.$

For example, consider the specialization $\Xi \colon \mathcal{P}_n \to \mathbb{C}$ determined by

$\Xi(x_1)=1,\ \Xi(x_2)=2,\ \dots,\ \Xi(x_n)=n.$

Then, for a general polynomial

$f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}$

we have

$\Xi(f) = f(1,2,\dots,n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha 1^{\alpha_1}2^{\alpha_2} \dots n^{\alpha_n}.$

In simpler times, people were interested in finding the image of the power sum polynomial

$p_r(x_1,\dots,x_n)=x_1^r + x_2^r + \dots + x_n^r$

under the specialization $\Xi,$ which is the number

$\Xi(p_r)=p_r(1,2,\dots,n) = 1^r+2^r+ \dots+ n^r.$

A formula for this number was eventually found in terms of another family of numbers called Bernoulli numbers, which are themselves not so easy to describe. The main conceptual insight this formula provides is that $\Xi(p_r)$ admits a second description as the the image of a univariate polynomial $q_r(x)$ under the specialization $x \mapsto n$ .

In probability theory, one is concerned with specializations of $\mathcal{P}_n$ whose target $\mathcal{A}$ is the algebra of complex-valued random variables on a given probability space. In fact, the problem we want to solve in this context is the stochastic version of the above numerical problem: given $X_1,\dots,X_n \in \mathcal{A}$ , determine the distribution of

$p_r(X_1,\dots,X_n)=X_1^r+ \dots + X_n^r$

in terms of the joint distribution of $X_1,\dots,X_n$ . Probability theory provides efficient tools for doing this when the random variables $X_1,\dots,X_n$ are independent. If they are highly correlated, however, we know much less.

Problem 5.2. Determine the distribution of $p_r(X_1,\dots,X_n)$ when $X_1,\dots,X_n$ are iid Bernoulli random variables.

We can recast the content of Lecture 4 in terms of a specialization of $\mathcal{P}_n$ whose target is the Jucys-Murphy algebra, $\mathcal{J}_n$ . Recall that $\mathcal{J}_n$ is the commutative subalgebra of the convolution algebra $\mathcal{C}(S_n)$ of the symmetric group $S_n=\mathrm{Aut}\{1,\dots,n\}$ generated by

$\mathcal{Z}_1 \cup \mathcal{Z}_2 \cup \dots \cup \mathcal{Z}_n$ ,

where $\mathcal{Z}_k=Z\mathcal{C}(S_k)$ is the center of the subalgebra $\mathcal{C}(S_k)$ of $\mathcal{C}(S_n)$ consisting of functions supported on permutations which fix the points $k+1,k+2,\dots,n.$ The Jucys-Murphy specialization of $\mathcal{P}_n$ is the algebra homomorphism

$\Xi \colon \mathcal{P}_n \longrightarrow \mathcal{J}_n$

defined by

$\Xi(x_1)=|J_1\rangle,\ \Xi(x_2)=|J_2\rangle,\ \dots,\ \Xi(x_n)=|J_n\rangle,$

where

$|J_t\rangle = \sum\limits_{1 \leq s<t} |st\rangle, \quad 1 \leq t \leq n,$

are the Jucys-Murphy elements in $\mathcal{C}(S_n).$ Our key result is that the metric level sets

$|L_r\rangle = \sum\limits_{\substack{\pi \in S_n \\ |\pi|=r}} |\pi\rangle, \quad 1 \leq r \leq n,$

in the symmetric group (identity-centered spheres in the all-transpositions metric) are the image of a particular family of the polynomials $e_1,\dots,e_n \in \mathcal{P}_n$ under the JM-specialization. The polynomials in question are the elementary symmetric polynomials

$e_r(x_1,\dots,x_n) = \sum\limits_{1 \leq t_1 < \dots < t_r \leq n} x_{t_1}x_{t_2} \dots x_{t_r}, \quad 1 \leq r \leq n,$

which may also be written

$e_r(x_1,\dots,x_n) = \sum\limits_{\substack{I \subseteq \{1,\dots,n\} \\ |I|=r}} \prod\limits_{i \in I}x_i.$

Our theorem from Lecture 4 (which actually goes back to Lecture 1, and seems to have been completely uninteresting to many people) is that

$|L_r \rangle = e_r(|J_1\rangle,|J_2\rangle,\dots,|J_n\rangle).$

We can also look at the image of this identity in the regular representation of $\mathcal{C}(S_n)$ , where it becomes

$L_r= e_r(J_1,J_2,\dots,J_n),$

giving us a polynomial decomposition of the metric level-set operators $L_r \in \mathrm{End}\mathcal{C}(S_n)$ on the all-transpositions Cayley graph of $S_n$ in terms of a simpler family of operators $J_t \in \mathrm{End}\mathcal{C}(S_n).$ As described in Lecture 1, the metric level set operators on any graph interpolate between the adjacency operator and the distance operator. However, the polynomial decomposition of these operators given above is a special feature of the all-transpositions Cayley graph of the symmetric group, and we will soon diagonalize the Jucys-Murphy operators $J_1,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n).$

Before doing this, let us come to an understanding of the elementary symmetric polynomials, and symmetric polynomials more generally. The basic fact is that we have a natural unitary representation of $S_n$ on $\mathcal{P}_n$ , namely the group homomorphism

$\varphi \colon S_n \longrightarrow U(\mathcal{P}_n)$

from the symmetric group of $\{1,\dots,n\}$ to the unitary group of $\mathcal{P}_n$ defined by

$\varphi(\pi)|\alpha\rangle = |\alpha_{\pi(1)}\alpha_{\pi(2)} \dots \alpha_{\pi(n)}\rangle, \quad \pi \in S_n,\ \alpha \in \mathrm{Com}_n.$

Note that this is an infinite-dimensional unitary representation of the symmetric group, but that each of the finite-dimensional subspaces $\mathcal{P}_n^d$ is a finite-dimensional unitary representation of $S_n.$ In polynomial notation, we have

$\varphi(\pi)x^\alpha = x_1^{\alpha_{\pi(1)}}x_2^{\alpha_{\pi(2)}} \dots x_n^{\alpha_{\pi(n)}} = x_{\pi^{-1}(1)}^{\alpha_1}x_{\pi^{-1}(2)}^{\alpha_2} \dots x_{\pi^{-1}(n)}^{\alpha_n},$

and thus for a general polynomial

$f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_1^{\alpha_1}x_2^{\alpha_2} \dots x_n^{\alpha_n}$

we have

$\varphi(\pi) f(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathrm{Com}_n} c_\alpha x_{\pi^{-1}(1)}^{\alpha_1}x_{\pi^{-1}(2)}^{\alpha_2} \dots x_{\pi^{-1}(n)}^{\alpha_n}=f(x_{\pi^{-1}(1)}, \dots x_{\pi^{-1}(n)}).$

Like every unitary representation, $(\mathcal{P}_n,\varphi)$ contains a space of invariants,

$\mathcal{S}_n = \mathcal{P}_n^{S_n} = \mathbb{C}[x_1,\dots,x_n]^{S_n},$

and since $\mathcal{P}_n$ is a commutative algebra so is $\mathcal{S}_n$ . The commutative algebra $\mathcal{S}_n$ is called the algebra of symmetric polynomials in $n$ variables, and it consists precisely of those polynomials such that

$f(x_1,\dots,x_n)=f(x_{\pi(1)},\dots,x_{\pi(n)}), \quad \text{for all } \pi \in S_n.$

Let us find a basis for $\mathcal{S}_n$ . First, $S_n$ acts on $\mathrm{Com}_n$ , which is just a set and not a Hilbert space, according to

$\pi \cdot \alpha = (\alpha_{\pi(1)},\dots,\alpha_{\pi(n)}).$

This action preserves the coordinate sum of $\alpha,$ so in fact $S_n$ is acting on each of the finite sets $\mathrm{Com}_n^d,$ whose cardinality is easy to compute (Problem 4.1). However, the number of orbits into which $\mathrm{Com}_n^d$ decomposes under $S_n$ is not at all easy to compute. Let $\mathrm{Par}_n^d \subset \mathrm{Com}_n^d$ be the set of weak $n$ -part compositions of $d$ , and observe that each of these may be identified with a partition of $d$ having at most $n$ parts. The coordinates of a given composition $\alpha \in \mathrm{Com}_n^d$ can be permuted so that they become a weakly decreasing list of numbers, hence the orbits of $\mathrm{Com}_n^d$ may be indexed by the elements of $\mathrm{Par}_n^d$ and we obtain the decomposition

$\mathrm{Com}_n^d = \bigsqcup\limits_{\lambda \in \mathrm{Par}_n^d} \mathcal{O}_\lambda,$

where $\mathcal{O}_\lambda$ is the set of compositions which sort to the partition $\lambda.$ The monomial symmetric polynomials are defined by

$m_\lambda(x_1,\dots,x_n) = \sum\limits_{\alpha \in \mathcal{O}_\lambda} x^\alpha.$

Problem 5.3. Prove that the monomial symmetric polynomials form an orthogonal basis of $\mathcal{S}_n$ . Decompose the power sum symmetric polynomials $p_r(x_1,\dots,x_n)$ and elementary symmetric polynomials $e_r(x_1,\dots,x_n)$ in this basis.

The above was a first-principles construction of a basis for the space of $S_n$ -invariants in $\mathcal{P}_n$ . On the other hand, we know from Math 202B that for any unitary representation $(V,\varphi)$ of a finite group $G$ , averaging the operators $\varphi(g)$ gives the orthogonal projection $V \to V^G.$

Problem 5.4. Show that the operator $P = \frac{1}{n!}\sum_{\pi \in S_n} \varphi(\pi)$ projects each monomial in $\mathcal{P}_n$ onto an element of $\mathcal{S}_n$ that is in fact a monomial symmetric polynomial, up to an explicitly describable scalar factor.

Math 202C: Lecture 4

*** Problems in this lecture due April 12 at 23:59 ***

Let $S_n=\mathrm{Aut}\{1,\dots,n\}$ be the symmetric group on $\{1,\dots,n\}.$ Our convention is that permutations are multiplied left-to-right, $\rho\sigma := \sigma \circ \rho.$ We identify $lates S_n$ with its Cayley graph $\mathrm{Cay}(S_n,T_n)$ as generated by the conjugacy class of transpositions,

$T_n=\{\tau=(s t) \colon 1 \leq s <t \leq n\}.$

Thus, two permutations $\rho,\sigma \in S_n$ are adjacent if and only if $\sigma=\rho\tau$ for $\tau \in T_n.$ Under this identification, $S_n$ decomposes into a disjoint union of $n$ independent sets, the levels

$L_r=\{\pi \in S_n \colon |\pi|=r\} = \{\pi \in S_n \colon n-\mathsf{cyc}(\pi)=r\}, \quad 0 \leq r \leq n-1.$

The cardinalities of these levels make up the $n$ th row in the number triangle whose entries are the Stirling numbers of the first kind. Each level further decomposes as a disjoint union of conjugacy classes: we have

$L_r = \bigsqcup\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} C_\alpha,$

where $\Lambda$ is the set of partitions of $n$ , and $\ell(\alpha)$ is the number of terms in the partition $\alpha$ . We can encode this decomposition algebraically in $\mathcal{C}(S_n)$ as

$|L_r\rangle = \sum\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} |C_\alpha\rangle.$

Note that, because $S_n$ is an ambivalent group, each class indicator $|C_\alpha\rangle$ is a selfadjoint element in the convolution algebra $\mathcal{C}(S_n)$ . In the regular representation, the above becomes an identity in $\mathrm{End}\mathcal{C}(S_n)$ , namely

$L_r = \sum\limits_{\substack{\alpha \in \Lambda \\ \ell(\alpha)=n-r}} C_\alpha,$

and we have used this already to give a simple proof of the fact that the metric level set operators $L_1,\dots,L_{n-1}$ on $S_n=\mathrm{Cay}(S_n,T_n)$ commute. In particular, these selfadjoint operators are simultaneously diagonalizable. In this Lecture, we will obtain a different combinatorial decomposition of $L_r$ which opens the door to actually diagonalizing $L_1,\dots,L_{n-1}.$

In Lecture 1, we defined an edge-labeling of $S_n$ by marking each edge corresponding to the transposition $(s\ t)$ with $t$ , the larger of the two elements it transposes. We defined a walk on $S_n$ to be strictly monotone if the labels of the edges it traverses form a strictly increasing sequence. It is clear that the length of any such walk is at most $n-1,$ which is the diameter of $S_n$ . The following remarkable result holds.

Theorem 4.1. There exists a unique strictly monotone walk between every pair of permutations, and this walk is a geodesic.

Theorem 4.1 gives a second description of $L_r$ as the set of all strictly monotone $r$ -step walks on $S_n$ emanating from the identity permutation, $\iota.$ The figure below illustrates this for $n=4,$ with $\text{blue}=2,$ $\text{purple}=3,$ $\text{black}=4.$

The Jucys-Murphy spanning tree of $S_4$ .

The Jucys-Murphy spanning tree of $S_4$ .

Definition 4.2. The Jucys-Murphy subsets of $S_n$ are defined by

$J_1=\emptyset$

$J_2 = \{(1 2)\}$

$J_3=\{(1 3),(2 3)\}$

$\vdots$

$J_n=\{(1 n),(2 n),\dots,(n-1 n).\}$

Thus,

$J_t = \{(s t) \colon 1 \leq s <t\}$

is the set of all transpositions which swap a given number $t$ with smaller numbers $s$ . The set $J_1 = \emptyset$ is included only for notational convenience, as a sequence with initial index $2$ may be unsettling. It is clear that the JM-subsets of $S_n$ are pairwise disjoint, and that their union is $T_n=L_1$ . More generally, the set of all $r$ -step strictly monotone walks on $S_n$ emanating from the identity permutation $\iota$ is in bijection with

$\bigsqcup\limits_{1 \leq t_1 < \dots < t_r \leq n} J_{t_1} \times J_{t_2} \times \dots \times J_{t_r},$

and hence we have a bijection.

$L_r \simeq \bigsqcup\limits_{1 \leq t_1 < \dots < t_r \leq n} J_{t_1} \times J_{t_2} \times \dots \times J_{t_r},$

We will now encode this correspondence algebraically in $\mathcal{C}(S_n).$

Definition 4.3. The Jucys-Murphy elements in $\mathcal{C}(S_n)$ are

$|J_t\rangle = \sum\limits_{1 \leq s<t} |s t\rangle, \quad 1 \leq t \leq n.$

The JM-elements are selfadjoint elements satisfying $\langle J_t | J_t \rangle = t-1$ and

$|T_n\rangle = \sum\limits_{t=1}^n |J_t\rangle.$

Thus, $J_1,\dots,J_n \in \mathrm{End}\mathcal{C}(S_n)$ are selfadjoint operators satisfying

$T_n = \sum\limits_{t=1}^n J_t,$

where $T_n$ is the adjacency operator on $\mathrm{Cay}(S_n,T_n).$

Problem 4.2. Prove that the operator norm of $J_t$ is $\|J_t\|=t-1.$

At this point, we can write down the algebraic identity

$|L_r\rangle = \sum\limits_{1 \leq t_1 < \dots < t_r \leq n}|J_{t_1}\rangle \dots |J_{t_n}\rangle.$

This is equivalent to the combinatorial decomposition of $L_r$ in terms of $JM$ -subsets, but at the same time it has many advantages. To see these we first the need the following fact.

Theorem 4.3. The JM-elements $|J_1\rangle,\dots,|J_n\rangle$ pairwise commute in $\mathcal{C}(S_n).$

One way to verify this would be to check directly that the products

$|J_k\rangle|J_l\rangle = \sum\limits_{i<k}\sum\limits_{j<l}|ik\rangle|jl\rangle \quad\text{and}\quad |J_k\rangle|J_l\rangle = \sum\limits_{i<k}\sum\limits_{j<l}|jl\rangle|ik\rangle$

are indeed the same. We will instead construct a manifestly commutative subalgebra $\mathcal{J}_n$ of $\mathcal{C}(S_n)$ which contains $\{|J_1\rangle,\dots,|J_n\rangle\}.$

The key to this construction is the fact that the symmetric groups $S_1,S_2,\dots,S_n$ form an inductive chain

$S_1 \subset S_2 \subset \dots \subset S_n,$

where we identity $S_k=\mathrm{Aut}\{1,\dots,k\}$ with the subgroup of $S_n$ consisting of permutations which fix the numbers $k+1,k+2,\dots,n.$ This gives an inductive chain of algebras,

$\mathcal{C}(S_1) \subset \mathcal{C}(S_2) \subset \dots \subset \mathcal{C}(S_n),$

each of which has its own center, giving us a collection

$Z\mathcal{C}(S_1),\ Z\mathcal{C}(S_2),\ \dots,\ Z\mathcal{C}(S_n)$

of commutative subalgebras of $\mathcal{C}(S_n).$ However, it is not true that these commutative subalgebras form an inductive chain. In fact, they are as disjoint as subalgebras can be.

Problem 4.3. Prove that $Z\mathcal{C}(S_k) \cap Z\mathcal{C}(S_l)=\mathbb{C}|\iota\rangle$ for $k<l.$

What is true is that each element of $Z\mathcal{C}(S_k)$ commutes with every element of $Z\mathcal{C}(S_l).$ Indeed, assuming $k<l$ , the center $Z\mathcal{C}(S_l)$ consists of all elements commuting with every element of $\mathcal{C}(S_l),$ and

$Z\mathcal{C}(S_k) \subset \mathcal{C}(S_k) \subset \mathcal{C}(S_l).$

Definition 4.4. The Jucys-Murphy subalgebra $\mathcal{J}_n$ of $\mathcal{C}(S_n)$ is the commutative subalgebra generated by $Z\mathcal{C}(S_1) \cup Z\mathcal{C}(S_2) \cup \dots \cup Z\mathcal{C}(S_n).$

Problem 4.4. Prove that $\dim \mathcal{J}_n \geq \sum_{k=1}^n p(k)-(n-1),$ where $p(k)$ is the number of partitions of $k.$

The commutativity of the JM-elements follows from the fact that they belong to the JM-subalgebra.

Theorem 4.4. We have $|J_1\rangle,|J_2\rangle,\dots,|J_n\rangle \in \mathcal{J}_n.$

Proof: For each $1 \leq k \leq n$ , let

$|T_k\rangle = \sum\limits_{1 \leq s < t \leq k} |st\rangle$

be the sum of all transpositions in $S_k$ . This is an element of $\mathcal{C}(S_k),$ and indeed an element of $Z\mathcal{C}(S_k).$ Now observe that

$|J_k\rangle = |T_k\rangle - |T_{k-1}\rangle,$

which expresses $|J_k\rangle$ as a difference of elements in $\mathcal{J}_n$ . $\square$

Math 202C: Model Homework Solution

Lectures are informal by design: we meet to discuss algebra and chart our course through the subject. I talk to you, you talk to me. We generate ideas to craft a narrative. We then retire to our respective domiciles to polish that narrative. My job is to identify the key large scale ideas, treating these as girders which I assemble into an overarching structure presented in the lecture notes. I also identify important small-scale ideas which play the role of joists supporting the overarching structure. Your job is to install these joists, which you do in the homework problems. In order for this collaborative endeavor to succeed, there must be committed and earnest effort from both parties.

If I was a telepath, I would read your mind to determine if you are supplying this effort and base your grade on the results of this brain scan. Although I am not a telepath, I can gauge your earnestness to a certain extent based on verbal interactions. However, this measurement is imperfect and moreover I do not interact verbally with every student, though I am willing to do so. Therefore, your written solutions must communicate your commitment to the cause.

Below is an exemplar on which you can model your work in terms of level of detail and tone, should you wish to do so. My process was the following: I discussed the problem with two people for about fifteen minutes after lecture; I thought about it more on my fifteen minute drive home; I thought about it again while falling asleep, probably another fifteen minutes; I woke up and worked out the solution carefully, which took about an hour.

Problem 1.1. Find a formula for the number of walks of given length between two given vertices of the complete graph.

Solution: Let $K_n$ be the complete graph on $\{1,\dots,n\}$ . Given $i,j \in S$ and $r \in \mathbb{N},$ let $W^r(i,j)$ denote the number of $r$ -step walks $i \to j$ in $G.$ Observe that

$W^r(1,1) + W^r(1,2) + \dots + W^r(1,n)=(n-1)^r.$

Indeed, at the each step of the walk we have $n-1$ choices for the next vertex to visit. We claim that

$W^r(1,2)= \dots = W^r(1,n).$

Intuitively, this is because the complete graph is completely homogeneous. Let us translate this intuition into a precise argument.

Let $S_n=\mathrm{Aut}\{1,\dots,n\}$ be the symmetric group on $\{1,\dots,n\},$ and let $\tau = (2\ 3)$ be the transposition which transposes $2$ and $3.$ Let $\tau$ act on the set of walks $1 \to v_1 \to \dots \to v_{r-1} \to 2$ by

$\tau(1 \to v_1 \to \dots \to v_{r-1} \to 2) = \tau(1) \to \tau(v_1) \to \dots \tau(v_{r-1}) \to \tau(2).$

In words, $\tau$ acts on an input walk by replacing all occurrences of $2$ with $3$ , and vice versa. This is a bijective mapping from the set of $r$ -step walks $1 \to 2$ to the set of $r$ -step walks $1 \to 3$ , and in fact this bijection is an involution because $\tau$ is. Consequently, $W^r(1,2)=W^r(1,3).$ The same argument shows that $W^r(1,2)=W^r(1,4)$ , just use $\tau=(2\ 4)$ instead.

We now know that

$W^r(1,1) + (n-1)W^r(1,2) = (n-1)^r.$

If we can find a second linear equation satisfied by these numbers, we can solve the resulting system. We will find a second relation by haruspication, which in the present situation means looking at small values of $r.$

For $r=1,$ we have $W^1(1,1)=0$ and $W^1(1,2)=1.$ For $r=2,$ we have $W^2(1,1) = n-1$ and $W^2(1,2)=n-2.$ We thus conjecture that

$W^r(1,2) = W^r(1,1) + (-1)^{r-1},$

and we can prove this by induction on $r.$ The base step of the induction has been completed; the inductive step is to show that

$W^{r+1}(1,2) = W^{r+1}(1,1) + (-1)^r.$

We have

$W^{r+1}(1,2) = W^r(1,1) + \overline{W^r(1,2)}+ \dots + W^r(1,n)$

and

$W^{r+1}(1,1)= \overline{W^r(1,1)} + W^r(1,2)+\dots+W^r(1,n),$

where in both equations the overline denotes an omitted term. Subtracting the second equation from the first, we get

$W^{r+1}(1,2)-W^{r+1}(1,1)=W^r(1,1)-W^r(1,2)=-(-1)^{r-1}=(-1)^r,$

where the second equality is the induction hypothesis.

We have arrived at a system of two linear equations in two unknowns,

$W^r(1,1) + (n-1)W^r(1,2)=(n-1)^r \quad\text{ and }\quad W^r(1,1)-W^r(1,2)=(-1)^r.$

Subtracting the second equation from the first, we arrive at

$W^r(1,2) = \frac{(n-1)^r -(-1)^r}{n}.$

Now substitute this value into the second equation to arrive at

$W^r(1,1) = (-1)^r+ \frac{(n-1)^r -(-1)^r}{n}=\frac{(n-1)^r+ (-1)^r(n-1)}{n}.$

Math 202C: Lecture 0

This post serves as the syllabus for Math 202C.

Lecturer: Jonathan Novak, office hours immediately after lectures.
Grader: Johann Birnick, office hours TBA.
Topics: Symmetric groups, multilinear algebra, algebraic varieties.
Textbook: Of course not. Useful references are this, this, and this.
Evaluation: Grade assigned based on problem sets, no exam.
Problem sets: GradeScope submission, typeset your work unless calligraphy is a hobby. If you’re using AI to solve the problems, please use it to LaTeX the solutions.

DATE	TOPIC	Modality
March 30, Lecture 1	Graphs and operators	In-person lecture
April 1, Lecture 2	Cayley graphs	In-person lecture
April 3, Lecture 3	Level operators on Cayley graphs	Asynchronous lecture post
April 6, Lecture 4	Jucys-Murphy elements	In-person lecture
April 8, Lecture 5	Polynomials and symmetric polynomials	In-person lecture
April 10, Lecture 6	Elementary symmetric polynomials	In-person lecture
April 13, Lecture 7	Elementary-to-monomial transition	Asynchronous lecture post
April 15, Lecture 8	Fundamental theorem of symmetric polys	Asynchronous lecture post
April 17, Lecture 9	None	Cancelled
April 20	Review and preview, content and context	In-person lecture
April 22	Farahat-Higman theorem	In-person lecture
April 24	Simple branching	In-person lecture
April 27	Centralizers and multiplicity-free branching	In-person lecture
April 29	Multiplicity-free branching for symmetric groups	In-person lecture
May 1	GT-basis and GT-algebra	In-person lecture
May 4
May 6
May 8
May 11
May 13
May 15
May 18
May 20
May 22
May 27
May 29
June 1
June 3
June 5

Schedule subject to change

Math 202C: Lecture 3

*** All problems in this lecture due April 12 at 23:59 ***

We return to the general setting of a finite, connected, simple graph $G=\mathrm{Cay}(S,T)$ , and as before $\mathcal{C}(S)$ the Hilbert space with orthonormal basis $|\pi\rangle, |\rho\rangle,|\sigma\rangle,\dots$ . In Lecture 1 we introduced the metrics level set linear operators $L_0,L_1,L_2,\dots$ in $\mathrm{End}\mathcal{C}(S)$ which act on vertices according to

$L_r|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \mathrm{d}(\rho,\sigma)=r}} |\sigma\rangle.$

We argued the case that these operators are quite fundamental: $L_0=I$ is the identity operator, $L_1=K$ is the adjacency operator,

$D = \sum\limits_{r=0}^\infty rL_r$

is the distance operator,

$\Omega_q=\sum\limits_{r=0}^\infty q^r L_r$

is the exponential distance operator. However, the selfadjoint operators $L_0,L_1,L_2,\dots$ and the subalgebra of $\mathrm{End}\mathcal{C}(S)$ they generate are not easy to understand.

Problem 3.1. Show that $\langle \sigma|L_sL_r|\rho\rangle$ is the cardinality of the intersection of the sphere of radius $r$ centered at $\rho$ with the sphere of radius $s$ centered at $\sigma$ .

So, the matrix elements of the commutator

$[L_s,L_r]=L_sL_r - L_rL_s$

are differences of intersection numbers,

and if it seems difficult to characterize graphs for which all such differences are zero, that’s because it is. There is one much-studied class of graphs which does have the property that metric level set operators $L_0,L_1,L_2,\dots$ commute.

Problem 3.2. Show that if $G$ is a distance regular graph, then $[L_r,L_s]=0.$

The following is the only restricted situation in which I personally know a simple combinatorial characterization of commutativity of the metric level set operators (it is likely that an actual graph theorist would know more).

Problem 3.3. Suppose $G$ has diameter $2.$ Prove that the metric level set operators on $G$ commute if and only if $G$ is a regular graph. (Hint: the basic reason behind this is that in a graph of diameter $2,$ the metric does not contain any more information than the adjacency relation).

I am torturing you (us, really) with these combinatorial considerations because I want to further hammer home the point that Cayley graphs are much easier to understand than general graphs due to the underlying group structure, which is reflected in the behavior of linear operators acting on the vertices of the graph.

Suppose that $S=\{\pi,\rho,\sigma,\dots\}$ is a finite group, so that $\mathcal{C}(S)$ is the convolution algebra of this group. For any arbitrary subset $A \subseteq S,$ we write

$|A\rangle=\sum\limits_{\pi \in A}|\pi\rangle,$

which is simply the indicator function of the set $A$ . It is a very convenient abuse of notation to also simply write $A$ for the image of $|A\rangle$ in the right regular representation, i.e. we identity the set $A \subseteq S$ with the operator $A \in \mathrm{End}\mathcal{C}(S)$ defined by

$A|\rho\rangle = |\rho\rangle|A\rangle=\sum\limits_{\pi \in A} |\rho\pi\rangle.$

Recall from Math 202B that we have a tracial state

$\langle \cdot \rangle \colon \mathcal{C}(S) \longrightarrow \mathbb{C}$

on the convolution algebra of $S$ defined by

$\langle A \rangle := \langle \iota |A\rangle,$

where $\iota \in S$ is the group identity. In other words, $\langle A \rangle$ is the coefficient of the vector $|A\rangle$ in the group basis $|\pi\rangle,$ which is the same thing as the value of the function $A$ at the point $\pi=\iota.$ Here is a second under-appreciated characterization of this important functional, which is often referred to as the canonical group trace.

Problem 3.4. Prove that all diagonal matrix elements $\langle \pi | A |\pi\rangle$ of the image $A \in \mathrm{End}\mathcal{C}(S)$ of $|A\rangle\in \mathcal{C}(S)$ in the regular representation are equal to the same number, and that that number is $\langle A \rangle.$ Hence conclude (as was already shown in Math 202B) that the $\langle A \rangle =\frac{1}{|S|}\mathrm{Tr} A.$

Now to Cayley graphs. We continue with a finite group $S$ and select inside it a symmetric set of generators $T$ , giving us the Cayley graph $G = \mathrm{Cay}(S,T).$ As discussed above, we may also view $T \in \mathrm{End}\mathcal{C}(S)$ as the adjacency operator on $G.$

Problem 3.5. Show that $\langle T^r\rangle$ is the number of length $r$ loops based at the identity element (or any other specified point) in $S.$

Now specialize to the case where $S=\mathrm{Aut}\{1,\dots,n\}$ is the symmetric group and $T$ is the full conjugacy class of transpositions. The metric level set operators $L_0,L_1,L_2,\dots \in \mathrm{End}\mathcal{C}(S)$ commute for a very simple reason: they are the images of the vectors

$|L_r \rangle = \sum\limits_{\substack{\pi \in S \\ |\pi|=r}} |\pi\rangle, \quad r=0,1,2,\dots,$

and these vectors lie in the center of the convolution algebra $\mathcal{C}(S)$ . Indeed, the center of $\mathcal{C}(S)$ is spanned by the conjugacy classes

$|C_\alpha\rangle = \sum\limits_{\pi \in C_\alpha} |\pi\rangle, \quad \alpha \in \Lambda,$

and

$|L_r\rangle = \sum\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}} |C_\alpha\rangle$

is simply the sum of all conjugacy classes corresponding to partitions of $n$ with $n-r$ parts.

The fact that the level set operators $L_0,L_1,L_2,\dots$ on a graph $G$ commute is equivalent to saying that they are simultaneously diagonalizable. However, in the group-theoretic setting this statement can be refined. Let us stick with the case where $S=\mathrm{Aut}\{1,\dots,n\}$ is the symmetric group, noting that what follows is true for the Cayley graph $G=\mathrm{Cay}(S,T)$ of any finite group $S$ with generating set $T \subseteq S$ such that $|T\rangle$ is a central element in the convolution algebra $\mathcal{C}(S)$ . Let $\Lambda$ be a set indexing the conjugacy classes in $S,$ so for the symmetric group $\Lambda$ is the set of partitions of $n$ . Then $\Lambda$ also indexes isomorphism classes of irreducible unitary representations of $S,$ or equivalently irreducible linear representations of $\mathcal{C}(S).$ For each $\lambda \in \Lambda,$ choose a representative $(V^\lambda,\Phi^\lambda)$ of the corresponding isomorphism class. By Schur’s Lemma, any central element $|A\rangle \in \mathcal{C}(S)$ acts in each irrep $(V^\lambda,\Phi^\lambda)$ as a scalar operator: we have

$\Phi^\lambda|A\rangle = \widehat{A}(\lambda) I_{V^\lambda}, \quad \widehat{A}(\lambda) \in \mathbb{C}.$

Let us apply this to the adjacency operator on the all-transpositions Cayley graph $G=\mathrm{Cay}(S,T)$ of the symmetric group, which is the image of the central element

$|T\rangle = \sum\limits_{\tau \in T} |\tau\rangle= \sum\limits_{1 \leq s<t \leq n} |s\ t\rangle.$

The eigenvalues of the adjacency operator $T \in \mathrm{End}\mathcal{C}(S)$ are the numbers

$\widehat{T}(\lambda) = {n \choose 2} \frac{\chi^\lambda(\tau)}{\dim V^\lambda}, \quad \lambda \in \Lambda,$

where $\tau \in T$ is any transposition. Moreover, where the multiplicity of $\widehat{T}(\lambda)$ is $(\dim V^\lambda)^2$ because the isotypic decomposition of the regular representation is

$\mathcal{C}(S) = \bigoplus\limits_{\lambda \in \Lambda} (\dim V^\lambda)V^\lambda.$

Thus, in a sense, we completely know the spectrum of the adjacency operator on the all-transpositions Cayley graph of the symmetric group. However, to actually compute these numbers explicitly, we need to know both the dimension

$\dim V^\lambda = \chi^\lambda(\iota) = \mathrm{Tr}\varphi^\lambda(\iota)$

of every irreducible unitary representation $(V^\lambda,\varphi^\lambda)$ of $S$ , as well as the character

$\chi^\lambda(\tau) = \mathrm{Tr} \varphi^\lambda(\tau)$

of a transposition in each representation.

This highlights the fact that we really only know the irreducible unitary representations $(V^\lambda,\varphi^\lambda)$ of $S=\mathrm{Aut}\{1,\dots,n\}$ abstractly. In particular, while we know that the irreducible representations of $S$ are (up to isomorphism) in bijection with the conjugacy classes in $S$ , we have not explicitly selected such a bijection and it is not at all clear how best to do so.

Nevertheless, there are a few things we can say concretely at this stage. First, as with every group, $S$ has the trivial representation $(V^\mathsf{triv},\varphi^\mathsf{triv})$ in which $V^\mathsf{triv}$ is a one-dimensional Hilbert space and $\varphi^\mathsf{triv}$ sends every $\sigma \in S$ to the identity operator on this space. For this representation, we have

$\widehat{T}^\mathsf{triv}={n \choose 2} \frac{\chi^\mathsf{triv}(\tau)}{\dim V^\mathsf{triv}}= {n \choose 2},$

and we have reproduced the fact that the degree of the regular graph $G=\mathrm{Cay}(S,T)$ is an eigenvalue of its adjacency operator.

Second, the symmetric group $S$ has a second one-dimensional representation $(V^\mathsf{alt},\varphi^\mathsf{alt})$ in which

$\varphi^\mathsf{alt} = (-1)^{|\sigma|}I.$

This gives the eigenvalue

$\widehat{T}^\mathsf{alt}={n \choose 2} \frac{\chi^\mathsf{alt}(\tau)}{\dim V^\mathsf{alt}}=- {n \choose 2},$

which could also be deduced from the fact that the spectrum of the adjacency operator on a bipartite graph is symmetric about zero.

We can compute a third eigenvalue of the adjacency operator on the all-transpositions Cayley graph which is not directly accessible without representation-theoretic methods. To do so, we start with the permutation representation $(V^\mathsf{perm},\varphi^\mathsf{perm})$ , in which $V$ is the Hilbert space with orthonormal basis $|1\rangle,\dots,|n\rangle$ and $\varphi^\mathrm{perm}(\sigma)|i\rangle=|\sigma(i)\rangle.$ The character of this representation is

$\chi^\mathsf{perm}(\sigma) = \sum\limits_{i=1}^n \langle i | \varphi^\mathsf{perm}(\sigma) | i\rangle = \sum\limits_{i=1}^n \langle i|\sigma(i)\rangle=\mathsf{fix}(\sigma),$

the number of fixed points of the permutation $\sigma$ . However, the permutation representation is not irreducible, as it has a one-dimensional invariant subspace spanned by the vector

$|w\rangle = |1\rangle + \dots + |n\rangle.$

The orthogonal complement of this vector gives an $(n-1)$ -dimensional representation $(V^\mathsf{std},\varphi^\mathsf{std})$ of $S$ called the standard representation, whose character satisfies

$\chi^\mathsf{std}+\chi^\mathsf{triv}=\chi^\mathsf{perm}$

and is therefore given by

$\chi^\mathsf{std}(\sigma)=\mathsf{fix}(\sigma)-1.$

Problem 3.6. Prove that the standard representation of $S=\mathrm{Aut}\{1,\dots,n\}$ is irreducible, and give a third eigenvalue of the adjacency operator on the all-transpositions Cayley graph $G=\mathrm{Cay}(S,T).$

Math 202C: Lecture 2

*** Problems in this lecture due April 10 at 23:59 ***

In this lecture, we remain in the setting of the symmetric group $S=\mathrm{Aut}\{1,\dots,n\}$ but we will change the geometry on this group. More precisely, instead of identifying $S$ with its Cayley graph as generated by the set of adjacent transpositions, we will identify the symmetric group with its Cayley graph as generated by the full conjugacy class of transpositions,

$T = \{(s\ t) \in S \colon 1 \leq s < t \leq n\}.$

Let $\mathrm{d}$ denote the corresponding metric on $S,$ so that $\mathrm{d}(\rho,\sigma)$ is the minimal number $r$ such that there exist transpositions $\tau_1,\dots,\tau_r \in T$ with

$\sigma = \rho\tau_1 \dots \tau_r.$

As before, write $|\pi|=\mathrm{d}(\iota,\pi)$ for the distance from the identity permutation $\iota$ to a given permutation $\pi.$

Problem 2.1. Prove that $|\pi|=n-\mathrm{cyc}(\pi),$ where $\mathrm{cyc}(\pi)$ is the number of factors in the disjoint cycle decomposition of $\pi.$

Identifying $S$ with $\mathrm{Cay}(S,T),$ the symmetric group becomes a graded graph: it decomposes into a disjoint union

$S = L_0 \sqcup L_1 \sqcup \dots \sqcup L_{n-1}$

in which

$L_r = \{\pi \in S \colon |\pi|=r\} = \{\pi \in S \colon \mathsf{cyc}(\pi)=n-r\}$

is the sphere of radius $r$ centered at $\iota,$ and $L_r$ is an independent set, meaning that no two of its points are adjacent vertices. Moreover, if $\rho,\sigma \in S$ are adjacent vertices, then we have $\rho \in L_r$ and $\sigma \in L_s$ with $|r-s|=1.$ The decomposition of a group into concentric spheres is a general thing that happens whenever one chooses a generating set; a special feature of this particular situation is that each sphere further decomposes into a disjoint union of conjugacy classes,

$L_r = \bigsqcup\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}}C_\alpha.$

Here, we are indexing the conjugacy classes of $S$ by the set $\Lambda$ of partitions of $n$ . More precisely, the notation $\alpha \vdash n$ means that $\alpha$ is a decomposition of $n$ into a sum of $\ell(\alpha)$ positive integers, without regard to order of the terms. Every permutation $\pi \in S$ gives a partition $\mathsf{type}(\pi)\vdash n$ obtained from the lengths of the cycles in its disjoint cycle decomposition; conversely, for any partition $\alpha \vdash n$ the corresponding fiber $C_\alpha =\mathsf{type}^{-1}(\alpha)$ is a conjugacy class in $S.$

Let us now construct the analogue of Mallows measure corresponding to the all-transpositions geometry. This is called the Ewens measure on permutations: by definition, it is the Gibbs measure

$\mathbb{P}(\sigma) = \frac{q^|\sigma|}{Z} = \frac{q^{n-\mathsf{cyc}(\sigma)}}{Z}$

corresponding to a statistical mechanical system whose states $\sigma$ are permutations of $1,\dots,n$ with the potential energy of $\sigma$ being inversely proportional to its number of cycles. The partition function is

$Z= \sum\limits_{\sigma \in S} q^{|\sigma|} = q^n\sum\limits_{\sigma \in S} q^{-\mathsf{cyc}(\sigma)},$

or equivalently

$Z=\sum\limits_{k=0}^n \begin{bmatrix} n \\ k \end{bmatrix} q^{-k},$

where $\begin{bmatrix} n \\ k \end{bmatrix}$ is the Stirling number of the first kind, which counts permutations in $S=\mathrm{Aut}\{1,\dots,n\}$ whose disjoint cycle factorization has $k$ components. This in turn gives

$Z= (1+q)(1+2q) \dots (1+(n-1)q),$

corresponding to the fact that Stirling Numbers of the first kind are the coefficients of the falling factorial polynomial.

Problem 2.2. Show that the expected number of cycles in a random permutation under Ewens measure is

$1 + \sum\limits_{j=1}^{n-1} \frac{1}{1+jq}.$

Deduce from this that the expected number of cycles in a uniformly random permutation of $1,\dots,n$ is the harmonic number

$H_n=1+\frac{1}{2} + \dots + \frac{1}{n}.$

The all-transpositions geometry on permutations has many features which make it easier to understand than the adjacent-transpositions geometry. Perhaps the most significant of these is that it is “Euclideanizable.” Right, that’s not a real word – here is what is meant more precisely.

Consider the full forward cycle $\gamma=(1\ 2\ \dots\ n)$ in the symmetric group $S$ . In the all-transpositions geometry, we have $|\gamma|=n-1$ , and the number of geodesics $\iota \to \gamma$ is $\mathrm{Cay}_{n-1},$ where

$\mathrm{Cay}_n = (n+1)^{n-1}$

is the $n$ th Cayley number. Taking this formula for granted, we can calculate the number of geodesics from the identity to any permutation (hence the number of geodesics between any two permutations) as follows.

Problem 2.3. Let $\pi$ be any permutation in the conjugacy class $C_\alpha$ of $S$ . Show that the number of geodesics $\iota \to \pi$ is

${n \choose \alpha_1-1,\dots,\alpha_\ell-1} \prod\limits_{k=1}^\ell \mathrm{Cay}_{\alpha_k-1},$

where $\ell=\ell(\alpha).$

To Euclideanize the all-transpositions geometry on the symmetric group, we give a rule which allows one to systematically select a distinguished geodesic between every pair of points. To do so, let us label the edges of $\mathrm{Cay}(S,T)$ as follows: we mark each edge corresponding to the transposition $(s\ t)$ with $t$ , the larger of the two numbers swapped. Thus, emanating from every vertex of the graph we have one $2$ -edge, two $3$ -edges, three $4$ -edges, etc. Call a walk on the Cayley graph strictly monotone if the labels of the edges it traverses form a strictly increasing sequence. Thus, an $r$ -step strictly monotone walk $\rho \to \sigma$ corresponds to an equation of the form

$\sigma=\rho(s_1\ t_1) \dots (s_r\ t_r), \quad 2\leq t_1 < \dots < t_r \leq n$

in $S.$

Problem 2.4. Prove that for any $\pi \in S$ , there exists a unique strictly monotone walk $\iota \to \pi$ , and that the length of this walk is $|\pi|.$ Conclude that there exists a unique strictly monotone walk between every pair of points in $S$ , and that this walk is a geodesic.

This more or less marks the end of my efforts to get you thinking about the combinatorics of permutations; after this we will move on to the algebraic aspects of the symmetric group and its representations. The campaign for the elementary beauty and unexpected subtlety of permutations met with mixed success: there were interesting follow-up discussions with some students, whereas other pairs of eyes glazed over as the minds behind them fell into the “I already know this stuff” trap, which is one the of the more pernicious lies you can tell yourself.

Here is one final push. Our ever-curious Lani was recently asking about posets, and whether there is any active research being conducted on them these days. The answer is definitely “yes,” and though I don’t really work in the area I do have one burning question about posets which keeps me up at night. The question is about finding a partial order on permutations of a fixed genus.

From our Euclideanized perspective, genus is a metric invariant which measures “flatness” of triangles $S$ as generated by the full conjugacy class of transpositions. More precisely, let $\pi,\rho,\sigma$ be three distinct permutations in $S=\mathrm{Aut}\{1,\dots,n\}$ . There are two oriented strictly monotone triangles formed by these three points, namely $\rho \to \pi \to \sigma \to \rho$ and $\sigma \to \pi \to \rho \to \sigma.$ It may happen $\pi$ lies on a geodesic $\rho \to \sigma,$ in which case both of these triangles are flat, and have perimeter

$|\pi^{-1}\rho| + |\rho^{-1}\sigma| + |\sigma^{-1}\pi| = 2|\rho^{-1}\sigma|.$

In general, the oriented triangle $\rho \to \pi \to \sigma \to \rho$ is a cycle in a bipartite graph, so necessarily has even perimeter: we have

$|\pi^{-1}\rho| + |\rho^{-1}\sigma| + |\sigma^{-1}\pi| = 2|\rho^{-1}\sigma| + 2g(\rho,\pi,\sigma)$

for some nonnegative integer $g(\rho,\pi,\sigma)$ which we call the genus of the triple $(\rho,\pi,\sigma)$ . At the other extreme from flatness, it may be that $\pi,\rho,\sigma$ are pairwise antipodal points, i.e. the distance between each pair of them is the diameter of $S,$ in which case the above equation becomes

$3(n-1)=2(n-1)+2g(\rho,\pi,\sigma),$

giving a maximal genus of

$g(\pi,\rho,\sigma) = \lfloor \frac{n-1}{2} \rfloor.$

Problem 2.5. Show that a triple of pairwise antipodal points in $S$ exists if and only if $n$ is odd.

Since

$g(\rho,\pi,\sigma)=g(\iota,\rho^{-1}\pi,\rho^{-1}\sigma),$

it is natural to view genus as a function $g(\pi,\sigma):=g(\iota,\pi,\sigma)$ on pairs of permutations. Going even further, let us fix the second argument to be $\sigma=\gamma$ , where $\gamma=(1\ 2\ \dots\ n)$ is the full forward cycle, and define the genus of a permutation by $g(\pi):=g(\pi,\gamma).$ Then, we obtain a corresponding decomposition of the symmetric group

$S = \bigsqcup\limits_{k=0}^{\lfloor \frac{n-1}{2} \rfloor} S_g$

in which $S_g$ is the set of permutations of genus $g$ . The set $S_0$ of “planar” permutations consists precisely of those $\pi$ which lie on a geodesic $\iota \to \gamma,$ and these permutations have a special feature: the disjoint cycle decomposition of any $\pi \in S_0$ is a noncrossing partition of $\{1,\dots,n\},$ and in fact this map from genus zero permutations into noncrossing permutations is bijective. There is also a very natural partial order on $S_0$ which can be described as follows: we have $\pi \leq \sigma$ if and only if $\pi$ and $\sigma$ lie on a common geodesic $\iota \to \gamma$ , and a particle emanating from $\iota$ and traveling along this geodesic passes through $\pi$ before passing through $\sigma.$ This is a special case of the geodesic partial order which can be defined relative to any two vertices of a connected graph, and in this case it coincides exactly with the reverse refinement order on noncrossing partitions.

Now the question is: what is the higher genus analogue of this partial order? How can we define a poset structure by some uniform recipe which feels good for all values of $g$ , subject to the condition that our recipe reverts to the above at $g=0$ ? This is an open question. The most natural thing that I can think of is the following: given two permutations $\pi,\sigma \in S_g$ , we declare $\pi \leq \sigma$ if and only if they lie on a common genus $g$ triangle two of whose vertices are $\iota$ and $\gamma,$ and a particle emanating from $\iota$ and traveling along the perimeter of this triangle encounters $\pi$ before encountering $\sigma$ . Whether or not this actually defines a partial order on $S_g$ is not clear to me, but my fluid intelligence is declining from its peak while yours is cresting. So, if you want to discuss further please feel free to get in touch.

Math202C: Lecture 17 – Expander Graphs

Author: Haixiao Wang

As demonstrated in [1], expander graphs are a sparse graph that mimic the structure properties of complete graphs, quantified by vertex, edge or spectral expansion. At the same time, expander constructions have spawned research in pure and applied mathematics, with several applications to complexity theory, design of robust computer networks, and the theory of error-correcting codes.

This lecture is organized as follows. We introduce two different definitions of expanders in the first section, and then explore the relationship between them in second part.

17.1. Two Definitions

We start our discussion in expansion with two definitions of expansion: combinatorial and spectral. In the rest of this post, we consider an undirected graph $G = (V, E)$ with $|V| = n$ , where self loops and multiple edges are allowed. Given two subsets $S,T \subset V$ , the set of edges between $S$ and $T$ is denoted by $E(S, T) = \{(u,v ): (u, v) \in E, u\in S, v\in T\}$ .

17.1.1. Combinatorial expansion

Intuitively, a graph $G=(V,E)$ is an expander if subsets $S \subseteq V$ expand outwards into $V \setminus S$ . In other words, all small subsets will have large boundary, respectively. Follow this intuition, we introduce the definition of boundary, based on edges and vertices respectively, to quantify the corresponding expansion.

Definition 1(Boundary):

The Edge Boundary of a set $S$ , denoted by $\partial S$ , is the set of edges $\partial S = E(S, V \setminus S)$ , which is the set of edges between disjoint set $S$ and $V \setminus S$ .
The Vertex Boundary of a set $S$ is a subset of vertices in $V \setminus S$ adjacent to vertices in $S$ , denoted as $N(S)$ .

Obviously, vertex boundary $|N(S)|$ is the same as the number of neighboring vertices of vertex sets $S$ . With those notations, we then define the expansion over the entire graph $G$ via maximizing over subsets $S$ .

Definition 2(Expander): For some $\varepsilon \in [0, 1]$ , a graph $G = (V, E)$ with $|V| = n$ is an

$\varepsilon$ -edge-expander if $\min\limits_{ S\subset V, |S| \leq \frac{n}{2} } \frac{|\partial S|}{|E(S, V)|} \geq \varepsilon$ .
$\varepsilon$ -vertex-expander if $\min\limits_{ S\subset V, |S| \leq \frac{n}{2} } \frac{|N(S)|}{|S|} \geq \varepsilon$ .

Remark 3: The reason for the restriction $|S| \leq n/2$ is that we are examining the relative size of every cut in the graph, i.e., the number of edges crossing between $S$ and $V \setminus S$ , and we examine each case only once.

Problem 1: Show that the complete graph $K_{n}$ is

at least a $1/2$ -edge-expander.
a $1$ -vertex-expander. (We only consider $\varepsilon \in [0, 1]$ here.)

From the definition above, the maximum $\varepsilon$ , which makes $G = (V, E)$ an $\varepsilon$ -edge-expander, is an important criteria, leading to the definition of Cheeger constant.

Definition 4(Cheeger Constant): The Edge Expansion Ratio, or Cheeger constant, of graph $G = (V, E)$ with $|V| = n$ , denoted by $h(G)$ , is defined as

$h(G) = \min\limits_{S\subset V, |S| \leq \frac{n}{2}}\frac{|\partial S|}{|E(S, V)|}$

Remark 5: Actually, Cheeger constant can be defined alternatively as $\min\limits_{S\subset V, |S| \leq n/2}\frac{|\partial S|}{|S|} = O(n)$ . We didn’t follow this definition since we want to keep $\varepsilon\in [0, 1]$ .

17.1.2. Spectral Expansion

To simplify our discussion, we focus on regular graphs in the rest of this lecture. Let $A = A(G)\in \mathbb{R}^{n \times n}$ denote the adjacency Matrix of a $d$ -regular graph $G = (V, E)$ with $|V| = n$ and $A$ being real symmetric, then $A$ has $n$ real eigenvalues, denoted by $\lambda_{1} \geq \lambda_{2} \geq \cdots \geq \lambda_n$ , and corresponding orthonormal eigenvectors $v_1, \dots, v_n$ with $A v_i = \lambda_i v_i$ . The spectrum of $A$ encodes a lot of information about the graph structure. Recall the following properties, which we have seen in previous lectures.

The largest eigenvalue is $\lambda_1 = d$ and the corresponding eigenvector is $v_1 = (1/\sqrt{n}, \dots, 1/\sqrt{n})$ .
The graph $G$ is connected if and only if $\lambda_1 > \lambda_2$ .
The graph is bipartite if and only if $\lambda_1 = -\lambda_n$ .

The definition of spectral expansion and spectral gap then follows.

Definition 6(Spectral Expansion): A graph $G = (V, E)$ with $|V| = n$ is a $\lambda$ -spectral-expander if $\lambda(G):= \max \{ |\lambda_2|, |\lambda_n| \} \leq \lambda$ .

In other words, the second largest eigenvalue of $A(G)$ , in absolute value, is at most $\lambda$ . This also gives rise to the definition of Ramanujan graph.

Definition 7(Ramanujan graph): A connected $d$ -regular graph $G$ is a Ramanujan graph if $\lambda (G)\leq 2\sqrt{d-1}$ .

We now refer to the spectral gap.

Definition 8(Spectral Gap): The spectral gap of $d$ -regular graph $G$ is $\lambda_{1} - \lambda(G)$ .

Remark 9: Sometimes the spectral gap is defined as $\lambda_1 - \lambda_2$ , which is often referred as one-sided.

Problem 2:

Show that the complete graph $K_{n}$ has spectral gap $n - 2$ . Actually, we can also show that the one-sided spectral gap is $n$ .
Show that the complete bipartite graph $K_{m, n}$ has spectral gap $0$ . (Hint: use the spectral property above.)

17.2. Relating Two Definitions

It is not so obvious that the combinatorial and spectral versions of expansion are related. In this section, we will introduce two relations between edge and spectral expansion: the Expander-Mixing Lemma and Cheeger’s inequality.

17.2.1. The Expander-Mixing Lemma

Lemma 10(Expander-Mixing Lemma[2]): Let $G = (V, E)$ be a $d$ -regular graph with $n$ vertices and eigenvalues $\lambda_1 \geq \dots \geq \lambda_n$ . Then for all $S, T \subset V$ :

$\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \lambda(G) \sqrt{|S|\cdot |T|}.$

As we can see, the left-hand side measures the deviation between two quantities:

$|E(S, T)|$ : the number of edges between the two subsets $S$ and $T$ .
$|S|\cdot |T|\cdot d/n$ : the expected number of edges between two subsets $S$ and $T$ , drawn from an Erdos–Renyi random graph $G(n, d/n)$ .

The Expander-Mixing lemma tell us a small $\lambda(G)$ (a.k.a. large spectral gap) implies that this discrepancy is small, leading to the fact that the graph is nearly random in this sense, as discussed in [1]. We should mention that the converse of expander-mixing is true as well, stated as follows.

Lemma 11(Expander-Mixing Lemma Converse [4]): Let $G = (V, E)$ be a $d$ -regular graph with $n$ vertices and eigenvalues $\lambda_1 \geq \dots \geq \lambda_n$ satisfying:

$\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \rho \sqrt{|S|\cdot |T|}.$

for any two disjoint subsets $S,T\subset V$ and some positive $\rho$ . Then $\lambda(G) \leq O(\rho \cdot(1 + \log(d/\rho)))$ .

Proof of Expander-Mixing Lemma: Recall that the adjacency matrix $A$ has eigenvalues $\lambda_1 \geq \dots \geq \lambda_n$ with corresponding orthonormal eigenvectors $v_1, \dots, v_n$ . Let $1_S$ and $1_T$ be the characteristic vectors of subsets $S$ and $T$ (a.k.a., the $v$ -th coordinate of $1_S$ is $1$ if $v\in S$ and $0$ otherwise). We then decompose $1_S$ and $1_T$ in the orthonormal basis of eigenvectors $v_1, \dots, v_n$ as

$1_S = \sum\limits_{i=1}^n \alpha_i v_i, \quad 1_S = \sum\limits_{j=1}^n\beta_j v_j$ .

Observing that $|E(S, T)| = (1_S)^T A 1_T$ , we expand out the right hand side and apply orthogonality, which yields

$(1_S)^T A 1_T = \left (\sum\limits_{i=1}^n \alpha_i v_i \right)^T A \left (\sum\limits_{j=1}^n\beta_j v_j \right) = \sum\limits_{i=1}^n \lambda_i\alpha_i\beta_i = d\frac{|S|\cdot|T|}{n} + \sum\limits_{i=2}^n \lambda_i\alpha_i\beta_i$ .

where the last step follows from the facts $\lambda_1 = d$ with corresponding $v_1 = (1/\sqrt{n},\dots, 1/\sqrt{n})$ , $\alpha_1 =<1_S, v_1> = |S|/\sqrt{n}$ and $\beta_1 =<1_T, v_1> = |T|/\sqrt{n}$ . Shifting the first term on right hand side to the left, the desired result follows from Cauchy-Schwarz inequality:

$\left| |E(S, T)| - d\frac{|S||T|}{n} \right | = \left|\sum\limits_{i=2}^n\lambda_i\alpha_i\beta_i\right| \leq \lambda(G) \sum\limits_{i=2}^n |\alpha_i \beta_i| \leq \lambda(G) ||\alpha||_2 ||\beta ||_2 = \lambda(G) ||1_S||_2 ||1_T||_2 = \lambda(G)\sqrt{|S||T|}.$

Remark 12: If we normalize the adjacency matrix $A = A(G)$ and obtain $\bar{A} = (1/d) A$ , where each entry of $A$ is divided by $d$ , then the corresponding eigenvalues of $\bar{A}$ , namely $\bar{\lambda}_{1}\geq \dots \geq \bar{\lambda}_n$ , lie in the interval $[-1, 1]$ . The Expander-Mixing Lemma is in the form of

$\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \bar{\lambda}(G)d \sqrt{|S|\cdot |T|}.$

It is sometimes convenient to consider the normalized spectrum. For example in random graph theory, it would be convenient to look at the normalized spectrum when the expected degree $d = O(\log n)$ . However, we can focus on the spectrum of $\latex A = A(G)$ without normalization when the expected degree $\latex d = O(1)$.

Regular $\lambda$ -spectral expanders with small $\lambda(G)$ (a.k.a. large spectral gap) have some of significant properties, listed as follows.

Corollary 13: An independent set , in a graph $G = (V, E)$ with $|V| = n$ , is a set of vertices $S$ , where no two vertices in $S$ are adjacent, i.e. with $|E(S, S)| = 0$ . An independent set $S$ , in a $d$ -regular $\alpha$ -spectral expander $G$ , has cardinality at most $\alpha n$ , i.e., $|S|\leq \alpha n$ , which follows immediately from Lemma 10(Expander-Mixing Lemma[2]).

Corollary 14: Given a graph $G = (V, E)$ with $|V| = n$ , the diameter of $G$ is defined as $\mathrm{diam}(G) = \max\limits_{u, v}d_{G}(u, v)$ , where $d_{G}(u, v)$ is the length of shortest path between $u$ and $v$ . If $G$ is a $d$ -regular $\alpha$ -spectral expander $G$ with $\alpha < d/2$ , then we have

$\Omega \Big( \frac{\log n}{\log d} \Big) \leq \mathrm{diam}(G) \leq O(\log n).$

17.2.2. Cheeger’s Inequality

As we have seen from previous discussion, spectral expansion implies strong structural properties on the edge-structure of graphs. In this section, we introduce another famous result, Cheeger’s inequality, indicating that the Cheeger constant(Edge Expansion Ratio) of a graph $G$ can be approximated by the its spectral gap $\lambda_1 - \lambda(G)$ .[6]

Theorem 15(Cheeger’s inequality, [3]): Let $G = (V, E)$ be a $d$ -regular graph with eigenvalues $\lambda_1\geq \cdots \geq \lambda_n$ , then

$\frac{d - \lambda_2}{2} \leq \min\limits_{S\subset V, |S|\leq n/2} \frac{|\partial S|}{|S|} \leq \sqrt{2d(d - \lambda_2)}$ .

In the normalized scenario $\bar{A} = (1/d)A(G)$ with eigenvalues $\bar{\lambda}_{1}, \cdots, \bar{\lambda}_n \in [-1, 1]$ , we have

$\frac{1 - \bar{\lambda}_2}{2} \leq h(G):=\min\limits_{S\subset V, |S|\leq n/2} \frac{|\partial S|}{|E(S, V)|} \leq \sqrt{2(1 - \bar{\lambda}_2)}$ .

Remark 16: The proof for the normalized scenario is different from the regular scenario. We should start from the desired quantity and finish the calculation step by step. However, the strategy are the same. We use the Rayleigh Quotient to obtain upper bound and lower bound.

In the context of $n\gg d$ , the estimate of spectral gap was given by Nilli Alon.

Theorem 17([5]): For every $d$ -regular graph $G$ , we have

$\lambda(G) \geq 2\sqrt{d - 1} - o_{n}(1)$

The $o_{n}(1)$ term is a quantity that tends to zero for every fixed $d$ as $n \to \infty$ .

[1]. Shlomo Horry, N. L. and Wigderson, A. (2006). Expander graphs andtheir applications.BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY, 43(4).
[2]. Alon, N. and Chung, F. R. K. (1988). Explicit construction of linear sized tolerantnetworks.Discrete Mathematics, 42
[3]. Cheeger, J. (1970). A lower bound for the smallest eigenvalue of the laplacian.Problems inanalysis, 26.[Nilli, 1991] Nilli, A. (1991). On the secon
[4]. Yonatan Bilu, N. L. (2006). Lifts, discrepancy and nearly optimal spectral gap.Com-binatorica, 26
[5]. Nilli, A. (1991). On the second eigenvalue of a graph.Discrete Mathematics, 91(2):207–210.
[6]. Lovett, S. (2021). Lecture notes for expander graphs and high-dimensional expanders.