Math202C: Lecture 17 – Expander Graphs

Author: Haixiao Wang

As demonstrated in [1], expander graphs are a sparse graph that mimic the structure properties of complete graphs, quantified by vertex, edge or spectral expansion. At the same time, expander constructions have spawned research in pure and applied mathematics, with several applications to complexity theory, design of robust computer networks, and the theory of error-correcting codes.

This lecture is organized as follows. We introduce two different definitions of expanders in the first section, and then explore the relationship between them in second part.

17.1. Two Definitions


We start our discussion in expansion with two definitions of expansion: combinatorial and spectral. In the rest of this post, we consider an undirected graph G = (V, E) with |V| = n, where self loops and multiple edges are allowed. Given two subsets S,T \subset V, the set of edges between S and T is denoted by E(S, T) = \{(u,v ): (u, v) \in E, u\in S, v\in T\}.

17.1.1. Combinatorial expansion


Intuitively, a graph G=(V,E) is an expander if subsets S \subseteq V expand outwards into V \setminus S. In other words, all small subsets will have large boundary, respectively. Follow this intuition, we introduce the definition of boundary, based on edges and vertices respectively, to quantify the corresponding expansion.

Definition 1(Boundary):

  • The Edge Boundary of a set S, denoted by \partial S, is the set of edges \partial S = E(S, V \setminus S), which is the set of edges between disjoint set S and V \setminus S.
  • The Vertex Boundary of a set S is a subset of vertices in V \setminus S adjacent to vertices in S, denoted as N(S).

Obviously, vertex boundary |N(S)| is the same as the number of neighboring vertices of vertex sets S. With those notations, we then define the expansion over the entire graph G via maximizing over subsets S.

Definition 2(Expander): For some \varepsilon \in [0, 1], a graph G = (V, E) with |V| = n is an

  • \varepsilon-edge-expander if \min\limits_{ S\subset V, |S| \leq \frac{n}{2} } \frac{|\partial S|}{|E(S, V)|} \geq \varepsilon.
  • \varepsilon-vertex-expander if \min\limits_{ S\subset V, |S| \leq \frac{n}{2} } \frac{|N(S)|}{|S|} \geq \varepsilon.

Remark 3: The reason for the restriction |S| \leq n/2 is that we are examining the relative size of every cut in the graph, i.e., the number of edges crossing between S and V \setminus S, and we examine each case only once.

Problem 1: Show that the complete graph K_{n} is

  • at least a 1/2-edge-expander.
  • a 1-vertex-expander. (We only consider \varepsilon \in [0, 1] here.)

From the definition above, the maximum \varepsilon, which makes G = (V, E) an \varepsilon-edge-expander, is an important criteria, leading to the definition of Cheeger constant.

Definition 4(Cheeger Constant): The Edge Expansion Ratio, or Cheeger constant, of graph G = (V, E) with |V| = n, denoted by h(G), is defined as

h(G) = \min\limits_{S\subset V, |S| \leq \frac{n}{2}}\frac{|\partial S|}{|E(S, V)|}

Remark 5: Actually, Cheeger constant can be defined alternatively as \min\limits_{S\subset V, |S| \leq n/2}\frac{|\partial S|}{|S|} = O(n). We didn’t follow this definition since we want to keep \varepsilon\in [0, 1].

17.1.2. Spectral Expansion

To simplify our discussion, we focus on regular graphs in the rest of this lecture. Let A = A(G)\in \mathbb{R}^{n \times n} denote the adjacency Matrix of a d-regular graph G = (V, E) with |V| = n and A being real symmetric, then A has n real eigenvalues, denoted by \lambda_{1} \geq \lambda_{2} \geq \cdots \geq \lambda_n, and corresponding orthonormal eigenvectors v_1, \dots, v_n with A v_i = \lambda_i v_i. The spectrum of A encodes a lot of information about the graph structure. Recall the following properties, which we have seen in previous lectures.

  • The largest eigenvalue is \lambda_1 = d and the corresponding eigenvector is v_1 = (1/\sqrt{n}, \dots, 1/\sqrt{n}).
  • The graph G is connected if and only if \lambda_1 > \lambda_2.
  • The graph is bipartite if and only if \lambda_1 = -\lambda_n.

The definition of spectral expansion and spectral gap then follows.

Definition 6(Spectral Expansion): A graph G = (V, E) with |V| = n is a \lambda-spectral-expander if \lambda(G):= \max \{ |\lambda_2|, |\lambda_n| \} \leq \lambda.

In other words, the second largest eigenvalue of A(G), in absolute value, is at most \lambda. This also gives rise to the definition of Ramanujan graph.

Definition 7(Ramanujan graph): A connected d-regular graph G is a Ramanujan graph if \lambda (G)\leq 2\sqrt{d-1}.

We now refer to the spectral gap.

Definition 8(Spectral Gap): The spectral gap of d-regular graph G is \lambda_{1} - \lambda(G).

Remark 9: Sometimes the spectral gap is defined as \lambda_1 - \lambda_2, which is often referred as one-sided.

Problem 2:

  • Show that the complete graph K_{n} has spectral gap n - 2. Actually, we can also show that the one-sided spectral gap is n.
  • Show that the complete bipartite graph K_{m, n} has spectral gap 0. (Hint: use the spectral property above.)

17.2. Relating Two Definitions

It is not so obvious that the combinatorial and spectral versions of expansion are related. In this section, we will introduce two relations between edge and spectral expansion: the Expander-Mixing Lemma and Cheeger’s inequality.

17.2.1. The Expander-Mixing Lemma

Lemma 10(Expander-Mixing Lemma[2]): Let G = (V, E) be a d-regular graph with n vertices and eigenvalues \lambda_1 \geq \dots \geq \lambda_n. Then for all S, T \subset V:

\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \lambda(G) \sqrt{|S|\cdot |T|}.

As we can see, the left-hand side measures the deviation between two quantities:

  • |E(S, T)|: the number of edges between the two subsets S and T.
  • |S|\cdot |T|\cdot d/n: the expected number of edges between two subsets S and T, drawn from an Erdos–Renyi random graph G(n, d/n).

The Expander-Mixing lemma tell us a small \lambda(G) (a.k.a. large spectral gap) implies that this discrepancy is small, leading to the fact that the graph is nearly random in this sense, as discussed in [1]. We should mention that the converse of expander-mixing is true as well, stated as follows.

Lemma 11(Expander-Mixing Lemma Converse [4]): Let G = (V, E) be a d-regular graph with n vertices and eigenvalues \lambda_1 \geq \dots \geq \lambda_n satisfying:

\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \rho \sqrt{|S|\cdot |T|}.

for any two disjoint subsets S,T\subset V and some positive \rho. Then \lambda(G) \leq O(\rho \cdot(1 + \log(d/\rho))).

Proof of Expander-Mixing Lemma: Recall that the adjacency matrix A has eigenvalues \lambda_1 \geq \dots \geq \lambda_n with corresponding orthonormal eigenvectors v_1, \dots, v_n. Let 1_S and 1_T be the characteristic vectors of subsets S and T(a.k.a., the v-th coordinate of 1_S is 1 if v\in S and 0 otherwise). We then decompose 1_S and 1_T in the orthonormal basis of eigenvectors v_1, \dots, v_n as

1_S = \sum\limits_{i=1}^n \alpha_i v_i, \quad 1_S = \sum\limits_{j=1}^n\beta_j v_j.

Observing that |E(S, T)| = (1_S)^T A 1_T, we expand out the right hand side and apply orthogonality, which yields

(1_S)^T A 1_T = \left (\sum\limits_{i=1}^n \alpha_i v_i \right)^T A \left (\sum\limits_{j=1}^n\beta_j v_j \right) = \sum\limits_{i=1}^n \lambda_i\alpha_i\beta_i = d\frac{|S|\cdot|T|}{n} + \sum\limits_{i=2}^n \lambda_i\alpha_i\beta_i .

where the last step follows from the facts \lambda_1 = d with corresponding v_1 = (1/\sqrt{n},\dots, 1/\sqrt{n}), \alpha_1 =<1_S, v_1> = |S|/\sqrt{n} and \beta_1 =<1_T, v_1> = |T|/\sqrt{n}. Shifting the first term on right hand side to the left, the desired result follows from Cauchy-Schwarz inequality:

\left| |E(S, T)| - d\frac{|S||T|}{n} \right | = \left|\sum\limits_{i=2}^n\lambda_i\alpha_i\beta_i\right| \leq \lambda(G) \sum\limits_{i=2}^n |\alpha_i \beta_i| \leq \lambda(G) ||\alpha||_2 ||\beta ||_2 = \lambda(G) ||1_S||_2  ||1_T||_2 = \lambda(G)\sqrt{|S||T|}.

Remark 12: If we normalize the adjacency matrix A = A(G) and obtain \bar{A} = (1/d) A, where each entry of A is divided by d, then the corresponding eigenvalues of \bar{A}, namely \bar{\lambda}_{1}\geq \dots \geq \bar{\lambda}_n, lie in the interval [-1, 1]. The Expander-Mixing Lemma is in the form of

\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \bar{\lambda}(G)d \sqrt{|S|\cdot |T|}.

It is sometimes convenient to consider the normalized spectrum. For example in random graph theory, it would be convenient to look at the normalized spectrum when the expected degree d = O(\log n). However, we can focus on the spectrum of $\latex A = A(G)$ without normalization when the expected degree $\latex d = O(1)$.

Regular \lambda-spectral expanders with small \lambda(G)(a.k.a. large spectral gap) have some of significant properties, listed as follows.

Corollary 13: An independent set , in a graph G = (V, E) with |V| = n, is a set of vertices S, where no two vertices in S are adjacent, i.e. with |E(S, S)| = 0. An independent set S, in a d-regular \alpha-spectral expander G, has cardinality at most \alpha n, i.e., |S|\leq \alpha n, which follows immediately from Lemma 10(Expander-Mixing Lemma[2]).

Corollary 14: Given a graph G = (V, E) with |V| = n, the diameter of G is defined as \mathrm{diam}(G) = \max\limits_{u, v}d_{G}(u, v), where d_{G}(u, v) is the length of shortest path between u and v. If G is a d-regular \alpha-spectral expander G with \alpha < d/2, then we have

\Omega \Big( \frac{\log n}{\log d} \Big) \leq \mathrm{diam}(G) \leq O(\log n).

17.2.2. Cheeger’s Inequality

As we have seen from previous discussion, spectral expansion implies strong structural properties on the edge-structure of graphs. In this section, we introduce another famous result, Cheeger’s inequality, indicating that the Cheeger constant(Edge Expansion Ratio) of a graph G can be approximated by the its spectral gap \lambda_1 - \lambda(G).[6]

Theorem 15(Cheeger’s inequality, [3]): Let G = (V, E) be a d-regular graph with eigenvalues \lambda_1\geq \cdots \geq \lambda_n, then

\frac{d - \lambda_2}{2} \leq \min\limits_{S\subset V, |S|\leq n/2} \frac{|\partial S|}{|S|} \leq \sqrt{2d(d - \lambda_2)}.

In the normalized scenario \bar{A} = (1/d)A(G) with eigenvalues \bar{\lambda}_{1}, \cdots, \bar{\lambda}_n \in [-1, 1], we have

\frac{1 - \bar{\lambda}_2}{2} \leq h(G):=\min\limits_{S\subset V, |S|\leq n/2} \frac{|\partial S|}{|E(S, V)|} \leq \sqrt{2(1 - \bar{\lambda}_2)}.

Remark 16: The proof for the normalized scenario is different from the regular scenario. We should start from the desired quantity and finish the calculation step by step. However, the strategy are the same. We use the Rayleigh Quotient to obtain upper bound and lower bound.

In the context of n\gg d, the estimate of spectral gap was given by Nilli Alon.

Theorem 17([5]): For every d-regular graph G, we have

\lambda(G) \geq 2\sqrt{d - 1} - o_{n}(1)

The o_{n}(1) term is a quantity that tends to zero for every fixed d as n \to \infty.

[1]. Shlomo Horry, N. L. and Wigderson, A. (2006). Expander graphs andtheir applications.BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY, 43(4).

[2]. Alon, N. and Chung, F. R. K. (1988). Explicit construction of linear sized tolerantnetworks.Discrete Mathematics, 42

[3]. Cheeger, J. (1970). A lower bound for the smallest eigenvalue of the laplacian.Problems inanalysis, 26.[Nilli, 1991] Nilli, A. (1991). On the secon

[4]. Yonatan Bilu, N. L. (2006). Lifts, discrepancy and nearly optimal spectral gap.Com-binatorica, 26

[5]. Nilli, A. (1991). On the second eigenvalue of a graph.Discrete Mathematics, 91(2):207–210.

[6]. Lovett, S. (2021). Lecture notes for expander graphs and high-dimensional expanders.

Math 202C: Lecture 12

Since the representation theory of the symmetric group is likely fresh in your mind thanks to yesterday’s exam, this is a good time to show you a very nice probabilistic application of this theory to card shuffling. This post will discuss the “top-to-random” shuffle, where the cards 1,\dots,n are arranged in a single row (or a single stack), and a randomly chosen card is repeatedly swapped with the rightmost (or top) card. The algebraic model for this procedure is the Cayley graph \Gamma(n) of the symmetric group as generated by the so-called “star transpositions”

\tau_1=(1\ n), \tau_2=(2\ n),\dots, \tau_{n-1} = (n-1\ n).

Problem 12.1: Prove that the above transpositions do in fact generate \mathrm{S}(n).

These are called star transpositions because, if one forms the graph on \{1,\dots,n\} such that \{i,j\} is an edge if and only if one of the above transpositions swaps i and j, then this graph is a star graph. The top-to-random shuffle is then the same thing as the lazy simple random walk on the Cayley graph \Gamma(n).

The paper attached below gives a detailed analysis of the top-to-random shuffle using the representation theory of \mathrm{S}(n), and gives a precise meaning to the statement that n \log n such shuffles are required in order to randomize the deck. The argument is pretty much all representation theory — only some very basic probabilistic reasoning is used towards the end.

Math 202C: Lecture 9

We begin with a brief recap of the main result from Lecture 8. Let

\chi_\sigma = \sum\limits_{\pi \in \mathrm{B}(k)} \chi_\sigma(\pi) \pi, \quad \sigma \in \mathrm{B}(k)

be the characters of the rank k-hypercube group

\mathrm{B}(k) = \langle (1\ 2), (3\ 4), \dots, (2k-1\ k) \rangle.

Definition 1: The orthogonal idempotents in the group algebra \mathbb{C}\mathrm{B}(k) are defined as the characters of \mathrm{B}(k) normalized by the order of \mathrm{B}(k):

\tilde{\chi}_\sigma = \frac{1}{2^k} \chi_\sigma, \quad \sigma \in \mathrm{B}(k).

The reason for the name is the following key property of these elements:

\tilde{\chi}_\rho \tilde{\chi}_\sigma = \tilde{\chi}_\rho [\rho = \sigma], \quad \rho,\sigma \in \mathrm{B}(k).

This is quite valuable, since expanding elements a \in \mathbb{C}\mathrm{B}(k) in the group algebra relative to the basis of orthogonal idempotents essentially makes multiplication in the group algebra trivial, which translates into the “Fourier coefficients of a product are products of Fourier coefficients” fact that we saw last Lecture. Note also that the behavior of orthogonal idempotents in the group algebra is formally identical to the behavior of orthogonal projections in Hilbert space — you perhaps know from Math 202B that this is not a coincidence, and if you don’t know this don’t worry, because we will discuss it.

The main result for today is the following.

Theorem 1: The class of multiplication operators in \mathrm{End}\mathbb{C}\mathrm{B}(k) coincides with the class of operators that are diagonal in the character basis of \mathbb{C}\mathrm{B}(k).

Proof: Let b \in \mathbb{C}\mathrm{B}(k) be an element of the group algebra, and let R_b \in \mathrm{End} \mathbb{C}\mathrm{B}(k) be the corresponding multiplication operator. Consider the character expansion of b,

b = 2^{-k} \sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,b\rangle \chi_\pi .

Now let \chi_\sigma \in \mathbb{C}\mathrm{B}(k) be another character. We have

R_b\chi_\sigma = 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,b\rangle \chi_\sigma\chi_\pi = 2^{-k}\langle \chi_\sigma,b\rangle 2^k \chi_\sigma = \langle \chi_\sigma,b\rangle\chi_\sigma.

So, \chi_\sigma is an eigenvector of R_b, with eigenvalue the Fourier coefficient \langle \chi_\sigma,b\rangle.

Conversely, suppose that R \in  \mathrm{End}\mathbb{C}\mathrm{B}(k) acts diagonally in the character basis:

R \chi_\pi = \lambda_\pi \chi_\pi, \quad \pi \in \mathrm{B}(k).

Then the action of R on an arbitrary a \in \mathbb{C}\mathrm{B}(k) is

Ma = M 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,a \rangle \chi_\pi = 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,a \rangle \lambda_\pi\chi_\pi = \left(  2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,a \rangle \chi_\pi \right) \left(2^{-k}\sum\limits_{\sigma \in \mathrm{B}(k)} \lambda_\sigma \chi_\sigma \right),

so that M=R_b with

b= 2^{-k}\sum\limits_{\sigma \in \mathrm{B}(k)} \lambda_\sigma \chi_\sigma.

— Q.E.D.

With Theorem 1 in hand, we can immediately compute the the eigenvalues and eigenvectors of the Cayley graph of \mathrm{B}(k). Indeed, the adjacency operator of the Cayley graph of \mathrm{B}(k) is just the multiplication operators

R_{\sum_{i=1}^k \tau_i},

so by Theorem 1 we immediately have that

R_{\sum_{i=1}^k \tau_i} \chi_\sigma = \langle \chi_\sigma,\sum\limits_{i=1}^k \tau_i \rangle \chi_\sigma, \quad \sigma \in \mathrm{B}(k).

Now, we have that the resulting eigenvalue is

\langle \chi_\sigma,\sum\limits_{i=1}^k \tau_i \rangle = \sum\limits_{i=1}^k \langle \chi_\sigma,\tau_i \rangle,

and this is a sum we can explicitly evaluate, because we explicitly computed the characters of \mathrm{B}(k) back in Lecture 7. In particular, what we know is that

\langle \chi_\sigma,\tau_i \rangle =\chi_\sigma(\tau_i)

is equal to -1 if \tau_i is a factor of \sigma, and is equal to +1 if \tau_i is not a factor of \sigma. Since we compute this scalar product for each and every generator \tau_i of \mathrm{B}(k), the sum has |\sigma| terms equal to -1, and k - |\sigma| terms equal to +1, where |\sigma| is the word norm of \sigma, or equivalently its distance to the identity permutation in the Cayley graph. So we get

\sum\limits_{i=1}^k \langle \chi_\sigma,\tau_i \rangle = k-2|\sigma|

as our explicit eigenvalue of the character \chi_\sigma under the action of the adjacency operator. Moreover, the number of elements \sigma \in \mathrm{B}(k) such that |\sigma|=j for a given 0 \leq j \leq k is simply {k \choose j}, since this just amounts to choosing j of the generators \tau_1,\dots,\tau_k of \mathrm{B}(k). So we can summarize our results as follows.

Theorem 2: The eigenvalues of Cayley graph of \mathrm{B}(k) are the numbers

k-2j, \quad j=0,\dots,k,

and the dimension of the eigenspace corresponding to k-2j is {k \choose j}.

Since all eigenvalues of the Cayley graph of \mathrm{B}(k) are given explicitly by Theorem 2, we are in a position to explicitly enumerate walks on this graph, using the spectral approach to counting walks discussed previously.

Problem 9.1: Show that the number W^r(\rho,\sigma) of r-step walks on \mathrm{B}(k) from \rho to \sigma is

W^r(\rho,\sigma) = \frac{1}{2^k} \sum\limits_{i=0}^k \sum\limits_{j=0}^{|\rho\sigma|} (-1)^j {|\rho\sigma| \choose j} {k-|\rho\sigma| \choose i-j} (k-2i)^r.

This material, and problem 9.1, have applications in coding theory which you might find interesting to look into.

Math 202C: Lecture 8

We are at the beginning of the theory of harmonic analysis on finite abelian groups, starting with the concrete example of the the rank k hypercube group, which we have chosen to define as the permutation group \mathrm{B}(k) generated by the disjoint transpositions

\tau_1=(1\ 2), \tau_2=(3\ 4), \dots, \tau_k = (2k-1\ 2k).

Typically, analysts think of harmonic analysis on \mathrm{B}(k) (or any other group) as the study of the Hilbert space \mathbb{C}^{\mathrm{B}(k)} of functions

a \colon \mathrm{B}(k) \longrightarrow \mathbb{C}

equipped with the scalar product

\langle a,b \rangle = \sum\limits_{\pi \in \mathrm{B}(k)} \overline{a(\pi)}b(\pi),

so that one has a canonical orthonormal basis \{\delta_\pi\} of \mathbb{C}^{\mathrm{B}(k)} consisting of indicator functions. This space has the structure of an algebra when equipped with the convolution product of functions,

[a*b](\pi) = \sum\limits_{\sigma \in \mathrm{B}(k)} a(\pi\sigma^{-1}) b(\sigma),

which is exactly what one gets from bilinear extension of group law in \mathrm{B}(k) lifted to the group element basis:

\delta_\rho * \delta_\sigma = \delta_{\rho\sigma}.

In particular, the convolution product on \mathbb{C}^{\mathrm{B}(k)} is quite different from the pointwise product. This is the right point of view to take when one is dealing with non-discrete groups (e.g. Lie groups), but for finite groups there is another way of writing things out which is often simpler and cleaner, and favored by algebraists.

The algebraists’ preferred Hilbert space \mathbb{C}\mathrm{B}(k) of formal \mathbb{C}-linear combinations of permutations \pi \in \mathrm{B}(k),

a = \sum\limits_{\pi \in \mathrm{B}(k)} a(\pi)\pi.

Such linear combinations are of course equivalent to functions on the group, and you can think of the linear combination a as a generating function which are equivalent to complex valued functions on \mathrm{B}(k). From this perspective the canonical orthonormal basis of \mathbb{C}\mathrm{B}(k) consists of the permutations \pi \in \mathrm{B}(k) themselves (rather than their indicator functions), which induces the equivalent scalar product,

\langle a,b \rangle = \sum\limits_{\pi \in \mathrm{B}(k)} \overline{\langle \pi, a\rangle} \langle \pi,b \rangle.

The convolution product on \mathbb{C}\mathrm{B}(k) is likewise replaced with the equivalent multiplication (written simply as juxtaposition)

ab = \sum\limits_{(\rho,\sigma) \in \mathrm{B}(k)^2} \langle \rho,a\rangle \langle \sigma,b\rangle \rho\sigma =\sum\limits_{\pi \in \mathrm{B}(k)} \left(\sum\limits_{\sigma \in \mathrm{B}(k)} \langle \pi\sigma^{-1},a\rangle \langle \sigma,b \rangle \right)\pi.

In short, \mathbb{C}^{\mathrm{B}(k)} and \mathbb{C}\mathrm{B}(k) are isomorphic both as Hilbert spaces and as algebras, so the choice to use one rather than the other is a matter only of preference. In practice, it is quite useful to be comfortable with both the analytic and the algebraic way of doing things.

Yet another way of repackaging everything is the following probably too-general perspective. Let us zoom back out to the very beginning, where we started with an arbitrary finite graph \Gamma. The algebraic way to understand this combinatorial object is to replace the vertex set V of \Gamma with the algebra \mathbb{C}\langle x_v \colon v \in V \rangle of polynomials in noncommuting variables x_v indexed by the vertices of v. In other words, this is the free algebra of rank |V|. This algebra is graded by degree, and we have a decomposition

\mathbb{C}\langle x_v \colon v \in V \rangle = \bigoplus_{d=0}^\infty \mathbb{C}^{(d)}\langle x_v \colon v \in V \rangle,

with the dth summand being the space of homogeneous polynomials of degree d. Observe that this vectors space is isomorphic to the space of complex-valued functions on V^d, i.e. the space of functions on d-tuples of vertices, or equivalently formal linear combinations of d-tuples of vertices. In particular, the linear component \mathbb{C}^{(1)}\langle x_v \colon v \in V\rangle is isomorphic to either/both \mathbb{C}^V and \mathbb{C}V, and the adjacency operator acts on this linear component to capture the edge structure of \Gamma. Indeed, the free algebra is isomorphic to the tensor algebra of the vector space of functions on the vertex set V. In the case where the vertex set of V of \Gamma is actually a group \mathrm{G}, we can mod out by the group law and consider the quotient

\mathbb{C}\langle x_g \colon g \in \mathrm{G} \rangle/\langle x_gx_h=x_{gh}\rangle

of the free algebra by the ideal corresponding to the group multiplication. This quotient consists solely of polynomials of degree 1, so functions on the group, or equivalently linear combinations of group elements, but now with the structure of an algebra.

Coming back to reality, the fact that \mathbb{C}\mathrm{B}(k) is an algebra and not just a vector space is reflected in the presence of a certain class of operators on the group algebra which are not present in the endomorphism algebra of a vector space devoid of multiplication.

Definition 1: Given b \in \mathbb{C}\mathrm{B}(k), the corresponding multilplication operator R_b\in \mathrm{End} \mathbb{C}\mathrm{B}(k) is defined by

R_ba = ab, \quad a \in \mathbb{C}\mathrm{B}(k).

We are using the letter “R” for multiplication operators because later on we will consider the group algebras of noncommutative groups, in which left and right multiplication do not necessarily coincide, and we have typically different left and right multiplication operators L_b and R_b. To put this the Math 202B way, the operator R_b is the image of b in the right regular representation of \mathbb{C}\mathrm{B}(k).

Note that multiplication operators corresponding to pure group elements (as opposed to linear combinations of group elements) are unitary operators on \mathbb{C}\mathrm{B}(k),

\langle \rho,R_\pi\sigma \rangle = \langle \rho,\sigma\pi \rangle = \langle \rho\pi^{-1},\sigma \rangle = \langle R_{\pi}^{-1} \rho,\sigma \rangle.

However, since we are working with the permutation group \mathrm{B}(k), in which all permutations are actually involutions, R_\pi is both unitary and selfadjoint, hence has eigenvalues \pm 1.

The importance of multiplication operators in our quest to understand the eigenvalues and eigenvectors of the Cayley graph of \mathrm{B}(k) as generated by the transpositions \tau_1,\dots,\tau_k is the following.

Proposition 1: The adjacency operator of the Cayley graph of \mathrm{B}_k is the multiplication operator R_t, where

t = \sum\limits_{i=1}^k \tau_i.

Proof: For any \rho \in \mathrm{B}(k), we have

R_t\rho = \sum\limits_{i=1}^k \rho\tau_i.

— Q.E.D.

In view of Proposition 1, the problem of understanding the eigenvalues and eigenvectors of the Cayley graph is that of understanding the eigenvalues and eigenvectors of

R_t = \sum\limits_{I=1}^k R_{\tau_i}.

To solve it, we will work out the spectral analysis of arbitrary multiplication operators on \mathrm{B}(k). The main tool in this endeavor is as second basis of the group algebra, which consists of special linear combinations \chi of group elements which have the property that

\langle \rho\sigma,\chi \rangle = \langle \rho,\chi \rangle \langle \sigma,\chi \rangle.

That is, \chi is the generating function of a group homomorphism

\mathrm{B}(k) \to \mathbb{C}^\times,

the multiplicative group of nonzero complex numbers. I like this way of stating things; in the analytic formalism, the words “character” and “homomorphism” mean exactly the same thing, a redundancy, whereas in the algebraic formalism “character” really means “generating function of a homomorphism.”

In Lecture 7, we explicitly worked out all homorphisms from \mathrm{B}(k) to \mathbb{C}^\times, showing that they have values in the subgroup generated by -1, and are moreover parameterized by the points of \mathrm{B}(k). We also showed that the characters are orthogonal,

\langle \chi_\pi,\chi_\sigma \rangle = 2^k[\pi=\sigma], \quad \pi,\sigma \in \mathrm{B}(k).

In particular, for any a \in \mathbb{C}\mathrm{B}(k), we have the character expansion

a = 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,a\rangle \chi_\pi.

The character expansion of a is also known as its Fourier expansion, with \langle \chi_\pi,a\rangle being the Fourier coefficients of a. The main reason that character expansions are “better” than expansions in the group basis is that the Fourier coefficients of a product ab are just the products of the corresponding Fourier coefficients of a and b.

The main reason that characters — which are generating functions of homomorphisms \mathrm{B}(k) \to \{-1,+1\} — are better than the original group elements is that their multiplication is extremely simple.

Theorem 1: For any \rho,\sigma \in \mathrm{B}(k), we have

\chi_\rho\chi_\sigma =0

if \rho \neq \sigma, and

\chi_\pi \chi_\pi = 2^k\chi_\pi.

Proof: Expanding in the permutation basis, we have

\chi_\rho\chi_\sigma = \sum\limits_{\pi_1 \in \mathrm{B}(k)} \left( \sum\limits_{\pi_2 \in \mathrm{B}(k)} \chi_\rho(\pi_1\pi_2^{-1}) \chi_\sigma(\pi_2) \right) \pi_1 = \sum\limits_{\pi_1 \in \mathrm{B}(k)} \left( \sum\limits_{\pi_2 \in \mathrm{B}(k)} \chi_\rho(\pi_1) \chi_\rho(\pi_2) \chi_\sigma(\pi_2) \right) \pi_1,

where we used that \chi_\rho is a homomorphism and \pi_2^{-1}=\pi_2. The result now follows from character orthogonality, which is Theorem 2 in Lecture 7.

— Q.E.D.

An immediate consequence is that the Fourier coefficients of a product are the products of the Fourier coefficients of the factors.

Corollary 1: For any a,b \in \mathbb{C}\mathrm{G}, we have

\langle \chi_\pi,ab\rangle = \langle \chi_\pi,a \rangle \langle \chi_\pi,b\rangle.

Proof: On one hand, the character expansion of the product ab is

ab = 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,ab\rangle \chi_\pi.

On the other hand, we compute the product a and b via their character expansions:

ab = \left( 2^{-k}\sum\limits_{\rho \in \mathrm{B}(k)} \langle \chi_\rho,a\rangle \chi_\rho\right)\left( 2^{-k}\sum\limits_{\sigma \in \mathrm{B}(k)} \langle \chi_\sigma,b\rangle \chi_\sigma\right) \\= 4^{-k} \sum\limits_{(\rho,\sigma) \in \mathrm{B}(k)^2}\langle \chi_\rho,a\rangle\langle \chi_\sigma,b\rangle \chi_\rho\chi_\sigma \\ = 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle\chi_\pi,a\rangle \langle \chi_\pi,b\rangle \chi_\pi,

where we applied Theorem 1 to get the final equality. Since the characters are linearly independent, the result follows.

—- Q.E.D.

In Lecture 9, we will apply the above to prove that M \in \mathrm{End} \mathbb{C}\mathrm{G} is a multiplication operator if and only if it acts diagonally in the character basis. We will also find the corresponding eigenvalues, and hence count walks on \mathrm{B}(k). In Lecture 10 we will then analyze convergence of random walk on \mathrm{B}(k) to equilibrium.

Math 202C: Lecture 5

We continue with our study of random walks on finite graphs via linear algebra. Let \mathrm{G} be a graph on \{1,\dots,n\}, possibly with loops and/or multiple edges. As in Lecture 4, let P^r(i,j) denote the probability that simple random walk on \mathrm{G} starting from vertex i is located at vertex j at time r. In this Lecture, we will use the linear algebra setup from Lecture 4 to analyze the r \to \infty asymptotic behavior of the probability P^r(i,j), which corresponds to the “long time behavior” of simple random walk on \mathrm{G}.

The linear algebraic setup is as follows. Let \mathcal{H} be a Hilbert space with orthonormal basis \{\mathbf{e}_1,\dots,\mathbf{e}_n\} indexed by the vertices of \mathrm{G}. In the language of quantum mechanics, by replacing the vertex set of \mathrm{G} with the Hilbert space \mathcal{H} we allow ourselves to consider not just vertices, but “superpositions” of vertices. Recall from Lecture 4 that the transition operator T \in \mathrm{End}\mathcal{H} for simple random walk on \mathrm{G} is defined by its action on the vertex basis as

T\mathbf{e}_i = \sum\limits_{j=1}^n \frac{m_{ij}}{d_i} \mathbf{e}_j,

where m_{ij} is the number of edges of \mathrm{G} incident to both i and j, and d_i=\sum_{j=1}^n m_{ij} is the total number of edges incident to i. In Lecture 4, we showed that

P^r(i,j) = \langle T^r\mathbf{e}_i,\mathbf{e}_j \rangle,

so that understanding the r \to \infty asymptotics of the probability P^r(i,j) is equivalent to understanding the r \to \infty asymptotics of the scalar product \langle T^r\mathbf{e}_i,\mathbf{e}_j \rangle. We would like to accomplish this via spectral analysis of T.

We adopt the same assumptions as in Lecture 4: \mathrm{G} is a connected graph on \{1,\dots,n\} and n \geq 2, so that in particular every vertex has strictly positive degree. Under this assumption, we showed previously that, although T need not be selfadjoint, or even normal, the “symmetrize” transition operator D^{-1}TD is selfadjoint, with D \in \mathrm{GL}(\mathcal{H}) being the operator that acts diagonally in the vertex basis according to

D\mathbf{e}_j = \sqrt{d_j}\mathbf{e}_j.

We may thus invoke the spectral theorem to assert that

D^{-1}TD = \lambda_1P_1 + \dots + \lambda_mP_m,

where

\lambda_1 >\dots > \lambda_m,

are the distinct eigenvalues of the selfadjoint operator D^{-1}TD, with P_i \in \mathrm{End}\mathcal{H} the orthogonal projection of \mathcal{H} onto the \lambda_i-eigenspace. Because the P_i‘s are orthogonal projections, we have

D^{-1}T^rD = \lambda_1^rP_1 + \dots + \lambda_m^rP_m,

so that for the rth power of the transition operator we have the decomposition

T^r = \lambda_1^r DP_1D^{-1} + \dots + \lambda_m^r DP_mD^{-1}.

So far everything we have said is valid for an arbitrary operator in \mathrm{End}\mathcal{H} whose orbit under the conjugation action of \mathrm{GL}(\mathcal{H}) contains a selfadjoint operator. To go further we naturally have to reference some of the particular features of the transition operator T.

Theorem 1: The largest eigenvalue of T is \lambda_1=1.

Proof: Let \|\cdot\|_1 be the 1-norm on \mathcal{H} corresponding to the basis \mathbf{e}_1,\dots,\mathbf{e}_n, i.e.

\|\mathbf{v}\|_1 = \sum\limits_{j=1}^n |\langle \mathbf{v},\mathbf{e}_j \rangle|.

We claim that T has norm at most 1 in the corresponding operator norm on \mathrm{End}\mathcal{H}, i.e.

\|T\mathbf{v}\|_1 \leq \|\mathbf{v}\|_1 \quad \forall \mathbf{v} \in \mathcal{H}.

Indeed, we have

\|T\mathbf{v}\|_1 = \sum\limits_{j=1}^n |\langle S\mathbf{v},\mathbf{e}_j\rangle| = \sum\limits_{j=1}^n \left| \left\langle \sum\limits_{i=1}^n \langle \mathbf{e}_i,\mathbf{v}\rangle T\mathbf{e}_i,\mathbf{e}_j \right\rangle \right| \leq \sum\limits_{i=1}^n|\langle \mathbf{v},\mathbf{e}_i\rangle|\sum\limits_{j=1}^n |\langle T\mathbf{e}_i,\mathbf{e}_j\rangle|,

and the terminal double sum is exactly \|\mathbf{v}\|_1, since

\sum\limits_{j=1}^n \langle T\mathbf{e}_i,\mathbf{e}_j \rangle = \sum\limits_{j=1}^n \frac{m_{ij}}{d_i} = \frac{d_i}{d_i}=1.

It follows that any eigenvalue of T has modulus at most 1.

Now we exhibit an eigenvector of T with eigenvalue 1. Consider the vector

\mathbf{s} = \sum\limits_{i=1}^n \frac{d_i}{D}\mathbf{e}_i,

where

D=\sum\limits_{i=1}^n d_i

is the sum of the degrees of all vertices of \mathrm{G}, which is equal to twice the number of edges in \mathrm{G} less the number of loops. We then have

T\mathbf{s} = \sum\limits_{i=1}^n \frac{d_i}{D}T\mathbf{e}_i = \sum\limits_{i=1}^n \frac{d_i}{D}\sum\limits_{j=1}^n \frac{m_{ij}}{d_i} \mathbf{e}_j = \sum\limits_{j=1}^n \frac{d_j}{D}\mathbf{e}_j = \mathbf{s}.

— Q.E.D.

The eigenvector \mathbf{s} above is called the stationary vector for simple random walk on \mathrm{G}, and it encodes the probability measure on the vertices of \mathrm{G} which assigns mass \frac{d_i}{D} to vertex i. Note that, if \mathrm{G} is a simple graph with constant vertex degree equal to d, which is always the case for Cayley graphs corresponding to a symmetric generating set, then the stationary vector

\mathbf{s} = \sum\limits_{i=1}^n \frac{d}{dn}\mathbf{e}_i = \sum\limits_{i=1}^n \frac{1}{n}\mathbf{e}_i

corresponds to uniform probability measure on \{1,\dots,n\}.

It is important to note that Theorem 1 does not preclude the possibility that the smallest eigenvalue of the transition operator T could also have maximum modulus — it could well be that \lambda_m=-1. However, this can only be the case if \mathrm{G} is bipartite (perhaps I should fill in a proof of this later, but if I don’t you can find one in any text on spectral graph theory). Therefore, we have the following.

Corollary 1: If \mathrm{G} is not bipartite, then

\lim\limits_{r \to \infty} \|T^r-DP_1D^{-1}\| = 0,

where \|\cdot\| is any norm on \mathrm{End}\mathcal{H}.

Proof: Let \|\cdot\| be an arbitrary norm on \mathrm{End}\mathcal{H} (indeed, since \mathrm{End}\mathcal{H}\} is finite-dimensional, all norms are equivalent). Then for any r we have that

\|T^r-\lambda_1DP_1D^{-1}\| \leq \sum\limits_{i=2}^m |\lambda_i|^r \|DP_iD^{-1}\|,

by the triangle inequality. But since \lambda_1=1 and |\lambda_i|<1 for i > 1, the statement follows from the squeeze theorem.

— Q.E.D.

At this point we can finally say something about the probability P^r(i,j) = \langle T^r\mathbf{e}_i,\mathbf{e}_j \rangle that simple random walk on \mathrm{G} starting at i arrives at j at time r: for \mathrm{G} a connected non-bipartite graph on n \geq 2 vertices, the limit

t(i,j) = \lim\limits_{r \to \infty} P^r(i,j)

exists. Note that the role of bipartiteness here is quite clear from a combinatorial perspective: if \mathrm{G} is bipartite, then with probability one simple random walk started at i is located in the color class of i at even times, and in the opposite color class at odd times, so that the above limit obviously does not exist.

Continuing on under the assumption that \mathrm{G} is not-bipartite, let us define a vector \mathbf{t}_i \in \mathcal{H} by

\mathbf{t}_i = \sum\limits_{j=1}^n t(i,j) \mathbf{e}_j,

which encodes the limit distribution of simple random walk on \mathrm{G} started at i.

Theorem 2: For any 1 \leq i \leq n, the limit vector \mathbf{t}_i is an eigenvector of T with eigenvalue 1.

Proof: We have

T\mathbf{t}_i = T\lim\limits_{r \to \infty} T^r \mathbf{e}_i = \lim\limits_{r \to \infty} T^{r+1} \mathbf{e}_i = \mathbf{t}_i.

— Q.E.D.

Theorem (Markov limit theorem): For \mathrm{G} a connected non-bipartite graph on \{1,\dots,n\}, n \geq 2, we have

\mathbf{t}_1=\mathbf{t}_2=\dots = \mathbf{t}_n = \mathbf{s}.

Proof: The statement follows if we can show that \lambda_1=1 is a simple eigenvalue of T, i.e. that the \lambda_1-eigenspace is a line. Indeed, all the vectors \mathbf{t}_i as well as \mathbf{s} have nonnegative coordinates summing to 1 relative to the vertex basis, so if any of these is a scalar multiple of another then the scalar in question must be 1.

The proof that the top eigenspace is indeed one-dimensional uses a bit of a deus ex machina, namely the Perron-Frobenius theorem. Look up this theorem, and complete the proof!

— Q.E.D.

Math 202C: Lecture 1

Since this is a course in applied algebra, let us begin with a real-world problem. Consider a deck of playing cards numbered 1,2,\dots,n arranged in a single row on a table. At each instant of discrete time, choose two cards independently at random and swap them. How long until you can be sure that the order of the cards is uniformly random?

As with every applied problem, in order to get started we have to think about how best to formulate the above as a precise mathematical question. This is what it means to “model” a real world problem.

First, each of the possible configurations of cards can be uniquely encoded as a permutation of the numbers 1,\dots,n — just write down the labels of the cards one at a time, from left to right, and you obtain a permutation written of 1,\dots,n written in one-line notation. This is a good first step: we have identified the state space of the system we wish to understand, and it is the set \mathrm{S}(n) of permutations of 1,\dots,n.

Next, we need to make precise the way in which our system evolves in time, i.e. transitions from one state to another at each time instant. If the configuration of cards at time t is encoded by the permutation \pi \in \mathrm{S}(n), what are the possible states that our system can be in at time t+1? An obvious but in fact crucially important fact is that the answer to this question is not just combinatorial, but also algebraic: our configuration space \mathrm{S}(n) is not just a set, it is a group, the symmetric group of rank n. This means that the evolution equation we seek can be represented as multiplication in this group: given that the configuration of cards at time t is the permutation \pi the possible states at time t+1 are precisely the permutations

\sigma = \pi(x\ y), \quad 1 \leq x < y \leq n,

where (x\ y) denotes the transposition in \mathrm{S}(n) which interchanges the numbers x,y \in \{1,\dots,n\}. This is a good time to fix our notational conventions. Permutations in \mathrm{S}(n) are bijective functions \{1,\dots,n\} \to \{1,\dots,n\}, and multiplication of permutations means composition of functions. The concatenation of two permutations \pi_1,\pi_2 means the following:

\pi_1\pi_2 := \pi_2 \circ \pi_1.

Although this looks messy, the reason for taking this convention rather than the seemingly more natural one \pi_2\pi_1=\pi_2 \circ \pi_1 is that we are more interested in doing computations “upstairs” where permutations are elements in a group, as opposed to “downstairs” where they are functions acting on points, and it is nice to multiply strings of elements from left to right.

It is useful to take the above one step further, and think of the symmetric group \mathrm{S}(n) as not just an algebraic object, but a geometric one, in the manner of geometric group theory. To do this, first solve the following problem.

Problem 1: The set \mathrm{T} \subset \mathrm{S}(n) of all transpositions \tau=(x\ y),\ 1 \leq x < y \leq n generates S(n). That is, for any permutation \pi \in \mathrm{S}(n), there exists a nonnegative integer k and transpositions \tau_1,\dots,\tau_k \in \mathrm{T} such that

\pi = \tau_1 \tau_2 \dots \tau_k.

With the solution to Problem 1 under our belts, we can define the word norm on \mathrm{S}(n) corresponding to the generating set \mathrm{T} by

\|\pi\|_T = \min \{k \in \mathbb{Z}_{\geq 0} \colon \pi=\tau_1\dots\tau_k,\ \tau_i\in \mathrm{T}\}.

Problem 2: Prove that \|\pi\sigma\|_{\mathrm{T}} \leq \|\pi\|_{\mathrm{T}} + \|\sigma\|_{\mathrm{T}}.

Once we have a norm on \mathrm{S}(n), meaning a way to measure the “size” of a permutation, it is very natural to define a corresponding notion of distance between two permutations by

\mathrm{d}_\mathrm{T}(\pi_1,\pi_2) = \|\pi_1^{-1}\pi_2\|_\mathrm{T}.

Problem 3: Check that the pair (\mathrm{S}(n),\mathrm{d}_\mathrm{T}) satisfies the metric space axioms.

To summarize, the state space \mathrm{S}(n) of the system we wish to understand is not just a set, but a group, and further not just a group, but a metric space. It is nice to have a pictorial representation of the metric space (\mathrm{S}(n),\mathrm{d}_T), and this is afforded by the corresponding Cayley graph \Gamma(\mathrm{S}(n),\mathrm{T}). The vertex set of \Gamma(\mathrm{S}(n),\mathrm{T}) is \mathrm{S}(n), and two vertices \pi,\sigma are adjacent if and only if there exists \tau \in \mathrm{T} such that \sigma = \pi\tau. Below is a picture of the Cayley graph of \mathrm{S}(4) as generated by \mathrm{T}=\{(1\ 2), (1\ 3), (2\ 3), (1\ 4),(2\ 4),(3\ 4)\}.

Cayley graph of S(4) generated by transpositions.

Let us note a few basic features of this graph. First, it is {n \choose 2}-regular, because |\mathrm{T}|={n \choose 2}. Second, it is a graded graph, meaning that its vertex set decomposes as a disjoint union of independent sets,

\mathrm{S}(n) = \bigsqcup_{k=0}^{n-1} L_k,

where L_k is the set of permutations whose disjoint cycle decomposition consists of n-k cycles. Third, from the geometric perspective, the kth level L_k of the Cayley graph is the sphere of radius k in \mathrm{S}(n) centered at the identity permutation,

L_k=\{ \pi \in \mathrm{s}(n) \colon \|\pi\|_T=k\}.

Problem 4: Prove that \max \{\mathrm{d}_\mathrm{T}(\pi,\sigma) \colon \pi,\sigma \in \mathrm{S}(n)\}=n-1.

At this point we have quite a clear and detailed understanding of the system we want to analyze: the state space of the system is the Cayley graph of \mathrm{S}(n) as generated by the set \mathrm{T} of transpositions, and the evolution occurring is discrete time random walk on this graph. Now we have to make precise what exactly we mean by “random walk” in this sentence.

The most natural thing to consider is simple random walk: at time zero, we have a particle situated at the identity permutation on \Gamma(\mathrm{S}(n),\mathrm{T}), and at each instant of discrete time the particle jumps to a neighboring vertex, with equal probability of selecting any neighbor. We would like to understand how many jumps must be made before the probability to observe the particle at any vertex of the graph is the same as at any other vertex. This is what it means for a deck of cards to be shuffled: every configuration is equally likely, so that a person choosing a card from the deck cannot do anything more than guess what this card is going to be. Now, there is an immediate problem: this will never happen. The reason is annoying but unavoidable: every permutation is either even or odd, meaning that for any \pi \in \mathrm{S}(n), all walks from the identity to \pi have either an even or an odd number of steps. So, at even times, the particle must be situated in the alternating group \mathrm{A}(n), and at odd times in its complement. But actually, this foible reminds us that we are trying to solve a real world problem in which two cards are selected independently and then swapped — it is possible that the two cards selected are actually the same, and no swap occurs.

Problem 5: Prove that the same card, no-swap event occurs with probability \frac{2}{n+1}.

Thus, what we really want to understand is the lazy random walk on \mathrm{S}(n), where at each instant the particle either stays put with the above probability, or jumps to an adjacent vertex, with equal probability of jumping in any direction. Analyzing this model, Persi Diaconis and Mehrdad Shahshahani were able to prove the following landmark result in applied algebra.

Theorem (Diaconis-Shahshahani): A deck of cards is fully shuffled after \frac{1}{2}n \log n transpositions.

Although the Diaconis-Shahshahani result is ostensibly a theorem in probability theory, one may legitimately refer to it as a theorem of applied algebra: its proof uses everything you learned about the representation theory of the symmetric groups you learned in Math 202B. Moreover, as far as the author knows, there is no way to obtain this result without using representation theory, i.e. by purely probabilistic reasoning.

In this course, we will use simple stochastic processes like card shuffling as motivating questions whose solution leads to the development of harmonic analysis on finite groups. This is a fascinating application of algebra, and taking this route leads not just through the character theory of finite groups, but all the way to Gelfand pairs, spherical functions, and other related algebraic constructions. Even though you already know the representation theory of \mathrm{S}(n), so that in principle we could start with the Diaconis-Shahshahani theorem, we will instead begin with the analysis of random walks on much simpler groups, such as the discrete circle and hypercube. This motivates the development of character theory for finite abelian groups, where representations aren’t particularly meaningful, and sets the stage nicely for the harmonic analysis interpretation of the representation theory of general finite groups.

Math 202C: Lecture 0

As a student enrolled in Math 202C, you have necessarily completed Math 202B under the tutelage of Professor Brendon Rhoades, and as such are an expert on the representation theory of the symmetric group \mathrm{S}(d). You may however not realize that this theory is directly applicable to natural and compelling real world problems, some of which are sufficiently rich and interesting that they can be adopted as a posteriori motivations for the development of the subject.

In this course, we are going to take another look at the representation theory of the symmetric group through the lens of the following real world question: how many times do you have to shuffle a deck of cards in order to completely randomize it? We will try to answer this question completely and thoroughly, down to the last detail, and in doing so expand on your existing knowledge of the representation theory of the symmetric group, as well as place this theory into the context of the more general subject known as harmonic analysis on finite groups. This is a beautiful and very classical branch of algebra which enriches the representation theory of finite groups you learned in Math 202B with new perspectives gleaned from the theory of the Fourier transform in classical analysis. Representation theory and harmonic analysis can be completely united, and the resulting subject is the representation theory of topological groups, which we may briefly discuss towards the end of the course. However, we will primarily remain in the finite setting where everything is totally algebraic and easy to understand without any sophisticated machinery.

The mechanics of the course are as follows. I will provide you with a complete text on harmonic analysis on finite groups, in the form of blog posts which will appear here every Monday and Wednesday, in the approximate neighborhood of our regularly scheduled lecture time. Thus there will be two posts per week; reading through and understanding each of these will likely require somewhat more effort than absorbing a standard in-person fifty minute lecture. These two lecture posts form the asynchronous component of the course. That leaves our Friday lecture slot, which will be a two hour live stream session: the first hour will be a flipped classroom style experience, rather than a planned live lecture, in which we can collaboratively review, discuss, and expand upon the material from the lecture posts. The second hour will be my office hours, so open to any and all questions. In practice, it is likely that there won’t be much of a distinction between the first and second hour, but batching the lecture slot and office hours gives us a good chunk of time to work through things. This all happens on the Math 262C Discord server, which will be our main means of communication throughout the quarter: please sign up immediately using this link, https://discord.gg/SRKNEvNP. When signing up, please take your real name as your username.

The grading scheme for the course is as follows. The lecture posts will contain embedded problems for solution. You will be expected to solve these problems, typeset your solutions in LaTeX, and submit them for grading each week, prior to Sunday at midnight. The exact method for submission and return of your graded solutions will be decided this week, and announced on the Discord server. Furthermore, you will write a “guest lecture post” to be published on this blog on a topic related to the course material; I will prepare a list of possible topics, or you may choose your own topic and clear it with me. Your grade for the course will be an average of your graded problem solutions and your guest lecture post, and you may choose the weighting of this average however you wish, anywhere from 100% problem solutions to 100% guest lecture post. Thus, at the extremes, if you prefer to solve problems and have your solutions graded, you can skip the lecture post, and conversely if you are getting tired of solving problems and would rather throw yourself into a more long term project, you can put all your effort into writing a lecture post and forego submission of typed solutions to the problems.

At this point, I expect you have questions. Please sign up for the Discord server, and we can start talking and working out various logistics. The mathematical content of the course begins on Wednesday, with Lecture 1.

Math 262A: Lecture 18

The main point of this topics course (if there was one) is that there is a surprising and useful relationship between the asymptotics of integrals (a la Laplace) and the combinatorics of topological maps (Feynman diagrams). There are a few levels of sophistication to this relationship, which may be thought of as corresponding to zero-dimensional scalar, vector, and matrix-valued quantum field theories. We will try to summarize this relationship now from a high-level perspective, meaning that the details might be not quite right, but the basic idea is more or less sound.

Formally, for the scalar theory the connection is something like

\int\limits_\mathbb{R} e^{-N(\frac{x^2}{2} + \sum_{d=3}^\infty p_d\frac{x^d}{d})}\mathrm{d}x \sim \sum\limits_\Gamma \frac{1}{|\mathrm{Aut} \Gamma|} N^{v(\Gamma)-e(\gamma)} p_\Gamma, \quad N \to \infty

where the sum is over maps in which every vertex has degree at least 3, the numbers v(\Gamma) and e(\Gamma) are the number of edges and vertices of \Gamma, respectively, and the “amplitude” p_\Gamma is the product of the coefficients p_d from the perturbation of the Gaussian density on the integral side according to the vertex degree sequence of \Gamma,

p_\Gamma = \prod\limits_{V \in \Gamma} p_{\mathrm{deg}V}.

Now, we have that

e(\Gamma)-v(\Gamma)=r(\Gamma)-1,

where r(\Gamma) is the circuit rank of \Gamma and c(\Gamma). This means that we can organize the asymptotic expansion of the integral as a generating function for maps sorted by their circuit rank:

\int\limits_\mathbb{R} e^{-N(\frac{x^2}{2} + \sum\limits_{d=3}^\infty p_d\frac{x^d}{d})} \mathrm{d}x \sim \sum\limits_{k=0}^\infty \frac{1}{N^k}\sum\limits_{\substack{\Gamma \\ r(\Gamma)=k+1}} \frac{1}{|\mathrm{Aut}\Gamma|} p_\Gamma.

This is nice, and it allows us to compute graphically the coefficients in the asymptotic expansions of various integrals, for example the integral which figures in Stirling’s formula. Physicists call it the “loop expansion.” Importantly, it is actually true in the analytic (as opposed to just formal) sense, meaning that if you truncate the (divergent) expansion at order N^k then the error is o(N^k), because of the Laplace method, which shows that the integral in question really is concentrated in a neighborhood of the minimizer of the integrand. If you take logs, meaning that you are only interested in the order of magnitude of the integral you started with, then the loop expansion only involves connected graphs, which reduces the complexity quite a bit. An unfortunate aspect of the loop expansion is that it misses out on the fact that the relevant diagrams are not just combinatorial objects — graphs — but actually graphs embedded in surfaces. This expansion does not see the surface, meaning that it does not encode its topology.

We saw in Lecture 17 that it is indeed possible to get topological information by changing the space over which we integrate from \mathbb{R} to its noncommutative counterpart \mathrm{H}(N), the Euclidean space of N \times N Hermitian matrices. Formally, the expansion from the scalar case is modified to

\int\limits_{\mathrm{H}(N)} e^{-\mathrm{Tr} N(\frac{x^2}{2} + \sum_{d=3}^\infty p_d\frac{x^d}{d})}\mathrm{d}x \sim \sum\limits_\Gamma \frac{1}{|\mathrm{Aut} \Gamma|} N^{v(\Gamma)-e(\gamma)+f(\Gamma)} p_\Gamma, \quad N \to \infty,

where f(\Gamma) is the number of faces of the topological map \Gamma. Using the Euler formula,

v(\Gamma)-e(\Gamma)+f(\Gamma)=2-2g(\Gamma),

we can reorganize the formal N \to \infty expansion of the above matrix integral into not a loop expansion, but a genus expansion,

\int\limits_{\mathrm{H}(N)} e^{-N(\frac{x^2}{2} + \sum_{d=3}^\infty p_d\frac{x^d}{d})}\mathrm{d}x \sim \sum\limits_{k=0}^\infty N^{2-2k}\sum\limits_{\substack{\Gamma \\ g(\Gamma)=k}} \frac{1}{|\mathrm{Aut} \Gamma|}  p_\Gamma, \quad N \to \infty,

where now the inner sum is over maps with vertices of degree at least three of a fixed topological genus. This is even nicer than the loop expansion, because it sees the topology of maps (genus) as opposed to just their combinatorics (circuit rank), but it is much harder to prove that it is correct analytically, meaning that stopping the (divergent) expansion after k terms leaves an o(N^{-2k}) error. The basic problem is that the Laplace principle doesn’t work: the principle relies on the fact that the contribution to an integral over \mathbb{R} with integrand of the the form e^{-NS(x)} is dominated by a small neighborhood of the maximum of S(x), but if we swap out \mathbb{R} for \mathrm{H}(N), then a box in this N^2-dimensional Euclidean space has volume O(e^{N^2}), so that contributions to the integral from Lebesgue measure are on the same scale as contributions from the integrand itself.

This is not at all an easy problem to deal with, but I can at least give you an idea of how analysts have gotten past this obstruction. The starting point is a classical result due to Hermann Weyl the subset of \mathrm{H}(N) consisting of Hermitian matrices with a given list of eigenvalues,

\lambda_1 \geq \lambda_2 \geq \dots \geq \lambda_N,

has Euclidean volume

C_N \prod\limits_{1 \leq i < j \leq N} (\lambda_i-\lambda_j)^2,

where C_N is a constant depending only on N. Now, each such isospectral set is, by the Spectral Theorem, an orbit of the unitary group \mathrm{U}(N) acting on \mathrm{H}(N) by conjugation, i.e. a set of the form

\mathcal{O}_\lambda = \{U\mathrm{diag}(\lambda_1,\dots,\lambda_N)U^{-1} \colon U \in \mathrm{U}(N)\}.

Incidentally, the orbits \mathcal{O}_\lambda are symplectic manifolds, but we don’t need to pay attention to this here. The point that is relevant for us is the Weyl integration formula, which is just the change of variables formula resulting from Weyl’s volume computation: if f(X) is a function on \mathrm{H}(N) which is invariant under the conjugation action of \mathrm{U}(N), then

\int\limits_{\mathrm{H}(N)} f(X) \mathrm{d}X = c_N\int\limits_{\mathbb{W}^N} f(\lambda_1,\dots,\lambda_N) \prod_{i<j} (\lambda_i-\lambda_j)^2 \mathrm{d}\lambda,

where the integration is with respect to Lebesgue measure on the Weyl chamber \mathbb{W}^N \subset \mathbb{R}^N, i.e. the convex set

\mathbb{W}^N = \{(\lambda_1,\dots,\lambda_N) \in \mathbb{R}^N \colon \lambda_1 > \dots > \lambda_N\},

and f(\lambda_1,\dots,\lambda_N) is, by abuse of notation, the value of the original function f on any Hermitian matrix with eigenvalues \lambda_1,\dots,\lambda_N. In particular, a matrix integral of the Laplace form

\int\limits_{\mathrm{H}(N)} e^{-N\mathrm{Tr}S(X)} \mathrm{d}X,

can be rewritten via the Weyl integration formula as

c_N\int\limits_{\mathbb{W}^N} e^{-N\sum_{i=1}^N S(\lambda_i)} \prod_{i<j} (\lambda_i-\lambda_j)^2\mathrm{d}\lambda,

and the remaining integral over eigenvalues can yet again be rewritten in the more compelling form

\int\limits_{\mathbb{W}^N} e^{-N^2(\frac{1}{N} \sum_{i=1}^N S(\lambda_i)-\frac{2}{N} \sum_{i<j} \log(\lambda_i-\lambda_j))} \mathrm{d}\lambda.

There are a few things to say here. First and foremost, the point of all of this is that we have reduced from integrating over the huge space \mathrm{H}(N) of dimension N^2 to integrating over the N-dimensional set \mathbb{W}^N, which is a massive reduction in degrees of freedom, and in particular contributions from the Lebesgue measure are now order e^N. Second, the integrand is of order e^{-N^2}, so it seems like the Laplace principle may yet be correct in this context: the main contributions to the integrand come from a small neighborhood of the minimizer of the function

\mathcal{S}(\lambda_1,\dots,\lambda_N) = \frac{1}{N} \sum_{i=1}^N S(\lambda_i)-\frac{2}{N} \sum_{i<j} \log(\lambda_i-\lambda_j).

Third, we have actually met this action before, back when we looked at the asymptotic enumeration of permutations with restricted decreasing subsequence length: it represents the potential energy of a system of N identical point charges on a wire with logarithmic repulsion (which is indeed the electrostatic repulsion in two dimensions), but confined by the potential well S(\lambda), which was S(\lambda)=\frac{1}{2}\lambda^2 last time we met it, but now can be anything. As N\to \infty, the distribution of these charges will crystallize around the configuration which minimizes the action \mathcal{S}, which in particular means that the empirical distribution \mu_N of these N point charges — i.e. the probability measure on \mathbb{R} which places mass 1/N at each particle — will converge as N \to \infty to a continuous probability measure \mu_\mathcal{S} on \mathbb{R} called the “equilibrium measure” corresponding to S. More precisely, in the language of calculus of variations, we have

\lim_{N\to \infty} \frac{1}{N^2} \log \int\limits_{\mathrm{H}(N)} e^{-NS(X)} \mathrm{d}X = - \inf\limits_{\mu \in \mathcal{P}(\mathbb{R})} \mathcal{S}(\mu),

where

\mathcal{S}(\mu) = \int\limits_\mathbb{R} S(x)\mu(\mathrm{d}x) - \int\limits_\mathbb{R}\int\limits_\mathbb{R} \log |x-y|\mu(\mathrm{d}x)\mu(\mathrm{d}y).

It is a theorem that for nice enough S, the equilibrium measure always exists and is unique. This is the Laplace principle for matrix integrals. Although it is technically very involved, a fine analysis of the convergence to equilibrium measure shows that under reasonable hypotheses the formal genus expansion above is indeed correct, analytically, as an asymptotic expansion. This means that the Feynman diagrams for matrix integrals really are maps on surfaces. It is interesting that this connection is in fact more typically used to say something about maps, not about integrals: there are independent methods for computing the coefficients in the asymptotic expansion of Hermitian matrix integrals based on orthogonal polynomials, and once these coefficients are found in some independent fashion one has also determined the generating function for maps of a given genus. For example, taking the potential S(x) = \frac{x^2}{2} + p_4\frac{x^4}{4} and calculating the corresponding equilibrium measure, one can obtain an explicit formula for the number of ways to glue a sphere from a given number of squares. A good entry point into all of this is here.

Math 262A: Lecture 15

The N\to \infty Feynman diagram expansion

\int\limits_\mathbb{R} e^{-N S(x)}\frac{\mathrm{d}x}{\sqrt{2\pi N^{-1}}}\sim \sum\limits_\Gamma \frac{\mathrm{Amp}_S\ \Gamma}{|\mathrm{Aut}\ \Gamma|} N^{V(\Gamma)-E(\Gamma)}

is both fascinating and frustrating: while it is fascinating that there exists a close connection between the asymptotics of integrals and the combinatorics of maps, it is frustrating that this connection is purely combinatorial, and does not “see” the topology of the surface on which the diagram \Gamma is drawn. Wouldn’t it be nice if the exponent of N in the expansion had an extra term, and thus became Euler’s formula

V(\Gamma)-E(\Gamma)+F(\Gamma) = 2-2g(\Gamma),

relating the alternating sum of vertices, edges, and faces of a graph to the genus of the surface on which it is drawn? Could there be some way to tinker with the integral \int e^{-NS} to bring this about?

Note first that it is impossible to get the missing F(\Gamma) terms by somehow choosing the right action S in the above integral, because the diagrammatic expansion is essentially universal: the only part of it which depends on S is the “amplitude” \mathrm{Amp}_S\ \Gamma, which is a function of \Gamma determined by the Taylor expansion of S. In order to make the jump from the combinatorics of graphs to the topology of maps, we have to alter not the integrand, but the domain of integration itself.

Perhaps the first such alteration one would try is to introduce extra dimensions: instead of integrating over \mathbb{R}, we would integrate over \mathbb{R}^d for d > 1. It turns out that there is a Laplace method and a corresponding Feynman diagram expansion for the multidimensional integral

\int\limits_{\mathbb{R}^d} e^{-N S(x)}\mathrm{d}x,

which you can read about for example in these notes of Pavel Etingof. I was hoping to cover the vector Laplace principle in this course, but unfortunately was not able to do so. However, the answer is more or less that both the principle and the corresponding diagrammatic expansion are essentially the same, and in particular moving from scalars to vectors does not allow one to make contact with the topology of maps by counting faces.

It turns out that what must be done is not to replace numbers with vectors, but rather with their quantum qounterparts (sorry, couldn’t resist): operators. Consider a Hilbert space \mathrm{V}, i.e. a complex vector space equipped with a scalar product \langle \cdot,\cdot \rangle, which is required to be complete with respect to the norm induced by the scalar product. We can consider the operator algebra \mathrm{End} \mathrm{V} as a noncommutative generalization of \mathbb{C}: it actually is \mathbb{C} if \mathrm{dim} \mathrm{V}=1, and in any dimension we can add and multiply operators in a manner analogous to the addition and multiplication of numbers. What subsets of \mathrm{End} \mathrm{V} would be the counterparts of the real line \mathbb{R} \subset \mathbb{C} and the unit circle \mathbb{T} \subset \mathbb{C}? Answer: the Hermitian operators

\mathrm{H}(\mathrm{V}) = \{ X \in \mathrm{End}\mathrm{V} \colon X^*=X\}

are like real numbers, in that they are self-conjugate and consequently have real spectrum, while the unitary operators

\mathrm{U}(\mathrm{V}) = \{U \in \mathrm{End}\mathrm{V} \colon U^*=U^{-1}\}

are like complex numbers of modulus one, in that conjugation is the same as inversion and consequently the spectrum lies in \mathbb{T}. Thus, a noncommutative analogue of the Laplace method would mean a statement on the N \to \infty asymptotics of the integral

\int\limits_{\mathrm{H}(\mathrm{V})} e^{-NS(X)} \mathrm{d}X,

where S is a scalar-valued function on \mathrm{H}(\mathrm{V}). Similarly, a noncommutative analogue of the imaginary version of the Laplace method, which is called the method of stationary phase and deals with integrals over the circle \mathbb{T} rather than the line \mathbb{R}, would be something about the N \to \infty asymptotics of

\int\limits_{\mathrm{U}(\mathrm{V})} e^{-NS(U)} \mathrm{d}U,

where S is a scalar-valued function on unitary operators. Here there is an immediate issue in that these are ill-defined functional integrals as soon as \mathrm{V} is infinite-dimensional, because of the lack of Lebesgue (or Haar) measure in infinite dimensions. So in order to rigorously study the above integrals, we have to restrict to the case that \mathbf{V} is of finite dimension M<\infty, in which case we can identify \mathrm{H}(\mathrm{V}) with the set \mathrm{H}(M) of M \times M Hermitian matrices, and \mathrm{U}(\mathrm{V}) with the set of M \times M unitary matrices. The former is a real vector space of dimension M^2, which is isomorphic to the Euclidean space \mathbb{R}^{M^2} when given the scalar product \langle X,Y \rangle = \mathrm{Tr} XY, where \mathrm{Tr} is the usual matrix trace, and the latter is a compact real Lie group of dimension M^2. So, we are led to study the N\to \infty asymptotics of the matrix integrals

\int\limits_{\mathrm{H}(M)} e^{-NS(X)} \mathrm{d}X

and

\int\limits_{\mathrm{U}(M)} e^{-NS(U)} \mathrm{d}U,

which are perfectly well-defined objects. Our goal in the remainder of this course is to develop versions of the Laplace principle and the method of stationary phase for these objects, and see that their diagrammatic expansions actually do connect up with the topology of maps.

The first step in this direction is to identify what the counterpart of the the scalar integral

\int\limits_\mathbb{R} x^d e^{-N\frac{x^2}{2}} \mathrm{d}x = \sqrt{\frac{2\pi}{N}} |\mathrm{Inv}(d)| N^{-\frac{d}{2}},

which computes the moments of the Gaussian measure on \mathbb{R}, becomes when we replace the real line with Hermitian matrices. The answer to this was found by physicists long ago, and was rediscovered by mathematicians somewhat later in this fundamental paper of Harer and Zagier.

Theorem 1: For any \mathrm{d} \in \mathbb{N}, we have

\int\limits_{\mathrm{H}(N)} \mathrm{Tr} X^d e^{-\frac{N}{2}\mathrm{Tr}X^2}\mathrm{d}X = \mathcal{Z}_N \sum\limits_{g=0}^\infty \frac{\varepsilon_g(d)}{N^{2g}},

where \mathcal{Z}_N is an explicit number depending only on N, and \varepsilon_g(d) is the number of ways in which the edges of a d-gon can be identified in pairs so as to produce a compact orientable surface of genus g. On the right hand side, if we set N=1, we obtain

\sum\limits_{g=0}^\infty \varepsilon_g(d) = |\mathrm{Inv}(d)|,

since if we forget the stratification by genus encoded by N^{-2g}, the (finite) sum is simply counting ways to glue the edges of a d-gon in pairs. Note however that setting N=1 on the left hand side of Theorem 1 does *not* produce

\int\limits_\mathbb{R} e^{-N\frac{x^2}{2}} \mathrm{d}x,

but rather

\int\limits_\mathbb{R} e^{-\frac{x^2}{2}} \mathrm{d}x.

In particular, the fact that we are integrating over \mathrm{H}(N) rather than \mathrm{H}(M) in Theorem 1 is not a typo: to get the connection with surface topology we need M=N. This simple fact causes massive problems on the analytic side, effectively invalidating the fundamental notion underlying Laplace’s method, which is that the main contribution to an integral containing a large parameter comes from a small neighborhood of a single value of the integrand. We will discuss both the combinatorics and the analysis of this “quantum Laplace method” next week.

Math 262A: Lecture 13

Continuing with Lecture 12, we are analyzing the Taylor series

T_I(\hbar) = \sum\limits_{k=0}^\infty a_k\hbar^k

of the smooth function

I(\hbar) = \frac{1}{\sqrt{2\pi\hbar}}\int\limits_{\mathbb{R}} e^{-\frac{1}{\hbar}S(x)} \mathrm{d}x,

where S is a smooth function on the line such that the above integral converges, and whose Taylor series is of the form

T_S(x) = \frac{x^2}{2} + \sum\limits_{d=3}^\infty p_d \frac{x^d}{d}.

I should say here that I’m trying out a new notation: if F is a function of the real variable y which is smooth in a neighborhood of y=0, then I write

T_F(y) = \sum\limits_{k=0}^\infty F^{(k)}(0)\frac{y^k}{k!},

this being a formal power series in a single indeterminate which by abuse of notation I also call y. So it does not make sense to write

F(y) = T_F(y),

since the LHS is a number and the RHS is a formal power series. However, Taylor’s theorem with remainder tells you that if you take a finite number of terms of the formal power series T_F and trade the formal variable y for a real number y, you get a good approximation to the real number F(y).

Back to the problem at hand, what we are trying to do is to express the formal power series T_I in terms of the formal power series T_S. What we currently know is that

T_I(\hbar) = 1 + \sum\limits_{d=1}^\infty \frac{1}{d!}\sum\limits_{\vdash d} t_d|C_\alpha|p_\alpha\hbar^{\frac{d}{2}-\ell(\alpha)},

where the internal (finite) sum is over Young diagrams with d, with t_d the number of fixed point free involutions in \mathrm{S}(d) and C_\alpha the conjugacy class in \mathrm{S}(d) of permutations of cycle type \alpha, and

p_\alpha=\prod\limits_{i=1}^{\ell(\alpha)} p_{\alpha_i}

subject to p_1=p_2=0. So, in completely algebraic language, one could say that we are studying the “Feynman transform”

\mathcal{F} \colon x^3\mathbb{R}[p_1,p_2,p_3,\dots][[x]] \to \mathbb{R}[p_1,p_2,p_3,\dots][[\hbar]]

defined by

\mathcal{F}\left( \sum\limits_{d=3}^\infty p_d \frac{x^d}{d}\right):=\sum\limits_{d=1}^\infty \frac{1}{d!} \sum\limits_{\alpha \vdash d} t_d|C_\alpha|p_\alpha \hbar^{\frac{d}{2}-\ell(\alpha)}.

The domain of this map is the principal ideal generated by x^3 in the algebra of formal power series in x with coefficients being polynomials in the p_1,p_2,p_3,\dots, and the codomain is formal power series in \hbar with coefficients in the same polynomial ring. The problem then is to simplify the output of the Feynman map as a power series in \hbar, i.e. to determine the coefficient of \hbar^k for each k \in \mathbb{N}.

What we want to do in this lecture is give a diagrammatic solution to the above algebraic problem. Note that the solution to this problem has analytic value via Laplace’s method, but the solution itself does not involve any analysis and is purely combinatorial.

The first step is to think of the number |C_\alpha|t_d as the cardinality of C_\alpha \times \mathrm{Inv}(d), where again C_\alpha is the conjugacy class of permutations in the symmetric group \mathrm{S}(d), and \mathrm{Inv}(d) is the set of fixed point free involutions in \mathrm{S}(d). Note that \mathrm{Inv}(d) is nonempty if and only if d is even, in which case

\mathrm{Inv}(d) = C_{(2^{\frac{d}{2}})},

where (2^{\frac{d}{2}}) is the Young diagram with \frac{d}{2} rows of length 2. Every pair (\sigma,\tau) \in C_\alpha \times \mathrm{Inv}(d) is a combinatorial map. Associated to this map are two topological maps, which are dual to one another. To construct these topological maps, we use the following construction. First, for each cycle of \sigma we draw a polygon whose number of sides is equal to the length of \sigma, and whose edges are labeled counterclockwise in the cyclic order prescribed by \sigma. At the same time, we place into this polygon the corresponding dual object, namely a vertex in the interior of the polygon together with line segments, called “half edges” or “darts,” which emanate from the vertex, one for each edge of the polygon and labeled in the same way. Note that we need only consider the case where d is even, since t_d=0 otherwise, and moreover where all rows of \alpha have length at least 3, since p_1=p_2=0, meaning that all polygons in this construction have at least three sides, or dually that all stars have degree at least 3. Now we glue the sides of the polygons together according to the pairing permutation \tau, and at the same time pair half edges of stars in the same manner, i.e. using \tau. This produces a pair of dual topological maps \Gamma,\Gamma' on a compact oriented surface S. Our convention is that we view \Gamma as the topological map obtained by glueing stars, and \Gamma' as the topological map obtained by glueing polygons. The figure below illustrates this construction for the combinatorial map

\left( (1\ 2\ 3)(4\ 5\ 6), (1\ 2)(3\ 4)(5\ 6) \right).

Glueing a pair of triangles/3-stars to get a pair of maps.

Now let us consider the Feynman transform

\mathcal{F}\left( \sum\limits_{d=3}^\infty p_d \frac{x^d}{d}\right):=\sum\limits_{d=1}^\infty \frac{1}{d!} \sum\limits_{\alpha \vdash d} t_d|C_\alpha|p_\alpha \hbar^{\frac{d}{2}-\ell(\alpha)}

from the point of view of topological rather than combinatorial maps. First, the powers of \hbar occurring in the transformed power series have a clear interpretation: we have

\hbar^{\frac{d}{2}-\ell(\alpha)} = \hbar^{e(\Gamma)-v(\Gamma)},

where v(\Gamma),e(\Gamma) are the number of vertices and edges in \Gamma. Observe that actually the difference e(\Gamma)-v(\Gamma) depends only on the underlying graph, not actually on its embedding, so that this is a purely combinatorial parameter. Indeed, it is closely related to the so-called circuit rank of a graph, which is the number of independent cycles in the graph, or equivalently the minimal number of edges which must be deleted in order to obtain an acyclic graph, i.e. a forest. The precise relationship between these parameters is

e(\Gamma)-v(\Gamma)=r(\Gamma)-c(\Gamma)

with r(\Gamma) the circuit rank and c(\Gamma) the number of connected components. This is pretty obvious if you think about it: consider the extremal case where we have one connected component consisting of a tree. In this case the difference r(\Gamma)-c(\Gamma) is obviously equal to zero, and if we start adding edges the circuit rank goes up by one every time we add an edge, and moreover the construction is additive in the number of components (graph theory people – did I get that right?).

So, at the coarsest level, the output of the Feynman transform is apparently a generating function for (possibly disconnected) graphs sorted by circuit rank, where we are restricting to graphs in which every vertex has degree at least 3. Now this is not quite the case, since many combinatorial maps will give rise to the same topological map. Indeed, if we take the combinatorial map (\sigma,\tau) and conjugate it by an arbitrary permutation \pi \in \mathrm{S}(d), then we get another combinatorial map (\pi\sigma\pi^{-1},\pi\tau\pi^{-1}) which obviously yields the same topological map when the construction described above is performed, since we have simply relabeled everything by conjugating. Thus we expect that the correspondence between combinatorial and topological maps should be d!-to-one, which would be excellent news, since then the factor

\frac{1}{d!} |C_\alpha|t_d

would be equal to one. However, this is not precisely true, since there may be permutations \pi whose conjugation action on the combinatorial map (\sigma,\tau) has no effect, leaving this combinatorial map unaltered; this happens precisely when \pi commutes with both \pi and \sigma, or in other words belongs to the centralizer of the subgroup \langle \sigma,\tau \rangle of the symmetric group \mathrm{S}(d). Which they generate. It is thus reasonable to define the automorphism group of a combinatorial map to be this centralizer. We then see that the Feynman transform is indeed a generating function for graphs arranged by circuit rank, but in which each graph is counted with a weight equal to the reciprocal of the order of its automorphism group:

\mathcal{F}\left( \sum\limits_{d=3}^\infty p_d \frac{x^d}{d}\right):=\sum\limits_\Gamma \frac{\hbar^{r(\Gamma)-1}}{|\mathrm{Aut} \Gamma|}p_\Gamma,

the sum being over all topological maps of minimum vertex degree 3, with

p_\Gamma = \prod_{v \in \Gamma} p_{\deg(v)},

where the product is over all vertices of \Gamma. This product is called the “Feynman amplitude” of \Gamma, and it is the only part of the above construction which actually depends on the generating function S we started with — everything else is universal, being exactly the same irrespective of what S is. And that’s the general idea of the Feynman diagram expansion in zero-dimensional scalar-valued quantum field theory. It’s pretty useful, because for example we can say that the leading contribution in the asymptotic expansion of the integral we’re trying to approximate in the \hbar \to 0 limit comes from graphs which have minimum vertex degree 3, and two independent cycles, and there are only three of those:

Leading Feynman diagrams.

We’ll clean this up a bit next week, and then go on to the matrix-valued case, where the expansion depends not just on the underlying graph of a topological map, but also on the topology of the surface into which it is embedded.