Math202C: Lecture 17 – Expander Graphs

Author: Haixiao Wang

As demonstrated in [1], expander graphs are a sparse graph that mimic the structure properties of complete graphs, quantified by vertex, edge or spectral expansion. At the same time, expander constructions have spawned research in pure and applied mathematics, with several applications to complexity theory, design of robust computer networks, and the theory of error-correcting codes.

This lecture is organized as follows. We introduce two different definitions of expanders in the first section, and then explore the relationship between them in second part.

17.1. Two Definitions

We start our discussion in expansion with two definitions of expansion: combinatorial and spectral. In the rest of this post, we consider an undirected graph G = (V, E) with |V| = n, where self loops and multiple edges are allowed. Given two subsets S,T \subset V, the set of edges between S and T is denoted by E(S, T) = \{(u,v ): (u, v) \in E, u\in S, v\in T\}.

17.1.1. Combinatorial expansion

Intuitively, a graph G=(V,E) is an expander if subsets S \subseteq V expand outwards into V \setminus S. In other words, all small subsets will have large boundary, respectively. Follow this intuition, we introduce the definition of boundary, based on edges and vertices respectively, to quantify the corresponding expansion.

Definition 1(Boundary):

  • The Edge Boundary of a set S, denoted by \partial S, is the set of edges \partial S = E(S, V \setminus S), which is the set of edges between disjoint set S and V \setminus S.
  • The Vertex Boundary of a set S is a subset of vertices in V \setminus S adjacent to vertices in S, denoted as N(S).

Obviously, vertex boundary |N(S)| is the same as the number of neighboring vertices of vertex sets S. With those notations, we then define the expansion over the entire graph G via maximizing over subsets S.

Definition 2(Expander): For some \varepsilon \in [0, 1], a graph G = (V, E) with |V| = n is an

  • \varepsilon-edge-expander if \min\limits_{ S\subset V, |S| \leq \frac{n}{2} } \frac{|\partial S|}{|E(S, V)|} \geq \varepsilon.
  • \varepsilon-vertex-expander if \min\limits_{ S\subset V, |S| \leq \frac{n}{2} } \frac{|N(S)|}{|S|} \geq \varepsilon.

Remark 3: The reason for the restriction |S| \leq n/2 is that we are examining the relative size of every cut in the graph, i.e., the number of edges crossing between S and V \setminus S, and we examine each case only once.

Problem 1: Show that the complete graph K_{n} is

  • at least a 1/2-edge-expander.
  • a 1-vertex-expander. (We only consider \varepsilon \in [0, 1] here.)

From the definition above, the maximum \varepsilon, which makes G = (V, E) an \varepsilon-edge-expander, is an important criteria, leading to the definition of Cheeger constant.

Definition 4(Cheeger Constant): The Edge Expansion Ratio, or Cheeger constant, of graph G = (V, E) with |V| = n, denoted by h(G), is defined as

h(G) = \min\limits_{S\subset V, |S| \leq \frac{n}{2}}\frac{|\partial S|}{|E(S, V)|}

Remark 5: Actually, Cheeger constant can be defined alternatively as \min\limits_{S\subset V, |S| \leq n/2}\frac{|\partial S|}{|S|} = O(n). We didn’t follow this definition since we want to keep \varepsilon\in [0, 1].

17.1.2. Spectral Expansion

To simplify our discussion, we focus on regular graphs in the rest of this lecture. Let A = A(G)\in \mathbb{R}^{n \times n} denote the adjacency Matrix of a d-regular graph G = (V, E) with |V| = n and A being real symmetric, then A has n real eigenvalues, denoted by \lambda_{1} \geq \lambda_{2} \geq \cdots \geq \lambda_n, and corresponding orthonormal eigenvectors v_1, \dots, v_n with A v_i = \lambda_i v_i. The spectrum of A encodes a lot of information about the graph structure. Recall the following properties, which we have seen in previous lectures.

  • The largest eigenvalue is \lambda_1 = d and the corresponding eigenvector is v_1 = (1/\sqrt{n}, \dots, 1/\sqrt{n}).
  • The graph G is connected if and only if \lambda_1 > \lambda_2.
  • The graph is bipartite if and only if \lambda_1 = -\lambda_n.

The definition of spectral expansion and spectral gap then follows.

Definition 6(Spectral Expansion): A graph G = (V, E) with |V| = n is a \lambda-spectral-expander if \lambda(G):= \max \{ |\lambda_2|, |\lambda_n| \} \leq \lambda.

In other words, the second largest eigenvalue of A(G), in absolute value, is at most \lambda. This also gives rise to the definition of Ramanujan graph.

Definition 7(Ramanujan graph): A connected d-regular graph G is a Ramanujan graph if \lambda (G)\leq 2\sqrt{d-1}.

We now refer to the spectral gap.

Definition 8(Spectral Gap): The spectral gap of d-regular graph G is \lambda_{1} - \lambda(G).

Remark 9: Sometimes the spectral gap is defined as \lambda_1 - \lambda_2, which is often referred as one-sided.

Problem 2:

  • Show that the complete graph K_{n} has spectral gap n - 2. Actually, we can also show that the one-sided spectral gap is n.
  • Show that the complete bipartite graph K_{m, n} has spectral gap 0. (Hint: use the spectral property above.)

17.2. Relating Two Definitions

It is not so obvious that the combinatorial and spectral versions of expansion are related. In this section, we will introduce two relations between edge and spectral expansion: the Expander-Mixing Lemma and Cheeger’s inequality.

17.2.1. The Expander-Mixing Lemma

Lemma 10(Expander-Mixing Lemma[2]): Let G = (V, E) be a d-regular graph with n vertices and eigenvalues \lambda_1 \geq \dots \geq \lambda_n. Then for all S, T \subset V:

\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \lambda(G) \sqrt{|S|\cdot |T|}.

As we can see, the left-hand side measures the deviation between two quantities:

  • |E(S, T)|: the number of edges between the two subsets S and T.
  • |S|\cdot |T|\cdot d/n: the expected number of edges between two subsets S and T, drawn from an Erdos–Renyi random graph G(n, d/n).

The Expander-Mixing lemma tell us a small \lambda(G) (a.k.a. large spectral gap) implies that this discrepancy is small, leading to the fact that the graph is nearly random in this sense, as discussed in [1]. We should mention that the converse of expander-mixing is true as well, stated as follows.

Lemma 11(Expander-Mixing Lemma Converse [4]): Let G = (V, E) be a d-regular graph with n vertices and eigenvalues \lambda_1 \geq \dots \geq \lambda_n satisfying:

\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \rho \sqrt{|S|\cdot |T|}.

for any two disjoint subsets S,T\subset V and some positive \rho. Then \lambda(G) \leq O(\rho \cdot(1 + \log(d/\rho))).

Proof of Expander-Mixing Lemma: Recall that the adjacency matrix A has eigenvalues \lambda_1 \geq \dots \geq \lambda_n with corresponding orthonormal eigenvectors v_1, \dots, v_n. Let 1_S and 1_T be the characteristic vectors of subsets S and T(a.k.a., the v-th coordinate of 1_S is 1 if v\in S and 0 otherwise). We then decompose 1_S and 1_T in the orthonormal basis of eigenvectors v_1, \dots, v_n as

1_S = \sum\limits_{i=1}^n \alpha_i v_i, \quad 1_S = \sum\limits_{j=1}^n\beta_j v_j.

Observing that |E(S, T)| = (1_S)^T A 1_T, we expand out the right hand side and apply orthogonality, which yields

(1_S)^T A 1_T = \left (\sum\limits_{i=1}^n \alpha_i v_i \right)^T A \left (\sum\limits_{j=1}^n\beta_j v_j \right) = \sum\limits_{i=1}^n \lambda_i\alpha_i\beta_i = d\frac{|S|\cdot|T|}{n} + \sum\limits_{i=2}^n \lambda_i\alpha_i\beta_i .

where the last step follows from the facts \lambda_1 = d with corresponding v_1 = (1/\sqrt{n},\dots, 1/\sqrt{n}), \alpha_1 =<1_S, v_1> = |S|/\sqrt{n} and \beta_1 =<1_T, v_1> = |T|/\sqrt{n}. Shifting the first term on right hand side to the left, the desired result follows from Cauchy-Schwarz inequality:

\left| |E(S, T)| - d\frac{|S||T|}{n} \right | = \left|\sum\limits_{i=2}^n\lambda_i\alpha_i\beta_i\right| \leq \lambda(G) \sum\limits_{i=2}^n |\alpha_i \beta_i| \leq \lambda(G) ||\alpha||_2 ||\beta ||_2 = \lambda(G) ||1_S||_2  ||1_T||_2 = \lambda(G)\sqrt{|S||T|}.

Remark 12: If we normalize the adjacency matrix A = A(G) and obtain \bar{A} = (1/d) A, where each entry of A is divided by d, then the corresponding eigenvalues of \bar{A}, namely \bar{\lambda}_{1}\geq \dots \geq \bar{\lambda}_n, lie in the interval [-1, 1]. The Expander-Mixing Lemma is in the form of

\Bigg| |E(S, T)| - \frac{d|S|\cdot |T|}{n}\Bigg| \leq \bar{\lambda}(G)d \sqrt{|S|\cdot |T|}.

It is sometimes convenient to consider the normalized spectrum. For example in random graph theory, it would be convenient to look at the normalized spectrum when the expected degree d = O(\log n). However, we can focus on the spectrum of $\latex A = A(G)$ without normalization when the expected degree $\latex d = O(1)$.

Regular \lambda-spectral expanders with small \lambda(G)(a.k.a. large spectral gap) have some of significant properties, listed as follows.

Corollary 13: An independent set , in a graph G = (V, E) with |V| = n, is a set of vertices S, where no two vertices in S are adjacent, i.e. with |E(S, S)| = 0. An independent set S, in a d-regular \alpha-spectral expander G, has cardinality at most \alpha n, i.e., |S|\leq \alpha n, which follows immediately from Lemma 10(Expander-Mixing Lemma[2]).

Corollary 14: Given a graph G = (V, E) with |V| = n, the diameter of G is defined as \mathrm{diam}(G) = \max\limits_{u, v}d_{G}(u, v), where d_{G}(u, v) is the length of shortest path between u and v. If G is a d-regular \alpha-spectral expander G with \alpha < d/2, then we have

\Omega \Big( \frac{\log n}{\log d} \Big) \leq \mathrm{diam}(G) \leq O(\log n).

17.2.2. Cheeger’s Inequality

As we have seen from previous discussion, spectral expansion implies strong structural properties on the edge-structure of graphs. In this section, we introduce another famous result, Cheeger’s inequality, indicating that the Cheeger constant(Edge Expansion Ratio) of a graph G can be approximated by the its spectral gap \lambda_1 - \lambda(G).[6]

Theorem 15(Cheeger’s inequality, [3]): Let G = (V, E) be a d-regular graph with eigenvalues \lambda_1\geq \cdots \geq \lambda_n, then

\frac{d - \lambda_2}{2} \leq \min\limits_{S\subset V, |S|\leq n/2} \frac{|\partial S|}{|S|} \leq \sqrt{2d(d - \lambda_2)}.

In the normalized scenario \bar{A} = (1/d)A(G) with eigenvalues \bar{\lambda}_{1}, \cdots, \bar{\lambda}_n \in [-1, 1], we have

\frac{1 - \bar{\lambda}_2}{2} \leq h(G):=\min\limits_{S\subset V, |S|\leq n/2} \frac{|\partial S|}{|E(S, V)|} \leq \sqrt{2(1 - \bar{\lambda}_2)}.

Remark 16: The proof for the normalized scenario is different from the regular scenario. We should start from the desired quantity and finish the calculation step by step. However, the strategy are the same. We use the Rayleigh Quotient to obtain upper bound and lower bound.

In the context of n\gg d, the estimate of spectral gap was given by Nilli Alon.

Theorem 17([5]): For every d-regular graph G, we have

\lambda(G) \geq 2\sqrt{d - 1} - o_{n}(1)

The o_{n}(1) term is a quantity that tends to zero for every fixed d as n \to \infty.

[1]. Shlomo Horry, N. L. and Wigderson, A. (2006). Expander graphs andtheir applications.BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY, 43(4).

[2]. Alon, N. and Chung, F. R. K. (1988). Explicit construction of linear sized tolerantnetworks.Discrete Mathematics, 42

[3]. Cheeger, J. (1970). A lower bound for the smallest eigenvalue of the laplacian.Problems inanalysis, 26.[Nilli, 1991] Nilli, A. (1991). On the secon

[4]. Yonatan Bilu, N. L. (2006). Lifts, discrepancy and nearly optimal spectral gap.Com-binatorica, 26

[5]. Nilli, A. (1991). On the second eigenvalue of a graph.Discrete Mathematics, 91(2):207–210.

[6]. Lovett, S. (2021). Lecture notes for expander graphs and high-dimensional expanders.

Math 202C: Lecture 12

Since the representation theory of the symmetric group is likely fresh in your mind thanks to yesterday’s exam, this is a good time to show you a very nice probabilistic application of this theory to card shuffling. This post will discuss the “top-to-random” shuffle, where the cards 1,\dots,n are arranged in a single row (or a single stack), and a randomly chosen card is repeatedly swapped with the rightmost (or top) card. The algebraic model for this procedure is the Cayley graph \Gamma(n) of the symmetric group as generated by the so-called “star transpositions”

\tau_1=(1\ n), \tau_2=(2\ n),\dots, \tau_{n-1} = (n-1\ n).

Problem 12.1: Prove that the above transpositions do in fact generate \mathrm{S}(n).

These are called star transpositions because, if one forms the graph on \{1,\dots,n\} such that \{i,j\} is an edge if and only if one of the above transpositions swaps i and j, then this graph is a star graph. The top-to-random shuffle is then the same thing as the lazy simple random walk on the Cayley graph \Gamma(n).

The paper attached below gives a detailed analysis of the top-to-random shuffle using the representation theory of \mathrm{S}(n), and gives a precise meaning to the statement that n \log n such shuffles are required in order to randomize the deck. The argument is pretty much all representation theory — only some very basic probabilistic reasoning is used towards the end.

Math 202C: Lecture 9

We begin with a brief recap of the main result from Lecture 8. Let

\chi_\sigma = \sum\limits_{\pi \in \mathrm{B}(k)} \chi_\sigma(\pi) \pi, \quad \sigma \in \mathrm{B}(k)

be the characters of the rank k-hypercube group

\mathrm{B}(k) = \langle (1\ 2), (3\ 4), \dots, (2k-1\ k) \rangle.

Definition 1: The orthogonal idempotents in the group algebra \mathbb{C}\mathrm{B}(k) are defined as the characters of \mathrm{B}(k) normalized by the order of \mathrm{B}(k):

\tilde{\chi}_\sigma = \frac{1}{2^k} \chi_\sigma, \quad \sigma \in \mathrm{B}(k).

The reason for the name is the following key property of these elements:

\tilde{\chi}_\rho \tilde{\chi}_\sigma = \tilde{\chi}_\rho [\rho = \sigma], \quad \rho,\sigma \in \mathrm{B}(k).

This is quite valuable, since expanding elements a \in \mathbb{C}\mathrm{B}(k) in the group algebra relative to the basis of orthogonal idempotents essentially makes multiplication in the group algebra trivial, which translates into the “Fourier coefficients of a product are products of Fourier coefficients” fact that we saw last Lecture. Note also that the behavior of orthogonal idempotents in the group algebra is formally identical to the behavior of orthogonal projections in Hilbert space — you perhaps know from Math 202B that this is not a coincidence, and if you don’t know this don’t worry, because we will discuss it.

The main result for today is the following.

Theorem 1: The class of multiplication operators in \mathrm{End}\mathbb{C}\mathrm{B}(k) coincides with the class of operators that are diagonal in the character basis of \mathbb{C}\mathrm{B}(k).

Proof: Let b \in \mathbb{C}\mathrm{B}(k) be an element of the group algebra, and let R_b \in \mathrm{End} \mathbb{C}\mathrm{B}(k) be the corresponding multiplication operator. Consider the character expansion of b,

b = 2^{-k} \sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,b\rangle \chi_\pi .

Now let \chi_\sigma \in \mathbb{C}\mathrm{B}(k) be another character. We have

R_b\chi_\sigma = 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,b\rangle \chi_\sigma\chi_\pi = 2^{-k}\langle \chi_\sigma,b\rangle 2^k \chi_\sigma = \langle \chi_\sigma,b\rangle\chi_\sigma.

So, \chi_\sigma is an eigenvector of R_b, with eigenvalue the Fourier coefficient \langle \chi_\sigma,b\rangle.

Conversely, suppose that R \in  \mathrm{End}\mathbb{C}\mathrm{B}(k) acts diagonally in the character basis:

R \chi_\pi = \lambda_\pi \chi_\pi, \quad \pi \in \mathrm{B}(k).

Then the action of R on an arbitrary a \in \mathbb{C}\mathrm{B}(k) is

Ma = M 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,a \rangle \chi_\pi = 2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,a \rangle \lambda_\pi\chi_\pi = \left(  2^{-k}\sum\limits_{\pi \in \mathrm{B}(k)} \langle \chi_\pi,a \rangle \chi_\pi \right) \left(2^{-k}\sum\limits_{\sigma \in \mathrm{B}(k)} \lambda_\sigma \chi_\sigma \right),

so that M=R_b with

b= 2^{-k}\sum\limits_{\sigma \in \mathrm{B}(k)} \lambda_\sigma \chi_\sigma.

— Q.E.D.

With Theorem 1 in hand, we can immediately compute the the eigenvalues and eigenvectors of the Cayley graph of \mathrm{B}(k). Indeed, the adjacency operator of the Cayley graph of \mathrm{B}(k) is just the multiplication operators

R_{\sum_{i=1}^k \tau_i},

so by Theorem 1 we immediately have that

R_{\sum_{i=1}^k \tau_i} \chi_\sigma = \langle \chi_\sigma,\sum\limits_{i=1}^k \tau_i \rangle \chi_\sigma, \quad \sigma \in \mathrm{B}(k).

Now, we have that the resulting eigenvalue is

\langle \chi_\sigma,\sum\limits_{i=1}^k \tau_i \rangle = \sum\limits_{i=1}^k \langle \chi_\sigma,\tau_i \rangle,

and this is a sum we can explicitly evaluate, because we explicitly computed the characters of \mathrm{B}(k) back in Lecture 7. In particular, what we know is that

\langle \chi_\sigma,\tau_i \rangle =\chi_\sigma(\tau_i)

is equal to -1 if \tau_i is a factor of \sigma, and is equal to +1 if \tau_i is not a factor of \sigma. Since we compute this scalar product for each and every generator \tau_i of \mathrm{B}(k), the sum has |\sigma| terms equal to -1, and k - |\sigma| terms equal to +1, where |\sigma| is the word norm of \sigma, or equivalently its distance to the identity permutation in the Cayley graph. So we get

\sum\limits_{i=1}^k \langle \chi_\sigma,\tau_i \rangle = k-2|\sigma|

as our explicit eigenvalue of the character \chi_\sigma under the action of the adjacency operator. Moreover, the number of elements \sigma \in \mathrm{B}(k) such that |\sigma|=j for a given 0 \leq j \leq k is simply {k \choose j}, since this just amounts to choosing j of the generators \tau_1,\dots,\tau_k of \mathrm{B}(k). So we can summarize our results as follows.

Theorem 2: The eigenvalues of Cayley graph of \mathrm{B}(k) are the numbers

k-2j, \quad j=0,\dots,k,

and the dimension of the eigenspace corresponding to k-2j is {k \choose j}.

Since all eigenvalues of the Cayley graph of \mathrm{B}(k) are given explicitly by Theorem 2, we are in a position to explicitly enumerate walks on this graph, using the spectral approach to counting walks discussed previously.

Problem 9.1: Show that the number W^r(\rho,\sigma) of r-step walks on \mathrm{B}(k) from \rho to \sigma is

W^r(\rho,\sigma) = \frac{1}{2^k} \sum\limits_{i=0}^k \sum\limits_{j=0}^{|\rho\sigma|} (-1)^j {|\rho\sigma| \choose j} {k-|\rho\sigma| \choose i-j} (k-2i)^r.

This material, and problem 9.1, have applications in coding theory which you might find interesting to look into.

Math 202C: Lecture 5

We continue with our study of random walks on finite graphs via linear algebra. Let \mathrm{G} be a graph on \{1,\dots,n\}, possibly with loops and/or multiple edges. As in Lecture 4, let P^r(i,j) denote the probability that simple random walk on \mathrm{G} starting from vertex i is located at vertex j at time r. In this Lecture, we will use the linear algebra setup from Lecture 4 to analyze the r \to \infty asymptotic behavior of the probability P^r(i,j), which corresponds to the “long time behavior” of simple random walk on \mathrm{G}.

The linear algebraic setup is as follows. Let \mathcal{H} be a Hilbert space with orthonormal basis \{\mathbf{e}_1,\dots,\mathbf{e}_n\} indexed by the vertices of \mathrm{G}. In the language of quantum mechanics, by replacing the vertex set of \mathrm{G} with the Hilbert space \mathcal{H} we allow ourselves to consider not just vertices, but “superpositions” of vertices. Recall from Lecture 4 that the transition operator T \in \mathrm{End}\mathcal{H} for simple random walk on \mathrm{G} is defined by its action on the vertex basis as

T\mathbf{e}_i = \sum\limits_{j=1}^n \frac{m_{ij}}{d_i} \mathbf{e}_j,

where m_{ij} is the number of edges of \mathrm{G} incident to both i and j, and d_i=\sum_{j=1}^n m_{ij} is the total number of edges incident to i. In Lecture 4, we showed that

P^r(i,j) = \langle T^r\mathbf{e}_i,\mathbf{e}_j \rangle,

so that understanding the r \to \infty asymptotics of the probability P^r(i,j) is equivalent to understanding the r \to \infty asymptotics of the scalar product \langle T^r\mathbf{e}_i,\mathbf{e}_j \rangle. We would like to accomplish this via spectral analysis of T.

We adopt the same assumptions as in Lecture 4: \mathrm{G} is a connected graph on \{1,\dots,n\} and n \geq 2, so that in particular every vertex has strictly positive degree. Under this assumption, we showed previously that, although T need not be selfadjoint, or even normal, the “symmetrize” transition operator D^{-1}TD is selfadjoint, with D \in \mathrm{GL}(\mathcal{H}) being the operator that acts diagonally in the vertex basis according to

D\mathbf{e}_j = \sqrt{d_j}\mathbf{e}_j.

We may thus invoke the spectral theorem to assert that

D^{-1}TD = \lambda_1P_1 + \dots + \lambda_mP_m,


\lambda_1 >\dots > \lambda_m,

are the distinct eigenvalues of the selfadjoint operator D^{-1}TD, with P_i \in \mathrm{End}\mathcal{H} the orthogonal projection of \mathcal{H} onto the \lambda_i-eigenspace. Because the P_i‘s are orthogonal projections, we have

D^{-1}T^rD = \lambda_1^rP_1 + \dots + \lambda_m^rP_m,

so that for the rth power of the transition operator we have the decomposition

T^r = \lambda_1^r DP_1D^{-1} + \dots + \lambda_m^r DP_mD^{-1}.

So far everything we have said is valid for an arbitrary operator in \mathrm{End}\mathcal{H} whose orbit under the conjugation action of \mathrm{GL}(\mathcal{H}) contains a selfadjoint operator. To go further we naturally have to reference some of the particular features of the transition operator T.

Theorem 1: The largest eigenvalue of T is \lambda_1=1.

Proof: Let \|\cdot\|_1 be the 1-norm on \mathcal{H} corresponding to the basis \mathbf{e}_1,\dots,\mathbf{e}_n, i.e.

\|\mathbf{v}\|_1 = \sum\limits_{j=1}^n |\langle \mathbf{v},\mathbf{e}_j \rangle|.

We claim that T has norm at most 1 in the corresponding operator norm on \mathrm{End}\mathcal{H}, i.e.

\|T\mathbf{v}\|_1 \leq \|\mathbf{v}\|_1 \quad \forall \mathbf{v} \in \mathcal{H}.

Indeed, we have

\|T\mathbf{v}\|_1 = \sum\limits_{j=1}^n |\langle S\mathbf{v},\mathbf{e}_j\rangle| = \sum\limits_{j=1}^n \left| \left\langle \sum\limits_{i=1}^n \langle \mathbf{e}_i,\mathbf{v}\rangle T\mathbf{e}_i,\mathbf{e}_j \right\rangle \right| \leq \sum\limits_{i=1}^n|\langle \mathbf{v},\mathbf{e}_i\rangle|\sum\limits_{j=1}^n |\langle T\mathbf{e}_i,\mathbf{e}_j\rangle|,

and the terminal double sum is exactly \|\mathbf{v}\|_1, since

\sum\limits_{j=1}^n \langle T\mathbf{e}_i,\mathbf{e}_j \rangle = \sum\limits_{j=1}^n \frac{m_{ij}}{d_i} = \frac{d_i}{d_i}=1.

It follows that any eigenvalue of T has modulus at most 1.

Now we exhibit an eigenvector of T with eigenvalue 1. Consider the vector

\mathbf{s} = \sum\limits_{i=1}^n \frac{d_i}{D}\mathbf{e}_i,


D=\sum\limits_{i=1}^n d_i

is the sum of the degrees of all vertices of \mathrm{G}, which is equal to twice the number of edges in \mathrm{G} less the number of loops. We then have

T\mathbf{s} = \sum\limits_{i=1}^n \frac{d_i}{D}T\mathbf{e}_i = \sum\limits_{i=1}^n \frac{d_i}{D}\sum\limits_{j=1}^n \frac{m_{ij}}{d_i} \mathbf{e}_j = \sum\limits_{j=1}^n \frac{d_j}{D}\mathbf{e}_j = \mathbf{s}.

— Q.E.D.

The eigenvector \mathbf{s} above is called the stationary vector for simple random walk on \mathrm{G}, and it encodes the probability measure on the vertices of \mathrm{G} which assigns mass \frac{d_i}{D} to vertex i. Note that, if \mathrm{G} is a simple graph with constant vertex degree equal to d, which is always the case for Cayley graphs corresponding to a symmetric generating set, then the stationary vector

\mathbf{s} = \sum\limits_{i=1}^n \frac{d}{dn}\mathbf{e}_i = \sum\limits_{i=1}^n \frac{1}{n}\mathbf{e}_i

corresponds to uniform probability measure on \{1,\dots,n\}.

It is important to note that Theorem 1 does not preclude the possibility that the smallest eigenvalue of the transition operator T could also have maximum modulus — it could well be that \lambda_m=-1. However, this can only be the case if \mathrm{G} is bipartite (perhaps I should fill in a proof of this later, but if I don’t you can find one in any text on spectral graph theory). Therefore, we have the following.

Corollary 1: If \mathrm{G} is not bipartite, then

\lim\limits_{r \to \infty} \|T^r-DP_1D^{-1}\| = 0,

where \|\cdot\| is any norm on \mathrm{End}\mathcal{H}.

Proof: Let \|\cdot\| be an arbitrary norm on \mathrm{End}\mathcal{H} (indeed, since \mathrm{End}\mathcal{H}\} is finite-dimensional, all norms are equivalent). Then for any r we have that

\|T^r-\lambda_1DP_1D^{-1}\| \leq \sum\limits_{i=2}^m |\lambda_i|^r \|DP_iD^{-1}\|,

by the triangle inequality. But since \lambda_1=1 and |\lambda_i|<1 for i > 1, the statement follows from the squeeze theorem.

— Q.E.D.

At this point we can finally say something about the probability P^r(i,j) = \langle T^r\mathbf{e}_i,\mathbf{e}_j \rangle that simple random walk on \mathrm{G} starting at i arrives at j at time r: for \mathrm{G} a connected non-bipartite graph on n \geq 2 vertices, the limit

t(i,j) = \lim\limits_{r \to \infty} P^r(i,j)

exists. Note that the role of bipartiteness here is quite clear from a combinatorial perspective: if \mathrm{G} is bipartite, then with probability one simple random walk started at i is located in the color class of i at even times, and in the opposite color class at odd times, so that the above limit obviously does not exist.

Continuing on under the assumption that \mathrm{G} is not-bipartite, let us define a vector \mathbf{t}_i \in \mathcal{H} by

\mathbf{t}_i = \sum\limits_{j=1}^n t(i,j) \mathbf{e}_j,

which encodes the limit distribution of simple random walk on \mathrm{G} started at i.

Theorem 2: For any 1 \leq i \leq n, the limit vector \mathbf{t}_i is an eigenvector of T with eigenvalue 1.

Proof: We have

T\mathbf{t}_i = T\lim\limits_{r \to \infty} T^r \mathbf{e}_i = \lim\limits_{r \to \infty} T^{r+1} \mathbf{e}_i = \mathbf{t}_i.

— Q.E.D.

Theorem (Markov limit theorem): For \mathrm{G} a connected non-bipartite graph on \{1,\dots,n\}, n \geq 2, we have

\mathbf{t}_1=\mathbf{t}_2=\dots = \mathbf{t}_n = \mathbf{s}.

Proof: The statement follows if we can show that \lambda_1=1 is a simple eigenvalue of T, i.e. that the \lambda_1-eigenspace is a line. Indeed, all the vectors \mathbf{t}_i as well as \mathbf{s} have nonnegative coordinates summing to 1 relative to the vertex basis, so if any of these is a scalar multiple of another then the scalar in question must be 1.

The proof that the top eigenspace is indeed one-dimensional uses a bit of a deus ex machina, namely the Perron-Frobenius theorem. Look up this theorem, and complete the proof!

— Q.E.D.

Math 202C: Lecture 0

As a student enrolled in Math 202C, you have necessarily completed Math 202B under the tutelage of Professor Brendon Rhoades, and as such are an expert on the representation theory of the symmetric group \mathrm{S}(d). You may however not realize that this theory is directly applicable to natural and compelling real world problems, some of which are sufficiently rich and interesting that they can be adopted as a posteriori motivations for the development of the subject.

In this course, we are going to take another look at the representation theory of the symmetric group through the lens of the following real world question: how many times do you have to shuffle a deck of cards in order to completely randomize it? We will try to answer this question completely and thoroughly, down to the last detail, and in doing so expand on your existing knowledge of the representation theory of the symmetric group, as well as place this theory into the context of the more general subject known as harmonic analysis on finite groups. This is a beautiful and very classical branch of algebra which enriches the representation theory of finite groups you learned in Math 202B with new perspectives gleaned from the theory of the Fourier transform in classical analysis. Representation theory and harmonic analysis can be completely united, and the resulting subject is the representation theory of topological groups, which we may briefly discuss towards the end of the course. However, we will primarily remain in the finite setting where everything is totally algebraic and easy to understand without any sophisticated machinery.

The mechanics of the course are as follows. I will provide you with a complete text on harmonic analysis on finite groups, in the form of blog posts which will appear here every Monday and Wednesday, in the approximate neighborhood of our regularly scheduled lecture time. Thus there will be two posts per week; reading through and understanding each of these will likely require somewhat more effort than absorbing a standard in-person fifty minute lecture. These two lecture posts form the asynchronous component of the course. That leaves our Friday lecture slot, which will be a two hour live stream session: the first hour will be a flipped classroom style experience, rather than a planned live lecture, in which we can collaboratively review, discuss, and expand upon the material from the lecture posts. The second hour will be my office hours, so open to any and all questions. In practice, it is likely that there won’t be much of a distinction between the first and second hour, but batching the lecture slot and office hours gives us a good chunk of time to work through things. This all happens on the Math 262C Discord server, which will be our main means of communication throughout the quarter: please sign up immediately using this link, When signing up, please take your real name as your username.

The grading scheme for the course is as follows. The lecture posts will contain embedded problems for solution. You will be expected to solve these problems, typeset your solutions in LaTeX, and submit them for grading each week, prior to Sunday at midnight. The exact method for submission and return of your graded solutions will be decided this week, and announced on the Discord server. Furthermore, you will write a “guest lecture post” to be published on this blog on a topic related to the course material; I will prepare a list of possible topics, or you may choose your own topic and clear it with me. Your grade for the course will be an average of your graded problem solutions and your guest lecture post, and you may choose the weighting of this average however you wish, anywhere from 100% problem solutions to 100% guest lecture post. Thus, at the extremes, if you prefer to solve problems and have your solutions graded, you can skip the lecture post, and conversely if you are getting tired of solving problems and would rather throw yourself into a more long term project, you can put all your effort into writing a lecture post and forego submission of typed solutions to the problems.

At this point, I expect you have questions. Please sign up for the Discord server, and we can start talking and working out various logistics. The mathematical content of the course begins on Wednesday, with Lecture 1.