# Math 202C: Lecture 10

In this Lecture, we analyze simple random walk on the group $\mathrm{B}(k)$ of permutations generated by the disjoint transpositions

$\tau_1=(1\ 2), \dots, \tau_k=(2k-1\ 2k).$

The corresponding Cayley graph $\Gamma=\mathrm{Cay}(\mathrm{B}(k),\{\tau_1,\dots,\tau_k\})$ has vertex set $\mathrm{B}(k),$ with $\{\rho,\sigma\}$ an edge if and only if $\sigma=\rho\tau_i$ for some $1 \leq i \leq k.$

Simple random walk on $\Gamma$ has an interesting equivalent formulation known Ehrenfest diffusion model, which is used in thermodynamics. The Ehrenfest model consists of two urns, $U_1$ and $U_2,$ and $k$ balls $b_1,\dots,b_k.$ At time zero, all balls are in urn $U_2.$ At each instant of discrete time thereafter, a ball is chosen uniformly at random, and moved to the other urn.

Problem 10.1: Explain the equivalence between Ehrenfest diffusion and simple random walk on $\mathrm{B}(k).$

Simple random walk on $\Gamma$ begins with a single particle positioned at the identity permutation $\iota.$ At each tick of the clock, the particle makes a random jump to an adjacent permutation, with equal probability of jumping to any neighboring vertex on the $k$-regular graph $\Gamma.$ However, because $\Gamma$ is bipartite, the distribution of this random walk does not converge.

Problem 10.2: Prove that $\Gamma$ is a bipartite graph.

To fix the bipartiteness issue, we replace $\Gamma$ with a new graph $\Gamma,$ which differs only from $\Gamma$ by the addition of exactly one loop at each vertex, resulting in a $(k+1)$-regular graph with the same vertex set. This resolves the bipartiteness issue: by the Markov limit theorem, simple random walk on $\overline{\Gamma}$ converges to uniform measure on $\mathrm{B}(k).$ Note that one can equivalently think of simple random walk on $\overline{\Gamma}$ as a “lazy” random walk on $\Gamma,$ where at each time instant the particle either stays put or jumps to a neighboring vertex, with all events being equally likely.

We proceed to analysis of simple random walk on $\overline{\Gamma}.$ Let $\mathbb{C}\mathrm{B}(k)$ be the group algebra of $\mathrm{B}(k),$ i.e. the Hilbert space with orthonormal basis

$\{\pi \colon \pi \in \mathrm{B}(k)\}$

and multiplication given by bilinear extension of the group law

$\rho \sigma = \rho\sigma.$

The probability operator $P \in \mathrm{End} \mathbb{C}\mathrm{B}(k)$ for lazy simple random walk on $\overline{\Gamma}$ is then implemented as multiplication by

$\frac{1}{k+1}\iota + \frac{1}{k+1}\tau_1 + \dots + \frac{1}{k+1}\tau_k,$

where $\iota \in \mathrm{B}(k)$ is the identity permutation. In other words, the distribution of the particle at time $r$ is given by the vector $P^r_\iota.$ We would like to compare this to the vector

$u=\sum\limits_{\pi \in \mathrm{B}(k)} \frac{1}{2^k} \pi,$

which corresponds to uniform measure on $\mathrm{B}(k)$. More precisely, we would like to estimate the quantity

$\|P^r\iota - u\|_1 = \sum\limits_{\pi \in \mathrm{B}(k)} \left| \langle P^r \iota,\pi\rangle -\frac{1}{2^k}\right|,$

which, up to a factor of $1/2,$ coincides with the total variation distance between the distribution of the particle at time $r$ and the uniform distribution.

We will get an upper bound on the above total variation distance using the eigenvalues and eigenvectors of $P.$ We have ample information concerning the eigenvalues and eigenvectors of the operator $P$ from Lecture 9, where we analyzed the operator $A$ of multiplication by

$\tau_1 + \dots + \tau_k,$

which is the adjacency operator of $\Gamma.$ We know that the eigenvalues of $A$ are the numbers

$k-2j, \quad 0 \leq j \leq k,$

and that $E_j,$ the eigenspace of $k-2j,$ has dimension ${k \choose j}$ (an orthogonal basis of $E_j$ is given by the characters $\chi_\pi$ of $\mathrm{B}(k)$ indexed by permutations with $|\pi|=j.$) Now, it is immediate to see that

$P = \frac{1}{k+1}(I + A),$

where $I \in \mathrm{End}\mathbb{C}\mathrm{B}(k)$ is the identity operator, so that $P$ has the same eigenspaces as $A,$ and more precisely that $P$ acts in $E_j$ as multiplication by

$\lambda_k=1-\frac{2j}{k+1}.$

In particular, the largest eigenvalue $\lambda_0$ of $P$ is $1$ and it is a simple eigenvalue, which is consistent with Perron-Frobenius, and moreover $\lambda_0$ is the eigenvalue of largest absolute value, since the smallest eigenvalue is

$\lambda_k= 1-\frac{2k}{k+1} > 1- \frac{2k}{k} = -1,$

which is also consistent with Perron-Frobenius. We will now plug this information into the famous Upper Bound Lemma of Diaconis-Shahshahani to get an estimate on the total variation distance between the distribution of lazy simple random walk on $\Gamma$ at time $r,$ and the uniform distribution on the vertices of $\Gamma.$ We skip the proof, and comment only that it follows from combining character expansion and the Cauchy-Schwarz inequality. A proof can be found in the “Harmonic Analysis on Finite Groups” reference text.

Theorem (Upper Bound Lemma): We have

$\|P^r\iota - u \|_1^2 \leq \sum\limits_{j=1}^k \lambda_j^{2r} \dim E_j.$

with $\lambda_\pi$ the eigenvalue of the transition operator acting on the character $\chi_\pi$. The only eigenvalue equal to $1$ is $\lambda_\iota,$ and hence we have

Let us apply the Upper Bound Lemma. Right off the bat, it tells us that

$\|P^r\iota - u\|_1^2 \leq \sum\limits_{j=1}^k {k \choose j} \left(1-\frac{2j}{k+1}\right)^{2r},$

so we just have to estimate this sum of explicit quantities. Observing that

$\left(1-\frac{2j}{k+1}\right)^{2r} = \left(1-\frac{2(k+1-j)}{k+1}\right)^{2r},$

we have the estimate

$\sum\limits_{j=1}^k {k \choose j} \left(1-\frac{2j}{k+1}\right)^{2r} \leq 2\sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} {k \choose j} \left(1-\frac{2j}{k+1}\right)^{2r},$

and the factor of two which appears cancels with the missing factor of $1/2$ which distinguishes the $1$-norm from the actual total variation distance. Thus we get that $\mathrm{d}_r,$ the total variation distance between the distribution of lazy simple random walk on $\Gamma$ at time $r$ and the uniform distribution, satisfies

$\mathrm{d}_r \leq \sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} {k \choose j} \left(1-\frac{2j}{k+1}\right)^{2r}.$

To estimate the right hand side, first observe that

${k \choose j} = \frac{k(k-1) \dots (k-j+1)}{j!} \leq \frac{k^j}{j!},$

so that

$\mathrm{d}_r \leq \sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} \frac{k^j}{j!}\left(1-\frac{2j}{k+1}\right)^{2r}.$

Using that

$e^{-x} = 1-x+\frac{x^2}{2} - \dots \geq 1-x$

for $x \geq 0,$ we have

$\left(1-\frac{2j}{k+1}\right)^{2r} \leq e^{-\frac{4jr}{k+1}},$

so that we have arrived at the upper bound

$\mathrm{d}_r \leq \sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} \frac{k^j}{j!}e^{-\frac{4jr}{k+1}}.$

Now observe that if $r$ is large enough so that the exponentially small factor $e^{-\frac{4jr}{k+1}}$ defeats the polynomially large factor $k^j$ even for small values of $j,$ then the distance $\mathrm{d}_r$ will be very small. More precisely, if

$r \geq \frac{1}{4}(k+1)(\log k + c)$

for some constant $c > 0,$ then

$e^{-\frac{4jr}{k+1}} \leq k^{-j}e^{-cj},$

and we have

$\mathrm{d}_r \leq \sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} \frac{1}{j!} e^{-cj} \leq \sum\limits_{j=1}^{\infty} \frac{1}{j!} e^{-cj} = e^{e^{-c}}-1,$

a number exponentially close to $0.$ Thus as soon as the number of steps $r$ in lazy simple random walk on $\Gamma$ exceeds the threshold

$r=\frac{1}{4}(k+1)\log k,$

the distribution of the walk becomes exponentially close to its limiting distribution. On the other hand, if instead

$r \leq \frac{1}{4}(k+1)(\log k - c)$

is below this threshold, then the distance between the total variation distance between the distribution of the walk and its limit distribution is still exponentially close to its initial value of $1.$ Thus the transition from not very random to exponentially close to total randomness happens suddenly, in a neighborhood of the “cutoff” $\frac{1}{4}(k+1)\log k.$

This so-called “cutoff phenomenon,” was discovered by Diaconis and Shahshahani for card shuffles, i.e. random walks on the symmetric group, which we will discuss next. There is no known characterization of the class of Markov chains for which this phenomenon occurs; observe that to establish it for $\mathrm{B}(k)$ we needed total knowledge of the eigenvalues and eigenvectors of the associated probability operator $P.$ For random walks on groups this information can often be obtained thanks to representation theory, which is a testament to the power of symmetry. For random walks on graphs which are not Cayley graphs of groups, complete knowledge of eigenvalues and eigenvectors is typically unavailable, and one cannot determine effective bounds on the speed of convergence of random walk to its stationary distribution.