Math 202C: Lecture 10

In this Lecture, we analyze simple random walk on the group \mathrm{B}(k) of permutations generated by the disjoint transpositions

\tau_1=(1\ 2), \dots, \tau_k=(2k-1\ 2k).

The corresponding Cayley graph \Gamma=\mathrm{Cay}(\mathrm{B}(k),\{\tau_1,\dots,\tau_k\}) has vertex set \mathrm{B}(k), with \{\rho,\sigma\} an edge if and only if \sigma=\rho\tau_i for some 1 \leq i \leq k.

Simple random walk on \Gamma has an interesting equivalent formulation known Ehrenfest diffusion model, which is used in thermodynamics. The Ehrenfest model consists of two urns, U_1 and U_2, and k balls b_1,\dots,b_k. At time zero, all balls are in urn U_2. At each instant of discrete time thereafter, a ball is chosen uniformly at random, and moved to the other urn.

Problem 10.1: Explain the equivalence between Ehrenfest diffusion and simple random walk on \mathrm{B}(k).

Simple random walk on \Gamma begins with a single particle positioned at the identity permutation \iota. At each tick of the clock, the particle makes a random jump to an adjacent permutation, with equal probability of jumping to any neighboring vertex on the k-regular graph \Gamma. However, because \Gamma is bipartite, the distribution of this random walk does not converge.

Problem 10.2: Prove that \Gamma is a bipartite graph.

To fix the bipartiteness issue, we replace \Gamma with a new graph \Gamma, which differs only from \Gamma by the addition of exactly one loop at each vertex, resulting in a (k+1)-regular graph with the same vertex set. This resolves the bipartiteness issue: by the Markov limit theorem, simple random walk on \overline{\Gamma} converges to uniform measure on \mathrm{B}(k). Note that one can equivalently think of simple random walk on \overline{\Gamma} as a “lazy” random walk on \Gamma, where at each time instant the particle either stays put or jumps to a neighboring vertex, with all events being equally likely.

We proceed to analysis of simple random walk on \overline{\Gamma}. Let \mathbb{C}\mathrm{B}(k) be the group algebra of \mathrm{B}(k), i.e. the Hilbert space with orthonormal basis

\{\pi \colon \pi \in \mathrm{B}(k)\}

and multiplication given by bilinear extension of the group law

\rho \sigma = \rho\sigma.

The probability operator P \in \mathrm{End} \mathbb{C}\mathrm{B}(k) for lazy simple random walk on \overline{\Gamma} is then implemented as multiplication by

\frac{1}{k+1}\iota + \frac{1}{k+1}\tau_1 + \dots + \frac{1}{k+1}\tau_k,

where \iota \in \mathrm{B}(k) is the identity permutation. In other words, the distribution of the particle at time r is given by the vector P^r_\iota. We would like to compare this to the vector

u=\sum\limits_{\pi \in \mathrm{B}(k)} \frac{1}{2^k} \pi,

which corresponds to uniform measure on \mathrm{B}(k). More precisely, we would like to estimate the quantity

\|P^r\iota - u\|_1 = \sum\limits_{\pi \in \mathrm{B}(k)} \left| \langle P^r \iota,\pi\rangle -\frac{1}{2^k}\right|,

which, up to a factor of 1/2, coincides with the total variation distance between the distribution of the particle at time r and the uniform distribution.

We will get an upper bound on the above total variation distance using the eigenvalues and eigenvectors of P. We have ample information concerning the eigenvalues and eigenvectors of the operator P from Lecture 9, where we analyzed the operator A of multiplication by

\tau_1 + \dots + \tau_k,

which is the adjacency operator of \Gamma. We know that the eigenvalues of A are the numbers

k-2j, \quad 0 \leq j \leq k,

and that E_j, the eigenspace of k-2j, has dimension {k \choose j} (an orthogonal basis of E_j is given by the characters \chi_\pi of \mathrm{B}(k) indexed by permutations with |\pi|=j.) Now, it is immediate to see that

P = \frac{1}{k+1}(I + A),

where I \in \mathrm{End}\mathbb{C}\mathrm{B}(k) is the identity operator, so that P has the same eigenspaces as A, and more precisely that P acts in E_j as multiplication by

\lambda_k=1-\frac{2j}{k+1}.

In particular, the largest eigenvalue \lambda_0 of P is 1 and it is a simple eigenvalue, which is consistent with Perron-Frobenius, and moreover \lambda_0 is the eigenvalue of largest absolute value, since the smallest eigenvalue is

\lambda_k= 1-\frac{2k}{k+1} > 1- \frac{2k}{k} = -1,

which is also consistent with Perron-Frobenius. We will now plug this information into the famous Upper Bound Lemma of Diaconis-Shahshahani to get an estimate on the total variation distance between the distribution of lazy simple random walk on \Gamma at time r, and the uniform distribution on the vertices of \Gamma. We skip the proof, and comment only that it follows from combining character expansion and the Cauchy-Schwarz inequality. A proof can be found in the “Harmonic Analysis on Finite Groups” reference text.

Theorem (Upper Bound Lemma): We have

\|P^r\iota - u \|_1^2 \leq \sum\limits_{j=1}^k \lambda_j^{2r} \dim E_j.

with \lambda_\pi the eigenvalue of the transition operator acting on the character \chi_\pi. The only eigenvalue equal to 1 is \lambda_\iota, and hence we have

Let us apply the Upper Bound Lemma. Right off the bat, it tells us that

\|P^r\iota - u\|_1^2 \leq \sum\limits_{j=1}^k {k \choose j} \left(1-\frac{2j}{k+1}\right)^{2r},

so we just have to estimate this sum of explicit quantities. Observing that

\left(1-\frac{2j}{k+1}\right)^{2r} = \left(1-\frac{2(k+1-j)}{k+1}\right)^{2r},

we have the estimate

\sum\limits_{j=1}^k {k \choose j} \left(1-\frac{2j}{k+1}\right)^{2r} \leq  2\sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} {k \choose j} \left(1-\frac{2j}{k+1}\right)^{2r},

and the factor of two which appears cancels with the missing factor of 1/2 which distinguishes the 1-norm from the actual total variation distance. Thus we get that \mathrm{d}_r, the total variation distance between the distribution of lazy simple random walk on \Gamma at time r and the uniform distribution, satisfies

\mathrm{d}_r \leq \sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} {k \choose j} \left(1-\frac{2j}{k+1}\right)^{2r}.

To estimate the right hand side, first observe that

{k \choose j} = \frac{k(k-1) \dots (k-j+1)}{j!} \leq \frac{k^j}{j!},

so that

\mathrm{d}_r \leq \sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} \frac{k^j}{j!}\left(1-\frac{2j}{k+1}\right)^{2r}.

Using that

e^{-x} = 1-x+\frac{x^2}{2} - \dots \geq 1-x

for x \geq 0, we have

\left(1-\frac{2j}{k+1}\right)^{2r} \leq e^{-\frac{4jr}{k+1}},

so that we have arrived at the upper bound

\mathrm{d}_r \leq \sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} \frac{k^j}{j!}e^{-\frac{4jr}{k+1}}.

Now observe that if r is large enough so that the exponentially small factor e^{-\frac{4jr}{k+1}} defeats the polynomially large factor k^j even for small values of j, then the distance \mathrm{d}_r will be very small. More precisely, if

r \geq \frac{1}{4}(k+1)(\log k + c)

for some constant c > 0, then

e^{-\frac{4jr}{k+1}} \leq k^{-j}e^{-cj},

and we have

\mathrm{d}_r \leq \sum\limits_{j=1}^{\lfloor \frac{k+1}{2} \rfloor} \frac{1}{j!} e^{-cj} \leq \sum\limits_{j=1}^{\infty} \frac{1}{j!} e^{-cj} = e^{e^{-c}}-1,

a number exponentially close to 0. Thus as soon as the number of steps r in lazy simple random walk on \Gamma exceeds the threshold

r=\frac{1}{4}(k+1)\log k,

the distribution of the walk becomes exponentially close to its limiting distribution. On the other hand, if instead

r \leq \frac{1}{4}(k+1)(\log k - c)

is below this threshold, then the distance between the total variation distance between the distribution of the walk and its limit distribution is still exponentially close to its initial value of 1. Thus the transition from not very random to exponentially close to total randomness happens suddenly, in a neighborhood of the “cutoff” \frac{1}{4}(k+1)\log k.

This so-called “cutoff phenomenon,” was discovered by Diaconis and Shahshahani for card shuffles, i.e. random walks on the symmetric group, which we will discuss next. There is no known characterization of the class of Markov chains for which this phenomenon occurs; observe that to establish it for \mathrm{B}(k) we needed total knowledge of the eigenvalues and eigenvectors of the associated probability operator P. For random walks on groups this information can often be obtained thanks to representation theory, which is a testament to the power of symmetry. For random walks on graphs which are not Cayley graphs of groups, complete knowledge of eigenvalues and eigenvectors is typically unavailable, and one cannot determine effective bounds on the speed of convergence of random walk to its stationary distribution.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s