Math 202C: Lecture 1

Since this is a course in applied algebra, let us begin with a real-world problem. Consider a deck of playing cards numbered 1,2,\dots,n arranged in a single row on a table. At each instant of discrete time, choose two cards independently at random and swap them. How long until you can be sure that the order of the cards is uniformly random?

As with every applied problem, in order to get started we have to think about how best to formulate the above as a precise mathematical question. This is what it means to “model” a real world problem.

First, each of the possible configurations of cards can be uniquely encoded as a permutation of the numbers 1,\dots,n — just write down the labels of the cards one at a time, from left to right, and you obtain a permutation written of 1,\dots,n written in one-line notation. This is a good first step: we have identified the state space of the system we wish to understand, and it is the set \mathrm{S}(n) of permutations of 1,\dots,n.

Next, we need to make precise the way in which our system evolves in time, i.e. transitions from one state to another at each time instant. If the configuration of cards at time t is encoded by the permutation \pi \in \mathrm{S}(n), what are the possible states that our system can be in at time t+1? An obvious but in fact crucially important fact is that the answer to this question is not just combinatorial, but also algebraic: our configuration space \mathrm{S}(n) is not just a set, it is a group, the symmetric group of rank n. This means that the evolution equation we seek can be represented as multiplication in this group: given that the configuration of cards at time t is the permutation \pi the possible states at time t+1 are precisely the permutations

\sigma = \pi(x\ y), \quad 1 \leq x < y \leq n,

where (x\ y) denotes the transposition in \mathrm{S}(n) which interchanges the numbers x,y \in \{1,\dots,n\}. This is a good time to fix our notational conventions. Permutations in \mathrm{S}(n) are bijective functions \{1,\dots,n\} \to \{1,\dots,n\}, and multiplication of permutations means composition of functions. The concatenation of two permutations \pi_1,\pi_2 means the following:

\pi_1\pi_2 := \pi_2 \circ \pi_1.

Although this looks messy, the reason for taking this convention rather than the seemingly more natural one \pi_2\pi_1=\pi_2 \circ \pi_1 is that we are more interested in doing computations “upstairs” where permutations are elements in a group, as opposed to “downstairs” where they are functions acting on points, and it is nice to multiply strings of elements from left to right.

It is useful to take the above one step further, and think of the symmetric group \mathrm{S}(n) as not just an algebraic object, but a geometric one, in the manner of geometric group theory. To do this, first solve the following problem.

Problem 1: The set \mathrm{T} \subset \mathrm{S}(n) of all transpositions \tau=(x\ y),\ 1 \leq x < y \leq n generates S(n). That is, for any permutation \pi \in \mathrm{S}(n), there exists a nonnegative integer k and transpositions \tau_1,\dots,\tau_k \in \mathrm{T} such that

\pi = \tau_1 \tau_2 \dots \tau_k.

With the solution to Problem 1 under our belts, we can define the word norm on \mathrm{S}(n) corresponding to the generating set \mathrm{T} by

\|\pi\|_T = \min \{k \in \mathbb{Z}_{\geq 0} \colon \pi=\tau_1\dots\tau_k,\ \tau_i\in \mathrm{T}\}.

Problem 2: Prove that \|\pi\sigma\|_{\mathrm{T}} \leq \|\pi\|_{\mathrm{T}} + \|\sigma\|_{\mathrm{T}}.

Once we have a norm on \mathrm{S}(n), meaning a way to measure the “size” of a permutation, it is very natural to define a corresponding notion of distance between two permutations by

\mathrm{d}_\mathrm{T}(\pi_1,\pi_2) = \|\pi_1^{-1}\pi_2\|_\mathrm{T}.

Problem 3: Check that the pair (\mathrm{S}(n),\mathrm{d}_\mathrm{T}) satisfies the metric space axioms.

To summarize, the state space \mathrm{S}(n) of the system we wish to understand is not just a set, but a group, and further not just a group, but a metric space. It is nice to have a pictorial representation of the metric space (\mathrm{S}(n),\mathrm{d}_T), and this is afforded by the corresponding Cayley graph \Gamma(\mathrm{S}(n),\mathrm{T}). The vertex set of \Gamma(\mathrm{S}(n),\mathrm{T}) is \mathrm{S}(n), and two vertices \pi,\sigma are adjacent if and only if there exists \tau \in \mathrm{T} such that \sigma = \pi\tau. Below is a picture of the Cayley graph of \mathrm{S}(4) as generated by \mathrm{T}=\{(1\ 2), (1\ 3), (2\ 3), (1\ 4),(2\ 4),(3\ 4)\}.

Cayley graph of S(4) generated by transpositions.

Let us note a few basic features of this graph. First, it is {n \choose 2}-regular, because |\mathrm{T}|={n \choose 2}. Second, it is a graded graph, meaning that its vertex set decomposes as a disjoint union of independent sets,

\mathrm{S}(n) = \bigsqcup_{k=0}^{n-1} L_k,

where L_k is the set of permutations whose disjoint cycle decomposition consists of n-k cycles. Third, from the geometric perspective, the kth level L_k of the Cayley graph is the sphere of radius k in \mathrm{S}(n) centered at the identity permutation,

L_k=\{ \pi \in \mathrm{s}(n) \colon \|\pi\|_T=k\}.

Problem 4: Prove that \max \{\mathrm{d}_\mathrm{T}(\pi,\sigma) \colon \pi,\sigma \in \mathrm{S}(n)\}=n-1.

At this point we have quite a clear and detailed understanding of the system we want to analyze: the state space of the system is the Cayley graph of \mathrm{S}(n) as generated by the set \mathrm{T} of transpositions, and the evolution occurring is discrete time random walk on this graph. Now we have to make precise what exactly we mean by “random walk” in this sentence.

The most natural thing to consider is simple random walk: at time zero, we have a particle situated at the identity permutation on \Gamma(\mathrm{S}(n),\mathrm{T}), and at each instant of discrete time the particle jumps to a neighboring vertex, with equal probability of selecting any neighbor. We would like to understand how many jumps must be made before the probability to observe the particle at any vertex of the graph is the same as at any other vertex. This is what it means for a deck of cards to be shuffled: every configuration is equally likely, so that a person choosing a card from the deck cannot do anything more than guess what this card is going to be. Now, there is an immediate problem: this will never happen. The reason is annoying but unavoidable: every permutation is either even or odd, meaning that for any \pi \in \mathrm{S}(n), all walks from the identity to \pi have either an even or an odd number of steps. So, at even times, the particle must be situated in the alternating group \mathrm{A}(n), and at odd times in its complement. But actually, this foible reminds us that we are trying to solve a real world problem in which two cards are selected independently and then swapped — it is possible that the two cards selected are actually the same, and no swap occurs.

Problem 5: Prove that the same card, no-swap event occurs with probability \frac{2}{n+1}.

Thus, what we really want to understand is the lazy random walk on \mathrm{S}(n), where at each instant the particle either stays put with the above probability, or jumps to an adjacent vertex, with equal probability of jumping in any direction. Analyzing this model, Persi Diaconis and Mehrdad Shahshahani were able to prove the following landmark result in applied algebra.

Theorem (Diaconis-Shahshahani): A deck of cards is fully shuffled after \frac{1}{2}n \log n transpositions.

Although the Diaconis-Shahshahani result is ostensibly a theorem in probability theory, one may legitimately refer to it as a theorem of applied algebra: its proof uses everything you learned about the representation theory of the symmetric groups you learned in Math 202B. Moreover, as far as the author knows, there is no way to obtain this result without using representation theory, i.e. by purely probabilistic reasoning.

In this course, we will use simple stochastic processes like card shuffling as motivating questions whose solution leads to the development of harmonic analysis on finite groups. This is a fascinating application of algebra, and taking this route leads not just through the character theory of finite groups, but all the way to Gelfand pairs, spherical functions, and other related algebraic constructions. Even though you already know the representation theory of \mathrm{S}(n), so that in principle we could start with the Diaconis-Shahshahani theorem, we will instead begin with the analysis of random walks on much simpler groups, such as the discrete circle and hypercube. This motivates the development of character theory for finite abelian groups, where representations aren’t particularly meaningful, and sets the stage nicely for the harmonic analysis interpretation of the representation theory of general finite groups.

1 Comment

Leave a Reply