Math 202C: Lecture 2

*** Problems in this lecture due April 10 at 23:59 ***

In this lecture, we remain in the setting of the symmetric group S=\mathrm{Aut}\{1,\dots,n\} but we will change the geometry on this group. More precisely, instead of identifying S with its Cayley graph as generated by the set of adjacent transpositions, we will identify the symmetric group with its Cayley graph as generated by the full conjugacy class of transpositions,

T = \{(s\ t) \in S \colon 1 \leq s < t \leq n\}.

Let \mathrm{d} denote the corresponding metric on S, so that \mathrm{d}(\rho,\sigma) is the minimal number r such that there exist transpositions \tau_1,\dots,\tau_r \in T with

\sigma = \rho\tau_1 \dots \tau_r.

As before, write |\pi|=\mathrm{d}(\iota,\pi) for the distance from the identity permutation \iota to a given permutation \pi.

Problem 2.1. Prove that |\pi|=n-\mathrm{cyc}(\pi), where \mathrm{cyc}(\pi) is the number of factors in the disjoint cycle decomposition of \pi.

Identifying S with \mathrm{Cay}(S,T), the symmetric group becomes a graded graph: it decomposes into a disjoint union

S = L_0 \sqcup L_1 \sqcup \dots \sqcup L_{n-1}

in which

L_r = \{\pi \in S \colon |\pi|=r\} = \{\pi \in S \colon \mathsf{cyc}(\pi)=n-r\}

is the sphere of radius r centered at \iota, and L_r is an independent set, meaning that no two of its points are adjacent vertices. Moreover, if \rho,\sigma \in S are adjacent vertices, then we have \rho \in L_r and \sigma \in L_s with |r-s|=1. The decomposition of a group into concentric spheres is a general thing that happens whenever one chooses a generating set; a special feature of this particular situation is that each sphere further decomposes into a disjoint union of conjugacy classes,

L_r = \bigsqcup\limits_{\substack{\alpha \vdash n \\ \ell(\alpha)=n-r}}C_\alpha.

Here, we are indexing the conjugacy classes of S by the set \Lambda of partitions of n. More precisely, the notation \alpha \vdash n means that \alpha is a decomposition of n into a sum of \ell(\alpha) positive integers, without regard to order of the terms. Every permutation \pi \in S gives a partition \mathsf{type}(\pi)\vdash n obtained from the lengths of the cycles in its disjoint cycle decomposition; conversely, for any partition \alpha \vdash n the corresponding fiber C_\alpha =\mathsf{type}^{-1}(\alpha) is a conjugacy class in S.

Let us now construct the analogue of Mallows measure corresponding to the all-transpositions geometry. This is called the Ewens measure on permutations: by definition, it is the Gibbs measure

\mathbb{P}(\sigma) = \frac{q^|\sigma|}{Z} = \frac{q^{n-\mathsf{cyc}(\sigma)}}{Z}

corresponding to a statistical mechanical system whose states \sigma are permutations of 1,\dots,n with the potential energy of \sigma being inversely proportional to its number of cycles. The partition function is

Z= \sum\limits_{\sigma \in S} q^{|\sigma|} = q^n\sum\limits_{\sigma \in S} q^{-\mathsf{cyc}(\sigma)},

or equivalently

Z=\sum\limits_{k=0}^n \begin{bmatrix} n \\ k \end{bmatrix} q^{-k},

where \begin{bmatrix} n \\ k \end{bmatrix} is the Stirling number of the first kind, which counts permutations in S=\mathrm{Aut}\{1,\dots,n\} whose disjoint cycle factorization has k components. This in turn gives

Z= (1+q)(1+2q) \dots (1+(n-1)q),

corresponding to the fact that Stirling Numbers of the first kind are the coefficients of the falling factorial polynomial.

Problem 2.2. Show that the expected number of cycles in a random permutation under Ewens measure is

1 + \sum\limits_{j=1}^{n-1} \frac{1}{1+jq}.

Deduce from this that the expected number of cycles in a uniformly random permutation of 1,\dots,n is the harmonic number

H_n=1+\frac{1}{2} + \dots + \frac{1}{n}.

The all-transpositions geometry on permutations has many features which make it easier to understand than the adjacent-transpositions geometry. Perhaps the most significant of these is that it is “Euclideanizable.” Right, that’s not a real word – here is what is meant more precisely.

Consider the full forward cycle \gamma=(1\ 2\ \dots\ n) in the symmetric group S. In the all-transpositions geometry, we have |\gamma|=n-1, and the number of geodesics \iota \to \gamma is \mathrm{Cay}_{n-1}, where

\mathrm{Cay}_n = (n+1)^{n-1}

is the nth Cayley number. Taking this formula for granted, we can calculate the number of geodesics from the identity to any permutation (hence the number of geodesics between any two permutations) as follows.

Problem 2.3. Let \pi be any permutation in the conjugacy class C_\alpha of S. Show that the number of geodesics \iota \to \pi is

{n \choose \alpha_1-1,\dots,\alpha_\ell-1} \prod\limits_{k=1}^\ell \mathrm{Cay}_{\alpha_k-1},

where \ell=\ell(\alpha).

To Euclideanize the all-transpositions geometry on the symmetric group, we give a rule which allows one to systematically select a distinguished geodesic between every pair of points. To do so, let us label the edges of \mathrm{Cay}(S,T) as follows: we mark each edge corresponding to the transposition (s\ t) with t, the larger of the two numbers swapped. Thus, emanating from every vertex of the graph we have one 2-edge, two 3-edges, three 4-edges, etc. Call a walk on the Cayley graph strictly monotone if the labels of the edges it traverses form a strictly increasing sequence. Thus, an r-step strictly monotone walk \rho \to \sigma corresponds to an equation of the form

\sigma=\rho(s_1\ t_1) \dots (s_r\ t_r), \quad 2\leq t_1 < \dots < t_r \leq n

in S.

Problem 2.4. Prove that for any \pi \in S, there exists a unique strictly monotone walk \iota \to \pi, and that the length of this walk is |\pi|. Conclude that there exists a unique strictly monotone walk between every pair of points in S, and that this walk is a geodesic.

This more or less marks the end of my efforts to get you thinking about the combinatorics of permutations; after this we will move on to the algebraic aspects of the symmetric group and its representations. The campaign for the elementary beauty and unexpected subtlety of permutations met with mixed success: there were interesting follow-up discussions with some students, whereas other pairs of eyes glazed over as the minds behind them fell into the “I already know this stuff” trap, which is one the of the more pernicious lies you can tell yourself.

Here is one final push. Our ever-curious Lani was recently asking about posets, and whether there is any active research being conducted on them these days. The answer is definitely “yes,” and though I don’t really work in the area I do have one burning question about posets which keeps me up at night. The question is about finding a partial order on permutations of a fixed genus.

From our Euclideanized perspective, genus is a metric invariant which measures “flatness” of triangles S as generated by the full conjugacy class of transpositions. More precisely, let \pi,\rho,\sigma be three distinct permutations in S=\mathrm{Aut}\{1,\dots,n\}. There are two oriented strictly monotone triangles formed by these three points, namely \rho \to \pi \to \sigma \to \rho and \sigma \to \pi \to \rho \to \sigma. It may happen \pi lies on a geodesic \rho \to \sigma, in which case both of these triangles are flat, and have perimeter

|\pi^{-1}\rho| + |\rho^{-1}\sigma| + |\sigma^{-1}\pi| = 2|\rho^{-1}\sigma|.

In general, the oriented triangle \rho \to \pi \to \sigma \to \rho is a cycle in a bipartite graph, so necessarily has even perimeter: we have

|\pi^{-1}\rho| + |\rho^{-1}\sigma| + |\sigma^{-1}\pi| = 2|\rho^{-1}\sigma| + 2g(\rho,\pi,\sigma)

for some nonnegative integer g(\rho,\pi,\sigma) which we call the genus of the triple (\rho,\pi,\sigma). At the other extreme from flatness, it may be that \pi,\rho,\sigma are pairwise antipodal points, i.e. the distance between each pair of them is the diameter of S, in which case the above equation becomes

3(n-1)=2(n-1)+2g(\rho,\pi,\sigma),

giving a maximal genus of

g(\pi,\rho,\sigma) = \lfloor \frac{n-1}{2} \rfloor.

Problem 2.5. Show that a triple of pairwise antipodal points in S exists if and only if n is odd.

Since

g(\rho,\pi,\sigma)=g(\iota,\rho^{-1}\pi,\rho^{-1}\sigma),

it is natural to view genus as a function g(\pi,\sigma):=g(\iota,\pi,\sigma) on pairs of permutations. Going even further, let us fix the second argument to be \sigma=\gamma, where \gamma=(1\ 2\ \dots\ n) is the full forward cycle, and define the genus of a permutation by g(\pi):=g(\pi,\gamma). Then, we obtain a corresponding decomposition of the symmetric group

S = \bigsqcup\limits_{k=0}^{\lfloor \frac{n-1}{2} \rfloor} S_g

in which S_g is the set of permutations of genus g. The set S_0 of “planar” permutations consists precisely of those \pi which lie on a geodesic \iota \to \gamma, and these permutations have a special feature: the disjoint cycle decomposition of any \pi \in S_0 is a noncrossing partition of \{1,\dots,n\}, and in fact this map from genus zero permutations into noncrossing permutations is bijective. There is also a very natural partial order on S_0 which can be described as follows: we have \pi \leq \sigma if and only if \pi and \sigma lie on a common geodesic \iota \to \gamma, and a particle emanating from \iota and traveling along this geodesic passes through \pi before passing through \sigma. This is a special case of the geodesic partial order which can be defined relative to any two vertices of a connected graph, and in this case it coincides exactly with the reverse refinement order on noncrossing partitions.

Now the question is: what is the higher genus analogue of this partial order? How can we define a poset structure by some uniform recipe which feels good for all values of g, subject to the condition that our recipe reverts to the above at g=0? This is an open question. The most natural thing that I can think of is the following: given two permutations \pi,\sigma \in S_g, we declare \pi \leq \sigma if and only if they lie on a common genus g triangle two of whose vertices are \iota and \gamma, and a particle emanating from \iota and traveling along the perimeter of this triangle encounters \pi before encountering \sigma. Whether or not this actually defines a partial order on S_g is not clear to me, but my fluid intelligence is declining from its peak while yours is cresting. So, if you want to discuss further please feel free to get in touch.

Math 202C: Lecture 1

***Problems in this lecture due 23:59 on April 8***

Let G=(S,T) be a finite, connected, simple graph with vertex set S=\{\pi,\rho,\sigma,\dots\} and edge set T \subseteq {S \choose 2}. A basic (meaning fundamental, not necessarily easy) problem is to compute the number W^r(\rho,\sigma) of r-step walks from \rho to \sigma in G.

Problem 1.1. Let G be the complete graph on S=\{1,\dots,N\}, whose vertex set is T={S \choose 2}. Using nothing but your brain, find a formula for W^r(\rho,\sigma). If you find this problem intractable or, worse, uninteresting, consider a degree in law or medicine.

Math 202A provides an algebraic encoding of graphs. Let \mathcal{C}(S) be the Hilbert space with orthonormal basis |\rho\rangle,|\pi\rangle,|\sigma\rangle,\dots. Let \langle \rho|,\langle \pi|,\langle \sigma|,\dots be the dual basis of \mathrm{Hom}(\mathcal{C}(S),\mathbb{C}), so that

\langle \rho | \sigma \rangle = \langle \rho,\sigma \rangle = \delta_{\rho\sigma}.

Does using Dirac notation make this an applied math course? The question is vacuous since there is no such thing as non-applied math, but there are cultural differences between fields which are reflected in their iconography, and hopefully you can at least read a paper in quantum computing at the end of this sequence.

The adjacency operator K\in \mathrm{End}\mathcal{C}(S) acts on the vertex basis of \mathcal{C}(S) by creating a flat superposition of all vertices adjacent to a given input,

K|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \{\rho,\sigma\} \in T}} |\sigma\rangle.

Equivalently, the matrix elements of K in the vertex basis are

\langle \sigma | K |\rho\rangle=\begin{cases} 1, \text{ if } \{\rho,\sigma\} \in T\\ 0, \text{ if } \{\rho,\sigma\} \not\in T\end{cases}.

Thus, the selfadjoint operator K \in \mathrm{End}\mathcal{C}(S) encodes the edge set T \subseteq {S\choose 2}.

For example, consider the complete graph G=(S,T) on the vertex set S=\{1,\dots,N\}. The computational basis in \mathcal{C}(S) consists of the vectors

|1\rangle, |2\rangle,\dots,|N\rangle,

upon which A acts according to

K| i\rangle = \sum\limits_{j \neq i} |j\rangle.

Problem 1.2. Show that W^r(\rho,\sigma) = \langle \rho| K^r|\sigma\rangle. In particular, show that the trace of K^r equals the total number of loops (i.e. closed walks) of length r in G.

The linear algebra approach to counting walks in graphs is to find a basis of \mathcal{C}(G) on which K acts diagonally.

Problem 1.3. Implement this approach for the complete graph and recover your barehands solution to Problem 1.1.

The mathematical subject which uses eigenvalues of linear operators to analyze graphs is called spectral graph theory. In particular, the spectrum of the adjacency operator K on a given graph G=(S,T) is called the adjacency spectrum of G. A couple of facts worth mentioning: if G is bipartite then the spectrum of K is symmetric about 0; if G is degree-regular then the largest eigenvalue of K is the degree of any vertex.

Instead of asking how many walks there are between two given vertices, we may ask how far apart they are. This information is encoded in the distance operator D \in \mathrm{End}\mathcal{C}(S), whose matrix elements in the vertex basis tabulate geodesic distance in G,

\langle \sigma | D | \rho\rangle = \mathrm{d}(\rho,\sigma).

Ron Graham made the striking discovery that the determinant of the distance operator on a tree depends only on the cardinality of its vertex set. Distance eigenvalues of graphs have been much studied ever since. I found some recent interesting work on distance operators via Johann Birnick’s website; Johann is the TA for this course.

One can in a sense interpolate between K and D. To do so, we consider operators L_0,L_1,L_2,\dots \in \mathcal{C}(S) corresponding to metric level sets in G=(S,T), meaning that

\langle \sigma | L_r |\rho \rangle = \begin{cases} 1, \text{ if }\mathrm{d}(\rho,\sigma)=r \\ 0, \text{ otherwise}\end{cases}.

Equivalently,

L_r|\rho\rangle = \sum\limits_{\substack{\sigma \in S \\ \mathrm{d}(\rho,\sigma)=r}} |\sigma\rangle

is the sum of all points on the sphere of radius r centered at \rho. We have that L_0=I is the identity operator, L_1 = K is the adjacency operator, and

D=\sum\limits_{r=0}^\infty rL_r.

This is in fact a finite sum, since L_r is the zero operator for all r exceeding the diameter of G.

One can also consider the exponential adjacency operator

\Psi_t= \sum\limits_{r=0}^\infty \frac{t^r}{r!} K^r =: e^{tK}

on G, whose matrix elements are exponential generating functions enumerating walks between vertices by length,

\langle \sigma | \Psi_t |\rho \rangle = \sum\limits_{r=0}^\infty \frac{t^r}{r!} W^r(\rho,\sigma).

For regular graphs this is essentially the heat kernel, which has been much studied. The exponential distance operator,

\Omega_q = \sum\limits_{r=0}^\infty q^r L_r,

is the element-wise exponential of the distance operator,

\langle \sigma | \Omega_q |\rho\rangle= q^{\mathrm{d}(\rho,\sigma)},

and unlike \Psi_t it is a relative newcomer in spectral graph theory.

Looking at the power series which define \Psi_t and \Omega_q, one sees that the level set operators L_0,L_1,L_2,\dots are to \Omega_q as powers K^0,K^1,K^2,\dots of the adjacency operator are to \Psi_t. While the first two elements of these sequences are the same, a big flaw in this symbolic analogy is that the subalgebra of \mathrm{End}\mathcal{C}(S) generated by the (finite) set of selfadjoint operators \{L_0,L_1,L_2,\dots\} is highly non-trivial, whereas the subalgebra of \mathrm{End}\mathcal{C}(S) generated by the single selfadjoint operator K simply consists of polynomials in this operator, which is but a small sector of \mathrm{Alg}(L_0,L_1,L_2,L_3,\dots). In particular, it is not at all clear what relations exist between L_r and L_s, or even whether they commute. We will have more to say about this later.

The exponential distance operator \Omega_q encodes the same information as the distance operator D in a manner which interfaces more readily with the formalism of statistical mechanics. More precisely, let us write q=e^{-\beta} so that

\langle \sigma | \Omega_q | \rho \rangle = e^{-\beta \mathrm{d}(\rho,\sigma)}.

Then, fixing a root vertex \rho, the \rhoth row of \Omega_q defines Gibbs measure (after Josiah, not Amelia) on the vertices of G according to

\mathbb{P}(\sigma) = \frac{1}{Z}e^{-\beta |\sigma|}.

Physically, this distribution corresponds to a thermodynamic ensemble whose states are the vertices of G, with \beta>0 the inverse temperature and the potential energy |\sigma| of a state \sigma being equal to its distance |\sigma|:=\mathrm{d}(\rho,\sigma) to the root vertex \rho in G. The partition function

Z= \sum\limits_\sigma q^{|\sigma|} = \sum\limits_{\sigma} e^{-\beta\mathrm{d}(\rho,\sigma)}

sums the energy over all states, ensuring that \mathbb{P} is a probability measure on S. Its logarithm \log Z is the free energy of our ensemble, and

q \frac{\partial}{\partial q} \log Z = \frac{1}{Z} \sum\limits_{\sigma} |\sigma|q^{|\sigma|}=\sum\limits_\sigma |\sigma| \mathbb{P}(\sigma)

is precisely the average distance to the root vertex \rho under the non-uniform distribution \mathbb{P}, which becomes the uniform distribution on states in the high-temperature limit \beta \to 0 limit, where q=e^{-\beta} \to 1.

Let us try this out on a toy example, the complete graph G=(S,T) on S=\{1,\dots,N\}. Take the root vertex to be 1, so that the potential energy of any other vertex \sigma \in \{1,\dots,N\} is

|\sigma| = \mathrm{d}(1,\sigma)=1.

Thus, under the corresponding Gibbs measure on states S=\{1,\dots,N\} we have

\mathbb{P}(1) = \frac{1}{Z}

and

\mathbb{P}(\sigma) = \frac{q}{Z}, \quad \sigma \neq 1,

where the partition function is

Z= q^0+q^1+ \dots + q^1 = 1+(N-1)q.

We thus have

q\frac{d}{dq} \log Z = \frac{(N-1)q}{1+(N-1)q},

and setting q=1 we recover the obvious fact that the average distance to 1 under the uniform measure on the complete graph is \frac{N-1}{N}. Indeed, for the uniform measure on the complete graph the random variable \sigma \mapsto |\sigma| has a Bernoulli distribution on \{0,1\} which assigns probability \frac{1}{N} to 0 and probability \frac{N-1}{N} to 1.

Linear operators on graphs fall under the broad umbrella of Math 202A, which treated linear maps between finite-dimensional Hilbert spaces. Math 202B gives us additional algebraic tools which apply to graphs whose vertices form a group; these are officially known as Cayley graphs.

Let S be a finite group and T \subset S a symmetric set of generators in G. Symmetric means that \tau \in T implies \tau^{-1} \in T, and saying that T generates S means that for any \sigma \in S there exists r \in \mathbb{N}_0 and \tau_1,\dots, \tau_r \in T such that \sigma=\tau_1\tau_2 \dots \tau_r. The Cayley graph \mathrm{Cay}(S,T) has vertex set S, with \{\rho,\sigma\} an edge if and only if \sigma = \rho\tau for some \tau \in T, which is then necessarily unique. The fact that T is symmetric means that this is a well-defined as an undirected graph without multiple edges, and the fact that T generates S means that the graph is connected. If T does not contain the identity element \iota then no vertex is adjacent to itself, and we henceforth assume this to be the case. We see that for Cayley graphs G=\mathrm{Cay}(S,T), the edge set can effectively be viewed as a subset of the vertex set, since the generating T \subseteq S determines the adjacency relation on vertices.

For a Cayley graph G=\mathrm{Cay}(S,T), the space \mathcal{C}(S) carries not just a scalar product but also a vector product, namely convolution: we have

|\pi\rangle|\sigma\rangle = |\pi \sigma\rangle, \quad \pi,\sigma \in S,

where \pi\sigma is the group product of \pi and \sigma. This gives us a new point of view on the adjacency operator K on G=\mathrm{Cay}(S,T). For any subset P \subseteq S, let

|P\rangle = \sum\limits_{\pi \in P} |\pi\rangle

be the corresponding element of \mathcal{C}(S). Thus in particular

\langle P|Q\rangle = |P \cap Q|.

Let us allow the following extremely useful abuse of notation: we will also identify P \subset S with its image in the (right) regular representation of \mathcal{C}(S). Thus,

P|\rho \rangle = |\rho\rangle |P\rangle = \sum\limits_{\pi \in P} |\rho\pi\rangle.

In particular, T is both our chosen generating set of S, and also its image in the regular representation is the adjacency operator on \mathrm{Cay}(S,T),

T|\rho\rangle = |\rho\rangle |T\rangle = \sum\limits_{\tau \in T} |\rho\tau\rangle.

More generally, let

L_r = \{ \pi \in S \colon |\pi|=r\} = \{\pi \in S \colon \mathrm{d}(\iota,\pi)=r\}

be the sphere of radius r centered at the identity element. This set becomes a vector in \mathcal{C}(S) via quantization,

|L_r\rangle = \sum\limits_{\pi \in L_r} |\pi\rangle,

and this vector becomes an operator in \mathrm{End}\mathcal{S} via the regular representation,

L_r|\rho\rangle = |\rho\rangle |L_r\rangle = \sum\limits_{|\pi|=r} |\rho\pi\rangle.

The matrix elements of the operator L_r are

\langle \sigma | L_r |\rho \rangle = \sum\limits_{|\pi|=r} \langle \sigma |\rho\pi\rangle,

and

\langle \sigma|\rho\pi\rangle\neq 0 \iff \sigma=\rho\pi \iff \rho^{-1}\sigma=\pi,

in which case

\mathrm{d}(\rho,\sigma)=|\rho^{-1}\sigma|=|\pi|=r.

Thus, the image of |L_r\rangle in the (right) regular representation is indeed the metric level set operator L_r.

To sum up, the adjacency eigenvalues of the Cayley graph \mathrm{Cay}(S,T) are the eigenvalues of |T\rangle \in \mathcal{C}(S) acting in the regular representation of \mathcal{C}(S). This means that the spectral graph theory of Cayley graphs is part of the representation theory of finite groups. If we are in the incredibly fortuitous situation where G is an abelian group, this comes down to using the discrete Fourier transform.

Problem 1.4. Let C be the cyclic group of order n with generator \gamma, and take the symmetric generating set D=\{\gamma,\gamma^{-1}\}. Then \mathrm{Cay}(C,D) is a cycle graph with n vertices. Find a formula for W^r(\rho,\sigma), the number of r-step walks between two given vertices of a cycle graph.

Concerning distance in Cayley graphs \mathrm{Cay}(S,T), we by definition have that \mathrm{d}(\rho,\sigma) is equal to the minimal value of r such that there exist \tau_1,\dots,\tau_r such that

\sigma = \rho\tau_1 \dots \tau_r.

Problem 1.5. Show that for any \pi,\rho,\sigma \in S we have \mathrm{d}(\pi\rho,\pi\sigma)=\mathrm{d}(\rho,\sigma), and explain why the same cannot in general be said for \mathrm{d}(\rho\pi,\sigma\pi). Show that \mathrm{d}(\rho,\sigma) = \mathrm{d}(\iota,\rho^{-1}\sigma) with \iota the group identity.

Now let us consider a nonabelian example: our group is S=\mathrm{Aut}\{1,\dots,n\}, the symmetric group consisting of all bijections \pi \colon \{1,\dots,n\} \to \{1,\dots,n\}, which are generally called permutations. We will take the group law to be \pi\sigma := \sigma \circ \pi, which is the convention used in most computer algebra systems, or so I am told. Take the generating set

T = \{(1\ 2),(2\ 3),\dots,(n-1\ n)\}

of adjacent transpositions.

Problem 1.6. Draw \mathrm{Cay}(S,T) for n=3.

Set |\pi|=\mathrm{d}(\iota,\pi) with \iota the identity permutation.

Now consider the metric Gibbs measure on the Cayley graph \mathrm{Cay}(S,T) of the symmetric group S=\mathrm{Aut}\{1,\dots,n\} corresponding to the generating set T=\{(1\ 2),(2\ 3),\dots,(n-1\ n)\} of adjacent transpositions. We take the root vertex to be \iota, the identity permutation in S. Thus,

\mathbb{P}(\sigma) = \frac{q^{|\sigma|}}{Z},

where |\sigma|=\mathrm{d}(\iota,\sigma) is the minimal number of adjacent transpositions whose produce is \sigma.

Problem 1.7. Show that |\sigma|=\mathsf{inv}(\sigma), where \mathrm{inv}(\sigma) is the cardinality of the set \{1 \leq i<j \leq n\ \colon\ \sigma(i)>\sigma(j)\} of inversions in the permutation \sigma. Hence compute the diameter of G=\mathrm{Cay}(S,T).

We conclude that the metric Gibbs measure on G=\mathrm{Cay}(S,T) is

\mathbb{P}(\sigma)=\frac{1}{Z}q^{\mathsf{inv}(\sigma)}.

This is a famous probability measure on permutations called the Mallows measure; it arises in theoretical computer science in the analysis of sorting algorithms. It is a remarkable fact that the partition function

Z=\sum\limits_{\sigma \in S} q^{|\mathsf{inv}(\sigma)|}.

It is a remarkable fact that this partition function can be evaluated exactly, i.e. this stat mech model is exactly solvable. For any positive integer k, the corresponding q-number is

[k]_q = 1+q+ \dots + q^{k-1},

so that in particular [k]_1=k reverts to an ordinary number at q=1. The q-factorial is accordingly defined as

[k]_q! = [1]_q[2]_q \dots [k]_q.

Here’s a pretty thing.

Theorem 1.1. The partition function for Mallows measure is Z=[n]_q!.

Now you can derive many things, including the following.

Problem 1.8. Find a formula for the expected number of inversions in a random permutation whose distribution is Mallows measure. Using your result, find a formula for the expected number of inversions in a uniformly random permutation.