Math 262A: Lecture 13

Continuing with Lecture 12, we are analyzing the Taylor series

T_I(\hbar) = \sum\limits_{k=0}^\infty a_k\hbar^k

of the smooth function

I(\hbar) = \frac{1}{\sqrt{2\pi\hbar}}\int\limits_{\mathbb{R}} e^{-\frac{1}{\hbar}S(x)} \mathrm{d}x,

where S is a smooth function on the line such that the above integral converges, and whose Taylor series is of the form

T_S(x) = \frac{x^2}{2} + \sum\limits_{d=3}^\infty p_d \frac{x^d}{d}.

I should say here that I’m trying out a new notation: if F is a function of the real variable y which is smooth in a neighborhood of y=0, then I write

T_F(y) = \sum\limits_{k=0}^\infty F^{(k)}(0)\frac{y^k}{k!},

this being a formal power series in a single indeterminate which by abuse of notation I also call y. So it does not make sense to write

F(y) = T_F(y),

since the LHS is a number and the RHS is a formal power series. However, Taylor’s theorem with remainder tells you that if you take a finite number of terms of the formal power series T_F and trade the formal variable y for a real number y, you get a good approximation to the real number F(y).

Back to the problem at hand, what we are trying to do is to express the formal power series T_I in terms of the formal power series T_S. What we currently know is that

T_I(\hbar) = 1 + \sum\limits_{d=1}^\infty \frac{1}{d!}\sum\limits_{\vdash d} t_d|C_\alpha|p_\alpha\hbar^{\frac{d}{2}-\ell(\alpha)},

where the internal (finite) sum is over Young diagrams with d, with t_d the number of fixed point free involutions in \mathrm{S}(d) and C_\alpha the conjugacy class in \mathrm{S}(d) of permutations of cycle type \alpha, and

p_\alpha=\prod\limits_{i=1}^{\ell(\alpha)} p_{\alpha_i}

subject to p_1=p_2=0. So, in completely algebraic language, one could say that we are studying the “Feynman transform”

\mathcal{F} \colon x^3\mathbb{R}[p_1,p_2,p_3,\dots][[x]] \to \mathbb{R}[p_1,p_2,p_3,\dots][[\hbar]]

defined by

\mathcal{F}\left( \sum\limits_{d=3}^\infty p_d \frac{x^d}{d}\right):=\sum\limits_{d=1}^\infty \frac{1}{d!} \sum\limits_{\alpha \vdash d} t_d|C_\alpha|p_\alpha \hbar^{\frac{d}{2}-\ell(\alpha)}.

The domain of this map is the principal ideal generated by x^3 in the algebra of formal power series in x with coefficients being polynomials in the p_1,p_2,p_3,\dots, and the codomain is formal power series in \hbar with coefficients in the same polynomial ring. The problem then is to simplify the output of the Feynman map as a power series in \hbar, i.e. to determine the coefficient of \hbar^k for each k \in \mathbb{N}.

What we want to do in this lecture is give a diagrammatic solution to the above algebraic problem. Note that the solution to this problem has analytic value via Laplace’s method, but the solution itself does not involve any analysis and is purely combinatorial.

The first step is to think of the number |C_\alpha|t_d as the cardinality of C_\alpha \times \mathrm{Inv}(d), where again C_\alpha is the conjugacy class of permutations in the symmetric group \mathrm{S}(d), and \mathrm{Inv}(d) is the set of fixed point free involutions in \mathrm{S}(d). Note that \mathrm{Inv}(d) is nonempty if and only if d is even, in which case

\mathrm{Inv}(d) = C_{(2^{\frac{d}{2}})},

where (2^{\frac{d}{2}}) is the Young diagram with \frac{d}{2} rows of length 2. Every pair (\sigma,\tau) \in C_\alpha \times \mathrm{Inv}(d) is a combinatorial map. Associated to this map are two topological maps, which are dual to one another. To construct these topological maps, we use the following construction. First, for each cycle of \sigma we draw a polygon whose number of sides is equal to the length of \sigma, and whose edges are labeled counterclockwise in the cyclic order prescribed by \sigma. At the same time, we place into this polygon the corresponding dual object, namely a vertex in the interior of the polygon together with line segments, called “half edges” or “darts,” which emanate from the vertex, one for each edge of the polygon and labeled in the same way. Note that we need only consider the case where d is even, since t_d=0 otherwise, and moreover where all rows of \alpha have length at least 3, since p_1=p_2=0, meaning that all polygons in this construction have at least three sides, or dually that all stars have degree at least 3. Now we glue the sides of the polygons together according to the pairing permutation \tau, and at the same time pair half edges of stars in the same manner, i.e. using \tau. This produces a pair of dual topological maps \Gamma,\Gamma' on a compact oriented surface S. Our convention is that we view \Gamma as the topological map obtained by glueing stars, and \Gamma' as the topological map obtained by glueing polygons. The figure below illustrates this construction for the combinatorial map

\left( (1\ 2\ 3)(4\ 5\ 6), (1\ 2)(3\ 4)(5\ 6) \right).

Glueing a pair of triangles/3-stars to get a pair of maps.

Now let us consider the Feynman transform

\mathcal{F}\left( \sum\limits_{d=3}^\infty p_d \frac{x^d}{d}\right):=\sum\limits_{d=1}^\infty \frac{1}{d!} \sum\limits_{\alpha \vdash d} t_d|C_\alpha|p_\alpha \hbar^{\frac{d}{2}-\ell(\alpha)}

from the point of view of topological rather than combinatorial maps. First, the powers of \hbar occurring in the transformed power series have a clear interpretation: we have

\hbar^{\frac{d}{2}-\ell(\alpha)} = \hbar^{e(\Gamma)-v(\Gamma)},

where v(\Gamma),e(\Gamma) are the number of vertices and edges in \Gamma. Observe that actually the difference e(\Gamma)-v(\Gamma) depends only on the underlying graph, not actually on its embedding, so that this is a purely combinatorial parameter. Indeed, it is closely related to the so-called circuit rank of a graph, which is the number of independent cycles in the graph, or equivalently the minimal number of edges which must be deleted in order to obtain an acyclic graph, i.e. a forest. The precise relationship between these parameters is

e(\Gamma)-v(\Gamma)=r(\Gamma)-c(\Gamma)

with r(\Gamma) the circuit rank and c(\Gamma) the number of connected components. This is pretty obvious if you think about it: consider the extremal case where we have one connected component consisting of a tree. In this case the difference r(\Gamma)-c(\Gamma) is obviously equal to zero, and if we start adding edges the circuit rank goes up by one every time we add an edge, and moreover the construction is additive in the number of components (graph theory people – did I get that right?).

So, at the coarsest level, the output of the Feynman transform is apparently a generating function for (possibly disconnected) graphs sorted by circuit rank, where we are restricting to graphs in which every vertex has degree at least 3. Now this is not quite the case, since many combinatorial maps will give rise to the same topological map. Indeed, if we take the combinatorial map (\sigma,\tau) and conjugate it by an arbitrary permutation \pi \in \mathrm{S}(d), then we get another combinatorial map (\pi\sigma\pi^{-1},\pi\tau\pi^{-1}) which obviously yields the same topological map when the construction described above is performed, since we have simply relabeled everything by conjugating. Thus we expect that the correspondence between combinatorial and topological maps should be d!-to-one, which would be excellent news, since then the factor

\frac{1}{d!} |C_\alpha|t_d

would be equal to one. However, this is not precisely true, since there may be permutations \pi whose conjugation action on the combinatorial map (\sigma,\tau) has no effect, leaving this combinatorial map unaltered; this happens precisely when \pi commutes with both \pi and \sigma, or in other words belongs to the centralizer of the subgroup \langle \sigma,\tau \rangle of the symmetric group \mathrm{S}(d). Which they generate. It is thus reasonable to define the automorphism group of a combinatorial map to be this centralizer. We then see that the Feynman transform is indeed a generating function for graphs arranged by circuit rank, but in which each graph is counted with a weight equal to the reciprocal of the order of its automorphism group:

\mathcal{F}\left( \sum\limits_{d=3}^\infty p_d \frac{x^d}{d}\right):=\sum\limits_\Gamma \frac{\hbar^{r(\Gamma)-1}}{|\mathrm{Aut} \Gamma|}p_\Gamma,

the sum being over all topological maps of minimum vertex degree 3, with

p_\Gamma = \prod_{v \in \Gamma} p_{\deg(v)},

where the product is over all vertices of \Gamma. This product is called the “Feynman amplitude” of \Gamma, and it is the only part of the above construction which actually depends on the generating function S we started with — everything else is universal, being exactly the same irrespective of what S is. And that’s the general idea of the Feynman diagram expansion in zero-dimensional scalar-valued quantum field theory. It’s pretty useful, because for example we can say that the leading contribution in the asymptotic expansion of the integral we’re trying to approximate in the \hbar \to 0 limit comes from graphs which have minimum vertex degree 3, and two independent cycles, and there are only three of those:

Leading Feynman diagrams.

We’ll clean this up a bit next week, and then go on to the matrix-valued case, where the expansion depends not just on the underlying graph of a topological map, but also on the topology of the surface into which it is embedded.

Math 262A: Lecture 12

In Lecture 11, we were getting to the exciting part of this meandering topics course, where two seemingly disparate pieces of mathematics (Laplace’s Principle, the Exponential Formula) collide and produce something new (Feynman diagrams). Let’s review the two ingredients.

Take an interval [a,b] \subseteq \mathbb{R} which may be finite or infinite, and let S \colon [a,b] \to \mathbb{R} be a smooth function which attains a global minimum at a unique interior point c \in (a,b), and which is such that the integral

\int_a^b e^{-S(x)} \mathrm{d}x

is finite.

Theorem (Laplace Principle): The quantity

I(\hbar) = \sqrt{\frac{S''(c)}{2\pi\hbar}} e^{\frac{1}{\hbar}S(c)} \int_a^b e^{-\frac{1}{\hbar}S(x)} \mathrm{d}x

extends to a smooth function of \hbar \in [0,\infty), with I(0)=1.

The utility of this result is the following. Let

\sum\limits_{j=0}^\infty a_j\hbar^j

be the Maclaurin series of the smooth function I(\hbar). Taylor’s theorem with remainder tells us that for any nonnegative integer k, we have

\left|  I(\hbar) - \sum\limits_{j=0}^ka_j\hbar^j \right| \leq M_k \hbar^{k+1},

where M_k \geq 0 depends on k but not on \hbar. In other words, we have the \hbar \to 0 asymptotic expansion

I(\hbar) = \sum\limits_{j=0}^k a_j \hbar^j + o(\hbar^k),

where the error term is uniform in \hbar. Often times this is used to get the N \to \infty asymptotic expansion of an integral of the form

\int_a^b e^{-NS(x)} \mathrm{d}x,

in which case setting \hbar = 1/N the Laplace principle gives us the N \to \infty asymptotics

\sqrt{\frac{NS''(c)}{2\pi}} e^{NS(c)} \int_a^b e^{-NS(x)} \mathrm{d}x = 1+  \frac{a_1}{N} + \dots + \frac{a_k}{N^k} + o\left(\frac{1}{N^k} \right).

The prototypical example is Stirling’s approximation to N! via the Euler integral.

Now we recall the Exponential Formula, which tells us how to compute the exponential of a formal power series

S(x) = \sum\limits_{d=1}^\infty s_d \frac{x^d}{d!},

or equivalently how to compute the Maclaurin series of e^S when S is a smooth function with derivatives s_d=S^{(d)}(0) such that S(0)=0. Writing

e^{S(x)} = 1 + \sum\limits_{d=1}^\infty h_d \frac{x^d}{d!},

the Exponential Formula says that

h_d = \sum\limits_{\pi \in \mathrm{Par}(d)} \prod\limits_{B \in \pi} s_{|B|},

the sum being over all partitions \pi of \{1,\dots,d\}, and the product being over the blocks B of each partition \pi. An alternative form of the Exponential Formula, which is more useful for our current purpose, arises when we instead write

S(x) = \sum\limits_{d=1}^\infty p_d \frac{x^d}{d},

so s_d=(d-1)!p_d. In terms of the scaled derivatives p_d, the formula for h_d becomes

h_d = \sum\limits_{\pi \in \mathrm{Par}(d)} \prod\limits_{B \in \pi} (d-1)!p_{|B|}.

Since a finite set of cardinality c can be cyclically ordered in (c-1)! ways, we can rewrite the above as

h_d = \sum\limits_{\sigma \in \mathrm{S}(d)} \prod\limits_{C \in \sigma} p_{|C|},

where the sum is over all permutations \sigma of \{1,\dots,d\} and the product is over the cycles C of each permutation \sigma. Observe that this product does not depend on the internal structure of the cycles C of \pi, but only their sizes, so that any two permutations of the same cycle type (i.e. any two permutations which are conjugates of one another) give the same contribution to the sum. That is to say, we may rewrite the above as

h_d = \sum\limits_{\alpha \vdash d} |C_\alpha| p_\alpha,

where the sum is over all Young diagrams \alpha, and for each diagram C_\alpha \subseteq \mathrm{S}(d) is the conjugacy class of permutations of cycle type \alpha and

p_\alpha = \prod\limits_{I=1}^{\ell(\alpha)} p_{\alpha_i}.

We now combine the Laplace Principle with the (permutation form of) the Exponential Formula to calculate the asymptotic expansion coefficients a_j. To simplify the computation, we will make several assumptions on the Laplace side, none of which are essential — these can all be eliminated without too much difficulty and the end result is more or less the same. First, we assume the integration is over the whole real line. Second, we assume that the critical point c where the unique minimum of S occurs is c=0, and that S(c)=0. Finally, let us assume that the positive number S''(0) is equal to one.

With these assumptions, we have

I(\hbar)=\frac{1}{\sqrt{2\pi\hbar}}\int_\mathbb{R} e^{-\frac{1}{\hbar}S(x)}\mathrm{d}x,

and the Maclaurin series of S has the form

\frac{x^2}{2} + \sum\limits_{d=3}^\infty p_d\frac{x^d}{d!}.

with p_d:=\frac{S^{(d)}(0)}{(d-1)!}. We factor the integrand as

e^{-\frac{1}{\hbar}S(x)} = e^{-\frac{1}{\hbar}\frac{x^2}{2}} e^{-\frac{1}{\hbar}(S(x)-\frac{x^2}{2})},

and by the Exponential Formula the Maclaurin series the exponential of the non-quadratic part of the action S is

1+\sum\limits_{d=1}^\infty \left(\sum\limits_{\alpha \vdash d} \frac{|C_\alpha|}{d!} (-1)^{\ell(\alpha)}p_\alpha \hbar^{-\ell(\alpha)} \right)x^d

with p_1=p_2=0. We get the Maclaurin series of I by integrating this term-by-term against the Gaussian density:

1+\sum\limits_{d=1}^\infty \left(\sum\limits_{\alpha \vdash d} \frac{|C_\alpha|}{d!} (-1)^{\ell(\alpha)}p_\alpha \hbar^{-\ell(\alpha)} \right)\frac{1}{\sqrt{2\pi\hbar}}\int_\mathbb{R}x^d\mathrm{d}x = 1+\sum\limits_{d=1}^\infty \sum\limits_{\alpha \vdash d} \frac{t_d|C_\alpha|}{d!} (-1)^{\ell(\alpha)}p_\alpha \hbar^{\frac{d}{2}-\ell(\alpha)}

where we used the exact evaluation

\frac{1}{\sqrt{2\pi\hbar}}\int_\mathbb{R} x^d \mathrm{d}x = t_d\hbar^{\frac{d}{2}},

with t_d the number of fixed point free involutions in the symmetric group \mathrm{S}(d).

In order to determine the expansion coefficients a_j, we now want to arrange the sum

1+\sum\limits_{d=1}^\infty \sum\limits_{\alpha \vdash d} \frac{t_d|C_\alpha|}{d!} (-1)^{\ell(\alpha)}p_\alpha \hbar^{\frac{d}{2}-\ell(\alpha)}

as a power series in \hbar. At this point, it is not even clear that the series involves only nonnegative integer powers of \hbar. First of all, the presence of \frac{d}{2} in the exponent of \hbar makes it look like we might have half-integer powers, but this is not the case since t_d=0 whenever d is an odd number. Next, the actual exponent of \hbar is \frac{d}{2}-\ell(\alpha), which seems like it could be negative, but in fact since p_1=p_2=0 the quantity p_\alpha vanishes if \alpha \vdash d has more than \frac{d}{3} rows. Combining these conditions, we see that the first non-zero exponent of \hbar occurs at d=6, and the only \alpha \vdash 6 which contributes to the sum is \alpha=(3,3), the diagram with two rows each of length 3, so that in this case we have

\hbar^{\frac{d}{2}-\ell(\alpha)}=\hbar^{3-2}=\hbar^1.

Another thing which is not quite obvious is that this is the only situation which produces a power of 1 in the exponent of \hbar, or even that there are finitely many such situations. Let us see why this is the case. In order for this exponent to arise, we must have that \alpha \vdash d is a Young diagram with an even number of cells such that \ell(\alpha)=\frac{d}{2}-1. Since all rows of \alpha have length at least three, we get the inequality

\frac{d}{2}-1 \leq \frac{d}{3} \implies d \leq 6.

So, the only term in the sum which is linear in \hbar is

\frac{1}{6!}t_6|C_{(3,3)}|p_3^2,

and we conclude that

a_1=\frac{1}{6!}t_6|C_{(3,3)}|p_3^2.

We can explicitly evaluate the “universal” part of this formula, i.e. the part that has no dependence one S. As previously discussed, the number of fixed point free involutions in S(6) is

t_d=(5-1)!! = 5 \cdot 3 \cdot 1 = 15.

Next, the number of permutations in S(6) whose disjoint cycle decomposition is of the form (*\ *\ *)(*\ *\ *) is

\frac{{6 \choose 3} \cdot 2 \cdot 2}{2} = 20 * 2=40,

because there are {6 \choose 3} ways to choose the elements in cycles and 2 ways to cyclically arrange each, and this overcounts by a factor of 2 since the cycles are unordered. We thus conclude that

a_1 = \frac{40*15}{720}p_3^2 = \frac{5}{6}p_3^2.

(Note: I think I made a mistake here, because the subleading correction in Stirling’s formula is supposed to be \frac{1}{12}, which is ten times smaller than \frac{5}{6}).

Clearly, to continue this calculation to find all the coefficients a_j, we need a better way to organize this information. This can be done pictorially, and is the simplest example of the use of Feynman diagrams in perturbation theory. The number t_d|C_\alpha| is equal to the number of pairs (\gamma,\varphi) \in \mathrm{S}(d) \times \mathrm{S}(d) such that \gamma is a fixed point free involution and \varphi \in C_\alpha is a permutation of cycle type \alpha. As we discussed in Lecture 11, such pairs of permutations are referred to as combinatorial maps, or rotation systems, because they can be converted into topological maps, which are graphs drawn on surfaces. So, graphs on surface are the Feynman diagrams of zero dimensional quantum field theory. We will pick this up in Lecture 13, where we will give the full diagrammatic interpretation of the expansion coefficients a_j. After that, the plan is to tackle the same problem when we replace integration over \mathbb{R} with integration over the real vector space \mathrm{H}(N) of N \times N Hermitian matrices; in this situation we can actually distinguish the genus of the maps which arise. After that we will generalize further to integrating over the real vector space of selfadjoint elements in a finite-dimensional von Neumann algebra.

Math 262A: Lecture 9

This week, we will discuss a remarkable law of nature discovered by George Polya. Consider a particle situated at a given site of the lattice \mathbb{Z}^d. At each tick of the clock, the particle makes a random jump to a neighboring lattice site, with equal probability of jumping in any direction. In other words, the particle is executing simple random walk on \mathbb{Z}^d. A random walk is said to be recurrent if it returns to its initial position with probability one; otherwise, it is said to be transient.

Theorem (Polya): Simple random walk on \mathbb{Z}^d is recurrent for d=1,2 and transient for d \geq 3.

Many proofs of this theorem are known, and the extent to which any two such proofs are different is a perennial subject of debate. In this lecture and the next, we will present the proof given here, which combines a number of the techniques we have so far encountered in this course (exponential generating functions, Laplace method), and introduces new ones that we will need going forward (ordinary generating functions, Borel transform). We will use all of these new methods when we begin our discussion of Feynman diagrams next week.

In case you are frustrated with the admittedly slow pace of the course so far and want to know what is planned for its remainder, here is a rough outline. First I want to discuss Feynman diagrams for zero-dimensional scalar-valued QFTs, which are combinatorial maps. Then I would like to discuss Feynman diagrams for zero-dimensional matrix-valued QFTs, which are again maps on surfaces but with more of their topological structure visible. After this I want to look at a framework which subsumes both of these, namely Feynman diagrams for zero-dimensional QFTs with values in a finite-dimensional von Neumann algebra. Finally, I would like to discuss the combinatorics of zero-dimensional QFTs with values in a matrix group, a situation which generalizes the method of stationary phase and for which the appropriate Feynman diagrams seem to be branched covers of compact Riemann surfaces.

Back to Polya’s theorem. For n \in \mathbb{N}, let E_n denote the event that simple random walk on \mathbb{Z}^d returns to its initial position for the first time after n steps. It is convenient to set E_0=\emptyset, corresponding to the fact that the particle is tautologically situated at its initial position at time zero and this should not count as a return. Write p_n for the probability of E_n. The events E_n are mutually exclusive for different values of n, and

E=\bigsqcup\limits_{n=0}^\infty E_n

is the event that the particle returns to initial position after finitely many steps. We would like to calculate the probability

p= \sum\limits_{n=0}^\infty p_n

that E occurs.

Let us clarify that we are viewing \mathbb{Z}^d as a graph, whose vertices are the points of \mathbb{Z}^d with two vertices being adjacent if and only if they are unit Euclidean distance apart. A loop on \mathbb{Z}^d is then just a closed walk in the usual sense of graph theory. We consider the set of loops based at an arbitrary but fixed vertex of \mathbb{Z}^d, and denote by \ell_n the number of length n loops based at this point. It is convenient to consider the base point itself as a loop of length zero, the trivial loop, so that \ell_0=1.

A loop is said to be indecomposable if it is not the concatenation of two loops. Let r_n denote the number of length n indecomposable loops based at the given point, with the convention that the trivial loop is not indecomposable so that r_0=0 (this is like excluding 1 from the primes).

Proposition 1: For any n \geq 1, we have

\ell_n = \sum\limits_{k=0}^n r_k \ell_{n-k}.

Proof: Every loop of length n \geq 1 consists of an indecomposable loop of some length k \geq 1, followed by a (possibly trivial) loop of length n-k.

—Q.E.D.

If we divide the formula in Proposition 1 by (2d)^n, which is the total number of n-step walks on \mathbb{Z}^d beginning at the given base point, we get the recursion

q_n = \sum\limits_{k=0}^n p_k q_{n-k},

where p_n is the probability of a first return at time n as above, and q_n is the probability that the particle is situated at its initial position at time n.

Exercise 1: Which is bigger, p_n or q_n? (Hint: this is a trick question whose purpose is to make sure you are clear on the notation).

We now consider the ordinary generating functions of the sequences (p_n)_{n=0}^\infty and (q_n)_{n=0}^\infty, which by definition are the formal power series

P(z) = \sum\limits_{n=0}^\infty p_nz^n \quad\text{ and }\quad Q(z) = \sum\limits_{n=0}^\infty q_nz^n.

The product of these generating functions is

P(z)Q(z) =  \sum\limits_{n=0}^\infty \left(\sum\limits_{k=0}^n p_kq_{n-k}\right)z^n = \sum\limits_{n=1}^\infty \left(\sum\limits_{k=0}^n p_kq_{n-k}\right)z^n = Q(z)-1,

where we used the fact that p_0=0 together with Proposition 1. Now, each of the series P(z) and Q(z) has radius of convergence at least 1, because each has coefficients of absolute value at most 1, and therefore we may consider the equality

P(z)Q(z)=Q(z)-1

as an identity in the algebra of analytic functions on the open unit disc in \mathbb{C}. Since the coefficients of Q(z) are nonnegative numbers, the function Q(z) is non-vanishing on the interval [0,1), and we thus have

P(z) = 1- \frac{1}{Q(z)}, \quad z \in [0,1).

Now, since

P(1) = \sum_{n=0}^\infty p_n = p,

we can apply Abel’s power series theorem to obtain the following formula for the probability p that we are trying to calculate:

p = \lim\limits_{\substack{z \to 1 \\ z \in [0,1)}} P(z) = 1 - \frac{1}{\lim\limits_{\substack{z \to 1 \\ z \in [0,1)}}Q(z)}.

So, everything hinges on the limiting behavior of Q(z) as z approaches 1 through positive reals: we have either

\lim\limits_{\substack{z \to 1 \\ z \in [0,1)}}Q(z) = \infty,

in which case p=1 (recurrence), or

\lim\limits_{\substack{z \to 1 \\ z \in [0,1)}}Q(z) < \infty,

in which case p<1 (transience).

In order to determine which of the two limiting behaviors above Q(z) actually exhibits, we need some kind of tractable expression for this function. Combinatorially, this amounts to asking what can be said about the loop generating function

L(z) = \sum\limits_{n=0}^\infty \ell_n z^n

since we have Q(z) = L(\frac{z}{2d}) (Gian-Carlo Rota is said to have proclaimed that “probability is just combinatorics divided by n,” and that is exactly what is going on here). While there is not much that we can say about the ordinary generating function L(z) for lattice loops, it turns out that the corresponding exponential generating function,

E(z) = \sum\limits_{n=0}^\infty \ell_n \frac{z^n}{n!},

is analyzable. This is because the basic features of exponential generating functions, as discussed back in Lecture 3.

Let’s recall how this works. In this paragraph it is important to keep the dimension d of \mathbb{Z}^d in mind, so we write E_d(z) for the exponential generating function of the sequence (\ell_n^{(d)})_{n=0}^\infty of length n loops on \mathbb{Z}^d. To be concrete, let us consider what is going on for d=2. A walk on \mathbb{Z}^d is a sequence of steps, each of which is either horizontal or vertical, and so it is in effect a shuffle of two walks on \mathbb{Z}^1, one corresponding to horizontal steps and the other to vertical steps. Thus, a loop on \mathbb{Z}^2 is a shuffle of two loops on \mathbb{Z}^1. More quantitatively, we have

\ell_n^{(2)} = \sum\limits_{k=0}^n {n \choose k}\ell_k^{(1)}\ell_{n-k}^{(1)},

since to build a length n loop on \mathbb{Z}^2 we must choose a number k corresponding to the number of horizontal steps in the look, then choose a cardinality k subset of \{1,\dots,n\} corresponding to the times at which horizontal steps occur, and finally choose one of the \ell_k^{(1)} one-dimensional k-step loops to build on the chosen subset. The algebraic translation of this combinatorial reasoning is the generating function identity

E_2(z) = E_1(z)E_1(z).

Exactly the same reasoning applies in any dimension d: we have

E_d(z) = E_1(z)^d.

The above makes it seem at least plausible that we will be able to achieve some fairly explicit understanding of the exponential loop generating function E_d(z), since we have reduced to the case d=1 which ought to be straightforward. Indeed, we clearly have

E_1(z) = \sum\limits_{n=0}^\infty \ell_n^{(1)} \frac{z^n}{n!} = \sum\limits_{k=0}^\infty \ell_{2k}^{(1)} \frac{z^{2k}}{(2k)!},

since any loop on \mathbb{Z}^1 must consist of an even number of steps, half of which are positive and half of which are negative. Moreover, the number of loops is just the number of ways to choose the times at which the positive steps occur, so that

E_1(z) = \sum\limits_{k=0}^\infty {2k \choose k} \frac{z^{2k}}{(2k)!} = \sum\limits_{k=0}^\infty \frac{z^{2k}}{k!k!},

which is a nice, explicit power series. The question now is: can we sum this series? What this really means is: can we find the function whose Maclaurin series is E_1(z), and if so do we know anything useful about this function? Luckily, the answer turns out to be an emphatic yes: E_1(z) is a Bessel function.

The Bessel functions form a large class of special functions which crop up everywhere in mathematics, physics, and engineering, usually as solutions to physically meaningful ordinary differential equations. The particular class of Bessel functions relevant for us are called the modified Bessel functions of the first kind, and typically denoted I_\alpha(z), with \alpha being a parameter. The function I_\alpha(x) is one of two linearly independent solutions to the second order ODE

z^2 F''(z) + zF"(z) - (z^2+\alpha^2)F(z) = 0,

which is known as the modified Bessel equation (the other solution is of course the modified Bessel function of the second kind). Nineteenth century mathematicians found two ways to describe I_\alpha(z): as a power series,

I_\alpha(2z) = \sum\limits_{k=0}^\infty \frac{z^{2k+\alpha}}{k!(k+\alpha)!},

and as an integral,

I_\alpha(2z) = \frac{z^\alpha}{\sqrt{\pi}(\alpha-\frac{1}{2})!}\int_0^\pi e^{2z\cos\theta} (\sin \theta)^{2\alpha}\mathrm{d}\theta.

From the power series representation, we see that we have

E_1(z) = I_0(2z),

so that we now know

E_d(z) = I_0(2z)^d.

The above connection with Bessel functions leaves us in good shape to analyze the exponential loop generating function. However, this isn’t what we actually want — remember that we want to analyze the ordinary loop generating function in order to solve our problem. So we need a way to convert exponential generating functions into ordinary generating functions. This can be done using an integral transform called the Borel transform, which is in fact just an application of Euler’s integral representation of the factorial,

n! = \int_0^\infty t^n e^{-t} \mathrm{d}t,

which we discussed way back in Lecture 1. Indeed, if

E(z) = \sum\limits_{n=0}^\infty a_n \frac{z^n}{n!}

is the exponential generating function of the sequence (a_n)_{n=0}^\infty, then its Borel transform

\mathcal{B}E(z) = \int_0^\infty E(tz) e^{-t} \mathrm{d}t = \sum\limits_{n=0}^\infty a_n \frac{z^n}{n!} \int_0^\infty t^ne^{-t} \mathrm{d}t = \sum_{n=0}^\infty a_nz^n

is the ordinary generating function of this same sequence.

Now it is clear what must be done: we have to Borel transform the exponential loop generating function, which is a Bessel function, in order to get to the ordinary loop generating function. That is, we have

\mathcal{B}E_d(z) = \int\limits_0^\infty I_0(2tz)^d e^{-t} \mathrm{d}t,

so that we finally have the representation

Q(z) = \int\limits_0^\infty I_0(\frac{2tz}{d})^d e^{-t} \mathrm{d}t

for the function whose limit behavior we want to analyze.

Recall that what we want to do is analyze the z\to 1 behavior of

Q(z) = \int\limits_0^\infty I_0(\frac{2tz}{d})^d e^{-t} \mathrm{d}t,

and determine whether the limit is finite or infinite, so whether or not the above integral is convergent or divergent. Since all we care about is convergence or divergence, it suffices to consider the tail of the integral,

\int\limits_N^\infty I_0(\frac{2tz}{d})^d e^{-t} \mathrm{d}t,

with N large. Now because the integrand here is a Bessel function, it is itself an integral:

I_0(\frac{2tz}{d})^d = \int_0^\pi e^{tf(\theta)} \mathrm{d}\theta,

where f(\theta) = \frac{z}{d}\cos \theta and t \geq N is large. We are thus perfectly positioned for an application of the Laplace approximation (end point case),

\int_0^\pi e^{tf(\theta)} \mathrm{d}\theta \sim e^{tf(0)} \sqrt{\frac{\pi}{2t|f''(0)|}},\ t \to \infty.

Wading back through the formulas, what we have found is that

I_0(\frac{2tz}{d})^d e^{-t} \sim \text{const.} e^t(z-1)(tz)^{-\frac{d}{2}},\ t \to \infty.

Now, by the monotone convergence theorem we have

\lim\limits_{\substack{z \to 1 \\ z \in [0,1)}} \int_N^\infty e^t(z-1)(tz)^{-\frac{d}{2}} \mathrm{d}t = \int_N^\infty \lim\limits_{\substack{z \to 1 \\ z \in [0,1)}} e^t(z-1)(tz)^{-\frac{d}{2}} \mathrm{d}t = \int_N^\infty t^{-\frac{d}{2}}\mathrm{d}t,

an integral which diverges for d=1,2 and converges for d \geq 3. Polya’s theorem is proved, and if you’re willing to retrace our steps a bit you will find that you have a reasonable idea of what the mechanism is that is responsible for recurrence of simple random walk in dimensions d=1,2, and transience in dimensions d\geq 3. Indeed, if you want to get your hands dirty you can even try to estimate p by really staring at the above argument in order to get a feeling for how likely it is that simple random walk will escape to infinity.

I cannot resist mentioning that Bessel functions are intimately related to the enumeration of permutations with bounded decreasing subsequence length, which was our topic last week. The relation is via the following beautiful identity due to Ira Gessel.

Theorem (Gessel): For any d \in \mathbb{N}, we have

\sum\limits_{n=0}^\infty (n!)_d \frac{z^{2n}}{n!n!} = \det [I_{i-j}(2z)]_{i,j=1}^d,

where (n!)_d is the number of permutations in \mathrm{S}(n) with no decreasing subsequence of length d+1.

If you would like to see a proof of this theorem together with various pointers to the literature, I can refer you to this paper. In fact, since there will be no lecture post on Thursday, you can substitute this reading assignment for the missing lecture.

Math 262A: Lecture 8

Recall from Lecture 7 that the sequence of functions (f_N)_{N=1}^\infty defined on the region

\Omega_{d-1} = \{(y_1,\dots,y_d) \in \mathbb{R}^{d} \colon y_1 > \dots > y_d,\ y_1+\dots+y_d=0\}

by

f_N(y_1,\dots,y_d) = C_N(d) \dim (N+y_1\sqrt{N},\dots,N+y_d\sqrt{N}),

where

C_N(d) = (2\pi)^{\frac{d}{2}} \frac{N^{dN+\frac{d^2}{2}}}{(dN)!e^{dN}},

converges pointwise on \Omega_{d-1}: we have

\lim\limits_{N \to \infty} f_N(y_1,\dots,y_N) = e^{-S(y_1,\dots,y_d)},

where

S(y_1,\dots,y_d) = \frac{1}{2}\sum\limits_{i=1}^d y_i^2-\sum\limits_{1 \leq i< j \leq d} \log(y_i-y_j)

is the potential energy of a system of identical two-dimensional point charges on \mathbb{R} at locations y_1> \dots > y_d.

For arbitrary \beta > 0, define

N!_d^{(\beta)} = \sum\limits_{\lambda \in \mathbf{Y}_d(dN)} (\dim \lambda)^\beta,

the sum being over the set \mathbf{Y}_d(dN) of Young diagrams with dN boxes and at most d rows. According RSK, at \beta=2 this is the number of permutations in \mathrm{S}(N) with no decreasing subsequence of length N+1, and at \beta=1 it is the number of involutions satisfying the same.

We now use our limit theorem for f_N to calculate the asymptotics of N!_d^{(\beta)}. The first step is to rewrite the sum defining N!_d^{(\beta)} so that it is centered on the rectangular diagram R(d,N) \in \mathbf{Y}_d(dN), which as we have discussed in the previous lectures is the canonical self-complementary diagram relative to the twice-longer rectangle R(d,2N) which appear in Conjecture 1 of Lecture 6, which is the statement we are trying to prove. We have

C_N(d)N!_d^{(\beta)} = \sum\limits_{\substack{\frac{(d-1)N}{\sqrt{N}} \geq y_1 \geq \dots \geq y_{d-1} \geq \frac{-N}{\sqrt{N}} \\ y_i \in \frac{1}{\sqrt{N}}\mathbb{Z}}} (f_N(y_1,\dots,y_d))^\beta,

where y_d:=-(y_1+\dots+y_{d-1}). Now, the lattice \left(\frac{1}{\sqrt{N}}\mathbb{Z} \right)^{d-1} partitions \mathbb{R}^{d-1} into cells

\left[\frac{k_1}{\sqrt{N}},\frac{k_1+1}{\sqrt{N}} \right) \times \dots \left[\frac{k_N}{\sqrt{N}},\frac{k_N+1}{\sqrt{N}} \right), \quad k_i \in \mathbb{Z},

of volume \left(\frac{1}{\sqrt{N}} \right)^{d-1}, so that

\left(\frac{1}{\sqrt{N}} \right)^{d-1}C_N(d)N!_d^{(\beta)} = \left(\frac{1}{\sqrt{N}} \right)^{d-1}\sum\limits_{\substack{\frac{(d-1)N}{\sqrt{N}} \geq y_1 \geq \dots \geq y_{d-1} \geq \frac{-N}{\sqrt{N}} \\ y_i \in \frac{1}{\sqrt{N}}\mathbb{Z}}} (f_N(y_1,\dots,y_d))^\beta

is a Riemann sum for the integral

\int\limits_{\Omega_{d-1}} f_N(y_1,\dots,y_N)^\beta \mathrm{d}y_1 \dots \mathrm{d}y_{d-1}.

We thus have that

\lim\limits_{N \to \infty} \left(\frac{1}{\sqrt{N}} \right)^{d-1}C_N(d)N!_d^{(\beta)}\left(\frac{1}{\sqrt{N}} \right)^{d-1} = \int\limits_{\Omega_{d-1}} \lim\limits_{N \to \infty} f_N(y_1,\dots,y_N)^\beta \mathrm{d}y_1 \dots \mathrm{d}y_{d-1}\\ = \int\limits_{\Omega_{d-1}} \lim\limits_{N \to \infty} e^{-\beta S(y_1,\dots,y_d)}\mathrm{d}y_1 \dots \mathrm{d}y_{d-1},

by the dominated convergence theorem. Thus, we have found the N \to infty asymptotics of N!_d^{(\beta)} if we can evaluate the integral on the right, which up to the linear constraint y_1+\dots+y_d=0 is the partition function of the system of the two-dimensional Coulomb gas we have been discussing. This integral is not particularly easy to evaluate, and to compute it one needs to know some version of the Selberg integral formula.

In fact, we don’t have to evaluate the integral to prove Conjecture 1 if we take advantage of symmetry. Define the sum

S_N(d,\alpha,\beta) = \sum\limits_{\substack{\mu \in \mathbf{Y}_d(dN) \\ \mu \subset R(d,2N)}} (\dim \mu)^\alpha (\dim \mu^*)^{\beta-\alpha},

where 0 \leq \alpha \leq \beta and \mu^* is the complement of \mu \subset R(d,2N). Repeating the argument above essentially verbatim (change variables from \lambda_i \in \mathbb{Z}_{\geq 0} to y_i \in N^{-1/2}\mathbb{Z} in the sum and scale by the mesh volume), we find that

\lim\limits_{N \to \infty} \left(\frac{1}{\sqrt{N}}\right)^{d-1} C_N(d) S_N(d,\alpha,\beta) = \int\limits_{\Omega_{d-1}} e^{-\alpha S(y_1,\dots,y_d)} e^{-(\beta-\alpha)S(-y_d,\dots,-y_1)} \mathrm{d}y_1 \dots \mathrm{d}y_{d-1}.

This integral is even worse than the last one, but we claim that in fact the two are equal:

\int\limits_{\Omega_{d-1}}  e^{-\beta S(y_1,\dots,y_d)}\mathrm{d}y_1 \dots \mathrm{d}y_{d-1} = \int\limits_{\Omega_{d-1}} e^{-\alpha S(y_1,\dots,y_d)} e^{-(\beta-\alpha)S(-y_d,\dots,-y_1)} \mathrm{d}y_1 \dots \mathrm{d}y_{d-1}.

This follows immediately if we can show that the action S has the symmetry

S(y_1,\dots,y_d) = S(-y_d,\dots,-y_1).

This in turn is physically obvious: reflecting each point charge relative the the attractive harmonic potential preserves the distance of each particle to the potential, as well as the distance between every pair of particles, and hence does not change the potential energy of the system (Exercise: prove this algebraically). We have thus proved the following generalization of Conjecture 1 from Lecture 6.

Theorem 1: For any d \in \mathbb{N} and 0 \leq \alpha \leq \beta, we have that

N!_d^{(\beta)} \sim S_N(d,\alpha,\beta)

as N \to \infty. In particular, taking \alpha=1 and \beta =2 we get

N!_d \sim \dim R(d,2N)

as N \to \infty.

The moral of this story is that using physical intuition to guide mathematical reasoning can be quite useful. This is just a small example. For something more challenging, you might try the following exercise suggested by Polya and Szego. Let me know if you solve it.

Exercise 1: Prove the Riemann hypothesis by showing that the imaginary coordinates of the non-trivial zeros of the zeta function are the energy levels of a quantum mechanical system, i.e. the eigenvalues of the selfadjoint operator which is the Hamiltonian of the system.

The fact that we can obtain the N \to \infty asymptotics of N!_d is really a consequence of the fact that we can describe this combinatorial function as a sum over Young diagrams. One could try various other versions of this. For example, let \sigma \in \mathrm{S}(d+1) be any fixed permutation, and define N!_\sigma to the the number of permutations \pi in the symmetric group \mathrm{S}(N) which avoid \sigma, meaning that no subsequence of \pi, when written in one-line notation, is order isomorphic to \sigma. Our exactly solvable function N!_d corresponds to choosing \sigma to be the reverse identity permutation, i.e. the Coxeter element in the Weyl group \mathrm{S}(d+1)=A_d. Not much is known about the N \to \infty asymptotics of the generalized factorial N!_\sigma for other choices of \sigma. The growth of this function is the subject of the Stanley-Wilf ex-conjecture, which asserts that N!_\omega \leq C_\sigma^N where C_\sigma > 0 does not depend on N. Stirling’s formula tells us that this is false for N!, which exhibits super-exponential growth, and the theorem of Regev we have proved over the course of the last few lectures tells us that this is true for N!_d.

The Stanley-Wilf conjecture was proved by Adam Marcus and Gabor Tardos in 2002. Their proof gives a value for C_\sigma which is exponential in d. On the other hand, we know from the asymptotics of N!_d, which we can explicitly compute, that in this case C_\sigma is polynomial in d (Exercise: check this by computing the asymptotics of \dim R(d,2N) using the usual Stirling formula (Hint: this was already suggested as an Exercise in Lecture 5)). For a while, it was believed that C_\sigma should be of polynomial growth for all choices of \sigma. This was disproved by Jacob Fox, who showed that in a certain precise sense C_\sigma is almost always of exponential growth. So N!_d really is special among pattern avoiding generalizations of the factorial function.

Math 262A: Lecture 7

We continue with the N \to \infty asymptotic analysis of N!_d, where d \in \mathbb{N} is arbitrary but fixed and N!_d is the number of permutations in the symmetric group \mathrm{S}(N) with no decreasing subsequence of length d+1.

At the end of Lecture 6, we computed the N \to \infty asymptotics of a diagram \in \mathbf{Y}_d(dN) fluctuating around the canonical self-complentary diagram R(d,N) (relative to the twice-larger rectangle R(d,2N), whose dimension we expect to asymptotically coincide with (dN)!_d) on a square root scale. Recall that

\Omega_{d-1} = \{(y_1,\dots,y_d) \in \mathbb{R}^d \colon y_1> \dots>y_d,\ y_1+\dots+y_d=0\}.

Theorem 1: For any (y_1,\dots,y_d) \in \Omega_{d-1}, we have

\lim\limits_{N \to \infty} C_d(N) \dim(N+y_1\sqrt{N},\dots,N+y_d\sqrt{N}) = e^{-S(y_1,\dots,y_d)},

where C_d(N) is

C_d(N) = (2\pi)^{\frac{d}{2}} \frac{N^{dN+\frac{d^2}{2}}}{(dN)!e^{dN}},

and S is

S(y_1,\dots,y_d) = \frac{1}{2} \sum\limits_{i=1}^d y_i^2 - \sum\limits_{1 \leq i < j \leq d} \log(y_i-y_j).

At the end of Lecture 5, we indicated that the action S is an important special function which merits further discussion; we will have that discussion now.

What Theorem 1 reveals is the remarkable fact, as N \to \infty, the dimension of a Young diagram with \lambda \in \mathbb{Y}_d(dN) fluctuates on a square root scale, and the fluctuations themselves behave like a 2D Coulomb gas of d identical point charges on a wire at inverse temperature \beta=1 and confined by a harmonic potential. Let us unpack this statement.

First, the Weyl chamber

\mathbb{W}^d = \{(y_1,\dots,y_d) \in \mathbb{R}^d \colon y_1> \dots>y_d\}

is the configuration space of d hard particles on a wire, labeled from right to left. If S(y_1,\dots,y_d) is the potential energy of this system of particles when it is in a given configuration, then the formalism of statistical mechanics says that the density which determines the probability to observe the system in a given configuration is

\frac{1}{Z_d(\beta)} e^{-\beta S(y_1,\dots,y_d)},

where \beta is the inverse temperature and

Z_d(\beta) = \int_{\mathbb{W}^d} e^{-\beta S(y_1,\dots,y_d)} \mathrm{d}y_1 \dots \mathrm{d}y)_d

is the partition function which ensures that this is the density of a probability measure. The partition function corresponding to this particular S is a famous integral called Mehta’s integral, and it can be computed exactly using the Selberg integral formula.

What kind of a system has energy functional S as in Theorem 1? According to Coulomb’s law, the repulsive force F between two identical point charges in \mathbb{R}^n is inversely proportional to the distance between them raised to the power (n-1). So in our usual n=3 world, F has an inverse square law, but in an n=2 world the repulsion is proportional to the reciprocal of the distance. Since energy is the integral of force, this accounts for the logarithmic terms in S — they represent the electrostatic repulsion between d identical two dimensional point charges confined to a wire. Left to their own devices, these charges would drift infinitely far away from each other on the unbounded wire \mathbb{R}. The quadratic part of S has the meaning of a harmonic potential which attracts each point charge even as they repel one another, leading to the existence of a non-trivial ground state, the minimizer of S.

What configuration minimizes S? This classical problem in electrostatics was solved by Stieltjes, and the answer is: the roots of the dth Hermite polynomial, aka the matching polynomial of the complete graph on d vertices. Thus fluctuations of Young diagrams have something to do with the roots of Hermite polynomials. On the other hand, the Hermite polynomials themselves are orthogonal polynomials with respect to the Gaussian density, so fluctuations of Young diagrams should somehow be linked to Gaussian objects of some kind. In fact, the right objects are Gaussian random matrices.

Consider the real vector space of d \times d symmetric matrices. This is a Euclidean space when equipped with the standard scalar product \langle X,Y \rangle = \mathrm{Tr} XY, i.e. the scalar product with respect to which the elementary matrices E_{ij} form an orthonormal basis. In particular, one can define the standard Gaussian measure on this Euclidean space by stipulating its density, e^{-\frac{1}{2} \mathrm{Tr} X^2}. A random matrix with this distribution will have Gaussian random variables as its entries, and these variables are independent up to the symmetry constraint, as you can see by canonically identifying such a random matrix with a d^2 dimensional random vector and taking its Laplace transform. However, the eigenvalues of x_1 > \dots > x_d of a d \times d Gaussian random symmetric matrix are neither Gaussian nor independent: the distribution of the random vector (x_1,\dots,x_d) in \mathbb{W}^d has density proportional to

e^{-S(x_1,\dots,x_d)},

where S is just as in Theorem 1. This is essentially due to a classical result of Hermann Weyl which says that the Euclidean volume of the set of d \times d symmetric matrices with given eigenvalues x_1 \geq \dots \geq d is proportional to the Vandermonde determinant in these eigenvalues,

\det [x_j^{d-i}]_{1 \leq i,j \leq d} = \prod\limits_{1 \leq i<j \leq d} (x_i-x_j).

In particular, the volume of the set of symmetric matrices with two or more equal eigenvalues is zero, as it should be, since one can see from the spectral theorem that this is not a full-dimensional subset of the space of symmetric matrices.

Theorem 1 is thus saying that Young diagram fluctuations behave like the eigenvalues y_1>\dots > y_d of the traceless random matrix Y_d = X_d - \frac{1}{d}(\mathrm{Tr}X_d)I. This surprising coincidence of Young diagram formulas and random matrix formulas is just the tip of a huge iceberg linking the two subjects, the culmination of which is the Baik-Deift-Johansson Theorem linking the fluctuations of the length of the longest increasing subsequence in a large random permutation with the eigenvalues of random matrices.

Concerning random matrices, the Coulomb gas with Boltzmann weight e^{-\beta S(x_1,\dots,x_d)} can be interpreted as the spectrum of a d \times d random matrix for all values of the inverse temperature \beta. For \beta=2,4, the relationship is as above: one takes Gaussian complex selfadjoint matrices with entries to get the \beta=2 case, and Gaussian quaternion selfadjoint matrices to get the \beta =4 case. For the other values of \beta, the random matrix realization of the 2D Coulomb gas confined to a wire is more subtle, and was found by our colleague Ioana Dumitriu together Alan Edelman.

The above was pretty sketchy but also hopefully somewhat interesting. This being a topics course, you may want to take the time to run down some of the details (you can always ask me for further explanation, or pointers to the literature).

In our next lecture, we will complete the derivation of the first-order asymptotics of N!_d. For now, you might want to ask yourself: how is this related to the symmetry

S(y_1,\dots,y_d) = S(-y_d,\dots,-y_1)

of the energy of the Coulomb gas, and why is this symmetry physically obvious? In the next lecture, I will also tell you a bit about the Stanley-Wilf Conjecture (now a theorem of Marcus and Tardos), which is related to the growth of an even more general factorial function N!_\sigma which counts the number of permutations in \mathrm{S}(N) which avoid a fixed permutation \sigma \in \mathrm{S}(d).

Math 262A: Lecture 6

In this Lecture we continue the the N \to \infty asymptotic analysis of N!_d with d \in \mathbb{N} arbitrary but fixed. Recall that N!_d was defined in Lecture 5 as the number of permutations in \mathrm{S}(N) which have no increasing subsequence of length d+1.

Our starting point is the “gluing” trick we used to recover Knuth’s formula

N!_3 = \frac{1}{N+1}{2N \choose N}.

Based on this, we can hope that the asymptotics of N!_d might coincide with the asymptotics of

\dim R(d, \frac{2N}{d}),

where R(a,b) is the a \times b rectangular Young diagram. This would be excellent, since the asymptotics of the dimension of a very long rectangle of fixed height are easy to deduce from Frobenius’s formula for \dim \lambda together with Stirling’s approximation of N!.

Conjecture 1: For any fixed d \in \mathbb{N}, we have

(dN)!_d =\dim R(d,2N)+o\left(\dim R(d,2N)\right)

as N \to \infty.

To begin, we write

(dN)!_d = \sum\limits_{\lambda \in \mathrm{Y}_d(dN)} (\dim \lambda)^2,

which is just RSK. We want to compare this sum to \dim R(d,2N) using our “gluing” intuition. To do this, let us define the notion of the complement \mu^* of a diagram \mu \subseteq R(d,2N). This is exactly what you think it is: if \mu=(\mu_1,\dots,\mu_d), then \mu^*=(2N-\mu_d,\dots,2N-\mu_1) — draw yourself a picture. The basis of our gluing trick was that in the case d=2, every diagram is self complementary: for \mu \in \mathrm{Y}_2(2N), we have

\mu=(\mu_1,\mu_2) = (\mu_1, 2N-\mu_1)

and

\mu^*=(2N-(2N-\mu_1),2N-\mu_1) = (\mu_1,2N-\mu_1).

This is not necessarily true for d > 2, but for any d we certainly have that the set of standard Young tableaux (SYT) of shape R(d,2N) is in bijection with the set of pairs (P,Q) of SYT in which the shape of P is a diagram \mu \in \mathrm{Y}_d(dN) and the shape of Q is the complementary diagram \mu^*. The bijection is obtained simply by splitting any SYT of shape R(d,2N) in half according to the decomposition

\{1,\dots,2dN\} = \{1,\dots,dN\} \sqcup \{dN+1,\dots,2dN\}

of its entries, rotating the half carrying the entries \{dN+1,\dots,2dN\}, and subtracting dN from each entry. In formulas, this is

\dim R(d,2N) = \sum\limits_{\mu \in \mathrm{Y}_d(dN)} (\dim \mu)(\dim \mu^*).

We can now compare (dN)!_d and \dim R(d,2N) by comparing the above sums. The RSK sum formula for (dN)!_d can be decomposed according to whether \lambda \in \mathrm{Y}_d(dN) is contained in R(d,2N) or not,

(dN)!_d = \sum\limits_{\substack{\lambda \in \mathrm{Y}_d(dN) \\ \lambda \subseteq R(d,2N)}} (\dim \lambda)^2 + \sum\limits_{\substack{\nu \in \mathrm{Y}_d(dN) \\ \nu_1 > 2N}} (\dim \lambda)^2,

and the first group of terms can be further decomposed into two sum according to whether \lambda \subseteq R(d,2N) is self-complementary or not,

(dN)!_d = \sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu = \mu^*}} (\dim \mu)^2 +\sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu \neq \mu^*}} (\dim \mu)^2 + \sum\limits_{\substack{\nu \in \mathrm{Y}_d(dN) \\ \nu_1 > 2N}} (\dim \nu)^2.

Similarly, we have

\dim R(d,2N) = \sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu = \mu^*}} (\dim \mu)^2 +\sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu \neq \mu^*}} (\dim \mu)(\dim \mu^*).

Now use the second formula to substitute for the first group of terms in the first formula, obtaining

(dN)!_d = \dim R(d,2N)+\sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu \neq \mu^*}} (\dim \mu)^2 - \sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu \neq \mu^*}} (\dim \mu)(\dim \mu^*)+ \sum\limits_{\substack{\nu \in \mathrm{Y}_d(dN) \\ \nu_1 > 2N}} (\dim \nu)^2.

Observing that

\sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu \neq \mu^*}} (\dim \mu)^2 = \sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu \neq \mu^*}} (\dim \mu^*)^2,

we complete the square to obtain the following.

Proposition 1: We have (dN)!_d = \dim R(d,2N) + E_d(N), where the error term is

E_d(N) = \frac{1}{2} \sum\limits_{\substack{\mu \in \mathrm{Y}_d(dN) \\ \mu \subseteq R(d,2N) \\ \mu \neq \mu^*}} (\dim \mu-\dim \mu^*)^2 + \sum\limits_{\substack{\nu \in \mathrm{Y}_d(dN) \\ \nu_1 > 2N}} (\dim \nu)^2.

The error term E_d(N) reflects the two sources of error in the approximation

(dN)!_d \approx \dim R(d,2N),

namely that the right hand side fails to account for diagrams which aren’t self-complementary relative to R(d,2N), or which don’t fit inside R(d,2N). In particular, we have that E_2(N)=0, corresponding to Knuth’s theorem, but E_d(N)>0 for d>2. Conjecture 1 says that E_d(N) is negligible for large N.

If E_d(N) is in fact negligible, it must be the case that the main contributions to the sum expressing (dN)!_d come from diagrams \lambda \in \mathrm{Y}_d(dN) which are contained in R(d,2N), and are moreover self-complementary relative to this rectangle. The simplest diagram with these features is of course half of R(d,2N), i.e. the rectangle R(d,N). So we are hopeful that diagrams in a neighborhood of this rectangle are essentially contributions from a neighborhood of the maximum of \dim \lambda, just like in Laplace’s method.

To investigate the above, we want to consider the large N behavior of the functional

f_N(y_1,\dots,y_N;\varepsilon) =\dim (N+y_1N^\varepsilon, \dots, N+y_dN^\varepsilon),

where \varepsilon \in (0,1) and (y_1,\dots,y_d) \in \mathbb{R}^d is a vector such that

y_1 > \dots > y_d \quad\text{ and }\quad y_1+\dots+y_d=0.

The set \Omega_{d-1} is a (d-1)-dimensional convex subset of \mathbb{R}^d, which is in fact a hyperplane slice of the type A Weyl chamber in \mathbb{R}^d (you can safely ignore these words if you don’t know what they mean).

Note that we are evaluating \dim at possibly non-integral numbers, so we are in effect taking Frobenius’s formula as a way to extend the domain of \dim, just like Euler’s integral extends the domain of N! to continuous arguments (and actually we’re using Euler’s integral inside the Frobenius formula because it contains factorials). The functional f_N is capturing the behavior of the dimension function in a neighborhood of the suspected maximizer, which to compare with Laplace’s method is like looking at the Taylor expansion of the e^{-NS(x)} in a neighborhood of its maximum.

Let’s see how f_N(\cdot;\varepsilon) behaves as N \to \infty. We have

f_N(y_1,\dots,y_N) = \frac{(dN)!}{\prod_{i=1}^d(N+y_iN^\varepsilon + d-i)!}\prod\limits_{1 \leq i<j \leq d} \left((y_i-y_j)N^{\varepsilon} + j-i \right),

where x! = \int_0^\infty t^x e^{-t} \mathrm{d}t as we have discussed. We now have to think about the N \to \infty behavior of the various pieces of this formula. The easiest piece is (dN)!, for which we have

(dN)! \sim \sqrt{2\pi} \frac{(dN)^{dN+\frac{1}{2}}}{e^{dN}}

by Stirling’s approximation. Now consider the product in the denominator,

\prod\limits_{i=1}^d(N+y_iN^\varepsilon + d-i)!.

For each factor in this product, we have that

(N+y_iN^\varepsilon + d-i)! \sim N^{d-i+1} (N+y_iN^\varepsilon-1)!

by the usual recurrence x!=x(x-1)! for factorials, which follows from Euler’s integral. Thus

\prod\limits_{i=1}^d(N+y_iN^\varepsilon + d-i)! \sim N^{1+2+\dots+d}\prod\limits_{i=1}^d(N+y_iN^\varepsilon-1)! \\= N^{\frac{d(d+1)}{2}}\prod\limits_{i=1}^d(N+y_iN^\varepsilon-1)!

To estimate the remaining simplified product, let’s take a logarithm; this will convert a product of big numbers into a sum of smaller numbers, which ought to be easier to understand. We obtain

\log \prod\limits_{i=1}^d(N+y_iN^\varepsilon-1)! = \sum\limits_{i=1}^d \log (N+y_iN^\varepsilon-1)!

Now using Stirling’s approximation in the form

\log (N-1)! \sim \frac{1}{2}\log 2\pi + (N-\frac{1}{2})\log N-N,

the asymptotic form of the above sum is

\sum\limits_{i=1}^d \log (N+y_iN^\varepsilon-1)!\sim \frac{d}{2} \log 2\pi -dN + \sum\limits_{i=1}^d \left(N+y_iN^\varepsilon - \frac{1}{2} \right)\log \left( N+y_iN^\varepsilon\right).

Nothing for it but to keep hammering away. We have

\sum\limits_{i=1}^d \left(N+y_iN^\varepsilon - \frac{1}{2} \right)\log \left( N+y_iN^\varepsilon\right) = dN\log N - \frac{d}{2} \log N + \sum\limits_{i=1}^d \left(N+y_iN^\varepsilon - \frac{1}{2} \right)\log \left( 1+y_iN^{\varepsilon-1}\right),

and the purpose of this step was to allow us the fact that \varepsilon <1 together with the Maclaurin series of \log(1+x) to write

\log \left( 1+y_iN^{\varepsilon-1}\right) = y_iN^{\varepsilon-1} - \frac{1}{2} y_i^2N^{2\varepsilon-2} + O(N^{3\varepsilon-3}),

which finally gets us to

\sum\limits_{i=1}^d \left(N+y_iN^\varepsilon - \frac{1}{2} \right)\log \left( 1+y_iN^{\varepsilon-1}\right) \sim \frac{N^{2\varepsilon-1}}{2} \sum\limits_{i=1}^d y_i^2,

where, crucially, we used the fact that y_1+\dots+y_d=0. Notice that \varepsilon = \frac{1}{2} is emerging as the right scaling here — do you see why? Assembling all these calculations into one master estimate, we obtain

\frac{(dN)!}{\prod_{i=1}^d(N+y_iN^\varepsilon + d-i)!} \sim \frac{e^{dN}}{(2\pi)^{\frac{d}{2}}N^{dN+\frac{d^2}{2}}} e^{-\frac{N^{2\varepsilon-1}}{2} \sum_{i=1}^d y_i^2}.

There is one more piece of Frobenius’s formula which we still have to estimate, but this one is easy:

\prod\limits_{1 \leq i<j \leq d} \left((y_i-y_j)N^{\varepsilon} + j-i \right) \sim N^{\varepsilon \frac{d(d-1)}{2}} \prod\limits_{1 \leq i<j \leq d} (y_i-y_j).

At the end of the day (which it is), if we take \varepsilon = \frac{1}{2} then we have the following.

Theorem 1: For any fixed d \in \mathbb{N} and any (y_1,\dots,y_d) \in \Omega_{d-1}, we have

\dim\left(N+y_1\sqrt{N},\dots,N+y_d\sqrt{N} \right) \sim \frac{(dN)!e^{dN}}{(2\pi)^{\frac{d}{2}} N^{dN+\frac{d^2}{2}}}e^{-S(y_1,\dots,y_d)},

where

S(y_1,\dots,y_d) = \frac{1}{2} \sum\limits_{i=1}^d y_i^2 - \sum\limits_{1 \leq i<j \leq d} (y_i-y_j).

The action S which has emerged from this herculean computation is an extremely interesting and important special function. We will discuss this further next lecture, and complete our analogue of Stirling’s approximation for N!_d.

Math 262A: Lecture 5

In this Lecture and the next, we will explore the analogue of Stirling’s formula for a certain generalization of N!. This generalization is based on the combinatorial interpretation of N! as the number of permutations of 1,\dots,N, i.e. as the cardinality of the symmetric group \mathrm{S}(N). A permutation \pi \in \mathrm{S}(N) is said to have an increasing subsequence of length k at positions 1 \leq i_1 < \dots i_k \leq N if \pi(i_1) < \dots \pi(i_k). If instead \pi(i_1) > \dots > \pi(i_k), then we say that \pi has a decreasing subsequence of length k at these positions.

Definition 1: Given positive integers d,N \in \mathbb{N}, we define N!_d to be the number of permutations in \mathrm{S}(N) that have no decreasing subsequence of length d+1.

Exercise 1: Show that if one changes the word “decreasing” to “increasing” in Definition 1, the number N!_d stays the same.

For example, in the case d=1, we have N!_1=1 for all N\in \mathbb{N}, corresponding to the fact that the identity permutation is the unique permutation with no decreasing subsequence of length 2. The case d=2 is non-trivial: computing N!_2 means counting permutations in \mathrm{S}(N) with no decreasing subsequence of length 3, and the number of these is not immediately obvious. This enumeration problem seems to have been first addressed by Donald Knuth.

Theorem (Knuth): For all N \in \mathbb{N}, we have N!_2 = \frac{1}{N+1}{2N \choose N}.

The sequence

\mathrm{Cat}_N = \frac{1}{N+1}{2N \choose N}, \quad N \in \mathbb{N}

is the sequence of Catalan numbers, and it is ubiquitous in combinatorics.

Exercise 2: Prove Knuth’s theorem.

Apart from the cases d=1,2, there is no simple explicit formula which gives N!_d for all values of N (of course, when N \leq d, we simply have N!_d=N!). Asymptotic algebraic combinatorics is supposed to deal with cases such as this by supplying useful asymptotic formulas, even when no exact formula is available. The problem we set ourselves is thus to find the analogue of Stirling’s formula for this combinatorial generalization of the factorial:

Problem 1: Given an arbitrary but fixed d \in \mathbb{N}, give an N \to \infty asymptotic approximation to N!_d.

Exercise 3: Combine Knuth’s theorem and Stirling’s formula to obtain to solve Problem 1 in the case d=2.

Problem 1 was first solved by Amitai Regev in this paper, and it uses some surprisingly sophisticated mathematics, namely something called Mehta’s integral. A simpler proof was given thirty years later by yours truly in this paper, and this is the argument we will present. The starting point of both arguments is the same, namely a certain bijection called the Robinson-Schensted-Knuth correspondence.

The RSK correspondence is a bijection between \mathrm{S}(N) and pairs of standard Young tableaux of the same shape. Young diagrams and Young tableaux are objects of fundamental importance in algebraic combinatorics. Given a natural number N, a partition of N is a decomposition of N into a sum of natural numbers (the “parts” of the partition) which are less than or equal to N, without regard for the order of the summands. For example, the partitions of N=3 are

3,\ 2+1,\ 1+1+1.

Notice that ignoring the order of the summands is the same thing as picking a particular order. There are two canonical choices: write the parts in weakly decreasing order as above (English convention), or in weakly increasing order (French convention). We are using the English convention.

Let \mathbf{Y}(N) denote the set of partitions of N, and set

\mathbf{Y} = \bigsqcup_{N =1}^\infty \mathbf{Y}(N).

The set \mathbf{Y} is called Young’s lattice, because it carries a partial order introduced by Alfred Young which makes it into a lattice. This partial order emerges when we graphically encode partitions as Young diagrams: this means that we write the first part \lambda_1 of a given partition \lambda as a row of latex \lambda_1 cells, then the second part \lambda_2 as a second row of cells under the first row, etc. (it’s probably best if you just look at a picture). The partial order on \mathbf{Y} is then just containment of diagrams: we define \mu \leq \lambda if and only if \mu fits inside \lambda (again, it’s probably best if you just look at a picture). From now on, we identify partitions with their Young diagrams.

Let \lambda \in \mathbf{Y}(N) be a Young diagram with N cells. A standard Young tableau of shape \lambda is a walk in Young’s lattice which starts at the diagram \Box, ends at \lambda, and in which each step consists of adding a single cell. One can encode such a walk by writing the numbers 1,\dots,N in the cells of \lambda corresponding to the order in which they were added; by construction, in any such filling the numbers will be increasing along rows and columns (again, best to just look at a picture). The number of standard Young tableau of shape \lambda is called the dimension of \lambda and denoted \dim \lambda.

Theorem 2 (RSK correspondence): There is a bijective correspondence between permutations \pi \in \mathrm{S}(N) and pairs (P,Q) of standard Young tableaux of the same shape \lambda \in \mathbf{Y}(N). This correspondence has the following properties: if \pi corresponds to (P,Q), then \pi^{-1} corresponds to (Q,P); if \pi corresponds to (P,Q), then the maximal length of a decreasing subsequence in \pi is the number of rows in the common shape \lambda of P and Q, and the maximal length of a longest increasing subsequence in \pi is the number of columns in \lambda.

For a Young diagram \lambda, let \ell(\lambda) denote the number of rows in \lambda. The RSK correspondence gives us the following formula.

Corollary 1: We have

N!_d = \sum\limits_{\substack{ \lambda \in \mathbf{Y}(N) \\ \ell(\lambda) \leq d}} (\dim \lambda)^2.

Corollary 1 will play the same role in our asymptotic analysis of N!_d that Euler’s integral

N! = \int_0^\infty x^N e^{-x} \mathrm{d}x

played in the asymptotic analysis of N!. Although Corollary 1 gives us a sum representation of N!_d rather than an integral representation, we shall see that this sum may naturally be viewed as a Riemann sum for a certain (multidimensional) integral. This is enabled by the following explicit formula for the dimension of a given Young diagram.

Theorem 3 (Frobenius’ formula): For any \lambda \in \mathbf{Y}(N) and any m \geq \ell(\lambda), we have

\dim \lambda = \frac{N!}{\prod_{i=1}^m (m-i + \lambda_i)} \prod\limits_{1 \leq i < j \leq m}(\lambda_i - \lambda_j + j-i).

Combining Corollary 1 and Theorem 3 gives us a complicated but explicit formula for N!_d. It is not clear that this is helpful, since even in the case d=2 it seems non-trivial to evaluate this sum to get the Catalan number \mathrm{Cat}_N (try it, or at least try writing down this sum). However, there is another way to look at things: before we start plugging in complicated products and trying to figure out what to do with them, we can step back and consider the symmetries in the problem.

Here is a “symmetry first” approach to obtaining Knuth’s formula for N!_2 using the RSK correspondence in conjunction with Frobenius’ formula. Let \pi \in \mathrm{S}(N) be a permutation with no decreasing subsequence of length 3, and let (P,Q) be the pair of standard Young tableaux associated to \pi by RSK. The diagram \lambda \in \mathbf{Y}(N) on which both P,Q are built has N cells and at most 2 rows. Take the tableau Q, and replace each of its entries i with N+i so that the entries of this augmented Q are the numbers N+1,\dots,2N. Now take this augmented Q, rotate it clockwise through 180^\circ degrees, and glue it to P (you should probably draw yourself a picture). You are now looking at a standard Young tableaux of shape R(2,N), where R(a,b) denotes the rectangular Young diagram with a rows, each of length b. This process is clearly reversible (make sure you understand both why and how), and so we have a bijection between permutations in \mathrm{S}(N) with no decreasing subsequence of length 3 and standard Young tableaux of shape R(2,N). It seems like it shouldn’t be too hard to do the single computation of evaluating \dim R(2,N) using Frobenius’ formula, and presumably the result of this computation will be \mathrm{Cat}_N. Before we check this, observe that even before applying the Frobenius formula we have discovered an interesting peculiarity of the universe.

Theorem 4: For any N \in \mathbb{N}, the number of permutations in \mathrm{S}(N) with no decreasing subsequence of length 3 is equal to the number of involutions in \mathrm{S}(2N) whose longest decreasing subsequence has length exactly 2 and whose longest increasing subsequence has length exactly N.

Proof: This follows from the first property of RSK, i.e. the one about switching P and Q (think about it). — Q.E.D.

Now, combining the above trick with the Frobenius formula, we get

N!_2 = \dim R(2,N) = \frac{(2N)!}{(N-1+2)!(N-2+2)!} (N-N+2-1) = \frac{(2N)!}{(N+1)!N!} = \mathrm{Cat}_N,

so we’ve recovered Knuth’s theorem in quite an appealing way.

Let’s not get too excited — the above trick is not going to work for evaluating N!_d when with d > 2. Indeed, a Young diagram with 3 rows need not be self-complementary, in that one can’t necessarily glue it to a rotated copy of the same diagram to get a rectangle (draw yourself a counterexample). But perhaps this is true most of the time — that is, maybe for large N most diagrams with N cells and at most a fixed number d of rows are indeed self-complementary. This sort of thing is very much the spirit of asymptotic algebraic combinatorics, and in fact something along these lines turns out to be true in this case. We’ll see why next lecture.

Exercise 4: Using the Frobenius and Stirling formulas, show that for d fixed and q \to \infty we have

\dim R(d,q) \sim (2\pi)^{\frac{1-d}{2}}\left( \prod_{k=1}^{d-1} k! \right) d^{dq+\frac{1}{2}} q^{\frac{1-d^2}{2}}.

Use this computation to conjecture an N \to \infty asymptotic formula for N!_d.

Math 262A: Lecture 4

Let’s begin by going over the computation from the end of Lecture 3, where we were trying to compute the coefficients m_d(N) in the expansion

e^{\sum_{d=1}^\infty c_d(N) \frac{z^d}{d!}} = 1 + \sum\limits_{d=1}^\infty m_d(N) \frac{z^d}{d!}

with

c_d(N) = \delta_{d \geq 3} N(-1)^{d-1} (d-1)!.

From the Exponential Formula, we get

m_d(N) = \sum\limits_{\pi \in \mathrm{P}(d)} \prod\limits_{B \in \pi} c_{|B|}(N),

so for each partition \pi of \{1,\dots,d\} the corresponding term in the sum m_d(N) is

\prod\limits_{B \in \pi} \delta_{|B| \geq 3} N(-1)^{|B|-1} (|B|-1)! = N^{b(\pi)} (-1)^{d-b(\pi)} \prod\limits_{B \in \pi} \delta_{|B| \geq 3} (|B|-1)!,

where b(\pi) is the number of blocks in \pi. The product

\prod\limits_{B \in \pi} \delta_{|B| \geq 3} (|B|-1)!

is counting the number of ways to build a cyclic permutation of length at least 3 on each block of \pi, so we conclude that

m_d(N) = \sum\limits_{k=1}^d (-1)^{d-k} N^k t_k(d)

with t_k(d) the number of permutations in the symmetric group \mathrm{S}(d) made up of exactly k cycles, each of length at least 3. We can write this a bit more precisely as

m_d(N) = \sum\limits_{k=1}^{\lfloor \frac{d}{3} \rfloor} (-1)^{d-k} N^k t_k(d).

It remains to solve Problem 2 from Lecture 3, which was to evaluate the integral

\int\limits_{-\varepsilon}^{+\varepsilon} x^d e^{-\frac{N}{2}x^2} \mathrm{d}x.

Unlike Problem 1, Problem 2 cannot be solved exactly. In the case d=0, this was discussed in Lecture 2: we can approximate

\int\limits_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}x^2} \mathrm{d}x = \int\limits_{-\infty}^{+\infty} e^{-\frac{N}{2}x^2} \mathrm{d}x + E_0(\varepsilon,N)

with E_0(\varepsilon,N) an exponentially small errror, and then evaluate the Gaussian integral

\int\limits_{-\infty}^{+\infty} e^{-\frac{N}{2}x^2} \mathrm{d}x = \sqrt{\frac{2\pi}{N}}.

For any fixed d \in \mathbb{N} we have an analogous approximation

\int\limits_{-\varepsilon}^{+\varepsilon} x^d e^{-\frac{N}{2}x^2} \mathrm{d}x = \int\limits_{-\infty}^{+\infty}x^d e^{-\frac{N}{2}x^2} \mathrm{d}x +E_d(\varepsilon,N),

with E_d(\varepsilon,N) an error term that remains to be quantified precisely but is clearly rapidly decaying given the exponentially small tails of the Gaussian density. We now evaluate the total integral

\int\limits_{-\infty}^{+\infty}x^d e^{-\frac{N}{2}x^2} \mathrm{d}x.

In fact, we will evaluate all such integrals simultaneously by looking instead at the integral

L(z) = \int\limits_{-\infty}^{+\infty} e^{-\frac{N}{2}x^2+xz} \mathrm{d}x,

which is the Laplace transform of the Gaussian density. On one hand, observe that the integral converges to define an entire function of z \in \mathbb{C} whose derivatives at z=0 can be computed by differentiating under the integral sign, and are exactly the integrals we’re trying to evaluate:

\bigg[\frac{\mathrm{d}}{\mathrm{d}z}\bigg]^d L(z) \big{|}_{z=0}=\int\limits_{-\infty}^{+\infty}x^d e^{-\frac{N}{2}x^2} \mathrm{d}x.

(Differentiation under the integral sign is our first contact with the ghost of Richard Feynman, a congenial specter that will continue to haunt us). On the other hand, we can compute L(z) exactly by completing the square: since

-\frac{N}{2}x^2+xz = -\frac{N}{2}\left( x^2 -\frac{2}{N}xz + \frac{1}{N^2}z^2\right) + \frac{1}{2N}z^2 = -\frac{N}{2}\left( x-\frac{1}{N}z\right)^2 + \frac{1}{2N}z^2,

we obtain

L(z) = e^{\frac{1}{2N}z^2} \int\limits_{-\infty}^{+\infty} e^{-\frac{N}{2}\left(x-\frac{1}{N}z\right)^2} \mathrm{d}x = e^{\frac{1}{2N}z^2} \int\limits_{-\infty}^{+\infty} e^{-\frac{N}{2}x^2} \mathrm{d}x,

by translation invariance of Lebesgue measure, and hence

L(z) = \sqrt{\frac{2\pi}{N}} e^{\frac{1}{2N}z^2}.

This is an important feature of the Gaussian density: up to minor details, it is a fixed point of the Laplace transform. In particular, this makes the Maclaurin series of L(z) is easy to compute directly:

L(z) = \sqrt{\frac{2\pi}{N}}\sum\limits_{k=0}^\infty \frac{(2k)!}{N^k 2^k k!} \frac{z^{2k}}{(2k)!}.

Taking a closer look at the nonzero coefficients of this series, we observe that

\frac{(2k)!}{2^k k!} = \frac{(2k)(2k-1)(2(k-1))(2k-3) \dots 1}{2^k k!} = (2k-1)(2k-3) \dots 1,

the product of all odd numbers below 2k. Thus in terms of the double factorial n!!, which is the product of all numbers below n of the same parity (and should not be confused with the much larger number (n!)!),$ we have

L(z) = \sqrt{\frac{2\pi}{N}}\sum\limits_{k=0}^\infty N^{-k} (2k-1)!!\frac{z^{2k}}{(2k)!}.

We thus obtain the integral formula

\int\limits_{-\infty}^{+\infty}x^d e^{-\frac{N}{2}x^2} \mathrm{d}x = \sqrt{\frac{2\pi}{N}}\begin{cases} 0, \text{ if } d \text{ odd } \\ N^{-\frac{d}{2}} (d-1)!!, \text{ if } d \text{ even.} \end{cases}

Exercise 1: Show that for d even, (d-1)!! is the number of permutations in \mathrm{S}(d) all of whose cycles have length exactly 2, i.e. the number of fixed point free involutions.

We have finally reached the point where we can legitimately write down Stirling’s formula. Here’s a step-by-step reconstruction of the argument we’ve been building.

Step 1: Obtain the integral representation

N! = \frac{N^{N+1}}{e^N}Z_N(S),

where S(w) = w- \log(1+w) and

Z_N(S) =\int\limits_{-1}^{+\infty} e^{-N(w-\log(1+w)} \mathrm{d}w.

Step 2: Demonstrate that the integral Z_N(S) localizes in an \varepsilon-neighborhood of the minimizer of S:

Z_N(S) = Z_{N,\varepsilon}(S),

where

Z_{N,\varepsilon}(S) = \int_{-\varepsilon}^{+\varepsilon}e^{-NS(w)}\mathrm{d}w + O(e^{-c(\varepsilon)}N).

Step 3: Write the integrand of Z_{N,\varepsilon}(S) in the form

e^{-NS(w)} = e^{-\frac{N}{2}w^2} e^{\sum_{d=3}^\infty c_d(N) \frac{w^d}{d!}}.

Step 4: Expand the non-Gaussian part of the integrand,

e^{\sum_{d=3}^\infty c_d(N) \frac{w^d}{d!}}  = 1 + \sum\limits_{d=1}^\infty m_d(N) \frac{w^d}{d!},

with the coefficients m_d(N) being the “disconnected” version of the coefficients c_d(N).

Step 5: Integrate term-by-term:

Z_{N,\varepsilon}(S) = \int_{-\varepsilon}^{+\varepsilon}e^{-\frac{N}{2}w^2}\mathrm{d}w + \sum\limits_{d=1}^\infty m_d(N) \int_{-\varepsilon}^{+\varepsilon}w^de^{-\frac{N}{2}w^2}\mathrm{d}w.

Step 6: Replace each truncated Gaussian integral with a total Gaussian integral at the cost of an exponentially small error E_{d,\varepsilon}(N), and evaluate:

m_d(N) \int_{-\varepsilon}^{+\varepsilon}w^de^{-\frac{N}{2}w^2}\mathrm{d}w = [d \text{ even}] m_d(N) N^{-\frac{d}{2}} \left(\frac{d}{2}-1\right)!! + E_{d,\varepsilon}(N).

The bracket in Step 6 is an Iverson bracket.

Problem 1: Carefully executing the above steps, show that we have obtained Stirling’s approximation: as N \to \infty, we have

N! = \sqrt{2\pi}\frac{N^{N+\frac{1}{2}}}{e^N}\left(1 + O(\frac{1}{N}) \right).

Problem 2: Develop a good bookkeeping system allowing the computation of subleading terms in this approximation.

Problem 3: Go through Steps 1 to 6, consider what features of S were used, and examine to what extent these features were needed.

These are the problems which we will focus on in the next lectures. In particular, Problem 2 leads to the concept of Feynman diagrams. You can start thinking about these problems now, and see what you can see.

Math 262A: Lecture 2

We continue with the estimation of N! for large N via Euler’s integral,

N! = \int_0^\infty t^N e^{-t} \mathrm{d}t.

As at the end of Lecture 1, we make the substitution t=Nx, thereby obtaining

N! = \int_0^\infty (Nx)^N e^{-Nx} N\mathrm{d}x = N^{N+1} \int_0^\infty e^{-N(x-\log x)} \mathrm{d}x.

Now, one way to characterize an algebraic combinatorialist is to say that such a person loathes \log x, this being some horrible transcendental thing, but loves \log(1+x), this being an exponential generating function for cyclic permutations:

\log(1+x) = \sum\limits_{d=1}^\infty (-1)^{d-1} (d-1)! \frac{x^d}{d!} = \sum\limits_{d=1}^\infty (-1)^{d-1} \frac{x^d}{d}.

Accordingly, we make the further change of variables x=1+w, which gets us to the formula

N!= \frac{N^{N+1}}{e^N} \int_{-1}^\infty e^{-N(w-\log (1+w))} \mathrm{d}w.

We now focus on the problem of determining the N \to \infty asymptotics of

I_N = \int_{-1}^\infty e^{-NS(w)} \mathrm{d}w,

where the “action” S(w) = w-\log(1+w) is a smooth function on (-1,\infty), the Maclaurin series of which is

S(w) = \frac{w^2}{2} - \frac{w^3}{3} + \frac{w^4}{4} - \dots.

Exercise 1: What is the radius of convergence of this series?

Having solved Exercise 1 (which is singularly easy), you know that the series expansion of S(w) is only valid in a neighborhood of w=0, so ostensibly is not of any help in the estimation of I_N, since the integrand involves S(w) with arbitrarily large inputs w. On the other hand, we’re trying to work out an approximation, not perform an exact evaluation, so perhaps we can get away with something local. Indeed, the shape of S indicates that this might well be the case. Since

S'(w) = 1- \frac{1}{1+w},

our action S has a unique stationary point at w=0, and the value S(0)=0 is the global minimum of S on its domain of definition, (-1,\infty).

Plot of S(w) near w=0.

This means that e^{-NS(0)}=1 is the global maximum of e^{-NS(w)}, the integrand of I_N: we have

0 < e^{-NS(w)} < 1 \quad \forall w \in (-1,\infty)-\{0\},

In particular the integrand of I_N is exponentially smaller than 1 for all w \neq 0. This means that the integrand of I_N is sharply peaked at the unique minimizer w=0 of the action S(w), and the larger N is, the more exaggerated this effect becomes, as in the plots below.

Plot of e^{-10S(w)} near w=0.
Plot of e^{-1000S(w)} near w=0.

Accordingly, we expect that as N \to \infty the integral I_N “localizes” at the stationary point of the action meaning that

I_N \approx e^{-NS(0)}

might not be such a bad approximation. Let us investigate this more closely: we fix \varepsilon >0, and seek to estimate the integral

\int_{-\varepsilon}^{+\varepsilon} e^{-NS(w)} \mathrm{d}w.

Being combinatorialists, we will count the complement, meaning that we will try to estimate the above integral by instead estimating

\int_{-1}^{-\varepsilon} e^{-NS(w)} \mathrm{d}w \quad\text{ and }\quad \int_{+\varepsilon}^{+\infty} e^{-NS(w)} \mathrm{d}w.

To estimate the left integral, observe that left of zero we have

S(w) > \frac{w^2}{2},

and hence

\int_{-1}^{-\varepsilon} e^{-NS(w)} \mathrm{d}w < \int_{-1}^{-\varepsilon} e^{-\frac{N}{2}w^2} \mathrm{d}w < \int_{-\infty}^{-\varepsilon} e^{-\frac{N}{2}w^2} \mathrm{d}w,

where the rightmost integral is convergent (say, by comparison with \int_{-\infty}^{-\varepsilon} e^{-\frac{N}{2}w} \mathrm{d}w). To control the rightmost integral, we can observe that

e^{-\frac{N}{2}w^2} = e^{-\frac{N}{4}w^2}e^{-\frac{N}{4}w^2} \leq e^{-\frac{N}{4}\varepsilon^2}e^{-\frac{N}{4}w^2}

for w \in (-\infty,-\varepsilon), so that we have the bound

\int_{-1}^{-\varepsilon} e^{-NS(w)} \mathrm{d}w < C_ Ne^{-\frac{N}{4}\varepsilon^2}

where

C_N = \int_{-\infty}^{-\varepsilon} e^{-\frac{N}{4}w^2} \mathrm{d}w < \infty

satisfies

\lim\limits_{N \to \infty} C_N=0

by the Bounded Convergence Theorem. In particular, we can conclude that

\int_{-1}^{-\varepsilon} e^{-NS(w)} \mathrm{d}w = O(e^{-\varepsilon N}).

Now we estimate the \int_{+\varepsilon}^{+\infty}-integral using a similar strategy Since S(w) is strictly increasing for w>0, we have

e^{-NS(w)} = e^{-\frac{N}{2}S(w)}e^{-\frac{N}{2}S(w)} \leq e^{-\frac{N}{2}S(\varepsilon)}e^{-\frac{N}{2}S(w)}

for w \in (\epsilon,\infty). Consequently, we have the bound

\int_{+\varepsilon}^{+\infty} e^{-NS(w)} \mathrm{d}w < C_N(\varepsilon)e^{-\frac{N}{2}S(\varepsilon)},

where

C_N(\varepsilon) = \int_{+\varepsilon}^{+\infty} e^{-\frac{N}{2}S(w)} \mathrm{d}w <\infty

is a convergent integral, and moreover

\lim\limits_{N \to \infty} C_N(\varepsilon) = 0

by the Bounded Convergence Theorem. So we similarly have that

\int_{+\varepsilon}^\infty e^{-NS(w)} \mathrm{d}w = O(e^{-\frac{S(\varepsilon}{2} N}).

We thus conclude that

I_N = \int_{-\varepsilon}^{+\varepsilon} e^{-NS(w)} \mathrm{d}w + \text{exponentially decaying error in }N.

Just to remind ourselves where we are vis-a-vis our principal goal of approximation N!, we have arrived at the estimate

N! = \frac{N^{N+1}}{e^N}\left( \int_{-\varepsilon}^{+\varepsilon} e^{-NS(w)} \mathrm{d}w + \text{exponentially decaying error in }N\right).

It remains to estimate the \int_{-\varepsilon}^{+\varepsilon} integral. This is ostensibly more difficult than what we did above, since it apparently requires some actual insight into the behavior of S(w). On the other hand, we are now reduced to considering the behavior of the action on a small interval where we can represent it as a series: we have that

\int_{-\varepsilon}^{+\varepsilon} e^{-NS(w)} \mathrm{d}w = \int_{-\varepsilon}^{+\varepsilon} e^{-N\sum_{d=2}^\infty (-1)^d \frac{w^d}{d}} \mathrm{d}w

for all \varepsilon sufficiently small. Now, since

S(w) = \frac{w^2}{2} + O(w^3) \quad\text{ as }w \rightarrow 0,

we expect a further approximation of the form

\int_{-\varepsilon}^{+\varepsilon} e^{-NS(w)} \mathrm{d}w \approx \int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2} \mathrm{d}w.

Of course, such an approximation is not exactly true, and we have to understand how accurate it actually is in order to proceed rigorously. For now, however, let us see what would happen if we had the exact identity

\int_{-\varepsilon}^{+\varepsilon} e^{-NS(w)} \mathrm{d}w ``="\int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2} \mathrm{d}w,

where the quotes are meant to indicate that we are considering a hypothetical truth which we know to be false, but might still be conceptually useful.

Consider the integral

\int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2} \mathrm{d}w,

which we would like to approximate. An ideal outcome would be that we could exactly evaluate it. Unfortunately, even though the integrand here doesn’t look so bad, trying to compute the indefinite integral

\int e^{-\frac{w^2}{2}} \mathrm{d}w

is a fool’s errand, being in some sense analogous to trying to solve the quintic. On the other hand, by the same sort of rough estimates we have been using so far in this lecture, we have

\int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2} \mathrm{d}w =\int_{-\infty}^{+\infty} e^{-\frac{N}{2}w^2} \mathrm{d}w + \text{exponentially decaying error in }N.

The integral

\int_\mathbb{R} e^{-x^2} \mathrm{d}x,

which is known as the Gaussian integral, can in fact be exactly evaluated — not using the Fundamental Theorem of Calculus, but by exploiting symmetry. The answer is

\int_\mathbb{R} e^{-x^2} \mathrm{d}x = \sqrt{\pi},

a remarkable formula which is the basis of various amusing proclamations, including Lord Kelvin‘s assertion that “a mathematician is one to whom this is as obvious as that twice two makes four is to you.” I leave you to either try the derivation of this formula on your own, or look it up somewhere.

Changing variables in the Gaussian integral, we get the evaluation we actually need:

\int_{-\infty}^{+\infty} e^{-\frac{N}{2}w^2} \mathrm{d}w = \sqrt{\frac{2\pi}{N}}.

Following through on our thought experiment, this would imply the approximation

N! = \sqrt{2\pi}\frac{N^{N+\frac{1}{2}}}{e^N}\left(1 + \text{exponentially decaying error in }N\right).

The meat of this formula is correct, but the error term is the result of our wishful thinking, and it is false — we will correct it in Lecture 3.

Math 262A: Lecture 1

The goal of this topics course is to introduce you to asymptotic algebraic combinatorics. Each of these words is familiar individually, but when combined in this order the resulting phrase is ambiguous; indeed, looking at the list of lectures from a recent conference dedicated to asymptotic algebraic combinatorics, it is clear that it means different things to different people.

In this course, we will try to get a feeling for what asymptotic algebraic combinatorics is about by going on a quest: we are going to start at Stirling’s approximation, which might be considered the historically first result in asymptotic algebraic combinatorics, and from there forge a path to what physicists simply call “Large N,” which has something to do with quantum field theory. There will be various (hopefully edifying) digressions along the way, sometimes leading to unsolved problems — for example, I hope to discuss the pattern-restricted factorial, for which an analogue of Stirling’s formula is only sometimes known, and the Bhargava factorial, for which no analogue of Stirling’s formula is known at all. Let’s start the quest.

We want to estimate N!. Let’s start by writing down the stupidest thing we can think of — that way, things can only get better.

Proposition 1: We have

1 \leq N! \leq N^N.

Proof: By definition,

N! = \prod\limits_{x=1}^N x.

On the interval [1,N], the linear function l(x) = x has a minimum value of 1 at x=1, and a maximum value of N at x=N.

— Q.E.D.

To improve on the shameful bounds in Proposition 1, we can use a multiplicative version of Gauss’s famous “replica trick” for summing consecutive numbers.

Proposition 2: We have

N^{N/2} \leq N! \leq \left(\frac{N+1}{2} \right)^N.

Proof: Take two copies of N!, multiply them together

N! \cdot N! = \prod\limits_{x=1}^Nx \cdot \prod\limits_{x=1}^N x,

write the second product backwards,

N! \cdot N! = \prod\limits_{x=1}^Nx \cdot \prod\limits_{x=1}^N (N-x+1),

and interlace the factors of the two products to get

(N!)^2 = \prod\limits_{x=1}^N x(N-x+1).

The minimum of the quadratic function

q(x) = x(N-x+1) = -(x-\frac{1}{2}(N+1))^2 +\frac{1}{4}(N+1)^2

on the interval [1,N] occurs at x=1, with

q(1) = N,

and the maximum occurs at x=\frac{1}{2}(N+1), with

q(\frac{1}{2}(N+1))=\frac{1}{4}(N+1)^2.

— Q.E.D.

Problem 1: Suppose we use k replicas of N!, meaning that we write (N!)^k = \prod\limits_{x=1}^N \prod\limits_{j=1}^k\pi_j(x), where \pi_1,\dots,\pi_k \in \mathrm{S}(N) may be any permutations. Anything to be gained from this?

The fact that we now have a non-trivial lower bound on N! has some nice consequences. In particular, let us consider the prime factorization of N!,

N! = \prod\limits_p p^{e_p(N!)},

where the product is over distinct primes p and e_p(N!) is the multiplicity of p in N!, which is given by Legendre’s formula:

e_p(N!) = \sum\limits_{k=0}^\infty \lfloor \frac{N}{p^k} \rfloor.

Since floors are lower than chairs and tables, we get the estimate

e_p(N!) < \sum\limits_{k=0}^\infty \frac{N}{p^k}=\frac{N}{p-1},

so that the “isotypic component” of N! corresponding to the prime p is

p^{e_p(N!)} < p^{\frac{N}{p-1}}.

Now, since p \leq 2^{p-1}, we have

p^{e_p(N!)} < 2^N,

which gives us the estimate

N! < 2^{N\pi(N)}

with \pi(N) the number of primes less than or equal to N. In order for this to comport with the lower bound in Proposition 2, we get

\pi(N) > \frac{1}{2} \log_2 N,

which isn’t great from the perspective of actually counting primes, but it does tell you that there are infinitely many.

Another approach to estimating factorials is to use calculus: instead of squaring N!, we take its logarithm and interpret the resulting sum as a Riemann sum.

Proposition 3: We have

e\left(\frac{N}{e}\right)^N \leq N! \leq eN\left(\frac{N}{e}\right)^N.

Proof: Write

\sum\limits_{x=1}^{N-1} \log(x) \leq \int\limits_1^N \log(x) \mathrm{d}x \leq \sum\limits_{x=1}^{N} \log(x),

where the lower and upper bounds are, respectively, lower and upper Riemann sums for the integral between them. This integral can be evaluated:

\int\limits_1^N \log(x) \mathrm{d}x = N \log N - N +1.

The upper bound on the integral thus yields

e^{N \log N - N +1} \leq N!,

which is the claimed lower bound on N!. The lower bound on the integral yields

(N-1)! \leq e^{N \log N - N +1},

and multiplying through by N we get the claimed upper bound on N!.

— Q.E.D.

Problem 2: Which is better — Proposition 2 or Proposition 3?

Instead of comparing N! to an integral, we can represent it as an integral.

Proposition 4: We have

N! = \int_0^\infty t^N e^{-t} \mathrm{d}t.

Proof: This can be proved by induction on N. For N=0, we have

\int_0^\infty e^{-t} \mathrm{d}t = -e^{-t}|_{t=0}^{t=\infty}=1 = 0!.

Now, for any N \in \mathbb{N}, integrating by parts we have

\int t^N e^{-t} \mathrm{d}t = -t^Ne^{-t} - N\int t^{N-1} e^{-t}\mathrm{d}t.

Since f(t)=z^Ne^{-t} satisfies f(0)=0 and \lim_{t \to \infty} f(t) =0, this gives us

\int_0^\infty t^N e^{-t} \mathrm{d}x = N\int_H t^{N-1} e^{-t} \mathrm{d}t = N(N-1)! = N!.

— Q.E.D.

The integral representation of N! given by Proposition 4 was found by Euler, who wanted to define z! for z a complex variable. Indeed, the integral

z! := \int_0^\infty t^z e^{-t} \mathrm{d}t

converges to define a holomorphic function on the half plane H=\{ z\in \mathbb{C} \colon \Re z >0\}, so that we have an analytic factorial function on this domain. In some sense, this is the only way to define z! (precise statements are the Bohr-Mollerup Theorem and Wielandt’s Theorem).

This is not the direction we’re headed. For us, the main significance Euler’s integral representation is that it can be re-written as follows. Making the change of variables x=\frac{1}{N}t, we get

N! = N^{N+1} \mathcal{Z}_N(S),

where

Z_N(S) = \int_\mathbb{R} e^{-NS(x)} \mathrm{d}x,

where S(x) = x - \log x for x>0 and S(x)=\infty for x \leq 0. What we want to do is to consider the generalization of the factorial function obtained by considering all integrals of the form Z_N(S) with

S \colon \mathbb{R} \to \mathbb{R}.

a given function called the “action.” This term comes from physics, and is related to the fact that Z_N(S) is the partition function a zero-dimensional quantum field theory with action S. We will see that, for S drawn from a broad class of functions, the N \to \infty behavior of Z_N(S) can be understood in a uniform way using a robust technique known as the Laplace method. Eventually, we will see that all such integrals admit asymptotic expansions whose coefficients can be understood in terms of certain graphs known as Feynman diagrams. Next time, we’ll work out Stirling’s formula from the integral representation of N!, and start developing Laplace’s method in general.