Math 262A: Lecture 3

In this lecture we continue our long march towards Stirling’s formula. Let me begin by saying that, while we seem to be spending an absurd amount of time on this, Stirling’s approximation is being used as a case study in something much more general, Laplace’s method, and we will understand the general thing almost automatically once we understand the specific thing.

We return to the integral representation

N! = \frac{N^{N+1}}{e^N}\int_{-1}^\infty e^{-N(w-\log(1+w))}\mathrm{d}w.

In Lecture 2, we performed some elementary estimates which showed that, for all sufficiently small \varepsilon >0, we have

\int_{-1}^\infty e^{-N(w-\log(1+w))}\mathrm{d}w = \int_{-\varepsilon}^{+\varepsilon} e^{-N(w-\log(1+w))}\mathrm{d}w + O(e^{-c(\varepsilon)N}),

where c(\varepsilon) is a positive number depending on \varepsilon but not on N. This quantifies the intuition that the bulk contribution to the integral comes from a small neighborhood of the minimum of the action.

We now come to the problem of estimating this bulk contribution

\int_{-\varepsilon}^{+\varepsilon} e^{-N(w-\log(1+w))}\mathrm{d}w.

The action

S(w) = w-\log(1+w)

is holomorphic on the open unit disc \mathbb{D}_1 = \{w \in \mathbb{C} \colon |w|<1\}, and its Maclaurin series is

S(w) = \sum\limits_{d=2}^{\infty} (-1)^d\frac{w^d}{d},

this series converging absolutely and uniformly on compact subsets of \mathbb{D}_1 to the correct value S(w). In Lecture 2, we observed that if we throw away all terms in the series expansion of S(w) beyond the quadratic term, our integral \int_{-\varepsilon}^{+\varepsilon}-integral becomes the truncated Gaussian integral

\int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2}\mathrm{d}w,

which can then be replaced by a total Gaussian integral at the cost of introducing an exponentially small error,

\int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2}\mathrm{d}w =\int_{-\infty}^{+\infty} e^{-\frac{N}{2}w^2}\mathrm{d}w + O(e^{-c(\varepsilon)N}),

this being beneficial because the total Gaussian integral can be exactly evaluated,

\int_{-\infty}^{+\infty} e^{-\frac{N}{2}w^2}\mathrm{d}w = \sqrt{\frac{2\pi}{N}}.

Thus, the remaining question is to quantify the error incurred by replacing

\int_{-\varepsilon}^{+\varepsilon} e^{-N(w-\log(1+w))}\mathrm{d}w \quad\text{with}\quad \int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2}\mathrm{d}w.

We now want quantify the error incurred by throwing away all terms of the action S beyond its dominant (quadratic) term. For small w, we naturally expect this error to be a small number which decays as N increases, but this decay turns out to be much slower than exponential and represents the main source of error in the approximation scheme we’re building.

To get started, we write

\int_{-\varepsilon}^{+\varepsilon} e^{-N(w-\log(1+w))}\mathrm{d}w = \int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2}e^{-N\sum_{d=3}^\infty (-1)^d \frac{w^d}{d}}\mathrm{d}w.

Next, we Maclaurin-expand the non-Gaussian part of the integrand, writing

e^{-N\sum_{d=3}^\infty (-1)^d \frac{w^d}{d}} = \sum\limits_{d=0}^\infty m_d(N) \frac{w^d}{d!},

where a_d(N) is the dth derivative of e^{-N\sum_{d=3}^\infty (-1)^d \frac{w^d}{d}} at w=0. Assuming \varepsilon < 1, this series converges uniformly on \overline{\mathbb{D}}_\varepsilon = \{w \in \mathbb{C} \colon |w| \leq \varepsilon\}; indeed, it is the Maclaurin series of the function S(w)-\frac{w^2}{2}, and we aren’t changing the location of singularities by subtracting an entire function. We can thus interchange the order of summation and integration, thereby obtaining

\int_{-\varepsilon}^{+\varepsilon} e^{-N(w-\log(1+w))}\mathrm{d}w = \sum\limits_{d=0}^\infty \frac{m_d(N)}{d!} \int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2}w^d \mathrm{d}w.

We have thus reduced the initial problem to two problems.

Problem 1: Compute the coefficients m_d(N).

Problem 2: Compute the integrals \int_{-\varepsilon}^{+\varepsilon} e^{-\frac{N}{2}w^2}w^d \mathrm{d}w.

We will solve Problem 1 today, and Problem 2 on Thursday (in case you’d like to think ahead).

Problem 1 is pure algebraic combinatorics, being a question about composition in the algebra of formal power series \mathbb{C}[[z]]. This question can be phrased in greater generality, as follows. Let

E(z) = \sum\limits_{d=0}^\infty e_d \frac{z^d}{d!} \quad\text{ and }\quad C(z) = \sum\limits_{d=1}^\infty c_d \frac{z^d}{d!}

be elements of \mathbb{C}[[z]] such that the constant term of C(z) is zero. Notice that we are writing these series as if their coefficients were the derivatives of analytic functions E(z) and C(z) at z=0, but we are not actually imposing any convergence conditions on them. Series of this form are often referred to as exponential generating functions. The composition

E(C(z)) = \sum\limits_{d=0}^\infty e_d \frac{C(z)^d}{d!} = e_0 + e_1\frac{1}{1!}\left(c_1\frac{z^1}{1!} + c_2\frac{z^2}{2!} + \dots \right)^1 + e_2\frac{1}{2!}\left( c_1\frac{z^1}{1!} + c_2\frac{z^2}{2!} + \dots\right)^2 + \dots

is well-defined. Let M(z) := E(C(z)), and expand

M(z) = \sum\limits_{d=0}^\infty m_d \frac{z^d}{d!}.

The problem is to understand the coefficients of M(z) in terms of the coefficients of E(z) and C(z). This problem was posed and solved by Hurwitz in the nineteenth century in connection with what is now a classical problem in enumerative geometry, namely the question of counting branched covers of the Riemann sphere \mathbf{P}^1(\mathbb{C}) with specified ramification data (I hope to touch on this problem more extensively later in the course).

Exercise 1: Compute the first few coefficients of M(z) by hand.

Exercise 2: Does it make sense to do this computation using the chain rule? Why or why not? Do you get correct formulas by this method?

Let’s consider first an easier problem, which well serve as a stepping stone. Consider two series

A(z) = \sum\limits_{d=0}^\infty a_d \frac{z^d}{d!} \quad\text{ and }\quad B(z) = \sum\limits_{d=0}^\infty b_d \frac{z^d}{d!},

and let

P(z) = A(z)B(z) = \sum\limits_{d=0}^\infty p_d \frac{z^d}{d!}

be their product. It is then immediate that

p_d = \sum\limits_{k=0}^d {d \choose k} a_k b_{n-k}.

This algebraic expression also has combinatorial meaning, which becomes clear if we sacrifice its elegance and rewrite it in the garish form

p_d = \sum\limits_{\substack{ (S_A,S_B) \\ S_A,S_B \subseteq [d] \\ S_A \cap S_B = \{\}}} a_{|S_A|}b_{|S_B|}.

The enumerative meaning of this expression is the following. Suppose that a_d is the number of combinatorial structures of Type A which can be built on [d]= \{1,\dots,d\}, and similarly b_d is the number of combinatorial structures of Type B which can be built on [d]. Then, p_d is the number of ways to build a hybrid combinatorial structure of Type AB on [d] by choosing disjoint subsets S_A,S_B \subseteq \{1,\dots,d\} and building a type A structure on S_A and a Type B structure on S_B. This can be iterated: the product of k exponential generating functions A_1(z),A_2(z),\dots,A_k(z) is an exponential generating function P(z) whose coefficients p_d count the number of ways to select a list (S_{1},\dots,S_{k}) of disjoin subsets of [d] whose union is [d], and then build a Type A_j structure on S_j for each 1 \leq j \leq k.

Exercise 3: Read this (easy), this (harder), or this (hardest).

We can now state the main theorem on composition of exponential generating functions, M(z)=E(C(z)). A partition of [d] is a set \pi of disjoint nonempty subsets of [d] whose union is d. The elements B of \pi are called its blocks, and we denote by b(\pi) the number of blocks in \pi. Let \mathrm{P}(d) denote the set of all partitions of [d].

Theorem (The Compositional Formula): With notation as above, we have

m_d = \sum\limits_{\pi \in \mathrm{P}(d)} e_{b(\pi)} \prod\limits_{B \in \pi} c_{|B|}.

Proof: Write

M(z) = \sum\limits_{k=0}^\infty e_k \frac{1}{k!} C(z)^k,

use the combinatorial expansion C(z)^k, and reorganize the sum.

— Q.E.D.

Corollary (The Exponential Formula): If E(z) = e^z, then

m_d = \sum\limits_{\pi \in \mathrm{P}(d)} \prod\limits_{B \in \pi} c_{|B|}.

The combinatorial meaning of the Exponential Formula is the following: if C(z) is the exponential generating function for a class of “connected” combinatorial structures The Exponential Formula gets used all over the place — probabilists call it the moment-cumulant formula, while physicists often call it the linked cluster expansion and invoke it using colorful phrases like “connected vacuum bubbles exponentiate.” A basic example is the following: let m_d be the number of simple graphs with vertex set [d], let c_d be the number of connected simply graphs with vertex set [d], and form the generating functions

M(z) = 1 + \sum\limits_{d=1}^\infty m_d\frac{z^d}{d!} \quad\text{ and }\quad C(z) = \sum\limits_{d=1}^\infty c_d \frac{z^d}{d!}.

Then M(z) = e^{C(z)}, and the sequences (m_d)_{d=1}^\infty and (c_d)_{d=1}^\infty are related as in the Exponential Formula.

Now we return to Problem 1 above, which is to compute the coefficients m_d(N) in the expansion

e^{-N\sum_{d=3}^\infty (-1)^d \frac{w^d}{d}} = \sum\limits_{d=0}^\infty m_d(N) \frac{w^d}{d!}.

We do this using the Exponential Formula: we have

e^{-N\sum_{d=3}^\infty (-1)^d \frac{w^d}{d}} = e^{C(w)},


C(w) = \sum\limits_{d=3}^\infty (-N)(-1)^d(d-1)! \frac{w^d}{d!}.

The Exponential Formula now tells us that the coefficients m_d(N) we seek are given by

m_d(N) = \sum\limits_{\pi \in \mathrm{P}(d)} \prod\limits_{B \in \pi} (-N)(-1)^{|B|} \delta_{|B| \geq 3}(|B|-1)!.

Thus, we have solved Problem 1: we see that

m_d(N) = (-1)^d \sum\limits_{k=1}^{d} (-N)^k t_k(d),

with t_k(d) the number of permutations of 1,\dots,d consisting of k cycles, each of length at least 3.

Exercise 4: Do the same computation instead using the Compositional Formula with E(z) = e^{-Nz}.


Leave a Reply