Math 262A: Lecture 17

As we have discussed previously, the moments of the Gaussian distribution are given by

\int\limits_\mathbb{R} x^d e^{-\frac{x^2}{2}} \frac{\mathrm{d}x}{\sqrt{2\pi}} = [d \text{ even}] (d-1)!!,

where we are using the Iverson bracket for indicator functions, and the double factorial is like the usual factorial, but going down in steps of two (e.g. 5!! = 5 \cdot 3 \cdot 1). Thus one can evaluate the integral of any polynomial against the standard Gaussian measure using linearity of the integral and this simple formula.

In n dimensions, if one wants to be able to compute the integral

\int\limits_{\mathbb{R}^n} P(x_1,\dots,x_n) e^{-\frac{1}{2}(x_1^2+\dots+x_n^2)} \frac{\mathrm{d}\mathbf{x}}{(2\pi)^{\frac{n}{2}}}

of a polynomial function P on \mathbb{R}^n, then it is necessary and sufficient to compute all correlation functions of the Gaussian measure, which simply means integrals of monomials of a given degree. Physicists have a nice notation for such integrals, and write them as

\langle x_{i(1)} \dots x_{i(d)} \rangle = \int\limits_{\mathbb{R}^n} x_{i(1)} \dots x_{i(d)} e^{-\frac{1}{2}(x_1^2+\dots+x_n^2)} \frac{\mathrm{d}\mathbf{x}}{(2\pi)^{\frac{n}{2}}},

where i \in \mathrm{Fun}(d,n). Observe that in the case $latex $ is a constant function, our old formula still applies, since this reduces to the one-dimensional case. Another easy case is the integration of monomials of degree two, where we have

\langle x_{i(1)}x_{i(2)} \rangle = \int\limits_{\mathbb{R}^n} x_{i(1)} x_{i(2)} e^{-\frac{1}{2}(x_1^2+\dots+x_n^2)} \frac{\mathrm{d}\mathbf{x}}{(2\pi)^{\frac{n}{2}}}=\int\limits_{\mathbb{R}^2} x_{i(1)} x_{i(2)} e^{-\frac{1}{2}(x_{i(1)}^2+x_{i(2)}^2)} \frac{\mathrm{d}\mathbf{x}}{2\pi}.

If i(1)=i(2), this is

\int\limits_{\mathbb{R}} x_{i(1)}^2 e^{-\frac{1}{2}x_{i(1)}^2}\frac{\mathrm{d}x_{i(1)}}{\sqrt{2\pi}} \int\limits_{\mathbb{R}} e^{-\frac{1}{2}x_{i(1)}^2}\frac{\mathrm{d}x_{i(1)}}{\sqrt{2\pi}} = 1,

while if i(1) \neq i(2) we instead get

\int\limits_{\mathbb{R}} x_{i(1)} e^{-\frac{1}{2}x_{i(1)}^2}\frac{\mathrm{d}x_{i(1)}}{\sqrt{2\pi}} \int\limits_{\mathbb{R}} x_{i(2)}e^{-\frac{1}{2}x_{i(2)}^2}\frac{\mathrm{d}x_{i(2)}}{\sqrt{2\pi}} = 0.

The case of general d-point correlation functions reduces to that of 2-point functions via the following combinatorial formula, which statisticians call Isserlis’ formula and physicists call Wick’s lemma. Wick’s lemma says that, for any d \in \mathbb{N} and any index function i \in \mathrm{Fun}(d,n), the d-point correlation function

\langle x_{i(1)} \dots x_{i(d)} \rangle = \sum\limits_{\tau \in \mathrm{S}_2(d)} \prod\limits_{C \in \tau} \langle x_{i(c)}x_{i\tau(c)} \rangle,

is equal to the sum of 2-point correlators obtained by pairing the factors in the product in all possible ways. For example we have

\langle x_{i(1)}x_{i(2)}x_{i(3)} \rangle = 0

because there aren’t any pairings of three objects, while

\langle x_{i(1)}x_{i(2)}x_{i(3)}x_{i(4)} \rangle = \langle x_{i(1)}x_{i(2)}\rangle \langle x_{i(3)i(4)} \rangle + \langle x_{i(1)}x_{i(3)}\rangle \langle x_{i(2)i(4)} \rangle + \langle x_{i(1)}x_{i(4)}\rangle \langle x_{i(2)i(3)} \rangle

because \mathrm{S}_2(4) = \{(1\ 2)(3\ 4),(1\ 3)(2\ 4),(1\ 4)(2\ 3)\} is the set of fixed point free involutions in \mathrm{S}(4). Hence \langle x_{i(1)}x_{i(2)}x_{i(3)}x_{i(4)} \rangle equals a sum of delta functions,

\delta_{i(1)i(2)}\delta_{i(3)i(4)} + \delta_{i(1)i(3)}\delta_{i(2)i(4)} + \delta_{i(1)i(4)}\delta_{i(2)i(3)}.

Equivalently, one may say that \langle x_{i(1)}x_{i(2)}x_{i(3)}x_{i(4)} \rangle is equal to the number of fixed point free involutions \tau \in \mathrm{S}_2(d) such that the given index function i \in \mathrm{Fun}(d,n) is constant on the cycles of \tau.

At the end of Lecture 15, we were considering the Gaussian probability measure on the real vector space \mathrm{H}(N) of N \times N Hermitian matrices (i.e. complex matrices such that X^*=X). This real vector space is isomorphic to \mathbb{R}^{N^2}: an isomorphism is gotten by reading the upper triangular part of the Hermitian matrix X=[X_{i,j}] across rows, so that each matrix element X_{ij} with i<j contributes two real parameters x_{ij} =\mathrm{Re}X_{ij} and y_{ij} = \mathrm{Im} X_{ij}, for a total of N(N-1) real parameters, and each of the N diagonal matrix X_{ii} contributes itself, X_{ii} = x_{ii} = \mathrm{Re} X_{ii}, giving N more real parameters. A natural scalar product on \mathrm{H}(N) is given by the trace form, \langle X,Y \rangle = \mathrm{Tr}XY, and one can define the corresponding Gaussian measure on \mathrm{H}(N) to be the absolutely continuous (w.r.t to Lebesgue) measure with density

\frac{1}{Z_N} e^{-\frac{1}{2} \mathrm{Tr} X^2},


Z_N = \int\limits_{\mathrm{H}(N)} e^{-\frac{1}{2} \mathrm{Tr} X^2} \mathrm{d}X

is a normalization constant which can be explicitly computed, but is not simply equal to


as you might expect. The reason for this is the identity

\mathrm{Tr} X^2 = \sum\limits_{i,j=1}^N X_{ij}X_{ji} = \sum\limits_{i,j=1}^N X_{ij}\overline{X_{ij}} = \sum\limits_{i=1}^N x_{ii}^2 + 2\sum\limits_{1 \leq i<j \leq N}(x_{ij}^2+y_{ij}^2),

which corresponds to a scalar product on \mathbb{R}^{N^2}, but not the standard scalar product (i.e. dot product) because of the factor of 2 on terms coming from off-diagonal matrix elements. This is a minor difference, and the upshot is that the Wick formula still works, and reduces the computation of correlation functions

\langle X_{i(1)j(1)} \dots X_{i(d)j(d)} \rangle= \int\limits_{\mathrm{H}(N)} X_{i(1)j(1)} \dots X_{i(d)j(d)} e^{-\frac{1}{2}\mathrm{Tr} X^2}\frac{\mathrm{d}X}{Z_N}

to two-point correlators, which are given by

\langle X_{ij}X_{kl} \rangle = \delta_{il}\delta_{jk}.

For example, we have

\langle X_{i(1)j(1)}X_{i(2)j(2)}X_{i(3)j(3)}X_{i(4)j(4)}\rangle = \langle X_{i(1)j(1)}X_{i(2)j(2)}\rangle\langle X_{i(3)j(3)}X_{i(4)j(4)} \rangle \\ + \langle X_{i(1)j(1)}X_{i(3)j(3)}\rangle\langle X_{i(2)j(2)}X_{i(4)j(4)} \rangle + \langle X_{i(1)j(1)}X_{i(4)j(4)}\rangle\langle X_{i(2)j(2)}X_{i(3)j(3)} \rangle,

the terms of which correspond, just as in the vector case, to the fixed point free involutions in \mathrm{S}(4), and this reduces to the corresponding sum of delta functions of indices,

\delta_{i(1)j(2)}\delta_{i(2)j(1)} \delta_{i(3)j(4)}\delta_{i(4)j(3)} + \delta_{i(1)j(3)}\delta_{i(3)j(1)} \delta_{i(2)j(4)}\delta_{i(4)j(2)} + \delta_{i(1)j(4)}\delta_{i(4)j(1)} \delta_{i(2)j(3)}\delta_{i(3)j(2)}.

Let us use the matrix Wick lemma to compute a particularly interesting correlation function: \langle \mathrm{Tr} X^d \rangle. On one hand, this has a spectral interpretation: we have

\langle \mathrm{Tr} X^d \rangle = \int\limits_{\mathrm{H}(N)} \left(\lambda_1(X)^d + \dots +\lambda(X)^d \right) e^{-\frac{1}{2}\mathrm{Tr} X^2}\frac{\mathrm{d}X}{Z_N},

where \lambda_i(X) are the eigenvalues of X. On the other hand, this is a sum of d-point correlation functions, namely

\langle \mathrm{Tr} X^d \rangle = \sum\limits_{i \in \mathrm{Fun}(d,N)} \langle X_{i(1)i(2)}X_{i(2)i(3)} \dots X_{i(d)i(1)} \rangle,

which may more compactly be written as

\langle \mathrm{Tr} X^d \rangle = \sum\limits_{i \in \mathrm{Fun}(d,N)} \left\langle \prod_{k=1}^d X_{i(k)i\gamma(k)}\right\rangle,

where \gamma=(1\ 2\ \dots\ d) is the full forward cycle in the symmetric group \mathrm{S}(d). In fact, it is convenient here to change the variance in the Gaussian measure, altering it to match the the N in \mathrm{H}(N) (you’ll see why in a moment). This means that we are instead computing

\langle \mathrm{Tr} X^d \rangle = \int\limits_{\mathrm{H}(N)} \mathrm{Tr} X^d e^{-N\frac{1}{2}\mathrm{Tr} X^2}\frac{\mathrm{d}X}{Z_N},

which has the effect of dividing the 2-point functions by N, i.e.

\langle X_{ij}X_{kl} \rangle = \frac{\delta_{il}\delta_{jk}}{N}.

Now if we apply the Wick formula, the computation evolves as follows:

\langle \mathrm{Tr} X^d \rangle = \sum\limits_{i \in \mathrm{Fun}(d,N)} \left\langle \prod_{k=1}^d X_{i(k)i\gamma(k)}\right\rangle \\ = \sum\limits_{i \in \mathrm{Fun}(d,N)} \sum\limits_{\tau \in \mathrm{S}_2(d)} \prod\limits_{(rs) \in \tau} \langle X_{i(r)i\gamma(r)} X_{i(s)i\gamma(s)} \rangle \\ = \sum\limits_{i \in \mathrm{Fun}(d,N)} \sum\limits_{\tau \in \mathrm{S}_2(d)} \prod\limits_{(rs) \in \tau} \frac{\delta_{i(r)i\gamma(s)}\delta_{i(s)i\gamma(r)}}{N} \\ = N^{-\frac{d}{2}}\sum\limits_{i \in \mathrm{Fun}(d,N)} \sum\limits_{\tau \in \mathrm{S}_2(d)} \prod\limits_{(rs) \in \tau} \delta_{i(r)i\gamma(s)}\delta_{i(s)i\gamma(r)}.

Now, to say that (r\ s) \in \tau simply means that (r\ s) is a cycle of the involution \tau, which in turn just means that

\tau(r)=s \quad\text{ and }\tau(s)=r,

so that

\delta_{i(r)i\gamma(s)} = \delta_{i(r)\gamma\tau(r)} \quad\text{ and }quad \delta_{i(s)i\gamma(r)}=\delta_{i(s)i\gamma\tau(s)},

and we see that the product

\prod\limits_{(rs) \in \tau} \delta_{i(r)i\gamma(s)}\delta_{i(s)I\gamma(r)}

is really the single condition [i=i\gamma\tau] at the level of equality of functions (a concatenation of functions means composition). We thus have

\langle \mathrm{Tr} X^d \rangle = \sum\limits_{i \in \mathrm{Fun}(d,N)} \sum\limits_{\tau \in \mathrm{S}_2(d)} [i=i\gamma\tau]  = \sum\limits_{\tau \in \mathrm{S}_2(d)} \sum\limits_{i \in \mathrm{Fun}(d,N)} [i=i\gamma\tau].

Now, the condition [i=i\gamma\tau] is not difficult to interpret: in order to return true, it must be the case that i is constant on the cycles of the permutation i. To build such a function, we are free to assign any of the numbers 1,\dots,N for the value of i on each cycle of \gamma\tau, and we thus obtain

\langle \mathrm{Tr} X^d \rangle = \sum\limits_{\tau \in \mathrm{S}_2(d)} N^{c(\gamma\tau)-\frac{d}{2}}.

And now, we know how to make topology out of this: the combinatorial map (\gamma,\tau) yields a surface by taking the single cycle of \gamma=(1\ 2\ \dots d) and writing it out as a d-gon, and then glueing its sides in pairs using the involution \gamma. When we do this, the former vertices and edges of the polygon become the vertices and edges of a graph drawn on the resulting surface: the number of vertices of this embedded graph is the number of cycles in \gamma\tau, the number of edges is \frac{d}{2}, and of course the number of faces is 1, since we started with a single polygon. We thus have that

\frac{1}{N} \langle \mathrm{Tr} X^d \rangle = \sum\limits_{\tau \in \mathrm{S}_2(d)} N^{c(\gamma\tau)-\frac{d}{2}-1} = \sum\limits_{\tau \in \mathrm{S}_2(d)} N^{(c(\gamma\tau)-\frac{d}{2}+1)-2},

so that by the Euler formula v-e+f=2-2g the exponent of N is exactly -2g, where g is the genus of the surface obtained from (\gamma,\tau). We thus find that the matrix Wick formula has provided us for a generating function for the number of one-face maps obtained from gluing a d-gon, sorted by genus:

\frac{1}{N} \langle \mathrm{Tr} X^d \rangle = \sum\limits_{g=0}^\infty \frac{\varepsilon_g(d)}{N^{2g}},

where \varepsilon_g(d) is the number of gluings of a d-gon that produce a surface of genus g. And that’s the Harer-Zagier theorem. Next lecture, we’ll talk about what it’s good for.

1 Comment

Leave a Reply