# Math 262A: Lecture 17

As we have discussed previously, the moments of the Gaussian distribution are given by

$\int\limits_\mathbb{R} x^d e^{-\frac{x^2}{2}} \frac{\mathrm{d}x}{\sqrt{2\pi}} = [d \text{ even}] (d-1)!!,$

where we are using the Iverson bracket for indicator functions, and the double factorial is like the usual factorial, but going down in steps of two (e.g. $5!! = 5 \cdot 3 \cdot 1$). Thus one can evaluate the integral of any polynomial against the standard Gaussian measure using linearity of the integral and this simple formula.

In $n$ dimensions, if one wants to be able to compute the integral

$\int\limits_{\mathbb{R}^n} P(x_1,\dots,x_n) e^{-\frac{1}{2}(x_1^2+\dots+x_n^2)} \frac{\mathrm{d}\mathbf{x}}{(2\pi)^{\frac{n}{2}}}$

of a polynomial function $P$ on $\mathbb{R}^n,$ then it is necessary and sufficient to compute all correlation functions of the Gaussian measure, which simply means integrals of monomials of a given degree. Physicists have a nice notation for such integrals, and write them as

$\langle x_{i(1)} \dots x_{i(d)} \rangle = \int\limits_{\mathbb{R}^n} x_{i(1)} \dots x_{i(d)} e^{-\frac{1}{2}(x_1^2+\dots+x_n^2)} \frac{\mathrm{d}\mathbf{x}}{(2\pi)^{\frac{n}{2}}},$

where $i \in \mathrm{Fun}(d,n).$ Observe that in the case $latex$ is a constant function, our old formula still applies, since this reduces to the one-dimensional case. Another easy case is the integration of monomials of degree two, where we have

$\langle x_{i(1)}x_{i(2)} \rangle = \int\limits_{\mathbb{R}^n} x_{i(1)} x_{i(2)} e^{-\frac{1}{2}(x_1^2+\dots+x_n^2)} \frac{\mathrm{d}\mathbf{x}}{(2\pi)^{\frac{n}{2}}}=\int\limits_{\mathbb{R}^2} x_{i(1)} x_{i(2)} e^{-\frac{1}{2}(x_{i(1)}^2+x_{i(2)}^2)} \frac{\mathrm{d}\mathbf{x}}{2\pi}.$

If $i(1)=i(2),$ this is

$\int\limits_{\mathbb{R}} x_{i(1)}^2 e^{-\frac{1}{2}x_{i(1)}^2}\frac{\mathrm{d}x_{i(1)}}{\sqrt{2\pi}} \int\limits_{\mathbb{R}} e^{-\frac{1}{2}x_{i(1)}^2}\frac{\mathrm{d}x_{i(1)}}{\sqrt{2\pi}} = 1,$

while if $i(1) \neq i(2)$ we instead get

$\int\limits_{\mathbb{R}} x_{i(1)} e^{-\frac{1}{2}x_{i(1)}^2}\frac{\mathrm{d}x_{i(1)}}{\sqrt{2\pi}} \int\limits_{\mathbb{R}} x_{i(2)}e^{-\frac{1}{2}x_{i(2)}^2}\frac{\mathrm{d}x_{i(2)}}{\sqrt{2\pi}} = 0.$

The case of general $d$-point correlation functions reduces to that of $2$-point functions via the following combinatorial formula, which statisticians call Isserlis’ formula and physicists call Wick’s lemma. Wick’s lemma says that, for any $d \in \mathbb{N}$ and any index function $i \in \mathrm{Fun}(d,n),$ the $d$-point correlation function

$\langle x_{i(1)} \dots x_{i(d)} \rangle = \sum\limits_{\tau \in \mathrm{S}_2(d)} \prod\limits_{C \in \tau} \langle x_{i(c)}x_{i\tau(c)} \rangle,$

is equal to the sum of $2$-point correlators obtained by pairing the factors in the product in all possible ways. For example we have

$\langle x_{i(1)}x_{i(2)}x_{i(3)} \rangle = 0$

because there aren’t any pairings of three objects, while

$\langle x_{i(1)}x_{i(2)}x_{i(3)}x_{i(4)} \rangle = \langle x_{i(1)}x_{i(2)}\rangle \langle x_{i(3)i(4)} \rangle + \langle x_{i(1)}x_{i(3)}\rangle \langle x_{i(2)i(4)} \rangle + \langle x_{i(1)}x_{i(4)}\rangle \langle x_{i(2)i(3)} \rangle$

because $\mathrm{S}_2(4) = \{(1\ 2)(3\ 4),(1\ 3)(2\ 4),(1\ 4)(2\ 3)\}$ is the set of fixed point free involutions in $\mathrm{S}(4).$ Hence $\langle x_{i(1)}x_{i(2)}x_{i(3)}x_{i(4)} \rangle$ equals a sum of delta functions,

$\delta_{i(1)i(2)}\delta_{i(3)i(4)} + \delta_{i(1)i(3)}\delta_{i(2)i(4)} + \delta_{i(1)i(4)}\delta_{i(2)i(3)}.$

Equivalently, one may say that $\langle x_{i(1)}x_{i(2)}x_{i(3)}x_{i(4)} \rangle$ is equal to the number of fixed point free involutions $\tau \in \mathrm{S}_2(d)$ such that the given index function $i \in \mathrm{Fun}(d,n)$ is constant on the cycles of $\tau.$

At the end of Lecture 15, we were considering the Gaussian probability measure on the real vector space $\mathrm{H}(N)$ of $N \times N$ Hermitian matrices (i.e. complex matrices such that $X^*=X$). This real vector space is isomorphic to $\mathbb{R}^{N^2}:$ an isomorphism is gotten by reading the upper triangular part of the Hermitian matrix $X=[X_{i,j}]$ across rows, so that each matrix element $X_{ij}$ with $i contributes two real parameters $x_{ij} =\mathrm{Re}X_{ij}$ and $y_{ij} = \mathrm{Im} X_{ij},$ for a total of $N(N-1)$ real parameters, and each of the $N$ diagonal matrix $X_{ii}$ contributes itself, $X_{ii} = x_{ii} = \mathrm{Re} X_{ii}$, giving $N$ more real parameters. A natural scalar product on $\mathrm{H}(N)$ is given by the trace form, $\langle X,Y \rangle = \mathrm{Tr}XY,$ and one can define the corresponding Gaussian measure on $\mathrm{H}(N)$ to be the absolutely continuous (w.r.t to Lebesgue) measure with density

$\frac{1}{Z_N} e^{-\frac{1}{2} \mathrm{Tr} X^2}$,

where

$Z_N = \int\limits_{\mathrm{H}(N)} e^{-\frac{1}{2} \mathrm{Tr} X^2} \mathrm{d}X$

is a normalization constant which can be explicitly computed, but is not simply equal to

$(2\pi)^{\frac{N^2}{2}},$

as you might expect. The reason for this is the identity

$\mathrm{Tr} X^2 = \sum\limits_{i,j=1}^N X_{ij}X_{ji} = \sum\limits_{i,j=1}^N X_{ij}\overline{X_{ij}} = \sum\limits_{i=1}^N x_{ii}^2 + 2\sum\limits_{1 \leq i

which corresponds to a scalar product on $\mathbb{R}^{N^2},$ but not the standard scalar product (i.e. dot product) because of the factor of $2$ on terms coming from off-diagonal matrix elements. This is a minor difference, and the upshot is that the Wick formula still works, and reduces the computation of correlation functions

$\langle X_{i(1)j(1)} \dots X_{i(d)j(d)} \rangle= \int\limits_{\mathrm{H}(N)} X_{i(1)j(1)} \dots X_{i(d)j(d)} e^{-\frac{1}{2}\mathrm{Tr} X^2}\frac{\mathrm{d}X}{Z_N}$

to two-point correlators, which are given by

$\langle X_{ij}X_{kl} \rangle = \delta_{il}\delta_{jk}.$

For example, we have

$\langle X_{i(1)j(1)}X_{i(2)j(2)}X_{i(3)j(3)}X_{i(4)j(4)}\rangle = \langle X_{i(1)j(1)}X_{i(2)j(2)}\rangle\langle X_{i(3)j(3)}X_{i(4)j(4)} \rangle \\ + \langle X_{i(1)j(1)}X_{i(3)j(3)}\rangle\langle X_{i(2)j(2)}X_{i(4)j(4)} \rangle + \langle X_{i(1)j(1)}X_{i(4)j(4)}\rangle\langle X_{i(2)j(2)}X_{i(3)j(3)} \rangle,$

the terms of which correspond, just as in the vector case, to the fixed point free involutions in $\mathrm{S}(4),$ and this reduces to the corresponding sum of delta functions of indices,

$\delta_{i(1)j(2)}\delta_{i(2)j(1)} \delta_{i(3)j(4)}\delta_{i(4)j(3)} + \delta_{i(1)j(3)}\delta_{i(3)j(1)} \delta_{i(2)j(4)}\delta_{i(4)j(2)} + \delta_{i(1)j(4)}\delta_{i(4)j(1)} \delta_{i(2)j(3)}\delta_{i(3)j(2)}.$

Let us use the matrix Wick lemma to compute a particularly interesting correlation function: $\langle \mathrm{Tr} X^d \rangle.$ On one hand, this has a spectral interpretation: we have

$\langle \mathrm{Tr} X^d \rangle = \int\limits_{\mathrm{H}(N)} \left(\lambda_1(X)^d + \dots +\lambda(X)^d \right) e^{-\frac{1}{2}\mathrm{Tr} X^2}\frac{\mathrm{d}X}{Z_N},$

where $\lambda_i(X)$ are the eigenvalues of $X.$ On the other hand, this is a sum of $d$-point correlation functions, namely

$\langle \mathrm{Tr} X^d \rangle = \sum\limits_{i \in \mathrm{Fun}(d,N)} \langle X_{i(1)i(2)}X_{i(2)i(3)} \dots X_{i(d)i(1)} \rangle,$

which may more compactly be written as

$\langle \mathrm{Tr} X^d \rangle = \sum\limits_{i \in \mathrm{Fun}(d,N)} \left\langle \prod_{k=1}^d X_{i(k)i\gamma(k)}\right\rangle,$

where $\gamma=(1\ 2\ \dots\ d)$ is the full forward cycle in the symmetric group $\mathrm{S}(d).$ In fact, it is convenient here to change the variance in the Gaussian measure, altering it to match the the $N$ in $\mathrm{H}(N)$ (you’ll see why in a moment). This means that we are instead computing

$\langle \mathrm{Tr} X^d \rangle = \int\limits_{\mathrm{H}(N)} \mathrm{Tr} X^d e^{-N\frac{1}{2}\mathrm{Tr} X^2}\frac{\mathrm{d}X}{Z_N},$

which has the effect of dividing the $2$-point functions by $N,$ i.e.

$\langle X_{ij}X_{kl} \rangle = \frac{\delta_{il}\delta_{jk}}{N}.$

Now if we apply the Wick formula, the computation evolves as follows:

$\langle \mathrm{Tr} X^d \rangle = \sum\limits_{i \in \mathrm{Fun}(d,N)} \left\langle \prod_{k=1}^d X_{i(k)i\gamma(k)}\right\rangle \\ = \sum\limits_{i \in \mathrm{Fun}(d,N)} \sum\limits_{\tau \in \mathrm{S}_2(d)} \prod\limits_{(rs) \in \tau} \langle X_{i(r)i\gamma(r)} X_{i(s)i\gamma(s)} \rangle \\ = \sum\limits_{i \in \mathrm{Fun}(d,N)} \sum\limits_{\tau \in \mathrm{S}_2(d)} \prod\limits_{(rs) \in \tau} \frac{\delta_{i(r)i\gamma(s)}\delta_{i(s)i\gamma(r)}}{N} \\ = N^{-\frac{d}{2}}\sum\limits_{i \in \mathrm{Fun}(d,N)} \sum\limits_{\tau \in \mathrm{S}_2(d)} \prod\limits_{(rs) \in \tau} \delta_{i(r)i\gamma(s)}\delta_{i(s)i\gamma(r)}.$

Now, to say that $(r\ s) \in \tau$ simply means that $(r\ s)$ is a cycle of the involution $\tau,$ which in turn just means that

$\tau(r)=s \quad\text{ and }\tau(s)=r,$

so that

$\delta_{i(r)i\gamma(s)} = \delta_{i(r)\gamma\tau(r)} \quad\text{ and }quad \delta_{i(s)i\gamma(r)}=\delta_{i(s)i\gamma\tau(s)},$

and we see that the product

$\prod\limits_{(rs) \in \tau} \delta_{i(r)i\gamma(s)}\delta_{i(s)I\gamma(r)}$

is really the single condition $[i=i\gamma\tau]$ at the level of equality of functions (a concatenation of functions means composition). We thus have

$\langle \mathrm{Tr} X^d \rangle = \sum\limits_{i \in \mathrm{Fun}(d,N)} \sum\limits_{\tau \in \mathrm{S}_2(d)} [i=i\gamma\tau] = \sum\limits_{\tau \in \mathrm{S}_2(d)} \sum\limits_{i \in \mathrm{Fun}(d,N)} [i=i\gamma\tau].$

Now, the condition $[i=i\gamma\tau]$ is not difficult to interpret: in order to return true, it must be the case that $i$ is constant on the cycles of the permutation $i.$ To build such a function, we are free to assign any of the numbers $1,\dots,N$ for the value of $i$ on each cycle of $\gamma\tau,$ and we thus obtain

$\langle \mathrm{Tr} X^d \rangle = \sum\limits_{\tau \in \mathrm{S}_2(d)} N^{c(\gamma\tau)-\frac{d}{2}}.$

And now, we know how to make topology out of this: the combinatorial map $(\gamma,\tau)$ yields a surface by taking the single cycle of $\gamma=(1\ 2\ \dots d)$ and writing it out as a $d$-gon, and then glueing its sides in pairs using the involution $\gamma.$ When we do this, the former vertices and edges of the polygon become the vertices and edges of a graph drawn on the resulting surface: the number of vertices of this embedded graph is the number of cycles in $\gamma\tau,$ the number of edges is $\frac{d}{2},$ and of course the number of faces is $1,$ since we started with a single polygon. We thus have that

$\frac{1}{N} \langle \mathrm{Tr} X^d \rangle = \sum\limits_{\tau \in \mathrm{S}_2(d)} N^{c(\gamma\tau)-\frac{d}{2}-1} = \sum\limits_{\tau \in \mathrm{S}_2(d)} N^{(c(\gamma\tau)-\frac{d}{2}+1)-2},$

so that by the Euler formula $v-e+f=2-2g$ the exponent of $N$ is exactly $-2g,$ where $g$ is the genus of the surface obtained from $(\gamma,\tau).$ We thus find that the matrix Wick formula has provided us for a generating function for the number of one-face maps obtained from gluing a $d$-gon, sorted by genus:

$\frac{1}{N} \langle \mathrm{Tr} X^d \rangle = \sum\limits_{g=0}^\infty \frac{\varepsilon_g(d)}{N^{2g}},$

where $\varepsilon_g(d)$ is the number of gluings of a $d$-gon that produce a surface of genus $g.$ And that’s the Harer-Zagier theorem. Next lecture, we’ll talk about what it’s good for.