Math 289A: Lecture 15

Today we want to prove the “inverse Wigner theorem” we started discussing last time. Let (B_N)_{N=1}^\infty be a sequence of traceless deterministic matrices, and for each N \in \mathbb{N} let X_N=U_NB_NU_N^* be a uniformly random Hermitian matrix isospectral with B_N. Suppose that the empirical eigenvalue distribution of \frac{1}{\sqrt{N}}B_N converges in moments as N \to \infty, meaning that the limit

\gamma_d=\lim\limits_{N \to \infty} \frac{1}{N} \mathrm{Tr} \left(\frac{1}{\sqrt{N}}B_N\right)^d

exists for each d \in \mathbb{N}. Equivalently, we have

\gamma_d=\lim\limits_{N \to \infty} \frac{p_d\left( \frac{B_{N1},}{\sqrt{N}},\dots,\frac{B_{NN}}{\sqrt{N}}\right)}{N}

with B_{N1},\dots,B_{NN} any enumeration of the eigenvalues of B_N and p_d(x_1,\dots,x_N)=x_1^d + \dots + x_N^d the power sum polynomial of degree d. Let X_{N1},\dots,X_{NN} denote the diagonal matrix elements of the random matrix X_N, so

\begin{bmatrix} X_{N1} \\ \vdots \\ X_{NN} \end{bmatrix} = \begin{bmatrix} {} & \vdots & {} \\ \dots & |U_N(i,j)|^2 & \dots \\ {} & \vdots & {}\end{bmatrix}\begin{bmatrix} B_{N1} \\ \vdots \\ B_{NN} \end{bmatrix}

with U_N a uniformly random unitary matrix.

Theorem 15.1. For any fixed r \in \mathbb{N}, the random vector (X_{N1},\dots,X_{Nr}) converges in moments to a vector (Y_1,\dots,Y_r) of iid centered Gaussians of variance \gamma_2. More precisely, we have

\lim\limits_{N \to \infty} \left\langle X_{N1}^{\alpha_1} \dots X_{Nr}^{\alpha_r}\right\rangle = \left\langle Y_1^{\alpha_1} \dots Y_r^{\alpha_r}\right\rangle

for any vector (\alpha_1,\dots,\alpha_r) of nonnegative integers.

Here is a corollary which further explains why we are calling Theorem 15.1 an inverse Wigner theorem.

Corollary 15.1. For any r \in \mathbb{N}, the r \times r principal submatrix of X_N converges to \mathrm{GUE}_r(\gamma_2), i.e. to an r \times r random Hermitian matrix following a centered Gaussian distribution of variance \gamma_2.

Problem 15.1. Deduce Corollary 15.1 from Theorem 15.1.

The rest of the lecture consists of the proof of Theorem 15.1. First, let us explain the assumption that our deterministic data B_N has trace zero. The random variables X_{N1},\dots,X_{NN} are identically distributed (but not independent), with first moment

\langle X_{N1} \rangle = \frac{B_{N1}+\dots+B_{NN}}{N}= \frac{1}{N}\mathrm{Tr}\, B_N.

So, assuming \mathrm{Tr}\, B_N=0 is equivalent to centering X_{N1},\dots,X_{NN}.

Now we start the proof. The joint distribution of X_{N1},\dots,X_{NN} is a compactly supported probability measure on \mathbb{R}^N — indeed, it is supported on the compact convex set whose extreme points are the distinct permutations of the deterministic vector (B_{N1},\dots,B_{NN}). Therefore, the Laplace transform

K_N(z,a_1,\dots,a_N) = \left\langle e^{z(a_1X_{N1} + \dots + a_NX_{NN})}\right\rangle

is an entire function of the complex variables z,a_1,\dots,a_N, i.e. K_N is an analytic function on \mathbb{C}^{N+1}. Note that if we set z=i and restricting the remaining variables (a_1,\dots,a_N), the Laplace transform specializes to the characteristic function of (X_{N1},\dots,X_{NN}), so that it completely determines the distribution of this random vector. The variable z is redundant, but it is a convenient marker of homogeneity: we write the Maclaurin expansion of K_N as

K_N(z,a_1,\dots,a_N) = 1 + \sum\limits_{d=1}^\infty \frac{z^d}{d!} K_N^d(a_1,\dots,a_N),

where K_N^d \colon \mathbb{C}^N \to \mathbb{C} is a homogeneous polynomial of degree d. Indeed, this polynomial is

K_N^d(a_1,\dots,a_N) = \left\langle (a_1X_{N1} + \dots + a_NX_{NN})^d \right\rangle,

which in the standard monomial basis of homogeneous polynomials is

K_N^d(a_1,\dots,a_N) = \sum\limits_{\alpha} a_1^{\alpha_1} \dots a_N^{\alpha_N} \langle X_{N1}^{\alpha_1} \dots X_{NN}^{\alpha_N}\rangle {d \choose \alpha_1,\dots,\alpha_N},

the sum being over all vectors \alpha=(\alpha_1,\dots,\alpha_N) of nonnegative integers whose coordinates sum to d, aka weak N-part compositions of d. We call K_N^d the degree d moment polynomial of X_{N1},\dots,X_{NN}, because its coefficients in the monomial basis are the degree d joint moments of these random variables (up to a multinomial coefficient). So our objective is to understand the N \to \infty behavior of this polynomial restricted to the r-dimensional subspace of \mathbb{C}^N spanned by the first r coordinate vectors, where r \in \mathbb{N} is arbitrary but fixed. To be completely explicit, this is the polynomial

K_N^d(a_1,\dots,a_r) := K_N^d(a_1,\dots,a_r,0,\dots,0)=\sum\limits_{\alpha} a_1^{\alpha_1} \dots a_r^{\alpha_r} \langle X_{N1}^{\alpha_1} \dots X_{Nr}^{\alpha_r}\rangle {d \choose \alpha_1,\dots,\alpha_r},

the sum being over weak r-part compositions of N. This is an important simplification, since the number of variables of K_N^d(a_1,\dots,a_r) is fixed, i.e. its domain does not vary with N.

In fact, it is more convenient to work with the log-Laplace transform of X_{N1},\dots,X_{NN}. More precisely, since K_N(0,\dots,0)=1, there exists a simply connected open neighborhood \Omega_N of the origin in \mathbb{C}^{N+1} on which K_N is non vanishing, and there is a unique analytic function L_N \colon \Omega_N \to \mathbb{C} vanishing at the origin which satisfies

K_N(z,a_1,\dots,a_N) = e^{L_N(z,a_1,\dots,a_N)}, \quad (z,a_1,\dots,a_N) \in \Omega_N.

Let

L_N(z,a_1,\dots,a_N) = \sum\limits_{d=1}^\infty \frac{z^d}{d!} L_N^d(a_1,\dots,a_N)

be the Maclaurin expansion of this analytic function, so L_N^d(a_1,\dots,a_N) is a homogeneous degree d polynomial. This is called the degree d cumulant polynomial of the random variables X_{N1},\dots,X_{NN}. According to the Exponential Formula (aka moment-cumulant formula), these cumulants are recursively determined by the moment polynomials by a triangular system of equations, the first three of which are (suppressing the variables a_1,\dots,a_N)

K_N^1=L_N^1,\ K_N^2=L_N^2+L_N^1L_N^1,\ K_N^3 =L_N^3+3L_N^2L_N^1+L_N^1L_N^1L_N^1.

We can write the polynomial L_N^d as

L_N^d(a_1,\dots,a_N) = \sum\limits_{\alpha} \sum\limits_{\alpha} a_1^{\alpha_1} \dots a_N^{\alpha_N} \langle X_{N1}^{\alpha_1} \dots X_{NN}^{\alpha_N}\rangle_c {d \choose \alpha_1,\dots,\alpha_N}

except for the subscript on the expectation-like quantity $\langle X_{N1}^{\alpha_1} \dots X_{NN}^{\alpha_N}\rangle_c$, which is known as a degree d joint cumulant of X_{N1},\dots,X_{NN}. Note that this is a definition, not a proposition – the relationship between the Maclaurin expansion of the Laplace transform and the log-Laplace transform implicitly defines these statistics.

Problem 15.2. Show that \langle X_{N1}X_{N2}\rangle_c=\langle X_{N1}X_{N2}\rangle-\langle X_{N1}\rangle \langle X_{N2}\rangle is the covariance of X_{N1} and X_{N2}.

Our discussion so far pertains to any finite family of random variables with compactly supported joint distribution. Now we will start invoking special features of the family X_{N1},\dots,X_{NN} we are actually dealing with. The first of these is that these random variables are exchangeable, which is equivalent to saying that the cumulant polynomials L_N^d(a_1,\dots,a_N) are symmetric polynomials. This means that we can also expand them as linear combinations of Newton polynomials, and as previously discussed have a powerful understanding of this expansion.

Theorem 15.2 (Leading Cumulants Formula). For any 1 \leq d \leq N, we have

L_N^d(a_1,\dots,a_N) = N^{2-d} \sum\limits_{\alpha \vdash d} \frac{p_\alpha(a_1,\dots,a_N)}{N^{\ell(\alpha)}}\sum\limits_{\beta \vdash d} \frac{p_\beta(B_{N1},\dots,B_{NN})}{N^{\ell(\beta)}} L_N(\alpha,\beta),

where each sum is over the set of integer partitions of d and

L_N(\alpha,\beta)=(-1)^{\ell(\alpha)+\ell(\beta)} \sum\limits_{g=0}^\infty N^{-2g} \vec{H}_g(\alpha,\beta)

is a convergent power series in N^{-2} whose coefficients \vec{H}_g(\alpha,\beta) are positive integers.

The combinatorial coefficients \vec{H}_g(\alpha,\beta) are quite intricate and there is no simple closed formula for them (you can read this if you want to know more), but luckily we do not need to know very much about them in order to prove our inverse Wigner theorem. All that matters for us is that this magic formula is expressing the cumulant polynomials of X_{N1},\dots,X_{NN} in terms of the moments of the empirical eigenvalue distribution of our deterministic data matrices B_N in a quite transparent way.

To leverage this, let us restrict to the cumulant polynomials of X_{N1},\dots,X_{Nr}), which just means setting the last N-r variables of L_N^d(a_1,\dots,a_N) equal to zero (this eventually makes sense because r is fixed and N increases without bound). We write the resulting polynomial in the form

L_N^d(a_1,\dots,a_r) = N^{1-\frac{d}{2}}\sum\limits_{\alpha \vdash d} \frac{p_\alpha(a_1,\dots,a_r)}{N^{\ell(\alpha)-1}}\sum\limits_{\beta \vdash d} \frac{p_\beta(\frac{B_{N1}}{\sqrt{N}},\dots,\frac{B_{NN}}{\sqrt{N}})}{N^{\ell(\beta)}} L_N(\alpha,\beta).

Ignore the factor of N^{1-\frac{d}{2}} for a moment and just think about the behavior of the double sum as N \to \infty. The number of terms in this sum is independent of N, which means that we only need to understand the asymptotics of each individual term

\frac{p_\alpha(a_1,\dots,a_r)}{N^{\ell(\alpha)-1}}\frac{p_\beta(\frac{B_{N1}}{\sqrt{N}},\dots,\frac{B_{NN}}{\sqrt{N}})}{N^{\ell(\beta)}} L_N(\alpha,\beta).

Reading from right to left, we have that L_N(\alpha,\beta) \to \vec{H}_0(\alpha,\beta) as N \to \infty. By hypothesis, we have

\frac{p_\beta(\frac{B_{N1}}{\sqrt{N}},\dots,\frac{B_{NN}}{\sqrt{N}})}{N^{\ell(\beta)}} \longrightarrow \prod_{i=1}^{\ell(\beta)}\gamma_{\beta_i}=:\gamma_\beta.

Finally, since

|p_\alpha(a_1,\dots,a_r)| \leq r^{\ell(\alpha)} \|(a_1,\dots,a_r)\|_\infty,

we have

\frac{p_\alpha(a_1,\dots,a_r)}{N^{\ell(\alpha)-1}} = O\left(\frac{1}{N^{\ell(\alpha)-1}}\right)

as N \to \infty. Thus, the whole double sum is O(1) as N \to \infty, so now is a good time to remember the prefactor N^{1-\frac{d}{2}} and immediately conclude that

L_N^d(a_1,\dots,a_r) \longrightarrow 0

as N \to \infty, uniformly on compact subsets of \mathbb{C}^r, for all d \geq 3. For d=2, we have

L_N^2(a_1,\dots,a_N) \to p_2(a_1,\dots,a_N)(\gamma_2\vec{H}_0(2,2)+\gamma_1\gamma_1\vec{H}_0(2,11)),

uniformly on compact subsets of \mathbb{C}^r, and since \gamma_1=0 and \vec{H}_0(d,d)=(d-1)! this simplifies to

L_N^2(a_1,\dots,a_N) \to p_2(a_1,\dots,a_N)\gamma_2.

Finally, the polynomial L_N^1(a_1,\dots,a_N) is identically zero by our trace zero hypothesis. What we have shown is that all joint cumulants of X_{N1},\dots,X_{Nr} converge to zero as N \to \infty except for pure cumulants of degree two,

\langle X_{N1}^2\rangle_c,\dots,\langle X_{Nr}\rangle_c,

each of which converges to \gamma_2.

Problem 15.3. Compute the joint cumulants of a finite family of centered iid Gaussians and complete the proof.

Leave a Reply