Math 289A: Lecture 12

This week we will finish working through the Olshanski-Vershik limit theorem, and explain how the tools we have developed along the way have set us up to prove other kinds of limit theorems about random matrices.

Recall the setting of the OV paper: we have an infinite triangular array of real numbers,

\begin{matrix} b_{11} & {} {} \\ b_{21} & b_{22} &{} \\ b_{31} & b_{32} & b_{33} \\ \vdots & \vdots & \vdots \end{matrix}.

Assume the entries are sorted so that each row is non-increasing from left to right and let

B_N=(b_{N1},\dots,b_{NN})

be the Nth row. Thinking of B_N \in \mathbb{W}_N as a diagonal matrix, define a corresponding random Hermitian matrix by

X_N=U_NB_NU_N^*.

We know that in the characteristic function of X_N,

\varphi_{X_N}(A) = \int\limits_{\mathbb{H}_N} e^{i \mathrm{Tr}\, AUB_NU^*}\mathrm{d}U,\quad A \in \mathbb{H}_N,

we can assume without loss in generality that A is diagonal, so what we are really looking at is the sequence of characteristic functions

K_N \colon \mathbb{R}^N \longrightarrow \mathbb{C}

corresponding to our data set b_{ij} given as

K_N(a_1,\dots,a_N) = \int\limits_{\mathbb{U}_N} e^{i \sum_{x,y=1}^N a_xb_y|U_{xy}|^2}\mathrm{d}U,

or equivalently

K_N(a_1,\dots,a_N) = \mathbb{E}\left[e^{i(a_1X_N(1,1)+\dots+a_NX_N(N,N))}\right], \quad A \in \mathbb{R}^N,

where the diagonal matrix elements X_N(1,1),\dots,X_N(N,N) are exchangeable (but not independent) real random variables. Part I of the Olshanski-Vershik theorem is concerned with finding necessary and sufficient conditions on the triangular array b_{ij} such that the characteristic function

K_N(a)=K_n(a,0,\dots,0), \quad a \in \mathbb{R}^N,

of a single diagonal X_N(1,1) converges as N\to \infty. Note that extracting the (1,1)-entry on either side of X_N=U_NB_NU_N^* we have

X_N(1,1) = \sum\limits_{j=1}^N |U_{1j}|^2 b_{Nj},

showing explicitly how X_N(1,1) is built from the uniformly random unit vector (U_{11},\dots,U_{1N}) in \mathbb{C}^N and the data B_N=(b_{N1},\dots,b_{NN}).

The characteristic function K_N(a) of X_N(1,1) has the power series expansion

K_N(a) = 1 + \sum\limits_{d=1}^\infty \frac{(ia)^d}{d!}K_N^d,

where K_N^d=\mathbf{E}[X_N^d(1,1)] are the moments of this real random variable. Olshanski-Vershik being by finding sufficient conditions on the data b_{ij} such that, for each fixed d \in \mathbb{N}, the sequence of numbers (K_N^d)_{d=1}^\infty converges to a limit K^d as N \to \infty. In other words, they begin by finding conditions on the data b_{ij} such that the random variable X_N(1,1) converges in moments. We will do the same thing, but in a different way, using the following consequence of Theorem 10.1 in Lecture 10.

Theorem 12.1 (Leading Moments Formula). For each 1 \leq d \leq N, we have

K_N^d = N^{-d}\sum\limits_{\alpha,\beta \vdash d} p_\beta(b_{N1},\dots,b_{NN})\sum\limits_{r=0}^\infty \frac{(-1)^r}{N^r}\vec{W}^r(\alpha,\beta),

where the inner sum is absolutely convergent and \vec{W}^r(\alpha,\beta) is the number of r-step monotone walks on the symmetric group of degree d which begin at a permutation of cycle type \alpha and end at a permutation of cycle type \beta.

The most efficient way to use this result is to take a logarithm, i.e. to work with the cumulants of X_N(1,1). This is also the path followed by Olshanski-Vershik, but the method they use to do the calculation is different.

Let

L_N(a) = \log K_N(a) = \sum\limits_{d=1}^\infty \frac{(ia)^d}{d!}L_N^d

be the log-characteristic function of X_N(1,1), i.e. the cumulant generating function. The cumulant sequence L_N^1,L_N^2,L_N^3,\dots contains the same information as the moment sequence K_N^1,K_N^2,K_N^3,\dots, via the moment-cumulant relations

K_N^1 =L_N^1

K_N^2 = L_N^2 + L_N^1L_N^1

K_N^3 =L_N^3 + 3L_N^2L_N^1+L_N^1L_N^1L_N^1

\vdots

The cumulants L_N^d encode the “connected” version of the combinatorial problem encoded by the moments K_N^d.

Theorem 12.2 (Leading Cumulants Formula). For each 1 \leq d \leq N, we have

L_N^d =N^{-d} \sum\limits_{\alpha,\beta \vdash d} p_\beta(b_{N1},\dots,b_{NN})\sum\limits_{r=0}^\infty \frac{(-1)^r}{N^r}\vec{H}^r(\alpha,\beta),

where \vec{H}^r(\alpha,\beta) is the number of walks counted by \vec{W}^r(\alpha,\beta) which have the additional feature that their steps and endpoints generate a transitive subgroup of the symmetric group.

The Leading Cumulants Formula is identical to the Leading Moments formula except that the walk count \vec{W}^r(\alpha,\beta) has been replaced with the connected walk count \vec{H}^r(\alpha,\beta). This seemingly minor change is in fact very advantageous.

Theorem 12.3 (Riemann-Hurwitz Formula). The number \vec{H}^r(\alpha,\beta) is nonzero if and only if there is g \in \mathbb{N}_0 such that

r=2g-2+\ell(\alpha)+\ell(\beta).

Due to the Riemann-Hurwitz formula it is more convenient to parameterize the numbers \vec{H}^r(\alpha,\beta) by the parameter g, i.e. to set

\vec{H}_g(\alpha,\beta):= \vec{H}^{2g-2+\ell(\alpha)+\ell(\beta)}.

Theorem 12.4 (Optimized Leading Cumulants Formula). For each 1 \leq d \leq N, we have

L_N^d =N^{2-d} \sum\limits_{\alpha,\beta \vdash d} \frac{p_\beta(b_{N1},\dots,b_{NN})}{N^{\ell(\alpha)+\ell(\beta)}}(-1)^{\ell(\alpha)+\ell(\beta)}\sum\limits_{g=0}^\infty N^{-2g}\vec{H}_g(\alpha,\beta).

Now we are in position to prove some interesting limit theorems. We begin by proving one such theorem corresponding to an explicit choice of the dataset b_{ij} (this corresponds to Remark 4.5 in Olshanski-Vershik).

Let \gamma >0 be a positive constant, and suppose our dataset b_{ij} has Nth row given by

B_N=(\sqrt{\gamma N},\quad,\sqrt{\gamma N},-\sqrt{\gamma N},\dots,-\sqrt{\gamma N}),

where the number of positive and negative entries is as near to equal as possible – equal if N even and differing by 1 if N odd, where we assume the data favors positive numbers in the latter case. In this situation we can establish the following N \to \infty limit theorem for a single diagonal matrix element of X_N=U_NB_NU_N^*.

Theorem 12.5 (Gaussian Limit Theorem). Any diagonal matrix element of X_N=U_NB_NU_N^* converges in distribution to a centered Gaussian random variable Y of variance \gamma.

Proof: Let \beta \vdash d be a partition of d \in \mathbb{N}. Then, we have

p_\beta(B_N) = (\gamma N)^{\frac{d}{2}}p_d(1,\dots,1,-1,\dots,-1),

and therefore

p_\beta(B_N) = O(N^{1+\frac{d}{2}}).

Thus, from the Optimized Leading Cumulants Formula we have

L_N^d=O(N^{1-\frac{d}{2}}),

showing that L_N^d converges to zero for all d>2. This means that X_N(1,1) converges in cumulants (equivalently, in moments) to a Gaussian random variable Y, whose mean and variance remain to be determined.

The first cumulant of X_N(1,1) is

L_N^1 =N^{2-1} \frac{\sqrt{\gamma N}}{N^2}=N^{-\frac{1}{2}}\sqrt{\gamma},

and this converges to 0 as N \to \infty. For the second cumulant,

L_N^2 =\frac{p_2(B_N)}{N^2} + O(\frac{1}{N}),

and

\frac{p_2(B_N)}{N^2} = \frac{N(\gamma N)}{N^2}=\gamma.

Thus, we have shown that X_N(1,1) converges in moments/cumulants to Y \sim \mathcal{N}(0,\gamma). In fact, this implies X_N(1,1) \to Y in distribution as N \to \infty, thanks to the Method of Moments.

-QED

Leave a Reply