Math 289A: Formalism of Schur-Horn Arrays

The right way to set up the classical limit theorems of probability theory is to start with a triangular array of real random variables

\begin{matrix} X_{11} & {} & {} \\ X_{21} & X_{22} & {} \\ X_{31} & X_{32} & X_{33} \\ \vdots & \vdots & \vdots\end{matrix}.

Your first exposure to the law of large numbers and central limit theorem likely involved a different setup, namely an infinite sequence

X_1,X_2,X_3,\dots

of iid random variables, whose partial sums

S_N=X_1 + \dots + X_N

are the main object of interest. There are some non-negligible advantages to the triangular array setup, for example the fact that the rows do not have to be defined on a common underlying probability space. The N \to \infty behavior of the row sums

S_N = X_{N1}+ \dots + X_{NN}

is the topic of slightly more advanced and flexible versions of the central limit theorem. This result and other classical limit theorems apply to the situation when the random variables in each row of the triangular array are independent or nearly independent.

We are interested in triangular arrays of random variables as above which arise from an underlying deterministic data set

\begin{matrix} B_{11} & {} & {} \\ B_{21} & B_{22} & {} \\ B_{31} & B_{32} & B_{33} \\ \vdots & \vdots & \vdots\end{matrix}.

The triangular array of random variables associated to this data has rows defined by

\begin{bmatrix} X_{N1} \\ \vdots \\ X_{NN} \end{bmatrix} = \begin{bmatrix} {} & \vdots & {}\\ \dots & |U_{ij}|^2 & \dots \\ {} & \vdots & {}\end{bmatrix}\begin{bmatrix} B_{N1} \\ \vdots \\ B_{NN} \end{bmatrix}

where U_N=[U_{ij}] is a uniformly random unitary matrix of rank N. Thus, the random vector (X_{N1},\dots,X_{NN}) is a random linear transformation of the deterministic data (B_{N1},\dots,B_{NN}). Because the entries of U_N are highly correlated (rows are orthonormal, columns are orthonormal) the random variables X_{N1},\dots,X_{NN} are far from independent. They are, however, exchangeable.

There are two ways to think about a Schur-Horn array. Geometrically, (X_{N1},\dots,X_{NN}) is a random point inside the permutahedron in \mathbb{R}^N generated by (B_{N1},\dots,B_{NN}). Algebraically, (X_{N1},\dots,X_{NN}) is the diagonal of the random matrix U_NB_NU_N^*, where B_N=\mathrm{diag}(B_{N1},\dots,B_{NN}).

We are interested in proving limit theorems for Schur-Horn arrays. Such theorems will describe N \to \infty behavior of the rows of the X-array in terms of prescribed N \to \infty behavior of the rows of the B-array. In particular, we are interested in the situation where the empirical distribution of the vector

\left(\frac{B_{N1}}{N^t}, \dots \frac{B_{NN}}{N^t}\right)

converges in moments as N \to \infty, where t \geq 0 is a scaling exponent. The empirical distribution is the probability measure on \mathbb{R} which places mass 1/N at each coordinate of the vector (B_{N1},\dots,B_{NN}). If we think of these data points as beads on a wire, then the empirical measure of S \subseteq \mathbb{R} is the fraction of beads which lie in S. Convergence in moments of the empirical measure means that the the limit

\gamma_d = \lim\limits_{N \to \infty} \frac{p_d(\left(\frac{B_{N1}}{N^t}, \dots \frac{B_{NN}}{N^t}\right)}{N}

exists for each fixed d \in \mathbb{N}, where p_d(z_1,\dots,z_N)=z_1^d + \dots + z_N^d is the power sum polynomial of degree d in N variables.

Let \langle \cdot \rangle denote expected value, and let

K_N(a_1,\dots,a_N) = \left\langle e^{i(a_1X_{N1}+\dots+a_NX_{NN})}\right\rangle

be the characteristic function of (X_{N1},\dots,X_{NN}). Because the distribution of (X_{N1},\dots,X_{NN}) in \mathbb{R}^N is compactly supported, K_N is an analytic function on \mathbb{R}^N. The Maclaurin expansion of the characteristic function is

K_N(a_1,\dots,a_N) = 1+\sum\limits_{i=1}^\infty \frac{i^d}{d!} K_N^d(a_1,\dots,a_N),

where

K_N^d(a_1,\dots,a_N) = \sum\limits_\alpha a_1^{\alpha_1} \dots a_N^{\alpha_N} \langle X_{N1}^{\alpha_1}\dots X_{NN}^{\alpha_N}\rangle {d \choose \alpha_1,\dots,\alpha_N}

and the sum is over all vectors \alpha=(\alpha_1,\dots,\alpha_N) of nonnegative integers whose coordinates sum to N. Thus, K_N^d(a_1,\dots,a_N) is a homogeneous degree d polynomial in N variables whose coefficients are, up to a transparent factor, the degree d joint moments of the random variables X_{N1},\dots,X_{NN}. Since K_N(0,\dots,0)=1, the logarithm of L_N(a_1,\dots,a_N)=\log K_N(a_1,\dots,a_N) of the characteristic function is defined and analytic on an open neighborhood of the origin in \mathbb{R}^N. We write the Maclaurin expansion of L_N as

L_N(a_1,\dots,a_N) = \sum\limits_{d=1}^\infty \frac{i^d}{d!} L_N^d(a_1,\dots,a_N),

where L_N^d(a_1,\dots,a_N) is a homogeneous degree d polynomial in N variables. The polynomials L_N^1,L_N^2,L_N^3,\dots are called the cumulant polynomials of the random variables X_{N1},\dots,X_{NN}. They are determined by the moment polynomials K_N^1,K_N^2,K_N^3,\dots according to a certain recursion explained on pages 2 and 3 of these notes. We write the cumulant polynomials in the form

K_N^d(a_1,\dots,a_N) = \sum\limits_\alpha a_1^{\alpha_1} \dots a_N^{\alpha_N} \langle X_{N1}^{\alpha_1}\dots X_{NN}^{\alpha_N}\rangle_c {d \choose \alpha_1,\dots,\alpha_N},

where the quantities \langle X_{N1}^{\alpha_1}\dots X_{NN}^{\alpha_N}\rangle_c are the degree d joint cumulants of the random variables X_{N1},\dots,X_{NN} as \alpha ranges over nonnegative integer vectors of sum d. Note that this is a definition, not a proposition. In statistical mechanics and quantum field theory, joint cumulants are referred to as connected correlation functions.

The main fact about Schur-Horn arrays that allows us to understand them is a second formula for the cumulant polynomials K_N^d(a_1,\dots,a_N) of the random variables X_{N1},\dots,X_{NN} in terms of the moments of the empirical distribution of the deterministic data (B_{N1},\dots,B_{NN}).

Theorem (Cumulant Formula for Schur-Horn Arrays). For each 1 \leq d \leq N, we have

K_N^d(a_1,\dots,a_N) = N^{2-d}\sum\limits_\alpha \frac{p_\alpha(a_1,\dots,a_N)}{N^{\ell(\alpha)}}\sum\limits_\beta \frac{p_\beta(B_{N1},\dots,B_{NN})}{N^{\ell(\beta)}}L_N(\alpha,\beta),

where the sums are over nonnegative integer vectors \alpha=(\alpha_1,\dots,\alpha_N) and \beta=(\beta_1,\dots,\beta_N) with non-increasing coordinates which sum to d, and \ell(\alpha) and \ell(\beta) are the number of nonzero coordinates of \alpha and \beta respectively. For each such pair \alpha,\beta the polynomials p_\alpha(z_1,\dots,z_N) and p_\beta(b_1,\dots,b_N) are the corresponding products of power sum symmetric polynomials, i.e.

p_\alpha = \prod\limits_{i=1}^{\ell(\alpha)} p_{\alpha_i}\qquad\text{and}\qquad p_\alpha = \prod\limits_{i=1}^{\ell(\beta)} p_{\beta_i}.

Finally,

L_N(\alpha,\beta) = (-1)^{\ell(\alpha)+\ell(\beta)} \sum\limits_{g=0}^\infty N^{-2g} \vec{H}_g(\alpha,\beta)

is a convergent series in which the coefficients \vec{H}_g(\alpha,\beta) are positive integers called monotone Hurwitz numbers which admit an explicit combinatorial description.

The cumulant formula for Schur-Horn arrays looks complicated, but it is fairly easy to use in practice. For example, let us set our scaling exponent to t=1/2. Then, we for any fixed r \in \mathbb{N} we have

K_N^d(a_1,\dots,a_r,0,\dots,0) = N^{1-\frac{d}{2}}\sum\limits_\alpha \frac{p_\alpha(a_1,\dots,a_N)}{N^{\ell(\alpha)-1}}\sum\limits_\beta \frac{p_\beta(\frac{B_{N1}}{\sqrt{N}},\dots,\frac{B_{NN}}{\sqrt{N}})}{N^{\ell(\beta)}}L_N(\alpha,\beta).

As we saw (twice) in lecture, you can calculate the N \to \infty limit of this polynomial. Assuming \gamma_1=0 and \gamma_1=1, the only nonzero limit is

K_N^d(a_1,\dots,a_N) \longrightarrow a_1^2 + \dots + a_r^2.

This means that

\lim\limits_{N \to \infty} \langle X_{N1}^{\alpha_1}\dots X_{Nr}^{\alpha_N}\rangle_c = 0

unless \alpha=(\alpha_1,\dots,\alpha_r) is a nonnegative integer vector with total sum 2 and only one nonzero coordinate. From this we conclude that for any fixed r \in \mathbb{N}, the random variables X_{N1},\dots,X_{Nr} converge in joint distribution to r iid standard Gaussians.

When our scaling exponent is t=1, we do not get such a simple Gaussian limit. Rather, we find that

\lim\limits_{N \rightarrow \infty} \frac{1}{N}K_N^d = (a_1^d + \dots + a_r^d) \kappa_d

for each fixed r \in \mathbb{N}, where

\kappa_d = \sum\limits_{\beta \vdash d} \gamma_\beta (-1)^{d+\ell(\beta)} \vec{H}_0(d,\beta),

where \gamma_\beta = \gamma_{\beta_1} \gamma_{\beta_2} \gamma_{\beta_3} \dots. Thus we still have a sum of pure powers indicating asymptotic independence between the random variables X_{N1},\dots,X_{Nr}, but the cumulants of these random variables are asymptotically N \kappa_d. We are going to see that

\kappa_1,\kappa_2,\kappa_3,\dots

is the free cumulant sequence corresponding to the moment sequence

\gamma_1,\gamma_2,\gamma_3,\dots.

Leave a Reply