Math 289A: Lecture 9

This week we are looking at a landmark paper by Grigori Olshanski and Anatoly Vershik, “Ergodic Unitarily Invariant Measures on the Space of Infinite Hermitian Matrices.” This paper is among the first to consider characteristic functions in the context of random matrix theory. Namely, it considers the problem of approximating the function

K_N \colon \mathbb{R}^N \times \mathbb{R}^N \longrightarrow \mathbb{C}

defined by

K_N(A,B) = \int\limits_{\mathbb{U}_N} e^{i\mathrm{Tr}AUBU^*}\mathrm{d}U,

where the integration is over the unitary group of rank N against Haar measure. We have referred to this function as the “quantum Fourier kernel,” but as we have stressed it is also the classical characteristic function of the diagonal (X_N(1,1),\dots,X_N(N,N)) of X_N=U_NBU_N^* of a uniformly random Hermitian matrix with deterministic eigenvalues B \in \mathbb{R}^N. The paper of Olshanski-Vershik is about approximating K_N(A,B) as N \to \infty, which probabilistically means that we are trying to understand how the joint distribution of the diagonal matrix elements of a very large uniformly random Hermitian matrix with prescribed eigenvalues B is determined by the vector B. This is an example of a high-dimensional problem: we are trying to approximate a function: our goal is to approximate a function

K_N(A,B) = K_N(a_1,\dots,a_N;b_1,\dots,b_N)=\int\limits_{\mathbb{U}_N} e^{i\sum_{x,y=1}^N a_xb_y|U_{xy}|^2}\mathrm{d}U

of 2N real variables as the number of variables goes to infinity. Typically such problems are much harder than traditional problems of asymptotic analysis, which consider the approximation of some parameter-dependent family of functions defined on a fixed domain as the parameter approaches a given value.

One way to simplify the question is to consider the restriction of K_N to the subspace \mathbb{R}^r \times \mathbb{R}^N with r \in \mathbb{N} fixed. That is, we consider the characteristic function

K_N(a_1,\dots,a_r,0,\dots,0;b_1,\dots,b_N) = \mathbb{E}\left[e^{i(a_1X_N(1,1) + \dots +a_r X_N(r,r))}\right]

of only the first r diagonal matrix elements of the random Hermitian matrix X_N=U_NBU_N^*. Recall that we have previously shown that all diagonal matrix elements of X_N are identically distributed, but not necessarily independent. The paper of Olshanski and Vershik shows that if the eigenvalues B=(b_1,\dots,b_N), scaled by a factor of N^{-1}, converge to a point configuration on \mathbb{R} as N \to \infty, then the random variables X_N(1,1),\dots,X_N(r,r) converge to independent random variables whose characteristic function can be calculated explicitly in terms of this point configuration. We will try to work through this result, which I believe to be the first high-dimensional limit theorem in RMT obtained via characteristic functions.

The Olshanski-Vershik approach of computing the asymptotics of the characteristic function

K_N(A,B) = \int\limits_{\mathbb{U}_N}e^{i\mathrm{Tr}AUBU^*}\mathrm{d}U

is based on series expansion of this matrix integral. I am a big proponent of this approach, but it does require some quite extensive background from representation theory and algebraic combinatorics which is not traditionally found in the realm of probability. We will try not to get too bogged down in the algebra, and in order to do this it will be necessary to accept some results from representation theory as black boxes.

First, let us analytically continue the characteristic function K_N \colon \mathbb{R}^N \times \mathbb{R}^N \to \mathbb{C} to a function

K_N \colon \mathbb{C} \times \mathbb{C}^N \times \mathbb{C}^N \longrightarrow \mathbb{C}

by declaring

K_N(z,A,B) = \int\limits_{\mathbb{U}_N} e^{z \mathrm{Tr} AUBU^*} \mathrm{d}U,

where z \in \mathbb{C} is a complex number and A,B \in \mathbb{C}^N are complex vectors. Thus, by compactness of \mathbb{U}_N, we have that K_N is an entire function of 1+2N complex variables z,a_1,\dots,a_N,b_1,\dots,b_N, and the Maclaurin series of this function, which is a multivariate Taylor series centered at the origin, converges uniformly absolutely on compact subsets of \mathbb{C}^{1+2N}. We recover our original characteristic function K_N by setting z=i and restricting A,B to \mathbb{R}^N \subset \mathbb{C}^N.

The analytic function K_N has various symmetries that allow us to present its Maclaurin series in a more advantageous form. More precisely, K_N(z,A,B)=K_N(z,a_1,\dots,a_N,b_1,\dots,b_N) is invariant under independent permutations of a_1,\dots,a_N and b_1,\dots,b_N, and also stable under swapping these two sets of variables. This means that the Macluarin series of K_N can be presented in the form

K_N(z,A,B) = 1 + \sum_{d=1}^\infty \frac{z^d}{d!} K_N^d(A,B),

where K_N^d(A,B) = K_N^d(a_1,\dots,a_N,b_1,\dots,b_N) is a homogeneous degree d symmetric polynomial in a_1,\dots,a_N and also a homogeneous degree d symmetric polynomial in b_1,\dots,b_N. Therefore, by Newton’s theorem on symmetric polynomials, we can write this partial derivative as

K_N^d(A,B) = \sum\limits_{\alpha,\beta \in \mathsf{Y}^d} p_\alpha(a_1,\dots,a_N)p_\beta(b_1,\dots,b_N)K_N(\alpha,\beta),

where \mathsf{Y}^d is the set of integer partitions of the degree d, which we identify with the set of Young diagrams with d cells, and p_\alpha(a_1,\dots,a_N) and p_\beta(b_1,\dots,b_N) are the Newton power-sum symmetric polynomials corresponding to \alpha,\beta \in \mathsf{Y}^d. For example, if d=3 and \alpha=(1,1,1), \beta=(2,1), then

p_\alpha(a_1,\dots,a_N)p_\beta(b_1,\dots,b_N)= (a_1+\dots+a_N)^3(b_1^2+\dots+b_N^2)(b_1+\dots+b_N).

Our objective then is to calculate the numerical coefficients K_N(\alpha,\beta) of the polynomial K_N^d(A,B) which we will refer to as its Newton coefficients.

Problem 9.1. Show that K_N^d(A,B)=K_N^d(B,A), and use this to show that K_N(\alpha,\beta)=K_N(\beta,\alpha).

What is good about the Newton polynomials is that they connect us to point configurations, which is what we want. More precisely, if B=(b_1,\dots,b_N) \in \mathbb{R}^N is a real vector, then we can view it not as a point in N-dimensional space but as a configuration of particles b_1,\dots,b_N on the real line. The empirical distribution \eta_B of B is the discrete measure on \mathbb{R} which places unit mass at each particle, and the degree d moment of this measure is

\int\limits_{\mathbb{R}} x^d \eta_B(\mathrm{d}x) = b_1^d + \dots +b_N^d=p_d(b_1,\dots,b_N).

This means that writing the Maclaurin expansion of $K_N(z,A,B)$ in terms of Newton symmetric polynomials is exactly what we want to do, because it is expressing this function in terms of the empirical distributions of its arguments A,B, and the empirical distribution of a point configuration is a natural way to describe the “shape” of the configuration in the infinite particle limit where N \to \infty.

Unfortunately, there is no obvious way to calculate the Newton coefficients K_N(\alpha,beta). What we can do is compute the Maclaurin series of K_N(z,A,B) = K_N(z,a_1,\dots,a_N,b_1,\dots,b_N) by repeatedly differentiating under the integral sign with respect to z. Equivalently, we just expand the exponential function of z in the inetegrand and then integrate term-by-term, which gives

K_N(z,A,B) = 1 + \sum\limits_{d=1}^\infty \frac{z^d}{d!} \int\limits_{\mathbb{U}_N} \left(\mathrm{Tr} AUBU^* \right)^d \mathrm{d}U.

This gives us the formula

K_N^d(A,B) = \int\limits_{\mathbb{U}_N} \left(\mathrm{Tr} AUBU^* \right)^d \mathrm{d}U,

which at least reduces our initial problem of computing an exponential integral over \mathbb{U}_N to computing a polynomial integral over \mathbb{U}_N,

K_N^d(A,B) = \int\limits_{\mathbb{U}_N} \left(\sum\limits_{x,y=1}^N a_xb_y U_{xy} \bar{U}_{xy} \right)^d \mathrm{d}U,

but still there is no obvious path forward.

At this point we should ask ourselves what integrals over \mathbb{U}_N we actually can compute. To make the discussion as simple as possible, take N=1, so \mathbb{U}_1 is just the unit circle in \mathbb{C}. One thing we can confidently state is the orthogonality relation

\int\limits_{\mathbb{U}_1} U^k \bar{U}^l \mathrm{d}U = \int\limits_0^{2\pi} e^{i(k-l)\theta} \frac{\mathrm{d}\theta}{2\pi} = \delta_{kl}.

Representation theory gives us analogous formulas for all N, namely a system of multivariate polynomials which are orthogonal polynomials for integration with respect to Haar measure on \mathbb{U}_N. Let \mathsf{Y}_N be the set of all Young diagrams with at most N rows.

Defintion 9.1. For each \lambda \in \mathsf{Y}_N, the corresponding Schur symmetric polynomial is defined by

s_\lambda(t_1,\dots,t_N) = \frac{\det [t_j^{N-i+\lambda_i}]}{\det[t_j^{N-i}]},

where \lambda_1 \geq \lambda_2 \geq \dots \geq \lambda_N \geq 0 are the row lengths of \lambda.

Out of the many equivalent definitions of Schur polynomials, we are taking this one because it is a fairly explicit and elementary formula. Indeed, the determinant in the denominator is the Vandermonde determinant, which has the property that it divides any other antisymmetric polynomial in t_1,\dots,t_N to yield a symmetric polynomial.

Problem 9.2. Let \lambda=\Box be the single-cell Young diagram. Compute the Schur polynomial s_\Box(t_1,\dots,t_N) directly from Definition 9.1. Do the same for a few more small Young diagrams.

The disadvantage of Definition 9.1 is that it makes the orthogonality relations for Schur polynomials, which are really consequences of the fact that these polynomials are characters of irreducible representations of \mathbb{U}_N, into a theorem whose proof would take us to far afield. So we will have to treat this property as a black box. Given a matrix A \in \mathbb{C}^{N \times N}, let us write s_\lambda(A) for the evaluation s_\lambda(a_1,\dots,a_N) of the Schur polynomial s_\lambda(t_1,\dots,t_N) on the eigenvalues of A.

Theorem 9.1 (Schur orthogonality). For any two Young diagrams \lambda,\mu \in \mathsf{Y}_N and any two matrices A,B \in \mathbb{C}^{N \times N}, we have

\int\limits_{\mathbb{U}_N} s_\lambda(AUBU^*) \mathrm{d}U = \frac{s_\lambda(A)s_\lambda(B)}{\dim \mathbf{W}_N^\lambda}

and

\int\limits_{\mathbb{U}_N} s_\lambda(AU)s_\mu(BU^*) \mathrm{d}U =\delta_{\lambda\mu} \frac{s_\lambda(AB)}{\dim \mathbf{W}_N^\lambda},

where \dim \mathbf{W}_N^\lambda is the number of semi standard Young tableaux of shape \lambda with entries from \{1,\dots,N\}.

We also need a remarkable “multinomial formula” due to Frobenius. Let \mathsf{Y}_N^d= \mathsf{Y}^d \cap \mathsf{Y}_N be the set of Young diagrams with exactly d cells and at most N rows.

Theorem 9.2 (Frobenius formula). We have

(t_1 + \dots + t_N)^d = \sum\limits_{\lambda \in \mathsf{Y}_N^d} (\dim \mathbf{V}^\lambda) s_\lambda(t_1,\dots,t_N),

where \dim \mathbf{V}^\lambda is the number of standard Young tableaux of shape \lambda.

The reason the tableaux counts \dim \mathsf{V}^\lambda and \dim \mathsf{W}_N^\lambda are denoted this way is that the former is the dimension of an irreducible representation of the symmetric group \mathbb{S}^d and the latter is the dimension of an irreducible representation of the unitary group \mathbb{U}_N. At this point we have enough material to explicitly calculate a series expansion of K_N(z,A,B). This expansion is the starting point of Olshanski and Vershik’s asymptotic analysis, and is stated as Theorem 5.1 in their paper.

Theorem 9.3 (Schur expansion of K_N). We have

K_N(z,A,B) = \int\limits_{\mathbb{U}_N} e^{z \mathrm{Tr} AUBU^*} \mathrm{d}U=1+ \sum\limits_{d=1}^\infty \frac{z^d}{d!} \sum\limits_{\lambda \in \mathsf{Y}_N^d} s_\lambda(A)s_\lambda(B) \frac{\dim \mathbf{V}^\lambda}{\dim \mathbf{W}_N^\lambda}.

Proof: From the Theorem 9.2, we have

K_N^d(A,B) = \int\limits_{\mathbb{U}_N} (\mathrm{Tr} AUBU^*)^d \mathrm{d}U=\sum\limits_{\lambda \in \mathsf{Y}_N^d} (\dim \mathbf{V}^\lambda) \int\limits_{\mathbb{U}_N} s_\lambda(AUBU^*) \mathrm{d}U,

and applying the first integration formula in Theorem 9.1 this becomes

K_N^d(A,B) =  \sum\limits_{\lambda \in \mathsf{Y}_N^d} s_\lambda(A)s_\lambda(B) \frac{\dim \mathbf{V}^\lambda}{\dim \mathbf{W}_N^\lambda}.

– QED

Problem 9.3. Compute the Schur expansion of K_N up to O(z^4). This is straightforward but will give you a feel for the structure of the formula.

Let us include the analogue of Theorem 9.3 for the quantum Bessel kernel

K_{MN}(A,B) = \int\limits_{\mathbb{U}_{MN}} e^{i \mathrm{Tr}(A^*UBV^* + VB^*U^*A)} \mathrm{d}(U,V),

where M \geq N and the integration is against Haar measure on the product group \mathbb{U}_{MN}= \mathbb{U}_M \times \mathbb{U}_N. Here A,B \in \mathbb{R}^N are real vectors viewed as diagonal matrices in \mathbb{C}^{M \times N}. Since K_{MN}(A,B) is invariant under signed permutations of the coordinates of A=(a_1,\dots,a_N) and B=(b_1,\dots,b_N), we can assume without loss in generality that these are non increasing lists of nonnegative numbers. The quantum Bessel kernel is the characteristic function of a uniformly random M \times N rectangular matrix with singular values B=(b_1,\dots,b_N). Once again we can analytically continue this to

K_{MN}(z,A,B) = \int\limits_{\mathbb{U}_{MN}} e^{z \mathrm{Tr}(A^*UBV^* + VB^*U^*A)} \mathrm{d}(U,V),

where z \in \mathbb{C} and A,B \in \mathbb{C}^N are viewed as complex diagonal M \times N matrices in the integral. The following expansion of K_{MN} is not in the Olshanski-Vershik paper, but it can be obtained in essentially the same way; a more general result is proved here.

Theorem 9.4 (Schur expansion of K_{MN}). We have

K_{MN}(z,A,B) = \int\limits_{\mathbb{U}_N} e^{z \mathrm{Tr} AUBU^*} \mathrm{d}U=1+ \sum\limits_{d=1}^\infty \frac{z^{2d}}{d!d!} \sum\limits_{\lambda \in \mathsf{Y}_N^d} s_\lambda(A^*A)s_\lambda(B^*B) \frac{\dim \mathbf{V}^\lambda}{\dim \mathbf{W}_M^\lambda}\frac{\dim \mathbf{V}^\lambda}{\dim \mathbf{W}_N^\lambda}.

We close this lecture with a pointer to interesting work in the engineering literature which considers exactly these integrals in an applied context.

Leave a Reply