Math 289A: Lecture 10

We start with a brief recap of what we are trying to achieve – since things are getting complicated it might be helpful to step back for a moment before stepping further in.

Let \mathbb{H}_N be the Eucliden space of N \times N Hermitian matrices, and let \mathbb{U}_N be the compact group of N \times N unitary matrices. We have a group action

\Phi \colon \mathbb{U}_N \longrightarrow \mathrm{Aut}(\mathbb{H}_N),

where \mathrm{Aut}(\mathbb{H}_N) is the orthogonal group of the Euclidean space \mathbb{H}_N, defined by

\Phi(U)X = UXU^*, \quad X \in \mathbb{H}_N,\ U \in \mathbb{U}_N.

This group action is called unitary conjugation. Let

\mathcal{O}_X = \{UXU^* \colon U \in \mathbb{U}_N\}

denote the orbit of a given Hermitian matrix X \in \mathbb{H}_N under unitary conjugation.

Theorem 10.1 (Spectral Theorem). For every X \in \mathbb{H}_N, the orbit \mathcal{O}_X contains a diagonal matrix B which is unique up to permutations of its diagonal elements.

Let

\mathbb{W}_N = \{(b_1,\dots,b_N) \in \mathbb{R}^N \colon b_1 \geq \dots \geq b_N\}.

We will refer to this orthant of \mathbb{R}^N as the Weyl chamber. This term comes from Lie theory – we are not doing Lie theory, we just need a name for \mathbb{W}_N. By the Spectral Theorem, the orbits of \mathbb{H}_N under unitary conjugation are indexed by points of the Weyl chamber: we set

\mathcal{O}(B) = \mathcal{O}_B, \quad B \in \mathbb{W}_N,

where on the right hand side we are viewing B as a diagonal matrix.

A standard problem in linear algebra is the following: we are given a matrix X \in \mathbb{H}_N, meaning that we are explicitly given all entries of this matrix, and from this information we are asked to find B \in \mathbb{W}_N such that X \in \mathcal{O}(B). The Spectral Theorem guarantees that this problem has a unique solution. The coordinates of the solution B \in \mathbb{R}^N are the eigenvalues of the given matrix X.

We are interested in solving a generic version of the inverse problem: given B \in \mathbb{W}_N, determine the matrix elements of a uniformly random point X \in \mathcal{O}(B). Since B is given, we are being asked about the random Hermitian matrix X_N=U_NBU_N^*, where U_N is a uniformly random unitary matrix from \mathbb{U}_N. This is something we know how to define precisely (Haar measure).

We want to solve this inverse problem by computing the characteristic function of X_N, which is

\varphi_{X_N}(T) = \mathbf{E}\left[e^{i\langle T,U_NBU_N^*\rangle}\right]=\int\limits_{\mathbb{U}_N}e^{i \mathrm{Tr}TUBU^*} \mathrm{d}U.

We know that it is in fact sufficient to do this for T=A being a diagonal Hermitian matrix, and we have introduced the notation

K_N(A,B) = \int\limits_{\mathbb{U}_N} e^{i \mathrm{Tr} AUBU^*} \mathrm{d}U

in order to emphasize the fact that we can view this characteristic function as defining a bounded symmetric kernel

K_N \colon \mathbb{R}^N \times \mathbb{R}^N \longrightarrow \mathbb{C}.

Furthermore, K_N is actually the characteristic function of the diagonal D_N=(X_N(1,1),\dots,X_N(N,N)) of X_N, so a solution to our problem is the same thing as determining the distribution of the diagonal of a uniformly random Hermitian matrix with prescribed eigenvalues.

We can view K_N as a generating function for the joint moments of the diagonal matrix elements X_N(1,1),\dots,X_N(N,N), meaning that

K_N(A,B) = 1 + \sum\limits_{d=1}^\infty \frac{i^d}{d!} \sum\limits_{\gamma \in \mathsf{C}_N^d} a_1^{\gamma_1} \dots a_N^{\gamma_N} \mathbf{E}\left[ X_N^{\gamma_1}(1,1) \dots X_N^{\gamma_N}(N,N)\right] K_N(\gamma),

where the internal sum runs over the set \mathsf{C}_N^d of weak N-part compositions of the number d, which are simply vectors \gamma=(\gamma_1,\dots,\gamma_N) of nonnegative integers whose coordinates sum to d. The coefficients K_N(\gamma) corresponding to a given \gamma \in \mathsf{C}_N^d is a very simple quantity, namely the multinomial coefficient

K_N(\gamma) = \frac{d!}{\gamma_1! \dots \gamma_N!}.

But, it is Haard (sorry, couldn’t resist) to see how to directly compute all mixed moments of the random variables X_N(1,1),\dots,X_N(N,N). Indeed, we actually have a formula for each of these random variables, namely

X_N(k,k) = \sum\limits_{j=1}^N |U_{jk}|^2 b_j,

which is obtained from X_N=U_NBU_N^* simply by matrix multiplication. This formula more or less shows that a direct computation of, say, the moment sequence of just X_N(1,1) involves computing a family of Haar integrals over \mathbb{U}_N which we don’t know how to contend with.

But K_N is a moment generating function in another sense: we have

K_N(A,B) = 1 + \sum\limits_{d=1}^\infty \frac{i^d}{d!} \sum\limits_{\alpha,\beta \in \mathsf{Y}_N^d} p_\alpha(a_1,\dots,a_N)p_\beta(b_1,\dots,b_N) K_N(\alpha,\beta),

where the internal (double) sum is over the set \mathsf{Y}_N^d of Young diagrams with first row of length at most N; such diagrams are pictorial representations of partitions of d with largest part at most N. We know that such an expansion exists by a “soft” symmetry argument combined with a classical theorem of Newton: we use that K_N(A,B) is invariant under independent permutations of the coordinates of the vectors A=(a_1,\dots,a_N) and B=(b_1,\dots,b_N). The moments involved in this expansion are the moments of the (non-normalized) empirical eigenvalue distributions of these vectors, i.e. the discrete measures \eta_A and \eta_B on \mathbb{R} which place unit mass at each coordinate. The moments of these measures are exactly sums of powers of the coordinates of A and B.

This leaves the problem of computing the coefficients K_N(\alpha,\beta) in this nonstandard moment generating function. This can be done using representation theory – in fact one needs to know quite a bit about representations of the unitary and symmetric groups, as well as about relations between them the relationship between them. We will mostly black box this to avoid going too deeply into algebra (though an indication of what needs to be done was given in the previous lecture, where we discussed Schur polynomials). Instead, we will focus on two facts: first, the coefficients K_N(\alpha,\beta) have an intrinsically interesting combinatorial interpretation, which is something that often happens with moment generating functions of interesting random variables (we discussed the example of Gaussians in lecture); second, this combinatorial interpretation will allow us to prove probabilistically meaningful limit theorems. The simplest such theorem is the result of Olshanski and Vershik on the N \to \infty limit joint distribution of X_N(1,1),\dots,X_N(k,k) where k \in \mathbb{N} is arbitrary but fixed (you should be actively reading the Olshanski-Vershik paper while we are doing this).

Theorem 10.1. For every integer 1 \leq d \leq N, and every pair of Young diagrams \alpha,\beta \in \mathsf{Y}_N^d, we have

K_N(\alpha,\beta) = \sum\limits_{r=0}^\infty \frac{(-1)^r}{N^r}\vec{W}^r(\alpha,\beta),

where the series is absolutely convergent and \vec{W}^r(\alpha,\beta) is the number of monotone r-step walks on the Cayley graph of the symmetric group \mathfrak{S}^d which begin at a permutation of cycle type \alpha and end at a permutation of cycle type \beta.

Clearly there is much to be unpacked here, but ultimately it is not so complicated, and moreover only a small amount of the information contained in this statement is needed to prove limit theorems about random matrices. A very basic observation is that in the range 1 \leq d \leq N, the set \mathsf{Y}_N^d is simply the set \mathbb{Y}^d of all Young diagrams with exactly d cells.

Now we have to explain what we mean by the Cayley graph of the symmetric group, and what we mean by a monotone walk on this graph. For this I am linking to a paper which hopefully provides a reasonably clear account of this combinatorial construction (all you need to read is Section 2.1 and 2.2, just a couple of pages). After you have read this you will know what the Cayley graph of the symmetric group is, what is meant by a strictly monotone walk on this graph. Furthermore, you will know the remarkable fact that there is a unique strictly monotone walk between any two permutations, and this walk is a geodesic.

As explained in lecture, it is often the case that the logarithm of the characteristic function is easier to work with than the characteristic function itself – it is a generating function for cumulants rather than moments. Let us consider the logarithm of the characteristic function of X_N = U_NBU_N^*,

L_N(A,B) = \log K_N(A,B) = \log\int\limits_{\mathbb{U}_N} e^{i\mathrm{Tr}\, AUBU^*} \mathrm{d}U.

Leave a Reply