We start with a brief recap of what we are trying to achieve – since things are getting complicated it might be helpful to step back for a moment before stepping further in.
Let be the Eucliden space of
Hermitian matrices, and let
be the compact group of
unitary matrices. We have a group action
where is the orthogonal group of the Euclidean space
, defined by
This group action is called unitary conjugation. Let
denote the orbit of a given Hermitian matrix under unitary conjugation.
Theorem 10.1 (Spectral Theorem). For every , the orbit
contains a diagonal matrix
which is unique up to permutations of its diagonal elements.
Let
We will refer to this orthant of as the Weyl chamber. This term comes from Lie theory – we are not doing Lie theory, we just need a name for
. By the Spectral Theorem, the orbits of
under unitary conjugation are indexed by points of the Weyl chamber: we set
where on the right hand side we are viewing as a diagonal matrix.
A standard problem in linear algebra is the following: we are given a matrix meaning that we are explicitly given all entries of this matrix, and from this information we are asked to find
such that
The Spectral Theorem guarantees that this problem has a unique solution. The coordinates of the solution
are the eigenvalues of the given matrix
We are interested in solving a generic version of the inverse problem: given determine the matrix elements of a uniformly random point
Since
is given, we are being asked about the random Hermitian matrix
where
is a uniformly random unitary matrix from
This is something we know how to define precisely (Haar measure).
We want to solve this inverse problem by computing the characteristic function of which is
We know that it is in fact sufficient to do this for being a diagonal Hermitian matrix, and we have introduced the notation
in order to emphasize the fact that we can view this characteristic function as defining a bounded symmetric kernel
Furthermore, is actually the characteristic function of the diagonal
of
so a solution to our problem is the same thing as determining the distribution of the diagonal of a uniformly random Hermitian matrix with prescribed eigenvalues.
We can view as a generating function for the joint moments of the diagonal matrix elements
, meaning that
where the internal sum runs over the set of weak
-part compositions of the number
which are simply vectors
of nonnegative integers whose coordinates sum to
The coefficients
corresponding to a given
is a very simple quantity, namely the multinomial coefficient
But, it is Haard (sorry, couldn’t resist) to see how to directly compute all mixed moments of the random variables Indeed, we actually have a formula for each of these random variables, namely
which is obtained from simply by matrix multiplication. This formula more or less shows that a direct computation of, say, the moment sequence of just
involves computing a family of Haar integrals over
which we don’t know how to contend with.
But is a moment generating function in another sense: we have
where the internal (double) sum is over the set of Young diagrams with first row of length at most
; such diagrams are pictorial representations of partitions of
with largest part at most
We know that such an expansion exists by a “soft” symmetry argument combined with a classical theorem of Newton: we use that
is invariant under independent permutations of the coordinates of the vectors
and
The moments involved in this expansion are the moments of the (non-normalized) empirical eigenvalue distributions of these vectors, i.e. the discrete measures
and
on
which place unit mass at each coordinate. The moments of these measures are exactly sums of powers of the coordinates of
and
This leaves the problem of computing the coefficients in this nonstandard moment generating function. This can be done using representation theory – in fact one needs to know quite a bit about representations of the unitary and symmetric groups, as well as about relations between them the relationship between them. We will mostly black box this to avoid going too deeply into algebra (though an indication of what needs to be done was given in the previous lecture, where we discussed Schur polynomials). Instead, we will focus on two facts: first, the coefficients
have an intrinsically interesting combinatorial interpretation, which is something that often happens with moment generating functions of interesting random variables (we discussed the example of Gaussians in lecture); second, this combinatorial interpretation will allow us to prove probabilistically meaningful limit theorems. The simplest such theorem is the result of Olshanski and Vershik on the
limit joint distribution of
where
is arbitrary but fixed (you should be actively reading the Olshanski-Vershik paper while we are doing this).
Theorem 10.1. For every integer and every pair of Young diagrams
, we have
where the series is absolutely convergent and is the number of monotone
-step walks on the Cayley graph of the symmetric group
which begin at a permutation of cycle type
and end at a permutation of cycle type
Clearly there is much to be unpacked here, but ultimately it is not so complicated, and moreover only a small amount of the information contained in this statement is needed to prove limit theorems about random matrices. A very basic observation is that in the range the set
is simply the set
of all Young diagrams with exactly
cells.
Now we have to explain what we mean by the Cayley graph of the symmetric group, and what we mean by a monotone walk on this graph. For this I am linking to a paper which hopefully provides a reasonably clear account of this combinatorial construction (all you need to read is Section 2.1 and 2.2, just a couple of pages). After you have read this you will know what the Cayley graph of the symmetric group is, what is meant by a strictly monotone walk on this graph. Furthermore, you will know the remarkable fact that there is a unique strictly monotone walk between any two permutations, and this walk is a geodesic.
As explained in lecture, it is often the case that the logarithm of the characteristic function is easier to work with than the characteristic function itself – it is a generating function for cumulants rather than moments. Let us consider the logarithm of the characteristic function of