Math 289A: Lecture 5

Let $\mathbb{H}_N$ be the real vector space of Hermitian matrices with the Hilbert-Schmidt scalar product

$\langle X,Y \rangle =\mathrm{Tr}\, XY.$

Let

$f \colon \mathbb{H}_N \longrightarrow \mathbb{R}^N$

be the function from Hermitian matrices to standard Euclidean space defined by

$f(X) = (\text{eigenvalues of }X\text{ listed largest to smallest}).$

Theorem 5.1 (Hoffman-Wielandt Inequality). For any $X,Y \in \mathbb{H}_N$ we have

$\|f(X)-f(Y)\| \leq \|X-Y\|.$

Proof: By the spectral theorem, there are unitary matrices $V,W \in \mathbb{U}_N$ and non-increasing vectors $A,B \in \mathbb{R}^N$ such that

$X=VAV^* \quad\text{and}\quad Y=WBW^*.$

The square of the Hilbert-Schmidt norm of the difference $X-Y$ is

$\|X-Y\|^2 = \langle VAV^*-WBW^*,VAV^*-WBW^*\rangle = \mathrm{Tr}\, (VAV^*-WBW^*)^2,$

and

$\mathrm{Tr}\, (VAV^*-WBW^*)^2=\mathrm{Tr}\, A^2 -2\mathrm{Tr}\, VAV^*WBW^*+\mathrm{Tr}\, B^2.$

Therefore, if we can show

$\mathrm{Tr}\, AUBU^* \leq \mathrm{Tr}\, AB$

holds for all $U \in \mathbb{U}_N,$ we will have proved the claimed inequality (make sure you understand why).

Now,

$\mathrm{Tr}\, AUBU^* = \sum_{i,j=1}^N a_ib_j |U_{ij}|^2,$

and $[|U_{ij}|^2]$ is a doubly stochastic matrix. Therefore, the maximum value of $\mathrm{Tr}\, AUBU^*$ over all $U \in \mathbb{U}_N$ is bounded by the maximum value of the function

$g(M) = \sum_{i,j=1}^N a_ib_j |M_{ij}|,$

as $M$ ranges over all doubly stochastic matrices. Since $g$ is a linear function on the Birkhoff polytope, it achieves its maximum value at an extreme point of this convex set, so at a permutation matrix. The vertex corresponding to the identity matrix gives

$g(I) = \sum\limits_{i=1}^N a_i b_i = \mathrm{Tr}\, AB,$

and since the coordinates of $A$ and $B$ are nonincreasing the identity permutation is the optimal matching.

– QED

Now let $X_N$ be a random Hermitian matrix.

Theorem 5.2. The following are equivalent:

The characteristic function of $X_N$ is invariant under unitary conjugation;
The distribution of $X_N$ is invariant under unitary conjugation;
$X_N$ is equal in distribution to a random Hermitian matrix of the form $U_NB_NU_N^*,$ where $U_N$ is a uniformly random unitary matrix and $B_N$ is a random real diagonal matrix independent of $U_N$ .

We have already proved the equivalence of (1) and (2).

Problem 5.2. Complete the proof of Theorem 5.1.

At this point we have enough information to explain how the characteristic function of a unitarily invariant random Hermitian determines the joint distribution of its eigenvalues.

Theorem 5.2. The characteristic function of a unitarily invariant random Hermitian matrix is equal to the quantum characteristic function of its eigenvalues.

Obviously, this statement needs to be explained.

Let $B_N$ be a real random vector whose distribution $\sigma_N$ in $\mathbb{R}^N$ is arbitrary, and let $U_N$ be an independent random unitary matrix whose distribution $\eta_N$ in $\mathbb{U}_N$ is uniform (Haar) measure. Then $X_N=U_NB_NU_NB^*$ is a unitarily invariant random Hermitian matrix, and by Theorem 5.2 every unitarily invariant random Hermitian matrix is equal in law to a matrix constructed in this way.

Because $\mathbb{H}_N$ is non-compact, there is no such thing as a uniformly random Hermitian matrix. The matrix $X_N = U_NB_NU_N^*$ is as close as we can get: it is uniform conditional on the distribution of its eigenvalues $B_N.$ If $\mu_N$ is the distribution of $X_N$ in $\mathbb{H}_N$ , a random sample from $\mu_N$ is obtained by sampling a vector from the distribution $\sigma_N$ on $\mathbb{R}^N,$ and then choosing a uniformly random point on the $\mathbb{U}_N$ -orbit in $\mathbb{H}_N$ which contains this vector.

In terms of characteristic functions, this says that

$\varphi_{X_N}(A) = \mathbf{E}_{\mu_N}\left[e^{i \mathrm{Tr} AU_NB_NU_N^*}\right] = \mathbf{E}_{\sigma_N} \left[K_N(A,B_N)\right],$

where

$K_N(A,B) = \mathbf{E}_{\eta_N}\left[e^{i\mathrm{Tr}\, AUBU^*}\right] = \int_{\mathbb{U}_N} e^{i \mathrm{Tr}\, AUBU^*} \eta_N(\mathrm{d}U)$

is the characteristic function of a uniformly random Hermitian matrix from the $\mathbb{U}_N$ orbit containing $B \in \mathbb{R}^N$ viewed as a diagonal matrix. Thus, $K_N(A,B)$ is the characteristic function of a uniformly random Hermitian matrix with prescribed eigenvalues. This defines a bounded symmetric kernel

$K_N \colon \mathbb{R}^N \times \mathbb{R}^N \longrightarrow \mathbb{C}$

which we call the quantum Fourier kernel, because it is a superposition of standard $N$ -dimensional Fourier kernels

$K_N(A,B) = \int_{\mathbb{U}_N}e^{i \sum_{x,y=1}^N a_xb_y |U_{xy}|^2} \eta_N(\mathrm{d}U).$

Problem 5.3. Prove that indeed $K_N(A,B)=K_N(B,A)$ and $|K_N(A,B)|\leq 1.$ Moreover, show that $K_N(A,B)$ is invariant under independent permutations of the coordinates of $A,B \in \mathbb{R}^N.$

If $K_N(A,B)$ is the quantum Fourier kernel, then

$\psi_{B_N}(A) = \mathbf{E}[K_N(A,B_N)] = \int_{\mathbb{R}^N} K_N(A,B) \sigma_N(\mathrm{d}B)$

deserves to be called the quantum characteristic function of random vector $B_N$ , or the quantum Fourier transform of the probability measure $\sigma_N.$ What we have shown in this lecture is that for any unitarily invariant random Hermitian matrix $X_N,$ we have

$\varphi_{X_N}(A) = \psi_{B_N}(A), \quad A \in \mathbb{R}^N,$

where $B_N$ is any $N$ -dimensional random vector whose components are eigenvalues of $X_N$ (this is what Theorem 5.2 means). We can also state the above as

$\varphi_{D_N}(A) = \psi_{B_N}(A), \quad A \in \mathbb{R}^N,$

where $D_N$ is the diagonal of the random matrix $X_N$ .

The conclusion to be drawn from all of this is that figuring out how to analyze the eigenvalues of a unitarily invariant random Hermitian matrix $X_N$ given its characteristic function $\varphi_{X_N}$ is exactly the same thing as figuring out how to analyze an arbitrary random real vector $B_N$ given its quantum characteristic function $\psi_{B_N}.$

Before telling you how we will do this, let me tell you how we won’t. The main reason the Hoffman-Weilandt theorem was presented in this lecture is that the proof makes you think about maximizing the function

$g(U) = \sum_{x,y=1}^N a_xb_y|U_{xy}|^2$

over unitary matrices $U.$ The action $g(U)$ is exactly what appears in the exponent of the quantum Fourier kernel:

$K_N(A,B) = \int_{\mathbb{U}_N} e^{i g(U)} \mathrm{d}U.$

(I am going to stop writing the Haar measure $\eta_N$ explicitly in Haar integrals, so just $\mathrm{d}U$ rather than $\eta_N(\mathrm{d}U)$ ). Now, the method of stationary phase predicts that $K_N(A,B)$ should be approximated by contributions from the local maxima of the action, which means that this integral should localize to a sum over permutations. This turns out to be exactly correct, not just a good approximation, and the reasons for this are rooted in symplectic geometry. The end result is the following determinantal formula for the quantum Fourier kernel.

Theorem 5.4 (HCIZ Formula). For any $A,B \in \mathbb{R}^N$ with no repeated coordinates we have

$K_N(A,B) = \frac{1!2! \dots (N-1)!}{i^{N \choose 2}}\frac{\det [ e^{ia_xb_y}]}{\det[a_x^{N-y}]\det[b_x^{N-y}]}.$

This is a very interesting formula which makes manifest all the symmetries of the quantum Fourier kernel claimed in Problem 5.3 (but, you should solve that problem without using the determinantal formula). It is also psychologically satisfying in that it shows exactly how the quantum Fourier kernel is built out of classical one-dimensional Fourier kernels. However, it is not really good for much else. We will explain this further in the coming lectures. But first we will adapt all of the above to random rectangular matrices.

Math 289A: Lecture 5

Like this:

Published by Jonathan Novak

Leave a ReplyCancel reply

Share this:

Like this:

Published by Jonathan Novak

Leave a ReplyCancel reply

Discover more from Jonathan Novak