Math 289A: Lecture 7

In this lecture we will review what we have done so far from a slightly different perspective. Starting with the Fourier kernel

F \colon \mathbb{R} \times \mathbb{R} \longrightarrow \mathbb{C},

which is defined by F(a,b) = e^{iab}, we can define the characteristic function of a random variable X with distribution \mu by

\varphi_X(a) = \mathbf{E}F(a,X) = \int\limits_{\mathbb{R}} F(a,x)\mu(\mathrm{d}x).

This is an absolutely continuous function from \mathbb{R} into the closed unit disc in \mathbb{C} which uniquely determines the measure \mu. To get some feel for which this might be, expand the Fourier kernel in a power series,

F(a,b) = 1 + \sum_{d=1}^\infty \frac{i^d}{d!} (ab)^d.

Then, assuming we can interchange expectation with the summation we get

\mathbf{E}F(a,X) = 1 + \sum_{d=1}^\infty \frac{i^d}{d!} a^d\mathbf{E}[X^d],

where

\mathbf{E}[X^d] = \int\limits_{\mathbb{R}} x^d \mu(\mathrm{d}x),

is the degree d moment of X, or equivalently of \mu. So at least heuristically, the characteristic function of X is a generating function for the moments of its distribution. One situation in which this formal computation is quantitatively correct is when \mu has compact support, so if the characteristic function does indeed characterize the distribution of X then it must be the case that \mu is characterized by its moments. It is not difficult to see why the why the moments of a probability measure with compact support determine it completely: if we know how to compute the expectation of polynomial functions of X, then thanks to the Weierstrass approximation theorem we also know how to compute the expectation of continuous functions of X, and this in turn tells us how \mu measures intervals, by taking expectations of smooth bump functions. Of course, to obtain the fully general result that the characteristic function of X completely determines its distribution a different argument is needed. For example, the Cauchy distribution has no moments whatsoever, but its characteristic function still exists and determines it uniquely.

The N-dimensional Fourier kernel

F_N \colon \mathbb{R^N} \times \mathbb{R^N} \longrightarrow \mathbb{C}

is defined by

F_N(A,B) = e^{i \langle A,B\rangle} = e^{i(a_1b_1+\dots+a_Nb_N)}, \quad A,B \in \mathbb{R}^N

and the characteristic function of an N-dimensional random vector X_N is the continuous function \mathbb{R}^N \to \mathbb{C} defined by

\varphi_{X_N}(A) = \mathbf{E}F_N(A,X_N) = \int\limits_{\mathbb{R}^N} F_N(A,X) \mu_N(\mathrm{d}X),

where \mu_N is the distribution of X_N in \mathbb{R}^N. It is still true that \varphi_{X_N} exists and uniquely determines \mu_N, and we can tell a similar story about moments which supports this statement. To do so we first have to expand the N-dimensional Fourier kernel in a power series. This is easy but let us first make some general observations about what this expansion must look like before doing the computation. Since F_N(A,B) = F_N(B,A) and

F_N(PA,PB) = e^{i\langle PA,PB\rangle} = e^{i \langle A,B \rangle} = F_N(A,B),

for any N \times N permutation matrix P, we have

F_N(A,B) =1 + \sum_{d=1}^\infty \frac{d^d}{d!} F_N^d(A,B)

where F_N^d(A,B) is a homogeneous polynomial of degree d in a_1b_1,\dots,a_Nb_N which is invariant under independent permutations of these variables. Therefore, we have

F_N^d(A,B) = \sum\limits_{\alpha \in \mathsf{C}_N^d} (a_1b_1)^{\alpha_1}\dots (a_Nb_N)^{\alpha_N}F_N(\alpha)

where \mathsf{C}_N^d is the set of weak N-part compositions \alpha of d \in \mathbb{N} and F_N(\alpha) is a numerical coefficient. It is easy to determine this coefficient explicitly: since

F_N(A,B) = e^{ia_1b_1} \dots e^{ia_Nb_N} = \left(\sum\limits_{d=0}^\infty \frac{i^d}{d!}(a_1b_1)^d \right)\dots \left(\sum\limits_{d=0}^\infty \frac{d^d}{d!}(a_Nb_N)^d \right),

we have

F_N(\alpha) = \frac{d!}{\alpha_1! \dots \alpha_N!}={d \choose \alpha_1,\dots,\alpha_N},

the multinomial coefficient. Thus, the expected value of F_N^d(A,X_N) is a generating polynomial for degree d mixed moments of the entries of the random vector X_N=(X_N(1),\dots,X_N(N)),

\mathbf{E}F_N^d(A,X_N) = \sum\limits_{\alpha \in \mathsf{C}_N^d} a_1^{\alpha_1} \dots a_N^{\alpha_N} \mathbf{E}[X^{\alpha_1}_N(1)\dots X^{\alpha_N}_N(N)]F_N(\alpha).

In the previous lectures we defined a strange new kernel

K_N \colon \mathbb{R}^N \times \mathbb{R}^N \longrightarrow \mathbb{C}

by

K_N(A,B) = \int\limits_{\mathbb{U}_N} F_N(A,|U|^2B)\mathrm{d}U,

where the integration is against Haar probability measure on the unitary group \mathbb{U}_N, and for any U \in \mathbb{U}_N the matrix

|U|^2=\begin{bmatrix} {} & \vdots & {} \\ \dots & |U_{xy}|^2 & \dots \\ {} & \vdots & {} \end{bmatrix}

is obtained from U=[U_{xy}] by taking the squared modulus of each entry. Thus for any B \in \mathbb{R}^N the vector

|U|^2B=\begin{bmatrix} |U_{11}|^2b_1 + \dots + |U_{1N}|^2b_N \\ |U_{21}|^2b_1 + \dots + |U_{2N}|^2b_N \\ \vdots \\ |U_{N1}|^2b_1 + \dots + |U_{NN}|^2b_N\end{bmatrix} \in \mathbb{R}^N

has coordinates which are superpositions of the coordinates of B, and because of this we named K_N the N-dimensional “quantum Fourier kernel.”

Problem 7.1. Prove that K_N is a bounded symmetric kernel on $late \mathbb{R}^N$, i.e. |K_N(A,B)|\leq 1 and K_N(A,B)=K_N(B,A) for all A,B \in \mathbb{R}^N.

Then, given a random vector B_N with distribution \sigma_N in \mathbb{R}^N, we defined its quantum characteristic function by

\psi_{B_N}(A)=\mathbf{E}K_N(A,B_N) = \int\limits_{\mathbb{R}^N} K_N(A,B)\sigma(\mathrm{d}B).

This is a continuous function from \mathbb{R}^N into the closed unit disc in \mathbb{C}, and we claimed that \psi_{B_N} uniquely determines the distribution \sigma_N of B_N. Since the quantum characteristic function of B_N is clearly distinct from its classical characteristic function

\varphi_{B_N}(A) = \mathbf{E}F_N(A,B_N)=\int\limits_{\mathbb{R}^N} F_N(A,B)\sigma(\mathrm{d}B),

this must be true for some different reason. To understand what this reason might be, at least heuristically, we could proceed along the same lines we did with F_N and try to determine if \varphi_{B_N}(A) is some sort of generating function encoding natural statistics of B_N.

Let’s give this a try. The first step is to expand the quantum Fourier kernel K_N as a power series. Before doing the computation, we can look for some easy symmetries of K_N which will tell us in advance what a series expansion of K_N ought to look like. In fact, the quantum Fourier kernel is much more symmetric than the classical Fourier kernel – it is invariant under independent (as opposed to simultaneous) permutations of the coordinates of its arguments.

Problem 7.2. Prove that K_N(PA,QB)=K_N(A,B) for any A,B \in \mathbb{R}^N and any N\times N permutation matrices P,Q.

From Problems 7.1 and 7.2, we can indeed infer the general form of a series expansion of the kernel K_N. Namely, we have

K_N(A,B) = 1 + \sum\limits_{d=1}^\infty \frac{i^d}{d!} K_N^d(A,B),

where K_N^d(A,B) is a homogeneously degree d polynomial function of the coordinates a_1,\dots,a_N,b_1,\dots,b_N which is invariant under independent permutations of a_1,\dots,a_N and b_1,\dots,b_N and stable under swapping these two sets of variables. What does such a polynomial look like?

A classical theorem of Newton asserts that a symmetric homogeneous degree d polynomial in N variables is a polynomial function of the power-sum polynomials

p_1(x_1,\dots,x_N) = x_1+\dots+x_N \\ p_2(x_1,\dots,x_N)=x_1^2+\dots+x_N^2\\ \vdots \\ p_N(x_1,\dots,x_N)=x_1^N+\dots+x_N^N.

Equivalently, the vector space \Lambda_N^d of symmetric homogeneous degree d polynomials in N variables has as a linear basis the polynomials

p_\alpha(x_1,\dots,x_N)=\prod\limits_{k=1}^{\ell(\alpha)} p_{\alpha_1k}(x_1,\dots,x_N),

where \alpha=(\alpha_1,\dots,\alpha_r) ranges over the set \mathsf{P}_N^d of partitions of d with largest part at most N, and \ell(\alpha) denoting the number of parts.

Problem 7.3. Prove that for any \alpha \in \mathsf{P}_N^d and any A \in \mathbb{R}^N, the evolution p_\alpha(A)=p_\alpha(a_1,\dots,a_N) satisfies |p_\alpha(A)| \leq \|A\|_\infty N^{\ell(\alpha)}, and that this bound is sharp.

Thus, a power series expansion of the quantum Fourier kernel can be written in the form

K_N(A,B) = 1 + \sum\limits_{d=1}^\infty \frac{i^d}{d!}\sum\limits_{\alpha,\beta \in \mathsf{P}_N^d} \frac{p_\alpha(a_1,\dots,a_N)}{N^{\ell(\alpha)}}\frac{p_\beta(b_1,\dots,b_N)}{N^{\ell(\beta)}}K_N(\alpha,\beta),

where the coefficients satisfy K_N(\alpha,\beta)=K_N(\beta,\alpha). This complicated-looking expression is actually quite meaningful. For any vector A \in \mathbb{R}^N, the normalized power sum polynomial of degree d in the coordinates of A is

\frac{p_k(a_1,\dots,a_N)}{N} = \int\limits_{\mathbb{R}} x^k \eta_A(\mathrm{d}x),

where

\eta_A = \frac{1}{N} \sum\limits_{i=1}^N \delta_{a_i}

is the discrete probability measure on \mathbb{R} which places equal mass at each coordinate of A. This is called the empirical distribution of A \in \mathbb{R}^N. Thus, the expansion of the quantum Fourier kernel K_N(A,B) is a generating function for the moments of the empirical distributions of its arguments. Consequently, for B_N a random vector, the quantum characteristic function \psi_{B_N}(A) is a generating function for expectations of polynomials in the random variables

p_1(B_N),p_2(B_N),p_3(B_N),\dots,

which are the moments of the (random) empirical distribution of the random vector B_N.

Leave a Reply