Math 289A: Lecture 1

Let \mathbb{L} be a Euclidean space and \mathfrak{B} its Borel sigma algebra.

Definition 1.1. A random variable taking values in \mathbb{L} is a measurable function X from a probability space (\Omega,\mathfrak{F},\mathbf{P}) into \mathbb{L}.

Every random variable leads a double life: it is both a function into \mathbb{L} and a measure on \mathbb{L}.

Definition 1.2. The distribution of X is the probability measure on Borel sets \mathsf{S} \subseteq \mathbb{L} defined by

\mu_X(\mathsf{S}) = \mathbf{P}\left(\{\omega \in \Omega \colon X(\omega) \in \mathsf{S}\}\right).

Strictly speaking, it is not correct to conflate X with its distribution \mu_X, since one could have a second random variable Y which is distinct from X but has the same distribution – when this happens we say that X and Y are “equal in distribution”, or “equal in law.” I heard the “double life” descriptor in a probability course taught by David Steinsaltz, and I think it is helpful provided one keeps the above caveat in mind. In fact I will go one step further and say that X leads a triple life – it is also a complex-valued function on \mathbb{L}.

Definition 1.2. The characteristic function of X is defined by the expectation

\varphi_{X}(T) = \mathbf{E}\left[ e^{i \langle T,X\rangle} \right] = \int\limits_{\mathbb{L}} e^{i \langle T,X \rangle} \mu_{X}(\mathrm{d}X), \quad T \in \mathbb{L}.

That is, the characteristic function of a random variable is the Fourier transform of its distribution.

Problem 1.1. Prove that the characteristic function of a random variable is an absolutely continuous function from \mathbb{L} into the closed unit disc in \mathbb{C}.

The characteristic function of X faithfully encodes its distribution: if Y is another random variable, then X and Y are equal in law if and only if they have the same characteristic function. This is a famous result in probability theory – a proof can be found in virtually any textbook on the subject. What makes characteristic functions so useful is that they transform probability into analysis, specifically harmonic analysis, which allows us to apply analytic methods to probabilistic questions. In particular, the famous Levy Continuity Theorem characterizes convergence in distribution for sequences of random variables in terms of the corresponding sequence of characteristic functions. This provides an efficient approach to the classical limit theorems of probability theory, the Law of Large Numbers and the Central Limit Theorem. Again, you will find an exposition of this approach in any standard probability text.

Random Matrix Theory is one of the most active areas of research in contemporary probability theory. It is a beautiful blend of algebra and probability which enjoys striking connections with many other fields, including combinatorics, number theory, physics, statistics, and something called “data science.” Strangely, RMT makes no use of characteristic functions – I am not aware of a single textbook on random matrices which even defines the characteristic function of a random matrix, despite the fact that this is simply a special case of the above construction.

To be concrete, let us take \mathbb{H} to be the real vector space of N \times N Hermitian matrices equipped with the scalar product

\langle X,Y \rangle = \mathrm{Tr}\, XY,

where \mathrm{Tr} denotes the standard matrix trace.

Problem 1.2. Prove that the above really does define a scalar product on the space of Hermitian matrices, and write down an orthonormal basis.

Everything we have said so far about random variables taking values in a general Euclidean space \mathbb{L} is perfectly valid in the special case where \mathbb{L}=\mathbb{H}. Thus, there is no definitional obstruction to analyzing a random variable X taking values in \mathbb{H} via its characteristic function \varphi_X. But there is a conceptual issue: random matrices are random variables which lead not just a triple life, but a quadruple life. More precisely, a random Hermitian matrix X has associated to it not just a measure \mu_X and a function \varphi_X, but also an operator L_X defined by matrix multiplication

L_X(v) = Xv, \quad v \in \mathbb{C}^N.

The random operator L_X is a geometric object, a random linear transformation of \mathbb{C}^N, and as such has geometric invariants such as eigenvalues and eigenvectors.

The statistical behavior of the geometric invariants of L_X is encoded in the characteristic function \varphi_X of its matrix X. However, the characteristic function “sees” X as a random vector in the N^2-dimensional Euclidean space \mathbb{H}, not as a random operator on the N-dimensional Hilbert space \mathbb{C}^N, and the question of how to extract geometric information from \varphi_X does not have an obvious or easy answer. Because of this, researchers in random matrix theory long ago abandoned characteristic functions. The subject has developed its own specific tools which have been very successful but have also distanced it from other parts of probability theory.

In recent years the question of whether a Fourier-analytic approach to random matrices might be possible after all has begun to be reconsidered. I do not claim to have a satisfactory answer to this question, but we will explore it in this topics course and perhaps see the beginnings of an answer. To get the most out of this endeavor, it would be best to absorb the standard approach to random matrix theory at the same time. The course notes by Todd Kemp and Terence Tao are both excellent resources which are freely available online.

Leave a Reply