Math 202B: Lecture 12

Let G be a group with group law written multiplicatively: for any elements g,h \in G, their product is gh \in G. Since G is a finite set, we can form the algebra \mathcal{F}(G) of functions A \colon G \to \mathbb{C}. As we know, this function algebra has a natural basis consisting of the elementary functions

E_g(h) = \delta_{gh},\quad g,h \in G

which are selfadjoint orthogonal idempotents in \mathcal{F}(G). However, the function algebra \mathcal{F}(G) has nothing to do with the group structure of G, so from the point of view of this construction the fact that G is a group and not just a set is irrelevant.

We can use the underlying group structure to define a different algebra \mathcal{C}(G) whose elements are functions A \colon G \to \mathbb{C}, but whose operations differ from those of \mathcal{F}(G) and are defined using the group structure of G. In \mathcal{C}(G), multiplication of functions not defined pointwise, but rather by convolution: we declare

E_gE_h = E_{gh}, \quad g,h \in G.

Thus, the multiplication tensor of \mathcal{C}(G) is not the three-dimensional identity matrix, but rather a two-dimensional object: the multiplication table of G. Thus for any two functions

A = \sum\limits_{g \in G} \alpha_g E_g \quad\text{and}\quad B=\sum\limits_{g \in G}\beta_gE_g,

their convolution product is given by

AB=\sum\limits_{g,h \in G} \alpha_g\beta_h E_{gh},

which can equivalently be written

AB = \sum\limits_{k \in G} \left(\sum\limits_{gh=k}\alpha_g\beta_h \right)E_k,

or

AB= \sum\limits_{g \in G} \left(\sum\limits_{h \in G}\alpha_{gh^{-1}}\beta_h \right)E_g

or

AB= \sum\limits_{g \in G} \left(\sum\limits_{h \in G}A(gh^{-1})B(h) \right)E_g.

This is very different from the product of A and B viewed as elements of \mathcal{F}(G), which is

AB=\sum\limits_{g \in G} A(g)B(g)E_g.

Similarly, instead of defining conjugation in \mathcal{C}(G) using pointwise using conjugation in \mathbb{C}, we define it using the the natural involution in G, which is taking inverses,

E_g^* = E_{g^{-1}}, \quad g \in G,

and extend this antilinearly,

A^* = \left(\sum\limits_{g \in G} \alpha_g E_g\right)^*=\sum\limits_{g \in G}\bar{\alpha}_g E_{g^{-1}} = \sum\limits_{g \in G} \bar{\alpha}_{g^{-1}}E_g.

This group-theoretic conjugation is indeed antimultiplicative with respect to convolution,

(E_g*E_h)^* = E_{gh}^*=E_{(gh)^{-1}}=E_{h^{-1}g^{-1}} = E_{h^{-1}}*E_{g^{-1}}=E_h^**E_g^*,

as well as involutive,

(E_g^*)^* = E_{g^{-1}}^*=E_{(g^{-1})^{-1}}= E_g.

Finally, the multiplicative unit in \mathcal{C}(G) is I= E_e, where e \in G is the group unit, as opposed to being the constant function equal to 1 on every g \in G, which is the multiplicative unit in \mathcal{F}(G).

Let us record the above as an official definition.

Definition 12.1. The convolution algebra (aka a group algebra) of a group G is the vector space

\mathcal{C}(G) = \{A \colon G \to \mathbb{C}\}

of complex-valued functions on G with multiplication defined by

[AB](g) = \sum\limits_{h \in G}A(gh^{-1})B(h)

and conjugation defined by

[A^*](g) = \overline{A(g^{-1})}.

In summary, \mathcal{F}(G) and \mathcal{C}(G) are exactly the same as vector spaces, but they are different as algebras. In particular, the elementary functions \{E_g \colon g \in G\} are a basis of \mathcal{F}(G) consisting of orthogonal projections, but in \mathcal{C}(G) they form a basis consisting of unitary elements.

Problem 12.1. Prove that if G,H are isomorphic groups then their convolution algebras \mathcal{C}(G) and \mathcal{C}(H) are isomorphic algebras (the converse is false, bonus points if you give a counterexample). Also show that \mathcal{C}(G) is commutative if and only if G is abelian.

To further highlight the difference between \mathcal{C}(G) and \mathcal{F}(G), let us consider their selfadjoint elements. To say that a function A \colon G \to \mathbb{C} is selfadjoint as an element of \mathcal{F}(G) simply means that it is real-valued:

A^*(g) = A(g) \iff \overline{A(g)} = A(g).

However, A being selfadjoint in \mathcal{C}(G) means something else:

A^*(g)=A(g) \iff \overline{A(g^{-1}})=A(g).

To further compare and contrast the function algebra \mathcal{F}(G) and the convolution algebra \mathcal{C}(G), let us consider their subalgebras. From Lecture 3 we know that the lattice of subalgebras of \mathcal{F}(G) is isomorphic to the lattice of partitions of G, which has nothing to do with the group structure on G. On the other hand, we have a subalgebra of \mathcal{C}(G) associated to every subgroup of H of G, defined by

\mathcal{A}(H) = \{A \in \mathcal{C}(g) \colon A(g) = 0 \text{ unless }g \in H\}.

In lecture, we also gave a characterization of unitary elements in \mathcal{C}(G).

Problem 12.2. Prove that \mathcal{A}(H) is a subalgebra of \mathcal{C}(G), and moreover that \mathcal{A}(H) is isomorphic to \mathcal{C}(H).

This raises the question of whether subalgebras of \mathcal{C}(G) are in bijection with subgroups of G, i.e. whether the lattice of subalgebras of \mathcal{C}(G) is isomorphic to the lattice of subgroups of G, which would be analogous to the situation with function algebras. This is not true in general. Consider the following element of \mathcal{C}(G),

P = \frac{1}{|G|} \sum_{g \in G} E_g,

which is the average of the elementary functions. Equivalently, P\colon G \to \mathbb{C} is the constant function

P(g) = \frac{1}{|G|}, \quad g \in G.

Then P is selfadjoint (make sure you understand why), and also

P^2 = \frac{1}{|G|^2} \sum_{h \in G} \sum_{g \in G} E_gE_h=\frac{1}{|G|^2} \sum_{h \in G} \sum_{g \in G} E_{gh}.

Now, the internal sum is just

\sum_{g \in G} E_{gh} = \sum_{g \in G} E_g,

since multiplying each element g \in G by a fixed group element h \in G just permutes the points of the group (make sure you understand why). Therefore, P^2=P and we conclude that the average of all elementary functions E_g is a projection in \mathcal{C}(G). Now consider the subalgebra of \mathcal{C}(G) generated by the the multiplicative identity I=E_e and P,

\mathcal{A}=\mathrm{alg}\{I,P\} = \{\alpha I + \beta P \colon \alpha,\beta \in \mathbb{C}\}.

Since I,P are linearly independent (make sure you understand why), we have located a two-dimensional subalgebra inside \mathcal{C}(G), for any group G. If G is a group of odd order, Lagrange’s theorem tells us that any subgroup H \leq G must also have odd order, and consequently the subalgebra \mathcal{A}=\mathrm{alt}\{I,P\} cannot be the algebra of functions vanishing outside some subgroup H of G, for then \mathcal{A} would be isomorphic to \mathcal{C}(H) and its dimensions would be the cardinality of H, which is impossible since |H| cannot be even.

Math 202B: Lecture 11

Let \mathcal{A} be an abelian subalgebra of the endomorphism algebra \mathrm{End}(V) of a finite dimensional Hilbert space V. We are going to use Math 202A linear algebra to show that \mathcal{A} is isomorphic to a function algebra. Combining this with the work we did last week, this result allows us to conclude the following: if \mathcal{A} is a commutative algebra which supports a scalar product satisfying the left-Frobenius identity (or, equivalently, admits a faithful state), then \mathcal{A} is isomorphic to a function algebra.

Problem 11.1. Prove that a one-dimensional subalgebra of \mathrm{End}(V) is isomorphic to the function algebra of a point.

Now let \mathcal{A} be an arbitrary m-dimensional abelian subalgebra of \mathrm{End}(V), and let \{A_1,\dots,A_m\} be a basis of \mathcal{A}. Since \mathcal{A} is commutative, all its elements are normal, as we proved in Week 1. Thus, A_1,\dots,A_m are commuting normal operators on V and we can apply the spectral theorem.

Theorem 11.1. There exists an orthonormal basis X \subset V such that

A_1x = \widehat{A}_1(x),\dots,A_mx=\widehat{A}_m(x)x, \quad x \in X,

where \widehat{A}_1(x), \dots, \widehat{A}_m(x) are scalars.

In possibly more familiar terms, Theorem 11.1 says that a finite family of commuting normal operators is simultaneously diagonalizable. For our purposes, we want to interpret this result as defining a function \mathcal{A} \to \mathcal{F}(X). Since \{A_1,\dots,A_m\} is a basis of \mathcal{A}, every A \in \mathcal{A} can be uniquely represented as a linear combination

A=\alpha_1A_1,\dots,\alpha_mA_m,

so we have a well-defined function

\mathcal{A} \longrightarrow \mathcal{F}(X)

which sends A \in \mathcal{A} to the function \widehat{A} \in \mathcal{F}(X) defined by

\widehat{A}(x) = \alpha_1\widehat{A}_1(x) + \dots + \alpha_m\widehat{A}_m(x), \quad x \in X.

We call this mapping \mathcal{A} \to \mathcal{F}(X) the spectral transform on \mathcal{A} relative to the orthonormal basis X \subset V.

Theorem 11.2. The spectral transform is an injective algebra homomorphism.

Proof: We carefully checked this in class, and if you were not there you should do the same. \square

Theorem 11.2 proves that every abelian subalgebra \mathcal{A} of \mathrm{End}(V) is isomorphic to a subalgebra of \mathcal{F}(X) for a finite set X. Since we have already shown in Math 202B that every subalgebra of a function algebra is isomorphic to a function algebra, this completes the proof that every commutative subalgebra of an endomorphism algebra is isomorphic to a function algebra.

However, we can be more precise than this: we can say which subalgebra of \mathcal{F}(X) the abelian subalgebra \mathcal{A} \leq \mathrm{End}(V) is transformed into. Namely, each of the operators A_i in our chosen basis \{A_1,\dots,A_m\} of \mathcal{A} induces a partition \mathfrak{p}_i of X obtained by partition the points of X into distinct eigenspaces. That is, two points x,y \in X are in the same block of \mathfrak{p}_i if and only if \widehat{A}_i(x)=\widehat{A}_i(y).

Problem 11.2. Prove that the image of \mathcal{A} under the spectral transform is the subalgebra of \mathcal{F}(X) consisting of all functions constant on the blocks of \mathfrak{q}, where \mathfrak{q} is the coarsest partition of X finer than each of the partitions \mathfrak{p}_1,\dots,\mathfrak{p}_m. (Hint: Show that \mathcal{A} maps into the stated subalgebra is straightforward, and Stone-Weierstrass finishes the job).

Math 202B: Lecture 9

Let V be a finite-dimensional Hilbert space and let \mathcal{A}=\mathrm{End}(V) be the algebra of linear operators on V. We have shown that the unique faithful tracial state on \mathcal{A} is

\tau(A)=\frac{1}{\dim V} \sum\limits_{x \in X} \langle x,Ax \rangle,

the average of the diagonal matrix elements of A \in \mathcal{A} with respect to any orthonormal basis X \subset V. Equivalently, the unique Frobenius scalar product on \mathcal{A}=\mathrm{End}(V) is

\langle A,B \rangle_F = \frac{1}{\dim V}\sum\limits_{x \in X} \langle Ax,Bs \rangle.

We also showed that \tau is not an algebra homomorphism, so that algebra homomorphisms \mathrm{End}(V) \to \mathbb{C} simply do not exist.

We can also go the other way and consider states which are not necessarily faithful or tracial. That is, we consider linear functionals \sigma \colon \mathcal{A} \to \mathbb{C} which are required to satisfy

\sigma(I) = 1 \quad\text{and}\quad \sigma(A^*A) \geq 0,

but nothing more. The corresponding sesquilinear form

\langle A,B \rangle_\sigma=\sigma(A^*B)

is Hermitian, satisfies \langle I,I \rangle_\sigma=\sigma(I)=1, and the left Frobenius identity holds,

\langle AB,C \rangle_\sigma=\sigma((AB)^*C)=\sigma(B^*A^*C)=\langle B,A^*C\rangle_\sigma.

In the case where \mathcal{A}=\mathcal{F}(X) is the function algebra of a finite set X, states on \mathcal{A} are in bijection with probability measures on X. Faithful states correspond to probability measures whose support is all of X, i.e. \mathbb{P}(x) >0 for all x \in X. We will now obtain an analogous classification of states on \mathcal{A}=\mathrm{End}(V). In this noncommutative setting, the role of the probability measure \mathbb{P} on X is played by a special kind of operator on P on V called a density operator.

Let \sigma be an arbitrary linear functional on \mathcal{A}=\mathrm{End}(V). Equip \mathcal{A} with the Hilbert-Schmidt scalar product. By the Riesz representation theorem, there is R \in \mathcal{A} such that

\sigma(A) = \langle R,A \rangle_{HS} = \mathrm{Tr}(R^*A), \quad A \in \mathcal{A}.

Equivalently, setting P=R^* we have

\sigma(A)=\mathrm{Tr}(PA), \quad A \in \mathcal{A}.

Proposition 9.1. We have \sigma(I)=1 if and only if \mathrm{Tr}(P)=1.

Proof: \sigma(I)=\mathrm{Tr}(P). \square

Proposition 9.2. We have \sigma(A^*A) \geq 0 for all A \in \mathcal{A} if and only if P=Q^*Q for some Q \in \mathcal{A}.

Proof: If P=Q^*Q, then

\sigma(A^*A)=\mathrm{Tr}(PA^*A)=\mathrm{Tr}(Q^*QA^*A)=\mathrm{Tr}(QA^*AQ^*)=\langle AQ,AQ\rangle_{HS} \geq 0.

Conversely, suppose \sigma(A^*A) \geq 0 for all A \in \mathcal{A}. Then, choosing an orthonormal basis X\subset V we have that

\mathrm{Tr}(PA^*A) =\sum\limits_{x \in X} \langle x,PA^*Ax \rangle \geq 0, \quad A \in \mathcal{A}.

Taking A=E_{yx} to be an elementary operator, we have

\mathrm{Tr}(PA^*A)=\mathrm{Tr}(PE_{xy}E_{yx}) = \sum\limits_{z \in X} \langle z,PE_{xx}z\rangle = \langle x,Px \rangle,

so that

\langle x,Px \rangle \geq 0 \quad\text{ for all } x \in X.

Thus, nonnegativity of \sigma means that all diagonal matrix elements of P, in any orthonormal basis X \subset V, are nonnegative. Writing P=H+iK with H,K \in \mathcal{A} selfadjoint, we have

\langle x,Px \rangle = \langle x,Hx \rangle + i\langle x,Kx \rangle.

Thus, the fact that x,Px \rangle is real for all x \in X is enough to force P to be selfadjoint.

What remains now is to show that a selfadjoint operator P which satisfies \langle v,Pv \rangle \geq 0 for all v \in V can be factored as P=Q^*Q for some Q \in \mathcal{A}. To do this we use the spectral theorem: since P is selfadjoint we have

P=\sum\limits_{\lambda \in \Lambda} \alpha_\lambda P_\lambda,

where |\Lambda| is a finite nonempty set of cardinality at most |X|=\dim V, and $P_\lambda,$ \lambda \in \Lambda are selfadjoint orthogonal idempotents: for all \lambda,\mu \in \Lambda we have

P_\lambda^*=P_\lambda, \quad P_\lambda P_\mu =\delta_{\lambda\mu}.

Indeed, this is the algebraic form of the spectral theorem for selfadjoint operators in \mathrm{End}(V). The geometric form of the theorem is that there exists an orthogonal decomposition

V=\bigoplus_{\lambda \in \Lambda} V_\lambda

of V into nonzero subspaces such that P restricted to V_\lambda is a scalar multiple \alpha_\lambda of the identity operator. The connection between the algebraic statement and the geometric one is that P_\lambda is the orthogonal projection of V onto V_\lambda.

Now if e_\lambda is a unit vector in the image V_\lambda of P_\lambda then for any \lambda \in \Lambda we have

e_\lambda,Pe_\lambda \rangle = \langle e_\lambda,\alpha_\lambda e_\lambda\rangle \geq 0,

so \alpha_\lambda is a nonzero nonnegative number, aka a positive number. Thus we can define

Q=\sum\limits_{\lambda \in \Lambda}\sqrt{\alpha_\lambda}P_\lambda

and we have P=Q^*Q. \square

What we have proved is that every state on the algebra \mathcal{A}=\mathrm{End}(V) has the form

\tau(A)=\mathrm{Tr}(PA), \quad A \in \mathcal{A}

where \mathrm{Tr}(P)=1 and P=Q^*Q for some Q \in \mathcal{A}. Any P \in \mathcal{A}=\mathrm{End}(V) with these two properties is called a density operator.

Thus, we have shown that states on the endomorphism algebra \mathrm{End}(V) of a finite-dimensional Hilbert space V are in bijection with density operators on V. This is reminiscent of our earlier theorem that states on the function algebra \mathcal{F}(X) of a finite set X are in bijection with probability measures on X. In fact, density operators are the noncommutative analogue of probability measures in a very precise way. In the proof above, we saw that the condition that there exists a factorization P=Q^*Q is equivalent to the condition that

P=\sum\limits_{\lambda \in \Lambda} \alpha_\lambda P_\lambda

where \Lambda is a nonempty set of cardinality bounded by \dim V and P_\lambda are pairwise orthogonal selfadjoint idempotents weighted by positive numbers \alpha_\lambda. This gives

\mathrm{Tr}(P) = \sum\limits_{\lambda \in \Lambda} \alpha_\lambda (\dim V_\lambda),

where V_\lambda is the eigenspace of P which P_\lambda projects onto. Together with the condition \mathrm{Tr}(P)=1, this means the density operator P induces a probability measure \mathbb{P} on \Lambda defined by

\mathbb{P}(\lambda) = \alpha_\lambda (\dim V_\lambda).

Problem 9.1. Let P \in \mathrm{End}(V) be a density operator. Show that the corresponding state \tau_P on \mathrm{End}(V) is a faithful state if and only if \mathrm{rank}(P)=\mathrm{dim}(V).

Math 289A: Lecture 7

In this lecture we will review what we have done so far from a slightly different perspective. Starting with the Fourier kernel

F \colon \mathbb{R} \times \mathbb{R} \longrightarrow \mathbb{C},

which is defined by F(a,b) = e^{iab}, we can define the characteristic function of a random variable X with distribution \mu by

\varphi_X(a) = \mathbf{E}F(a,X) = \int\limits_{\mathbb{R}} F(a,x)\mu(\mathrm{d}x).

This is an absolutely continuous function from \mathbb{R} into the closed unit disc in \mathbb{C} which uniquely determines the measure \mu. To get some feel for which this might be, expand the Fourier kernel in a power series,

F(a,b) = 1 + \sum_{d=1}^\infty \frac{i^d}{d!} (ab)^d.

Then, assuming we can interchange expectation with the summation we get

\mathbf{E}F(a,X) = 1 + \sum_{d=1}^\infty \frac{i^d}{d!} a^d\mathbf{E}[X^d],

where

\mathbf{E}[X^d] = \int\limits_{\mathbb{R}} x^d \mu(\mathrm{d}x),

is the degree d moment of X, or equivalently of \mu. So at least heuristically, the characteristic function of X is a generating function for the moments of its distribution. One situation in which this formal computation is quantitatively correct is when \mu has compact support, so if the characteristic function does indeed characterize the distribution of X then it must be the case that \mu is characterized by its moments. It is not difficult to see why the why the moments of a probability measure with compact support determine it completely: if we know how to compute the expectation of polynomial functions of X, then thanks to the Weierstrass approximation theorem we also know how to compute the expectation of continuous functions of X, and this in turn tells us how \mu measures intervals, by taking expectations of smooth bump functions. Of course, to obtain the fully general result that the characteristic function of X completely determines its distribution a different argument is needed. For example, the Cauchy distribution has no moments whatsoever, but its characteristic function still exists and determines it uniquely.

The N-dimensional Fourier kernel

F_N \colon \mathbb{R^N} \times \mathbb{R^N} \longrightarrow \mathbb{C}

is defined by

F_N(A,B) = e^{i \langle A,B\rangle} = e^{i(a_1b_1+\dots+a_Nb_N)}, \quad A,B \in \mathbb{R}^N

and the characteristic function of an N-dimensional random vector X_N is the continuous function \mathbb{R}^N \to \mathbb{C} defined by

\varphi_{X_N}(A) = \mathbf{E}F_N(A,X_N) = \int\limits_{\mathbb{R}^N} F_N(A,X) \mu_N(\mathrm{d}X),

where \mu_N is the distribution of X_N in \mathbb{R}^N. It is still true that \varphi_{X_N} exists and uniquely determines \mu_N, and we can tell a similar story about moments which supports this statement. To do so we first have to expand the N-dimensional Fourier kernel in a power series. This is easy but let us first make some general observations about what this expansion must look like before doing the computation. Since F_N(A,B) = F_N(B,A) and

F_N(PA,PB) = e^{i\langle PA,PB\rangle} = e^{i \langle A,B \rangle} = F_N(A,B),

for any N \times N permutation matrix P, we have

F_N(A,B) =1 + \sum_{d=1}^\infty \frac{d^d}{d!} F_N^d(A,B)

where F_N^d(A,B) is a homogeneous polynomial of degree d in a_1b_1,\dots,a_Nb_N which is invariant under independent permutations of these variables. Therefore, we have

F_N^d(A,B) = \sum\limits_{\alpha \in \mathsf{C}_N^d} (a_1b_1)^{\alpha_1}\dots (a_Nb_N)^{\alpha_N}F_N(\alpha)

where \mathsf{C}_N^d is the set of weak N-part compositions \alpha of d \in \mathbb{N} and F_N(\alpha) is a numerical coefficient. It is easy to determine this coefficient explicitly: since

F_N(A,B) = e^{ia_1b_1} \dots e^{ia_Nb_N} = \left(\sum\limits_{d=0}^\infty \frac{i^d}{d!}(a_1b_1)^d \right)\dots \left(\sum\limits_{d=0}^\infty \frac{d^d}{d!}(a_Nb_N)^d \right),

we have

F_N(\alpha) = \frac{d!}{\alpha_1! \dots \alpha_N!}={d \choose \alpha_1,\dots,\alpha_N},

the multinomial coefficient. Thus, the expected value of F_N^d(A,X_N) is a generating polynomial for degree d mixed moments of the entries of the random vector X_N=(X_N(1),\dots,X_N(N)),

\mathbf{E}F_N^d(A,X_N) = \sum\limits_{\alpha \in \mathsf{C}_N^d} a_1^{\alpha_1} \dots a_N^{\alpha_N} \mathbf{E}[X^{\alpha_1}_N(1)\dots X^{\alpha_N}_N(N)]F_N(\alpha).

In the previous lectures we defined a strange new kernel

K_N \colon \mathbb{R}^N \times \mathbb{R}^N \longrightarrow \mathbb{C}

by

K_N(A,B) = \int\limits_{\mathbb{U}_N} F_N(A,|U|^2B)\mathrm{d}U,

where the integration is against Haar probability measure on the unitary group \mathbb{U}_N, and for any U \in \mathbb{U}_N the matrix

|U|^2=\begin{bmatrix} {} & \vdots & {} \\ \dots & |U_{xy}|^2 & \dots \\ {} & \vdots & {} \end{bmatrix}

is obtained from U=[U_{xy}] by taking the squared modulus of each entry. Thus for any B \in \mathbb{R}^N the vector

|U|^2B=\begin{bmatrix} |U_{11}|^2b_1 + \dots + |U_{1N}|^2b_N \\ |U_{21}|^2b_1 + \dots + |U_{2N}|^2b_N \\ \vdots \\ |U_{N1}|^2b_1 + \dots + |U_{NN}|^2b_N\end{bmatrix} \in \mathbb{R}^N

has coordinates which are superpositions of the coordinates of B, and because of this we named K_N the N-dimensional “quantum Fourier kernel.”

Problem 7.1. Prove that K_N is a bounded symmetric kernel on $late \mathbb{R}^N$, i.e. |K_N(A,B)|\leq 1 and K_N(A,B)=K_N(B,A) for all A,B \in \mathbb{R}^N.

Then, given a random vector B_N with distribution \sigma_N in \mathbb{R}^N, we defined its quantum characteristic function by

\psi_{B_N}(A)=\mathbf{E}K_N(A,B_N) = \int\limits_{\mathbb{R}^N} K_N(A,B)\sigma(\mathrm{d}B).

This is a continuous function from \mathbb{R}^N into the closed unit disc in \mathbb{C}, and we claimed that \psi_{B_N} uniquely determines the distribution \sigma_N of B_N. Since the quantum characteristic function of B_N is clearly distinct from its classical characteristic function

\varphi_{B_N}(A) = \mathbf{E}F_N(A,B_N)=\int\limits_{\mathbb{R}^N} F_N(A,B)\sigma(\mathrm{d}B),

this must be true for some different reason. To understand what this reason might be, at least heuristically, we could proceed along the same lines we did with F_N and try to determine if \varphi_{B_N}(A) is some sort of generating function encoding natural statistics of B_N.

Let’s give this a try. The first step is to expand the quantum Fourier kernel K_N as a power series. Before doing the computation, we can look for some easy symmetries of K_N which will tell us in advance what a series expansion of K_N ought to look like. In fact, the quantum Fourier kernel is much more symmetric than the classical Fourier kernel – it is invariant under independent (as opposed to simultaneous) permutations of the coordinates of its arguments.

Problem 7.2. Prove that K_N(PA,QB)=K_N(A,B) for any A,B \in \mathbb{R}^N and any N\times N permutation matrices P,Q.

From Problems 7.1 and 7.2, we can indeed infer the general form of a series expansion of the kernel K_N. Namely, we have

K_N(A,B) = 1 + \sum\limits_{d=1}^\infty \frac{i^d}{d!} K_N^d(A,B),

where K_N^d(A,B) is a homogeneously degree d polynomial function of the coordinates a_1,\dots,a_N,b_1,\dots,b_N which is invariant under independent permutations of a_1,\dots,a_N and b_1,\dots,b_N and stable under swapping these two sets of variables. What does such a polynomial look like?

A classical theorem of Newton asserts that a symmetric homogeneous degree d polynomial in N variables is a polynomial function of the power-sum polynomials

p_1(x_1,\dots,x_N) = x_1+\dots+x_N \\ p_2(x_1,\dots,x_N)=x_1^2+\dots+x_N^2\\ \vdots \\ p_N(x_1,\dots,x_N)=x_1^N+\dots+x_N^N.

Equivalently, the vector space \Lambda_N^d of symmetric homogeneous degree d polynomials in N variables has as a linear basis the polynomials

p_\alpha(x_1,\dots,x_N)=\prod\limits_{k=1}^{\ell(\alpha)} p_{\alpha_1k}(x_1,\dots,x_N),

where \alpha=(\alpha_1,\dots,\alpha_r) ranges over the set \mathsf{P}_N^d of partitions of d with largest part at most N, and \ell(\alpha) denoting the number of parts.

Problem 7.3. Prove that for any \alpha \in \mathsf{P}_N^d and any A \in \mathbb{R}^N, the evolution p_\alpha(A)=p_\alpha(a_1,\dots,a_N) satisfies |p_\alpha(A)| \leq \|A\|_\infty N^{\ell(\alpha)}, and that this bound is sharp.

Thus, a power series expansion of the quantum Fourier kernel can be written in the form

K_N(A,B) = 1 + \sum\limits_{d=1}^\infty \frac{i^d}{d!}\sum\limits_{\alpha,\beta \in \mathsf{P}_N^d} \frac{p_\alpha(a_1,\dots,a_N)}{N^{\ell(\alpha)}}\frac{p_\beta(b_1,\dots,b_N)}{N^{\ell(\beta)}}K_N(\alpha,\beta),

where the coefficients satisfy K_N(\alpha,\beta)=K_N(\beta,\alpha). This complicated-looking expression is actually quite meaningful. For any vector A \in \mathbb{R}^N, the normalized power sum polynomial of degree d in the coordinates of A is

\frac{p_k(a_1,\dots,a_N)}{N} = \int\limits_{\mathbb{R}} x^k \eta_A(\mathrm{d}x),

where

\eta_A = \frac{1}{N} \sum\limits_{i=1}^N \delta_{a_i}

is the discrete probability measure on \mathbb{R} which places equal mass at each coordinate of A. This is called the empirical distribution of A \in \mathbb{R}^N. Thus, the expansion of the quantum Fourier kernel K_N(A,B) is a generating function for the moments of the empirical distributions of its arguments. Consequently, for B_N a random vector, the quantum characteristic function \psi_{B_N}(A) is a generating function for expectations of polynomials in the random variables

p_1(B_N),p_2(B_N),p_3(B_N),\dots,

which are the moments of the (random) empirical distribution of the random vector B_N.

Math 289A: Lecture 5

Let \mathbb{H}_N be the real vector space of Hermitian matrices with the Hilbert-Schmidt scalar product

\langle X,Y \rangle =\mathrm{Tr}\, XY.

Let

f \colon \mathbb{H}_N \longrightarrow \mathbb{R}^N

be the function from Hermitian matrices to standard Euclidean space defined by

f(X) = (\text{eigenvalues of }X\text{ listed largest to smallest}).

Theorem 5.1 (Hoffman-Wielandt Inequality). For any X,Y \in \mathbb{H}_N we have

\|f(X)-f(Y)\| \leq \|X-Y\|.

Proof: By the spectral theorem, there are unitary matrices V,W \in \mathbb{U}_N and non-increasing vectors A,B \in \mathbb{R}^N such that

X=VAV^* \quad\text{and}\quad Y=WBW^*.

The square of the Hilbert-Schmidt norm of the difference X-Y is

\|X-Y\|^2 = \langle VAV^*-WBW^*,VAV^*-WBW^*\rangle = \mathrm{Tr}\, (VAV^*-WBW^*)^2,

and

\mathrm{Tr}\, (VAV^*-WBW^*)^2=\mathrm{Tr}\, A^2 -2\mathrm{Tr}\, VAV^*WBW^*+\mathrm{Tr}\, B^2.

Therefore, if we can show

\mathrm{Tr}\, AUBU^* \leq \mathrm{Tr}\, AB

holds for all U \in \mathbb{U}_N, we will have proved the claimed inequality (make sure you understand why).

Now,

\mathrm{Tr}\, AUBU^* = \sum_{i,j=1}^N a_ib_j |U_{ij}|^2,

and [|U_{ij}|^2] is a doubly stochastic matrix. Therefore, the maximum value of \mathrm{Tr}\, AUBU^* over all U \in \mathbb{U}_N is bounded by the maximum value of the function

g(M) = \sum_{i,j=1}^N a_ib_j |M_{ij}|,

as M ranges over all doubly stochastic matrices. Since g is a linear function on the Birkhoff polytope, it achieves its maximum value at an extreme point of this convex set, so at a permutation matrix. The vertex corresponding to the identity matrix gives

g(I) = \sum\limits_{i=1}^N a_i b_i = \mathrm{Tr}\, AB,

and since the coordinates of A and B are nonincreasing the identity permutation is the optimal matching.

– QED

Now let X_N be a random Hermitian matrix.

Theorem 5.2. The following are equivalent:

  1. The characteristic function of X_N is invariant under unitary conjugation;
  2. The distribution of X_N is invariant under unitary conjugation;
  3. X_N is equal in distribution to a random Hermitian matrix of the form U_NB_NU_N^*, where U_N is a uniformly random unitary matrix and B_N is a random real diagonal matrix independent of U_N.

We have already proved the equivalence of (1) and (2).

Problem 5.2. Complete the proof of Theorem 5.1.

At this point we have enough information to explain how the characteristic function of a unitarily invariant random Hermitian determines the joint distribution of its eigenvalues.

Theorem 5.2. The characteristic function of a unitarily invariant random Hermitian matrix is equal to the quantum characteristic function of its eigenvalues.

Obviously, this statement needs to be explained.

Let B_N be a real random vector whose distribution \sigma_N in \mathbb{R}^N is arbitrary, and let U_N be an independent random unitary matrix whose distribution \eta_N in \mathbb{U}_N is uniform (Haar) measure. Then X_N=U_NB_NU_NB^* is a unitarily invariant random Hermitian matrix, and by Theorem 5.2 every unitarily invariant random Hermitian matrix is equal in law to a matrix constructed in this way.

Because \mathbb{H}_N is non-compact, there is no such thing as a uniformly random Hermitian matrix. The matrix X_N = U_NB_NU_N^* is as close as we can get: it is uniform conditional on the distribution of its eigenvalues B_N. If \mu_N is the distribution of X_N in \mathbb{H}_N, a random sample from \mu_N is obtained by sampling a vector from the distribution \sigma_N on \mathbb{R}^N, and then choosing a uniformly random point on the \mathbb{U}_N-orbit in \mathbb{H}_N which contains this vector.

In terms of characteristic functions, this says that

\varphi_{X_N}(A) = \mathbf{E}_{\mu_N}\left[e^{i \mathrm{Tr} AU_NB_NU_N^*}\right] = \mathbf{E}_{\sigma_N} \left[K_N(A,B_N)\right],

where

K_N(A,B) = \mathbf{E}_{\eta_N}\left[e^{i\mathrm{Tr}\, AUBU^*}\right] = \int_{\mathbb{U}_N} e^{i \mathrm{Tr}\, AUBU^*} \eta_N(\mathrm{d}U)

is the characteristic function of a uniformly random Hermitian matrix from the \mathbb{U}_N orbit containing B \in \mathbb{R}^N viewed as a diagonal matrix. Thus, K_N(A,B) is the characteristic function of a uniformly random Hermitian matrix with prescribed eigenvalues. This defines a bounded symmetric kernel

K_N \colon \mathbb{R}^N \times \mathbb{R}^N \longrightarrow \mathbb{C}

which we call the quantum Fourier kernel, because it is a superposition of standard N-dimensional Fourier kernels

K_N(A,B) = \int_{\mathbb{U}_N}e^{i \sum_{x,y=1}^N a_xb_y |U_{xy}|^2} \eta_N(\mathrm{d}U).

Problem 5.3. Prove that indeed K_N(A,B)=K_N(B,A) and |K_N(A,B)|\leq 1. Moreover, show that K_N(A,B) is invariant under independent permutations of the coordinates of A,B \in \mathbb{R}^N.

If K_N(A,B) is the quantum Fourier kernel, then

\psi_{B_N}(A) = \mathbf{E}[K_N(A,B_N)] = \int_{\mathbb{R}^N} K_N(A,B) \sigma_N(\mathrm{d}B)

deserves to be called the quantum characteristic function of random vector B_N, or the quantum Fourier transform of the probability measure \sigma_N. What we have shown in this lecture is that for any unitarily invariant random Hermitian matrix X_N, we have

\varphi_{X_N}(A) = \psi_{B_N}(A), \quad A \in \mathbb{R}^N,

where B_N is any N-dimensional random vector whose components are eigenvalues of X_N (this is what Theorem 5.2 means). We can also state the above as

\varphi_{D_N}(A) = \psi_{B_N}(A), \quad A \in \mathbb{R}^N,

where D_N is the diagonal of the random matrix X_N.

The conclusion to be drawn from all of this is that figuring out how to analyze the eigenvalues of a unitarily invariant random Hermitian matrix X_N given its characteristic function \varphi_{X_N} is exactly the same thing as figuring out how to analyze an arbitrary random real vector B_N given its quantum characteristic function \psi_{B_N}.

Before telling you how we will do this, let me tell you how we won’t. The main reason the Hoffman-Weilandt theorem was presented in this lecture is that the proof makes you think about maximizing the function

g(U) = \sum_{x,y=1}^N a_xb_y|U_{xy}|^2

over unitary matrices U. The action g(U) is exactly what appears in the exponent of the quantum Fourier kernel:

K_N(A,B) = \int_{\mathbb{U}_N} e^{i g(U)} \mathrm{d}U.

(I am going to stop writing the Haar measure \eta_N explicitly in Haar integrals, so just \mathrm{d}U rather than \eta_N(\mathrm{d}U)). Now, the method of stationary phase predicts that K_N(A,B) should be approximated by contributions from the local maxima of the action, which means that this integral should localize to a sum over permutations. This turns out to be exactly correct, not just a good approximation, and the reasons for this are rooted in symplectic geometry. The end result is the following determinantal formula for the quantum Fourier kernel.

Theorem 5.4 (HCIZ Formula). For any A,B \in \mathbb{R}^N with no repeated coordinates we have

K_N(A,B) = \frac{1!2! \dots (N-1)!}{i^{N \choose 2}}\frac{\det [ e^{ia_xb_y}]}{\det[a_x^{N-y}]\det[b_x^{N-y}]}.

This is a very interesting formula which makes manifest all the symmetries of the quantum Fourier kernel claimed in Problem 5.3 (but, you should solve that problem without using the determinantal formula). It is also psychologically satisfying in that it shows exactly how the quantum Fourier kernel is built out of classical one-dimensional Fourier kernels. However, it is not really good for much else. We will explain this further in the coming lectures. But first we will adapt all of the above to random rectangular matrices.

Math 202B: Lecture 15

Let V be a Hilbert space of dimension 1 < V < \infty and \mathcal{L}(V) be the algebra of all linear operators A \colon V \to V. As we saw in Lecture 14, the trace on \mathcal{L}(V) is, up to scaling, the unique linear functional \tau \colon \mathcal{L}(V) \to \mathbb{C} such that \tau(AB)=\tau(BA). Thus, the only linear functional satisfying \tau(I)=1 and \tau(AB)=\tau(BA) is

\tau(A) = \frac{1}{\dim V}\mathrm{Tr}(A), \quad A \in \mathcal{L}(V).

This fact allows us to prove a fact claimed early on in the course.

Problem 15.1. Prove that there does not exist an algebra homomorphism \chi \colon \mathcal{L}(V) \to \mathbb{C}.

The fact that \mathcal{L}(V) admits not even a single homomorphism to \mathbb{C} indicates that this is an algebra with the highest degree of noncommutativity. According to our definitions in Week One, the degree of noncommutativity of an algebra is formalized as the dimension of its center. We now prove that the center of \mathcal{L}(V) is indeed one-dimensional. The key notion is the following.

Definition 15.1. Given a subalgebra \mathcal{A} of \mathcal{L}(V), a subspace W of V is said to be \mathcal{A}invariant if for every A \in \mathcal{A} and every w \in W we have Aw \in W.

Every subalgebra \mathcal{A} of \mathcal{L}(V) has at least two invariant subspaces, namely the zero space \{0_V\} and the total space V. An important theorem of Burnside characterizes subalgebras of \mathcal{L}(V) for which this minimum is achieved.

Theorem 15.1. (Burnside) A subalgebra \mathcal{A} of \mathcal{L}(V) has exactly two invariant subspaces if and only if \mathcal{A}=\mathcal{L}(V).

Proof: An elementary proof of Burnside’s theorem by induction in the dimension of V can be found here.

-QED

Corollary 15.2. A proper subalgebra \mathcal{A} of \mathcal{L}(V) has at least four invariant subspaces.

Proof: The zero space and the total space are \mathcal{A}-invariant, and since \mathcal{A} \neq \mathcal{L}(V) there is a third \mathcal{A}-invariant subspace \{0_V\} < W < \mathcal{L}(V), by Burnside’s theorem. The orthogonal complement W^\perp of W is again a nonzero proper subspace of V, and we claim it is \mathcal{A}-invariant. Indeed, let x \in W^\perp and A \in \mathcal{A}. Then, for any w \in W we have

\langle w,Ax \rangle = \langle A^*w,x \rangle = 0,

where we used the fact that W is \mathcal{A}-invariant. This shows Ax \in W^\perp, i.e. W^\perp is \mathcal{A}-invariant.

– QED

Using Burnside’s theorem we can characterize the center of \mathcal{L}(V).

Theorem 15.2. The center of \mathcal{L}(V) is the set \mathbb{C}I of scalar operators.

Proof: Let Z \in \mathcal{L}(V) be a central element, and let \zeta \in \mathbb{C} be an eigenvalue of Z. Then, the \zeta-eigenspace of Z is an \mathcal{L}(V)-invariant subspace, since if v \in V is an eigenvector of Z corresponding to \zeta and A \in \mathcal{L}(V) is any operator we have

ZAv = AZv =\lambda Av,

which shows that Av is again in the \zeta-eigenspace of Z. Since this eigenspace is not the zero space, Burnside’s theorem forces it to be all of V, i.e. Z=\zeta I.

-QED

We have succeeded in describing the center of \mathcal{L}(V) explicitly, and ideally we would like to have an explicit description of all subalgebras of \mathcal{L}(V). We were able to do this for function algebras \mathcal{F}(X), where subalgebras are indexed by partitions \mathsf{P} of the finite set X. The classification is very simple: the subalgebra \mathcal{A}(\mathsf{P}) of a given partition \mathsf{P} of X is the set of all functions on X which are constant on the blocks of \mathsf{P}. A nice feature of this classification is that it shows every subalgebra of \mathcal{F}(X) is itself isomorphic to a function algebra, because \mathcal{A}(\mathsf{P}) is isomorphic to \mathcal{F}(\mathsf{P}).

We can attempt to classify subalgebras of \mathcal{L}(V) by adapting our approach to the classification of subalgebras of \mathcal{F}(X). The first step is to formulate the vector space version of partitions of a set.

Definition 15.2. A linear partition of V is a set \mathsf{W} of nonzero subspaces of V whose union spans V. We refer to the subspaces W \in \mathsf{W} as the blocks of \mathsf{W}.

Given a linear partition \mathsf{W} of V, it is not useful to consider the set of linear operators in \mathcal{L}(V) constant on the blocks of \mathsf{W}. Indeed, there is only one such operator, namely the zero operator. Rather, we define \mathcal{A}(\mathsf{W}) to be the set of all operators A \in \mathcal{L}(V) such that every block W \in \mathsf{W} is A-invariant.

Problem 15.2. Prove that A(\mathsf{W}) is a subalgebra of \mathcal{L}(V).

Here our construction starts to diverge from what we saw in the setting of function algebras: unless \mathsf{W}=\{V\} is the linear partition of V consisting of a single block, in which case \mathcal{A}(\mathsf{W})=\mathcal{L}(V), the subalgebra \mathcal{A}(\mathsf{W}) is definitely not isomorphic to the algebra of all linear operators on a Hilbert space. Indeed, consider the case where \mathsf{W}=\{W_1,W_2\} is a linear partition of V with two blocks. Let e_1,\dots,e_m be an ordered basis of W_1, and let f_1,\dots,f_n be an ordered basis of W_2. Then e_1,\dots,e_m,f_1,\dots,f_n is an ordered basis of V, and \mathcal{A}(\mathsf{W}) consists of all operators in A\in \mathcal{L}(V) whose matrix relative to this ordered basis has the form

[A]_{(e_1,\dots,e_m,f_1,\dots,f_n)}=\begin{bmatrix} M_1 & {} \\ {} & M_2 \end{bmatrix},

with M_1 \in \mathbb{C}^{m \times m} and M_2 \in \mathbb{C}^{n\times n}. Thus, \mathcal{A}(\{W_1,W_2\}) is not isomorphic to a linear algebra, but rather to a direct sum of linear algebras,

\mathcal{A}(\{W_1,W_2\}) = \mathcal{L}(W_1) \oplus \mathcal{L}(W_2).

The question now is whether the above can reversed: given a subalgebra \mathcal{A} of \mathcal{L}(V), does there exist a linear partition \mathsf{W} such that \mathcal{A}=\mathcal{A}(\mathsf{W})? If \mathcal{A}=\mathcal{L}(V), the answer is clearly “yes” since we have \mathcal{L}(V) = \mathcal{A}(\{V\}). Thus we consider the case where \mathcal{A} is a proper subalgebra of \mathcal{L}(V). Our starting point is Burnside’s theorem, which guarantees the existence of an \mathcal{A}-invariant subspace \{0_V\} < W < V. Together with the corollary to Burnside’s theorem, we thus have a linear partition \mathsf{W} = \{W,W^\perp\} of V whose blocks are \mathcal{A}-invariant. We now ask whether W has a proper non-trivial subspace invariant under \mathcal{A}. If it does we can split it into the orthogonal direct sum of two smaller \mathcal{A}-invariant subspaces; if not W is said to be \mathcal{A}-irreducible. Continuing this process, for both W and W^\perp, we get the following.

Theorem 15.4. (Maschke’s theorem) There exists a linear partition \mathsf{W} of V whose blocks are \mathcal{A}-invariant and \mathcal{A}-irreducible subspaces.

Now consider the linear partition \mathsf{W} of V into \mathcal{A}-invariant, \mathcal{A}-irreducible subspaces we have constructed. It is tempting to hope that \mathcal{A} = \mathcal{A}(\mathsf{W}). This is almost correct. Let us say that two blocks W,W' \in \mathsf{W} are \mathcal{A}equivalent if it is possible to choose a basis in W and a basis in W' such that the matrix of every A \in \mathcal{A}, viewed as an operator on W, coincides with the matrix of A viewed as an operator on W'. This is an equivalence relation on the blocks of \mathsf{W}. Let \Lambda be a set parameterizing the corresponding equivalence classes of blocks, and for each \lambda \in \Lambda choose a representative V^\lambda of the corresponding equivalence class.

Theorem 15.5. (Classification Theorem for subalgebras of linear algebras) The subalgebra \mathcal{A} is isomorphic to \bigoplus_{\lambda \in \Lambda} \mathcal{L}(V^\lambda).

We are not going to prove the above classification in full right now – in the coming lectures we will prove a special case of it for homomorphic images of convolution algebras of finite groups. If time permits, we will prove the full theorem later.

Math 202B: Lecture 12

Let \mathcal{C}(G) be the convolution algebra of a finite group G. Then, the left regular representation

\mathsf{L} \colon \mathcal{C}(G) \longrightarrow \mathcal{L}(V)

is an injective algebra homomorphism from the convolution algebra \mathcal{C}(G) into the the linear algebra of \mathcal{C}(G) viewed as a Hilbert space. That is, the regular representation converts convolution of functions into matrix multiplication.

If G is abelian, then the Fourier transform

\widehat{\cdot} \colon \mathcal{C}(G) \longrightarrow \mathcal{F}(\Lambda)

is an algebra isomorphism between the convolution algebra \mathcal{C}(G) and the function algebra of the dual group \Lambda of G. That is, the Fourier transform converts convolution into pointwise multiplication.

Let us reconcile these two points of view on the convolution algebra of a finite abelian group. Let A \in mathcal{C}(G) be any function on G. Let

A = \sum\limits_{\lambda \in \Lambda} \widehat{A}(\lambda)F^\lambda

be the Fourier expansion of A. Taking the left regular representation, we have

\mathsf{L}(A) = \sum\limits_{\lambda \in \Lambda} \widehat{A}(\lambda)P^\lambda,

where P^\lambda=\mathsf{L}(F^\lambda) is the linear operator on the Hilbert space V=\mathcal{C}(G) given by the image of the Fourier basis vector F^\lambda under the injective algebra homomorphism \mathsf{L}. In other words,

P^\lambda, \quad \lambda \in \Lambda,

is a basis of orthogonal projections for the image of \mathcal{C}(G) in \mathcal{L}(V) under \mathsf{L}, and

\mathsf{L}(A) = \sum\limits_{\lambda \in \Lambda} \widehat{A}(\lambda)P^\lambda

is the spectral decomposition of the operator \mathsf{L}(A). This says that the operators \mathsf{L}(A) making up the image of the commutative algebra \mathcal{C}(G) in \mathcal{L}(V) are simultaneously diagonalizable, which we already knew from our characterization of commutative subalgebras of \mathcal{L}(V), and moreover that calculating the eigenvalues of the operator \mathsf{L}(A) \in \mathcal{L}(V) is the same thing as calculating the Fourier transform of the function A \in \mathcal{C}(G).

Problem 12.3. A circulant matrix is a complex square matrix of the form

C=\begin{bmatrix} c_0 & c_1 & \dots & c_{n-1} \\ c_{n-1} & c_0 & \dots & c_{n-2} \\ \vdots & \vdots & {} & \vdots \\ c_1 & c_2 & \dots & c_0 \end{bmatrix}.

Determine the eigenvalues and eigenvectors of C. Hint: show that every n \times n circulant matrix is the image of a function on the cyclic group of order n in the regular representation.

As an extension to the above problem, you may consider how the uncertainty principle for the Fourier transform relates the sparsity of a circulant matrix to the dimension of its kernel. (Note to self: perhaps do this in Lecture in the next iteration of Math 202B).

Math 202B: Lecture 10

Let

G = C_{n_1} \times \dots \times C_{n_r}

be an abelian group, and let

\Lambda = \mathbb{Z}_{n_1} \times \dots \times \mathbb{Z}_{n_r}

be its dual group. In Lecture 8 we showed that the convolution algebra \mathcal{C}(G) admits a basis \{F^\lambda \colon \lambda \in \Lambda\} consisting of orthogonal projections, called its Fourier basis. Thus any function A \in \mathcal{C}(G) can be expanded in the group basis,

A = \sum\limits_{g \in G} A(g) E_g,

and in the Fourier basis

A = \sum\limits_{\lambda \in \Lambda} \widehat{A}(\lambda) F^\lambda.

The coefficients of the Fourier expansion of A \in \mathcal{C}(G) give a function \widehat{A} \in \mathcal{F}(\Lambda) defined by \lambda \mapsto \widehat{A}(\lambda).

Definition 9.1. The Fourier transform on \mathcal{C}(G) is the map \mathcal{C}(G) \to \mathcal{F}(\Lambda) defined by A \mapsto \widehat{A}.

In Lecture 3, we proved that any algebra which admits a basis of orthogonal projections is isomorphic to a function algebra, and the proof of this general fact given there applies here in particular, allowing us to conclude that the Fourier transform A \mapsto \widehat{A} is an algebra isomorphism \mathcal{C}(G) \to \mathcal{F}(G). Thus, we have

\widehat{A*B} = \widehat{A} \widehat{B},

where on the left hand side

[A*B](g) = \sum\limits_{h \in G} A(gh^{-1})B(h), \quad g \in G,

is the convolution product of two functions in \mathcal{C}(G), and on the right hand side

[\widehat{A}\widehat{B}](\lambda) = \widehat{A}(\lambda)\widehat{B}(\lambda), \quad \lambda \in \Lambda

is the pointwise product of two functions in \mathcal{F}(\Lambda).

By definition, the elements of the Fourier basis of \mathcal{C}(G) are

F^\lambda = \frac{1}{|G|} \sum\limits_{g \in G} \chi^\lambda(g) E_g,

where \{\chi^\lambda \colon \lambda \in \Lambda\} is the character basis of \mathcal{C}(G), which consists of all group homomorphisms G \to U(\mathbb{C}).

Theorem 9.1. For any A \in \mathcal{C}(G), we have

\widehat{A}(\lambda) = \frac{1}{|G|}\langle \chi^\lambda,A\rangle,

where the right hand side is an \ell^2-scalar product in \mathcal{C}(G).

Proof: Taking a slightly different normalization of the Fourier basis, we get an orthonormal basis of \mathcal{C}(G) consisting of the functions

E^\lambda = \sqrt{|G|}F^\lambda = \frac{1}{\sqrt{|G|}}\chi^\lambda, \quad \lambda \in \Lambda.

Because this basis is orthonormal, we have

A = \sum\limits_{\lambda \in \Lambda} \langle E^\lambda,A\rangle E^\lambda,

for any function A \in \mathcal{C}(G), and comparing with the Fourier expansion of A this gives

\widehat{A}(\lambda) = \frac{1}{\sqrt{|G|}}\langle E^\lambda,A\rangle E^\lambda=\frac{1}{|G|}\langle \chi^\lambda,A\rangle.

-QED

Theorem 9.1 is called the Fourier inversion formula, because

\widehat{A}(\lambda) = \frac{1}{|G|}\langle \chi^\lambda,A\rangle = \frac{1}{|G|}\sum\limits_{g \in G} \overline{\chi^\lambda(g)}A(g)

recovers the values of the transformed function \widehat{A} \in \mathcal{F}(\Lambda) in terms of the values of the original function A \in \mathcal{C}(G). As an example, let us calculate the Fourier transform of the elementary function E_h \in \mathcal{C}(G) corresponding to a given group element h \in G. From the inversion formula, the Fourier transform of the indicator function of h is

\widehat{E_h}(\lambda) = \frac{1}{|G|}\langle \chi^\lambda,E_h\rangle = \frac{1}{|G|}\sum\limits_{g \in G} \overline{\chi^\lambda(g)}E_h(g) = \frac{1}{|G|} \overline{\chi^\lambda(h)}.

Equivalently, we have

\widehat{E_h} = \frac{1}{|G|}\sum\limits_{\lambda \in \Lambda} \overline{\chi^\lambda(g)}E_\lambda,

where \{E_\lambda \colon \lambda \in \Lambda\} is the elementary basis of \mathcal{F}(\Lambda) consisting of indicator functions E_\lambda(\mu) = \delta_{\lambda\mu}.

Theorem 9.2. (Plancherel and Parseval formulas) For any A,B \in \mathcal{C}(G), we have

\langle \widehat{A},\widehat{B}\rangle=|G|\langle A,B\rangle

and

\|\widehat{A}\|_2 = \sqrt{|G|}\|A\|_2.

Proof: From the definition of the Fourier transform, we have

\langle A,B \rangle = \left\langle \sum\limits_{\lambda \in \Lambda} \widehat{A}(\lambda)F^\lambda,\sum\limits_{\mu \in \Lambda} \widehat{B}(\mu)F^\mu\right\rangle=\sum\limits_{\lambda \in \Lambda}\sum\limits_{\mu \in \Lambda}\overline{\widehat{A}(\lambda)}\widehat{B}(\mu)\langle F^\lambda,F^\mu\rangle=\sum\limits_{\lambda \in \Lambda}\overline{\widehat{A}(\lambda)}\widehat{B}(\lambda)\frac{1}{|G|}.

This is the Plancherel formula, and the Parseval formula is the special case where A=B, together with the definition of the norm associated to a scalar product.

-QED

The \ell^2-norm on functions A \colon G\to \mathbb{C} the p=2 case of the \ell^p-norm, which for any p>0 is defined by

\|A\|_p =\left(\sum\limits_{g\in G}|A(g)|^p\right)^{\frac{1}{p}}.

Also familiar from analysis is the case p=\infty, which is defined by

\|A\|_\infty = \max\limits_{g \in g} |A(g)|.

The intuition behind this definition is that, as p \to \infty the sum \|A\|_p^p is approximately equal to its largest term, and this concentration becomes exact in the p=\infty limit. This intuition leads to the following inequality.

Proposition 11.3. For any p>0 and A \colon G\to \mathbb{C}, we have

\|A\|_p^p \leq \|A\|^p_\infty |\mathrm{supp}(A)|,

where by definition the support of A is the cardinality of the complement of its vanishing set,

\mathrm{supp}(A) = \{g\in G\colon A(g) \neq 0\}.

Problem 11.2. Prove Proposition 11.3.

We now prove a discrete version of the famous uncertainty principle, whose origins are in quantum mechanics; in Fourier analysis, this principle effectively says that if the support of a function is small than the support of its Fourier transform is large.

Theorem 11.4. (Uncertainty principle) For any nonzero function A \in \mathcal{C}(G), we have

|\mathrm{supp}(A)||\mathrm{supp}(\widehat{A}| \geq |\Lambda|.

Proof: Start from the Fourier expansion of A, which in terms of the character basis \{\chi^\lambda \colon \lambda \in \Lambda\} of \mathcal{C}(G) is

A = \frac{1}{|G|} \sum\limits_{\lambda \in \Lambda} \widehat{A}(\lambda)\chi^\lambda.

For any g \in G, we have

|A(g)| =\left|\frac{1}{|G|} \sum\limits_{\lambda \in \Lambda} \widehat{A}\chi^\lambda(g)\right| \leq \frac{1}{|G|}\sum\limits_{\lambda \in \Lambda} |\widehat{A}(\lambda)| =\frac{1}{|G|}\sum\limits_{\lambda \in \mathrm{supp}(\widehat{A})} |\widehat{A}(\lambda)|,

where we used the triangle inequality and the fact that |\chi^\lambda(g)|=1. Since g\in G was arbitrary, the above shows that

\|A\|_\infty \leq \frac{1}{|G|}\sum\limits_{\lambda \in \mathrm{supp}(\widehat{A})} |\widehat{A}(\lambda)|.

Squaring both sides, we thus have

\|A\|_\infty^2 \leq \frac{1}{|G|^2}\left(\sum\limits_{\lambda \in \Lambda} \delta_{\lambda \in \mathrm{supp}(A)} |\widehat{A}(\lambda)|\right)^2\leq \frac{1}{|G|^2} |\mathrm{supp}(\widehat{A})| \|\widehat{A}\|_2^2=\frac{1}{|G|} |\mathrm{supp}(\widehat{A})| \|A\|_2^2\leq \frac{1}{|G|} |\mathrm{supp}(\widehat{A})| \|A\|_\infty^2|\mathrm{supp}(A)|,

where we applied the Cauchy-Schwarz inequality, the Parseval formula, and Proposition 11.3. Since \|A\|_\infty^2 \neq 0, it can be cancelled from both sides of the above inequality, leaving

1 \leq \frac{1}{|\Lambda|} |\mathrm{supp}(\widehat{A})||\mathrm{supp}(A)|.

-QED

Problem 11.3. Prove that the uncertainty principle is sharp by exhibiting a function A \colon G\to \mathbb{C} for which equality holds in Theorem 11.4.

Math 202B: Lecture 9

The convolution and class algebras \mathcal{C}(G) and \mathcal{Z}(G) of a group G are apparently quite different from the other types of algebras we have seen, namely function algebras \mathcal{F}(X) and linear algebras \mathcal{L}(V). However, they are in fact closely related.

Theorem 9.1. The convolution algebra \mathcal{C}(G) is isomorphic to a subalgebra of a linear algebra.

Proof: Since \mathcal{C}(G) has a natural scalar product, the \ell^2-scalar product

\langle A,B \rangle = \sum\limits_{g \in G} \overline{A(g)}B(g),

it is a Hilbert space as well as an algebra. For each A \in \mathcal{C}(G), consider the function \mathsf{L}(A) \colon \mathcal{C}(G) \to \mathcal{C}(G) defined by

\mathsf{L}(A)B = AB, \quad B \in \mathcal{C}(G).

Observe that \mathsf{L}(A) is a linear operator on \mathcal{C}(G), i.e.

\mathsf{L}(A)(\beta B + \gamma C) = A(\beta B + \gamma C) = \beta AB + \gamma AC = \beta \mathsf{L}(A)B + \gamma \mathsf{L}(A)C.

Thus, \mathsf{L} is a function from the convolution algebra \mathcal{C}(G) to the linear algebra \mathcal{L}(\mathcal{C}(G)). Moreover, the map \mathsf{L} itself is a linear transformation, i.e.

\mathsf{L}(\alpha A + \beta B) C= (\alpha A +\beta B)C=\alpha AB +\beta AC = \alpha \mathsf{L}(A)B + \gamma \mathsf{L}(B)C.

We claim that \mathsf{L} is an injective linear transformation. Since \mathsf{L} is a linear transformation of \mathcal{C}(G), it is completely determined by its action on the group basis \{E_g \colon g \in G\} of \mathcal{C}(G). Thus, it suffices to show that the linear operators \mathsf{L}(E_g) and \mathsf{L}(E_h) on \mathcal{C}(G) are distinct unless g=h. Since \mathsf{L}(E_g) and \mathsf{L}(E_h) are linear operators on \mathcal{C}(G), they are uniquely determined by their actions on the group basis \{E_g \colon g \in G\} of \mathcal{C}(G). We have

\mathsf{L}(E_g)E_k = E_gE_k = E_{gk} \quad\text{and}\quad \mathsf{L}(E_h)E_k = E_hE_k = E_{hk},

hence

\mathsf{L}(E_g)E_k=\mathsf{L}(E_h)E_k \iff gk = hk \iff g=h.

From this we can conclude that the image of \mathcal{C}(G) in \mathcal{L}(\mathcal{C}(G)) under \mathsf{L} is a subspace isomorphic to \mathcal{C}(G). It remains to show that \mathsf{L} is not just a linear transformation, but an algebra homomorphism, and we leave this as an exercise (that you really should do).

-QED

The proof of Theorem 8.1 is the linear version of Cayley’s theorem from group theory: instead of representing G as a subgroup of the group of permutations of G, it represents \mathcal{C}(G) as a subalgebra of the algebra of linear operators on \mathcal{C}(G). This is called the left regular representation of \mathcal{C}(G).

Corollary 9.2. The class algebra \mathcal{Z}(G) is isomorphic to a function algebra.

Proof: Since \mathcal{Z}(G) is a subalgebra of \mathcal{C}(G), and since \mathcal{C}(G) is isomorphic to a subalgebra of a linear algebra, \mathcal{Z}(G) is isomorphic to a subalgebra of a linear algebra. Thus, \mathcal{Z}(G) is isomorphic to a commutative subgalgebra of a linear algebra, and we have previously shown that all commutative subalgebras of linear algebras are isomorphic to function algebras.

-QED

For an example illustrating how the proof of Theorem 8.1 works, let us take G=\{g^0,g^1,g^2,g^3\} to be a cyclic group of order four with generator g, so that g^0=g^4=\dots=e is the group unit. The group basis of \mathcal{C}(G) is thus E_{g^0},E_{g^1},E_{g^2},E_{g^3}, and we denote these vectors by E_0,E_1,E_2,E_3 for brevity. Let A \in \mathcal{C}(G) be any function on G, and let

A = \alpha_0 E_0 + \alpha_1 E_1 + \alpha_2 E_2 + \alpha_3 E_3

be its expansion in the group basis, so that

\alpha_0 = A(g^0),\ \alpha_1=A(g),\ \alpha_2=A(g^2),\ \alpha_3=A(g^3).

Problem 9.1. Show by direct calculation that the matrix of \mathsf{L}(A) \in \mathcal{L}(\mathcal{C}(G)) with respect to the ordered basis E_0,E_1,E_2,E_3 of \mathcal{C}(G) is

\mathsf{L}(A) = \begin{bmatrix} \alpha_0 & \alpha_1 & \alpha_2 & \alpha_3 \\ \alpha_3 & \alpha_0 & \alpha_1 & \alpha_2 \\ \alpha_2 & \alpha_3 & \alpha_0 & \alpha_1 \\ \alpha_1 & \alpha_2 & \alpha_3 & \alpha_0 \end{bmatrix}.

In the case where G is an abelian group, as in the example above, we have that \mathcal{C}(G) = \mathcal{Z}(G). Moreover, it is possible to establish Corollary 8.1 directly, without appealing to Theorem 8.1. This is done constructively, by finding an explicit basis of orthogonal projections in the commutative convolution algebra \mathcal{C}(G) called its Fourier basis. The advantage of this direct approach is that it also gives us an explicit description of the spectrum of all matrices in the left regular representation of \mathcal{C}(G). This is very useful in applications – in particular, matrices in the left regular representation of a cyclical group are called circulant matrices and they are important in engineering.

To see begin to see where a projection basis of \mathcal{C}(G) might come from, recall that we previously showed the non-existence of an algebra homomorphism \mathsf{T} \colon \mathcal{L}(V) \to \mathbb{C} for V a Hilbert space of dimension at least two. This reflects the fact that linear algebras \mathcal{L}(V) are maximally noncommuative. But we have also seen that for G a group with at least two elements, the dimension of \mathcal{Z}(G) is at least two, so convolution algebras are always at least one degree more commutative than linear algebras, and therefore might admit homomorphisms to the complex numbers.

Theorem 9.3. A linear transformation \mathsf{T} \colon \mathcal{C}(G) \to \mathbb{C} is an algebra homomorphism if and only if the function \chi \in \mathcal{C}(G) defined by

\chi(g) = \mathsf{T}(E_g), \quad g \in G,

is a group homomorphism taking values in U(\mathbb{C}), the unitary group of the complex numbers.

Problem 9.2. Prove Theorem 8.3.

Definition 9.1. A group homomorphism \chi \colon G \to U(\mathbb{C}) is called a character of G. The set of all characters of G is denoted \widehat{G}.

Observe first of all that \widehat{G} is a set of functions on G, is a subset of \mathcal{C}(G), the space of all \mathbb{C}-valued functions on G. It is this special subset of functions tethered to the group law in G that we look to for orthogonal projections. No matter what the group G is, the set \widehat{G} is nonempty: it contains at least the trivial character defined by \chi(g)=1, g \in G. However, for highly noncommutative groups there may not be many nontrivial characters.

Problem 9.3. Determine the Fourier dual of the symmetric group S_n.

For abelian groups the situation is much better and in fact we always have |G|=|\widehat{G}|. To begin, recall that every finite abelian group is isomorphic to a product of cyclic groups. We thus fix a positive integer r \in \mathbb{N}, and r positive integers n_1,\dots,n_r \in \mathbb{N}, and consider the group

G = G_1 \times \dots \times G_r,

where G_i is a cyclic group of order n_i with generator g_i. Define the dual group of G to be

\Lambda = \{\alpha=(\alpha_1,\dots,\alpha_r) \colon \alpha_k \in \{0,1,\dots,n_k-1\},\ 1 \leq k \leq r\}.

That is,

\Lambda = \mathbb{Z}_{n_1} \times \dots \times \mathbb{Z}_{n_r},

where \mathbb{Z}_n= is the additive group of integers modulo n. We can parameterize G by the points of \Lambda, writing

g_\alpha=(g_1^{\alpha_1},\dots,g_r^{\alpha_r}), \quad \alpha \in \Lambda.

Indeed, the parameterization \alpha \mapsto g_\alpha is a group isomorphism \Lambda \to G (Exercise: prove this, noting that because |\Lambda|=|G| it is sufficient to show the parameterization is an injective group homomorphism).

Theorem 9.4. For every \lambda \in \Lambda, the function \chi^\lambda \colon G \to U(\mathbb{C}) defined by

\chi^\lambda(g_\alpha) = \omega_1^{\alpha_1\lambda_1} \dots \omega_r^{\alpha_r\lambda^r},

where \omega_k=\exp\left(\frac{2\pi i}{n_k}\right) is a principal kth root of unity, is a character of G, and every character of G is of this form.

Proof: For any \lambda \in \Lambda, it is clear that \chi^\lambda(e)=1, because the identity element e \in G has parameters \alpha=(0,0,\dots,0). Moreover, for any \alpha,\beta \in \Lambda we have

\chi^\lambda(g_\alpha g_\beta) = \chi^\lambda(g_{\alpha+\beta})=(\omega_1)^{(\alpha_1+\beta_1)\lambda_1} \dots (\omega_r)^{(\alpha_r+\beta_r)\lambda_r)} = (\omega_1)^{\alpha_1\lambda_1} \dots (\omega_r)^{\alpha_r\lambda_r}(\omega_1)^{\beta_1\lambda_1} \dots (\omega_r)^{\beta_r\lambda_r}=\chi^\lambda(g_\alpha)\chi^\lambda(g_\beta),

so \chi^\lambda is indeed a group homomorphism G \to U(\mathbb{C}). The fact that every homomorphism \chi \colon G \to U(\mathbb{C}) is \chi^\lambda for some \lambda \in \Lambda is left as an exercise.

-QED

We now have a special subset \widehat{G}=\{\chi^\lambda \colon \lambda \in \Lambda\} of the convolution algebra \mathcal{C}(G) of the finite abelian group G, namely the set \widehat{G} of all homomorphisms to the unitary group U(\mathbb{C}). We now claim that the characters form a basis of \mathcal{C}(G). Since the number of characters is |\widehat{G}|=|\Lambda|=|G|, which is the dimension of \mathcal{C}(G), it is sufficient to show that \widehat{G}=\{\chi^\lambda \colon \lambda \in \Lambda\} is a linearly independent set in \mathcal{C}(G).

Theorem 9.5. The set \{\chi^\lambda \colon \lambda \in \Lambda\} is orthogonal with respect to the \ell^2-scalar product on G – we have

\langle \chi^\lambda,\chi^\mu\rangle = \sum\limits_{g \in G} \overline{\chi^\lambda(g)} \chi^\mu(g) = \delta_{\lambda\mu}|G|.

Proof: For any \lambda,\mu \in \Lambda, we have

\langle \chi^\lambda,\chi^\mu \rangle = \sum\limits_{\alpha \in \Lambda} \overline{\chi^\lambda(g_\alpha)}\chi^\mu(g_\alpha) = \sum\limits_{\alpha \in \Lambda}(\omega_1)^{\alpha_1(\mu_1-\lambda_1)} \dots (\omega_r)^{\alpha_r(\mu_r-\lambda_r)}=\left(\sum\limits_{\alpha_1=0}^{n_1-1} \zeta_1^{\alpha_1}\right) \dots \left(\sum\limits_{\alpha_r=0}^{n_r-1} \zeta_r^{\alpha_r}\right),

where

\zeta_1 = \omega_1^{\mu_1-\lambda_1},\ \dots,\ \zeta_r=\omega_r^{\mu_r-\lambda_r}.

Thus if \lambda=\mu we have

\langle \chi^\lambda,\chi^\mu \rangle = n_1 \dots n_r = |\Lambda|=|G|,

and if \lambda \neq \mu we have

\langle \chi^\lambda,\chi^\mu \rangle =\frac{1-\zeta_1^{n_1}}{1-\zeta_1} \dots \frac{1-\zeta_r^{n_r}}{1-\zeta_r},

where the denominator of each fraction is nonzero and the numerator is zero, because \zeta_k=\omega_k^{\mu_k-\lambda_k} is an n_kth root of unity.

-QED

The orthogonal basis \widehat{G}=\{\chi^\lambda \colon \lambda \in \Lambda\} of \mathcal{C}(G) is called its character basis. It is convenient to write \chi^\lambda_\alpha := \chi^\lambda(g_\alpha), since this highlights the symmetry \chi^\lambda_\alpha = \chi^\alpha_\lambda. The |G| \times |G| symmetric matrix X=[\chi^\lambda_\alpha] is called the character table of G, and Theorem 8.3 says that \frac{1}{\sqrt{|G|}} is a symmetric unitary matrix. Another way to say the same thing is that the rescaled character basis

E^\lambda = \frac{1}{\sqrt{|G|}} \chi^\lambda, \quad \lambda \in \Lambda,

is an orthonormal basis of the convolution algebra \mathcal{C}(G). In fact, the further scaling

F^\lambda = \frac{1}{|G|}F^\lambda, \quad \lambda \in \Lambda,

is even better, for the following reason.

Theorem 9.6. The elements of the basis \{F^\lambda \colon \lambda \in \Lambda\} are orthogonal projections in \mathcal{C}(G).

Proof: For any \lambda \in \Lambda, we have

(F^\lambda)^* = \left( \frac{1}{|G|}\sum\limits_{g \in G} \chi^\lambda(g) E_g\right)^*=\frac{1}{|G|}\sum\limits_{g \in G} \overline{\chi^\lambda(g)} E_g^* =\frac{1}{|G|} \sum\limits_{g \in G} \chi^\lambda(g^{-1}) E_{g^{-1}} = F^\lambda.

For any \lambda,\mu \in \Lambda, we have

F^\lambda F^\mu = \left( \frac{1}{|G|}\sum\limits_{g \in G} \chi^\lambda(g) E_g\right)\left( \frac{1}{|G|}\sum\limits_{g \in G} \chi^\mu(g) E_g\right)=\frac{1}{|G|^2}\sum\limits_{g \in G} \left( \sum\limits_{h \in G}\chi^\lambda(gh^{-1})\chi^\mu(h)\right)E_g = \frac{1}{|G|^2}\sum\limits_{g \in G} \chi^\lambda(g)\left( \sum\limits_{h \in G}\chi^\lambda(h^{-1})\chi^\mu(h)\right)E_g.

The internal sum is

\sum\limits_{h \in G}\chi^\lambda({h^{-1}})\chi^\mu(h)=\sum\limits_{h \in G}\overline{\chi^\lambda(h)}\chi^\mu(h)=\delta_{\lambda\mu}|G|,

where the final equality is Theorem 8.3. Thus

F^\lambda F^\mu = \frac{1}{|G|^2}\delta_{\lambda\mu}|G|\sum\limits_{g \in G} \chi^\lambda(g)E_g=\delta_{\lambda\mu}F^\lambda.

-QED

Since we know that any algebra with a basis of orthogonal projections is isomorphic to the function algebra (Lecture 3), Theorem 8.4 gives the promised second proof of the fact that the convolution algebra \mathcal{C}(G) of an abelian group G is isomorphic to a function algebra. In this particular case, the basis \{F^\lambda \colon \lambda \in \Lambda\} is known as the Fourier basis of \mathcal{C}(G).

Math 202B: Lecture 8

The convolution algebra \mathcal{C}(G) of a group G is commutative if and only if G is abelian. As in Lecture 2, we will refine this dichotomy by giving a quantitative measurement of how (non)commutative \mathcal{C}(G) is. In Lecture 2 we defined the commutativity index of an algebra to be the dimension of its center, so our goal now is to determine the dimension of the center of a convolution algebra.

The degree of non commutativity of \mathcal{C}(G) is determined by that of G, and in group theory we understand this using the action of G on itself defined by

g.h = ghg^{-1}.

The orbits of this action are the conjugacy classes of G, and the number of conjugacy classes in G is called its class number. The higher the class number, the more commutative the group – for abelian groups, conjugacy classes are singleton sets.

Problem 8.1. Prove that G is an abelian group if and only if its class number is equal to its cardinality, and more generally that the number of commuting pairs of elements in G is equal to the cardinality of G times its class number (hint: the orbit-stabilizer theorem may be helpful). Show also that every group except the trivial group contains at least two conjugacy classes.

For the function algebra \mathcal{F}(G), every partition of G gives rise to a subalgebra of \mathcal{F}(G), namely the set of functions constant on the blocks of the partition. As we discussed in Lecture 4, there is no reason that the convolution of two such functions will still be constant on the blocks of the given partition. On the other hand, the partition of G into conjugacy classes is determined by the group structure of G, and so functions on G which are constant on the blocks of this particular group-theoretic partition of G are relevant from the point of view of \mathcal{C}(G).

Definition 8.2. A function A \colon G \to \mathbb{C} is called a class function if it is constant on conjugacy classes, meaning that A(g) = A(hgh^{-1}) for all g,h \in G.

The following gives an alternative way to think about class functions as functions which are insensitive to any noncommutativity present in G.

Theorem 8.1. A function A \colon G \to \mathbb{C} is a class function if and only if A(gh)=A(hg) for all g,h \in G.

Proof: Suppose first that A is a class function on G. Then,

A(gh) = A(hghh^{-1}) = A(hg).

Conversely, suppose A is insensitive to noncommutativity. Then,

A(hgh^{-1}) = A(h^{-1}hg) = A(g).

-QED

We can now characterize the center of \mathcal{C}(G).

Theorem 8.2. A function A \in \mathcal{C}(G) belongs to Z(\mathcal{C}(G)) if and only if it is constant on conjugacy classes.

Proof: Suppose first that A \in \mathcal{C}(G) is a class function; we will prove that it commutes with every B \in \mathcal{C}(G). For any g \in G, we have

[AB](g) = \sum\limits_{h \in G} A(gh^{-1})B(h) = \sum\limits_{h \in G} A(g(gh)^{-1}) B(gh) = \sum\limits_{h \in G} A(gh^{-1}g^{-1})B(gh),

where the second inequality follows from the fact that the substitution h \rightsquigarrow gh simply permutes the terms of the sum. Continuing the calculation, we have

[AB](g) = \sum\limits_{h \in G} A(gh^{-1}g^{-1})B(gh) =\sum\limits_{h \in G} A(h^{-1})B(gh) = \sum\limits_{h \in G} B(gh)A(h^{-1}),

where the second inequality follows from the fact that A is a class function. We now conclude

[AB](g) = \sum\limits_{h \in G} B(gh^{-1})A(h) = [BA](g),

as required.

Now suppose that Z \in Z(\mathcal{C}(g)) is a central function; we will prove it is constant on conjugacy classes. Since Z commutes with all functions in \mathcal{C}(g), it commutes with every elementary function E_g. Since

[ZE_g](h) = \sum\limits_{k \in G}Z(hk^{-1})E_g(k)=Z(hg^{-1}),

and

[E_gZ](h) = \sum\limits_{k\in G}E_g(hk^{-1})Z(k)=Z(g^{-1}h),

the centrality of Z implies that Z(gh)=Z(hg) for all g,h \in G. Thus Z is a class function, by Theorem 5.1.

-QED

Thus, the set of all functions constant on conjugacy classes of G is a subalgebra of both \mathcal{F}(G) and \mathcal{C}(G). The subalgebra of class functions has no special significance in \mathcal{F}(G), not being any more or less special than the subalgebra associated with any other partition of G. But in \mathcal{C}(G), the subalgebra of class functions is the center of \mathcal{C}(G). The center of \mathcal{C}(G) is often called the class algebra of G and denoted \mathcal{Z}(G) rather than Z(\mathcal{C}(G)) to emphasize that it is worth thinking about as a standalone object, i.e. as a commutative algebra naturally associated to the finite group G.

Just as the convolution algebra \mathcal{C}(G) has a natural basis given by indicator functions of elements of G, the class algebra \mathcal{Z}(G) has a natural basis given by indicator functions of conjugacy classes in G. Let \Lambda be a set parameterizing the conjugacy classes of G, so the collection of these is

\{C_\alpha \colon \alpha \in \Lambda\},

and this is a subset of the power set of G. For each \alpha \in \Lambda, let K_\alpha \colon G \to \mathbb{C} be the indicator function of C_\alpha, so

K_\alpha(g) = \begin{cases} 1, \text{ if }g \in C_\alpha \\ 0, \text{ if }g \not\in C_\alpha \end{cases}.

Equivalently, in terms of the group basis \{E_g \colon g \in G\} of \mathcal{C}(G) we have

K_\alpha = \sum\limits_{g \in C_\alpha} E_g, \quad \alpha \in \Lambda.

Then, the functions \{K_\alpha \colon \alpha \in \Lambda\} span the class algebra of G, since any function Z \colon G \to \mathbb{C} constant on conjugacy classes can be written

Z = \sum\limits_{\alpha \in \Lambda} Z(\alpha) K_\alpha,

where Z(\alpha) denotes the value Z(g) for any g \in \mathcal{C}_\alpha. We will show that \{K_\alpha \colon \alpha \in \Lambda\} is a basis of \mathcal{Z}(G) using the \ell^2-scalar product on \mathcal{C}(G), which is

\langle A,B \rangle = \sum\limits_{g \in G} \overline{A(g)}B(g).

It is clear that \{E_g \colon g \in G\} is an orthonormal basis of \mathcal{C}(G) with respect to the \ell^2-scalar product, and we therefore have

\langle K_\alpha,K_\beta \rangle = \left\langle \sum\limits_{g \in C_\alpha} E_g , \sum\limits_{h \in C_\beta} E_h \right\rangle = \sum\limits_{g \in C_\alpha,\ h \in C_\beta} \langle E_g,E_h\rangle = |C_\alpha \cap C_\beta|.

Since \{C_\alpha \colon \alpha \in \Lambda\} is a partition of G, the scalar product is

\langle K_\alpha,K_\beta \rangle = \delta_{\alpha\beta}|C_\alpha|,

which shows that \{K_\gamma \colon \gamma \in \Gamma\} is an orthogonal (but not orthonormal) set of functions in \mathcal{Z}(G), hence linearly independent.

In conclusion, if G is any finite group, the dimension of the convolution algebra \mathcal{C}(G) is the cardinality of G, and the dimension of the class algebra \mathcal{Z}(G) is the class number of G.

Problem 8.3. Consider the multiplication tensor [c_{\alpha\beta\gamma}] of \mathcal{Z}(G), i.e. K_\alpha K_\beta = \sum_\gamma c_{\alpha\beta\gamma}K_\gamma. For any \alpha,\beta,\gamma \in \Gamma and any g \in C_\gamma, show that

c_{\alpha\beta\gamma} = |\{(x,y) \in C_\alpha \times C_\beta \colon xy=g\}|.

Thus, the multiplication tensor of \mathcal{Z}(G) is quite an interesting object: c_{\alpha\beta\gamma} counts solutions to the equation xy=g in G, where g is any particular point of the conjugacy class C_\gamma, and x,y \in G are required to belong to C_\alpha and C_\beta respectively.