Lectures 5 and 6 of Math 202C give an application of the representation theory of the symmetric group as discussed in Math 202B together with the additional material on the YJM elements in the corresponding group algebra . This application is probabilistic in nature, and concerns the following question: how many times do you have to shuffle a deck of cards in order to randomize it?
In order to analyze this question rigorously, we need to model it mathematically. The first step is to say exactly what we mean by “shuffle.” We consider a deck of cards labeled laid face up on a table, arranged in a single row. Initially, this arrangement is from left to right. At each tick of the clock, we choose a random card, and swap its position with the rightmost card. This gives us a new arrangement of cards which, reading the labels from left to right, corresponds to a permutation in the symmetric group . This is called the “top-to-random” shuffle. Note that it is possible the randomly chosen card is in fact the rightmost (top) card, and in this case no shuffling occurs.
The top-to-random shuffle is a stochastic process evolving in discrete time. It may equivalently be formulated as a random walk on the the Cayley graph of the symmetric group generated by the set
The elements of are transpositions; they are called “star transpositions” because, if we picture as the edge set of a graph on , we get the star graph with hub vertex The simple random walk on is the following stochastic process: a particle is initially located at the identity permutation , and at each tick of the clock it jumps to a neighboring vertex, with equal probability of jumping in any direction. This can be encoded algebraically as follows. Let be the group algebra of , equipped with the scalar product in which the permutation basis is orthonormal. Identify the generating set with the sum of its elements,
which you will recognize from our previous lectures as the top YJM element in The probability that a particle on evolving by simple random walk is located at a given permutation at a given time is then equal to the normalized scalar product
This random walk is not equivalent to the top-to-random shuffle, but rather to a shuffling process in which one of the cards below the top card is exchanged with the top card. This is not a good way to shuffles cards, because randomness is never achieved; equivalently, the corresponding random walk is not ergodic. A random walk on a graph is said to be ergodic if, for all sufficiently large times, the particle has positive probability of being located at each vertex. This is not the case for simple random walk on — at even times, the probability the particle is located at an odd permutation is zero, and vice versa.
A shuffling process should have the property that as we get closer and closer to a uniform distribution, which algebraically is represented by the vector
More formally, let us introduce the -norm on corresponding to the permutation basis, which assigns to each the number
In probability theory, the corresponding distance is called total variation distance. The lazy simple random walk on is the modification of simple random walk in which, at each tick of the clock, the particle either jumps to an adjacent vertex or stays put, with each such event occurring with equal probability. This random walk is generated by the group algebra element , which means that the probability the particle is located at is
This random walk is ergodic, and it follows from the Perron-Frobenius theorem in linear algebra that
for each , so that perfect randomness is achieved in the limit of infinitely many shuffles.
Shuffling infinitely many times is not something that can be done in the real world. Therefore what we want is an effective version of the above, which tells us how close our deck is to randomized after shuffles have been performed. Algebraically, we want bounds on the -norm
This weeks lectures, which take the form of a typeset document posted to Piazza in the resources section, explain how to do this using the representation theory of the symmetric group, in particular the spectral theory of the YJM elements which we have been discussing. A remarkable insight coming out of the analysis is that the top-to-random shuffle exhibits what is known as the cutoff phenomenon: for small values of , the total variation distance to uniform remains close to its initial value, and then suddenly crosses a -dependent cutoff time and becomes exponentially small, indicating that near-perfect randomness has been achieved in finite time.
In Lecture 3, we introduced the Young-Jucys-Murphy elements in the group algebra of the symmetric group . The YJM elements form a commuting family of elements in the group algebra of the symmetric group , and have the amazing feature that symmetric polynomials in always belong to the class algebra despite the non-centrality of the ‘s themselves. This is quite remarkable, as there is no obvious reason why an arbitrary symmetric polynomial should evaluate to a linear combination of conjugacy classes.
Since is central, Schur’s Lemma implies that it acts in any given irreducible representation of as a scalar operator. More precisely, if is the irreducible representation of corresponding to the partition (more precisely, a representative of the isomorphism class of all such representations), then the operator is of the form
where is the identity operator and is a scalar. Note that this scalar must in fact be real, because each , being a sum of transpositions, is a selfadjoint element of Indeed,
are commuting selfadjoint operators, and hence are simultaneously diagonalizable, i.e. admit a common eigenbasis. A natural question is then how to explicitly describe the eigenvalue This question has an amazing answer, which we now describe.
According to Schur’s Lemma, the class algebra may be described as the subalgebra of consisting of elements which act as scalar operators in all irreducible representations. Note that we are including here a converse to Schur’s Lemma, namely the statement that if is a scalar operator in for all , then ; this follows immediately from the fact that the map
defined by
is an algebra isomorphism (often referred to as the noncommutative Fourier transform). We can generalize this as follows.
As you know from Math 202B, the branching of irreducible -modules is multiplicity free. More precisely, we may consider the group algebra of the symmetric group as the subalgebra of spanned by permutations in satisfying The isotopic decomposition of an irreducible -module as an -module is
where the direct sum is over all irreducible -modules such that the Young diagram can be obtained from the Young diagram by adding a single cell.
Example 1: For the “staircase” diagram the isotypic decomposition of the irreducible -module as an -module is
Iterating this construction, we obtain a decomposition of into a sum of dimensional spaces indexed by “growth sequences”
where is a diagram with -cells, which encode the possible life histories of the multicellular organism starting from the single-celled organism Obviously, each such growth history corresponds uniquely to a standard Young tableau of shape — the entry of a given cell encodes the time instant at which that cell was added. Thus, multiplicity-free branching gives us a basis of each irreducible representation canonically determined up to scaling; this basis is known as the Young basis (or sometimes the Gelfand-Tsetlin basis). By Schur’s lemma, the subalgebra
coincides with the maximal commutative subalgebra of of elements which act diagonally on the Young basis of every irreducible representation ; this algebra is known as the Young subalgebra (or Gelfand-Tsetlin subalgebra) of the group algebra , and its construction depends crucially on the fact that we have a natural inclusion . Observe that the YJM elements belong to the algebra (explaining why is a conceptually optimal way to answer one of the homework problems). The following Theorem is an enormously useful tool, both theoretically and computationally.
Theorem 1 (Diaconis-Greene,Murphy,Okounkov-Vershik): The commutative algebra is exactly the algebra of polynomials in the YJM elements Moreover, for each irreducible representation of , the action of on the corresponding Young basis is given by
where is the content (i.e. column index minus row index) of the cell in containing the number .
Example 2: Consider the group algebra of the symmetric group of degree acting in the irreducible representation labeled by the staircase partition and let be the Young basis vector corresponding to the standard tableau
Theorem 1 says that is an eigenvector of each of the operators and that the corresponding eigenvalues are, respectively,
Corollary 1: For any and any , we have
where the right hand side denotes the evaluation of the symmetric polynomial on the numbers , with being an arbitrary enumeration of the cells of
Example 2: Corollary 1 tells us that the eigenvalue of the symmetric polynomial
acting in is
We will black-box Theorem 1, and (possibly) return to its proof later, though I will post a resource where you can look it up if you wish. Our goal will be to mine some rather remarkable information from Theorem 1. There are two natural directions in which one might want to go.
First, there is the question of whether evaluation on the YJM elements in fact defines a surjective morphism
If this were so, then for each there would exist a symmetric polynomial such that
i.e. we could express each conjugacy class as a symmetric polynomial function of the YJM elements. This would in turn have the following character-theoretic consequence. One one hand, the eigenvalue of the conjugacy class acting in the irreducible representation is given by
where as always the character is by definition the trace of the operator , with an arbitrary permutation in the conjugacy class Indeed, on one hand we have
since the trace of a scalar operator is simply its unique eigenvalue summed dimension-many times, and on the other hand
Corollary 1 would then give us the formula
which in turn gives us a formula for the irreducible character as a function of the content alphabet of the Young diagram . In fact, this is all true, and is a consequence of combining Corollary 1 with the following classical theorem on the class algebra.
Theorem 2 (Farahat-Higman): Every conjugacy class can be expressed as a polynomial in the levels of the Cayley graph of
A good treatment of this approach to the irreducible characters of the symmetric groups has been given by our colleague Adriano Garsia, and I will post his lecture notes to our Piazza page. There is an inescapable downside to this seemingly miraculous result: the symmetric polynomials are not given by any simple formula. This is in a sense inevitable, as the computation of the irreducible characters of the symmetric groups is a -complete problem, and consequently the quest for an explicit closed form is quixotic.
Instead, one can take the opposite approach, and start from the fact that there are many interesting (symmetric) polynomials which we can write down explicitly, and Theorem 1 gives us a way to explicitly calculate the action of in irreducible representations. It is this direction that we shall follow in the coming lectures.
In Lecture 2, we established the following unique factorization result for the symmetric group
Theorem 1 (Jucys): Every permutation admits a unique factorization
into transpositions , such that
Moreover, the number of factors in this decomposition is
We now consider some algebraic ramifications of Jucys’ result. Let denote the group algebra of the symmetric group. As you know from Math 202B, this is the -dimensional Hilbert space in which sits as an orthonormal basis, and which supports the bilinear product inherited from the group law on . In fact, is a -algebra, with antilinear involution given by All of this is true for arbitrary groups, and indeed you can view as a functor from the category of groups to the category of unital associative -algebras which transports each group to its corresponding group algebra
Any subset of may be identified with a vector in simply by identifying it with the sum of its elements, and any family of disjoint subsets of is linearly independent, since any two of its members are orthogonal in In particular, for each we may view the corresponding conjugacy class as an element of , the sum of all permutations of cycle type and consider the linearly independent set in the group algebra,
Let denote the subspace of spanned by the conjugacy classes, the dimension of which is the number of partitions of . Given , we have
where
Th number in fact depends only on the conjugacy class of , for if then for any Thus is a subalgebra of called the class algebra, which is a particularly appropriate name for something encountered in an algebra class. We thus have
where
The numbers are called the connection coefficients (aka structure constants) of the class algebra ; they are the entries of the multiplication table for conjugacy classes. More generally, we can take arbitrary polynomials in conjugacy classes, and write them as linear combinations of conjugacy classes, and this often has enumerative meaning. For example, consider the class of transpositions. For any , we have
and each term in the sum corresponds uniquely to an -step walk on the Cayley graph of beginning at Thus,
which says that the combinatorial problem of counting -step walks on the graph from the identity permutation to a permutation of cycle type is the same thing as the algebraic problem of resolving into a linear combination of conjugacy classes.
For any , we have
where both sums are over , and the two sums are in fact the same since Thus, the class algebra is a subalgebra of the center of the group algebra. Conversely, if is a central element, then whence is a linear combination of conjugacy classes, and we conclude that the class algebra coincides with the center of the group algebra Again, all of the above is true for any group, and you can view as a functor from groups to commutative, unital -algebras which transports each group to its class algebra
In addition to the conjugacy class themselves, the levels of the Cayley graph of belong to the class algebra, and are -linear combinations of conjugacy classes,
Lifted to the group algebra, Theorem 1 says that these central elements are given by the sums
where is the set of functions from to Equivalently, we have
where are the sums
Note that is an empty sum, hence equal to the zero element in A nice way to picture these sums is to take the conjugacy class of transpositions and view it as the strictly upper triangular matrix over whose entry in row and column is the transposition whenever and when not. Then is simply the th column sum of the matrix . For example, in the case we have
and
The transpositions sums are known as the Young-Jucys-Murphy elements of the group algebra Note that the YJM elements are non-central, since they correspond to proper subsets of a conjugacy class (namely, the class ), and therefore cannot be expressed as linear combinations of conjugacy classes. Nevertheless, we have the following.
Proposition 2: The YJM elements commute with one another.
Proof: Problem set 2.
In view of the commutativity of the elements , we may equivalently write Theorem 1 as
which says that is the sum of all squarefree monomials of degree in the commuting elements . But this is a polynomial you know from Math 202B: it is the elementary symmetric polynomial of degree in variables. In particular, Theorem 1 is equivalent to the following polynomial identity in the class algebra .
Theorem 2:
As you know from Math 202B, the elementary symmetric polynomials are important because they express the coefficients of a polynomial as functions of its roots (the inverse problem, expressing the roots as functions of the coefficients, is harder). Thus Theorem 2 is equivalent to the following factorization in the class algebra.
Theorem 3: For any , we have
In addition to being a direct incarnation of the relationship between the roots of a polynomial and its coefficients, there is another reason why the elementary symmetric polynomials are deserving of their name. Let be commuting indeterminates, and consider the corresponding polynomial algebra As you know from Math 202B, polynomials satisfying
are called symmetric polynomials, and their totality forms a subalgebra of denoted This subalgebra is itself a polynomial algebra.
Theorem 3 (Newton): We have
Theorem 3 is often referred to as the fundamental theorem on symmetric polynomials. A proof is given in Professor Sam’s 202B notes (Theorem 3.3.4). Combining Theorem 3 with Theorem 2, we learn the following: even though the JM elements do not themselves belong to the class algebra , any symmetric function of them does.
Theorem 4: For any the group algebra element is central.
Theorem 4 is quite remarkable, and has many interesting consequences. Here is one. Suppose we weaken our definition of strictly monotone walks on the edge-labeled Cayley graph of the symmetric group to just plain monotone walks, meaning walks which have the feature that the labels of the edges they traverse form a weakly increasing sequence.
Theorem 5: The number depends only on the conjugacy class of .
Proof: The complete homogeneous symmetric polynomial of degree in commuting variables is the sum of all monomials of degree in these variables (hence the name “complete”). Equivalently,
Evaluating on the YJM elements, we obtain
which is precisely the sum of all weakly monotone walks on beginning at Thus,
But since is a symmetric polynomial, Theorem 4 guarantees that , so that we must have for all as claimed.
— Q.E.D.
No direct combinatorial proof of Theorem 5 is known.
Exercise: In the symmetric group , write down all factorizations of the full cycle into three transpositions; identify the factorizations which are weakly monotone, and those which are strictly monotone. Repeat this procedure with the full cycle
Before leaving the world of linear algebra for multilinear algebra (i.e. tensors), let us reformulate the spectral theorem entirely “upstairs” in (as in Lecture 1, is complex vector space of dimension equipped with a scalar product).
Theorem 1: If is selfadjoint, then there is a positive integer real numbers and selfadjoint operators in satisfying
and
such that
The operators above are called orthogonal idempotents. The “idempotent” part is , and this term is applied more generally in ring theory to refer to anything equal to its own square. The “orthogonal” part is , the zero operator in , when This is a useful representation of , because orthogonal idempotence immediately implies that
Given your Math 202A background, you have realized immediately that Theorem 1 is equivalent to the eigenvalue/eigenvector form of the Spectral Theorem discussed in Lecture 1, because the set of selfadjoint idempotents acting in is in canonical bijection with the lattice of subspaces of , since these are exactly the operators which act by orthogonal projection onto a subspace: for any , the corresponding maps to the nearest point of the subspace . The correspondence between selfadjoint idempotents — which are often simply called orthogonal projections — and subspaces even holds in the infinite-dimensional setting provided is complete with respect to the norm , i.e. is a Hilbert space, provided we restrict to closed subspaces. Thus the Grassmannian is for all practical purposes equivalent to the set of selfadjoint idempotents/orthogonal projections of rank . Theorem 1 above is obtained from the eigenvalue/eigenvector form of the Spectral Theorem by replacing the eigenspaces of with the orthogonal projections onto those spaces, and conversely Theorem 1 gives the eigenvalue/eigenvector form of the Spectral Theorem with being the distinct eigenvalues of whose corresponding eigenspaces are the images of the orthogonal idempotents .
Let us close out this brief recap of eigenvectors and eigenvalues (i.e. Math 202A material) by recalling the algebraic criterion for diagonalizability. An operator is said to be normal if it commutes with its adjoint, . Obviously, selfadjoint operators are normal, , and so are unitary operators since .
Theorem 2: Given , there exists an orthonormal basis of such that for each , where are scalars, if and only if is normal. Equivalently, is a -linear combination of pairwise orthogonal selfadjoint idempotents if and only if .
After revisiting this essential piece of 202A technology, my plan was to jump into multilinear algebra — tensor products, exterior products, symmetric products, and their applications to eigenvalue inequalities — and work up to the construction of Schur functors, which interpolate between these using the representation theory of the symmetric groups. This would be an application of the material you learned in 202B wherein the goal is to enhance the material you learned in Math 202A. However, I have since changed my mind. After reviewing Professor Sam’s 202B lecture notes (which I am posting to Piazza for your reference), and speaking briefly with Steven, I have realized that you already have some exposure to this material. Even though there is much left to be done in this direction, namely starting from Section 6 in Steven’s notes and working up to a proof of Theorem 6.4.1 via Schur-Weyl duality, I have ultimately decided to leave this for the second half of the course, and begin instead with some very natural and appealing applications of symmetric group representations which we can sink our teeth into immediately. Also, this will give you exposure to material which cannot be looked up in textbooks, or at least cannot be found presented in the way we’ll do it.
Definition 1: For each , the symmetric group of degree consists of the set of bijective functions
i.e. automorphisms of , with the group law being composition of functions. Elements of are called permutations.
As you know from Math 202B (and likely much before), any permutation can be factored into a product of disjoint cyclic permutations, and this factorization is unique up to the order of the factors. Our starting point will be a rather different unique factorization theorem for permutations. To explain, let us consider from the perspective of geometric group theory. Let denote the set of transpositions, i.e is the conjugacy class of -cycles,
As you are also no doubt aware, generates , which means that any permutation can be written as a product of transpositions,
This representation is very far from being unique, and indeed the question of counting the number of factorizations of a given permutation into a given number of transpositions is a classical problem in algebraic combinatorics known as the Hurwitz enumeration problem; despite its elementary formulation, this question has deep ties to enumerative algebraic geometry and mathematical physics, some of which we will hopefully be able to discuss. Nevertheless, for each there is a minimal value of for which such a factorization exists, and this is called the word norm of and denoted , where the subscript reminds that this quantity depends on the choice of generating set We will omit this subscript in order to lighten the notation, and if we switch to a different generating set inducing a different word norm on , we will say so loud and clear. The following is elementary combinatorics, but you should take the time to prove it.
Proposition 1: The length of any factorization of a permutation into transpositions satisfies for some . That is, all factorizations of into transpositions have the same parity.
Somewhat less elementary but no less important is the fact that there is a tight connection between the word norm of a permutation and the number of factors in its unique decomposition into disjoint cycles.
Proposition 2 (join-cut relation): We have
The key idea in the proof of Proposition 2 is expressed in its name: multiplying a permutation by a transposition will either join two of its cycles into one, or cut one of its cycles into two. Thus,
The word norm makes the symmetric group into a metric space space.
Proposition 3: The function
defined by | is a metric.
We may thus for example speak of the open ball of radius centered at a given permutation ,
and its boundary, the sphere
The metric is nothing but the graph theory distance (or geodesic distance) on the Cayley graph of the symmetric group as generated by the conjugacy class of transpositions. The vertex set of this graph is itself, and two permutations are joined by an edge if and only if there is such that Note that this is an undirected graph because is the identity permutation for every so that is equivalent to Note also that is a regular graph of degree Another feature of is that it is a graded graph: it decomposes into a disjoint union of independent sets or “levels,”
with being the sphere of radius centered at the identity permutation . By the join-cut relation, may also be described as the set of permutations whose disjoint cycle decomposition contains factors.
The Hurwitz enumeration problem has an equivalent formulation in terms of the Cayley graph: it asks for the number of -step walks from to on This description also points to a refinement of the Hurwitz problem. Let us introduce the edge-labeling on in which each edge corresponding to the transposition is marked by , the larger of the two numbers it interchanges. We will call this the Biane-Stanley edge labeling of the symmetric group, after Philippe Biane and Richard Stanley, in whose work it appears implicitly.
with Biane-Stanley labeling.
Given an -step walks on , we define its signature to be the corresponding sequence of edge labels. We say that a walk is strictly monotone if its signature is a strictly increasing sequence. Thus, an -step strictly monotone walk from to is the same thing as a strictly monotone factorization of into transpositions, i.e. a factorization
such that
Theorem (unique monotone factorization): Every permutation admits a unique strictly monotone factorization, and the length of this factorization is equal to That is,
Proof: The proof is by induction on .
In the base case we have with and The complete answer to the Hurwitz enumeration problem for is thus given by
and
This corresponds to the fact that if and have the same parity, then we have the unique length factorization
but no factorization exists if and have opposite parity. Imposing the strictly monotone constraint whittles this down to
and
For the inductive step, there are two cases to consider. Let and let be any permutation. If is a fixed point of , i.e. , then has a unique strictly monotone factorization by the induction hypothesis. If on the other hand is one of the
permutations which contains in its support (meaning ), then is of the form
for a unique pair and By the induction hypothesis, admits a unique strictly monotone factorization of length equal to , and the result follows.
—- Q.E.D.
Graphically, the unique monotone factorization theorem gives a presentation of the symmetric group as a starlike tree, which we call the Jucys tree after the Lithuanian physicist Algimantas Adolfas Jucys (not to be confuse with his father, the Lithuanian physicist Adolfas Jucys), in whose work it appears implicitly. We will embed this construction in the group algebra in Lecture 3, and start using it in conjunction with linear representations of the group algebra.
Throughout this lecture, is a finite-dimensional complex vector space equipped with a scalar product , which is by convention a linear function of its second argument. The scalar product on determines various symmetry classes in the algebra of linear operators and the study of these symmetry classes is one of the main components of Math 202A. In particular, recall that an operator is said to be selfajdoint (or Hermitian) if
Hermitian operators are the subject of the Spectral Theorem, which may be formulated as follows.
Theorem 1: If is selfadjoint, then there exists an orthonormal basis of such that , where are real numbers.
Theorem 1 is the eigenvalue/eigenvector form of the Spectral Theorem. We will give a proof of this result which iteratively constructs the eigenvalues/eigenvectors of as the maxima/maximizers of the (necessarily real-valued) quadratic form on a nested sequence of compact sets in . Instead of invoking the Fundamental Theorem of Algebra to claim the existence of a root of the characteristic polynomial, this argument calls the Extreme Value Theorem as a subroutine.
Lemma 1: If is a selfadjoint operator such that the affiliated quadratic form is nonnegative, then any zero of is in the kernel of .
Proof: In fact, the argument does not use the finite-dimensionality of . Let be such that Consider a real perturbation of this zero in an arbitrary direction, i.e. consider , where is arbitrary and is a fixed unit vector. We then have
The first term is zero by hypothesis. The middle terms are the real number times
Thus by hypothesis the function
is a nonnegative function of the real variable . This function has the asymptotics
as , so it must be the case that , otherwise would change sign at , which it does not, being a nonnegative function. Repeating the above with a pure imaginary perturbation , we get that as well. Thus is orthogonal to every unit vector , whence it must be the zero vector,
— Q.E.D.
Proof of Theorem 1: Since is finite-dimensional, its unit sphere
is compact, and the function is continuous. Thus, by the Extreme Value Theorem, there exists a unit vector at which attains its maximum value on , i.e. we have
for all unit vectors . But this is equivalent to
for all unit vectors , and in this form it is clear that the inequality actually holds for arbitrary vectors , i.e. we have
for all where
Note that is selfadjoint (indeed, selfadjoint operators in form a real vector space), and the above says that the associated quadratic form is nonnegative. Since by construction, we may invoke Lemma 1 to conclude that is in the kernel of , i.e. is an eigenvector of with eigenvalue .
Consider now the orthogonal complement
i.e. the -dimensional hyperplane in perpendicular to the line through . Let be the unit sphere in , let be the maximum of the quadratic form on , and let be a point at which the maximum is achieved. By exactly the same argument as above, we get that is an eigenvector of with eigenvalue . Indeed, we can iterate the argument times to obtain an orthonormal set of eigenvectors with eigenvalues .
— Q.E.D.
2. The subspace lattice
Let denote the set of all subspaces of , including the trivial subspace and itself. This set carries a natural partial order: given we declare if and only if is a subspace of . In fact, is a graded poset in which the rank of a subspace is its dimension . The graded poset is moreover a lattice: the largest subspace contained in each of two given subspaces and is their intersection, and the smallest subspaces containing both of them is their span. Finally, the lattice is self-dual: the map sending a subspace to its orthogonal complement is an order-reversing involution.
In formulating the above, we have just encountered our first example of a functor: we can view as a covariant functor from the category of vector spaces to the category of posets. Indeed, if are vector spaces and is a linear transformation, we get an order-preserving function defined by
In this course we will be considering quite a few more functors, most of which will be endofunctors from the category of vector spaces to itself. The most basic of these are tensor powers, symmetric powers, and exterior powers, which are the basic operations of multilinear algebra and are closely related to the physical concept of Fock space, which is intended to capture the notion of symmetry classes of elementary particles in quantum mechanics — bosons and fermions and such.
3. Flags and Grassmannians
Very loosely speaking, vector spaces are continuous versions of sets. A finite set might represent the possible states of some physical system, while the free vector space over this set allows for superpositions of states, i.e. linear combinations of them. Associated to is its lattice of subsets , where meet and join are intersection and union. This is called a Boolean lattice, and it is graded by cardinality rather than dimension. Sometimes is referred to as a hypercube, because its elements can be thought of as length bitstrings, and these may in turn be thought of as the vertices of a polytope in which is a cube when .
In a general poset, a chain is a totally ordered subset, while an antichain is a subset which does not contain any comparable pairs of elements. A chain or antichain is said to be saturated if adding any new element will cause it to no longer be a chain or antichain. It is easy to spot saturated chains in the Boolean lattice — they are just sequences of subsets of such that , and there are factorially many of these. It is equally easy to spot saturated antichains, namely the collections consisting of all subsets of of cardinality . The cardinality of itself is then , and it is a famous result in extremal combinatorics that the size of the largest antichain in the Boolean lattice is the largest number in the th row of Pascal’s triangle.
The lattice of subspaces of a finite-dimensional vector space is in many ways analogous to the lattice of subsets of a finite set. Indeed, if one could define a field with one element in a satisfactory way, then the two notions would coincide for vector spaces over . This is problematic, but one can analyze the subspace lattice of a vector space defined over without difficulty, and count chains, antichains, etc. For example, the antichain consisting of -dimensional subspaces is the Gaussian binomial coefficient, which degenerates to the ordinary binomial coefficient in the limit, almost as if really did exist…
Anyway, we are working over , so everything is infinite. Chains in are typically referred to as flags, though this term is more often than not reserved for chains that contain the trivial subspace and the itself. Saturated chains are called complete flags: they are sequences
such that The analogue of the antichain in the Boolean lattice consisting of subsets of cardinality is the collection of -dimensional subspaces of ; these antichains are called Grassmannians and denoted . We are going to discuss Grassmannians of finite-dimensional complex vector spaces in some detail, and I hope that we will have time to discuss a certain special infinite-dimensional Grassmannian known as the Sato Grassmannian. For now, we concentrate on the following fundamental relationship between eigenvalues of selfadjoint operators and Grassmannians.
3. Eigenvalues as extremizers
Let be a selfadjoint operator, and define a function
by
Thus, assigns to each subspace the maximum of the quadratic form associated to on the unit sphere in . In particular, we know from our proof of the Spectral Theorem above that the largest eigenvalue of is given by
The following result, known as the Courant-Fisher minimax theorem, generalizes this by giving a variational characterization of each individual eigenvalue of .
Theorem 2: If is a selfadjoint operator with eigenvalues then for each we have
That is, the th largest eigenvalue of is the minimum value of over the Grassmannian of -dimensional subspaces of .
Proof: Let be an orthonormal basis of such that , For each , consider the space
spanned by the eigenvectors corresponding to the largest eigenvalues of and also set . In the proof of Theorem 1, we showed that
so that indeed the th largest eigenvalue is given by evaluating the function at a point of the Grassmannian , namely It remains to show that the evaluation of on every other subspace of dimension is at least .
To achieve this end, we establish the following squeeze inequality for the values of the quadratic form on the subspace spanned by the first eigenvectors: for all we have
Indeed, let be any vector in . Using orthonormality of the eigenvectors we calculate
and the lower bound is established in exactly the same way.
Now let be arbitrary. Since
,
the intersection of the subspaces and is at least one-dimensional (see Problem set 1). Let be a unit vector in . Since , we have by the above inequality. But also , which shows that , as required.
— Q.E.D.
4. Adjoint orbits
Any operator which preserves the scalar product,
is called a unitary operator. It is clear that the set of unitary operators on is a group, indeed a subgroup of the general linear group . Note that the former depends on the scalar product but the latter does not; one may equivalently say that invertible operators map bases to bases, while unitary operators map orthonormal bases to orthonormal bases.
Let be given real numbers, and let be the set of selfadjoint operators on with this given spectrum.
Theorem 3: Choose any orthonormal basis , and let be the selfadjoint operator defined by
Then the isospectral set is given by
Proof: First we check that consists entirely of selfadjoint operators having spectrum . The selfadjointness of follows immediately from the selfadjointness of together with the fact that . To see that has spectrum , let be the orthonormal basis of defined by
Thent is an eigenvector of with eigenvalue .
Now let be a given selfadjoint operator with eigenvalues . By the Spectral Theorem, there exists an orthonormal basis such that If is the unitary operator sending to , then
Welcome to Math 202C in the Spring quarter of 2022. This course concludes the applied algebra graduate course sequence, the previous installments of which were taught by Professor Freddie Manners and Professor Steven Sam. The goal of the course will be to unite and extend the material you learned in 202A and 202B in a natural, pleasing, and useful way. We will focus on applications of the representation theory of the symmetric groups, which was the subject of 202B. The Jucys-Murphy elements will be constructed, and their action in irreducible representations will be applied to analyze random walks (card shuffling) and permutation factorizations (Hurwitz theory). We then review some 202A material, namely the basic constructions of multilinear algebra (tensor, exterior, and symmetric algebras), and their physical interpretation as state spaces for bosons and fermions (Fock spaces). We apply the representation theory of the symmetric groups to prove the existence of a -Fock space interpolating between the symmetric and exterior algebras. The two parts of the course are then unified using the infinite wedge space formalism, whose connection with the theory of integrable systems via the Sato Grassmannian will be discussed if time permits.
Now to the mechanics of Math 202C. This post, Lecture 0, serves as the course syllabus. The course is being taught remotely, and the plan is that the two weekly lectures will be delivered in the form of blog posts like the one you are reading right now. Each of these posts will likely cover more than I would be able to cover in an in-person ninety minute lecture. These posts will appear every Tuesday and Thursday in a temporal neighborhood of our scheduled lecture slots, and the course will feel much like a reading course, albeit a fairly intense one leading up to a qualifying exam. I will also hold live office hours over Zoom, so that we can discuss the course content in real time, and so that I can embellish and add to it verbally. This will happen on Fridays, and we will work out a time over Piazza, which we will use as a question and answer forum (signup link here). We’ll be conducting all class-related discussion on Piazza. The quicker you begin asking questions on Piazza (rather than via emails), the quicker you’ll benefit from the collective knowledge of your classmates and instructors. I encourage you to ask questions when you’re struggling to understand a concept.
As for evaluation, there are no exams (neither midterm nor final), though of course you’ll be taking the Applied Algebra qualifying exam mid-quarter. We will however have weekly problem sets which will be due on Sundays and submitted via Gradescope. More information about this will be posted on Piazza. These problem sets will help to cement the concepts you’ll be learning in the course, and will also give you an idea of the sorts of problems that are likely to appear on the qualifying exam. I believe this is everything you need to know to get started — see you on Piazza starting today, and in real time on Friday.
1
Spectral theorem, flags and Grassmannians, minimax
2
Jucys’s theorem: unique factorization
3
Symmetric polynomials and JM elements
4
Murphy’s theorem: content eigenvalues
5
Top-to-random shuffle I
6
Top-to-random shuffle II
7
Basic category theory
8
Tensor product, tensor powers, tensor algebra
9
Symmetric and antisymmetric tensors, bosons and fermions
As demonstrated in [1], expander graphs are a sparse graph that mimic the structure properties of complete graphs, quantified by vertex, edge or spectral expansion. At the same time, expander constructions have spawned research in pure and applied mathematics, with several applications to complexity theory, design of robust computer networks, and the theory of error-correcting codes.
This lecture is organized as follows. We introduce two different definitions of expanders in the first section, and then explore the relationship between them in second part.
17.1. Two Definitions
We start our discussion in expansion with two definitions of expansion: combinatorial and spectral. In the rest of this post, we consider an undirected graph with , where self loops and multiple edges are allowed. Given two subsets , the set of edges between and is denoted by .
17.1.1. Combinatorial expansion
Intuitively, a graph is an expander if subsets expand outwards into . In other words, all small subsets will have large boundary, respectively. Follow this intuition, we introduce the definition of boundary, based on edges and vertices respectively, to quantify the corresponding expansion.
Definition 1(Boundary):
The Edge Boundary of a set , denoted by , is the set of edges , which is the set of edges between disjoint set and .
The Vertex Boundary of a set is a subset of vertices in adjacent to vertices in , denoted as .
Obviously, vertex boundary is the same as the number of neighboring vertices of vertex sets . With those notations, we then define the expansion over the entire graph via maximizing over subsets .
Definition 2(Expander): For some , a graph with is an
-edge-expander if .
-vertex-expander if .
Remark 3: The reason for the restriction is that we are examining the relative size of every cut in the graph, i.e., the number of edges crossing between and , and we examine each case only once.
Problem 1: Show that the complete graph is
at least a -edge-expander.
a -vertex-expander. (We only consider here.)
From the definition above, the maximum , which makes an -edge-expander, is an important criteria, leading to the definition of Cheeger constant.
Definition 4(Cheeger Constant): The Edge Expansion Ratio, or Cheeger constant, of graph with , denoted by , is defined as
Remark 5: Actually, Cheeger constant can be defined alternatively as . We didn’t follow this definition since we want to keep .
17.1.2. Spectral Expansion
To simplify our discussion, we focus on regular graphs in the rest of this lecture. Let denote the adjacency Matrixof a -regular graph with and being real symmetric, then has real eigenvalues, denoted by , and corresponding orthonormal eigenvectors with . The spectrum of encodes a lot of information about the graph structure. Recall the following properties, which we have seen in previous lectures.
The largest eigenvalue is and the corresponding eigenvector is .
The graph is connected if and only if .
The graph is bipartite if and only if .
The definition of spectral expansion and spectral gap then follows.
Definition 6(Spectral Expansion): A graph with is a -spectral-expander if .
In other words, the second largest eigenvalue of , in absolute value, is at most . This also gives rise to the definition of Ramanujan graph.
Definition 7(Ramanujan graph): A connected -regular graph is a Ramanujan graph if .
We now refer to the spectral gap.
Definition 8(Spectral Gap): The spectral gap of -regular graph is .
Remark 9: Sometimes the spectral gap is defined as , which is often referred as one-sided.
Problem 2:
Show that the complete graph has spectral gap . Actually, we can also show that the one-sided spectral gap is .
Show that the complete bipartite graph has spectral gap . (Hint: use the spectral property above.)
17.2. Relating Two Definitions
It is not so obvious that the combinatorial and spectral versions of expansion are related. In this section, we will introduce two relations between edge and spectral expansion: the Expander-Mixing Lemma and Cheeger’s inequality.
17.2.1. The Expander-Mixing Lemma
Lemma 10(Expander-Mixing Lemma[2]): Let be a -regular graph with vertices and eigenvalues . Then for all :
As we can see, the left-hand side measures the deviation between two quantities:
: the number of edges between the two subsets and .
: the expected number of edges between two subsets and , drawn from an Erdos–Renyi random graph .
The Expander-Mixing lemma tell us a small (a.k.a. large spectral gap) implies that this discrepancy is small, leading to the fact that the graph is nearly random in this sense, as discussed in [1]. We should mention that the converse of expander-mixing is true as well, stated as follows.
Lemma 11(Expander-Mixing Lemma Converse [4]): Let be a -regular graph with vertices and eigenvalues satisfying:
for any two disjoint subsets and some positive . Then .
Proof of Expander-Mixing Lemma: Recall that the adjacency matrix has eigenvalues with corresponding orthonormal eigenvectors . Let and be the characteristic vectors of subsets and (a.k.a., the -th coordinate of is if and otherwise). We then decompose and in the orthonormal basis of eigenvectors as
.
Observing that , we expand out the right hand side and apply orthogonality, which yields
.
where the last step follows from the facts with corresponding , and . Shifting the first term on right hand side to the left, the desired result follows from Cauchy-Schwarz inequality:
Remark 12: If we normalize the adjacency matrix and obtain , where each entry of is divided by , then the corresponding eigenvalues of , namely , lie in the interval . The Expander-Mixing Lemma is in the form of
It is sometimes convenient to consider the normalized spectrum. For example in random graph theory, it would be convenient to look at the normalized spectrum when the expected degree . However, we can focus on the spectrum of $\latex A = A(G)$ without normalization when the expected degree $\latex d = O(1)$.
Regular -spectral expanders with small (a.k.a. large spectral gap) have some of significant properties, listed as follows.
Corollary 13: An independent set , in a graph with , is a set of vertices , where no two vertices in are adjacent, i.e. with . An independent set , in a -regular -spectral expander , has cardinality at most , i.e., , which follows immediately from Lemma 10(Expander-Mixing Lemma[2]).
Corollary 14: Given a graph with , the diameter of is defined as , where is the length of shortest path between and . If is a -regular -spectral expander with , then we have
17.2.2. Cheeger’s Inequality
As we have seen from previous discussion, spectral expansion implies strong structural properties on the edge-structure of graphs. In this section, we introduce another famous result, Cheeger’s inequality, indicating that the Cheeger constant(Edge Expansion Ratio) of a graph can be approximated by the its spectral gap .[6]
Theorem 15(Cheeger’s inequality, [3]): Let be a -regular graph with eigenvalues , then
.
In the normalized scenario with eigenvalues , we have
.
Remark 16: The proof for the normalized scenario is different from the regular scenario. We should start from the desired quantity and finish the calculation step by step. However, the strategy are the same. We use the Rayleigh Quotient to obtain upper bound and lower bound.
In the context of , the estimate of spectral gap was given by Nilli Alon.
Theorem 17([5]): For every -regular graph , we have
The term is a quantity that tends to zero for every fixed as .
[1]. Shlomo Horry, N. L. and Wigderson, A. (2006). Expander graphs andtheir applications.BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY, 43(4).[2]. Alon, N. and Chung, F. R. K. (1988). Explicit construction of linear sized tolerantnetworks.Discrete Mathematics, 42 [3]. Cheeger, J. (1970). A lower bound for the smallest eigenvalue of the laplacian.Problems inanalysis, 26.[Nilli, 1991] Nilli, A. (1991). On the secon[4]. Yonatan Bilu, N. L. (2006). Lifts, discrepancy and nearly optimal spectral gap.Com-binatorica, 26[5]. Nilli, A. (1991). On the second eigenvalue of a graph.Discrete Mathematics, 91(2):207–210.[6]. Lovett, S. (2021). Lecture notes for expander graphs and high-dimensional expanders.
Since the representation theory of the symmetric group is likely fresh in your mind thanks to yesterday’s exam, this is a good time to show you a very nice probabilistic application of this theory to card shuffling. This post will discuss the “top-to-random” shuffle, where the cards are arranged in a single row (or a single stack), and a randomly chosen card is repeatedly swapped with the rightmost (or top) card. The algebraic model for this procedure is the Cayley graph of the symmetric group as generated by the so-called “star transpositions”
Problem 12.1: Prove that the above transpositions do in fact generate
These are called star transpositions because, if one forms the graph on such that is an edge if and only if one of the above transpositions swaps and , then this graph is a star graph. The top-to-random shuffle is then the same thing as the lazy simple random walk on the Cayley graph
The paper attached below gives a detailed analysis of the top-to-random shuffle using the representation theory of and gives a precise meaning to the statement that such shuffles are required in order to randomize the deck. The argument is pretty much all representation theory — only some very basic probabilistic reasoning is used towards the end.
We begin with a brief recap of the main result from Lecture 8. Let
be the characters of the rank -hypercube group
Definition 1: The orthogonal idempotents in the group algebra are defined as the characters of normalized by the order of :
The reason for the name is the following key property of these elements:
This is quite valuable, since expanding elements in the group algebra relative to the basis of orthogonal idempotents essentially makes multiplication in the group algebra trivial, which translates into the “Fourier coefficients of a product are products of Fourier coefficients” fact that we saw last Lecture. Note also that the behavior of orthogonal idempotents in the group algebra is formally identical to the behavior of orthogonal projections in Hilbert space — you perhaps know from Math 202B that this is not a coincidence, and if you don’t know this don’t worry, because we will discuss it.
The main result for today is the following.
Theorem 1: The class of multiplication operators in coincides with the class of operators that are diagonal in the character basis of
Proof: Let be an element of the group algebra, and let be the corresponding multiplication operator. Consider the character expansion of
Now let be another character. We have
So, is an eigenvector of with eigenvalue the Fourier coefficient
Conversely, suppose that acts diagonally in the character basis:
Then the action of on an arbitrary is
so that with
— Q.E.D.
With Theorem 1 in hand, we can immediately compute the the eigenvalues and eigenvectors of the Cayley graph of Indeed, the adjacency operator of the Cayley graph of is just the multiplication operators
so by Theorem 1 we immediately have that
Now, we have that the resulting eigenvalue is
and this is a sum we can explicitly evaluate, because we explicitly computed the characters of back in Lecture 7. In particular, what we know is that
is equal to if is a factor of and is equal to if is not a factor of Since we compute this scalar product for each and every generator of the sum has terms equal to and terms equal to where is the word norm of or equivalently its distance to the identity permutation in the Cayley graph. So we get
as our explicit eigenvalue of the character under the action of the adjacency operator. Moreover, the number of elements such that for a given is simply since this just amounts to choosing of the generators of So we can summarize our results as follows.
Theorem 2: The eigenvalues of Cayley graph of are the numbers
and the dimension of the eigenspace corresponding to is
Since all eigenvalues of the Cayley graph of are given explicitly by Theorem 2, we are in a position to explicitly enumerate walks on this graph, using the spectral approach to counting walks discussed previously.
Problem 9.1: Show that the number of -step walks on from to is
This material, and problem 9.1, have applications in coding theory which you might find interesting to look into.
We are at the beginning of the theory of harmonic analysis on finite abelian groups, starting with the concrete example of the the rank hypercube group, which we have chosen to define as the permutation group generated by the disjoint transpositions
Typically, analysts think of harmonic analysis on (or any other group) as the study of the Hilbert space of functions
equipped with the scalar product
so that one has a canonical orthonormal basis of consisting of indicator functions. This space has the structure of an algebra when equipped with the convolution product of functions,
which is exactly what one gets from bilinear extension of group law in lifted to the group element basis:
In particular, the convolution product on is quite different from the pointwise product. This is the right point of view to take when one is dealing with non-discrete groups (e.g. Lie groups), but for finite groups there is another way of writing things out which is often simpler and cleaner, and favored by algebraists.
The algebraists’ preferred Hilbert space of formal -linear combinations of permutations
Such linear combinations are of course equivalent to functions on the group, and you can think of the linear combination as a generating function which are equivalent to complex valued functions on From this perspective the canonical orthonormal basis of consists of the permutations themselves (rather than their indicator functions), which induces the equivalent scalar product,
The convolution product on is likewise replaced with the equivalent multiplication (written simply as juxtaposition)
In short, and are isomorphic both as Hilbert spaces and as algebras, so the choice to use one rather than the other is a matter only of preference. In practice, it is quite useful to be comfortable with both the analytic and the algebraic way of doing things.
Yet another way of repackaging everything is the following probably too-general perspective. Let us zoom back out to the very beginning, where we started with an arbitrary finite graph . The algebraic way to understand this combinatorial object is to replace the vertex set of with the algebra of polynomials in noncommuting variables indexed by the vertices of In other words, this is the free algebra of rank This algebra is graded by degree, and we have a decomposition
with the th summand being the space of homogeneous polynomials of degree . Observe that this vectors space is isomorphic to the space of complex-valued functions on i.e. the space of functions on -tuples of vertices, or equivalently formal linear combinations of -tuples of vertices. In particular, the linear component is isomorphic to either/both and and the adjacency operator acts on this linear component to capture the edge structure of Indeed, the free algebra is isomorphic to the tensor algebra of the vector space of functions on the vertex set In the case where the vertex set of of is actually a group we can mod out by the group law and consider the quotient
of the free algebra by the ideal corresponding to the group multiplication. This quotient consists solely of polynomials of degree so functions on the group, or equivalently linear combinations of group elements, but now with the structure of an algebra.
Coming back to reality, the fact that is an algebra and not just a vector space is reflected in the presence of a certain class of operators on the group algebra which are not present in the endomorphism algebra of a vector space devoid of multiplication.
Definition 1: Given the corresponding multilplication operator is defined by
We are using the letter “R” for multiplication operators because later on we will consider the group algebras of noncommutative groups, in which left and right multiplication do not necessarily coincide, and we have typically different left and right multiplication operators and To put this the Math 202B way, the operator is the image of in the right regular representation of
Note that multiplication operators corresponding to pure group elements (as opposed to linear combinations of group elements) are unitary operators on
However, since we are working with the permutation group in which all permutations are actually involutions, is both unitary and selfadjoint, hence has eigenvalues
The importance of multiplication operators in our quest to understand the eigenvalues and eigenvectors of the Cayley graph of as generated by the transpositions is the following.
Proposition 1: The adjacency operator of the Cayley graph of is the multiplication operator where
Proof: For any we have
— Q.E.D.
In view of Proposition 1, the problem of understanding the eigenvalues and eigenvectors of the Cayley graph is that of understanding the eigenvalues and eigenvectors of
To solve it, we will work out the spectral analysis of arbitrary multiplication operators on The main tool in this endeavor is as second basis of the group algebra, which consists of special linear combinations of group elements which have the property that
That is, is the generating function of a group homomorphism
the multiplicative group of nonzero complex numbers. I like this way of stating things; in the analytic formalism, the words “character” and “homomorphism” mean exactly the same thing, a redundancy, whereas in the algebraic formalism “character” really means “generating function of a homomorphism.”
In Lecture 7, we explicitly worked out all homorphisms from to showing that they have values in the subgroup generated by and are moreover parameterized by the points of We also showed that the characters are orthogonal,
In particular, for any we have the character expansion
The character expansion of is also known as its Fourier expansion, with being the Fourier coefficients of The main reason that character expansions are “better” than expansions in the group basis is that the Fourier coefficients of a product are just the products of the corresponding Fourier coefficients of and .
The main reason that characters — which are generating functions of homomorphisms — are better than the original group elements is that their multiplication is extremely simple.
Theorem 1: For any we have
if and
Proof: Expanding in the permutation basis, we have
where we used that is a homomorphism and The result now follows from character orthogonality, which is Theorem 2 in Lecture 7.
— Q.E.D.
An immediate consequence is that the Fourier coefficients of a product are the products of the Fourier coefficients of the factors.
Corollary 1: For any we have
Proof: On one hand, the character expansion of the product is
On the other hand, we compute the product and via their character expansions:
where we applied Theorem 1 to get the final equality. Since the characters are linearly independent, the result follows.
—- Q.E.D.
In Lecture 9, we will apply the above to prove that is a multiplication operator if and only if it acts diagonally in the character basis. We will also find the corresponding eigenvalues, and hence count walks on In Lecture 10 we will then analyze convergence of random walk on to equilibrium.