Lectures 5 and 6 of Math 202C give an application of the representation theory of the symmetric group as discussed in Math 202B together with the additional material on the YJM elements in the corresponding group algebra . This application is probabilistic in nature, and concerns the following question: how many times do you have to shuffle a deck of cards in order to randomize it?
In order to analyze this question rigorously, we need to model it mathematically. The first step is to say exactly what we mean by “shuffle.” We consider a deck of cards labeled laid face up on a table, arranged in a single row. Initially, this arrangement is from left to right. At each tick of the clock, we choose a random card, and swap its position with the rightmost card. This gives us a new arrangement of cards which, reading the labels from left to right, corresponds to a permutation in the symmetric group . This is called the “top-to-random” shuffle. Note that it is possible the randomly chosen card is in fact the rightmost (top) card, and in this case no shuffling occurs.
The top-to-random shuffle is a stochastic process evolving in discrete time. It may equivalently be formulated as a random walk on the the Cayley graph of the symmetric group generated by the set
The elements of are transpositions; they are called “star transpositions” because, if we picture as the edge set of a graph on , we get the star graph with hub vertex The simple random walk on is the following stochastic process: a particle is initially located at the identity permutation , and at each tick of the clock it jumps to a neighboring vertex, with equal probability of jumping in any direction. This can be encoded algebraically as follows. Let be the group algebra of , equipped with the scalar product in which the permutation basis is orthonormal. Identify the generating set with the sum of its elements,
which you will recognize from our previous lectures as the top YJM element in The probability that a particle on evolving by simple random walk is located at a given permutation at a given time is then equal to the normalized scalar product
This random walk is not equivalent to the top-to-random shuffle, but rather to a shuffling process in which one of the cards below the top card is exchanged with the top card. This is not a good way to shuffles cards, because randomness is never achieved; equivalently, the corresponding random walk is not ergodic. A random walk on a graph is said to be ergodic if, for all sufficiently large times, the particle has positive probability of being located at each vertex. This is not the case for simple random walk on — at even times, the probability the particle is located at an odd permutation is zero, and vice versa.
A shuffling process should have the property that as we get closer and closer to a uniform distribution, which algebraically is represented by the vector
More formally, let us introduce the -norm on corresponding to the permutation basis, which assigns to each the number
In probability theory, the corresponding distance is called total variation distance. The lazy simple random walk on is the modification of simple random walk in which, at each tick of the clock, the particle either jumps to an adjacent vertex or stays put, with each such event occurring with equal probability. This random walk is generated by the group algebra element , which means that the probability the particle is located at is
for each , so that perfect randomness is achieved in the limit of infinitely many shuffles.
Shuffling infinitely many times is not something that can be done in the real world. Therefore what we want is an effective version of the above, which tells us how close our deck is to randomized after shuffles have been performed. Algebraically, we want bounds on the -norm
This weeks lectures, which take the form of a typeset document posted to Piazza in the resources section, explain how to do this using the representation theory of the symmetric group, in particular the spectral theory of the YJM elements which we have been discussing. A remarkable insight coming out of the analysis is that the top-to-random shuffle exhibits what is known as the cutoff phenomenon: for small values of , the total variation distance to uniform remains close to its initial value, and then suddenly crosses a -dependent cutoff time and becomes exponentially small, indicating that near-perfect randomness has been achieved in finite time.
In Lecture 3, we introduced the Young-Jucys-Murphy elements in the group algebra of the symmetric group . The YJM elements form a commuting family of elements in the group algebra of the symmetric group , and have the amazing feature that symmetric polynomials in always belong to the class algebra despite the non-centrality of the ‘s themselves. This is quite remarkable, as there is no obvious reason why an arbitrary symmetric polynomial should evaluate to a linear combination of conjugacy classes.
Since is central, Schur’s Lemma implies that it acts in any given irreducible representation of as a scalar operator. More precisely, if is the irreducible representation of corresponding to the partition (more precisely, a representative of the isomorphism class of all such representations), then the operator is of the form
where is the identity operator and is a scalar. Note that this scalar must in fact be real, because each , being a sum of transpositions, is a selfadjoint element of Indeed,
are commuting selfadjoint operators, and hence are simultaneously diagonalizable, i.e. admit a common eigenbasis. A natural question is then how to explicitly describe the eigenvalue This question has an amazing answer, which we now describe.
According to Schur’s Lemma, the class algebra may be described as the subalgebra of consisting of elements which act as scalar operators in all irreducible representations. Note that we are including here a converse to Schur’s Lemma, namely the statement that if is a scalar operator in for all , then ; this follows immediately from the fact that the map
is an algebra isomorphism (often referred to as the noncommutative Fourier transform). We can generalize this as follows.
As you know from Math 202B, the branching of irreducible -modules is multiplicity free. More precisely, we may consider the group algebra of the symmetric group as the subalgebra of spanned by permutations in satisfying The isotopic decomposition of an irreducible -module as an -module is
where the direct sum is over all irreducible -modules such that the Young diagram can be obtained from the Young diagram by adding a single cell.
Example 1: For the “staircase” diagram the isotypic decomposition of the irreducible -module as an -module is
Iterating this construction, we obtain a decomposition of into a sum of dimensional spaces indexed by “growth sequences”
where is a diagram with -cells, which encode the possible life histories of the multicellular organism starting from the single-celled organism Obviously, each such growth history corresponds uniquely to a standard Young tableau of shape — the entry of a given cell encodes the time instant at which that cell was added. Thus, multiplicity-free branching gives us a basis of each irreducible representation canonically determined up to scaling; this basis is known as the Young basis (or sometimes the Gelfand-Tsetlin basis). By Schur’s lemma, the subalgebra
coincides with the maximal commutative subalgebra of of elements which act diagonally on the Young basis of every irreducible representation ; this algebra is known as the Young subalgebra (or Gelfand-Tsetlin subalgebra) of the group algebra , and its construction depends crucially on the fact that we have a natural inclusion . Observe that the YJM elements belong to the algebra (explaining why is a conceptually optimal way to answer one of the homework problems). The following Theorem is an enormously useful tool, both theoretically and computationally.
Theorem 1 (Diaconis-Greene,Murphy,Okounkov-Vershik): The commutative algebra is exactly the algebra of polynomials in the YJM elements Moreover, for each irreducible representation of , the action of on the corresponding Young basis is given by
where is the content (i.e. column index minus row index) of the cell in containing the number .
Example 2: Consider the group algebra of the symmetric group of degree acting in the irreducible representation labeled by the staircase partition and let be the Young basis vector corresponding to the standard tableau
Theorem 1 says that is an eigenvector of each of the operators and that the corresponding eigenvalues are, respectively,
Corollary 1: For any and any , we have
where the right hand side denotes the evaluation of the symmetric polynomial on the numbers , with being an arbitrary enumeration of the cells of
Example 2: Corollary 1 tells us that the eigenvalue of the symmetric polynomial
acting in is
We will black-box Theorem 1, and (possibly) return to its proof later, though I will post a resource where you can look it up if you wish. Our goal will be to mine some rather remarkable information from Theorem 1. There are two natural directions in which one might want to go.
First, there is the question of whether evaluation on the YJM elements in fact defines a surjective morphism
If this were so, then for each there would exist a symmetric polynomial such that
i.e. we could express each conjugacy class as a symmetric polynomial function of the YJM elements. This would in turn have the following character-theoretic consequence. One one hand, the eigenvalue of the conjugacy class acting in the irreducible representation is given by
where as always the character is by definition the trace of the operator , with an arbitrary permutation in the conjugacy class Indeed, on one hand we have
since the trace of a scalar operator is simply its unique eigenvalue summed dimension-many times, and on the other hand
Corollary 1 would then give us the formula
which in turn gives us a formula for the irreducible character as a function of the content alphabet of the Young diagram . In fact, this is all true, and is a consequence of combining Corollary 1 with the following classical theorem on the class algebra.
Theorem 2 (Farahat-Higman): Every conjugacy class can be expressed as a polynomial in the levels of the Cayley graph of
A good treatment of this approach to the irreducible characters of the symmetric groups has been given by our colleague Adriano Garsia, and I will post his lecture notes to our Piazza page. There is an inescapable downside to this seemingly miraculous result: the symmetric polynomials are not given by any simple formula. This is in a sense inevitable, as the computation of the irreducible characters of the symmetric groups is a -complete problem, and consequently the quest for an explicit closed form is quixotic.
Instead, one can take the opposite approach, and start from the fact that there are many interesting (symmetric) polynomials which we can write down explicitly, and Theorem 1 gives us a way to explicitly calculate the action of in irreducible representations. It is this direction that we shall follow in the coming lectures.
In Lecture 2, we established the following unique factorization result for the symmetric group
Theorem 1 (Jucys): Every permutation admits a unique factorization
into transpositions , such that
Moreover, the number of factors in this decomposition is
We now consider some algebraic ramifications of Jucys’ result. Let denote the group algebra of the symmetric group. As you know from Math 202B, this is the -dimensional Hilbert space in which sits as an orthonormal basis, and which supports the bilinear product inherited from the group law on . In fact, is a -algebra, with antilinear involution given by All of this is true for arbitrary groups, and indeed you can view as a functor from the category of groups to the category of unital associative -algebras which transports each group to its corresponding group algebra
Any subset of may be identified with a vector in simply by identifying it with the sum of its elements, and any family of disjoint subsets of is linearly independent, since any two of its members are orthogonal in In particular, for each we may view the corresponding conjugacy class as an element of , the sum of all permutations of cycle type and consider the linearly independent set in the group algebra,
Let denote the subspace of spanned by the conjugacy classes, the dimension of which is the number of partitions of . Given , we have
Th number in fact depends only on the conjugacy class of , for if then for any Thus is a subalgebra of called the class algebra, which is a particularly appropriate name for something encountered in an algebra class. We thus have
The numbers are called the connection coefficients (aka structure constants) of the class algebra ; they are the entries of the multiplication table for conjugacy classes. More generally, we can take arbitrary polynomials in conjugacy classes, and write them as linear combinations of conjugacy classes, and this often has enumerative meaning. For example, consider the class of transpositions. For any , we have
and each term in the sum corresponds uniquely to an -step walk on the Cayley graph of beginning at Thus,
which says that the combinatorial problem of counting -step walks on the graph from the identity permutation to a permutation of cycle type is the same thing as the algebraic problem of resolving into a linear combination of conjugacy classes.
For any , we have
where both sums are over , and the two sums are in fact the same since Thus, the class algebra is a subalgebra of the center of the group algebra. Conversely, if is a central element, then whence is a linear combination of conjugacy classes, and we conclude that the class algebra coincides with the center of the group algebra Again, all of the above is true for any group, and you can view as a functor from groups to commutative, unital -algebras which transports each group to its class algebra
In addition to the conjugacy class themselves, the levels of the Cayley graph of belong to the class algebra, and are -linear combinations of conjugacy classes,
Lifted to the group algebra, Theorem 1 says that these central elements are given by the sums
where is the set of functions from to Equivalently, we have
where are the sums
Note that is an empty sum, hence equal to the zero element in A nice way to picture these sums is to take the conjugacy class of transpositions and view it as the strictly upper triangular matrix over whose entry in row and column is the transposition whenever and when not. Then is simply the th column sum of the matrix . For example, in the case we have
The transpositions sums are known as the Young-Jucys-Murphy elements of the group algebra Note that the YJM elements are non-central, since they correspond to proper subsets of a conjugacy class (namely, the class ), and therefore cannot be expressed as linear combinations of conjugacy classes. Nevertheless, we have the following.
Proposition 2: The YJM elements commute with one another.
Proof: Problem set 2.
In view of the commutativity of the elements , we may equivalently write Theorem 1 as
which says that is the sum of all squarefree monomials of degree in the commuting elements . But this is a polynomial you know from Math 202B: it is the elementary symmetric polynomial of degree in variables. In particular, Theorem 1 is equivalent to the following polynomial identity in the class algebra .
As you know from Math 202B, the elementary symmetric polynomials are important because they express the coefficients of a polynomial as functions of its roots (the inverse problem, expressing the roots as functions of the coefficients, is harder). Thus Theorem 2 is equivalent to the following factorization in the class algebra.
Theorem 3: For any , we have
In addition to being a direct incarnation of the relationship between the roots of a polynomial and its coefficients, there is another reason why the elementary symmetric polynomials are deserving of their name. Let be commuting indeterminates, and consider the corresponding polynomial algebra As you know from Math 202B, polynomials satisfying
are called symmetric polynomials, and their totality forms a subalgebra of denoted This subalgebra is itself a polynomial algebra.
Theorem 3 (Newton): We have
Theorem 3 is often referred to as the fundamental theorem on symmetric polynomials. A proof is given in Professor Sam’s 202B notes (Theorem 3.3.4). Combining Theorem 3 with Theorem 2, we learn the following: even though the JM elements do not themselves belong to the class algebra , any symmetric function of them does.
Theorem 4: For any the group algebra element is central.
Theorem 4 is quite remarkable, and has many interesting consequences. Here is one. Suppose we weaken our definition of strictly monotone walks on the edge-labeled Cayley graph of the symmetric group to just plain monotone walks, meaning walks which have the feature that the labels of the edges they traverse form a weakly increasing sequence.
Theorem 5: The number depends only on the conjugacy class of .
which is precisely the sum of all weakly monotone walks on beginning at Thus,
But since is a symmetric polynomial, Theorem 4 guarantees that , so that we must have for all as claimed.
No direct combinatorial proof of Theorem 5 is known.
Exercise: In the symmetric group , write down all factorizations of the full cycle into three transpositions; identify the factorizations which are weakly monotone, and those which are strictly monotone. Repeat this procedure with the full cycle
Before leaving the world of linear algebra for multilinear algebra (i.e. tensors), let us reformulate the spectral theorem entirely “upstairs” in (as in Lecture 1, is complex vector space of dimension equipped with a scalar product).
Theorem 1: If is selfadjoint, then there is a positive integer real numbers and selfadjoint operators in satisfying
The operators above are called orthogonal idempotents. The “idempotent” part is , and this term is applied more generally in ring theory to refer to anything equal to its own square. The “orthogonal” part is , the zero operator in , when This is a useful representation of , because orthogonal idempotence immediately implies that
Given your Math 202A background, you have realized immediately that Theorem 1 is equivalent to the eigenvalue/eigenvector form of the Spectral Theorem discussed in Lecture 1, because the set of selfadjoint idempotents acting in is in canonical bijection with the lattice of subspaces of , since these are exactly the operators which act by orthogonal projection onto a subspace: for any , the corresponding maps to the nearest point of the subspace . The correspondence between selfadjoint idempotents — which are often simply called orthogonal projections — and subspaces even holds in the infinite-dimensional setting provided is complete with respect to the norm , i.e. is a Hilbert space, provided we restrict to closed subspaces. Thus the Grassmannian is for all practical purposes equivalent to the set of selfadjoint idempotents/orthogonal projections of rank . Theorem 1 above is obtained from the eigenvalue/eigenvector form of the Spectral Theorem by replacing the eigenspaces of with the orthogonal projections onto those spaces, and conversely Theorem 1 gives the eigenvalue/eigenvector form of the Spectral Theorem with being the distinct eigenvalues of whose corresponding eigenspaces are the images of the orthogonal idempotents .
Let us close out this brief recap of eigenvectors and eigenvalues (i.e. Math 202A material) by recalling the algebraic criterion for diagonalizability. An operator is said to be normal if it commutes with its adjoint, . Obviously, selfadjoint operators are normal, , and so are unitary operators since .
Theorem 2: Given , there exists an orthonormal basis of such that for each , where are scalars, if and only if is normal. Equivalently, is a -linear combination of pairwise orthogonal selfadjoint idempotents if and only if .
After revisiting this essential piece of 202A technology, my plan was to jump into multilinear algebra — tensor products, exterior products, symmetric products, and their applications to eigenvalue inequalities — and work up to the construction of Schur functors, which interpolate between these using the representation theory of the symmetric groups. This would be an application of the material you learned in 202B wherein the goal is to enhance the material you learned in Math 202A. However, I have since changed my mind. After reviewing Professor Sam’s 202B lecture notes (which I am posting to Piazza for your reference), and speaking briefly with Steven, I have realized that you already have some exposure to this material. Even though there is much left to be done in this direction, namely starting from Section 6 in Steven’s notes and working up to a proof of Theorem 6.4.1 via Schur-Weyl duality, I have ultimately decided to leave this for the second half of the course, and begin instead with some very natural and appealing applications of symmetric group representations which we can sink our teeth into immediately. Also, this will give you exposure to material which cannot be looked up in textbooks, or at least cannot be found presented in the way we’ll do it.
Definition 1: For each , the symmetric group of degree consists of the set of bijective functions
i.e. automorphisms of , with the group law being composition of functions. Elements of are called permutations.
As you know from Math 202B (and likely much before), any permutation can be factored into a product of disjoint cyclic permutations, and this factorization is unique up to the order of the factors. Our starting point will be a rather different unique factorization theorem for permutations. To explain, let us consider from the perspective of geometric group theory. Let denote the set of transpositions, i.e is the conjugacy class of -cycles,
As you are also no doubt aware, generates , which means that any permutation can be written as a product of transpositions,
This representation is very far from being unique, and indeed the question of counting the number of factorizations of a given permutation into a given number of transpositions is a classical problem in algebraic combinatorics known as the Hurwitz enumeration problem; despite its elementary formulation, this question has deep ties to enumerative algebraic geometry and mathematical physics, some of which we will hopefully be able to discuss. Nevertheless, for each there is a minimal value of for which such a factorization exists, and this is called the word norm of and denoted , where the subscript reminds that this quantity depends on the choice of generating set We will omit this subscript in order to lighten the notation, and if we switch to a different generating set inducing a different word norm on , we will say so loud and clear. The following is elementary combinatorics, but you should take the time to prove it.
Proposition 1: The length of any factorization of a permutation into transpositions satisfies for some . That is, all factorizations of into transpositions have the same parity.
Somewhat less elementary but no less important is the fact that there is a tight connection between the word norm of a permutation and the number of factors in its unique decomposition into disjoint cycles.
Proposition 2 (join-cut relation): We have
The key idea in the proof of Proposition 2 is expressed in its name: multiplying a permutation by a transposition will either join two of its cycles into one, or cut one of its cycles into two. Thus,
The word norm makes the symmetric group into a metric space space.
Proposition 3: The function
defined by | is a metric.
We may thus for example speak of the open ball of radius centered at a given permutation ,
and its boundary, the sphere
The metric is nothing but the graph theory distance (or geodesic distance) on the Cayley graph of the symmetric group as generated by the conjugacy class of transpositions. The vertex set of this graph is itself, and two permutations are joined by an edge if and only if there is such that Note that this is an undirected graph because is the identity permutation for every so that is equivalent to Note also that is a regular graph of degree Another feature of is that it is a graded graph: it decomposes into a disjoint union of independent sets or “levels,”
with being the sphere of radius centered at the identity permutation . By the join-cut relation, may also be described as the set of permutations whose disjoint cycle decomposition contains factors.
The Hurwitz enumeration problem has an equivalent formulation in terms of the Cayley graph: it asks for the number of -step walks from to on This description also points to a refinement of the Hurwitz problem. Let us introduce the edge-labeling on in which each edge corresponding to the transposition is marked by , the larger of the two numbers it interchanges. We will call this the Biane-Stanley edge labeling of the symmetric group, after Philippe Biane and Richard Stanley, in whose work it appears implicitly.
Given an -step walks on , we define its signature to be the corresponding sequence of edge labels. We say that a walk is strictly monotone if its signature is a strictly increasing sequence. Thus, an -step strictly monotone walk from to is the same thing as a strictly monotone factorization of into transpositions, i.e. a factorization
Theorem (unique monotone factorization): Every permutation admits a unique strictly monotone factorization, and the length of this factorization is equal to That is,
Proof: The proof is by induction on .
In the base case we have with and The complete answer to the Hurwitz enumeration problem for is thus given by
This corresponds to the fact that if and have the same parity, then we have the unique length factorization
but no factorization exists if and have opposite parity. Imposing the strictly monotone constraint whittles this down to
For the inductive step, there are two cases to consider. Let and let be any permutation. If is a fixed point of , i.e. , then has a unique strictly monotone factorization by the induction hypothesis. If on the other hand is one of the
permutations which contains in its support (meaning ), then is of the form
for a unique pair and By the induction hypothesis, admits a unique strictly monotone factorization of length equal to , and the result follows.
Graphically, the unique monotone factorization theorem gives a presentation of the symmetric group as a starlike tree, which we call the Jucys tree after the Lithuanian physicist Algimantas Adolfas Jucys (not to be confuse with his father, the Lithuanian physicist Adolfas Jucys), in whose work it appears implicitly. We will embed this construction in the group algebra in Lecture 3, and start using it in conjunction with linear representations of the group algebra.
Throughout this lecture, is a finite-dimensional complex vector space equipped with a scalar product , which is by convention a linear function of its second argument. The scalar product on determines various symmetry classes in the algebra of linear operators and the study of these symmetry classes is one of the main components of Math 202A. In particular, recall that an operator is said to be selfajdoint (or Hermitian) if
Hermitian operators are the subject of the Spectral Theorem, which may be formulated as follows.
Theorem 1: If is selfadjoint, then there exists an orthonormal basis of such that , where are real numbers.
Theorem 1 is the eigenvalue/eigenvector form of the Spectral Theorem. We will give a proof of this result which iteratively constructs the eigenvalues/eigenvectors of as the maxima/maximizers of the (necessarily real-valued) quadratic form on a nested sequence of compact sets in . Instead of invoking the Fundamental Theorem of Algebra to claim the existence of a root of the characteristic polynomial, this argument calls the Extreme Value Theorem as a subroutine.
Lemma 1: If is a selfadjoint operator such that the affiliated quadratic form is nonnegative, then any zero of is in the kernel of .
Proof: In fact, the argument does not use the finite-dimensionality of . Let be such that Consider a real perturbation of this zero in an arbitrary direction, i.e. consider , where is arbitrary and is a fixed unit vector. We then have
The first term is zero by hypothesis. The middle terms are the real number times
Thus by hypothesis the function
is a nonnegative function of the real variable . This function has the asymptotics
as , so it must be the case that , otherwise would change sign at , which it does not, being a nonnegative function. Repeating the above with a pure imaginary perturbation , we get that as well. Thus is orthogonal to every unit vector , whence it must be the zero vector,
Proof of Theorem 1: Since is finite-dimensional, its unit sphere
is compact, and the function is continuous. Thus, by the Extreme Value Theorem, there exists a unit vector at which attains its maximum value on , i.e. we have
for all unit vectors . But this is equivalent to
for all unit vectors , and in this form it is clear that the inequality actually holds for arbitrary vectors , i.e. we have
for all where
Note that is selfadjoint (indeed, selfadjoint operators in form a real vector space), and the above says that the associated quadratic form is nonnegative. Since by construction, we may invoke Lemma 1 to conclude that is in the kernel of , i.e. is an eigenvector of with eigenvalue .
Consider now the orthogonal complement
i.e. the -dimensional hyperplane in perpendicular to the line through . Let be the unit sphere in , let be the maximum of the quadratic form on , and let be a point at which the maximum is achieved. By exactly the same argument as above, we get that is an eigenvector of with eigenvalue . Indeed, we can iterate the argument times to obtain an orthonormal set of eigenvectors with eigenvalues .
2. The subspace lattice
Let denote the set of all subspaces of , including the trivial subspace and itself. This set carries a natural partial order: given we declare if and only if is a subspace of . In fact, is a graded poset in which the rank of a subspace is its dimension . The graded poset is moreover a lattice: the largest subspace contained in each of two given subspaces and is their intersection, and the smallest subspaces containing both of them is their span. Finally, the lattice is self-dual: the map sending a subspace to its orthogonal complement is an order-reversing involution.
In formulating the above, we have just encountered our first example of a functor: we can view as a covariant functor from the category of vector spaces to the category of posets. Indeed, if are vector spaces and is a linear transformation, we get an order-preserving function defined by
In this course we will be considering quite a few more functors, most of which will be endofunctors from the category of vector spaces to itself. The most basic of these are tensor powers, symmetric powers, and exterior powers, which are the basic operations of multilinear algebra and are closely related to the physical concept of Fock space, which is intended to capture the notion of symmetry classes of elementary particles in quantum mechanics — bosons and fermions and such.
3. Flags and Grassmannians
Very loosely speaking, vector spaces are continuous versions of sets. A finite set might represent the possible states of some physical system, while the free vector space over this set allows for superpositions of states, i.e. linear combinations of them. Associated to is its lattice of subsets , where meet and join are intersection and union. This is called a Boolean lattice, and it is graded by cardinality rather than dimension. Sometimes is referred to as a hypercube, because its elements can be thought of as length bitstrings, and these may in turn be thought of as the vertices of a polytope in which is a cube when .
In a general poset, a chain is a totally ordered subset, while an antichain is a subset which does not contain any comparable pairs of elements. A chain or antichain is said to be saturated if adding any new element will cause it to no longer be a chain or antichain. It is easy to spot saturated chains in the Boolean lattice — they are just sequences of subsets of such that , and there are factorially many of these. It is equally easy to spot saturated antichains, namely the collections consisting of all subsets of of cardinality . The cardinality of itself is then , and it is a famous result in extremal combinatorics that the size of the largest antichain in the Boolean lattice is the largest number in the th row of Pascal’s triangle.
The lattice of subspaces of a finite-dimensional vector space is in many ways analogous to the lattice of subsets of a finite set. Indeed, if one could define a field with one element in a satisfactory way, then the two notions would coincide for vector spaces over . This is problematic, but one can analyze the subspace lattice of a vector space defined over without difficulty, and count chains, antichains, etc. For example, the antichain consisting of -dimensional subspaces is the Gaussian binomial coefficient , which degenerates to the ordinary binomial coefficient in the limit, almost as if really did exist…
Anyway, we are working over , so everything is infinite. Chains in are typically referred to as flags, though this term is more often than not reserved for chains that contain the trivial subspace and the itself. Saturated chains are called complete flags: they are sequences
such that The analogue of the antichain in the Boolean lattice consisting of subsets of cardinality is the collection of -dimensional subspaces of ; these antichains are called Grassmannians and denoted . We are going to discuss Grassmannians of finite-dimensional complex vector spaces in some detail, and I hope that we will have time to discuss a certain special infinite-dimensional Grassmannian known as the Sato Grassmannian. For now, we concentrate on the following fundamental relationship between eigenvalues of selfadjoint operators and Grassmannians.
3. Eigenvalues as extremizers
Let be a selfadjoint operator, and define a function
Thus, assigns to each subspace the maximum of the quadratic form associated to on the unit sphere in . In particular, we know from our proof of the Spectral Theorem above that the largest eigenvalue of is given by
The following result, known as the Courant-Fisher minimax theorem, generalizes this by giving a variational characterization of each individual eigenvalue of .
Theorem 2: If is a selfadjoint operator with eigenvalues then for each we have
That is, the th largest eigenvalue of is the minimum value of over the Grassmannian of -dimensional subspaces of .
Proof: Let be an orthonormal basis of such that , For each , consider the space
spanned by the eigenvectors corresponding to the largest eigenvalues of and also set . In the proof of Theorem 1, we showed that
so that indeed the th largest eigenvalue is given by evaluating the function at a point of the Grassmannian , namely It remains to show that the evaluation of on every other subspace of dimension is at least .
To achieve this end, we establish the following squeeze inequality for the values of the quadratic form on the subspace spanned by the first eigenvectors: for all we have
Indeed, let be any vector in . Using orthonormality of the eigenvectors we calculate
and the lower bound is established in exactly the same way.
Now let be arbitrary. Since
the intersection of the subspaces and is at least one-dimensional (see Problem set 1). Let be a unit vector in . Since , we have by the above inequality. But also , which shows that , as required.
4. Adjoint orbits
Any operator which preserves the scalar product,
is called a unitary operator. It is clear that the set of unitary operators on is a group, indeed a subgroup of the general linear group . Note that the former depends on the scalar product but the latter does not; one may equivalently say that invertible operators map bases to bases, while unitary operators map orthonormal bases to orthonormal bases.
Let be given real numbers, and let be the set of selfadjoint operators on with this given spectrum.
Theorem 3: Choose any orthonormal basis , and let be the selfadjoint operator defined by
Then the isospectral set is given by
Proof: First we check that consists entirely of selfadjoint operators having spectrum . The selfadjointness of follows immediately from the selfadjointness of together with the fact that . To see that has spectrum , let be the orthonormal basis of defined by
Thent is an eigenvector of with eigenvalue .
Now let be a given selfadjoint operator with eigenvalues . By the Spectral Theorem, there exists an orthonormal basis such that If is the unitary operator sending to , then
Welcome to Math 202C in the Spring quarter of 2022. This course concludes the applied algebra graduate course sequence, the previous installments of which were taught by Professor Freddie Manners and Professor Steven Sam. The goal of the course will be to unite and extend the material you learned in 202A and 202B in a natural, pleasing, and useful way. We will focus on applications of the representation theory of the symmetric groups, which was the subject of 202B. The Jucys-Murphy elements will be constructed, and their action in irreducible representations will be applied to analyze random walks (card shuffling) and permutation factorizations (Hurwitz theory). We then review some 202A material, namely the basic constructions of multilinear algebra (tensor, exterior, and symmetric algebras), and their physical interpretation as state spaces for bosons and fermions (Fock spaces). We apply the representation theory of the symmetric groups to prove the existence of a -Fock space interpolating between the symmetric and exterior algebras. The two parts of the course are then unified using the infinite wedge space formalism, whose connection with the theory of integrable systems via the Sato Grassmannian will be discussed if time permits.
Now to the mechanics of Math 202C. This post, Lecture 0, serves as the course syllabus. The course is being taught remotely, and the plan is that the two weekly lectures will be delivered in the form of blog posts like the one you are reading right now. Each of these posts will likely cover more than I would be able to cover in an in-person ninety minute lecture. These posts will appear every Tuesday and Thursday in a temporal neighborhood of our scheduled lecture slots, and the course will feel much like a reading course, albeit a fairly intense one leading up to a qualifying exam. I will also hold live office hours over Zoom, so that we can discuss the course content in real time, and so that I can embellish and add to it verbally. This will happen on Fridays, and we will work out a time over Piazza, which we will use as a question and answer forum (signup link here). We’ll be conducting all class-related discussion on Piazza. The quicker you begin asking questions on Piazza (rather than via emails), the quicker you’ll benefit from the collective knowledge of your classmates and instructors. I encourage you to ask questions when you’re struggling to understand a concept.
As for evaluation, there are no exams (neither midterm nor final), though of course you’ll be taking the Applied Algebra qualifying exam mid-quarter. We will however have weekly problem sets which will be due on Sundays and submitted via Gradescope. More information about this will be posted on Piazza. These problem sets will help to cement the concepts you’ll be learning in the course, and will also give you an idea of the sorts of problems that are likely to appear on the qualifying exam. I believe this is everything you need to know to get started — see you on Piazza starting today, and in real time on Friday.
Spectral theorem, flags and Grassmannians, minimax
Jucys’s theorem: unique factorization
Symmetric polynomials and JM elements
Murphy’s theorem: content eigenvalues
Top-to-random shuffle I
Top-to-random shuffle II
Basic category theory
Tensor product, tensor powers, tensor algebra
Symmetric and antisymmetric tensors, bosons and fermions
These are respectively known as the degree elementary, complete, and power sum symmetric functions over the alphabet . Note that they are not actually functions, but rather elements of the algebra of formal power series over The “fundamental theorem of symmetric function theory” asserts that these series all generate the same subalgebra of .
Theorem (Newton): We have
The subalgebra of generated by any of the above three fundamental families is called the algebra of symmetric functions, and typically denoted or if the alphabet needs to be specified. This algebra has a lot of internal structure which expresses itself in combinatorially interesting ways. To see some of this, you might try expressing the elementary symmetric functions in terms of the power sums, as Newton did.
Definition 1: A specialization of is an algebra homomorphism from to a commutative -algebra
In practice, specializations typically occur as families of substitutions. Here is a very simple example. For any we have a specialization
i.e. by substituting the numerical value for each of the variables (or any other specified set of variables), and substituting the numerical value for the rest. It is natural to identify the homomorphism with the numerical multiset
and write rather than to reflect the substitution.
Problem 1: Explicitly compute
Another interesting numerical specialization arises from taking
For this numerical substitution we have
which is a sum people have been thinking about since antiquity. In particular, there are various formulas for this sum, many of which involve the Bernoulli numbers. In order to evaluate and one may proceed rather differently as follows. Let be a new variable, and consider the generating functions
which are elements of the algebra of formal power series in the single variable , with coefficients in . These generating functions may be explicitly evaluated without too much effort.
We can repackage the the content of Lecture 13 as the study of a certain family of specializations
where is the center of the group algebra These specializations are defined using the substitution multiset
are the Jucys-Murphy elements in The natural basis of is the conjugacy class basis,
so understanding the JM specialization of means being able to compute
the coefficient of a given conjugacy class in the specialization of a given symmetric function What we saw in Lecture 13 is that for certain this problem has a combinatorial interpretation as counting certain walks in the Cayley graph of as generated by the conjugacy class of transpositions.
Consider first the elementary symmetric functions What we saw in Lecture 12 is that
is the number of -step strictly monotone walks from the identity permutation to any permutation of cycle type Moreover, we were able to solve the class expansion problem for the elementary symmetric polynomials explicitly:
with the sum on the right being exactly the identity-centered sphere of radius in or equivalently the th level of the Cayley graph. Equivalently, for the generating function
which is the generating function for permutations by norm.
For the complete symmetric functions we found that
is the number of -step weakly monotone walks from the identity permutation to any permutation of cycle type . We did not give an exact formula for this number, and indeed no such formula is known. However, we can make some progress using another remarkable aspect of the JM elements, namely their interaction with the representation theory of
We know that, for any the specialization is a central element in the group algebra Thus, by Schur’s lemma, acts in any irreducible representation of as a scalar operator, i.e. we have
with a scalar and the identity operator. It turns out that the eigenvalue can be expressed in terms of a remarkable numerical specialization of
Definition 2: Given a Young diagram and a cell the content of is its column index minus its row index. The content alphabet of is the multiset of the contents of its cells,
Theorem (Jucys): The eigenvalue of acting in is the specialization of on the content alphabet of
In fact, Jucys proved a more general result which implies the above: he showed that for any of the elements the Young basis of is an eigenbasis of the operator and
where is the content of the cell containing in the standard Young tableau
In particular, the Jucys’ result enables us to obtain a formula for the generating function enumerating monotone walks from the identity permutation to any permutation of cycle type in terms of the characters of
For certain choices of , the sum on the right simplifies considerably and explicit formulas for follow. For example, in the case where , we may use the fact that the character of a full cycle vanishes in any representation for which is not a hook diagram, and is either when is a hook. This leads to an explicit formula for the number of monotone walks of arbitrary length from the identity to a full cycle in Understanding all of this requires only Math 202B knowledge, and the details are in the paper referenced above.
In Lecture 12, we analyzed the top-to-random shuffle of a deck of cards, which is the same thing as lazy simple random walk on the Cayley graph of the symmetric group generated by the star transpositions
An important role in this analysis was played the sum of these transpositions,
which is an element of the group algebra In fact, is one of a special family of elements in called the Jucys-Murphy elements of These element have many remarkable properties, a few of which we will discuss in this lecture — in fact, these simple transposition sums are so rich in structure that they encapsulate the entire representation theory of the symmetric group, and can be taken as the starting point for a very intuitive and streamlined re-imagining of this subject.
The JM elements can be understood in terms of the Cayley as arising from a special edge-labeling of called the Biane-Stanley edge labeling. To construct this labeling, mark each edge of the group corresponding to the transposition with the larger of the two numbers interchanged. Thus, each edge of is assigned a label from the set Emanating from each vertex we see one -edge, two -edges, three -edges, etc., as in the picture below.
For a full example, the figure below shows the Cayley graph of the symmetric group with the Biane-Stanley edge labeling, with -edges drawn in cyan, -edges in magenta, and -edges in black.
This edge labeling of the Cayley graph of the symmetric group was introduced by Richard Stanley in the context of noncrossing partitions, and you can read about this here. We will use the introduction of more structure on to break away from the Markovian paradigm of the evolution of a memoryless particle on
Definition 1: A walk on is said to be monotone if the labels of the edges it traverses form a non-decreasing sequence, and strictly monotone if the labels of the edges it traverses form an increasing sequence.
You can think of monotone walks as virtual histories of a particle moving around on which learns from experience: once the particle traverses a road of value , it refuses to traverse roads of lesser value. Strictly monotone walks are histories of an even more fastidious particle that seeks out roads of strictly increasing value as it evolves. If the motion of the particle is random, then monotone walks on are histories of a self-interacting random walk, meaning a random walk whose next step at each time instant depends on its history up to that time.
Given permutations and a nonnegative integer let denote the number of -step walks on from to and let denote the number of strictly monotone -step walks from to Obviously, we always have
where we recall that is the number of unrestricted -step walks from to
Theorem 1: For any permutations we have
That is, there exists a unique strictly monotone walk between any two permutations, and this walk is a geodesic.
Proof: First, observe that the problem is vertex transitive: we have for any Indeed, if
is a strictly monotone walk from to then
is a strictly monotone walk from to and vice versa. It is thus sufficient to prove that
where is the identity permutation. Here is a proof by picture: draw all strictly monotone walks on departing from . What results is a graphical presentation of as a starlike tree:
The basic idea of an actual proof is as follows. First, verify that every cyclic permutation admits a unique strictly monotone factorization, the number of factors of which is one less than the length of the cycle, which is the smallest number of factors possible. Now for an arbitrary permutation, factor each of its cycles into a minimal strictly monotone product of transpositions, and observe that there is a unique way to rearrange the transposition factors of all these factorizations so that they are strictly monotone.
A group-theoretic way to conceptualize Theorem 1 is as a unique factorization theorem for the symmetric group: every permutation has a unique factorization of the form
such that and moreover In terms of the group algebra we may rephrase Theorem 1 as follows: we have
is the th level of the Cayley graph, or in other words the identity-centered sphere of radius in the symmetric group, viewed as an element of the group algebra Equivalently, this can be written
Defintion 1: The Jucys-Murphy elements in are the transposition sums
For notational convenience, we set the zero element of the group algebra
Problem 1: Prove that the JM elements commute with one another.
Let the symmetric group act on the algebra of of polynomials in variables by permuting variables:
The algebra of symmetric polynomials is the corresponding space of invariants,
A non-obvious fact is that this algebra of invariants is in fact a polynomial algebra.
Theorem (Newton): We have
Newton’s theorem, which is often called the fundamental theorem of symmetric function theory, says that every is a polynomial in the elementary symmetric polynomials
Theorem 2: The substitution rule
defines a homomorphism
from the algebra of symmetric polynomials in variables to the center of
Proof: Theorem 1 says that
Thus, by Newton’s theorem, any symmetric polynomial evaluated on the JM elements will be a polynomial in It thus suffices to show that each level of the Cayley graph (identified with the formal sum of its constituents) is an element of But this is clear: the center is spanned by the conjugacy classes of (each of which is identified with the formal sum of its constituents), and
Theorem 2 has rather interesting consequences for monotone walks on the Cayley graph. Note that, if we consider all walks on the Cayley graph, then the number of -step walks from the identity to a given permutation depends only on the conjugacy class of Indeed, if is a conjugate of then every factorization
of corresponds to the factorization
and this correspondence is clearly bijective. This simple argument does not show that the number of monotone factorizations
is equal to the number of monotone factorizations of because conjugating each of the factors will not necessarily preserve the monotonicity condition. However, the following more sophisticated argument shows that indeed
which is the sum of all monotone products of transpositions. Evaluating each term in the sum and collecting like coefficients, the coefficient of is the number of monotone factorizations of and the coefficient of is the number of monotone factorizations of But, since this sum is a linear combination of conjugacy classes by Theorem 2, and since and belong to the same conjugacy class, these coefficients are equal.
In the next lecture, we will combine the spectral theory of the JM elements with their combinatorial properties to give eigenvalue formulas for the enumeration of monotone walks on the Cayley graph.
Since the representation theory of the symmetric group is likely fresh in your mind thanks to yesterday’s exam, this is a good time to show you a very nice probabilistic application of this theory to card shuffling. This post will discuss the “top-to-random” shuffle, where the cards are arranged in a single row (or a single stack), and a randomly chosen card is repeatedly swapped with the rightmost (or top) card. The algebraic model for this procedure is the Cayley graph of the symmetric group as generated by the so-called “star transpositions”
Problem 12.1: Prove that the above transpositions do in fact generate
These are called star transpositions because, if one forms the graph on such that is an edge if and only if one of the above transpositions swaps and , then this graph is a star graph. The top-to-random shuffle is then the same thing as the lazy simple random walk on the Cayley graph
The paper attached below gives a detailed analysis of the top-to-random shuffle using the representation theory of and gives a precise meaning to the statement that such shuffles are required in order to randomize the deck. The argument is pretty much all representation theory — only some very basic probabilistic reasoning is used towards the end.
If I was to start this course over again, I would not have limited the opening act to a discussion of the relationship between linear algebra and graph theory, but with a more general discussion of the relations between multilinear algebra and graph theory. Since I sketched this briefly in our last in-person lecture, I will record what was said here in more precise detail.
Let be a Hilbert space with finite or countably infinite orthonormal basis To make things simpler, let us consider the case where but everything we say below works (or can be made to work) for arbitrary separable Hilbert spaces. Let be a the one-dimensional Hilbert space spanned by a unit vector which does not belong to
Let and recall that , the th tensor power of is the vector space of degree tensors
and linear combinations thereof. This is again a Hilbert space, with the scalar product defined by
In particular, an orthonormal basis of is given by the tensors
where is the set of all functions
Thus every tensor is a linear combination
The classical way to think about this is that is the space of complex-valued functions on -tuples of vertices of a graph with vertex set Alternatively, you can think of as the -particle space corresponding to the one-particle space From this point of view, each function
corresponds to an assignment of distinguishable particles labeled to the vertices of If instead we are dealing with quantum particles, meaning particles whose location cannot be definitely specified, then we instead must deal with a unit tensor
in , the idea being that
represents the probability to find the distinguishable particles at vertices respectively. It is convenient to define where is a unit vector not in called the “vacuum vector,” which corresponds to no particles, i.e. a vacuum.
Now suppose that one would like to deal with a variable (but finite) number of particles: the tensor algebra over is the algebraic structure that does this. It is defined as the direct sum of all tensor powers of
This is a graded algebra, i.e. we have a natural multiplication of tensors: if and then Note that is a noncommutative algebra, and it is also infinite-dimensional, even though is finite-dimensional.
Problem 11.1: Show that is (non-canonically) isomorphic to the free algebra of rank which consists of polynomials in noncommutative indeterminates.
Returning to the viewpoint that is the one-particle space corresponding to a graph with vertex set we encode the edge structure of using the adjacency operator defined by
where is the multiplicity of the edge in As we have discussed, the matrix elements of then count -step walks in i.e.
is the number of -step walks from vertex to vertex in Let us now observe that this naturally extends to any finite number of particles. Namely, for any we have the operator with action defined by
Proposition 1: For any any and any the number of ways in which distinguishable particles on can move from positions to positions in instants of time, where at each instant each particle jumps to an adjacent vertex, is given by the scalar product
Proof: We have that
The th factor in the product counts the number of ways in which particle can travel from vertex to vertex in jumps along the edges of Since the particles are non-interacting, the product of these numbers is the number of possible histories corresponding to the evolution of the particle system from initial state to state in time
Now suppose that is an orthonormal basis of consisting of eigenvectors of
Then, the tensors
are an orthonormal basis of consisting of eigenvectors of
Corollary 1: For any any and any the number of ways in which distinguishable particles on can move from positions to positions in instants of time, where at each instant each particle jumps to an adjacent vertex, is given by
The point of the above discussion is that, associated to every graph is not just a Hilbert space encoding the vertices and an operator encoding the edges, but an infinite dimensional noncommutative algebra encoding -valued functions on ordered tuples of vertices of arbitrary finite length, and an operator acting on these functions in a manner determined by the adjacency structure of Moreover, the variable multi-particle construction is not really any different from the one-particle construction: all the formulas are the same up to replacing indices with multi-indices.
The reason that we are discussing the above is that it gives a nice conceptual understanding of why the spectral analysis of Cayley graphs is so much more accessible than the spectral analysis of arbitrary graphs. In the group case the vertex set of is not an inert finite set, but rather a finite set which comes to us with a group law. Since this is a special case of the above, we can of course still form the corresponding one-particle space with orthonormal basis indexed by the group elements (which may or may not have a natural linear order), but it is more appropriate to denote this space since it is already an a unital, associative, and finite-dimensional algebra via bilinear extension of
We can still of course form the tensor algebra but there is a natural map from to its degree one component given by evaluation:
Moreover, it is not only which is naturally a finite-dimensional algebra, but all of its tensor powers via
which in multi-index notation we can succinctly write as Moreover, if is an abelian group, things get even simpler and all these finite-dimensional algebras are commutative, and character theory gives a universal method to construct the eigenvalues and eigenvectors of the adjacency operator and all of its tensor powers.
One demerit of the above is that the tensor algebra is “unphysical,” in the sense that if we were dealing with natural particles they would not be distinguishable. Moreover, we would want to classify them as either interacting or not, so decide between bosons and fermions populating the vertices of our graph This can be accomplished by letting the adjacency operator act not on the tensor powers of the one-particle space but rather on its symmetric and antisymmetric tensors powers. We will discuss this next time.