Written by a countably infinite number of monkeys with an uncountably infinite number of typewriters, edited by Andrew Dennehy
In a bold twist on the standard formula, in this lecture post we will be discussing math. Specifically, we will be discussing the Cohn-Uman algorithm and how representation theory can provide us an algorithm to multiply two matrices together. We will then see how this perspective could potentially help us prove an important theorem in complexity theory, namely that there exists an algorithm which computes the product of two matrices in
time for any
. Before we get to that, though, we will first cover an algorithm with many similarities to the one we will be discussing.
Consider the problem of multiplying two polynomials
The naïve approach would be the one that’s taught in high school, namely finding all the terms and using them to calculate the coefficients of
. However, we’re significantly cooler than high school students, so we’re going to use some significantly cooler math. First, we introduce the concept of the discrete Fourier transform. Let
be the
-th root of unity. The DFT of a vector
given by
Equivalently, if is the
-th component of
,
So, if we apply this transformation to the vector of coefficients of , we get that
So, the problem of finding the Fourier coefficients of is equivalent to the problem of evaluating
at
. From now on, assume
is a power of 2, since if it isn’t we can just add a bunch of 0 coefficients until we have a power of 2. Define the following two polynomials:
This lets us decompose as
So, instead of evaluating at all those powers of
, we can instead evaluate
and
at
and then combine the results. So, we have reduced the problem of plugging
numbers into a degree
polynomial into plugging
numbers (specifically the
-th roots of unity) into two degree
polynomials. Because we’ve assumed that
is a power of 2, so is
, meaning we can repeat the same algorithm for
and
. We can apply this process repeatedly, which gives us a recursive algorithm with complexity
for calculating the coefficients. Similarly, one can show that we can invert the DFT by applying the same process to the polynomial
with replaced with
and then dividing the result by
. The upshot of all of this work is that
So, we can calculate the Fourier coefficients of by calculating the coefficients of each polynomial independently and multiplying them together. Since this has linear complexity and both the Fourier and Inverse Fourier transforms have
complexity, this method gives us an
complexity algorithm to multiply two degree
polynomials.
Notice that instead of using the roots of unity, we could have instead used a generator for and worked over
with identical results. So, we can frame multiplication of polynomials as a special case of of multiplying elements in
.
Next, we take a slight detour to discuss some group theoretic concepts. For any subset of a group
(not required to be a subgroup), let the right quotient set
of
be defined to be
Exercise 1: Assume that and
. What does this imply about
? Show that the condition that this implies is actually equivalent to
, so the implication goes both ways.
Definition 1: A group realizes
if there are subsets
such that
and for
, if
then . This condition is called the triple product property. Furthermore, this is equivalent to the condition that if
, then
.
Although it isn’t entirely trivial, one can show that if realizes
, then it realizes any permutation of those numbers.
Exercise 2: Prove that realizes
.
This is the “trivial realization” of . We now get our first taste of why these ideas could be useful for the problem of matrix multiplication:
Theorem 1: Let be any field. If
realizes
, then the number of field operations required to multiply an
matrix by an
matrix over
is at most the number of operations required to multiply two elements of
Proof: Let realize
through the subsets
. Then, for matrices
and
of sizes
and
respectively, we can index the rows of
by
, the columns of
and the rows of
by
, and the columns of
by
. Then, consider the product
By the triple product property, we have that
which is true iff ,
, and
. So, the coefficient of
in the product is
So, you can just read off the elements of from these coefficients.
The process we just described, of converting matrix multiplication to multiplication in , is called the Cohn Uman algorithm. Also, we can now see the connection to the polynomial algorithm we described earlier; we’ve converted a problem of multiplying matrices to a problem of multiplying elements of
, which is what we did with polynomials before using
.
We can now define a measurement of the “quality” of embedding that a given group can provide.
Definition 2: The pseudo-exponent of a non-trivial finite group
is the minimum of
over all (not all 1) such that
realizes
.
Lemma 1: For any group ,
. Further, if
is abelian, then
Proof: is clearly at most 3, since we can just take
and
, meaning
realizes
.
To show our lower bound, suppose realizes
(not all equal to 1) through the subsets
. Then, the triple product tells us that the map
is injective on
and the image only intersects
in the identity. So,
with equality only occurring if
. The same argument shows this is true for
and
, so
. Therefore,
So, . Finally, if
is abelian, then the product map from
to
must be injective (because the triple product property implies the kernel of the map is trivial), so
, meaning
.
Next, we want to relate to
, the exponent of complexity for matrix multiplication, i.e. the smallest number such that we can perform matrix multiplication in
time for any
. To do this, we first recall from 202B that
where the ‘s are the dimensions of the irreducible representations of
. Then the most important theorem we will discuss is as follows:
Theorem 2: Suppose that is a group with character degrees
. Then,
Intuitively, the idea of this theorem is that the problem of multiplying matrices of size reduces to multiplication of two elements in
together, which itself reduces to a problem of multiplying a collection of matrices of size
, each of which should take approximately
operations. So, multiplying two matrices of size
has complexity of order at most
, which means that this sum is an upper bound for
. Actually proving the theorem is more difficult, so we won’t show it here, but this intuition is the most important part.
Importantly, notice that if were equal to 2, then this would imply that
(because
). However,
, so we need to control the character degrees of
to get non-trivial bounds on
. Define
to be the number such that
is the maximum character degree of
. An important corollary of Theorem 2 is:
Corollary 1: Let be a finite group. If
, then
Proof: Let be the character degrees of
. Then, by theorem 4.1,
This implies . Dividing by
(which is positive by assumption) gives us our result.
A special case of this corollary is given by
Corollary 2: If there exists a sequence of finite groups such that
as
and
, then the exponent of matrix multiplication is 2.
While these are weaker versions of our original theorem, they only require knowledge of , rather than all of the character degrees. Finally, we end with an open problem in representation theory whose positive answer could lead to a solution to our original problem:
Open Problem: For integers not all 1, does there exist a finite group that realizes
and has character degrees
such that
While this is not a necessary condition, a positive answer to this could lead to a proof that when combined with our theorem. This is because one limitation of our theorem is that unless we can rule out the case that
in theorem 4.1, that theorem is satisfied trivially. So, for us to be able to show that a group gives us a non-trivial improvement, we have to show that for any
, there exists a group
realizing
such that
, which the open problem implies.
While it is an open problem whether these groups exist in general, there are a few concrete examples where such a group exists, such as the one that follows:
Let and let
, where the group action of
on
is given by switching the two factors. If we let
be the generator of
, we can write the elements of
as
with
and
. Let
be the three copies of
in
, viewed as subgroups, and let
for notation. Then, we can define
as
One can show, although the proof is long, that these sets satisfy the triple product property. Furthermore, the character degrees of are all at most 2, since
is an abelian subgroup of index 2 in
and the maximum character degree of a group is at most the index of any abelian subgroup in the group. Since the sum of the squares of the character degrees is
, the sum of the cubes of the degrees is at most
. However,
, meaning that for
,
1 Comment