# Math 202C: Lecture 18

Written by a countably infinite number of monkeys with an uncountably infinite number of typewriters, edited by Andrew Dennehy

In a bold twist on the standard formula, in this lecture post we will be discussing math. Specifically, we will be discussing the Cohn-Uman algorithm and how representation theory can provide us an algorithm to multiply two matrices together. We will then see how this perspective could potentially help us prove an important theorem in complexity theory, namely that there exists an algorithm which computes the product of two $n\times n$ matrices in $O(n^{2+\epsilon})$ time for any $\epsilon>0$. Before we get to that, though, we will first cover an algorithm with many similarities to the one we will be discussing.

Consider the problem of multiplying two polynomials

$p(x)=p_0+p_1x+\cdots +p_{n-1}x^{n-1}$

$q(x)=q_0+q_1x+\cdots +q_{n-1}x^{n-1}$

The naïve approach would be the one that’s taught in high school, namely finding all the $a_ib_j$ terms and using them to calculate the coefficients of $p(x)q(x)$. However, we’re significantly cooler than high school students, so we’re going to use some significantly cooler math. First, we introduce the concept of the discrete Fourier transform. Let $\omega_n=e^{i \frac{2\pi}{n}}$ be the $n$-th root of unity. The DFT of a vector $v$ given by

$\hat{v} := \begin{bmatrix}1 & 1 & 1 & \cdots & 1 \\ 1 & \omega_n & \omega_n^2 & \cdots & \omega_n^{n-1} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & \omega_n^{n-1} & \omega_n^{2(n-1)} & \cdots & \omega_n^{(n-1)^2} \end{bmatrix} v$

Equivalently, if $v_i$ is the $i$-th component of $v$,

$\hat{v}_i := v_0+v_1 \omega_n^{k} + \cdots + v_n\omega_n^{nk}$

So, if we apply this transformation to the vector of coefficients of $p$, we get that

$\hat{p}_k= p_0+p_1 \omega_n^{k} + \cdots + p_n\omega_n^{nk} =p(\omega_n^k)$

So, the problem of finding the Fourier coefficients of $p$ is equivalent to the problem of evaluating $p$ at $1,\omega_n,\omega_n^2,...,\omega_n^{n-1}$. From now on, assume $n$ is a power of 2, since if it isn’t we can just add a bunch of 0 coefficients until we have a power of 2. Define the following two polynomials:

$p_0(x)=p_0+p_2x + \cdots + p_{n-2} x^{n/2-1}$

$p_1(x)=p_1+p_3x + \cdots + p_{n-1} x^{n/2-1}$

This lets us decompose $p$ as

$p(x)=p_0(x^2)+xp_1(x^2)$

So, instead of evaluating $p$ at all those powers of $\omega_n$, we can instead evaluate $p_0$ and $p_1$ at $1,\omega_n^2, ..., \omega_n^{2(n-1)}$ and then combine the results. So, we have reduced the problem of plugging $n$ numbers into a degree $n$ polynomial into plugging $n/2$ numbers (specifically the $n/2$-th roots of unity) into two degree $n/2$ polynomials. Because we’ve assumed that $n$ is a power of 2, so is $n/2$, meaning we can repeat the same algorithm for $p_0$ and $p_1$. We can apply this process repeatedly, which gives us a recursive algorithm with complexity $O(n \log n)$ for calculating the coefficients. Similarly, one can show that we can invert the DFT by applying the same process to the polynomial

$\hat p (x)=\hat{p}_0+\hat{p}_1 x + \cdots + \hat{p}_n x^{n}$

with $\omega_n$ replaced with $\omega_n^{-1}=\omega_n^{n-1}$ and then dividing the result by $n$. The upshot of all of this work is that

$\widehat{p q}_k = [pq](\omega_n^k) = p(\omega_n^k) q(\omega_n^k)=\hat p_k \hat q_k$

So, we can calculate the Fourier coefficients of $pq$ by calculating the coefficients of each polynomial independently and multiplying them together. Since this has linear complexity and both the Fourier and Inverse Fourier transforms have $n\log n$ complexity, this method gives us an $O(n\log n)$ complexity algorithm to multiply two degree $n$ polynomials.

Notice that instead of using the roots of unity, we could have instead used a generator for $C_n$ and worked over $\mathbb C [C_n]$ with identical results. So, we can frame multiplication of polynomials as a special case of of multiplying elements in $\mathbb C [C_n]$.

Next, we take a slight detour to discuss some group theoretic concepts. For any subset $S$ of a group $G$ (not required to be a subgroup), let the right quotient set $Q(S)$ of $S$ be defined to be

$Q(S)=\left\{s_1s_2^{-1}:s_1,s_2\in S\right\}$

Exercise 1: Assume that $S\neq \emptyset$ and $Q(S)=S$. What does this imply about $S$? Show that the condition that this implies is actually equivalent to $Q(S)=S$, so the implication goes both ways.

Definition 1: A group $G$ realizes $\langle n_1,n_2,n_3 \rangle$ if there are subsets $S_1, S_2, S_3\subseteq G$ such that $|S_i|=n_i$ and for $q_i\in Q(S_i)$, if

$q_1q_2q_3=1$

then $q_1=q_2=q_3=1$. This condition is called the triple product property. Furthermore, this is equivalent to the condition that if $q_1q_2=q_3$, then $q_1=q_2=q_3=1$ .

Although it isn’t entirely trivial, one can show that if $G$ realizes $\langle n_1,n_2,n_3 \rangle$, then it realizes any permutation of those numbers.

Exercise 2: Prove that $C_n\times C_m\times C_p$ realizes $\langle n,m,p \rangle$.

This is the “trivial realization” of $\langle n,m,p \rangle$. We now get our first taste of why these ideas could be useful for the problem of matrix multiplication:

Theorem 1: Let $F$ be any field. If $G$ realizes $\langle n,m,p \rangle$, then the number of field operations required to multiply an $n\times m$ matrix by an $m\times p$ matrix over $F$ is at most the number of operations required to multiply two elements of $F[G]$

Proof: Let $G$ realize $\langle n,m,p \rangle$ through the subsets $S,T,U$. Then, for matrices $A$ and $B$ of sizes $n\times m$ and $m\times p$ respectively, we can index the rows of $A$ by $S$, the columns of $A$ and the rows of $B$ by $T$, and the columns of $B$ by $U$. Then, consider the product

$\left(\sum\limits_{s\in S,t\in T} A_{st}s ^{-1} t\right) \left(\sum \limits _{t'\in T, u\in U} B_{t'u} t'^{-1}u\right)$

By the triple product property, we have that

$(s^{-1} t)(t'^{-1} u)=s'^{-1} u'\iff (s' s^{-1}) (tt'^{-1} )(uu'^{-1})=1$

which is true iff $s=s'$, $t=t'$, and $u=u'$. So, the coefficient of $s^{-1} u$ in the product is

$\sum\limits_{t\in T}A_{st}B_{tu}=(AB)_{su}$

So, you can just read off the elements of $AB$ from these coefficients. $\square$

The process we just described, of converting matrix multiplication to multiplication in $F[G]$, is called the Cohn Uman algorithm. Also, we can now see the connection to the polynomial algorithm we described earlier; we’ve converted a problem of multiplying matrices to a problem of multiplying elements of $\mathbb C[G]$, which is what we did with polynomials before using $\mathbb C[C_n]$.

We can now define a measurement of the “quality” of embedding that a given group $G$ can provide.

Definition 2: The pseudo-exponent $\alpha(g)$ of a non-trivial finite group $G$ is the minimum of

$\frac{3\log |G|}{\log nmp}$

over all $n,m,p$ (not all 1) such that $G$ realizes $\langle n,m,p \rangle$.

Lemma 1: For any group $G$, $2< \alpha(G)\leq 3$. Further, if $G$ is abelian, then $\alpha(G)=3$

Proof: $\alpha(G)$ is clearly at most 3, since we can just take $S_1=S_2=\{1\}$ and $S_3=G$, meaning $G$ realizes $\langle 1,1,|G|\rangle$.

To show our lower bound, suppose $G$ realizes $\langle n_1,n_2,n_3\rangle$ (not all equal to 1) through the subsets $S_1,S_2,S_3$. Then, the triple product tells us that the map $(x,y)\mapsto x^{-1}y$ is injective on $S_1\times S_2$ and the image only intersects $Q(S_3)$ in the identity. So, $|G|\geq n_1n_2$ with equality only occurring if $n_3=1$. The same argument shows this is true for $n_2n_3$ and $n_3n_1$, so $|G|^3>(n_1n_2n_3)^2$. Therefore,

$\frac{3 \log |G|}{\log nmp}> \frac{3 \log (nmp)^{2/3}}{\log nmp}=\frac{3 \cdot 2}{3}=2$

So, $\alpha(G)>2$. Finally, if $G$ is abelian, then the product map from $S_1\times S_2 \times S_3$ to $G$ must be injective (because the triple product property implies the kernel of the map is trivial), so $|G|\geq n_1n_2n_3$, meaning $\alpha(G)\geq 3$. $\square$

Next, we want to relate $\alpha(G)$ to $\omega$, the exponent of complexity for matrix multiplication, i.e. the smallest number such that we can perform matrix multiplication in $O(n^{\omega +\epsilon})$ time for any $\epsilon>0$. To do this, we first recall from 202B that

$\mathbb{C}[G]\cong \mathrm{End} \mathbb{C}[G]\cong \mathcal{M}_{d_1\times d_1} \oplus \cdots \oplus \mathcal{M}_{d_k\times d_k}$

where the $d_i$‘s are the dimensions of the irreducible representations of $G$. Then the most important theorem we will discuss is as follows:

Theorem 2: Suppose that $G$ is a group with character degrees $d_1,...,d_k$. Then,

$|G|^{\omega/\alpha(G)}\leq \sum\limits_{i=1}^k d_i^{\omega}$

Intuitively, the idea of this theorem is that the problem of multiplying matrices of size $|G|^{1/\alpha(G)}$ reduces to multiplication of two elements in $\mathbb{C}[G]$ together, which itself reduces to a problem of multiplying a collection of matrices of size $d_i\times d_i$, each of which should take approximately $d_i^\omega$ operations. So, multiplying two matrices of size $|G|^{1/\alpha(G)}$ has complexity of order at most $\sum_{i=1}^k d_i^{\omega}$, which means that this sum is an upper bound for $|G|^{\omega/\alpha(G)}$. Actually proving the theorem is more difficult, so we won’t show it here, but this intuition is the most important part.

Importantly, notice that if $\alpha(G)$ were equal to 2, then this would imply that $\omega =2$ (because $\sum_i d_i^2 =|G|$). However, $\alpha(G)>2$, so we need to control the character degrees of $G$ to get non-trivial bounds on $\omega$. Define $\gamma(G)$ to be the number such that $|G|^{1/\gamma}$ is the maximum character degree of $G$. An important corollary of Theorem 2 is:

Corollary 1: Let $G$ be a finite group. If $\alpha(G)<\gamma(G)$, then

$\omega\leq \alpha(G)\left(\frac{\gamma(G)-2}{\gamma(G)-\alpha(G)} \right)$

Proof: Let $\{d_i\}$ be the character degrees of $G$. Then, by theorem 4.1,

$|G|^{\omega/\alpha(G)}\leq \sum_i d_i^{\omega-2}d_i^2\leq |G|^{(\omega-2)/\gamma(G)}\sum_i d_i^2\leq |G|^{1+(\omega-2)/\gamma(G)}$

This implies $\omega\left(\frac{1}{\alpha(G)}-\frac{1}{\gamma(G)}\right)\leq 1-\frac{2}{\gamma(G)}$. Dividing by $\frac{1}{\alpha(G)}-\frac{1}{\gamma(G)}$ (which is positive by assumption) gives us our result. $\square$

A special case of this corollary is given by

Corollary 2: If there exists a sequence $G_1,G_2,...$ of finite groups such that $\alpha(G_i)=2+o(1)$ as $i\rightarrow \infty$ and $\alpha(G_i)-2=o(\gamma(G_i)-2)$, then the exponent of matrix multiplication is 2.

While these are weaker versions of our original theorem, they only require knowledge of $\gamma(G)$, rather than all of the character degrees. Finally, we end with an open problem in representation theory whose positive answer could lead to a solution to our original problem:

Open Problem: For integers $n,m,p$ not all 1, does there exist a finite group that realizes $\langle n,m,p\rangle$ and has character degrees $d_1,...,d_k$ such that

$nmp>\sum_{i}d_i^3$

While this is not a necessary condition, a positive answer to this could lead to a proof that $\omega=2$ when combined with our theorem. This is because one limitation of our theorem is that unless we can rule out the case that $|G|^{3/\alpha}\leq \sum_{i}d_i^3$ in theorem 4.1, that theorem is satisfied trivially. So, for us to be able to show that a group gives us a non-trivial improvement, we have to show that for any $n,m,p\in \mathbb N$, there exists a group $G$ realizing $\langle n,m,p \rangle$ such that $|G|^{3/\alpha}> \sum\limits_{i}d_i^3$, which the open problem implies.

While it is an open problem whether these groups exist in general, there are a few concrete examples where such a group exists, such as the one that follows:

Let $H=C_2\times C_2 \times C_2$ and let $G=H^2 \rtimes C_2$, where the group action of $C_2$ on $H^2$ is given by switching the two factors. If we let $z$ be the generator of $C_2$, we can write the elements of $G$ as $(a,b)z^i$ with $a,b\in H$ and $i\in \{0,1\}$. Let $H_1,H_2,H_3$ be the three copies of $C_n$ in $H$, viewed as subgroups, and let $H_4:=H_1$ for notation. Then, we can define $S_1,S_2,S_3$ as

$S_i=\{(a,b)z^j:a\in H_i\setminus \{1\},b\in H_{i+1},j\in \{0,1\}\}$

One can show, although the proof is long, that these sets satisfy the triple product property. Furthermore, the character degrees of $G$ are all at most 2, since $H^2$ is an abelian subgroup of index 2 in $G$ and the maximum character degree of a group is at most the index of any abelian subgroup in the group. Since the sum of the squares of the character degrees is $|G|$, the sum of the cubes of the degrees is at most $2|G|=4n^6$. However, $|S_i|=2n(n-1)$, meaning that for $n\geq 5$,

$|S_1| |S_2| |S_3| =8n^3(n-1)^3\geq 4n^6= 2|G| \geq \sum\limits_{i}d_i^3$