Math 202A: Lecture 12

In this lecture we begin our treatment of the Singular Value Decomposition, which is arguably the most applicable and applied result in linear algebra. Those of you familiar with this result likely think of it in the following form: for an arbitrary m \times n complex matrix A there exists a factorization of the form A=UBV^*, where U is an m \times m unitary matrix, B is an m \times n matrix whose diagonal elements are nonnegative numbers and whose off diagonal elements are zero, and V is an n \times n unitary matrix. Let’s check we got the dimensions right here: we write down the word (mm)(mn)(nn) and observe that matrix multiplication simplifies it to (mn)(nn) and then to mn, which is indeed the shape of A.

Of course, Math 202A is Linear Algebra Done Weird and our treatment of the SVD will look nothing like the above, though in the end we will reproduce this matrix factorization. The “Weird” in Linear Algebra Done Weird refers in part to the fact that quantum mechanics is often viewed as weird by physicists, but mathematically the transition from classical to quantum is no weirder than the transition from combinatorics to linear algebra. We will use this perspective to shape our understanding of the SVD. This development will be divided over three lectures, each of which will (hopefully) be quite easy to understand. I would like you to come away from this discussion with the sense that the Singular Value Decomposition is not just some arbitrary (albeit useful) matrix factorization which can be done, but rather a geometrically natural result which can yet again be understood as a major benefit of quantization.

In this lecture we remain in the classical computational category \mathbf{FSet}, whose objects are finite sets and whose morphisms are functions. Let X be a set of cardinality n, let Y be a set of cardinality m, and let f \in \mathrm{Hom}(X,Y) be an arbitrary function. The binary encoding of this function consists of the mn numbers

f_{yx} = [f(x)=y], \quad x \in X,\ y \in Y,

where

[P]=\begin{cases} 1, \text{ if }P\text{ true}\\ 0, \text{ if }P\text{ false} \end{cases}

is the Iverson bracket beloved by computer scientists. If we choose an ordering x_1,\dots,x_n of X and y_1,\dots,y_m of the domain X and codomain Y, we get a corresponding arrangement of these bits as an m \times n matrix with entries

[f]_{ij} = f_{y_ix_j}, \quad 1 \leq i \leq m,\ 1 \leq j \leq n.

If m=n the matrix [f] is square, if m>n it is tall and skinny (aka flaco), if m<n it is short and fat (aka gordo).

Now let us consider the question of how to choose orderings of X and Y so that the corresponding matrix [f] assumes a canonical form, a vaguely defined term (sorry Lani) which roughly means “reasonably simple.”

Consider first the case where m=n, so that possibly X and Y are in one-to-one correspondence.

Problem 12.1. Show that one can choose orderings of X and Y such that [f] is the n \times n identity matrix I_n if and only if f \in \mathrm{Hom}(X,Y) is a bijection. Now letting [f] be the matrix of f relative to an arbitrary pair of orderings, show that there exists n \times n permutation matrices P,Q such that I_n = P[f]Q if and only if f is a bijection.

Now consider the case where m>n, so that [f] is tall and skinny and possibly X embeds into Y.

Problem 12.2. Show that one can choose orderings of X and Y such that [f] is the m \times n matrix made by appending an (m-n) \times n matrix of zeros to the bottom of I_n if and only if f is injective. Now letting [f] be the matrix of f with respect to an arbitrary pair of orderings, show that there exists an m \times m permutation matrix P and an n \times n permutation matrix Q such that P[f]Q is in this canonical form if and only if f is injective.

Finally, consider the case where m<n, so that [f] is short and fat and possibly X covers Y.

Problem 12.3. Show that f is surjective if and only if there exist orderings of X and Y such that the m \times m principal submatrix of [f] is I_m. Now letting [f] be the matrix of f with respect to an arbitrary pair of orderings, show that there exists an m \times m permutation matrix P and an n \times n permutation matrix Q such that P[f]Q is in this canonical form if and only if f is surjective.

The matrix form in Problem 12.3 should perhaps not be called canonical because the “leftover” part of [f] outside its principal m \times m submatrix has no particular structure. Here is another option.

Visualize the surjective function f as a bipartite graph \Gamma(X,Y) with edges going between two disjoint sets of vertices X and Y, such that \{x,y\} is an edge if and only if f(x)=y. The connected components of \Gamma are star graphs \Gamma_y, y \in Y, in which the hub vertex y has degree equal to the cardinality |f^{-1}(y)| of the fiber of f over y. The surjectivity condition means that each of these star graphs has at least one edge or, equivalently, two vertices. The type of f is the vector \lambda(F) whose components are the positive numbers

|f^{-1}(y)|, \quad y \in Y,

listed in weakly decreasing order from left to right. Thus, \lambda(f)=(\lambda_1(f),\dots,\lambda_m(f)) is a partition of n with m parts, or equivalently a Young diagram with exactly n cells and m rows.

As an example of the above, take a set X of cardinality 5 and let x_1,x_2,x_3,x_4,x_5 be an ordering of its elements. Let Y be a set of cardinality 2 and let y_1,y_2 be an ordering of its elements. Let f \in \mathrm{Hom}(X,Y) be the surjective function

f^{-1}(y_1)=\{x_1,x_2,x_3\}, \quad f^{-1}(y_2) = \{x_4,x_5\}.

The type of this function is \lambda(f)=(3,2). The matrix of this function with respect to these ordering is

[f]=\begin{bmatrix} 1 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 \end{bmatrix}.

As a second example, take a set X of cardinality 6 and let x_1,x_2,x_3,x_4,x_5,x_6 be an ordering of its elements. Let Y be a set of cardinality 2 and let y_1,y_2,y_3 be an ordering of its elements. Let f \in \mathrm{Hom}(X,Y) be the surjective function

f^{-1}(y_1)=\{x_1,x_2,x_3\}, \quad f^{-1}(y_2) = \{x_4,x_5\}, f^{-1}(y_3)=\{x_6\}.

The type of this function is \lambda(f)=(3,2,1). The matrix of this function with respect to these ordering is

[f]=\begin{bmatrix} 1 & 1 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1\end{bmatrix}.

Problem 12.4. Prove that f \in \mathrm{Hom}(X,Y) is surjective if and only if there exist orderings of X and Y such that the corresponding matrix [f] is in reduced row-echelon form and has no zero rows. Now letting [f] be the matrix of f with respect to arbitrary orderings of X and Y, show that f is surjective if and only if there is an m \times m permutation matrix P and an n \times n permutation matrix Q such that P[f]Q assumes the above canonical form.

Problem 12.5. State and prove a generalization of Problem 12.4 for arbitrary f \in \mathrm{Hom}(X,Y), dropping all assumptions on the relative sizes of X and Y and the injectivity/surjectivity of f.

Leave a Reply