In this lecture we begin our treatment of the Singular Value Decomposition, which is arguably the most applicable and applied result in linear algebra. Those of you familiar with this result likely think of it in the following form: for an arbitrary complex matrix
there exists a factorization of the form
where
is an
unitary matrix,
is an
matrix whose diagonal elements are nonnegative numbers and whose off diagonal elements are zero, and
is an
unitary matrix. Let’s check we got the dimensions right here: we write down the word
and observe that matrix multiplication simplifies it to
and then to
which is indeed the shape of
Of course, Math 202A is Linear Algebra Done Weird and our treatment of the SVD will look nothing like the above, though in the end we will reproduce this matrix factorization. The “Weird” in Linear Algebra Done Weird refers in part to the fact that quantum mechanics is often viewed as weird by physicists, but mathematically the transition from classical to quantum is no weirder than the transition from combinatorics to linear algebra. We will use this perspective to shape our understanding of the SVD. This development will be divided over three lectures, each of which will (hopefully) be quite easy to understand. I would like you to come away from this discussion with the sense that the Singular Value Decomposition is not just some arbitrary (albeit useful) matrix factorization which can be done, but rather a geometrically natural result which can yet again be understood as a major benefit of quantization.
In this lecture we remain in the classical computational category whose objects are finite sets and whose morphisms are functions. Let
be a set of cardinality
, let
be a set of cardinality
and let
be an arbitrary function. The binary encoding of this function consists of the
numbers
where
is the Iverson bracket beloved by computer scientists. If we choose an ordering of
and
of the domain
and codomain
, we get a corresponding arrangement of these bits as an
matrix with entries
If the matrix
is square, if
it is tall and skinny (aka flaco), if
it is short and fat (aka gordo).
Now let us consider the question of how to choose orderings of and
so that the corresponding matrix
assumes a canonical form, a vaguely defined term (sorry Lani) which roughly means “reasonably simple.”
Consider first the case where so that possibly
and
are in one-to-one correspondence.
Problem 12.1. Show that one can choose orderings of and
such that
is the
identity matrix
if and only if
is a bijection. Now letting
be the matrix of
relative to an arbitrary pair of orderings, show that there exists
permutation matrices
such that
if and only if
is a bijection.
Now consider the case where , so that
is tall and skinny and possibly
embeds into
Problem 12.2. Show that one can choose orderings of and
such that
is the
matrix made by appending an
matrix of zeros to the bottom of
if and only if
is injective. Now letting
be the matrix of
with respect to an arbitrary pair of orderings, show that there exists an
permutation matrix
and an
permutation matrix
such that
is in this canonical form if and only if
is injective.
Finally, consider the case where , so that
is short and fat and possibly
covers
Problem 12.3. Show that is surjective if and only if there exist orderings of
and
such that the
principal submatrix of
is
Now letting
be the matrix of
with respect to an arbitrary pair of orderings, show that there exists an
permutation matrix
and an
permutation matrix
such that
is in this canonical form if and only if
is surjective.
The matrix form in Problem 12.3 should perhaps not be called canonical because the “leftover” part of outside its principal
submatrix has no particular structure. Here is another option.
Visualize the surjective function as a bipartite graph
with edges going between two disjoint sets of vertices
and
, such that
is an edge if and only if
The connected components of
are star graphs
in which the hub vertex
has degree equal to the cardinality
of the fiber of
over
The surjectivity condition means that each of these star graphs has at least one edge or, equivalently, two vertices. The type of
is the vector
whose components are the positive numbers
listed in weakly decreasing order from left to right. Thus, is a partition of
with
parts, or equivalently a Young diagram with exactly
cells and
rows.
As an example of the above, take a set of cardinality
and let
be an ordering of its elements. Let
be a set of cardinality
and let
be an ordering of its elements. Let
be the surjective function
The type of this function is The matrix of this function with respect to these ordering is
As a second example, take a set of cardinality
and let
be an ordering of its elements. Let
be a set of cardinality
and let
be an ordering of its elements. Let
be the surjective function
The type of this function is The matrix of this function with respect to these ordering is
Problem 12.4. Prove that is surjective if and only if there exist orderings of
and
such that the corresponding matrix
is in reduced row-echelon form and has no zero rows. Now letting
be the matrix of
with respect to arbitrary orderings of
and
show that
is surjective if and only if there is an
permutation matrix
and an
permutation matrix
such that
assumes the above canonical form.
Problem 12.5. State and prove a generalization of Problem 12.4 for arbitrary dropping all assumptions on the relative sizes of
and
and the injectivity/surjectivity of