Math 202A: Lectures 10 and 11

In this pair of lectures we consider the familiar topic of subspaces and projections, continuing our program of studying a subject you already know from a (possibly) new point of view. Probably you have heard of the textbook Linear Algebra Done Right, which prompted a reactionary response in the form an antithetical textbook called Linear Algebra Done Wrong. As to which is better, the debate rages on. Math 202A is Linear Algebra Done Weird. I believe that title is unclaimed; let me know if you have a better one to suggest.

As in previous lectures, we will use category-theoretic thinking to guide us towards analogies between the classical computational category \mathbf{FSet} of finite sets and the quantum computational category \mathbf{FHil} of finite-dimensional Hilbert spaces.

In this pair of lectures, we look at subobjects in these categories and compare and contrast the classical (set-theoretic) and quantum (linear-algebraic) aspects of subobjects. In both cases (and indeed much more generally), we can describe subobjects algebraically as equivalence classes of idempotent endomorphisms. In the classical computational category \mathbf{FSet}, there is no canonical way to choose a representative of the equivalence class of idempotents which correspond to a given subobject. However, in the quantum computational category \mathbf{FHil} the geometry of Hilbert space singles out a natural representative in every equivalence class of idempotents characterized by an orthogonality feature.

We start on the classical side, \mathbf{FSet}, where subobjects are just subsets. We have a functor \mathcal{P} \colon \mathbf{FSet} \rightarrow \mathbf{FLat} from finite sets to finite lattices which sends X to its power set,

\mathcal{P}(X) = \{Y \subseteq X\}.

The partial order in \mathcal{P}(X) is containment, meaning that for Y_1,Y_2 \in \mathcal{P}(X) we define

Y_1 \leq Y_2 \iff Y_1 \subseteq Y_2.

What makes \mathcal{P}(X) a lattice (as opposed to just a poset) is the fact that we have also the meet and join operations,

Y_1 \vee Y_2 = Y_1 \cup Y_2 \quad\text{ and }\quad Y_1 \wedge Y_2 = Y_1 \cap Y_2,

which give the smallest set containing both Y_1,Y_2 and the largest set contained in both Y_1,Y_2, respectively.

The subset lattice \mathcal{P}(X) has a couple of additional properties. First, it is graded, with rank function given by subset cardinality. This means that \mathcal{P}(X) decomposes into a disjoint union of antichains

\mathcal{P}_0(X),\mathcal{P}_1(X),\dots,\mathcal{P}_n(X)

in which n=|X| and

\mathcal{P}_k(X) = \{Y \subseteq X \colon |Y|=k\}

has cardinality

|\mathcal{P}_k(X)|= {n \choose k}.

Second, the subset lattice \mathcal{P}(X) has a natural anti-isomorphism (i.e. order-reversing bijection) given by

Y \mapsto Y^c = X \backslash Y.

Taking complements maps \mathcal{P}_k(X) bijectively onto \mathcal{P}_{n-k}(X), which is why

{n \choose k} = {n \choose n-k}.

Furthermore, for any Y \in \mathcal{P}(X) we have

|X|=|Y|+|Y^c|.

Our definition of the power set functor \mathcal{P} is not complete until we say where it sends a given set function f \in \mathrm{Hom}(X_1,X_2). The corresponding \mathcal{P}(f) \colon \mathcal{P}(X_1) \to \mathcal{P}(X_2) is defined by

[\mathcal{P}(f)](Y) = f(Y) = \{f(x) \colon x \in y\}.

As an exercise, you should check that \mathcal{P}(f) really is a morphism in \mathbf{FLat}.

Now let us consider how we can detect subobjects of X algebraically, using functions on X.

Definition 10.1. A function f \in \mathrm{End}(X) is said to be a retraction onto Y \in \mathcal{P}(X) if f(x) \in Y for all x \in X and f(y)=y for all y \in Y.

Notice that Definition 10.1 only applies to nonempty subsets of X. Another remark is that retractions are usually considered as morphisms in the category \mathbf{Top} whose objects are topological spaces and whose morphisms are continuous functions. We can include \mathbf{Set} in \mathbf{Top} by giving every set the discrete topology.

Now let us discuss a question raised by Lani Southern: what does it mean to say that there is no “canonical” retraction of X onto Y. The precise meaning of this is that there is no way to assign to each pair (X,Y) consisting of a finite set X and a subset Y \subseteq X a specific retraction f_{XY} of X onto Y which is equivariant with respect to all automorphisms of X that preserve Y. In more detail, let \mathrm{Aut}(X,Y) denote the subgroup of \mathrm{Aut}(X) consisting of all automorphisms (permutations) \sigma of X satisfying \sigma(Y)=Y. Call a rule (X,Y) \mapsto f_{XY} which assigns to set/subset pairs a corresponding retraction canonical if

\sigma \circ f_{XY} = f_{XY} \circ \sigma \quad\text{for all } \sigma \in \mathrm{Aut}(X,Y).

Proposition 10.2. If |Y|\geq 2 and X \backslash Y is nonempty, no canonical retraction exists.

Proof: Suppose otherwise, and let f_{XY} be a canonical retraction of X onto Y. Pick a point x \in X \backslash Y. Then, for every \sigma \in \mathrm{Aut}(X,Y) we have

f_{XY}(x)=f_{XY}(\sigma x)= \sigma f_{XY}(x).

In particular, this holds for every permutation \sigma which has our chosen point x as a fixed point and permutes the points of Y arbitrarily, so the above forces f_{XY}(x) \in Y to be fixed by every permutation of Y, which in turn forces |Y|=1, a contradiction. \square

In order to get past the above, we would need to break the symmetry of X. For example, this could mean promoting the structureless finite set X to a metric space (X,\mathrm{d}). Then, for any nonempty Y \in \mathcal{P}(X), we can define a retraction of X onto Y by declaring f_{XY}(x) to be the point of Y closest to x. However, even here we need to impose a rule for breaking ties (two distinct minimizers). Once we do this, the retraction f_{XY} is canonical in the sense that it is equivariant with respect to all isometries of (X,\mathrm{d}) which preserve tie breakers.

Definition 10.2. A function f \in \mathrm{End}(X) is called an idempotent if f \circ f = f.

Idempotents allow us to discuss retractions without specifying what they retract onto.

Problem 10.1. Prove that an idempotent f \in \mathrm{End}(X) is a retraction onto its image \mathrm{Im}(f)=\{f(x) \colon x \in X\}. Conversely, show that if f \in \mathrm{End}(X) is a retraction of X onto a subset Y \in \mathcal{P}(X), then f^2=f.

Let \mathrm{Ide}(X) denote the set of idempotent functions in \mathrm{End}(X). Define an equivalence relation on \mathrm{Ide}(X) by

f \sim g \iff \mathrm{Im}(f) = \mathrm{Im}(g).

Then, associated to each Y \in \mathcal{P}(X) is the equivalence class of idempotent functions with image Y,

[f]_Y = \{f \in \mathrm{End}(X) \colon \mathrm{Im}(f)=Y,\ f^2=f\}.

Problem 10.2. Prove that Y \mapsto [f]_Y is a bijection \mathcal{P}(X)\backslash\{\emptyset\} \to \mathrm{Ide}(X)/\sim.

There is another way to encode subsets if we do not insists on self-maps.

Defintion 10.3. The indicator function of Y \in \mathcal{P}(X) is the function \mathbf{1}_Y \in \mathrm{Hom}(X,\{0,1\}) defined by

\mathbf{1}_Y(x) = \begin{cases} 1, \text{ if } x \in Y \\ 0, \text{ if } x \not\in Y \end{cases}.

The function

\mathcal{P}(X) \longrightarrow \mathrm{Hom}(X,\{0,1\})

which sends each subset Y to its indicator function \mathbf{1}_Y is a bijection (binary encoding of subsets).

Now let us transport the above to the quantum computational category by means of our favorite functor

\mathcal{F} \colon \mathbf{FSet} \longrightarrow \mathbf{FHil}.

Recall that the quantization functor \mathcal{F} sends a finite set X to the free Hilbert space \mathcal{F}(X), inside of which X lives on as an orthonormal basis. Conversely, if V is an arbitrary Hilbert space then we can choose an orthonormal basis of V and identity it with \mathcal{F}(X).

On the Hilbert space side, the subobject functor

\mathcal{L} \colon \mathbf{FHil} \longrightarrow \mathbf{Lat}

sends a Hilbert space V to

\mathcal{L}(V) =\{W \subseteq V \colon W \text{ a subspace}\},

with partial order again given by set containment. Thus, \mathcal{L}(V) only includes subsets of V which are themselves objects of \mathbf{FHil}. The meet and join operations in \mathcal{L}(V) are

W_1 \wedge W_2 = W_1 \cap W_2 \quad\text{and}\quad W_1 \vee W_2=\mathrm{Span}(W_1 \cup W_2).

Moreover, if A \in \mathrm{Hom}(V_1,V_2) is a linear transformation the corresponding order homomorphism \mathcal{L}(A) \in \mathrm{Hom}(\mathcal{L}(V_1),\mathcal{L}(V_2)) is defined by

[\mathcal{L}(A)](W) = A(W) = \{Aw \colon w \in W\} = \mathrm{Im}(A).

The subspace lattice \mathcal{L}(V) of the Hilbert space V is graded by dimension: it decomposes into a disjoint union of \dim V antichains

\mathcal{L}(V)_k = \{W \leq V \colon \dim W=k\}, \quad 0 \leq k \leq n,

where n=\dim V. Although each \mathcal{L}(V)_k is uncountably infinite set, it has much more structure than just being a very big set. In fact, these sets are important enough that they have a special name.

Definition 10.4. The set \mathcal{L}(V)_k is called the Grassmannian of k-planes in V.

Note to self: if we’re going to be using ostentatious names, perhaps \mathcal{P}_k(X) should be called the Pascalian of k-sets in X.

Now let V = \mathcal{F}(X) be the free Hilbert space over the finite set X. Then, every subset X \in \mathcal{P}(X) quantizes to \mathcal{F}(X) \in \mathcal{L}(V), giving 2^{|X|} special subspaces of V called coordinate subspaces relative to X. The coordinate subspaces of V = \mathcal{F}(X) form an induced sublattice of \mathcal{L}(V) isomorphic to \mathcal{P}(X). Complementary subsets of X quantize to complementary coordinate subspaces of V.

Problem 10.3. Prove that for any Y \in \mathcal{P}(X) we have \mathcal{F}(Y^c)=\mathcal{F}(Y)^\perp. Moreover, prove that we have the internal direct sum decomposition \mathcal{F}(X) = \mathcal{F}(Y) \oplus \mathcal{F}(Y^c).

Of course, not every subspace W \in \mathcal{L}(V) is a coordinate subspace of V=\mathcal{F}(X). For example, for two distinct points x_1,x_2 \in X the line in V spanned by the vector x_1+x_2 does not contain any point of X.

Problem 10.4. Prove that for any W \in \mathcal{L}(V) we have the internal direct sum decomposition V = W \oplus W^\perp. (Hint: you can deduce this directly from Problem 10.3 by constructing an appropriate unitary operator on V).

Now let us again consider the question of encoding subobjects of V algebraically, this time as linear operators on V rather than arbitrary functions.

Definition 10.5. Given W \in \mathcal{L}(V), a linear retraction of V onto W is an operator P_W \in \mathrm{End}(V) such that \mathrm{Im}(P_W)=W and the restriction of P_W to W is the identity operator I_W.

Problem 10.5. Prove that linear operator P \in \mathrm{End}(V) is a retraction onto its image if and only if it is idempotent, P^2=P.

As in the set-theoretic setting, let \mathrm{Ide}(V) denote the set of idempotent operator on V, and define an equivalence relation on \mathrm{Ide}(V) by declaring two idempotents equivalent if and only if they have the same image.

Problem 10.6. The map W \mapsto [P]_W which sends a subspace W \leq V to the equivalence class [P]_W of idempotent operators with image W is a bijection \mathcal{L}(V) \to \mathrm{Ide}/\sim.

Now we come to a key difference between the classical computational category \mathbf{FSet} and the quantum computational category \mathbf{FHil}. In the classical setting, given a point x \in X and a subset Y \in \mathcal{P}(X), either x \in Y or x \not\in Y. In the quantum setting, given a vector v \in V and a subspace W \in \mathcal{L}(V), we can quantify how much v is or is not in W.

Definition 10.6. For v \in V and W \in \mathcal{L}(V), define

\mathrm{d}(v,W) = \inf\{\|v-w\| \colon w \in W\},

and call this quantity the distance from the vector v to the subspace W.

Note that the distance from v \in V to W \in \mathcal{L}(V) is indeed well-defined, since the set of distances \|v-w\| with w \in W is bounded below by zero. The question of whether or not the minimum is attained is answered in the following.

Theorem 10.7. For any v \in V and any W \in \mathcal{L}(V), there exists a unique w_0 \in W such that \|v-w_0\|=\mathrm{d}(v,W).

Proof: Suppose first that W \in \mathcal{L}(V) is a coordinate subspace of V = \mathcal{F}(X). That is, we have W=\mathcal{F}(Y) for a nonempty Y \in \mathcal{P}(X), and hence W^\perp = \mathcal{F}(Z) where Z=Y^c. For any vector v \in V, consider its expansion in the orthonormal basis X=Y \sqcup Z,

v = \sum\limits_{x \in X} \langle x,v \rangle x = \sum\limits_{y \in Y} \langle y,v \rangle y + \sum\limits_{z \in Z} \langle z,v \rangle z,

and define

w_0 = \sum\limits_{y \in Y} \langle y,v \rangle y.

Then, w_0 \in W. We claim that \|v-w_0\| = \mathrm{d}(v,W) and that w_0 is the unique point of W which achieves this minimum. To see this, we first compute

\|v-w_0\|^2 = \sum\limits_{z \in Z} |\langle z,v\rangle|^2.

On the other hand, for any w \in W we have

\|v-w\|^2 = \sum\limits_{y \in Y} |\langle y,v\rangle-\langle y,w\rangle|^2 + \sum\limits_{z \in Z} |\langle z,v \rangle|^2,

which is strictly larger than \|v-w_0\|^2 unless we have

\langle \langle y,v \rangle = \langle y,w\rangle, \quad\text{for all }y \in Y,

in which case w=w_0.

To finish, observe that the argument above holds even when W is not a coordinate subspace of V, thanks to Problem 10.4. \square

In view of Theorem 10.7, the above definition is legitimate.

Definition 10.8. For any subspace W \in \mathcal{L}(V), we define the orthogonal projection P_W \in \mathrm{End}(V) to be the linear operator characterized by

\|v -P_Wv\|= \mathrm{d}(v,W), \quad v \in V.

It is clear from the definition that P_W is an idempotent with image W, hence it is a retraction of V onto W (Problem 10.5). As for the name “orthogonal projection,” this is justified by the following.

Problem 10.7. Prove that v-P_Wv \in W^\perp, and that \mathrm{Ker}(P_W)=W^\perp.

Hence, we see an important difference between the classical computational category \mathbf{FSet} and the quantum computational category \mathbf{FHil}: for a finite set X and a subset Y \in \mathcal{P}(X) there is no canonical representative of the class [f]_Y of idempotents with image Y, whereas in the quantized setting V = \mathcal{F}(X) we do have a canonical linear retraction onto any subspace W \in \mathcal{L}(V) characterized by orthogonality.

To finish, let us consider the case of a coordinate subspace W=\mathcal{F}(Y) of V=\mathcal{F}(X), i.e. Y \in \mathcal{P}(X). Let P_W \in \mathrm{End}(V) be the corresponding orthogonal projection operator. The matrix elements of P_W in the orthonormal basis X are described as follows: for two distinct points x_1,x_2 \in X we have \langle x_1,P_Wx_2 \rangle = 0, and for any x \in X we have \langle x,P_Wx \rangle = \mathbf{1}_Y(x). That is, the orthogonal projection P_W incorporates the binary encoding of the subset Y \subset X as its diagonal.

Let us note that the following question remains open: if we are given an idempotent operator P \in \mathrm{End}(V) we currently have no algebraic criterion for determining whether or not P is an orthogonal projection. To determine this, we would have to examine the geometry of P, meaning that we would have to compute its image W = \mathrm{Im}(P) and check that \mathrm{Ker}(P) = W^\perp. In fact, there is an algebraic characterization of which idempotents actually are orthogonal projections: it is precisely the selfadjoint ones. I mention this because you probably already know it. If you don’t that’s even better, because we will prove this characterization as a theorem after we establish the singular value decomposition, which we shall use to define the adjoint of a general linear transformation.

Leave a Reply