Math 202A: Lectures 10 and 11

In this pair of lectures we consider the familiar topic of subspaces and projections, continuing our program of studying a subject you already know from a (possibly) new point of view. Probably you have heard of the textbook Linear Algebra Done Right, which prompted a reactionary response in the form an antithetical textbook called Linear Algebra Done Wrong. As to which is better, the debate rages on. Math 202A is Linear Algebra Done Weird. I believe that title is unclaimed; let me know if you have a better one to suggest.

As in previous lectures, we will use category-theoretic thinking to guide us towards analogies between the classical computational category $\mathbf{FSet}$ of finite sets and the quantum computational category $\mathbf{FHil}$ of finite-dimensional Hilbert spaces.

In this pair of lectures, we look at subobjects in these categories and compare and contrast the classical (set-theoretic) and quantum (linear-algebraic) aspects of subobjects. In both cases (and indeed much more generally), we can describe subobjects algebraically as equivalence classes of idempotent endomorphisms. In the classical computational category $\mathbf{FSet},$ there is no canonical way to choose a representative of the equivalence class of idempotents which correspond to a given subobject. However, in the quantum computational category $\mathbf{FHil}$ the geometry of Hilbert space singles out a natural representative in every equivalence class of idempotents characterized by an orthogonality feature.

We start on the classical side, $\mathbf{FSet},$ where subobjects are just subsets. We have a functor $\mathcal{P} \colon \mathbf{FSet} \rightarrow \mathbf{FLat}$ from finite sets to finite lattices which sends $X$ to its power set,

$\mathcal{P}(X) = \{Y \subseteq X\}.$

The partial order in $\mathcal{P}(X)$ is containment, meaning that for $Y_1,Y_2 \in \mathcal{P}(X)$ we define

$Y_1 \leq Y_2 \iff Y_1 \subseteq Y_2.$

What makes $\mathcal{P}(X)$ a lattice (as opposed to just a poset) is the fact that we have also the meet and join operations,

$Y_1 \vee Y_2 = Y_1 \cup Y_2 \quad\text{ and }\quad Y_1 \wedge Y_2 = Y_1 \cap Y_2,$

which give the smallest set containing both $Y_1,Y_2$ and the largest set contained in both $Y_1,Y_2,$ respectively.

The subset lattice $\mathcal{P}(X)$ has a couple of additional properties. First, it is graded, with rank function given by subset cardinality. This means that $\mathcal{P}(X)$ decomposes into a disjoint union of antichains

$\mathcal{P}_0(X),\mathcal{P}_1(X),\dots,\mathcal{P}_n(X)$

in which $n=|X|$ and

$\mathcal{P}_k(X) = \{Y \subseteq X \colon |Y|=k\}$

has cardinality

$|\mathcal{P}_k(X)|= {n \choose k}.$

Second, the subset lattice $\mathcal{P}(X)$ has a natural anti-isomorphism (i.e. order-reversing bijection) given by

$Y \mapsto Y^c = X \backslash Y.$

Taking complements maps $\mathcal{P}_k(X)$ bijectively onto $\mathcal{P}_{n-k}(X),$ which is why

${n \choose k} = {n \choose n-k}.$

Furthermore, for any $Y \in \mathcal{P}(X)$ we have

$|X|=|Y|+|Y^c|.$

Our definition of the power set functor $\mathcal{P}$ is not complete until we say where it sends a given set function $f \in \mathrm{Hom}(X_1,X_2)$ . The corresponding $\mathcal{P}(f) \colon \mathcal{P}(X_1) \to \mathcal{P}(X_2)$ is defined by

$[\mathcal{P}(f)](Y) = f(Y) = \{f(x) \colon x \in y\}.$

As an exercise, you should check that $\mathcal{P}(f)$ really is a morphism in $\mathbf{FLat}.$

Now let us consider how we can detect subobjects of $X$ algebraically, using functions on $X.$

Definition 10.1. A function $f \in \mathrm{End}(X)$ is said to be a retraction onto $Y \in \mathcal{P}(X)$ if $f(x) \in Y$ for all $x \in X$ and $f(y)=y$ for all $y \in Y.$

Notice that Definition 10.1 only applies to nonempty subsets of $X$ . Another remark is that retractions are usually considered as morphisms in the category $\mathbf{Top}$ whose objects are topological spaces and whose morphisms are continuous functions. We can include $\mathbf{Set}$ in $\mathbf{Top}$ by giving every set the discrete topology.

Now let us discuss a question raised by Lani Southern: what does it mean to say that there is no “canonical” retraction of $X$ onto $Y.$ The precise meaning of this is that there is no way to assign to each pair $(X,Y)$ consisting of a finite set $X$ and a subset $Y \subseteq X$ a specific retraction $f_{XY}$ of $X$ onto $Y$ which is equivariant with respect to all automorphisms of $X$ that preserve $Y.$ In more detail, let $\mathrm{Aut}(X,Y)$ denote the subgroup of $\mathrm{Aut}(X)$ consisting of all automorphisms (permutations) $\sigma$ of $X$ satisfying $\sigma(Y)=Y.$ Call a rule $(X,Y) \mapsto f_{XY}$ which assigns to set/subset pairs a corresponding retraction canonical if

$\sigma \circ f_{XY} = f_{XY} \circ \sigma \quad\text{for all } \sigma \in \mathrm{Aut}(X,Y).$

Proposition 10.2. If $|Y|\geq 2$ and $X \backslash Y$ is nonempty, no canonical retraction exists.

Proof: Suppose otherwise, and let $f_{XY}$ be a canonical retraction of $X$ onto $Y$ . Pick a point $x \in X \backslash Y.$ Then, for every $\sigma \in \mathrm{Aut}(X,Y)$ we have

$f_{XY}(x)=f_{XY}(\sigma x)= \sigma f_{XY}(x).$

In particular, this holds for every permutation $\sigma$ which has our chosen point $x$ as a fixed point and permutes the points of $Y$ arbitrarily, so the above forces $f_{XY}(x) \in Y$ to be fixed by every permutation of $Y,$ which in turn forces $|Y|=1,$ a contradiction. $\square$

In order to get past the above, we would need to break the symmetry of $X.$ For example, this could mean promoting the structureless finite set $X$ to a metric space $(X,\mathrm{d}).$ Then, for any nonempty $Y \in \mathcal{P}(X),$ we can define a retraction of $X$ onto $Y$ by declaring $f_{XY}(x)$ to be the point of $Y$ closest to $x.$ However, even here we need to impose a rule for breaking ties (two distinct minimizers). Once we do this, the retraction $f_{XY}$ is canonical in the sense that it is equivariant with respect to all isometries of $(X,\mathrm{d})$ which preserve tie breakers.

Definition 10.2. A function $f \in \mathrm{End}(X)$ is called an idempotent if $f \circ f = f.$

Idempotents allow us to discuss retractions without specifying what they retract onto.

Problem 10.1. Prove that an idempotent $f \in \mathrm{End}(X)$ is a retraction onto its image $\mathrm{Im}(f)=\{f(x) \colon x \in X\}.$ Conversely, show that if $f \in \mathrm{End}(X)$ is a retraction of $X$ onto a subset $Y \in \mathcal{P}(X),$ then $f^2=f.$

Let $\mathrm{Ide}(X)$ denote the set of idempotent functions in $\mathrm{End}(X)$ . Define an equivalence relation on $\mathrm{Ide}(X)$ by

$f \sim g \iff \mathrm{Im}(f) = \mathrm{Im}(g).$

Then, associated to each $Y \in \mathcal{P}(X)$ is the equivalence class of idempotent functions with image $Y,$

$[f]_Y = \{f \in \mathrm{End}(X) \colon \mathrm{Im}(f)=Y,\ f^2=f\}.$

Problem 10.2. Prove that $Y \mapsto [f]_Y$ is a bijection $\mathcal{P}(X)\backslash\{\emptyset\} \to \mathrm{Ide}(X)/\sim.$

There is another way to encode subsets if we do not insists on self-maps.

Defintion 10.3. The indicator function of $Y \in \mathcal{P}(X)$ is the function $\mathbf{1}_Y \in \mathrm{Hom}(X,\{0,1\})$ defined by

$\mathbf{1}_Y(x) = \begin{cases} 1, \text{ if } x \in Y \\ 0, \text{ if } x \not\in Y \end{cases}.$

The function

$\mathcal{P}(X) \longrightarrow \mathrm{Hom}(X,\{0,1\})$

which sends each subset $Y$ to its indicator function $\mathbf{1}_Y$ is a bijection (binary encoding of subsets).

Now let us transport the above to the quantum computational category by means of our favorite functor

$\mathcal{F} \colon \mathbf{FSet} \longrightarrow \mathbf{FHil}.$

Recall that the quantization functor $\mathcal{F}$ sends a finite set $X$ to the free Hilbert space $\mathcal{F}(X),$ inside of which $X$ lives on as an orthonormal basis. Conversely, if $V$ is an arbitrary Hilbert space then we can choose an orthonormal basis of $V$ and identity it with $\mathcal{F}(X).$

On the Hilbert space side, the subobject functor

$\mathcal{L} \colon \mathbf{FHil} \longrightarrow \mathbf{Lat}$

sends a Hilbert space $V$ to

$\mathcal{L}(V) =\{W \subseteq V \colon W \text{ a subspace}\},$

with partial order again given by set containment. Thus, $\mathcal{L}(V)$ only includes subsets of $V$ which are themselves objects of $\mathbf{FHil}.$ The meet and join operations in $\mathcal{L}(V)$ are

$W_1 \wedge W_2 = W_1 \cap W_2 \quad\text{and}\quad W_1 \vee W_2=\mathrm{Span}(W_1 \cup W_2).$

Moreover, if $A \in \mathrm{Hom}(V_1,V_2)$ is a linear transformation the corresponding order homomorphism $\mathcal{L}(A) \in \mathrm{Hom}(\mathcal{L}(V_1),\mathcal{L}(V_2))$ is defined by

$[\mathcal{L}(A)](W) = A(W) = \{Aw \colon w \in W\} = \mathrm{Im}(A).$

The subspace lattice $\mathcal{L}(V)$ of the Hilbert space $V$ is graded by dimension: it decomposes into a disjoint union of $\dim V$ antichains

$\mathcal{L}(V)_k = \{W \leq V \colon \dim W=k\}, \quad 0 \leq k \leq n,$

where $n=\dim V.$ Although each $\mathcal{L}(V)_k$ is uncountably infinite set, it has much more structure than just being a very big set. In fact, these sets are important enough that they have a special name.

Definition 10.4. The set $\mathcal{L}(V)_k$ is called the Grassmannian of $k$ -planes in $V.$

Note to self: if we’re going to be using ostentatious names, perhaps $\mathcal{P}_k(X)$ should be called the Pascalian of $k$ -sets in $X.$

Now let $V = \mathcal{F}(X)$ be the free Hilbert space over the finite set $X.$ Then, every subset $X \in \mathcal{P}(X)$ quantizes to $\mathcal{F}(X) \in \mathcal{L}(V),$ giving $2^{|X|}$ special subspaces of $V$ called coordinate subspaces relative to $X.$ The coordinate subspaces of $V = \mathcal{F}(X)$ form an induced sublattice of $\mathcal{L}(V)$ isomorphic to $\mathcal{P}(X).$ Complementary subsets of $X$ quantize to complementary coordinate subspaces of $V.$

Problem 10.3. Prove that for any $Y \in \mathcal{P}(X)$ we have $\mathcal{F}(Y^c)=\mathcal{F}(Y)^\perp.$ Moreover, prove that we have the internal direct sum decomposition $\mathcal{F}(X) = \mathcal{F}(Y) \oplus \mathcal{F}(Y^c).$

Of course, not every subspace $W \in \mathcal{L}(V)$ is a coordinate subspace of $V=\mathcal{F}(X)$ . For example, for two distinct points $x_1,x_2 \in X$ the line in $V$ spanned by the vector $x_1+x_2$ does not contain any point of $X.$

Problem 10.4. Prove that for any $W \in \mathcal{L}(V)$ we have the internal direct sum decomposition $V = W \oplus W^\perp.$ (Hint: you can deduce this directly from Problem 10.3 by constructing an appropriate unitary operator on $V$ ).

Now let us again consider the question of encoding subobjects of $V$ algebraically, this time as linear operators on $V$ rather than arbitrary functions.

Definition 10.5. Given $W \in \mathcal{L}(V),$ a linear retraction of $V$ onto $W$ is an operator $P_W \in \mathrm{End}(V)$ such that $\mathrm{Im}(P_W)=W$ and the restriction of $P_W$ to $W$ is the identity operator $I_W.$

Problem 10.5. Prove that linear operator $P \in \mathrm{End}(V)$ is a retraction onto its image if and only if it is idempotent, $P^2=P.$

As in the set-theoretic setting, let $\mathrm{Ide}(V)$ denote the set of idempotent operator on $V,$ and define an equivalence relation on $\mathrm{Ide}(V)$ by declaring two idempotents equivalent if and only if they have the same image.

Problem 10.6. The map $W \mapsto [P]_W$ which sends a subspace $W \leq V$ to the equivalence class $[P]_W$ of idempotent operators with image $W$ is a bijection $\mathcal{L}(V) \to \mathrm{Ide}/\sim.$

Now we come to a key difference between the classical computational category $\mathbf{FSet}$ and the quantum computational category $\mathbf{FHil}.$ In the classical setting, given a point $x \in X$ and a subset $Y \in \mathcal{P}(X)$ , either $x \in Y$ or $x \not\in Y.$ In the quantum setting, given a vector $v \in V$ and a subspace $W \in \mathcal{L}(V),$ we can quantify how much $v$ is or is not $in W.$

Definition 10.6. For $v \in V$ and $W \in \mathcal{L}(V),$ define

$\mathrm{d}(v,W) = \inf\{\|v-w\| \colon w \in W\},$

and call this quantity the distance from the vector $v$ to the subspace $W.$

Note that the distance from $v \in V$ to $W \in \mathcal{L}(V)$ is indeed well-defined, since the set of distances $\|v-w\|$ with $w \in W$ is bounded below by zero. The question of whether or not the minimum is attained is answered in the following.

Theorem 10.7. For any $v \in V$ and any $W \in \mathcal{L}(V),$ there exists a unique $w_0 \in W$ such that $\|v-w_0\|=\mathrm{d}(v,W).$

Proof: Suppose first that $W \in \mathcal{L}(V)$ is a coordinate subspace of $V = \mathcal{F}(X).$ That is, we have $W=\mathcal{F}(Y)$ for a nonempty $Y \in \mathcal{P}(X),$ and hence $W^\perp = \mathcal{F}(Z)$ where $Z=Y^c.$ For any vector $v \in V,$ consider its expansion in the orthonormal basis $X=Y \sqcup Z,$

$v = \sum\limits_{x \in X} \langle x,v \rangle x = \sum\limits_{y \in Y} \langle y,v \rangle y + \sum\limits_{z \in Z} \langle z,v \rangle z,$

and define

$w_0 = \sum\limits_{y \in Y} \langle y,v \rangle y.$

Then, $w_0 \in W.$ We claim that $\|v-w_0\| = \mathrm{d}(v,W)$ and that $w_0$ is the unique point of $W$ which achieves this minimum. To see this, we first compute

$\|v-w_0\|^2 = \sum\limits_{z \in Z} |\langle z,v\rangle|^2.$

On the other hand, for any $w \in W$ we have

$\|v-w\|^2 = \sum\limits_{y \in Y} |\langle y,v\rangle-\langle y,w\rangle|^2 + \sum\limits_{z \in Z} |\langle z,v \rangle|^2,$

which is strictly larger than $\|v-w_0\|^2$ unless we have

$\langle \langle y,v \rangle = \langle y,w\rangle, \quad\text{for all }y \in Y,$

in which case $w=w_0.$

To finish, observe that the argument above holds even when $W$ is not a coordinate subspace of $V$ , thanks to Problem 10.4. $\square$

In view of Theorem 10.7, the above definition is legitimate.

Definition 10.8. For any subspace $W \in \mathcal{L}(V),$ we define the orthogonal projection $P_W \in \mathrm{End}(V)$ to be the linear operator characterized by

$\|v -P_Wv\|= \mathrm{d}(v,W), \quad v \in V.$

It is clear from the definition that $P_W$ is an idempotent with image $W,$ hence it is a retraction of $V$ onto $W$ (Problem 10.5). As for the name “orthogonal projection,” this is justified by the following.

Problem 10.7. Prove that $v-P_Wv \in W^\perp,$ and that $\mathrm{Ker}(P_W)=W^\perp.$

Hence, we see an important difference between the classical computational category $\mathbf{FSet}$ and the quantum computational category $\mathbf{FHil}$ : for a finite set $X$ and a subset $Y \in \mathcal{P}(X)$ there is no canonical representative of the class $[f]_Y$ of idempotents with image $Y,$ whereas in the quantized setting $V = \mathcal{F}(X)$ we do have a canonical linear retraction onto any subspace $W \in \mathcal{L}(V)$ characterized by orthogonality.

To finish, let us consider the case of a coordinate subspace $W=\mathcal{F}(Y)$ of $V=\mathcal{F}(X),$ i.e. $Y \in \mathcal{P}(X).$ Let $P_W \in \mathrm{End}(V)$ be the corresponding orthogonal projection operator. The matrix elements of $P_W$ in the orthonormal basis $X$ are described as follows: for two distinct points $x_1,x_2 \in X$ we have $\langle x_1,P_Wx_2 \rangle = 0,$ and for any $x \in X$ we have $\langle x,P_Wx \rangle = \mathbf{1}_Y(x).$ That is, the orthogonal projection $P_W$ incorporates the binary encoding of the subset $Y \subset X$ as its diagonal.

Let us note that the following question remains open: if we are given an idempotent operator $P \in \mathrm{End}(V)$ we currently have no algebraic criterion for determining whether or not $P$ is an orthogonal projection. To determine this, we would have to examine the geometry of $P,$ meaning that we would have to compute its image $W = \mathrm{Im}(P)$ and check that $\mathrm{Ker}(P) = W^\perp.$ In fact, there is an algebraic characterization of which idempotents actually are orthogonal projections: it is precisely the selfadjoint ones. I mention this because you probably already know it. If you don’t that’s even better, because we will prove this characterization as a theorem after we establish the singular value decomposition, which we shall use to define the adjoint of a general linear transformation.

Math 202A: Lectures 10 and 11

Like this:

Published by Jonathan Novak

Leave a ReplyCancel reply

Share this:

Like this:

Published by Jonathan Novak

Leave a ReplyCancel reply

Discover more from Jonathan Novak