Math 202A: Lecture 17

In this lecture we will discuss the adjoint and transpose. You probably know all about these matrix operations, and may feel that you are too enlightened to participate in this discussion. That feeling is called “ignorance”. Before enlightenment: take adjoints, transpose matrices. After enlightenment: take adjoints, transpose matrices.

In the beginning (of Math 202A), there was quantization,

\mathcal{F} \colon \mathbf{FSet} \longrightarrow \mathbf{FHil}.

\mathcal{F} sends a finite set X to the free Hilbert space \mathcal{F}(X) containing X as an orthonormal basis. It also sends a set function f \in \mathrm{Hom}(X,Y) to the linear transformation \mathcal{F}(f) \in \mathrm{Hom}(\mathcal{F}(X),\mathcal{F}(Y)) whose action on this basis is

\mathcal{F}(f)x = f(x), \quad x \in X.

\mathcal{F} is not the only functor from sets to spaces. As the Good Book says, the adjoint

\mathcal{F}^* \colon \mathbf{FSet} \longrightarrow \mathbf{FHil}

was with quantization, and the adjoint was quantization. Indeed, \mathcal{F}^* matches \mathcal{F} on sets,

\mathcal{F}^*(X)=\mathcal{F}(X),

but acts differently on functions: instead of pushing a point of X forward onto a point of Y via f, \mathcal{F}^* pulls a point of Y back onto its preimage under f,

\mathcal{F}^*(f)y = \sum\limits_{x \in f^{-1}(y)} x.

Thus the adjoint \mathcal{F}^* of \mathcal{F} is a contravariant functor,

\mathcal{F}^*(f) \in \mathrm{Hom}(\mathcal{F}^*(Y),\mathcal{F}^*(X))=\mathrm{Hom}(\mathcal{F}(Y),\mathcal{F}(X)).

It is worth noting that while \mathcal{F}(f) \in \mathrm{Hom}(\mathcal{F}(X),\mathcal{F}(Y)) is a linear transformation which contains the same information as the original set function f \in \mathrm{Hom}(X,Y), presented in the same way, \mathcal{F}^*(f) \in \mathrm{Hom}(\mathcal{F}(Y),\mathcal{F}(X)) has no direct counterpart in \mathbf{FSet}. In the classical computational category, you can identify a function f with the ordered list of its fibers, but the fibers of f are not points of its domain, they are subobjects of it. In the quantum computational category you can consider \mathcal{F}^*(f)y, which is literally a point of the domain of \mathcal{F}(f). It is precisely this fact which allowed us to establish the Singular Value Decomposition for quantized set functions with no additional technology.

One more remark about the relationship between \mathcal{F}(f) and \mathcal{F}^*(f). A function f \in \mathrm{Hom}(X,Y) can be viewed as a bipartite graph. The transformation \mathcal{F}(f) encodes the edges of the bipartite graph f, but not quite in the usual way graph theorists do. A graph theorist would encode adjacency in f as an endomorphism of

\mathcal{F}(X \sqcup Y) \simeq \mathcal{F}(X) \oplus \mathcal{F}(Y),

namely as the linear operator A whose matrix in the ordered orthonormal basis X \sqcup Y = \{x_1,\dots,x_n,y_1,\dots,y_m\} is

[A] = \begin{bmatrix} 0_{m \times n} & \mathcal{F}^*(f) \\ \mathcal{F}(f) & 0_{n \times m} \end{bmatrix}.

There is a very concise way to capture the relationship between the pushforward \mathcal{F}(f) and the pullback \mathcal{F}^*(f) using scalar products.

Problem 17.1. Let V=\mathcal{F}(X) and W=\mathcal{F}(Y); write A=\mathcal{F}(f) and A^*=\mathcal{F}^*(f). Show that

\langle w,Av \rangle =\langle A^*w,v\rangle=\overline{\langle v,A^*w\rangle}, \quad\text{for all }v \in V,\ w \in W.

Now we look at Problem 16.1 and ask ourselves the same question we asked for the Singular Value Decomposition: can we untether this from combinatorics? If so, what does the untethered relation mean geometrically? More precisely, given any two finite-dimensional Hilbert spaces V,W and any linear transformation A \in \mathrm{Hom}(V,W), we seek a linear transformation A^* \in \mathrm{Hom}(V,W) which satisfies

\langle w,Av \rangle =\langle A^*w,v\rangle=\overline{\langle v,A^*w\rangle}, \quad\text{for all }v \in V,\ w \in W.

Assuming such an A^* exists, we want a geometric interpretation of it.

Problem 17.2. Show that if A^* exists, it is unique.

The proof of the existence of A^* is very straightforward. Let X \subset V and Y \subset W be any two orthonormal bases; these are guaranteed to exist by Gram-Schmidt and we have V=\mathcal{F}(X) and W = \mathcal{F}(Y). However, it is not necessarily true that we can select X,Y such that our pre-selected A \in \mathrm{Hom}(V,W) maps every point of X to a point of Y; indeed, a necessary condition for this to be possible is that A be an isometry. Nevertheless, we can simply define A^* \in \mathrm{Hom}(W,V) by declaring its matrix elements to be what we want them to be. More precisely, set

\langle x,A^*y\rangle := \langle Ax,y \rangle = \overline{\langle y,Ax \rangle}, \quad x \in X,\ y \in Y.

Then, for any latex v \in V and w \in W we have

\langle v,A^*w\rangle = \sum\limits_{x \in X} \sum\limits_{y \in Y} \overline{\langle x,v \rangle}\langle y,w\rangle \langle x,A^*y\rangle

and this coincides with

\langle Av,w\rangle = \sum\limits_{x \in X} \sum\limits_{y \in Y} \overline{\langle x,v \rangle}\langle y,w\rangle \overline{\langle Ax,y\rangle}

thanks to our definition of A^*. This is completely true, and defines the adjoint A^* \in \mathrm{Hom}(W,V) of a linear transformation A \in \mathrm{Hom}(V,W) in a manner which is logically unassailable and yet totally unsatisfying for all but the most soulless of algebraists.

A natural geometric approach to reversing and arbitrary linear transformation A \in \mathrm{Hom}(V,W) can be developed using the Singular Value Decomposition, which gives us orthogonal decompositions

V = V_1 \oplus \dots \oplus V_r \oplus \mathrm{Ker}(A) \quad\text{and}\quad W=W_1 \oplus \dots \oplus W_r \oplus \mathrm{Im}(A)^\perp,

positive numbers

\sigma_1 \geq \dots \geq \sigma_r >0,

and isometric isomorphisms

U_1 \in \mathrm{Hom}(V_1,W_1),\dots,U_r \in \mathrm{Hom}(V_r,W_r)

such that

A|_{V_i} = \sigma_1 U_i, \quad 1 \leq i \leq r.

By abuse of notation, we write

A= \sigma_1U_1 \oplus \dots \oplus \sigma_rU_r \oplus 0_{p\times q},

where p= \dim \mathrm{Ker}(A) is the nullity of A and q=\dim \mathrm{Im}(A)^\perp is the corank of A, so that 0_{p\times q} \in \mathrm{Hom}(\mathrm{Ker}(A),\mathrm{Im}(A)^\perp) is the zero transformation. Now let us define a corresponding transformation A^\# \in \mathrm{Hom}(W,V) by

A^\# = \sigma_1U_1^{-1} \oplus \dots \oplus \sigma_rU_r^{-1} \oplus 0_{q \times p},

which is shorthand for declaring that the restriction of A^\# to W_i is \sigma_i U_i^{-1} for 1 \leq i \leq r and the restriction of A^\# to \mathrm{Im}(A)^\perp is the zero transformation into \mathrm{Ker}(A).

Problem 17.3. Prove that A^\#=A^*. Deduce \mathrm{Ker}(A^*)=\mathrm{Im}(A)^\perp and \mathrm{Im}(A^*)=\mathrm{Ker}(A)^\perp.

One advantage to the geometric interpretation of A^* as “reversing” the singular value decomposition of A is that it immediately leads you to wonder about “inverting” the singular value decomposition of A. More precisely, the the linear transformation in \mathrm{Hom}(W,V) defined by

A^+ = \sigma_1^{-1}U_1^{-1} \oplus \dots \oplus \sigma_r^{-1}U_r^{-1} \oplus 0_{q \times p}

is known as the Moore-Penrose pseudoinverse of A. Indeed, A^+ is defined by inverting the invertible part of A, in the following sense.

Problem 17.4. Show that the restriction of A to V_1 \oplus \dots \oplus V_r is an isomorphism onto W_1 \oplus \dots \oplus W_r with inverse A^+. Prove that AA^+A=A holds in \mathrm{Hom}(V,W) and that A^+AA^+=A^+ holds in \mathrm{Hom}(W,V).

Now let us discuss the transpose of A \in \mathrm{Hom}(V,W). No doubt you are familiar with the transpose of a matrix, so I will make the following remark: the transpose of the matrix of A is the matrix of A^* with respect to orthonormal bases adapted to the singular value decomposition, i.e. with respect to orthonormal bases of V and W consisting of left and right singular vectors, respectively. This is a singular consequence of the fact that the singular values of A are real. For general orthonormal bases of V and W, the corresponding matrices of A and A^* are related by conjugate transpose, not transpose.

The intrinsic meaning of the matrix transpose is not to be found in the adjoint operation, but in a different operation coming from Riesz duality in \mathrm{FHil}. Recall that for every finite dimensional Hilbert space V we have an antilinear isomorphism with V^*=\mathrm{Hom}(V,\mathbb{C}), the space of linear functionals on V, given by the Riesz mapping

v \mapsto \varphi_v(\cdot) = \langle v,\cdot\rangle, \quad v \in V.

n particular, this allows us to promote V^* from a vector space to a Hilbert space by defining the scalar product

\langle \varphi_v,\varphi_{v^\prime}\rangle = \langle v,v^\prime\rangle, \quad v,v^\prime \in V.

This scalar product is on V^*=\mathrm{Hom}(V,\mathbb{C}) is precisely what we used to prove that the Frobenius scalar product on \mathrm{Hom}(V,W) is basis-independent via an argument inductive in the dimension of W. If X \subset V is an orthonormal basis, we get a corresponding orthonormal basis X^* \subset V^* consisting of the linear functionals \varphi_x(\cdot)=\langle x,\cdot\rangle.

We now declare the transpose of A \in \mathrm{Hom}(V,W) to be the morphism A^t \in \mathrm{Hom}(W^*,V^*) defined by

A^t \psi \colon = \psi \circ A, \quad \psi \in W^*.

Problem 17.5. Verify that A^t really is a linear transformation W^* \to V^*. Taking orthonormal bases X \subset V and Y \subset W, show that the matrix of A^t \in \mathrm{Hom}(W^*,V^*) relative to the dual bases Y^* \subset W^* and X^* \subset V^* is the transpose of the matrix of A \in \mathrm{Hom}(V,W) relative to X \subset V and Y \subset W.

2 Comments

  1. some anonymous student using a burner email says:

    oh man. the definition of the Moore-Penrose pseudoinverse was satisfying. Each time I encountered it, it was just $A^+ = (A^T A)^{-1} A^T$, whatever that means! It makes much more sense as the inverse of the invertible part as per the SVD. Thanks!

    1. I’m happy this was helpful. Now get out there and pseudo-invert some matrices.

Leave a Reply to Jonathan NovakCancel reply