Math 202A: Lecture 4

Definition 4.1. A Hilbert space is a complex vector space $V$ equipped with a scalar product: a function

$\langle \cdot,\cdot \rangle \colon V \times V \longrightarrow \mathbb{C}$

satisfying:

$\langle v,v \rangle \geq 0$ with equality if and only if $v=0_V$ ;
$\langle v,w \rangle=\overline{\langle w,v\rangle}$ ;
$\langle v, \beta_1w_1+\beta_2w_2\rangle=\beta_1\langle v,w_1\rangle + \beta_2 \langle v,w_2\rangle.$

The term “scalar product” indicates that $\langle \cdot,\cdot \rangle$ is a rule for multiplying two vectors $v,w$ in $V$ such that the product $\langle v,w\rangle$ is a complex number rather than a vector in $V.$ Using these axioms you can check that the scalar product satisfies the following augmented FOIL identity from high school algebra, which is known as sesquilinearity:

$\langle \alpha_1v_1+\alpha_2v_2,\beta_1w_1+\beta_2w_2\rangle=\overline{\alpha_1}\beta_1\langle v_1,w_1\rangle + \overline{\alpha_1}\beta_2\langle v_1,w_2\rangle + \overline{\alpha_2}\beta_1\langle v_2,w_1\rangle + \overline{\alpha_2}\beta_2\langle v_2,w_2\rangle.$

Definition 4.1 is nonstandard in that it typically includes an extra analytic condition which we have omitted (this will be discussed further below). One may also see the pair $(V,\langle \cdot,\cdot\rangle)$ referred to as a Hermitian space.

A vector space $V$ without a scalar product abstracts the familiar operations of addition and scaling of spatial vectors $\vec{v}$ , which are standard in science and engineering. In particular you have already in your mind the image of vectors as directed line segments representing things like velocity or acceleration; these are added tail-to-tip and stretched or squished by scalar multiplication. The reason to generalize from spatial vectors $\vec{v}$ to abstract vectors $v$ is that many other types of objects can be manipulated in the same way as spatial vectors, and we can study all such systems simultaneously through an axiomatic development of general vector spaces. The motivation behind allowing for scaling by arbitrary complex numbers instead of just real ones is a bit more involved, but one may point to the use of complex vector spaces in quantum mechanics as one reason. The role of the scalar product is to provide an axiomatic foundation for abstracting familiar geometric aspects of spatial vectors, such as the length of a vector and the angle between two vectors, to the setting of an arbitrary vector space.

Defintion 4.2. A normed vector space is a complex vector space $V$ equipped with a function

$\|\cdot\| \colon V \longrightarrow [0,\infty)$

such that

$\|v\|=0$ if and only if $v=0_V$ ;
$\|\alpha v\| = |\alpha|\|v\|$ ;
$\|v+w\| \leq \|v\| + \|w\|.$

The norm $\|v\|$ abstracts the notion of vector length in a way which is compatible with our geometric intuition. The first axiom says that only the zero vector has zero length. The second says that scaling a vector and then measuring its length is the same thing as multiplying the original length measurement by the magnitude of the scaling. The third is the triangle inequality: it abstracts the fact that if we add two spatial vectors $\vec{v},\vec{w}$ tail-to-tip we get a triangle with three directed sides $\vec{v},\vec{w}$ and $\vec{v}+\vec{w}$ . Any normed vector space can be promoted to a metric space with distance defined by $\mathrm{d}(v,w) = \|v-w\|.$

We now claim that in a Hilbert space $V$ the scalar product gives us a norm defined by

$\|v\| = \sqrt{\langle v,v \rangle}.$

To verify this claim, we have to check that the three conditions stipulated by Definition 4.2 do in fact hold. The first two are easy to check. The third is a bit more problematic: we compute

$\|v+w\| = \sqrt{\langle v+w,v+w\rangle} = \sqrt{\|v\|^2+2\Re \langle v,w\rangle + \|w\|^2}$

and we have to control the quantity $\Re \langle v,w\rangle.$ Since the real part of any complex number is bounded by its modulus, where equality holds precisely for nonnegative real numbers, we have

$\|v+w\| \leq \sqrt{\|v\|^2+2|\langle v,w\rangle| + \|w\|^2}$

with equality if and only if $\langle v,w \rangle \geq 0.$

Theorem 4.3. (Cauchy-Schwarz inequality) For any vectors $v,w \in V$ we have

$|\langle v,w\rangle \leq \|v\| \|w\|,$

with equality if and only if $w = \alpha v$ for some $\alpha \in \mathbb{C}.$

Problem 4.1. Prove the Cauchy-Schwarz inequality.

At this point we have shown that the scalar product on a Hilbert space $V$ gives rise to a genuine norm defined by $\|v\|=\sqrt{\langle v,v \rangle}$ and hence to a genuine metric defined by $\mathrm{d}(v,w) = \|v-w\|$ . The reason Definition 4.1 is nonstandard is that at this stage one typically includes an extra clause: the metric space $(V,\mathrm{d})$ must be complete. We are not going to make metric completeness part of our definition of Hilbert space because we will for the most part not need to take limits of sequences of vectors in Hilbert space — that would be analysis, and our focus is algebra. Furthermore, we will soon define a notion of dimension for Hilbert spaces and then restrict our study to the finite-dimensional ones, where metric completeness is automatic.

Although we are working over the complex numbers, one may just as well consider real vector spaces equipped with a scalar product: these are called Euclidean spaces because these are indeed the most elementary and natural setting in which to carry out an axiomatic abstraction of Euclidean geometry. So far, everything we have said about Hilbert spaces holds verbatim for Euclidean spaces. However, differences between the two cases will now start to emerge. Let us begin with the fact that in both Euclidean space and Hilbert space the scalar product can be recovered from the norm, but the recipe is a bit different depending on whether one is working over the real or complex numbers.

Theorem 4.4. (Euclidean Polarization) For any vectors $v,w$ in a Euclidean space $V,$ we have

$\langle v,w \rangle = \frac{1}{4}\left( \|v+w\|^2 - \|v-w\|^2\right).$

Proof: We have

$\|v+w\|^2 = \langle v+w,v+w\rangle = \|v\|^2 +2\langle v,w \rangle + \|w\|^2$

and

$\|v-w\|^2 = \langle v-w,v-w\rangle = \|v\|^2 +2\langle v,w \rangle + \|w\|^2.$

Subtracting the second expression from the first, we obtain

$\|v+w\|^2 - \|v-w\|^2 = 4\langle v,w \rangle.$

-QED

Theorem 4.5. (Hermitian Polarization) For any vectors $v,w$ in a Hilbert space $V,$ we have

$\langle v,w \rangle = \frac{1}{4}\left( \|v+w\|^2 - \|v-w\|^2 -\right) + \frac{i}{4}\left( \|v+iw\|^2 - \|v-iw\|^2\right).$

Proof: Same as above: expand the right hand side and simplify.

-QED

In both Euclidean space and Hilbert space, two vectors are said to be orthogonal if $\langle v,w \rangle = 0$ and in this case we have the following.

Theorem 4.6. (Pythagorean Theorem) For any orthogonal vectors $v,w$ we have $\|v+w\|^2 = \|v\|^2 + \|w\|^2.$

Proof: Expand $\|v+w\|^2 = \langle v+w,v+w\rangle$ and simplify using orthogonality.

-QED

Observe that the above argument also shows that we have $\|v-w\|^2=\|v\|^2+\|w\|^2$ for orthogonal vectors. If we drop orthogonality, the correct statement is the following, which is again the same in Euclidean and Hilbert space.

Theorem 4.7 (Parallelogram Law) For any two vectors $v,w$ in Euclidean space or Hilbert space, we have

$\|v+w\|^2 + \|v-w\|^2 = 2\|v\|^2 + 2\|w\|^2.$

Proof: We have

$\|v+w\|^2 = \langle v,v \rangle + \langle v,w\rangle + \langle w,v \rangle + \langle w,w\rangle$

and

$\|v-w\|^2 = \langle v,v \rangle - \langle v,w\rangle - \langle w,v \rangle + \langle w,w\rangle.$

Adding these two expressions gives the stated identity.

-QED

Orthogonality is an abstraction of the notion of perpendicularity for spatial vectors. Now let us consider the notion of angles in general Euclidean spaces and Hilbert spaces. In both settings the definition is based on the Cauchy-Schwarz inequality, which implies

$-1 \leq \frac{\Re \langle v,w \rangle}{\|v\|\|w\|} \leq 1.$

Definition 4.8. The Euclidean angle between nonzero vectors $v,w$ is the unique $\theta \in [-\pi,\pi]$ such that

$\cos \theta = \frac{\Re \langle v,w \rangle}{\|v\|\|w\|}.$

The definition of the Euclidean angle between two vectors is valid in both Euclidean space and Hilbert space. In Euclidean space, the scalar product is real so we just write

$\cos \theta = \frac{\langle v,w \rangle}{\|v\|\|w\|}, \quad \theta \in [-\pi,\pi].$

This indeed corresponds to the intuitive notion of angle: if $v,w$ are orthogonal then $\theta=\frac{\pi}{2},$ and if $w=\alpha v$ then $\theta=0$ when $\alpha >0$ and $\theta = \pi$ when $\alpha <0.$ However, while definition 4.6 is logically valid for vectors in Hilbert space, it produces counterintuitive results: for example, if $w=iv$ then $\|v\|=\|w\|$ and $\Re \langle v,w \rangle =0,$ so $\theta=\frac{\pi}{2}$ even though $w$ is a scalar multiple of $v.$ We therefore introduce a different notion of angle measure as follows.

Definition 4.9. The Hermitian angle between nonzero vectors $v,w$ is the unique $\theta \in [0,\frac{\pi}{2}]$ such that

$\cos \theta = \frac{|\langle v,w \rangle|}{\|v\|\|w\|}.$

The Hermitian angle concept is much better adapted to complex scalars than the Euclidean angle: in particular the Hermitian angle between $v$ and $iv$ is zero. On the other hand, we now see a different kind of counterintuitive behavior in that the Hermitian angle between $v$ and $-v$ is also zero. The explanation for this phenomenon is that $-v = e^{i\pi}v,$ i.e. in complex geometry a ray is a real plane and $v$ and $-v$ point in the same “direction.” In other words, there are no obtuse angles in complex geometry because the real concept of rotating a vector is a special case of complex scaling. This is a feature, not a bug.

Theorem 4.10. (Euclidean Law of Cosines) For any vectors $v,w$ we have

$\|v-w\|^2 = \|v\|^2 +\|w\|^2-2\|v\|\|w\|\cos \theta,$

where $\theta$ is the Euclidean angle between $v$ and $w.$

Proof: We have

$\langle v-w,v-w\rangle = \|v\|^2+\|w\|^2 - 2\Re \langle v,w\rangle = \|v\|^2 + \|w\|^2 -2\|v\|\|w\|\frac{\Re \langle v,w \rangle}{\|v\|\|w\|},$

and the result now follows from the definition of the Euclidean angle between $v$ and $w.$

– QED

Theorem 4.11 (Hermitian Law of Cosines) For any vectors $v,w$ in a Hilbert space $V,$ we have

$\min\limits_{\phi \in \mathbb{R}} \|v-e^{i\phi}w\|^2 = \|v\|^2+\|w\|^2-2\|v\|\|w\|\cos \theta.$

where $\theta$ is the Hermitian angle between $v$ and $w.$

The following corollary of Theorem 4.9 is useful in the applied context of phase retrieval problems.

Corollary 4.12 (Best Phase Law) For any unit vectors $v,w$ in a Hilbert space $V,$ we have

$\min\limits_{\phi \in \mathbb{R}} \|v-e^{i\phi}w\|=2\sin\left( \frac{\theta}{2}\right),$

where $\theta$ is the Hermitian angle between $v$ and $w.$

Problem 4.2. Prove the Hermitian Law of Cosines for vectors in Hilbert space. Hint: modify the proof of the Euclidean case.

A lingering question is whether we could use some other class of normed vector spaces apart from Euclidean and Hilbert spaces to axiomatically develop real and complex geometry. The following nice result says that the answer is no.

Theorem 4.13. Let $V$ be either a real or complex normed vector space in which the parallelogram law holds. Then, there exists a scalar product on $V$ such that $\|v\|=\sqrt{\langle v,v\rangle}.$

Let us consider how one would prove Theorem 4.13, say in the real case. The basic idea is to define a polarization-inspired function

$\langle \cdot,\cdot \rangle \colon V \times V \longrightarrow \mathbb{R}$

$\langle v,w \rangle = \frac{1}{4}(\|v+w\|^2 - \|v-w\|^2)$

and show that this is a scalar product which induces the given norm. Indeed, we have

$\langle v,v \rangle = \frac{1}{4}(\| v+v\|^2 - \|v-v\|^2) = \|v\|^2,$

and since $\|\cdot\|$ is a norm we immediately get that the first scalar product axiom holds. Symmetry is also straightforward,

$4\langle v,w \rangle = \|v+w\|^2-\|v-w\|^2=\|w+v\|^2-\|w-v\|^2=4\langle w,v \rangle.$

Problem 4.3. Complete the proof of Theorem 4.13 in the real case. Hint: one strategy is to first prove additivity,

$\langle v,w_1+w_2\rangle=\langle v,w_1+w_2\rangle,$

using the Parallelogram Law as your main tool. From here you can show that $\langle v,nw\rangle = n \langle v,w\rangle$ for $n \in \mathbb{Z}.$ Then think about how to bootstrap this to scalars in $\mathbb{Q}$ and finally in $\mathbb{R}.$ If you want to be a hero, adapt your argument to the complex setting as an optional add-on.

Math 202A: Lecture 4

Like this:

Published by Jonathan Novak

Leave a ReplyCancel reply

Share this:

Like this:

Published by Jonathan Novak

Leave a ReplyCancel reply

Discover more from Jonathan Novak