Definition 4.1. A Hilbert space is a complex vector space equipped with a scalar product: a function
satisfying:
with equality if and only if
;
;
The term “scalar product” indicates that is a rule for multiplying two vectors
in
such that the product
is a complex number rather than a vector in
Using these axioms you can check that the scalar product satisfies the following augmented FOIL identity from high school algebra, which is known as sesquilinearity:
Definition 4.1 is nonstandard in that it typically includes an extra analytic condition which we have omitted (this will be discussed further below). One may also see the pair referred to as a Hermitian space.
A vector space without a scalar product abstracts the familiar operations of addition and scaling of spatial vectors
, which are standard in science and engineering. In particular you have already in your mind the image of vectors as directed line segments representing things like velocity or acceleration; these are added tail-to-tip and stretched or squished by scalar multiplication. The reason to generalize from spatial vectors
to abstract vectors
is that many other types of objects can be manipulated in the same way as spatial vectors, and we can study all such systems simultaneously through an axiomatic development of general vector spaces. The motivation behind allowing for scaling by arbitrary complex numbers instead of just real ones is a bit more involved, but one may point to the use of complex vector spaces in quantum mechanics as one reason. The role of the scalar product is to provide an axiomatic foundation for abstracting familiar geometric aspects of spatial vectors, such as the length of a vector and the angle between two vectors, to the setting of an arbitrary vector space.
Defintion 4.2. A normed vector space is a complex vector space equipped with a function
such that
if and only if
;
;
The norm abstracts the notion of vector length in a way which is compatible with our geometric intuition. The first axiom says that only the zero vector has zero length. The second says that scaling a vector and then measuring its length is the same thing as multiplying the original length measurement by the magnitude of the scaling. The third is the triangle inequality: it abstracts the fact that if we add two spatial vectors
tail-to-tip we get a triangle with three directed sides
and
. Any normed vector space can be promoted to a metric space with distance defined by
We now claim that in a Hilbert space the scalar product gives us a norm defined by
To verify this claim, we have to check that the three conditions stipulated by Definition 4.2 do in fact hold. The first two are easy to check. The third is a bit more problematic: we compute
and we have to control the quantity Since the real part of any complex number is bounded by its modulus, where equality holds precisely for nonnegative real numbers, we have
with equality if and only if
Theorem 4.3. (Cauchy-Schwarz inequality) For any vectors we have
with equality if and only if for some
Problem 4.1. Prove the Cauchy-Schwarz inequality.
At this point we have shown that the scalar product on a Hilbert space gives rise to a genuine norm defined by
and hence to a genuine metric defined by
. The reason Definition 4.1 is nonstandard is that at this stage one typically includes an extra clause: the metric space
must be complete. We are not going to make metric completeness part of our definition of Hilbert space because we will for the most part not need to take limits of sequences of vectors in Hilbert space — that would be analysis, and our focus is algebra. Furthermore, we will soon define a notion of dimension for Hilbert spaces and then restrict our study to the finite-dimensional ones, where metric completeness is automatic.
Although we are working over the complex numbers, one may just as well consider real vector spaces equipped with a scalar product: these are called Euclidean spaces because these are indeed the most elementary and natural setting in which to carry out an axiomatic abstraction of Euclidean geometry. So far, everything we have said about Hilbert spaces holds verbatim for Euclidean spaces. However, differences between the two cases will now start to emerge. Let us begin with the fact that in both Euclidean space and Hilbert space the scalar product can be recovered from the norm, but the recipe is a bit different depending on whether one is working over the real or complex numbers.
Theorem 4.4. (Euclidean Polarization) For any vectors in a Euclidean space
we have
Proof: We have
and
Subtracting the second expression from the first, we obtain
-QED
Theorem 4.5. (Hermitian Polarization) For any vectors in a Hilbert space
we have
Proof: Same as above: expand the right hand side and simplify.
-QED
In both Euclidean space and Hilbert space, two vectors are said to be orthogonal if and in this case we have the following.
Theorem 4.6. (Pythagorean Theorem) For any orthogonal vectors we have
Proof: Expand and simplify using orthogonality.
-QED
Observe that the above argument also shows that we have for orthogonal vectors. If we drop orthogonality, the correct statement is the following, which is again the same in Euclidean and Hilbert space.
Theorem 4.7 (Parallelogram Law) For any two vectors in Euclidean space or Hilbert space, we have
Proof: We have
and
Adding these two expressions gives the stated identity.
-QED
Orthogonality is an abstraction of the notion of perpendicularity for spatial vectors. Now let us consider the notion of angles in general Euclidean spaces and Hilbert spaces. In both settings the definition is based on the Cauchy-Schwarz inequality, which implies
Definition 4.8. The Euclidean angle between nonzero vectors is the unique
such that
The definition of the Euclidean angle between two vectors is valid in both Euclidean space and Hilbert space. In Euclidean space, the scalar product is real so we just write
This indeed corresponds to the intuitive notion of angle: if are orthogonal then
and if
then
when
and
when
However, while definition 4.6 is logically valid for vectors in Hilbert space, it produces counterintuitive results: for example, if
then
and
so
even though
is a scalar multiple of
We therefore introduce a different notion of angle measure as follows.
Definition 4.9. The Hermitian angle between nonzero vectors is the unique
such that
The Hermitian angle concept is much better adapted to complex scalars than the Euclidean angle: in particular the Hermitian angle between and
is zero. On the other hand, we now see a different kind of counterintuitive behavior in that the Hermitian angle between
and $-v$ is also zero. The explanation for this phenomenon is that
i.e. in complex geometry a ray is a real plane and
and
point in the same “direction.” In other words, there are no obtuse angles in complex geometry because the real concept of rotating a vector is a special case of complex scaling. This is a feature, not a bug.
Theorem 4.10. (Euclidean Law of Cosines) For any vectors we have
where is the Euclidean angle between
and
Proof: We have
and the result now follows from the definition of the Euclidean angle between and
– QED
Theorem 4.11 (Hermitian Law of Cosines) For any vectors in a Hilbert space
we have
where is the Hermitian angle between
and
The following corollary of Theorem 4.9 is useful in the applied context of phase retrieval problems.
Corollary 4.12 (Best Phase Law) For any unit vectors in a Hilbert space
we have
where is the Hermitian angle between
and
Problem 4.2. Prove the Hermitian Law of Cosines for vectors in Hilbert space. Hint: modify the proof of the Euclidean case.
A lingering question is whether we could use some other class of normed vector spaces apart from Euclidean and Hilbert spaces to axiomatically develop real and complex geometry. The following nice result says that the answer is no.
Theorem 4.13. Let be either a real or complex normed vector space in which the parallelogram law holds. Then, there exists a scalar product on
such that
Let us consider how one would prove Theorem 4.13, say in the real case. The basic idea is to define a polarization-inspired function
by
and show that this is a scalar product which induces the given norm. Indeed, we have
and since is a norm we immediately get that the first scalar product axiom holds. Symmetry is also straightforward,
Problem 4.3. Complete the proof of Theorem 4.13 in the real case. Hint: one strategy is to first prove additivity,
using the Parallelogram Law as your main tool. From here you can show that for
Then think about how to bootstrap this to scalars in
and finally in
If you want to be a hero, adapt your argument to the complex setting as an optional add-on.