In Lecture 5, we considered the question of how a vector in a Euclidean space
can be represented as a linear combination of the vectors in an orthonormal basis
of
We worked out the answer to this question: the coordinates of
are given by taking the scalar product with each vector in the the orthonormal basis:
Equivalently, using our algebraic definition of the angle between two vectors in a Euclidean space, this can be written as
where is the angle between
and
This lead us to think of the vector
as the “projection” of
onto the one-dimensional subspace
of
In what sense is the vector
the “projection” of the vector
onto the “line”
? Our geometric intuition concerning projections suggests that this construction should have two properties: first, the vector
should be the element of
which is closest to
and second, the vector
should be orthogonal to
(This would be a good time to draw yourself a diagram, or to consult the diagram in Lecture 5). We want to prove that these two features, which characterize the geometric notion of projection, actually hold in the setting of an arbitrary Euclidean space. Let us consider this in the following slightly more general setup, where the line
is replaced by an arbitrary finite-dimensional subspace. Here’s a motivating and suggestive picture.

We first develop some general features of subspaces of Euclidean spaces, which amount to the statement that they always come in complementary pairs. More precisely, let us consider the subset of
consisting of all those vectors in
which are perpendicular to every vector in the subspace
Proposition 1: is a subspace of
Proof: Since the zero vector is orthogonal to everything, we have It remains to demonstrate that
is closed under taking linear combinations. For any
any
and any
we have
—Q.E.D.
Proposition 2: We have
Proof: Since both and
contain the zero vector (because they’re subspaces), their intersection also contains the zero vector. Now let
Then,
is orthogonal to itself, i.e.
By the scalar product axioms, the only vector with this property is
— Q.E.D.
Propositions 1 and 2 make no assumption on the dimension of the Euclidean space — it could be finite-dimensional, or it could be infinite-dimensional. The same is true of the subspace
At this point, we restrict to the case that
is an
-dimensional vector space, and keep this restriction in place for the rest of the lecture.
Let be an
-dimensional subspace of the
-dimensional subspace
If
then
as proved on Assignment 1. Suppose
and let
be an orthonormal basis of
Since
there is a vector
which is not in
In particular, the vector
is not the zero vector. This vector is orthogonal to each of the vectors
and hence two things are true: first,
and second,
is an orthogonal set of nonzero vectors. Thus, if
the set
is an orthogonal basis of
If
then there is a vector
which is not in the span of
. We set
to obtain a nonzero vector orthogonal to all vectors in the set
. In particular,
If
then
is an orthogonal basis of
If
we repeat the same process. After
iterations of this process, we have generated an orthogonal basis
such that is an orthonormal basis of
and
is an orthogonal basis of
which can be normalized to get an orthonormal basis of
We now come orthogonal projections in general. Let be a subspace of
and let
be its orthogonal complement. Invoking the above construction, let
be an orthonormal basis of
such that
is an orthonormal basis of
and
is an orthonormal basis of
The function
defined by
is called the orthogonal projector of on
For any vector
the vector
is called the orthogonal projection of
onto
Observe that
if
Proposition 1: The function is a linear transformation.
Proof: First, let us check that sends the zero vector of
to the zero vector of
Note that, since
is a subspace of
they have the same zero vector, we denote it simply
instead of using two different symbols
and
for this same vector. We have
Now we check that respects linear combinations. Let
be two vectors, and let
be two scalars. We then have
— Q.E.D.
Proposition 2: The linear transformation satisfies
Proof: The claim is that for all
Let us check this. First, observe that for any vector
in the orthogonal basis
of
we have
Note also that since is a basis of
the above calculation together with Proposition 1 tells us that
for all
which is to be expected: the projection of a vector
already in
onto
should just be
Now to finish the proof, we apply this calculation:
— Q.E.D.
Proposition 3: The linear transformation has the property that
for any
Proof: For any two vectors we have
— Q.E.D
Proposition 4: For any and
we have
Proof: Before reading the proof, draw yourself a diagram to make sure you can visualize what this proposition is saying. The proof itself follows easily from Proposition 3: we have
— Q.E.D.
Proposition 5: For any we have
Proof: Let us write
Now observe that the vector lies in
since it is the difference of two vectors in this subspace. Consequently,
and
are orthogonal vectors, by Proposition 4. We may thus apply the Pythagorean theorem (Assignment 2) to obtain
where
— Q.E.D.
Proposition 5 says that is the vector in
which is closest to
which matches our geometric intuition concerning projections. Equivalently, we can say that $P_\mathbf{W}\mathbf{v}$ is the vector in
which best approximates
and this perspective makes orthogonal projections very important in applications of linear algebra to statistics, data science, physics, engineering, and more. However, Proposition 5 also has purely mathematical importance. Namely, we have constructed the linear transformation
using an arbitrarily chosen orthonormal basis
in
If we had used a different orthonormal basis
the same formula gives us a possibly different linear transformation
defined by
Propositions 1-5 above all apply to as well, and in fact this forces
so that it really is correct to speak of the orthogonal projection of
onto
To see why these two transformations must be the same, let us suppose they are not. This means that there is a vector
such that
Thus by Proposition 5 we have
while also by Proposition 5 we have
a contradiction. So, in the construction of the transformation it does not matter which orthonormal basis of
we use.