In Lecture 5, we considered the question of how a vector in a Euclidean space can be represented as a linear combination of the vectors in an orthonormal basis of We worked out the answer to this question: the coordinates of are given by taking the scalar product with each vector in the the orthonormal basis:

Equivalently, using our algebraic definition of the angle between two vectors in a Euclidean space, this can be written as

where is the angle between and This lead us to think of the vector as the “projection” of onto the one-dimensional subspace of In what sense is the vector the “projection” of the vector onto the “line” ? Our geometric intuition concerning projections suggests that this construction should have two properties: first, the vector should be the element of which is closest to and second, the vector should be orthogonal to (This would be a good time to draw yourself a diagram, or to consult the diagram in Lecture 5). We want to prove that these two features, which characterize the geometric notion of projection, actually hold in the setting of an arbitrary Euclidean space. Let us consider this in the following slightly more general setup, where the line is replaced by an arbitrary finite-dimensional subspace. Here’s a motivating and suggestive picture.

We first develop some general features of subspaces of Euclidean spaces, which amount to the statement that they always come in complementary pairs. More precisely, let us consider the subset of consisting of all those vectors in which are perpendicular to every vector in the subspace

**Proposition 1:** is a subspace of

*Proof:* Since the zero vector is orthogonal to everything, we have It remains to demonstrate that is closed under taking linear combinations. For any any and any we have

—Q.E.D.

**Proposition 2:** We have

*Proof: *Since both and contain the zero vector (because they’re subspaces), their intersection also contains the zero vector. Now let Then, is orthogonal to itself, i.e. By the scalar product axioms, the only vector with this property is

— Q.E.D.

Propositions 1 and 2 make no assumption on the dimension of the Euclidean space — it could be finite-dimensional, or it could be infinite-dimensional. The same is true of the subspace At this point, we restrict to the case that is an -dimensional vector space, and keep this restriction in place for the rest of the lecture.

Let be an -dimensional subspace of the -dimensional subspace If then as proved on Assignment 1. Suppose and let be an orthonormal basis of Since there is a vector which is not in In particular, the vector

is not the zero vector. This vector is orthogonal to each of the vectors and hence two things are true: first, and second, is an orthogonal set of nonzero vectors. Thus, if the set is an orthogonal basis of If then there is a vector which is not in the span of . We set

to obtain a nonzero vector orthogonal to all vectors in the set . In particular, If then is an orthogonal basis of If we repeat the same process. After iterations of this process, we have generated an orthogonal basis

such that is an orthonormal basis of and is an orthogonal basis of which can be normalized to get an orthonormal basis of

We now come orthogonal projections in general. Let be a subspace of and let be its orthogonal complement. Invoking the above construction, let be an orthonormal basis of such that is an orthonormal basis of and is an orthonormal basis of The function defined by

is called the **orthogonal projector** of on For any vector the vector is called the **orthogonal projection** of onto Observe that if

**Proposition 1:** The function is a linear transformation.

*Proof:* First, let us check that sends the zero vector of to the zero vector of Note that, since is a subspace of they have the same zero vector, we denote it simply instead of using two different symbols and for this same vector. We have

Now we check that respects linear combinations. Let be two vectors, and let be two scalars. We then have

— Q.E.D.

**Proposition 2:** The linear transformation satisfies

*Proof:* The claim is that for all Let us check this. First, observe that for any vector in the orthogonal basis of we have

Note also that since is a basis of the above calculation together with Proposition 1 tells us that for all which is to be expected: the projection of a vector already in onto should just be Now to finish the proof, we apply this calculation:

— Q.E.D.

**Proposition 3:** The linear transformation has the property that for any

*Proof:* For any two vectors we have

— Q.E.D

**Proposition 4:** For any and we have

*Proof:* Before reading the proof, draw yourself a diagram to make sure you can visualize what this proposition is saying. The proof itself follows easily from Proposition 3: we have

— Q.E.D.

**Proposition 5:** For any we have

*Proof:* Let us write

Now observe that the vector lies in since it is the difference of two vectors in this subspace. Consequently, and are orthogonal vectors, by Proposition 4. We may thus apply the Pythagorean theorem (Assignment 2) to obtain

where

— Q.E.D.

Proposition 5 says that is the vector in which is closest to which matches our geometric intuition concerning projections. Equivalently, we can say that $P_\mathbf{W}\mathbf{v}$ is the vector in which best approximates and this perspective makes orthogonal projections very important in applications of linear algebra to statistics, data science, physics, engineering, and more. However, Proposition 5 also has purely mathematical importance. Namely, we have constructed the linear transformation using an arbitrarily chosen orthonormal basis in If we had used a different orthonormal basis the same formula gives us a possibly different linear transformation

defined by

Propositions 1-5 above all apply to as well, and in fact this forces so that it really is correct to speak of *the* orthogonal projection of onto To see why these two transformations must be the same, let us suppose they are not. This means that there is a vector such that Thus by Proposition 5 we have

while also by Proposition 5 we have

a contradiction. So, in the construction of the transformation it does not matter which orthonormal basis of we use.