Math 31AH: Lecture 5

In this lecture we continue the study of Euclidean spaces. Let \mathbf{V} be a vectors space, and let \langle \cdot,\cdot \rangle be a scalar product on \mathbf{V}, as defined in Lecture 4. The following definition generalizes the concept of perpendicularity to the setting of an arbitrary Euclidean space.

Definition 1: Vectors \mathbf{v},\mathbf{w} are said to be orthogonal if \langle \mathbf{v},\mathbf{w} \rangle =0. More generally, we say that S \subseteq \mathbf{V} is an orthogonal set if \langle \mathbf{v},\mathbf{w} \rangle =0 for all \mathbf{v},\mathbf{w} \in S.

Observe that the zero vector \mathbf{0} \in \mathbf{V} is orthogonal to every vector \mathbf{v} \in \mathbf{V}, by the third scalar product axiom. Let us check that orthogonality of nonzero abstract vectors does indeed generalize perpendicularity of geometric vectors.

Proposition 1: Two nonzero vectors \mathbf{v},\mathbf{w} are orthogonal if and only if the angle between them is \frac{\pi}{2}.

Proof: By definition, the angle between nonzero vectors \mathbf{v} and \mathbf{w} is the unique number \theta \in [0,2\pi) which solves the equation

\langle \mathbf{v},\mathbf{w} \rangle = \|\mathbf{v}\| \|\mathbf{w}\| \cos \theta.

If the angle between \mathbf{v} and \mathbf{w} is \frac{\pi}{2}, then

\langle \mathbf{v},\mathbf{w} \rangle = \|\mathbf{v}\| \|\mathbf{w}\| \cos \frac{\pi}{2} = \|\mathbf{v}\| \|\mathbf{w}\|0=0.

Conversely, if \langle \mathbf{v},\mathbf{w} \rangle =0, then

\|\mathbf{v}\| \|\mathbf{w}\| \cos \theta = 0.

Since \mathbf{v},\mathbf{w} are nonzero, we have \|\mathbf{v}\| > 0 and \|\mathbf{w}\| > 0, and we can divide through by \|\mathbf{v}\| \|\mathbf{w}\| > 0 to obtain

\cos \theta = 0.

The unique solution of this equation in the interval [0,2\pi) is \theta = \frac{\pi}{2}. — Q.E.D.

In Lecture 4, we proved that any two nonzero vectors \mathbf{v},\mathbf{w} separated by a nonzero angle are linearly independent. This is not true for three or more vectors: for example, if \mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3 \in \mathbb{R}^2 are the vectors (1,0),(1,1),(0,1), respectively, then

\theta(\mathbf{v}_1,\mathbf{v}_2) = \theta(\mathbf{v}_2,\mathbf{v}_3) = \frac{\pi}{4},\ \theta(\mathbf{v}_1,\mathbf{v}_3) = \frac{\pi}{2},

but \mathbf{v_2} = \mathbf{v}_1+\mathbf{v}_3. So, separation by a positive angle is generally not enough to guarantee the linear independence of a given set of vectors. However, orthogonality is.

Proposition 2: If S = \{\mathbf{v}_1,\dots,\mathbf{v}_n\} be an orthogonal set of nonzero vectors, then S is linearly independent.

Proof: Let x_1,\dots,x_n \in \mathbb{R} be scalars such that

x_1\mathbf{v}_1 + \dots + x_n \mathbf{v}_n = \mathbf{0}.

Let us take the scalar product with \mathbf{v}_1 on both sides of this equation, to get

\langle \mathbf{v}_1, x_1\mathbf{v}_1 + \dots + x_n \mathbf{v}_n \rangle = \langle \mathbf{v}_1,\mathbf{0} \rangle.

Using the scalar product axioms, we thus have

x_1 \langle \mathbf{v}_1,\mathbf{v}_1 \rangle + \dots + x_n \langle \mathbf{v}_1,\mathbf{v}_n \rangle = 0.

Now, since S \{\mathbf{v}_1,\dots,\mathbf{v}_n\} is an orthogonal set, all terms on the left hand side are zero except for the first term, which is x_1 \langle \mathbf{v}_1,\mathbf{v}_1 \rangle = x_1 \|\mathbf{v}_1\|^2. We thus have

x_1 \|\mathbf{v}_1\|^2 = 0.

Now, since \mathbf{v}_1 \neq \mathbf{0}, we have \|\mathbf{v}_1\| > 0, and thus we can divide through by \|\mathbf{v}_1\| in the above equation to get

x_1 = 0.

Repeating the above argument with \mathbf{v}_2 in place of \mathbf{v}_1 yields x_2=0. In general, using the same argument for each i=1,\dots,n, we get x_i=0 for all i=1,\dots,n. Thus S is a linearly independent set. — Q.E.D.

One consequence Proposition 1 is that, if \mathbf{V} is an n-dimensional vector space, and E=\{\mathbf{e}_1,\dots,\mathbf{e}_n\} is an orthogonal set of nonzero vectors in \mathbf{V}, then E is a basis of \mathbf{V}. In general, a basis of a vector space which is also an orthogonal set is called an orthogonal basis. In many ways, orthogonal bases are better than bases which are not orthogonal sets. One manifestation of this is the very useful fact that coordinates relative to an orthogonal basis are easily expressed as scalar products.

Proposition 2: Let E=\{\mathbf{e}_1,\dots,\mathbf{e}_n\} be an orthogonal basis in \mathbf{V}. For any \mathbf{v} \in \mathbf{V}, the unique representation of \mathbf{v} as a linear combination of vectors in E is

\mathbf{v} = \frac{\langle \mathbf{e}_1,\mathbf{v} \rangle}{\|\mathbf{e}_1\|^2}\mathbf{e}_1 + \frac{\langle \mathbf{e}_2,\mathbf{v} \rangle}{\|\mathbf{e}_2\|^2}\mathbf{e}_2 + \dots + \frac{\langle \mathbf{e}_n,\mathbf{v} \rangle}{\|\mathbf{e}_n\|^2}\mathbf{e}_n.

Equivalently, we have

\mathbf{v} = \frac{\|\mathbf{v}\| \cos \theta_1}{\|\mathbf{e}_1\|} \mathbf{e}_1 + \frac{\|\mathbf{v}\|\cos \theta_2}{\|\mathbf{e}_2\|} \mathbf{e}_2 + \dots + \frac{|\mathbf{v}\| \cos \theta_n}{\|\mathbf{e}_n\|} \mathbf{e}_n,

where, for each 1 \leq j \leq n, \theta_j is the angle between \mathbf{v} and \mathbf{e}_j.

Proof: Let \mathbf{v} \in \mathbf{V} be any vector, and let

\mathbf{v} = x_1\mathbf{e}_1 + \dots + x_n\mathbf{e}_n

be its unique representation as a linear combination of vectors from E. Taking the inner product with the basis vector \mathbf{e}_j on both sides of this decomposition, we get

\langle \mathbf{e}_j,\mathbf{v} \rangle = \left\langle \mathbf{e}_j, \sum_{i=1}^n x_i \mathbf{e}_i \right\rangle.

Using the scalar product axioms, we can expand the right hand side as

\left\langle \mathbf{e}_j, \sum_{i=1}^n x_i \mathbf{e}_i \right\rangle = \sum_{i=1}^n x_i \langle \mathbf{e}_j,\mathbf{e}_i \rangle =\sum_{i=1}^n x_i \delta_{ij},

where \delta_{ij} is the Kronecker delta, which equals 1 if I=j and equals 0 if i \neq j. We thus have

\langle \mathbf{e}_j,\mathbf{v} \rangle = x_j \langle \mathbf{e}_j,\mathbf{e}_j \rangle = x_j \|\mathbf{e}_j\|^2.

Now, since \{\mathbf{e}_1,\dots,\mathbf{e}_n\} is a linearly independent set, \mathbf{e}_j \neq \mathbf{0} and hence \|\mathbf{e}_j \| >0. Solving for the coordinate x_j, we thus have

x_j = \frac{\langle \mathbf{e}_j,\mathbf{v} \rangle}{\|\mathbf{e}_j\|^2}.

Since \langle \mathbf{e}_j,\mathbf{v}\rangle = \|\mathbf{e}_j\| \|\mathbf{v}\| \cos \theta_j, where \theta_j is the angle between \mathbf{v} and the basis vector \mathbf{e}_j, this may equivalently be written

x_j = \frac{\|\mathbf{e}_j\| \|\mathbf{v}\| \cos \theta_j}{\|\mathbf{e}_j\|^2} = \frac{\|\mathbf{v}\| \cos \theta_j}{\|\mathbf{e}_j\|},

which completes the proof. — Q.E.D.

The formulas in Proposition 2 become even simpler if E = \{\mathbf{e}_1,\dots,\mathbf{e}_n\} is an orthogonal basis in which every vector has length 1, i.e.

\|\mathbf{e}_j\| = 1, \quad j=1,\dots,n.

Such a basis is called an orthonormal basis. According to Proposition 2, if E is an orthonormal basis in \mathbf{V}, then for any v \in \mathbf{V} we have

\mathbf{v} = \langle \mathbf{e}_1,\mathbf{v}\rangle\mathbf{e}_1 + \langle \mathbf{e}_2,\mathbf{v} \rangle\mathbf{e}_2 + \dots + \langle \mathbf{e}_n,\mathbf{v} \rangle\mathbf{e}_n,

or equivalently

\mathbf{v} = \|\mathbf{v}\| \cos \theta_1\mathbf{e}_1 + \|\mathbf{v}\|\cos \theta_2\mathbf{e}_2 + \dots + \|\mathbf{v}\| \cos \theta_n\mathbf{e}_n.

The first of these formulas is important in that it gives an algebraically efficient way to calculate coordinates relative to an orthonormal basis: to calculate the coordinates of a vector \mathbf{v}, just compute its scalar product with each of the basis vectors. The second formula is important because it provides geometric intuition: it says that the coordinates of \mathbf{v} relative to an orthonormal basis are the lengths of the orthogonal projections of \mathbf{v} onto the lines (i.e one-dimensional subspaces) spanned by each of the basis vectors. Indeed, thinking of the case where \mathbf{v} and \mathbf{e}_j are geometric vectors, the quantity \|\mathbf{v}\| \cos \theta_j is the length of the orthogonal projection P_{\mathbf{e}_j}\mathbf{v} of the vector \mathbf{v} onto the line spanned by \mathbf{e}_j, as in the figure below.

Orthogonal projection of a vector onto a line.

An added benefit of orthonormal bases is that they reduce abstract scalar products to the familiar dot product of geometric vectors. More precisely, suppose that E=\{\mathbf{e}_1,\dots,\mathbf{e}_n\} is an orthonormal basis of \mathbf{V}. Let \mathbf{v},\mathbf{w} be vectors in \mathbf{V}, and let

\mathbf{v} = x_1\mathbf{e}_1 + \dots + x_n\mathbf{e}_n \\ \mathbf{w} = y_1\mathbf{e}_1 + \dots + y_n\mathbf{e}_n

be their representations relative to E. Then, we may evaluate the scalar product of \mathbf{v} and \mathbf{w} as

\langle \mathbf{v},\mathbf{w} \rangle = \left\langle \sum_{i=1}^n x_i\mathbf{e}_i,\sum_{j=1}^n y_j\mathbf{e}_j \right\rangle\\ = \sum_{i,j=1}^n x_iy_j \langle \mathbf{e}_i,\mathbf{e}_j \rangle\\ = \sum_{i,j=1}^n x_iy_j \delta_{ij} \\= \sum_{i=1}^n x_iy_i\\ = (x_1,\dots,x_n) \cdot (y_1,\dots,y_n).

In words, the scalar product \langle \mathbf{v},\mathbf{w} \rangle equals the dot product of the coordinate vectors of \mathbf{v} and \mathbf{w} relative to an orthonormal basis of \mathbf{V}.

This suggests the following definition.

Definition 2: Euclidean spaces (\mathbf{V}_1,\langle \cdot,\cdot \rangle_1) and (\mathbf{V}_2,\langle \cdot,\cdot \rangle_2) are said to be isomorphic if there exists an isomorphism T \colon \mathbf{V}_1 \to \mathbf{V}_2 which has the additional feature that

\langle T\mathbf{v},T\mathbf{w} \rangle_2 = \langle \mathbf{v},\mathbf{w} \rangle_1.

Our calculation above makes it seem likely that any two n-dimensional Euclidean spaces (\mathbf{V}_1,\langle \cdot,\cdot \rangle_1) and (\mathbf{V}_2,\langle \cdot,\cdot \rangle_2)are isomorphic, just as any two n-dimensional vector spaces \mathbf{V} and \mathbf{W} are. Indeed, we can prove this immediately if we can claim that both \mathbf{V}_1 and \mathbf{V}_2 contain orthonormal bases. In this case, let E_1 = \{\mathbf{e}_{11},\dots,\mathbf{e}_{1n}\} be an orthonormal basis in \mathbf{V}_1, let E_2 = \{\mathbf{e}_{21},\dots,\mathbf{e}_{2n}\} be an orthonormal basis in \mathbf{V}_2, and define T \colon \mathbf{V}_1 \to \mathbf{V}_2 to be the unique linear transformation that transforms \mathbf{e}_{1j} into \mathbf{e}_{2j} for each 1 \leq j \leq n. Then T is an isomorphism of vector spaces by the same argument as in Lecture 2, and it also satisfies \langle T\mathbf{v},T\mathbf{w}\rangle_2 = \langle \mathbf{v},\mathbf{w}\rangle_1 (make sure you understand why).

But, how can we be sure that every n-dimensional Euclidean space (\mathbf{V},\langle \cdot,\cdot \rangle) actually does contain an orthonormal basis? Certainly, we know that \mathbf{V} contains a basis B=\{\mathbf{b}_1,\dots,\mathbf{b}_n\}, but this basis might not be orthonormal. Luckily, there is a fairly simple algorithm which takes as input a finite linearly independent set of vectors, and outputs a linearly independent orthogonal set of the same size, which we can then “normalize” by dividing each vector in the output set by its norm. This algorithm is called the Gram-Schmidt algorithm, and you are encouraged to familiarize yourself with it — it’s not too complicated, and is based entirely on material covered in this lecture. In this course, we only need to know that the Gram-Schmidt algorithm exists, so that we can claim any finite-dimensional Euclidean space has an orthonormal basis. We won’t bother analyzing the internal workings of the Gram-Schmidt algorithm, and will treat it as a black box to facilitate geometric thinking in abstract Euclidean spaces. More on this in Lecture 6.

Lecture 5 video

1 Comment

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s