Let be a function from one Euclidean space to another. In making this declaration, we did not make the scalar products on the source and target spaces explicit. This omission is commonly made for the sake of convenience, and by abuse of notation we just denote all scalar products , since the vector space on which a given scalar product is defined can always be deduced from context. The resulting notational ambiguity is ultimately less confusing than requiring distinct symbols for all scalar products in play at any given time.

As discussed in Lecture 2, our goal is to do calculus with functions which take vector inputs and produce vector outputs. Since is a function of only one variable, the vector , let us explain why this subject is also known as multivariable calculus. Let be an orthonormal basis of , so that every is given by

If is viewed as a variable (rather than a particular vector), then its coordinates

relative to the orthonormal basis also become variables. In other words, associated to the function is another function defined by

which is a function of the scalar variables , where . The objects and contain exactly the same information, even though the former is a function of a single vector variable and the latter is a function of scalar variables . Notice that the construction of from makes no mention of the scalar product on , nor does it reference a basis in .

Now suppose we choose an orthonormal basis of . Then, for any , we have

and since we are thinking of as a variable, it is natural to think of the coordinates of in the basis as functions of ,

.

The functions so defined are called the **component functions** of relative to the basis , and they contain exactly the same information as the function itself. In particular, we have the following.

**Proposition 1:** Given a function and vectors and we have

if and only if

where are the components of the vector relative to the orthonormal basis of . In particular, the vector-valued function is continuous at if and only if the scalar-valued functions are continuous at .

*Proof:* Since the basis is fixed, let us write as an abbreviation for We have

hence

Suppose first that as . Then, for any , there exists such that implies

where we are assuming so that Since each term in the sum of squares is bounded by the total sum, this gives

for each which shows that as

Conversely, suppose that for each we have as and let be given. Then, for each there is a such that implies

Setting , we thus have that implies

which shows that as

Q.E.D.

Technicalities aside, the structure of the above proof is very simple: the argument is that if a sum of nonnegative numbers is small, then each term in the sum must be small, and conversely the sum of a finite number of small numbers is still small.

Now let us consider the above paragraphs simultaneously, meaning that we have chosen orthonormal bases and . Then each coordinate function gives rise to an associated function defined by

In particular, upon choosing orthonormal bases and , every function gives rise to an associated function defined by

**Example 1:** Let be the function defined by , where is a specified vector. This function is called “translation by ” Choose an orthonormal basis , and suppose that the coordinate vector of relative to this basis is . Then, the function is given by

and

The above discussion shows that the perspective of vector calculus, in which we consider functions of a single vector variable , is equivalent to the perspective of multivariable calculus, in which we consider functions of multiple scalar variables . From this perspective, one might wonder about the prospect of a “multivector” calculus in which we consider functions of multiple vector variables, where it may even be that each vector variable ranges over its own Euclidean space . In fact, this is already included in vector calculus, because such -tuples of vectors are themselves single vectors in an enlarged Euclidean space.

**Definition 1: **Given Euclidean spaces their direct product is the Euclidean space consisting of the Cartesian product

with vector addition and scalar multiplication defined component-wise, i.e.

and

and scalar product defined by

It is good to be comfortable with both perspectives; the former is better for conceptual understanding, while the latter is useful for visualization and calculation.

Thus the calculus of functions of two vector variables, , is just the calculus of functions on the direct product Euclidean space . Equivalently, the calculus of functions is the same thing as the calculus of functions There is in fact a further generalization of vector calculus called tensor calculus, which is very useful in physics and engineering (particularly in the theory of relativity and in materials science), but that is beyond the scope of this course.

**Example 2:** It may be tempting to throw away the more abstract perspective entirely, and in the previous lectures I have been arguing against doing this by holding up the example of the function

which sends each symmetric operator on to the list of its eigenvalues arranged in weakly decreasing order. Conceptually, the function which sends a symmetric operator to its eigenvalues is very natural, and something you can hold in your mind quite easily. However, it is not easy to work concretely with this function by choosing coordinates. On Problem Set 1, we showed how to the choice of a basis leads to a corresponding basis , and in particular that if then . So, according to our discussion above we have an associated function

Moreover, if we choose the standard basis , then we have component functions

which send a symmetric operator to its $i$th largest eigenvalue,

,

and writing down a formula for these functions in terms of the coordinates of relative to amounts to writing down a formula for the eigenvalues of a symmetric matrix in terms of its entries, and doing this for is in a sense impossible. You’ll work out the case on PSet 2. Again, this all points to the need to be able to do approximations a la calculus, a question which we return to now.

We now come back to our general discussion of functions from one Euclidean space to another. In linear algebra, we consider only the case where is linear, and in that context what matters are associated vector spaces like the kernel and image of . However, to study more general (i.e. non-linear functions) between Euclidean spaces we need a more general vocabulary that includes a larger variety of special subsets of Euclidean space.

**Defintion 1:** Given a vector and a number , the **open ball** of radius centered at is the subset of defined by

This is the set of vectors whose distance to is strictly less than . Observe that is the empty set unless .

In terms of open balls, the continuity of a function at a point may be formulated as follows: is continuous at is and only if it has the property that, for any given , there is a corresponding such that the image of under is contained in .

**Definition 2:** A subset of a Euclidean space is said to be **open** if for any there exists such that . A subset of is said to be **closed** if its complement is open.

There is a characterization of continuous functions in terms of open and closed sets.

**Theorem 1:** A function is continuous if and only if the preimage of any open set is open. Equivalently, is continuous if and only if the preimage of any closed set is closed.

We won’t use Theorem 1 much, so we shall skip the proof – you will see this result again in a real analysis course.

We can also characterize continuity of a function as continuity of its components.

**Theorem 2:** A function is continuous at if and only if its component functions relative to an arbitrary orthonormal basis are continuous.

**Definition 3**: A set is said to be **bounded** if there exists such that i.e. if for all .

**Definition 4:** A set is said to be **compact** if it is closed and bounded.

**Theorem (Extreme Value Theorem):** If is compact, every continuous function attains a maximum and a minimum: there are points such that

.

The point is said to be a maximizer of , and is said to be a minimizer of .

There is a particular situation in which we can say more about maximizers and minimizers. Recall that a linear combination of vectors is an expression of the form

where are arbitrary scalars, and that the linear span of is the subset of consisting of all linear combinations of the these vectors,

There is constrained version of this in which we consider only linear combinations whose scalar coefficients are nonnegative and sum to . These special linear combinations of are called **convex combinations**, and the set

of all convex combinations of For example, the convex hull of two vectors may be visualized as the line segment whose endpoints are and , while the convex hull of three vectors may be visualized as the triangular region whose vertices are .

**Theorem (Convex Optimization Theorem):** For any finite set of vectors , every linear function has a maximizer and a minimizer in .

Here is a very interesting example of a convex hull. Let be a Euclidean space, and let be an orthonormal basis in . Recall that a permutation is a bijective function

If we write the table of values of such a function as a matrix,

then the bottom row of the matrix consists of the numbers arranged in some order. For example, in the case , the permutations consist of the following matrices:

For each permutation , the associated **permutation operator** on is defined by its action on the basis , which is given by

For example, in the case , the matrices of these operators relative to the basis are, in the same order, as follows:

Since these matrices have exactly one in every row and column, with all other entries , they obviously have the property that each row and column sums to . There are many more such matrices, however, and the following theorem characterizes them as the convex hull of the permutation matrices.

**Theorem (Birkhoff-von Neumann theorem):** The convex hull of the permutation operators consists of all operators on whose matrices relative to the basis have nonnegative entries, and whose rows and columns sum to .

We now have everything we need to prove that the eigenvalue map is continuous. In fact, we prove the following stronger result.

**Theorem (Hoffman-Wielandt inequality):** Let be the function which sends each symmetric operator on to its eigenvalues listed in weakly decreasing order. Then, we have

*Proof: *Let us make sure we understand which norms we are using. On the LHS of the inequality, we have the norm on corresponding to the standard inner product, so that if

then

or equivalently

On the right hand side, we have the Frobenius norm for operators,

Observing that

we thus have that proving , which is our objective, is equivalent to proving

which we do now.

Since is a symmetric operator on , by the Spectral Theorem there exists and orthonormal basis of such that

From Lecture 1, we have the formula

so that

Invoking the Spectral Theorem again, there is an orthonormal basis such that

We then have that

and plugging this into the above we finally have

Now observe that the matrix

has nonnegative entries, and also each row and column of sums to (why?). Thus, if we define a function on the convex hull of the permutation matrices by

then this function is linear and we have . It now follows from the convex optimization theorem together with the Birkhoff-von Neumann theorem that

where the maximum is over all permutations . Evaluating the right hand side, we get that

where the final equality follows from the fact that and

Q.E.D.

**Corollary 1:** The eigenvalue function is continuous.

*Proof:* A function from one metric space to another is said to be a “contraction” if

Thus, a contraction is a function which brings points closer together, or more precisely doesn’t spread them farther apart. It is immediate to check that the definition of continuity holds for any contraction (in fact, we can choose when checking continuity). The Hoffman-Wielandt inequality says that the eigenvalue mapping is a contraction: the distance between the eigenvalue vectors of symmetric operators is at most Frobenius distance between and

Q.E.D.

## 1 Comment