Math 31BH: Lecture 4

This post is a little shorter than the previous ones because I went back and re-worked parts of Lecture 3 substantially to add more detail and clarity, and I also added some additional material which upon reflection is best housed in Lecture 3. So, as part of reading this post, you should go back and make a second pass through Lecture 3.

What we have succeeded in doing so far is defining limits and continuity for functions f \colon \mathbf{V} \to \mathbf{W} which map between Euclidean spaces, so we have the first two core notions of calculus squared away (we saw a fairly involved example, the eigenvalue mapping on symmetric operators, which showed that understanding continuity for functions which are hard to describe explicitly can be hard).

The remaining core notions of calculus are the big ones: differentiation and integration. In Math 31BH we develop differential calculus for vector-valued functions of vectors, and in Math 31CH you will concentrate on the development of integral calculus for such functions.

Let us begin with the familiar setting of functions f \colon \mathbb{R} \to \mathbb{R}. First of all, we want do consider functions which may not be defined on all of \mathbb{R}, but only on some subset D \subseteq \mathbb{R}, the domain of f. For example, the square root function f(x)=\sqrt{x} has domain D being the set of nonnegative numbers, while the logarithm function f(x) has domain D being the set of positive numbers. So in our general setup we are considering functions f \colon D \to \mathbf{W} where D is a subset of a Euclidean space \mathbf{V} which does not necessarily exhaust that space. This extension is completely non-problematic, as is the extended notion of image of a function.

Definition 1: The image of a function f \colon \mathbf{V} \to \mathbf{W} is the set of outputs of f in \mathbf{W},

\mathrm{Im}(f) = \{\mathbf{w} \in \mathbf{W} \colon f(\mathbf{v})=\mathbf{w} \text{ for some } \mathbf{v} \in \mathbf{V}\}.

Now let us talk about graphs of functions, the precise definition of which involves the direct product of Euclidean spaces, a concept introduced in Lecture 3.

Proposition 1: If \dim \mathbf{V} = n and \dim \mathbf{W}=m, then \dim(\mathbf{V} \times \mathbf{W})=n+m.

Proof: If \mathcal{B} =\{\mathbf{b}_1,\dots,\mathbf{b}_n\} is an orthonormal basis of \mathbf{V} and \mathcal{C}= \{\mathbf{c}_1,\dots,\mathbf{v}_n\} is an orthonormal basis of \mathbf{W}, then the set

\mathcal{B} \times \mathcal{C} = \{(\mathbf{b}_i,\mathbf{c}_j) \colon (i,j) \in \{1,\dots,n\} \times \{1,\dots,m\}\}

spans \mathbf{V} \times \mathbf{W}. This follows readily from the fact that \mathcal{B} spans \mathbf{V} and \mathcal{C} spans \mathbf{W}; make sure you understand why. Moreover, we have that

\langle (\mathbf{b}_i,\mathbf{c}_j),(\mathbf{b}_k,\mathbf{c}_l)\rangle = \langle \mathbf{b}_i,\mathbf{b}_k\rangle + \langle \mathbf{c}_j,\mathbf{c}_l\rangle,

which vanishes unless i=k and j=l. Thus \mathcal{B} \times \mathcal{C} is an orthogonal set, and in particular it is a linearly independent set in \mathbf{V} \times \mathbf{W}.


Definition 2: The graph of a function f \colon \mathbf{V} \to \mathbf{W} is the set of all input-output pairs for f, i.e. the subset of \mathbf{V} \times \mathbf{W} defined by

\Gamma(f) = \{(\mathbf{v},f(\mathbf{V})) \colon \mathbf{v} \in D\}.

This agrees with the informal definition of a graph you have known for a long time as a drawing of f on a piece of paper: for functions f \colon \mathbf{R} \to \mathbf{R}, we have \Gamma(f) \subset \mathbb{R} \times \mathbb{R} = \mathbb{R}^2. In the general case, the graph of f \colon \mathbf{V} \to \mathbf{W} is a harder object to understand, and this is not just because \mathbf{V} and \mathbf{W} are abstract Euclidean spaces. Indeed, even if we work in coordinates as described in Lecture 3, meaning that we consider the associated function

f_{\mathcal{BC}} \colon \mathbb{R}^n \times \mathbb{R}^m

has graph

\Gamma(f_{\mathcal{BC}}) \subset \mathbb{R}^n \times \mathbb{R}^m = \mathbb{R}^{n+m},

which may be difficult to visualize if \max(n,m)>1.

Now we come to the real sticking point, the Newton quotient. If D \subseteq \mathbb{R} is an open set and f \colon \mathbb{D} \to \mathbb{R} is a function, then for any x \in D the ratio

\Delta_hf(x) = \frac{f(x+h)-f(x)}{x+h-x} =  \frac{f(x+h)-f(x)}{h}

is well-defined for any sufficiently small number h. Moreover, this number has an immediate intuitive meaning as a secant line for the graph \Gamma(f) \subset \mathbb{R}^2, i.e. it is the slope of the line in \mathbb{R}^2 passing through the points (x,f(x)),(x+h,f(x+h)) \in \Gamma(f). We then say that f(x) is differentiable at the point x \in D if the limit

f'(x) = \lim\limits_{h \to 0} \Delta_h f(x)

exists, in which case the number f'(x) is called the derivative of f at x; it is the slope of the tangent line to \Gamma(f) at the point (x,f(x)) \in \Gamma(f).

Generalizing the definition of the derivative to functions which map vectors to vectors is problematic from the outset. Let D be an open set in a Euclidean space \mathbf{V}, and let f \colon D \to \mathbf{V} be a function defined on D. For any \mathbf{v} \in D, we have B_\delta(\mathbf{v}) \subseteq D for sufficiently small \delta>0, so that the difference

f(\mathbf{v}+\mathbf{h}) - f(\mathbf{v})

makes sense for any h \in \mathbf{V} with \|\mathbf{h}\|< \delta. However, when we attempt to form the corresponding difference quotient, we get the fraction

\Delta_\mathbf{h}f(\mathbf{v})= \frac{f(\mathbf{v}+\mathbf{h})-f(\mathbf{v})}{\mathbf{h}},

which is problematic since at no time in Math 31AH up til now have we defined what it means to divide two vectors in a vector space \mathbf{V}. As we discussed in Lecture 2, a notion of vector division in some sense only exists for \mathbf{V}=\mathbb{R}, in which case vectors are real numbers, and \mathbf{V} = \mathbb{R}^2, in which case \mathbb{R}^2 can be identified with the complex numbers \mathbb{C}, for which division is meaningful. The former case gives us back the usual calculus derivative, and the latter gives us a notion of derivative for functions f \colon \mathbb{C} \to \mathbb{C}, which is the starting point of the subject known as complex analysis. Complex analysis is a beautiful and useful subject, but our world is not two-dimensional, and we would like to have access to calculus in dimensions higher than two. Moreover, we want to consider functions f \colon D \to \mathbf{W}, where \mathbf{W} is distinct from the Euclidean space \mathbf{V} containing the domain D of f. In such a setting the Newton quotient becomes even more heretical, since it involves division of a vector in \mathbf{W} by a vector in \mathbf{V}.

We will have to work hard to resolve the philosophical impediments to differentiation of vector-valued functions of vectors. However, there is a natural starting point for this quest, namely the differentiation of vector-valued functions of scalars. Indeed, if D \subseteq \mathbb{R} is an open set of real numbers and f \colon D \to \mathbf{W} is a function from D into a Euclidean space \mathbf{W}, then for t \in D and h \in \mathbb{R} sufficiently small the Newton quotient

\Delta_h f(t) = \frac{f(t+h)-f(t)}{h}

makes perfectly good sense: it is the vector f(t+h)-f(t) \in \mathbf{W} scaled by the number \frac{1}{h}. So vector-valued functions of scalars are a good place to start.

We will work in the setting where f \colon [a,b] \to \mathbf{W} is a function whose domain is a closed interval in \mathbb{R}. In this case, the image of f is said to be a curve in \mathbf{W}, and by abuse of language we may refer to f itself as a curve in \mathbf{W}; it may be thought of as the path described by a particle located at the point f(a) \in \mathbf{W} at time t=a, and located at the point f(b) \in \mathbf{W} at time t=b.

Definition 3: A function f \colon [a,b] \to \mathbf{W} is said to be differentiable at a point t \in (a,b) if the limit

f'(t) = \lim\limits_{h \to 0} \frac{f(t+h)-f(t)}{h}

exists. In this case, the vector f'(t) \in \mathbf{W} is said to be the derivative of f at t. In full detail, this means that f'(t) \in \mathbf{W} is a vector with the following property: for any \varepsilon > 0, there exists a corresponding \delta > 0 such that

|h| < \delta \implies \left\|f'(t)-\frac{1}{h}(f(t+h)-f(t))\right\|<\varepsilon

where \|\mathbf{w}\|=\sqrt{\langle \mathbf{w},\mathbf{w}\rangle} is the Euclidean norm in \mathbf{W}.

Note that the component functions f_{\mathcal{C}1},\dots,f_{\mathcal{C}m} of a curve f \colon [a,b] \to \mathbf{W} relative to an orthonormal basis \mathcal{C}=\{\mathbf{c}_1,\dots\mathbf{c}_m\} of \mathbf{W} are scalar-valued functions of the scalar “time variable” t, i.e.

f_{\mathcal{C}1} \colon [a,b] \to \mathbb{R}, \quad i=1,\dots,m

This is part of what makes curves easier to study than general vector-valued functions: they are just m-tuples of functions \mathbb{R} \to \mathbb{R}, for which we already have a well-developed calculus at our disposal.

Theorem 1: Let f \colon [a,b] \to \mathbf{W} be a curve, and let \mathcal{C}=\{\mathbf{c}_1,\dots,\mathbf{c}_m\} be an orthonormal basis of \mathbf{W}. Then f is differentiable at time t \in (a,b) if and only if its component functions f_1,\dots,f_m relative to \mathcal{C} are differentiable at time t, and in this case we have

f'(t) = f_1'(t)\mathbf{c}_1 +\dots+ f_m'(t)\mathbf{c}_m.

Proof: The components of the vector-valued Newton quotient


relative to the basis \mathcal{C} are the scalar-valued Newton quotients

\Delta_hf_i(t)=\frac{1}{h}(f_i(t+h)-f_i(t)), \quad i=1,\dots,m.

The statement now follows from Proposition 1 in Lecture 3.


Example 1: Let \mathbf{W} be a 2-dimensional Euclidean spaces with orthonormal basis \mathcal{C}=\{\mathbf{c}_1,\mathbf{c}_2\}. Consider the function f \colon [0,2\pi] \to \mathbf{W} defined by

f(t) = \cos t\mathbf{c}_1+ \sin t\mathbf{c}_2.

It is hopefully immediately apparent to you that the image of f in \mathbf{W} is the unit circle in this Euclidean space,

\mathrm{Im}(f) =\{\mathbf{w} \in \mathbf{W} \colon \|\mathbf{w}\|=1\}.

The graph

\Gamma(f) = \{(t,\mathbf{w}) \colon t \in [0,2\pi]\} \subset \mathbf{R} \times \mathbf{W}

is a helix. The component functions of f in the basis \mathcal{C}=\mathbf{c}_1,\mathbf{c}_2 of \mathbb{R}^2 are

f_1(t) = \cos t \quad\text{ and }\quad f_2(t) = \sin t,

and as you know from elementary calculus these are differentiable functions with derivatives

f_1'(t) = -\sin t \quad\text{ and }\quad f_2'(t)= \cos t.

Thus, the curve f(t)=f_1(t)\mathbf{c}_1+f_2(t)\mathbf{c}_2 is differentiable, and its derivative is

f'(t) = -\sin t\mathbf{c}_1 + \cos t \mathbf{c}_2.

Equivalently, the coordinate vector of the vector f'(t)\in \mathbf{W} in the basis \mathcal{C} is

[f'(t)]_{\mathcal{C}} = \begin{bmatrix} -\sin t \\ \cos t \end{bmatrix}.

Leave a Reply