Math 31AH: Lecture 11

We met linear transformations for the first time back in Lecture 2, and have encountered them a handful of times in subsequent lectures. In this lecture, we begin a systematic study of linear transformation that will take up the rest of the course. As with the study of any new species, there will be a lot of classifying: we will divide up linear transformations into various classes defined by certain common features which to a large extent determine their behavior. Before classifying linear transformations into various special species and families, let us consider the ecosystem of all linear transformations as a collective whole.

Let \mathbf{V} and \mathbf{W} be vector spaces, and let \mathrm{Hom}(\mathbf{V},\to\mathbf{W}) be the set of all linear transformations T \colon \mathbf{V} \to \mathbf{W}. This seemingly strange notation stems from the fact that is a fancy alternative name for linear transformations is homomorphisms, which roughly means
“same shape” in Greek.

Theorem 1: \mathrm{Hom}(\mathbf{V},\mathbf{W}) is a vector space.

Proof: In order to promote \mathrm{Hom}(\mathbf{V},\mathbf{W}) from simply a set to a vector space, we need to give a rule for adding and scaling linear transformations. These rules are simply and natural. If T_1,T_2 \in \mathrm{Hom}(\mathbf{V},\mathbf{W}) linear transformations and a_1,a_2 \in \mathbb{R} are scalars we define a new linear transformation a_1T_1 + a_2T_2 by

[a_1T_1+a_2T_2]\mathbf{v} := a_1T_1\mathbf{v} + a_2T_2\mathbf{v} \quad \forall \mathbf{v} \in \mathbf{V}.

One must check first that the function from \mathbf{V} to \mathbf{W} so defined satisfies the linear transformation axioms, and then that the set \mathrm{Hom}(\mathbf{V},\mathbf{W}) equipped with these notions of addition and scalar multiplication satisfies the vector spaces axioms. This is left as an exercise to the reader (it is a long but easy exercise which you should do at least once).

— Q.E.D.

Now let us start asking questions about the vectors in \mathrm{Hom}(\mathbf{V},\mathbf{W}), i.e. considering what properties a linear transformation T from \mathbf{V} to \mathbf{W} may or may not have. Given T \in \mathrm{Hom}(\mathbf{V},\mathbf{W}), one of the first things we would like to know is whether it is injective and/or surjective. These questions reduce to questions about certain subspaces of \mathbf{V} and \mathbf{W} associated to T.

Definition 1: The kernel of T is the subset of \mathbf{V} defined by

\mathrm{Ker}T = \{\mathbf{v} \in \mathbf{V} \colon T(\mathbf{v}) = \mathbf{0}_\mathbf{W}\},

and the image of T is the subset of \mathbf{W} defined by

\mathrm{Im}T = \{ \mathbf{w} \in \mathbf{W} \colon \exists\ \mathbf{v} \in \mathbf{V} \text{ such that } T\mathbf{v} = \mathbf{w}\}.

You can think of the kernel and image of T as the solution sets to two different equations involving T. The kernel in terms of certain vector equations associated to T. First, the kernel of T is the set of solutions to the equation


where \mathbf{v} is an unknown vector in \mathbf{V}. Second, the image of T is the set of all \mathbf{w} \in \mathbf{W} such that the equation


has a solution.

Proposition 1: The kernel of T is a subspace of \mathbf{V}, and the image of T is a subspace of \mathbf{W}.

Proof: Since T is a linear transformation, it is by definition the case that T\mathbf{0}_\mathbf{V} = \mathbf{0}_\mathbf{W}, so that \mathbf{0}_\mathbf{V} \in \mathrm{Ker}T. Now let \mathbf{v}_1,\mathbf{v}_2 \in \mathrm{Ker}T be any two vectors in the kernel of T, and let a_1,a_2 \in \mathbb{R} be any two scalars. We then have

T(a_1\mathbf{v}_1 + a_2\mathbf{v}_2) = a_1T\mathbf{v}_1+a_2T\mathbf{v}_2=a_1\mathbf{0}_\mathbf{W} + a_2\mathbf{0}_\mathbf{W} = \mathbf{0}_\mathbf{W},

which shows that \mathrm{Ker}T is closed under taking linear combinations. Hence, \mathrm{Ker}T is a subspace of \mathbf{V}.

Now consider the image of T. Since T is a linear transformation, it is by definition the case that T\mathbf{0}_\mathbf{V} = \mathbf{0}_\mathbf{W}, so that \mathbf{0}_\mathbf{W} \in \mathrm{Im}T. Now let \mathbf{w}_1,\mathbf{w}_2 \in \mathrm{Im}T be any two vectors in the image of T. Then, by definition of \mathrm{Im}T, there exist vectors \mathbf{v}_1,\mathbf{v}_2 \in \mathbf{V} such that T\mathbf{v}_1=\mathbf{w}_1,\ T\mathbf{v}_2=\mathbf{w}_2. So, for any scalars a_1,a_2 \in \mathbb{R}, we have that

a_1\mathbf{w}_1+a_2\mathbf{w}_2 = a_1T\mathbf{v}_1+a_2T\mathbf{v}_2 = T(a_1\mathbf{v}_1+a_2\mathbf{v}_2),

which shows that \mathrm{Im}T is closed under taking linear combinations.

— Q.E.D.

Proposition 2: The linear transformation T is injective if and only if \mathrm{Ker}T = \{\mathbf{0}_\mathbf{V}\}, and surjective if and only if \mathrm{Im}T = \mathbf{W}.

Proof: First we consider the relationship between the injectivity of T and its kernel. Suppose that \mathrm{Ker}T = \{\mathbf{0}_\mathbf{V}\}. Let \mathbf{v}_1,\mathbf{v}_2 \in \mathbf{V} be vectors such that T\mathbf{v}_1=T\mathbf{v}_2. This is equivalent to


which says that \mathbf{v}_1-\mathbf{v}_2 \in \mathrm{Ker}T. Since the only vector in \mathrm{Ker}T is \mathbf{0}_\mathbf{V}, this forces \mathbf{v}_1-\mathbf{v}_2=\mathbf{0}_\mathbf{V}, which means that \mathbf{v}_1=\mathbf{v}_2, and we conclude that T is injective. Conversely, suppose we know that T is injective. Let \mathbf{v}_1,\mathbf{v}_2 \in \mathrm{Ker}T be any two vectors in the kernel of T. Then T\mathbf{v}_1=T\mathbf{v}_2=\mathbf{0}_\mathbf{W}. Since T is injective, this forces \mathbf{v}_1=\mathbf{v}_2, which says that any two vectors in \mathrm{Ker}T are equal to one another. But this means that any vector in \mathrm{Ker}T is equal to \mathbf{0}_\mathbf{V}, and we conclude that \mathrm{Ker}T=\{\mathbf{0}_\mathbf{V}\}.

Now let us consider the relationship between the surjectivity of T and its image. Suppose that \mathrm{Im}T=\mathbf{W}. This is exactly what it means for the function T to be surjective: every element in the codomain of T is in fact in the range of T. Conversely, suppose we know that T is surjective. By definition of surjectivity, this means that \mathrm{Im}T=\mathbf{W}.

— Q.E.D.

We can use the above propositions to gain some insight into linear equations, meaning equations of the form

T \mathbf{v}=\mathbf{w},

where T is a given linear transformation from \mathbf{V} to \mathbf{W}, \mathbf{w} is a given vector in \mathbf{W}, and \mathbf{v} is an unknown vector in \mathbf{V}.

Theorem 1: The solution set of the above linear equation is \mathbf{v}_0+\mathrm{Ker}T, i.e. the set of of all vectors in \mathbf{V} of the form \mathbf{v}_0 + \mathbf{k}, where \mathbf{v}_0 is any particular vector such that T\mathbf{v}_0=\mathbf{w}, and \mathbf{k} \in \mathrm{Ker}T.

Proof: We begin by first consider the homogenous equation associated to the linear equation we want to solve, which is


By definition, the solution set of this homogeneous equation is \mathrm{Ker}T. Now, if \mathbf{v}_0 \in \mathbf{V} is any vector such that T\mathbf{v}_0 = \mathbf{w}, then we also have

T(\mathbf{v}_0+\mathbf{k})= T\mathbf{v}_0+T\mathbf{k}=\mathbf{w}+\mathbf{0}_\mathbf{W} = \mathbf{w},

which shows that all vectors in \mathbf{v}_0+\mathrm{Ker}T are solutions of the equation T\mathbf{v}=\mathbf{w}.

Conversely, suppose that \mathbf{v}_1 is a vector in \mathbf{V} such that T\mathbf{v}_1=\mathbf{w}. Then, we have

T(\mathbf{v}_1-\mathbf{v}_0) = T\mathbf{v}_1-T\mathbf{v}_0=\mathbf{w}-\mathbf{w}=\mathbf{0}_\mathbf{W},

which shows that \mathbf{v}_1-\mathbf{v}_0 is a vector in \mathrm{Ker}T. That is, \mathbf{v}_1-\mathbf{v}_0=\mathbf{k} for some \mathbf{k} \in \mathrm{Ker}T, or equivalently \mathbf{v}_1=\mathbf{v}_0+\mathbf{k}. This shows that all solutions of T\mathbf{v}=\mathbf{W} belong to the set \mathbf{v}_0 + \mathrm{Ker}T.

— Q.E.D.

We can now make the following statements about solutions of the linear equation T\mathbf{v}=\mathbf{w}:

  • The equation has a solution if and only if \mathbf{w} \in \mathrm{Im}T.
  • If the equation has a solution, this solution is unique if and only if \mathrm{Ker}T = \{\mathbf{0}_\mathbf{V}\}.
  • If the equation has a solution but the solution is not unique, then the equation has infinitely many solutions.

Exercise 1: Arrange the above information into a tree diagram which shows the relationship between the various cases.

In Lecture 12, we will think in more detail about the structure of the vector space \mathrm{Hom}(\mathbf{V},\mathbf{W}) in the case that \mathbf{V} and \mathbf{W} are finite-dimensional. In the finite-dimensional case, we shall see that everything we might want to know can be expressed in the language of matrices.

Lecture 11 video


  1. says:

    Hi I think there are typos in the two displayed equation in proposition 1.

    1. Corrected, thanks.

Leave a Reply