Assuming familiarity with the geometric concept of a vector, we introduce the notion of a vector space: a set containing objects which can be added to one another and scaled by numbers in a manner which is formally consistent with the addition and scaling of geometric vectors. The vector space concept is the bedrock of linear algebra.

**Definition 1:** A vector space is a triple consisting of a set together with two functions

and

where denotes the Cartesian product of sets and denotes the set of real numbers. The functions and are required to have the following properties:

- For all we have
- For all we have
- There exists such that for all
- For every there exists such that
- For every we have
- For every and every we have
- For every and every we have
- For every and every we have

This is the definition of a vector space. It is logically precise and totally unambiguous, or in other words mathematically rigorous. However, it is difficult to have any intuitive feeling for this construction. In order to make the definition of a vector space more relatable, we will rewrite it using words and symbols that are more familiar and evocative.

First, let us call the elements of the set “vectors.” When we use this word, we are reminded of geometric vectors, which are familiar objects. However, this is only an analogy — we make no assumption on the nature of the elements of . Indeed, we shall soon see that many rather different mathematical objects — including numbers, polynomials, functions, and of course geometric vectors — can all be viewed as vectors.

Second, let us call the function in Definition 1 “vector addition,” and write in place of If we use this alternative notation, then the axioms governing the function in Definition 1 become

- For all we have
- For all we have
- There exists such that for all
- For every there exists such that

This makes things intuitively clear: the axioms above say that the operation of vector addition behaves the way addition does in other contexts we are familiar with, such as the addition of numbers or the addition of geometric vectors. Although conceptually helpful, this comes at a cost, and that cost is ambiguity: we are now using the symbol in two different ways, since it can mean either addition of numbers in or vectors in and these operations are not the same. However, writing is so much more natural than writing that we decide to do this going forward despite the ambiguity it introduces. This is called abuse of notation.

Third and last, we are going to do the same thing with the function — give it a name, and write it in a more intuitive but more ambiguous way. The usual name given to is “scalar multiplication,” which indicates that is an abstraction of the familiar operation of scaling a geometric vector by a number. The usual notation for scalar multiplication is simply juxtaposition: we write in place of Adding on the axioms prescribing the behavior of the function , we now have

- For all we have
- For all we have
- There exists such that for all
- For every there exists such that
- For every , we have
- For every and every we have

Again, this makes things much more intuitive: for example, Axiom 5 now says that scaling any vector by the number does nothing, and Axiom 6 says that scaling a vector by a number and then scaling by produces the same result as scaling by the number Written in this way, the axioms governing scalar multiplication become clear and natural, and are compatible with our experience of scaling geometric vectors. However, we are again abusing notation, since two different operations, multiplication of real numbers and scaling of vectors, are being denoted in exactly the same way. Finally, if we incorporate the axioms dictating the way in which vector addition and scalar multiplication are required to interact with one another, we arrive at the following reformulation of Definition 1.

**Definition 2:** A **vector space** is a set whose elements can be added together and scaled by real numbers in such a way that the following axioms hold:

- For all we have
- For all we have
- There exists such that for all
- For every there exists such that
- For every , we have
- For every and every we have
- For every and every we have
- For every and every we have

We also make the following definition, which formalizes the notion that a vector space may contain a smaller vector space

**Definition 3:** A subset of a vector space is said to be a subspace of if it is itself a vector space when equipped with the operations of vector addition and scalar multiplication inherited from .

From now on, we are going to use Definition 2 as our definition of a vector space, since it is much more convenient and understandable to write things in this way. However, it is important to comprehend that Definition 2 is not completely precise, and to be aware that the pristine and unassailable definition of a vector space given by Definition 1 is what is actually happening under the hood.

**Excercise 1:** Write down as many concrete examples of vector spaces as you can. You should be able to exhibit quite a few specific vector spaces which are significantly different from one another.

This brings us to an important question: why are we doing this? We understand familiar vector spaces like and pretty well, so why not just analyze these as standalone objects instead of viewing them as particular instances of the general notion of a vector space? There are many answers to this question, some quite philosophical. Here is a practical answer: if we are able to prove theorems about an abstract vector space, then these theorems will be universal: they will apply to all specific instances of vector spaces which we encounter in the wild.

We now begin to develop this program: we seek to identify properties that every object which satisfies the axioms laid out in Definition 2 must have. What should these properties be? In addressing this question, it is helpful to rely on the intuition gained from experience working with geometric vectors. For example, vectors in are just pairs of real numbers, and we have concrete and specific formulas for vector addition and scalar multiplication in if and then

and

Thus for example we can see that, in Axiom 3 in Definition 2 is fulfilled by the vector

since

In fact, we can say more: is the *only* vector in which has the property required by Axiom 3, because is the only number such that for every number We can similarly argue that, in the only vector which fulfills Axiom 3 is So, we might suspect that in *any* vector space the vector whose existence is required by Axiom 3 is actually unique. This claim is formulated as follows.

**Proposition 1:** There is a unique vector such that for all

In order to prove that the claim made by Proposition 1 is true, we must deduce it using nothing more than the vector space axioms given in Definition 2. This is Problem 1 on the first homework assignment. The propositions below give more properties which hold true for every vector space In every case, proving such a proposition means deducing its truth using no information apart from the axioms in Definition 2, and propositions which have already been proved using these axioms.

**Proposition 2:** For every there exists a unique such that

*Proof:* Let be any vector. By Axiom 4 in Definition 2, we know that there exists a vector such that It remains to prove that this vector is unique. Suppose that is another vector such that We then have

Adding the vector to both sides of this equation, we get

Since by Axiom 1, and since by hypothesis, the above equation implies

By Axiom 1 this is equivalent to

and by Axiom 3 this implies

as required. — Q.E.D.

Now that we have proved that the vector which cancels out is unique, it is appropriate to denote it by Thus Axiom 4 becomes

which we agree to write more simply as

More generally, for any two vectors we write as shorthand for

**Proposition 2:** For any we have That is, scaling the zero vector by any number produces the zero vector.

*Proof:* By Axiom 3, we have

Let be arbitrary. Multiplying both sides of the above equation by we get

Using Axiom 8 on the left hand side of this equation, we get

Now, subtracting from both sides of the above equation, we get

which simplifies to

as required.

**Proposition 3:** For any we have That is, scaling any vector by the number zero produces the zero vector.

*Proof:* Let be any vector. We have

where we used Axiom 7 to obtain the first equality and Axiom 5 to obtain the second equality. On the other hand, the left hand side of the above equation is

where the first equality is the fact that adding the number and the number produces the number and the second inequality is Axiom 5 again. So, we have that

Since the vector was chosen arbitrarily, we have shown that the vector has the property that

for any We thus have

by Proposition 1. — Q.E.D.

**Proposition 4:** If and then That is, scaling a nonzero vector by a nonzero number produces a nonzero vector.

*Proof: *Suppose there exists a nonzero number and a nonzero vector such that

Since the real number is well-defined. Multiplying both sides of the above equation by we obtain

This gives

Using Axiom 5 on the left hand side and Proposition 2 on the right hand side, this becomes

However, this is false, since Since the statement that leads to the false statement it must itself be false, and we conclude that — Q.E.D.

**Proposition 5:** If and then That is, if two scalar multiples of the same nonzero vector are the same, then the scaling factors are the same.

*Proof: *Subtracting from both sides of the equation yields

where we used Axiom 7 on the left hand side. If this contradicts Proposition 4, so it must be the case that —Q.E.D.

**Proposition 6:** Every vector space contains either one vector, or infinitely many vectors.

*Proof:* Let be a vector space. Then, by Axiom 4, contains at least one vector, namely It is possible that this is the only vector in i.e. we have However, if contains another vector then it also contains the vector for all By Proposition 4, each of these vectors is different from and by Proposition 5 they are all different from one another. —Q.E.D.

**Exercise 2:** Try to prove more propositions about vector spaces suggested by your familiarity with and If you discover something interesting, consider posting about your findings on Piazza.

We now embark on an ambitious project: using nothing more than Definition 2 and the Propositions we have already deduced from it, we want to define a meaningful notion of dimension for vector spaces. The first step on this road is the following definition.

**Definition 4:** Let be a vector space, and let be a finite subset of We say that is **linearly dependent** if there exist numbers not all equal to zero, such that

If no such numbers exist, then is said to be **linearly independent**.

It will be convenient to extend Definition 3 to the case where is a set of size zero. There is only one such set, namely the empty set By fiat, we declare the empty set to be a linearly independent set.

The fundamental feature of a linearly dependent set in a vector space is that at least one vector in is a linear combination of the other vectors in meaning that it can be represented as a sum of scalar multiples of these other vectors. For example, suppose that is a linearly dependent set. Then, by Definition 3, there exist numbers not all equal to zero, such that

We thus have

If we can divide both sides of this equation by obtaining

This expresses the vector as a linear combination of and However, if we cannot divide through by as we did above. Instead, we use the fact that one of is nonzero. If then we have

while if we have

So, no matter what, the fact that is a linearly dependent set implies that at least one vector in this set is a linear combination of the other two. Conversely, if it were the case that was a linearly *independent* set, then *no* vector in would be a linear combination of the other two.

**Definition 5:** Let be a vector space and let be a nonnegative integer. We say that is **-dimensional** if it contains a linearly independent set of size for each integer and does not contain a linearly independent set of size for any integer If no such number exists, then is said to be **infinite-dimensional.**

**Proposition 7:** Suppose that is a vector space which is both -dimensional and -dimensional. Then

*Proof:* Let us suppose without loss of generality that Then either or Suppose it were the case that Then, since is -dimensional, any subset of of size is linearly dependent. But this is false, since the fact that is -dimensional means that contains a linearly independent set of size Consequently, it must be the case that —Q.E.D.

In view of Proposition 7, the concept of vector space dimension introduced by Definition 4 is well-defined; if is -dimensional for some nonnegative integer , then is the unique number with this property. We may therefore refer to as the dimension of and write If is a vector space which is not -dimensional for any nonnegative integer then it is infinite-dimensional, as per Definition 4. In this case it is customary to write

One way to make the concept of vector space dimension more relatable is to think of it as the critical value at which a phase transition between possible linear independence and certain linear dependence occurs. That is, if one samples a set of vectors of size less than or equal to from it is possible that is linearly independent; however, if one samples a set of more than from , then is necessarily linearly dependent. To say that is to say that this transition never occurs.

Let us use Definition 4 to calculate the dimension of , a vector space containing only one vector. Observe that the only two subsets of are and and these sets have sizes and respectively. Now, is linearly independent by definition (see the paragraph immediately following Definition 3), so contains a linearly independent set of size zero. Moreover, is linearly dependent since for any choice of by Proposition 2, and thus any subset of of size bigger than zero is linearly dependent. We have thus shown that

As another example, let us calculate the dimension of the number line. A linearly independent subset of of size zero is given by the empty set A linearly independent set of size one is given by or any other set containing a single non-zero real number. Consider now an arbitrary subset of size two. Since at least one of is not equal to zero. Suppose without loss in generality that If then we have with so that is linearly dependent in this case. If then we have with So, we have shown that any set of two real numbers is linearly dependent, and by one of the problems on Assignment 1 this implies that any set of more than two real numbers is linearly dependent. We thus conclude that

At this point, vector spaces may seem like impossibly complicated objects which are impossible to analyze in general. However, it turns out that for many purposes understanding a given vector space can be reduced to understanding a well-chose finite subset of The first step in this direction is the following theorem.

**Theorem 1:** Let be an -dimensional vector space, and let be a linearly independent set of vectors in Then, every vector in can be uniquely represented as a linear combination of the vectors in

*Proof:* Let be any vector. Consider the set Since the dimension of is the set must be linearly dependent. Thus there exist numbers not all of which are equal to zero, such that

We claim that Indeed, if it were the case that then the above would read

where are not all equal to zero. But this is impossible, since is a linearly independent set, and thus it cannot be the case that Now, since we can write

which shows that is a linear combination of the vectors Since was arbitrary, we have shown that every vector in can be represented as a linear combination of vectors from

Now let us prove uniqueness. Let be a vector, and suppose that

are two representations of as a linear combination of the vectors in Subtracting the second of these equations from the first, we obtain the equation

Since is a linearly independent set, we have that for all which means that for all We thus conclude that any two representations of any vector as a linear combination of the vectors in fact coincide. —Q.E.D.

A subset of a vector space which has the property that every can be written as a linear combination of vectors in is said to **span** the vector space If moreover is a linearly independent set, then is called a **basis** of and in this case the above argument shows that every vector in can be written as a unique linear combination of the vectors in In Theorem 1, we have proven that, in an -dimensional vector space any linearly independent set of size is a basis. We will continue to study the relationship between the dimension of a vector space and its bases in Lecture 2.

## 3 Comments