Math 31AH: Lecture 1

Assuming familiarity with the geometric concept of a vector, we introduce the notion of a vector space: a set containing objects which can be added to one another and scaled by numbers in a manner which is formally consistent with the addition and scaling of geometric vectors. The vector space concept is the bedrock of linear algebra.

Definition 1: A vector space is a triple $(\mathbf{V},a,s)$ consisting of a set $\mathbf{V}$ together with two functions

$a \colon \mathbf{V} \times \mathbf{V} \to \mathbf{V}$

and

$s \colon \mathbb{R} \times \mathbf{V} \to \mathbf{V},$

where $\times$ denotes the Cartesian product of sets and $\mathbb{R}$ denotes the set of real numbers. The functions $a$ and $s$ are required to have the following properties:

1. For all $\mathbf{v}_1,\mathbf{v}_2 \in \mathbf{V},$ we have $a(\mathbf{v}_1,\mathbf{v}_2) = a(\mathbf{v}_2,\mathbf{v}_1).$
2. For all $\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3 \in \mathbf{V},$ we have $a(\mathbf{v}_1,a(\mathbf{v}_2,\mathbf{v}_3)) = a(a(\mathbf{v}_1,\mathbf{v}_2),\mathbf{v}_3).$
3. There exists $\mathbf{0} \in \mathbf{V}$ such that $a(\mathbf{0},\mathbf{v}) = \mathbf{v}$ for all $\mathbf{v} \in \mathbf{V}.$
4. For every $\mathbf{v} \in \mathbf{V}$ there exists $\mathbf{w} \in \mathbf{V}$ such that $a(\mathbf{v},\mathbf{w})=\mathbf{0}.$
5. For every $\mathbf{v} \in \mathbf{V},$ we have $s(1,\mathbf{v})=\mathbf{v}.$
6. For every $t_1,t_2 \in \mathbb{R}$ and every $\mathbf{v} \in \mathbf{V},$ we have $s(t_1,s(t_2,\mathbf{v})) = s(t_1t_2,\mathbf{v}).$
7. For every $t_1,t_2 \in \mathbb{R}$ and every $\mathbf{v} \in \mathbf{V},$ we have $s(t_1+t_2,\mathbf{v}) = a(s(t_1,\mathbf{v}),s(t_2,\mathbf{v})).$
8. For every $t \in \mathbb{R}$ and every $\mathbf{v}_1,\mathbf{v}_2 \in \mathbf{V},$ we have $s(t,a(\mathbf{v}_1,\mathbf{v}_2))=a(s(t,\mathbf{v}_1,s(t,\mathbf{v}_2).$

This is the definition of a vector space. It is logically precise and totally unambiguous, or in other words mathematically rigorous. However, it is difficult to have any intuitive feeling for this construction. In order to make the definition of a vector space more relatable, we will rewrite it using words and symbols that are more familiar and evocative.

First, let us call the elements of the set $\mathbf{V}$ “vectors.” When we use this word, we are reminded of geometric vectors, which are familiar objects. However, this is only an analogy — we make no assumption on the nature of the elements of $\mathbf{V}.$. Indeed, we shall soon see that many rather different mathematical objects — including numbers, polynomials, functions, and of course geometric vectors — can all be viewed as vectors.

Second, let us call the function $a$ in Definition 1 “vector addition,” and write $\mathbf{v}_1+\mathbf{v}_2$ in place of $a(\mathbf{v}_1,\mathbf{v}_2).$ If we use this alternative notation, then the axioms governing the function $a$ in Definition 1 become

1. For all $\mathbf{v}_1,\mathbf{v}_2 \in \mathbf{V},$ we have $\mathbf{v}_1+\mathbf{v}_2 = \mathbf{v}_2+\mathbf{v}_1.$
2. For all $\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3\in\mathbf{V},$ we have $\mathbf{v}_1+(\mathbf{v}_2+\mathbf{v}_3) = (\mathbf{v}_1+\mathbf{v}_2)+\mathbf{v}_3.$
3. There exists $\mathbf{0} \in \mathbf{V}$ such that $\mathbf{0}+\mathbf{v} = \mathbf{v}$ for all $\mathbf{v} \in \mathbf{V}.$
4. For every $\mathbf{v} \in \mathbf{V}$ there exists $\mathbf{w} \in \mathbf{V}$ such that $\mathbf{v}+\mathbf{w}=\mathbf{0}.$

This makes things intuitively clear: the axioms above say that the operation of vector addition behaves the way addition does in other contexts we are familiar with, such as the addition of numbers or the addition of geometric vectors. Although conceptually helpful, this comes at a cost, and that cost is ambiguity: we are now using the symbol $+$ in two different ways, since it can mean either addition of numbers in $\mathbb{R}$ or vectors in $\mathbf{V},$ and these operations are not the same. However, writing $\mathbf{v}_1+\mathbf{v}_2$ is so much more natural than writing $a(\mathbf{v}_1,\mathbf{v}_2)$ that we decide to do this going forward despite the ambiguity it introduces. This is called abuse of notation.

Third and last, we are going to do the same thing with the function $s$ — give it a name, and write it in a more intuitive but more ambiguous way. The usual name given to $s$ is “scalar multiplication,” which indicates that $s$ is an abstraction of the familiar operation of scaling a geometric vector by a number. The usual notation for scalar multiplication is simply juxtaposition: we write $t\mathbf{v}$ in place of $s(t,\mathbf{v}).$ Adding on the axioms prescribing the behavior of the function $s$, we now have

1. For all $\mathbf{v}_1,\mathbf{v}_2 \in \mathbf{V},$ we have $\mathbf{v}_1+\mathbf{v}_2 = \mathbf{v}_2+\mathbf{v}_1.$
2. For all $\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3\in\mathbf{V},$ we have $\mathbf{v}_1+(\mathbf{v}_2+\mathbf{v}_3) =(\mathbf{v}_1+\mathbf{v}_2)+\mathbf{v}_3.$
3. There exists $\mathbf{0} \in \mathbf{V}$ such that $\mathbf{0}+\mathbf{v} = \mathbf{v}$ for all $\mathbf{v} \in \mathbf{V}.$
4. For every $\mathbf{v} \in \mathbf{V}$ there exists $\mathbf{w} \in \mathbf{V}$ such that $\mathbf{v}+\mathbf{w}=\mathbf{0}.$
5. For every $\mathbf{v} \in \mathbf{V}$, we have $1\mathbf{v}=\mathbf{v}.$
6. For every $t_1,t_2 \in \mathbb{R}$ and every $\mathbf{v} \in \mathbf{V},$ we have $t_1(t_2\mathbf{v})=(t_1t_2)\mathbf{v}.$

Again, this makes things much more intuitive: for example, Axiom 5 now says that scaling any vector by the number $1$ does nothing, and Axiom 6 says that scaling a vector $\mathbf{v}$ by a number $t_2$ and then scaling $t_2\mathbf{v}$ by $t_1$ produces the same result as scaling $\mathbf{v}$ by the number $t_1t_2.$ Written in this way, the axioms governing scalar multiplication become clear and natural, and are compatible with our experience of scaling geometric vectors. However, we are again abusing notation, since two different operations, multiplication of real numbers and scaling of vectors, are being denoted in exactly the same way. Finally, if we incorporate the axioms dictating the way in which vector addition and scalar multiplication are required to interact with one another, we arrive at the following reformulation of Definition 1.

Definition 2: A vector space is a set $\mathbf{V}$ whose elements can be added together and scaled by real numbers in such a way that the following axioms hold:

1. For all $\mathbf{v}_1,\mathbf{v}_2 \in \mathbf{V},$ we have $\mathbf{v}_1+\mathbf{v}_2 = \mathbf{v}_2+\mathbf{v}_1.$
2. For all $\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3\in\mathbf{V},$ we have $\mathbf{v}_1+(\mathbf{v}_2+\mathbf{v}_3) = (\mathbf{v}_1+\mathbf{v}_2)+\mathbf{v}_3.$
3. There exists $\mathbf{0} \in \mathbf{V}$ such that $\mathbf{0}+\mathbf{v} = \mathbf{v}$ for all $\mathbf{v} \in \mathbf{V}.$
4. For every $\mathbf{v} \in \mathbf{V}$ there exists $\mathbf{w} \in \mathbf{V}$ such that $\mathbf{v}+\mathbf{w}=\mathbf{0}.$
5. For every $\mathbf{v} \in \mathbf{V}$, we have $1\mathbf{v}=\mathbf{v}.$
6. For every $t_1,t_2 \in \mathbb{R}$ and every $\mathbf{v} \in \mathbf{V},$ we have $t_1(t_2\mathbf{v})=(t_1t_2)\mathbf{v}.$
7. For every $t_1,t_2 \in \mathbb{R}$ and every $\mathbf{v} \in \mathbf{V},$ we have $(t_1+t_2)\mathbf{v} = t_1\mathbf{v}+t_2\mathbf{v}.$
8. For every $t \in \mathbb{R}$ and every $\mathbf{v}_1,\mathbf{v}_2 \in \mathbf{V},$ we have $t(\mathbf{v}_1+\mathbf{v}_2)=t\mathbf{v}_1+t\mathbf{v}_2.$

We also make the following definition, which formalizes the notion that a vector space $\mathbf{V}$ may contain a smaller vector space $\mathbf{W}.$

Definition 3: A subset $\mathbf{W}$ of a vector space $\mathbf{V}$ is said to be a subspace of $\mathbf{V}$ if it is itself a vector space when equipped with the operations of vector addition and scalar multiplication inherited from $\mathbf{V}$.

From now on, we are going to use Definition 2 as our definition of a vector space, since it is much more convenient and understandable to write things in this way. However, it is important to comprehend that Definition 2 is not completely precise, and to be aware that the pristine and unassailable definition of a vector space given by Definition 1 is what is actually happening under the hood.

Excercise 1: Write down as many concrete examples of vector spaces as you can. You should be able to exhibit quite a few specific vector spaces which are significantly different from one another.

This brings us to an important question: why are we doing this? We understand familiar vector spaces like $\mathbb{R}^2$ and $\mathbb{R}^3$ pretty well, so why not just analyze these as standalone objects instead of viewing them as particular instances of the general notion of a vector space? There are many answers to this question, some quite philosophical. Here is a practical answer: if we are able to prove theorems about an abstract vector space, then these theorems will be universal: they will apply to all specific instances of vector spaces which we encounter in the wild.

We now begin to develop this program: we seek to identify properties that every object which satisfies the axioms laid out in Definition 2 must have. What should these properties be? In addressing this question, it is helpful to rely on the intuition gained from experience working with geometric vectors. For example, vectors in $\mathbb{R}^2$ are just pairs of real numbers, and we have concrete and specific formulas for vector addition and scalar multiplication in $\mathbb{R}^2:$ if $\mathbf{x}=(x_1,x_2)$ and $\mathbf{y}=(y_1,y_2),$ then

$\mathbf{x}+\mathbf{y}=(x_1,x_2)+(y_1,y_2) = (x_1+x_2,y_1+y_2),$

and

$t\mathbf{x} = t(x_1,x_2) = (tx_1,tx_2).$

Thus for example we can see that, in $\mathbb{R}^2,$ Axiom 3 in Definition 2 is fulfilled by the vector

$\mathbf{0}=(0,0),$

since

$\mathbf{x}+\mathbf{0}=(x_1,x_2) + (0,0) = (x_1+0,x_2+0)=(x_1,x_2) = \mathbf{x}.$

In fact, we can say more: $\mathbf{0}=(0,0)$ is the only vector in $\mathbb{R}^2$ which has the property required by Axiom 3, because $0$ is the only number such that $x+0=x$ for every number $x.$ We can similarly argue that, in $\mathbb{R}^3,$ the only vector which fulfills Axiom 3 is $\mathbf{0}=(0,0,0).$ So, we might suspect that in any vector space $\mathbf{V},$ the vector $\mathbf{0}$ whose existence is required by Axiom 3 is actually unique. This claim is formulated as follows.

Proposition 1: There is a unique vector $\mathbf{0} \in \mathbf{V}$ such that $\mathbf{v}+\mathbf{0}=\mathbf{v}$ for all $\mathbf{v} \in \mathbf{V}.$

In order to prove that the claim made by Proposition 1 is true, we must deduce it using nothing more than the vector space axioms given in Definition 2. This is Problem 1 on the first homework assignment. The propositions below give more properties which hold true for every vector space $\mathbf{V}.$ In every case, proving such a proposition means deducing its truth using no information apart from the axioms in Definition 2, and propositions which have already been proved using these axioms.

Proposition 2: For every $\mathbf{v} \in \mathbf{V},$ there exists a unique $\mathbf{w} \in \mathbf{W}$ such that $\mathbf{v}+\mathbf{w}=\mathbf{0}.$

Proof: Let $\mathbf{v} \in \mathbf{V}$ be any vector. By Axiom 4 in Definition 2, we know that there exists a vector $\mathbf{w} \in \mathbf{V}$ such that $\mathbf{v}+\mathbf{w}=\mathbf{0}.$ It remains to prove that this vector is unique. Suppose that $\mathbf{w}'$ is another vector such that $\mathbf{v}+\mathbf{w}'=\mathbf{0}.$ We then have

$\mathbf{v}+\mathbf{w} = \mathbf{v}+\mathbf{w}'.$

Adding the vector $\mathbf{w}$ to both sides of this equation, we get

$\mathbf{w}+\mathbf{v}+\mathbf{w} =\mathbf{w}+ \mathbf{v}+\mathbf{w}'.$

Since $\mathbf{w}+\mathbf{v} = \mathbf{v}+\mathbf{w}$ by Axiom 1, and since $\mathbf{v}+\mathbf{w}=\mathbf{0}$ by hypothesis, the above equation implies

$\mathbf{0}+\mathbf{w} =\mathbf{0}+\mathbf{w}'.$

By Axiom 1 this is equivalent to

$\mathbf{0}+\mathbf{w} =\mathbf{0}+\mathbf{w}',$

and by Axiom 3 this implies

$\mathbf{w} =\mathbf{w}',$

as required. — Q.E.D.

Now that we have proved that the vector $\mathbf{w}$ which cancels out $\mathbf{v}$ is unique, it is appropriate to denote it by $-\mathbf{v}.$ Thus Axiom 4 becomes

$\mathbf{v} + (-\mathbf{v}) = \mathbf{0},$

which we agree to write more simply as

$\mathbf{v} - \mathbf{v}=\mathbf{0}.$

More generally, for any two vectors $\mathbf{v},\mathbf{w} \in \mathbf{V},$ we write $\mathbf{v}-\mathbf{w}$ as shorthand for $\mathbf{v}+(-\mathbf{w}).$

Proposition 2: For any $t \in \mathbb{R},$ we have $t\mathbf{0}=\mathbf{0}.$ That is, scaling the zero vector by any number produces the zero vector.

Proof: By Axiom 3, we have

$\mathbf{0} = \mathbf{0}+\mathbf{0}.$

Let $t \in \mathbb{R}$ be arbitrary. Multiplying both sides of the above equation by $t,$ we get

$t\mathbf{0} = t(\mathbf{0}+\mathbf{0}).$

Using Axiom 8 on the left hand side of this equation, we get

$t\mathbf{0} = t\mathbf{0} +t\mathbf{0}.$

Now, subtracting $t\mathbf{0}$ from both sides of the above equation, we get

$t\mathbf{0}-t\mathbf{0} = t\mathbf{0} +t\mathbf{0} - t\mathbf{0},$

which simplifies to

$\mathbf{0} = t\mathbf{0},$

as required.

Proposition 3: For any $\mathbf{v} \in \mathbf{V},$ we have $0\mathbf{v} = \mathbf{0}.$ That is, scaling any vector by the number zero produces the zero vector.

Proof: Let $\mathbf{v} \in \mathbf{V}$ be any vector. We have

$(1+0)\mathbf{v}=1\mathbf{v} + 0\mathbf{v} = \mathbf{v}+0\mathbf{v},$

where we used Axiom 7 to obtain the first equality and Axiom 5 to obtain the second equality. On the other hand, the left hand side of the above equation is

$(1+0)\mathbf{v} = 1\mathbf{v} = \mathbf{v},$

where the first equality is the fact that adding the number $1$ and the number $0$ produces the number $1,$ and the second inequality is Axiom 5 again. So, we have that

$\mathbf{v} = \mathbf{v}+\mathbf{0}\mathbf{v}.$

Since the vector $\mathbf{v}$ was chosen arbitrarily, we have shown that the vector $0\mathbf{v}$ has the property that

$\mathbf{v} + 0\mathbf{v} = \mathbf{v}$

for any $v \in \mathbf{V}.$ We thus have

$0\mathbf{v} = \mathbf{0},$

by Proposition 1. — Q.E.D.

Proposition 4: If $t \neq 0$ and $\mathbf{v} \neq \mathbf{0},$ then $t\mathbf{v} \neq \mathbf{0}.$ That is, scaling a nonzero vector by a nonzero number produces a nonzero vector.

Proof: Suppose there exists a nonzero number $t$ and a nonzero vector $\mathbf{v}$ such that

$t\mathbf{v}=\mathbf{0}.$

Since $t \neq 0,$ the real number $t^{-1}$ is well-defined. Multiplying both sides of the above equation by $t^{-1},$ we obtain

$(t^{-1}t)\mathbf{v} = t^{-1}\mathbf{0}.$

This gives

$1\mathbf{v}=t^{-1}\mathbf{0}.$

Using Axiom 5 on the left hand side and Proposition 2 on the right hand side, this becomes

$\mathbf{v}=\mathbf{0}.$

However, this is false, since $\mathbf{v} \neq \mathbf{0}.$ Since the statement that $t\mathbf{v}=\mathbf{0}$ leads to the false statement $\mathbf{v}=\mathbf{0},$ it must itself be false, and we conclude that $t\mathbf{v} \neq \mathbf{0}.$ — Q.E.D.

Proposition 5: If $\mathbf{v} \neq \mathbf{0}$ and $t_1\mathbf{v}=t_2\mathbf{v},$ then $t_1=t_2.$ That is, if two scalar multiples of the same nonzero vector are the same, then the scaling factors are the same.

Proof: Subtracting $t_2\mathbf{v}$ from both sides of the equation $t_1\mathbf{v}=t_2\mathbf{v}$ yields

$(t_1-t_2)\mathbf{v}=\mathbf{0},$

where we used Axiom 7 on the left hand side. If $t_1 \neq t_2,$ this contradicts Proposition 4, so it must be the case that $t_1=t_2.$ —Q.E.D.

Proposition 6: Every vector space contains either one vector, or infinitely many vectors.

Proof: Let $\mathbf{V}$ be a vector space. Then, by Axiom 4, $\mathbf{V}$ contains at least one vector, namely $\mathbf{0}.$ It is possible that this is the only vector in $\mathbf{V},$ i.e. we have $\mathbf{V}=\{\mathbf{0}\}.$ However, if $\mathbf{V}$ contains another vector $\mathbf{v} \neq \mathbf{0},$ then it also contains the vector $t\mathbf{v}$ for all $t \in \mathbb{R}\backslash \{0\}.$ By Proposition 4, each of these vectors is different from $\mathbf{0},$ and by Proposition 5 they are all different from one another. —Q.E.D.

Exercise 2: Try to prove more propositions about vector spaces suggested by your familiarity with $\mathbb{R}^2$ and $\mathbb{R}^3.$ If you discover something interesting, consider posting about your findings on Piazza.

We now embark on an ambitious project: using nothing more than Definition 2 and the Propositions we have already deduced from it, we want to define a meaningful notion of dimension for vector spaces. The first step on this road is the following definition.

Definition 4: Let $\mathbf{V}$ be a vector space, and let $S =\{\mathbf{v}_1,\dots,\mathbf{v}_k\}$ be a finite subset of $\mathbf{V}.$ We say that $S$ is linearly dependent if there exist numbers $t_1,\dots,t_k \in \mathbb{R},$ not all equal to zero, such that

$\sum_{I=1}^k t_i\mathbf{v}_i = \mathbf{0}.$

If no such numbers exist, then $S$ is said to be linearly independent.

It will be convenient to extend Definition 3 to the case where $S \subseteq \mathbf{V}$ is a set of size zero. There is only one such set, namely the empty set $S=\{\}.$ By fiat, we declare the empty set to be a linearly independent set.

The fundamental feature of a linearly dependent set $S$ in a vector space $\mathbf{V}$ is that at least one vector in $S$ is a linear combination of the other vectors in $S,$ meaning that it can be represented as a sum of scalar multiples of these other vectors. For example, suppose that $S=\{\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3\}$ is a linearly dependent set. Then, by Definition 3, there exist numbers $t_1,t_2,t_3 \in \mathbb{R},$ not all equal to zero, such that

$t_1\mathbf{v}_1 + t_2 \mathbf{v}_2 + t_3 \mathbf{v}_3 = \mathbf{0}.$

We thus have

$t_1\mathbf{v}_1 = -t_2 \mathbf{v}_2 - t_3 \mathbf{v}_3.$

If $t_1 \neq 0,$ we can divide both sides of this equation by $t_1,$ obtaining

$\mathbf{v}_1 = -\frac{t_2}{t_1} \mathbf{v}_2 - \frac{t_3}{t_1} \mathbf{v}_3.$

This expresses the vector $\mathbf{v}_1$ as a linear combination of $\mathbf{v}_2$ and $\mathbf{v}_3.$ However, if $t_1 = 0,$ we cannot divide through by $t_1$ as we did above. Instead, we use the fact that one of $t_2,t_3$ is nonzero. If $t_2 \neq 0,$ then we have

$\mathbf{v}_2 = -\frac{t_1}{t_2} \mathbf{v}_1 - \frac{t_3}{t_2} \mathbf{v}_3,$

while if $t_3 \neq 0,$ we have

$\mathbf{v}_3 = -\frac{t_1}{t_3} \mathbf{v}_1 - \frac{t_2}{t_3} \mathbf{v}_2.$

So, no matter what, the fact that $\{\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3\}$ is a linearly dependent set implies that at least one vector in this set is a linear combination of the other two. Conversely, if it were the case that $\{\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3\}$ was a linearly independent set, then no vector in $\{\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3\}$ would be a linear combination of the other two.

Definition 5: Let $\mathbf{V}$ be a vector space and let $n$ be a nonnegative integer. We say that $\mathbf{V}$ is $n$-dimensional if it contains a linearly independent set of size $k$ for each integer $0 \leq k \leq n,$ and does not contain a linearly independent set of size $k$ for any integer $k>n.$ If no such number $n$ exists, then $\mathbf{V}$ is said to be infinite-dimensional.

Proposition 7: Suppose that $\mathbf{V}$ is a vector space which is both $m$-dimensional and $n$-dimensional. Then $m=n.$

Proof: Let us suppose without loss of generality that $m \leq n.$ Then either $m or $m=n.$ Suppose it were the case that $m Then, since $\mathbf{V}$ is $m$-dimensional, any subset $S$ of $\mathbf{V}$ of size $n$ is linearly dependent. But this is false, since the fact that $\mathbf{V}$ is $n$-dimensional means that $\mathbf{V}$ contains a linearly independent set of size $n.$ Consequently, it must be the case that $m=n.$ —Q.E.D.

In view of Proposition 7, the concept of vector space dimension introduced by Definition 4 is well-defined; if $\mathbf{V}$ is $n$-dimensional for some nonnegative integer $n$, then $n$ is the unique number with this property. We may therefore refer to $n$ as the dimension of $\mathbf{V},$ and write $\dim \mathbf{V}=n.$ If $\mathbf{V}$ is a vector space which is not $n$-dimensional for any nonnegative integer $n,$ then it is infinite-dimensional, as per Definition 4. In this case it is customary to write $\dim \mathsf{V} =\infty.$

One way to make the concept of vector space dimension more relatable is to think of it as the critical value at which a phase transition between possible linear independence and certain linear dependence occurs. That is, if one samples a set $S$ of vectors of size less than or equal to $\dim \mathbf{V}$ from $\mathbf{V},$ it is possible that $S$ is linearly independent; however, if one samples a set of more than $\dim \mathrm{V}$ from $\mathbf{V}$, then $S$ is necessarily linearly dependent. To say that $\dim \mathbf{V} = \infty$ is to say that this transition never occurs.

Let us use Definition 4 to calculate the dimension of $\mathbf{V}=\{\mathbf{0}\}$, a vector space containing only one vector. Observe that the only two subsets of $\mathbf{V}$ are $S_0=\{\}$ and $S_1=\{\mathbf{0}\},$ and these sets have sizes $0$ and $1,$ respectively. Now, $S_0$ is linearly independent by definition (see the paragraph immediately following Definition 3), so $\mathbf{V}$ contains a linearly independent set of size zero. Moreover, $S_1$ is linearly dependent since $t\mathbf{0}=\mathbf{0}$ for any choice of $t \in \mathbb{R}$ by Proposition 2, and thus any subset of $\mathbf{V}$ of size bigger than zero is linearly dependent. We have thus shown that $\dim \{\mathbf{0}\}=0.$

As another example, let us calculate the dimension of $\mathbb{R}^1=\mathbb{R},$ the number line. A linearly independent subset of $\mathbb{R}$ of size zero is given by the empty set $S_0=\{\}.$ A linearly independent set of size one is given by $\mathbf{S}_1=\{1\},$ or any other set containing a single non-zero real number. Consider now an arbitrary subset $S_2=\{x,y\}$ of size two. Since $x \neq y,$ at least one of $x,y$ is not equal to zero. Suppose without loss in generality that $y \neq 0.$ If $x=0,$ then we have $t_1x+t_2y=0$ with $t_1=1,t_2=0,$ so that $S_2$ is linearly dependent in this case. If $x \neq 0,$ then we have $t_1x+t_2y = 0$ with $t_1=1, t_2=\frac{x}{y}.$ So, we have shown that any set of two real numbers is linearly dependent, and by one of the problems on Assignment 1 this implies that any set of more than two real numbers is linearly dependent. We thus conclude that $\dim \mathbb{R}=1.$

At this point, vector spaces may seem like impossibly complicated objects which are impossible to analyze in general. However, it turns out that for many purposes understanding a given vector space $\mathbf{V}$ can be reduced to understanding a well-chose finite subset of $\mathbf{V}.$ The first step in this direction is the following theorem.

Theorem 1: Let $\mathbf{V}$ be an $n$-dimensional vector space, and let $B = \{\mathbf{b}_1,\dots,\mathbf{b}_n\}$ be a linearly independent set of $n$ vectors in $\mathbf{V}.$ Then, every vector $\mathbf{v}$ in $\mathbf{V}$ can be uniquely represented as a linear combination of the vectors in $B.$

Proof: Let $\mathbf{v} \in \mathbf{V}$ be any vector. Consider the set $C=\{\mathbf{v},\mathbf{b}_1,\dots,\mathbf{b}_n\}.$ Since the dimension of $\mathbf{V}$ is $n,$ the set $C$ must be linearly dependent. Thus there exist numbers $t_0,t_1,\dots,t_n \in \mathbb{R},$ not all of which are equal to zero, such that

$t_0\mathbf{v} + t_1\mathbf{b}_1 + \dots + t_n\mathbf{b}_n=\mathbf{0}.$

We claim that $t_0\neq 0.$ Indeed, if it were the case that $t_0=0,$ then the above would read

$t_1\mathbf{b}_1 + \dots + t_n\mathbf{b}_n=\mathbf{0},$

where $t_1,\dots,t_n$ are not all equal to zero. But this is impossible, since $B=\{\mathbf{b}_1,\dots,\mathbf{b}_n\}$ is a linearly independent set, and thus it cannot be the case that $t_0=0.$ Now, since $t_0 \neq 0,$ we can write

$\mathbf{v} = -\frac{t_1}{t_0}\mathbf{b}_1 - \dots - \frac{t_n}{t_0} \mathbf{b}_n,$

which shows that $\mathbf{v}$ is a linear combination of the vectors $\mathbf{b}_1,\dots,\mathbf{b}_n.$ Since $\mathbf{v}$ was arbitrary, we have shown that every vector in $\mathbf{V}$ can be represented as a linear combination of vectors from $B.$

Now let us prove uniqueness. Let $\mathbf{v} \in \mathbf{V}$ be a vector, and suppose that

$\mathbf{v}=p_1\mathbf{b}_1 + \dots +p_n\mathbf{b}_n \\ \mathbf{v}=q_1\mathbf{b}_1 + \dots +q_n\mathbf{b}_n$

are two representations of $\mathbf{v}$ as a linear combination of the vectors in $B.$ Subtracting the second of these equations from the first, we obtain the equation

$\mathbf{0}=(p_1-q_1)\mathbf{b}_1 + \dots + (p_n-q_n)\mathbf{b}_n.$

Since $B$ is a linearly independent set, we have that $p_i-q_i=0$ for all $1 \leq i \leq n,$ which means that $p_i=q_i$ for all $1 \leq i \leq n.$ We thus conclude that any two representations of any vector $\mathbf{v} \in \mathbf{V}$ as a linear combination of the vectors $\mathbf{b}_1,\dots,\mathbf{b}_n$ in fact coincide. —Q.E.D.

A subset $B$ of a vector space $\mathbf{V}$ which has the property that every $\mathbf{v} \in \mathbf{V}$ can be written as a linear combination of vectors in $B$ is said to span the vector space $\mathbf{V}.$ If moreover $B$ is a linearly independent set, then $B$ is called a basis of $\mathbf{V},$ and in this case the above argument shows that every vector in $\mathbf{V}$ can be written as a unique linear combination of the vectors in $B.$ In Theorem 1, we have proven that, in an $n$-dimensional vector space $\mathbf{V},$ any linearly independent set of size $n$ is a basis. We will continue to study the relationship between the dimension of a vector space $\mathbf{V}$ and its bases in Lecture 2.