Two basic issues in linear algebra which we have not yet resolved are:
- Can we multiply vectors?
- Can we certify linear independence?
The answers to these questions turn out to be closely related to one another. In this lecture, we discuss the first item.
Let be a vector space. We have seen one sort of multiplication of vectors, namely the scalar product
One one hand, the scalar product is a proper multiplication rule in the sense that it satisfies the FOIL identity, which is referred to as bilinearity in polite company. On the other hand, the scalar product does not correspond to our usual notion of multiplication in the sense that the product of two vectors is a number, not a vector. This is strange in that one instinctively feels that the “product” of two objects should be another object of the same type. It is natural to ask whether, we can define a bilinear “vector product” which has the feature that the product of two vectors in
is a vector in
In other words, we are asking whether it is possible to give some universal recipe for multiplication of vectors which would turn every vector space into an algebra.
So far, we have only seen certain specific vector spaces where a bilinear multiplication of vectors naturally presents itself. Here is a list of these spaces.
In this case, vectors
are real numbers, and the vector product
is the product of real numbers.
Technically, we have not seen this example yet, but here it is. Let
and
be vectors in
We then define their product to be
Next week, we will see that this example of vector multiplication gives the complex number system.
In this example, the vector space
consists of infinite sequences
which are identically zero after finitely many terms. This means that
is isomorphic to the vector space of polynomials in a single variable. Let
and
be vectors in
We define their product to be
which is just the recipe for multiplying polynomials and collecting together terms of the same degree.
In this example, the vector space
consists of matrices with
rows and
columns. This means that
is isomorphic to the vector space of linear operators on an
-dimensional vector space. A vector product in
is then defined by matrix multiplication.
The above examples are quite different from one another, and they do not appear to be given by any universal recipe for defining a product of vectors. It turns out that in order to answer the question of how to define a universal vector product, it is better not to answer it at all. This is the idea behind the tensor product, which we now introduce.
To every pair of vectors we associate a new vector denoted
which is called the tensor product of
and
However, the vector
does not reside in
; rather, it is a vector in a new vector space called the tensor square of
and denoted
What is happening here is that we view the symbol
as a rule for multiplying two vectors, but we do not specify what this rule is — instead, we view
as an “unevaluated” product of two vectors. We then store this unevaluated product in a new vector space
which contains all unevaluated products of vectors from
More precisely, the vectors in
are all unevaluated expressions of the form
where is a natural number and
are vectors. These unevaluated expressions are called tensors, and often denoted by Greek letters. So tensor products are ambiguous, in the sense that we do not specify what the result of the multiplication
actually is. The only thing we specify about this rule is that it is bilinear:
where the equality means that the LHS and the RHS are different expressions for the same vector in the vector space
A tensor in which can be represented as the product of two vectors from
is called a simple tensor. Note that a tensor may be simple without obviously being so, in the event that it can be “factored” as in high school algebra. For example, we have
We haven’t yet said how to scale tensors by numbers. The rule for scalar multiplication of tensors is determined by bilinearity: it is defined by
and
We can summarize all of the above by saying that two tensors are equal if and only if it is possible to rewrite
as
using bilinearity.
Tensor products take a while to get used to. It’s important to remember that the only specified property of the tensor product is bilinearity; apart from this, it’s entirely ambiguous. So, anything we can say about tensor products must ultimately be a consequence of bilinearity. Here is an example.
Proposition 1: For any we have
Proof: We are going to use the fact that scaling any vector by the number
produces the zero vector
This was proved in Lecture 1, when we discussed the definition of a vector space. We have
Notice that bilinearity was used here to move the scalar zero from the second factor in the tensor product to the first factor in the tensor product. The proof that is essentially the same (try it!).
— Q.E.D.
Using Proposition 1, we can explicitly identify the “zero tensor,” i.e. the zero vector in the vector space
Proposition 2: We have
Proof: Let
be any tensor. We want to prove that
In the case we have
Using bilinearity, we have
where we used Proposition 1 and bilinearity.
The case now follows from the case
— Q.E.D.
Suppose now that is a Euclidean space, i.e. it comes with a scalar product
Then, there is an associated scalar product on the vector space
which by abuse of notation we also write as
This natural scalar product on
is uniquely determined by the requirement that
Exercise 1: Verify that the scalar product on just defined really does satisfy the scalar product axioms.
Proposition 3: If is an orthogonal set of vectors in
then
is an orthogonal set of tensors in
Proof: We must show that if are different tensors, then their scalar product is zero. We have
The assumption that these tensors are different is equivalent to saying that one of the following conditions holds:
Since is an orthogonal set, the first possibility implies
, and the second implies
In either case, the product
is equal to zero.
— Q.E.D.
Theorem 1: If is an orthonormal basis in
then
is an orthonormal basis in
Proof: Let us first show that spans
We have
which shows that an arbitrary tensor is a linear combination of the tensors
Since is an orthogonal set in
by Proposition 3 we have that
is an orthogonal set in
and therefore it is linearly independent.
It remains only to show that all tensors in have unit length. This is established by direct computation:
— Q.E.D.
Corollary 1: If then
It is important to note that the tensor product is noncommutative: it is typically not the case that However, we can decompose a simple tensor into two pieces, as
The first of these fractions is called the “symmetric part” of and is denoted
The reason for this notation is that we can think of as a symmetric version of the tensor product: a bilinear multiplication of vectors that, by construction, is commutative:
Note that if the symmetric tensor product produces the same tensor as the tensor product itself:
The second fraction above is called the “antisymmetric part” of and denoted
This is an antisymmetric version of the tensor product in that, by construction, satisfies
Note that the antisymmetric tensor product of any vector with itself produces the zero tensor:
Although it may seem like the symmetric tensor product is more natural (commutative products are nice), it turns out that the antisymmetric tensor product — or wedge product as it’s often called — is more important. Here is a first indication of this. Suppose that is a
-dimensional Euclidean space with orthonormal basis
Let
be two vectors in Let’s compute their wedge product: using FOIL, we find
Probably, you recognize the lone scalar remaining at the end of this computation as a determinant:
Even if you don’t, no need to worry: you are not expected to know what a determinant is at this point. Indeed, in Lecture 22 we are going to use the wedge product to define determinants.
1 Comment