Processing math: 100%

Tuesday, 14 July 2015

Tensor product

Key ideas: Cartesian product, free vector space, quotient space, equivalence relations

Take the Cartesian product U\times V. Consider every element to be a basis element, and form every possible linear combination (over the field \mathbb{F}) of u's and v's. This gives us the 'free vector space' over U\times V, F(U\times V).

For example, an element of F(\mathbb{R}\times \mathbb{R}) can be 2(1,0)+3(2,-2)+4(3,-2). We can't simplify this any further.

But we don't actually want this vector space; it's 'too big'. We want to form a quotient space by modding out a subspace, W. We're going to specify W by specifying its basis elements, which are all elements of the form: (u,v+v')-(u,v)-(u,v')\\(u+u',v) - (u,v) - (u',v)\\(cu,v) - c(u,v)\\(u,cv) - c(u,v).
'Modding these out' effectively sets all of these to 0 in the quotient space F(U\times V)/W.

So the vectors of U\otimes W are actually the equivalence classes of F(U\times V) under the following equivalence relations: (u,v+v')\sim (u,v)+(u,v')\\(u+u',v)\sim (u,v)+(u',v)\\(cu,v)\sim c(u,v)\sim (u,cv).
An element (u,v)+W is written u\otimes v. Then the above rules become: u\otimes (v+v')=u\otimes v+u\otimes v'\\(u+u')\otimes v=u\otimes v+u'\otimes v\\c(u\otimes v)=(cu\otimes v)=(u\otimes cv).
In other words, \otimes is bilinear on U\otimes V.

Let's look at 2(1,0)+3(2,-2)+4(3,-2) as it occurs in the tensor product \mathbb{R}\otimes \mathbb{R}. It becomes:
\begin{align}2(1\otimes 0)+3(2\otimes -2)+4(3\otimes -2)&=2\otimes 0+6\otimes-2+12⊗-2\\&=2\otimes 0+(6+12)\otimes-2\\&=2\otimes 0+18\otimes -2\\&=2\otimes (9*0)+18\otimes -2\\&=18\otimes 0+18\otimes -2\\&=18\otimes -2\\&=-2(18\otimes 1)\\&=-36(1\otimes 1).\end{align}
In fact, for any two real numbers a,b, we have (a\otimes b)=ab(1\otimes 1). So \{1\} is a basis for \mathbb{R} and \{1\otimes 1\} is a basis for \mathbb{R}\otimes \mathbb{R}, which is thus a 1-dimensional vector space over \mathbb{R}, and hence isomorphic to \mathbb{R}, namely \mathbb{R}\otimes \mathbb{R} \cong \mathbb{R} (as vector spaces). But note that we can do something in the vector space \mathbb{R}\otimes \mathbb{R} we can't do ordinarily in a vector space: we can 'multiply vectors'.

If U has a basis \{e_1,\cdots,e_n\} and V has a basis \{e'_1,\cdots,e'_m\} then U\otimes V has the basis \{e_i\otimes e'_j\}, where i=1,\cdots,n and j=1,\cdots,m. Thus U\otimes V has dimension mn. In general U\otimes V can be thought of as a generic bilinear map on U\times V (if U=V=\mathbb{F}, this becomes ordinary field multiplication). What bilinearity means is that u\otimes v=(\sum_{i=1}^n u_ie_i) \otimes (\sum_{j=1}^m v_je'_j)=\sum_{i=1}^n \sum_{j=1}^m u_i v_j(e_i\otimes e'_j). Namely, u\otimes v is a vector in U\otimes V, spanned by the basis vectors e_i\otimes e'_j, and the coefficients are given by u_iv_j for each basis vector.

Let n=2,m=3. The tensor product space is mn=6-dimensional. The basis vectors are e_1\otimes e'_1,e_1\otimes e'_2,e_1\otimes e'_3,e_2\otimes e'_1,e_2\otimes e'_2,e_2\otimes e'_3. Written as six-component column vectors, they are e_1\otimes e'_1=(1,0,0,0,0,0)^T,e_1\otimes e'_2=(0,1,0,0,0,0)^T,e_1\otimes e'_3=(0,0,1,0,0,0)^T\\e_2\otimes e'_1=(0,0,0,1,0,0)^T,e_2\otimes e'_2=(0,0,0,0,1,0)^T,e_2\otimes e'_3=(0,0,0,0,0,1)^T. For general vectors u and v, the tensor product is u\otimes v=\begin{pmatrix}u_1v_1\\u_1v_2\\u_1v_3\\u_2v_1\\u_2v_2\\u_2v_3\end{pmatrix}.

Let's now consider the map A:U\to U, u\mapsto Au or \begin{pmatrix} u_1\\u_2\\ \vdots\\ u_n\end{pmatrix}\mapsto \begin{pmatrix} a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{pmatrix}\begin{pmatrix} u_1\\u_2\\ \vdots\\ u_n\end{pmatrix}. On the tensor product space U\otimes V, the same matrix can still act on the vectors, so that u\mapsto Au, but v\mapsto v untouched. The matrix is A\otimes I. A\otimes I=\begin{pmatrix}a_{11}&0&0&a_{12}&0&0\\0&a_{11}&0&0&a_{12}&0\\0&0&a_{11}&0&0&a_{12}\\a_{21}&0&0&a_{22}&0&0\\0&a_{21}&0&0&a_{22}&0\\0&0&a_{21}&0&0&a_{22}\\\end{pmatrix} If we act it on u\otimes v, \begin{align}(A\otimes I)(u\otimes v)&=\begin{pmatrix}a_{11}&0&0&a_{12}&0&0\\0&a_{11}&0&0&a_{12}&0\\0&0&a_{11}&0&0&a_{12}\\a_{21}&0&0&a_{22}&0&0\\0&a_{21}&0&0&a_{22}&0\\0&0&a_{21}&0&0&a_{22}\\\end{pmatrix} \begin{pmatrix}u_1v_1\\u_1v_2\\u_1v_3\\u_2v_1\\u_2v_2\\u_2v_3\end{pmatrix}\\&=\begin{pmatrix}(a_{11}u_1+a_{12}u_2)v_1\\(a_{11}u_1+a_{12}u_2)v_2\\(a_{11}u_1+a_{12}u_2)v_3\\(a_{21}u_1+a_{22}u_2)v_1\\(a_{21}u_1+a_{22}u_2)v_2\\(a_{21}u_1+a_{22}u_2)v_3\end{pmatrix}\\&=(Au)\otimes v \end{align} We can see that the matrix A indeed acts only on u\in U, and leaves v\in V untouched.

Similarly, the matrix B:V\to V, w\mapsto Bw can also act on U\otimes V as I\otimes B: I\otimes B=\begin{pmatrix}b_{11}&b_{12}&b_{13}&0&0&0\\b_{21}&b_{22}&b_{23}&0&0&0\\b_{31}&b_{32}&b_{33}&0&0&0\\0&0&0&b_{11}&b_{12}&b_{13}\\0&0&0&b_{21}&b_{22}&b_{23}\\0&0&0&b_{31}&b_{32}&b_{33}\end{pmatrix}, which acts on u\otimes v as \begin{align}(I\otimes b)(u\otimes v)&=\begin{pmatrix}b_{11}&b_{12}&b_{13}&0&0&0\\b_{21}&b_{22}&b_{23}&0&0&0\\b_{31}&b_{32}&b_{33}&0&0&0\\0&0&0&b_{11}&b_{12}&b_{13}\\0&0&0&b_{21}&b_{22}&b_{23}\\0&0&0&b_{31}&b_{32}&b_{33}\end{pmatrix}\begin{pmatrix}u_1v_1\\u_1v_2\\u_1v_3\\u_2v_1\\u_2v_2\\u_2v_3\end{pmatrix}\\ &=\begin{pmatrix}u_1(b_{11}v_1+b_{12}v_2+b_{13}v_3)\\ u_1(b_{21}v_1+b_{22}v_2+b_{23}v_3)\\u_1(b_{31}v_1+b_{32}v_2+b_{33}v_3)\\ u_2(b_{11}v_1+b_{12}v_2+b_{13}v_3)\\u_2(b_{21}v_1+b_{22}v_2+b_{23}v_3)\\ u_2(b_{31}v_1+b_{32}v_2+b_{33}v_3)\end{pmatrix}\\ &=u\otimes (Bv)\end{align}. If you have two matrices, their multiplications are done on each vector space separately, (A_1\otimes I)(A_2\otimes I)=(A_1A_2)\otimes I\\ (I\otimes B_1)(I\otimes B_2)=I\otimes(B_1B_2)\\(A\otimes I)(I\otimes B)=(I\otimes B)(A\otimes I)=(A\otimes B).
We can write out A\otimes B explicitly: A\otimes B=\begin{pmatrix} a_{11}b_{11}&a_{11}b_{12}&a_{11}b_{13}&a_{12}b_{11}&a_{12}b_{12}&a_{12}b_{13}\\ a_{11}b_{21}&a_{11}b_{22}&a_{11}b_{23}&a_{12}b_{21}&a_{12}b_{22}&a_{12}b_{23}\\ a_{11}b_{31}&a_{11}b_{32}&a_{11}b_{33}&a_{12}b_{31}&a_{12}b_{32}&a_{12}b_{33}\\ a_{21}b_{11}&a_{21}b_{12}&a_{21}b_{13}&a_{22}b_{11}&a_{22}b_{12}&a_{22}b_{13}\\ a_{21}b_{21}&a_{21}b_{22}&a_{21}b_{23}&a_{22}b_{21}&a_{22}b_{22}&a_{22}b_{23}\\ a_{21}b_{31}&a_{21}b_{32}&a_{21}b_{33}&a_{22}b_{31}&a_{22}b_{32}&a_{22}b_{33} \end{pmatrix}. One can verify that (A\otimes B)(u\otimes v)=(Au)\otimes (Bv).
Other useful formulae are \text{det}(A\otimes B)=(\text{det}A)^m(\text{det}B)^n and \text{Tr}(A\otimes B)=(\text{Tr}A)(\text{Tr}B).



Now we present the formal definitions related to tensor product.

As with most of linear algebra, one has the usual choice between a concrete, base-oriented definition and an abstract, base-free definition. The latter definition has clear advantages since we can use it for important, more general applications. Nevertheless, to build intuition, we first provide the concrete (base-oriented) definition of the tensor product of vector spaces over a field.

Definition:
Suppose A and B are vector spaces over a field \mathbb{F}, with bases \{a_1,\cdots,a_m\} and \{b_1,\cdots,b_n\} respectively. The tensor product A\otimes B is defined as the vector space of dimension mn over \mathbb{F} with base \{a_i \otimes b_j: 1\leq i \leq m, 1\leq j \leq n\}. For any a=\sum_i \alpha_i a_i \in A and b=\sum_j \beta_j b_j \in B with \alpha_i, \beta_i in \mathbb{F}, we define the simple tensor a\otimes b to be \sum_{i,j} \alpha_i \beta_j a_i\otimes b_j. Thus \alpha(a\otimes b)=\alpha a\otimes b=a\otimes \alpha b, \forall \alpha \in \mathbb{F}.

Despite its intuitive immediacy, this definition hampers development of the theory because of its reliance on the choice of base.

We now try an abstract approach, which provides direct, computation-free proofs of all the important properties. An extra benefit is the ability to work with modules and algebras over arbitrary commutative rings, since the existence of a base no longer is required to define the tensor product.

A balanced map is a function \psi:M\times N \to G, where G=(G,+) is an Abelian group, satisfying the following properties:
\phi(a_1+a_2,b)=\phi(a_1,b)+\phi(a_2,b)\\ \phi(a,b_1+b_2)=\phi(a,b_1)+\phi(a,b_2)\\ \phi(ar,b)=\phi(a,rb)
for all a_i\in M,b_i\in N,r\in R. The tensor product is a group M\otimes_R N, endowed with a balanced map \phi_{M,N}:M\times N\to M\otimes_R N, such that for any balanced map \psi:M\times N \to G there is a unique group homomorphism \overline{\psi}:M\otimes_R N\to G such that \overline{\psi}\phi_{M,N}=\psi,
namely, the diagram
M\times N \xrightarrow{\qquad \psi \qquad} G\\ \; \phi_{M,N} \searrow \quad \qquad \nearrow \overline{\psi}\\ \qquad\;\; M\otimes_R N
is commutative. Our universal construction is achieved by taking the free Abelian group, that is, the free \mathbb{Z}-module, generated by what are to become the simple tensors, and factoring out all the desired relations.

Definition: Given a right R-module M and an R-module N over an arbitrary ring R, M\otimes_R N is defined to be F/K, where F is the free \mathbb{Z}-module with basis M\times N, the set of ordered pairs \{(a,b):a\in M, b\in N\}, and where K is the submodule generated by the elements
(a_1+a_2,b)-(a_1,b)-(a_2,b);\\ (a,b_1+b_2)-(a,b_1)-(a,b_2);\\ (ar,b)-(a,rb)
for all a_i\in M,b_i\in N,r\in R.

However abstract the construction of M\otimes_R N, it must be "correct", being unique up to isomorphism, by "abstract nonsense".

References
What are tensor products?
Algebra by Ernest Shult
Graduate Algebra: Noncommutative View by Louis Halle Rowen [p137-140]

More to read
How to lose your fear of tensor products

No comments:

Post a Comment