Section 7.3 Linear transformations
Recall the definition of a linear transformation: Definition 7.1.6. We want to give examples of linear transformations and also to verify that the transformations in \(\R^2\) given in Section 7.2 are linear.
Subsection 7.3.1 Examples of linear transformations
The following transformations \(L\colon \R^n\to\R^m\) are linear:
- The zero transformation: \(L(\vec x)=\vec0\) for all \(\vec x\)\begin{equation*} L(\vec x)+L(\vec y)=\vec0+\vec0=\vec0=L(\vec x + \vec y)\\ rL(\vec x)=r\vec0=\vec0=L(r\vec x)\text{.} \end{equation*}
- The identity transformation (for \(m=n)\text{:}\) \(L(\vec x)=\vec x\) for all \(\vec x\text{.}\)\begin{equation*} L(\vec x)+L(\vec y)=\vec x+\vec y=L(\vec x + \vec y)\\ rL(\vec x)=r\vec x=L(r\vec x)\text{.} \end{equation*}
- \(L((x_1,x_2,x_3))=(x_1+x_2,x_2+x_3)\)\begin{equation*} \begin{array}{rl} L(\vec x)+L(\vec y) \amp =L((x_1,x_2,x_3))+L((y_1,y_2,y_3))\\ \amp = (x_1+x_2,x_2+x_3)+(y_1+y_2,y_2+y_3)\\ \amp =(x_1+x_2+y_1+y_2, x_2+x_3+y_2+y_3)\\ \amp =(x_1+y_1+x_2+y_2, x_2+y_2+x_3+y_3)\\ \amp =L((x_1+y_1,x_2+y_2,x_3+y_3))\\ \amp =L( (x_1,x_2,x_3)+ (y_1,y_2,y_3))\\ \amp =L(\vec x+\vec y) \text{, and} \end{array} \end{equation*}\begin{equation*} \begin{array}{rl} rL(\vec x) \amp =rL((x_1,x_2,x_3))\\ \amp =r(x_1+x_2,x_2+x_3)\\ \amp =(rx_1+rx_2,rx_2+rx_3)\\ \amp =L((rx_1,rx_2,rx_3))\\ \amp =L(r\vec x). \end{array} \end{equation*}
Here are some interesting transformations (which are shown to be linear) \(L\colon \R^2\to\R^2\text{.}\)
Subsection 7.3.2 First properties of linear transformations
Theorem 7.3.8. First properties.
Any linear transformation \(L\) satisfies
\(\displaystyle L(\vec0)=\vec0\)
\(\displaystyle L(\vec x-\vec y)=L(\vec x)-L(\vec y)\)
\(L(\vec x-\vec y)=\vec0\) if and only if \(L(\vec x) =L(\vec y)\)
\(\displaystyle L(r_1\vec x_1+r_2\vec x_2)=r_1L(\vec x_1)+r_2L(\vec x_2)\)
-
For any vectors \(\vec x_1, \vec x_2,\ldots,\vec x_n\) and real numbers \(r_1, r_2,\ldots,r_n\)
\begin{equation*} L(r_1\vec x_1+r_2 \vec x_2+\cdots+r_n\vec x_n)= r_1L(\vec x_1)+r_2 L(\vec x_2)+\cdots+r_nL(\vec x_n) \end{equation*}
Proof.
-
We evaluate \(L(\vec 0+\vec 0)\) in two ways:
Since \(\vec 0+\vec 0=\vec0\text{,}\) we have \(L(\vec 0+\vec 0)=L(\vec 0)\)
-
Since \(\vec 0+\vec 0=2\vec0\text{,}\) we have \(L(\vec 0+\vec 0)= L(2\vec 0)=2L(\vec 0)\text{,}\) and so
\begin{equation*} 2L(\vec 0)=L(\vec 0)\\L(\vec 0)=\vec0\text{.} \end{equation*}
\(\displaystyle L(\vec x-\vec y) =L(\vec x+(-1)\vec y) =L(\vec x)+L((-1)\vec y) =L(\vec x)-L(\vec y)\)
\(L(\vec x-\vec y)=\vec0\) if and only if \(L(\vec x)-L(\vec y)=\vec0\) which in turn implies \(L(\vec x) =L(\vec y)\text{.}\)
\(\displaystyle L(r_1\vec x_1+r_2\vec x_2) =L(r_1\vec x_1)+L(r_2\vec x_2) =r_1L(\vec x_1)+r_2L(\vec x_2) \)
-
We apply the the previous addition property repeatedly:
\begin{alignat*}{1} L(r_1\vec x_1+r_2 \vec x_2\amp+\cdots+r_n\vec x_n)\\ \amp = L(r_1\vec x_1)+L(r_2 \vec x_2+\cdots+r_n\vec x_n)\\ \amp = r_1L(\vec x_1)+ L(r_2\vec x_2+\cdots+r_n\vec x_n)\\ \amp = r_1L(\vec x_1)+ L(r_2\vec x_2) +L(r_3\vec x_3+\cdots+r_n\vec x_n)\\ \amp = r_1L(\vec x_1)+ r_2L(\vec x_2) +L(r_3\vec x_3+\cdots+r_n\vec x_n)\\ \amp \,\,\,\vdots\\ \amp = r_1L(\vec x_1)+r_2 L(\vec x_2) +\cdots+r_nL(\vec x_n) \end{alignat*}
Checkpoint 7.3.9.
Show that the reflection by by a line not passing through \(\mathbf0\) is not a linear transformation.
The line not passing through \(\mathbf0\) implies \(L(\mathbf0)\not=\mathbf0\text{.}\) However Theorem 7.3.8 proves that any linear transformation satisfies \(L(\mathbf0)=\mathbf0\text{.}\)
Theorem 7.3.10. New linear transformations from old ones.
Suppose \(L_1\colon \R^n\to\R^m\) and \(L_2\colon \R^n\to\R^m\) are linear transformations, and \(r_1\) and \(r_2\) are scalars. Then \(T\colon \R^n\to\R^m\) defined by \(T=r_1L_1+r_2L_2\) is also a linear transformation.
Proof.
and
The standard basis of \(\R^n\) is the set of vectors \(\{\vec e_1, \vec e_2,\ldots,\vec e_n\}\) where
Theorem 7.3.11. The value of \(L\) on the standard basis determines \(L\) everywhere.
If the values \(L(\vec e_1), L(\vec e_2),\ldots L(\vec e_n)\) are known, then value of \(L(\vec x)\) is known for all \(\vec x\) in \(\R^n\text{.}\)
Proof.
Suppose \(L(\vec e_1)=\vec f_1, L(\vec e_2)=\vec f_2,\ldots,L(\vec e_n)=\vec f_n\text{,}\) and \(\vec x=(x_1,x_2,\ldots,x_n)\text{.}\) Then \((x_1,x_2,\ldots,x_n)=x_1\vec e_1 + x_2\vec e_2+\cdots+x_n\vec e_n\) and
Corollary 7.3.12. \(L(\vec e_i)=\vec 0, i=1\cdots n\text{,}\) implies \(L=0\).
If \(L(\vec e_1)=L(\vec e_2)=\cdots=L(\vec e_n)=\vec 0\text{,}\) then \(L\) is the zero transformation.
There is an easy consequence:
Theorem 7.3.13. Two linear transformations equal on the standard basis are equal everywhere.
Suppose \(L_1\colon \R^n\to\R^m\text{,}\) \(L_2\colon \R^n\to\R^m\) and \(L_1(\vec e_i) = L_2(\vec e_i)\) for \(i=1,2,\dots n\text{.}\) Then \(L_1(\vec x)=L_2(\vec x)\) for all \(\vec x\) in \(\R^n\text{,}\) and \(L_1=L_2\text{.}\)
Proof.
\(L_1(\vec e_i)-L_2(\vec e_i)=\vec 0\) for \(i=1,2,\ldots,n\) by assumption. If \(\vec x=(x_1,\dots,x_n)\text{,}\) we may write \(\vec x = x_1\vec e_1+x_2\vec e_2+\cdots +x_n\vec e_n.\) Then
From this we see that
and \(L_1=L_2\text{.}\)
Subsection 7.3.3 Matrix representation of a linear transformation
Suppose we have a linear transformation \(L\colon \R^n\to\R^n\text{.}\) As usual, let \(\{\vec e_1,\vec e_2,\ldots,\vec e_n\}\) be the standard basis for \(\R^n\text{.}\) Consider the vectors \(\{L(\vec e_1) ,L(\vec e_2), \ldots, L(\vec e_n)\}\) in \(\R^m\text{.}\) We form a matrix \(A\) by using the entries of \(L(\vec e_k)\) for the \(k\)-th column. We can think of the construction in the following way:
Notice that \(A\) is an \(m\times n\) matrix, and hence we have \(L_A\colon\R^n\to\R^m\text{.}\) We evaluate this transformation at \(\vec e_1\text{:}\)
Similarly
As we have seen in Theorem 7.3.11 a linear transformation is determined by its values on the standard basis. This implies \(L=L_A\text{.}\) We call the constructed matrix \(A\) the matrix representation of \(L\text{.}\)
Theorem 7.3.14. Matrix representation of a linear transformation.
Let \(L\colon \R^n\to\R^m\) be a linear transformation, and let \(A\) be the matrix formed by letting the \(k\)-th column be the entries of \(L(\vec e_k)\text{.}\) Then
Proof.
We can compute this in a different way: The equation (7.3.1) says
and, similarly,
Subsection 7.3.4 Examples of matrix representations
We next look at the matrix representations for the examples given in List 7.3.1
-
The matrix representation of a rotation linear transformation is determined by the values on the standard basis \(\{\vec e_1,\vec e_2\}.\)
Figure 7.3.15. Rotation of \(\{\vec e_1,\vec e_2\}\) through an angle \(\theta\) The matrix representation is then
\begin{equation*} \begin{bmatrix} \cos\theta\amp -\sin\theta\\ \sin\theta \amp \cos\theta \end{bmatrix}\text{.} \end{equation*} -
The matrix representation of the reflection by the line \(y=x\) turns out to be particularly easy.
Figure 7.3.16. Reflection of \(\{\vec e_1,\vec e_2\}\) by the line \(y=x\) Since \(L(\vec e_1)=\vec e_2\) and \(L(\vec e_2)=\vec e_1\text{,}\) the matrix representation is
\begin{equation*} \begin{bmatrix} 0\amp 1\\ 1\amp0 \end{bmatrix}\text{.} \end{equation*} -
The matrix representation of the projection onto the line \(y=x\) is easy to see:
Figure 7.3.17. Projection of \(\{\vec e_1,\vec e_2\}\) onto the line \(y=x\) In this case \(L(\vec e_1)=L(\vec e_2)=\frac12(1,1)\text{,}\) and so the matrix representation is
\begin{equation*} \frac12\begin{bmatrix} 1\amp 1\\ 1\amp1 \end{bmatrix}\text{.} \end{equation*} -
Figure 7.3.18. Dilation of \(\{\vec e_1,\vec e_2\}\) for \(L(\vec x)=sx\) The matrix representation of \(L(\vec x)=sx\) is
\begin{equation*} \begin{bmatrix} s\amp 0\\ 0\amp s \end{bmatrix}\text{.} \end{equation*}
Checkpoint 7.3.19.
Give the matrix representation of the linear transformation defined as a rotation by an angle \(\theta\) clockwise.
A rotation clockwise by \(\theta\) is the same as a rotation counterclockwise by \(-\theta\text{,}\) and so the representation is