Linear transformations

Section 7.3 Linear transformations

Recall the definition of a linear transformation: Definition 7.1.6. We want to give examples of linear transformations and also to verify that the transformations in \(\R^2\) given in Section 7.2 are linear.

Subsection 7.3.1 Examples of linear transformations

The following transformations \(L\colon \R^n\to\R^m\) are linear:

The zero transformation: \(L(\vec x)=\vec0\) for all \(\vec x\)
\begin{equation*} L(\vec x)+L(\vec y)=\vec0+\vec0=\vec0=L(\vec x + \vec y)\\ rL(\vec x)=r\vec0=\vec0=L(r\vec x)\text{.} \end{equation*}
The identity transformation (for \(m=n)\text{:}\) \(L(\vec x)=\vec x\) for all \(\vec x\text{.}\)
\begin{equation*} L(\vec x)+L(\vec y)=\vec x+\vec y=L(\vec x + \vec y)\\ rL(\vec x)=r\vec x=L(r\vec x)\text{.} \end{equation*}
\(L((x_1,x_2,x_3))=(x_1+x_2,x_2+x_3)\)
\begin{equation*} \begin{array}{rl} L(\vec x)+L(\vec y) \amp =L((x_1,x_2,x_3))+L((y_1,y_2,y_3))\\ \amp = (x_1+x_2,x_2+x_3)+(y_1+y_2,y_2+y_3)\\ \amp =(x_1+x_2+y_1+y_2, x_2+x_3+y_2+y_3)\\ \amp =(x_1+y_1+x_2+y_2, x_2+y_2+x_3+y_3)\\ \amp =L((x_1+y_1,x_2+y_2,x_3+y_3))\\ \amp =L( (x_1,x_2,x_3)+ (y_1,y_2,y_3))\\ \amp =L(\vec x+\vec y) \text{, and} \end{array} \end{equation*}

\begin{equation*} \begin{array}{rl} rL(\vec x) \amp =rL((x_1,x_2,x_3))\\ \amp =r(x_1+x_2,x_2+x_3)\\ \amp =(rx_1+rx_2,rx_2+rx_3)\\ \amp =L((rx_1,rx_2,rx_3))\\ \amp =L(r\vec x). \end{array} \end{equation*}

Here are some interesting transformations (which are shown to be linear) \(L\colon \R^2\to\R^2\text{.}\)

List 7.3.1. Examples of linear transformations in \(\R^2\)

Rotations in \(\R^2\) as seen in Example 7.2.1.

The following figure verifies the addition property. The vector \(\vec x+\vec y\) gets rotated to the red corner of the green parallelogram and so is \(L(\vec x+\vec y)\text{.}\) The parallelogram rule for this same parallelogram says that this corner is also \(L(\vec x)+L(\vec y)\text{.}\)

Figure 7.3.2. Rotations are linear: addition
The situation is similar for the scalar multiplication property:

Figure 7.3.3. Rotations are linear: scalar multiplication
Reflections in \(\R^2\) as seen in Example 7.2.3. The reasoning is similar to the rotation example. The yellow parallelogram is reflected into a (mirror image) green parallelogram. The upper right corner (red dot) of that parallelogram may be evaluated in two ways and the values are equal.

Figure 7.3.4. Reflections are linear: addition
Scalar multiplication is handled similarly.

Figure 7.3.5. Reflections are linear: scalar multiplication
Projections in \(\R^2\) as seen in Example 7.2.5. The geometry of projections in \(\R^2\) can show that they are linear transformations. In the following figure, \(\vec z\) is the point that completes the parallelogram through \(\vec x\text{,}\) \(\vec x + \vec y\) and \(T(\vec x)\text{.}\) The triangles \(\triangle(\vec0, \vec y, T(\vec y))\) and \(\triangle(T(\vec x), \vec z, T(\vec x+\vec y))\) are then congruent, and so \(T(\vec x+\vec y)-T(\vec x)=T(\vec y)-\vec0\text{,}\) that is, \(T(\vec x)+T(\vec y)=T(\vec x+\vec y)\text{.}\)

Figure 7.3.6. Projections are linear: addition
However, a different proof uses the linearity of the reflection transformation \(L\) given in Figure 7.3.4 and the observation that \(T(\vec x)\) is the midpoint of the line joining \(\vec x\) and \(L(\vec x)\text{.}\) In other words \(T(\vec x)=\frac12(\vec x + L(\vec x))\text{.}\) This allows an easy computation:
- linearity of addition
  \begin{align*} T(\vec x+\vec y) \amp=\frac12(\vec x + \vec y + L(\vec x+\vec y))\\ \amp=\frac12(\vec x + \vec y + L(\vec x)+L(\vec y))\\ \amp=\frac12(\vec x+L(\vec x))+\frac12(\vec y+L(\vec y))\\ \amp=T(\vec x)+T(\vec y)\text{.} \end{align*}
- linearity of scalar multiplication
  \begin{align*} T(r\vec x) \amp=\frac12(r\vec x+L(r\vec x))\\ \amp=\frac12(r\vec x+rL(\vec x))\\ \amp=\frac12r(\vec x+L(\vec x))\\ \amp=rT(\vec x)\text{.} \end{align*}
Constructing new linear transformations from old ones is a good mathematical technique that will turn out to be quite useful.
Dilations in \(\R^2\) as seen in Example 7.2.7. Dilations may also be viewed geometrically.

Figure 7.3.7. Dilations are linear: addition
The yellow and green parallelograms are proportional, and so the figure shows that \(L(\vec x + \vec y)=L(\vec x)+L(\vec y)\text{.}\) On the other hand, since \(L(\vec x)=s\vec x\text{,}\) it is easy to compute directly:
- \(L(\vec x + \vec y) =s(\vec x + \vec y) =s\vec x + s\vec y =L(\vec x) + L(\vec y) \text{.}\)
- \(L(r\vec x) =s(r\vec x) =(sr)\vec x =(rs)\vec x =r(s\vec x) =rL(\vec x) \text{.}\)

Subsection 7.3.2 First properties of linear transformations

Theorem 7.3.8. First properties.

Any linear transformation \(L\) satisfies

\(\displaystyle L(\vec0)=\vec0\)
\(\displaystyle L(\vec x-\vec y)=L(\vec x)-L(\vec y)\)
\(L(\vec x-\vec y)=\vec0\) if and only if \(L(\vec x) =L(\vec y)\)
\(\displaystyle L(r_1\vec x_1+r_2\vec x_2)=r_1L(\vec x_1)+r_2L(\vec x_2)\)
For any vectors \(\vec x_1, \vec x_2,\ldots,\vec x_n\) and real numbers \(r_1, r_2,\ldots,r_n\)

\begin{equation*} L(r_1\vec x_1+r_2 \vec x_2+\cdots+r_n\vec x_n)= r_1L(\vec x_1)+r_2 L(\vec x_2)+\cdots+r_nL(\vec x_n) \end{equation*}

Proof.

We evaluate \(L(\vec 0+\vec 0)\) in two ways:
1. Since \(\vec 0+\vec 0=\vec0\text{,}\) we have \(L(\vec 0+\vec 0)=L(\vec 0)\)
2. Since \(\vec 0+\vec 0=2\vec0\text{,}\) we have \(L(\vec 0+\vec 0)= L(2\vec 0)=2L(\vec 0)\text{,}\) and so
  
  \begin{equation*} 2L(\vec 0)=L(\vec 0)\\L(\vec 0)=\vec0\text{.} \end{equation*}
\(\displaystyle L(\vec x-\vec y) =L(\vec x+(-1)\vec y) =L(\vec x)+L((-1)\vec y) =L(\vec x)-L(\vec y)\)
\(L(\vec x-\vec y)=\vec0\) if and only if \(L(\vec x)-L(\vec y)=\vec0\) which in turn implies \(L(\vec x) =L(\vec y)\text{.}\)
\(\displaystyle L(r_1\vec x_1+r_2\vec x_2) =L(r_1\vec x_1)+L(r_2\vec x_2) =r_1L(\vec x_1)+r_2L(\vec x_2) \)
We apply the the previous addition property repeatedly:

\begin{alignat*}{1} L(r_1\vec x_1+r_2 \vec x_2\amp+\cdots+r_n\vec x_n)\\ \amp = L(r_1\vec x_1)+L(r_2 \vec x_2+\cdots+r_n\vec x_n)\\ \amp = r_1L(\vec x_1)+ L(r_2\vec x_2+\cdots+r_n\vec x_n)\\ \amp = r_1L(\vec x_1)+ L(r_2\vec x_2) +L(r_3\vec x_3+\cdots+r_n\vec x_n)\\ \amp = r_1L(\vec x_1)+ r_2L(\vec x_2) +L(r_3\vec x_3+\cdots+r_n\vec x_n)\\ \amp \,\,\,\vdots\\ \amp = r_1L(\vec x_1)+r_2 L(\vec x_2) +\cdots+r_nL(\vec x_n) \end{alignat*}

Checkpoint 7.3.9.

Show that the reflection by by a line not passing through \(\mathbf0\) is not a linear transformation.

Solution

The line not passing through \(\mathbf0\) implies \(L(\mathbf0)\not=\mathbf0\text{.}\) However Theorem 7.3.8 proves that any linear transformation satisfies \(L(\mathbf0)=\mathbf0\text{.}\)

Theorem 7.3.10. New linear transformations from old ones.

Suppose \(L_1\colon \R^n\to\R^m\) and \(L_2\colon \R^n\to\R^m\) are linear transformations, and \(r_1\) and \(r_2\) are scalars. Then \(T\colon \R^n\to\R^m\) defined by \(T=r_1L_1+r_2L_2\) is also a linear transformation.

Proof.

\begin{align*} T(\vec x+\vec y) \amp = (r_1L_1+r_2L_2)(\vec x+\vec y)\\ \amp = r_1L_1(\vec x+\vec y)+r_2L_2(\vec x+\vec y)\\ \amp = r_1L_1(\vec x)+ r_1L_1(\vec y)+r_2L_2(\vec x)+ r_2L_2(\vec y)\\ \amp = r_1L_1(\vec x)+r_2L_2(\vec x)+ r_1L_1(\vec y)+ r_2L_2(\vec y)\\ \amp = (r_1L_1+r_2L_2)(\vec x)+ (r_1L_1+ r_2L_2)(\vec y)\\ \amp =T(\vec x)+T(\vec y)\text{,} \end{align*}

and

\begin{align*} T(r\vec x) \amp= (r_1L_1+r_2L_2)(r\vec x)\\ \amp= r_1L_1(r\vec x)+r_2L_2(r\vec x)\\ \amp= r_1rL_1(\vec x)+r_2rL_2(\vec x)\\ \amp= r(r_1L_1+r_2L_2)(\vec x)\\ \amp= rT(\vec x)\text{.} \end{align*}

The standard basis of \(\R^n\) is the set of vectors \(\{\vec e_1, \vec e_2,\ldots,\vec e_n\}\) where

\begin{equation*} \vec e_1=(1,0,\ldots,0)\\ \vec e_2=(0,1,\ldots,0)\\ \vdots\\ \vec e_n=(0,0,\ldots,1) \end{equation*}

Theorem 7.3.11. The value of \(L\) on the standard basis determines \(L\) everywhere.

If the values \(L(\vec e_1), L(\vec e_2),\ldots L(\vec e_n)\) are known, then value of \(L(\vec x)\) is known for all \(\vec x\) in \(\R^n\text{.}\)

Proof.

Suppose \(L(\vec e_1)=\vec f_1, L(\vec e_2)=\vec f_2,\ldots,L(\vec e_n)=\vec f_n\text{,}\) and \(\vec x=(x_1,x_2,\ldots,x_n)\text{.}\) Then \((x_1,x_2,\ldots,x_n)=x_1\vec e_1 + x_2\vec e_2+\cdots+x_n\vec e_n\) and

\begin{equation*} \begin{array}{rl} L(\vec x) \amp =L(x_1\vec e_1 + x_2\vec e_2+\cdots+x_n\vec e_n)\\ \amp =x_1L(\vec e_1) + x_2L(\vec e_2)+\cdots+x_nL(\vec e_n)\\ \amp =x_1\vec f_1 + x_2\vec f_2+\cdots+x_n\vec f_n. \end{array} \end{equation*}

Corollary 7.3.12. \(L(\vec e_i)=\vec 0, i=1\cdots n\text{,}\) implies \(L=0\).

If \(L(\vec e_1)=L(\vec e_2)=\cdots=L(\vec e_n)=\vec 0\text{,}\) then \(L\) is the zero transformation.

There is an easy consequence:

Theorem 7.3.13. Two linear transformations equal on the standard basis are equal everywhere.

Suppose \(L_1\colon \R^n\to\R^m\text{,}\) \(L_2\colon \R^n\to\R^m\) and \(L_1(\vec e_i) = L_2(\vec e_i)\) for \(i=1,2,\dots n\text{.}\) Then \(L_1(\vec x)=L_2(\vec x)\) for all \(\vec x\) in \(\R^n\text{,}\) and \(L_1=L_2\text{.}\)

Proof.

\(L_1(\vec e_i)-L_2(\vec e_i)=\vec 0\) for \(i=1,2,\ldots,n\) by assumption. If \(\vec x=(x_1,\dots,x_n)\text{,}\) we may write \(\vec x = x_1\vec e_1+x_2\vec e_2+\cdots +x_n\vec e_n.\) Then

\begin{align*} L_1(\vec x) \amp =L_1(x_1\vec e_1+x_2\vec e_2+\cdots +x_m\vec e_n)\\ \amp =x_1L_1(\vec e_1)+x_2L_1(\vec e_2)+\cdots +x_mL_1(\vec e_n)\\ \amp =x_1L_2(\vec e_1)+x_2L_2(\vec e_2)+\cdots +x_mL_2(\vec e_n)\\ \amp =L_2(x_1\vec e_1+x_2\vec e_2+\cdots +x_m\vec e_n)\\ \amp= L_2(\vec x) \end{align*}

From this we see that

\begin{equation*} L_1(\vec x)-L_2(\vec x) = \vec 0 \end{equation*}

and \(L_1=L_2\text{.}\)

Subsection 7.3.3 Matrix representation of a linear transformation

Suppose we have a linear transformation \(L\colon \R^n\to\R^n\text{.}\) As usual, let \(\{\vec e_1,\vec e_2,\ldots,\vec e_n\}\) be the standard basis for \(\R^n\text{.}\) Consider the vectors \(\{L(\vec e_1) ,L(\vec e_2), \ldots, L(\vec e_n)\}\) in \(\R^m\text{.}\) We form a matrix \(A\) by using the entries of \(L(\vec e_k)\) for the \(k\)-th column. We can think of the construction in the following way:

\begin{equation*} \begin{array}{cccc} A = \amp [L(\vec e_1) \amp L(\vec e_2) \amp \cdots \amp L(\vec e_n)]\\ \amp \uparrow \amp \uparrow \amp \amp \uparrow\\ \amp \text{column } 1\amp \text{column } 2\amp\amp \text{column } n \end{array} \end{equation*}

Notice that \(A\) is an \(m\times n\) matrix, and hence we have \(L_A\colon\R^n\to\R^m\text{.}\) We evaluate this transformation at \(\vec e_1\text{:}\)

\begin{equation} L_A(\vec e_1)= A\begin{bmatrix} 1\\0\\\vdots\\0 \end{bmatrix} =L(\vec e_1)\label{MatrixRep}\tag{7.3.1} \end{equation}

Similarly

\begin{equation*} L_A(\vec e_k)=L(\vec e_k) \text{ for } 1\leq k\leq n\text{.} \end{equation*}

As we have seen in Theorem 7.3.11 a linear transformation is determined by its values on the standard basis. This implies \(L=L_A\text{.}\) We call the constructed matrix \(A\) the matrix representation of \(L\text{.}\)

Theorem 7.3.14. Matrix representation of a linear transformation.

Let \(L\colon \R^n\to\R^m\) be a linear transformation, and let \(A\) be the matrix formed by letting the \(k\)-th column be the entries of \(L(\vec e_k)\text{.}\) Then

\begin{equation*} L=L_A \end{equation*}

Proof.

We can compute this in a different way: The equation (7.3.1) says

\begin{align*} L(\vec e_1) \amp=A\begin{bmatrix}1\\0\\\vdots\\0\end{bmatrix}\\ \amp=\begin{bmatrix}a_{1,1}\\a_{2,1}\\\vdots\\a_{n,1}\end{bmatrix}\\ \amp=a_{1,1}\vec e_1+a_{2,1}\vec e_2+\cdots+a_{n,1}\vec e_n \end{align*}

and, similarly,

\begin{equation*} L(\vec e_i)=a_{1,i}\vec e_1+a_{2,i}\vec e_2+\cdots+a_{n,i}\vec e_n \end{equation*}

Subsection 7.3.4 Examples of matrix representations

We next look at the matrix representations for the examples given in List 7.3.1

The matrix representation of a rotation linear transformation is determined by the values on the standard basis \(\{\vec e_1,\vec e_2\}.\)

Figure 7.3.15. Rotation of \(\{\vec e_1,\vec e_2\}\) through an angle \(\theta\)
The matrix representation is then

\begin{equation*} \begin{bmatrix} \cos\theta\amp -\sin\theta\\ \sin\theta \amp \cos\theta \end{bmatrix}\text{.} \end{equation*}
The matrix representation of the reflection by the line \(y=x\) turns out to be particularly easy.

Figure 7.3.16. Reflection of \(\{\vec e_1,\vec e_2\}\) by the line \(y=x\)
Since \(L(\vec e_1)=\vec e_2\) and \(L(\vec e_2)=\vec e_1\text{,}\) the matrix representation is

\begin{equation*} \begin{bmatrix} 0\amp 1\\ 1\amp0 \end{bmatrix}\text{.} \end{equation*}
The matrix representation of the projection onto the line \(y=x\) is easy to see:

Figure 7.3.17. Projection of \(\{\vec e_1,\vec e_2\}\) onto the line \(y=x\)
In this case \(L(\vec e_1)=L(\vec e_2)=\frac12(1,1)\text{,}\) and so the matrix representation is

\begin{equation*} \frac12\begin{bmatrix} 1\amp 1\\ 1\amp1 \end{bmatrix}\text{.} \end{equation*}
Figure 7.3.18. Dilation of \(\{\vec e_1,\vec e_2\}\) for \(L(\vec x)=sx\)
The matrix representation of \(L(\vec x)=sx\) is

\begin{equation*} \begin{bmatrix} s\amp 0\\ 0\amp s \end{bmatrix}\text{.} \end{equation*}

Checkpoint 7.3.19.

Give the matrix representation of the linear transformation defined as a rotation by an angle \(\theta\) clockwise.

Solution

A rotation clockwise by \(\theta\) is the same as a rotation counterclockwise by \(-\theta\text{,}\) and so the representation is

\begin{equation*} \begin{bmatrix} \cos\theta\amp \sin\theta\\ -\sin\theta \amp \cos\theta \end{bmatrix}\text{.} \end{equation*}