Bases and coordinates

Section 5.10 Bases and coordinates

In this section we generalize our understanding of the coordinates of a vector. We start with a little observation: consider the vector \(\vec v=(1,2,3)\) in \(\R^3\text{,}\) and the standard basis \(\{\vec e_1, \vec e_2, \vec e_3\}=\{(1,0,0),(0,1,0), (0,0,1)\}\text{.}\) then

\begin{equation*} \vec v=(1,2,3)=1(1,0,0)+2(0,1,0)+3(0,0,1)=1\vec e_1+2\vec e_2+3\vec e_3 \text{.} \end{equation*}

Now suppose we take a different basis for \(\R^3\text{:}\) \(B=\{(0,1,1),(1,0,1),(1,1,0)\}\text{.}\) Then, using Theorem 5.9.16 there is a unique choice of \(r_1, r_2, r_3\) so that \(\vec v= (1,2,3)=r_1(0,1,1)+r_2(1,0,1)+r_3(1,1,0)\text{.}\) The values are determined by the three equations in three unknowns:

\begin{gather*} 1=r_2+r_3\\ 2=r_1+r_3\\ 3=r_1+r_2 \end{gather*}

The unique solution is \((r_1,r_2,r_3)=(2,1,0)\text{.}\) We then call this triple the coordinates of \(\vec v\) with respect to the basis \(B\).

Definition 5.10.1. Coordinates with respect to a basis.

Let \(\vec v\) be a vector in \(\R^n\text{,}\) and let \(B=\{\vec x_1,\vec x_2,\ldots,\vec x_n\}\) be a basis. From Theorem 5.9.16, there is exactly one choice of \(r_1,r_2,\ldots,r_n\) so that \(\vec v=r_1\vec x_1+r_2\vec x_2+\cdots+r_n\vec x_n\text{.}\) The coordinates of \(\vec v\) with respect to the basis \(B\) is then \((r_1,r_2,\ldots,r_n)\text{.}\) When necessary to emphasize the basis, the notation \((r_1,r_2,\ldots,r_n)_B\) is used.

Observation 5.10.2. Order makes a difference.

If the vectors of a basis are reordered, then the coordinates of any vector with respect to that basis get reordered too. Sometimes the term ordered basis is used to emphasize the importance of order, but usually it is just tacitly understood and causes no problems.

Example 5.10.3. Coordinates with respect to a basis in \(\R^2\).

Consider the basis \(B=\{\vec u_1,\vec u_2\}\) in \(\R^2\) where \(\vec u_1=(2,1)\) and \(\vec u_2=(-1,1)\text{.}\) In addition, consider \(\vec w=(1,5)\text{.}\) We want the coordinates of \(\vec w\) with respect to the basis \(B\text{.}\) It follows easily from the equation \(\vec w=r_1\vec u_1+r_2\vec u_2\) that \(r_1=2\) and \(r_2=3\text{.}\) Hence \(\vec w=(2,3)_B\text{.}\) The figure below shows the geometric interpretation of this equation.

Figure 5.10.4. \(\vec w=2\vec u_1+3\vec u_2\)

The line joining \(\vec u_1\) with the origin contains all scalar multiples of \(\vec u_1\text{.}\) Similarly, the line joining \(\vec u_2\) with the origin contains all scalar multiples of \(\vec u_2\text{.}\) The parallelogram rule is used to add these two scalar multiples together.

The nice part is that this reasoning is reversible. Given any basis \(B=\{\vec u_1,\vec u_2\}\text{,}\) the lines through \(\vec u_1\) and \(\vec u_2\) are considered axes and, for any other vector \(\vec w\text{,}\) lines parallel to these axes through \(\vec w\) can be used to make a parallelogram that determines \(r_1\) and \(r_2\text{.}\)

With the standard basis \(\{\vec e_1,\vec e_2\}\text{,}\) the coordinates of a vector \(\vec w\) are determined by dropping perpendiculars to the \(x\)-axis and to the \(y\)-axis. The parallelogram is actually a rectangle since the basis vectors are orthogonal.

It is reasonable to ask is why the consideration different bases is worthwhile. One answer is that it can make some computations much easier. The evaluation of projections from a point to a line in \(\R^2\) is given in Subsection 5.7.2. The following example gives a different quicker evaluation using bases.

Example 5.10.5. The projection of a point to a line in \(\R^2\) revisited.

Consider the computation of the projection of a point \((x,y)\) onto the line \(y=mx\text{.}\) The strategy is to first use the line as one axis for a basis. To that end we take a nonzero point on the line as the first basis element: \(\vec u_1=(1,m)\) . Next, we want a second basis element orthogonal to \(\vec u_1\) so that the parallelogram rule will be applied as a rectangle. An easy choice is \(\vec u_2=(-m,1)\) (why?), so the basis is \(B=\{(1,m),(-m,1)\}\text{.}\) The coordinates \((r_1,r_2)\) of \((x,y)\) with respect to \(B\) are found by solving

\begin{equation*} r_1(1,m)+r_2(-m,1)=(x,y)\text{.} \end{equation*}

This is routine and, in particular, gives us \(r_1=\frac1{m^2+1}(x+my)\text{.}\) Note that \(r_1\vec u_1=\frac1{m^2+1}(x+my)(1,m)=\frac1{m^2+1}(x+my,mx+m^2y)\) is our desired point. Note also that we could evaluate \(r_2\text{,}\) but we don't need to because of our choice of basis (we only have to work half as hard!).

Figure 5.10.6. Projection of \((x,y)\) to the line \(y=mx\)

Subsection 5.10.1 Bases, coordinates and matrices.

Matrix multiplication has a nice role to play for the computation of coordinates. It rests on a straightforward relationship.

Observation 5.10.7. Linear combinations and matrices.

Let \(\vec x_1, \vec x_2,\ldots,\vec x_n\) be vectors in \(\R^n\text{,}\) and let \(A= \begin{bmatrix} \vec x_1 \amp\vec x_2 \amp\cdots \amp\vec x_n \end{bmatrix}\) be the matrix with \(\vec x_1, \vec x_2,\ldots,\vec x_n\) as columns. Then

\begin{equation*} r_1\vec x_1+r_2\vec x_2+\cdots+r_n\vec x_n=\vec w \end{equation*}

if and only if

\begin{equation*} A \begin{bmatrix} r_1\\r_2\\ \vdots \\r_n \end{bmatrix} =\vec w\text{.} \end{equation*}

This is easily verified by evaluating \(\vec w_k\) in each case, for \(k=1,2,\ldots,n\text{.}\)

Now suppose that \(B=\{\vec x_1,\ldots,\vec x_n\}\) is a basis for \(\R^n\text{,}\) and \(A\) is the matrix with \(\vec x_1,\ldots,\vec x_n\) as columns. By Theorem 5.9.12 the matrix \(A\) is nonsingular, and this implies that \(A^{-1}\) exists.

Proposition 5.10.8.

\begin{equation*} \begin{bmatrix} r_1\\r_2\\ \vdots \\r_n \end{bmatrix} = A^{-1}\vec w\text{.} \end{equation*}

Proof.

Multiply both sides of the result in Observation 5.10.7 by \(A^{-1}\text{.}\)

Example 5.10.9. The projection of a point to a line in \(\R^2\) revisited again.

We can use this result to revisit Example 5.10.5. The matrix \(A\) of basis column vectors would then satisfy

\begin{equation*} A= \begin{bmatrix} 1\amp -m\\ m \amp 1 \end{bmatrix}\text{.} \end{equation*}

Since \(\det(A)=m^2+1\text{,}\) by Theorem 4.5.3,

\begin{equation*} A^{-1}= \frac1{m^2+1}\begin{bmatrix} 1\amp m\\ -m \amp 1 \end{bmatrix}\text{,} \end{equation*}

and

\begin{equation*} \begin{bmatrix} r_1\\r_2 \end{bmatrix} =A^{-1} \begin{bmatrix} x\\y \end{bmatrix} = \frac1{m^2+1} \begin{bmatrix} x+my\\-mx+y \end{bmatrix}\text{.} \end{equation*}

Example 5.10.10. An ellipse in the plane.

The standard equation for an ellipse in the plane is

\begin{equation*} \frac{x^2}{a^2}+\frac{y^2}{b^2}=1\text{.} \end{equation*}

Clearly the points \((\pm a,0)\) and \((0,\pm b)\) satisfy the equation and are on the ellipse. A typical instance is symmetric about the \(x\)-axis and \(y\)-axis:

Figure 5.10.11. Graph of \(\frac{x^2}{a^2}+\frac{y^2}{b^2}=1\) with \(a=2\) and \(b=1\)

Next we consider the points in the plane satisfying \(x^2-xy+y^2=1\text{.}\) Here is the graph:

Figure 5.10.12. Graph of \(x^2-xy+y^2=1\)

It really looks like an ellipse, but the equation is not in our standard form, because the curve is not appropriately aligned with the \(x\)-axis and \(y\)-axis. What to do? We can change the axes by using a new basis: \(B=\{\vec u_1, \vec u_2\}\) where \(\vec u_1=(1,1)\) and \(\vec u_2=(-1,1)\text{.}\)

Figure 5.10.13. Graph of \(x^2-xy+y^2=1\) with new axes

Now suppose \(\vec w=(u,v)\) is on the curve. Let \((x,y)\) be the coordinates of \(\vec w\) with respect to the basis \(B\text{.}\) Using Proposition 5.10.8,

\begin{equation*} \vec w = \begin{bmatrix} u\\v \end{bmatrix} = A \begin{bmatrix} x\\y \end{bmatrix} = \begin{bmatrix} 1\amp-1\\ 1\amp 1 \end{bmatrix} \begin{bmatrix} x\\y \end{bmatrix} = \begin{bmatrix} x-y\\x+y \end{bmatrix} \end{equation*}

and, since \((u,v)\) is on the curve,

\begin{equation*} 1=u^2-uv+v^2=(x-y)^2 -(x-y)(x+y) + (x+y)^2=x^2+3y^2\text{.} \end{equation*}

Hence \(\frac{x^2}{a^2} + \frac{y^2}{b^2}=1\) where \(a=1\) and \(b=\frac1{\sqrt3}\text{,}\) and the curve is indeed an ellipse.

Subsection 5.10.2 Change of basis

Suppose we have two bases \(B_1=\{\vec x_1,\ldots,\vec x_n\}\) and \(B_2=\{\vec y_1,\ldots,\vec y_n\}\) and also \(\vec w \text{,}\) all in \(\R^n\text{.}\) If we know the coordinates of \(\vec w\) with respect to \(B_1\text{,}\) can we find the coordinates of \(\vec w\) with respect to \(B_2\) easily? If we use the right matrices, the answer is yes.

Proposition 5.10.14. Change of basis and matrix multiplication.

Suppose that \(B_1\) and \(B_2\) are bases of \(\R^n\text{.}\) Then there is a matrix \(C\) with the following property: If \(\vec w\) is any vector in \(\R^n\text{,}\) and \((r_1,r_2,\ldots,r_n)\) and \((s_1,s_2,\ldots,s_n)\) are the are the coordinates of \(\vec w\) with respect to the bases \(B_1\) and \(B_2\text{,}\) then

\begin{equation*} \begin{bmatrix} s_1\\ s_2\\ \vdots \\s_n \end{bmatrix} = C \begin{bmatrix} r_1\\ r_2\\ \vdots \\r_n \end{bmatrix}\text{.} \end{equation*}

Proof.

From Observation 5.10.7, there exist matrices \(A_1\) and \(A_2\) satisfying

\begin{equation*} A_1 \begin{bmatrix} r_1\\r_2\\ \vdots \\r_n \end{bmatrix} = \vec w = A_2 \begin{bmatrix} s_1\\s_2\\ \vdots \\s_n \end{bmatrix}\text{.} \end{equation*}

Setting \(C=A_2^{-1}A_1\text{,}\) we have

\begin{equation*} C \begin{bmatrix} r_1\\ r_2\\ \vdots \\r_n \end{bmatrix} =A_2^{-1}A_1 \begin{bmatrix} r_1\\ r_2\\ \vdots \\r_n \end{bmatrix} =A_2^{-1}\vec w = \begin{bmatrix} s_1\\ s_2\\ \vdots \\s_n \end{bmatrix}\text{.} \end{equation*}

Theorem 5.10.15. Change of basis theorem.

Suppose that \(B_1=\{\vec x_1, \vec x_2, \ldots, \vec x_n\}\) and \(B_2=\{\vec y_1, \vec y_2, \ldots, \vec y_n\}\) are bases of \(\R^n\text{,}\) that \(\vec w\) is a vector in \(\R^n\text{,}\) and that \((r_1,r_2,\ldots,r_n)\) and \((s_1,s_2,\ldots,s_n)\) are the are the coordinates of \(\vec w\) with respect to the bases \(B_1\) and \(B_2\text{.}\) In addition, let \((c_{1,j}, c_{2,j},\ldots, c_{n,j})\) be the coordinates of \(\vec x_j\) with respect to \(B_2\text{,}\) that is,

\begin{align*} \vec x_j=c_{1,j} \vec y_1+c_{2,j}\vec y_2+ \cdots + c_{n,j}\vec y_n \amp\amp \text{ for } 1\leq j\leq n\text{.} \end{align*}

Then, for \(C= \begin{bmatrix} c_{i,j} \end{bmatrix} \text{,}\)

\begin{equation*} \begin{bmatrix} s_1\\ s_2\\ \vdots \\s_n \end{bmatrix} = C \begin{bmatrix}r_1\\r_2\\ \vdots\\r_n\end{bmatrix}\text{.} \end{equation*}

Proof.

Using

\begin{gather*} \vec x_1 = c_{1,1}\vec y_1+c_{2,1}\vec y_2 +\cdots+ c_{n,1}\vec y_n\\ \vec x_2 = c_{1,2}\vec y_1+c_{2,2}\vec y_2 +\cdots+ c_{n,2}\vec y_n\\ \vdots\\ \vec x_n = c_{1,n}\vec y_1+c_{2,n}\vec y_2 +\cdots+ c_{n,n}\vec y_n \end{gather*}

and

\begin{equation*} \vec w = r_1\vec x_1+r_2\vec x_2+\cdots+r_n\vec x_n = s_1\vec y_1+s_2\vec y_2+\cdots+s_n\vec y_n\text{,} \end{equation*}

we see that

\begin{align*} s_1\vec y_1+s_2\vec y_2+\cdots+s_n\vec y_n \amp = r_1\vec x_1+r_2\vec x_2+\cdots+r_n\vec x_n \\ \amp = r_1(c_{1,1}\vec y_1+c_{2,1}\vec y_2 +\cdots+ c_{n,1}\vec y_n) \\ \amp \phantom{===} +r_2(c_{1,2}\vec y_1+c_{2,2}\vec y_2 +\cdots+ c_{n,2}\vec y_n) \\ \amp\phantom{===|} \vdots\\ \amp \phantom{===} +r_n(c_{1,n}\vec y_1+c_{2,n}\vec y_2 +\cdots+ c_{n,n}\vec y_n) \\ \amp = (r_1 c_{1,1}+r_2 c_{1,2}+\cdots+r_n c_{1,n})\vec y_1 \\ \amp \phantom{===} + (r_1 c_{2,1}+r_2 c_{2,2}+\cdots+r_n c_{2,n})\vec y_2 \\ \amp\phantom{===|} \vdots\\ \amp \phantom{===} + (r_1 c_{n,1}+r_2 c_{n,2}+\cdots+r_n c_{n,n})\vec y_n \text{.} \end{align*}

Using Proposition 5.9.4, we have

\begin{gather*} s_1=r_1 c_{1,1}+r_2 c_{1,2}+\cdots+r_n c_{1,n}\\ s_2=r_1 c_{2,1}+r_2 c_{2,2}+\cdots+r_n c_{2,n}\\ \vdots\\ s_n=r_1 c_{n,1}+r_2 c_{n,2}+\cdots+r_n c_{n,n}\text{,} \end{gather*}

which is identical to

\begin{equation*} \begin{bmatrix} s_1\\ s_2\\ \vdots \\s_n \end{bmatrix} = C \begin{bmatrix}r_1\\r_2\\ \vdots\\r_n\end{bmatrix}\text{.} \end{equation*}

Observation 5.10.16.

Notice that Theorem 5.10.15 gives an algorithm for constructing the desired matrix \(C\text{.}\) For each \(k=1,2,\ldots,n\text{,}\) let \(\vec z_k\) be the coordinates of \(\vec x_k\) with respect to \(B_2\text{.}\) Using column vectors, let \(C= \begin{bmatrix} \vec z_1\amp\vec z_2\amp\cdots\amp\vec z_n \end{bmatrix} \text{.}\) Then

\begin{equation*} \begin{bmatrix} s_1\\ s_2\\ \vdots \\s_n \end{bmatrix} = C \begin{bmatrix}r_1\\r_2\\ \vdots\\r_n\end{bmatrix}\text{.} \end{equation*}

Also, notice that Proposition 5.10.8 is a special case of Theorem 5.10.15 where \(B_1=\{\vec x_1,\ldots,\vec x_n\}\) and \(B_2\) is the standard basis.