Skip to main content

Section 3.5 The transpose and trace of a matrix

Definition 3.5.1. The transpose of a matrix.

The transpose of \(A\) is the matrix \(A^T\) derived by making the first row of \(A\) the first column of \(A^T\text{,}\) the second row of \(A\) the second column of \(A^T\text{,}\) etc. In other words, when taking a transpose, the rows and columns are interchanged. Another way of saying this is that the subscripts have been interchanged, that is, if \(A^T=B=[b_{i,j}].\) then \(b_{i,j}=a_{j,i}\)

Example 3.5.2. Matrix transposes.
  • Let \(A=\begin{bmatrix} 1\amp2\amp3\\4\amp5\amp6\\7\amp8\amp9 \end{bmatrix}.\) Then \(A^T=\begin{bmatrix} 1\amp4\amp7\\2\amp5\amp8\\3\amp6\amp9 \end{bmatrix}\)

  • The identity matrix \(I_n\) of order n has all diagonal entries equal to one and all other entries equal to zero. \(I_n=\begin{bmatrix} 1 \amp 0 \amp 0 \amp \cdots \amp 0\\ 0 \amp 1 \amp 0 \amp \cdots \amp 0\\ 0 \amp 0 \amp 1 \amp \cdots \amp 0\\ \amp \amp \amp \ddots \amp \\ 0 \amp 0 \amp 0 \amp \cdots \amp 1 \end{bmatrix}_n\) Clearly \(I_n^T=I_n.\)
  • The transpose is defined for nonsquare matrices, too. Let \(A=\begin{bmatrix} 1 \amp 2 \amp 3 \amp 4 \\ 5 \amp 6 \amp 7 \amp 8 \\ 9 \amp10\amp11\amp12 \end{bmatrix}.\) Then \(A^T=\begin{bmatrix} 1 \amp 5 \amp9\\ 2 \amp 6 \amp 10\\ 3 \amp 7 \amp 11\\ 4 \amp 8 \amp 12 \end{bmatrix}.\)

Here is an animation that illustrates that \(A^T\) may be derived from \(A\) by reflection on the main diagonal.

Figure 3.5.3.
  • The \(i\)-\(j\) entry of \(A^T\) is \(a_{j,i}\text{,}\) obtained by interchanging the subscripts of \(A\text{.}\) Taking the transpose again interchanges the subscripts again, and so the \(i\)-\(j\) entry of \((A^T)^T\) is \(a_{i,j}\text{,}\) the same as for \(A\text{.}\) Hence \((A^T)^T=A\text{.}\)

  • The \(i\)-\(j\) entry on both sides of the equation is \(a_{j,i}+b_{j,i}\text{.}\)

  • The \(i\)-\(j\) entry on both sides of the equation is \(ra_{j,i}`\text{.}\)

  • The \(i\)-\(j\) entry of \((AB)^T\) is the \(j\)-\(i\) entry of \(AB,\) which in turn is

    \begin{equation*} a_{j,1}b_{1,i} +a_{j,2}b_{2,i}+\cdots +a_{j,n}b_{n,i}. \end{equation*}

    The \(i\)-\(j\) entry of \(B^TA^T\) is \((B^T)_{i,1}(A^T)_{1,j} + (B^T)_{i,2}(A^T)_{2,j} +\cdots + (B^T)_{i,n}(A^T)_{n,j} =\\ b_{1,i}a_{j,1} + b_{2,i}a_{j,2} +\cdots + b_{n,i}a_{j,n}.\) Hence both sides of the equation have the same \(i\)-\(j\) entry, and so the matrices are equal.

\begin{align*} (AB)^T_{i,j} \amp = (AB)_{j,i}\\ \amp = \sum_{k=1}^n a_{j,k}b_{k,i}\\ \amp = \sum_{k=1}^n b_{k,i}a_{j,k}\\ \amp = \sum_{k=1}^n B^T_{i,k}A^T_{k,j}\\ \amp = (B^T A^T)_{i,j} \end{align*}
Definition 3.5.5. Trace of a square matrix.

The trace of a square matrix \(A\) of size \(n\) is the sum of the diagonal elements, that is,

\begin{equation*} \tr A=a_{1,1}+a_{2,2}+\cdots + a_{n,n}=\sum_{i=1}^n a_{i,i} \end{equation*}

Suppose that \(A\) is \(m\times n\) and \(B\) is \(n\times m\) (remember that \(AB\) is then defined and square). To evaluate both \(\tr (AB)\) and \(\tr (BA)\text{,}\) we consider the following rectangular array of numbers:

\begin{equation*} \begin{matrix} \amp\amp\amp\amp\amp\text{Row sums}\\ a_{1,1}b_{1,1}\amp a_{1,2}b_{2,1}\amp a_{1,3}b_{3,1}\amp\cdots\amp a_{1,n}b_{n,1} \amp\gets (AB)_{1,1}\\ a_{2,1}b_{1,2}\amp a_{2,2}b_{2,2}\amp a_{2,3}b_{3,2}\amp\cdots\amp a_{2,n}b_{n,2} \amp\gets(AB)_{2,2}\\ \vdots\amp\vdots\amp\vdots\amp\amp\vdots\amp\vdots\\ a_{m,1}b_{1,m}\amp a_{m,2}b_{2,m}\amp a_{m,3}b_{3,m}\amp\cdots\amp a_{m,n}b_{n,m} \amp\gets (AB)_{m,m}\\ \\ \uparrow\amp\uparrow\amp\uparrow\amp\amp\uparrow\\ \llap{\text{Column sums:}\quad} (BA)_{1,1}\amp(BA)_{2,2}\amp(BA)_{3,3}\amp\amp(BA)_{n,n} \end{matrix} \end{equation*}

We are going to compute the sum of all the elements of this array in two ways. Having done this, the two answers will be equal.

First we find the sum of all the entries of the array by adding row-wise. Observe that the sum of the elements in the first row is \((AB)_{1,1}\text{,}\) the sum of those in the second row is \((AB)_{2,2}\text{,}\) and so on until the sum of the elements in the last row is \((AB)_{m,m}\text{.}\) Hence the sum of all of the elements in the array is \((AB)_{1,1}+(AB)_{2,2}+\cdots+(AB)_{m,m}=\tr (AB)\text{.}\)

Now we add columnwise. Notice that the first column sums to

\begin{align*} a_{1,1}b_{1,1}\amp +a_{2,1}b_{1,2}+\cdots+a_{m,1}b_{1,m}\\ \amp = b_{1,1}a_{1,1}+b_{2,1}a_{1,2}+\cdots+b_{m,1}a_{1,m}\\ \amp =(BA)_{1,1}, \end{align*}

the second column sums to \((BA)_{2,2}\) and so on until the last column sums to \((BA)_{n,n}\text{.}\)Hence the sum of all of the elements in the array is \((BA)_{1,1}+(BA)_{2,2}+\cdots+(BA)_{n,n}=\tr (BA)\text{.}\) Equating the two evaluations gives \(\tr (AB)=\tr (BA)\text{.}\)

Suppose that \(A\) is \(m\times n\text{.}\) Then, since \(AB\) and \(BA\) are defined, the conformability ensures that \(B\) is \(n\times m\text{.}\)

\begin{align*} \tr (AB) \amp = \sum_{i=1}^m(AB)_{i,i} \\ \amp = \sum_{i=1}^m \sum_{j=1}^n a_{i,j}b_{j,i}\\ \amp = \sum_{i=1}^m \sum_{j=1}^n b_{j,i}a_{i,j}\\ \amp = \sum_{j=1}^n\sum_{i=1}^m b_{j,i}a_{i,j}\\ \amp = \sum_{j=1}^n (BA)_{j,j}\\ \amp = \tr (BA) \end{align*}