The Gaussian elimination algorithm

Section 2.5 The Gaussian elimination algorithm

Subsection 2.5.1 Some matrices whose associated system of equations are easy to solve

The elementary row operations allow us to change matrices and their associated system of linear equations without changing the solutions of those equations. The goal is to end up with matrices which make these common solutions obvious. Here are some examples.

The augmented matrix

\begin{equation*} \left[\begin{array}{cccc|c} 1\amp 0\amp 0\amp 0\amp 1\\ 0\amp 1\amp 0\amp 0\amp 2\\ 0\amp 0\amp 1\amp 0\amp 5\\ 0\amp 0\amp 0\amp 1\amp -1 \end{array}\right] \end{equation*}

corresponds to the system of linear equations

\begin{equation*} \begin{array}{cccccc} x_1\amp \amp \amp \amp =\amp 1\\ \amp x_2\amp \amp \amp =\amp 2\\ \amp \amp x_3\amp \amp =\amp 5\\ \amp \amp \amp x_4\amp =\amp -1 \end{array} \end{equation*}

The equations themselves actually describe the unique solution! Notice the structure of the coefficient matrix that makes this possible. There is only one nonzero entry in each row, its value is 1, and as you proceed down through from one row to the next, the nonzero entry moves one column to the right.
The augmented matrix

\begin{equation*} \left[\begin{array}{cc|c} 1\amp 1\amp 4\\ 0\amp 2\amp 6 \end{array}\right] \end{equation*}

corresponds to the system of linear equations

\begin{equation*} \begin{array}{rcc} x_1+x_2\amp =\amp 4\\ 2x_2\amp =\amp 6 \end{array} \end{equation*}

The last row is easy to solve: we get \(x_2=3\text{.}\) Using this value, it is also easy to solve \(x_1+x_2=x_1+3=4\text{,}\) or \(x_1=1\text{.}\)
The augmented matrix

\begin{equation*} \left[\begin{array}{ccc|c} 1\amp 1\amp 1\amp 4\\ 0\amp 0\amp 2\amp 6 \end{array}\right] \end{equation*}

corresponds to the system of linear equations

\begin{equation*} \begin{array}{rcc} x_1+x_2+x_3\amp =\amp 4\\ 2x_3\amp =\amp 6 \end{array} \end{equation*}

As before, we get \(x_3=3\text{.}\) We still have two variables undefined: we assign a parameter to the second one: \(x_2=t\text{.}\) Using this value, we have \(x_1+x_2+x_3=x_1+t+3=4\text{,}\) or \(x_1=1-t\text{.}\) We now know the solutions: \(x_1=1-t\text{,}\) \(x_2=t\text{,}\) \(x_3=3\) where \(t\) can be any real number. In fact we can check this result with the first equation: \(x_1+x_2+x_3=(1-t) + t + 3=4\text{.}\) The compact way of writing this solution is \((x_1,x_2,x_3)=(1-t,t,3)\text{.}\)

It's clear from the examples given that having lots of zeros in the coefficient matrix is helpful for computing the solutions. It's also clear that if the first nonzero entry in a row is one, then the computation easier. Our plan is to use elementary row operations to change a given coefficient matrix into one with these properties, and then to describe all the solutions. Here are some observations that will help us:

If a column has some nonzero entry, we can always make the top entry nonzero by interchanging rows, if necessary using \(R_1\leftrightarrow R_i\) for some \(i>1\text{.}\)
If the first nonzero entry of a row \(R_i\) is \(\lambda\text{,}\) we can turn it into a 1 by using \(R_i\leftarrow \tfrac 1\lambda R_i\text{.}\)
If two rows \(R_i\) and \(R_j\) have nonzero entries in some column \(k\text{,}\) we can turn the \(j,k\) entry into a zero using \(R_j \leftarrow R_j - \frac{a_{j,k}}{a_{i,k}} R_i\text{.}\)

Subsection 2.5.2 Row echelon form

We now want to define a general class of matrices whose corresponding system of linear equations have solutions that are easy to find. These matrices have a special pattern of zeros and ones, and are said to be in row echelon form.

Figure 2.5.1. A matrix in row echelon form

The matrix above gives an idea of what we want. Notice the staircase line drawn through the matrix has all entries below it equal to zero. The entries marked with a \(*\) can take on any value.

The first nonzero entry in a row (if there is one) is called the leading entry. If it equals \(1\text{,}\) then it is called a leading one.

Definition 2.5.2. Row echelon form.

A matrix is in row echelon form if

Every leading entry is a leading one.
Every entry below a leading one is \(0\text{.}\)
As you go down the matrix, the leading ones move to the right.
Any all zero rows are at the bottom.

Checkpoint 2.5.3.

Which of the following matrices are in row echelon form?

\(\displaystyle \begin{bmatrix} 1\amp2\amp3\amp4\\ 0\amp5\amp6\amp7\\ 0\amp0\amp8\amp9\\ 0\amp0\amp0\amp10 \end{bmatrix}\)
\(\displaystyle \begin{bmatrix} 1\amp2\amp3\amp4\\ 0\amp1\amp2\amp3\\ 0\amp0\amp0\amp0\\ 0\amp0\amp0\amp1 \end{bmatrix}\)
\(\displaystyle \begin{bmatrix} 1\amp2\amp3\amp4\\ 0\amp1\amp2\amp3\\ 0\amp0\amp0\amp1\\ 0\amp0\amp0\amp0 \end{bmatrix}\)
\(\displaystyle \begin{bmatrix} 1\amp2\amp3\amp4\\ 0\amp1\amp6\amp7\\ 0\amp0\amp1\amp1\\ 0\amp0\amp1\amp1 \end{bmatrix}\)
\(\displaystyle \begin{bmatrix} 1\amp0\amp0\amp0\\ 0\amp0\amp0\amp0\\ 0\amp0\amp0\amp0\\ 0\amp0\amp0\amp0 \end{bmatrix}\)

Solution

Not in row echelon form because not every leading entry is a \(1\text{.}\)
Not in row echelon form because the zero row is not at the bottom.
It is row echelon form.
Not row echelon form because a leading entry has a nonzero entry below it.
It is row echelon form.

Subsection 2.5.3 The Gaussian elimination algorithm

The plan is now start with the augmented matrix and, by using a sequence of elementary row operations, change it to a new matrix where it is easy to identify the solutions of the associated system of linear equations. Since any elementary row operation leaves the solutions unchanged, the solutions to the final system of linear equations will be identical to the solutions of the original one.

We work on the columns of the matrix from left to right and change the matrix in the following way:

Start with the first column. If it has all entries equal to zero, move on to the next column to the right.
If the column has nonzero entries, interchange rows, if necessary, to get a nonzero entry on top.
Change the top entry, if necessary, to make it a \(1\text{.}\)
For any nonzero entry below the top one, use an elementary row operation to change it to zero.
Now consider the part of the matrix below the top row and to the right of the column under consideration: if there are no such rows or columns, stop since the procedure is finished.Otherwise, carry out the same procedure on the new matrix.

Here is a first example:

\begin{equation*} \begin{array}{r} 3x_1-2x_2-x_3+x_4=1\\ 6x_1-8x_2+x_3+2x_4=4 \end{array} \end{equation*}

has augmented matrix

\begin{equation*} \begin{bmatrix} 3 \amp -2 \amp -1 \amp 1 \amp 1 \\ 6 \amp -8 \amp 1 \amp 2 \amp 4 \end{bmatrix}. \end{equation*}

We don't need to exchange rows to make the top entry of the first column nonzero, so we proceed to make the top entry \(1\) using the elementary row operation \(R_1\gets \frac13R_1\text{.}\) The matrix becomes

\begin{equation*} \begin{bmatrix} 1 \amp -\frac23 \amp -\frac13 \amp \frac13 \amp \frac13 \\ 6 \amp -8 \amp 1 \amp 2 \amp 4 \end{bmatrix}. \end{equation*}

Now we must make all entries below the top one in the column equal to zero. There is, of course, only one such entry, and so, using \(R_2\leftarrow R_2-6R_1\text{,}\) we get

\begin{equation*} \begin{bmatrix} 1 \amp -\frac23 \amp -\frac13 \amp \frac13 \amp \frac13 \\ 0 \amp -4 \amp 3 \amp 0 \amp 2 \end{bmatrix}\text{.} \end{equation*}

We're now done with the first column, so we continue with the same procedure on the matrix obtained by deleting the first row and first column:

\begin{equation*} \begin{bmatrix} -4 \amp 3 \amp 0 \amp 2 \end{bmatrix}. \end{equation*}

Since there is only one row, we need only change the top entry to \(1\) using division by \(-4\text{,}\) that is, \(R_1\gets -\frac14R_1\text{.}\) We then get

\begin{equation*} \begin{bmatrix} 1 \amp -\frac34 \amp 0 \amp -\frac12 \end{bmatrix}. \end{equation*}

and, putting it back into the original matrix, we get

\begin{equation*} \begin{bmatrix} 1 \amp -\frac23 \amp -\frac13 \amp \frac13 \amp \frac13 \\ 0 \amp 1 \amp -\frac34 \amp 0 \amp -\frac12 \end{bmatrix} \end{equation*}

The matrix is in row echelon form. Now we can determine all of the solutions to the original system of linear equations. The first nonzero entry in the first row is in the first column, the column associated with \(x_1\text{.}\) The first nonzero entry in the second row, similarly, is associated with \(x_2\text{.}\) We assign parameters to the other variables: \(x_3=s\) and \(x_4=t\text{.}\) The second row then tells us that \(x_2-\frac34s =-\frac12 \text{,}\) or \(x_2=\frac34s-\frac12\text{.}\) Now that we know \(x_2\text{,}\) we can use the first row to find \(x_1\text{:}\) we get \(x_1-\frac23x_2-\frac13s+\frac13t=\frac13\text{.}\) We substitute in our known value for \(x_2\) into this equation, and after some simplification, we get \(x_1=\tfrac56s-\tfrac13t\text{.}\) In summary, we have: All solutions to

\begin{equation*} \begin{array}{r} 3x_1-2x_2-x_3+x_4=1\\ 6x_1-8x_2+x_3+2x_4=4 \end{array} \end{equation*}

are of the form

\begin{equation*} \begin{array}{rcl} x_1\amp =\amp \tfrac56s-\tfrac13t\\ x_2\amp =\amp -\tfrac12 +\tfrac34 s\\ x_3\amp =\amp s\\ x_4\amp =\amp t \end{array} \end{equation*}

where \(s\) and \(t\) are any real numbers. More compactly, we write this as \((x_1,x_2,x_3,x_4)= \tfrac56s-\tfrac13t,-\tfrac12 +\tfrac34 s,s,t)\text{.}\) In other words, for any assignment of real numbers to \(s\) and \(t\text{,}\) we get a solution to the system of linear equations.

It is easy (and worthwhile) to check that substituting \(x_1,\dots,x_4\) into the two equations does indeed give a solution.

Now consider another example with the equations

\begin{alignat*}{3} x_1 \amp {}+2x_2 \amp{}-2x_3 \amp{}= \amp{}-1\\ 3x_1 \amp {}- 2x_2 \amp{}-4x_3 \amp{}= \amp{}3\\ 4x_1 \amp \amp{}-2x_3 \amp{}= \amp{}-2\\ -x_1 \amp {}-x_2 \amp{}+2x_3 \amp{}= \amp{}0 \end{alignat*}

and its corresponding augmented matrix as it is changed by Gaussian elimination:

\begin{equation*} \begin{bmatrix} 1\amp 2\amp -2\amp -1 \\ 3\amp -2\amp -4\amp 3\\ 4\amp 0\amp -2\amp -2\\ -1\amp -1\amp 2\amp 0 \end{bmatrix} \begin{array}{c} \\R_2 \leftarrow R_2-3R_1\\ R_3\gets R_3-4R_1\\ R_4\gets R_4+R_1 \end{array} \end{equation*}

\begin{equation*} \begin{bmatrix} 1\amp 2\amp -2\amp -1 \\ 0\amp -8\amp 2\amp 6\\ 0\amp -8\amp 6\amp 2\\ 0\amp 1\amp 0\amp -1 \end{bmatrix} \begin{array}{c} R_2\gets-\frac18 R_2\\ R_3\gets R_3+8R_2\\R_4\gets R_4-R_2 \end{array} \end{equation*}

\begin{equation*} \begin{bmatrix} 1\amp 2\amp -2\amp -1 \\ 0\amp 1\amp -\frac14\amp -\frac34\\ 0\amp 0\amp 4\amp -4\\ 0\amp 0\amp \tfrac14 \amp -\tfrac14 \end{bmatrix} \begin{array}{c} R_3 \gets \frac14 R_3\\R_4\gets R_4-\frac14 R_3 \end{array} \end{equation*}

\begin{equation*} \begin{bmatrix} 1\amp 2\amp -2\amp -1 \\ 0\amp 1\amp -\frac14\amp -\frac34\\ 0\amp 0\amp 1\amp -1\\ 0\amp 0\amp 0\amp 0 \end{bmatrix} \end{equation*}

The last row, for any choice of \(x_1, x_2, x_3\text{,}\) reduces to \(0=0,\) so any solution of the associated first three equations will also be a solution to the last one. In other words, the last row of the matrix has no effect on the solution set and may be dropped from the matrix. The third row gives \(x_3=-1.\) The second row gives \(x_2=-1\) and the first row gives \(x_1=-1.\) Hence there is one solution: \((x_1,x_2,x_3)=(-1,-1,-1).\)

Now change the equations from the last example very slightly (the right-hand side of the last equation is changed from \(0\) to \(1\)):

The Gaussian elimination is almost identical as

\begin{equation*} \begin{bmatrix} 1\amp 2\amp -2\amp -1 \\ 3\amp -2\amp -4\amp 3\\ 4\amp 0\amp -2\amp -2\\ -1\amp -1\amp 2\amp 1 \end{bmatrix} \end{equation*}

is reduced to

\begin{equation*} \begin{bmatrix} 1\amp 2\amp -2\amp -1 \\ 0\amp 1\amp -\frac14\amp -\frac34\\ 0\amp 0\amp 1\amp -1\\ 0\amp 0\amp 0\amp 1 \end{bmatrix} \end{equation*}

Now the last row says \(0x_1+0x_2+0x_3=1\text{,}\) which, for any choice of \(x_1\text{,}\) \(x_2\) and \(x_3\text{,}\) reduces to \(0=1\) and is never true. This means the original system of equations have no solutions, that is, the system is inconsistent.

We can make a useful observation here: If a row of the augmented matrix is of the form

\begin{equation*} \begin{bmatrix} 0 \amp 0 \amp 0 \amp \cdots \amp 0 \amp * \end{bmatrix} \end{equation*}

where \(*\) is either zero or nonzero, then one of two things happens:

\(*=0\) in which case the row may be dropped from the matrix
\(*\not=0\) in which case there is no solution.

Example 2.5.4.

Consider the system of linear equations:

\begin{equation*} \begin{array}{rcl} x+y+z\amp =\amp 1\\ 2x+y+z \amp =\amp 2\\ 3x+ay+bz\amp =\amp c \end{array} \end{equation*}

We wish to know the values of \(a,\) \(b\) and \(c\) for which there are there no solutions, one solution or more than one solution. To solve this problem, we apply Gaussian elimination to the augmented matrix:

\begin{equation*} \begin{bmatrix} 1\amp 1\amp 1\amp 1\\ 2\amp 1\amp 1\amp 2\\ 3\amp a\amp b\amp c \end{bmatrix} \begin{matrix}R_2\gets R_2-2R_1\\ R_3\gets R_3-3 R_1\end{matrix} \end{equation*}

\begin{equation*} \begin{bmatrix} 1\amp 1\amp 1\amp 1\\0\amp -1\amp -1\amp 0\\0\amp a-3\amp b-3\amp c-3 \end{bmatrix} \begin{matrix}R_2\gets -R_2\\R_3\gets R_3-(a-3)R_2\end{matrix} \end{equation*}

\begin{equation*} \begin{bmatrix} 1\amp 1\amp 1\amp 1\\0\amp 1\amp 1\amp 0\\0\amp 0\amp b-a\amp c-3 \end{bmatrix} \end{equation*}

An analysis of the last row tells us everything: If \(b-a\not=0,\) then there is exactly one solution. If \(b-a=0,\) and \(c-3\not=0,\) then there are no solutions. Otherwise (when \(b=a\) and \(c=3\)) there are an infinite number of solutions.

Exercises Exercises

1.

Find all solutions to the system of equations

\begin{align*} x+2y-z \amp=2\\ x+y-z\amp=0\\ 2x-y+z\amp=3 \end{align*}

Solution

We put the augmented matrix into row echelon form:

\begin{equation*} \begin{bmatrix} 1 \amp 2 \amp -1 \amp 2\\ 1 \amp 1 \amp -1 \amp 0\\ 2 \amp -1 \amp 1 \amp 3\\ \end{bmatrix} \\ \begin{array}{l} R_2\gets R_2 - R_1\\ R_3\gets R_3-2R_1 \end{array} \\ \begin{bmatrix} 1 \amp 2 \amp -1 \amp 2\\ 0 \amp -1 \amp 0 \amp -2\\ 0 \amp -5 \amp 3 \amp -1\\ \end{bmatrix} \\ \begin{array}{l} R_2\gets -R_2\\ \end{array} \\ \begin{bmatrix} 1 \amp 2 \amp -1 \amp 2\\ 0 \amp 1 \amp 0 \amp 2\\ 0 \amp -5 \amp 3 \amp -1\\ \end{bmatrix} \\ R_3\gets R_3+5R_2 \\ \begin{bmatrix} 1 \amp 2 \amp -1 \amp 2\\ 0 \amp 1 \amp 0 \amp 2\\ 0 \amp 0 \amp 3 \amp 9\\ \end{bmatrix} \\ R_3\gets \frac13 R_3 \\ \begin{bmatrix} 1 \amp 2 \amp -1 \amp 2\\ 0 \amp 1 \amp 0 \amp 2\\ 0 \amp 0 \amp 1 \amp 3\\ \end{bmatrix} \end{equation*}

The last row gives \(z=3\text{.}\) The second row gives \(y=2\text{.}\) The first row gives \(x=2-2y+z=2-4+3=1\text{,}\) and so there is a unique solution \((x,y,z)=(1,2,3)\text{.}\)