Skip to main content

Section 7.1 Linear Systems and Gaussian Elimination

The same elementary row operations that we used for a linear system works for a matrix as well. Here they are written in terms of rows:
  • Multiply any row by a constant \(c R_{j} \rightarrow R_{j}\)
  • Interchange (swap) any two rows \(R_{i} \leftrightarrow R_{j}\)
  • Multiply any equation (row) by a constant and add to another row \(c_{1} R_{i} + R_{j} \rightarrow R_{j}\text{.}\)

Definition 7.1.1.

A upper triangular matrix is a matrix \(A\) with the property that \(a_{i,j}= 0\) if \(i<j\text{.}\)
That is, the matrix \(A\) contains only zeros below the main diagonal.

Definition 7.1.2.

Gaussian Elimination is the process of using the row operations above to transform a matrix to an upper diagonal matrix.

Definition 7.1.3.

An augmented coefficient matrix is a matrix of coefficients from a linear system including the right hand side. In general if the linear system is
\begin{equation*} \begin{aligned}a_{1,1}x_{1} + a_{1,2}x_{2} + \cdots a_{1,n}x_{n}\amp = b_{1} \\ a_{2,1}x_{1} + a_{2,2}x_{2} + \cdots a_{2,n}x_{n}\amp = b_{2} \\ \vdots \qquad \qquad \qquad\amp \quad \vdots \\ a_{m,1}x_{1} + a_{m,2}x_{2} + \cdots a_{m,n}x_{n}\amp = b_{m} \\\end{aligned} \end{equation*}
can be written as the matrix:
\begin{equation*} \begin{aligned}\begin{array}{rrrr|r} a_{1,1} \amp a_{1,2} \amp \cdots \amp a_{1,n} \amp b_1 \\ a_{2,1} \amp a_{2,2} \amp \cdots \amp a_{2,n} \amp b_2 \\ \vdots \amp \amp \ddots \amp \amp \vdots \\ a_{m,1} \amp a_{m,2} \amp \cdots \amp a_{m,n} \amp b_m\end{array}\end{aligned} \end{equation*}

Example 7.1.4.

Use Gaussian elimination and Back Substitution to solve:
\begin{equation*} \begin{aligned}x - 2y + z\amp = -1 \\ 2x + y + 2z\amp = 3 \\ -x -y +3 z\amp = -6\end{aligned} \end{equation*}
Solution.
can be written as an augmented coefficient matrix:
\begin{equation*} \begin{bmatrix}1 \amp -2 \amp 1 \amp -1 \\ 2 \amp 1 \amp 2 \amp 3 \\ -1 \amp -1 \amp 3 \amp -6\end{bmatrix} \end{equation*}
\begin{equation*} \begin{aligned}\amp \qquad \begin{bmatrix}1 \amp -2 \amp 1 \amp -1 \\ 2 \amp 1 \amp 2 \amp 3 \\ -1 \amp -1 \amp 3 \amp -6\end{bmatrix} \\ \underrightarrow{-2 R_1 + R_2 \rightarrow R_2, \quad R_1 + R_3 \rightarrow R_3}\amp \qquad \begin{bmatrix}1 \amp -2 \amp 1 \amp -1 \\ 0 \amp 5 \amp 0 \amp 5 \\ 0 \amp -3 \amp 4 \amp -7\end{bmatrix} \\ \underrightarrow{(1/5)R_2 \rightarrow R_2 }\amp \qquad \begin{bmatrix}1 \amp -2 \amp 1 \amp -1 \\ 0 \amp 1 \amp 0 \amp 1 \\ 0 \amp -3 \amp 4 \amp -7\end{bmatrix} \\ \underrightarrow{3R_2 + R_3 \rightarrow R_3 }\amp \qquad \begin{bmatrix}1 \amp -2 \amp 1 \amp -1 \\ 0 \amp 1 \amp 0 \amp 1 \\ 0 \amp 0 \amp 4 \amp -4\end{bmatrix} \\ \underrightarrow{(1/4) R_3 \rightarrow R_3}\amp \qquad \begin{bmatrix}1 \amp -2 \amp 1 \amp -1 \\ 0 \amp 1 \amp 0 \amp 1 \\ 0 \amp 0 \amp 1 \amp -1\end{bmatrix}\end{aligned} \end{equation*}
Now we perform back substitution, this results in the three linear equations:
\begin{equation*} \begin{aligned}z\amp = -1 \\ y\amp = 0 z + 1 \\ x\amp = 2 y - z -1 = 2\end{aligned} \end{equation*}
and therefore the solution is \((2,1,-1)\)

Subsection 7.1.1 Operation Counts of Gaussian Elimination

When solving systems of equations on realistic problems, often hundreds or thousands of variables are needed resulting in very large matrices. There are two big issues involved with such large matrices, the size needed for storage and the speed to solve such matrices.
First, we look at the speed of solving a linear system with \(m\) rows (equations) and \(n\) variables using Gaussian Elimination. Let’s also assume that \(n \gt m\text{.}\) In this case, we will examine solving a system in which no pivoting or row swaps are needed.
  1. First, consider the number of operations of the form:
    \begin{equation*} \begin{aligned}k R_{1} + R_{j} \to R_{j}\end{aligned} \end{equation*}
    for \(j=2, \ldots, m\) needed to eliminate the element in row \(j\text{,}\) column 1.
    This operation takes \(n(m-1)\) multiplications and \(n(m-1)\) additions.
  2. The result is now a \(m-1\) by \(n-1\) matrix, so to repeat this process, there is
    \begin{equation*} \begin{aligned}\text{nops}= n(m-1) + (n-1)(m-2) + (n-2)(m-3) + \cdots (n-m)\end{aligned} \end{equation*}
    let \(n = m + p\) (that is there is \(p\) more columns than rows).
    \begin{equation*} \begin{aligned}\text{nops}\amp = (m+p)(m-1) + (m+p-1)(m-2) + \cdots p \\\amp = m(m-1) + (m-1)(m-2) + \cdots (m-p+1)(1) + \\\amp \qquad \qquad \bigl( p(m-1) + p(m-2) + \cdots + p) \\\amp = \sum_{i=1}^{m} i(i-1) + p(i-1) \\\amp = \sum_{i=1}^{m} i^{2} + (p-1)i - p \\\amp = \frac{m(m+1)(2m+1)}{6}+ (p-1) \frac{m(m+1)}{2}- mp \\\amp = O(m^{3})\end{aligned} \end{equation*}
  3. For Backwards substitution, recall that if the matrix is in upper triangular form, then
    \begin{equation*} \begin{aligned}x_{k}\amp = \frac{1}{\tilde{a}_{k,k}}\bigl(\tilde{b}_{k} - \tilde{a}_{k,k+1}x_{k+1}- \cdots \tilde{a}_{k,n-1}x_{n-1}\bigr)\end{aligned} \end{equation*}
    and each step requires \(n-k\) multiplications. Since there are \(m\) steps of back sub, there are
    \begin{equation*} \begin{aligned}\sum_{k=1}^{m} (n-k)\amp = \sum_{k=1}^{m} (m+p-k) \\\amp = m^{2} + mp - \frac{m(m+1)}{2}= O(m^{2})\end{aligned} \end{equation*}
    multiplications for back substitution.

Subsection 7.1.2 Comparison to Cramer’s Rule

Although the number of operations needed to solve linear systems seem large, if we compare this to another standard method of solution, Cramer’s rule, we will see that Gaussian Elimination is quite efficient.
Recall that Cramer’s Rule for solving the linear system \(\mathbf{Ax}=\mathbf{b}\) for a square matrix \(\mathbf{A}\) is that
\begin{equation*} \begin{aligned}x_{j}\amp = \frac{1}{|\mathbf{A}|}\begin{vmatrix}a_{1,1}\amp a_{1,2}\amp \cdots\amp b_{1}\amp \cdots\amp a_{1,n}\\ a_{2,1}\amp a_{2,2}\amp \cdots\amp b_{2}\amp \cdots\amp a_{2,n}\\ \vdots\amp \amp \vdots\amp \amp \vdots \\ a_{n,1}\amp a_{n,2}\amp \cdots\amp b_{n}\amp \cdots\amp a_{n,n}\\\end{vmatrix}\end{aligned} \end{equation*}
where the vector \(\mathbf{b}\) has be placed in the determine in the \(j\)th column.
To determine the number of operations for Cramer’s Rule, first, we need to know the number of operations to find a determinant. If \(\mathbf{A}\) is a 2 by 2 matrix, then there are \(2\) multiplications and 1 addition. For other determinants, note that the cofactor expansion method is often used, which is
\begin{equation*} \begin{aligned}|A|\amp = a_{1,1}M_{1,1}- a_{1,2}M_{1,2}+ a_{1,3}M_{1,3}+ \cdots + (-1)^{n} a_{1,n}M_{1,n}\\\amp = \sum_{k=1}^{n} (-1)^{k} a_{1,k}M_{1,k}\end{aligned} \end{equation*}
where \(M_{i,j}\) is the minor (that is the determinant of the matrix with the \(i\)th row and \(j\)th column removed). If we let \(g(n)\) be the number of multiplications needed for the determinant, then
\begin{equation*} \begin{aligned}g(n)\amp = 1 \cdot g(n-1)\end{aligned} \end{equation*}
s and we know that \(g(1) = 1\text{,}\) This is the definition of the factorial, so \(g(n)=n!\text{.}\)
The number of operations for Cramer’s rule of a matrix of size \(n\) by \(n\) can be found by noting that \(n+1\) determinants are needed. Thus Cramer’s rule requires \((n+1)n!=(n+1)!\) multiplications, which is much larger than \(O(n^{3})\text{.}\)

Subsection 7.1.3 Pivoting Strategies

First, let’s look at solving a simple 3 by 3 linear system as an example we’ll use below.

Example 7.1.5.

Solve
\begin{equation*} \begin{aligned}\frac{2}{3}x_{1} + \frac{2}{7}x_{2} + \frac{1}{5}x_{3}\amp = \frac{43}{15}\\ \frac{1}{3}x_{1} + \frac{1}{7}x_{2} - \frac{1}{2}x_{3}\amp = \frac{5}{6}\\ \frac{1}{5}x_{1} - \frac{3}{7}x_{2} + \frac{2}{5}x_{3}\amp = -\frac{12}{5}\end{aligned} \end{equation*}
using exact arithmetic.
Solution.
There are many approaches to doing this and we’ll switch everything to integers in this case first before doing row operations. Multiplying the first row by 105, the second by 42 and the third by
\begin{equation*} \begin{aligned}x_{1}\amp = 1\amp x_{2}\amp = 7\amp x_{3}\amp = 1\end{aligned} \end{equation*}
If instead we use floating point arithmetic to 4 digits, we write the system as
\begin{equation*} \begin{aligned}\begin{bmatrix}0.6667\amp 0.2857\amp 0.2000\amp 2.867 \\ 0.3333\amp 0.1429\amp -0.5000\amp 0.8333 \\ 0.2000\amp -0.4286\amp 0.4000\amp -2.400\end{bmatrix}\end{aligned} \end{equation*}
An operation of \(-0.3333/0.6667 R_{1} + R_{2} \rightarrow R_{2}\) and \(-0.2/0.6667 R_{1} + R_{3} \rightarrow R_{3}\) is
\begin{equation*} \begin{aligned}\begin{bmatrix}0.6667\amp 0.2857\amp 0.2000\amp 2.867 \\ 0.0000\amp 0.0001\amp - 0.6000\amp -0.5997 \\ 0.0000\amp -0.5143\amp 0.3400\amp -3.260\end{bmatrix}\end{aligned} \end{equation*}
To eliminate the element in the 3rd row, 2nd column, we can perform \(0.5143/0.0001 R_{2} + R_{3} \rightarrow R_{3}\text{,}\) we get:
\begin{equation*} \begin{aligned}\begin{bmatrix}0.6667\amp 0.2857\amp 0.2000\amp 2.867 \\ 0.0000\amp 0.0001\amp - 0.6000\amp -0.5997 \\ 0.0000\amp 0.0000\amp -3086\amp -3087\end{bmatrix}\end{aligned} \end{equation*}
Next, performing back substitution, we get:
\begin{equation*} \begin{aligned}x_{3}\amp = \frac{-3087}{-3086}= 1.000 \\ x_{2}\amp = \frac{-0.5997 +0.6000(1.000)}{0.0001}= 3.000 \\ x_{1}\amp = \frac{2.867 -2.000(1.000) -0.2857(3.000)}{0.6667}= 2.715\end{aligned} \end{equation*}
Since the actual solution is \((1,7,1)\text{,}\) the approximate values are off by large relative margins.

Subsection 7.1.4 Partial Pivoting

One method to eliminate this error is to perform partial pivoting. The reason that much of the round off error occur above was due to the operation....
Instead, we will use something called partial pivoting, which means that at each pivoting step, we will swap rows to bring the largest value (in absolute value) to the pivoting element.
In the example above, the first pivot was fine as the 0.6667 was the large value, however look at the next step:
\begin{equation*} \begin{aligned}\begin{bmatrix}0.6667\amp 0.2857\amp 0.2000\amp 2.867 \\ 0.0000\amp 0.0001\amp - 0.6000\amp -0.5997 \\ 0.0000\amp -0.5143\amp 0.3400\amp -3.260\end{bmatrix}\end{aligned} \end{equation*}
At this step, let’s swap the 2nd and 3rd rows to get:
\begin{equation*} \begin{aligned}\begin{bmatrix}0.6667\amp 0.2857\amp 0.2000\amp 2.867 \\ 0.0000\amp -0.5143\amp 0.3400\amp -3.260 \\ 0.0000\amp 0.0001\amp - 0.6000\amp -0.5997 \\\end{bmatrix}\end{aligned} \end{equation*}
And then perform the operation \(0.0001/0.5143 R_{2} + R_{3} \rightarrow R_{3}\) to get:
\begin{equation*} \begin{aligned}\begin{bmatrix}0.6667\amp 0.2857\amp 0.2000\amp 2.867 \\ 0.0000\amp -0.5143\amp 0.3400\amp -3.260 \\ 0.0000\amp 2 \times 10^{-8}\amp -0.5999\amp -0.6003 \\\end{bmatrix}\end{aligned} \end{equation*}
and finally back substitution yields:
\begin{equation*} \begin{aligned}x_{3}\amp = \frac{-0.6003}{-0.5999}= 1.001 \\ x_{2}\amp = \frac{-3.260 -0.3400(1.001)}{-0.5143}= 7.000 \\ x_{1}\amp = \frac{2.867-0.2(1.001)-0.2857(7.000)}{0.6667}= 1.000\end{aligned} \end{equation*}
which is exact for \(x_{1}\) and \(x_{2}\text{,}\) and only off by 0.1%for \(x_{3}\text{.}\)
In practice it is expensive to perform a row swap. Instead, we typically

Subsection 7.1.5 Scaled Partial Pivoting

\begin{equation*} \begin{aligned}0.7 x_{1} + 1725 x_{2}\amp = 1739 \\ 0.4352 x_{1} -5.433 x_{2}\amp = 3.271\end{aligned} \end{equation*}
which has a solution \(x_{1}=1\) and \(x_{2} = 20\text{.}\)
Using Partial Pivoting, the largest element in the first column is in row 1, so we pivot about that element to get:
\begin{equation*} \begin{aligned}0.7x_{1} +1725 x_{2}\amp = 1739 \\ -1077x_{2}\amp = -1078\end{aligned} \end{equation*}
so \(x_{2} = 1.001\) and \(x_{1} = 17.14\text{.}\)
The reason partial pivoting didn’t work is that we didn’t take into account the fact that the absolute value of the largest does not matter. Instead, we should actually look at the relatively largest value (relative to others in the row).
Since \(0.4352/0.5433 > 0.7/1725\text{,}\) we should pivot about the 2nd row, first column, to get:
\begin{equation*} \begin{aligned}1734 x_{2}\amp = 1734 \\ 0.4352 x_{1} -5.433 x_{2}\amp = 3.271\end{aligned} \end{equation*}
which results in \(x_{2}=1.000\) and \(x_{1} = 20.00\text{,}\) the exact result to 4 decimals.