Other Rootfinding Methods

Section 3.5 Other Rootfinding Methods

In Section 3.4, we saw Newton’s method. Although it’s not a perfect rootfinding algorithm, Newton’s method has a lot of strength including it’s relative simplicity and quadratic convergence. However, one drawback is that it requires taking the derivative. In this chapter we will see the secant method which uses a secant line instead of a tangent line to estimate the root.

🔗

Additional methods including the blending of the methods we’ve seen including using the speed of Newton’s method with the bisection method which guarantees a root.

🔗

Subsection 3.5.1 The Secant Method

Recall that in Section 3.4, we used the tangent line to the function \(f(x)\) to estimate the root. If instead, we use the secant line, we do the same. Figure 3.5.1 shows the secant line through a function \(y=f(x)\) through two points \((x_1, f(x_1))\) and \((x_2, f(x_2))\text{.}\) The zero of the secant line is found and labelled as \(x_3\text{.}\) This is used recursively.

🔗

Figure 3.5.1.

🔗

Recall that a secant line through the points \((x_1, f(x_1))\) and \((x_2, f(x_2))\) can be written

\begin{equation*} y = f(x_2) + \frac{f(x_2)-f(x_1)}{x_2-x_1} (x - x_2) \end{equation*}

and the zeros of this line is

\begin{equation*} x = x_2 - f(x_2) \frac{x_2-x_1}{f(x_2)-f(x_1)} \end{equation*}

This value is thus used to find an approximation to the root or

\begin{equation*} x_3 = x_2 - f(x_2) \frac{x_2-x_1}{f(x_2)-f(x_1)} \end{equation*}

🔗

If this is same formula is used recursively, then we have

\begin{equation} x_{n+1} = x_n - f(x_n) \frac{x_n-x_{n-1}}{f(x_n)-f(x_{n-1})}\tag{3.5.1} \end{equation}

🔗

Algorithm 3.5.2. Secant Rootfinding Method.

The following will find the root \(p\) to within \(\epsilon\) using the secant method:

🔗

Start with two approximations of the root called \(x_0\) and \(x_1\text{.}\) Let \(n=1\text{.}\)
🔗

🔗
Find

\begin{equation*} x_{n+1} = x_n - f(x_n) \frac{x_n-x_{n-1}}{f(x_n)-f(x_{n-1})} \end{equation*}

🔗

🔗
If \(|x_{n+1}-x_n| < \epsilon\text{,}\) stop. The best approximation to the root \(p\) is \(x_{n+1}\text{.}\) If not, increment \(n\) by 1 and return to step 2.
🔗

🔗

🔗

Note: An alternative stopping condition would be use \(|f(x_{n+1})| \lt \epsilon\text{.}\)

🔗

The next example shows the detailed steps to use the secant method.

🔗

Example 3.5.3.

Use the secant method to find the intersection point of \(y=\cos x\) and \(y=x\) or finding the root of \(f(x) = \cos x - x\text{.}\) This is the same function as in Example 3.4.3. Use \(x_0=1.5\) and \(x_1 = 1\text{.}\)

🔗

Solution.

🔗

Table 3.5.4. Convergence of Root-Finding Method

🔗

\(n\)	\(x_n\)	\(f(x_n)\)
0	\(1.50000000000000000000\)	\(-1.429262798332\)
1	\(1.00000000000000000000\)	\(-4.596976941319\times 10^{-01}\)
2	\(0.76293613902753061761\)	\(-4.012601889379\times 10^{-02}\)
3	\(0.74026437750068385814\)	\(-1.974111129638\times 10^{-03}\)
4	\(0.73909126246184205257\)	\(-1.025799485867\times 10^{-05}\)
5	\(0.73908513481012311798\)	\(-2.669348387419\times 10^{-09}\)
6	\(0.73908513321516280022\)	\(-3.612606381257\times 10^{-15}\)
7	\(0.73908513321516064166\)	\(-1.272274423348\times 10^{-24}\)
8	\(0.73908513321516064166\)	\(-6.063960095502\times 10^{-40}\)
9	\(0.73908513321516064166\)	\(-8.636168555094\times 10^{-78}\)

🔗

Subsection 3.5.2 Analysis of the Secant Method

Analyzing the secant method is similar to that of other iterative methods, however, because there are three points involved in the step in (3.5.1), the analysis must be adapted a bit.

🔗

First, let \(p\) be the root of \(f\text{,}\) that is, \(f(p)=0\) and \((x_n)\) is the sequence of points from the secant method. Define \(\epsilon_n = |x_n-p|\text{.}\) Then expand the terms \(f(x_n)\) and \(f(x_{n-1})\) about the root \(p\text{.}\)

\begin{equation*} \begin{aligned} f(x_n) \amp = f(p) + f'(p)(x_n-p) + \frac{f''(p)}{2}(x_n-p)^2 + \cdots \\ \amp = f'(p) \epsilon_n + \frac{f''(p)}{2}\epsilon_n^2 \\ f(x_{n-1})\amp = f(p) + f'(p)(x_{n-1}-p) + \frac{f''(p)}{2}(x_{n-1}-p)^2 + \cdots \\ \amp = f'(p) \epsilon_{n-1} + \frac{f''(p)}{2}\epsilon_{n-1}^2 \end{aligned} \end{equation*}

where \(f(p)=0\) is used. Subtracting these results in

\begin{equation*} f(x_n) - f(x_{n-1}) = f'(p)(\epsilon_n - \epsilon_{n-1}) + \frac{f''(p)}{2} (\epsilon_n^2 - \epsilon_{n-1}^2) \end{equation*}

Then using (3.5.1) and subtracting \(p\) from each side results in

\begin{equation*} \begin{aligned} x_{n+1} - p \amp = x_n - f(x_n) \frac{x_n-x_{n-1}}{f(x_n)-f(x_{n-1})} - p \\ \epsilon_{n+1}\amp = \epsilon_n - \left(f'(p)\epsilon_n + \frac{f''(p)}{2}\epsilon_n^2\right) \frac{\epsilon_n - \epsilon_{n-1}}{f'(p)(\epsilon_n - \epsilon_{n-1}) + \frac{f''(p)}{2} (\epsilon_n^2 - \epsilon_{n-1}^2)} \\ \amp = \epsilon_n \frac{f'(p)+ \frac{f''(p)}{2} (\epsilon_n + \epsilon_{n-1}) - f'(p) - \epsilon_n f''(p)/2}{f'(p)+ \frac{f''(p)}{2} (\epsilon_n + \epsilon_{n-1})} \\ \amp = \epsilon_n \frac{\epsilon_{n-1} f''(p)/2}{f'(p)+ \frac{f''(p)}{2} (\epsilon_n + \epsilon_{n-1})} \\ \amp \approx \epsilon_n \epsilon_{n-1} \frac{f''(p)}{2f'(p)} \end{aligned} \end{equation*}

so we have the relationship that \(\epsilon_{n+1} \approx C \epsilon_n \epsilon_{n-1}\text{.}\)

🔗

We are trying to find the order of convergence or the value of \(\alpha\) such that \(\epsilon_{n+1} \approx \lambda \epsilon_n^{\alpha}\) or equivalently, \(\epsilon_{n} \approx \lambda \epsilon_{n-1}^{\alpha}\) which results in

\begin{equation*} \epsilon_{n-1} \approx \left(\frac{1}{\lambda} \epsilon_n\right)^{1/\alpha} \end{equation*}

and then

\begin{equation*} \begin{aligned} \epsilon_{n+1} \amp \approx C \epsilon_n \epsilon_{n-1} \approx C \epsilon_n \left(\frac{1}{\lambda} \epsilon_n\right)^{1/\alpha} \\ \lambda \epsilon_n^{\alpha} \amp \approx \frac{C}{\lambda^{1/\alpha}} \epsilon^{1+1/\alpha} \end{aligned} \end{equation*}

🔗

To ensure that the powers of \(\epsilon\) is the same for both sides:

\begin{equation*} \begin{aligned} \alpha \amp = 1 + \frac{1}{\alpha} \\ \alpha^2 - \alpha -1 \amp = 0 \end{aligned} \end{equation*}

and using the quadratic formula, \(\alpha = (1 \pm \sqrt{5})/2\) and since we need \(\alpha \gt 0\text{,}\) then \(\alpha \approx 1.61\text{,}\) the same result that we saw above.

🔗

Subsection 3.5.3 The Newton-Bisection Method

One of the advantages of the bisection method from Section 3.2 is that if the original interval is guaranteed to contain a root, then each successive step results in a root. However, as we saw, it is quite slow to converge and in this section, we combine the speed of Newton’s method to speed up the bisection method.

🔗

Subsection 3.5.4 Muller’s Method

Muller’s method extends the idea of the secant method to three points and the quadratic that passes through the points. Consider a general function as in Figure 3.5.5

🔗

Figure 3.5.5.

🔗

We find the parabola that passes through the points \((x_0, f_0)\text{,}\) \((x_1, f_1)\) and \((x_2, f_2)\text{,}\) where \(f_j = f(x_j)\text{.}\) Although the quadratic can be written in standard form, it is easier if we write it as

\begin{equation*} q(x) = a_0 + a_1 (x-x_1) + a_2 (x-x_1)^2 \end{equation*}

We also use

\begin{equation*} \begin{aligned} h_1 \amp = x_1-x_0 \amp h_2 \amp = x_2 - x_1 \amp \delta_1 \amp = \frac{f_1-f_0}{h_1} \amp \delta_2 \amp = \frac{f_2-f_1}{h_2} \end{aligned} \end{equation*}

substituting the three points into \(q(x)\) results in

\begin{align} q(x_0) \amp = f_0 = a_0 + a_1 (x_0-x_1) + a_2 (x_0-x_1)^2 = a_0 -a_1 h_1 + a_2 h_1^2 \tag{3.5.2}\\ q(x_1) \amp = f_1 = a_0 \tag{3.5.3}\\ q(x_2) \amp = f_2 = a_0 + a_1 (x_2-x_1) + a_2 (x_2-x_1)^2 = a_0 + a_1 h_2 + a_2 h_2^2 \tag{3.5.4} \end{align}

Plugging in \(a_0=f_1\) into (3.5.2) and divide by \(h_1\) results in

\begin{equation} \frac{f_0-f_1}{h_1} = -\delta_1 = - a_1 + a_2 h_1\tag{3.5.5} \end{equation}

Combining (3.5.3) and (3.5.4) and dividing by \(h_2\) results in

\begin{equation} \frac{f_2-f_1}{h_2} = \delta_2 = a_1 + a_2h_2\tag{3.5.6} \end{equation}

adding (3.5.5) and (3.5.6)

\begin{equation*} \delta_2 - \delta_1 = a_2 (h_1 + h_2) \end{equation*}

which results in

\begin{equation*} a_2 = \frac{\delta_2-\delta_1}{h_1+h_2} \end{equation*}

and then subtracting \(h_2\) times (3.5.5) and \(h_1\) times (3.5.6) results in

\begin{equation*} \delta_1 h_2 + \delta_2 h_1 = a_1(h_2+h_1) \end{equation*}

\begin{equation*} a_1 = \frac{\delta_1 h_2 + \delta_2 h_1 }{h_2+h_1} \end{equation*}

🔗

This shows that the quadratic that passes through the points is

\begin{equation*} q(x) = a_0 + a_1 (x-x_1) + a_2(x-x_1)^2 \end{equation*}

with

\begin{equation*} \begin{aligned} a_0 \amp =f_1 \amp a_1 \amp =\frac{\delta_1 h_2 + \delta_2 h_1 }{h_2+h_1} \amp a_2 \amp = \frac{\delta_2-\delta_1}{h_1+h_2} \end{aligned} \end{equation*}

🔗

Example 3.5.6.

Use the results above to find the quadratic that passes through the points \((1,4), (3,1)\) and \((6,2)\)

🔗

Solution.

First,

\begin{equation*} \begin{aligned} h_1 \amp = x_1 -x_0 = 3-1 = 2 \\ h_2 \amp = x_2 - x_1 = 6-3 = 3 \\ \delta_1 \amp = \frac{f_1-f_0}{h_1} = \frac{1-4}{2} = -\frac{3}{2} \\ \delta_2 \amp = \frac{f_2 -f_1}{h_2} = \frac{2-1}{3} = \frac{1}{3} \end{aligned} \end{equation*}

and then the coefficients of the quadratic is

\begin{equation*} \begin{aligned} a_0 \amp = 1 \\ a_1 \amp = \frac{\delta_1 h_2 + \delta_2 h_1 }{h_2+h_1} = \frac{(-3/2)(3) + (1/3)(2)}{2+3} = -\frac{23}{30} \\ a_2 \amp = \frac{\delta_2-\delta_1}{h_1+h_2} = \frac{(1/3-(-3/2))}{5} = \frac{11}{30} \end{aligned} \end{equation*}

🔗

This results in the quadratic equation:

\begin{equation*} q(x) = 1-\frac{23}{30}(x-3) + \frac{11}{30}(x-3)^2 \end{equation*}

🔗

Next, Muller’s method uses the root of the quadratic to find the root, however, only one of the roots is needed.

\begin{equation*} x-x_1 = - \frac{2a_0}{a_1 \pm \sqrt{a_1^2-4a_0a_2}} = \frac{-a_1 \pm \sqrt{a_1^2 - 4a_0a_2}}{2 a_2} \end{equation*}

The root is the one closer to the point \(x_2\) or the one with the larger denominator. Thus we will take the sign that is the same as the sign of \(a_1\text{.}\)

🔗

Therefore, we get:

\begin{equation*} x_3 = x_1 - \frac{2a_0}{a_1 + \text{sgn}(a_1) \sqrt{a_1^2-4a_0a_2}} \end{equation*}

🔗

Algorithm 3.5.7.

The following steps uses Muller’s method to estimate the root \(p\) of the function \(f(x)\text{.}\) The input in the function and three \(x-\)values. Let’s call them \(x_0, x_1\) and \(x_2\) and we will find the root to tol.

🔗

Store x₀, x₁, x₂ in a vector/array.
🔗

🔗

Let

h₁ = x₁ -x₀
h₂ = x₂ - x₁
δ₁ = (f(x₁)-f(x₀))/h₁
δ₂ = (f(x₂)-f(x₁))/h₂
a₀ = f(x₁)
a₁ = (δ₁ * h₂ + δ₂ * h₁)/(h₁+h₂)
a₂ = (δ₂-δ₁)/(h₁+h₂)

🔗

let

x₃ = x₁ - 2*a₀/(a₁+sgn(a₁)*sqrt(a₁^2-4*a₂*a₀))

🔗

If abs(x₃-x₂) < ϵ then stop and return x₃. If not, let
```
x₀ = x₁
x₁ = x₂
x₂ = x₃
```
and return to step 2.

🔗
🔗

🔗

We will now solve the equation \(\cos x = x\) by finding the root of \(f(x) = \cos x - x\text{.}\)

🔗

Example 3.5.8.

Find the root of \(f(x) = \cos x - x\) using Muller’s method starting with \(x_0 = 0, x_1 = 1, x_2 = 2\text{.}\)

🔗

Solution.

We show the first step in detail and the the remaining ones.

h₁ = x₁ -x₀ = 1
h₂ = x₂ - x₁ = 1
δ₁ = (f(x₁)-f(x₀))/h₁ = -1.4596976941
δ₂ = (f(x₂)-f(x₁))/h₂ = -1.9564491424
a₀ = f(x₁) = -0.45969769
a₁ = (δ₁ * h₂ + δ₂ * h₁)/(h₁+h₂) = -1.708073
a₂ = (δ₂-δ₁)/(h₁+h₂) = -0.2483757

🔗

And then x₃ = 0.71942008 .

🔗

We then repeat this until the cutoff value resulting the the following table.

🔗

Table 3.5.9. Root-Finding Convergence with BigFloat

🔗

n	x	f
0	\(1.50000000000000000000\)	\(1.000000000000\text{e}{+00}\)
1	\(1.00000000000000000000\)	\(-4.596976941319\text{e}{-01}\)
2	\(0.76293613902753061761\)	\(-2.416146836547\text{e}{+00}\)
3	\(0.74026437750068385814\)	\(3.276790532876\text{e}{-02}\)
4	\(0.73909126246184205257\)	\(9.171630667026\text{e}{-04}\)
5	\(0.73908513481012311798\)	\(-1.907353946720\text{e}{-06}\)
6	\(0.73908513321516280022\)	\(-1.371358137823\text{e}{-12}\)
7	\(0.73908513321516064166\)	\(5.745267438980\text{e}{-23}\)
8	\(0.73908513321516064166\)	\(3.599040004838\text{e}{-42}\)
9	\(0.73908513321516064166\)	\(-8.636168555094\text{e}{-78}\)
10	\(0.73908513321516064166\)	\(8.636168555094\text{e}{-78}\)

🔗

Subsection 3.5.5 Brent’s Method

🔗

Prev Top Next