2.4. Error Analysis for Iterative Methods#
The following definition is Definition 2.7 from [Burden and Faires, 2005].
Theorem: Order of Convergence
Assume that \(\{c_{n}\}\) is a sequence of approximate solutions that converge to a point \(c\). Let \(c_{n} \neq c\) for \(n\in \mathbb{N}\), then there exists \(\lambda\) and \(\alpha\) such that
where \(\|.\|\) is a distance function (such as absolute value for real values). Then, \(\alpha\) represents the order of convergence while \(\lambda\) is an asymptotic error constant.
Remark
In the previous theorem,
If \(\alpha = 1\) (and \(\lambda <1\)), the \(\{c_{n}\}\) is linearly convergent.
If \(\alpha = 2\) (and \(\lambda <1\)), the \(\{c_{n}\}\) is quadratically convergent.
If \(\alpha = 3\) (and \(\lambda <1\)), the \(\{c_{n}\}\) is cubically convergent (cubic convergence).
Etc.
Suppose that \(\{ c_n\}_{n = 0}^{\infty}\) is linearly convergent to 0 with
and that \(\{ c_n\}_{n = 0}^{\infty}\) is quadratically convergent to 0 with the same asymptotic error constant,
For simplicity, we can assume that for each \(n\), we have,
For linearly and quadratically convergent schemes, this means that
and
respectively.
2.4.1. The Order of Accuracy of the Bisection method#
Observe that, in the Bisection method, the size of the interval containing a root is halved at each iteration. Therefore, the error is reduced by approximately a factor of 2, and the sequence \(\{c_n\}\) converges \textbf{linearly}.
Example: Investigate the order of convergence of the Bisection method for
on \([0,~3]\) using \(c = 2\) (exact root).
Solution:
Observe that \(x = -1\) and \(x = 2\) are the roots of \(x^{2}-x-2 = 0\). \(f(x)\) has one root (\(x = 2\)) between \(a = 0\) and \(b = 3\).
# This part is used for producing tables and figures
import sys
sys.path.insert(0,'..')
import hd_tools as hd
\(f(x)\) has one root (\(x = 2\)) between \(a = 0\) and \(b = 3\).
# Inputs:
f = lambda x: x**2 - x - 2
a=0
b=3
import pandas as pd
from hd_RootFinding_Algorithms import bisection
data = bisection(f, a, b, Nmax = 20, TOL = 1e-4)
display(data.style.set_properties(subset=['cn','fcn'],
**{'background-color': 'PaleGreen', 'color': 'Black',
'border-color': 'DarkGreen'}).format({'fcn': "{:.4e}"}))
an | bn | cn | fcn | |
---|---|---|---|---|
0 | 0.000000 | 3.000000 | 1.500000 | -1.2500e+00 |
1 | 1.500000 | 3.000000 | 2.250000 | 8.1250e-01 |
2 | 1.500000 | 2.250000 | 1.875000 | -3.5938e-01 |
3 | 1.875000 | 2.250000 | 2.062500 | 1.9141e-01 |
4 | 1.875000 | 2.062500 | 1.968750 | -9.2773e-02 |
5 | 1.968750 | 2.062500 | 2.015625 | 4.7119e-02 |
6 | 1.968750 | 2.015625 | 1.992188 | -2.3376e-02 |
7 | 1.992188 | 2.015625 | 2.003906 | 1.1734e-02 |
8 | 1.992188 | 2.003906 | 1.998047 | -5.8556e-03 |
9 | 1.998047 | 2.003906 | 2.000977 | 2.9306e-03 |
10 | 1.998047 | 2.000977 | 1.999512 | -1.4646e-03 |
11 | 1.999512 | 2.000977 | 2.000244 | 7.3248e-04 |
12 | 1.999512 | 2.000244 | 1.999878 | -3.6620e-04 |
13 | 1.999878 | 2.000244 | 2.000061 | 1.8311e-04 |
14 | 1.999878 | 2.000061 | 1.999969 | -9.1552e-05 |
The iterations was stopped when \(f(c_n) < 10^{-4}\).
In many of the previous examples, we compared an approximation of the root at the current iteration with that of previous iterations. In practice, sometimes, it’s difficult to identify the exact solution. However, here, to comment on the rate of convergence, we chose examples whose actual roots are known. The approximations of the roots at each iteration are compared to the exact root using the absolute error. Moreover, recall that,
def disp_Error(Error_data):
display(Error_data.style.set_properties(subset=['Abs Error'],
**{'background-color': 'MistyRose', 'color': 'Black',
'border-color': 'DarkGreen'}).format({'Abs Error': "{:.4e}"}))
pd.options.mode.chained_assignment = None
Error_data = data[['cn']]
Error_data['Exact'] = [2]*data['cn'].shape[0]
Error_data = Error_data.rename(columns = {'cn':'Approximate'})
Error_data['Abs Error'] = abs(Error_data['Exact'] - Error_data['Approximate'])
disp_Error(Error_data)
Approximate | Exact | Abs Error | |
---|---|---|---|
0 | 1.500000 | 2 | 5.0000e-01 |
1 | 2.250000 | 2 | 2.5000e-01 |
2 | 1.875000 | 2 | 1.2500e-01 |
3 | 2.062500 | 2 | 6.2500e-02 |
4 | 1.968750 | 2 | 3.1250e-02 |
5 | 2.015625 | 2 | 1.5625e-02 |
6 | 1.992188 | 2 | 7.8125e-03 |
7 | 2.003906 | 2 | 3.9062e-03 |
8 | 1.998047 | 2 | 1.9531e-03 |
9 | 2.000977 | 2 | 9.7656e-04 |
10 | 1.999512 | 2 | 4.8828e-04 |
11 | 2.000244 | 2 | 2.4414e-04 |
12 | 1.999878 | 2 | 1.2207e-04 |
13 | 2.000061 | 2 | 6.1035e-05 |
14 | 1.999969 | 2 | 3.0518e-05 |
Error_data['Abs Error'].values[1:] / Error_data['Abs Error'].values[:-1]
array([0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5])
Now observe that,
Therefore,
And \(\alpha = 1\), therefore, the method is \textbf{linearly convergent}.
2.4.2. The Order of Accuracy of the Fixed-point iteration#
The following theorem is Theorem 2.9 from [Burden and Faires, 2005].
Theorem
Assume \(c\) is a solution of \(x = g(c)\), \(g'( c) = 0\), and \(g''\) is continuous with \(|g''(x)| < M\) on an open interval \(I\) containing \(c\). Then there exists a \(\delta > 0\) such that, for \(c_{0} \in [c - \delta, c +\delta]\) and the sequence defined by \(c_{n} = g( c_{n-1})\) with \(n \geq 1\), converges at least quadratically to \(c\). \ Moreover, for sufficiently large values of \(n\),
Proof:
Pick \(k\in (0,~1)\) and \(\delta >0\) such that \([c - \delta, c +\delta] \subset I\) and \(|g'(x)| \leq k\). As
the discussion in the proof of Theorem \ref{NewtonConvergencetheorem} demonstrate that the terms of the sequence \(\{c_{n} \}_{n = 0}^{\infty}\) are contained in \([c - \delta, c +\delta]\).
Now, it follows from expanding \(g(x)\) in linear Taylor polynomial for \(x \in [p - \delta, p + \delta]\) that
where \(\xi\) is a point between \(x\) and \(c\). It follows from \(g(c) = c\) and \(g '( c) = 0\) that
In particular, letting \(x = c_{n}\), we have,
where \(\xi_{n}\) is a point between \(x\) and \(c_{n}\). Therefore,
Now, as \(|g'(x)| \leq k <1\) on \([c - \delta, c +\delta]\) and \(g\) maps \([c - \delta, c +\delta]\) into \([c - \delta, c +\delta]\), it follows from the Fixed-Point Theorem that \(\{c_{n} \}_{n = 0}^{\infty}\) converges to a point \(c\). However, \(\xi_{n}\) is a point between \(c\) and \(c_{n}\) for each \(n\). Therefore, \(\{\xi_{n} \}_{n = 0}^{\infty}\) converges to \(c\), and
This means that the sequence \(\{ c_{n} \}_{n=0}^{\infty}\) is \textbf{quadratically convergent} if \(g''(c) \neq 0\) and of higher-order convergence if \(g''(c) = 0\)
Since \(g\) is a continuous function and \textbf{strictly bounded} by \(M\) on the interval \([c - \delta, c +\delta]\), this also means that, for sufficiently large values of \(n\),
Example:
Investigate the order of convergence of the fixed-point iteration using the following example.
on \([0.5, 1.5]\) and \(c_0 = 0.9999999\). Note that here \(c=1\) is the exact fixed point.
Solution:
Here \(c= 1\) is the exact fixed point, and \(I=[0.5, 1.5]\). We have,
Observe that \(g'(x) \neq 0\) and \(g''(x)\) is continuous function on \(I\), and \(|g''(x)| \leq 2\) for all \(x \in I\).
Let \(c_0 = 0.9999999\), we have,
Also, using the Python script for the fixed-point method, we have,
import numpy as np
g = lambda x : np.sqrt(x)
from hd_RootFinding_Algorithms import Fixed_Point
data = Fixed_Point(g = g, x0 = .9999999, TOL=1e-10, Nmax=30)
# En is the difference between two consecutive xn.
display(data.style.set_properties(subset=['xn'], **{'background-color': 'PaleGreen', 'color': 'Black',
'border-color': 'DarkGreen'}).format({'En': "{:.4e}"}))
xn | En | |
---|---|---|
0 | 1.000000 | 1.0000e+00 |
1 | 1.000000 | 5.0000e-08 |
2 | 1.000000 | 2.5000e-08 |
3 | 1.000000 | 1.2500e-08 |
4 | 1.000000 | 6.2500e-09 |
5 | 1.000000 | 3.1250e-09 |
6 | 1.000000 | 1.5625e-09 |
7 | 1.000000 | 7.8125e-10 |
8 | 1.000000 | 3.9062e-10 |
9 | 1.000000 | 1.9531e-10 |
10 | 1.000000 | 9.7656e-11 |
The iterations was stopped when \(E_n < 10^{-10}\). Moreover, the exact fixed point is \(c = 1\). Using this as the exact solution, we have,
c = 1
Error_data = data[['xn']]
Error_data['Exact'] = [c]*data['xn'].shape[0]
Error_data = Error_data.rename(columns = {'xn':'Approximate'})
Error_data['Abs Error'] = abs(Error_data['Exact'] - Error_data['Approximate'])
disp_Error(Error_data)
Approximate | Exact | Abs Error | |
---|---|---|---|
0 | 1.000000 | 1 | 1.0000e-07 |
1 | 1.000000 | 1 | 5.0000e-08 |
2 | 1.000000 | 1 | 2.5000e-08 |
3 | 1.000000 | 1 | 1.2500e-08 |
4 | 1.000000 | 1 | 6.2500e-09 |
5 | 1.000000 | 1 | 3.1250e-09 |
6 | 1.000000 | 1 | 1.5625e-09 |
7 | 1.000000 | 1 | 7.8125e-10 |
8 | 1.000000 | 1 | 3.9063e-10 |
9 | 1.000000 | 1 | 1.9531e-10 |
10 | 1.000000 | 1 | 9.7656e-11 |
Now observe that,
Error_data['Abs Error'].values[1:] / (Error_data['Abs Error'].values[:-1])
array([0.50000001, 0.50000001, 0.5 , 0.5 , 0.5 ,
0.50000002, 0.5 , 0.50000007, 0.5 , 0.50000028])
We can see that
And \(\alpha = 1\), therefore, the method is linearly convergent.
Corollary: Convergence of Fixed Point Iteration
Let \(g\) be continuous on \([a, b]\), \(c\) be its fixed point, and \(g'\) be continuous on \((a, b)\), and assume there exists \(k < 1\) so that \(|g'(x)|\leq k\) when \(x\in(a,b)\). Then,
a. If \(g'(c) \neq 0\), the sequence linearly converges to the fixed point.
b. If \(g'(c) = 0\), the sequence at least quadratically converges to the fixed point.
2.4.3. The Order of Accuracy of the Newton’s method#
In the above definition, \(q(x)\) denotes the part of \(f (x)\) that does \textbf{not} contribute to the zero of \(f\).
The following theorem is Theorem 2.11 from [Burden and Faires, 2005].
Theorem
A continuous function \(f\) on \([a, b]\) has a root \(r\in(a,b)\) with multiplicity of 1 if and only if
Proof:
If point \(c\) is a root for \(f\), then \(f(c) = 0\). Then,
Now, since \(f \in C^{1}[a,b]\), then
On the other hand, let \(f(c) =0\) and \(f'(c) \neq 0\). Then, we can expand \(f\) in a zeroth Taylor polynomial round point \(c\)s as follows,
where \(\xi(x)\) is between \(x\) and \(c\). Again, since \(f \in C^{1}[a,b]\), then
Now, it is enough to assume \(q = f'\circ\xi\). Then, \(f (x) = (x - c) q(x)\) with \( \lim_{x \to c} q(x) \neq 0\). Therefore, \(f\) has a simple zero at point \(c\).
Theorem
Assume that function \(f\) is \(m\) times continuous on \([a,b]\) and has a root \(c\in(a,b)\) with multiplicity of \(m\) if only if
Newton’s method is available as follows
On the other hand, from the fixed point theorem, we have,
It follows from (2.57) and (2.58) that
Let \(c\) be a root of \(f(x) = 0\) with a multiplicity of 1 (i.e. \(f(c) = 0\) and \(f'(c) \neq 0\)), and also from the fixed point, we will have \(c = g(c)\). Thus, it follows from (2.59) that
On the other hand, we can expand \(g(x_n)\) using the Taylor series around the point \(r\)
Now, let \(e_n = x_n - r\) (error at iteration \(n\)), we have,
Therefore, \hlg{Newton’s method is \textbf{quadratically} convergent}. Moreover, let \(c_n\) be a point near a root with a multiplicity of \(m \geq 2\). Then, it follows from the Taylor series developed around \(c\),
Thus,
and the error term at iteration \(n+1\) would be
Therefore,
This means Newton’s method for a root with a \hlr{multiplicity of \(m \geq 2\) is \textbf{not quadratic}} and is linear convergence.
2.4.3.1. Modified Newton Algorithm#
Moreover, if the root has the multiplicity \(m\) and this multiplicity is known. Then the algorithm can be modified as follows to increase the convergence rate.
import pandas as pd
import numpy as np
def Newton_method_mod(f, Df, x0, m = 1, TOL = 1e-4, Nmax = 100, Frame = True):
'''
Parameters
----------
f : function
DESCRIPTION.
Df : function
DESCRIPTION. the derivative of f
x0 : float
DESCRIPTION. Initial estimate
m: int
DESCRIPTION. the multiplicity of the root
TOL : TYPE, optional
DESCRIPTION. Tolerance (epsilon). The default is 1e-4.
Nmax : Int, optional
DESCRIPTION. Maximum number of iterations. The default is 100.
Frame : bool, optional
DESCRIPTION. If it is true, a dataframe will be returned. The default is True.
Returns
-------
xn, fxn, Dfxn, n
or
a dataframe
'''
xn=np.zeros(Nmax, dtype=float)
fxn=np.zeros(Nmax, dtype=float)
Dfxn=np.zeros(Nmax, dtype=float)
xn[0] = x0
for n in range(0,Nmax-1):
fxn[n] = f(xn[n])
if abs(fxn[n]) < TOL:
Dfxn[n] = Df(xn[n])
if Frame:
return pd.DataFrame({'xn': xn[:n+1], 'fxn': fxn[:n+1], 'Dfxn': Dfxn[:n+1]})
else:
return xn, fxn, Dfxn, n
Dfxn[n] = Df(xn[n])
if Dfxn[n] == 0:
print('Zero derivative. No solution found.')
return None
xn[n+1] = xn[n] - m*(fxn[n]/Dfxn[n])
print('Exceeded maximum iterations. No solution found.')
return None
function [an, bn, cn, fcn, n] = Regula_falsi_Illinois_alg(f, a, b, Nmax, TOL)
%{
Parameters
----------
f : function
DESCRIPTION. A function. Here we use lambda functions
a : float
DESCRIPTION. a is the left side of interval [a, b]
b : float
DESCRIPTION. b is the right side of interval [a, b]
Nmax : Int, optional
DESCRIPTION. Maximum number of iterations. The default is 1000.
TOL : float, optional
DESCRIPTION. Tolerance (epsilon). The default is 1e-4.
Returns
-------
an, bn, cn, fcn, n
or
a dataframe
Example: f = @(x) x^3 - x - 2
%}
if nargin==4
TOL= 1e-4;
end
if f(a)*f(b) >= 0
disp("Regula falsi (Illinois algorithm) method is inapplicable .")
return
end
% let c_n be a point in (a_n, b_n)
an = zeros(Nmax, 1);
bn = zeros(Nmax, 1);
cn = zeros(Nmax, 1);
fcn = zeros(Nmax, 1);
% initial values
an(1) = a;
bn(1) = b;
for n = 1:(Nmax-1)
cn(n)= (0.5*an(n)*f(bn(n)) - bn(n)*f(an(n))) / (0.5*f(bn(n)) - f(an(n)));
fcn(n)=f(cn(n));
if f(an(n))*fcn(n) < 0
an(n+1)=an(n);
bn(n+1)=cn(n);
elseif f(bn(n))*fcn(n) < 0
an(n+1)=cn(n);
bn(n+1)=bn(n);
else
disp("Regula falsi (Illinois algorithm) method fails.")
end
if (abs(fcn(n)) < TOL)
return
end
end
Example: \(f(x)=e^{x^2}-1\) has two zero roots (i.e. \(c = 0\)) with the multiplicity of 2 on \([-3,~3]\). Comment on the rate of convergence of Newton’s method and modified Newton’s method using \(c_0 = -2\).
Solution: We can utilize \verb”Newton_method_mod” to generate \(\{c_n\}_{n = 0}^{\infty}\) for the Newton’s method (set \(m = 1\)) and modified Newton’s method (set \(m = 2\)).
from hd_RootFinding_Algorithms import Newton_method_mod
import numpy as np
f = lambda x: np.exp(x**2) - 1
Df = lambda x: 2*x*np.exp(x**2)
# Newton method
data = Newton_method_mod(f, Df, x0 = -2, TOL = 1e-4, Nmax = 20)
display(data.style.format({'fxn': "{:.4e}"}))
hd.it_method_plot(data, title = 'Newton Method', ycol = 'fxn', ylabel = 'f(xn)', color = 'OrangeRed')
# Newton method modified
data = Newton_method_mod(f, Df, m = 2, x0 = -2, TOL = 1e-4, Nmax = 20)
display(data.style.format({'fxn': "{:.4e}"}))
hd.it_method_plot(data, title = 'Newton Method (Modified)', ycol = 'fxn', ylabel = 'f(xn)', color = 'Orchid')
xn | fxn | Dfxn | |
---|---|---|---|
0 | -2.000000 | 5.3598e+01 | -218.392600 |
1 | -1.754579 | 2.0727e+01 | -76.242818 |
2 | -1.482726 | 8.0113e+00 | -26.722522 |
3 | -1.182931 | 3.0525e+00 | -9.587583 |
4 | -0.864554 | 1.1116e+00 | -3.651212 |
5 | -0.560103 | 3.6850e-01 | -1.533001 |
6 | -0.319725 | 1.0763e-01 | -0.708274 |
7 | -0.167762 | 2.8544e-02 | -0.345101 |
8 | -0.085050 | 7.2598e-03 | -0.171335 |
9 | -0.042679 | 1.8231e-03 | -0.085513 |
10 | -0.021359 | 4.5630e-04 | -0.042737 |
11 | -0.010682 | 1.1411e-04 | -0.021366 |
12 | -0.005341 | 2.8529e-05 | -0.010683 |
xn | fxn | Dfxn | |
---|---|---|---|
0 | -2.000000 | 5.3598e+01 | -218.392600 |
1 | -1.509158 | 8.7528e+00 | -29.437114 |
2 | -0.914478 | 1.3077e+00 | -4.220761 |
3 | -0.294806 | 9.0799e-02 | -0.643149 |
4 | -0.012448 | 1.5496e-04 | -0.024899 |
5 | -0.000001 | 9.2992e-13 | -0.000002 |
The rate of convergence for the modified Newton method is much faster.
Example: Investigate the order of convergence of Newton’s method and modified Newton’s method for \(f(x)= (x - 3)^4\) on \([-6,~6]\) using \(c_0 = -5\) and \(c= 3\).
Solution: Observe that \(c = 3\) is a root of this polynomial with the multiplicity of \(m = 4\).
from hd_RootFinding_Algorithms import Newton_method_mod
import numpy as np
f = lambda x: (x - 3)**4
Df = lambda x: 4*(x - 3)**3
# Newton method
data = Newton_method_mod(f, Df, x0 = -5, TOL = 1e-4, Nmax = 20)
display(data.style.format({'fxn': "{:.4e}"}))
hd.it_method_plot(data, title = 'Newton Method', ycol = 'fxn', ylabel = 'f(xn)', color = 'OrangeRed')
# Newton method modified
data = Newton_method_mod(f, Df, m = 4, x0 = -5, TOL = 1e-4, Nmax = 20)
display(data.style.format({'fxn': "{:.4e}"}))
xn | fxn | Dfxn | |
---|---|---|---|
0 | -5.000000 | 4.0960e+03 | -2048.000000 |
1 | -3.000000 | 1.2960e+03 | -864.000000 |
2 | -1.500000 | 4.1006e+02 | -364.500000 |
3 | -0.375000 | 1.2975e+02 | -153.773438 |
4 | 0.468750 | 4.1053e+01 | -64.873169 |
5 | 1.101562 | 1.2989e+01 | -27.368368 |
6 | 1.576172 | 4.1099e+00 | -11.546030 |
7 | 1.932129 | 1.3004e+00 | -4.870982 |
8 | 2.199097 | 4.1145e-01 | -2.054945 |
9 | 2.399323 | 1.3019e-01 | -0.866930 |
10 | 2.549492 | 4.1192e-02 | -0.365736 |
11 | 2.662119 | 1.3033e-02 | -0.154295 |
12 | 2.746589 | 4.1238e-03 | -0.065093 |
13 | 2.809942 | 1.3048e-03 | -0.027461 |
14 | 2.857456 | 4.1285e-04 | -0.011585 |
15 | 2.893092 | 1.3063e-04 | -0.004888 |
16 | 2.919819 | 4.1331e-05 | -0.002062 |
xn | fxn | Dfxn | |
---|---|---|---|
0 | -5.000000 | 4.0960e+03 | -2048.000000 |
1 | 3.000000 | 0.0000e+00 | 0.000000 |
The iterations was stopped when \(f(c_n) < 10^{-4}\). The above iterations are too short for any form of error estimation. Nevertheless, observe that the rate of convergence for the modified Newton method is much faster.