2.1. Bracketing Methods#

In these methods, a smaller interval that includes a root is identified iteratively. As these methods do not mandate any derivatives in general, they are also known as no derivative methods.

2.1.1. Bisection method#

This method is employed to compute the solution of the equation \(f(x) = 0\) for a real variable \(x\). It requires the function \(f\) to be continuous on the interval \([a,~b]\), where \(f(a)f(b) < 0\) (indicating that \(f(a)\) and \(f(b)\) have opposite signs). In such cases, according to the Intermediate Value Theorem, \(f\) must possess at least one root within the interval \((a,~b)\) (for more details, refer to [Burden and Faires, 2005]). These methods are also referred to as the “binary-search” methods.

Observe that, this method generates a sequence of intervals such that.

\[\begin{align*} [a, b] = [a_{1}, b_{1}] \supset [a_{2}, b_{2}] \supset [a_{3}, b_{3}] \supset \ldots [a_{n}, b_{n}] \supset \ldots \end{align*}\]

The following theorem is Theorem 2.1 from [Burden and Faires, 2005]).

Theorem

Assume that function \(f\) is continuous on \([a,~b]\) and \(f(a)f(b)<0\). Then, the Bisection method induces a sequence of approximations of a root \(c\) of function \(f\) from \([a,~b]\), \(\{c_{n}\}\) such that

(2.1)#\[\begin{equation}\label{bisection_theorem_eq} \left| c - c_{n} \right| \leq \frac{b-a}{2^n} \end{equation}\]

Proof:

We have \(c\in (a_{n}, b_{n})\) with \(n\geq 1\) and

(2.2)#\[\begin{equation} b_{n} - a_{n} = \frac{1}{2^{n-1}} (b - a),\quad n \geq 1. \end{equation}\]

Since \(c_{n} = \dfrac{1}{2}(a_{n} + b_{n})\) with \(n\geq 1\), we have,

(2.3)#\[\begin{equation} |c_{n} - c| \leq \frac{1}{2}(b_{n} - a_{n}) =\frac{b-a}{2^n}. \end{equation}\]

Therefore, the sequence \(\{c_{n}\}\) converges to \(c\) as \(n\to \infty\) and

(2.4)#\[\begin{equation} c_{n} = c +\mathcal{O}\left(\frac{1}{2^n}\right). \end{equation}\]
import pandas as pd 
import numpy as np

def bisection_mod(f, a, b, TOL = 1e-4, Frame = True):
    '''
    Parameters
    ----------
    f : function 
        DESCRIPTION. A function. Here we use lambda functions
    a : float
        DESCRIPTION. a is the left side of interval [a, b]
    b : float
        DESCRIPTION. b is the right side of interval [a, b]
    TOL : float, optional
        DESCRIPTION. Tolerance (epsilon). The default is 1e-4.
    Frame : bool, optional
        DESCRIPTION. If it is true, a dataframe will be returned. The default is True.

    Returns
    -------
    an, bn, cn, fcn, n
    or
    a dataframe
    '''
    
    if f(a)*f(b) >= 0:
        print("Bisection method is inapplicable .")
        return None
    
    # let c_n be a point in (a_n, b_n)
    Nmax= int(np.ceil(np.log2(np.abs(b-a)/TOL)))+2
    an=np.zeros(Nmax, dtype=float)
    bn=np.zeros(Nmax, dtype=float)
    cn=np.zeros(Nmax, dtype=float)
    fcn=np.zeros(Nmax, dtype=float)
    # initial values
    an[0]=a
    bn[0]=b

    for n in range(0, Nmax-1):
        cn[n]=(an[n] + bn[n])/2
        fcn[n]=f(cn[n])
        if f(an[n])*fcn[n] < 0:
            an[n+1]=an[n]
            bn[n+1]=cn[n]
        elif f(bn[n])*fcn[n] < 0:
            an[n+1]=cn[n]
            bn[n+1]=bn[n]
        else:
            print("Bisection method fails.")
            return None
        if (abs(fcn[n]) < TOL):
            if Frame:
                return pd.DataFrame({'an': an[:n+1], 'bn': bn[:n+1], 'cn': cn[:n+1], 'fcn': fcn[:n+1]})
            else:
                return an, bn, cn, fcn, n
        
    if Frame:
        return pd.DataFrame({'an': an[:n+1], 'bn': bn[:n+1], 'cn': cn[:n+1], 'fcn': fcn[:n+1]})
    else:
        return an, bn, cn, fcn, n
function [an, bn, cn, fcn, n] = bisection_mod(f, a, b, TOL)
%{
Parameters
----------
f : function 
    DESCRIPTION. A function. Here we use lambda functions
a : float
    DESCRIPTION. a is the left side of interval [a, b]
b : float
    DESCRIPTION. b is the right side of interval [a, b]
TOL : float, optional
    DESCRIPTION. Tolerance (epsilon). The default is 1e-4.

Returns
-------
an, bn, cn, fcn, n
or
a dataframe

Example: f = @(x) x^3 - x - 2
%}
if nargin==3
    TOL= 1e-4;
end

if f(a)*f(b) >= 0
    disp("Bisection method is inapplicable .")
    return
end

% let c_n be a point in (a_n, b_n)
Nmax = fix(ceil(log2(abs(b-a)/TOL)));
an = zeros(Nmax, 1);
bn = zeros(Nmax, 1);
cn = zeros(Nmax, 1);
fcn = zeros(Nmax, 1);
% initial values
an(1) = a;
bn(1) = b;

for n = 1:(Nmax-1)
    cn(n)=(an(n) + bn(n))/2;
    fcn(n)=f(cn(n));
    if f(an(n))*fcn(n) < 0
        an(n+1)=an(n);
        bn(n+1)=cn(n);
    elseif f(bn(n))*fcn(n) < 0
        an(n+1)=cn(n);
        bn(n+1)=bn(n);
    else
        disp("Bisection method fails.")
    end
    if (abs(fcn(n)) < TOL)
        return
    end
end

Example: Let \(a = 1\) and \(b= 2\), \(N_{max} = 20\) and \(\varepsilon = 10^{-4}\). Find a root of the following polynomial using the Bisection method.

Solution:

By sketching function \(f(x)\), it can be seen that the root is somewhere around 1.5.

Using the Bisection method, we have \(a = 1\) and \(b = 2\). For n=0 we set \(a_0 = 1\) and \(b_0 = 2\). Now, since \(f(a_0) = -2 <0\) and \(f(b_0) = 4 >0\), then \(f(a_0)f(b_0)<0\), and we can implement the Bisection method. Thus,

\[\begin{equation*} c_{0} = \frac{1+2}{2} = 1.5. \end{equation*}\]

Because \(c_{0} = (1+2)/2 = 1.5\), then \(f(c_{0}) = -0.125\). Observe that,

\[\begin{equation*} f(a_{0})f(c_{0})>0 \quad \text{and} \quad f(b_{0})f(c_{0})<0. \end{equation*}\]

and the for the next iteration, n=1, we set \(a_{1} = 1.5\) and \(b_{1} = 2\) and repeat the above steps. Therefore, \(c_{1} = (1.5 + 2)/2 = 1.75.\) \ \noindent Now, for this step, \(f(c_{1}) = 1.609375\), and

\[\begin{equation*} f(a_{1})f(c_{1})<0 \quad \text{and} \quad f(b_{1})f(c_{1})>0. \end{equation*}\]

Thus, for the next iteration, n=2, we have, \(a_{2} = 1.5\) and \(b_{2} = 1.75\). By continuing the above procedure, we can generate the following table. Here, since we don’t know what exactly the value of \(c\) is for this example, we measure the error through \(f(c_n)\). The iterations was stopped when \(|f(c_n)|< 10^{-4}\).

Using the above Python function,

# defining function f using Lambdas
f = lambda x: x**3 - x - 2

from IPython.display import display, Markdown, Latex

a=1
display(Latex(r'(a, f(a)) = (%.2f, %.2f)' % (a,f(a))))
b=2
display(Latex(r'(b, f(b)) = (%.2f, %.2f)' % (b,f(b))))
\[(a, f(a)) = (1.00, -2.00)\]
\[(b, f(b)) = (2.00, 4.00)\]

Now since \(f(a)<0\) and \(f(b)>0\), we can implement the method. Let \(N_{max}=20\) and the tolerance be \(10^{-4}\)

# This part is used for producing tables and figures
import sys
sys.path.insert(0,'..')
import hd_tools as hd
Loading BokehJS ...
from hd_RootFinding_Algorithms import bisection
data = bisection(f, a, b, Nmax = 20, TOL = 1e-4)
display(data.style.format({'fcn': "{:.4e}"}))
hd.it_method_plot(data, title = 'Bisection method')
hd.root_plot(f, a, b, [data.an.values[:-1], data.an.values[-1]], ['an', 'root'])
  an bn cn fcn
0 1.000000 2.000000 1.500000 -1.2500e-01
1 1.500000 2.000000 1.750000 1.6094e+00
2 1.500000 1.750000 1.625000 6.6602e-01
3 1.500000 1.625000 1.562500 2.5220e-01
4 1.500000 1.562500 1.531250 5.9113e-02
5 1.500000 1.531250 1.515625 -3.4054e-02
6 1.515625 1.531250 1.523438 1.2250e-02
7 1.515625 1.523438 1.519531 -1.0971e-02
8 1.519531 1.523438 1.521484 6.2218e-04
9 1.519531 1.521484 1.520508 -5.1789e-03
10 1.520508 1.521484 1.520996 -2.2794e-03
11 1.520996 1.521484 1.521240 -8.2891e-04
12 1.521240 1.521484 1.521362 -1.0343e-04
13 1.521362 1.521484 1.521423 2.5935e-04
14 1.521362 1.521423 1.521393 7.7956e-05

Remark

Note that the difference between \(c_n\) and a solution \(c\) is bounded by

(2.5)#\[\begin{equation} |c_{n}-c|\leq {\frac {|b-a|}{2^{n}}}. \end{equation}\]

Therefore, The number of iterations needed to achieve a required tolerance \(\varepsilon\) is bounded by

(2.6)#\[\begin{equation}\label{eq_nmax} n \geq \log _{2}\left({\frac {|b-a|}{\varepsilon }}\right) \end{equation}\]

Since \(n\) is a natural number (i.e. 1, 2, 3, \(\ldots\), we usually choose the smallest natural number \(n\) that is greater than \(\log _{2}\left(|b-a|/\varepsilon)\right)\).

def bisection_nmax(a, b, TOL):
    '''
    Parameters
    ----------
    a : float
        DESCRIPTION. a is the left side of interval [a, b]
    b : float
        DESCRIPTION. b is the right side of interval [a, b]
    TOL : float, optional
        DESCRIPTION. Tolerance (epsilon). The default is 1e-4.

    Returns
    -------
    nmax
    '''
    return int(np.ceil(np.log2(np.abs(b-a)/TOL)))
function [nmax] = bisection_nmax(a, b, TOL)
%{
Parameters
----------
a : float
    DESCRIPTION. a is the left side of interval [a, b]
b : float
    DESCRIPTION. b is the right side of interval [a, b]
TOL : float, optional
    DESCRIPTION. Tolerance (epsilon). The default is 1e-4.

Returns
-------
nmax
%}
nmax = fix(ceil(log2(abs(b-a)/TOL)));

In our example, we need at least iterations.

from hd_RootFinding_Algorithms import bisection_nmax
Nmax = bisection_nmax(a, b, TOL = 1e-4)
display(Latex(r'N_{\max} = %i' % Nmax))
\[N_{\max} = 14\]

2.1.2. Bisection method (Modified)#

We can modify Algorithm Bisection_Algorithm and remove \(N_{\max}\) from it using the above remark. In order to identify the \(N_{\max}\) automatically, we can use ceil function which maps \(x\) to the least integer greater than or equal to \(x\).

def bisection_mod(f, a, b, TOL = 1e-4, Frame = True):
    '''
    Parameters
    ----------
    f : function 
        DESCRIPTION. A function. Here we use lambda functions
    a : float
        DESCRIPTION. a is the left side of interval [a, b]
    b : float
        DESCRIPTION. b is the right side of interval [a, b]
    TOL : float, optional
        DESCRIPTION. Tolerance (epsilon). The default is 1e-4.
    Frame : bool, optional
        DESCRIPTION. If it is true, a dataframe will be returned. The default is True.

    Returns
    -------
    an, bn, cn, fcn, n
    or
    a dataframe
    '''
    
    if f(a)*f(b) >= 0:
        print("Bisection method is inapplicable .")
        return None
    
    # let c_n be a point in (a_n, b_n)
    Nmax= int(np.ceil(np.log2(np.abs(b-a)/TOL)))+2
    an=np.zeros(Nmax, dtype=float)
    bn=np.zeros(Nmax, dtype=float)
    cn=np.zeros(Nmax, dtype=float)
    fcn=np.zeros(Nmax, dtype=float)
    # initial values
    an[0]=a
    bn[0]=b

    for n in range(0, Nmax-1):
        cn[n]=(an[n] + bn[n])/2
        fcn[n]=f(cn[n])
        if f(an[n])*fcn[n] < 0:
            an[n+1]=an[n]
            bn[n+1]=cn[n]
        elif f(bn[n])*fcn[n] < 0:
            an[n+1]=cn[n]
            bn[n+1]=bn[n]
        else:
            print("Bisection method fails.")
            return None
        if (abs(fcn[n]) < TOL):
            if Frame:
                return pd.DataFrame({'an': an[:n+1], 'bn': bn[:n+1], 'cn': cn[:n+1], 'fcn': fcn[:n+1]})
            else:
                return an, bn, cn, fcn, n
        
    if Frame:
        return pd.DataFrame({'an': an[:n+1], 'bn': bn[:n+1], 'cn': cn[:n+1], 'fcn': fcn[:n+1]})
    else:
        return an, bn, cn, fcn, n
function [an, bn, cn, fcn, n] = bisection_mod(f, a, b, TOL)
%{
Parameters
----------
f : function 
    DESCRIPTION. A function. Here we use lambda functions
a : float
    DESCRIPTION. a is the left side of interval [a, b]
b : float
    DESCRIPTION. b is the right side of interval [a, b]
TOL : float, optional
    DESCRIPTION. Tolerance (epsilon). The default is 1e-4.

Returns
-------
an, bn, cn, fcn, n
or
a dataframe

Example: f = @(x) x^3 - x - 2
%}
if nargin==3
    TOL= 1e-4;
end

if f(a)*f(b) >= 0
    disp("Bisection method is inapplicable .")
    return
end

% let c_n be a point in (a_n, b_n)
Nmax = fix(ceil(log2(abs(b-a)/TOL)));
an = zeros(Nmax, 1);
bn = zeros(Nmax, 1);
cn = zeros(Nmax, 1);
fcn = zeros(Nmax, 1);
% initial values
an(1) = a;
bn(1) = b;

for n = 1:(Nmax-1)
    cn(n)=(an(n) + bn(n))/2;
    fcn(n)=f(cn(n));
    if f(an(n))*fcn(n) < 0
        an(n+1)=an(n);
        bn(n+1)=cn(n);
    elseif f(bn(n))*fcn(n) < 0
        an(n+1)=cn(n);
        bn(n+1)=bn(n);
    else
        disp("Bisection method fails.")
    end
    if (abs(fcn(n)) < TOL)
        return
    end
end

Example: Use the modified Bisection method for finding the root of the function from the previous Example.

Solution:

Since \(a= 1\), \(b= 2\) and \(\varepsilon = 10^{-4}\), we have,

\[\begin{equation*} n \geq \log _{2}\left({\frac {|b-a|}{\varepsilon }}\right) = \log _{2}\left({\frac {|2-1|}{10^{-4}}}\right) = \log _{2}\left(10^{4}\right) = 13.29 \end{equation*}\]

Thus, if we pick \(N_{\max} = 14\), it will produce an error less than \(10^{-4}\). However, since we don’t know what exactly the value of \(c\) is for this example, we measure the error through \(f(c_n)\). Generally, the error is measured using the absolute error as \(|\text{Exact} ~-~\text{Approximation}|\) at each step.

Observe that the solution for this example would be almost identical to that of Example. However, the main difference would be the \(N_{max}\).

from hd_RootFinding_Algorithms import bisection_mod
data = bisection_mod(f, a, b, TOL = 1e-4)
display(data.style.format({'fcn': "{:.4e}"}))
hd.it_method_plot(data, title = 'Bisection method (Modified)', color = 'DarkGreen')
hd.root_plot(f, a, b, [data.an.values[:-1], data.an.values[-1]], ['an', 'root'])
  an bn cn fcn
0 1.000000 2.000000 1.500000 -1.2500e-01
1 1.500000 2.000000 1.750000 1.6094e+00
2 1.500000 1.750000 1.625000 6.6602e-01
3 1.500000 1.625000 1.562500 2.5220e-01
4 1.500000 1.562500 1.531250 5.9113e-02
5 1.500000 1.531250 1.515625 -3.4054e-02
6 1.515625 1.531250 1.523438 1.2250e-02
7 1.515625 1.523438 1.519531 -1.0971e-02
8 1.519531 1.523438 1.521484 6.2218e-04
9 1.519531 1.521484 1.520508 -5.1789e-03
10 1.520508 1.521484 1.520996 -2.2794e-03
11 1.520996 1.521484 1.521240 -8.2891e-04
12 1.521240 1.521484 1.521362 -1.0343e-04
13 1.521362 1.521484 1.521423 2.5935e-04
14 1.521362 1.521423 1.521393 7.7956e-05

2.1.3. ITP Method#

The Interpolate Truncate and Project (ITP) method represents an enhanced version of the bisection method, capable of attaining a superlinear convergence similar to that of the secant method. Even in the most unfavorable scenarios, the ITP method achieves a performance level equivalent to that of the bisection method.

Furthermore, for the above example, we have,

def ITP_Method(f, a, b, TOL = 1e-4, N0 = 1, K1 = 0.1, K2 = 2, Frame = True):
    
    '''
    Parameters
    ----------
    f : function 
        DESCRIPTION. A function. Here we use lambda functions
    a : float
        DESCRIPTION. a is the left side of interval [a, b]
    b : float
        DESCRIPTION. b is the right side of interval [a, b]
    N0, K1 and K2: hyper parameters
    TOL : float, optional
        DESCRIPTION. Tolerance (epsilon). The default is 1e-4.
    Frame : bool, optional
        DESCRIPTION. If it is true, a dataframe will be returned. The default is True.

    Returns
    -------
    an, bn, cn, fcn, n
    or
    a dataframe
    '''
    Nmax  = int(np.ceil(np.log2((b-a)/(2*TOL))) + N0)
    
    if f(a)*f(b) >= 0:
        print("ITP method is inapplicable .")
        return None
    
    # let c_n be a point in (a_n, b_n)
    an=np.zeros(Nmax, dtype=float)
    bn=np.zeros(Nmax, dtype=float)
    cn=np.zeros(Nmax, dtype=float)
    fcn=np.zeros(Nmax, dtype=float)

    # initial values
    an[0] = a
    bn[0] = b
    n = 0
    while (bn[n]- an[n] > (2*TOL)):
        mid = (an[n] + bn[n])/2
        r = TOL * (2 ** (Nmax - n)) - (bn[n]- an[n])/2
        xf = (an[n]*f(bn[n]) - bn[n]*f(an[n])) / (f(bn[n]) - f(an[n]))
        sigma = np.sign(mid - xf)
        delta = K1*((b-a)**K2)

        #Truncation:
        if delta <= np.abs(mid - xf):
            xt = xf + delta * sigma
        else:
            xt = mid

        # Projection
        if np.abs(xt - mid) <= r:
            cn[n] = xt
        else:
            cn[n] = mid - sigma * r

        # Updating Interval:
        fcn[n] = f(cn[n])
        if fcn[n] > 0:
            an[n+1]=an[n]
            bn[n+1] = cn[n]
        elif fcn[n] < 0:
            an[n+1] = cn[n]
            bn[n+1]=bn[n]
        else:
            an[n+1] = cn[n]
            bn[n+1] = cn[n]
        n = n + 1
        
    if Frame:
        return pd.DataFrame({'an': an[:n], 'bn': bn[:n], 'cn': cn[:n], 'fcn': fcn[:n]})
    else:
        return an, bn, cn, fcn, n
function [an, bn, cn, fcn, n] = ITP_Method(f, a, b, TOL, N0, K1, K2)
%{
Parameters
----------
f : function 
    DESCRIPTION. A function. Here we use lambda functions
a : float
    DESCRIPTION. a is the left side of interval [a, b]
b : float
    DESCRIPTION. b is the right side of interval [a, b]
N0, K1 and K2: hyper parameters
Nmax : Int, optional
    DESCRIPTION. Maximum number of iterations. The default is 1000.
TOL : float, optional
    DESCRIPTION. Tolerance (epsilon). The default is 1e-4.

Returns
-------
an, bn, cn, fcn, n
or
a dataframe

Example: f = @(x) x^3 - x - 2
%}

switch nargin
    case 3
        TOL= 1e-4; N0 = 1; K1 = 0.1; K2 = 2;
    case 4
        N0 = 1; K1 = 0.1; K2 = 2;
    case 5
        K1 = 0.1; K2 = 2;
    case 6
        K2 = 2;
end

Nmax = fix(ceil(log2(abs(b-a)/TOL))) + N0;

if f(a)*f(b) >= 0
    disp("ITP method is inapplicable .")
    return
end

% let c_n be a point in (a_n, b_n)
an = zeros(Nmax, 1);
bn = zeros(Nmax, 1);
cn = zeros(Nmax, 1);
fcn = zeros(Nmax, 1);
% initial values
an(1) = a;
bn(1) = b;

n = 1;
while (bn(n)- an(n) > (2*TOL))
    mid = (an(n) + bn(n))/2;
    r = TOL * (2^(Nmax - n)) - (bn(n)- an(n))/2;
    xf = (an(n)*f(bn(n)) - bn(n)*f(an(n))) / (f(bn(n)) - f(an(n)));
    sigma = sign(mid - xf);
    delta = K1*((b-a)^K2);
    % Truncation:
    if delta <= abs(mid - xf)
        xt = xf + delta * sigma;
    else
        xt = mid;
    end

    % Projection
    if abs(xt - mid) <= r
        cn(n) = xt;
    else
        cn(n) = mid - sigma * r;
    end
    
    % Updating Interval:
    fcn(n)=f(cn(n));
    if f(an(n))*fcn(n) < 0
        an(n+1)=an(n);
        bn(n+1)=cn(n);
    elseif f(bn(n))*fcn(n) < 0
        an(n+1)=cn(n);
        bn(n+1)=bn(n);
    else
        disp("Regula falsi Illinois alg. method fails.")
    end
    if (abs(fcn(n)) < TOL)
        return
    end
    n = n + 1;
end
from hd_RootFinding_Algorithms import ITP_Method
data = ITP_Method(f, a, b, TOL = 5e-4)
display(data.style.format({'fcn': "{:.4e}"}))
hd.it_method_plot(data, title = 'ITP Method', color = 'OrangeRed')
hd.root_plot(f, a, b, [data.an.values[:-1], data.an.values[-1]], ['an', 'root'])
  an bn cn fcn
0 1.000000 2.000000 1.433333 -4.8863e-01
1 1.433333 2.000000 1.595020 4.6285e-01
2 1.433333 1.595020 1.514177 -4.2576e-02
3 1.514177 1.595020 1.554599 2.0252e-01
4 1.514177 1.554599 1.534388 7.8091e-02
5 1.514177 1.534388 1.524282 1.7291e-02
6 1.514177 1.524282 1.519230 -1.2759e-02
7 1.519230 1.524282 1.521756 2.2367e-03
8 1.519230 1.521756 1.520493 -5.2684e-03
9 1.520493 1.521756 1.521124 -1.5176e-03