2.6. Zeros of Polynomials and Müller’s Method#

Polynomial and the Degree of the Polynomial

An expression of the form \(a_nx^n+a_{n-1}x^{n-1}+a_{n-2}x^{n-2}+\ldots+a_{1}x^{1}+a_0\) is called a polynomial. In this expression \(a_{0},~a_{1},~\ldots,~a_{n}\) are numbers and \(x\) is a variable. If the coeficient of \(x^n\) is nonzero (\(a_n \neq 0\)), the integer \(n\) is called the degree of the polynomial, and \(a_n\) is called the leading coefficient.

A polynomial of degree \(1\)

\(ax+b\)

A polynomial of degree \(2\)

\(ax^2+bx+c\)

A polynomial of degree \(3\)

\(ax^3+bx^2+cx+d\)

\(\vdots\)

\(\vdots\)

A polynomial of degree \(n\)

\(a_nx^n+a_{n-1}x^{n-1}+a_{n-2}x^{n-2}+\ldots+a_{1}x^{1}+a_0\)

The following theorem is Theorem 2.16 from [Burden and Faires, 2005].

The Fundamental Theorem of Algebra

Every positive degree polynomial with complex coefficients has a complex root.

Corollary

a. Every positive degree polynomial with complex coefficients has \textbf{at least one} complex root. b. Let \(P(x)\) be a polynomial of degree \(n \geq 1\) where the coefficients of this polynomial (real or complex). Then, there are distinct real numbers \(x_j\) and unique integers \(m_j\) for \(j=1,\dots,k\) such that

(2.67)#\[\begin{equation} \sum_{j = 1}^{k} m_{j} = n \quad \text{and} \quad P(x) = \sum_{j = 1}^{k} a_{j}(x - x_j)^{m_{j}}. \end{equation}\]

c. Assume that \(P(x)\) and \(Q(x\)) are degree at most \(n\) polynomials, and there are unique real \(k>n\) numbers \(x_j\) for \(j=1,\dots,k\) such that \(P(x_{j}) = Q(x_{j})\) for \(j=1,\dots,k\), Then

(2.68)#\[\begin{equation} P(x) = Q(x), \quad x \in \mathbb{R}. \end{equation}\]

Example: Find the roots of the quadratic \(x^2-(3-2i)x+(5-i)=0\).

Solution: Using the quadratic formula

\[\begin{align*} x=\frac{(3-2i) \pm\sqrt{(-(3-2i))^2-4(5-i)}}{2}. \end{align*}\]

Note that

\[\begin{align*} (-(3-2i))^2-4(5-i) = 5-12i-20+4i=-15-8i,\end{align*}\]

Hence,

\[\begin{align*} x=\frac{3-2i \pm\sqrt{-15-8i}}{2}.\end{align*}\]

To find \(\pm\sqrt{-15-8i}\), solve \(z^2=-15-8i\) for \(z\). Let \(z=a+bi\) and \(z^2=-15-8i\).

Then

\[\begin{align*} &(a^2-b^2)+2abi=-15-8i, \quad \Rightarrow \quad \begin{cases} a^2-b^2=-15, \\ 2ab=-8. \end{cases} \quad \Rightarrow \quad \begin{cases} a^2-b^2=-15, \\ a=-4/b. \end{cases} \end{align*}\]

Solving for \(a\) and \(b\)

\[\begin{align*} \left(-\frac{4}{b}\right)^2-b^2=-15 \quad &\Rightarrow \quad (b-4) (b+4) \underbrace{(b^2+1)}_{\text{No real roots!}}=0 \quad \Rightarrow \quad b=\pm 4 \\ & \Rightarrow \quad a=\mp 1. \end{align*}\]

Therefore, \(z= 1-4i, -1+4i\), i.e., \(z=\pm(1-4i)\). We have,

\[\begin{align*} x=\frac{(3-2i) \pm(1-4i)}{2}= \begin{cases} \dfrac{(3-2i) +(1-4i)}{2} =\dfrac{4-6i}{2} = 2-3i, \\ \dfrac{(3-2i) -(1-4i)}{2} =\dfrac{2+2i}{2}= 1+i. \end{cases}\end{align*}\]

Thus the roots of \(x^2-(3-2i)x+(5-i)\) are \(2-3i\) and $1+i$.

This corollary indicates that to demonstrate that two polynomials of degree less than or equal to \(n\) are the same, we only need to show that they agree at \(n + 1\) values.

2.6.1. Horner’s Method#

Assume that we are interested in finding a root of a polynomial using Newton’s method. From Section \ref{newton_methods}, we know that given a initial point \(x_{0}\), \(x_{1}\) can be approximated as follows,

(2.69)#\[\begin{equation} x_{1} = x_{0} - \frac{P(x_{0})}{P'(x_{0})}. \end{equation}\]

Thus, a sequence of approximations, \(\{x_{0},x_{1},x_{2}\ldots\}\) to the exact root \(c\) can be find as follows,

(2.70)#\[\begin{equation} x_{n+1} = x_{n} - \frac{P(x_{n})}{P'(x_{n})}, \qquad n\geq 0. \end{equation}\]

However, the Newton’s method requires both \(P(x_{n}\) and \(P'(x_{n})\) for approximating \(x_{n+1}\). Using the Horner’s method, we can identify \(P(x_{n}\) and \(P'(x_{n})\) just by having the coefficients \(a_{0},~a_{1},~\ldots,~a_{n}\).

For a given \(x_0\), Horner’s method re-writes the polynomial

(2.71)#\[P(x)=\sum _{i=0}^{n}a_{i}x^{i}=a_{0}+a_{1}\,x+a_{2}\,x^{2}+a_{3}\,x^{3}+\cdots +a_{n}\,x^{n},\]

as

(2.72)#\[P(x) = (x - x_{0})Q(x) + b_{0}\]

with

(2.73)#\[Q(x) = \sum_{j = 1}^{n} b_{j}\,x^{j-1} = b_{n}\,x^{n-1} + b_{n-1}\,x^{n-2} + \ldots + b_{2}\,x + b_{0}.\]

Note that using (2.72) and (2.73), we have,

(2.74)#\[\begin{split}P(x) & = (x - x_{0})Q(x) + b_{0} = (x - x_{0}) \sum_{j = 1}^{n} b_{j} x^{j-1} + b_{0} \notag \\ & = x\sum_{j = 1}^{n} b_{j} x^{j-1} - x_{0}\sum_{j = 1}^{n} b_{j} x^{j-1} + b_{0}\notag \\ & = b_{n} x^{n} + (b_{n-1} - b_{n}x_{0})x^{n-1} + \dots + (b_{1} - b_{2}x_{0})x + (b_{0} - b_{1}x_{0}).\end{split}\]

It follows from comparing (2.71) and (2.74) that,

\[\begin{equation*} \begin{cases} b_{n} = a_{n}\\ a_{n-1} = b_{n-1} - b_{n}x_{0}\\ \vdots\\ a_{1} = b_{1}- b_{2}x_{0}\\ a_{0} = b_{0} - b_{1}x_{0}. \end{cases} \end{equation*}\]

Therefore, \(b_{n} = a_{n}\) and \(a_{k} = b_{k} - b_{k+1}x_{0}\) or

(2.75)#\[\begin{equation} \begin{cases} b_{n} = a_{n},\\ b_{k} = a_{k} + b_{k+1}x_{0},\quad \text{for}\quad k = n - 1, n - 2, \ldots , 1, 0. \end{cases} \end{equation}\]

For an initial approximation, \(x_0\), of the zero of our polynomial as our divisor, we have that

(2.76)#\[\begin{equation} P(x_0) = {(x_{0} - x_{0})}Q(x) + b_{0} = b_{0} \end{equation}\]

Using the product rule, the derivative with respect to \(x\) of

(2.77)#\[\begin{equation} P(x) = (x - x_{0})Q(x) + b_{0} \end{equation}\]

is

(2.78)#\[\begin{equation} P'(x) = Q(x). \end{equation}\]

Therefore, applying synthetic division to \(Q(x)\) will give us \(P'(x_{0})=Q(x_{0})\). With that knowledge, we can apply Newton’s method to find the first approximation to a zero of \(P(x)\)

(2.79)#\[\begin{equation} x_{1} = x_{0} - \frac{P(x_{0})}{P'(x_{0})}. \end{equation}\]

If we begin the entire process again, applying synthetic division on \(P(x)\) and \(Q(x)\) using the approximation \(x_1\) as our divisor, we will find our next approximation, \(x_2\), to the zero of \(P(x)\).

import pandas as pd 
import numpy as np

def Horner_Method(a, x0):
    '''
    Parameters
    ----------
    a : list/array
        DESCRIPTION. A list or array of coeficients
        a = [a0, a1, a2, ..., an]
    x0 : float
        DESCRIPTION. point x0

    Returns
    -------
    y : float
        DESCRIPTION. p(x0)
        
    p(x) = 2x^3 - 6x^2 + 2x -1
    a = [2, -6, 2, -1]
    x0 = 2

    '''
    n = len(a)
    y = a[-1]
    z = a[-1]
    for i in reversed(range(1, n-1)):
        y = y * x0 + a[i]
        z = z * x0 + y
    y = y * x0 + a[0]
    return y, z 
function   y = Horner_Method(a, x0)
%{
Parameters
----------
a : list/array
    DESCRIPTION. A list or array of coeficients
x0 : float
    DESCRIPTION. point x0

Returns
-------
y : float
    DESCRIPTION. p(x0)

Example:
p(x) = 2x^3 - 6x^2 + 2x -1
a = [2, -6, 2, -1]
x0 = 2
%}

n = length(a);
y = a(1);
for i = 2:n
    y = y * x0 + a(i);
end

Example: Try Horner’s method using \(p(x) = 2x^3 - 6x^2 + 2x -1\) and \(x_0 =2\).

Solution: We have

# This part is used for producing tables and figures
import sys
sys.path.insert(0,'..')
import hd_tools as hd

from hd_RootFinding_Algorithms import Horner_Method
a = [-1, 2, -6, 2] # a = [a0,~a1,~a2,~a3]
x0 = 2
y, z = Horner_Method(a, x0)
print('y = %.2f' % y)
print('z = %.2f' % z)
y = -5.00
z = 2.00

2.6.2. Müller’s Method#

This method is an excellent choice for identifying the zeros of a polynomial even these zeros are complex numbers. Consider the zeros of a quadratic polynomial

(2.80)#\[\begin{equation} P(x) = a(x -x_2)^2 + b(x - x_2) + c \end{equation}\]

that passes through points

\[\begin{equation*} (x_0, f(x_0)),~ (x_1, f(x_1)), \text{ and } (x_3, f(x_3)) \end{equation*}\]

That is, we have the following system of equations that can be solved for \(a\), \(b\), and \(c\):

(2.81)#\[\begin{equation} \begin{cases} f(x_0) &= a(x_0 -x_2)^2 + b(x_0 - x_2) + c,\\ f(x_1) &= a(x_1 -x_2)^2 + b(x_1 - x_2) + c,\\ f(x_2) &= a(x_2 -x_2)^2 + b(x_2 - x_2) + c = c,\\ \end{cases} \end{equation}\]

to obtain:

(2.82)#\[\begin{equation} \begin{cases} f(x_0) &= a(x_0 -x_2)^2 + b(x_0 - x_2) + c,\\ f(x_1) &= a(x_1 -x_2)^2 + b(x_1 - x_2) + c,\\ f(x_2) &= a(x_2 -x_2)^2 + b(x_2 - x_2) + c = c, \end{cases} \end{equation}\]

This method is a generalization of the secant method. This method required three initial guesses to start. Then, the next point can be estimated as follows,

(2.83)#\[\begin{equation} x_{n+1} =x_n-(x_n-x_{n-1})\dfrac{2C}{\max\{B\pm \sqrt{B^2-4AC}\}} \end{equation}\]

where

(2.84)#\[\begin{equation} \begin{cases} q &= \dfrac{x_n-x_{n-1}}{x_{n-1}-x_{n-2}},\\ A &= qP(x_n)-q(1+q)P(x_{n-1})+q^2P(x_{n-2}),\\ B &= (2q+1)P(x_n)-(1+q)^2P(x_{n-1})+q^2P(x_{n-2}),\\ C &= (1+q)P(x_n) \end{cases} \end{equation}\]

This also can be done using the Vandermonde method as [6]

(2.85)#\[\begin{equation} \begin{bmatrix} (x_{n-2} - x_{n})^2 & (x_{n-2} - x_{n})^2 & 1\\ (x_{n-1} - x_{n})^2 & (x_{n-1} - x_{n})^2 & 1\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix}a\\b\\c\end{bmatrix} \begin{bmatrix} f(x_{n-2})\\ f(x_{n-1})\\ f(x_{n}) \end{bmatrix} \end{equation}\]

and the iterative algorithm of Muller’s method can be written as [6]:

(2.86)#\[\begin{equation} x_{n+1} = x_{n} +\dfrac{-2c}{b + sign(b) \sqrt{b^2 - 4ac}}. \end{equation}\]
import pandas as pd 
import numpy as np

def Muller_method(f, x, TOL = 1e-4, Nmax = 100):
    '''
    Parameters
    ----------
    f : function
        DESCRIPTION.
    x : list
        DESCRIPTION. a list that includex x0, x1 and x2
    TOL : TYPE, optional
        DESCRIPTION. Tolerance (epsilon). The default is 1e-4.
    Nmax : Int, optional
        DESCRIPTION. Maximum number of iterations. The default is 100.


    Returns
    -------
    a dataframe

    '''
    # convert x to an array
    x = np.array(x)
    
    xn = []
    xn.extend(x)
    fxn = []
    fxn.extend(f(x))
    
    n = 0
    while n <=Nmax :
        y = f(x);
        
        # Vandermonde matrix and linear system
        [a, b, c] = np.linalg.solve(np.vander(x - x[-1]), y)
        x = np.append(x[1:3], x[-1] -(2*c)/(b+ np.sign(b)*np.sqrt((b**2) - 4*a*c)))
        del a, b, c
        
        # Adding to the list
        xn.append(x[-1])
        fxn.append(f(x[-1]))
        
        # Break condition
        if ( np.abs( f(x[-1]) ) < TOL ):
            return pd.DataFrame({'xn': xn, 'fxn': fxn})
            break
        n +=1
        
    print('Exceeded maximum iterations. No solution found.')
    return None
function [xn, fxn] = Muller_method(f, x, TOL, Nmax)
%{
Parameters
----------
f : function
    DESCRIPTION.
x : list
    DESCRIPTION. a list that includex x0, x1 and x2
TOL : TYPE, optional
    DESCRIPTION. Tolerance (epsilon). The default is 1e-4.
Nmax : Int, optional
    DESCRIPTION. Maximum number of iterations. The default is 100.

Returns
-------
xn, fxn, Dfxn, n
or
a dataframe

Example
f = @(x) x.^3 - x - 2
x = [.2, .5, .7]
%}

switch nargin
    case 3
        Nmax = 100;
    case 2
        TOL = 1e-4; Nmax = 100;
end

n0 = length(x);

xn = zeros(Nmax, 1);
xn(1:n0) = x;
fxn = zeros(Nmax, 1);
fxn(1:n0) = f(x);

for n = length(x):(Nmax-1)
    y = f(x);

    % Vandermonde matrix and linear system
    Temp = linsolve(vander(x - x(n0)) , transpose(y));
    a = Temp(1);
    b = Temp(2);
    c = Temp(3);
    xnew = x;
    xnew(1:(n0-1)) = x(2:n0);
    xnew(n0) = x(n0) -(2*c)/(b+ sign(b)*sqrt((b^2) - 4*a*c));
    x = xnew;

    % Adding to the list
    xn(n) = x(n0);
    fxn(n) = f(x(n0));
    
    % Break condition
    if ( abs( f(x(n0)) ) < TOL )
        xn = xn(1:n);
        fxn = fxn(1:n);
        return 
    end
end
disp('Exceeded maximum iterations. No solution found.')

Example: Use Müller methods on \(P(x)=x^{3}-x-2\) with \(x_0 = 0.2\), \(x_1 = 0.5\), and \(x_2 = 0.7\) for identifying one of the roots of \(P(x)\).

Solution: We have

from hd_RootFinding_Algorithms import Muller_method
f = lambda x: x**3 - x - 2
data = Muller_method(f, x = [.2, .5, .7], Nmax = 20, TOL = 1e-4)
display(data.style.format({'fxn': "{:.4e}"}))
hd.it_method_plot(data, title = 'Secant Method', ycol = 'fxn', ylabel = 'f(xn)', color = 'Purple')
hd.root_plot(f, 0, 3, [data.xn.values[:-1], data.xn.values[-1]], ['xn', 'root'])
  xn fxn
0 0.200000 -2.1920e+00
1 0.500000 -2.3750e+00
2 0.700000 -2.3570e+00
3 1.872094 2.6891e+00
4 1.468739 -3.0038e-01
5 1.518933 -1.4517e-02
6 1.521372 -4.5029e-05