CSC270 Root Finding




Mean Value Theorem (MVT)

Theorem

  If f() is continuous and differentiable on an interval [a,b], then
  there exists some c in (a,b) such that

                 f(b) - f(a)
         f'(c) = -----------
                    b - a

  Equivalently,

         f(b) - f(a) = f'(c) (b - a)


Above, (f(b)-f(a))/(b-a) is just the slope between point (a,f(a)) and
point (b,f(b)).  This is the average, or `mean', slope of the function
between those two points.  The theorem simply says that there is a
point in (a,b) for which the function *has* that mean slope.





Taylor series

Suppose we want to model a function f() near a point c.  Suppose
further that f() is infinitely differentiable at c and that we know
f'(c), f''(c), and so on.

Then we can use a polynomial function p() with the same value and
derivatives to *approximate* f().  For this polynomial, we have the
following constraints:

  p(c) = f(c)
  p'(c) = f'(c)
  p''(c) = f''(c)
  and so on.

A polynomial that satisfies these constraints is the ``Taylor
Polynomial of degree n'':

                                              2            (n)        n
                                  f''(c) (x-c)            f  (c) (x-c)
  p_n (x) = f(c) + f'(c) (x-c) +  ------------- + ..... + -------------
                                        2!                      n!

If n goes to infinity, we get the Taylor Series.  For many functions
(among them the polynomials), the Taylor Series converges to the
function.  That is, p_n(x) = f(x) for all x.  In this course, we won't
characterize the functions for which the Taylor Series converges.

When approximating a function f() with the Taylor polynomial p_n() of
degree n, the *residual* is

  r_n(x) = p(x) - f(x)

Taylor's Theorem says that if the n derivatives of f() exist in an
interval [a,b] that contains c, then there is *some* x_{max} in [a,b]
such that

               (n+1)            n+1
            | f   (c) (x_{max}-c)  |          n+1
  r_n(x) <=   ------------------      * | (x-c)  |
                     (n+1)!

for all x in [a,b].  This means that the residual is bounded by some
constant (a function of x_{max}) times (x-c)^{n+1}.  In other words,
the residual is ``of the order of (x-c)^{n+1}, written:

                n+1
  r_n is O( (x-c)  )

This gives us an idea of how fast the residual diminishes as x 
approaches c.



Newton's method of root finding  

[Readings section 2, pages 412-421]

Suppose an algorithm computes successive approximations, x1, x2, x3,
..., to a root of a function f().  Consider the first-order Taylor
expansion about the i^th appoximation:

  p(x) = f(x_i) + f'(x_i) (x - x_i)

Since p(x) is approximately f(x), we set p(x) to zero and solve for
the root x:

            f(x_i)
  x = x_i - -------
            f'(x_i)

Since p() is only an approximation to f(), this root x is only an
approximation to the root of f().  So we'll use it as our next
approximation, x_{n+1}.  If p() is a ``good'' approximation of f(),
x_{n+1} should be a better approximation to f()'s root than was x_n.

This is called Newton iteration.  The algorithm computes

  x_1 = x_0 - f(x_0) / f'(x_0)
  x_2 = x_1 - f(x_1) / f'(x_1)
  x_3 = x_2 - f(x_2) / f'(x_2)
  and so on.

Intuitively, x_{n+1} is chosen as the point where the tangent at
(x_n,f(x_n)) cuts the x axis.



Theorem [page 64]

  If x_n is in an interval [a,b] such that

    - f(a) f(b) < 0
    - f' is not zero in [a,b]
    - f'' doesn't change sign in [a,b]
    - f(a)/f'(a) < b-a and f(b)/f'(b) < b-a

  then the Newton iteration converges to a root of f().

Newton iteration converges quadratically, provided the initial guess
x_0 is ``good enough''.  To show this requires a bit of calculus.  In
brief, if the error in the n^{th} iteration is e_n, then

                                2
  e_{n+1} is proportional to e_n



Secant method of root finding

The Newton method requires the first derivative f'(), which is
sometimes not available to a program.  The secant method avoids this
problem.

To get x_{n+1} from x_n, the Newton method followed the tangent at x_n
until it intersected the x axis.  The secant method will approximate
the tangent by drawing a line between two close points on the curve
f(x).  Such a line is a `secant'.  Note that, as the two points get
closer, the secant approaches the tangent.



Thus f'(x_n) is approximated by


              f(x_n) - f(x_{n-1})
  f'(x_n)  ~  -------------------
                 x_n - x_{n-1}

and the Newton iteration is modified as follows:


  x_{n+1} = x_n - f(x_n) / f'(x_n)

                              x_n - x_{n-1}
          ~ x_n - f(x_n) * -------------------
                           f(x_n) - f(x_{n-1})

            x_{n-1} f(x_n) - x_n f(x_{n-1})
          = -------------------------------
                  f(x_n) - f(x_{n-1})

As with Newton iteration, the secant method uses for its next
approximation to the root the intersection of the secant with
the x axis.

The secant method converges a bit slower than the Newton method:

                                1.62
  e_{n+1} is proportional to e_n

As with the Newton method, there are situations in which the
secant method will not converge to the root.