2022-01-24 Convergence classes

  • Office hours: Monday 9-10pm, Tuesday 2-3pm, Thursday 2-3pm

  • This week will stay virtual. Plan is to start in-person the following Monday (Jan 31)

Last time

  • Forward and backward error

  • Computing volume of a polygon

  • Rootfinding examples

  • Use Roots.jl to solve

  • Introduce Bisection

Today

  • Discuss rootfinding as a modeling tool

  • Limitations of bisection

  • Convergence classes

  • Intro to Newton methods

using Plots
default(linewidth=4)

Stability demo: Volume of a polygon

X = ([1 0; 2 1; 1 3; 0 1; -1 1.5; -2 -1; .5 -2; 1 0])
R(θ) = [cos(θ) -sin(θ); sin(θ) cos(θ)]
Y = X * R(deg2rad(10))' .+ [1e4 1e4]
plot(Y[:,1], Y[:,2], seriestype=:shape, aspect_ratio=:equal)
../_images/2022-01-24-convergence-classes_3_0.svg
using LinearAlgebra
function pvolume(X)
    n = size(X, 1)
    vol = sum(det(X[i:i+1, :]) / 2 for i in 1:n-1)
end

@show pvolume(Y)
[det(Y[i:i+1, :]) for i in 1:size(Y, 1)-1]
A = Y[2:3, :]
sum([A[1,1] * A[2,2], - A[1,2] * A[2,1]])
X
pvolume(Y) = 9.249999989226126
8×2 Matrix{Float64}:
  1.0   0.0
  2.0   1.0
  1.0   3.0
  0.0   1.0
 -1.0   1.5
 -2.0  -1.0
  0.5  -2.0
  1.0   0.0
  • Why don’t we even get the first digit correct? These numbers are only 108 and ϵmachine1016 so shouldn’t we get about 8 digits correct?

Rootfinding

Given f(x), find x such that f(x)=0.

We’ll work with scalars (f and x are just numbers) for now, and revisit later when they vector-valued.

Change inputs to outputs

  • f(x;b)=x2b

    • x(b)=b

  • f(x;b)=tanxb

    • x(b)=arctanb

  • f(x)=cosx+xb

    • x(b)=?

We aren’t given f(x), but rather an algorithm f(x) that approximates it.

  • Sometimes we get extra information, like fp(x) that approximates f(x)

  • If we have source code for f(x), maybe it can be transformed “automatically”

Nesting matryoshka dolls of rootfinding

  • I have a function (pressure, temperature) (density, energy)

    • I need (density, energy) (pressure, temperature)

  • I know what condition is satisfied across waves, but not the state on each side.

  • Given a proposed state, I can measure how much (mass, momentum, energy) is not conserved.

    • Find the solution that conserves exactly

  • I can compute lift given speed and angle of attack

    • How much can the plane lift?

    • How much can it lift on a short runway?

Discuss: rootfinding and inverse problems

  • Inverse kinematics for robotics

    • (statics) how much does each joint an robotic arm need to move to grasp an object

    • (with momentum) fastest way to get there (motors and arms have finite strength)

    • similar for virtual reality

  • Infer a light source (or lenses/mirrors) from an iluminated scene

  • Imaging

    • radiologist seeing an x-ray is mentally inferring 3D geometry from 2D image

    • computed tomography (CT) reconstructs 3D model from multiple 2D x-ray images

    • seismology infers subsurface/deep earth structure from seismograpms

Iterative bisection

f(x) = cos(x) - x
hasroot(f, a, b) = f(a) * f(b) < 0
function bisect_iter(f, a, b, tol)
    hist = Float64[]
    while abs(b - a) > tol
        mid = (a + b) / 2
        push!(hist, mid)
        if hasroot(f, a, mid)
            b = mid
        else
            a = mid
        end
    end
    hist
end
bisect_iter (generic function with 1 method)
length(bisect_iter(f, -1, 3, 1e-20))
56

Let’s plot the error

|bisectk(f,a,b)r|,k=1,2,

where r is the true root, f(r)=0.

hist = bisect_iter(f, -1, 3, 1e-10)
r = hist[end] # What are we trusting?
hist = hist[1:end-1]
scatter( abs.(hist .- r), yscale=:log10)
ks = 1:length(hist)
plot!(ks, 4 * (.5 .^ ks))
../_images/2022-01-24-convergence-classes_17_0.svg

Evidently the error ek=xkx after k bisections satisfies the bound

|ek|c2k.

Convergence classes

A convergent rootfinding algorithm produces a sequence of approximations xk such that

limkxkx
where f(x)=0. For analysis, it is convenient to define the errors ek=xkx. We say that an iterative algorithm is q-linearly convergent if
limk|ek+1|/|ek|=ρ<1.
(The q is for “quotient”.) A smaller convergence factor ρ represents faster convergence. A slightly weaker condition (r-linear convergence or just linear convergence) is that
|ek|ϵk
for all sufficiently large k when the sequence {ϵk} converges q-linearly to 0.

ρ = 0.8
errors = [1.]
for i in 1:30
    next_e = errors[end] * ρ
    push!(errors, next_e)
end
plot(errors, yscale=:log10, ylims=(1e-10, 1))
../_images/2022-01-24-convergence-classes_21_0.svg
e = hist .- r
scatter(abs.(errors[2:end] ./ errors[1:end-1]), ylims=(0,1))
../_images/2022-01-24-convergence-classes_22_0.svg

Bisection: A = q-linearly convergent, B = r-linearly convergent, C = neither

Remarks on bisection

  • Specifying an interval is often inconvenient

  • An interval in which the function changes sign guarantees convergence (robustness)

  • No derivative information is required

  • If bisection works for f(x), then it works and gives the same accuracy for f(x)σ(x) where σ(x)>0.

  • Roots of even degree are problematic

  • A bound on the solution error is directly available

  • The convergence rate is modest – one iteration per bit of accuracy

Newton-Raphson Method

Much of numerical analysis reduces to Taylor series, the approximation

f(x)=f(x0)+f(x0)(xx0)+f(x0)(xx0)2/2+O((xx0)3)
centered on some reference point x0.

In numerical computation, it is exceedingly rare to look beyond the first-order approximation

f~x0(x)=f(x0)+f(x0)(xx0).
Since f~x0(x) is a linear function, we can explicitly compute the unique solution of f~x0(x)=0 as
x=x0f(x0)/f(x0).
This is Newton’s Method (aka Newton-Raphson or Newton-Raphson-Simpson) for finding the roots of differentiable functions.

An implementation

function newton(f, fp, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = fp(x)
        if verbose
            println("[$k] x=$x  f(x)=$fx  f'(x)=$fpx")
        end
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - fx / fpx
    end  
end

f(x) = cos(x) - x
fp(x) = -sin(x) - 1
newton(f, fp, 1; tol=1e-15, verbose=true)
[1] x=1  f(x)=-0.45969769413186023  f'(x)=-1.8414709848078965
[2] x=0.7503638678402439  f(x)=-0.018923073822117442  f'(x)=-1.6819049529414878
[3] x=0.7391128909113617  f(x)=-4.6455898990771516e-5  f'(x)=-1.6736325442243012
[4] x=0.739085133385284  f(x)=-2.847205804457076e-10  f'(x)=-1.6736120293089505
[5] x=0.7390851332151607  f(x)=0.0  f'(x)=-1.6736120291832148
(0.7390851332151607, 0.0, 5)