2023-01-30 Rootfinding
Contents
2023-01-30 Rootfinding#
Last time#
More on floating point
Discuss Taylor Series activity
Condition numbers
Today#
Forward and backward error
Computing volume of a polygon
Rootfinding examples
Use Roots.jl to solve
Introduce Bisection
using Plots
default(linewidth=4)
Reliable = well-conditioned and stable#
Mathematical functions \(f(x)\) can be ill-conditioned (big \(\kappa\))#
Modeling is how we turn an abstract question into a mathematical function
We want well-conditioned models (small \(\kappa\))
Some systems are intrinsically sensitive: fracture, chaotic systems, combustion
Algorithms f(x)
can be unstable#
Unreliable, though sometimes practical
Unstable algorithms are constructed from ill-conditioned parts
An ill-conditioned problem from Paul Olum#
From Surely You’re Joking, Mr. Feynman (page 113)
So Paul is walking past the lunch place and these guys are all excited. “Hey, Paul!” they call out. “Feynman’s terrific! We give him a problem that can be stated in ten seconds, and in a minute he gets the answer to 10 percent. Why don’t you give him one?” Without hardly stopping, he says, “The tangent of 10 to the 100th.” I was sunk: you have to divide by pi to 100 decimal places! It was hopeless.
What’s the condition number?
\(f(x) = \tan x\)
\(f'(x) = 1 + \tan^2 x\)
tan(1e100)
-0.4116229628832498
tan(BigFloat("1e100", precision=500))
0.4012319619908143541857543436532949583238702611292440683194415381168718098221194
Forward and backward error#
plot(tan, ylims=(-10, 10), x=LinRange(-10, 10, 1000))
Read Backward error#
For ill-conditioned functions, the best we can hope for is small backward error.
Feynman could have answered with any real number and it would have a (relative) backward error less than \(10^{-100}\). All the numbers we saw on the previous slide had tiny backward error because
for some \(\epsilon < 10^{-100}\). (All real numbers can be written this way for some small \(\epsilon\).)
Go find some functions…#
Find a function \(f(x)\) that models something you’re interested in
Plot its condition number \(\kappa\) as a function of \(x\)
Plot the relative error (using single or double precision; compare using Julia’s
big
)Is the relative error ever much bigger than \(\kappa \epsilon_{\text{machine}}\)?
Can you find what caused the instability?
Share on Zulip
Further reading: FNC Introduction#
Stability demo: Volume of a polygon#
X = [1 0; 2 1; 1 3; 0 1; -1 1.5; -2 -1; .5 -2; 1 0]
R(θ) = [cos(θ) -sin(θ); sin(θ) cos(θ)]
Y = X * R(deg2rad(30))' .+ [0 0]
plot(Y[:,1], Y[:,2], seriestype=:shape, aspect_ratio=:equal, xlims=(-3, 3), ylims=(-3, 3))
using LinearAlgebra
function pvolume(X)
n = size(X, 1)
vol = sum(det(X[i:i+1, :]) / 2 for i in 1:n-1)
end
@show ref = pvolume(Y)
[det(Y[i:i+1, :]) for i in 1:size(Y, 1)-1]
ref = pvolume(Y) = 9.25
7-element Vector{Float64}:
1.0
5.0
0.9999999999999997
1.0
4.0
4.5
2.0
What happens if this polygon is translated, perhaps far away?
Rootfinding#
Given \(f(x)\), find \(x\) such that \(f(x) = 0\).
We’ll work with scalars (\(f\) and \(x\) are just numbers) for now, and revisit later when they vector-valued.
Change inputs to outputs#
\(f(x; b) = x^2 - b\)
\(x(b) = \sqrt{b}\)
\(f(x; b) = \tan x - b\)
\(x(b) = \arctan b\)
\(f(x) = \cos x + x - b\)
\(x(b) = ?\)
We aren’t given \(f(x)\), but rather an algorithm f(x)
that approximates it.
Sometimes we get extra information, like
fp(x)
that approximates \(f'(x)\)If we have source code for
f(x)
, maybe it can be transformed “automatically”
Example: Queueing#
In a simple queueing model, there is an arrival rate and a departure (serving) rate. While waiting in the queue, there is a probability of “dropping out”. The length of the queue in this model is
One model for the waiting time (where these rates are taken from exponential distributions) is
wait(arrival, departure) = log(arrival / departure) / departure
plot(d -> wait(1, d), xlims=(.1, 1), xlabel="departure", ylabel="wait")
Departure rate given wait#
Easy to measure wait
I have a limited tolerance for waiting
my_wait = 0.8
plot([d -> wait(1, d) - my_wait, d -> 0], xlims=(.1, 1), xlabel="departure", ylabel="wait")
#import Pkg; Pkg.add("Roots")
using Roots
d0 = find_zero(d -> wait(1, d) - my_wait, 1)
plot([d -> wait(1, d) - my_wait, d -> 0], xlims=(.1, 1), xlabel="departure", ylabel="wait")
scatter!([d0], [0], marker=:circle, color=:black, title="d0 = $d0")
Example: Nonlinear Elasticity#
Strain-energy formulation#
where \(I_1 = \lambda_1^2 + \lambda_2^2 + \lambda_3^3\) and \(J = \lambda_1 \lambda_2 \lambda_3\) are invariants defined in terms of the principle stretches \(\lambda_i\).
Uniaxial extension#
In the experiment, we would like to know the stress as a function of the stretch \(\lambda_1\). We don’t know \(J\), and will have to determine it by solving an equation.
How much does the volume change?#
Using symmetries of uniaxial extension, we can write an equation \(f(\lambda, J) = 0\) that must be satisfied. We’ll need to solve a rootfinding problem to compute \(J(\lambda)\).
function f(lambda, J)
C_1 = 1.5e6
D_1 = 1e8
D_1 * J^(8/3) - D_1 * J^(5/3) + C_1 / (3*lambda) * J - C_1 * lambda^2/3
end
plot(J -> f(4, J), xlims=(0.1, 3), xlabel="J", ylabel="f(J)")
find_J(lambda) = find_zero(J -> f(lambda, J), 1)
plot(find_J, xlims=(0.1, 5), xlabel="lambda", ylabel="J")
An algorithm: Bisection#
Bisection is a rootfinding technique that starts with an interval \([a,b]\) containing a root and does not require derivatives. Suppose \(f\) is continuous. What is a sufficient condition for \(f\) to have a root on \([a,b]\)?
hasroot(f, a, b) = f(a) * f(b) < 0
f(x) = cos(x) - x
plot(f)
Bisection#
function bisect(f, a, b, tol)
mid = (a + b) / 2
if abs(b - a) < tol
return mid
elseif hasroot(f, a, mid)
return bisect(f, a, mid, tol)
else
return bisect(f, mid, b, tol)
end
end
x0 = bisect(f, -1, 3, 1e-5)
x0, f(x0)
(0.7390861511230469, -1.7035832658995886e-6)
How fast does it converge?#
function bisect_hist(f, a, b, tol)
mid = (a + b) / 2
if abs(b - a) < tol
return [mid]
elseif hasroot(f, a, mid)
return prepend!(bisect_hist(f, a, mid, tol), [mid])
else
return prepend!(bisect_hist(f, mid, b, tol), [mid])
end
end
bisect_hist (generic function with 1 method)
bisect_hist(f, -1, 3, 1e-4)
17-element Vector{Float64}:
1.0
0.0
0.5
0.75
0.625
0.6875
0.71875
0.734375
0.7421875
0.73828125
0.740234375
0.7392578125
0.73876953125
0.739013671875
0.7391357421875
0.73907470703125
0.739105224609375
Iterative bisection#
Data structures often optimized for appending rather than prepending.
Bounds stack space
function bisect_iter(f, a, b, tol)
hist = Float64[]
while abs(b - a) > tol
mid = (a + b) / 2
push!(hist, mid)
if hasroot(f, a, mid)
b = mid
else
a = mid
end
end
hist
end
bisect_iter (generic function with 1 method)
bisect_iter(f, -1, 3, 1e-4)
16-element Vector{Float64}:
1.0
0.0
0.5
0.75
0.625
0.6875
0.71875
0.734375
0.7421875
0.73828125
0.740234375
0.7392578125
0.73876953125
0.739013671875
0.7391357421875
0.73907470703125
Let’s plot the error#
where \(r\) is the true root, \(f(r) = 0\).
r = bisect(f, -1, 3, 1e-14) # what are we trusting?
hist = bisect_hist(f, -1, 3, 1e-10)
scatter( abs.(hist .- r), yscale=:log10)
ks = 1:length(hist)
plot!(ks, 4 * (.5 .^ ks))
Evidently the error \(e_k = x_k - x_*\) after \(k\) bisections satisfies the bound
Next time#
Discuss rootfinding examples
Limitations of bisection
Convergence classes
Newton methods
convergence theory
derivation via convergence theory
reliability
For you#
Share an equation that is useful for one task, but requires rootfinding for another task.
OR: explain a system in which one stakeholder has the natural inputs, but a different stakeholder only knows outputs.
Please collaborate.