2023-03-15 Higher dimensions

2023-03-15 Higher dimensions#

Last time#

Accuracy of piecewise constant (nearest neighbor) interpolation
Piecewise polynomial methods
Splines
Libraries

Today#

Fast Fourier Transform
Fast Chebyshev
Interpolation in higher dimensions

using LinearAlgebra
using Plots
default(linewidth=4, legendfontsize=12)

function vander(x, k=nothing)
    if isnothing(k)
        k = length(x)
    end
    m = length(x)
    V = ones(m, k)
    for j in 2:k
        V[:, j] = V[:, j-1] .* x
    end
    V
end

function vander_chebyshev(x, n=nothing)
    if isnothing(n)
        n = length(x) # Square by default
    end
    m = length(x)
    T = ones(m, n)
    if n > 1
        T[:, 2] = x
    end
    for k in 3:n
        #T[:, k] = x .* T[:, k-1]
        T[:, k] = 2 * x .* T[:,k-1] - T[:, k-2]
    end
    T
end

function interp_nearest(x, s)
    A = zeros(length(s), length(x))
    for (i, t) in enumerate(s)
        loc = nothing
        dist = Inf
        for (j, u) in enumerate(x)
            if abs(t - u) < dist
                loc = j
                dist = abs(t - u)
            end
        end
        A[i, loc] = 1
    end
    A
end

runge(x) = 1 / (1 + 10*x^2)

CosRange(a, b, n) = (a + b)/2 .+ (b - a)/2 * cos.(LinRange(-pi, 0, n))

vcond(mat, points, nmax) = [cond(mat(points(-1, 1, n))) for n in 2:nmax]

function interp_spline(x, s)
    m, n = length(s), length(x)
    A = diagm(m, n, ones(n))
    for j in 1:n
        fspline = CubicSplineInterpolation(x, A[1:n,j])
        A[:,j] = fspline.(s)
    end
    A
end

function my_spy(A)
    cmax = norm(vec(A), Inf)
    s = max(1, ceil(120 / size(A, 1)))
    spy(A, marker=(:square, s), c=:diverging_rainbow_bgymr_45_85_c67_n256, clims=(-cmax, cmax))
end

function interp_chebyshev(x, xx)
    vander_chebyshev(xx, length(x)) * inv(vander_chebyshev(x))
end

function interp_monomial(x, xx)
    vander(xx, length(x)) * inv(vander(x))
end

function interp_error(ieval, x, xx, test)
    """Compute norm of interpolation error for function test
    using method interp_and_eval from points x to points xx.
    """
    A = ieval(x, xx)
    y = test.(x)
    yy = test.(xx)
    norm(A * y - yy, Inf)
end

function plot_convergence(ievals, ptspaces; xscale=:log10, yscale=:log10, maxpts=40)
    """Plot convergence rates for an interpolation scheme applied
    to a set of tests.
    """
    xx = LinRange(-1, 1, 100)
    ns = 2:maxpts
    fig = plot(title="Convergence",
        xlabel="Number of points",
        ylabel="Interpolation error",
        xscale=xscale,
        yscale=yscale,
        legend=:bottomleft,
        size=(1200, 800))
    for ieval in ievals
        for ptspace in ptspaces
            for test in [runge]
                try
                    errors = [interp_error(ieval, ptspace(-1, 1, n), xx, test)
                             for n in ns]
                    plot!(ns, errors, marker=:circle, label="$ieval/$ptspace")
                catch
                    continue
                end
            end
        end
    end
    for k in [1, 2, 3]
        plot!(ns, ns .^ (-1.0*k), color=:black, label="\$n^{-$k}\$")
    end
    fig
end

vcond (generic function with 1 method)

Accuracy#

plot_convergence([interp_monomial, interp_chebyshev, interp_nearest, interp_spline], [LinRange, CosRange], maxpts=60)

Generalizations of interpolation#

To create a Vandermonde matrix, we choose a family of functions \(\phi_j(x)\) and a set of points \(x_i\), then create the matrix

\[ V_{ij} = \phi_j(x_i) .\]

Integrals?#

\[ B_{ij} = \int_{(x_{i-1} + x_i)/2}^{(x_i + x_{i+1})/2} \phi_j(s) ds\]

This leads to conservative reconstruction, which is an important part of finite volume methods, which are industry standard for shock dynamics.

Derivatives?#

What if we instead computed derivatives?

\[ A_{ij} = \phi_j'(x_i) \]

function diff_monomial(x)
    n = length(x)
    A = zeros(n, n)
    A[:,2] = one.(x)
    for j in 3:n
        A[:,j] = A[:,j-1] .* x * (j - 1) / (j - 2)
    end
    A
end

diff_monomial(LinRange(-1, 1, 4))

4×4 Matrix{Float64}:
0  1.0  -2.0       3.0
0  1.0  -0.666667  0.333333
0  1.0   0.666667  0.333333
0  1.0   2.0       3.0

We need boundary conditions!#

First, a stable basis!#

Derivatives of Chebyshev polynomials also satisfy a recurrence.

function chebdiff(x, n=nothing)
    T = vander_chebyshev(x, n)
    m, n = size(T)
    dT = zero(T)
    dT[:,2:3] = [one.(x) 4*x]
    for j in 3:n-1
        dT[:,j+1] = j * (2 * T[:,j] + dT[:,j-1] / (j-2))
    end
    ddT = zero(T)
    ddT[:,3] .= 4
    for j in 3:n-1
        ddT[:,j+1] = j * (2 * dT[:,j] + ddT[:,j-1] / (j-2))
    end
    T, dT, ddT
end

chebdiff (generic function with 2 methods)

x = CosRange(-1, 1, 7)
T, dT, ddT = chebdiff(x)
c = T \ cos.(3x)
scatter(x, dT * c)
plot!(s -> -3sin(3s))

Solving a BVP with Chebyshev collocation#

A boundary value problem (BVP) asks to find a function \(u(x)\) satisfying an equation like

\[ -u_{xx}(x) = f(x) \]

subject to boundary conditions \(u(-1) = a\) and \(u'(1) = b\).

We’ll use the “method of manufactured solutions”: choose \(u(x) = \tanh(2x)\) and solve with the corresponding \(f(x)\). In practice, \(f(x)\) comes from the physics and you need to solve for \(u(x)\).

function poisson_cheb(n, rhsfunc, leftbc=(0, zero), rightbc=(0, zero))
    x = CosRange(-1, 1, n)
    T, dT, ddT = chebdiff(x)
    L = -ddT
    rhs = rhsfunc.(x)
    for (index, deriv, func) in
            [(1, leftbc...), (n, rightbc...)]
        L[index,:] = (T, dT)[deriv+1][index,:]
        rhs[index] = func(x[index])
    end
    x, L / T, rhs
end

poisson_cheb (generic function with 3 methods)

manufactured(x) = tanh(2x)
d_manufactured(x) = 2*cosh(2x)^-2
mdd_manufactured(x) = 8 * tanh(2x) / cosh(2x)^2
x, A, rhs = poisson_cheb(11, mdd_manufactured,
    (0, manufactured), (1, d_manufactured))
plot(x, A \ rhs, marker=:auto)
plot!(manufactured, legend=:bottomright)

“spectral” (exponential) convergence¶#

function poisson_error(n)
    x, A, rhs = poisson_cheb(n, mdd_manufactured, (0, manufactured), (1, d_manufactured))
    u = A \ rhs
    norm(u - manufactured.(x), Inf)
end

ns = 3:20
ps = [1 2 3]
plot(ns, abs.(poisson_error.(ns)), marker=:auto, yscale=:log10, xlabel="# points", ylabel="error")
plot!([n -> n^-p for p in ps], label=map(p -> "\$n^{-$p}\$", ps), size=(1000, 600))

Approximation by Fourier basis#

Just like we can approximate functions using linear combinations of polynomials, we can approximate periodic functions using a linear combination of Fourier modes.

\[ f(x) \approx \sum_{k=1}^n \underbrace{\hat f(\theta_k)}_{\hat f_k} e^{i\theta_k x} . \]

This is reminiscent of linear algebra

(14)#\[\begin{align} \Bigg[ f(x) \Bigg] &= \Bigg[ e^{i\theta_1 x} \Bigg| e^{i\theta_2 x} \Bigg| \dotsm \Bigg] \begin{bmatrix} \hat f_1 \\ \hat f_2 \\ \vdots \end{bmatrix} \\ &= \Bigg[ e^{i\theta_1 x} \Bigg] \hat f_1 + \Bigg[ e^{i\theta_2 x} \Bigg] \hat f_2 + \dotsb . \end{align}\]

Continuous \(\theta\): infinite domain#

If we take \(\theta \in (-\pi, \pi]\) as a continuous quantity (instead of a discrete set of modes), the sum becomes and integral and we get equality (for “nice enough” \(f(x)\)),

\[ f(x) = \int_{-\pi}^\pi \hat f(\theta) e^{i\theta x} d\theta, \]

in which \(\hat f(\theta)\) is the Fourier transform (specifically, the discrete time transform) of \(f(x)\). This representation is valuable for analyzing convergence of multigrid methods, among other applications.

Computing \(\hat f(\theta)\)#

If we select a finite number of points \(x\) and compute the square Vandermonde matrix

\[ \mathcal F = \Bigg[ e^{i\theta_1 x} \Bigg| e^{i\theta_2 x} \Bigg| \dotsm \Bigg] \]

then, knowing the vector \(f\), we could solve

\[ \mathcal F \hat f = f \]

for \(\hat f\). This would require \(O(n^3)\) where \(n\) is the number of points.

function vander_fourier(x, n=nothing)
    if isnothing(n)
        n = length(x)
    end
    theta = LinRange(-pi + 2pi/n, pi, n)
    F = exp.(1im * x * theta')
end

x = LinRange(-20, 20, 40)
F = vander_fourier(x)
plot(x, imag.(F[:, 19:22]))

\(\mathcal F\) as a matrix#

x = LinRange(-2, 2, 5)
F = vander_fourier(x) / sqrt(5)
@show norm(F' * F - I)
@show norm(F * F' - I);

norm(F' * F - I) = 4.0067422874877247e-16
norm(F * F' - I) = 4.594822717058422e-16

Every \(\mathcal F\) (suitably normalized) is a unitary matrix
- a unitary matrix is the complex-valued generalization of “orthogonal matrix”
- \(\mathcal F^H \mathcal F = \mathcal F \mathcal F^H = I\)
Typical notation is \(\mathcal F^*\) or \(\mathcal F^H\) representing “Hermitian transpose” or conjugate transpose

# The ' in Julia is the Hermitian adjoint
vander_fourier([-1, 0, 1])'

3×3 adjoint(::Matrix{ComplexF64}) with eltype ComplexF64:
  0.5-0.866025im     1.0+0.0im   0.5+0.866025im
  0.5+0.866025im     1.0-0.0im   0.5-0.866025im
 -1.0+1.22465e-16im  1.0-0.0im  -1.0-1.22465e-16im

What does this mean for cost?#

Fitting a discrete signal in the Fourier basis requires solving

\[\mathcal F \hat y = y\]

Faster!#

In this discrete context, the transform we need to evaluate is

\[ \hat f_k = \sum_\ell e^{-i\theta_k x_\ell} f_\ell \]

where \(f_\ell\) are samples \(f(x_\ell)\) at integers \(x_\ell = \ell\) and \(\theta_k\) are the frequencies \(2 \pi k/n\) (because the branch \(\theta \in (-\pi, \pi]\) is arbitrary).

(15)#\[\begin{align} \hat f_k &= \sum_{\ell=0}^{n-1} e^{-2\pi i \frac{k \ell}{n}} f_\ell \\ &= \underbrace{\sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{k (2\ell)}{n}} f_{2\ell}}_{\text{even}} + \underbrace{\sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{k (2\ell+1)}{n}} f_{2\ell+1}}_{\text{odd}} \\ &= \underbrace{\sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{k \ell}{n/2}} f_{2\ell}}_{\text{transform of even data}} + e^{-2\pi i \frac k n} \underbrace{\sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{k \ell}{n/2}} f_{2\ell+1}}_{\text{transform of odd data}} \end{align}\]

Periodicity#

When the original signal \(f\) is periodic, \(f_{\ell} = f_{(\ell + n) \bmod n}\), then

(16)#\[\begin{align} \hat f_{k+n/2} &= \sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{(k+n/2) \ell}{n/2}} f_{2\ell} + e^{-2\pi i \frac{(k+n/2)}{n}} \sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{(k+n/2) \ell}{n/2}} f_{2\ell+1} \\ &= \sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{k \ell}{n/2}} f_{2\ell} + e^{-2\pi i \frac{k}{n}} e^{-\pi i} \sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{k\ell}{n/2}} f_{2\ell+1} \\ &= \sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{k \ell}{n/2}} f_{2\ell} - e^{-2\pi i \frac{k}{n}} \sum_{\ell=0}^{n/2-1} e^{-2\pi i \frac{k \ell}{n/2}} f_{2\ell+1} \end{align}\]

where we have used that

\[ e^{-2\pi i \ell} = \Big( e^{-2\pi i} \Big)^\ell = 1^\ell = 1 .\]

We’ve reduced a Fourier transform of length \(n\) (at a cost of \(n^2\)) to two transforms of length \(n/2\) (at a cost of \(2 n^2/4 = n^2/2\)). Repeating this recursively yields a complexity of \(O(n\log n)\).

Visualize#

using FastTransforms

n = 64; m = 62
x = 0:n-1
f = exp.(2im * pi * m * x / n)
plot(x, real.(f), ylim=(-1.1, 1.1), marker=:circle)

using AbstractFFTs
fhat = fft(f)
scatter([abs.(fhat), abs.(fftshift(fhat))], marker=[:square :circle])

Transform a Gaussian bump \(e^{-(x/w)^2}\)#

n = 64; w = 2
x = 0:n-1
f = exp.(-((x .- n/2)/w) .^ 2)
scatter(x, real.(f), ylim=(-1.1, 1.1))

fhat = fft(f)
scatter([abs.(fhat), abs.(fftshift(fhat))])

Compute derivatives using the Fourier transform#

How do we differentiate this?

\[f(x) = \sum_k e^{i\theta_k x} \hat f_k\]

Evidently, we need only compute
\[f'(x) = \mathcal F^{-1} (i \theta_k \hat f_k)\]

Generalizations#

Non-power of 2
Non-uniform grids
Multiple dimensions
Butterfly algorithms for integral operators
\[ (\mathcal K f)(x) = \int_{Y} K(x,y) f(y) dy \]
See Poulson et al., A parallel butterfly algorithm

Applications#

Everywhere in signal processing
ECMWF global climate and weather model
Particle-Mesh Ewald for long-range forces in molecular dynamics
Turbulence simulation

Parallel implications: Bisection bandwidth of the network#

function vander_chebyshev(x, n=nothing)
    if isnothing(n)
        n = length(x) # Square by default
    end
    m = length(x)
    T = ones(m, n)
    if n > 1
        T[:, 2] = x
    end
    for k in 3:n
        #T[:, k] = x .* T[:, k-1]
        T[:, k] = 2 * x .* T[:,k-1] - T[:, k-2]
    end
    T
end

CosRange(a, b, n) = (a + b)/2 .+ (b - a)/2 * cos.(LinRange(0, π, n))

CosRange (generic function with 1 method)

\(O(n \log n)\) derivative of a function on the `CosRange` points#

We can map these points to the unit circle in the complex plane, and then we have equally-spaced periodic structure and can use FFT.

function chebfft(y)
    # Adapted from Trefethen's Spectral Methods in Matlab
    n = length(y) - 1
    x = CosRange(-1, 1, n+1)
    Y = [y; reverse(y[2:n])]
    ii = 0:n-1
    U = real.(fft(Y))
    W = real.(ifft(1im * [ii; 0; 1-n:-1] .* U))
    w = zeros(n + 1)
    w[2:n] = -W[2:n] ./ sqrt.(1 .- x[2:n] .^ 2)
    w[1] = sum(ii .^ 2 .* U[ii .+ 1]) / n + .5 * n * U[n+1]
    w[n+1] = sum((-1) .^ (ii .+ 1) .* ii .^ 2 .* U[ii .+ 1]) / n + .5 * (-1) ^ (n+1) * n * U[n+1]
    w
end

chebfft([1,2,0,3])

4-element Vector{Float64}:
  -6.333333333333334
   1.3333333333333333
  -1.333333333333333
 -11.666666666666666

x = CosRange(-1, 1, 40)
g(x) = cos(3x - 1)
dg(x) = -3 * sin(3x - 1) # for verification
y = g.(x)
dy = chebfft(y)
plot(x, dy, marker=:circle)
plot!(dg)

Library: FastChebInterp.jl #

using FastChebInterp

[CosRange(-1, 1, 11) chebpoints(10, -1, 1)]

11×2 Matrix{Float64}:
 -1.0           1.0
 -0.951057      0.951057
 -0.809017      0.809017
 -0.587785      0.587785
 -0.309017      0.309017
  6.12323e-17   0.0
  0.309017     -0.309017
  0.587785     -0.587785
  0.809017     -0.809017
  0.951057     -0.951057
  1.0          -1.0

c = chebinterp(g.(x), -1, 1)
cx, dcx = chebgradient(c, 0.5)

(-0.8011436155469339, -1.795416432311872)

Curse of Dimensionality#

Suppose we use a naive Vandermonde matrix to interpolate \(n\) data points in an \(n\)-dimensional space of functions, e.g., predicting \(z(x, y)\) from data \((x_i, y_i, z_i)\)

\[ \underbrace{\Bigg[ 1 \Bigg| x \Bigg| y \Bigg| xy \Bigg| x^2 \Big| y^2 \Big| x^2 y \Big| xy^2 \Big| x^2y^2 \Big| \dotsb \Bigg]}_{V} \Bigg[ \mathbf c \Bigg] = \Bigg[ \mathbf z \Bigg] \]

# A grid with 10 data points in each of d dimensions.
points(d) = 10. ^ d
flops(n) = n ^ 3
joules(flops) = flops / 20e9 # 20 GF/joule for best hardware today
scatter(1:10, d -> joules(flops(points(d))), xlims=(0, 10), yscale=:log10, legend=:none)

barrels_of_oil(flops) = joules(flops) / 6e9
scatter(1:10, d -> barrels_of_oil(flops(points(d))), xlims=(0, 10), yscale=:log10)

Fourier series and tensor product structure#

For periodic data on the interval \([-\pi, \pi)\), we can use a basis \(\{ 1, \sin x, \cos x, \sin 2x, \cos 2x, \dotsc\}\), which is equivalent to \(\{ 1, e^{ix}, e^{i2x}, \dotsc \}\) with suitable complex coefficients. If we’re given equally spaced points on the interval, the Vandermonde matrix \(V\) (with suitable scaling) is unitary (like orthogonal for complex matrices) and can be applied in \(O(n \log n)\) (with small constants) using the Fast Fourier Transform. This also works for Chebyshev polynomials sampled on CosRange points.

points(d) = 10. ^ d
flops(n) = 5n * log2(n)
joules(flops) = flops / 20e9 # 20 GF/joule for best hardware today
scatter(1:10, d -> joules(flops(points(d))), xlims=(0, 10), yscale=:log10)

Partial differential equations#

Boundary value problems in multiple dimensions.

Lower-degree polynomials to fit noise-free data#

We can fit \(m\) data points using an \(n < m\) dimensional space of functions. This involves solving a least squares problem for the coefficients \( \min_c \lVert V c - y \rVert \)

function chebyshev_regress_eval(x, xx, n)
    V = vander_chebyshev(x, n)
    @show cond(V)
    vander_chebyshev(xx, n) / V
end
ndata, nbasis = 30, 20
x = LinRange(-1, 1, ndata)
xx = LinRange(-1, 1, 500)
C = chebyshev_regress_eval(x, xx, nbasis)
plot(xx, [runge.(xx), C * runge.(x)])
scatter!(x, runge)

cond(V) = 30.083506637940346

S = svdvals(C)
scatter(S, yscale=:log10)