{
"cells": [
{
"cell_type": "markdown",
"id": "4159be0e",
"metadata": {
"hideCode": false,
"hidePrompt": false,
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# 2022-02-16 QR Stability\n",
"\n",
"* Sahitya's office hours: Friday 11-12:30\n",
"\n",
"## Last time\n",
"\n",
"* Gram-Schmidt process\n",
"* QR factorization\n",
"\n",
"## Today\n",
"* Stability and ill conditioning\n",
"* Intro to performance modeling\n",
"* Classical vs Modified Gram-Schmidt\n",
"* Right vs left-looking algorithms"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "eb781a7a",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"vander (generic function with 2 methods)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using LinearAlgebra\n",
"using Plots\n",
"using Polynomials\n",
"default(linewidth=4, legendfontsize=12)\n",
"\n",
"function vander(x, k=nothing)\n",
" if isnothing(k)\n",
" k = length(x)\n",
" end\n",
" m = length(x)\n",
" V = ones(m, k)\n",
" for j in 2:k\n",
" V[:, j] = V[:, j-1] .* x\n",
" end\n",
" V\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "08572038",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Gram-Schmidt orthogonalization\n",
"\n",
"Suppose we're given some vectors and want to find an orthogonal basis for their span.\n",
"\n",
"$$ \\Bigg[ a_1 \\Bigg| a_2 \\Bigg] = \\Bigg[ q_1 \\Bigg| q_2 \\Bigg] \\begin{bmatrix} r_{11} & r_{12} \\\\ 0 & r_{22} \\end{bmatrix} $$"
]
},
{
"cell_type": "markdown",
"id": "bfe2f87a",
"metadata": {
"cell_style": "center",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# A naive algorithm"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "fdc56e08",
"metadata": {
"cell_style": "split"
},
"outputs": [
{
"data": {
"text/plain": [
"gram_schmidt_naive (generic function with 1 method)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function gram_schmidt_naive(A)\n",
" m, n = size(A)\n",
" Q = zeros(m, n)\n",
" R = zeros(n, n)\n",
" for j in 1:n\n",
" v = A[:,j]\n",
" for k in 1:j-1\n",
" r = Q[:,k]' * v\n",
" v -= Q[:,k] * r\n",
" R[k,j] = r\n",
" end\n",
" R[j,j] = norm(v)\n",
" Q[:,j] = v / R[j,j]\n",
" end\n",
" Q, R\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "967f69c1",
"metadata": {
"cell_style": "split",
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"norm(Q' * Q - I) = 1.073721107832196e-8\n",
"norm(Q * R - A) = 8.268821431611631e-16\n"
]
}
],
"source": [
"x = LinRange(-1, 1, 20)\n",
"A = vander(x, 20)\n",
"Q, R = gram_schmidt_naive(A)\n",
"@show norm(Q' * Q - I)\n",
"@show norm(Q * R - A);"
]
},
{
"cell_type": "markdown",
"id": "913dbe1e",
"metadata": {
"cell_style": "center",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# What do orthogonal polynomials look like?"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "c5c93439",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = LinRange(-1, 1, 50)\n",
"A = vander(x, 6)\n",
"Q, R = gram_schmidt_naive(A)\n",
"plot(x, Q)"
]
},
{
"cell_type": "markdown",
"id": "e970788d",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"What happens if we use more than 50 values of $x$? Is there a continuous limit?"
]
},
{
"cell_type": "markdown",
"id": "3c5838b3",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Solving equations using $QR = A$\n",
"\n",
"If $A x = b$ then $Rx = Q^T b$. (Why is it easy to solve with $R$?)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "8ee71b61",
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x1 = [-0.9, 0.1, 0.5, 0.8] # points where we know values\n",
"y1 = [1, 2.4, -0.2, 1.3]\n",
"scatter(x1, y1)\n",
"A = vander(x1, 3)\n",
"Q, R = gram_schmidt_naive(A)\n",
"p = R \\ (Q' * y1)\n",
"p = A \\ y1\n",
"plot!(x, vander(x, 3) * p)"
]
},
{
"cell_type": "markdown",
"id": "39e38b8b",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# How accurate is it?"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "94fd9d57",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"norm(Q' * Q - I) = 2.2794113434933815e-13\n",
"norm(Q * R - A) = 3.6437542961698333e-16\n"
]
},
{
"data": {
"text/plain": [
"3.6437542961698333e-16"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m = 10\n",
"x = LinRange(-1, 1, m)\n",
"A = vander(x)\n",
"Q, R = gram_schmidt_naive(A)\n",
"@show norm(Q' * Q - I)\n",
"@show norm(Q * R - A) "
]
},
{
"cell_type": "markdown",
"id": "efc14559",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# A variant with more parallelism\n",
"\n",
"\\begin{align}\n",
"(I - q_2 q_2^T) (I - q_1 q_1^T) v &= (I - q_1 q_1^T - q_2 q_2^T + q_2 q_2^T q_1 q_1^T) v \\\\\n",
"&= \\Bigg( I - \\Big[ q_1 \\Big| q_2 \\Big] \\begin{bmatrix} q_1^T \\\\ q_2^T \\end{bmatrix} \\Bigg) v\n",
"\\end{align}"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "3ac43046",
"metadata": {
"cell_style": "split"
},
"outputs": [
{
"data": {
"text/plain": [
"gram_schmidt_classical (generic function with 1 method)"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function gram_schmidt_classical(A)\n",
" m, n = size(A)\n",
" Q = zeros(m, n)\n",
" R = zeros(n, n)\n",
" for j in 1:n\n",
" v = A[:,j]\n",
" R[1:j-1,j] = Q[:,1:j-1]' * v\n",
" v -= Q[:,1:j-1] * R[1:j-1,j]\n",
" R[j,j] = norm(v)\n",
" Q[:,j] = v / R[j,j]\n",
" end\n",
" Q, R\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 54,
"id": "6b29c747",
"metadata": {
"cell_style": "split"
},
"outputs": [
{
"data": {
"text/plain": [
"1.4142135623730951"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"norm([0 1; 1 0])"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "18964396",
"metadata": {
"cell_style": "split"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"norm(Q' * Q - I) = 6.339875256299394e-11\n",
"norm(Q * R - A) = 1.217027619812654e-16\n"
]
},
{
"data": {
"text/plain": [
"1.217027619812654e-16"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m = 10\n",
"x = LinRange(-1, 1, m)\n",
"A = vander(x, m)\n",
"Q, R = gram_schmidt_classical(A)\n",
"@show norm(Q' * Q - I)\n",
"@show norm(Q * R - A)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "14a347ed",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "6b57d4db",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "d42aa396",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Why does order of operations matter?\n",
"\n",
"\\begin{align}\n",
"(I - q_2 q_2^T) (I - q_1 q_1^T) v &= (I - q_1 q_1^T - q_2 q_2^T + q_2 q_2^T q_1 q_1^T) v \\\\\n",
"&= \\Bigg( I - \\Big[ q_1 \\Big| q_2 \\Big] \\begin{bmatrix} q_1^T \\\\ q_2^T \\end{bmatrix} \\Bigg) v\n",
"\\end{align}\n",
"is not exact in finite arithmetic."
]
},
{
"cell_type": "markdown",
"id": "c81bbbb2",
"metadata": {
"cell_style": "center",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# We can look at the size of what's left over\n",
"\n",
"We project out the components of our vectors in the directions of each $q_j$."
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "f53c1ab0",
"metadata": {
"scrolled": false,
"slideshow": {
"slide_type": ""
}
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = LinRange(-1, 1, 20)\n",
"A = vander(x)\n",
"Q, R = gram_schmidt_classical(A)\n",
"scatter(diag(R), yscale=:log10)"
]
},
{
"cell_type": "markdown",
"id": "4d73937f",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# The next vector is almost linearly dependent"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "728cdb88",
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"norm(v) = 1.4245900685395503\n"
]
},
{
"data": {
"image/svg+xml": [
"\n",
"\n"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = LinRange(-1, 1, 20)\n",
"A = vander(x)\n",
"Q, _ = gram_schmidt_classical(A)\n",
"#Q, _ = qr(A)\n",
"v = A[:,end]\n",
"@show norm(v)\n",
"scatter(abs.(Q[:,1:end-1]' * v), yscale=:log10)"
]
},
{
"cell_type": "markdown",
"id": "e26e8171",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Cost of Gram-Schmidt?\n",
"\n",
"* We'll count flops (addition, multiplication, division*)\n",
"* Inner product $\\sum_{i=1}^m x_i y_i$?\n",
"* Vector \"axpy\": $y_i = a x_i + y_i$, $i \\in [1, 2, \\dotsc, m]$.\n",
"* Look at the inner loop:\n",
"```julia\n",
"for k in 1:j-1\n",
" r = Q[:,k]' * v\n",
" v -= Q[:,k] * r\n",
" R[k,j] = r\n",
"end\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "0b6b998b",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
""
]
},
{
"cell_type": "markdown",
"id": "1696dc21",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Counting flops is a bad model\n"
]
},
{
"cell_type": "markdown",
"id": "398fa704",
"metadata": {
"cell_style": "split"
},
"source": [
"* We load a single entry (8 bytes) and do 2 flops (add + multiply). That's an **arithmetic intensity** of 0.25 flops/byte.\n",
"* Current hardware can do about 10 flops per byte, so our best algorithms will run at about 2% efficiency.\n",
"* Need to focus on memory bandwidth, not flops.\n",
"\n",
"## Dense matrix-matrix mulitply\n",
"\n",
"* [BLIS project](https://github.com/flame/blis/)\n",
"* [Analytic modeling](https://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf)"
]
},
{
"cell_type": "markdown",
"id": "efa58581",
"metadata": {
"cell_style": "split"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"id": "ef67b53a",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Inherent data dependencies\n",
"\n",
"$$ \\Bigg[ q_1 \\Bigg| q_2 \\Bigg| q_3 \\Bigg| q_4 \\Bigg| q_5 \\Bigg] \\begin{bmatrix} r_{11} & r_{12} & r_{13} & r_{14} & r_{15} \\\\\n",
" & r_{22} & r_{23} & r_{24} & r_{25} \\\\\n",
"& & r_{33} & r_{34} & r_{35} \\\\\n",
"&&& r_{44} & r_{45} \\\\\n",
"&&& & r_{55}\n",
"\\end{bmatrix}\n",
"$$"
]
},
{
"cell_type": "markdown",
"id": "c5c2b82f",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Right-looking modified Gram-Schmidt"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "4eee54d5",
"metadata": {
"cell_style": "split"
},
"outputs": [
{
"data": {
"text/plain": [
"gram_schmidt_modified (generic function with 1 method)"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function gram_schmidt_modified(A)\n",
" m, n = size(A)\n",
" Q = copy(A)\n",
" R = zeros(n, n)\n",
" for j in 1:n\n",
" R[j,j] = norm(Q[:,j])\n",
" Q[:,j] /= R[j,j]\n",
" R[j,j+1:end] = Q[:,j]'*Q[:,j+1:end]\n",
" Q[:,j+1:end] -= Q[:,j]*R[j,j+1:end]'\n",
" end\n",
" Q, R\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "372bd8ba",
"metadata": {
"cell_style": "split",
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"norm(Q' * Q - I) = 8.486718528276085e-9\n",
"norm(Q * R - A) = 8.709998074379606e-16\n"
]
},
{
"data": {
"text/plain": [
"8.709998074379606e-16"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m = 20\n",
"x = LinRange(-1, 1, m)\n",
"A = vander(x, m)\n",
"Q, R = gram_schmidt_modified(A)\n",
"@show norm(Q' * Q - I)\n",
"@show norm(Q * R - A)"
]
},
{
"cell_type": "markdown",
"id": "4e8d5238",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Classical versus modified?\n",
"\n",
"* Classical\n",
" * Really unstable, orthogonality error of size $1 \\gg \\epsilon_{\\text{machine}}$\n",
" * Don't need to know all the vectors in advance\n",
"* Modified\n",
" * Needs to be right-looking for efficiency\n",
" * Less unstable, but orthogonality error $10^{-9} \\gg \\epsilon_{\\text{machine}}$"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "c7afa881",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"norm(Q' * Q - I) = 2.1776697623015113e-15\n"
]
},
{
"data": {
"text/plain": [
"2.1776697623015113e-15"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m = 10\n",
"x = LinRange(-1, 1, m)\n",
"A = vander(x, m)\n",
"Q, R = qr(A)\n",
"@show norm(Q' * Q - I)"
]
}
],
"metadata": {
"@webio": {
"lastCommId": "cd1892c1e2064686902fbdf399e6e3e3",
"lastKernelId": "6eae1736-6531-4224-a6c2-38e7ff163d19"
},
"celltoolbar": "Slideshow",
"hide_code_all_hidden": false,
"kernelspec": {
"display_name": "Julia 1.7.1",
"language": "julia",
"name": "julia-1.7"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.7.2"
},
"rise": {
"enable_chalkboard": true
}
},
"nbformat": 4,
"nbformat_minor": 5
}