Introduction to Approximation Theory

Section 6.1 Introduction to Approximation Theory

A very important subject in numerical analysis is that of approximation theory. In a nutshell, one desires to approximate one function by another function. For example, let $g(x)=e^{x}\text{.}$ Although most computers and calculators will quickly evaluate this, let’s ignore that fact and try to think of how to do it with only the standard 4 operations: addition, subtraction, multiplication and division.

🔗

In order to evaluate $e^{x}$ for arbitrary values of $x\text{,}$ one way from calculus is to use the Maclaurin series

\begin{equation*} e^{x} = 1 + x + \frac{x^{2}}{2!}+ \frac{x^{3}}{3!}+ \cdots + \frac{x^{n}}{n!}+ \cdots. \end{equation*}

🔗

We learned in calculus that this series converges for any value of $x$—the problem is that it may take a huge number of terms to find $e^{x}$ to a desired accuracy. In this section, we learn how to approximate functions like $e^{x}$ or a set of data using other functions, generally polynomials and rational functions.

🔗

Subsection 6.1.1 Norms of Functions

As discussed, we seek a function (often a polynomial) that is “close” to the target function. In this section, we define carefully what this means. In order to do this, we discuss norms.

🔗

Definition 6.1.1.

A norm is a function $|| \cdot ||: V \rightarrow \R$ over a field $F\text{,}$ where $V$ is a vector space, that satisfies the following three properties for all $\mathbf{u}, \mathbf{v}\in V$ and $a \in F\text{,}$

$||a \mathbf{u}|| = |a| \; ||\mathbf{u}||\text{,}$
🔗

🔗
$||\mathbf{u}+ \mathbf{v}|| \leq ||\mathbf{u}|| + ||\mathbf{v}||,\quad$ (This is called the triangle inequality.)
🔗

🔗
$||\mathbf{u}|| \geq 0$ and $||\mathbf{u}||=0$ if and only if $\mathbf{u}=\mathbf{0}\text{,}$ the zero vector.
🔗

🔗

🔗

Generally the field $F$ is either $\R$ or $\mathbb{C}\text{,}$ the sets of reals and complex numbers respectively.

🔗

If the vector space $V = \R^{n}\text{,}$ the set of all real vectors of length $n\text{,}$ then a often-used norm is the Euclidean norm. If $\mathbf{v}\in V$ and

\begin{equation*} \mathbf{v}= \begin{bmatrix}v_{1} \\ v_{2} \\ \vdots \\ v_{n}\end{bmatrix} \end{equation*}

🔗

then the Euclidean norm is

\begin{equation*} ||\mathbf{v}||_{2} = \sqrt{v_{1}^{2} + v_{2}^{2} + \cdots v_{n}^{2}} \end{equation*}

🔗

and this can be shown to satisfy the three properties above.

🔗

The Euclidean norm gives the length of the vector in $\R^{2}$ or $\R^{3}$ and can be used to define the length in higher orders. In addition, the norm can be used to determine the distance between two point. If $\mathbf{x}$ and $\mathbf{y}$ are vectors in $\R^{n}\text{,}$ then the distance between $\mathbf{x}$ and $\mathbf{y}$ is $||\mathbf{x}-\mathbf{y}||_{2}$ and reduce to the distance between points in $\R^{2}$ and $\R^{3}\text{.}$

🔗

Recall that the notion of vector space is broad. Although it often applies to vectors and matrices, the set of polynomials of a given degree and the set of continuous functions on $[a,b]$ are examples of vector spaces.

🔗

Three standard norms are often called the $\ell_{1}, \ell_{2}$ and $\ell_{\infty}\text{.}$ For $\mathbf{u}\in \R^{n}\text{,}$ the norms $\ell_{1}, \ell_{2}$ and $\ell_{\infty}$ are defined as

\begin{equation*} \begin{aligned} ||u||_{1} \amp = \sum_{i=1}^{n} |u_{i}|, \\ ||u||_{2} \amp = \biggl(\sum_{i=1}^{n} u_{i}^{2}\biggr)^{1/2}, \\ ||u||_{\infty} \amp= \max_{i=1,2,\ldots,n}|u_{i}|. \end{aligned} \end{equation*}

🔗

The $\ell_{1}\text{,}$ $\ell_{2}$ and $\ell_{\infty}$ norms of a function $f$ on an interval $[a,b]$ are defined

\begin{equation*} \begin{aligned} ||f||_{1} \amp = \int_{a}^{b} |f(x)| \, dx, \\ ||f||_{2}\amp= \biggl( \int_{a}^{b} f(x)^{2} \, dx \biggr)^{1/2}, \\ ||f||_{\infty} \amp= \max_{x \in[a,b]}|f(x)|. \end{aligned} \end{equation*}

🔗

We will see that these norms play a large role in approximation theory. In short, we are looking for a function to approximate another function or a set of data. We will use one of the norms to determine how far the two are away from each other.

🔗

Subsection 6.1.2 Inner Product Spaces

The vector space $V$ is called an inner product space if for every pair of vectors $\bu$ and $\bv$ in $V$ there is a unique number $\langle \bu, \bv \rangle\text{,}$ called the inner product of $\bu$ and $\bv$ such that if $\bu, \bv$ and $\bw$ are in $V$ and $\alpha \in \R$ then

🔗

$\langle \bu, \bv \rangle = \langle \bv, \bu \rangle\text{,}$
🔗

🔗
$\langle a \bu + b \bv, \bw \rangle = a \langle \bu, \bw \rangle + b \langle \bv, \bw \rangle\text{,}$
🔗

🔗
$\langle \bu, \bu \rangle \geq 0$ and equals 0 if and only if $\bu = \bzero\text{.}$
🔗

🔗

🔗

Example 6.1.2.

Show that $\R^{n}$ is an inner product space with $\bu=(u_{1}, u_{2}, \ldots, u_{n})^{\intercal}$ and $\bv=(v_{1},v_{2}, \ldots,v_{n})^{\intercal}$ and $\langle \bu, \bv \rangle = \bu^{\intercal} \bv\text{.}$

🔗

First, the set $\R^{n}$ is a vectors space. Next, we need to show that the three properties are satisfied.

🔗

\begin{equation*} \langle \bu, \bv \rangle = \bu^{\intercal} \bv = (\bu^{\intercal} \bv)^{\intercal} = \bv^{\intercal} (\bu^{\intercal})^{\intercal} = \bv^{\intercal} \bu = \langle \bv, \bu \rangle. \end{equation*}

so the first property is satisfied.

🔗

🔗
\begin{equation*} \begin{aligned} \langle a \bu + b \bv, \bw \rangle\amp= (a \bu + b\bv)^{\intercal} \bw = (a \bu^{\intercal} + b \bv^{\intercal}) \bw \\ \amp= a \bu^{\intercal} \bw + b \bv^{\intercal} \bw = a \langle \bu, \bw \rangle + b \langle \bv, \bw \rangle. \end{aligned} \end{equation*}

🔗
\begin{equation*} \langle \bu, \bu \rangle = \bu^{\intercal} \bu = u_{1}^{2} + u_{2}^{2} + u_{3}^{2} + \cdots + u_{n}^{2} \end{equation*}

which satisfies $\geq 0\text{.}$ The only time that this quantity equals 0, is when $u_{1}=u_{2} = \cdots =u_{n}=0$ or $\bu = \bzero\text{.}$

🔗

🔗

🔗

The example above shows a standard way of demonstrating that something is an inner product space. Using the definition of the inner product, each property is clearly shown.

🔗

Subsection 6.1.3 Vector Norms and distance

Definition 6.1.3.

Let $\bu$ and $\bv$ be elements of an inner product space. The norm of the vector $\bu$ is given by

\begin{equation*} || \bu ||= \sqrt{\langle \bu,\bu \rangle} \end{equation*}

🔗

Note: if $\bu \in \R^{n}$ then the vector norm is the length of the vector.

🔗

Definition 6.1.4.

The distance between vectors $\bu$ and $\bv$ denoted $d(\bu,\bv)$ and defined as $d(\bu,\bv)=||\bu-\bv||\text{.}$

🔗

Note: if $\bu$ and $\bv$ are in $\R^{n}\text{,}$ then the distance function is the standard distance function.

🔗

Lemma 6.1.5.

The distance $||\bu-\bv||=0$ if and only if $\bu=\bv\text{.}$

🔗

Proof.

Let $||\bu-\bv||=0\text{,}$ then $\langle \bu-\bv,\bu-\bv \rangle=0$ By definition, this is only zero if $\bu-\bv=\bzero\text{,}$ therefore $\bu=\bv\text{.}$

🔗

If $\bu=\bv\text{,}$ then $||\bu-\bv|| = ||\bzero||=0\text{.}$

🔗

Subsection 6.1.4 Angles between vectors

If $\bu$ and $\bv$ are vectors in $\R^{2}$ or $\R^{3}$ then

\begin{equation*} \cos \theta = \frac{\langle \bu,\bv \rangle}{||\bu||\, || \bv||} \end{equation*}

🔗

and thus the angle between the vectors can be found.

🔗

This notion generalizes to any vectors in an inner product space, $V\text{.}$ Most helpful, two vectors meet at a right angle if $\langle \bu,\bv\rangle=0\text{.}$

🔗

Subsection 6.1.5 Orthonormal sets of vectors

Definition 6.1.6.

Let $\{\bv_{1}, \bv_{2}, \ldots, \bv_{n}\}$ each be elements of an inner product space, $V\text{.}$ The set is called an orthonormal set if

\begin{equation*} \langle \bv_{i}, \bv_{j} \rangle= 0 \qquad \text{if $i \neq j$} \end{equation*}

and

\begin{equation*} ||\bv_{i}|| = \sqrt{\langle \bv_{i}, \bv_{i} \rangle}= 1 \end{equation*}

🔗

If only the first condition holds, the set is called orthogonal.

🔗

Example 6.1.7.

Show that

\begin{equation*} \begin{aligned} \bv_{1}\amp= \begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix},\amp\bv_{2}\amp= \begin{bmatrix}0 \\ 1/\sqrt{2} \\ 1/\sqrt{2}\end{bmatrix},\amp\bv_{3}\amp= \begin{bmatrix}0 \\ 1/\sqrt{2} \\ -1/\sqrt{2}\end{bmatrix} \end{aligned} \end{equation*}

is an orthonormal set where each vector is an element of $\R^{3}\text{.}$

🔗

Note: this is also a basis for $\R^{3}\text{.}$

🔗

Theorem 6.1.8. Cauchy-Swartz Inequality.

If $\bu$ and $\bv$ are elements of an inner product space then

\begin{equation*} | \langle \bu, \bv \rangle| \leq || \bu ||\, ||\bv|| \end{equation*}

🔗

Proof.

If $\bv=\bzero\text{,}$ then the inequality is satisfied. Assume $\bv \neq \bzero$ and define

\begin{equation*} \lambda = \langle \bv,\bv \rangle^{-1}\langle \bu,\bv \rangle. \end{equation*}

🔗

By definition of the inner product,

\begin{equation*} \begin{aligned} 0 \leq\amp\langle \bu - \lambda \bv, \bu -\lambda \bv \rangle \\ \amp\qquad \text{Using properties of the inner product,}\\ 0 \leq\amp\langle \bu, \bu \rangle - 2 \lambda \langle \bu, \bv \rangle + \lambda^{2} \langle \bv,\bv \rangle \\ =\amp\langle \bu,\bu \rangle - 2 \langle \bv,\bv \rangle^{-1}\langle \bu,\bv \rangle \langle \bu, \bv \rangle + \bigl(\langle \bv,\bv \rangle^{-1}\langle \bu,\bv \rangle \bigr)^{2} \langle \bv,\bv \rangle. \\ =\amp\langle \bu, \bu \rangle - \langle \bv,\bv \rangle^{-1}\langle \bu,\bv \rangle^{2} \\ \amp \qquad \text{Multiply through by $\langle \bv,\bv\rangle$}\\ =\amp\langle \bu,\bu \rangle \langle \bv,\bv \rangle - \langle \bu,\bv \rangle^{2} \end{aligned} \end{equation*}

🔗

This can be rearranged to get:

\begin{equation*} \langle \bu,\bv \rangle^{2} \leq \langle \bu,\bu \rangle \langle \bv, \bv \rangle \end{equation*}

and taking the square root, you get the desired result.

🔗

Subsection 6.1.6 Inner Products of functions

Above, we defined an inner product space and for both motivation as well as the examples shown, we used $\R^{n}\text{.}$ This is a good space to become familiar with inner product spaces, because of the familiarity of the spaces $\R^{2}$ and $\R^{3}\text{.}$ In this section we will apply an inner product to a function on an interval $[a,b]\text{.}$

🔗

Definition 6.1.9.

The inner product between functions $f$ and $g\text{,}$ denoted $\langle f,g\rangle$ on the interval $[a,b]$ is defined to be:

\begin{equation} \langle f,g \rangle = \int_{a}^{b} f(x) g(x) \, dx.\tag{6.1.1} \end{equation}

🔗

Example 6.1.10.

Show that the set of continuous function on $[a,b]$ with the inner product defined above is an inner product space.

🔗

Solution.

We need to show that the three properties that define the inner product are valid for (6.1.1).

🔗

\begin{equation*} \langle f, g \rangle = \int_{a}^{b} f(x) g(x) \, dx = \int_{a}^{b} g(x) f(x) \, dx = \langle g, f \rangle. \end{equation*}

🔗

🔗
For $c_{1}$ and $c_{2}\text{,}$ real numbers,

\begin{equation*} \langle c_{1} f + c_{2} g,h \rangle = \int_{a}^{b} (c_{1} f +c_{2} g)h \, dx = \int_{a}^{b} (c_{1} f h + c_{2} gh) \, dx \end{equation*}

due to the linearity of integrals,

\begin{equation*} \phantom{\langle c_{1} f + c_{2} g,h \rangle} = c_{1} \int_{a}^{b} f h \, dx + c_{2} \int_{a}^{b} g h\, dx = c_{1} \langle f, h \rangle + c_{2} \langle g,h \rangle. \end{equation*}

🔗

🔗
\begin{equation*} \langle f, f \rangle = \int_{a}^{b} f^{2} \, dx \end{equation*}

And since $f^{2} \geq 0\text{,}$

\begin{equation*} \langle f, f \rangle\geq 0. \end{equation*}

🔗

If $f \equiv 0\text{,}$ then $\int_{a}^{b} f^{2} \, dx =0\text{.}$ If $\int_{a}^{b} f^{2} \, dx = 0\text{,}$ then $f \equiv 0\text{.}$ The second statement can be shown by contradiction. That is, assume that $f \not \equiv 0\text{,}$ and the result is that $\int_{a}^{b} f^{2} \, dx \neq 0\text{,}$ therefore the assumption that $f \not \equiv 0$ is not correct cannot hold.
🔗

🔗

🔗

So the set of continuous functions on $[a,b]$ with the given inner product is an inner product space.

🔗

As we saw above with vectors in $\R^{n}\text{,}$ there is a relationship between norms and inner products. This is true with functions as well. The $\ell_{2}$-norm of the function $f$ is given by

\begin{equation*} ||f ||_{2} = \sqrt{\langle f,f \rangle}= \biggl(\int_{a}^{b} f^{2} \, dx\biggr)^{1/2}. \end{equation*}

🔗

Subsection 6.1.7 Weighted Inner Products

Definition 6.1.11.

The inner product between functions $f$ and $g$ on the interval $[a,b]$ with respect to weight function $w(x) \geq 0$ is defined to be:

\begin{equation*} \langle f,g \rangle_{w} = \int_{a}^{b} f(x) g(x) w(x) \, dx \end{equation*}

🔗

and in a similar vein,

\begin{equation*} ||f||_{w} = \langle f,f\rangle_{w}^{1/2} \end{equation*}

🔗

Example 6.1.12.

Find the weighted inner product between $f(x)=x$ and $f(x)=x^{2}$ with weight function $w(x)=e^{-x}$ on the interval $[0,\infty)\text{.}$

🔗

\begin{equation*} \langle x,x^{2}\rangle_{e^{-x}}= \int_{0}^{\infty}x^{3} e^{-x}\, dx \end{equation*}

using tabular integration

\begin{equation*} = -(x^{3}+3x^{2}+6x+6)e^{-x}\biggr|_{0}^{\infty}= 6 \end{equation*}

🔗

Subsection 6.1.8 Orthogonal Functions

Again, as we saw above, two vectors $\bu$ and $\bv$ in $\R^{n}$ are orthogonal if $\langle \bu, \bv \rangle=0\text{.}$ Similarly, we can define two functions $f$ and $g$ to be orthogonal on $[a,b]$ if

\begin{equation*} \langle f, g \rangle = \int_{a}^{b} f g \, dx = 0 \end{equation*}

🔗

Example 6.1.13.

Show that $f(x) = x$ and $g(x) = x^{2}$ are orthogonal on $[-1,1]\text{.}$

🔗

\begin{equation*} \langle f, g \rangle = \int_{-1}^{1} x (x^{2}) \,dx = \int_{-1}^{1} x^{3} \, dx =0, \end{equation*}

since $x^{3}$ is an odd function and the interval is symmetric with respect to 0.

🔗

Definition 6.1.14.

The set $\{ f_{i} \}$ is an orthogonal set of functions on $[a,b]$ if $\langle f_{i}, f_{j} \rangle =0$ for all $i,j$ such that $i \neq j\text{.}$ If, in addition, the functions satisfy $||f_{i}||=1\text{,}$ for all $i\text{,}$ then the set is called orthonormal.

🔗

Note: a set of functions can be orthogonal with respect to a weight function $w(x)$ if the appropriate inner product is used. The following example shows this.

🔗

Example 6.1.15.

Show that the set of functions $\{1, 1-x,x^{2}/2-2x+1 \}$ are orthogonal with respect to the weight function $w(x) = e^{-x}$ on the interval $[0,\infty)\text{.}$

🔗

We need to show that each of the following inner products (integrals) are zero.

\begin{equation*} \begin{aligned} \langle 1,1-x\rangle\amp= \int_{0}^{\infty}1 \cdot (1-x) e^{-x}\, dx = xe^{-x}\biggr\vert_{0}^{\infty}=0 \\ \langle 1,x^{2}/2-2x+1\rangle\amp=\int_{0}^{\infty}1 \cdot (x^{2}/2-2x+1) e^{-x}\, dx \\\amp= e^{-x}(x^{3}/6-x^{2}-x) \biggr\vert_{0}^{\infty}= 0\\ \langle 1-x,x^{2}/2-2x+1\rangle\amp=\int_{0}^{\infty}(1-x)(x^{2}/2-2x+1)e^{-x}\, dx \\\amp= e^{-x}\biggl(-\frac{x^{4}}{8}+\frac{5x^{3}}{6}-\frac{3x^{2}}{2}+x\biggr) \biggl\vert_{0}^{\infty}=0 \end{aligned} \end{equation*}

so these form an orthogonal set. These three polynomials are the first three Laguerre Polynomials. There is an infinite set of these.

🔗

Prev Top Next