Skip to main content

Section 6.1 Introduction to Approximation Theory

A very important subject in numerical analysis is that of approximation theory. In a nutshell, one desires to approximate one function by another function. For example, let \(g(x)=e^{x}\text{.}\) Although most computers and calculators will quickly evaluate this, let’s ignore that fact and try to think of how to do it with only the standard 4 operations: addition, subtraction, multiplication and division.
In order to evaluate \(e^{x}\) for arbitrary values of \(x\text{,}\) one way from calculus is to use the Maclaurin series
\begin{equation*} e^{x} = 1 + x + \frac{x^{2}}{2!}+ \frac{x^{3}}{3!}+ \cdots + \frac{x^{n}}{n!}+ \cdots. \end{equation*}
We learned in calculus that this series converges for any value of \(x\)—the problem is that it may take a huge number of terms to find \(e^{x}\) to a desired accuracy. In this section, we learn how to approximate functions like \(e^{x}\) or a set of data using other functions, generally polynomials and rational functions.

Subsection 6.1.1 Norms of Functions

As discussed, we seek a function (often a polynomial) that is “close” to the target function. In this section, we define carefully what this means. In order to do this, we discuss norms.

Definition 6.1.1.

A norm is a function \(|| \cdot ||: V \rightarrow \R\) over a field \(F\text{,}\) where \(V\) is a vector space, that satisfies the following three properties for all \(\mathbf{u}, \mathbf{v}\in V\) and \(a \in F\text{,}\)
  1. \(||a \mathbf{u}|| = |a| \; ||\mathbf{u}||\text{,}\)
  2. \(||\mathbf{u}+ \mathbf{v}|| \leq ||\mathbf{u}|| + ||\mathbf{v}||,\quad\) (This is called the triangle inequality.)
  3. \(||\mathbf{u}|| \geq 0\) and \(||\mathbf{u}||=0\) if and only if \(\mathbf{u}=\mathbf{0}\text{,}\) the zero vector.
Generally the field \(F\) is either \(\R\) or \(\mathbb{C}\text{,}\) the sets of reals and complex numbers respectively.
If the vector space \(V = \R^{n}\text{,}\) the set of all real vectors of length \(n\text{,}\) then a often-used norm is the Euclidean norm. If \(\mathbf{v}\in V\) and
\begin{equation*} \mathbf{v}= \begin{bmatrix}v_{1} \\ v_{2} \\ \vdots \\ v_{n}\end{bmatrix} \end{equation*}
then the Euclidean norm is
\begin{equation*} ||\mathbf{v}||_{2} = \sqrt{v_{1}^{2} + v_{2}^{2} + \cdots v_{n}^{2}} \end{equation*}
and this can be shown to satisfy the three properties above.
The Euclidean norm gives the length of the vector in \(\R^{2}\) or \(\R^{3}\) and can be used to define the length in higher orders. In addition, the norm can be used to determine the distance between two point. If \(\mathbf{x}\) and \(\mathbf{y}\) are vectors in \(\R^{n}\text{,}\) then the distance between \(\mathbf{x}\) and \(\mathbf{y}\) is \(||\mathbf{x}-\mathbf{y}||_{2}\) and reduce to the distance between points in \(\R^{2}\) and \(\R^{3}\text{.}\)
Recall that the notion of vector space is broad. Although it often applies to vectors and matrices, the set of polynomials of a given degree and the set of continuous functions on \([a,b]\) are examples of vector spaces.
Three standard norms are often called the \(\ell_{1}, \ell_{2}\) and \(\ell_{\infty}\text{.}\) For \(\mathbf{u}\in \R^{n}\text{,}\) the norms \(\ell_{1}, \ell_{2}\) and \(\ell_{\infty}\) are defined as
\begin{equation*} \begin{aligned} ||u||_{1} \amp = \sum_{i=1}^{n} |u_{i}|, \\ ||u||_{2} \amp = \biggl(\sum_{i=1}^{n} u_{i}^{2}\biggr)^{1/2}, \\ ||u||_{\infty} \amp= \max_{i=1,2,\ldots,n}|u_{i}|. \end{aligned} \end{equation*}
The \(\ell_{1}\text{,}\) \(\ell_{2}\) and \(\ell_{\infty}\) norms of a function \(f\) on an interval \([a,b]\) are defined
\begin{equation*} \begin{aligned} ||f||_{1} \amp = \int_{a}^{b} |f(x)| \, dx, \\ ||f||_{2}\amp= \biggl( \int_{a}^{b} f(x)^{2} \, dx \biggr)^{1/2}, \\ ||f||_{\infty} \amp= \max_{x \in[a,b]}|f(x)|. \end{aligned} \end{equation*}
We will see that these norms play a large role in approximation theory. In short, we are looking for a function to approximate another function or a set of data. We will use one of the norms to determine how far the two are away from each other.

Subsection 6.1.2 Inner Product Spaces

The vector space \(V\) is called an inner product space if for every pair of vectors \(\bu\) and \(\bv\) in \(V\) there is a unique number \(\langle \bu, \bv \rangle\text{,}\) called the inner product of \(\bu\) and \(\bv\) such that if \(\bu, \bv\) and \(\bw\) are in \(V\) and \(\alpha \in \R\) then
  1. \(\langle \bu, \bv \rangle = \langle \bv, \bu \rangle\text{,}\)
  2. \(\langle a \bu + b \bv, \bw \rangle = a \langle \bu, \bw \rangle + b \langle \bv, \bw \rangle\text{,}\)
  3. \(\langle \bu, \bu \rangle \geq 0\) and equals 0 if and only if \(\bu = \bzero\text{.}\)

Example 6.1.2.

Show that \(\R^{n}\) is an inner product space with \(\bu=(u_{1}, u_{2}, \ldots, u_{n})^{\intercal}\) and \(\bv=(v_{1},v_{2}, \ldots,v_{n})^{\intercal}\) and \(\langle \bu, \bv \rangle = \bu^{\intercal} \bv\text{.}\)
First, the set \(\R^{n}\) is a vectors space. Next, we need to show that the three properties are satisfied.
  1. \begin{equation*} \langle \bu, \bv \rangle = \bu^{\intercal} \bv = (\bu^{\intercal} \bv)^{\intercal} = \bv^{\intercal} (\bu^{\intercal})^{\intercal} = \bv^{\intercal} \bu = \langle \bv, \bu \rangle. \end{equation*}
    so the first property is satisfied.
  2. \begin{equation*} \begin{aligned} \langle a \bu + b \bv, \bw \rangle\amp= (a \bu + b\bv)^{\intercal} \bw = (a \bu^{\intercal} + b \bv^{\intercal}) \bw \\ \amp= a \bu^{\intercal} \bw + b \bv^{\intercal} \bw = a \langle \bu, \bw \rangle + b \langle \bv, \bw \rangle. \end{aligned} \end{equation*}
  3. \begin{equation*} \langle \bu, \bu \rangle = \bu^{\intercal} \bu = u_{1}^{2} + u_{2}^{2} + u_{3}^{2} + \cdots + u_{n}^{2} \end{equation*}
    which satisfies \(\geq 0\text{.}\) The only time that this quantity equals 0, is when \(u_{1}=u_{2} = \cdots =u_{n}=0\) or \(\bu = \bzero\text{.}\)
The example above shows a standard way of demonstrating that something is an inner product space. Using the definition of the inner product, each property is clearly shown.

Subsection 6.1.3 Vector Norms and distance

Definition 6.1.3.

Let \(\bu\) and \(\bv\) be elements of an inner product space. The norm of the vector \(\bu\) is given by
\begin{equation*} || \bu ||= \sqrt{\langle \bu,\bu \rangle} \end{equation*}
Note: if \(\bu \in \R^{n}\) then the vector norm is the length of the vector.

Definition 6.1.4.

The distance between vectors \(\bu\) and \(\bv\) denoted \(d(\bu,\bv)\) and defined as \(d(\bu,\bv)=||\bu-\bv||\text{.}\)
Note: if \(\bu\) and \(\bv\) are in \(\R^{n}\text{,}\) then the distance function is the standard distance function.

Proof.

Let \(||\bu-\bv||=0\text{,}\) then \(\langle \bu-\bv,\bu-\bv \rangle=0\) By definition, this is only zero if \(\bu-\bv=\bzero\text{,}\) therefore \(\bu=\bv\text{.}\)
If \(\bu=\bv\text{,}\) then \(||\bu-\bv|| = ||\bzero||=0\text{.}\)

Subsection 6.1.4 Angles between vectors

If \(\bu\) and \(\bv\) are vectors in \(\R^{2}\) or \(\R^{3}\) then
\begin{equation*} \cos \theta = \frac{\langle \bu,\bv \rangle}{||\bu||\, || \bv||} \end{equation*}
and thus the angle between the vectors can be found.
This notion generalizes to any vectors in an inner product space, \(V\text{.}\) Most helpful, two vectors meet at a right angle if \(\langle \bu,\bv\rangle=0\text{.}\)

Subsection 6.1.5 Orthonormal sets of vectors

Definition 6.1.6.

Let \(\{\bv_{1}, \bv_{2}, \ldots, \bv_{n}\}\) each be elements of an inner product space, \(V\text{.}\) The set is called an orthonormal set if
\begin{equation*} \langle \bv_{i}, \bv_{j} \rangle= 0 \qquad \text{if $i \neq j$} \end{equation*}
and
\begin{equation*} ||\bv_{i}|| = \sqrt{\langle \bv_{i}, \bv_{i} \rangle}= 1 \end{equation*}
If only the first condition holds, the set is called orthogonal.

Example 6.1.7.

Show that
\begin{equation*} \begin{aligned} \bv_{1}\amp= \begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix},\amp\bv_{2}\amp= \begin{bmatrix}0 \\ 1/\sqrt{2} \\ 1/\sqrt{2}\end{bmatrix},\amp\bv_{3}\amp= \begin{bmatrix}0 \\ 1/\sqrt{2} \\ -1/\sqrt{2}\end{bmatrix} \end{aligned} \end{equation*}
is an orthonormal set where each vector is an element of \(\R^{3}\text{.}\)
Note: this is also a basis for \(\R^{3}\text{.}\)

Proof.

If \(\bv=\bzero\text{,}\) then the inequality is satisfied. Assume \(\bv \neq \bzero\) and define
\begin{equation*} \lambda = \langle \bv,\bv \rangle^{-1}\langle \bu,\bv \rangle. \end{equation*}
By definition of the inner product,
\begin{equation*} \begin{aligned} 0 \leq\amp\langle \bu - \lambda \bv, \bu -\lambda \bv \rangle \\ \amp\qquad \text{Using properties of the inner product,}\\ 0 \leq\amp\langle \bu, \bu \rangle - 2 \lambda \langle \bu, \bv \rangle + \lambda^{2} \langle \bv,\bv \rangle \\ =\amp\langle \bu,\bu \rangle - 2 \langle \bv,\bv \rangle^{-1}\langle \bu,\bv \rangle \langle \bu, \bv \rangle + \bigl(\langle \bv,\bv \rangle^{-1}\langle \bu,\bv \rangle \bigr)^{2} \langle \bv,\bv \rangle. \\ =\amp\langle \bu, \bu \rangle - \langle \bv,\bv \rangle^{-1}\langle \bu,\bv \rangle^{2} \\ \amp \qquad \text{Multiply through by $\langle \bv,\bv\rangle$}\\ =\amp\langle \bu,\bu \rangle \langle \bv,\bv \rangle - \langle \bu,\bv \rangle^{2} \end{aligned} \end{equation*}
This can be rearranged to get:
\begin{equation*} \langle \bu,\bv \rangle^{2} \leq \langle \bu,\bu \rangle \langle \bv, \bv \rangle \end{equation*}
and taking the square root, you get the desired result.

Subsection 6.1.6 Inner Products of functions

Above, we defined an inner product space and for both motivation as well as the examples shown, we used \(\R^{n}\text{.}\) This is a good space to become familiar with inner product spaces, because of the familiarity of the spaces \(\R^{2}\) and \(\R^{3}\text{.}\) In this section we will apply an inner product to a function on an interval \([a,b]\text{.}\)

Definition 6.1.9.

The inner product between functions \(f\) and \(g\text{,}\) denoted \(\langle f,g\rangle\) on the interval \([a,b]\) is defined to be:
\begin{equation} \langle f,g \rangle = \int_{a}^{b} f(x) g(x) \, dx.\tag{6.1.1} \end{equation}

Example 6.1.10.

Show that the set of continuous function on \([a,b]\) with the inner product defined above is an inner product space.
Solution.
We need to show that the three properties that define the inner product are valid for (6.1.1).
  1. \begin{equation*} \langle f, g \rangle = \int_{a}^{b} f(x) g(x) \, dx = \int_{a}^{b} g(x) f(x) \, dx = \langle g, f \rangle. \end{equation*}
  2. For \(c_{1}\) and \(c_{2}\text{,}\) real numbers,
    \begin{equation*} \langle c_{1} f + c_{2} g,h \rangle = \int_{a}^{b} (c_{1} f +c_{2} g)h \, dx = \int_{a}^{b} (c_{1} f h + c_{2} gh) \, dx \end{equation*}
    due to the linearity of integrals,
    \begin{equation*} \phantom{\langle c_{1} f + c_{2} g,h \rangle} = c_{1} \int_{a}^{b} f h \, dx + c_{2} \int_{a}^{b} g h\, dx = c_{1} \langle f, h \rangle + c_{2} \langle g,h \rangle. \end{equation*}
  3. \begin{equation*} \langle f, f \rangle = \int_{a}^{b} f^{2} \, dx \end{equation*}
    And since \(f^{2} \geq 0\text{,}\)
    \begin{equation*} \langle f, f \rangle\geq 0. \end{equation*}
    If \(f \equiv 0\text{,}\) then \(\int_{a}^{b} f^{2} \, dx =0\text{.}\) If \(\int_{a}^{b} f^{2} \, dx = 0\text{,}\) then \(f \equiv 0\text{.}\) The second statement can be shown by contradiction. That is, assume that \(f \not \equiv 0\text{,}\) and the result is that \(\int_{a}^{b} f^{2} \, dx \neq 0\text{,}\) therefore the assumption that \(f \not \equiv 0\) is not correct cannot hold.
So the set of continuous functions on \([a,b]\) with the given inner product is an inner product space.
As we saw above with vectors in \(\R^{n}\text{,}\) there is a relationship between norms and inner products. This is true with functions as well. The \(\ell_{2}\)-norm of the function \(f\) is given by
\begin{equation*} ||f ||_{2} = \sqrt{\langle f,f \rangle}= \biggl(\int_{a}^{b} f^{2} \, dx\biggr)^{1/2}. \end{equation*}

Subsection 6.1.7 Weighted Inner Products

Definition 6.1.11.

The inner product between functions \(f\) and \(g\) on the interval \([a,b]\) with respect to weight function \(w(x) \geq 0\) is defined to be:
\begin{equation*} \langle f,g \rangle_{w} = \int_{a}^{b} f(x) g(x) w(x) \, dx \end{equation*}
and in a similar vein,
\begin{equation*} ||f||_{w} = \langle f,f\rangle_{w}^{1/2} \end{equation*}

Example 6.1.12.

Find the weighted inner product between \(f(x)=x\) and \(f(x)=x^{2}\) with weight function \(w(x)=e^{-x}\) on the interval \([0,\infty)\text{.}\)
\begin{equation*} \langle x,x^{2}\rangle_{e^{-x}}= \int_{0}^{\infty}x^{3} e^{-x}\, dx \end{equation*}
using tabular integration
\begin{equation*} = -(x^{3}+3x^{2}+6x+6)e^{-x}\biggr|_{0}^{\infty}= 6 \end{equation*}

Subsection 6.1.8 Orthogonal Functions

Again, as we saw above, two vectors \(\bu\) and \(\bv\) in \(\R^{n}\) are orthogonal if \(\langle \bu, \bv \rangle=0\text{.}\) Similarly, we can define two functions \(f\) and \(g\) to be orthogonal on \([a,b]\) if
\begin{equation*} \langle f, g \rangle = \int_{a}^{b} f g \, dx = 0 \end{equation*}

Example 6.1.13.

Show that \(f(x) = x\) and \(g(x) = x^{2}\) are orthogonal on \([-1,1]\text{.}\)
\begin{equation*} \langle f, g \rangle = \int_{-1}^{1} x (x^{2}) \,dx = \int_{-1}^{1} x^{3} \, dx =0, \end{equation*}
since \(x^{3}\) is an odd function and the interval is symmetric with respect to 0.

Definition 6.1.14.

The set \(\{ f_{i} \}\) is an orthogonal set of functions on \([a,b]\) if \(\langle f_{i}, f_{j} \rangle =0\) for all \(i,j\) such that \(i \neq j\text{.}\) If, in addition, the functions satisfy \(||f_{i}||=1\text{,}\) for all \(i\text{,}\) then the set is called orthonormal.
Note: a set of functions can be orthogonal with respect to a weight function \(w(x)\) if the appropriate inner product is used. The following example shows this.

Example 6.1.15.

Show that the set of functions \(\{1, 1-x,x^{2}/2-2x+1 \}\) are orthogonal with respect to the weight function \(w(x) = e^{-x}\) on the interval \([0,\infty)\text{.}\)
We need to show that each of the following inner products (integrals) are zero.
\begin{equation*} \begin{aligned} \langle 1,1-x\rangle\amp= \int_{0}^{\infty}1 \cdot (1-x) e^{-x}\, dx = xe^{-x}\biggr\vert_{0}^{\infty}=0 \\ \langle 1,x^{2}/2-2x+1\rangle\amp=\int_{0}^{\infty}1 \cdot (x^{2}/2-2x+1) e^{-x}\, dx \\\amp= e^{-x}(x^{3}/6-x^{2}-x) \biggr\vert_{0}^{\infty}= 0\\ \langle 1-x,x^{2}/2-2x+1\rangle\amp=\int_{0}^{\infty}(1-x)(x^{2}/2-2x+1)e^{-x}\, dx \\\amp= e^{-x}\biggl(-\frac{x^{4}}{8}+\frac{5x^{3}}{6}-\frac{3x^{2}}{2}+x\biggr) \biggl\vert_{0}^{\infty}=0 \end{aligned} \end{equation*}
so these form an orthogonal set. These three polynomials are the first three Laguerre Polynomials. There is an infinite set of these.