Linear Approximation

Section 10.4 Linear Approximation

Subsection 10.4.1 Single Variable Interpretation

Consider a single variable function \(f: \RR \rightarrow \RR\text{.}\) In Calculus I, we defined the linear approximation of \(f\) at the point \((a,f(a)\text{.}\)

\begin{equation*} f(x) \approx f(a) + f^\prime(a) (x-a) \end{equation*}

From Calculus II or Definition 2.2.1 you might recognize this as simply the first-order Taylor approximation for \(f\text{,}\) where we truncate after the linear term. The linear approximation is the line that best approximates \(f\) at this point. Its graph is simply the tangent line to \(f\) at \((a,f(a))\text{.}\)

We can rearrange the linear approximaiton to help us generalize to multivariable functions.

\begin{equation} (f(x) - f(a)) \approx f^\prime(a) (x-a)\label{equation-single-linear-approx}\tag{10.4.1} \end{equation}

In this course, we have become accustomed to think of points in \(\RR^n\) along with local direction vectors: each point can be thought of as the origin for a system of local directions. We can do the same here: let \((a,f(a))\) be a local origin for a system of local directions. Then switching from \(f(x)\) to \(f(x) - f(a)\) and from \(x\) to \(x-a\) is just moving from the usual origin to this new, local, origin.

In this new origin, the function in Equation (10.4.1) is approximated by multiplication by \(f^\prime(a)\text{.}\) This gives us a new interpretation for the single-variable derivative: the derivative is the multiplicative factor for the local linear approximation of \(f\text{.}\) Locally, (in coordinates pretending that \((a,f(a))\) is the origin), the function is approximated by multiplication by \(f^\prime(a)\text{.}\)

Now lets think of function \(f: \RR^2 \rightarrow \RR\text{.}\) It we want to approximate by linear functions, we need to understand linear functions: \(\RR^2 \rightarrow \RR\text{.}\) We look to matrices. Matrices completely describe linear functions. A \(1 \times 1\) matrix is just a number, and the linear function \(\RR \rightarrow \RR\) is just multiplication by that number. But for \(\RR^2 \rightarrow \RR\text{,}\) we have a \(1 \times 2\) matrix.

Subsection 10.4.2 Linear Approximation in \(\RR^2\)

So we ask: what is a linear approximation to the function \(f: \RR^2 \rightarrow \RR\text{.}\) It must be a \(1 \times 2\) matrix \(M\) and fit into a equation mirroring Equation (10.4.1).

\begin{equation*} f(x,y) - f(a,b) \approx M \begin{pmatrix}x-a\\y-b\end{pmatrix} \end{equation*}

In local coordinates at \((a,b,f(a,b))\text{,}\) this is just matrix multiplication by \(M\text{.}\) So a linear approximation is a matrix multiplication in local coordintaes. What is the matrix \(M\text{?}\) Well, if we work with partial derivatives, the linear approximation should be formed of linear approximation in \(x\) and linear approximation in \(y\text{.}\)

\begin{equation*} f(x,y) \approx f(a,b) + \frac{\del f}{\del x} (a,b) (x-a) + \frac{\del f}{\del x} (a,b) (y-b) \end{equation*}

We put this in matrix form.

\begin{equation*} f(x,y) - f(a,b) \approx \left( \begin{matrix} \frac{\del f}{\del x} \amp \frac{\del f}{\del y} \end{matrix} \right) \begin{pmatrix}x-a\\y-b\end{pmatrix} \end{equation*}

Definition 10.4.1.

The matrix of the linear approximation to a scalar function \(f: \RR^n \rightarrow \RR\) is the \(1 \times n\) matrix of partial derivatives.

The graph of the linear approximation is the tangent plane that we've already defined at \((a,b,f(a,b))\text{.}\)

Subsection 10.4.3 Examples

Example 10.4.2.

Let's return to the function in Example 10.3.2, \(f(x,y) = \frac{1}{1 + x^2 + y^2}\text{.}\)

\begin{equation*} M = \left( \begin{matrix} \frac{-2x}{(1+x^2+y^2)^2} \amp \frac{-2y}{(1+x^2+y^2)^2} \end{matrix} \right) \end{equation*}

Look at the point \((0,0)\text{.}\)

\begin{equation*} f(x,y) \approx f(0,0) + \left( \begin{matrix} 0 \amp 0 \end{matrix} \right) \begin{pmatrix}x\\y\end{pmatrix} = 1 \end{equation*}

This linear approximation is a constant \(1\text{,}\) which makes sense at the top of the small hill. Momentarily, at the peak, nothing is changing and the function doesn't do anything. The linear approximation to doing nothing is appropriately a constant.

Look at the point \((1,1)\text{.}\)

\begin{equation*} f(x,y) \approx f(1,1) + \left( \begin{matrix} \frac{-2}{9} \amp \frac{-2}{9} \end{matrix} \right) \begin{pmatrix}x-1\\y-1\end{pmatrix} = \frac{1}{3} - \frac{2(x-1)}{9} - \frac{2(y-1)}{9} \end{equation*}

Look at the point \((-2,2)\text{.}\)

\begin{equation*} f(x,y) \approx f(-2,2) + \left( \begin{matrix} \frac{4}{81} \amp \frac{-4}{81} \end{matrix} \right) \begin{pmatrix}x+2\\y-2\end{pmatrix} = \frac{1}{9} + \frac{4(x+2)}{81} - \frac{4(y-2)}{81} \end{equation*}

Example 10.4.3.

\(f(x,y) = x^2 e^{x-y}\text{.}\)

\begin{equation*} M = \left( \begin{matrix} 2xe^{x-y} + x^2e^{x-y}, -x^2 e^{x-y} \end{matrix} \right) \end{equation*}

Look at the point \((2,2)\text{.}\)

\begin{equation*} f(x,y) \approx f(2,2) + \left( \begin{matrix} 4+4 \amp -4 \end{matrix} \right) \begin{pmatrix}x-2\\y-2\end{pmatrix} = 4 + (4+4e^2)(x-2) - 4(y-2) \end{equation*}

Look at the point \((-1,-1)\text{.}\)

\begin{equation*} f(x,y) \approx f(-1,-1) + \left( \begin{matrix} (2+e) \amp -1 \end{matrix} \right) \begin{pmatrix}x+1\\y+1\end{pmatrix} = 1 + (2+e)(x+1) - (y+1) \end{equation*}

Subsection 10.4.4 What Is A Derivative?

We extended the idea of a derivative with partial derivatives and directional derivatives. These are both useful, but neither are a universal extension. Both are only pieces of derivatives along certain directions. Gradients were a more universal extension, but they only identify the direction of greatest change. Tangent planes were a good geometric extension, but without an algebraic analogue.

The discussion of linear approximation leads us to a much more universal idea to generalize the derivative. The big idea is this: derivatives, in any dimension, are linear approximations to functions. This idea works in single variables, where multiplication by \(f^\prime(a)\) was the linear approximation. The derivatives calculates that factor. For higher dimensions, we can use matrix multiplication instead of just multiplication by a number.

Therefore, we could realistically say that the derivative of a function \(f: \RR^n \rightarrow \RR^m\) is the matrix of partial derivatives, which serves as a linear approximation of the function at any point in its domain. It will be useful to give this matrix a name, for future reference. I'll state the definition so that it works for scalar fields (\(m=1\)) for this course, as well as for vector fields (\(m \gt 1\)), which is important in Calculus IV.

Definition 10.4.4.

Let \(f: \RR^n \rightarrow \RR^m\) be a differentiable function. Then the \(m \times n\) matrix of partial derivatives of \(f\) is called the Jacobian Matrix of the function.