In this note, we introduce two versions of the proof of the Chain Rule. The first one comes from [1]. Let $y=f(u)$ and $u=g(x)$ be differentiable functions. We claim that
$$\frac{dy}{dx}=f'(u)g'(x)$$
The finite difference $\frac{f(g(x+h))-f(g(x))}{h}$ can be written as $\frac{f(u+k)-f(u)}{h}$ where $k=g(x+h)-g(x)$. Define $\varphi(t)=\frac{f(u+t)-f(u)}{t}-f'(u)$ if $t\ne 0$. Multiplying by $t$ and rearranging terms, we obtain
$$
f(u+t)-f(u)=t[\varphi(t)+f'(u)] \tag{1}
$$
$\lim_{t\to 0}\varphi(t)=0$ so we may define $\varphi(0)=0$. Then (1) is defined for all $t$. Now replace $t$ in (1) by $k$.
$$
\frac{f(u+k)-f(u)}{h}=\frac{k}{h}[\varphi(k)+f'(u)] \tag{2}
$$
(2) is valid even if $k=0$. When $h\to 0$, $\frac{k}{h}\to g'(x)$ and $\varphi(k)\to 0$. Hence the RHS of (2) approaches $f'(u)g'(x)$. This completes the proof.
Another version of the proof of the Chain Rule is from [2] as a guided exercise (\# 99 on page p. 559). Here we suppose that $y=f(u)$ is differentiable at $u_0=g(x_0)$ and $u=g(x)$ is differentiable at $x_0$. Then we claim that $y=f(g(x))$ is differentiable at $x=x_0$ and $$\left[\frac{dy}{dx}\right]_{x=x_0}=f'(u_0)g'(x_0)$$
Since $g'(x_0)$ exists, $\Delta u$ can be written as
$$\Delta u=g'(x_0)\Delta x+\rho(x)$$
where $\lim_{\Delta x\to 0}\frac{\rho(x)}{\Delta x}=0$. Similarly, if $\Delta u\ne 0$ (it could be 0), then $\Delta y$ can be written as
$$
\Delta y=f'(u_0)\Delta u+\sigma(u) \tag{3}
$$
where $\lim_{\Delta u\to 0}\frac{\sigma(u)}{\Delta u}=0$.
\begin{align*}
\Delta y&=f'(u_0)[g'(x_0)\Delta x+\rho(x)]+\sigma(g(x))\\
&=f'(u_0)g'(x_0)\Delta x+f'(u_0)\rho(x)+\sigma(x)
\end{align*}
As $\Delta u\to 0$, $\Delta y\to 0$ and accordingly $\sigma(u)\to 0$. So one can define $\sigma(u)=0$ if $\Delta u=0$. Then (3) is still valid if $\Delta u=0$.
$$\frac{\sigma(g(x))}{\Delta x}=\left\{\begin{array}{ccc}
\frac{\sigma(g(x))}{\Delta u}\cdot\frac{\Delta u}{\Delta x} & \mbox{if} & \Delta u\ne 0\\
0 & \mbox{if} & \Delta u=0\end{array}\right.\to 0$$
as $\Delta x\to 0$. Therefore,
$$\frac{\Delta y}{\Delta x}=f'(u_0)g'(x_0)+f'(u_0)\frac{\rho(x)}{\Delta x}+\frac{\sigma(g(x))}{\Delta x}$$
approaches
$$\frac{dy}{dx}=f'(u_0)g'(x_0)$$
as $\Delta x\to 0$.
Update: Here is yet another version of the proof of the chain rule. Suppose that $y=f(u)$ and $u=g(x)$ be differentiable. By Taylor series expansion, we obtain\begin{align*}
\Delta y&=f(u+\Delta u)-f(u)\\
&=f(u)+f'(u)\Delta u+\frac{f''(u)}{2!}(\Delta u)^2+\frac{f'''(u)}{3!}(\Delta u)^3+\cdots -f(u)\\
&=f'(u)\Delta u+\frac{f''(u)}{2!}(\Delta u)^2+\cdots \tag{4}
\end{align*}Dividing (4) by $\Delta x$, we have$$\frac{\Delta y}{\Delta x}=f'(u)\frac{\Delta u}{\Delta x}+\frac{f''(u)}{2!}\frac{\Delta u}{\Delta x}\Delta u+\frac{f'''(u)}{3!}\frac{\Delta u}{\Delta x}(\Delta u)^2+\cdots \tag{5}
$$
As $\Delta x\to 0$, (5) approaches
$$
\frac{dy}{dx}=f'(u)\frac{du}{dx}+\frac{f''(u)}{2!}\frac{du}{dx}du+\frac{f'''(u)}{3!}\frac{du}{dx}(du)^2+\cdots \tag{6}
$$
All the terms of $du$ after the first term can be neglected and consequently, (6) becomes
$$\frac{dy}{dx}=f'(u)\frac{du}{dx}$$
This completes the proof.
References:
- Tom M. Apostol, Calculus, Volume I One-Variable Calculus with an Introduction to Linear Algebra, 2nd Edition, John Wiley & Sons, Inc., 1967
- Jerrold Marsden and Alan Weinstein, Calculus II, Springer-Verlag, 1985
No comments:
Post a Comment