Mathphys Archive: June 2025

Monday, June 30, 2025

Calculus 23: Mean Value Theorem

The following theorem something that can be easily understood by intuition.

Theorem. [Rolle's Theorem]
Let $f$ be continuous on the closed interval $[a,b]$ and differentiable on the open interval $(a,b)$. If $f(a)=f(b)$, then there exists a number $c$ in $(a,b)$ such that $f'(c)=0$.

Example. Show that the equation $x^3+x-1=0$ has exactly only one real root.

Solution. Let $f(x)=x^3+x-1$. Note that $f(0)=-1$ and $f(1)=1$. So by the Intermediate Value Theorem, we see that there exists at least a root of the equation $x^3+x-1=0$ in the interval $(0,1)$. Now suppose that there are two different roots $a$ and $b$ of the equation $x^3+x-1=0$. Without loss of generality, we may assume that $a<b$. Then $f(x)$ is continuous on $[a,b]$ and differentiable on $(a,b)$. By Rolle's Theorem then, there exist a number $c$ in $(a,b)$ such that $f'(c)=0$. However, $f'(x)=3x^2+1\geq 1$ for all real number $x$. This is a contradiction. Therefore, there should be only one root of the equation.

Figure 1. The graph of $f(x)=x^3+x-1$

Let $f$ be continuous on $[a,b]$ and differentiable on $(a,b)$. Define $g(x)$ to be the distance between $f(x)$ and the line segment from $(a,f(a))$ to $(b,f(b))$, i.e.
$$g(x)=f(x)-\frac{f(b)-f(a)}{b-a}(x-a)-f(a).$$ Then $g(x)$ is continuous on $[a,b]$ and differentiable on $(a,b)$. Since $g(a)=g(b)=0$, by Rolle's theorem there exists a number $c$ in $(a,b)$ such that $g'(c)=f'(c)-\frac{f(b)-f(a)}{b-a}=0$. Therefore, we proved the following theorem.

Figure 2. Mean Value Theorem

Theorem. [Mean Value Theorem]
Let $f$ be continuous on $[a,b]$ and differentiable on $(a,b)$. Then there exists a number $c$ in $(a,b)$ such that
$$f'(c)=\frac{f(b)-f(a)}{b-a}.$$

Remark. Why the name Mean Value Theorem? The average $\mathrm{av}(f)$ of a continuous function $f$ on a closed interval $[a,b]$ can be defined by $$\mathrm{av}(f)=\frac{1}{b-a}\int_a^b f(x)dx$$ If $f'(x)$ is continuous on $[a,b]$, its average on $[a,b]$ is given by $$\frac{1}{b-a}\int_a^bf'(x)dx=\frac{f(b)-f(a)}{b-a}$$ That is, Mean Value Theorem states that one of the values of $f'(x)$ on $(a,b)$ becomes the average of $f'(x)$ on $[a,b]$.

The following examples are applications of the Mean Value Theorem.

Example. Suppose that $f(0)=-3$ and $f'(x)\leq 5$ for all values of $x$. How large can $f(2)$ possibly be?

Solution. By the Mean Value Theorem, there exists a number $c$ in $(0,2)$ such that
$$f'(c)=\frac{f(2)-f(0)}{2-0}=\frac{f(2)+3}{2}.$$
Since $f'(c)\leq 5$,
\begin{align*}
f(2)&=2f'(c)-3\\
&\leq 2\cdot 5-3=7.
\end{align*}
Hence, $7$ is the largest possible value of $f(2)$.

Example. A trucker handed in a ticket at a toll booth showing in 2 hours she had covered 159 mi on a toll road with speed limit of 65 mph. The trucker was cited for speeding. Why?

Solution. The average speed was $\frac{159}{2}=79.5$ mph. By the Mean Value Theorem the trucker was driving at the speed 79.5 mph at some point.

Using Mean Value Theorem, one can prove the following theorem.

Theorem. If $f'(x)=0$ for all $x$ in the open interval $(a,b)$, then $f$ is constant on $(a,b)$.

Proof. Let $x,y$ be any two numbers in $(a,b)$. Without loss of generality, we may assume that $x<y$. Then $f(x)$ is continuous on $[x,y]$ and is differentiable on $(x,y)$. So, by the Mean Value Theorem, there exists $x<z<y$ such that
$$f'(z)=\frac{f(y)-f(x)}{y-x}$$
Since $f'(z)=0$, $f(x)=f(y)$. This completes the proof.

Calculus 22: Maximum and Minimum

Maximum and Minimum

There are two different types of extremum (maximum or minimum) values of a function $y=f(x)$. We may consider a value of $y$ that is an extremum globally on the domain or we may also consider a value of $y$ that is an extremum locally around an $x$ value.

A function $f$ has an absolute maximum at $c$ if $f(c)\geq f(x)$ for all $x$ in the domain of $f$. Similarly, $f$ has an absolute minimum at $c$ if $f(c)\leq f(x)$ for all $x$ in the domain of $f$.

A function $f$ has a local maximum (or relative maximum) at $c$ if $f(c)\geq f(x)$ in some neighborhood of $c$ (i.e an open interval that contains $c$). Similarly, $f$ has a local minimum (or relative minimum) at $c$ if $f(c)\leq f(x)$ in some neighborhood of $c$.

Example.


Figure 1. The graph of $f(x)=3x^4-16x^3+18x^2$ on $[-1,4]$

Figure 1 shows the graph of $f(x)=3x^4-16x^3+18x^2$, $-1\leq x\leq 4$. It has a local maximum at $x=1$ and a local minimum at $x=3$. The local minimum $f(3)=-27$ is also an absolute minimum. $f$ has an absolute maximum $f(-1)=37$. This $f(-1)=37$ is not a local maximum by the way. The reason is that there is no local neighborhood around $x=-1$ as the domain is given by $[-1,4]$.

A natural question one may ask is whether a function always has an absolute maximum and an absolute minimum. You can easily find many examples that show that a function does not necessarily have an absolute maximum or an absolute minimum value. For instance, $y=x$ on $(-\infty,\infty)$ has neither an absolute maximum nor an absolute minimum. The function $y=x^2$ on $[0,1)$ has an absolute minimum 0 at $x=0$ but has no absolute maximum.

Theorem. [Max-Min Theorem, Fermat]
If $f$ is continuous on a closed interval $[a,b]$, then $f$ attains an absolute maximum and an absolute minimum on $[a,b]$.

The following theorem is also due to Fermat.

Theorem. If $f$ has a local maximum or a local minimum at $c$ and if $f'(c)$ exists, then $f'(c)=0$.

The converse of this theorem is not necessarily true i.e. $f'(c)=0$ does not necessarily mean that $f(c)$ is a local maximum or a local minimum. For example, consider $f(x)=x^3$. $f'(0)=0$ but $f(x)$ has neither a local maximum nor a local minimum at $x=0$ as shown in Figure 2.

Figure 2. The graph of $f(x)=x^3$

The preceding theorem is important as an absolute maximum and an absolute minimum may be found among local maximum values, local minimum values and the evaluations of $f$ at the end points, $f(a)$ and $f(b)$. To find local maximum values and local minimum values, we first find points $c$ such that $f'(c)=0$. Such points are called critical points. The reason they are called critical points is that the graph of a function changes from increasing to decreasing or from decreasing to increasing at a critical point.

Definition. A critical point of a function $f(x)$ is a number $c$ in the domain of $f$ such that either $f'(c)=0$ or $f'(c)$ does not exist.

Recipe for Finding Absolute Maximum and Absolute Minimum

Let $f$ be a continuous function on a closed interval $[a,b]$.

Step 1. Find all critical points of $f$ in $(a,b)$.

Step 2. Evaluate $f$ at each critical point obtained in Step 1.

Step 3. Find $f(a)$ and $f(b)$.

Step 4. Compare all the values obtained in Steps 2 and 3. The largest value is the absolute maximum and the smallest value is the absolute minimum.

Example. Find the absolute maximum and the absolute minimum values of
$$f(x)=x^3-3x^2+1,\ -\frac{1}{2}\leq x\leq 4.$$

Solution.

Step 1. Find all critical points of $f$ in $\left(-\frac{1}{2},4\right)$.

$f'(x)=3x^2-6x$. Set $f'(x)=0$ i.e. $3x^2-6x=0$. $3x^2-6x$ is factored as $3x(x-2)$. So we find two critical points $0, 2$.

Step 2. Evaluate $f$ at each critical point obtained in Step 1.

$f(0)=1$ and $f(2)=-3$.

Step 3. Find $f\left(-\frac{1}{2}\right)$ and $f(4)$.

$f\left(-\frac{1}{2}\right)=\frac{1}{8}$ and $f(4)=17$.

Step 4. Compare all the values obtained in Steps 2 and 3.

The largest value is $f(4)=17$ so this is the absolute maximum value of $f$ on $\left[-\frac{1}{2},4\right]$. The smallest value is $f(2)=-3$ so this is the absolute minimum of $f$ on $\left[-\frac{1}{2},4\right]$.

Calculus 21: Linear Approximations and Differentials

Linear Approximation

Figure 1. Linear Approximation

Let $y=f(x)$ be a differentiable function. The function $f(x)$ can be approximated by the tangent line to $y=f(x)$ at $a$ if $x$ is near $a$. Such an approximation is called a linear approximation.

If $x\approx a$ then $\Delta x=x-a\approx 0$, so we have
\begin{align*}
\frac{\Delta y}{\Delta x}&\approx \frac{dy}{dx}\\
&=f'(a).
\end{align*}
This means that
$$\frac{f(x)-f(a)}{x-a}\approx f'(a),$$
i.e.
$$
f(x)\approx f(a)+f'(a)(x-a). \tag{1}
$$
The equation (1) is called the linear approximation or tangent line approximation of $f$ at $a$. The linear function
$$
L(x):=f(a)+f'(a)(x-a)
$$
is called the linearization of $f$ at $a$. Notice that $L(x)$ is the equation of tangent line to $f$ at $a$.

Example. Find the linearlization of $f(x)=\sqrt{x+3}$ at $a=1$ and use it to approximate $\sqrt{3.98}$ and $\sqrt{4.05}$.

Solution. $f'(x)=\frac{1}{2\sqrt{x+3}}$, so
\begin{align*}
L(x)&=f(1)+f'(1)(x-1)\\
&=2+\frac{1}{4}(x-1)\\
&=\frac{x}{4}+\frac{7}{4}.
\end{align*}
When $x\approx 1$, we have the approximation
$$\sqrt{x+3}\approx \frac{x}{4}+\frac{7}{4}.$$

Figure 2. Linear approximation of $f(x)=\sqrt{x+3}$ at $a=1$

Setting $x+3=3.98$ we find $x=0.98$. Hence,
\begin{align*}
\sqrt{3.98}&\approx \frac{0.98}{4}+\frac{7}{4}\\
&=1.995.
\end{align*}
Setting $x+3=4.05$ we find $x=1.05$. Hence,
\begin{align*}
\sqrt{4.05}&\approx \frac{1.05}{4}+\frac{7}{4}\\
&=2.0125.
\end{align*}

Example. Use linear approximation to estimate $\sqrt{99.8}$.

Solution. In order to use linear approximation we need to choose $f(x)$, $x$ and $a$. First clearly from the given quantity we see that $f(x)=\sqrt{x}$ and thereby $x=99.8$. Since $f'(x)=\frac{1}{2\sqrt{x}}$, the linear approximation of $\sqrt{99.8}$ at $a$ is
$$\sqrt{99.8}\approx \sqrt{a}+\frac{1}{2\sqrt{a}}(99.8-a)$$
How do we choose a suitable $a$? There are two criteria you have to have in mind. One is $a$ has to be close to $x$ for the linear approximation to be useful. Second $a$ needs to be chosen so that $f(a)$ and $f'(a)$ can be calculated easily (meaning by hand without aid of a calculator). Why is this important? You have to understand that the use of linear approximation is not assuming any use of a calculator. (If you can use a calculator, what is the point of doing this approximation?) This is a method that was developed when there were no calculators available so people could calculate values like $\sqrt{99.8}$ by hand. Considering the two criteria, we find that $a=100$ is the one. Hence,
$$\sqrt{99.8}\approx \sqrt{100}+\frac{1}{2\sqrt{100}}(99.8-100)=10+\frac{1}{20}(-0.2)=9.99$$

Example. Use linear approximation to estimate $\cos 29^\circ$.

Solution. $f(x)=\cos x$ and $x=29^\circ=\frac{29\pi}{180}$ ($29^\circ$ is not a number but $\frac{29\pi}{180}$ is). Since $f'(x)=-\sin x$, the linear approximation of $\cos 29^\circ$ at $a$ is
$$\cos 29^\circ\approx \cos a-\sin a \left(\frac{29\pi}{180}-a\right)$$
The suitable $a$ is $=\frac{30\pi}{180}=\frac{\pi}{6}$ in the spirit of the two criteria we discussed in the example above. Therefore, we have
$$\cos 29^\circ\approx \cos\frac{\pi}{6}-\sin\frac{\pi}{6}\left(-\frac{\pi}{180}\right)=\frac{\sqrt{3}}{2}+\frac{\pi}{360}$$

Differentials

Figure 3. Differentials

As seen in Figure 3 above, when $\Delta x\approx 0$, $\Delta x=dx$ and $\Delta y\approx dy$. On the other hand, $\frac{dy}{dx}=f'(x)$. Hence, we obtain
$$
\Delta y\approx dy=f'(x)dx=f'(x)\Delta x. \tag{2}
$$

Example. The radius of a sphere was measured and found to be 21 cm with a possible error in measurement of at most 0.05 cm. What is the maximum error in using this value of the radius to compute the volume of the sphere?

Solution. Let $V$ denote the volume of a sphere of radius $r$. Then $V=\frac{4}{3}\pi r^3$. What we are trying to find is $\Delta V$ with $\Delta r\leq 0.05$ cm. As seen in (2), $\Delta V\approx dV$, so we find $dV$ instead because finding $dV$ is easier than finding the exact error $\Delta V$. Differentiating $V$ with respect to $r$, we obtain
\begin{align*}
\Delta V&\approx dV\\&=4\pi r^2 dr\\
&=4\pi r^2\Delta r\\
&\leq 4\pi\cdot(21)^2\cdot 0.05\\
&=277.
\end{align*}
So the maximum error in the calculated volume is about 277 $\mbox{cm}^3$.

Linear approximation and differentials may appear to be different entities but the two methods are indeed equivalent and they serve the same purpose. To illustrate this, let us take a look at the following example which will be answered by linear approximation and differentials.

Example. Approximate $\sqrt{81.1}$.

Solution by Linear Approximation. Let $f(x)=\sqrt{x}$ and choose $a=81$. The reason for this choice of $a$ is that one can easily calculate without the aid of a calculator (which is the main point of using this method) and also $a=81$ is close to 81.1. Now we find the tangent line to $f(x)$ at $a=81$, or equivalently the linear approximation $L(x)$ at $a=81$. It is
$$L(x)=\frac{1}{2\cdot 9}(x-81)+9$$
Then
\begin{align*}
L(81.1)&=\frac{1}{18}(81.1-81)+9\\&=\frac{1}{180}+9\\
&=9.005555555555556
\end{align*}
approximates $\sqrt{81.1}$.

Solution by Differentials. Recall that $\Delta y=f(x+\Delta x)-f(x)$ is approximated by the differential $dy=f'(x)dx=f'(x)\Delta x$ for very small $\Delta x$. Now with $f(x)=\sqrt{x}$, $dy=\frac{1}{2\sqrt{x}}\Delta x$. From $\Delta y\approx dy$, we have
$$f(x+\Delta x)\approx f(x)+\frac{1}{2\sqrt{x}}\Delta x$$
If we set $f(x+\Delta x)=\sqrt{81.1}$, we can choose $x=81$ and $\Delta x=0.1$. Accordingly, we find
\begin{align*}
\sqrt{81.1}&\approx\sqrt{81}+\frac{1}{2\sqrt{81}}0.1\\
&=9+\frac{1}{180}=9.005555555555556
\end{align*}

Monday, June 23, 2025

Calculus 20: Related Rates

Related rates problems often involve (context-wise) real-life applications of the chain rule/implicit differentiation. Here are some of the examples that are commonly seen in calculus textbooks.

Example. Car A is traveling west at 50mi/h and car B is traveling north at 60mi/h. Both are headed for the intersection of the two roads. At what rate are the cars approaching each other when car A is 0.3 mi and car B is 0.4 mi from the intersection?

Solution.

Denote by $x$ and $y$ the distances from the intersection to car A and to car B, respectively. Then we have $\frac{dx}{dt}=-50$mi/h and $\frac{dy}{dt}=-60$mi/h. Let us denote $z$ the distance between $A$ and $B$. Then by Pythagorean law we have
$$z^2=x^2+y^2$$
Differentiating this with respect to $t$, we obtain
$$z\frac{dz}{dt}=x\frac{dx}{dt}+y\frac{dy}{dt}$$
and thus
\begin{align*}
\frac{dz}{dt}&=\frac{1}{z}\left[x\frac{dx}{dt}+y\frac{dy}{dt}\right]\\
&=\frac{1}{0.5}[0.3(-50)+0.4(-60)]=-78\mathrm{mi/h}
\end{align*}

Example. Air is being pumped into a spherical balloon so that its volume increases at a rate of $100\mathrm{cm}^3/\mathrm{s}$. How fast is the radius of the balloon increasing when the diameter is 50 cm?

Solution. Let $V$ and $r$ denote the volume and the radius of the spherical balloon. Then $V=\frac{4}{3}\pi r^3$. Differentiating this with respect to $t$, we obtain
$$\frac{dV}{dt}=4\pi r^2\frac{dr}{dt}$$
So,
\begin{align*}
\frac{dr}{dt}&=\frac{1}{4\pi r^2}\frac{dV}{dt}\\
&=\frac{1}{4\pi(25)^2}100\\
&=\frac{1}{25\pi}\mathrm{cm/s}
\end{align*}

Example. Gravel is being dumped from a conveyor belt at a rate of $30 \mathrm{ft}^3/\mathrm{min}$ and its coarseness is such that it forms a pile in the shape of a cone whose base diameter and height are the same. How fast is the height of the pile increasing when the pile is 10 ft high?

Solution. The cross section of the gravel pile is shown in the figure below.

The amount of gravel dumped is the same as the volume of the cone. Let us denote the volume by $V$, its base radius by $r$, and its height by $h$. Then $V=\frac{1}{3}\pi r^2h$. Since $h=2r$, $V$ can be written as
$$V=\frac{1}{12}\pi h^3$$
Differentiating this with respect to $t$, we obtain
$$\frac{dV}{dt}=\frac{1}{4}\pi h^2\frac{dh}{dt}$$
So, we have
\begin{align*}
\frac{dh}{dt}&=\frac{4}{\pi h^2}\frac{dV}{dt}\\
&=\frac{4}{\pi(10)^2}(30)=\frac{1.2}{\pi}\mathrm{ft/min}\approx 0.38\mathrm{ft/min}
\end{align*}

Example. A ladder 10 ft long rests against a vertical wall. If the bottom of the ladder slides away from the wall at a rate of 1 ft/s, how fast is the top of the ladder sliding down the wall when the bottom of the ladder is 6 ft from the wall?

Solution.

Let us denote by $x$ and $y$ the distance from the wall to the bottom of the ladder and the distance from the top of the ladder to the floor, respectively. By Pythagorean law, we have $x^2+y^2=100$. Differentiating this with respect to $t$, we obtain
$$x\frac{dx}{dt}+y\frac{dy}{dt}=0$$
Hence, we have
\begin{align*}
\frac{dy}{dt}&=-\frac{x}{y}\frac{dx}{dt}\\
&=-\frac{6}{8}(1)=-\frac{3}{4}\mathrm{ft/s}
\end{align*}

Example. A water tank has the shape of an inverted circular cone with base radius 2m and heigh 4 m. If water is being pumped into the tank at a rate of $2 \mathrm{m}^3/\mathrm{min}$, find the rate at which the water level is rising when the water is 3 m deep.

Solution. The cross section of the water tank is shown in the figure below.

The amount of water $V$ when the water level is $h$ and the surface radius is $r$ is $V=\frac{1}{3}\pi r^2h$. From the above figure we have the following ratio holds $$\frac{2}{4}=\frac{r}{h}$$ i.e. $r=\frac{h}{2}$. SO $V$ can be written as
$$V=\frac{1}{12}\pi h^3$$
Differentiating this with respect to $t$, we obtain
$$\frac{dV}{dt}=\frac{1}{4}\pi h^2\frac{dh}{dt}$$
Hence,
\begin{align*}
\frac{dh}{dt}&=\frac{4}{\pi h^2}\frac{dV}{dt}\\
&=\frac{4}{\pi(3)^2}(2)\\
&=\frac{8}{9\pi}\mathrm{m/min}\approx 0.28\mathrm{m/min}
\end{align*}

Example. A street light is at the top of a 15 feet tall pole. A 6 feet tall woman walks away from the pole with a speed of 5 ft/sec along a straight path. How fast is the tip of her shadow moving when she is 45 feet from the base of the pole?

Solution. Let $x$ be the distance from the light pole to the woman and $y$ be the distance from the light pole to the tip of her shadow as shown in the figure below.

By similar triangles, we have $\frac{15}{y}=\frac{6}{y-x}$. Solving this equation for $y$, we obtain $y=\frac{5}{3}x$. Differentiating this with respect to $t$, we find how fast the tip of her shadow is moving:
$$\frac{dy}{dt}=\frac{5}{3}\frac{dx}{dt}=\frac{25}{3}\mathrm{ft/s}$$
As seen regardless of how far the woman is from the pole, the speed of the tip is constant.

Example. The fish population, $N$, in a small pond depends on the amount of algae, $a$ (measured in pounds), in it. The equation modeling the fish population is given by $N=(3a^2-20a+26)^4$. If the amount of algae is increasing at a rate of 2 lb/week, at what rate is the fish population changing when the pond contains 5 lb of algae?

Solution. By the chain rule, we obtain
\begin{align*}
\frac{dN}{dt}&=4(3a^2-20a+26)^3\left(6a\frac{da}{dt}-20\frac{da}{dt}\right)\\
&=4(3a^2-20a+26)^3(6a-20)\frac{da}{dt}
\end{align*}
$\frac{da}{dt}=2$lb/week, so when $a=5$lb, $\frac{dN}{dt}$ is
$$\frac{dN}{dt}=4(3(5)^2-20(5)+26)^3(6(5)-20)(2)=-80\ \mathrm{lb/week}$$
What this means is that the fish population is decreasing by 80 lb/week at the instant when the pond contains 5 lb of algae.

Example. The retail price per gallon of gasoline is increasing at 0.02 dollars per week. The demand equation is given by $$10p-\sqrt{356-x^2}=0$$ where $p$ is the price per gallon (in dollars), when $x$ million gallons are demanded. At what rate is the revenue changing when 10 million gallons are demanded?

Solution. The total revenue $R$ is given by the equation
$$R=xp$$
Differentiating this equation with respect to $t$, we obtain
$$\frac{dR}{dt}=\frac{dx}{dt}p+x\frac{dp}{dt}$$
The only quantity we don't have to calculate $\frac{dR}{dt}$ is $\frac{dx}{dt}$. To find it, let us differentiate the demand function with respect to $x$. By the chain rule, we obtain
$$10\frac{dp}{dt}+\frac{x}{\sqrt{351-x^2}}\frac{dx}{dt}=0$$
When $x=10$ million gallons, with $\frac{dp}{dt}=0.02$, we find from this equation that
$$\frac{dx}{dt}=-0.02\sqrt{256}=-0.02\cdot 16= -0.32\ \mbox{million gallons/week}$$
When $x=10$, from the demand function, we find $p$ as
$$p=\frac{\sqrt{256}}{10}=\frac{16}{10}=1.6$$
Therefore, the rate of change of revenue when 10 million gallons of gasoline is demanded is
$$\frac{dR}{dt}=-0.32(1.6)+10(0.02)=-0.312$$
What this means is that the revenue is decreasing by about 0.31 million dollars per week when the price increases 0.02 dollars per week (consequently the demand decreases by 0.32 million gallons per week as we saw earlier).

Example. A plane flying with a constant speed of 29 km/min passes over a ground radar station at an altitude of 13 km and climbs at an angle of 20 degrees. At what rate is the distance from the plane to the radar station increasing 3 minutes later?

Solution. First, take a look at the following picture

The Law of Cosine

The law of cosine says that the sides $a$, $b$, $c$ and the angle $\theta$ are related by
$$a^2=b^2+c^2-2bc\cos\theta$$

The question above can be pictorially represented as the following sketch

Using the law of cosine with $b=13$ km, we have
$$a^2=13^2+c^2-2\cdot 13\cdot c\cdot\cos\left(\frac{11}{18}\pi\right)$$
Here, the angle $\theta$ is given by $\theta=90^\circ+20^\circ=110^\circ=\frac{11}{18}\pi$ (Note: for this question, you can use degree instead of radian but in that case make sure that your calculator is set to use degree angle measurement.) Since we are to find $\frac{da}{dt}$, differentiate the above equation with respect to $t$:
$$2a\frac{da}{dt}=2c\frac{dc}{dt}-2\cdot 13\frac{dc}{dt}\cos\left(\frac{11}{18}\pi\right)$$
Since the airplane is flying along the side $c$ at the constant speed 29 km/min, it would have traveled $c=29\cdot 3=87$ km in three minutes. Thus,
$$a=\sqrt{13^2+87^2-2\cdot 13\cdot 87\cos\left(\frac{11}{18}\pi\right)}=92.2586$$
and hence $\frac{da}{dt}$ in three minutes is
\begin{align*}
\frac{da}{dt}&=\frac{c}{a}\frac{dc}{dt}-\frac{13}{a}\frac{dc}{dt}\cos\left(\frac{11}{18}\pi\right)\\
&=\frac{1}{a}\frac{dc}{dt}\left(c-13\cos\left(\frac{11}{18}\pi\right)\right)\\
&=\frac{1}{92.2586}29\left(87-13\cos\left(\frac{11}{18}\pi\right)\right)\\
&=28.7447\ \mathrm{km/min}
\end{align*}

Calculus 19: Derivatives of Logarithmic and Exponential Functions

In this note, we study derivatives of logarithmic and exponential functions.

Derivatives of Logarithmic Functions

First recall that
$$\lim_{t\to 0}(1+t)^{\frac{1}{t}}=e \tag{1}$$
\begin{align*}
\frac{d}{dx}\ln x&=\lim_{h\to 0}\frac{\ln(x+h)-\ln x}{h}\\
&=\lim_{h\to 0}\frac{1}{h}\ln\left(\frac{x+h}{x}\right)\\
&=\frac{1}{x}\lim_{h\to 0}\ln\left(1+\frac{h}{x}\right)^{\frac{x}{h}}\\
&=\frac{1}{x}\lim_{t\to 0}\ln(1+t)^{\frac{1}{t}}\\
&=\frac{1}{x}\end{align*}
with $t=\frac{h}{x}$.
$$\frac{d}{dx}\ln x=\frac{1}{x} \tag{2}$$
Using the change of base formula $\log_ax=\frac{\ln x}{\ln a}$, we obtain
$$\frac{d}{dx}\log_ax=\frac{1}{x\ln a}$$

Derivatives of Exponential Functions

We can find the derivative of the natural exponential function $y=e^x$ using the relationship $x=\ln y$ and implicit differentiation. Differentiating $x=\ln y$ with respect to $x$ we obtain $1=\frac{1}{y}\frac{dy}{dx}$ i.e. $\frac{dy}{dx}=y=e^x$. Hence
$$\frac{d}{dx}e^x=e^x$$
Note that $a^x=e^{x\ln a}$. So by the chain rule we find
$$\frac{d}{dx}a^x=\frac{d}{dx}e^{x\ln a}=e^{x\ln a}\ln a=a^x\ln a$$
Hence
$$\frac{d}{dx}a^x=a^x\ln a$$

The Power Rule (General Form)

Let us consider $x^n$ for any $x>0$ and any real number $n$. As we have seen above $x^n=e^{n\ln x}$ so by the chain rule
$$\frac{d}{dx}x^n=\frac{d}{dx}e^{n\ln x}=e^{n\ln x}\frac{n}{x}=nx^{n-1}$$
This completes the proof of the general power rule.

Logarithmic Differentiation

The derivatives of functions involving products, quotients, and powers may be found more easily (quickly) by taking the natural logarithm of such functions before differentiating. This allows us to break a complicated function into simpler pieces using properties of the natural logarithm. This whole process, which is called logarithmic differentiation, makes differentiation much easier and quicker.

Example. Use logarithmic differentiation to find the derivative of $y=\frac{x\sqrt{x^2+1}}{(x+1)^{\frac{2}{3}}}$.

Solution.
\begin{align*}\ln y&=\ln \frac{x\sqrt{x^2+1}}{(x+1)^{\frac{2}{3}}}\\
&=\ln x+\frac{1}{2}\ln(x^2+1)-\frac{2}{3}\ln(x+1)
\end{align*}
Differentiating with respect to $x$,
$$\frac{1}{y}\frac{dy}{dx}=\frac{1}{x}+\frac{x}{x^2+1}-\frac{2}{3(x+1)}$$
Therefore,
$$\frac{dy}{dx}=\left[\frac{1}{x}+\frac{x}{x^2+1}-\frac{2}{3(x+1)}\right]\frac{x\sqrt{x^2+1}}{(x+1)^{\frac{2}{3}}}$$

Example. Let $y=x^x$, $x>0$. Find $\frac{dy}{dx}$.

Solution 1. $y=x^x=e^{x\ln x}$ and by the chain rule we obtain
$$\frac{dy}{dx}=x^x(1+\ln x)$$

Solution 2. Use logarithmic differentiation. $\ln y=x\ln x$ and differentiating this with respect to $x$, we have
$$\frac{1}{y}\frac{dy}{dx}=1+\ln x$$
Hence, $$\frac{dy}{dx}=x^x(1+\ln x)$$

Alternative Approach

In the earlier approach we started out with $e^x$ and regarded $\ln x$ as its inverse function. It can also be done the other way around, namely we first define $\ln x$ and regard $e^x$ as its inverse function. The natural logarithmic function $\ln x$ can be defined by
$$\ln x=\int_1^x\frac{1}{t}dt,\ x>0 \tag{3}$$
The number $x$ that satisfies the equation $\ln x=1$ is denoted by $e$. All properties of natural logarithm can be derived from definition (3). Also from definition (3), we obtain (2) by the Fundamental Theorem of Calculus. Using (2) one can show the limit $$\lim_{x\to 0}(1+x)^{\frac{1}{x}}=e$$

Proof. Let $f(x)=\ln x$. Then $f'(x)=\frac{1}{x}$ and so $f'(1)=1$. On the other hand,
\begin{align*}
f'(1)&=\lim_{x\to 0}\frac{f(1+x)-f(1)}{x}\\
&=\lim_{x\to 0}\frac{\ln(1+x)}{x}\\
&=\lim_{x\to 0}\ln(1+x)^{\frac{1}{x}}\\
&=\ln[\lim_{x\to 0}(1+x)^{\frac{1}{x}}]
\end{align*} Therefore,
$$\lim_{x\to 0}(1+x)^{\frac{1}{x}}=e$$

Remark. By substituting $y=\frac{1}{x}$,
$$e=\lim_{y\to\infty}\left(1+\frac{1}{y}\right)^y$$

Remark. An alternative definition of $e$ is as an infinite series
$$e=1+\frac{1}{1!}+\frac{1}{2!}+\frac{1}{3!}+\cdots$$

Thursday, June 12, 2025

SDE: Itô's Formula

Let us consider the 1-dimensional case ($n=1$) of the stochastic differential equation (4) in here
$$dX=b(X)dt+dW \tag{1}$$
with $X(0)=0$.
Let $u: \mathbb{R}\longrightarrow\mathbb{R}$ be a smooth function and $Y(t)=u(X(t))$ ($t\geq 0$). What we learned in calculus (the chain rule) would dictate us that $dY$ is
$$dY=u'dX=u'bdt+u'dW,$$
where $'=\frac{d}{dx}$. It may come to you as a surprise to hear this but this is not correct in case of stochastic processes. First, by Taylor series expansion, we obtain
\begin{align*}
\Delta Y&=u(X+\Delta X)-u(X)\\
&=u(X)+u'(X)\Delta X+\frac{u''(X)}{2!}(\Delta X)^2+\cdots -u(X)\\
&=u'(X)\Delta X+\frac{u''(X)}{2!}(\Delta X)^2+\cdots
\end{align*}
and thus we have
\begin{align*}
dY&=u'dX+\frac{1}{2}u^{\prime\prime}(dX)^2+\cdots\\
&=u'(bdt+dW)+\frac{1}{2}u^{\prime\prime}(bdt+dW)^2+\cdots
\end{align*}
Now, we introduce the following striking formula
$$(dW)^2=dt \tag{2}$$
The proof of (2) is beyond the scope of this note and so it won't be discussed here. However it can be found, for example, in [1]. Using (2) $dY$ can be written as
$$dY=\left(u'b+\frac{1}{2}u^{\prime\prime}\right)dt+u'dW+\cdots$$
The terms beyond $u'dW$ are of order $(dt)^{\frac{3}{2}}$ and higher. Neglecting these terms, we have
$$dY=\left(u'b+\frac{1}{2}u^{\prime\prime}\right)dt+u'dW \tag{3}$$
(3) is the stochastic differential equation satisfied by $Y(t)$ and it is called the Itô's Formula named after a Japanese mathematician Kiyosi Itô.

Example. Let us consider the stochastic differential equation
$$dY=YdW,\ Y(0)=1 \tag{4}$$
Comparing (3) and (4), we obtain
\begin{align*}
u'b+\frac{1}{2}u^{\prime\prime}&=0 \tag{5}\\u'&=u \tag{6}
\end{align*}
The equation (6) along with the initial condition $Y(0)=1$ results in $u(X(t))=e^{X(t)}$. Using this $u$ with equation (5) we get $b=-\frac{1}{2}$ and so the equation (1) becomes
$$dX=-\frac{1}{2}dt+dW$$
in which case $X(t)=-\frac{1}{2}t+W(t)$. Hence, we find $Y(t)$ as
$$Y(t)=e^{-\frac{1}{2}t+W(t)}$$

Example. Let $P(t)$ denote the price of a stock at time $t\geq 0$. A standard model assumes that the relative change of price $\frac{dP}{P}$ evolves according to the stochastic differential equation
$$\frac{dP}{P}=\mu dt+\sigma dW \tag{7}$$
where $\mu>0$ and $\sigma$ are constants called the drift and the volatility of the stock, respectively. Again using Itô's formula similarly to what we did in the preceding example, we find the price function $P(t)$ which is the solution of
$$dP=\mu Pdt+\sigma PdW,\ P(0)=p_0$$
as
$$P(t)=p_0\exp\left[\left(\mu-\frac{1}{2}\sigma^2\right)\right]t+\sigma W(t).$$

Example. In this example, we solve the stochastic population growth model
\begin{align*}
\frac{dN}{dt}&=(r(t)+\xi(t))N(t) \tag{8}\\
&=r(t)N(t)+\xi(t)N(t)
\end{align*}
with $N(0)=N_0$. Here, $r(t)$ is a known function (this $r(t)$ may be considered as the relative growth rate at time $t$ in the deterministic population growth model) and $\xi(t)=\frac{dW}{dt}$ is a white noise. Let $N(t)=u(X(t))$. Then
$$dN=r(t)udt+udW \tag{9}$$
Comparing (3) and (9), we obtain
\begin{align*}
u'b+\frac{1}{2}u''&=r(t)u\\
u'&=u
\end{align*}
$u(t)=N_0e^{X(t)}$ and hence we find $b=r(t)-\frac{1}{2}$. From the equation (1), we have
$$dX=\left(r(t)-\frac{1}{2}\right)dt+dW$$
whose solution $X(t)$ is given by
$$X(t)=\int_0^t r(s)ds-\frac{t}{2}+W(t)$$
Therefore,
$$N(t)=N_0\exp\left[\int_0^t r(s)ds-\frac{t}{2}+W(t)\right]$$

References:

Bernt Øksendal, Stochastic Differential Equations, An Introduction with Applications, 5th Edition, Springer, 2000

SDE: What is a Stochastic Differential Equation?

Consider the population growth model
$$\frac{dN}{dt}=a(t)N(t),\ N(0)=N_0 \tag{1}$$
where $N(t)$ is the size of a population at time $t$ and $a(t)$ is the relative growth rate at time $t$. If $a(t)$ is completely known, one can easily solve (1). In fact, the solution would be $N(t)=N_0\exp\left(\int_0^t a(s)ds\right)$. Now suppose that $a(t)$ is not completely known but it can be written as $a(t)=r(t)+\mbox{noise}$. We do not know the exact behavior of noise but only its probability distribution. Such a case equations like (1) is called a stochastic differential equation. More generally, a stochastic differential equation can be written as
$$\frac{dX}{dt}=b(X(t))+B(X(t))\xi(t)\ (t>0),\ X(0)=x_0,\ \tag{2}$$
where $b: \mathbb{R}^n\longrightarrow\mathbb{R}^n$ is a smooth vector field and $X: [0,\infty)\longrightarrow\mathbb{R}^n$, $B: \mathbb{R}^n\longrightarrow\mathbb{M}^{n\times m}$ and $\xi(t)$ is an $m$-dimensional white noise. If $m=n$, $x_0=0$, $b=0$ and $B=I$, then (2) turns into
$$\frac{dX}{dt}=\xi(t),\ X(0)=0 \tag{3}$$
The solution of (3) is denoted by $W(t)$ and is called the $n$-dimensional Wiener process or Brownian motion. In other words, white noise $\xi(t)$ is the time derivative of the Wiener process. Replace $\xi(t)$ in (2) by $\frac{W(t)}{dt}$ and divide the resulting equation by $dt$. Then we obtain
$$dX(t)=b(X(t))dt+B(X(t))dW(t),\ X(0)=x_0 \tag{4}$$The stochastic differential equation (4) is solved symbolically as
$$X(t)=x_0+\int_0^tb(X(s))ds+\int_0^tb(X(s))dW(s) \tag{5}$$for all $t>0$. In order to make sense of $X(t)$ in (5) we will have to know what $W(t)$ is and what the integral $\int_0^tb(X(s))dW(s)$, which is called a stochastic integral, means.

Saturday, June 7, 2025

Calculus 18: Implicit Differentiation

Thus far, most of time, we have seen functions defined as $y=f(x)$. This clearly shows that $y$ is a function of the independent variable $x$. But often functions are defined implicitly. For instance, consider the equation $x^2+y^2=25$. Of course this is the equation of circle centered at the center $(0,0)$ with radius $5$. As you know, circles are not functions. But if we say $y\geq 0$, then the equation describes the upper half-circle which is a function defined by $y=\sqrt{25-x^2}$. Functions defined by equations like $x^2+y^2=25$ are called implicit functions. In some cases like $x^2+y^2=25$, we can easily write an implicit function explicitly as $y=f(x)$, but in many cases we cannot. For example, $x^3+y^3=6xy$. So, we need to devise a way to differentiate an implicit function without writing it as $y=f(x)$. This can indeed be done by the chain rule. You just assume that $y$ is a function of $x$ and use the chain rule. For example,
\begin{align*}
\frac{d}{dx}y^n&=(y^n)'\frac{dy}{dx}\ (y\ \mbox{is the innermost function})\\
&=ny^{n-1}\frac{dy}{dx}.
\end{align*}
Let us take a look at another example.
\begin{align*}
\frac{d}{dx}\cos y&=(\cos y)'\frac{dy}{dx}\ (y\ \mbox{is the innermost function})\\
&=-\sin y\frac{dy}{dx}.
\end{align*}
Here come more examples.

Example. If $x^2+y^2=25$, find $\frac{dy}{dx}$.

Solution. Differentiating the equation with respect to $x$, we obtain
$$2x+2y\frac{dy}{dx}=0.$$
Solving the resulting equation for $\frac{dy}{dx}$, we obtain
$$\frac{dy}{dx}=-\frac{x}{y}.$$

Examples.

Find $y'$ if $x^3+y^3=6xy$.
Find the tangent to $x^3+y^3=6xy$ at $(3,3)$.

Solution.

Differentiate the equation with respect to $x$. Then we obtain $$3x^2+3y^2\frac{dy}{dx}=6y+6x\frac{dy}{dx}.$$Solving the resulting equation for $\frac{dy}{dx}$, we obtain$$\frac{dy}{dx}=\frac{2y-x^2}{y^2-2x}.$$
The equation of tangent is$$y-3=\left[\frac{dy}{dx}\right]_{(3,3)}(x-3).$$$$\left[\frac{dy}{dx}\right]_{(3,3)}=\frac{2\cdot 3-(3)^2}{3^2-2\cdot 3}=-1.$$ Therefore, the tangent is given by $y=-x+6$.

Calculus 17: The Proof of the Chain Rule

In this note, we introduce two versions of the proof of the Chain Rule. The first one comes from [1]. Let $y=f(u)$ and $u=g(x)$ be differentiable functions. We claim that
$$\frac{dy}{dx}=f'(u)g'(x)$$
The finite difference $\frac{f(g(x+h))-f(g(x))}{h}$ can be written as $\frac{f(u+k)-f(u)}{h}$ where $k=g(x+h)-g(x)$. Define $\varphi(t)=\frac{f(u+t)-f(u)}{t}-f'(u)$ if $t\ne 0$. Multiplying by $t$ and rearranging terms, we obtain
$$
f(u+t)-f(u)=t[\varphi(t)+f'(u)] \tag{1}
$$
$\lim_{t\to 0}\varphi(t)=0$ so we may define $\varphi(0)=0$. Then (1) is defined for all $t$. Now replace $t$ in (1) by $k$.
$$
\frac{f(u+k)-f(u)}{h}=\frac{k}{h}[\varphi(k)+f'(u)] \tag{2}
$$
(2) is valid even if $k=0$. When $h\to 0$, $\frac{k}{h}\to g'(x)$ and $\varphi(k)\to 0$. Hence the RHS of (2) approaches $f'(u)g'(x)$. This completes the proof.

Another version of the proof of the Chain Rule is from [2] as a guided exercise (\# 99 on page p. 559). Here we suppose that $y=f(u)$ is differentiable at $u_0=g(x_0)$ and $u=g(x)$ is differentiable at $x_0$. Then we claim that $y=f(g(x))$ is differentiable at $x=x_0$ and $$\left[\frac{dy}{dx}\right]_{x=x_0}=f'(u_0)g'(x_0)$$
Since $g'(x_0)$ exists, $\Delta u$ can be written as
$$\Delta u=g'(x_0)\Delta x+\rho(x)$$
where $\lim_{\Delta x\to 0}\frac{\rho(x)}{\Delta x}=0$. Similarly, if $\Delta u\ne 0$ (it could be 0), then $\Delta y$ can be written as
$$
\Delta y=f'(u_0)\Delta u+\sigma(u) \tag{3}
$$
where $\lim_{\Delta u\to 0}\frac{\sigma(u)}{\Delta u}=0$.
\begin{align*}
\Delta y&=f'(u_0)[g'(x_0)\Delta x+\rho(x)]+\sigma(g(x))\\
&=f'(u_0)g'(x_0)\Delta x+f'(u_0)\rho(x)+\sigma(x)
\end{align*}
As $\Delta u\to 0$, $\Delta y\to 0$ and accordingly $\sigma(u)\to 0$. So one can define $\sigma(u)=0$ if $\Delta u=0$. Then (3) is still valid if $\Delta u=0$.
$$\frac{\sigma(g(x))}{\Delta x}=\left\{\begin{array}{ccc}
\frac{\sigma(g(x))}{\Delta u}\cdot\frac{\Delta u}{\Delta x} & \mbox{if} & \Delta u\ne 0\\
0 & \mbox{if} & \Delta u=0\end{array}\right.\to 0$$
as $\Delta x\to 0$. Therefore,
$$\frac{\Delta y}{\Delta x}=f'(u_0)g'(x_0)+f'(u_0)\frac{\rho(x)}{\Delta x}+\frac{\sigma(g(x))}{\Delta x}$$
approaches
$$\frac{dy}{dx}=f'(u_0)g'(x_0)$$
as $\Delta x\to 0$.

Update: Here is yet another version of the proof of the chain rule. Suppose that $y=f(u)$ and $u=g(x)$ be differentiable. By Taylor series expansion, we obtain\begin{align*}
\Delta y&=f(u+\Delta u)-f(u)\\
&=f(u)+f'(u)\Delta u+\frac{f''(u)}{2!}(\Delta u)^2+\frac{f'''(u)}{3!}(\Delta u)^3+\cdots -f(u)\\
&=f'(u)\Delta u+\frac{f''(u)}{2!}(\Delta u)^2+\cdots \tag{4}
\end{align*}Dividing (4) by $\Delta x$, we have$$\frac{\Delta y}{\Delta x}=f'(u)\frac{\Delta u}{\Delta x}+\frac{f''(u)}{2!}\frac{\Delta u}{\Delta x}\Delta u+\frac{f'''(u)}{3!}\frac{\Delta u}{\Delta x}(\Delta u)^2+\cdots \tag{5}
$$
As $\Delta x\to 0$, (5) approaches
$$
\frac{dy}{dx}=f'(u)\frac{du}{dx}+\frac{f''(u)}{2!}\frac{du}{dx}du+\frac{f'''(u)}{3!}\frac{du}{dx}(du)^2+\cdots \tag{6}
$$
All the terms of $du$ after the first term can be neglected and consequently, (6) becomes
$$\frac{dy}{dx}=f'(u)\frac{du}{dx}$$
This completes the proof.

References:

Tom M. Apostol, Calculus, Volume I One-Variable Calculus with an Introduction to Linear Algebra, 2nd Edition, John Wiley & Sons, Inc., 1967
Jerrold Marsden and Alan Weinstein, Calculus II, Springer-Verlag, 1985

Calculus 16: The Chain Rule

Let us consider the function $y=\sqrt{x^2+1}$. Notice that this is a composite function $y=\sqrt{u}$ and $u=x^2+1$. In general, a composite function can be written as $y=f(u)$ where $u$ is a function of $x$, $u=g(x)$. While we know how to differentiate $y=\sqrt{u}$ (i.e. finding $\frac{dy}{du}$) and $u=x^2+1$ (i.e. finding $\frac{du}{dx}$), we do not know how to differentiate $y=\sqrt{x^2+1}$ (i.e finding $\frac{dy}{dx}$). In this note, we would like to devise a way to differentiate a composite function. This is actually very important because the differentiable functions we stumble onto most of time are composite functions.

Let $y=f(u)$ and $u=g(x)$ and assume that both $\frac{dy}{du}$ and $\frac{du}{dx}$ exist. Now,
\begin{align*}
\frac{\Delta y}{\Delta x}&=\frac{\Delta y}{\Delta u}\cdot\frac{\Delta u}{\Delta x}\\
&=\frac{f(u+\Delta u)-f(u)}{\Delta u}\cdot\frac{g(\Delta x+x)-g(x)}{\Delta x}.
\end{align*}
Hence,
\begin{align*}
\frac{dy}{dx}&=\lim_{\Delta x\to 0}\frac{\Delta y}{\Delta x}\\
&=\lim_{\Delta u\to 0}\frac{\Delta y}{\Delta u}\cdot\lim_{\Delta x\to 0}\frac{\Delta u}{\Delta x}\ (\Delta u\to 0\ \mbox{as}\ \Delta x\to 0)\\
&=\frac{dy}{du}\cdot\frac{du}{dx}
\end{align*}
or
\begin{align*}
\frac{dy}{dx}&=\lim_{\Delta u\to 0}\frac{f(u+\Delta u)-f(u)}{\Delta u}\cdot\lim_{\Delta x\to 0}\frac{g(\Delta x+x)-g(x)}{\Delta x}\\
&=f'(u)g'(x).
\end{align*}

Theorem. [The Chain Rule]
Let $y=f(u)$ and $u=g(x)$. If both $\frac{dy}{du}$ and $\frac{du}{dx}$ exist, then $\frac{dy}{dx}$ exists and
\begin{align*}
\frac{dy}{dx}&=\frac{dy}{du}\cdot\frac{du}{dx}\\
&=f'(u)g'(x).
\end{align*}

Remark. The derivation of the chain rule shown above is not rigorously correct. The reason is that $\Delta u$ may become $0$. There is a more rigorous proof of the chain rule but we will not discuss that here.

Remark. Students commonly feel a difficulty with applying the chain rule when they learn it for the first time. The difficulty usually is not about understanding the chain rule itself but identifying the function $u=g(x)$. The candidate for $u$ is usually the function inside parentheses (or brackets) or the innermost function.

Example. We are now ready to find $\frac{dy}{dx}$ when $y=\sqrt{x^2+1}$. In this case, we don't see parentheses or brackets but the innermost function is $x^2+1$. Let $u=x^2+1$. Then $y=\sqrt{u}$. Now,
\begin{align*}
\frac{dy}{du}&=\frac{1}{2\sqrt{u}}\\
&=\frac{1}{2\sqrt{x^2+1}},\\
\frac{du}{dx}&=2x.
\end{align*}
so, we have by the chain rule
$$\frac{dy}{dx}=\frac{dy}{du}\cdot\frac{du}{dx}=\frac{x}{\sqrt{x^2+1}}.$$

Example. Differentiate $y=(x^3-1)^{100}$.

Solution. The function inside parentheses is $x^3-1$. So, it is our candidate. Let $u=x^3-1$. Then $y=u^{100}.$
By the chain rule,
\begin{align*}
\frac{dy}{dx}&=\frac{dy}{du}\cdot\frac{du}{dx}\\
&=100u^{99}\cdot(3x^2)\\
&=300x^2(x^3-1)^{99}.
\end{align*}

Examples. Find the derivative of each function.

$y=\sin 4x$.
$y=\sqrt{\sin x}$.

Solution.

The innermost function is $4x$. Let $u=4x$. Then $y=\sin u$. By the chain rule,\begin{align*}\frac{dy}{dx}&=\frac{dy}{du}\cdot\frac{du}{dx}\\&=\cos u\cdot4\\&=4\cos 4x.\end{align*}
The innermost function is $\sin x$. Let $u=\sin x$. Then $y=\sqrt{u}$. By the chain rule,\begin{align*}\frac{dy}{dx}&=\frac{dy}{du}\cdot\frac{du}{dx}\\&=\frac{1}{2\sqrt{u}}\cdot\cos x\\&=\frac{\cos x}{2\sqrt{\sin x}}.\end{align*}

Update: For those who are interested, the rigorous proof of the Chain Rule can be found here.

Calculus 15: Derivatives of Trigonometric Functions

In this note, we study derivatives of trigonometric functions $y=\sin x$, $y=\cos x$, $y=\sec x$, $y=\csc x$, $y=\tan x$, and $y=\cot x$. First, we calculate the derivative of $y=\sin x$. \begin{align*}\frac{d}{dx}\sin x&=\lim_{h\to 0}\frac{\sin(x+h)-\sin x}{h}\\&=\lim_{h\to 0}\frac{\sin x\cos h+\cos x\sin h-\sin x}{h}\\&=\lim_{h\to 0}\left[\sin x\frac{\cos h-1}{h}+\cos x\frac{\sin h}{h}\right]\end{align*} Recall that $\lim_{h\to 0}\frac{\cos h -1}{h}=0$ and $\lim_{h\to 0}\frac{\sin h}{h}=1$. Hence we obtain
$$\frac{d}{dx}\sin x=\cos x$$
In a similar manner, we can also obtain
$$\frac{d}{dx}\cos x=-\sin x$$
Using the reciprocal rule (baby quotient rule) here along with the derivatives of $\sin x$ and $\cos x$, we find the derivatives of $y=\sec x$, $y=\csc x$ as
\begin{align*}
\frac{d}{dx}\sec x&=\sec x\tan x\\
\frac{d}{dx}\csc x&=-\csc x\cot x
\end{align*}
Finally, using the quotient rule here along with the derivatives of $\sin x$ and $\cos x$, we find the derivatives of $y=\tan x$, $y=\cot x$ as \begin{align*}\frac{d}{dx}\tan x&=\sec^2 x\\
\frac{d}{dx}\cot x&=-\csc^2 x\end{align*}

Calculus 14: The Product and Quotient Rules

Product Rule: Let $u=f(x)$ and $v=g(x)$ be differentiable functions. Then $$(fg)'(x)=f(x)g'(x)+f'(x)g(x)$$ or $$\frac{d(uv)}{dx}=u\frac{dv}{dx}+\frac{du}{dx}v.$$

Proof. \begin{align*}(fg)'(x)&=\lim_{\Delta x\to 0}\frac{fg(x+\Delta x)-fg(x)}{\Delta x}\\&=\lim_{\Delta x\to 0}\frac{f(x+\Delta x)g(x+\Delta x)-f(x)g(x)}{\Delta x}\\&=\lim_{\Delta x\to 0}\frac{f(x+\Delta x)g(x+\Delta x)-f(x)g(x+\Delta x)+f(x)g(x+\Delta x)-f(x)g(x)}{\Delta x}\\&=\lim_{\Delta x\to 0}\frac{f(x+\Delta x)-f(x)}{\Delta x}g(x+\Delta x)+f(x)\lim_{\Delta x\to 0}\frac{g(x+\Delta x)-g(x)}{\Delta x}\\&=f'(x)g(x)+f(x)g'(x).\end{align*} Note that $\displaystyle\lim_{\Delta x\to 0}g(x+\Delta x)=g(x)$ because $g(x)$ is continuous.

Example. Using the product rule, differentiate $(x^2+2x-1)(x^3-4x^2)$.

Solution. \begin{align*}\frac{d}{dx}[(x^2+2x-1)(x^3-4x^2)]&=\frac{d(x^2+2x-1)}{dx}(x^3-4x^2)+(x^2+2x-1)\\&\frac{d(x^3-4x^2)}{dx}\\&=(2x+2)(x^3-4x^2)+(x^2+2x-1)(3x^2-8x)\\&=(2x^4-6x^3-8x^2)+(3x^4-2x^3-19x^2+8x)\\&=5x^4-8x^3-27x^2+8x.\end{align*} Multiplying first, \begin{align*}(x^2+2x-1)(x^3-4x^2)&=x^5-4x^4+2x^4-8x^3-x^3+4x^2\\&=x^5-2x^4-9x^3+4x^2.\end{align*} The derivative of this is $5x^4-8x^3-27x^2+8x$ by the power rule and differentiation formulas we discussed here.

Reciprocal Rule (Baby Quotient Rule): Let $v=g(x)$ be a differentiable function with $g(x)\ne 0$. Then $$\left(\frac{1}{g}\right)'(x)=\frac{-g'(x)}{[g(x)]^2}$$ or $$\frac{d}{dx}\left(\frac{1}{v}\right)=-\frac{1}{v^2}\frac{dv}{dx}.$$

Proof. \begin{align*}\left(\frac{1}{g}\right)'(x)&=\lim_{\Delta x\to 0}\frac{\frac{1}{g(x+\Delta x)}-\frac{1}{g(x)}}{\Delta x}\\&=\lim_{\Delta x\to 0}\frac{\frac{g(x)-g(x+\Delta x)}{g(x+\Delta x)g(x)}}{\Delta x}\\&=-\lim_{\Delta x\to 0}\frac{\frac{g(x+\Delta x)-g(x)}{\Delta x}}{g(x+\Delta x)g(x)}\\&=-\frac{g'(x)}{[g(x)]^2}.\end{align*}

Example. Differentiate $\frac{1}{\sqrt{x}+2}$.

Solution. \begin{align*}\frac{d}{dx}\frac{1}{\sqrt{x}+2}&=-\frac{d(\sqrt{x}+2)/dx}{(\sqrt{x}+2)^2}\\&=-\frac{1}{2\sqrt{x}(\sqrt{x}+2)^2}.\end{align*}

Using the product rule and the reciprocal rule, we can prove

Quotient Rule: Let $u=f(x)$ and $v=g(x)$ be differentiable functions and assume that $g(x)\ne 0$. Then $$\left(\frac{f}{g}\right)'(x)=\frac{f'(x)g(x)-f(x)g'(x)}{[g(x)]^2}$$ or $$\frac{d}{dx}\left(\frac{u}{v}\right)=\frac{\frac{du}{dx}v-u\frac{dv}{dx}}{v^2}.$$

Proof. \begin{align*}\left(\frac{f}{g}\right)'(x)&=\left(f\frac{1}{g}\right)'(x)\\&=f'(x)\frac{1}{g(x)}+f(x)\left(\frac{1}{g}\right)'(x)\ (\mbox{the product rule is applied})\\&=\frac{f'(x)}{g(x)}-f(x)\frac{g'(x)}{[g(x)]^2}\ (\mbox{the reciprocal rule is applied})\\&=\frac{f'(x)g(x)-f(x)g'(x)}{[g(x)]^2}.\end{align*}

Example. Find the derivative of $h(x)=\frac{2x+1}{x^2-2}$.

Solution. \begin{align*}h'(x)&=\frac{(2x+1)'(x^2-2)-(2x+1)(x^2-2)'}{(x^2-2)^2}\\&=\frac{2(x^2-2)-(2x+1)2x}{(x^2-2)^2}\\&=\frac{2x^2-4-4x^2-2x}{(x^2-2)^2}\\&=-\frac{2x^2+2x+4}{(x^2-2)^2}.\end{align*}

Example. Differentiate $\frac{x^2+2}{x^8}$.

Solution. Since the function is a rational function, you may hastily try to use the quotient rule to differentiate it. There is nothing wrong with that except there may be a simpler way to differentiate the function. In fact the function can be written as $$\frac{x^2+2}{x^8}=\frac{x^2}{x^8}+\frac{2}{x^8}=\frac{1}{x^6}+2x^{-8}=x^{-6}+2x^{-8}.$$ Thus the derivative is $$-6x^{-7}-16x^{-9}=-\frac{6}{x^7}-\frac{16}{x^9}.$$

Wednesday, June 4, 2025

Calculus 13: Continuity versus Differentiability

There is a close relationship between continuity and differentiability, namely

Theorem. If $f'(x_0)$ exists then $f(x)$ is continuous at $x_0$; i.e. $\displaystyle\lim_{x\to x_0}f(x)=f(x_0)$. However the converse need not be true.

Proof. \begin{align*}\lim_{x\to x_0}[f(x)-f(x_0)]&=\lim_{x\to x_0}\frac{f(x)-f(x_0)}{x-x_0}\cdot(x-x_0)\\&=f'(x)\cdot 0\\&=0.\end{align*}

Example. [A Counterexample for the Converse] The function $f(x)=|x|$ is continuous at $x=0$ but has no derivative at $x=0$.

Proof. \begin{align*}\lim_{x\to 0+}\frac{f(x)-f(0)}{x-0}&=\lim_{x\to 0+}\frac{|x|}{x}\\&=\lim_{x\to 0+}\frac{x}{x}\\&=1,\end{align*} while \begin{align*}\lim_{x\to 0-}\frac{f(x)-f(0)}{x-0}&=\lim_{x\to 0-}\frac{|x|}{x}\\&=\lim_{x\to 0-}\frac{-x}{x}\\&=-1.\end{align*} Hence, $f'(0)=\displaystyle\lim_{x\to 0}\frac{f(x)-f(0)}{x-0}$ does not exist.

The graph of $y=|x|$

Calculus 12: Basic Differentiation Formulas

Let us recall the definition of the derivative $$f'(x)=\lim_{h\to 0}\frac{f(x+h)-f(x)}{h}.$$ Replace $h$ by $\Delta x$. Then $f'(x)$ is rewritten as $$f'(x)=\lim_{\Delta x\to 0}\frac{f(x+\Delta x)-f(x)}{\Delta x}.$$ In mathematics, $\Delta$ often means an increment. So $\Delta x$ means an increment of $x$. Note that $\Delta x$ could be positive or negative. Denote by $\Delta y$ the difference $f(x+\Delta x)-f(x)$. $\Delta y$ is called an increment of $y$. Hence, the average rate of change of $y$ with respect to $x$ in the interval $[x,x+\Delta x]$ is the difference quotient $$\frac{\Delta y}{\Delta x}=\frac{f(x+\Delta x)-f(x)}{\Delta x}.$$ In Gottfried Leibniz's viewpoint, one could think of $\Delta x$ as becoming infinitesimal. The resulting quantity is denoted by $dx$. When $\Delta x$ becomes the infinitesimal $dx$, $\Delta y$ simultaneously becomes the infinitesimal $dy$. The infinitesimals $dx$ and $dy$ are called differentials. Hence the ratio $\frac{\Delta y}{\Delta x}$ becomes $\frac{dy}{dx}$ accordingly, and it is exactly equal to $f'(x)$. The quantity $\frac{dy}{dx}$ can be viewed as the ratio of differentials or as s synonym for $f'(x)$.

Leibniz Notation: If $y=f(x)$, the derivative $f'(x)$ can be written $$\frac{dy}{dx},\frac{df(x)}{dx},\ \mbox{or}\ \frac{d}{dx}f(x).$$ This is just a notation and does not represent a division. Using Leibniz notaion, the value $f'(a)$ of $f'(x)$ at a specific point $x=a$ can be written $$\left.\frac{dy}{dx}\right|_{x=a}\ \mbox{or}\ \left.\frac{df(x)}{dx}\right|_{x=a}.$$

Calculating derivatives using the definition can be really laborious. In actual practice, special rules and formulas are derived for differentiating certain types of functions. The following are such rules and they can be proved straightforwardly by the definition of the derivative.

Theorem. Let $c$ be a constant, and $f(x)$ and $g(x)$ be two differentiable functions of $x$.

$\displaystyle\frac{dc}{dx}=0$
$\displaystyle\frac{d(cf(x))}{dx}=c\frac{df(x)}{dx}$
$\displaystyle\frac{d[f(x)+g(x)]}{dx}=\frac{df(x)}{dx}+\frac{dg(x)}{dx}$

The converse of the first rule is also true, namely if $f'(x)=0$ for all $x$ in the domain then $f(x)$ is a constant function. This can be proved using the Mean Value Theorem which will be studied later.

Lemma. [Binomial Theorem] \begin{align*}(a+b)^n&=\begin{pmatrix}n\\0\end{pmatrix}a^nb^0+ \begin{pmatrix}n\\1\end{pmatrix}a^{n-1}b+\begin{pmatrix}n\\2\end{pmatrix}a^{n-2}b^2+\cdots+\begin{pmatrix}n\\k\end{pmatrix}a^{n-k}b^k+\\&\cdots+\begin{pmatrix}n\\n-1\end{pmatrix}ab^{n-1}+\begin{pmatrix}n\\n\end{pmatrix}a^0b^n\\&=a^n+\begin{pmatrix}n\\1\end{pmatrix}a^{n-1}b+\begin{pmatrix}n\\2\end{pmatrix}a^{n-2}b^2+\cdots+\begin{pmatrix}n\\k\end{pmatrix}a^{n-k}b^k+\\&\cdots+\begin{pmatrix}n\\n-1\end{pmatrix}ab^{n-1}+b^n,\end{align*} where $$\begin{pmatrix}n\\k\end{pmatrix}=\frac{n!}{k!(n-k)!}.$$ $\begin{pmatrix}n\\k\end{pmatrix}$ is also denoted by $n{\mathrm C}k$. The binomial coefficients $\begin{pmatrix}n\\k\end{pmatrix}$ can be also easily obtained by Pascal's triangle. For details see here and here.

Theorem. [Power Rule] $\displaystyle\frac{dx^n}{dx}=nx^{n-1}$

Proof. Let $y=x^n$. Then by the Binomial Theorem \begin{align*}\frac{dx^n}{dx}&=\lim_{\Delta x\to 0}\frac{\Delta y}{\Delta x}\\&=\lim_{\Delta x\to 0}\frac{(x+\Delta x)^n-x^n}{\Delta x}\\&=\lim_{\Delta x\to 0}\frac{\left[x^n+nx^{n-1}\Delta x+\frac{n(n-1)}{2!}x^{n-2}(\Delta x)^2+\cdots+(\Delta x)^n\right]-x^n}{\Delta x}\\&=\lim_{\Delta x\to 0}\left[nx^{n-1}+\frac{n(n-1)}{2!}x^{n-2}\Delta x+\cdots+(\Delta x)^{n-1}\right]\\&=nx^{n-1}.\end{align*}

Example. If $y=3x^5$, then by Property 2 of the first theorem and Power Rule (the preceding theorem), we have \begin{align*}\frac{dy}{dx}&=3\frac{dx^5}{dx}\\&=3(5)x^{5-1}\\&=15x^4.\end{align*}

Remark. In the preceding theorem, the power rule is established only for the case when $n$ is a positive integer. The formula is indeed valid for all real $n$'s.

Example. If $y=8x^{-\frac{3}{4}}$, then we have $$\frac{dy}{dx}=8\left(-\frac{3}{4}\right)x^{-\frac{3}{4}-1}=-6x^{-\frac{7}{4}}.$$

Using the first theorem and the preceding theorem (power rule), we can now find the derivative of any polynomial function.

Example. If $y=2x^4-x^3-2x+7$, then \begin{align*}\frac{dy}{dx}&=\frac{d(2x^4)}{dx}-\frac{d(x^3)}{dx}-2\frac{dx}{dx}+\frac{d(7)}{dx}\\&=8x^3-3x^2-2.\end{align*}

Example. The function $f(x)=\displaystyle\frac{3x^3-4}{x^2}$ can be written in the form to which we can apply the power rule to find its derivative. $$f(x)=\frac{3x^3-4}{x^2}=\frac{3x^3}{x^2}-\frac{4}{x^2}=3x-4x^{-2}.$$ Hence we have $$f'(x)=\frac{d(3x)}{dx}-\frac{d(4x^{-2})}{dx}=3+8x^{-3}.$$

Example. The function $y=\root 3\of{x^2}-3\root 3\of{x}-5$ can be also written in the form to which we can apply the power rule to find its derivative. \begin{align*}y&=\root 3\of{x^2}-3\root 3\of{x}-5\\&=(x^2)^{\frac{1}{3}}-3x^{\frac{1}{3}}-5\\&=x^{\frac{2}{3}}-3x^{\frac{1}{3}}-5.\end{align*} Hence, $$\frac{dy}{dx}=\frac{2}{3}x^{-\frac{1}{3}}-x^{-\frac{2}{3}}.$$

Calculus 11: Velocity and Acceleration

Let us assume that a particle is moving along a straight line and that the function $s=f(t)$ describes the position of moving particle at the time $t$. In physics, such a function $s=f(t)$ is called a motion.

Suppose the particle passes the points $P$ and $Q$ at the times $t$ and $t+\Delta t$, respectively. If $s$ and $s+\Delta s$ are the respective distances from some fixed point $O$, then the average velocity of the particle during the time interval $\Delta t$ is $$\frac{\Delta s}{\Delta t}=\frac{f(t+\Delta t)-f(t)}{\Delta t}=\frac{\mbox{Distance Traveled}}{\mbox{Time Elapsed}}.$$ The instantaneous velocity $v$ of the particle at the time $t$ is then given by the derivative of motion $s=f(t)$ $$v=\frac{ds}{dt}=\lim_{\Delta t\to 0}\frac{\Delta s}{\Delta t}.$$ In physics, the instantaneous velocity is also denoted by $\dot{s}$ or $\dot{f}(t)$. This dot notation was introduced by Sir Issac Newton.

Similarly, if $\Delta v$ is the change in the velocity of the particle as it moves from $P$ to $Q$ during the time interval $\Delta t$, then $$a=\frac{dv}{dt}=\lim_{\Delta t\to 0}\frac{\Delta v}{\Delta t}$$ is the acceleration of the particle at the time $t$. Using dot notation, the acceleration is also denoted by $\dot{v}$, $\ddot{s}$, $\ddot{f}(t)$, or $\frac{d^2 s}{dt^2}$. The last notation $\frac{d^2 s}{dt^2}$ is due to Gottfried Leibniz.

If a body is thrown vertically upward with a certain initial velocity $v_0$, its distance $s$ from the starting point is given by the formula $$s(t)=v_0t-\frac{1}{2}gt^2,$$ where $g$ is the gravitational constant $g=9.8\mbox{m}/\mbox{sec}^2=32\mbox{ft}/\mbox{sec}^2.$

Example. From the top of a building 96 feet high, a ball is thrown directly upward with a velocity of 80 feet per second. Find

the time required to reach the highest point
the maximum height attained, and
the velocity of the ball when it reaches the ground.

Solution. $v_0=80$ ft/sec and $g=32\mbox{ft}/\mbox{sec}^2$, so $$s=80t-16t^2$$ and $$v=\frac{ds}{dt}=80-32t.$$

At the heighest point, $v=0$ that is $0=80-32t$. So, $t=\frac{5}{2}$.
$s\left(\frac{5}{2}\right)=80\left(\frac{5}{2}\right)-16\left(\frac{5}{2}\right)^2=100$ft. Hence the height of the ball above the ground is 196 feet.
Since the ball will reach the ground when $s=-96$, it follows that $-96=80t-16t^2$ or $16(t-6)(t+1)=0$. Hence $t=6$ and the velocity is $v(6)=80-32\cdot 6=-112$ft/sec when the ball strikes the ground. The negative sign merely indicates that the velocity of the ball is directed downward.

Calculus 10: Derivatives

In this lecture note, we introduce a new idea, which was discovered independently by Sir Issac Newton and Gottfried Leibiz, to find the slope of a tangent line. This is in fact a quite ingenious idea. Let a function $y=f(x)$ be given. We want to find the slope of a line tangent to the graph of $y=f(x)$ at a point $x=a$. First consider another point on the $x$-axis that is away from $x=a$. If the distance from $x=a$ to this point is $h$, then the point can be written as $x=a+h$. Let $P(a,f(a))$ and $Q(a+h,f(a+h))$. Then the slope of line segment $\overline{PQ}$ is given by $$\frac{f(a+h)-f(a)}{h}.$$

Now we continuously change $h$ so that it gets smaller and smaller close to $0$, consequently the point $a+h$ gets closer to $a$. We want to see how the rate $\frac{f(a+h)-f(a)}{h}$ changes as $h\to 0$. To illustrate the situation better, I will use a specific example, say $f(x)=x^2$ with $a=2$. First we take $h=1$. The following picture shows you the graph of $f(x)=x^2$ (in black), where $1.5\leq x\leq 3$ and the line through $P(2,4)$ and $Q(2+h,(2+h)^2)$ (in blue), and the line tangent to the graph $f(x)=x^2$ at $x=2$ (in red).

Next we take $h=0.5$. Then the picture becomes

For $h=0.1$, the picture becomes

As one can clearly see, the line through $P(2,4)$ and $Q(2+h,(2+h)^2)$ gets closer to the tangent line as $h$ gets smaller close to $0$. We can still do better. For $h=0.001$, the picture becomes

The line through $P(2,4)$ and $Q(2+h,(2+h)^2)$ and the tangent line now appear to be overlapping. From this observation, we can see that the rate $\frac{f(a+h)-f(a)}{h}$ gets closer and closer to the slope of tangent line as $h$ gets smaller and smaller close to $0$. In fact, the slope would be exactly the limit of $\frac{f(a+h)-f(a)}{h}$ as $h$ approaches $0$. Denote the limit by $f'(a)$. Then $$f'(a)=\lim_{h\to 0}\frac{f(a+h)-f(a)}{h}.$$ $f'(a)$ is called the derivative of $f(x)$ at $x=a$. One may wonder why we need another name for the slope of a tangent line. The reason is that as we will see later the slope of a tangent line can mean something else in different contexts. Let $x=a+h$. Then $x\to a$ as $h\to 0$. So $f'(a)$ can be also written as $$f'(a)=\lim_{x\to a}\frac{f(x)-f(a)}{x-a}.$$ The equation of tangent line to $y=f(x)$ at $x=a$ is then given by $$y-f(a)=f'(a)(x-a).$$

Example. Find the equation of tangent line to the graph of $f(x)=x^2$ at $x=2$.

Solution. First we need to find $f'(2)$, i.e. the slop of the tangent line. \begin{align*}f'(2)&=\lim_{h\to 0}\frac{f(2+h)-f(2)}{h}\\&=\lim_{h\to 0}\frac{(2+h)^2-4}{h}\\&=\lim_{h\to 0}\frac{4+4h+h^2-4}{h}\\&=\lim_{h\to 0}(4+h)\\&=4.\end{align*}

Of course, we can also use the alternative definition of $f'(a)$ to calculate the slope:\begin{align*}f'(2)&=\lim_{x\to 2}\frac{f(x)-f(2)}{x-2}\\&=\lim_{x\to 2}\frac{x^2-4}{x-2}\\&=\lim_{x\to 2}\frac{(x+2)(x-2)}{x-2}\\&=\lim_{x\to 2}(x+2)\\&=4.\end{align*}

The equation of tangent line is then $y-4=4(x-2)$ or $y=4x-4$.

Remark. One may wonder which definition of $f'(a)$ to use. I would say that is the matter of a personal taste. For a polynomial function, one notable difference between the two definitions is that if you use the first definition, you will end up expanding a polynomial, while you will have to factorize a polynomial with the second definition. Since the expansion of a polynomial is easier than the factorization, you may want to use the first definition if you are not confident with factorizing polynomials.

Example. Find the equation of tangent line to the graph of $f(x)=x^5$ at $x=1$.

Solution. As we discussed here, this is an extremely difficult problem to solve by using only algebra if not impossible. But surprise! With the new method, this is more or less a piece of cake. First we calculate the slope $f'(1)$. \begin{align*}f'(1)&=\lim_{h\to 0}\frac{(1+h)^5-1}{h}\\&=\lim_{h\to 0}\frac{(1+h)^5-1}{h}\\&=\lim_{h\to 0}\frac{1+5h+10h^2+10h^3+5h^4+h^5-1}{h}\\&=\lim_{h\to 0}(5+10h+10h^2+5h^3+h^4)\\&=5.\end{align*} Or by the second definition, \begin{align*}f'(1)&=\lim_{x\to 1}\frac{f(x)-f(1)}{x-1}\\&=\lim_{x\to 1}\frac{x^5-1}{x-1}\\&=\lim_{x\to 1}\frac{(x-1)(x^4+x^3+x^2+x+1)}{x-1}\\&=\lim_{x\to 1}(x^4+x^3+x^2+x+1)\\&=5.\end{align*}Therefore the equation of the tangent line is given by $y-1=5(x-1)$ or $y=5x-4$. The following picture shows the graph of $y=x^5$ (in blue) and the graph of tangent line $y=5x-4$.

Calculus 9: Finding the Equation of Tangent Line to a Curve $y=f(x)$

Let us consider a simple geometry problem. Given a curve $y=f(x)$, we want to find a line tangent to the graph of $y=f(x)$ at $x=a$, meaning the line meets the graph of $y=f(x)$ exactly at a point $(a,f(a))$ on a small interval containing $x=a$.

One may wonder at this point why finding a tangent line is a big deal. Well, it is in fact a pretty big deal besides mathematicians' purely intellectual curiosities. There is a reason why Sir Issac Newton had to invent calculus of which crucial notion is the slope of a tangent line. It is still too early to talk about why it is important or useful. We will get there when we are ready.

We attempt to tackle the problem with an example first. Here is an example we want to consider.

Example. Find the equation of a line tangent to the graph of $y=x^2$ at $x=2$.

Solution. To find the equation of a line, we need two ingredients: slope and $y$-intercept or slope and a point. We already know a point. We know that the line must pass through $(2,4)$. So all we need to find is its slope $m$. From algebra, we know that the equation of a line passing through $(2,4)$ with slope $m$ is given by $y-4=m(x-2)$ or $y=mx-2m+4$. Since $y=x^2$ and $y=mx-2m+4$ meet exactly at one point, the quadratic equation $x^2=mx-2m+4$ or $x^2-mx+2m-4=0$ must have exactly one solution. We have learned from the theory of quadratic equations that in that case the discriminant $D=b^2-4ac$ must be equal to $0$. That is, in our case $$D=m^2-4(2m-4)=m^2-8m+16=(m-4)^2=0.$$ Hence we determine that $m=4$ and the equation of the tangent line is $y=4x-4$.

So, we see that finding the slope of a tangent line is not that difficult and that it does not require any new mathematics, or does it? Remember that we have not yet tackled our problem in general context. Before we get more ambitious, consider another example with a more complicated function, say $y=x^5$. Let us say that we want to find the line tangent to the graph of $y=x^5$ at $x=1$. Then the equation of the tangent line would be $y=mx-m+1$. In order for $y=x^5$ and the line $y=mx-m+1$ to meet exactly at one point, the quintic equation $x^5-mx+m-1=0$ must have exactly one solution. Our problem here is that we have no algebraic means, such as quadratic formula or discriminant, to use to determine the value of $m$. We are stuck here and there is no hope of tackling our simple geometry problem using only algebra. That is the reason Sir Isaac Newton and Gottfried Leibniz had to devise a new way to tackle the problem. This is where we enter the realm of Calculus.

Mathphys Archive