4.4 Approximation, Linearization, and Local Models of Motion
A Pendulum That Defies Solution
A pendulum swings back and forth. You might expect the equation governing it to be simple --- it is, after all, one of the most basic physical systems imaginable. A mass on a string, subject to gravity. Introductory physics courses solve it in a few lines.
But here is the truth. The exact equation of motion for a pendulum of length $L$ is:
$$\frac{d^2\theta}{dt^2} = -\frac{g}{L}\sin\theta$$
That $\sin\theta$ makes all the difference. This is a nonlinear differential equation, and it has no closed-form solution in terms of elementary functions. You cannot solve it with any technique from a standard calculus course. The exact solution involves elliptic integrals --- functions so specialized that most scientists never encounter them.
And yet, every introductory physics textbook solves the pendulum. How? By replacing $\sin\theta$ with $\theta$. For small angles, $\sin\theta \approx \theta$, and the equation becomes:
$$\frac{d^2\theta}{dt^2} \approx -\frac{g}{L}\theta$$
This is a linear differential equation. You solved equations of this form in Section 4.3 --- it produces simple harmonic motion. The approximate equation gives a clean, beautiful, useful answer. The exact equation gives an answer most people cannot compute.
The question is: when is the replacement justified? How small is "small enough"? And what do we lose when we make it?
Prediction
Before you read on: The approximation $\sin\theta \approx \theta$ is used for "small angles." But how small is small? At what angle does this approximation become unacceptably inaccurate?
Choose your best guess:
(a) The approximation is already bad by 5 degrees.
(b) The approximation is good up to about 15 degrees, then degrades.
(c) The approximation is surprisingly good up to about 30 degrees.
(d) The approximation is useful all the way to 60 degrees.
Commit to an answer. We will check it numerically in a moment.
Most students guess too conservatively. The approximation is better than you expect --- and understanding why it works so well is the core of this section.
Exploration: Seeing the Approximation
[Interactive: Small-Angle Explorer. Two curves are plotted on the same axes: $y = \sin\theta$ (blue) and $y = \theta$ (orange dashed), with $\theta$ on the horizontal axis in radians. A vertical slider controls a marker angle $\theta_0$. At the marker angle, a vertical line shows the gap between the two curves. A readout displays three quantities: the exact value $\sin\theta_0$, the approximate value $\theta_0$, and the percentage error $\left|\frac{\theta_0 - \sin\theta_0}{\sin\theta_0}\right| \times 100\%$. As the student drags the slider from 0 toward $\pi/2$, they watch the gap widen and the error grow.]
Challenge 1: At what angle does the percentage error first exceed 1%?
Challenge 2: At what angle does the percentage error first exceed 5%?
Challenge 3: At what angle does the percentage error first exceed 10%?
Make a note of your answers. We will compile them into a table below.
If you played with that interactive, you may have been surprised. The two curves hug each other closely for a surprisingly wide range of angles. The divergence is gradual, not sudden --- the approximation does not "break" at a sharp threshold. It degrades smoothly, and the question of when it becomes "too bad" depends on how much error you are willing to tolerate.
The Numbers
Let us lay out the comparison explicitly. Here is $\sin\theta$ versus $\theta$ at several angles, with the percentage error computed at each point. Remember that the approximation $\sin\theta \approx \theta$ requires $\theta$ to be measured in radians.
| Angle (degrees) | Angle (radians) | $\sin\theta$ (exact) | $\theta$ (approx) | Absolute error | Percentage error |
|---|---|---|---|---|---|
| 1 | 0.0175 | 0.01745 | 0.01745 | 0.000003 | 0.005% |
| 5 | 0.0873 | 0.08716 | 0.08727 | 0.00011 | 0.13% |
| 10 | 0.1745 | 0.17365 | 0.17453 | 0.00088 | 0.51% |
| 15 | 0.2618 | 0.25882 | 0.26180 | 0.00298 | 1.15% |
| 20 | 0.3491 | 0.34202 | 0.34907 | 0.00705 | 2.06% |
| 30 | 0.5236 | 0.50000 | 0.52360 | 0.02360 | 4.72% |
| 45 | 0.7854 | 0.70711 | 0.78540 | 0.07829 | 11.07% |
| 60 | 1.0472 | 0.86603 | 1.04720 | 0.18117 | 20.92% |
| 90 | 1.5708 | 1.00000 | 1.57080 | 0.57080 | 57.08% |
Read across the table slowly. At 5 degrees, the error is about one-tenth of a percent --- essentially invisible. At 10 degrees, it is half a percent. Even at 20 degrees, the error is only about 2%. You have to push past 30 degrees before the error exceeds 5%, and past 45 degrees before it exceeds 10%.
The approximation $\sin\theta \approx \theta$ is not a fragile trick that shatters at the slightest provocation. It is robust over a wide range. For most practical pendulum experiments --- where the swing amplitude is 10 or 15 degrees --- the error introduced by the approximation is negligible compared to other experimental uncertainties like air resistance, friction at the pivot, or imprecise length measurement.
Check your prediction: Look back at the prediction you made earlier. Were you too conservative? Most students are. The approximation is better than it looks, and that is not an accident. It is a consequence of a deep mathematical structure that we will now examine.
Concept Reveal: Linearization
Why does $\sin\theta \approx \theta$ work so well near $\theta = 0$? The answer lies in a technique called linearization --- replacing a complicated function with the simplest possible approximation that captures its local behavior.
The idea in one sentence
Linearization replaces a function with its tangent line at a chosen point.
The idea in detail
Recall the Taylor expansion from calculus. Any smooth function $f(x)$ can be expanded around a point $x = a$ as:
$$f(x) = f(a) + f'(a)(x - a) + \frac{f''(a)}{2!}(x - a)^2 + \frac{f'''(a)}{3!}(x - a)^3 + \cdots$$
The linear approximation keeps only the first two terms:
$$f(x) \approx f(a) + f'(a)(x - a)$$
This is the equation of the tangent line to $f$ at $x = a$. It matches the function's value at $a$ and its slope at $a$. Everything else --- the curvature, the wiggles, the long-range behavior --- is thrown away.
For $\sin\theta$ expanded around $\theta = 0$:
- $f(\theta) = \sin\theta$
- $f(0) = \sin 0 = 0$
- $f'(0) = \cos 0 = 1$
So:
$$\sin\theta \approx 0 + 1 \cdot (\theta - 0) = \theta$$
That is where $\sin\theta \approx \theta$ comes from. It is not a guess or a rule of thumb. It is the tangent-line approximation at the origin, and it is guaranteed to be accurate near $\theta = 0$ by the mathematics of Taylor expansion.
Why the error is so small
Here is a subtlety that explains the table above. The next term in the Taylor expansion of $\sin\theta$ is:
$$\sin\theta = \theta - \frac{\theta^3}{6} + \frac{\theta^5}{120} - \cdots$$
Notice: there is no $\theta^2$ term. The first correction to the linear approximation is cubic, not quadratic. This means the error grows as $\theta^3$, which is extremely small when $\theta$ is small. At $\theta = 0.1$ radians, the error is roughly $\theta^3/6 \approx 0.00017$ --- less than two-hundredths of a percent. The missing $\theta^2$ term is a gift from the symmetry of the sine function, and it is what makes the small-angle approximation work over such a wide range.
The Pendulum, Revisited
Now we can return to the pendulum with sharper tools. The exact equation of motion is:
$$\frac{d^2\theta}{dt^2} = -\frac{g}{L}\sin\theta$$
Applying the small-angle approximation $\sin\theta \approx \theta$:
$$\frac{d^2\theta}{dt^2} \approx -\frac{g}{L}\theta$$
This is the equation for simple harmonic motion. You encountered this structure in Section 4.3 --- an acceleration proportional to negative displacement produces oscillation. The solution is:
$$\theta(t) = \theta_0 \cos\left(\sqrt{\frac{g}{L}}\, t\right)$$
where $\theta_0$ is the initial angular displacement (released from rest). The period is:
$$T = 2\pi\sqrt{\frac{L}{g}}$$
This is the famous pendulum formula. Notice what it says and what it does not say. It says the period depends on the length $L$ and the gravitational acceleration $g$. It does not depend on the amplitude $\theta_0$.
But wait --- that independence from amplitude is a consequence of the approximation, not of the exact physics. In the exact (nonlinear) pendulum, the period does depend on the amplitude. For larger swings, the pendulum takes longer to complete each cycle. The small-angle approximation erases this dependence by throwing away the nonlinear terms.
Pause and think: The approximate model says the period is independent of amplitude. The exact model says it is not. What does this tell you about the nature of approximation --- what kind of information gets preserved, and what gets lost?
This is worth sitting with. The approximate model preserves the oscillatory character of the motion, the relationship between period and length, and the frequency of small oscillations. What it loses is the amplitude dependence of the period and the detailed shape of the waveform at large angles. For small swings, everything the approximation preserves is the part that matters. For large swings, the lost information becomes important.
Three Linearizations Every Physicist Knows
The small-angle approximation for the pendulum is just one instance of a general technique. Here are three linearizations that appear throughout physics and engineering. All are Taylor expansions around $x = 0$ (or $\theta = 0$), keeping only the linear term.
1. The sine function
$$\sin\theta \approx \theta \quad \text{(for small } \theta \text{ in radians)}$$
You have already seen this one in detail. The next correction term is $-\theta^3/6$.
2. The cosine function
$$\cos\theta \approx 1 - \frac{\theta^2}{2} \quad \text{(for small } \theta \text{ in radians)}$$
Wait --- this one keeps a quadratic term. Why? Because $\cos 0 = 1$ and $(\cos\theta)' = -\sin\theta$, so $f'(0) = 0$. The tangent line at $\theta = 0$ is just the horizontal line $y = 1$, which misses the curvature entirely. To get any useful information about how $\cos\theta$ departs from 1, you need the quadratic term. In some contexts, the approximation $\cos\theta \approx 1$ is sufficient (this is the truly linear version), but for problems where the departure from 1 matters, the quadratic correction $1 - \theta^2/2$ is essential.
3. The binomial approximation
$$(1 + x)^n \approx 1 + nx \quad \text{(for } |x| \ll 1\text{)}$$
This is the linearization of the function $f(x) = (1 + x)^n$ around $x = 0$. Since $f(0) = 1$ and $f'(0) = n$, the tangent line gives $1 + nx$. This approximation works for any exponent $n$ --- integer, fractional, or negative.
Examples:
- $\sqrt{1 + x} = (1 + x)^{1/2} \approx 1 + \frac{x}{2}$
- $\frac{1}{1 + x} = (1 + x)^{-1} \approx 1 - x$
- $(1 + x)^3 \approx 1 + 3x$
The binomial approximation is quietly ubiquitous. It appears in gravity corrections, relativistic approximations, optics, circuit analysis, and countless other settings where a quantity is "close to 1."
Before you read on: Try linearizing $\frac{1}{\sqrt{1 + x}}$ near $x = 0$ using the binomial approximation. Write down your answer, then check it below.
Check your answer
We write $\frac{1}{\sqrt{1 + x}} = (1 + x)^{-1/2}$. Using $(1 + x)^n \approx 1 + nx$ with $n = -1/2$: $$(1 + x)^{-1/2} \approx 1 + \left(-\frac{1}{2}\right)x = 1 - \frac{x}{2}$$ This approximation is used in special relativity, where the Lorentz factor $\gamma = (1 - v^2/c^2)^{-1/2}$ can be approximated as $1 + \frac{v^2}{2c^2}$ for speeds much less than light.Why Linearization Simplifies Differential Equations
Here is the structural reason linearization matters for this chapter. In Section 4.3, you studied three types of acceleration models:
- Constant acceleration: $\frac{dv}{dt} = C$ --- gives parabolic trajectories
- Acceleration proportional to velocity: $\frac{dv}{dt} = -bv$ --- gives exponential decay
- Acceleration proportional to displacement: $\frac{d^2x}{dt^2} = -kx$ --- gives oscillation
All three are linear differential equations. The unknown function ($v$ or $x$) appears only to the first power, not inside a sine, cosine, exponential, or any other nonlinear function. Linear DEs have two properties that make them manageable:
- They have closed-form solutions in terms of polynomials, exponentials, and trigonometric functions --- the functions you already know.
- Their solutions can be added together (the superposition principle). If $x_1(t)$ and $x_2(t)$ are both solutions, then $x_1(t) + x_2(t)$ is also a solution.
A nonlinear DE like $\frac{d^2\theta}{dt^2} = -\frac{g}{L}\sin\theta$ has neither property. It cannot be solved in closed form with standard functions, and superposition does not apply.
Linearization converts a nonlinear DE into a linear one. That is its power. By replacing $\sin\theta$ with $\theta$, we turn an unsolvable equation into a solvable one. The price is that the solution is only valid near the point of linearization --- but near that point, it is excellent.
Pause and think: This is a pattern you will see again and again in physics. An exact equation is too hard. A linearized version is solvable. The linearized version is used --- not because physicists are lazy, but because it captures the essential behavior in the regime that matters.
Every physicist uses approximations constantly. It is not laziness --- it is strategy. The skill is knowing when the approximation is good enough, and recognizing when you have pushed past its range of validity.
Connection: Taylor Expansion as a Physicist's Tool
What you are doing here is Taylor expansion from calculus, applied to physics. The derivative gives the local linear model. The question is always: how local is local enough?
In calculus, you may have studied Taylor series as a technique for computing limits or series representations. In physics, the purpose is different. You use the Taylor expansion to replace a hard problem with an easier one that is accurate enough. The mathematical structure is identical, but the emphasis shifts from "does the series converge?" to "how much error does the truncation introduce, and can I live with that error?"
This shift in emphasis is important. A mathematician asks whether the Taylor series converges to the original function. A physicist asks whether the first term or two are close enough for the measurement at hand. Both are valid questions. In this course, you are learning to ask the physicist's question.
And the answer always has two parts:
- The approximation itself: $\sin\theta \approx \theta$, or $(1 + x)^n \approx 1 + nx$, or whatever linearization applies.
- The range of validity: the values of $\theta$ or $x$ for which the error is acceptably small. This range is not a universal constant --- it depends on how much error you can tolerate, which depends on the problem.
Linearization Around Non-Zero Points
So far, we have linearized functions around $x = 0$ (or $\theta = 0$). But the technique works at any point. If you want to approximate $f(x)$ near $x = a$, the linear approximation is:
$$f(x) \approx f(a) + f'(a)(x - a)$$
This is the tangent-line approximation at $x = a$, and it is accurate when $x$ is close to $a$.
Example: Suppose a pendulum hangs at rest at an angle $\theta_0 = \pi/6$ (30 degrees) from the vertical, due to a steady side-wind. It oscillates about this tilted equilibrium. To study small oscillations about $\theta_0$, you would linearize $\sin\theta$ around $\theta = \pi/6$, not around $\theta = 0$.
Setting $\theta = \theta_0 + \phi$ where $\phi$ is a small deviation:
$$\sin(\theta_0 + \phi) = \sin\theta_0 \cos\phi + \cos\theta_0 \sin\phi \approx \sin\theta_0 + \cos\theta_0 \cdot \phi$$
The oscillation is governed by the $\phi$ term, with the local "stiffness" determined by $\cos\theta_0$, not by 1. The equilibrium angle affects the frequency of small oscillations. Linearization at a different point gives a different --- but equally valid --- local model.
The general principle: linearization is always local. It is excellent near the chosen point and increasingly unreliable far away. Knowing where it breaks is as important as using it.
Practice
Layer 1: Concrete --- Linearize and Compare
(a) Linearize $\sin\theta$ near $\theta = 0$. State the approximation and compute the percentage error at $\theta = 0.3$ radians (about 17 degrees).
(b) Linearize $\cos\theta$ near $\theta = 0$. Keep the first nontrivial correction (the $\theta^2/2$ term). Compute the percentage error at $\theta = 0.3$ radians.
(c) Use the binomial approximation to linearize $(1 + x)^4$ near $x = 0$. Compute the exact and approximate values at $x = 0.1$.
Check your answer
**(a)** The linearization is $\sin\theta \approx \theta$. At $\theta = 0.3$: $\sin(0.3) = 0.29552$. The approximation gives $0.3$. Percentage error: $\frac{|0.3 - 0.29552|}{0.29552} \times 100\% = \frac{0.00448}{0.29552} \times 100\% \approx 1.52\%$. **(b)** The approximation is $\cos\theta \approx 1 - \theta^2/2$. At $\theta = 0.3$: $\cos(0.3) = 0.95534$. The approximation gives $1 - 0.09/2 = 1 - 0.045 = 0.955$. Percentage error: $\frac{|0.955 - 0.95534|}{0.95534} \times 100\% = \frac{0.00034}{0.95534} \times 100\% \approx 0.036\%$. The cosine approximation (with the quadratic term) is extremely accurate at 0.3 radians. This is because the next omitted term is $\theta^4/24 = 0.00034$, which is tiny. **(c)** The binomial approximation gives $(1 + x)^4 \approx 1 + 4x$. At $x = 0.1$: exact value is $(1.1)^4 = 1.4641$. Approximate value is $1 + 4(0.1) = 1.4$. Percentage error: $\frac{|1.4 - 1.4641|}{1.4641} \times 100\% = \frac{0.0641}{1.4641} \times 100\% \approx 4.38\%$. The error is larger here because we dropped the $\binom{4}{2}x^2 = 6x^2 = 0.06$ term, which is not negligible when $x = 0.1$ and the exponent is 4.Layer 2: Pattern --- Linearize a DE and Compare Solutions
Consider the differential equation for a pendulum: $\frac{d^2\theta}{dt^2} = -\omega^2 \sin\theta$, where $\omega^2 = g/L$.
(a) Write the linearized version of this equation for small $\theta$. What is its solution for the initial conditions $\theta(0) = \theta_0$ and $\dot{\theta}(0) = 0$?
(b) The linearized solution predicts that the period is $T = 2\pi/\omega$, independent of amplitude. The exact period for the nonlinear pendulum is:
$$T_{\text{exact}} = T_0 \left(1 + \frac{1}{16}\theta_0^2 + \frac{11}{3072}\theta_0^4 + \cdots \right)$$
where $T_0 = 2\pi/\omega$ is the linearized period. Compute $T_{\text{exact}}/T_0$ for $\theta_0 = 15°$, $30°$, and $60°$.
(c) At what initial amplitude does the linearized period differ from the exact period by more than 1%?
Check your answer
**(a)** The linearized equation is $\frac{d^2\theta}{dt^2} = -\omega^2\theta$. With $\theta(0) = \theta_0$ and $\dot{\theta}(0) = 0$, the solution is: $$\theta(t) = \theta_0 \cos(\omega t)$$ **(b)** We need $\theta_0$ in radians for the series formula. For $\theta_0 = 15° = 0.2618$ rad: $$\frac{T_{\text{exact}}}{T_0} \approx 1 + \frac{1}{16}(0.2618)^2 + \frac{11}{3072}(0.2618)^4 = 1 + 0.00428 + 0.0000168 \approx 1.0043$$ The period is about 0.43% longer than the linearized prediction. For $\theta_0 = 30° = 0.5236$ rad: $$\frac{T_{\text{exact}}}{T_0} \approx 1 + \frac{1}{16}(0.5236)^2 + \frac{11}{3072}(0.5236)^4 = 1 + 0.01713 + 0.000269 \approx 1.0174$$ The period is about 1.74% longer. For $\theta_0 = 60° = 1.0472$ rad: $$\frac{T_{\text{exact}}}{T_0} \approx 1 + \frac{1}{16}(1.0472)^2 + \frac{11}{3072}(1.0472)^4 = 1 + 0.06854 + 0.00431 \approx 1.0728$$ The period is about 7.3% longer. **(c)** We need $\frac{1}{16}\theta_0^2 \geq 0.01$ (ignoring higher-order terms, which are small). $$\theta_0^2 \geq 0.16 \implies \theta_0 \geq 0.4 \text{ rad} \approx 23°$$ So the linearized period is within 1% of the exact period for amplitudes up to about 23 degrees. Beyond that, the amplitude dependence that the linear model erases starts to matter.Layer 3: Structure --- Why Linear Means Solvable
(a) Why does linearization always produce a simpler differential equation? What structural property of the DE changes when you linearize?
(b) In Section 4.3, you saw that linear DEs with different structures produce different types of motion: exponential decay ($dv/dt = -bv$), oscillation ($d^2x/dt^2 = -kx$), and uniform motion ($dv/dt = 0$). Can a linear DE produce chaotic motion --- motion that is sensitive to initial conditions and appears random? Why or why not?
(c) If linearization always produces a linear DE, and linear DEs always produce "well-behaved" solutions (exponentials, sines, cosines, polynomials), what does this say about the range of behaviors that a linearized model can capture?
Check your answer
**(a)** Linearization replaces a nonlinear function (like $\sin\theta$, $e^x$, or $v^2$) with a polynomial of degree 1. This means the unknown function appears only to the first power in the DE --- no squares, no sines, no products of unknowns. That is the defining feature of a linear DE, and linear DEs have solution methods that nonlinear DEs generally lack. The structural change is the removal of nonlinearity: the terms that make the equation hard are precisely the ones that linearization eliminates. **(b)** No. Linear DEs cannot produce chaotic motion. This is a theorem, not just an observation. The solutions of linear DEs are combinations of exponentials, sines, cosines, and polynomials --- functions whose long-term behavior is completely predictable (growth, decay, oscillation, or constant). Chaos requires nonlinearity. So when you linearize a system, you are guaranteed to get a predictable, well-behaved solution. This is both the power and the limitation of linearization: you gain solvability, but you lose the ability to see chaotic or other complex nonlinear behavior. **(c)** A linearized model can only produce motions that are combinations of exponential growth/decay, sinusoidal oscillation, and polynomial drift. These are the "building blocks" of linear systems. Any behavior that does not fit these patterns --- chaos, amplitude-dependent frequency, limit cycles, solitons --- is invisible to the linearized model. Linearization captures the *local* character of the motion (is it oscillating or decaying?) but misses the *global* nonlinear features.Layer 4: Debug --- A Bad Approximation in Action
A student is analyzing a pendulum released from rest at $\theta_0 = 45°$ ($\pi/4$ radians). They use the small-angle approximation throughout and report the following:
"The maximum restoring acceleration is $\frac{g}{L}\theta_0 = \frac{g}{L} \cdot 0.785$."
"The period is $T = 2\pi\sqrt{L/g}$, independent of the release angle."
(a) What is the exact maximum restoring acceleration? By what percentage did the student overestimate it?
(b) Using the series from the previous problem, by what percentage is the student's period estimate too short?
(c) The student defends the approximation: "The error is only about 10%, so it's fine." Under what circumstances would you agree? Under what circumstances would you not?
Check your answer
**(a)** The exact maximum restoring acceleration is $\frac{g}{L}\sin\theta_0 = \frac{g}{L}\sin(45°) = \frac{g}{L}(0.7071)$. The student computed $\frac{g}{L}(0.7854)$. Percentage overestimate: $\frac{0.7854 - 0.7071}{0.7071} \times 100\% \approx 11.1\%$. The student overestimated the restoring acceleration by about 11%. The approximate model predicts a stronger restoring force than actually exists. **(b)** From the series: $$\frac{T_{\text{exact}}}{T_0} \approx 1 + \frac{1}{16}\left(\frac{\pi}{4}\right)^2 + \frac{11}{3072}\left(\frac{\pi}{4}\right)^4$$ $$= 1 + \frac{1}{16}(0.6169) + \frac{11}{3072}(0.3805) = 1 + 0.0386 + 0.00136 \approx 1.040$$ The exact period is about 4.0% longer than the linearized prediction. The student's period is about 4% too short. **(c)** Whether 10--11% error is "fine" depends entirely on the context. **Acceptable:** If the student is doing a back-of-the-envelope estimate, designing a rough experiment, or checking whether a pendulum clock is feasible, 10% error may be perfectly adequate. Many engineering applications tolerate 10% uncertainty in preliminary design. **Not acceptable:** If the student is calibrating a precision instrument, comparing experimental data to theory, or making a claim that the period is amplitude-independent (since it is *not* for this amplitude), then 10% error is too large. The student should use either the exact solution or include the correction terms. The key point: it is not enough to say "the error is 10%." You must also say *whether 10% matters for the question being asked*. Approximation is always relative to purpose.The Broader Principle: Approximation as Strategy
Throughout this section, we have used linearization to turn hard problems into easy ones. This might feel like cheating --- like sweeping the hard parts under the rug. It is not.
Here is the deeper truth: almost all of physics operates in the regime of valid approximation. The "exact" equations are often themselves approximations of something deeper. Newton's law of gravity is an approximation to general relativity. The Schrodinger equation is an approximation to quantum field theory. The pendulum equation $\frac{d^2\theta}{dt^2} = -\frac{g}{L}\sin\theta$ is itself an approximation --- it ignores air resistance, the mass of the string, the finite size of the bob, the rotation of the Earth.
There is no "exact" model at the bottom. There are only models that are accurate enough for the question at hand. The skill is not finding the exact answer --- it is knowing which approximation to use and where it breaks down.
This is what makes linearization a principled scientific move rather than a desperate shortcut. When you write $\sin\theta \approx \theta$, you are not admitting defeat. You are making a conscious decision to trade a small, quantifiable amount of accuracy for a large gain in tractability. And you are keeping track of the trade --- you know the error, you know the range of validity, and you know what physical effects the approximation erases.
Pause and think: Every time you use $g = 9.8$ m/s$^2$ instead of computing the local gravitational acceleration from the mass and radius of the Earth, you are making an approximation. Every time you model a ball as a point particle, you are making an approximation. Linearization is just one more tool in this toolkit --- but it is a particularly powerful one because it comes with a built-in estimate of its own accuracy (the next term in the Taylor expansion).
Reflection
Think back over this section and consider:
When is an approximate answer more useful than an exact one?
An exact answer that you cannot compute is useless. An approximate answer that is 1% off and takes five minutes to derive is enormously valuable. The pendulum is the perfect example: the "exact" answer involves elliptic integrals that are opaque to most people, while the approximate answer $T = 2\pi\sqrt{L/g}$ is clean, memorable, insightful, and accurate to better than 1% for typical pendulum experiments.
But there is a flip side. An approximation used outside its range of validity is worse than useless --- it is misleading. A 45-degree pendulum does not have the same period as a 5-degree pendulum, and pretending it does can lead to real errors in real experiments.
The discipline of approximation is not just knowing how to simplify. It is knowing when to stop simplifying.
Looking Ahead
In this section, you learned to handle nonlinear equations by replacing them with linear approximations valid near a chosen point. This works beautifully when the motion stays close to that point --- small oscillations around equilibrium, small deviations from a known trajectory.
But what if the motion does not stay small? What if the pendulum swings through 90 degrees, or the drag force depends on $v^2$, or the equation simply has no closed-form solution at any approximation level?
In the next section, we take a completely different approach. Instead of finding a formula for the solution, we build the solution one time step at a time, using the differential equation as a recipe: compute the current acceleration, update the velocity, update the position, repeat. This is Euler's method --- the simplest numerical technique --- and it can solve any differential equation, linear or not, with no algebra at all. The only cost is that the answer is a table of numbers rather than a formula. As you will see, that trade-off is often worth making.