Why does the Newton-Raphson method not converge for some functions?

$\begingroup$

$f(x)=2x^2-x^3-2$. This is a cubic type graph as shown. The real root of this graph is $(-0.839,0)$.

So, the question is to use Newton's approximation method twice to approximate a solution to this $f(x)$.

I use an initial starting value of $x_0=1$. My first approximation is $x_1=2$, and my second one is $x_2=1.5$. I seem to not move any closer to the real solution as I keep iterating through the method.

Am I misunderstanding how to use this approximation? Is the issue that my first guess was too far from the actual solution?

$\endgroup$ 9

8 Answers

$\begingroup$

Newton's method does not always converge. Its convergence theory is for "local" convergence which means you should start close to the root, where "close" is relative to the function you're dealing with. Far away from the root you can have highly nontrivial dynamics.

One qualitative property is that, in the 1D case, you should not have an extremum between the root you want and your initial guess. If you have an odd number of extrema in the way, then you will start going away from the root you want, as you see here. If you have an even number of extrema in the way, then you will start going the right way, but you may later find yourself in a spot with an odd number of extrema in the way, leading to problems later.

Of course you may eventually find an occasion where there are an even number of extrema in the way, and then you manage to skip over all of them and get to the right side. At that point things will usually work out (not always, though). In this problem with your initial guess, that eventually happens, because the system eventually finds its way just slightly to the right of the extremum on the right, which sends it far off to the left.

$\endgroup$ 1 $\begingroup$

In fact, you gave up too early ; The method eventually converges :

1 2.000000000000000000000000000
2 1.500000000000000000000000000
3 0.3333333333333333333333333333
4 2.148148148148148148148148148
5 1.637079608343976160068114091
6 0.9483928480399477528436835979
7 1.910874140183680201544963299
8 1.405089904362402921055022221
9 -1.324018083676046424512855515
10 -0.9614381794507316717924414480
11 -0.8500221808505758631523579893
12 -0.8393807176849843501240483025
13 -0.8392867625049899194321196645
14 -0.8392867552141611764525252322
15 -0.8392867552141611325518525647
16 -0.8392867552141611325518525647
17 -0.8392867552141611325518525647
18 -0.8392867552141611325518525647
19 -0.8392867552141611325518525647
20 -0.8392867552141611325518525647
?

$\endgroup$ 10 $\begingroup$

The other answers are great. I'd just like to add a concrete example of weird behavior of which the Ian's answer speaks.

Let's consider a function $f(x) = \operatorname{sgn} x \sqrt{|x|}$. According to the algorithm, we iterate $$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}.$$ Now for the derivative (if we're not doing the derivative at $x = 0$, the $\operatorname{sgn} x$ is just a constant, and $|x| = x \operatorname{sgn} x$): $$f' = [\operatorname{sgn} x \sqrt{x \operatorname{sgn} x}]' = \operatorname{sgn} x \frac{1}{2\sqrt{x \operatorname{sgn} x}} \operatorname{sgn} x = \frac{1}{2\sqrt{|x|}}.$$ Plugging in: $$x_{n+1} = x_n - \frac{\operatorname{sgn} x \sqrt{|x|}}{1/(2\sqrt{|x|})} = x_n - 2 \operatorname{sgn} x \left(\sqrt{|x|}\right)^2 =\\ =x_n - 2\operatorname{sgn} x |x| = x_n - 2 x_n = - x_n.$$

So if we start iterating in $x = a$ (where $a \not = 0$), we get the sequence $a, -a, a, -a, \ldots$ and the method loops forever between those two points, never getting to the root $x = 0$!

Edit: Here's a gnuplotted image:(In each iteration, we make a tangent in the current point (the blue dashed line) and the $x$ for which the tangent crosses 0 is taken to be the next approximation (so we go along the magenta line in order to get the starting point for the next iteration).)

By the way, have a look at this image from Wikipedia:

It shows the complex plane colored with 5 colors, each color corresponding to one root of the complex equation $z^5 = 1$. Each point then has the color corresponding to the root to which Newton's method converges, if we start from that point. The "flowers" are beautiful to behold but totally abhorrent from the numerical point of view.

$\endgroup$ 8 $\begingroup$

Newton's method has no global convergence guarantee for arbitrary functions, as you just learned.

Now, people have posted examples of where Newton's method doesn't converge, but they're all rather "unusual" functions (some being very non-smooth), so it's natural to assume they're pathological and won't happen in practice.

Your example is one where Newton just takes more iterations than expected to converge, so it's not too bad. But here is an example of a cubic polynomial for which Newton's method won't converge!

\begin{align*} f(x) &= -0.74 + 0.765 x + 1.1 x^2 - 3.55 x^3 \\ x_0 &= 5/9 \end{align*}

Not only that, but it's in fact a stable oscillation—small perturbations won't change the behavior.

And for bonus points, you can generate as many of these as you want! Just let the Newton step be $$g(x) = x - f'(x)^{-1} f(x)$$ and then you're just looking for a nontrivial solution to the equation $$x_0 = g^3(x_0) = g(g(g(x_0)))$$ where a solution $x_0$ would be trivial if $g(x_0) = 0$.
Notice the equation above is just another polynomial equation (although of a much higher order)—which means its solutions can be readily found numerically.
The only caveat is that these solutions might not necessarily be stable. I suspect you should be able to place a second-derivative condition to ensure stable solutions, but the exact equation is not obvious to me at the moment, so I'll leave it as an exercise for the reader. :-)

Mathematica code for the plot:

Manipulate[ With[{f = Evaluate[Rationalize[d + c # + b #^2 + a #^3]] &}, Plot[ f[x], {x, -0.61, 1}, PlotStyle -> {Thickness[Tiny]}, Prolog -> {Thickness[Tiny], Line[Flatten[Map[ {{#, 0}, {#, f[#]}} &, NestList[Compile[t, t - f[t]/f'[t]], x0, n]], 1]]}]], {{a, -3.55}, -4, 4}, {{b, 1.1}, -2, 2}, {{c, 0.765}, -1, 1}, {{d, -0.74}, -1, 1}, {{x0, 5/9}, 0, 1}, {{n, 100}, 0, 1000, 1}]

Update:

I wrote some code to purposefully find both stable and unstable iterations:

Newton[f_] := t \[Function] t - f[t]/f'[t];
NewtonPlot[f_, xmin_, xmax_, x0_, n_, args___] := Plot[f[x], {x, xmin, xmax}, args, Prolog -> {Thickness[Tiny], Line[Flatten[Map[ {{#, 0}, {#, f[#]}} &, NestList[Compile[t, Newton[f][t]], x0, n]], 1]]}];
FindOscillatoryNewtonSolutions[f_, n_, h_](* {Stables,Unstables} *):= With[{step = Newton[f]}, With[{nstep = t \[Function] Nest[step, t, n]}, GroupBy[ Map[#[[1]][[2]] &, Solve[{nstep[t] == t, Abs[step[t] - t] >= h}, t, Reals]], t \[Function] With[{step1hp = nstep[t + h], step1hm = nstep[t - h]}, True \[And] Abs[N[step1hp - t]] >= Abs[N[nstep[step1hp] - t]] \[And] Abs[N[step1hm - t]] >= Abs[N[nstep[step1hm] - t]]]]]];
With[{z = 400, xmin = -1.1, xmax = +0.65, h = 10^-3}, Manipulate[ With[{f = t \[Function] Evaluate[Rationalize[d + c t + b t^2 + a t^3]], n = 3, m = 8}, With[{solcategories = FindOscillatoryNewtonSolutions[f, n, h]}, If[Length[solcategories] >= 2, Map[{Transpose@{N@SortBy[NestList[Newton[f], #1, n - 1], N]}, NewtonPlot[f, xmin, xmax, N[#1 + dx], n*m, PlotStyle -> {Thickness[Tiny]}, ImageSize -> Scaled[0.2], PerformanceGoal -> "Quality"]} &, {First@Lookup[solcategories, True], First@Lookup[solcategories, False]}], Throw["Did not manage to find both a stable and an unstable solution."]]]], {{dx, h}, -4 h, +4 h, h/4}, {{a, -355}, -z, +z, 1}, {{b, 110}, -z, +z, 1}, {{c, 77}, -z, +z, 1}, {{d, -74}, -z, +z, 1}, SynchronousUpdating -> False]]

Notice, below, that the first point iterates stably, whereas the second one iterates unstably. (The iterations would diverge further, but I cut them off at some point.)

$\endgroup$ 1 $\begingroup$

The relationship between starting point and eventual convergence point for Newton's method is complicated. Consider this image (from Wolfram's MathWorld, Newton's Method):

showing the basis of attraction in the complex plane for $x^3 - 1$, $x^4 - 1$, ... $x^{11} - 1$. Each colour corresponds to a root of the polynomial. Each point is colored to match the root that Newton's method eventually takes that point to.

These basins are typically quite complicated, but if you look at what is going on in the complex plane, it can be a little clearer what's going on. The real line runs horizontally, halfway down these plots. If all you know is what is happening on the real axis, which is what you are seeing in your problem, it can be difficult to predict where a given point is going to end up.

$\endgroup$ $\begingroup$

Newton-Raphson can behave badly even in seemingly easy situations. I am considering the use of N-R for minimization (rather than root finding, but the same applies). Even in the case of convex functions, N-R may not converge.

For example: $$f(x)=\ln(e^x+e^{-x})$$ is $C^{\infty}$, strictly convex and admits a single (global) minimum in 0. $f(x)=\ln(e^x+e^{-x})$

Yet, if we try to use N-R to find the minimum (i.e. the root of the derivative), the algorithm fails if started from a point $|x|>1.09$ (approx).

To see this, consider that $f'(x)=\tanh(x)$ and $f''(x)=\cosh^{-2}(x)$. Therefore, the N-R update rule maps the current $x$ to the new $x$ according to: $$x \leftarrow x - \frac{f'(x)}{f''(x)} = x - \frac12 \sinh(2x)$$ Whenever $|\frac12 \sinh(2x)| > |2x|$, the new iterate will be bigger in modulus than the previous one, i.e. N-R will diverge away from the solution. This happens for $|x|>1.09$ (approx).

$\endgroup$ 12 $\begingroup$

There are several articles about the convergence of Newton's method. There is something called the Newton-Kantorovich theorem which gives rigour to the notion of convergence regions.. your starting point must be within the Fatou set which encloses the point of attraction of the dynamical system formed by the iterates of the Newton iteration.

On the Newton–Kantorovich hypothesis for solving equations

I used the Method here to explore the convergence regions for asserting convergence of Newton's method for the Gram points of the zeta functionConvergence of the Newton-Kantorovich Method for the Gram Points

$\endgroup$ $\begingroup$

The Newton-Raphson method basically asks you to draw the tangent to the function at the point $x_0$, and $x_1$ is the point where that tangent hits the $x$-axis. This obviously works well if the function follows reasonably close to the tangent up to the point where $y = 0$, and not so well otherwise.

You can see that between $x = 0$ and $x \approx 1.4$ the tangent actually points in the wrong direction, away from the $0$. And where the tangent is parallel or almost to the $x$-axis, close to $x = 0$ and $x = 1.4$, following the tangent will get you far away from the zero.

For $x_0 < -0.346, \ x_1$ will be closer to the solution than $x_0$, and will be in the same range again, so the sequence will converge towards the solution. For $x_0 > 1.5, \ x_1$ will also be closer to the solution than $x_0$, but you need to get over the area from $-0.346 \to 1.5$.

And $\approx 1.081792$ there is a point where $3$ iterations get you back to the point where you started :-)

$\endgroup$

8 Answers

Update:

Your Answer

Sign up or log in

Post as a guest

Related Journals

Can you get your own house?

I lost my Pegassi Oppressor Mk II after replacing it, how do I get it back? [duplicate]

Is it possible to activate a single random wire in Terraria?

Fez: Treasure Chest in the sewer/well area with acid