(づ•ᴥ•)づ┬─┬

Notes

Revisiting a Root-Finding Idea from Undergrad

I recently recovered some notes from undergrad, when I was taking numerical analysis for the first time.

Most people know of Newton's method: given a well-behaved function f and some initial point x0, we define the sequence

xn+1=xnf(xn)f(xn),

and under good conditions, this sequence converges quadratically (read: super fast) to a root x* of f, f(x*)=0. We will assume at least one such root exists.

While taking a break from homework, I found myself sketching continuous functions and wondering if I could invent a sequence that would find roots by following a geometric process, rather than by writing down a formula for the sequence.

Chasing roots with circles?!

Here's what I tried: start with a function f:, well-behaved, e.g., C1 or better, and assume it has a root. We take a random point x0 and then draw the circle with radius |f(x0)| and center (x0,f(x0)). Next, we consider the points where this circle intersects the graph of f. This is equivalent to finding all x such that

(xx0)2+(f(x)f(x0))2=f(x0)2.

So far so good, but why should this equation even have roots? Well, either the entire graph of f is in the circle or it isn't.

And so we have shown that if f is continuous and has a root somewhere, then it must intersect such circles to get to the root. The idea from here was:

  1. Find the set V0={x:(xx0)2+(f(x)f(x0))2=f(x0)2}.
  2. Select x1:=\argminxV0|f(x)| (assuming Vn is finite for all n, e.g., if f is analytic, I believe this follows because its roots have to be isolated).
  3. Repeat with x1 to get to x2, etc.

This is where I stopped the analysis as an undergrad. The equation (xx0)2+(f(x)f(x0))2=f(x0)2 leads to harder problems than just solving f(x)=0. For example, if f(x)=x21 is your target, the circle equation is a quartic polynomial!

What appealed to me was that, when sketching the sequence on paper, you could see it taking large steps when far from the root and gradually slowing down as it approached closer.

Can we salvage this idea and show it converges?

There's a simple way to make a method out of this madness: we will replace f(x) with a tangent approximation! First,

f(x)f(x0)+f(x0)(xx0),

and then

(xx0)2+(f(x0)(xx0))2=f(x0)2(xx0)2(1+f(x0)2)=f(x0)2xx0=±|f(x0)|1+f(x0)2x=x0±|f(x0)|1+f(x0)2.

So at each step we compute the two values

x±=xn±|f(xn)|1+f(xn)2

and then xn+1=argminx+,x{|f(x)|}.

Informal convergence proof

Let's now show (a bit informally) that this converges! Close to a root x*, f(x)f(x*)(xx*) and f(x)f(x*), so

xn+1xn±|f(x*)||xnx*|1+(f(x*))2=±|f(x*)|1+(f(x*))2|ϵn|,

where ϵn=xnx*. So finally

|ϵn+1|=|xn+1x*|=||xn+1xn||xnx*|||ϵn|(1|f(x*)|1+(f(x*))2))=|ϵn|C,

where 0C<1, which means we have linear convergence when we are sufficiently close to the root! In fact, the bigger the slope of f at the root, the faster this method converges (when close to the root)!

Getting stuck

Far from the root, problems arise because this algorithm targets minima instead of roots. When we are close to a minimum, say f(xn)0, at which point the possible iterates are

x=xn±|f(xn)|.

Both move away from the minimum, so the algorithm will pick whichever has the smallest distance from it.

You can see this behavior here:

comparison (2)

Because of the slope, in the beginning the two methods do similar steps, but eventually we get stuck.

Is this a practical method?

Not really: it requires four evaluations of f per iteration: f(xn), f(xn), then f(x+) and f(x). Compared to Newton, Secant, Brent, etc., it seems like we get slower convergence for more work.

Also, because of its greedy nature, it can get stuck in local minima.

However, this method converges quickly when the slope at the root is steep, and slowly when it is shallow.

Footnotes

  1. Technically, a function defined only at a single point is continuous, but you know the kind of functions we’re talking about here. Also, if the root is at x0, we’re done and don’t need to take another step.