Targeting the gradient flow

13 May, 2026

In this post, we will first summarize what we discussed in part 1 and part 2, then add a few extra thoughts.

Summary of the two posts

Part 1: we picked a function $f : ℝ \to ℝ$ and built a simple vector field such that for any trajectory $(x, y) : ℝ^{> 0} \to ℝ^{2}$ , the trajectory is attracted to the graph of $f$ , in the sense that if $u (t) : = y (t) - f (x (t))$ , then ${lim}_{t} u (t) = 0$ . To do this, we defined the residual $u : = y - f (x)$ and forced it to satisfy the ODE $u^{'} = - k u, k > 0$ (which also gives an ODE for $y$ ). We also picked an arbitrary direction for $x$ to move with $x^{'} = 1$ . We showed that the corresponding vector field $F (x, y) = (x^{'}, y^{'})$ works for our purposes!

We discussed that we can extend the construction to vector fields on $x \in ℝ^{n}$ as well as the case of $k$ being a function of $x$ instead of constant.

Finally, given an existing vector field $F$ and the assumption $u^{'} = - k u$ , we showed that $u$ and $F$ satisfy a transport PDE:

$F \cdot \nabla u + k u = 0 .$

Part 2: we applied some of these ideas to optimization. Starting from a smooth $g : ℝ \to ℝ$ for which we want to find a minimum, we created various vector fields $F$ such that the trajectories following them converge to the stationary points of $g$ . We used a similar construction, setting $x^{'} = - g^{'} (x)$ and getting an ODE for $y$ from $u = y - g (x)$ and $u^{'} = - k u$ .

We also discovered that we have some freedom in picking $x^{'}$ , i.e., for any well-behaved $B : ℝ^{2} \to ℝ$ , setting $x^{'} = - g^{'} (x) + B (x, y) u$ will still give us convergent dynamics as long as we are not completely cancelling $- g^{'} (x)$ close to the stationary points. This opens up new possibilities, e.g., $B$ can also be a stochastic function, etc.

Follow-up questions and thoughts

It is fun watching the trajectories skip over some of the basins of attraction, but can we actually get more interesting behaviors out of picking the right $B$ ?

We assumed $u^{'} = - k u \Rightarrow u = \exp (- k t) u_{0}$ , and so no trajectory will have non-trivial periodic orbits. If a trajectory did, the residual would also have to be periodic and that is impossible as $\exp (- k t)$ is not a periodic function. The only way for periodicity to hold is if $u (t) = 0$ for all $t > t_{0}$ . However, on the attracting manifold we are following GD which also doesn't have any non-trivial periodic orbits.

So, regardless of the choice of $B$ (assuming it is well-behaved), these methods cannot produce sustained exploration. There is an exponential clock, and once it rings, exploration is over: we are back to following gradient descent. We can give more time for exploration by adjusting the decay, e.g., setting $u = u_{0} / (1 + t)$ .

We don't have to match $g$ .

In part 2, we had the residual be $u = y - g (x)$ and $u^{'} = - k u$ (or, equivalently, have $u$ satisfy our transport PDE). However, what if instead of targeting the graph of $g$ , we target the gradient flow $- g^{'}$ directly?

To do this, we will work on the $(x, v)$ space where $v = x^{'}$ . We set $u = v - (- g^{'} (x))$ , i.e., we are getting trajectories to match the gradient flow, $- g^{'} (x)$ . Now

$u^{'} = v^{'} + g^{″} (x) x^{'} = v^{'} + g^{″} (x) v .$

If we now enforce $u^{'} = - k u$ (and do a few calculations) we get:

$x^{″} + (k + g^{″} (x)) x^{'} + k g^{'} (x) = 0 .$

So the same residual-decay idea produces second-order optimization dynamics if we focus on gradient flow.

I was wondering if this is a known second-order ODE. It looks like a special case of the form studied by Alvarez¹, et al., : $x^{″} + (α + β g^{″} (x)) x^{'} + g^{'} (x) = 0$ .

There's potential to recover further ODEs from the literature (e.g., with $1 / t$ damping) by allowing time-dependence in $k$ and setting $u = v + η (t) g^{'} (x)$ . I leave that for future work.

TL;DR: All this came out of picking a residual form and then attaching a decay assumption.

Footnotes

Alvarez, F., Attouch, H., Bolte, J. and Redont, P., 2002. A second-order gradient-like dissipative dynamical system with hessian-driven damping.: Application to optimization and mechanics. Journal de mathématiques pures et appliquées, 81(8), pp.747-779.↩

Targeting the gradient flow

Summary of the two posts

Follow-up questions and thoughts

We don't have to match g.

Footnotes

We don't have to match $g$ .