Subject: BFGS variable metric question
From: Jive Dadson
Date: Wed, 30 Oct 1996 17:34:15 +0000
Greetings Gurus!
I am developing software that requires non-linear multi-variate function
minimization. I've been using the Boyden-Fletcher-Goldfarb-Shanno variant
of the variable metric or "quasi-Newton" method from _Numerical Recipes In C_.
(Actually they call it "dfpmin", for Davidon-Fletcher-Powell, but it uses
the BFGS update rule.) I'm using that routine because for the problems
I'm working on, it is the fastest I've found -- when it works.
It has a couple of problems which I have been able to work around, but
I wonder if someone could tell me if I've done the best hack job on it.
I freely confess I don't know much about this subject.
One problem is in the line-search routine it uses. The search is not
intended to find the minimum along the line, but rather a point that is
a "good enough" improvement. Problem is, it very often fails even using
double precision. The documentation says that happens when the data is
"poorly scaled", but that's not it. It seems to happen most often when
the data is real noisy. The original code has a rather rude way of dealing
with the failure: It calls "exit". Obviously that is unacceptable.
I replaced the call to "exit" with a call to a more robust, exact
line-search routine.
The other problem is that the approximation to the Hessian matrix that
the BFGS routine builds very often becomes non-positive-definite.
I don't know why that happens, but when it does, the routine stops,
often at a bad local minimum. I hacked around that as follows: Where
the routine previously exited, I put in code to invert the matrix using
Chlomsky decomposition. If that fails, I restart the algorithm with
the Hessian approximation set back to the identity matrix, so the
routine starts over with a gradient-descent step. That seems to have
fixed that problem. Does anyone know why the Hessian approximation
might be collapsing like that? Is there a better way to fix the algorithm?
Finally, I am taking advantage of the approximate Hessian the routine
builds, using it to estimate likelihoods for models. To assure that I have
a decent approximation, I have hacked the BFGS routine so it always does
a multiple of N updates, where N is the number of dimensions. Is that
good enough? Would it be good enough just to require that it does
at least N updates?
Thanks much for any light you can shed on the subject.
J.