[forwarding back to r-help for archiving/further discussion]
On 10-09-05 08:48 PM, Sally Luo wrote:> Prof. Bolker,
>
> Thanks for your reply and the helpful info.
>
> I still have a few questions.
>
> 1. I also tried to use different methods other than "BFGS"
without
> changing their default maxit values, and got very different results.
> Since I am not that experienced with the optim funciton, I am not sure
> which one is correct or trustworthy?
>
> * If I use method=SANN, that is,*
>
> > p<-optim(c(-0.2392925,0.4653128,-0.8332286, 0.0657,
-0.0031,
> -0.00245, 3.366, 0.5885, -0.00008,
> + 0.0786,-0.00292,-0.00081, 3.266, -0.3632, -0.000049,
> 0.1856, 0.00394, -0.00193, -0.889, 0.5379, -0.000063,
> + 0.213, 0.00338, -0.00026, -0.8912, -0.3023, -0.000056),
> f, method ="SANN", y=y,X=X,W=W)
>
> I get:
>
> There were 50 or more warnings (use warnings() to see the first 50)
> > p
> $par
> [1] -0.2392925 0.4653128 -0.8332286 0.0657000 -0.0031000
> -0.0024500 3.3660000 0.5885000 -0.0000800 0.0786000 -0.0029200
> -0.0008100
> [13] 3.2660000 -0.3632000 -0.0000490 0.1856000 0.0039400 -0.0019300
> -0.8890000 0.5379000 -0.0000630 0.2130000 0.0033800 -0.0002600
> [25] -0.8912000 -0.3023000 -0.0000560
>
> $value
> [1] -772.3262
>
> $counts
> function gradient
> 10000 NA
>
> $convergence
> [1] 0
>
> $message
> NULL
> > warnings()
> Warning messages:
> 1: In log(det(I_N - pd * wd - po * wo - pw * ww)) : NaNs produced
> 2: In log(det(I_N - pd * wd - po * wo - pw * ww)) : NaNs produced
> .
> .
>
> 49: In log(det(I_N - pd * wd - po * wo - pw * ww)) : NaNs produced
> 50: In log(det(I_N - pd * wd - po * wo - pw * ww)) : NaNs produced
>
>
for "SANN", the convergence criterion is not meaningful, because
SANN does not use a tolerance-based stopping criterion. As ?optim says,
"‘0’ indicates successful completion (which is always the case for
‘"SANN"’)."
> *If I change method to "Nelder-Mead", that is:*
>
> > p<-optim(c(-0.2392925,0.4653128,-0.8332286, 0.0657, -0.0031,
> -0.00245, 3.366, 0.5885, -0.00008,
> + 0.0786,-0.00292,-0.00081, 3.266, -0.3632, -0.000049,
> 0.1856, 0.00394, -0.00193, -0.889, 0.5379, -0.000063,
> + 0.213, 0.00338, -0.00026, -0.8912, -0.3023, -0.000056),
> f, method ="Nelder-Mead", y=y,X=X,W=W)
>
> Then I get:
>
> There were 21 warnings (use warnings() to see them)
> > p
> $par
> [1] -0.2392925 0.4653128 -0.8332286 0.0657000 -0.0031000
> -0.0024500 3.3660000 0.5885000 -0.0000800 0.0786000 -0.0029200
> -0.0008100
> [13] 3.5184500 -0.3632000 -0.0000490 0.1856000 0.0039400 -0.0019300
> -0.8890000 0.5379000 -0.0000630 0.2130000 0.0033800 -0.0002600
> [25] -0.8912000 -0.3023000 -0.0000560
>
> $value
> [1] -772.3568
>
> $counts
> function gradient
> 192 NA
>
> $convergence
> [1] 10
>
> ACCORDING TO the R manual, convergence=10 indicates degeneracy of the
> Nelder–Mead simplex. Could you explain to me what this degeneracy
> means? Does it mean the optimization gets stuck with a local minimal?
This means that the Nelder-Mead simplex has shrunk in such a way that
in a least one dimension the extent
of the simplex has shrunk to a point. It doesn't have anything to do
with local minima (there's not really any
way that a single run of a local optimizer can detect a local minimum).
You could try re-starting the optimization
from the point at which the previous run stopped.> 2. I also tried to change the maxit value of BFGS to 10000, and got
> the following results. It seems this time the algorithm coverges, but
> the estimation results are quite different from what I got by using
> the method "SANN". In this case, which method should I use?
>
> > p<-optim(c(-0.2392925,0.4653128,-0.8332286, 0.0657, -0.0031,
> -0.00245, 3.366, 0.5885, -0.00008,
> + 0.0786,-0.00292,-0.00081, 3.266, -0.3632, -0.000049,
> 0.1856, 0.00394, -0.00193, -0.889, 0.5379, -0.000063,
> + 0.213, 0.00338, -0.00026, -0.8912, -0.3023, -0.000056),
> f, method ="BFGS", hessian =TRUE,
control=list(maxit=10000),y=y,X=X,W=W)
>
> There were 50 or more warnings (use warnings() to see the first 50)
> > p
> $par
> [1] 1.113491e-01 6.347504e-02 -1.570647e-01 7.793766e-02
> 7.011026e-02 -3.075866e-03 3.365178e+00 8.123945e-02 -2.670111e-04
> [10] 7.941502e-02 -2.249492e-04 -1.388776e-03 3.266022e+00
> -4.023881e-01 -6.195116e-03 1.829491e-01 -1.116388e-02 -3.088426e-03
> [19] -8.888543e-01 6.394912e-01 3.425666e-03 2.193541e-01
> 3.743851e-02 8.376799e-05 -8.915029e-01 -5.596738e-01 -1.845092e-03
>
> $value
> [1] -950.553
>
> $counts
> function gradient
> 31321 1741
>
> $convergence
> [1] 0
The BFGS result (-950) is much better than the SANN or Nelder-Mead
results (-722).
>
> 3. In Peng's email, he pointed out the importance of choosing good
> initial values in order to get sensible estimates by using optim.
> Since I am not confident that my initial values are that good and I
> got different estimation results under different methods, in such
> cases, would you recommend any alternative function for solving
> maximzation problems?
>
The fact that you have different results from different optimization
techniques, and that the Nelder-Mead
method gets stuck, does suggest that you might have a problem with
multiple local minima. You also have
27 free parameters, which is rather a lot for simple, general-purpose,
out-of-the-box optimizers. The two steps
forward that I can suggest are (1) make sure that the results you get
from BFGS represent a sensible answer
to your problem (you haven't given any details of your problem, so I
can't tell); (2) try running BFGS with a variety
of starting conditions -- chosen systematically or randomly within some
range that won't crash.
In the end, without analyzing your objective function carefully you
can't have any guarantee that there
aren't better, non-local minima hiding somewhere ... if you (or someone
else) hasn't done the aforementioned
analysis, then usually the best you can do is assert that the solution
you found is reasonable and that you have
tried a range of starting conditions.
[[alternative HTML version deleted]]