I generally prefer "optim" to "nlminb", because it will
optionally
return the hessian, and a review of the eigenvalues and vectors can help
diagnose problems like this. When I tried "optim" with your
"func", I
got an error message:
> est. <- optim( c(277, 100,101, 10), func,
+ hessian=TRUE)
Error in optim(c(277, 100, 101, 10), func, hessian = TRUE) :
objective function in optim evaluates to length 100 not 1
For both "optim" and "nlminb", the second argument is a
"function to
be minimized", which must return a scalar. Your "func" returned
a
vector of length 100 = length(x); "nlminb" tried to minimize only the
first component of that vector without reference to y!
I therefore modified your "func" as follows and tried it with
"nlminb" and with three methods of "optim":
func1 <- function( par,y, x ) {
a = par[1]
m = par[2]
n = par[3]
tau = par[4]
y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
sum((y-y.)^2)
}
est.n <- nlminb( c(277, 100,101, 10), objective=func1,
control=list(eval.max=400, iter.max=1000), y=y, x=x)
est.o1 <- optim( c(277, 100,101, 10), func1,
hessian=TRUE, y=y, x=x)
est.o2 <- optim( c(277, 100,101, 10), func1,
method="BFGS", hessian=TRUE, y=y, x=x)
est.o3 <- optim( c(277, 100,101, 10), func1,
method="CG", hessian=TRUE, y=y, x=x)
est.o4 <- optim( c(277, 100,101, 10), func1,
method="SANN", hessian=TRUE, y=y, x=x)
These 5 optimizers found the following "minima", returning
"convergence" messages as follows:
nlminb: $objective=808.7282; $convergence=0, "relative convergence
(4)"
optim(method="Nelder-Mead"(default)): $value=9032.814 (11 times
nlminb); $convergence=1 (the iteration limit 'maxit' had been reached.)
optim(method="BFGS"): $value=1189625 (1471 times nlminb);
$convergence=0 ("successful convergence")
optim(method="CG"): $value=1189574 (1471 times nlminb);
$convergence=1
(the iteration limit 'maxit' had been reached.)
optim(method="SANN"): $value=42819.06 (53 times nlminb);
$convergence=0 ("successful convergence")
Clearly, nlminb got the smallest residual sum of squares. To get the
hessian there, I fed the nlminb output into optim as follows:
est.no1 <- optim(est.n$par, func1,
hessian=TRUE, y=y, x=x)
est.no.hess.eig <- eigen(est.no1$hessian, symmetric=TRUE)
signif(est.no.hess.eig$values, 2)
[1] 6.9e+05 3.2e+01 1.6e-04 -1.0e-07
Since the smallest eigenvalue is negative, we might think that the
hessian is indefinite. However, the smallest eigenvalue is less than
1e-12 times the largest in absolute value, which really means that the
smallest eigenvalue is essentially zero. Moreover, since the third
eigenvalue is less than 1e-9 times the largest, the true rank of the
hessian is at most 2. If we also compare the first and second
eigenvalues, we that the first is over 2000 times the second. This says
that with your model and your data, you really can only estimate one
parameter; everything else looks like noise, to me.
The eigenvectors provide more detail:
> round(est.no.hess.eig$vectors, 2)
[,1] [,2] [,3] [,4]
[1,] -0.01 1.00 0.00 0.00
[2,] 0.00 0.00 1.00 0.01
[3,] 0.00 0.00 -0.01 1.00
[4,] 1.00 0.01 0.00 0.00
>
This says that your data can estimate the fourth parameter very well,
as the largest eigenvalue is associated almost exclusively with that
parameter. You might also be able to estimate the second parameter
somewhat, as it is associated almost exclusively with the second
eigenvalue. However, you have to move the second and third parameters
much more than the first and fourth to see much change in the residual
sum of squares.
For more information, I suggest you consult the following:
Seber and Wild (1988) Nonlinear Regression (Wiley; reprinted in 2003,
esp. ch. 7 on "Growth Models")
Bates and Watts (1988) Nonlinear Regression Analysis and Its
Applications (Wiley, esp. ch. 6 on "Graphical Summaries")
hope this helps.
spencer graves
Rajarshi Guha wrote:
> Hi,
> I'm trying to fit some data using a logistic function defined as
>
> y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)
>
> My data is below:
>
> x <- 1:100
>
> y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5,
> 5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21,
> 24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169,
> 178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268,
> 268,270,272,272,272,273,275,275,275,276)
>
> My first attempt was to use nls as below:
>
> d <- data.frame(x=x, y=y)
> model <- nls(y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)), data=d,
> start=list(a=277,m=100,n=101,tau=10),
> algorithm='port', trace=TRUE,
> control=nls.control(maxiter=5000, minFactor=1/2048))
>
> Running the above code I get the following error message:
>
> Convergence failure: function evaluation limit reached without
> convergence (9).
>
> To investigate this further I used nlminb() to get a set of starting
> parameters. Thus I did:
>
> func <- function( par ) {
> a = par[1]
> m = par[2]
> n = par[3]
> tau = par[4]
> a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
> }
> est <- nlminb( c(277, 100,101, 10), objective=func,
> control=list(eval.max=400, iter.max=1000))
>
> I get absolute convergence and a set of parameter values. Plugging these
> into the nls call and trying again still gives me
>
> Convergence failure: function evaluation limit reached without
> convergence (9)
>
> I have tried a number of different starting values for the nls() call
> but I usually end up getting the following error:
>
> Convergence failure: false convergence (8)
>
> After reading the PORT library docs, I see that this error can mean
>
> 1) gradient is calculated incorrectly
> 2) stopping tolerances are too tight
> 3) gradient is discontinous near some iterate
>
> However, since nls() usually reports the above error after 30 to 40
> iterations, the PORT docs suggest that it is not problem 1. I'm not
sure
> about (3) - I have other data which are somewhat similar to the above
> data, but they lead to a straightforward fit.
>
>
>
> In the end I tried a different starting value and lowered the tolerances
> a little, and I got a valid fit
>
> My questions are:
>
> 1) Why would the parameters that lead nlminb() to converge to work in
> nls() (since I'm using the PORT algorithm in both cases)?
>
> 2) Is there a strategy to obtain starting values? In my case I know that
> a should be around 277, but for the others I'm not sure.
>
> 3) Is there a quick way to check whether the gradient is discontinous at
> a point, numerically (rather than calculating the analytical
> derivatives)? I did
>
> plot(diff(y))
>
> and it certainly looks messy, but I also have other y vectors which look
> equally jagged (though the jaggedness occurs at lower x)
>
> Any suggestions would be appreciated.
>
> Thanks,
>
> -------------------------------------------------------------------
> Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>
> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
> -------------------------------------------------------------------
> After a number of decimal places, nobody gives a damn.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA
spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel: 408-938-4420
Fax: 408-280-7915