Hello, I try to do a very simple nonlinear regression. The function is y = (b0 + b1*x1 + b2*x2 + b3*x3) * x4^b4 I think I do everything well, but as I set the starting value of b4 to 0 (it is the theoretically sane starting value), it converges very quickly, and to the wrong solution. Wrong in a sense, that 1) we do not expect this and 2) we do not get this on E-Views, Stata and SAS. I do not use any extra setting, just the plain default. I did several regressions choosing starting values for b4 on the seq(-1,1,.01) series. It did find the correct values (with `globally' smallest RSS), but the result is strongly dependent on the initial values. Morover, the good result comes from a `bad' initial value! I have read that the nonlinear optimizer/minimizer will change in the future, but this is funny. And it happens when I use R-devel, anyway. Anyone had the same problem? Thanks, Zsombor -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Zsombor Cseres-Gergely <z.cseres-gergely at ucl.ac.uk> writes:> I try to do a very simple nonlinear regression. The function is > > y = (b0 + b1*x1 + b2*x2 + b3*x3) * x4^b4Are you taking advantage of the fact that four of your five parameters are conditionally linear? You can use algorithm = "plinear" to indicate to the nls function that your model is partially linear like this one is. When this option is user you only need to specify a starting estimate for b4 and the optimization is reduced to a one-dimensional optimization of the profiled residual sum-of-squares. You would write the model as nls(y ~ x4^b4*cbind(1, x1, x2, x3), data = mydata, start = c(b4 = 0), alg = "plinear", trace = TRUE)> I think I do everything well, but as I set the starting value of b4 to 0 (it > is the theoretically sane starting value),Do you really expect 0 to be a sensible value for this parameter? If so, have you already fit the linear regression model y ~ 1 + x1 + x2 + x3 and found it to be adequate? Why then do you think that x4 determines the response in this fashion is your best guess at the value of b4 is the value that makes x4 of no consequence. Do you actually know so little about these data that you can't tell if you expect b4 to be negative or to be positive? One does not choose starting estimates in a nolinear regression because they are theoretically possible values. One uses every possible trick to come up with values that are consistent with the observed data.> it converges very quickly, and to the wrong solution.Please explain this further. An independent evaluation of the nonlinear least squares algorithms in several major statistical and econometrics packages by Bruce McCullough found that the algorithm and convergence criterion used in S-PLUS (and in R) was one of two that did *not* declare convergence to incorrect values (in the sense that one of more of the "converged" parameter estimates had zero correct significant digits) on at least one test problem.> Wrong in a sense, that 1) we do not expect this and 2) we > do not get this on E-Views, Stata and SAS. I do not use any extra setting, > just the plain default. I did several regressions choosing starting values for > b4 on the seq(-1,1,.01) series. It did find the correct values (with > `globally' smallest RSS), but the result is strongly dependent on the initial > values. Morover, the good result comes from a `bad' initial value! I have > read that the nonlinear optimizer/minimizer will change in the future, butNot that I am aware of. However, R is an open source system and you are welcome to contribute a superior nonlinear least squares implementation at any time.> this is funny. And it happens when I use R-devel, anyway. Anyone > had the same problem?I don't think that question is answerable because you have not given us enough detail of your problem. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Doug Bates has answered the substantive part of this. On Fri, 10 Nov 2000, Zsombor Cseres-Gergely wrote:> I have > read that the nonlinear optimizer/minimizer will change in the future, but > this is funny.Where, please? nls does not use it, as far as I know, but the optimizer nlm was supplemented/replaced by optim in version 0.99.0 so the referenced is outdated and we'd like to correct it if it is current. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, stats.ox.ac.uk/~ripley University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I have to correct my previous post (or have I already did it?): I used NLM, not NLS. On Thu, Nov 09, 2000 at 09:18:45PM -0600, Douglas Bates wrote:> Are you taking advantage of the fact that four of your five parameters > are conditionally linear? You can useNo. I used my fingers before my brain.> You would write the model as > > nls(y ~ x4^b4*cbind(1, x1, x2, x3), data = mydata, start = c(b4 = 0), > alg = "plinear", trace = TRUE)This works fine.> Do you really expect 0 to be a sensible value for this parameter? If > so, have you already fit the linear regression model > y ~ 1 + x1 + x2 + x3 > and found it to be adequate? Why then do you think that x4 determines > the response in this fashion is your best guess at the value of b4 is > the value that makes x4 of no consequence.Probably wrongly, but exatly for this reason. x1, x2 and x3 are number of adu1t males, females and children in the household, y is energy intake, and x4 is log(income)/head. Plotting the data indicated difference, so I choose 0 to see if there is one.> > it converges very quickly, and to the wrong solution. > Please explain this further. An independent evaluation of the> Not that I am aware of. However, R is an open source system and you > are welcome to contribute a superior nonlinear least squares > implementation at any time.Well, I did not wanted to blame the minimizer engine (nor R!). I just observed these things and behaved like a consumer of software bloathed with AI-like features, not like a craftsman with a precision tool. I hope R will teach me to be more the latter. But one question still remains for me. OK, I should have noticed and exploited the structure of the problem. But what if I do not? Should the other way give so different results? Thanks, Zsombor -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._