Dear R users, Could somebody please help me to find a way of comparing nonlinear, non-nested models in R, where the number of parameters is not necessarily different? Here is a sample (growth rates, y, as a function of internal substrate concentration, x): x <- c(0.52, 1.21, 1.45, 1.64, 1.89, 2.14, 2.47, 3.20, 4.47, 5.31, 6.48) y <- c(0.00, 0.35, 0.41, 0.49, 0.58, 0.61, 0.71, 0.83, 0.98, 1.03, 1.06) model1 <- function(x, xo, ym) ym * (x-xo)/x model2 <- function(x, xo, ym, k) ym * (x-xo)/(k+x-xo) model3 <- function(x, xo, ym) ym * (1-exp(-log(2)*(x-xo)/xo)) model4 <- function(x, xo, ym, k) ym * (1-exp(-log(2)*(x-xo)/k)) fit1 <- nls(y~model1(x, xo, ym), start=list(xo=0.5, ym=1)) fit2 <- nls(y~model2(x, xo, ym, k), start=list(xo=0.5, ym=1, k=1)) fit3 <- nls(y~model3(x, xo, ym), start=list(xo=0.5, ym=1)) fit4 <- nls(y~model4(x, xo, ym, k), start=list(xo=0.5, ym=1, k=1)) anova(fit1, fit2) anova(fit3, fit4) Models 1 and 2 are nested, as are models 3 and 4 (set k=xo), so they can be compared using anova. I am looking for a way to compare the non-nested models (ie models 1 and 3, and models 2 and 4), or better still, I would like to compare all 4 at once. A significance test would be ideal, but I am beginning to think that this may not make statistical sense. In that case, is there an appropriate measure of goodness of fit? I?d be very grateful if someone could put me on the right track. Thanks, Tom R version 2.14.2 (2012-02-29) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5] LC_TIME=German_Germany.1252
Tom Shatwell <shatwell <at> igb-berlin.de> writes:> > Dear R users,> Could somebody please help me to find a way of comparing nonlinear, > non-nested models in R, where the number of parameters is not > necessarily different? Here is a sample (growth rates, y, as a > function of internal substrate concentration, x):> x <- c(0.52, 1.21, 1.45, 1.64, 1.89, 2.14, 2.47, 3.20, 4.47, 5.31, 6.48) > y <- c(0.00, 0.35, 0.41, 0.49, 0.58, 0.61, 0.71, 0.83, 0.98, 1.03, 1.06) > > model1 <- function(x, xo, ym) ym * (x-xo)/x > model2 <- function(x, xo, ym, k) ym * (x-xo)/(k+x-xo) > model3 <- function(x, xo, ym) ym * (1-exp(-log(2)*(x-xo)/xo)) > model4 <- function(x, xo, ym, k) ym * (1-exp(-log(2)*(x-xo)/k)) > > fit1 <- nls(y~model1(x, xo, ym), start=list(xo=0.5, ym=1)) > fit2 <- nls(y~model2(x, xo, ym, k), start=list(xo=0.5, ym=1, k=1)) > fit3 <- nls(y~model3(x, xo, ym), start=list(xo=0.5, ym=1)) > fit4 <- nls(y~model4(x, xo, ym, k), start=list(xo=0.5, ym=1, k=1)) > > anova(fit1, fit2) > anova(fit3, fit4)> Models 1 and 2 are nested, as are models 3 and 4 (set k=xo), so they > can be compared using anova. I am looking for a way to compare the > non-nested models (ie models 1 and 3, and models 2 and 4), or better > still, I would like to compare all 4 at once. A significance test > would be ideal, but I am beginning to think that this may not make > statistical sense. In that case, is there an appropriate measure of > goodness of fit? I?d be very grateful if someone could put me on the > right track.It is surely not a panacea, and some statisticians (primarily Brian Ripley) argue that they are inappropriate for non-nested models, but sensibly used information criteria such as AIC are good for this situation. The AIC gives an (asymptotic) approximation of the expected predictive accuracy of a given model, taking into account the goodness of fit and the number of parameters.>From base R:> AIC(fit1,fit2,fit3,fit4)df AIC fit1 3 -8.08637 fit2 4 -50.35842 fit3 3 -15.62821 fit4 4 -59.71780 I prefer this representation (by default it sorts the models and presents the delta-AIC rather than the raw value):> library(bbmle)Loading required package: stats4 AIC> AICtab(fit1,fit2,fit3,fit4) dAIC df fit4 0.0 4 fit2 9.4 4 fit3 44.1 3 fit1 51.6 3 You may want to use a finite-size correction:> AICctab(fit1,fit2,fit3,fit4,nobs=length(x))dAICc df fit4 0.0 4 fit2 9.4 4 fit3 40.9 3 fit1 48.4 3 The Vuong test is another alternative for non-nested models.
Folks: 1. This discussion belongs on a statistical list, like stats.stackexchange.com, not here. 2. But I will sin in 4) below. 3, I think the best advice I can give is to consult with a local statistical expert who can help you understand your goals and your data and ignore all advice here :-) . You are in danger of producing irreproducible scientific nonsense. 4. Nonlinear models are fundamentally different than linear models. e.g. Single parameters do **not** correspond to single "degrees of freedom." Asymptotic approximations may be wholly unreliable. Classical inference methodology should not be trusted. That is why I recommend 3) above I promise to sin no further in this thread. -- Bert On Thu, Nov 8, 2012 at 6:22 AM, Ben Bolker <bbolker at gmail.com> wrote:> Tom Shatwell <shatwell <at> igb-berlin.de> writes: > >> >> Dear R users, > >> Could somebody please help me to find a way of comparing nonlinear, >> non-nested models in R, where the number of parameters is not >> necessarily different? Here is a sample (growth rates, y, as a >> function of internal substrate concentration, x): > >> x <- c(0.52, 1.21, 1.45, 1.64, 1.89, 2.14, 2.47, 3.20, 4.47, 5.31, 6.48) >> y <- c(0.00, 0.35, 0.41, 0.49, 0.58, 0.61, 0.71, 0.83, 0.98, 1.03, 1.06) >> >> model1 <- function(x, xo, ym) ym * (x-xo)/x >> model2 <- function(x, xo, ym, k) ym * (x-xo)/(k+x-xo) >> model3 <- function(x, xo, ym) ym * (1-exp(-log(2)*(x-xo)/xo)) >> model4 <- function(x, xo, ym, k) ym * (1-exp(-log(2)*(x-xo)/k)) >> >> fit1 <- nls(y~model1(x, xo, ym), start=list(xo=0.5, ym=1)) >> fit2 <- nls(y~model2(x, xo, ym, k), start=list(xo=0.5, ym=1, k=1)) >> fit3 <- nls(y~model3(x, xo, ym), start=list(xo=0.5, ym=1)) >> fit4 <- nls(y~model4(x, xo, ym, k), start=list(xo=0.5, ym=1, k=1)) >> >> anova(fit1, fit2) >> anova(fit3, fit4) > >> Models 1 and 2 are nested, as are models 3 and 4 (set k=xo), so they >> can be compared using anova. I am looking for a way to compare the >> non-nested models (ie models 1 and 3, and models 2 and 4), or better >> still, I would like to compare all 4 at once. A significance test >> would be ideal, but I am beginning to think that this may not make >> statistical sense. In that case, is there an appropriate measure of >> goodness of fit? I?d be very grateful if someone could put me on the >> right track. > > It is surely not a panacea, and some statisticians (primarily > Brian Ripley) argue that they are inappropriate for non-nested > models, but sensibly used information criteria such as AIC are > good for this situation. The AIC gives an (asymptotic) approximation > of the expected predictive accuracy of a given model, taking into > account the goodness of fit and the number of parameters. > > >From base R: > >> AIC(fit1,fit2,fit3,fit4) > df AIC > fit1 3 -8.08637 > fit2 4 -50.35842 > fit3 3 -15.62821 > fit4 4 -59.71780 > > I prefer this representation (by default it > sorts the models and presents the delta-AIC > rather than the raw value): > >> library(bbmle) > Loading required package: stats4 > AIC> AICtab(fit1,fit2,fit3,fit4) > dAIC df > fit4 0.0 4 > fit2 9.4 4 > fit3 44.1 3 > fit1 51.6 3 > > You may want to use a finite-size correction: > >> AICctab(fit1,fit2,fit3,fit4,nobs=length(x)) > dAICc df > fit4 0.0 4 > fit2 9.4 4 > fit3 40.9 3 > fit1 48.4 3 > > The Vuong test is another alternative for non-nested > models. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm