bcooper@hsph.harvard.edu
2002-Jun-20 16:59 UTC
Possible bug with glm.nb and starting values (PR#1695)
Full_Name: Ben Cooper Version: 1.5.0 OS: linux Submission from: (NULL) (134.174.187.90) The help page for glm.nb (in MASS package) says that it takes "Any other arguments for the glm() function except family" One such argument is start "starting values for the parameters in the linear predictor." However, when called with starting values glm.nb returns: Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ So it looks like this is either a bug, the documentation is inaccurate, or I'm missing something (or some combination of the above). An example: ________________________________________________________________ library(MASS) y<-c(7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6, 12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3,3,4) lag1<-c(0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6, 12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3,3) lag2<-c(0,0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6, 12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3) lag3<-c(0,0,0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6, 12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5)># first a poisson model which is OK >glm(y~lag1+lag2+lag3,family=poisson(link=identity))Error: no valid set of coefficients has been found:please supply starting values In addition: Warning message: NaNs produced in: log(x)> #therefore try: > glm(y~lag1+lag2+lag3,family=poisson(link=identity),start=c(2,0.1,0.1,0.1))# and this works. However, negative binomial model is not OK:> glm.nb(y~lag1+lag2+lag3,link=identity)Error: no valid set of coefficients has been found:please supply starting values In addition: Warning message: NaNs produced in: log(x)> #so try > glm.nb(y~lag1+lag2+lag3,link=identity,start=c(2,0.1,0.1,0.1))Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ ________________________________________________________________ I can get glm.nb to work with starting values with the following hack to the glm.nb code: i) change the line m$method <- m$model <- m$x <- m$y <- m$control <- m$contrasts <- m$init.theta <- m$link <- m$... <- NULL to m$method <- m$model <- m$x <- m$y <- m$control <- m$contrasts <- m$init.theta <- m$link <-m$start <- m$... <- NULL [i.e. insert m$start] ii) remove the line: start <- model.extract(m, start) iii) change the line fit <- glm.fitter(x = X, y = Y, w = w, etastart = start, offset = offset, family = fam0, control = list(maxit = control$maxit, epsilon = control$epsilon, trace = control$trace > 1)) to fit <- glm.fitter(x = X, y = Y, w = w, start=start, offset = offset, family = fam0, control = list(maxit = control$maxit, epsilon = control$epsilon, trace = control$trace > 1)) iv) change default value of start to NULL (as in glm) rather than eta when glm.nb is called. newglm.nb with these changes then works. But no doubt there are reasons for things being the way they are and these changes screw other things up. $platform [1] "i686-pc-linux-gnu" $arch [1] "i686" $os [1] "linux-gnu" $system [1] "i686, linux-gnu" $status [1] "" $major [1] "1" $minor [1] "5.0" $year [1] "2002" $month [1] "04" $day [1] "29" $language [1] "R" I also tried this with version 1.4.1 under Windows and had the same probelm. cheers Ben -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
ripley@stats.ox.ac.uk
2002-Jun-20 17:33 UTC
Possible bug with glm.nb and starting values (PR#1695)
It's an R/S incompatibility issue. `start' should be interpreted in S's sense, which is R's etastart. Why R chose to be gratuitously incompatible has always puzzled me. Remember MASS was written for S (actually before R), and not every statement has been checked for R/S discrepancies. On Thu, 20 Jun 2002 bcooper@hsph.harvard.edu wrote:> Full_Name: Ben Cooper > Version: 1.5.0 > OS: linux > Submission from: (NULL) (134.174.187.90) > > > The help page for glm.nb (in MASS package) says that it takes "Any other > arguments for the glm() function except family" > > One such argument is start "starting values for the parameters in the linear > predictor." > > However, when called with starting values glm.nb returns: > > Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > variable lengths differ > > So it looks like this is either a bug, the documentation is inaccurate, or I'm > missing something (or some combination of the above). > > > An example: > ________________________________________________________________ > > > library(MASS) > > y<-c(7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6, > 12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3,3,4) > > lag1<-c(0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6, > 12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3,3) > > lag2<-c(0,0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6, > 12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3) > > lag3<-c(0,0,0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6, > 12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5) > > > ># first a poisson model which is OK > >glm(y~lag1+lag2+lag3,family=poisson(link=identity)) > Error: no valid set of coefficients has been found:please supply starting > values > In addition: Warning message: > NaNs produced in: log(x) > > > #therefore try: > > glm(y~lag1+lag2+lag3,family=poisson(link=identity),start=c(2,0.1,0.1,0.1)) > > # and this works. However, negative binomial model is not OK: > > > glm.nb(y~lag1+lag2+lag3,link=identity) > Error: no valid set of coefficients has been found:please supply starting > values > In addition: Warning message: > NaNs produced in: log(x) > > > #so try > > glm.nb(y~lag1+lag2+lag3,link=identity,start=c(2,0.1,0.1,0.1)) > Error in model.frame(formula, rownames, variables, varnames, extras, extranames, > : > variable lengths differ > > ________________________________________________________________ > > > > I can get glm.nb to work with starting values with the following hack to the > glm.nb code: > > i) > > change the line > m$method <- m$model <- m$x <- m$y <- m$control <- m$contrasts <- m$init.theta > <- m$link <- m$... <- NULL > > to > > m$method <- m$model <- m$x <- m$y <- m$control <- m$contrasts <- m$init.theta > <- m$link <-m$start <- m$... <- NULL > > > [i.e. insert m$start] > > ii) > > remove the line: > > start <- model.extract(m, start) > > > iii) > > change the line > > fit <- glm.fitter(x = X, y = Y, w = w, etastart = start, > offset = offset, family = fam0, control = list(maxit = control$maxit, > epsilon = control$epsilon, trace = control$trace > > 1)) > > to > fit <- glm.fitter(x = X, y = Y, w = w, start=start, > offset = offset, family = fam0, control = list(maxit = control$maxit, > epsilon = control$epsilon, trace = control$trace > > 1)) > > iv) change default value of start to NULL (as in glm) rather than eta when > glm.nb is called. > > newglm.nb with these changes then works. > > But no doubt there are reasons for things being the way they are and these > changes screw other things up. > > $platform > [1] "i686-pc-linux-gnu" > $arch > [1] "i686" > $os > [1] "linux-gnu" > $system > [1] "i686, linux-gnu" > $status > [1] "" > $major > [1] "1" > $minor > [1] "5.0" > $year > [1] "2002" > $month > [1] "04" > $day > [1] "29" > $language > [1] "R" > > I also tried this with version 1.4.1 under Windows and had the same probelm. > > cheers > > Ben > > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._