Paul Johnson
2013-May-01 02:34 UTC
[R] Trouble with methods() after loading gdata package.
Greetings to r-help land. I've run into some program crashes and I've traced them back to methods() behavior after the package gdata is loaded. I provide now a minimal re-producible example. This seems bugish to me. How about you? dat <- data.frame(x = rnorm(100), y = rnorm(100)) lm1 <- lm(y ~ x, data = dat) methods(class = "lm") ## OK so far library(gdata) methods(class = "lm") ## epic fail ## OUTPUT.> dat <- data.frame(x = rnorm(100), y = rnorm(100)) > lm1 <- lm(y ~ x, data = dat) > > methods(class = "lm")[1] add1.lm* alias.lm* anova.lm case.names.lm* [5] confint.lm* cooks.distance.lm* deviance.lm* dfbeta.lm* [9] dfbetas.lm* drop1.lm* dummy.coef.lm* effects.lm* [13] extractAIC.lm* family.lm* formula.lm* hatvalues.lm [17] influence.lm* kappa.lm labels.lm* logLik.lm* [21] model.frame.lm model.matrix.lm nobs.lm* plot.lm [25] predict.lm print.lm proj.lm* qr.lm* [29] residuals.lm rstandard.lm rstudent.lm simulate.lm* [33] summary.lm variable.names.lm* vcov.lm* Non-visible functions are asterisked> > library(gdata)gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED. gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED. Attaching package: ‘gdata’ The following object is masked from ‘package:stats’: nobs The following object is masked from ‘package:utils’: object.size> methods(class = "lm")Error in data.frame(visible = rep.int(FALSE, n2), from = rep.int(msg, : duplicate row.names: nobs.lm> sessionInfo()R version 3.0.0 (2013-04-03) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] gdata_2.12.0.2 loaded via a namespace (and not attached): [1] gtools_2.7.1 tcltk_3.0.0 tools_3.0.0 gdata is one of my favorite packages, its worth the effort to get to the bottom of this. -- Paul E. Johnson Professor, Political Science Assoc. Director 1541 Lilac Lane, Room 504 Center for Research Methods University of Kansas University of Kansas http://pj.freefaculty.org http://quant.ku.edu [[alternative HTML version deleted]]
Achim Zeileis
2013-May-01 08:20 UTC
[R] Trouble with methods() after loading gdata package.
On Tue, 30 Apr 2013, Paul Johnson wrote:> Greetings to r-help land. > > I've run into some program crashes and I've traced them back to > methods() behavior after the package gdata is loaded. I provide now a > minimal re-producible example. This seems bugish to me. How about you? > > dat <- data.frame(x = rnorm(100), y = rnorm(100)) > lm1 <- lm(y ~ x, data = dat)The two lines above are not really needed. It's just the behaviour of methods(class = "lm") before and after loading gdata:> methods(class = "lm") > > ## OK so far > > library(gdata) > methods(class = "lm") > ## epic failThe reason is that nobs.lm is found twice by methods(): first in "stats" and then in "gdata". And because methods() builds a data.frame with row names corresponding to all methods, this gives an error because row names in data frames have to be unique. I guess it would be good to safeguard the methods function against situations like this. If methods are found more than once, one could run unqiue() on it or alternatively keep the duplicates but add the information about which namespace it is coming from. Additionally, it would probably be good if "gdata" changed its current behavior of copying the nobs generic and nobs.lm method. gdata does this so that it can provide a modified nobs.default and nobs.data.frame built on top of the nobs.default. But the price is that all the other methods registered with the stats::nobs generic are not found anymore R> methods(nobs) [1] nobs.default* nobs.glm* nobs.lm* nobs.logLik* nobs.nls* ... and after loading gdata ... R> methods(nobs) [1] nobs.data.frame* nobs.default* nobs.lm* Hence the glm/logLik/nls methods are only found if the user explicitly calls stats::nobs. An artificially constructed example is R> m <- lm(dist ~ speed, data = cars) R> nobs(logLik(m)) [1] 1 R> stats:::nobs(logLik(m)) [1] 50 I haven't checked how much use gdata makes of the modified nobs.default outside nobs.data.frame. If this isn't used in other crucial places, I would probably recommend to omit nobs/nobs.default/nobs.lm from the gdata namespace and just register nobs.data.frame with stats::nobs.> > > ## OUTPUT. > >> dat <- data.frame(x = rnorm(100), y = rnorm(100)) >> lm1 <- lm(y ~ x, data = dat) >> >> methods(class = "lm") > [1] add1.lm* alias.lm* anova.lm case.names.lm* > [5] confint.lm* cooks.distance.lm* deviance.lm* dfbeta.lm* > [9] dfbetas.lm* drop1.lm* dummy.coef.lm* effects.lm* > [13] extractAIC.lm* family.lm* formula.lm* hatvalues.lm > [17] influence.lm* kappa.lm labels.lm* logLik.lm* > [21] model.frame.lm model.matrix.lm nobs.lm* plot.lm > [25] predict.lm print.lm proj.lm* qr.lm* > [29] residuals.lm rstandard.lm rstudent.lm simulate.lm* > [33] summary.lm variable.names.lm* vcov.lm* > > Non-visible functions are asterisked >> >> library(gdata) > gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED. > > gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED. > > Attaching package: ?gdata? > > The following object is masked from ?package:stats?: > > nobs > > The following object is masked from ?package:utils?: > > object.size > >> methods(class = "lm") > Error in data.frame(visible = rep.int(FALSE, n2), from = rep.int(msg, : > duplicate row.names: nobs.lm > >> sessionInfo() > R version 3.0.0 (2013-04-03) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] gdata_2.12.0.2 > > loaded via a namespace (and not attached): > [1] gtools_2.7.1 tcltk_3.0.0 tools_3.0.0 > > > gdata is one of my favorite packages, its worth the effort to get to the > bottom of this. > > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > > [[alternative HTML version deleted]] > >