Bolker,Benjamin Michael
2009-Oct-21 03:57 UTC
[Rd] odd evaluation within correlation argument of glmmPQL
[I think I've seen this reported before but can't locate it any more. I believe this oddity (glitch? feature?) is behind a query that Jean-Baptiste Ferdy asked a year ago <http://finzi.psych.upenn.edu/Rhelp08/2008-October/176449.html>] It appears that glmmPQL looks in the global workspace, not within the data frame specified by the "data" argument, for the variables specified in the "form" argument of spatial correlation structures provided to the "correlation" argument. This is potentially confusing/dangerous, because you can easily (1) get an error saying that the variable is not found (the least dangerous and confusing), (2) get an error saying the variable is of the wrong length, if there is a variable with a matching name but the wrong length in the global environment, (3) get completely misled if there is a *different* variable with a matching name in the global environment with the right length (in which case glmmPQL will ignore the element in the data frame you specified and use the one from the global environment). The following patch (to MASS 7.3-3, retrieved from CRAN src/contrib 2.10.0 directory) appears to fix the problem. A quick look suggests that the same version of MASS is in 2.11.0 as well. *** glmmPQL.R.orig 2009-10-20 23:48:38.000000000 -0400 --- glmmPQL.R 2009-10-20 23:51:33.000000000 -0400 *************** *** 39,44 **** --- 39,47 ---- off <- attr(Terms, "offset") if(length(off<- attr(Terms, "offset"))) allvars <- c(allvars, as.character(attr(Terms, "variables"))[off+1]) + ## add variables in correlation formula, if any + if (!missing(correlation) && !is.null(attr(correlation,"formula"))) + allvars <- c(allvars,all.vars(attr(correlation,"formula"))) ## substitute back actual formula (rather than a variable name) Call$fixed <- eval(fixed); Call$random <- eval(random) m$formula <- as.formula(paste("~", paste(allvars, collapse="+"))) code below shows the problem. Ben Bolker ## set up data set.seed(1001) d = data.frame(x = runif(100), y = runif(100), z = rpois(100,2), f = factor(rep(letters[1:20],each=5))) library(MASS) ## without correlation structure: fine glmmPQL(z~1,random=~1|f,data=d,family="poisson") ## with correlation structure, ## but x and y not defined outside of 'd' glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## Error in eval(expr, envir, enclos) : object 'x' not found x = y = runif(10) glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## Error in ... ## variable lengths differ (found for 'f') ## this is what we probably wanted x = d$x y = d$y glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## this is dangerous! x = y = runif(100) glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## proof that glmmPQL really is looking ## outside of the data frame for its ## correlation variables ... x = y = rep(0,100) glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## Error in getCovariate.corSpatial(object, data = data) : ## Cannot have zero distances in "corSpatial"> sessionInfo()R version 2.9.2 (2009-08-24) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets grid methods [8] base other attached packages: [1] nlme_3.1-95 MASS_7.2-49 reshape_0.8.3 plyr_0.1.9 proto_0.3-8 loaded via a namespace (and not attached): [1] ggplot2_0.8.4 lattice_0.17-26
Ben Bolker
2009-Oct-27 19:05 UTC
[Rd] odd evaluation within correlation argument of glmmPQL
Ben Bolker wrote:> > > [snip] > > It appears that glmmPQL looks in the global workspace, not > within the data frame specified by the "data" argument, for > the variables specified in the "form" argument of spatial > correlation structures provided to the "correlation" argument. > > [snip example and patches] > >Replying to my own e-mail to bump this (it's been a week). The basic issue is that glmmPQL throws out data that are not involved in the fixed or random model terms, or in the offset term. It doesn't save variables that are only found in the correlation formula argument. It's hard to imagine how this behavior is anything other than a bug ... and the patch fixes it in a straightforward way. cheers Ben Bolker -- View this message in context: http://www.nabble.com/odd-evaluation-within-correlation-argument-of-glmmPQL-tp25986187p26083255.html Sent from the R devel mailing list archive at Nabble.com.