Trevor John Hastie
2016-Sep-19 20:13 UTC
[Rd] Subsetting issue in model.frame with na.omit
Running R version 3.3.1 (2016-06-21) Bug in Your Hair I have discovered an issue with model.frame() with regard to its implementation of the na.action argument. This impacts the gam package. We are expecting the last thing to happen in model.frame() is that it runs na.action on the frame it has produced. In the example below, we use "na.action=na.omit", which calls for subsetting out rows of the frame. However, when it does this, it does not see that there is a [.smooth method for the two columns, which are of S3 class "smooth". So it does do the subsetting, but does not use the subset methods. In my example, this is evidenced by the attribute element $NAs of (each) of the components still being present. When instead, I use "na.action=na.pass" in the call to model.frame, and then filter the resulting frame through na.omit(), it does the right thing. The $NAs component has disappeared, which is what should have happened here. set.seed(101) n=30 x=matrix(runif(n*2),n,2) x[sample(1:20,6,replace=FALSE)]=NA dx=data.frame(x) library(gam) ###Compare m=model.frame(~s(X1,df=4)+s(X2,df=4),data=dx,na.action=na.omit) attributes(m[[1]]) ###with m=model.frame(~s(X1,df=4)+s(X2,df=4),data=dx,na.action=na.pass) m=na.omit(m) attributes(m[[1]]) ------------------------------------------------------------------------------ Trevor Hastie hastie at stanford.edu Professor, Department of Statistics, Stanford University Phone: (650) 725-2231 Fax: (650) 725-8977 URL: http://www.stanford.edu/~hastie address: room 104, Department of Statistics, Sequoia Hall 390 Serra Mall, Stanford University, CA 94305-4065 ------------------------------------------------------------------------------