Hi All, I have a formula which has a factor with NAs in it. I wish to keep these in the model matrix, but the NA information is currently lost (the rows are kept but the NA gets converted to 0). Any ideas as to how I can keep NAs in? e.g. junk <- factor(c("hi",NA,"low","low","hi","low","hi","hi","low",NA,"hi","low","hi","hi","low","hi")) y <- c(1,2,1,2,2,2,1,2,1,1,2,2,1,1,1,2) na.keep <- function(X){X} myfn <- function (formula,data=sys.parent()){ mf <- match.call() mf[[1]] <- as.name("model.frame") if(is.null(mf$na.action)) mf$na.action <- as.name("na.keep") mf <- eval(mf, sys.frame(sys.parent())) Y <- model.extract(mf,response) Terms <- attr(mf,"terms") X <- model.matrix(Terms,mf) X } myfn(y~junk) Cheers, Rachel -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian Ripley
2000-Oct-06 07:51 UTC
[R] Formulae with factors that have missing values
On Fri, 6 Oct 2000, Rachel Merriman wrote:> Hi All, > > I have a formula which has a factor with NAs in it. I wish to keep > these in the model matrix, but the NA information is currently lost (the > rows are kept but the NA gets converted to 0). Any ideas as to how > I can keep NAs in? > > e.g. > > junk <- > factor(c("hi",NA,"low","low","hi","low","hi","hi","low",NA,"hi","low","hi","hi","low","hi")) > > y <- c(1,2,1,2,2,2,1,2,1,1,2,2,1,1,1,2) > > na.keep <- function(X){X} > > myfn <- function (formula,data=sys.parent()){ > mf <- match.call() > mf[[1]] <- as.name("model.frame") > if(is.null(mf$na.action)) mf$na.action <- as.name("na.keep") > mf <- eval(mf, sys.frame(sys.parent())) > Y <- model.extract(mf,response) > Terms <- attr(mf,"terms") > X <- model.matrix(Terms,mf) > X > } > > myfn(y~junk)On my system NA gets converted to 1.200089e-306, not 0. That looks like a bug, and S gives NA. The question is, what do you want NA to be represented as in the model matrix? If you want NA to be another level of the factor, try junk <- factor(c("hi",NA,"low","low","hi","low","hi","hi","low",NA, "hi","low","hi","hi", "low","hi"), exclude="") Alternatively, you might want junk to be coded, and all the columns of the coding set to NA. The simplest way to get that is contrasts(junk)[junk,, drop=F] until we fix the bug. As a matter of interest, what are you going to do with a model matrix with NAs in? -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._