Thaler,Thorn,LAUSANNE,Applied Mathematics
2013-Oct-28 16:19 UTC
[R] Automatically Remove Aliased Terms from a Model
Dear all, I am trying to implement a function which removes aliased terms from a model. The challenge I am facing is that with "alias" I get the aliased coefficients of the model, which I have to translate into the terms from the model formula. What I have tried so far: ------------------8<------------------ d <- expand.grid(a = 0:1, b=0:1) d$c <- (d$a + d$b) %% 2 d$y <- rnorm(4) d <- within(d, {a <- factor(a); b <- factor(b); c <- factor(c)}) l <- lm(y ~ a * b + c, d) removeAliased <- function(mod) { ## Retrieve all terms in the model X <- attr(mod$terms, "term.label") ## Get the aliased coefficients rn <- rownames(alias(mod)$Complete) ## remove factor levels from coefficient names to retrieve the terms regex.base <- unique(unlist(lapply(mod$model[, sapply(mod$model, is.factor)], levels))) aliased <- gsub(paste(regex.base, "$", sep = "", collapse = "|"), "", gsub(paste(regex.base, ":", sep = "", collapse = "|"), ":", rn)) uF <- formula(paste(". ~ .", paste(aliased, collapse = "-"), sep = "-")) update(mod, uF) } removeAliased(l) ------------------>8------------------ This function works in principle, but this workaround with removing the factor levels is just, well, a workaround which could cause problems in some circumstances (when the name of a level matches the end of another variable, when I use a different contrast and R names the coefficients differently etc. - and I am not sure which other cases I am overlooking). So my question is whether there are some more intelligent ways of doing what I want to achieve? Is there a function to translate a coefficient of a LM back to the term, something like: termFromCoef("a1") ## a1 termFromCoef("a1:b1") ## a:b With this I could simply translate the rownames from alias into the terms needed for the model update. Thanks for your help. Kind Regards, Thorn Thaler NRC Lausanne Applied Mathematics
Hi Thorn, it is not entirely clear (at least for me) what you want to accomplish. an easy and fail safe way of extracting used terms in a (g)lm-object is names(model.frame(l)) if you want to extract terms to finally select a model, have a look at drop1 and/or MASS::dropterm Hth Am 28.10.2013 17:19, schrieb Thaler,Thorn,LAUSANNE,Applied Mathematics:> Dear all, > > I am trying to implement a function which removes aliased terms from a model. The challenge I am facing is that with "alias" I get the aliased coefficients of the model, which I have to translate into the terms from the model formula. What I have tried so far: > > ------------------8<------------------ > d <- expand.grid(a = 0:1, b=0:1) > d$c <- (d$a + d$b) %% 2 > d$y <- rnorm(4) > d <- within(d, {a <- factor(a); b <- factor(b); c <- factor(c)}) > l <- lm(y ~ a * b + c, d) > > removeAliased <- function(mod) { > ## Retrieve all terms in the model > X <- attr(mod$terms, "term.label") > ## Get the aliased coefficients > rn <- rownames(alias(mod)$Complete) > ## remove factor levels from coefficient names to retrieve the terms > regex.base <- unique(unlist(lapply(mod$model[, sapply(mod$model, is.factor)], levels))) > aliased <- gsub(paste(regex.base, "$", sep = "", collapse = "|"), "", gsub(paste(regex.base, ":", sep = "", collapse = "|"), ":", rn)) > uF <- formula(paste(". ~ .", paste(aliased, collapse = "-"), sep = "-")) > update(mod, uF) > } > > removeAliased(l) > ------------------>8------------------ > > This function works in principle, but this workaround with removing the factor levels is just, well, a workaround which could cause problems in some circumstances (when the name of a level matches the end of another variable, when I use a different contrast and R names the coefficients differently etc. - and I am not sure which other cases I am overlooking). > > So my question is whether there are some more intelligent ways of doing what I want to achieve? Is there a function to translate a coefficient of a LM back to the term, something like: > > termFromCoef("a1") ## a1 > termFromCoef("a1:b1") ## a:b > > With this I could simply translate the rownames from alias into the terms needed for the model update. > > Thanks for your help. > > Kind Regards, > > Thorn Thaler > NRC Lausanne > Applied Mathematics > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Besuchen Sie uns auf: www.uke.de _____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Martin Zeitz (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Pr?l?, Rainer Schoppik _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING