For an application, I need to get a character string representation of the formula or model call for glm objects, but also, for labeling output and plots, I want to be able to abbreviate the words (variables) in model terms. This requires some formula magic that I can't quite get, in particular extracting the terms from a formula and then the words in each term. Perhaps there is some code for something similar I haven't found yet, or someone can suggest how to do this. A runnable example to show what I mean: Freq <- c(68,42,42,30, 37,52,24,43, 66,50,33,23, 47,55,23,47, 63,53,29,27, 57,49,19,29) Temperature <- gl(2, 2, 24, labels = c("Low", "High")) Softness <- gl(3, 8, 24, labels = c("Hard","Medium","Soft")) M.user <- gl(2, 4, 24, labels = c("N", "Y")) Brand <- gl(2, 1, 24, labels = c("X", "M")) detg <- data.frame(Freq,Temperature, Softness, M.user, Brand) detg.m0 <- glm(Freq ~ M.user*Temperature*Softness + Brand*M.user*Temperature, family = poisson, data = detg) detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand), family = poisson, data=detg) detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2, family = poisson, data=detg) detg.m2a <- update(detg.m1, . ~ .^2) In plot.lm, I found the following code to extract the model call from a glm object as a string and abbreviate it to a total length <=75. I need shorter total length, by abbreviating individual words in the model call, so the approach has to at least extract the terms in the model and then abbreviate the words in each term. # from plot.lm: get model call as a string # TODO: how to use abbreviate to abbreviate the words in the model terms??? mod.call <- function(x, max.len=75) { cal <- x$call if (!is.na(m.f <- match("formula", names(cal)))) { cal <- cal[c(1, m.f)] names(cal)[2L] <- "" } cc <- deparse(cal, max.len+5) nc <- nchar(cc[1L], "c") abbr <- length(cc) > 1 || nc > max.len cap <- if (abbr) paste(substr(cc[1L], 1L, min(max.len, nc)), "...") else cc[1L] cap } Tests, & WANTED, say with max length of each word in the string <= 6 & maximum total length <= 40 > mod.call(detg.m0) [1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user * Temperature)" WANTED, somthing like: "glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)" > mod.call(detg.m2a) [1] "glm(Freq ~ M.user + Temperature + Softness + Brand + M.user:Temperature + M ..." > > mod.call(detg.m2a, max.len=200) [1] "glm(Freq ~ M.user + Temperature + Softness + Brand + M.user:Temperature + M.user:Softness + M.user:Brand + Temperature:Softness + Temperature:Brand + Softness:Brand)" > WANTED, somthing closer to "glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft + Tmp:Brnd + Sft:Brnd)" TIA -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. & Chair, Quantitative Methods York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA
Try using all.names() to get all the names in the formula. E.g., f <- function (formula, minNameLength = 2, abbreviateFunctionNames = FALSE) { names <- all.names(formula, functions = abbreviateFunctionNames) abbrNames <- lapply(abbreviate(names, minlength = minNameLength), as.name) deparse(do.call("substitute", list(formula, abbrNames))) } used as > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor)) [1] "MR ~ log(FP) + sqrt(SP)" > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor), min=4) [1] "MyRs ~ log(FrsP) + sqrt(ScnP)" > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor), abbreviateFunctionNames=TRUE) [1] "MR ~ lg(FP) + sq(SP)" You could put that in a loop that stopped when nchar(f(...)) got small enough. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of Michael Friendly > Sent: Monday, July 08, 2013 10:36 AM > To: R-help > Subject: [R] abbreviating words in a model formula > > For an application, I need to get a character string representation of > the formula or > model call for glm objects, but also, for labeling output and plots, I > want to be able > to abbreviate the words (variables) in model terms. This requires some > formula > magic that I can't quite get, in particular extracting the terms from a > formula and > then the words in each term. > > Perhaps there is some code for something similar > I haven't found yet, or someone can suggest how to do this. > > A runnable example to show what I mean: > > Freq <- c(68,42,42,30, 37,52,24,43, > 66,50,33,23, 47,55,23,47, > 63,53,29,27, 57,49,19,29) > > Temperature <- gl(2, 2, 24, labels = c("Low", "High")) > Softness <- gl(3, 8, 24, labels = c("Hard","Medium","Soft")) > M.user <- gl(2, 4, 24, labels = c("N", "Y")) > Brand <- gl(2, 1, 24, labels = c("X", "M")) > > detg <- data.frame(Freq,Temperature, Softness, M.user, Brand) > detg.m0 <- glm(Freq ~ M.user*Temperature*Softness + > Brand*M.user*Temperature, > family = poisson, data = detg) > > detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand), > family = poisson, data=detg) > > detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2, > family = poisson, data=detg) > > detg.m2a <- update(detg.m1, . ~ .^2) > > In plot.lm, I found the following code to extract the model call from a > glm object as > a string and abbreviate it to a total length <=75. I need shorter total > length, > by abbreviating individual words in the model call, so the approach has to > at least extract the terms in the model and then abbreviate the words in > each term. > > # from plot.lm: get model call as a string > # TODO: how to use abbreviate to abbreviate the words in the model terms??? > mod.call <- function(x, max.len=75) { > cal <- x$call > if (!is.na(m.f <- match("formula", names(cal)))) { > cal <- cal[c(1, m.f)] > names(cal)[2L] <- "" > } > cc <- deparse(cal, max.len+5) > nc <- nchar(cc[1L], "c") > abbr <- length(cc) > 1 || nc > max.len > cap <- if (abbr) > paste(substr(cc[1L], 1L, min(max.len, nc)), "...") > else cc[1L] > cap > } > > Tests, & WANTED, say with max length of each word in the string <= 6 & > maximum total > length <= 40 > > > mod.call(detg.m0) > [1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user * > Temperature)" > > WANTED, somthing like: > "glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)" > > > mod.call(detg.m2a) > [1] "glm(Freq ~ M.user + Temperature + Softness + Brand + > M.user:Temperature + M ..." > > > > mod.call(detg.m2a, max.len=200) > [1] "glm(Freq ~ M.user + Temperature + Softness + Brand + > M.user:Temperature + M.user:Softness + M.user:Brand + > Temperature:Softness + Temperature:Brand + Softness:Brand)" > > > > WANTED, somthing closer to > "glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft > + Tmp:Brnd + Sft:Brnd)" > > TIA > -Michael > > > > -- > Michael Friendly Email: friendly AT yorku DOT ca > Professor, Psychology Dept. & Chair, Quantitative Methods > York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 > 4700 Keele Street Web: http://www.datavis.ca > Toronto, ONT M3J 1P3 CANADA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Your cart is stuck in front of your horse. This will be WAY easier to accomplish if you rename your columns in your input data frame before fitting the model. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. Michael Friendly <friendly at yorku.ca> wrote:>For an application, I need to get a character string representation of >the formula or >model call for glm objects, but also, for labeling output and plots, I >want to be able >to abbreviate the words (variables) in model terms. This requires some > >formula >magic that I can't quite get, in particular extracting the terms from a > >formula and >then the words in each term. > >Perhaps there is some code for something similar >I haven't found yet, or someone can suggest how to do this. > >A runnable example to show what I mean: > >Freq <- c(68,42,42,30, 37,52,24,43, > 66,50,33,23, 47,55,23,47, > 63,53,29,27, 57,49,19,29) > >Temperature <- gl(2, 2, 24, labels = c("Low", "High")) >Softness <- gl(3, 8, 24, labels = c("Hard","Medium","Soft")) >M.user <- gl(2, 4, 24, labels = c("N", "Y")) >Brand <- gl(2, 1, 24, labels = c("X", "M")) > >detg <- data.frame(Freq,Temperature, Softness, M.user, Brand) >detg.m0 <- glm(Freq ~ M.user*Temperature*Softness + >Brand*M.user*Temperature, > family = poisson, data = detg) > >detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand), > family = poisson, data=detg) > >detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2, > family = poisson, data=detg) > >detg.m2a <- update(detg.m1, . ~ .^2) > >In plot.lm, I found the following code to extract the model call from a > >glm object as >a string and abbreviate it to a total length <=75. I need shorter >total >length, >by abbreviating individual words in the model call, so the approach has >to >at least extract the terms in the model and then abbreviate the words >in >each term. > ># from plot.lm: get model call as a string ># TODO: how to use abbreviate to abbreviate the words in the model >terms??? >mod.call <- function(x, max.len=75) { > cal <- x$call > if (!is.na(m.f <- match("formula", names(cal)))) { > cal <- cal[c(1, m.f)] > names(cal)[2L] <- "" > } > cc <- deparse(cal, max.len+5) > nc <- nchar(cc[1L], "c") > abbr <- length(cc) > 1 || nc > max.len > cap <- if (abbr) > paste(substr(cc[1L], 1L, min(max.len, nc)), "...") > else cc[1L] > cap >} > >Tests, & WANTED, say with max length of each word in the string <= 6 & >maximum total >length <= 40 > > > mod.call(detg.m0) >[1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user * >Temperature)" > >WANTED, somthing like: >"glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)" > > > mod.call(detg.m2a) >[1] "glm(Freq ~ M.user + Temperature + Softness + Brand + >M.user:Temperature + M ..." > > > > mod.call(detg.m2a, max.len=200) >[1] "glm(Freq ~ M.user + Temperature + Softness + Brand + >M.user:Temperature + M.user:Softness + M.user:Brand + >Temperature:Softness + Temperature:Brand + Softness:Brand)" > > > >WANTED, somthing closer to >"glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft > >+ Tmp:Brnd + Sft:Brnd)" > >TIA >-Michael