For an application, I need to get a character string representation of
the formula or
model call for glm objects, but also, for labeling output and plots, I
want to be able
to abbreviate the words (variables) in model terms. This requires some
formula
magic that I can't quite get, in particular extracting the terms from a
formula and
then the words in each term.
Perhaps there is some code for something similar
I haven't found yet, or someone can suggest how to do this.
A runnable example to show what I mean:
Freq <- c(68,42,42,30, 37,52,24,43,
66,50,33,23, 47,55,23,47,
63,53,29,27, 57,49,19,29)
Temperature <- gl(2, 2, 24, labels = c("Low", "High"))
Softness <- gl(3, 8, 24, labels =
c("Hard","Medium","Soft"))
M.user <- gl(2, 4, 24, labels = c("N", "Y"))
Brand <- gl(2, 1, 24, labels = c("X", "M"))
detg <- data.frame(Freq,Temperature, Softness, M.user, Brand)
detg.m0 <- glm(Freq ~ M.user*Temperature*Softness +
Brand*M.user*Temperature,
family = poisson, data = detg)
detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand),
family = poisson, data=detg)
detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2,
family = poisson, data=detg)
detg.m2a <- update(detg.m1, . ~ .^2)
In plot.lm, I found the following code to extract the model call from a
glm object as
a string and abbreviate it to a total length <=75. I need shorter total
length,
by abbreviating individual words in the model call, so the approach has to
at least extract the terms in the model and then abbreviate the words in
each term.
# from plot.lm: get model call as a string
# TODO: how to use abbreviate to abbreviate the words in the model terms???
mod.call <- function(x, max.len=75) {
cal <- x$call
if (!is.na(m.f <- match("formula", names(cal)))) {
cal <- cal[c(1, m.f)]
names(cal)[2L] <- ""
}
cc <- deparse(cal, max.len+5)
nc <- nchar(cc[1L], "c")
abbr <- length(cc) > 1 || nc > max.len
cap <- if (abbr)
paste(substr(cc[1L], 1L, min(max.len, nc)), "...")
else cc[1L]
cap
}
Tests, & WANTED, say with max length of each word in the string <= 6
&
maximum total
length <= 40
> mod.call(detg.m0)
[1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user *
Temperature)"
WANTED, somthing like:
"glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)"
> mod.call(detg.m2a)
[1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
M.user:Temperature + M ..."
>
> mod.call(detg.m2a, max.len=200)
[1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
M.user:Temperature + M.user:Softness + M.user:Brand +
Temperature:Softness + Temperature:Brand + Softness:Brand)"
>
WANTED, somthing closer to
"glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft
+ Tmp:Brnd + Sft:Brnd)"
TIA
-Michael
--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street Web: http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA
Try using all.names() to get all the names in the formula. E.g.,
f <- function (formula, minNameLength = 2, abbreviateFunctionNames = FALSE)
{
names <- all.names(formula, functions = abbreviateFunctionNames)
abbrNames <- lapply(abbreviate(names, minlength = minNameLength),
as.name)
deparse(do.call("substitute", list(formula, abbrNames)))
}
used as
> f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor))
[1] "MR ~ log(FP) + sqrt(SP)"
> f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor), min=4)
[1] "MyRs ~ log(FrsP) + sqrt(ScnP)"
> f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor),
abbreviateFunctionNames=TRUE)
[1] "MR ~ lg(FP) + sq(SP)"
You could put that in a loop that stopped when nchar(f(...)) got small enough.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Michael Friendly
> Sent: Monday, July 08, 2013 10:36 AM
> To: R-help
> Subject: [R] abbreviating words in a model formula
>
> For an application, I need to get a character string representation of
> the formula or
> model call for glm objects, but also, for labeling output and plots, I
> want to be able
> to abbreviate the words (variables) in model terms. This requires some
> formula
> magic that I can't quite get, in particular extracting the terms from a
> formula and
> then the words in each term.
>
> Perhaps there is some code for something similar
> I haven't found yet, or someone can suggest how to do this.
>
> A runnable example to show what I mean:
>
> Freq <- c(68,42,42,30, 37,52,24,43,
> 66,50,33,23, 47,55,23,47,
> 63,53,29,27, 57,49,19,29)
>
> Temperature <- gl(2, 2, 24, labels = c("Low",
"High"))
> Softness <- gl(3, 8, 24, labels =
c("Hard","Medium","Soft"))
> M.user <- gl(2, 4, 24, labels = c("N", "Y"))
> Brand <- gl(2, 1, 24, labels = c("X", "M"))
>
> detg <- data.frame(Freq,Temperature, Softness, M.user, Brand)
> detg.m0 <- glm(Freq ~ M.user*Temperature*Softness +
> Brand*M.user*Temperature,
> family = poisson, data = detg)
>
> detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand),
> family = poisson, data=detg)
>
> detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2,
> family = poisson, data=detg)
>
> detg.m2a <- update(detg.m1, . ~ .^2)
>
> In plot.lm, I found the following code to extract the model call from a
> glm object as
> a string and abbreviate it to a total length <=75. I need shorter total
> length,
> by abbreviating individual words in the model call, so the approach has to
> at least extract the terms in the model and then abbreviate the words in
> each term.
>
> # from plot.lm: get model call as a string
> # TODO: how to use abbreviate to abbreviate the words in the model terms???
> mod.call <- function(x, max.len=75) {
> cal <- x$call
> if (!is.na(m.f <- match("formula", names(cal)))) {
> cal <- cal[c(1, m.f)]
> names(cal)[2L] <- ""
> }
> cc <- deparse(cal, max.len+5)
> nc <- nchar(cc[1L], "c")
> abbr <- length(cc) > 1 || nc > max.len
> cap <- if (abbr)
> paste(substr(cc[1L], 1L, min(max.len, nc)), "...")
> else cc[1L]
> cap
> }
>
> Tests, & WANTED, say with max length of each word in the string <= 6
&
> maximum total
> length <= 40
>
> > mod.call(detg.m0)
> [1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user *
> Temperature)"
>
> WANTED, somthing like:
> "glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)"
>
> > mod.call(detg.m2a)
> [1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
> M.user:Temperature + M ..."
> >
> > mod.call(detg.m2a, max.len=200)
> [1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
> M.user:Temperature + M.user:Softness + M.user:Brand +
> Temperature:Softness + Temperature:Brand + Softness:Brand)"
> >
>
> WANTED, somthing closer to
> "glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd +
Tmp:Sft
> + Tmp:Brnd + Sft:Brnd)"
>
> TIA
> -Michael
>
>
>
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
> 4700 Keele Street Web: http://www.datavis.ca
> Toronto, ONT M3J 1P3 CANADA
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Your cart is stuck in front of your horse. This will be WAY easier to accomplish
if you rename your columns in your input data frame before fitting the model.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
Michael Friendly <friendly at yorku.ca> wrote:
>For an application, I need to get a character string representation of
>the formula or
>model call for glm objects, but also, for labeling output and plots, I
>want to be able
>to abbreviate the words (variables) in model terms. This requires some
>
>formula
>magic that I can't quite get, in particular extracting the terms from a
>
>formula and
>then the words in each term.
>
>Perhaps there is some code for something similar
>I haven't found yet, or someone can suggest how to do this.
>
>A runnable example to show what I mean:
>
>Freq <- c(68,42,42,30, 37,52,24,43,
> 66,50,33,23, 47,55,23,47,
> 63,53,29,27, 57,49,19,29)
>
>Temperature <- gl(2, 2, 24, labels = c("Low",
"High"))
>Softness <- gl(3, 8, 24, labels =
c("Hard","Medium","Soft"))
>M.user <- gl(2, 4, 24, labels = c("N", "Y"))
>Brand <- gl(2, 1, 24, labels = c("X", "M"))
>
>detg <- data.frame(Freq,Temperature, Softness, M.user, Brand)
>detg.m0 <- glm(Freq ~ M.user*Temperature*Softness +
>Brand*M.user*Temperature,
> family = poisson, data = detg)
>
>detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand),
> family = poisson, data=detg)
>
>detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2,
> family = poisson, data=detg)
>
>detg.m2a <- update(detg.m1, . ~ .^2)
>
>In plot.lm, I found the following code to extract the model call from a
>
>glm object as
>a string and abbreviate it to a total length <=75. I need shorter
>total
>length,
>by abbreviating individual words in the model call, so the approach has
>to
>at least extract the terms in the model and then abbreviate the words
>in
>each term.
>
># from plot.lm: get model call as a string
># TODO: how to use abbreviate to abbreviate the words in the model
>terms???
>mod.call <- function(x, max.len=75) {
> cal <- x$call
> if (!is.na(m.f <- match("formula", names(cal)))) {
> cal <- cal[c(1, m.f)]
> names(cal)[2L] <- ""
> }
> cc <- deparse(cal, max.len+5)
> nc <- nchar(cc[1L], "c")
> abbr <- length(cc) > 1 || nc > max.len
> cap <- if (abbr)
> paste(substr(cc[1L], 1L, min(max.len, nc)), "...")
> else cc[1L]
> cap
>}
>
>Tests, & WANTED, say with max length of each word in the string <= 6
&
>maximum total
>length <= 40
>
> > mod.call(detg.m0)
>[1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user *
>Temperature)"
>
>WANTED, somthing like:
>"glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)"
>
> > mod.call(detg.m2a)
>[1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
>M.user:Temperature + M ..."
> >
> > mod.call(detg.m2a, max.len=200)
>[1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
>M.user:Temperature + M.user:Softness + M.user:Brand +
>Temperature:Softness + Temperature:Brand + Softness:Brand)"
> >
>
>WANTED, somthing closer to
>"glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft
>
>+ Tmp:Brnd + Sft:Brnd)"
>
>TIA
>-Michael