thr3ads.net - R help - [R] pulling items out of a lm() call [May 2006]

If this information is useful, please help other people find it:
Share via:

Andrew Gelman

2006-May-01 10:46 UTC

[R] pulling items out of a lm() call

I want to write a function to standardize regression predictors, which 
will require me to do some character-string manipulation to parse the 
variables in a call to lm() or glm().

For example, consider the call
lm (y ~ female + I(age^2) + female:black + (age + education)*female).

I want to be able to parse this to pick out the input variables 
("female", "age", "black", "education").
Then I can transform these as
appropriate (to get "z.female", "z.age", etc), feed them
back into the
lm() function, and go from there.

Does anyone know an easy way to pull out the variables?  I basically 
have to parse out the symbols "+", ":", "*", and
" ", but there's also
the problem of handling parentheses and the I() operator.

Thanks!
Andrew

-- 
Andrew Gelman
Professor, Department of Statistics
Professor, Department of Political Science
gelman at stat.columbia.edu
www.stat.columbia.edu/~gelman

Statistics department office:
  Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
  212-851-2142
Political Science department office:
  International Affairs Bldg (Amsterdam Ave at 118 St), Room 731
  212-854-7075

Mailing address:
  1255 Amsterdam Ave, Room 1016
  Columbia University
  New York, NY 10027-5904
  212-851-2142
  (fax) 212-851-2164

Dimitris Rizopoulos

2006-May-01 11:04 UTC

head link

[R] pulling items out of a lm() call

probably all.vars() could be useful in this case, e.g.,

m1 <- lm(y ~ female + I(age^2) + female:black + (age + 
education)*female)
all.vars(formula(m1))


Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm



----- Original Message ----- 
From: "Andrew Gelman" <gelman at stat.columbia.edu>
To: <r-help at stat.math.ethz.ch>
Sent: Monday, May 01, 2006 12:46 PM
Subject: [R] pulling items out of a lm() call

>I want to write a function to standardize regression predictors, 
>which
> will require me to do some character-string manipulation to parse 
> the
> variables in a call to lm() or glm().
>
> For example, consider the call
> lm (y ~ female + I(age^2) + female:black + (age + 
> education)*female).
>
> I want to be able to parse this to pick out the input variables
> ("female", "age", "black",
"education").  Then I can transform these
> as
> appropriate (to get "z.female", "z.age", etc), feed
them back into
> the
> lm() function, and go from there.
>
> Does anyone know an easy way to pull out the variables?  I basically
> have to parse out the symbols "+", ":", "*",
and " ", but there's
> also
> the problem of handling parentheses and the I() operator.
>
> Thanks!
> Andrew
>
> -- 
> Andrew Gelman
> Professor, Department of Statistics
> Professor, Department of Political Science
> gelman at stat.columbia.edu
> www.stat.columbia.edu/~gelman
>
> Statistics department office:
>  Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
>  212-851-2142
> Political Science department office:
>  International Affairs Bldg (Amsterdam Ave at 118 St), Room 731
>  212-854-7075
>
> Mailing address:
>  1255 Amsterdam Ave, Room 1016
>  Columbia University
>  New York, NY 10027-5904
>  212-851-2142
>  (fax) 212-851-2164
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

Gabor Grothendieck

2006-May-01 11:35 UTC

head link

[R] pulling items out of a lm() call

Try this:

# test data
fo <- y ~ female + I(age^2) + female:black + (age + education) * female

# create a list of form list(y = as.name("z.y"), ...) for use with
substitute
L <- sapply(all.vars(fo), function(nm) as.name(paste("z", nm, sep =
".")))
do.call(substitute, list(fo, L))

On 5/1/06, Andrew Gelman <gelman at stat.columbia.edu>
wrote:> I want to write a function to standardize regression predictors, which
> will require me to do some character-string manipulation to parse the
> variables in a call to lm() or glm().
>
> For example, consider the call
> lm (y ~ female + I(age^2) + female:black + (age + education)*female).
>
> I want to be able to parse this to pick out the input variables
> ("female", "age", "black",
"education").  Then I can transform these as
> appropriate (to get "z.female", "z.age", etc), feed
them back into the
> lm() function, and go from there.
>
> Does anyone know an easy way to pull out the variables?  I basically
> have to parse out the symbols "+", ":", "*",
and " ", but there's also
> the problem of handling parentheses and the I() operator.
>
> Thanks!
> Andrew
>
> --
> Andrew Gelman
> Professor, Department of Statistics
> Professor, Department of Political Science
> gelman at stat.columbia.edu
> www.stat.columbia.edu/~gelman
>
> Statistics department office:
>  Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
>  212-851-2142
> Political Science department office:
>  International Affairs Bldg (Amsterdam Ave at 118 St), Room 731
>  212-854-7075
>
> Mailing address:
>  1255 Amsterdam Ave, Room 1016
>  Columbia University
>  New York, NY 10027-5904
>  212-851-2142
>  (fax) 212-851-2164
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Peter Dalgaard

2006-May-01 11:37 UTC

head link

[R] pulling items out of a lm() call

Andrew Gelman <gelman at stat.columbia.edu> writes:
> I want to write a function to standardize regression predictors, which 
> will require me to do some character-string manipulation to parse the 
> variables in a call to lm() or glm().
> 
> For example, consider the call
> lm (y ~ female + I(age^2) + female:black + (age + education)*female).
> 
> I want to be able to parse this to pick out the input variables 
> ("female", "age", "black",
"education").  Then I can transform these as
> appropriate (to get "z.female", "z.age", etc), feed
them back into the
> lm() function, and go from there.
> 
> Does anyone know an easy way to pull out the variables?  I basically 
> have to parse out the symbols "+", ":", "*",
and " ", but there's also
> the problem of handling parentheses and the I() operator.
At which level of generality do you want this?

Consider> attr(terms(y ~ female + I(age^2) + female:black + (age ++               education)*female),"variables")

list(y, female, I(age^2), black, age, education)
> attr(delete.response(terms(y ~ female + I(age^2) + female:black ++          (age + education)*female)),"variables")
list(female, I(age^2), black, age, education)

This gets you some of the way. However, there are complications: You
can't just remove composite terms like "I(age^2)" because it is
not
guaranteed that "age" is in among the other terms:
> attr(terms( ~ I(speed^2)),"variables")list(I(speed^2))

So you need some way to tease out the individual variables inside I().

Here's a first cut.

l <- attr(delete.response(terms(y ~ female + I(age^2) + female:black
             + (age + education)*female)),"variables")

getterms <- function(e) {
    if (is.name(e)) e 
    else if (is.call(e)) lapply(e[-1], getterms)}

unique(c(lapply(l[-1],getterms), recursive=TRUE)) 

and possibly throw in an as.character() to get a vector of strings,
rather than a list of symbols. Notice that since anything can go
inside I(), you can get in trouble if parts of the expression is not
intended as a variable (e.g., y^lambda where lambda is a scalar). The
getterms function above pragmatically assumes that at least function
names need to be discarded.

-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

Maybe Matching Threads

Search for more seemingly similar threads

R help - May 2006 - pulling items out of a lm() call

[R] pulling items out of a lm() call

[R] pulling items out of a lm() call

[R] pulling items out of a lm() call

[R] pulling items out of a lm() call

Maybe Matching Threads