Kevin Wright
2011-May-31 20:35 UTC
[R] In a formula, what is the interaction of the intercept and a factor?
For a pedagogical purpose, I was trying to show how the formula for a simple regression line (~1+x) could be crossed with a factor (~1:group + x:group) to fit separate regressions by group. For example: set.seed(201108) dat <- data.frame(x=1:15, y=1:15+rnorm(15), group = sample(c('A','B'), size=15, replace=TRUE)) m1 <- lm(y~ 1 + x, data=dat) m2 <- lm(y ~ group + x:group, data=dat) m3 <- lm(y ~ 1:group + x:group, data=dat) m4 <- lm(y ~ 1 + x:group, data=dat) The simple regression is model m1. The usual way to write the by-group regression is model m2. In model m3 was trying to be explicitly clear and interact "1+x" with "group". Looking only at the coefficients, it appears that model m3 is simplified to model m4. R> coef(m3) (Intercept) groupA:x groupB:x 0.3775140 0.9213835 0.9879690 R> coef(m4) (Intercept) x:groupA x:groupB 0.3775140 0.9213835 0.9879690 I wonder if anyone can shed some light on what R is doing with the "1:group" term. Kevin [[alternative HTML version deleted]]
Kenn Konstabel
2011-Jun-01 04:49 UTC
[R] In a formula, what is the interaction of the intercept and a factor?
With some guessing: does lm(formula = y ~ -1 + group + x:group, data = dat) do what you want? I'm not sure now 1:group is treated, if at all. Kenn On Tue, May 31, 2011 at 11:35 PM, Kevin Wright <kw.stat at gmail.com> wrote:> For a pedagogical purpose, I was trying to show how the formula for a simple > regression line (~1+x) could be crossed with a factor (~1:group + x:group) > to fit separate regressions by group. ?For example: > > set.seed(201108) > dat <- data.frame(x=1:15, y=1:15+rnorm(15), > ? ? ? ? ? ?group = sample(c('A','B'), size=15, > ? ? ? ? ? ? ? ? ? ?replace=TRUE)) > > m1 <- lm(y~ 1 + x, data=dat) > m2 <- lm(y ~ group + x:group, data=dat) > m3 <- lm(y ~ 1:group + x:group, data=dat) > m4 <- lm(y ~ 1 + x:group, data=dat) > > The simple regression is model m1. > > The usual way to write the by-group regression is model m2. > > In model m3 was trying to be explicitly clear and interact "1+x" with > "group". > > Looking only at the coefficients, it appears that model m3 is simplified to > model m4. > > R> coef(m3) > (Intercept) ? ?groupA:x ? ?groupB:x > ?0.3775140 ? 0.9213835 ? 0.9879690 > > R> coef(m4) > (Intercept) ? ?x:groupA ? ?x:groupB > ?0.3775140 ? 0.9213835 ? 0.9879690 > > I wonder if anyone can shed some light on what R is doing with the "1:group" > term. > > Kevin > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Prof Brian Ripley
2011-Jun-01 06:19 UTC
[R] In a formula, what is the interaction of the intercept and a factor?
Try it for yourself:> model.matrix(y ~ 1:group, data = dat)(Intercept) 1 1 ... Or to do 'interact "1+x" with "group"'> model.matrix(y ~ (1+x):group, data = dat)(Intercept) x:groupA x:groupB 1 1 0 1 ... Note that you usually want to do '*' when you say 'interact with':> model.matrix(y ~ (1+x)*group, data = dat)(Intercept) x groupB x:groupB 1 1 1 1 1 ... On Tue, 31 May 2011, Kevin Wright wrote:> For a pedagogical purpose, I was trying to show how the formula for a simple > regression line (~1+x) could be crossed with a factor (~1:group + x:group) > to fit separate regressions by group. For example: > > set.seed(201108) > dat <- data.frame(x=1:15, y=1:15+rnorm(15), > group = sample(c('A','B'), size=15, > replace=TRUE)) > > m1 <- lm(y~ 1 + x, data=dat) > m2 <- lm(y ~ group + x:group, data=dat) > m3 <- lm(y ~ 1:group + x:group, data=dat) > m4 <- lm(y ~ 1 + x:group, data=dat) > > The simple regression is model m1. > > The usual way to write the by-group regression is model m2. > > In model m3 was trying to be explicitly clear and interact "1+x" with > "group". > > Looking only at the coefficients, it appears that model m3 is simplified to > model m4. > > R> coef(m3) > (Intercept) groupA:x groupB:x > 0.3775140 0.9213835 0.9879690 > > R> coef(m4) > (Intercept) x:groupA x:groupB > 0.3775140 0.9213835 0.9879690 > > I wonder if anyone can shed some light on what R is doing with the "1:group" > term. > > Kevin > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595