Hi, I would like to estimate something like y = a + b*d2*y + c*d3*y where the dummies are created from some vector d with three (actually many more) levels using factor(). But either there is included the variable y or d1*y. How could I get rid of these? Example: x = c(1,2,3,4,5,6,7,8) y = c(3,6,2,8,7,6,2,4) d = c(1,1,1,2,3,2,3,3) fd = factor(d) lm(x ~ fd*y) gives: Coefficients: (Intercept) fd2 fd3 y fd2:y fd3:y 2.4231 9.5769 6.1822 -0.1154 -0.8846 -0.3320 lm(x ~ fd*y - y) gives: Coefficients: (Intercept) fd2 fd3 fd1:y fd2:y fd3:y 2.4231 9.5769 6.1822 -0.1154 -1.0000 -0.4474 What I would like to get is: Coefficients: (Intercept) fd2 fd3 fd2:y fd3:y Is there an easy way to achieve this? Maybe it's obvious how to do it, but I'm new to R did quite some searching without finding the solution. Thanks a lot in advance! Andreas Goesele -- Andreas G?sele
Is this what you want? lm(y ~ fd + fd:y) On 29/11/06, Andreas Goesele <Goesele at hfph.mwn.de> wrote:> Hi, > > I would like to estimate something like y = a + b*d2*y + c*d3*y where > the dummies are created from some vector d with three (actually many > more) levels using factor(). But either there is included the variable > y or d1*y. How could I get rid of these? > > Example: > > x = c(1,2,3,4,5,6,7,8) > y = c(3,6,2,8,7,6,2,4) > d = c(1,1,1,2,3,2,3,3) > fd = factor(d) > > lm(x ~ fd*y) > gives: > > Coefficients: > (Intercept) fd2 fd3 y fd2:y fd3:y > 2.4231 9.5769 6.1822 -0.1154 -0.8846 -0.3320 > > lm(x ~ fd*y - y) > gives: > > Coefficients: > (Intercept) fd2 fd3 fd1:y fd2:y fd3:y > 2.4231 9.5769 6.1822 -0.1154 -1.0000 -0.4474 > > What I would like to get is: > > Coefficients: > (Intercept) fd2 fd3 fd2:y fd3:y > > Is there an easy way to achieve this? > > Maybe it's obvious how to do it, but I'm new to R did quite some > searching without finding the solution. > > Thanks a lot in advance! > > Andreas Goesele > > -- > Andreas G?sele > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP
I'm not sure if this will help, but it's worth a try. Do the regression as I suggested before, extract the model matrix and remove the "offending" column. I'm assuming you don't know in advance how many levels there are in the factor. Then use this to perform the regression. Something like this: m1 <- lm(x ~ fd:y + fd) mm <- model.matrix(m1) nl <- length(levels(fd)) newdat <- mm[,-c(1,nl)] lm(x ~ newdat) On 30/11/06, Andreas Goesele <Goesele at hfph.mwn.de> wrote:> "David Barron" <mothsailor at googlemail.com> writes: > > > I'm not sure that's possible, given that you would effectively be > > fitting a different model to those cases with fd==1, wouldn't you? > > This was what I was afraid of. But your argument I don't understand > completely. > > Do you want to say, that the *model* x = a + b1*fd2 + b2*fd3 + > c1*fd2*y + c1*fd3*y doesn't make sense as fd2 together with fd3 > doesn't cover all cases? And that because the model doesn't make sense > it is not easily implemented in R? > > My problem is that I have to estimate a model of this kind whether it > makes sense or not. So I still would be happy to find an easy and > elegant way... > > > On 29/11/06, Andreas Goesele <Goesele at hfph.mwn.de> wrote: > >> "David Barron" <mothsailor at googlemail.com> writes: > >> > >> > Is this what you want? > >> > > >> > lm(y ~ fd + fd:y) > >> > >> Thanks for you fast reply. But it's not what I wanted. > >> > >> To make it clearer I have to correct the first part of what I wrote: > >> > >> > On 29/11/06, Andreas Goesele <Goesele at hfph.mwn.de> wrote: > >> > >> >> I would like to estimate something like y = a + b*d2*y + c*d3*y > >> >> where the dummies are created from some vector d with three > >> >> (actually many more) levels using factor(). But either there is > >> >> included the variable y or d1*y. How could I get rid of these? > >> > >> This should have been something like: > >> > >> x = a + b1*d2 + b2*d3 + c1*d2*y + c1*d3*y > >> > >> When I change your suggestion to lm(x ~ fd + fd:y) I get the same as > >> for lm(x ~ fd*y - y), namely: > >> > >> Coefficients: > >> (Intercept) fd2 fd3 fd1:y fd2:y fd3:y > >> 2.4231 9.5769 6.1822 -0.1154 -1.0000 -0.4474 > >> > >> What I want is: > >> > >> Coefficients: > >> (Intercept) fd2 fd3 fd2:y fd3:y > >> > >> Thanks a lot again! > >> > >> Andreas Goesele > >> > >> -- > >> Andreas G?sele Omnis enim res, quae dando non deficit, > >> Inst. f. Gesellschaftspolitik dum habetur et non datur, > >> Kaulbachstr. 31a, 80539 M?nchen nondum habetur, quomodo habenda est. > >> E-mail: goesele at hfph.mwn.de (Augustinus) > >> > >> ______________________________________________ > >> R-help at stat.math.ethz.ch mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > -- > > ================================> > David Barron > > Said Business School > > University of Oxford > > Park End Street > > Oxford OX1 1HP > > > > > > -- > Andreas G?sele Omnis enim res, quae dando non deficit, > Inst. f. Gesellschaftspolitik dum habetur et non datur, > Kaulbachstr. 31a, 80539 M?nchen nondum habetur, quomodo habenda est. > E-mail: goesele at hfph.mwn.de (Augustinus) >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP