Just to provide some closure on this thread, let me add two comments:
1. Doug's version of my sweep function:
diffid1 <-
function(h, id) {
id <- as.factor(id)[ , drop = TRUE]
apply(as.matrix(h), 2, function(x) x - tapply(x, id, mean)[id])
}
is far more elegant than my original, and works perfectly, but
2. I should have mentioned that proposed strategy gets the
coefficient estimates right, however their standard errors need a
degrees of freedom correction, which in the present instance
is non-negligible -- sqrt(98/89) -- since the lm() step doesn't
know that we have already estimated the fixed effects with the
sweep operation.
url: www.econ.uiuc.edu/~roger Roger Koenker
email rkoenker at uiuc.edu Department of Economics
vox: 217-333-4558 University of Illinois
fax: 217-244-6678 Champaign, IL 61820
On May 5, 2007, at 7:16 PM, Douglas Bates wrote:
> On 5/5/07, roger koenker <roger at ysidro.econ.uiuc.edu> wrote:
>>
>> On May 5, 2007, at 3:14 PM, Douglas Bates wrote:
>> >
>> > As Roger indicated in another reply you should be able to obtain
>> the
>> > results you want by sweeping out the means of the groups from
>> both x
>> > and y. However, I tried Roger's function and a modified
version
>> that
>> > I wrote and could not show this. I'm not sure what I am doing
>> wrong.
>>
>> Doug, Isn't it just that you are generating a balanced factor and
>> Ivo is
>> generating an unbalanced one -- he wrote:
>
>> > fe = as.factor( as.integer( runif(100)*10 ) );
>
>> the coefficient on x is the same....
>
>> or, aarrgh, is it that you don't like the s.e. being wrong. I
>> didn't notice
>> this at first. But it shouldn't happen. I'll have to take
another
>> look at this.
>
> No, my mistake was much dumber than that. I was comparing the wrong
> coefficient. For some reason I was comparing the coefficient for x in
> the second fit to the Intercept from the first fit.
>
> I'm glad that it really is working and, yes, you are right, the
> degrees of freedom are wrong in the second fit because the effect of
> those 10 degrees of freedom are removed from the data before the model
> is fit.
>
>
>> > I enclose a transcript that shows that I can reproduce the
>> result from
>> > Roger's function but it doesn't do what either of us think
it
>> should.
>> > BTW, I realize that the estimate for the Intercept should be
>> zero in
>> > this case.
>> >
>> >
>> >
>> >> now, with a few IQ points more, I would have looked at the lme
>> >> function instead of the nlme function in library(nlme).
[then
>> >> again, I could understand stats a lot better with a few more
IQ
>> >> points.] I am reading the lme description now, but I still
don't
>> >> understand how to specify that I want to have dummies in my
>> >> specification, plus the x variable, and that's it. I
think I
>> am not
>> >> understanding the integration of fixed and random effects in
>> the same
>> >> R functions.
>> >>
>> >> thanks for pointing me at your lme4 library. on linux,
version
>> >> 2.5.0, I did
>> >> R CMD INSTALL matrix*.tar.gz
>> >> R CMD INSTALL lme4*.tar.gz
>> >> and it installed painlessly. (I guess R install packages
don't
>> have
>> >> knowledge of what they rely on; lme4 requires matrix, which
>> the docs
>> >> state, but having gotten this wrong, I didn't get an
error. no
>> big
>> >> deal. I guess I am too used to automatic resolution of
>> dependencies
>> >> from linux installers these days that I did not expect this.)
>> >>
>> >> I now tried your specification:
>> >>
>> >> > library(lme4)
>> >> Loading required package: Matrix
>> >> Loading required package: lattice
>> >> > lmer(y~x+(1|fe))
>> >> Linear mixed-effects model fit by REML
>> >> Formula: y ~ x + (1 | fe)
>> >> AIC BIC logLik MLdeviance REMLdeviance
>> >> 282 290 -138 270 276
>> >> Random effects:
>> >> Groups Name Variance Std.Dev.
>> >> fe (Intercept) 0.000000000445 0.0000211
>> >> Residual 0.889548532468 0.9431588
>> >> number of obs: 100, groups: fe, 10
>> >>
>> >> Fixed effects:
>> >> Estimate Std. Error t value
>> >> (Intercept) -0.0188 0.0943 -0.199
>> >> x 0.0528 0.0904 0.585
>> >>
>> >> Correlation of Fixed Effects:
>> >> (Intr)
>> >> x -0.022
>> >> Warning messages:
>> >> 1: Estimated variance for factor 'fe' is effectively
zero
>> >> in: `LMEoptimize<-`(`*tmp*`, value = list(maxIter = 200L,
>> >> tolerance >> >> 0.0000000149011611938477,
>> >> 2: $ operator not defined for this S4 class, returning NULL
in: x
>> >> $symbolic.cor
>> >>
>> >> Without being a statistician, I can still determine that this
>> is not
>> >> the model I would like to work with. The coefficient is
>> 0.0528, not
>> >> 0.0232. (I am also not sure why I am getting these warning
>> messages
>> >> on my system, either, but I don't think it matters.)
>> >>
>> >> is there a simple way to get the equivalent specification for
my
>> >> smple
>> >> model, using lmer or lme, which does not choke on huge data
sets?
>> >>
>> >> regards,
>> >>
>> >> /ivo
>> >>
>> >> <Ivo_Rout.txt>
>> > ______________________________________________
>> > R-help at stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-
>> > guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>