thr3ads.net - R help - [R] nlme fixed effects specification [May 2007]

If this information is useful, please help other people find it:
Share via:

ivo welch

2007-May-03 18:03 UTC

[R] nlme fixed effects specification

dear R experts:

sorry, I have to ask this again.  I know that the answer is in section
7.2 of "S Programming," but I don't have the book (and I plan to
buy
the next edition---which I hope will be titled S/R programming ;-) ).

I believe the following yields a standard fixed-effects estimation:

fixed.effects = as.factor( as.integer( runif(100)*10 ) )
y=rnorm(100); x=rnorm(100);
print(summary(lm( Y ~ X + fe)))

I would like to know how to get the same coefficient on X using nlme.
(I cannot use this ordinary lm method in my real application, simply
because I have 10,000 fixed effects.)  I tried a variety of arguments
to the "fixed" nlme parameter (e.g., fixed=list(fmid)), but did not
get the syntax right.  could someone please tell me the magic spell?

may I also suggest that such an example be added to the nlme examples
documentation, too, please?

help appreciated.

regards,

/ivo

Douglas Bates

2007-May-04 15:26 UTC

head link

[R] nlme fixed effects specification

On 5/3/07, ivo welch <ivowel at gmail.com> wrote:> dear R experts:
>
> sorry, I have to ask this again.  I know that the answer is in section
> 7.2 of "S Programming," but I don't have the book (and I plan
to buy
> the next edition---which I hope will be titled S/R programming ;-) ).
> I believe the following yields a standard fixed-effects estimation:
> fixed.effects = as.factor( as.integer( runif(100)*10 ) )
> y=rnorm(100); x=rnorm(100);
> print(summary(lm( Y ~ X + fe)))
Want to try that one again, Ivo?  You have defined objects called
"fixed.effects", "y" and "x" then used the formula
Y ~ X + fe.
> I would like to know how to get the same coefficient on X using nlme.
The nlme function or the nlme package.  I'm guessing that you are
referring to the lme function in the nlme package but it would help if
you told us exactly what you mean.

Also, perhaps you could be a bit more explicit about what you mean by
"same coefficient".  If you fit a model with random effects for a
factor then you will get similar but not identical values for the
coefficients of the fixed effects term.
> (I cannot use this ordinary lm method in my real application, simply
> because I have 10,000 fixed effects.)  I tried a variety of arguments
> to the "fixed" nlme parameter (e.g., fixed=list(fmid)), but did
not
> get the syntax right.  could someone please tell me the magic spell?
If you plan on fitting a linear mixed model then I would suggest

library(lme4)
lmer(Y ~ X + (1|fact))

where fact is the factor specifying the groups in the data.  For example
> set.seed(123454321)
> fact <- sample(10, 100, repl = TRUE)
> y <- rnorm(100); x <- rnorm(100)
> lmer(y ~ x + (1|fact))Linear mixed-effects model fit by REML
Formula: y ~ x + (1 | fact)
   AIC   BIC logLik MLdeviance REMLdeviance
 294.5 302.3 -144.3      282.7        288.5
Random effects:
 Groups   Name        Variance   Std.Dev.
 fact     (Intercept) 5.0482e-10 2.2468e-05
 Residual             1.0096e+00 1.0048e+00
number of obs: 100, groups: fact, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)  0.08536    0.10062  0.8483
x            0.08725    0.08787  0.9929

Correlation of Fixed Effects:
  (Intr)
x -0.052
Warning message:
Estimated variance for factor fact is effectively zero
> may I also suggest that such an example be added to the nlme examples
> documentation, too, please?
First we need to decide what the example is. :-)

Regards,
Doug

roger koenker

2007-May-09 12:42 UTC

head link

[R] nlme fixed effects specification

Just to provide some closure on this thread, let me add two comments:

1.  Doug's version of my sweep function:

diffid1 <-
function(h, id) {
     id <- as.factor(id)[ , drop = TRUE]
     apply(as.matrix(h), 2, function(x) x - tapply(x, id, mean)[id])
}

is far more elegant than my original, and works perfectly, but

2.  I should have mentioned that proposed strategy gets the
coefficient estimates right, however their standard errors need a
degrees of freedom correction, which in the present instance
is non-negligible -- sqrt(98/89) -- since the lm() step doesn't
know that we have already estimated the fixed effects with the
sweep operation.

url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    rkoenker at uiuc.edu            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Champaign, IL 61820


On May 5, 2007, at 7:16 PM, Douglas Bates wrote:
> On 5/5/07, roger koenker <roger at ysidro.econ.uiuc.edu> wrote:
>>
>> On May 5, 2007, at 3:14 PM, Douglas Bates wrote:
>> >
>> > As Roger indicated in another reply you should be able to obtain  
>> the
>> > results you want by sweeping out the means of the groups from  
>> both x
>> > and y.  However, I tried Roger's function and a modified
version
>> that
>> > I wrote and could not show this.  I'm not sure what I am doing
>> wrong.
>>
>> Doug,  Isn't it just that you are generating a  balanced factor and
>> Ivo is
>> generating an unbalanced one -- he wrote:
>
>> > fe = as.factor( as.integer( runif(100)*10 ) );
>
>> the coefficient on x is the same....
>
>> or, aarrgh,  is it that you don't like the s.e. being wrong.   I
>> didn't notice
>> this at first.  But it shouldn't happen.  I'll have to take
another
>> look at  this.
>
> No, my mistake was much dumber than that.  I was comparing the wrong
> coefficient.  For some reason I was comparing the coefficient for x in
> the second fit to the Intercept from the first fit.
>
> I'm glad that it really is working and, yes, you are right, the
> degrees of freedom are wrong in the second fit because the effect of
> those 10 degrees of freedom are removed from the data before the model
> is fit.
>
>
>> > I enclose a transcript that shows that I can reproduce the  
>> result from
>> > Roger's function but it doesn't do what either of us think
it
>> should.
>> > BTW, I realize that the estimate for the Intercept should be  
>> zero in
>> > this case.
>> >
>> >
>> >
>> >> now, with a few IQ points more, I would have looked at the lme
>> >> function instead of the nlme function in library(nlme).   
[then
>> >> again, I could understand stats a lot better with a few more
IQ
>> >> points.]  I am reading the lme description now, but I still
don't
>> >> understand how to specify that I want to have dummies in my
>> >> specification, plus the x variable, and that's it.  I
think I
>> am not
>> >> understanding the integration of fixed and random effects in  
>> the same
>> >> R functions.
>> >>
>> >> thanks for pointing me at your lme4 library.  on linux,
version
>> >> 2.5.0, I did
>> >>   R CMD INSTALL matrix*.tar.gz
>> >>   R CMD INSTALL lme4*.tar.gz
>> >> and it installed painlessly.  (I guess R install packages
don't
>> have
>> >> knowledge of what they rely on;  lme4 requires matrix, which  
>> the docs
>> >> state, but having gotten this wrong, I didn't get an
error.  no
>> big
>> >> deal.  I guess I am too used to automatic resolution of  
>> dependencies
>> >> from linux installers these days that I did not expect this.)
>> >>
>> >> I now tried your specification:
>> >>
>> >> > library(lme4)
>> >> Loading required package: Matrix
>> >> Loading required package: lattice
>> >> > lmer(y~x+(1|fe))
>> >> Linear mixed-effects model fit by REML
>> >> Formula: y ~ x + (1 | fe)
>> >>  AIC BIC logLik MLdeviance REMLdeviance
>> >>  282 290   -138        270          276
>> >> Random effects:
>> >>  Groups   Name        Variance       Std.Dev.
>> >>  fe       (Intercept) 0.000000000445 0.0000211
>> >>  Residual             0.889548532468 0.9431588
>> >> number of obs: 100, groups: fe, 10
>> >>
>> >> Fixed effects:
>> >>             Estimate Std. Error t value
>> >> (Intercept)  -0.0188     0.0943  -0.199
>> >> x             0.0528     0.0904   0.585
>> >>
>> >> Correlation of Fixed Effects:
>> >>   (Intr)
>> >> x -0.022
>> >> Warning messages:
>> >> 1: Estimated variance for factor 'fe' is effectively
zero
>> >>  in: `LMEoptimize<-`(`*tmp*`, value = list(maxIter = 200L,
>> >> tolerance >> >> 0.0000000149011611938477,
>> >> 2: $ operator not defined for this S4 class, returning NULL
in: x
>> >> $symbolic.cor
>> >>
>> >> Without being a statistician, I can still determine that this
>> is not
>> >> the model I would like to work with.  The coefficient is  
>> 0.0528, not
>> >> 0.0232.  (I am also not sure why I am getting these warning  
>> messages
>> >> on my system, either, but I don't think it matters.)
>> >>
>> >> is there a simple way to get the equivalent specification for
my
>> >> smple
>> >> model, using lmer or lme, which does not choke on huge data
sets?
>> >>
>> >> regards,
>> >>
>> >> /ivo
>> >>
>> >> <Ivo_Rout.txt>
>> > ______________________________________________
>> > R-help at stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-
>> > guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>

Reasonably Related Threads

Search for more apparently analagous threads

R help - May 2007 - nlme fixed effects specification

[R] nlme fixed effects specification

[R] nlme fixed effects specification

[R] nlme fixed effects specification

Reasonably Related Threads