zubin <binabina <at> bellsouth.net> writes:
>
> Hello, running a mixed model in the package LME4, lmer()
>
> Panel data, have about 322 time periods and 50 states, total data set is
> approx 15K records and about 20 explanatory variables. Not a very
> large data set.
>
> We run random intercepts as well as random coefficients for about 10 of
> the variables, the rest come in as fixed effects. We are running into
> a wall of time to execute these models.
>
> A sample specification of all random effects:
>
> lmer(Y ~ 1 + (x_078 + x_079 + growth_st_index +
> retail_st_index + Natl + econ_home_st_index +
> econ_bankruptcy + index2_HO + GPND_ST | state),
> data = newData, doFit = TRUE)
>
> Computation time is near 15 minutes.
> System ELAPSED User
> 21.4 888.63 701.74
>
> Does anyone have any ideas on way's to speed up lmer(), as well any
> parallel implementations, or approaches/options to reduce computation time?
>
>
(1) these kinds of questions will probably get more informed answers
on the r-sig-mixed-models list. Please direct follow-ups there.
(2) I'm not really sure whether this counts as "large" in the
mixed/
multilevel model world. It's certainly not very large for a
standard linear regression. For comparison, the 'Chem97' dataset in
the mlmRev package is 31022 observations x 8 variables x 2280 blocks and
is described as "relatively large" -- so the raw data matrix is about
the same size (twice as long, half as wide) but there are many more
blocks.
(3) Fitting 10 random effects (including the intercept)
is very ambitious, it leads to the
estimation of a 10x10 correlation matrix ... I don't know whether you
know that's what you're doing, or whether you need the full correlation
matrix. You can split it up into independent blocks (in the extreme,
10 uncorrelated random effects) by specifying the REs as separate chunks,
e.g. (1|state) + (0+x_078|state) + (0|x_079|state) + ... (see some
of the examples in the lmer documentation). (lme, in the nlme package,
offers more flexibility in specifying structured correlation matrices
of different types, but will in general be slower than lme4 -- but
perhaps it would be faster to fit a structured (simpler) model you're
happy with using lme than the full unstructured model using lmer)
(4) the development version of lme4, lme4a, *might* be faster (but
is less well tested/less stable).
(5) do you have alternatives? I haven't worked with data sets this
size myself, but anecdotes on the r-sig-mixed-models list suggest that
lmer is faster than most alternatives ... ?
Ben Bolker