Ken Kelley
2003-Oct-21 20:42 UTC
[R] Denominator Degrees of Freedom in lme() -- Adjusting and Understanding Them
Hello all. I was wondering if there is any way to adjust the denominator degrees of freedom in lme(). It seems to me that there is only one method that can be used. As has been pointed out previously on the list, the denominator degrees of freedom given by lme() do not match those given by SAS Proc Mixed or HLM5. Proc Mixed, for example, offers five different options for computing the denominator degrees of freedom. Is there anyway to make such specifications in lme(), so that the degrees of freedom will correspond with the output given from Proc Mixed. I've looked at Pinheiro and Bates' Mixed-Effects Models book (especially p. 91), but I still don't quite understand the method used for determining the degrees of freedom in lme(). When analyzing longitudinal data with the straight-line growth model (intercept and slope both have fixed and random effects), the degrees of freedom seem to be N*T-N-1, where N is total sample size and T is the number of timepoints (at least when data are balanced). In the Pinheiro and Bates book (p. 91), the degrees of freedom are given as m_i-(m_1-1+pi), where m_i is the number of groups at the ith level, m_0=1 if an intercept is included and p_i is the sum of the degrees of freedom corresponding to the terms estimated. I'm not sure how the N*T-N-1 matches up with the formula given on page 91. It seems to me the number of "groups" (i.e., m_i) would be equal to N, the number of individuals (note that this is what is given as the "number of groups" in the summary of the lme() object.). However, as more occasions of measurements are added, the number of degrees of freedom gets larger, making it seems as though m_i represents the total number of observations, not the "number of groups." For example, if N=2 and T=3, you end up with 3 degrees of freedom using lme(). Increasing T to 10 has not changed the number of groups (i.e., N still equals 2), but the degrees of freedom increases to 17. In such a situation SAS Proc Mixed would still have 1 degree of freedom (N-1) regardless of T, as the number of "groups" have not changed (just the number of observations per group have changed). Any insight into understanding the denominator degrees of freedom for the fixed effects would be appreciated. Since the degrees of freedom given by lme() can be made to be arbitrarily larger than those given by PROC MIXED (i.e., by having an arbitrarily large number of measurement occasions for each individual), and since the degrees of freedom affect the standard errors, then the hypothesis tests, then the p values, the differences between the methods is surprising. It seems one of the methods would be better than the other since they can potentially be so much different. Thanks and have a good one, Ken P.S. I have posted this to both the R and Multilevel Modeling list.
Douglas Bates
2003-Oct-21 21:54 UTC
[R] Denominator Degrees of Freedom in lme() -- Adjusting and Understanding Them
Contributions of code to provide alternative calculations of denominator degrees of freedom are welcome :-) I think it would be good to bear in mind that the use of the t and F distributions for models with mixed effects is already an approximation. If your design is such that you end up with a very few denominator degrees of freedom then the whole question of whether you should be using F or t distributions in the first place becomes problematic. If the number of denominator degrees of freedom is moderate than the distinction between alternative methods becomes unimportant. -- Douglas Bates bates at stat.wisc.edu Statistics Department 608/262-2598 University of Wisconsin - Madison http://www.stat.wisc.edu/~bates/
Spencer Graves
2003-Oct-22 00:22 UTC
[R] Denominator Degrees of Freedom in lme() -- Adjusting and Understanding Them
Prof. Bates may be able to give us more recent references on this, but the best literature I know on this is Pinhiero and Bates (2000) Mixed-Effects Models in S and S-Plus (Springer, sec. 2.4). This includes description of a "simulate.lme" function, which you can use to generate random numbers according to a given assumed model and then compare some results with a reference distribution. Something like this could be used to answer your question of what is the correct number of degrees of freedom to use for any particular model. hope this helps. spencer graves Douglas Bates wrote:>Contributions of code to provide alternative calculations of >denominator degrees of freedom are welcome :-) > >I think it would be good to bear in mind that the use of the t and F >distributions for models with mixed effects is already an >approximation. If your design is such that you end up with a very few >denominator degrees of freedom then the whole question of whether you >should be using F or t distributions in the first place becomes >problematic. If the number of denominator degrees of freedom is >moderate than the distinction between alternative methods becomes >unimportant. > > >