Dear list, I am currently working with a rather large data set on body temperature regulation in wintering birds. My original model contains quite a few dependent variables, but I do not (of course) wish to keep them all in my final model. I've fitted the following model to the data:> temp.lme1<-lmer(T.B~tarsus+wing+weight+factor(age)+factor(sex)+fat+minsunset+day1oct+day1oct.2+minnight+ave.day+minnight.1+T.A+ave.night.1+(1|ID)+(1|sign),data=bodytemp.df)where T.B equals body temperature; explanatories are a number of biometric measures (tarsus, wing, weight, fat, age, sex) and various measures of ambient temperature (ave.day, minnight.1, minnight, ave.night.1, T.A) and time/date (minsunset,day1oct,day1oct.2). Random factors are ID (individuals were samples ranging from 1 to 3 times) and sign (person performing measurements; 2 levels). Model output looks like this:> summary(temp.lme1)Linear mixed model fit by REML Formula: T.B ~ tarsus + wing + weight + factor(age) + factor(sex) + fat + minsunset + day1oct + day1oct.2 + minnight + ave.day + minnight.1 + T.A + ave.night.1 + (1 | ID) + (1 | sign) Data: bodytemp.df AIC BIC logLik deviance REMLdev 557.8 614 -260.9 441 521.8 Random effects: Groups Name Variance Std.Dev. ID (Intercept) 1.0399e-01 0.32247096 sign (Intercept) 6.2663e-08 0.00025033 Residual 8.0162e-01 0.89533134 Number of obs: 167, groups: ID, 124; sign, 2 Fixed effects: Estimate Std. Error t value (Intercept) 4.124e+01 4.104e+00 10.049 tarsus -5.925e-02 5.801e-02 -1.021 wing -6.252e-02 4.984e-02 -1.254 weight 1.499e-01 1.446e-01 1.037 factor(age)2K+ 1.981e-01 1.651e-01 1.200 factor(sex)M 9.232e-02 2.146e-01 0.430 fat -2.297e-02 8.150e-02 -0.282 minsunset -1.104e-03 1.043e-03 -1.058 day1oct -4.247e-03 2.879e-02 -0.148 day1oct.2 5.087e-05 1.560e-04 0.326 minnight -5.987e-02 7.022e-02 -0.853 ave.day 1.128e-01 1.582e-01 0.713 minnight.1 -9.590e-02 1.684e-01 -0.570 T.A -4.855e-02 5.185e-02 -0.936 ave.night.1 1.420e-01 2.477e-01 0.573 Correlation of Fixed Effects: (Intr) tarsus wing weight f()2K+ fct()M fat mnsnst day1ct dy1c.2 mnnght ave.dy mnng.1 T.A tarsus -0.851 wing -0.870 0.966 weight 0.071 -0.417 -0.411 factr(g)2K+ 0.211 -0.248 -0.241 0.219 factor(sx)M 0.573 -0.499 -0.526 -0.179 0.105 fat -0.037 0.046 0.052 -0.264 -0.152 0.045 minsunset -0.177 -0.144 -0.122 0.214 -0.101 -0.027 -0.045 day1oct -0.261 -0.051 -0.052 -0.117 -0.145 0.140 0.131 0.515 day1oct.2 0.257 0.050 0.051 0.121 0.141 -0.149 -0.125 -0.484 -0.993 minnight -0.074 0.249 0.216 -0.271 -0.032 -0.043 0.022 0.022 -0.168 0.231 ave.day -0.025 0.070 0.050 0.001 0.045 -0.022 0.046 -0.363 -0.120 0.041 -0.415 minnight.1 0.304 -0.081 -0.045 0.069 0.129 0.012 -0.054 -0.349 -0.636 0.644 0.023 0.052 T.A 0.049 -0.043 0.018 0.130 0.040 -0.164 -0.065 -0.317 -0.288 0.249 -0.598 0.267 0.143 ave.night.1 -0.234 0.004 -0.015 -0.030 -0.110 0.016 0.031 0.493 0.614 -0.586 0.105 -0.524 -0.863 -0.243 At this point, I want to go on selecting the variables with most explanatory power to come up with a final model. However, I'm not sure on how to do this, because (not being a trained statistician) I'm used to having p-values to guide me. Similarly, I would like to be able to report the relative "importance" of variables in some way but, as apparent from a number of threads, p-values seem to be the least preferred option when it comes to lmer. I've read about the mcmcsamp()-function, but I'm not entirely sure on how to use it or on how to intrepret the output. Any advice would be most appreciated. Kind regards, Andreas Nord -- View this message in context: http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19146850.html Sent from the R help mailing list archive at Nabble.com.
Have you thought about using AIC weights? As long as you are not considering models where you drop your random effects, calculating AIC values (or AICc values) and doing multimodel inference is one way to approach your problem. If you are fitting models with and without random effects, it gets trickier - see Vaida and Blanchard 2005 Biometrika. -Jarrett -- View this message in context: http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19147125.html Sent from the R help mailing list archive at Nabble.com.
You **really** should work with a local statistician. Remote statistical advice (this is not really about R) from even well-meaning helpers unfamiliar with your work is really very risky. For example, I would suggest making all sorts of plots (statistical summaries alone are wholly inadequate and potentially quite misleading), but exactly what to plot, how to interpret what the plots show, and what to do next would depend on both the subject matter background (how the study was conducted and what sorts of mechanisms are expected, for example)and what the plots revealed. Like the gangster movies (used to) say: just a friendly warning ... :) -- Bert Gunter Genentech ----- Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Andreas Nord Sent: Monday, August 25, 2008 9:22 AM To: r-help at r-project.org Subject: [R] lmer4 and variable selection Dear list, I am currently working with a rather large data set on body temperature regulation in wintering birds. My original model contains quite a few dependent variables, but I do not (of course) wish to keep them all in my final model. I've fitted the following model to the data:>temp.lme1<-lmer(T.B~tarsus+wing+weight+factor(age)+factor(sex)+fat+minsunset +day1oct+day1oct.2+minnight+ave.day+minnight.1+T.A+ave.night.1+(1|ID)+(1|sig n),data=bodytemp.df) where T.B equals body temperature; explanatories are a number of biometric measures (tarsus, wing, weight, fat, age, sex) and various measures of ambient temperature (ave.day, minnight.1, minnight, ave.night.1, T.A) and time/date (minsunset,day1oct,day1oct.2). Random factors are ID (individuals were samples ranging from 1 to 3 times) and sign (person performing measurements; 2 levels). Model output looks like this:> summary(temp.lme1)Linear mixed model fit by REML Formula: T.B ~ tarsus + wing + weight + factor(age) + factor(sex) + fat + minsunset + day1oct + day1oct.2 + minnight + ave.day + minnight.1 + T.A + ave.night.1 + (1 | ID) + (1 | sign) Data: bodytemp.df AIC BIC logLik deviance REMLdev 557.8 614 -260.9 441 521.8 Random effects: Groups Name Variance Std.Dev. ID (Intercept) 1.0399e-01 0.32247096 sign (Intercept) 6.2663e-08 0.00025033 Residual 8.0162e-01 0.89533134 Number of obs: 167, groups: ID, 124; sign, 2 Fixed effects: Estimate Std. Error t value (Intercept) 4.124e+01 4.104e+00 10.049 tarsus -5.925e-02 5.801e-02 -1.021 wing -6.252e-02 4.984e-02 -1.254 weight 1.499e-01 1.446e-01 1.037 factor(age)2K+ 1.981e-01 1.651e-01 1.200 factor(sex)M 9.232e-02 2.146e-01 0.430 fat -2.297e-02 8.150e-02 -0.282 minsunset -1.104e-03 1.043e-03 -1.058 day1oct -4.247e-03 2.879e-02 -0.148 day1oct.2 5.087e-05 1.560e-04 0.326 minnight -5.987e-02 7.022e-02 -0.853 ave.day 1.128e-01 1.582e-01 0.713 minnight.1 -9.590e-02 1.684e-01 -0.570 T.A -4.855e-02 5.185e-02 -0.936 ave.night.1 1.420e-01 2.477e-01 0.573 Correlation of Fixed Effects: (Intr) tarsus wing weight f()2K+ fct()M fat mnsnst day1ct dy1c.2 mnnght ave.dy mnng.1 T.A tarsus -0.851 wing -0.870 0.966 weight 0.071 -0.417 -0.411 factr(g)2K+ 0.211 -0.248 -0.241 0.219 factor(sx)M 0.573 -0.499 -0.526 -0.179 0.105 fat -0.037 0.046 0.052 -0.264 -0.152 0.045 minsunset -0.177 -0.144 -0.122 0.214 -0.101 -0.027 -0.045 day1oct -0.261 -0.051 -0.052 -0.117 -0.145 0.140 0.131 0.515 day1oct.2 0.257 0.050 0.051 0.121 0.141 -0.149 -0.125 -0.484 -0.993 minnight -0.074 0.249 0.216 -0.271 -0.032 -0.043 0.022 0.022 -0.168 0.231 ave.day -0.025 0.070 0.050 0.001 0.045 -0.022 0.046 -0.363 -0.120 0.041 -0.415 minnight.1 0.304 -0.081 -0.045 0.069 0.129 0.012 -0.054 -0.349 -0.636 0.644 0.023 0.052 T.A 0.049 -0.043 0.018 0.130 0.040 -0.164 -0.065 -0.317 -0.288 0.249 -0.598 0.267 0.143 ave.night.1 -0.234 0.004 -0.015 -0.030 -0.110 0.016 0.031 0.493 0.614 -0.586 0.105 -0.524 -0.863 -0.243 At this point, I want to go on selecting the variables with most explanatory power to come up with a final model. However, I'm not sure on how to do this, because (not being a trained statistician) I'm used to having p-values to guide me. Similarly, I would like to be able to report the relative "importance" of variables in some way but, as apparent from a number of threads, p-values seem to be the least preferred option when it comes to lmer. I've read about the mcmcsamp()-function, but I'm not entirely sure on how to use it or on how to intrepret the output. Any advice would be most appreciated. Kind regards, Andreas Nord -- View this message in context: http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19146850.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Mon, 25 Aug 2008, jebyrnes wrote:> > Have you thought about using AIC weights? As long as you are not considering > models where you drop your random effects, calculating AIC values (or AICc > values) and doing multimodel inference is one way to approach your problem. > > If you are fitting models with and without random effects, it gets trickier > - see Vaida and Blanchard 2005 Biometrika.Also if you are setting variances to zero ....> > -Jarrett > -- > View this message in context: http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19147125.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595