thr3ads.net - R help - [R] lmer4 and variable selection [Aug 2008]

If this information is useful, please help other people find it:
Share via:

Andreas Nord

2008-Aug-25 16:21 UTC

[R] lmer4 and variable selection

Dear list, 

I am currently working with a rather large data set on body temperature
regulation in wintering birds. My original model contains quite a few
dependent variables, but I do not (of course) wish to keep them all in my
final model. I've fitted the following model to the data:
>
temp.lme1<-lmer(T.B~tarsus+wing+weight+factor(age)+factor(sex)+fat+minsunset+day1oct+day1oct.2+minnight+ave.day+minnight.1+T.A+ave.night.1+(1|ID)+(1|sign),data=bodytemp.df)
where T.B equals body temperature; explanatories are a number of biometric
measures (tarsus,  wing, weight, fat, age, sex) and various measures of
ambient temperature (ave.day, minnight.1, minnight,  ave.night.1, T.A) and
time/date (minsunset,day1oct,day1oct.2). Random factors are ID (individuals
were samples ranging from 1 to 3 times) and sign (person performing
measurements; 2 levels).

Model output looks like this:
> summary(temp.lme1)Linear mixed model fit by REML 
Formula: T.B ~ tarsus + wing + weight + factor(age) + factor(sex) + fat +     
minsunset + day1oct + day1oct.2 + minnight + ave.day + minnight.1 +      T.A
+ ave.night.1 + (1 | ID) + (1 | sign) 
   Data: bodytemp.df 
   AIC BIC logLik deviance REMLdev
 557.8 614 -260.9      441   521.8
Random effects:
 Groups   Name        Variance   Std.Dev.  
 ID       (Intercept) 1.0399e-01 0.32247096
 sign     (Intercept) 6.2663e-08 0.00025033
 Residual             8.0162e-01 0.89533134
Number of obs: 167, groups: ID, 124; sign, 2

Fixed effects:
                 Estimate Std. Error t value
(Intercept)     4.124e+01  4.104e+00  10.049
tarsus         -5.925e-02  5.801e-02  -1.021
wing           -6.252e-02  4.984e-02  -1.254
weight          1.499e-01  1.446e-01   1.037
factor(age)2K+  1.981e-01  1.651e-01   1.200
factor(sex)M    9.232e-02  2.146e-01   0.430
fat            -2.297e-02  8.150e-02  -0.282
minsunset      -1.104e-03  1.043e-03  -1.058
day1oct        -4.247e-03  2.879e-02  -0.148
day1oct.2       5.087e-05  1.560e-04   0.326
minnight       -5.987e-02  7.022e-02  -0.853
ave.day         1.128e-01  1.582e-01   0.713
minnight.1     -9.590e-02  1.684e-01  -0.570
T.A            -4.855e-02  5.185e-02  -0.936
ave.night.1     1.420e-01  2.477e-01   0.573

Correlation of Fixed Effects:
            (Intr) tarsus wing   weight f()2K+ fct()M fat    mnsnst day1ct
dy1c.2 mnnght ave.dy mnng.1 T.A   
tarsus      -0.851
wing        -0.870  0.966
weight       0.071 -0.417 -0.411
factr(g)2K+  0.211 -0.248 -0.241  0.219
factor(sx)M  0.573 -0.499 -0.526 -0.179  0.105
fat         -0.037  0.046  0.052 -0.264 -0.152  0.045
minsunset   -0.177 -0.144 -0.122  0.214 -0.101 -0.027 -0.045
day1oct     -0.261 -0.051 -0.052 -0.117 -0.145  0.140  0.131  0.515
day1oct.2    0.257  0.050  0.051  0.121  0.141 -0.149 -0.125 -0.484 -0.993
minnight    -0.074  0.249  0.216 -0.271 -0.032 -0.043  0.022  0.022 -0.168 
0.231                            
ave.day     -0.025  0.070  0.050  0.001  0.045 -0.022  0.046 -0.363 -0.120 
0.041 -0.415                     
minnight.1   0.304 -0.081 -0.045  0.069  0.129  0.012 -0.054 -0.349 -0.636 
0.644  0.023  0.052              
T.A          0.049 -0.043  0.018  0.130  0.040 -0.164 -0.065 -0.317 -0.288 
0.249 -0.598  0.267  0.143       
ave.night.1 -0.234  0.004 -0.015 -0.030 -0.110  0.016  0.031  0.493  0.614
-0.586  0.105 -0.524 -0.863 -0.243

At this point, I want to go on selecting the variables with most explanatory
power to come up with a final model. However, I'm not sure on how to do
this, because (not being a trained statistician) I'm used to having p-values
to guide me. Similarly, I would like to be able to report the relative
"importance" of  variables in some way but, as apparent from a number
of
threads, p-values seem to be the least preferred option when it comes to
lmer. I've read about the mcmcsamp()-function, but I'm not entirely sure
on
how to use it or on how to intrepret the output. 

Any advice would be most appreciated.


Kind regards, 
Andreas Nord                   

-- 
View this message in context:
http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19146850.html
Sent from the R help mailing list archive at Nabble.com.

jebyrnes

2008-Aug-25 16:48 UTC

head link

[R] lmer4 and variable selection

Have you thought about using AIC weights?  As long as you are not considering
models where you drop your random effects, calculating AIC values (or AICc
values) and doing multimodel inference is one way to approach your problem.

If you are fitting models with and without random effects, it gets trickier
- see Vaida and Blanchard 2005 Biometrika.

-Jarrett
-- 
View this message in context:
http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19147125.html
Sent from the R help mailing list archive at Nabble.com.

Bert Gunter

2008-Aug-25 16:55 UTC

head link

[R] lmer4 and variable selection

You **really** should work with a local statistician. Remote statistical
advice (this is not really about R) from even well-meaning helpers
unfamiliar with your work is really very risky. For example, I would suggest
making all sorts of plots (statistical summaries alone are wholly inadequate
and potentially quite misleading), but exactly what to plot, how to
interpret what the plots show, and what to do next would depend on both the
subject matter background (how the study was conducted and what sorts of
mechanisms are expected, for example)and what the plots revealed.

Like the gangster movies (used to) say: just a friendly warning ...  :)

-- Bert Gunter
Genentech


----- Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
Behalf Of Andreas Nord
Sent: Monday, August 25, 2008 9:22 AM
To: r-help at r-project.org
Subject: [R] lmer4 and variable selection


Dear list, 

I am currently working with a rather large data set on body temperature
regulation in wintering birds. My original model contains quite a few
dependent variables, but I do not (of course) wish to keep them all in my
final model. I've fitted the following model to the data:
>temp.lme1<-lmer(T.B~tarsus+wing+weight+factor(age)+factor(sex)+fat+minsunset
+day1oct+day1oct.2+minnight+ave.day+minnight.1+T.A+ave.night.1+(1|ID)+(1|sig
n),data=bodytemp.df)

where T.B equals body temperature; explanatories are a number of biometric
measures (tarsus,  wing, weight, fat, age, sex) and various measures of
ambient temperature (ave.day, minnight.1, minnight,  ave.night.1, T.A) and
time/date (minsunset,day1oct,day1oct.2). Random factors are ID (individuals
were samples ranging from 1 to 3 times) and sign (person performing
measurements; 2 levels).

Model output looks like this:
> summary(temp.lme1)Linear mixed model fit by REML 
Formula: T.B ~ tarsus + wing + weight + factor(age) + factor(sex) + fat +

minsunset + day1oct + day1oct.2 + minnight + ave.day + minnight.1 +      T.A
+ ave.night.1 + (1 | ID) + (1 | sign) 
   Data: bodytemp.df 
   AIC BIC logLik deviance REMLdev
 557.8 614 -260.9      441   521.8
Random effects:
 Groups   Name        Variance   Std.Dev.  
 ID       (Intercept) 1.0399e-01 0.32247096
 sign     (Intercept) 6.2663e-08 0.00025033
 Residual             8.0162e-01 0.89533134
Number of obs: 167, groups: ID, 124; sign, 2

Fixed effects:
                 Estimate Std. Error t value
(Intercept)     4.124e+01  4.104e+00  10.049
tarsus         -5.925e-02  5.801e-02  -1.021
wing           -6.252e-02  4.984e-02  -1.254
weight          1.499e-01  1.446e-01   1.037
factor(age)2K+  1.981e-01  1.651e-01   1.200
factor(sex)M    9.232e-02  2.146e-01   0.430
fat            -2.297e-02  8.150e-02  -0.282
minsunset      -1.104e-03  1.043e-03  -1.058
day1oct        -4.247e-03  2.879e-02  -0.148
day1oct.2       5.087e-05  1.560e-04   0.326
minnight       -5.987e-02  7.022e-02  -0.853
ave.day         1.128e-01  1.582e-01   0.713
minnight.1     -9.590e-02  1.684e-01  -0.570
T.A            -4.855e-02  5.185e-02  -0.936
ave.night.1     1.420e-01  2.477e-01   0.573

Correlation of Fixed Effects:
            (Intr) tarsus wing   weight f()2K+ fct()M fat    mnsnst day1ct
dy1c.2 mnnght ave.dy mnng.1 T.A   
tarsus      -0.851

wing        -0.870  0.966

weight       0.071 -0.417 -0.411

factr(g)2K+  0.211 -0.248 -0.241  0.219

factor(sx)M  0.573 -0.499 -0.526 -0.179  0.105

fat         -0.037  0.046  0.052 -0.264 -0.152  0.045

minsunset   -0.177 -0.144 -0.122  0.214 -0.101 -0.027 -0.045

day1oct     -0.261 -0.051 -0.052 -0.117 -0.145  0.140  0.131  0.515

day1oct.2    0.257  0.050  0.051  0.121  0.141 -0.149 -0.125 -0.484 -0.993

minnight    -0.074  0.249  0.216 -0.271 -0.032 -0.043  0.022  0.022 -0.168 
0.231                            
ave.day     -0.025  0.070  0.050  0.001  0.045 -0.022  0.046 -0.363 -0.120 
0.041 -0.415                     
minnight.1   0.304 -0.081 -0.045  0.069  0.129  0.012 -0.054 -0.349 -0.636 
0.644  0.023  0.052              
T.A          0.049 -0.043  0.018  0.130  0.040 -0.164 -0.065 -0.317 -0.288 
0.249 -0.598  0.267  0.143       
ave.night.1 -0.234  0.004 -0.015 -0.030 -0.110  0.016  0.031  0.493  0.614
-0.586  0.105 -0.524 -0.863 -0.243

At this point, I want to go on selecting the variables with most explanatory
power to come up with a final model. However, I'm not sure on how to do
this, because (not being a trained statistician) I'm used to having p-values
to guide me. Similarly, I would like to be able to report the relative
"importance" of  variables in some way but, as apparent from a number
of
threads, p-values seem to be the least preferred option when it comes to
lmer. I've read about the mcmcsamp()-function, but I'm not entirely sure
on
how to use it or on how to intrepret the output. 

Any advice would be most appreciated.


Kind regards, 
Andreas Nord                   

-- 
View this message in context:
http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19146850.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Prof Brian Ripley

2008-Aug-25 18:29 UTC

head link

[R] lmer4 and variable selection

On Mon, 25 Aug 2008, jebyrnes wrote:
>
> Have you thought about using AIC weights?  As long as you are not
considering
> models where you drop your random effects, calculating AIC values (or AICc
> values) and doing multimodel inference is one way to approach your problem.
>
> If you are fitting models with and without random effects, it gets trickier
> - see Vaida and Blanchard 2005 Biometrika.
Also if you are setting variances to zero ....
>
> -Jarrett
> -- 
> View this message in context:
http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19147125.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Maybe Matching Threads

Search for more possibly parallel threads

R help - Aug 2008 - lmer4 and variable selection

[R] lmer4 and variable selection

[R] lmer4 and variable selection

[R] lmer4 and variable selection

[R] lmer4 and variable selection

Maybe Matching Threads