thr3ads.net - R help - [R] lme4 glmer general help wanted

If this information is useful, please help other people find it:
Share via:

Ben Gillespie

2012-Dec-06 19:12 UTC

[R] lme4 glmer general help wanted - code included

Hi guys,
I'm very new to R and have been teaching myself over the past few months -
it's a great tool and I'm hoping to use it to analyse my PhD data.As
I'm a bit of a newb, I'd really appreciate any feedback and/or guidance
with regards to the following questions that relate to generalized linearmixed
modelling (or, at least, I think they do!)(if there is a 'better', more
appropriate way that I could attempt to answer my questions, please let me
know).
I've spent a lot of time researching this approach on the internet, but
can'tseem to find any directly applicable examples.
Thanks in advance, and, if you need any further information, please let me know.
# My experiment:# I have 1 site on 3 different rivers (independent)(sites 1,2
and 3). # I visit each site 2 times (time 1 and 2). # On each visit, I take 5x
replicate insect samples and calculate total abundance for each replicate.# Site
1 is in an area called "yellow" and sites 2 and 3 are in an area
called "blue".
# My data frame:

data=data.frame(site=c(rep(1,10),rep(2,10),rep(3,10)),replicate=c(rep(1:5,6)),time=c(rep(1,5),rep(2,5),rep(1,5),rep(2,5),rep(1,5),rep(2,5)),abundance=c(1,2,1,2,1,2,1,2,1,2,30,32,30,32,30,32,30,32,30,32,30,31,33,32,31,31,33,32,31,32),sitetype=c(rep("yellow",10),rep("blue",20)))
data$site=factor(data$site)data$replicate=factor(data$replicate)data$time=factor(data$time)
data

# Initial remarks: # As each replicate (1-5) was taken from within each site
(1-3) on both sampling times (1-2),# I figure that 'replicate' should be
treated as nested within 'site' and that both should be treated as
random factors?
# First question: Is there is difference in abundance between sites?# Second
question: Is there is difference in abundance between sitetypes (blue or
yellow)?
	#If my 'initial remarks' statement is correct (please tell me if not),
then I think a generalized linear mixed model is appropriate and would be
something along these lines:
# Fitting the model:
		require(lme4)		
glmm1=glmer(abundance~time+sitetype+(1|site/replicate),family="poisson",data=data)
#I chose to use poisson as abundance is count data... not sure if that's a
good reason...				summary(glmm1)	#Output:
################################################################Generalized
linear mixed model fit by the Laplace approximation Formula: abundance ~ time +
sitetype + (1 | site/replicate)    Data: data    AIC   BIC logLik deviance 12.31
19.31 -1.153    2.306Random effects: Groups         Name        Variance
Std.Dev. replicate:site (Intercept)  0        0       site           (Intercept)
0        0      Number of obs: 30, groups: replicate:site, 15; site, 3
Fixed effects:               Estimate Std. Error z value Pr(>|z|)   
(Intercept)     3.43579    0.05641   60.91   <2e-16 ***time2          
0.01560    0.07900    0.20    0.843    sitetypeyellow -3.03815    0.26127 
-11.63   <2e-16 ***---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1
Correlation of Fixed Effects:            (Intr) time2 time2       -0.706      
sitetypyllw -0.108 
0.000################################################################
# Inferences:
		#I'm unsure how to assess the variance and std dev scores for site... some
guidance here would be appreciated....i.e. how do I answer my original question:
Is there is difference in abundance between sites?		#There is no statistically
significant difference between the two time periods (P=>0.05)		#Using the
above output, the model suggests that there is a statistically significant
difference between site types (p<0.05)
# Further questions: 
		#1 Are the above inferences correct?		#2 I have read about overdispersion....
how would I test for this in this example?		#3 How could I build an interaction
term into the model and answer the following: "Is there a statistically
significant site*time interaction?"		#4 Finally, are there any obvious
steps or things I should be doing in order to get a 'robust' or
'correct' answer from this problem? i.e. further tests... alternative
models and comparisons...
 		 	   		  
	[[alternative HTML version deleted]]

Bert Gunter

2012-Dec-06 20:26 UTC

head link

[R] lme4 glmer general help wanted - code included

You are more likely to get a helpful (or any) response on mixed models
issues by posting to the r-sig-mixed-models list, not here.

-- Bert

On Thu, Dec 6, 2012 at 11:12 AM, Ben Gillespie <nebstah@hotmail.com>
wrote:
> Hi guys,
> I'm very new to R and have been teaching myself over the past few
months -
> it's a great tool and I'm hoping to use it to analyse my PhD
data.As I'm a
> bit of a newb, I'd really appreciate any feedback and/or guidance with
> regards to the following questions that relate to generalized linearmixed
> modelling (or, at least, I think they do!)(if there is a 'better',
more
> appropriate way that I could attempt to answer my questions, please let me
> know).
> I've spent a lot of time researching this approach on the internet, but
> can'tseem to find any directly applicable examples.
> Thanks in advance, and, if you need any further information, please let me
> know.
> # My experiment:# I have 1 site on 3 different rivers (independent)(sites
> 1,2 and 3). # I visit each site 2 times (time 1 and 2). # On each visit, I
> take 5x replicate insect samples and calculate total abundance for each
> replicate.# Site 1 is in an area called "yellow" and sites 2 and
3 are in
> an area called "blue".
> # My data frame:
>
>
>
data=data.frame(site=c(rep(1,10),rep(2,10),rep(3,10)),replicate=c(rep(1:5,6)),time=c(rep(1,5),rep(2,5),rep(1,5),rep(2,5),rep(1,5),rep(2,5)),abundance=c(1,2,1,2,1,2,1,2,1,2,30,32,30,32,30,32,30,32,30,32,30,31,33,32,31,31,33,32,31,32),sitetype=c(rep("yellow",10),rep("blue",20)))
>
>
data$site=factor(data$site)data$replicate=factor(data$replicate)data$time=factor(data$time)
> data
>
> # Initial remarks: # As each replicate (1-5) was taken from within each
> site (1-3) on both sampling times (1-2),# I figure that 'replicate'
should
> be treated as nested within 'site' and that both should be treated
as
> random factors?
> # First question: Is there is difference in abundance between sites?#
> Second question: Is there is difference in abundance between sitetypes
> (blue or yellow)?
>         #If my 'initial remarks' statement is correct (please tell
me if
> not), then I think a generalized linear mixed model is appropriate and
> would be something along these lines:
> # Fitting the model:
>                 require(lme4)
>
glmm1=glmer(abundance~time+sitetype+(1|site/replicate),family="poisson",data=data)
>      #I chose to use poisson as abundance is count data... not sure if
> that's a good reason...                               summary(glmm1)
>  #Output:
> ################################################################Generalized
> linear mixed model fit by the Laplace approximation Formula: abundance ~
> time + sitetype + (1 | site/replicate)    Data: data    AIC   BIC logLik
> deviance 12.31 19.31 -1.153    2.306Random effects: Groups         Name
>    Variance Std.Dev. replicate:site (Intercept)  0        0       site
>       (Intercept)  0        0      Number of obs: 30, groups:
> replicate:site, 15; site, 3
> Fixed effects:               Estimate Std. Error z value Pr(>|z|)
>  (Intercept)     3.43579    0.05641   60.91   <2e-16 ***time2
> 0.01560    0.07900    0.20    0.843    sitetypeyellow -3.03815    0.26127
>  -11.63   <2e-16 ***---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05
‘.’
> 0.1 ‘ ’ 1
> Correlation of Fixed Effects:            (Intr) time2 time2       -0.706
>     sitetypyllw -0.108
>  0.000################################################################
> # Inferences:
>                 #I'm unsure how to assess the variance and std dev
scores
> for site... some guidance here would be appreciated....i.e. how do I answer
> my original question: Is there is difference in abundance between sites?
>        #There is no statistically significant difference between the two
> time periods (P=>0.05)                #Using the above output, the model
> suggests that there is a statistically significant difference between site
> types (p<0.05)
> # Further questions:
>                 #1 Are the above inferences correct?            #2 I have
> read about overdispersion.... how would I test for this in this example?
>            #3 How could I build an interaction term into the model and
> answer the following: "Is there a statistically significant site*time
> interaction?"         #4 Finally, are there any obvious steps or
things I
> should be doing in order to get a 'robust' or 'correct'
answer from this
> problem? i.e. further tests... alternative models and comparisons...
>
>         [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

	[[alternative HTML version deleted]]

Ben Gillespie

2012-Dec-06 21:25 UTC

head link

[R] lme4 glmer general help wanted - code included

Thanks Bert - I was unaware of that list. I have since posted my questions there
(in plain text).


________________________________> Date: Thu, 6 Dec 2012 12:50:05 -0800 
> Subject: Re: [R] lme4 glmer general help wanted - code included 
> From: gunter.berton at gene.com 
> To: nebstah at hotmail.com 
> 
> Incidentally, you are explicitly asked NOT to post in HTML. 
> 
> -- Bert 
> 
> On Thu, Dec 6, 2012 at 12:26 PM, Bert Gunter 
> <bgunter at gene.com<mailto:bgunter at gene.com>> wrote: 
> You are more likely to get a helpful (or any) response on mixed models 
> issues by posting to the r-sig-mixed-models list, not here. 
> 
> -- Bert 
> 
> On Thu, Dec 6, 2012 at 11:12 AM, Ben Gillespie 
> <nebstah at hotmail.com<mailto:nebstah at hotmail.com>> wrote: 
> Hi guys, 
> I'm very new to R and have been teaching myself over the past few 
> months - it's a great tool and I'm hoping to use it to analyse my
PhD
> data.As I'm a bit of a newb, I'd really appreciate any feedback
and/or
> guidance with regards to the following questions that relate to 
> generalized linearmixed modelling (or, at least, I think they do!)(if 
> there is a 'better', more appropriate way that I could attempt to 
> answer my questions, please let me know). 
> I've spent a lot of time researching this approach on the internet, but
> can'tseem to find any directly applicable examples. 
> Thanks in advance, and, if you need any further information, please let 
> me know. 
> # My experiment:# I have 1 site on 3 different rivers 
> (independent)(sites 1,2 and 3). # I visit each site 2 times (time 1 and 
> 2). # On each visit, I take 5x replicate insect samples and calculate 
> total abundance for each replicate.# Site 1 is in an area called 
> "yellow" and sites 2 and 3 are in an area called
"blue".
> # My data frame: 
> 
>
data=data.frame(site=c(rep(1,10),rep(2,10),rep(3,10)),replicate=c(rep(1:5,6)),time=c(rep(1,5),rep(2,5),rep(1,5),rep(2,5),rep(1,5),rep(2,5)),abundance=c(1,2,1,2,1,2,1,2,1,2,30,32,30,32,30,32,30,32,30,32,30,31,33,32,31,31,33,32,31,32),sitetype=c(rep("yellow",10),rep("blue",20)))
>
data$site=factor(data$site)data$replicate=factor(data$replicate)data$time=factor(data$time)
> data 
> 
> # Initial remarks: # As each replicate (1-5) was taken from within each 
> site (1-3) on both sampling times (1-2),# I figure that 'replicate'
> should be treated as nested within 'site' and that both should be 
> treated as random factors? 
> # First question: Is there is difference in abundance between sites?# 
> Second question: Is there is difference in abundance between sitetypes 
> (blue or yellow)? 
> #If my 'initial remarks' statement is correct (please tell me 
> if not), then I think a generalized linear mixed model is appropriate 
> and would be something along these lines: 
> # Fitting the model: 
> require(lme4) 
>
glmm1=glmer(abundance~time+sitetype+(1|site/replicate),family="poisson",data=data)
> #I chose to use poisson as abundance is count data... not sure if 
> that's a good reason... summary(glmm1) 
> #Output: 
> ################################################################Generalized
> linear mixed model fit by the Laplace approximation Formula: abundance 
> ~ time + sitetype + (1 | site/replicate) Data: data AIC BIC 
> logLik deviance 12.31 19.31 -1.153 2.306Random effects: Groups 
> Name Variance Std.Dev. replicate:site (Intercept) 0 0 
> site (Intercept) 0 0 Number of obs: 30, 
> groups: replicate:site, 15; site, 3 
> Fixed effects: Estimate Std. Error z value Pr(>|z|) 
> (Intercept) 3.43579 0.05641 60.91 <2e-16 ***time2 
> 0.01560 0.07900 0.20 0.843 sitetypeyellow -3.03815 
> 0.26127 -11.63 <2e-16 ***---Signif. codes: 0 ?***? 0.001 ?**? 0.01 
> ?*? 0.05 ?.? 0.1 ? ? 1 
> Correlation of Fixed Effects: (Intr) time2 time2 
> -0.706 sitetypyllw -0.108 
> 0.000################################################################ 
> # Inferences: 
> #I'm unsure how to assess the variance and std dev 
> scores for site... some guidance here would be appreciated....i.e. how 
> do I answer my original question: Is there is difference in abundance 
> between sites? #There is no statistically significant 
> difference between the two time periods (P=>0.05) #Using 
> the above output, the model suggests that there is a statistically 
> significant difference between site types (p<0.05) 
> # Further questions: 
> #1 Are the above inferences correct? #2 I 
> have read about overdispersion.... how would I test for this in this 
> example? #3 How could I build an interaction term into the 
> model and answer the following: "Is there a statistically significant 
> site*time interaction?" #4 Finally, are there any obvious steps 
> or things I should be doing in order to get a 'robust' or
'correct'
> answer from this problem? i.e. further tests... alternative models and 
> comparisons... 
> 
> [[alternative HTML version deleted]] 
> 
> 
> ______________________________________________ 
> R-help at r-project.org<mailto:R-help at r-project.org> mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 
> 
> 
> 
> 
> -- 
> 
> Bert Gunter 
> Genentech Nonclinical Biostatistics 
> 
> Internal Contact Info: 
> Phone: 467-7374 
> Website: 
>
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> 
> 
> 
> 
> -- 
> 
> Bert Gunter 
> Genentech Nonclinical Biostatistics 
> 
> Internal Contact Info: 
> Phone: 467-7374 
> Website: 
>
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
>

Seemingly Similar Threads

Search for more reasonably related threads

R help - Dec 2012 - lme4 glmer general help wanted - code included

[R] lme4 glmer general help wanted - code included

[R] lme4 glmer general help wanted - code included

[R] lme4 glmer general help wanted - code included

Seemingly Similar Threads