thr3ads.net - R help - [R] before-after control-impact analysis with R [Aug 2008]

If this information is useful, please help other people find it:
Share via:

Nikolaos Lampadariou

2008-Aug-17 23:11 UTC

[R] before-after control-impact analysis with R

Hello everybody,

In am trying to analyse a BACI experiment and I really want to do it 
with R (which I find really exciting).  So, before moving on I though it 
would be a good idea to repeat some known experiments which are quite 
similar to my own. I tried to reproduce 2 published examples but without 
much success. The first one in particular is a published dataset 
analysed with SAS by McDonald et al. (2000). They also provide the 
original data as well as the SAS code. I don't know much about SAS and i 
really want to stick to R. So here follow 3 questions based on these 2 
papers.


Paper 1
(McDonald et al. 2000. Analysis of count data from before-after 
control-impact studies. J. Agric. Biol. Envir Stat., 5:262-279)

Data were collected from 1984, 1989, 1990, 1991, 1993, and 1996. Samples 
from 1984 are considered as before an impact and the remaining years as 
  after the impact. Each year, 96 transects were sampled (36 in the 
oiled area and 60 in the non-oiled area; "0" is for oiled and
"1" for
non-oiled). The authors compare 3 different ways of analysing the data 
including glmm. The data can be reproduced with the following commands 
(density is fake numbers but I can provide the original data since I've 
typed them in anyway):

 >x<-c(rep("0",36),rep("1",60))
 >oiled<-rep(x,6)
 >year<-c(rep(1984,96), rep(1989,96), rep(1990,96), rep(1991,96), 
rep(1993,96), rep(1996,96))
 >density<-runif(576, min=0, max=10)
 >transect<-c(rep(1:96,6))
 >oil<-data.frame("oiled"=oiled, "transect"=transect,
"year"=year,
"density"=density)

Question 1:
I can reproduce the results of the repeated measures anova with:
 >oil.aov1<-aov(density~factor(year)*factor(oiled)+Error(factor(transect))

But why is the following command not working?
 >oil.aov2<-aov(density~oiled*year + Error(oiled/transect), data=oil)

After reading the R-help archive, as well as Chambers and Hasties 
(Statistical Models in S) and Pinheiro's and Bates (Mixed effects models 
in S and S-plus) I would expect that the correct model is the oil.aov2. 
As you might see from the data, transect is nested within oiled and I 
still get the wrong results when I try +Error(transect) or 
+Error(oiled:transect) and many other variants.

Question 2 (on the same paper):
The authors conclude that it is better to analyse the data with a 
Generalized Linear (Mixed) Model Technique. I tried lme and after 
reading Douglas Bates article (May 2005, vol. 5/1) also lmer as follows:
 >oil.lme<-lme(density~year*oiled, random=~1|oiled/transect)
or
 >harvest.lmer<-lmer(abund~H*BA+BA/year+BA/year:H+site:BA+(1|H/site))
but again no luck. I will get always some error messages or some 
interactions missing.


Paper 2
(Keough & Quinn, 2000. Legislative vs. practical protection of an 
intertidal shoreline in southeastern Australia. Ecol. Appl. 10: 871-881)

Data were collected from 1989, 1990, 1991, 1993, 1994, 1995, 1996, 
1997). the 1989-1991 are the before impact years and the other 4 years 
are the after the impact years. Eight sites were samples (2 protected 
and 6 impacted). In this dataset, site is nested within harvest (H) and 
year is nested within before-after (BA). Also, site is considered as 
random by the authors. The data (fake again) can be produced with the 
following commands:

 >site<-c(rep(c("A1","A2", "RR1",
"RR2", "WT1", "WT2", "WT3",
"WT4"),8))
 >H<-c(rep(c("exp", "exp", "prot",
"pro", "exp", "exp", "exp",
"exp"), 8))
 >year<-c(rep(1989,8), rep(1990,8), rep(1991,8), rep(1993,8), 
rep(1994,8), rep(1995,8), rep(1996,8), rep(1997,8))
 >BA<-c(rep("bef",24), rep("after",40))
 >abund<-runif(64, min=0, max=10)
 >harvest<-data.frame(abund, BA, H, site, year)

Question 3.
The authors use a complex anova design and here is part of their anova 
table which shows the design and the df.

source.of.var      df
Harvesting(H)     1, 6
before-after(BA)  1, 6
H x BA            1, 6
Site(H)           6, 31
Year(BA)          6, 31
Site x BA         6, 31
Year x H          6, 31
Resid.            31


My question again is: Why can't I reproduce the results? When I try a 
simple anova without any random factors:
 >harvest.lm<-lm(abund~H*BA+H/site+BA/year+site:BA+year:H)
I get close enought but the results are different (at least I think they 
are different because the nomin. df are different).

So obviously I need to assign sites as a random factor somehow. So when 
I try to include site (which is nested in H) as a random factor and also 
year nested in BA (as the authors did) the best I can come up with is:
 >harvest.lme<-lme(abund~H*BA+BA/year+BA/year:H, random=~1|H/site)
But I get a warning message (Warning message:In pf(q, df1, df2, 
lower.tail, log.p) : NaNs produced) and also I don't know where to put 
the site x BA term (whatever I try I get error messages).

I really apologise for the long post but after a week of studying and 
trying as many ideas and examples I could find and think of I felt that 
I really need advice from some more experienced users if I really want 
to use this magnificent tool correctly.

Thanks in advance

-- 
-------------------------------------------------------
Nikolaos Lampadariou
Hellenic Centre for Marine Research
P.O. Box 2214
710 03 Heraklion Crete
GREECE

e-mail: nlamp at her.hcmr.gr
Ph. +30 281 0337849, +30 281 0337806
FAX +30 281 0337822
Web site: http://www.hcmr.gr/

Mark Difford

2008-Aug-18 08:53 UTC

head link

[R] before-after control-impact analysis with R

Hi Nikolaos,
>> My question again is: Why can't I reproduce the results? When I try
a
>> simple anova without any random factors:
Lack of a "right" result probably has to do with the type of analysis
of
variance that is being done. The default in R is to use so-called Type I
tests, for good reason. SAS, I think, still uses a type of test that many
authorities consider to be meaningless, as a general, first-level, test.

There is a long, long thread on this, going back to about approx April/May
1999, when a suitable Ripplyism was coined, which I believe found its way
into the fortunes package. But

RSiteSearch("type III")

should do for a start. Also see

?drop1
library(car)
?Anova

HTH, Mark.


Nikolaos Lampadariou wrote:> 
> Hello everybody,
> 
> In am trying to analyse a BACI experiment and I really want to do it 
> with R (which I find really exciting).  So, before moving on I though it 
> would be a good idea to repeat some known experiments which are quite 
> similar to my own. I tried to reproduce 2 published examples but without 
> much success. The first one in particular is a published dataset 
> analysed with SAS by McDonald et al. (2000). They also provide the 
> original data as well as the SAS code. I don't know much about SAS and
i
> really want to stick to R. So here follow 3 questions based on these 2 
> papers.
> 
> 
> Paper 1
> (McDonald et al. 2000. Analysis of count data from before-after 
> control-impact studies. J. Agric. Biol. Envir Stat., 5:262-279)
> 
> Data were collected from 1984, 1989, 1990, 1991, 1993, and 1996. Samples 
> from 1984 are considered as before an impact and the remaining years as 
>   after the impact. Each year, 96 transects were sampled (36 in the 
> oiled area and 60 in the non-oiled area; "0" is for oiled and
"1" for
> non-oiled). The authors compare 3 different ways of analysing the data 
> including glmm. The data can be reproduced with the following commands 
> (density is fake numbers but I can provide the original data since I've
> typed them in anyway):
> 
>  >x<-c(rep("0",36),rep("1",60))
>  >oiled<-rep(x,6)
>  >year<-c(rep(1984,96), rep(1989,96), rep(1990,96), rep(1991,96), 
> rep(1993,96), rep(1996,96))
>  >density<-runif(576, min=0, max=10)
>  >transect<-c(rep(1:96,6))
>  >oil<-data.frame("oiled"=oiled,
"transect"=transect, "year"=year,
> "density"=density)
> 
> Question 1:
> I can reproduce the results of the repeated measures anova with:
> 
>oil.aov1<-aov(density~factor(year)*factor(oiled)+Error(factor(transect))
> 
> But why is the following command not working?
>  >oil.aov2<-aov(density~oiled*year + Error(oiled/transect), data=oil)
> 
> After reading the R-help archive, as well as Chambers and Hasties 
> (Statistical Models in S) and Pinheiro's and Bates (Mixed effects
models
> in S and S-plus) I would expect that the correct model is the oil.aov2. 
> As you might see from the data, transect is nested within oiled and I 
> still get the wrong results when I try +Error(transect) or 
> +Error(oiled:transect) and many other variants.
> 
> Question 2 (on the same paper):
> The authors conclude that it is better to analyse the data with a 
> Generalized Linear (Mixed) Model Technique. I tried lme and after 
> reading Douglas Bates article (May 2005, vol. 5/1) also lmer as follows:
>  >oil.lme<-lme(density~year*oiled, random=~1|oiled/transect)
> or
>  >harvest.lmer<-lmer(abund~H*BA+BA/year+BA/year:H+site:BA+(1|H/site))
> but again no luck. I will get always some error messages or some 
> interactions missing.
> 
> 
> Paper 2
> (Keough & Quinn, 2000. Legislative vs. practical protection of an 
> intertidal shoreline in southeastern Australia. Ecol. Appl. 10: 871-881)
> 
> Data were collected from 1989, 1990, 1991, 1993, 1994, 1995, 1996, 
> 1997). the 1989-1991 are the before impact years and the other 4 years 
> are the after the impact years. Eight sites were samples (2 protected 
> and 6 impacted). In this dataset, site is nested within harvest (H) and 
> year is nested within before-after (BA). Also, site is considered as 
> random by the authors. The data (fake again) can be produced with the 
> following commands:
> 
>  >site<-c(rep(c("A1","A2", "RR1",
"RR2", "WT1", "WT2", "WT3",
"WT4"),8))
>  >H<-c(rep(c("exp", "exp", "prot",
"pro", "exp", "exp", "exp",
"exp"), 8))
>  >year<-c(rep(1989,8), rep(1990,8), rep(1991,8), rep(1993,8), 
> rep(1994,8), rep(1995,8), rep(1996,8), rep(1997,8))
>  >BA<-c(rep("bef",24), rep("after",40))
>  >abund<-runif(64, min=0, max=10)
>  >harvest<-data.frame(abund, BA, H, site, year)
> 
> Question 3.
> The authors use a complex anova design and here is part of their anova 
> table which shows the design and the df.
> 
> source.of.var      df
> Harvesting(H)     1, 6
> before-after(BA)  1, 6
> H x BA            1, 6
> Site(H)           6, 31
> Year(BA)          6, 31
> Site x BA         6, 31
> Year x H          6, 31
> Resid.            31
> 
> 
> My question again is: Why can't I reproduce the results? When I try a 
> simple anova without any random factors:
>  >harvest.lm<-lm(abund~H*BA+H/site+BA/year+site:BA+year:H)
> I get close enought but the results are different (at least I think they 
> are different because the nomin. df are different).
> 
> So obviously I need to assign sites as a random factor somehow. So when 
> I try to include site (which is nested in H) as a random factor and also 
> year nested in BA (as the authors did) the best I can come up with is:
>  >harvest.lme<-lme(abund~H*BA+BA/year+BA/year:H, random=~1|H/site)
> But I get a warning message (Warning message:In pf(q, df1, df2, 
> lower.tail, log.p) : NaNs produced) and also I don't know where to put 
> the site x BA term (whatever I try I get error messages).
> 
> I really apologise for the long post but after a week of studying and 
> trying as many ideas and examples I could find and think of I felt that 
> I really need advice from some more experienced users if I really want 
> to use this magnificent tool correctly.
> 
> Thanks in advance
> 
> -- 
> -------------------------------------------------------
> Nikolaos Lampadariou
> Hellenic Centre for Marine Research
> P.O. Box 2214
> 710 03 Heraklion Crete
> GREECE
> 
> e-mail: nlamp at her.hcmr.gr
> Ph. +30 281 0337849, +30 281 0337806
> FAX +30 281 0337822
> Web site: http://www.hcmr.gr/
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context:
http://www.nabble.com/before-after-control-impact-analysis-with-R-tp19024219p19027892.html
Sent from the R help mailing list archive at Nabble.com.

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Aug 2008 - before-after control-impact analysis with R

[R] before-after control-impact analysis with R

[R] before-after control-impact analysis with R

Possibly Parallel Threads