Cristiano Alessandro
2015-Dec-14 19:10 UTC
[R] repeated measure with quantitative independent variable
Dear John,
thanks for your reply! The reason why I did not want to factorize the
within-subjects variable was to avoid increasing the Df of the model
from 1 (continuous variable) to k-1 (where k is the number of levels of
the factors). I am now confused, because you have factorized the
variable (indeed using "factor"), but the Df of myfactor_nc seems to
be
1. Could you explain that?
Comparing the results obtained with the two methods I seem to get
completely different results:
* aov()*
dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6);
subject <-
factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5","s5","s5"));
myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
mydata_nc <- data.frame(dv, subject, myfactor_nc)
am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc), data=mydata_nc)
summary(am1_nc)
Error: subject
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 4 12.4 3.1
Error: subject:myfactor_nc
Df Sum Sq Mean Sq F value Pr(>F)
myfactor_nc 1 14.4 14.4 16 0.0161 *
Residuals 4 3.6 0.9
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 5 1.333 0.2667
*Anova()*
dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2],
dv[myfactor_nc==3]))
mlm1 <- lm(dvm ~ 1)
myfactor_nc <- factor(1:3)
contrasts(myfactor_nc) <- matrix(-1:1, ncol=1)
idata <- data.frame(myfactor_nc)
Anova(mlm1, idata=idata, idesign=~myfactor_nc)
Note: model has only an intercept; equivalent type-III tests substituted.
Type III Repeated Measures MANOVA Tests: Pillai test statistic
Df test stat approx F num Df den Df Pr(>F)
(Intercept) 1 0.93790 60.409 1 4 0.001477 **
myfactor_nc 1 0.83478 7.579 2 3 0.067156 .
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Why is that?
Thanks a lot
Cristiano
On 12/14/2015 05:25 PM, Fox, John wrote:> Dear Cristiano,
>
> If I understand correctly what you want to do, you should be able to use
Anova() in the car package (your second question) by treating your numeric
repeated-measures predictor as a factor and defining a single linear contrast
for it.
>
> Continuing with your toy example:
>
>> myfactor_nc <- factor(1:3)
>> contrasts(myfactor_nc) <- matrix(-1:1, ncol=1)
>> idata <- data.frame(myfactor_nc)
>> Anova(mlm1, idata=idata, idesign=~myfactor_nc)
> Note: model has only an intercept; equivalent type-III tests substituted.
>
> Type III Repeated Measures MANOVA Tests: Pillai test statistic
> Df test stat approx F num Df den Df Pr(>F)
> (Intercept) 1 0.93790 60.409 1 4 0.001477 **
> myfactor_nc 1 0.83478 7.579 2 3 0.067156 .
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
>
> With just 3 distinct levels, however, you could just make myfactor_nc an
ordered factor, not defining the contrasts explicitly, and then you'd get
both linear and quadratic contrasts.
>
> I hope this helps,
> John
>
> -----------------------------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.socsci.mcmaster.ca/jfox/
>
>
>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
>> Cristiano Alessandro
>> Sent: Monday, December 14, 2015 8:43 AM
>> To: r-help at r-project.org
>> Subject: [R] repeated measure with quantitative independent variable
>>
>> Hi all,
>>
>> I am new to R, and I am trying to set up a repeated measure analysis
>> with a quantitative (as opposed to factorized/categorical)
>> within-subjects variable. For a variety of reasons I am not using
>> linear-mixed models, rather I am trying to fit a General Linear Model
(I
>> am aware of assumptions and limitations) to assess whether the value of
>> the within-subjects variable affects statistically significantly the
>> response variable. I have two questions. To make myself clear I propose
>> the following exemplary dataset (where myfactor_nc is the quantitative
>> within-subjects variable; i.e. each subject performs the experiment
>> three times -- nc_factor=1,2,3 -- and produces the response in variable
>> dv).
>>
>> dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6);
>> subject <-
>>
factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5
>> ","s5","s5"));
>> myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
>> mydata_nc <- data.frame(dv, subject, myfactor_nc)
>>
>> *Question 1 (using function aov)*
>>
>> Easily done...
>>
>> am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc),
>> data=mydata_nc)
>> summary(am1_nc)
>>
>> Unlike the case when myfactor_nc is categorical, this produces three
>> error strata: Error: subject, Error: subject:myfactor_nc, Error:
Within.
>> I cannot understand the meaning of the latter. How is that computed?
>>
>> *Question 2 (using function lm)*
>>
>> Now I would like to do the same with the functions lm() and Anova()
>> (from the car package). What I have done so far (please correct me if I
>> am mistaking) is the following:
>>
>> # Unstack the dataset
>> dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2],
>> dv[myfactor_nc==3]))
>>
>> #Fit the linear model
>> mlm1 <- lm(dvm ~ 1)
>>
>> (is that model above correct for my design?)
>>
>> Now I should use the Anova function, but it seems that it only accepts
>> factors, and not quantitative within-subject variable.
>>
>> Any help is highly appreciated!
>>
>> Thanks
>> Cristiano
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
Fox, John
2015-Dec-14 19:37 UTC
[R] repeated measure with quantitative independent variable
Dear Cristiano,> -----Original Message----- > From: Cristiano Alessandro [mailto:cri.alessandro at gmail.com] > Sent: Monday, December 14, 2015 2:11 PM > To: Fox, John > Cc: r-help at r-project.org > Subject: Re: [R] repeated measure with quantitative independent variable > > Dear John, > > thanks for your reply! The reason why I did not want to factorize the > within-subjects variable was to avoid increasing the Df of the model > from 1 (continuous variable) to k-1 (where k is the number of levels of > the factors). I am now confused, because you have factorized the > variable (indeed using "factor"), but the Df of myfactor_nc seems to be > 1. Could you explain that?I defined only one (linear) contrast for the factor -- but I made a mistake in how I did it, see below.> > Comparing the results obtained with the two methods I seem to get > completely different results: > > * aov()* > > dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6); > subject <- > factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5 > ","s5","s5")); > myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3) > mydata_nc <- data.frame(dv, subject, myfactor_nc) > > am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc), > data=mydata_nc) > summary(am1_nc) > > Error: subject > Df Sum Sq Mean Sq F value Pr(>F) > Residuals 4 12.4 3.1 > > Error: subject:myfactor_nc > Df Sum Sq Mean Sq F value Pr(>F) > myfactor_nc 1 14.4 14.4 16 0.0161 * > Residuals 4 3.6 0.9 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Error: Within > Df Sum Sq Mean Sq F value Pr(>F) > Residuals 5 1.333 0.2667 > > > *Anova()* > > dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2], > dv[myfactor_nc==3])) > > mlm1 <- lm(dvm ~ 1) > myfactor_nc <- factor(1:3) > contrasts(myfactor_nc) <- matrix(-1:1, ncol=1) > idata <- data.frame(myfactor_nc) > Anova(mlm1, idata=idata, idesign=~myfactor_nc) > Note: model has only an intercept; equivalent type-III tests > substituted. > > Type III Repeated Measures MANOVA Tests: Pillai test statistic > Df test stat approx F num Df den Df Pr(>F) > (Intercept) 1 0.93790 60.409 1 4 0.001477 ** > myfactor_nc 1 0.83478 7.579 2 3 0.067156 . > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Why is that?There are two differences: (1) The repeated-measures analysis reported by default by Anova() is a MANOVA; if you want the univariate analysis comparable to aov(), you have to ask for it. (2) As I mentioned, I made a mistake in defining the contrasts for the factor; what I did implicitly included the quadratic component, which wasn't my intention. Here's the correct version, which, as you can see, produces the same result as aov():> contrasts(myfactor_nc, 1) <- matrix(-1:1, ncol=1) > contrasts(myfactor_nc)[,1] 1 -1 2 0 3 1> idata <- data.frame(myfactor_nc)> summary(Anova(mlm1, idata=idata, idesign=~myfactor_nc), multivariate=FALSE)Note: model has only an intercept; equivalent type-III tests substituted. Univariate Type III Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den Df F Pr(>F) (Intercept) 187.27 1 12.4 4 60.409 0.001477 ** myfactor_nc 14.40 1 3.6 4 16.000 0.016130 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1> summary(am1_nc) # your result using aov()Error: subject Df Sum Sq Mean Sq F value Pr(>F) Residuals 4 12.4 3.1 Error: subject:myfactor_nc Df Sum Sq Mean Sq F value Pr(>F) myfactor_nc 1 14.4 14.4 16 0.0161 * Residuals 4 3.6 0.9 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: Within Df Sum Sq Mean Sq F value Pr(>F) Residuals 5 1.333 0.2667 My apologies for the confusion about the contrasts. John> > Thanks a lot > Cristiano > > > On 12/14/2015 05:25 PM, Fox, John wrote: > > Dear Cristiano, > > > > If I understand correctly what you want to do, you should be able to > use Anova() in the car package (your second question) by treating your > numeric repeated-measures predictor as a factor and defining a single > linear contrast for it. > > > > Continuing with your toy example: > > > >> myfactor_nc <- factor(1:3) > >> contrasts(myfactor_nc) <- matrix(-1:1, ncol=1) > >> idata <- data.frame(myfactor_nc) > >> Anova(mlm1, idata=idata, idesign=~myfactor_nc) > > Note: model has only an intercept; equivalent type-III tests > substituted. > > > > Type III Repeated Measures MANOVA Tests: Pillai test statistic > > Df test stat approx F num Df den Df Pr(>F) > > (Intercept) 1 0.93790 60.409 1 4 0.001477 ** > > myfactor_nc 1 0.83478 7.579 2 3 0.067156 . > > --- > > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > > > With just 3 distinct levels, however, you could just make myfactor_nc > an ordered factor, not defining the contrasts explicitly, and then you'd > get both linear and quadratic contrasts. > > > > I hope this helps, > > John > > > > ----------------------------------------------- > > John Fox, Professor > > McMaster University > > Hamilton, Ontario, Canada > > http://socserv.socsci.mcmaster.ca/jfox/ > > > > > > > >> -----Original Message----- > >> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of > >> Cristiano Alessandro > >> Sent: Monday, December 14, 2015 8:43 AM > >> To: r-help at r-project.org > >> Subject: [R] repeated measure with quantitative independent variable > >> > >> Hi all, > >> > >> I am new to R, and I am trying to set up a repeated measure analysis > >> with a quantitative (as opposed to factorized/categorical) > >> within-subjects variable. For a variety of reasons I am not using > >> linear-mixed models, rather I am trying to fit a General Linear Model > (I > >> am aware of assumptions and limitations) to assess whether the value > of > >> the within-subjects variable affects statistically significantly the > >> response variable. I have two questions. To make myself clear I > propose > >> the following exemplary dataset (where myfactor_nc is the > quantitative > >> within-subjects variable; i.e. each subject performs the experiment > >> three times -- nc_factor=1,2,3 -- and produces the response in > variable > >> dv). > >> > >> dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6); > >> subject <- > >> > factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5 > >> ","s5","s5")); > >> myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3) > >> mydata_nc <- data.frame(dv, subject, myfactor_nc) > >> > >> *Question 1 (using function aov)* > >> > >> Easily done... > >> > >> am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc), > >> data=mydata_nc) > >> summary(am1_nc) > >> > >> Unlike the case when myfactor_nc is categorical, this produces three > >> error strata: Error: subject, Error: subject:myfactor_nc, Error: > Within. > >> I cannot understand the meaning of the latter. How is that computed? > >> > >> *Question 2 (using function lm)* > >> > >> Now I would like to do the same with the functions lm() and Anova() > >> (from the car package). What I have done so far (please correct me if > I > >> am mistaking) is the following: > >> > >> # Unstack the dataset > >> dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2], > >> dv[myfactor_nc==3])) > >> > >> #Fit the linear model > >> mlm1 <- lm(dvm ~ 1) > >> > >> (is that model above correct for my design?) > >> > >> Now I should use the Anova function, but it seems that it only > accepts > >> factors, and not quantitative within-subject variable. > >> > >> Any help is highly appreciated! > >> > >> Thanks > >> Cristiano > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting- > >> guide.html > >> and provide commented, minimal, self-contained, reproducible code.