Cristiano Alessandro
2015-Dec-14 13:43 UTC
[R] repeated measure with quantitative independent variable
Hi all, I am new to R, and I am trying to set up a repeated measure analysis with a quantitative (as opposed to factorized/categorical) within-subjects variable. For a variety of reasons I am not using linear-mixed models, rather I am trying to fit a General Linear Model (I am aware of assumptions and limitations) to assess whether the value of the within-subjects variable affects statistically significantly the response variable. I have two questions. To make myself clear I propose the following exemplary dataset (where myfactor_nc is the quantitative within-subjects variable; i.e. each subject performs the experiment three times -- nc_factor=1,2,3 -- and produces the response in variable dv). dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6); subject <- factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5","s5","s5")); myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3) mydata_nc <- data.frame(dv, subject, myfactor_nc) *Question 1 (using function aov)* Easily done... am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc), data=mydata_nc) summary(am1_nc) Unlike the case when myfactor_nc is categorical, this produces three error strata: Error: subject, Error: subject:myfactor_nc, Error: Within. I cannot understand the meaning of the latter. How is that computed? *Question 2 (using function lm)* Now I would like to do the same with the functions lm() and Anova() (from the car package). What I have done so far (please correct me if I am mistaking) is the following: # Unstack the dataset dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2], dv[myfactor_nc==3])) #Fit the linear model mlm1 <- lm(dvm ~ 1) (is that model above correct for my design?) Now I should use the Anova function, but it seems that it only accepts factors, and not quantitative within-subject variable. Any help is highly appreciated! Thanks Cristiano
Fox, John
2015-Dec-14 16:25 UTC
[R] repeated measure with quantitative independent variable
Dear Cristiano, If I understand correctly what you want to do, you should be able to use Anova() in the car package (your second question) by treating your numeric repeated-measures predictor as a factor and defining a single linear contrast for it. Continuing with your toy example:> myfactor_nc <- factor(1:3) > contrasts(myfactor_nc) <- matrix(-1:1, ncol=1) > idata <- data.frame(myfactor_nc) > Anova(mlm1, idata=idata, idesign=~myfactor_nc)Note: model has only an intercept; equivalent type-III tests substituted. Type III Repeated Measures MANOVA Tests: Pillai test statistic Df test stat approx F num Df den Df Pr(>F) (Intercept) 1 0.93790 60.409 1 4 0.001477 ** myfactor_nc 1 0.83478 7.579 2 3 0.067156 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 With just 3 distinct levels, however, you could just make myfactor_nc an ordered factor, not defining the contrasts explicitly, and then you'd get both linear and quadratic contrasts. I hope this helps, John ----------------------------------------------- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.socsci.mcmaster.ca/jfox/> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of > Cristiano Alessandro > Sent: Monday, December 14, 2015 8:43 AM > To: r-help at r-project.org > Subject: [R] repeated measure with quantitative independent variable > > Hi all, > > I am new to R, and I am trying to set up a repeated measure analysis > with a quantitative (as opposed to factorized/categorical) > within-subjects variable. For a variety of reasons I am not using > linear-mixed models, rather I am trying to fit a General Linear Model (I > am aware of assumptions and limitations) to assess whether the value of > the within-subjects variable affects statistically significantly the > response variable. I have two questions. To make myself clear I propose > the following exemplary dataset (where myfactor_nc is the quantitative > within-subjects variable; i.e. each subject performs the experiment > three times -- nc_factor=1,2,3 -- and produces the response in variable > dv). > > dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6); > subject <- > factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5 > ","s5","s5")); > myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3) > mydata_nc <- data.frame(dv, subject, myfactor_nc) > > *Question 1 (using function aov)* > > Easily done... > > am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc), > data=mydata_nc) > summary(am1_nc) > > Unlike the case when myfactor_nc is categorical, this produces three > error strata: Error: subject, Error: subject:myfactor_nc, Error: Within. > I cannot understand the meaning of the latter. How is that computed? > > *Question 2 (using function lm)* > > Now I would like to do the same with the functions lm() and Anova() > (from the car package). What I have done so far (please correct me if I > am mistaking) is the following: > > # Unstack the dataset > dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2], > dv[myfactor_nc==3])) > > #Fit the linear model > mlm1 <- lm(dvm ~ 1) > > (is that model above correct for my design?) > > Now I should use the Anova function, but it seems that it only accepts > factors, and not quantitative within-subject variable. > > Any help is highly appreciated! > > Thanks > Cristiano > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Cristiano Alessandro
2015-Dec-14 19:10 UTC
[R] repeated measure with quantitative independent variable
Dear John, thanks for your reply! The reason why I did not want to factorize the within-subjects variable was to avoid increasing the Df of the model from 1 (continuous variable) to k-1 (where k is the number of levels of the factors). I am now confused, because you have factorized the variable (indeed using "factor"), but the Df of myfactor_nc seems to be 1. Could you explain that? Comparing the results obtained with the two methods I seem to get completely different results: * aov()* dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6); subject <- factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5","s5","s5")); myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3) mydata_nc <- data.frame(dv, subject, myfactor_nc) am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc), data=mydata_nc) summary(am1_nc) Error: subject Df Sum Sq Mean Sq F value Pr(>F) Residuals 4 12.4 3.1 Error: subject:myfactor_nc Df Sum Sq Mean Sq F value Pr(>F) myfactor_nc 1 14.4 14.4 16 0.0161 * Residuals 4 3.6 0.9 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Error: Within Df Sum Sq Mean Sq F value Pr(>F) Residuals 5 1.333 0.2667 *Anova()* dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2], dv[myfactor_nc==3])) mlm1 <- lm(dvm ~ 1) myfactor_nc <- factor(1:3) contrasts(myfactor_nc) <- matrix(-1:1, ncol=1) idata <- data.frame(myfactor_nc) Anova(mlm1, idata=idata, idesign=~myfactor_nc) Note: model has only an intercept; equivalent type-III tests substituted. Type III Repeated Measures MANOVA Tests: Pillai test statistic Df test stat approx F num Df den Df Pr(>F) (Intercept) 1 0.93790 60.409 1 4 0.001477 ** myfactor_nc 1 0.83478 7.579 2 3 0.067156 . --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Why is that? Thanks a lot Cristiano On 12/14/2015 05:25 PM, Fox, John wrote:> Dear Cristiano, > > If I understand correctly what you want to do, you should be able to use Anova() in the car package (your second question) by treating your numeric repeated-measures predictor as a factor and defining a single linear contrast for it. > > Continuing with your toy example: > >> myfactor_nc <- factor(1:3) >> contrasts(myfactor_nc) <- matrix(-1:1, ncol=1) >> idata <- data.frame(myfactor_nc) >> Anova(mlm1, idata=idata, idesign=~myfactor_nc) > Note: model has only an intercept; equivalent type-III tests substituted. > > Type III Repeated Measures MANOVA Tests: Pillai test statistic > Df test stat approx F num Df den Df Pr(>F) > (Intercept) 1 0.93790 60.409 1 4 0.001477 ** > myfactor_nc 1 0.83478 7.579 2 3 0.067156 . > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > With just 3 distinct levels, however, you could just make myfactor_nc an ordered factor, not defining the contrasts explicitly, and then you'd get both linear and quadratic contrasts. > > I hope this helps, > John > > ----------------------------------------------- > John Fox, Professor > McMaster University > Hamilton, Ontario, Canada > http://socserv.socsci.mcmaster.ca/jfox/ > > > >> -----Original Message----- >> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of >> Cristiano Alessandro >> Sent: Monday, December 14, 2015 8:43 AM >> To: r-help at r-project.org >> Subject: [R] repeated measure with quantitative independent variable >> >> Hi all, >> >> I am new to R, and I am trying to set up a repeated measure analysis >> with a quantitative (as opposed to factorized/categorical) >> within-subjects variable. For a variety of reasons I am not using >> linear-mixed models, rather I am trying to fit a General Linear Model (I >> am aware of assumptions and limitations) to assess whether the value of >> the within-subjects variable affects statistically significantly the >> response variable. I have two questions. To make myself clear I propose >> the following exemplary dataset (where myfactor_nc is the quantitative >> within-subjects variable; i.e. each subject performs the experiment >> three times -- nc_factor=1,2,3 -- and produces the response in variable >> dv). >> >> dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6); >> subject <- >> factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5 >> ","s5","s5")); >> myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3) >> mydata_nc <- data.frame(dv, subject, myfactor_nc) >> >> *Question 1 (using function aov)* >> >> Easily done... >> >> am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc), >> data=mydata_nc) >> summary(am1_nc) >> >> Unlike the case when myfactor_nc is categorical, this produces three >> error strata: Error: subject, Error: subject:myfactor_nc, Error: Within. >> I cannot understand the meaning of the latter. How is that computed? >> >> *Question 2 (using function lm)* >> >> Now I would like to do the same with the functions lm() and Anova() >> (from the car package). What I have done so far (please correct me if I >> am mistaking) is the following: >> >> # Unstack the dataset >> dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2], >> dv[myfactor_nc==3])) >> >> #Fit the linear model >> mlm1 <- lm(dvm ~ 1) >> >> (is that model above correct for my design?) >> >> Now I should use the Anova function, but it seems that it only accepts >> factors, and not quantitative within-subject variable. >> >> Any help is highly appreciated! >> >> Thanks >> Cristiano >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code.