Read the data using scan(): # # a1 a2 a3 a4 # ------------- ------------- ------------- ------------- # b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 # --- --- --- --- --- --- --- --- --- --- --- --- # # c1: # 4.1 4.6 3.7 4.9 5.2 4.7 5.0 6.1 5.5 3.9 4.4 3.7 # 4.3 4.9 3.9 4.6 5.6 4.7 5.4 6.2 5.9 3.3 4.3 3.9 # 4.5 4.2 4.1 5.3 5.8 5.0 5.7 6.5 5.6 3.4 4.7 4.0 # 3.8 4.5 4.5 5.0 5.4 4.5 5.3 5.7 5.0 3.7 4.1 4.4 # 4.3 4.8 3.9 4.6 5.5 4.7 5.4 6.1 5.9 3.3 4.2 3.9 # # c2: # 4.8 5.6 5.0 4.9 5.9 5.0 6.0 6.0 6.1 4.1 4.9 4.3 # 4.5 5.8 5.2 5.5 5.3 5.4 5.7 6.3 5.3 3.9 4.7 4.1 # 5.0 5.4 4.6 5.5 5.5 4.7 5.5 5.7 5.5 4.3 4.9 3.8 # 4.6 6.1 4.9 5.3 5.7 5.1 5.7 5.9 5.8 4.0 5.3 4.7 # 5.0 5.4 4.7 5.5 5.5 4.9 5.5 5.7 5.6 4.3 4.3 3.8 # # NOTE: Cut and paste the numbers without the leading # or labels #> Y <- scan() > A <- gl(4,3, 4*3*2*5, labels=c("a1","a2","a3","a4")); > B <- gl(3,1, 4*3*2*5, labels=c("b1","b2","b3")); > C <- gl(2,60, 4*3*2*5, labels=c("c1","c2")); > anova(lm(Y~A*B*C)) # all effects and interactionsIn the above example, why the number of replications for A is 3, for B is 1 and for C is 60? And why 4*3*2*5? Is the 5 because there are 5 lines in each 4*3*2 group? What is the logic behind this?
I understand now On May 30, 4:04?pm, Bogdan Lataianu <bodins... at gmail.com> wrote:> ?Read the data using scan(): > # > # ? ? ? ? ?a1 ? ? ? ? ? ? ? a2 ? ? ? ? ? ? ? a3 ? ? ? ? ? ? ? a4 > # ? ? ------------- ? ?------------- ? ?------------- ? ?------------- > # ? ? b1 ? b2 ? b3 ? ? b1 ? b2 ? b3 ? ? b1 ? b2 ? b3 ? ? b1 ? b2 ? b3 > # ? ? --- ?--- ?--- ? ?--- ?--- ?--- ? ?--- ?--- ?--- ? ?--- ?--- ?--- > # > # c1: > # ? ? 4.1 ?4.6 ?3.7 ? ?4.9 ?5.2 ?4.7 ? ?5.0 ?6.1 ?5.5 ? ?3.9 ?4.4 ?3.7 > # ? ? 4.3 ?4.9 ?3.9 ? ?4.6 ?5.6 ?4.7 ? ?5.4 ?6.2 ?5.9 ? ?3.3 ?4.3 ?3.9 > # ? ? 4.5 ?4.2 ?4.1 ? ?5.3 ?5.8 ?5.0 ? ?5.7 ?6.5 ?5.6 ? ?3.4 ?4.7 ?4.0 > # ? ? 3.8 ?4.5 ?4.5 ? ?5.0 ?5.4 ?4.5 ? ?5.3 ?5.7 ?5.0 ? ?3.7 ?4.1 ?4.4 > # ? ? 4.3 ?4.8 ?3.9 ? ?4.6 ?5.5 ?4.7 ? ?5.4 ?6.1 ?5.9 ? ?3.3 ?4.2 ?3.9 > # > # c2: > # ? ? 4.8 ?5.6 ?5.0 ? ?4.9 ?5.9 ?5.0 ? ?6.0 ?6.0 ?6.1 ? ?4.1 ?4.9 > 4.3 > # ? ? 4.5 ?5.8 ?5.2 ? ?5.5 ?5.3 ?5.4 ? ?5.7 ?6.3 ?5.3 ? ?3.9 ?4.7 ?4.1 > # ? ? 5.0 ?5.4 ?4.6 ? ?5.5 ?5.5 ?4.7 ? ?5.5 ?5.7 ?5.5 ? ?4.3 ?4.9 ?3.8 > # ? ? 4.6 ?6.1 ?4.9 ? ?5.3 ?5.7 ?5.1 ? ?5.7 ?5.9 ?5.8 ? ?4.0 ?5.3 ?4.7 > # ? ? 5.0 ?5.4 ?4.7 ? ?5.5 ?5.5 ?4.9 ? ?5.5 ?5.7 ?5.6 ? ?4.3 ?4.3 ?3.8 > # > # NOTE: Cut and paste the numbers without the leading # or labels > # > > > Y <- scan() > > A <- gl(4,3, 4*3*2*5, labels=c("a1","a2","a3","a4")); > > B <- gl(3,1, 4*3*2*5, labels=c("b1","b2","b3")); > > C <- gl(2,60, 4*3*2*5, labels=c("c1","c2")); > > anova(lm(Y~A*B*C)) ? # all effects and interactions > > In the above example, why the number of replications for A is 3, for B > is 1 and for C is 60? > And why 4*3*2*5? Is the 5 because there are 5 lines in each 4*3*2 > group? > What is the logic behind this? > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Bill.Venables at csiro.au
2011-May-30 23:26 UTC
[R] Basic question about three factor Anova
This is really a question about the help file for gl. The arguments are gl(n, k, length = n*k, labels = 1:n, ordered = FALSE) 'n' is the number of factor levels. That seems to be easy enough 'k' is called the "number of replications". This is perhaps not the best way to express what it is. k is the number of times each of the n levels is to be repeated before starting again. In your example the 'a' levels are repeated 3 times (to cover 'b'), the 'b' levels are repeated once since you read in the values b1 b2 b3 b1 b2 ... and the levels of 'c' are repeated 60 times each since the top 60 values are all c1 and the bottom 60 values are all c2. 'length' is the overall length of the factor you are generating. By default is is just n*k, but in this case it has to be 4 (A levels) x 3 (B levels) x 2 (C levels) x 5 (reps in each A:B:C subgroup). The other two arguments are clear enough. Bill Venables. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Bogdan Lataianu Sent: Tuesday, 31 May 2011 8:05 AM To: r-help at r-project.org Subject: [R] Basic question about three factor Anova Read the data using scan(): # # a1 a2 a3 a4 # ------------- ------------- ------------- ------------- # b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 # --- --- --- --- --- --- --- --- --- --- --- --- # # c1: # 4.1 4.6 3.7 4.9 5.2 4.7 5.0 6.1 5.5 3.9 4.4 3.7 # 4.3 4.9 3.9 4.6 5.6 4.7 5.4 6.2 5.9 3.3 4.3 3.9 # 4.5 4.2 4.1 5.3 5.8 5.0 5.7 6.5 5.6 3.4 4.7 4.0 # 3.8 4.5 4.5 5.0 5.4 4.5 5.3 5.7 5.0 3.7 4.1 4.4 # 4.3 4.8 3.9 4.6 5.5 4.7 5.4 6.1 5.9 3.3 4.2 3.9 # # c2: # 4.8 5.6 5.0 4.9 5.9 5.0 6.0 6.0 6.1 4.1 4.9 4.3 # 4.5 5.8 5.2 5.5 5.3 5.4 5.7 6.3 5.3 3.9 4.7 4.1 # 5.0 5.4 4.6 5.5 5.5 4.7 5.5 5.7 5.5 4.3 4.9 3.8 # 4.6 6.1 4.9 5.3 5.7 5.1 5.7 5.9 5.8 4.0 5.3 4.7 # 5.0 5.4 4.7 5.5 5.5 4.9 5.5 5.7 5.6 4.3 4.3 3.8 # # NOTE: Cut and paste the numbers without the leading # or labels #> Y <- scan() > A <- gl(4,3, 4*3*2*5, labels=c("a1","a2","a3","a4")); > B <- gl(3,1, 4*3*2*5, labels=c("b1","b2","b3")); > C <- gl(2,60, 4*3*2*5, labels=c("c1","c2")); > anova(lm(Y~A*B*C)) # all effects and interactionsIn the above example, why the number of replications for A is 3, for B is 1 and for C is 60? And why 4*3*2*5? Is the 5 because there are 5 lines in each 4*3*2 group? What is the logic behind this? ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.