thr3ads.net - R help - [R] Basic question about three factor Anova [May 2011]

If this information is useful, please help other people find it:
Share via:

Bogdan Lataianu

2011-May-30 22:04 UTC

[R] Basic question about three factor Anova

Read the data using scan():
#
#          a1               a2               a3               a4
#     -------------    -------------    -------------    -------------
#     b1   b2   b3     b1   b2   b3     b1   b2   b3     b1   b2   b3
#     ---  ---  ---    ---  ---  ---    ---  ---  ---    ---  ---  ---
#
# c1:
#     4.1  4.6  3.7    4.9  5.2  4.7    5.0  6.1  5.5    3.9  4.4  3.7
#     4.3  4.9  3.9    4.6  5.6  4.7    5.4  6.2  5.9    3.3  4.3  3.9
#     4.5  4.2  4.1    5.3  5.8  5.0    5.7  6.5  5.6    3.4  4.7  4.0
#     3.8  4.5  4.5    5.0  5.4  4.5    5.3  5.7  5.0    3.7  4.1  4.4
#     4.3  4.8  3.9    4.6  5.5  4.7    5.4  6.1  5.9    3.3  4.2  3.9
#
# c2:
#     4.8  5.6  5.0    4.9  5.9  5.0    6.0  6.0  6.1    4.1  4.9
4.3
#     4.5  5.8  5.2    5.5  5.3  5.4    5.7  6.3  5.3    3.9  4.7  4.1
#     5.0  5.4  4.6    5.5  5.5  4.7    5.5  5.7  5.5    4.3  4.9  3.8
#     4.6  6.1  4.9    5.3  5.7  5.1    5.7  5.9  5.8    4.0  5.3  4.7
#     5.0  5.4  4.7    5.5  5.5  4.9    5.5  5.7  5.6    4.3  4.3  3.8
#
# NOTE: Cut and paste the numbers without the leading # or labels
#
> Y <- scan()
> A <- gl(4,3, 4*3*2*5,
labels=c("a1","a2","a3","a4"));
> B <- gl(3,1, 4*3*2*5,
labels=c("b1","b2","b3"));
> C <- gl(2,60, 4*3*2*5, labels=c("c1","c2"));
> anova(lm(Y~A*B*C))   # all effects and interactions


In the above example, why the number of replications for A is 3, for B
is 1 and for C is 60?
And why 4*3*2*5? Is the 5 because there are 5 lines in each 4*3*2
group?
What is the logic behind this?

Bogdan Lataianu

2011-May-30 22:55 UTC

head link

[R] Basic question about three factor Anova

I understand now

On May 30, 4:04?pm, Bogdan Lataianu <bodins... at gmail.com>
wrote:> ?Read the data using scan():
> #
> # ? ? ? ? ?a1 ? ? ? ? ? ? ? a2 ? ? ? ? ? ? ? a3 ? ? ? ? ? ? ? a4
> # ? ? ------------- ? ?------------- ? ?------------- ? ?-------------
> # ? ? b1 ? b2 ? b3 ? ? b1 ? b2 ? b3 ? ? b1 ? b2 ? b3 ? ? b1 ? b2 ? b3
> # ? ? --- ?--- ?--- ? ?--- ?--- ?--- ? ?--- ?--- ?--- ? ?--- ?--- ?---
> #
> # c1:
> # ? ? 4.1 ?4.6 ?3.7 ? ?4.9 ?5.2 ?4.7 ? ?5.0 ?6.1 ?5.5 ? ?3.9 ?4.4 ?3.7
> # ? ? 4.3 ?4.9 ?3.9 ? ?4.6 ?5.6 ?4.7 ? ?5.4 ?6.2 ?5.9 ? ?3.3 ?4.3 ?3.9
> # ? ? 4.5 ?4.2 ?4.1 ? ?5.3 ?5.8 ?5.0 ? ?5.7 ?6.5 ?5.6 ? ?3.4 ?4.7 ?4.0
> # ? ? 3.8 ?4.5 ?4.5 ? ?5.0 ?5.4 ?4.5 ? ?5.3 ?5.7 ?5.0 ? ?3.7 ?4.1 ?4.4
> # ? ? 4.3 ?4.8 ?3.9 ? ?4.6 ?5.5 ?4.7 ? ?5.4 ?6.1 ?5.9 ? ?3.3 ?4.2 ?3.9
> #
> # c2:
> # ? ? 4.8 ?5.6 ?5.0 ? ?4.9 ?5.9 ?5.0 ? ?6.0 ?6.0 ?6.1 ? ?4.1 ?4.9
> 4.3
> # ? ? 4.5 ?5.8 ?5.2 ? ?5.5 ?5.3 ?5.4 ? ?5.7 ?6.3 ?5.3 ? ?3.9 ?4.7 ?4.1
> # ? ? 5.0 ?5.4 ?4.6 ? ?5.5 ?5.5 ?4.7 ? ?5.5 ?5.7 ?5.5 ? ?4.3 ?4.9 ?3.8
> # ? ? 4.6 ?6.1 ?4.9 ? ?5.3 ?5.7 ?5.1 ? ?5.7 ?5.9 ?5.8 ? ?4.0 ?5.3 ?4.7
> # ? ? 5.0 ?5.4 ?4.7 ? ?5.5 ?5.5 ?4.9 ? ?5.5 ?5.7 ?5.6 ? ?4.3 ?4.3 ?3.8
> #
> # NOTE: Cut and paste the numbers without the leading # or labels
> #
>
> > Y <- scan()
> > A <- gl(4,3, 4*3*2*5,
labels=c("a1","a2","a3","a4"));
> > B <- gl(3,1, 4*3*2*5,
labels=c("b1","b2","b3"));
> > C <- gl(2,60, 4*3*2*5, labels=c("c1","c2"));
> > anova(lm(Y~A*B*C)) ? # all effects and interactions
>
> In the above example, why the number of replications for A is 3, for B
> is 1 and for C is 60?
> And why 4*3*2*5? Is the 5 because there are 5 lines in each 4*3*2
> group?
> What is the logic behind this?
>
> ______________________________________________
> R-h... at r-project.org mailing
listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Bill.Venables at csiro.au

2011-May-30 23:26 UTC

head link

[R] Basic question about three factor Anova

This is really a question about the help file for gl.

The arguments are

gl(n, k, length = n*k, labels = 1:n, ordered = FALSE)

'n' is the number of factor levels.  That seems to be easy enough

'k' is called the "number of replications".  This is perhaps
not the best way to express what it is.  k is the number of times each of the n
levels is to be repeated before starting again.  In your example the 'a'
levels are repeated 3 times (to cover 'b'), the 'b' levels are
repeated once since you read in the values b1 b2 b3 b1 b2 ... and the levels of
'c' are repeated 60 times each since the top 60 values are all c1 and
the bottom 60 values are all c2.

'length' is the overall length of the factor you are generating.  By
default is is just n*k, but in this case it has to be 4 (A levels) x 3 (B
levels) x 2 (C levels) x 5 (reps in each A:B:C subgroup).

The other two arguments are clear enough.

Bill Venables.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Bogdan Lataianu
Sent: Tuesday, 31 May 2011 8:05 AM
To: r-help at r-project.org
Subject: [R] Basic question about three factor Anova

 Read the data using scan():
#
#          a1               a2               a3               a4
#     -------------    -------------    -------------    -------------
#     b1   b2   b3     b1   b2   b3     b1   b2   b3     b1   b2   b3
#     ---  ---  ---    ---  ---  ---    ---  ---  ---    ---  ---  ---
#
# c1:
#     4.1  4.6  3.7    4.9  5.2  4.7    5.0  6.1  5.5    3.9  4.4  3.7
#     4.3  4.9  3.9    4.6  5.6  4.7    5.4  6.2  5.9    3.3  4.3  3.9
#     4.5  4.2  4.1    5.3  5.8  5.0    5.7  6.5  5.6    3.4  4.7  4.0
#     3.8  4.5  4.5    5.0  5.4  4.5    5.3  5.7  5.0    3.7  4.1  4.4
#     4.3  4.8  3.9    4.6  5.5  4.7    5.4  6.1  5.9    3.3  4.2  3.9
#
# c2:
#     4.8  5.6  5.0    4.9  5.9  5.0    6.0  6.0  6.1    4.1  4.9
4.3
#     4.5  5.8  5.2    5.5  5.3  5.4    5.7  6.3  5.3    3.9  4.7  4.1
#     5.0  5.4  4.6    5.5  5.5  4.7    5.5  5.7  5.5    4.3  4.9  3.8
#     4.6  6.1  4.9    5.3  5.7  5.1    5.7  5.9  5.8    4.0  5.3  4.7
#     5.0  5.4  4.7    5.5  5.5  4.9    5.5  5.7  5.6    4.3  4.3  3.8
#
# NOTE: Cut and paste the numbers without the leading # or labels
#
> Y <- scan()
> A <- gl(4,3, 4*3*2*5,
labels=c("a1","a2","a3","a4"));
> B <- gl(3,1, 4*3*2*5,
labels=c("b1","b2","b3"));
> C <- gl(2,60, 4*3*2*5, labels=c("c1","c2"));
> anova(lm(Y~A*B*C))   # all effects and interactions


In the above example, why the number of replications for A is 3, for B
is 1 and for C is 60?
And why 4*3*2*5? Is the 5 because there are 5 lines in each 4*3*2
group?
What is the logic behind this?

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reasonably Related Threads

Search for more possibly parallel threads

R help - May 2011 - Basic question about three factor Anova

[R] Basic question about three factor Anova

[R] Basic question about three factor Anova

[R] Basic question about three factor Anova

Reasonably Related Threads