I notice something curious about how aov() treats a numeric factor: "score" is a dependent variable and "group" is a factor in a one-way ANOVA. But "group" contains numeric codes and is not a factor (checked with is.factor). An ANOVA done using:> aov(score~factor(group), data=mydata)gives the right answers. But> aov(score~group, data=mydata)also produces an ANOVA table, with incorrect entries. My question is: what exactly is R doing when I did not specify that "group" was a factor? Ravi Kulkarni -- View this message in context: http://r.789695.n4.nabble.com/Question-about-factor-that-is-numeric-in-aov-tp2164393p2164393.html Sent from the R help mailing list archive at Nabble.com.
On May 9, 2010, at 8:36 AM, Ravi Kulkarni wrote:> > I notice something curious about how aov() treats a numeric factor: > > "score" is a dependent variable and "group" is a factor in a one-way > ANOVA. > But "group" contains numeric codes and is not a factor (checked with > is.factor). An ANOVA done using: > >> aov(score~factor(group), data=mydata) > > gives the right answers. But > >> aov(score~group, data=mydata) > > also produces an ANOVA table, with incorrect entries. My question > is: what > exactly is R doing when I did not specify that "group" was a factor?Since you have not shown us the table we can only guess. My guess: It is treating that variable as continuous and estimating a single parameter. That may or may not be interpretable. If those codes have a meaningful order and scale, you may be getting what is sometimes called a trend test. If they are arbitrary, then the result is very probably nonsense.> > Ravi Kulkarni > --David Winsemius, MD West Hartford, CT
Dear Ravi, On Sunday 09 May 2010, Ravi Kulkarni wrote:> I notice something curious about how aov() treats a numeric factor:In R, there is no such thing as a "numeric factor". A numeric vector is not a factor unless declared as such.> "score" is a dependent variable and "group" is a factor in a one-way ANOVA. > But "group" contains numeric codes and is not a factor (checked with > > is.factor). An ANOVA done using: > > aov(score~factor(group), data=mydata) > > gives the right answers. But > > > aov(score~group, data=mydata) > > also produces an ANOVA table, with incorrect entries. My question is: what > exactly is R doing when I did not specify that "group" was a factor?The entries _are_ correct, because "group" is numeric. From the help of aov(): Details: This provides a wrapper to ?lm? for fitting linear models to balanced or unbalanced experimental designs. So aov() calls lm(), where it is mighty important whether "group" is numeric or factor. There are both in your mind, but in R you have to declare it as factor in order to treat it as such... I hope this helps, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd. 050025 Bucharest sector 5 Romania Tel.:+40 21 3126618 \ +40 21 3120210 / int.101 Fax: +40 21 3158391
Hi: On Sun, May 9, 2010 at 5:36 AM, Ravi Kulkarni <ravi.kulk@gmail.com> wrote:> > I notice something curious about how aov() treats a numeric factor: > > "score" is a dependent variable and "group" is a factor in a one-way ANOVA. > But "group" contains numeric codes and is not a factor (checked with > is.factor). An ANOVA done using: > > > aov(score~factor(group), data=mydata) > > gives the right answers. But > > > aov(score~group, data=mydata) > > also produces an ANOVA table, with incorrect entries. My question is: what > exactly is R doing when I did not specify that "group" was a factor? >It's doing simple linear regression, because group is evidently a numeric variable in mydata. Type str(mydata) to see what you've got. R cannot divine whether you want an ANOVA in this problem or simple linear regression. In a model with one variable on the RHS of the model formula, R will perform regression if it is numeric and ANOVA if it is a factor; it is up to you to know the type of variables you are inputting into a model. str() is one of the most useful and important functions in R; it would benefit you to acquaint yourself with its features. HTH, Dennis> > Ravi Kulkarni > -- > View this message in context: > http://r.789695.n4.nabble.com/Question-about-factor-that-is-numeric-in-aov-tp2164393p2164393.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]