ripley at stats.ox.ac.uk
2007-May-14 09:04 UTC
[Rd] (PR#9666) 'aggregate' should preserve level ordering of
On Tue, 8 May 2007, prechelt at inf.fu-berlin.de wrote:> Full_Name: Lutz Prechelt > Version: 2.4.1 > OS: Windows XP > Submission from: (NULL) (160.45.111.67) > > > aggregate (from package stats) should preserve the > ordering of levels of factors it works on and also their > 'ordered' attribute if present. > But it does not.In fact it treats all grouping variables consistently, reducing them to their levels and then data.frame does as.factor on the resulting column. It is not at all clear this is desirable. Take the example on the help page: 'Cold' is reported as a factor even though it is logical. It seems better not to coerce any of the grouping factors when putting into the data frame but rather to index the original variable, and I have implemented that for R-devel: as a side effect your example works as you would like. This does mean that grouping variables that are not factors and cannot be inserted into a data frame will no longer work.> Here is an example: > > ff = factor(c("a","b","a","b"),levels=c("b","a"),ordered=T) > agg = aggregate(1:4, list(groups=ff), sum) > print(levels(agg$groups)) # should be: "b" "a" > [1] "a" "b" > print(is.ordered(agg$groups)) # should be: TRUE > [1] FALSE > > ----- > > ?aggregate ignores the issue completely: > - the terms 'order' or 'level' do not occur in the > text at all > - the term 'factor' is mentioned only once: > "The elements of the list will be coerced to > factors (if they are not already factors)." > > ----- > > This issue made me write the following code used > for preparing the data for a barchart: > > df.a = aggregate(df[,value.var], > list(grouping=dfgrouping, other=dfsubbar.var), > FUN=FUN) > if (is.factor(dfsubbar.var)) { # R 2.4: this should be done by 'aggregate' > df.a$other = factor(df.a$other, > levels=levels(dfsubbar.var), > ordered=is.ordered(dfsubbar.var)) > } > > Cumbersome. > > R is great anyway. Thanks for your service building it! > > Lutz Prechelt > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Maybe Matching Threads
- 'aggregate' should preserve level ordering of factors (PR#9666)
- (PR#9589) 'union' does not handle factors while 'intersect'
- rbind.data.frame reacts on levels without factor (PR#9578)
- (PR#9578) rbind.data.frame reacts on levels without
- question about the aggregate function with respect to order of levels of grouping elements