thr3ads.net - R devel - [Rd] Aggregate factor names [Sep 2007]

If this information is useful, please help other people find it:
Share via:

Mike Lawrence

2007-Sep-27 15:57 UTC

[Rd] Aggregate factor names

Hi all,

A suggestion derived from discussions amongst a number of R users in  
my research group: set the default column names produced by aggregate 
() equal to the names of the objects in the list passed to the 'by'  
object.

ex. it is annoying to type

with(
	my.data
	,aggregate(
		my.dv
		,list(
			one.iv = one.iv
			,another.iv = another.iv
			,yet.another.iv = yet.another.iv
		)
		,some.function
	)
)

to yield a data frame with names = c 
('one.iv','another.iv','yet.another.iv','x')
when this seems more
economical:

with(
	my.data
	,aggregate(
		my.dv
		,list(
			one.iv
			,another.iv
			,yet.another.iv
		)
		,some.function
	)
)

--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://memetic.ca

Public calendar: http://icalx.com/public/informavore/Public

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
	- Piet Hein

Gabor Grothendieck

2007-Sep-27 16:06 UTC

head link

[Rd] Aggregate factor names

You can do this:

aggregate(iris[-5], iris[5], mean)


On 9/27/07, Mike Lawrence <Mike.Lawrence at dal.ca>
wrote:> Hi all,
>
> A suggestion derived from discussions amongst a number of R users in
> my research group: set the default column names produced by aggregate
> () equal to the names of the objects in the list passed to the 'by'
> object.
>
> ex. it is annoying to type
>
> with(
>        my.data
>        ,aggregate(
>                my.dv
>                ,list(
>                        one.iv = one.iv
>                        ,another.iv = another.iv
>                        ,yet.another.iv = yet.another.iv
>                )
>                ,some.function
>        )
> )
>
> to yield a data frame with names = c
>
('one.iv','another.iv','yet.another.iv','x')
when this seems more
> economical:
>
> with(
>        my.data
>        ,aggregate(
>                my.dv
>                ,list(
>                        one.iv
>                        ,another.iv
>                        ,yet.another.iv
>                )
>                ,some.function
>        )
> )
>
> --
> Mike Lawrence
> Graduate Student, Department of Psychology, Dalhousie University
>
> Website: http://memetic.ca
>
> Public calendar: http://icalx.com/public/informavore/Public
>
> "The road to wisdom? Well, it's plain and simple to express:
> Err and err and err again, but less and less and less."
>        - Piet Hein
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Prof Brian Ripley

2007-Sep-27 16:59 UTC

head link

[Rd] Aggregate factor names

You seem to be assuming that the argument 'by' to the "data
frame" method
of aggregate() is a call to list() with arguments which are names (and 
evaluate to factors).

When aggregate.data.frame comes to be called, the 'by' argument is a 
promise to the actual argument.  In your example the actual argument is 
the call (in a readable layout)

list(one.iv, another.iv, yet.another.iv)

but that is one of a very large number of possibilities for 'by'. 
Trying
to produce reasonable names for unnamed arguments is hard enough (see 
?cbind), but trying to deduce reasonable names for the elements of a list 
argument is one step further up the chain.  Further, if we did that, 
people who wanted the documented behaviour would not longer be able to get 
it.

I think what you want is a function that takes unnamed arguments and 
returns a named list, to replace your usage of list().  That's not so very 
hard to do, not least as in this context data.frame() will do the job.
So to extend the example on the help page
> aggregate(x = testDF, by = data.frame(by1, by2), FUN = "mean")    by1  by2 v1 v2
1    1   95  5 55
2    2   95  7 77
3    1   99  5 55
4    2   99 NA NA
5  big damp  3 33
6 blue  dry  3 33
7  red  red  4 44
8  red  wet  1 11

However, note that the grouping variables need NOT be factors, and this 
has made them so.  So you may want to look at data.frame() and
write list_with_names() to do just that.

On Thu, 27 Sep 2007, Mike Lawrence wrote:
> Hi all,
>
> A suggestion derived from discussions amongst a number of R users in
> my research group: set the default column names produced by aggregate
> () equal to the names of the objects in the list passed to the 'by'
> object.
>
> ex. it is annoying to type
>
> with(
> 	my.data
> 	,aggregate(
> 		my.dv
> 		,list(
> 			one.iv = one.iv
> 			,another.iv = another.iv
> 			,yet.another.iv = yet.another.iv
> 		)
> 		,some.function
> 	)
> )
>
> to yield a data frame with names = c
>
('one.iv','another.iv','yet.another.iv','x')
when this seems more
> economical:
>
> with(
> 	my.data
> 	,aggregate(
> 		my.dv
> 		,list(
> 			one.iv
> 			,another.iv
> 			,yet.another.iv
> 		)
> 		,some.function
> 	)
> )
>
> --
> Mike Lawrence
> Graduate Student, Department of Psychology, Dalhousie University
>
> Website: http://memetic.ca
>
> Public calendar: http://icalx.com/public/informavore/Public
>
> "The road to wisdom? Well, it's plain and simple to express:
> Err and err and err again, but less and less and less."
> 	- Piet Hein
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Martin Elff

2007-Oct-01 15:24 UTC

head link

[Rd] Aggregate factor names

On Thursday 27 September 2007 (17:57:55), Mike Lawrence
wrote:> ex. it is annoying to type
>
> with(
> ????????my.data
> ????????,aggregate(
> ????????????????my.dv
> ????????????????,list(
> ????????????????????????one.iv = one.iv
> ????????????????????????,another.iv = another.iv
> ????????????????????????,yet.another.iv = yet.another.iv
> ????????????????)
> ????????????????,some.function
> ????????)
> )
If you use my package 'memisc' you can write

aggregate(some.function(my.dv)~one.iv+another.iv+yet.another.iv,
		data=my.data)

Best,
Martin

-- 
"Dealing with failure is easy: work hard to improve.  Success is also
easy to handle: you've solved the wrong problem.  Work hard to
improve."
  fortune 1.0

-------------------------------------------------
Dr. Martin Elff
Dept. of Social Sciences
University of Mannheim
Block A5, Room A 328 (NEW)
68131 Mannheim
Germany

Phone: +49-621-181-2093
Fax: +49-621-181-2099
E-Mail: elff at sowi.uni-mannheim.de
Web: http://webrum.uni-mannheim.de/sowi/elff/
     http://www.sowi.uni-mannheim.de/lspwivs/
-------------------------------------------------

Seemingly Similar Threads

Search for more possibly parallel threads

R devel - Sep 2007 - Aggregate factor names

[Rd] Aggregate factor names

[Rd] Aggregate factor names

[Rd] Aggregate factor names

[Rd] Aggregate factor names

Seemingly Similar Threads