Hi, I ve been searching a lot in internet..but I can t find a solution Attached, you find a file. I need for each (Materiale, tpdv, UM) to find sum,avg and count My idea was to aggregate for the 3 parameters ..but I don t know how to get the numeric value (SUM,COUNT,AVG) I need. Can you help me? thank you http://www.nabble.com/file/p22905322/ordini2008_ex.txt ordini2008_ex.txt -- View this message in context: http://www.nabble.com/SUM%2CCOUNT%2CAVG-tp22905322p22905322.html Sent from the R help mailing list archive at Nabble.com.
I gather you have an SQL background since those are SQL functions. Check out the sqldf R package and the many examples on the home page: http://sqldf.googlecode.com and in ?sqldf That may ease the transition from SQL to R. On Mon, Apr 6, 2009 at 5:37 AM, calpeda <mauro.biasolo at calpeda.it> wrote:> > Hi, > I ve been searching a lot in internet..but I can t find a solution > Attached, you find a file. > I need for each (Materiale, tpdv, UM) to find sum,avg and count > My idea was to aggregate for the 3 parameters ..but I don t know how to get > the numeric value (SUM,COUNT,AVG) I need. > Can you help me? > thank you > > http://www.nabble.com/file/p22905322/ordini2008_ex.txt ordini2008_ex.txt > -- > View this message in context: http://www.nabble.com/SUM%2CCOUNT%2CAVG-tp22905322p22905322.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
There are various ways to do this in R. # sample data dd <- data.frame(a=1:10,b=sample(3,10,replace=T),c=sample(3,10,replace=T)) Using the standard built-in functions, you can use: *** aggregate *** aggregate(dd,list(b=dd$b,c=dd$c),sum) b c a b c 1 1 1 10 2 2 2 2 1 3 2 1 .... *** tapply *** tapply(dd$a,interaction(dd$b,dd$c),sum) 1.1 2.1 3.1 1.2 2.2 3.2 1.3 2.3 5.000000 3.000000 10.000000 5.000000 NA NA 5.000000 ... But the nicest way is probably to use the plyr package:> library(plyr) > ddply(dd,~b+c,sum)b c V1 1 1 1 14 2 2 1 6 .... ******** Unfortunately, none of these approaches allows you do return more than one result from the function, so you'll need to write> ddply(dd,~b+c,length) # count > ddply(dd,~b+c,sum) > ddply(dd,~b+c,mean) # arithmetic averageThere is an 'each' function in plyr, but it doesn't seem to be compatible with ddply. -s On Mon, Apr 6, 2009 at 5:37 AM, calpeda <mauro.biasolo@calpeda.it> wrote:> > Hi, > I ve been searching a lot in internet..but I can t find a solution > Attached, you find a file. > I need for each (Materiale, tpdv, UM) to find sum,avg and count > My idea was to aggregate for the 3 parameters ..but I don t know how to get > the numeric value (SUM,COUNT,AVG) I need. > Can you help me? > thank you > > http://www.nabble.com/file/p22905322/ordini2008_ex.txt ordini2008_ex.txt > -- > View this message in context: > http://www.nabble.com/SUM%2CCOUNT%2CAVG-tp22905322p22905322.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
calpeda:> I need for each (Materiale, tpdv, UM) to find sum,avg and count > My idea was to aggregate for the 3 parameters ..but I don t know how to > get the numeric value (SUM,COUNT,AVG) I need.If I have understood what you?re trying to accomplish, this should work: $ library(Hmisc) $ d=read.table("http://www.nabble.com/file/p22905322/ordini2008_ex.txt") $ sumfun=function(x) c(sum=sum(x), count=length(x), avg=mean(x)) $ with(d, summarize(qta, Materiale, sumfun, stat.name=NULL)) Materiale sum count avg 1 14001850000 10 1 10,0 2 16006080000 2 1 2,0 3 30100300000 1 1 1,0 4 41SD0800000 3 3 1,0 5 44029740000 2 1 2,0 6 60000321000 3 3 1,0 7 60401721000 1 1 1,0 8 60900761000 2 1 2,0 9 70020030000 2 2 1,0 10 70310010000 2 2 1,0 11 70730040018 3 2 1,5 12 71710040014 1 1 1,0 -- Karl Ove Hufthammer
A good package for this sort of questions is doBy, too. library(doBy) summaryBy( tpdv + UM + qta ~ Materiale ,data=data,FUN=c(sum,length,mean)) regards, Christian> Hi, > I ve been searching a lot in internet..but I can t find a solution > Attached, you find a file. > I need for each (Materiale, tpdv, UM) to find sum,avg and count > My idea was to aggregate for the 3 parameters ..but I don t know how to get > the numeric value (SUM,COUNT,AVG) I need. > Can you help me? > thank you > > http://www.nabble.com/file/p22905322/ordini2008_ex.txt ordini2008_ex.txt >
Nice example. Does anyone know if it is possible to use multiple aggregating functions with the melt/cast functions? Cheers, Dylan On Monday 06 April 2009, Christian wrote:> A good package for this sort of questions is doBy, too. > > library(doBy) > summaryBy( tpdv + UM + qta ~ Materiale ,data=data,FUN=c(sum,length,mean)) > > regards, Christian > > > Hi, > > I ve been searching a lot in internet..but I can t find a solution > > Attached, you find a file. > > I need for each (Materiale, tpdv, UM) to find sum,avg and count > > My idea was to aggregate for the 3 parameters ..but I don t know how to > > get the numeric value (SUM,COUNT,AVG) I need. > > Can you help me? > > thank you > > > > http://www.nabble.com/file/p22905322/ordini2008_ex.txt ordini2008_ex.txt > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, minimal, > self-contained, reproducible code.-- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341