Okay everyone heres a likely softball for someone.
Consider the following data frame:
#Create data
x<-rep(c(1,15),10)
y<-rnorm(20)
z<-c(rep("auto",10),rep("bus",10))
a<-rep(c(1,1,2,2,3,3,4,4,5,5),2)
#Create Data frame
Df<-data.frame(Source=x,Rate=y,Bin=a,Type=z)
I want to create a new column the equals the sum of the Rates for each type
(1,15) by Bin.
A related question: I have been using R for a while now and usually
manipulate my data in data frames but i know lists are better for R so
perhaps the above should be done using lists. Feel free to offer
suggestions coming from that angle.
Thanks guys
JR-
--
View this message in context:
http://r.789695.n4.nabble.com/Summarize-by-two-or-more-attributes-tp3529825p3529825.html
Sent from the R help mailing list archive at Nabble.com.
I will hit my own ball on this one tapply(Df$Rate,list(Df$Bin,Df$Type),sum) -- View this message in context: http://r.789695.n4.nabble.com/Summarize-by-two-or-more-attributes-tp3529825p3530034.html Sent from the R help mailing list archive at Nabble.com.
One possibility is: library(doBy) summaryBy(Rate~Source+Bin, data=Df, FUN=sum) On 5/17/2011 12:48 PM, LCOG1 wrote:> Okay everyone heres a likely softball for someone. > > Consider the following data frame: > > #Create data > x<-rep(c(1,15),10) > y<-rnorm(20) > z<-c(rep("auto",10),rep("bus",10)) > a<-rep(c(1,1,2,2,3,3,4,4,5,5),2) > #Create Data frame > Df<-data.frame(Source=x,Rate=y,Bin=a,Type=z) > > > I want to create a new column the equals the sum of the Rates for each type > (1,15) by Bin. > > A related question: I have been using R for a while now and usually > manipulate my data in data frames but i know lists are better for R so > perhaps the above should be done using lists. Feel free to offer > suggestions coming from that angle. > > Thanks guys > > JR- > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Summarize-by-two-or-more-attributes-tp3529825p3529825.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
Like This?
x<-rep(c(1,15),10)
y<-rnorm(20)
z<-c(rep("auto",10),rep("bus",10))
a<-rep(c(1,1,2,2,3,3,4,4,5,5),2)
#Create Data frame
Df<-data.frame(Source=x,Rate=y,Bin=a,Type=z)
Df
ddply(Df,c('Type','Bin'),summarise,Summed=sum(Rate))
?# Adding a column to Df
ddply(Df,c('Type','Bin'),mutate,Summed=sum(Rate))
?
# Convert the result to a list
dlply(Df,c('Type','Bin'),summarise,Summed=sum(Rate))
?
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish & Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx
----- Original Message ----> From: LCOG1 <jroll at lcog.org>
> To: r-help at r-project.org
> Sent: Tue, May 17, 2011 9:48:36 AM
> Subject: [R] Summarize by two or more attributes
>
> Okay everyone heres a likely softball for someone.
>
> Consider the following data frame:
>
> #Create data
> x<-rep(c(1,15),10)
> y<-rnorm(20)
> z<-c(rep("auto",10),rep("bus",10))
> a<-rep(c(1,1,2,2,3,3,4,4,5,5),2)
> #Create Data frame
> Df<-data.frame(Source=x,Rate=y,Bin=a,Type=z)
>
>
> I want to create a new column the equals the sum of the Rates for each type
> (1,15) by Bin.?
>
> A related question:? I have been using R for a while now and usually
> manipulate my data in data frames but i know lists are better for R so
> perhaps the above should be done using lists.? Feel free to offer
> suggestions coming from that angle.?
>
> Thanks guys
>
> JR-
>
>
>
> --
> View this message in context:
>http://r.789695.n4.nabble.com/Summarize-by-two-or-more-attributes-tp3529825p3529825.html
>
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
On May 17, 2011, at 11:48 AM, LCOG1 wrote:> Okay everyone heres a likely softball for someone. > > Consider the following data frame: > > #Create data > x<-rep(c(1,15),10) > y<-rnorm(20) > z<-c(rep("auto",10),rep("bus",10)) > a<-rep(c(1,1,2,2,3,3,4,4,5,5),2) > #Create Data frame > Df<-data.frame(Source=x,Rate=y,Bin=a,Type=z) > > > I want to create a new column the equals the sum of the Rates for each type > (1,15) by Bin. > > A related question: I have been using R for a while now and usually > manipulate my data in data frames but i know lists are better for R so > perhaps the above should be done using lists. Feel free to offer > suggestions coming from that angle. > > Thanks guys > > JR-See ?ave and consider: # Presuming you want 'Bin' nested within 'Source' Df$Sum <- ave(Df$Rate, list(Df$Source, Df$Bin), FUN = sum) # Or 'Source' nested within 'Bin' Df$Sum <- ave(Df$Rate, list(Df$Bin, Df$Source), FUN = sum) On your follow up, a data frame is a type of list with a 'data.frame' class attribute, a 'row.names' attribute and a 'names' attribute for the column names. Much like a matrix is a vector with a 'dim' attribute. Try this: unclass(Df) and see the output. It looks just like a list, because it is... If dealing with 'rectangular' datasets (eg. a database table), where each column may need to be of differing data types, a data frame in R is specifically designed to handle it. It is because a data frame is a list, that it can do this, since each element in a list can be a different type. If you need to deal with a data structure that may not be entirely based upon a rectangular data set and may need to contain various numbers of items per element, then a list is the way to go. Lists are commonly used in R functions to return complex objects that may contain vectors of various types, matrices, data frames and even lists of lists. A quick example would be objects returned by R's model functions. Run example(lm) and after the graphs finish, use str(lm.D9) to give an example of the structure of a somewhat complex list object. HTH, Marc Schwartz