jeffc
2008-Nov-17 02:12 UTC
[R] how to calculate another vector based on the data from a combination of two factors
Hi, I have a data set similar to the following State Gender Quantity TX Male 1 NY Female 2 TX Male 3 NY Female 4 I need to calculate cumulative sum of the quantity by State and Gender. The expected output is State Gender Quantity CumQuantity TX Male 1 1 TX Male 3 4 NY Female 2 2 NY Female 4 6 I highly appreciate if someone can give me some hints on solving that in R. Hao -- View this message in context: http://www.nabble.com/how-to-calculate-another-vector-based-on-the-data-from-a-combination-of-two-factors-tp20532749p20532749.html Sent from the R help mailing list archive at Nabble.com.
Marc Schwartz
2008-Nov-17 03:39 UTC
[R] how to calculate another vector based on the data from a combination of two factors
on 11/16/2008 08:12 PM jeffc wrote:> Hi, > > I have a data set similar to the following > > State Gender Quantity > TX Male 1 > NY Female 2 > TX Male 3 > NY Female 4 > > > I need to calculate cumulative sum of the quantity by State and Gender. The > expected output is > State Gender Quantity CumQuantity > TX Male 1 1 > TX Male 3 4 > NY Female 2 2 > NY Female 4 6 > > I highly appreciate if someone can give me some hints on solving that in R. > > HaoI would verify this, but something along the lines of the following:> DFState Gender Quantity 1 TX Male 1 2 NY Female 2 3 TX Male 3 4 NY Female 4 do.call(rbind, lapply(split(DF, list(DF$State, DF$Gender), drop = TRUE), function(x) cbind(x, cumsum(x$Quantity)))) which yields: State Gender Quantity cumsum(x$Quantity) NY.Female.2 NY Female 2 2 NY.Female.4 NY Female 4 6 TX.Male.1 TX Male 1 1 TX.Male.3 TX Male 3 4 To take this step by step: First, split() DF by the two factors:> split(DF, list(DF$State, DF$Gender), drop = TRUE)$NY.Female State Gender Quantity 2 NY Female 2 4 NY Female 4 $TX.Male State Gender Quantity 1 TX Male 1 3 TX Male 3 Pass that to lapply(), in which we do the cumsum() and cbind():> lapply(split(DF, list(DF$State, DF$Gender), drop = TRUE),function(x) cbind(x, cumsum(x$Quantity))) $NY.Female State Gender Quantity cumsum(x$Quantity) 2 NY Female 2 2 4 NY Female 4 6 $TX.Male State Gender Quantity cumsum(x$Quantity) 1 TX Male 1 1 3 TX Male 3 4 Pass that to do.call() to rbind() the results together:> do.call(rbind,lapply(split(DF, list(DF$State, DF$Gender), drop = TRUE), function(x) cbind(x, cumsum(x$Quantity)))) State Gender Quantity cumsum(x$Quantity) NY.Female.2 NY Female 2 2 NY.Female.4 NY Female 4 6 TX.Male.1 TX Male 1 1 TX.Male.3 TX Male 3 4 See ?split, ?do.call, ?rbind and ?cumsum. If you want the exact row ordering as you had it in your post, you can alter the factor levels, otherwise they will be sorted by alpha (eg. NY before TX and Female before Male). HTH, Marc Schwartz
Gabor Grothendieck
2008-Nov-17 03:58 UTC
[R] how to calculate another vector based on the data from a combination of two factors
Try this. The first line appends the cumulative sum column and the second displays it in sorted fashion: DF$cumQuantity <- ave(DF$Quantity, DF$State, DF$Gender, FUN = cumsum) DF[order(DF$State, DF$Gender), ] On Sun, Nov 16, 2008 at 9:12 PM, jeffc <hcen at andrew.cmu.edu> wrote:> > Hi, > > I have a data set similar to the following > > State Gender Quantity > TX Male 1 > NY Female 2 > TX Male 3 > NY Female 4 > > > I need to calculate cumulative sum of the quantity by State and Gender. The > expected output is > State Gender Quantity CumQuantity > TX Male 1 1 > TX Male 3 4 > NY Female 2 2 > NY Female 4 6 > > I highly appreciate if someone can give me some hints on solving that in R. > > Hao > > -- > View this message in context: http://www.nabble.com/how-to-calculate-another-vector-based-on-the-data-from-a-combination-of-two-factors-tp20532749p20532749.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
hadley wickham
2008-Nov-17 04:05 UTC
[R] how to calculate another vector based on the data from a combination of two factors
On Sun, Nov 16, 2008 at 8:12 PM, jeffc <hcen at andrew.cmu.edu> wrote:> > Hi, > > I have a data set similar to the following > > State Gender Quantity > TX Male 1 > NY Female 2 > TX Male 3 > NY Female 4 > > > I need to calculate cumulative sum of the quantity by State and Gender. The > expected output is > State Gender Quantity CumQuantity > TX Male 1 1 > TX Male 3 4 > NY Female 2 2 > NY Female 4 6 > > I highly appreciate if someone can give me some hints on solving that in R.Here's one approach that uses the plyr package: library(plyr) ddply(df, .(State, Gender), transform, CumQuantity = cumsum(Quantity)) You can find out more about how this works at http://had.co.nz/plyr Hadley -- http://had.co.nz/
Jorge Ivan Velez
2008-Nov-17 05:26 UTC
[R] how to calculate another vector based on the data from a combination of two factors
Dear Jeff, Try also df=df[order(df$State, df$Gender), ] df$cQuantity<-unlist(tapply(df[,3],df[,-3],cumsum)) df State Gender Quantity cQuantity 2 NY Female 2 2 4 NY Female 4 6 1 TX Male 1 1 3 TX Male 3 4 HTH, Jorge On Sun, Nov 16, 2008 at 9:12 PM, jeffc <hcen@andrew.cmu.edu> wrote:> > Hi, > > I have a data set similar to the following > > State Gender Quantity > TX Male 1 > NY Female 2 > TX Male 3 > NY Female 4 > > > I need to calculate cumulative sum of the quantity by State and Gender. The > expected output is > State Gender Quantity CumQuantity > TX Male 1 1 > TX Male 3 4 > NY Female 2 2 > NY Female 4 6 > > I highly appreciate if someone can give me some hints on solving that in R. > > Hao > > -- > View this message in context: > http://www.nabble.com/how-to-calculate-another-vector-based-on-the-data-from-a-combination-of-two-factors-tp20532749p20532749.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
jeffc
2008-Nov-18 03:59 UTC
[R] how to calculate another vector based on the data from a combination of two factors
Hi All, Thank all for the input. These different solutions rock!. Hao -- View this message in context: http://www.nabble.com/how-to-calculate-another-vector-based-on-the-data-from-a-combination-of-two-factors-tp20532749p20553226.html Sent from the R help mailing list archive at Nabble.com.