Kara Przeczek
2010-Feb-10 02:06 UTC
[R] sum sections of data of different lengths from within a data frame
Dear R Help: I am trying to sum data from one column in a dataframe based on a value in another. I do not know how to do this easily in R. For example: Col A Col B 1 0 3 0 2 1 2 0 1 0 4 0 1 1 9 1 3 0 5 0 2 1 I would like to cumsum the values in Col A for all rows where Col B is 0, and a value of 1 in Col B will reset the sum and give a value of 0.001. Thus, for this table I would like an output of 1, 4, 0.001, 2, 3, 7, 0.001, 0.001, 3, 8, 0.001. I tried using a For loop, but that summed all the Col A values together. I need something that does For (i in 1:length(df$Col B)) { IF{Col B == 0, cumsum(Col A) "until" Col B == 1, else 0.001} } I don't know how to use "until" in R. Any help would be greatly appreciated! Kara
jim holtman
2010-Feb-10 02:30 UTC
[R] sum sections of data of different lengths from within a data frame
WIll this do it for you:> x <- read.table(textConnection("ColA ColB+ 1 0 + 3 0 + 2 1 + 2 0 + 1 0 + 4 0 + 1 1 + 9 1 + 3 0 + 5 0 + 2 1"), header=TRUE)> closeAllConnections() > x.s <- split(x, cumsum(x$ColB)) > x.l <- do.call(rbind, lapply(x.s, function(.grp){+ newdata <- cbind(.grp, sum=cumsum((.grp$ColB == 0) * .grp$ColA)) + newdata$sum[newdata$ColB == 1] <- .001 + newdata + }))> > x.lColA ColB sum 0.1 1 0 1.000 0.2 3 0 4.000 1.3 2 1 0.001 1.4 2 0 2.000 1.5 1 0 3.000 1.6 4 0 7.000 2 1 1 0.001 3.8 9 1 0.001 3.9 3 0 3.000 3.10 5 0 8.000 4 2 1 0.001>On Tue, Feb 9, 2010 at 9:06 PM, Kara Przeczek <przeczek at unbc.ca> wrote:> Dear R Help: > > I am trying to sum data from one column in a dataframe based on a value in another. I do not know how to do this easily in R. > For example: > > Col A ?Col B > 1 ? ? ? ?0 > 3 ? ? ? ?0 > 2 ? ? ? ?1 > 2 ? ? ? ?0 > 1 ? ? ? ?0 > 4 ? ? ? ?0 > 1 ? ? ? ?1 > 9 ? ? ? ?1 > 3 ? ? ? ?0 > 5 ? ? ? ?0 > 2 ? ? ? ?1 > > I would like to cumsum the values in Col A for all rows where Col B is 0, and a value of 1 in Col B will reset the sum and give a value of 0.001. Thus, for this table I would like an output of 1, 4, 0.001, 2, 3, 7, 0.001, 0.001, 3, 8, 0.001. > I tried using a For loop, but that summed all the Col A values together. I need something that does > For (i in 1:length(df$Col B)) > { > IF{Col B == 0, cumsum(Col A) "until" Col B == 1, else 0.001} > } > I don't know how to use "until" in R. > Any help would be greatly appreciated! > Kara > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
kMan
2010-Feb-11 05:14 UTC
[R] sum sections of data of different lengths from within a data frame
Dear Kara, Did you bother to test your code? You say your code actually did some summing, but you didn't include any working example of that code. Did you bother to read the posting guide? (1) TRY to reference Col A (including space, as you indicated) in df.>names(df)<-c("Col A", "Col B") #space >df$Col A >df$"Col A" >names(df)<-c("ColA", "ColB") # no space >df$ColA(2) READ about flow control. Do you see 'until' mentioned anywhere?>?Control(3) Do YOUR OWN working example. TRY to write a for loop with a capital F, for example...>For(i in 1:10){print(i)} >for(i in 1:10){print(i)}(4) OBSERVE what "actually" happens when you take the length of your data frame. Does it make any sense?>length(df) >ncol(df) >dim(df)Crawl, then walk. Don't be lazy. KeithC. -----Original Message----- From: Kara Przeczek [mailto:przeczek at unbc.ca] Sent: Tuesday, February 09, 2010 7:07 PM To: r-help at r-project.org Subject: [R] sum sections of data of different lengths from within a data frame Dear R Help: I am trying to sum data from one column in a dataframe based on a value in another. I do not know how to do this easily in R. For example: Col A Col B 1 0 3 0 2 1 2 0 1 0 4 0 1 1 9 1 3 0 5 0 2 1 I would like to cumsum the values in Col A for all rows where Col B is 0, and a value of 1 in Col B will reset the sum and give a value of 0.001. Thus, for this table I would like an output of 1, 4, 0.001, 2, 3, 7, 0.001, 0.001, 3, 8, 0.001. I tried using a For loop, but that summed all the Col A values together. I need something that does For (i in 1:length(df$Col B)) { IF{Col B == 0, cumsum(Col A) "until" Col B == 1, else 0.001} } I don't know how to use "until" in R. Any help would be greatly appreciated! Kara