When the 'by' function forms subsets, are the rows in the same order as they are in the original data frame? For example, I want to use 'by' to calculate cumulative sums of a value 'v' by date 'd' for different levels of a factor 'f':>df<-data.frame(f=c("A","A","B"),d=as.Date(c("2010-1-1","2010-2-1","2010-1-1")),v=c(100,200,150))> dff d v 1 A 2010-01-01 100 2 A 2010-02-01 200 3 B 2010-01-01 150> do.call(rbind,by(df,df$f,FUN=function(x)data.frame(x[1],x[2],cumsum(x[3])))) f d v A.1 A 2010-01-01 100 A.2 A 2010-02-01 300 B B 2010-01-01 150 This is exactly what I want, namely, cumulative sums by date. Can I be sure that the rows within subset A will be arranged in date order as they are in the original data frame? I would not want 'by' to randomly switch the order and create, for example, f d v A.1 A 2010-02-01 200 A.2 A 2010-01-01 300 B B 2010-01-01 150 I could force the order of each subset within the FUN of by, adding to the execution time. Would that be advised? Thanks, Dan [[alternative HTML version deleted]]
Try this: transform(df, v = unlist(with(df, tapply(v, f, cumsum)))) On Fri, Jan 8, 2010 at 4:10 PM, Daniel Murphy <chiefmurphy at gmail.com> wrote:> When the 'by' function forms subsets, are the rows in the same order as they > are in the original data frame? > > For example, I want to use 'by' to calculate cumulative sums of a value 'v' > by date 'd' for different levels of a factor 'f': > >> > df<-data.frame(f=c("A","A","B"),d=as.Date(c("2010-1-1","2010-2-1","2010-1-1")),v=c(100,200,150)) >> df > ?f ? ? ? ? ?d ? v > 1 A 2010-01-01 100 > 2 A 2010-02-01 200 > 3 B 2010-01-01 150 >> do.call(rbind,by(df,df$f,FUN=function(x) > data.frame(x[1],x[2],cumsum(x[3])))) > ? ?f ? ? ? ? ?d ? v > A.1 A 2010-01-01 100 > A.2 A 2010-02-01 300 > B ? B 2010-01-01 150 > > This is exactly what I want, namely, cumulative sums by date. > > Can I be sure that the rows within subset A will be arranged in date order > as they are in the original data frame? I would not want 'by' to randomly > switch the order and create, for example, > ? ?f ? ? ? ? ?d ? v > A.1 A 2010-02-01 200 > A.2 A 2010-01-01 300 > B ? B 2010-01-01 150 > > I could force the order of each subset within the FUN of by, adding to the > execution time. Would that be advised? > > Thanks, > > Dan > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
Try ave: transform(df, v = ave(v, f, FUN = cumsum)) On Fri, Jan 8, 2010 at 1:10 PM, Daniel Murphy <chiefmurphy at gmail.com> wrote:> When the 'by' function forms subsets, are the rows in the same order as they > are in the original data frame? > > For example, I want to use 'by' to calculate cumulative sums of a value 'v' > by date 'd' for different levels of a factor 'f': > >> > df<-data.frame(f=c("A","A","B"),d=as.Date(c("2010-1-1","2010-2-1","2010-1-1")),v=c(100,200,150)) >> df > ?f ? ? ? ? ?d ? v > 1 A 2010-01-01 100 > 2 A 2010-02-01 200 > 3 B 2010-01-01 150 >> do.call(rbind,by(df,df$f,FUN=function(x) > data.frame(x[1],x[2],cumsum(x[3])))) > ? ?f ? ? ? ? ?d ? v > A.1 A 2010-01-01 100 > A.2 A 2010-02-01 300 > B ? B 2010-01-01 150 > > This is exactly what I want, namely, cumulative sums by date. > > Can I be sure that the rows within subset A will be arranged in date order > as they are in the original data frame? I would not want 'by' to randomly > switch the order and create, for example, > ? ?f ? ? ? ? ?d ? v > A.1 A 2010-02-01 200 > A.2 A 2010-01-01 300 > B ? B 2010-01-01 150 > > I could force the order of each subset within the FUN of by, adding to the > execution time. Would that be advised? > > Thanks, > > Dan > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >