I have a question about using cumsum on subsets of a data frame. Suppose I have a frame that looks something like this> tmpf x y 1 left 1 0 2 left 2 0 3 left 3 9 4 left 4 10 5 left 5 23 6 left 6 45 7 left 7 13 8 left 8 2 9 left 9 6 10 right 1 10 11 right 2 26 12 right 3 9 13 right 4 50 14 right 5 78 15 right 6 20 16 right 7 7 17 right 8 20 18 right 9 19 I'm plotting things like this will lattice> library(lattice) > xyplot(y ~ x | f, data=tmp)If I plot the cumsum with xyplot( cumsum(y) ~ x | f, data=tmp), it is summed across the values of the factor f. Can anyone suggest a way to calculate the cumulative sum of y in this data frame such that it is reset for each value of f? The resulting frame would look like this:> tmpf x y s 1 left 1 0 0 2 left 2 0 0 3 left 3 9 9 4 left 4 10 19 5 left 5 23 42 6 left 6 45 87 7 left 7 13 100 8 left 8 2 102 9 left 9 6 108 10 right 1 10 10 11 right 2 26 36 12 right 3 9 45 13 right 4 50 95 14 right 5 78 173 15 right 6 20 193 16 right 7 7 200 17 right 8 20 220 18 right 9 19 239 I know how to calculate the pieces with, for example,> cumsum(tmp$y[tmp$f=='right'])but I don't know how to get this piecewise into the data frame or how to automate it. Any suggestions? Thanks, Mike -- Michael A. Miller mmiller3 at iupui.edu Imaging Sciences, Department of Radiology, IU School of Medicine -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On 24 Jul 2002, Michael A. Miller wrote:> I have a question about using cumsum on subsets of a data frame. > Suppose I have a frame that looks something like this > > > tmp > f x y > 1 left 1 0 > 2 left 2 0 > 3 left 3 9 > 4 left 4 10 > 5 left 5 23 > 6 left 6 45 > 7 left 7 13 > 8 left 8 2 > 9 left 9 6 > 10 right 1 10 > 11 right 2 26 > 12 right 3 9 > 13 right 4 50 > 14 right 5 78 > 15 right 6 20 > 16 right 7 7 > 17 right 8 20 > 18 right 9 19 > > I'm plotting things like this will lattice > > > library(lattice) > > xyplot(y ~ x | f, data=tmp) > > If I plot the cumsum with xyplot( cumsum(y) ~ x | f, data=tmp), > it is summed across the values of the factor f. Can anyone > suggest a way to calculate the cumulative sum of y in this data > frame such that it is reset for each value of f? The resulting > frame would look like this: >Use split and unsplit Eg> df<-data.frame(y=runif(100),f=rep(1:5,rep(20,5))) > df$z<-unsplit(lapply(split(df$y,df$f),cumsum),df$f)-thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Michael,
The following appears to do what you want:
     > cbind(tmp, s=c(lapply(split(tmp, tmp$f),
     +     function(x) cumsum(x$y)), recursive=T))
         f x  y   s
     1   left 1  0   0
     2   left 2  0   0
     3   left 3  9   9
     4   left 4 10  19
     5   left 5 23  42
     6   left 6 45  87
     7   left 7 13 100
     8   left 8  2 102
     9   left 9  6 108
     10 right 1 10  10
     11 right 2 26  36
     12 right 3  9  45
     13 right 4 50  95
     14 right 5 78 173
     15 right 6 20 193
     16 right 7  7 200
     17 right 8 20 220
     18 right 9 19 239
I hope that this helps,
  John
At 09:29 AM 7/24/2002 -0500, Michael A. Miller wrote:>I have a question about using cumsum on subsets of a data frame.
>Suppose I have a frame that looks something like this
>
> > tmp
>        f x  y
>1   left 1  0
>2   left 2  0
>3   left 3  9
>4   left 4 10
>5   left 5 23
>6   left 6 45
>7   left 7 13
>8   left 8  2
>9   left 9  6
>10 right 1 10
>11 right 2 26
>12 right 3  9
>13 right 4 50
>14 right 5 78
>15 right 6 20
>16 right 7  7
>17 right 8 20
>18 right 9 19
>
>I'm plotting things like this will lattice
>
> > library(lattice)
> > xyplot(y ~ x | f, data=tmp)
>
>If I plot the cumsum with xyplot( cumsum(y) ~ x | f, data=tmp),
>it is summed across the values of the factor f.  Can anyone
>suggest a way to calculate the cumulative sum of y in this data
>frame such that it is reset for each value of f?  The resulting
>frame would look like this:
>
> > tmp
>        f x  y   s
>1   left 1  0   0
>2   left 2  0   0
>3   left 3  9   9
>4   left 4 10  19
>5   left 5 23  42
>6   left 6 45  87
>7   left 7 13 100
>8   left 8  2 102
>9   left 9  6 108
>10 right 1 10  10
>11 right 2 26  36
>12 right 3  9  45
>13 right 4 50  95
>14 right 5 78 173
>15 right 6 20 193
>16 right 7  7 200
>17 right 8 20 220
>18 right 9 19 239
>
>I know how to calculate the pieces with, for example,
>
> > cumsum(tmp$y[tmp$f=='right'])
>
>but I don't know how to get this piecewise into the data frame or
>how to automate it.  Any suggestions?
____________________________
John Fox
Department of Sociology
McMaster University
email: jfox at mcmaster.ca
web: http://www.socsci.mcmaster.ca/jfox
____________________________
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Wed, Jul 24, 2002 at 09:29:28AM -0500, Michael A. Miller wrote: ...> I'm plotting things like this will lattice > > > library(lattice) > > xyplot(y ~ x | f, data=tmp) > > If I plot the cumsum with xyplot( cumsum(y) ~ x | f, data=tmp), > it is summed across the values of the factor f.cumsum(y) is evaluated first, then xyplot. There have been replies that suggested fiddling with the data.frame to make a new data.frame. You could also fiddle with the panel argument of xyplot. Something like... xyplot(y~x | f, data=tmp, panel=function(x,y,...) {panel.xyplot(x,cumsum(y),type="l",...)}) that clips vertically - apparently, the ylim is the (quite sensible) range of the y data. So.... xyplot(y~x | f, data=tmp, panel=function(x,y,...) {panel.xyplot(x,cumsum(y),type="l",...)}, ylim=c(min(tmp$y),sum(tmp$y))) or, if you want the points too.... xyplot(y~x | f, data=tmp, panel=function(x,y,...) { panel.xyplot(x,cumsum(y),type="l",...) panel.xyplot(x,y,...) }, ylim=c(min(tmp$y),sum(tmp$y))) It won't return the calculated cumsum value, but it'll look pretty, and work for any arrangement of your "split" variable. Cheers Jason -- Indigo Industrial Controls Ltd. 64-21-343-545 jasont at indigoindustrial.co.nz -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thanks to all for your help with my cumsum question. As usual, I've not been disappointed by the generous and lightning fast support of the R community! Regards, Mike -- Michael A. Miller mmiller3 at iupui.edu Imaging Sciences, Department of Radiology, IU School of Medicine -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._