I have a question about using cumsum on subsets of a data frame. Suppose I have a frame that looks something like this> tmpf x y 1 left 1 0 2 left 2 0 3 left 3 9 4 left 4 10 5 left 5 23 6 left 6 45 7 left 7 13 8 left 8 2 9 left 9 6 10 right 1 10 11 right 2 26 12 right 3 9 13 right 4 50 14 right 5 78 15 right 6 20 16 right 7 7 17 right 8 20 18 right 9 19 I'm plotting things like this will lattice> library(lattice) > xyplot(y ~ x | f, data=tmp)If I plot the cumsum with xyplot( cumsum(y) ~ x | f, data=tmp), it is summed across the values of the factor f. Can anyone suggest a way to calculate the cumulative sum of y in this data frame such that it is reset for each value of f? The resulting frame would look like this:> tmpf x y s 1 left 1 0 0 2 left 2 0 0 3 left 3 9 9 4 left 4 10 19 5 left 5 23 42 6 left 6 45 87 7 left 7 13 100 8 left 8 2 102 9 left 9 6 108 10 right 1 10 10 11 right 2 26 36 12 right 3 9 45 13 right 4 50 95 14 right 5 78 173 15 right 6 20 193 16 right 7 7 200 17 right 8 20 220 18 right 9 19 239 I know how to calculate the pieces with, for example,> cumsum(tmp$y[tmp$f=='right'])but I don't know how to get this piecewise into the data frame or how to automate it. Any suggestions? Thanks, Mike -- Michael A. Miller mmiller3 at iupui.edu Imaging Sciences, Department of Radiology, IU School of Medicine -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On 24 Jul 2002, Michael A. Miller wrote:> I have a question about using cumsum on subsets of a data frame. > Suppose I have a frame that looks something like this > > > tmp > f x y > 1 left 1 0 > 2 left 2 0 > 3 left 3 9 > 4 left 4 10 > 5 left 5 23 > 6 left 6 45 > 7 left 7 13 > 8 left 8 2 > 9 left 9 6 > 10 right 1 10 > 11 right 2 26 > 12 right 3 9 > 13 right 4 50 > 14 right 5 78 > 15 right 6 20 > 16 right 7 7 > 17 right 8 20 > 18 right 9 19 > > I'm plotting things like this will lattice > > > library(lattice) > > xyplot(y ~ x | f, data=tmp) > > If I plot the cumsum with xyplot( cumsum(y) ~ x | f, data=tmp), > it is summed across the values of the factor f. Can anyone > suggest a way to calculate the cumulative sum of y in this data > frame such that it is reset for each value of f? The resulting > frame would look like this: >Use split and unsplit Eg> df<-data.frame(y=runif(100),f=rep(1:5,rep(20,5))) > df$z<-unsplit(lapply(split(df$y,df$f),cumsum),df$f)-thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Michael,
The following appears to do what you want:
> cbind(tmp, s=c(lapply(split(tmp, tmp$f),
+ function(x) cumsum(x$y)), recursive=T))
f x y s
1 left 1 0 0
2 left 2 0 0
3 left 3 9 9
4 left 4 10 19
5 left 5 23 42
6 left 6 45 87
7 left 7 13 100
8 left 8 2 102
9 left 9 6 108
10 right 1 10 10
11 right 2 26 36
12 right 3 9 45
13 right 4 50 95
14 right 5 78 173
15 right 6 20 193
16 right 7 7 200
17 right 8 20 220
18 right 9 19 239
I hope that this helps,
John
At 09:29 AM 7/24/2002 -0500, Michael A. Miller wrote:>I have a question about using cumsum on subsets of a data frame.
>Suppose I have a frame that looks something like this
>
> > tmp
> f x y
>1 left 1 0
>2 left 2 0
>3 left 3 9
>4 left 4 10
>5 left 5 23
>6 left 6 45
>7 left 7 13
>8 left 8 2
>9 left 9 6
>10 right 1 10
>11 right 2 26
>12 right 3 9
>13 right 4 50
>14 right 5 78
>15 right 6 20
>16 right 7 7
>17 right 8 20
>18 right 9 19
>
>I'm plotting things like this will lattice
>
> > library(lattice)
> > xyplot(y ~ x | f, data=tmp)
>
>If I plot the cumsum with xyplot( cumsum(y) ~ x | f, data=tmp),
>it is summed across the values of the factor f. Can anyone
>suggest a way to calculate the cumulative sum of y in this data
>frame such that it is reset for each value of f? The resulting
>frame would look like this:
>
> > tmp
> f x y s
>1 left 1 0 0
>2 left 2 0 0
>3 left 3 9 9
>4 left 4 10 19
>5 left 5 23 42
>6 left 6 45 87
>7 left 7 13 100
>8 left 8 2 102
>9 left 9 6 108
>10 right 1 10 10
>11 right 2 26 36
>12 right 3 9 45
>13 right 4 50 95
>14 right 5 78 173
>15 right 6 20 193
>16 right 7 7 200
>17 right 8 20 220
>18 right 9 19 239
>
>I know how to calculate the pieces with, for example,
>
> > cumsum(tmp$y[tmp$f=='right'])
>
>but I don't know how to get this piecewise into the data frame or
>how to automate it. Any suggestions?
____________________________
John Fox
Department of Sociology
McMaster University
email: jfox at mcmaster.ca
web: http://www.socsci.mcmaster.ca/jfox
____________________________
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Wed, Jul 24, 2002 at 09:29:28AM -0500, Michael A. Miller wrote: ...> I'm plotting things like this will lattice > > > library(lattice) > > xyplot(y ~ x | f, data=tmp) > > If I plot the cumsum with xyplot( cumsum(y) ~ x | f, data=tmp), > it is summed across the values of the factor f.cumsum(y) is evaluated first, then xyplot. There have been replies that suggested fiddling with the data.frame to make a new data.frame. You could also fiddle with the panel argument of xyplot. Something like... xyplot(y~x | f, data=tmp, panel=function(x,y,...) {panel.xyplot(x,cumsum(y),type="l",...)}) that clips vertically - apparently, the ylim is the (quite sensible) range of the y data. So.... xyplot(y~x | f, data=tmp, panel=function(x,y,...) {panel.xyplot(x,cumsum(y),type="l",...)}, ylim=c(min(tmp$y),sum(tmp$y))) or, if you want the points too.... xyplot(y~x | f, data=tmp, panel=function(x,y,...) { panel.xyplot(x,cumsum(y),type="l",...) panel.xyplot(x,y,...) }, ylim=c(min(tmp$y),sum(tmp$y))) It won't return the calculated cumsum value, but it'll look pretty, and work for any arrangement of your "split" variable. Cheers Jason -- Indigo Industrial Controls Ltd. 64-21-343-545 jasont at indigoindustrial.co.nz -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thanks to all for your help with my cumsum question. As usual, I've not been disappointed by the generous and lightning fast support of the R community! Regards, Mike -- Michael A. Miller mmiller3 at iupui.edu Imaging Sciences, Department of Radiology, IU School of Medicine -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._