Dear colleagues, I have a data set that looks roughly like this; mydat<-data.frame(state=c(rep("Alabama", 5), rep("Delaware", 5), rep("California", 5)), news=runif(15, min=0, max=8), cum.news=rep(0, 15)) For each state, I'd like to cumulatively sum the value of "news" and make that put that value in cum.news. I'm trying as follows but I get really weird results. One thing is that it keeps counting 0's as 1. for (i in levels(mydat$state)) { mydat[mydat$state==i, ]$cum.news<-sapply(mydat[mydat$state==i, ]$news, function(x) sum(1:x)) } I can sort of get the same sapply function to do what I want when working on a test string test<-1:10 sapply(test, function(x) sum(1:x)) Any thoughts? Simon Kiss ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606
Hi Simon, Is this what you want? mydat$cum.news<-unlist(tapply(mydat$news,mydat$state,FUN=cumsum)) Weidong Gu On Sat, Jul 23, 2011 at 7:11 AM, Simon Kiss <sjkiss at gmail.com> wrote:> Dear colleagues, I have a data set that looks roughly like this; > mydat<-data.frame(state=c(rep("Alabama", 5), rep("Delaware", 5), rep("California", 5)), news=runif(15, min=0, max=8), cum.news=rep(0, 15)) > > For each state, I'd like to cumulatively sum the value of "news" and make that put that value in cum.news. > > I'm trying as follows but I get really weird results. One thing is that it keeps counting 0's as 1. > > for (i in levels(mydat$state)) { > mydat[mydat$state==i, ]$cum.news<-sapply(mydat[mydat$state==i, ]$news, function(x) sum(1:x)) > } > > I can sort of get the same sapply function to do what I want when working on a test string > test<-1:10 > sapply(test, function(x) sum(1:x)) > > Any thoughts? > Simon Kiss > ********************************* > Simon J. Kiss, PhD > Assistant Professor, Wilfrid Laurier University > 73 George Street > Brantford, Ontario, Canada > N3T 2C9 > Cell: +1 905 746 7606 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Jul 23, 2011, at 13:11 , Simon Kiss wrote:> Dear colleagues, I have a data set that looks roughly like this; > mydat<-data.frame(state=c(rep("Alabama", 5), rep("Delaware", 5), rep("California", 5)), news=runif(15, min=0, max=8), cum.news=rep(0, 15)) > > For each state, I'd like to cumulatively sum the value of "news" and make that put that value in cum.news.Like this?> mydat <- within(mydat, cum.news <- ave(news, state, FUN=cumsum)) > mydatstate news cum.news 1 Alabama 7.9914863 7.991486 2 Alabama 7.3751514 15.366638 3 Alabama 3.4894295 18.856067 4 Alabama 3.1543811 22.010448 5 Alabama 7.9720879 29.982536 6 Delaware 2.3904745 2.390475 7 Delaware 5.5532841 7.943759 8 Delaware 5.4182249 13.361984 9 Delaware 4.6554645 18.017448 10 Delaware 3.1289714 21.146419 11 California 7.9450424 7.945042 12 California 2.0142029 9.959245 13 California 7.9735398 17.932785 14 California 1.0972878 19.030073 15 California 0.7215365 19.751609 -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com "D?den skal tape!" --- Nordahl Grieg