Hello world, I am actually transferring a course in data management for students in biology, geography and agriculture from statistica to R - it works surprisingly well. If anyone is interested in my scratch/notepad (in German language), please see www.hydrology.uni-kiel.de/~schorsch/statistik/statistik_datenauswertung.pdf (pages 40-52) The dataset is: www.hydrology.uni-kiel.de/~schorsch/statistik/erle_stat.csv It contains a 10 year dataset. So far for introduction, now comes the problem: we often need cumulative *annual* sums (sunshine, precipitation), i.e. the sum must reset to 0 at the beginning of the year. I know of cumsum(), but I do not now how to split the dataset automagically into annual pieces so I can cumsum() every year separately. I have the strong hope that the solution is one of these one-liners which leave the students with eyes wide open in surprise and makes them true believers in the power of the command-line 8-). Thanks & Greetings Georg -- Georg Hoermann, Luebeck, Germany Tel. 0451/47 70 32, 0172/431 57 15, Penguin #189476
maybe something like this: dat <- data.frame(years=rep(1994:2004, each=10), x=rnorm(110), y=rnorm(110)) lapply(split(dat[,-1], dat$years), cumsum) is what you want. I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat/ http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Georg Hoermann" <georg.hoermann at gmx.de> To: <r-help at stat.math.ethz.ch> Sent: Thursday, February 10, 2005 11:01 AM Subject: [R] Annual cumulative sums from time series> Hello world, > > I am actually transferring a course in data management for > students in biology, geography and agriculture > from statistica to R - it works > surprisingly well. If anyone is interested in my scratch/notepad > (in German language), please see > > www.hydrology.uni-kiel.de/~schorsch/statistik/statistik_datenauswertung.pdf > > (pages 40-52) > > The dataset is: > > www.hydrology.uni-kiel.de/~schorsch/statistik/erle_stat.csv > > It contains a 10 year dataset. So far for introduction, now > comes the problem: > > we often need cumulative *annual* sums (sunshine, precipitation), > i.e. the sum > must reset to 0 at the beginning of the year. I know > of cumsum(), but I do not now how to split the dataset automagically > into annual pieces so I can cumsum() every year separately. > I have the strong hope that the solution is one of these > one-liners which leave the students with eyes wide open in surprise > and > makes them true believers in the power of the command-line 8-). > > Thanks & Greetings > Georg > > > -- > Georg Hoermann, Luebeck, Germany > Tel. 0451/47 70 32, 0172/431 57 15, Penguin #189476 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >
Hi Georg If you know which value belongs to which year you can use tapply, by, aggregate family like: years<-rep(c(2000,2001,2002,2003), each=10) values<-(1:40) test<-cbind(years,values) test<-data.frame(test) aggregate(test$values,list(y=test$years),sum) Cheers Petr On 10 Feb 2005 at 11:01, Georg Hoermann wrote:> Hello world, > > I am actually transferring a course in data management for > students in biology, geography and agriculture > from statistica to R - it works > surprisingly well. If anyone is interested in my scratch/notepad > (in German language), please see > > www.hydrology.uni-kiel.de/~schorsch/statistik/statistik_datenauswertun > g.pdf > > (pages 40-52) > > The dataset is: > > www.hydrology.uni-kiel.de/~schorsch/statistik/erle_stat.csv > > It contains a 10 year dataset. So far for introduction, now > comes the problem: > > we often need cumulative *annual* sums (sunshine, precipitation), > i.e. the sum > must reset to 0 at the beginning of the year. I know > of cumsum(), but I do not now how to split the dataset automagically > into annual pieces so I can cumsum() every year separately. I have the > strong hope that the solution is one of these one-liners which leave > the students with eyes wide open in surprise and makes them true > believers in the power of the command-line 8-). > > Thanks & Greetings > Georg > > > -- > Georg Hoermann, Luebeck, Germany > Tel. 0451/47 70 32, 0172/431 57 15, Penguin #189476 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.htmlPetr Pikal petr.pikal at precheza.cz
On Thu, 10 Feb 2005, Georg Hoermann wrote:> Hello world, > > I am actually transferring a course in data management for > students in biology, geography and agriculture > from statistica to R - it works > surprisingly well. If anyone is interested in my scratch/notepad > (in German language), please see > > www.hydrology.uni-kiel.de/~schorsch/statistik/statistik_datenauswertung.pdf > > (pages 40-52) > > The dataset is: > > www.hydrology.uni-kiel.de/~schorsch/statistik/erle_stat.csv > > It contains a 10 year dataset. So far for introduction, now > comes the problem: > > we often need cumulative *annual* sums (sunshine, precipitation), > i.e. the sum > must reset to 0 at the beginning of the year. I know > of cumsum(), but I do not now how to split the dataset automagically > into annual pieces so I can cumsum() every year separately. > I have the strong hope that the solution is one of these > one-liners which leave the students with eyes wide open in surprise and > makes them true believers in the power of the command-line 8-). >> kiel <- read.csv(url("http://www.hydrology.uni-kiel.de/~schorsch/statistik/erle_stat.csv")) > yrs <- factor(strftime(strptime(as.character(kiel$DATUM), "%d.%m.%Y"),+ "%Y")) # this is clumsy, there are probably better ways> tpks1 <- unlist(tapply(kiel$Sonnen, yrs, cumsum)) > str(tpks1)Named num [1:3652] 0.12 0.24 0.24 0.24 5.52 ... - attr(*, "names")= chr [1:3652] "19891" "19892" "19893" "19894" ...> kiel$Sonnen[1:5][1] 0.12 0.12 0.00 0.00 5.28 I don't know if you want the name attribute on your annual cumulative sums, but for now they help control what's going on. This looks OK:> try1 <- tapply(kiel$Sonnen, yrs, cumsum) > cols <- rainbow(length(try1)) > plot(x=c(1,366), y=c(0,1200), type="n") > for (i in 1:length(try1)) lines(1:length(try1[[i]]), try1[[i]],+ col=cols[i])> legend(c(0,70), c(700,1100), names(try1), col=cols, lwd=1)for a quick impression.> Thanks & Greetings > Georg > > >-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: Roger.Bivand at nhh.no
Georg Hoermann [mailto:georg.hoermann at gmx.de] wrote:> www.hydrology.uni-kiel.de/~schorsch/statistik/erle_stat.csv ...contains a 10 year dataset.> We often need cumulative *annual* sums (sunshine, precipitation), i.e.the sum must reset to 0> at the beginning of the year. I know of cumsum(), but I do not now howto split the dataset> automagically into annual pieces so I can cumsum() every yearseparately. Several replies have been given using tapply/aggregate/split; here is a more primitive approach. I do the cumsum once for the series, and call it "x". Then I subtract from x the value of x on 12/31 of the prior year: R> y <- read.csv("erle_stat.csv", as.is=TRUE) R> year <- substring(y$DATUM, 7) R> x <- cumsum(y$Peff) R> y$cumPeff <- x - c(0,x)[match(year, year)] -- David Brahm (brahm at alum.mit.edu)
Georg Hoermann
2005-Feb-10 20:26 UTC
[R] Annual cumulative sums from time series - SOLVED + questions
Roger Bivand wrote:> On Thu, 10 Feb 2005, Georg Hoermann wrote: >Hello world, based on the code of Roger I have now two solutions: the first one (one line for the whole dataset) --------- cut here ------ erle <- read.csv(url("http://www.hydrology.uni-kiel.de/~schorsch/statistik/erle_stat.csv")) jahre <- factor(substring(erle$DATUM, 7)) tpks1 <- unlist(tapply(erle$Sonnen, jahre, cumsum)) plot(tpks1, type="l") --------- cut here ------- The second one plots one line for each year: ---- start - cut here ---- # read data in from Internet erle <- read.csv(url("http://www.hydrology.uni-kiel.de/~schorsch/statistik/erle_stat.csv")) # extract Year as a factor from variable DATUM jahre <- factor(substring(erle$DATUM, 7)) try1 <- tapply(erle$Sonnen, jahre, cumsum) # create colors for every year from rainbow color scheme cols <- rainbow(length(try1)) plot(x=c(1,366), y=c(0,1200), type="n", xlab="Days", ylab="Cumulative sunshine (h)") # draw the lines, one line for each year for (i in 1:length(try1)) lines(1:length(try1[[i]]), try1[[i]], col=cols[i]) # ...and the legend legend(c(0,100), c(400,1100), names(try1), col=cols, lwd=1, bty="n") ---- end cut here ----- For the second example, a mean sum for all years would also be a good idea... Thanks for all solutions... Merci & Gruss Georg -- Georg Hoermann, Dep. of Hydrology, Ecology, Kiel University, Germany Tel. 0431-880-1207, icq - 348 340 729, 0172/4315715, Penguin #189476