Jason Gilmore
2012-Aug-27 13:53 UTC
[R] How to average time series data around regular intervals
Hi, I'm pretty new to R and have run into a task which although I'm certain is within R's capabilities, falls outside of mine. :-) Consider the following data set: 2012-07-22 12:12:00, 21 2012-07-22 12:15:00, 22 2012-07-22 12:18:00, 24 2012-07-22 12:39:00, 21 2012-07-22 12:45:00, 25 2012-07-22 12:49:00, 26 2012-07-22 12:53:00, 20 2012-07-22 13:00:00, 18 2012-07-22 13:06:00, 22 My task involves creating a data set which *averages* these values at a resolution of 15 minutes, meaning that I need to average the values falling within 7.5 minutes of a 15 minute increment. Therefore given the above data set I need to treat it as three groups: 2012-07-22 12:12:00, 21 2012-07-22 12:15:00, 22 2012-07-22 12:18:00, 24 2012-07-22 12:39:00, 21 2012-07-22 12:45:00, 25 2012-07-22 12:49:00, 26 2012-07-22 12:53:00, 20 2012-07-22 13:00:00, 18 2012-07-22 13:06:00, 22 The end result should look like this: 2012-07-22 12:15:00, 22.33 2012-07-22 12:30:00, NA <- Because this 15 minute slot did not previously exist 2012-07-22 12:45:00, 24 2012-07-22 1:00:00, 20 Any help much appreciated. I've been working on this for several hours with little success. I'm able to identify the missing (NA) value using zoo/xts but can't seem to sort out the averaging matter. Thanks so much! Jason [[alternative HTML version deleted]]
jim holtman
2012-Aug-27 17:40 UTC
[R] How to average time series data around regular intervals
try this:> x <- read.table(text = "2012-07-22 12:12:00, 21+ 2012-07-22 12:15:00, 22 + 2012-07-22 12:18:00, 24 + 2012-07-22 12:39:00, 21 + 2012-07-22 12:45:00, 25 + 2012-07-22 12:49:00, 26 + 2012-07-22 12:53:00, 20 + 2012-07-22 13:00:00, 18 + 2012-07-22 13:06:00, 22", colClasses = c("POSIXct", "integer"), sep = ',')> # get minimum at an hour granularity > tMin <- trunc(min(x$V1), units = 'hour') > # back off 7.5 minute > tMin <- tMin - (7.5 * 60) > # create sequence for 'cut' > cSeq <- seq(tMin, max(x$V1) + (7.5 * 60), by = '15 min') > # now split and average > cCut <- cut(x$V1, cSeq) > # compute means > tapply(x$V2, cCut, mean)2012-07-22 11:52:30 2012-07-22 12:07:30 2012-07-22 12:22:30 2012-07-22 12:37:30 NA 22.33333 NA 24.00000 2012-07-22 12:52:30 20.00000>On Mon, Aug 27, 2012 at 9:53 AM, Jason Gilmore <wj at wjgilmore.com> wrote:> > Hi, > > I'm pretty new to R and have run into a task which although I'm certain is > within R's capabilities, falls outside of mine. :-) Consider the following > data set: > > 2012-07-22 12:12:00, 21 > 2012-07-22 12:15:00, 22 > 2012-07-22 12:18:00, 24 > 2012-07-22 12:39:00, 21 > 2012-07-22 12:45:00, 25 > 2012-07-22 12:49:00, 26 > 2012-07-22 12:53:00, 20 > 2012-07-22 13:00:00, 18 > 2012-07-22 13:06:00, 22 > > My task involves creating a data set which *averages* these values at a > resolution of 15 minutes, meaning that I need to average the values falling > within 7.5 minutes of a 15 minute increment. Therefore given the above data > set I need to treat it as three groups: > > 2012-07-22 12:12:00, 21 > 2012-07-22 12:15:00, 22 > 2012-07-22 12:18:00, 24 > > 2012-07-22 12:39:00, 21 > 2012-07-22 12:45:00, 25 > 2012-07-22 12:49:00, 26 > > 2012-07-22 12:53:00, 20 > 2012-07-22 13:00:00, 18 > 2012-07-22 13:06:00, 22 > > The end result should look like this: > > 2012-07-22 12:15:00, 22.33 > 2012-07-22 12:30:00, NA <- Because this 15 minute slot did not previously > exist > 2012-07-22 12:45:00, 24 > 2012-07-22 1:00:00, 20 > > Any help much appreciated. I've been working on this for several hours with > little success. I'm able to identify the missing (NA) value using zoo/xts > but can't seem to sort out the averaging matter. > > Thanks so much! > Jason > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.