Ted Byers
2009-Jun-26 20:57 UTC
[R] Where can I find information on how to subsample a time series?
I suspect I'm looking in the wrong places, so guidance to the relevant documentation would be as welcome as a little code snippet. I have time series data stored in a MySQL database. There is the usual DATE field, along with a double precision number: there are daily values (including only normal working days: Monday through Friday). I actually have to do a couple things here. Because of how the result is to be used, I need to first create two time series. The first is the delta between 22 working days, and the second is the delta between 66 working days. I have hundreds of these datasets, and some go back 30 years. I need to estimate the correlation between 22 day deltas (i.e. is the delta for one month correlated with that of the previous month) and between the 22 day delta and the 66 day delta that ends the day before the the first day of the 22 day delta. However, I KNOW the statistical properties of the time series are not constant (so the usual assumptions do not apply to the entire series). Therefore, I want to subsample finely enough to get a reasonably sensible correlation and examine how that changes through time. (There are no tests of significance here: I just want to explore just how much the properties of these series change through time). I have C++ code, admittedly not written particularly efficiently, that does this. The question is, is it possible to do this reasonably efficiently using R? Thanks Ted [[alternative HTML version deleted]]
Whit Armstrong
2009-Jun-26 22:32 UTC
[R] Where can I find information on how to subsample a time series?
assuming you pull the data you want into x and y: whit at ubuntu:~$ R> library(fts) > x <- fts() > y <- fts() > xy.cor.200 <- moving.cor(x,y,200) > tail(xy.cor.200)[,1] 2012-03-12 -0.3009635 2012-03-13 -0.2923489 2012-03-14 -0.2824015 2012-03-15 -0.2662689 2012-03-16 -0.2566354 2012-03-17 -0.2537089 2012-03-18 -0.2490421 2012-03-19 -0.2391911 2012-03-20 -0.2263381 2012-03-21 -0.2113029>which is just using c++ to do the calculation. here is the template function for correlation that fts uses: http://github.com/armstrtw/tslib/blob/5b0fe2fc5ecb393d1dca097c2c19008227eb6c7e/tslib/vector.summary/cor.hpp -Whit On Fri, Jun 26, 2009 at 4:57 PM, Ted Byers<r.ted.byers at gmail.com> wrote:> I suspect I'm looking in the wrong places, so guidance to the relevant > documentation would be as welcome as a little code snippet. > > I have time series data stored in a MySQL database. ?There is the usual DATE > field, along with a double precision number: there are daily values > (including only normal working days: Monday through Friday). ?I actually > have to do a couple things here. ?Because of how the result is to be used, I > need to first create two time series. ?The first is the delta between 22 > working days, and the second is the delta between 66 working days. ?I have > hundreds of these datasets, and some go back 30 years. ?I need to estimate > the correlation between 22 day deltas (i.e. is the delta for one month > correlated with that of the previous month) and between the 22 day delta and > the 66 day delta that ends the day before the the first day of the 22 day > delta. ?However, I KNOW the statistical properties of the time series are > not constant (so the usual assumptions do not apply to the entire series). > Therefore, I want to subsample finely enough to get a reasonably sensible > correlation and examine how that changes through time. ?(There are no tests > of significance here: I just want to explore just how much the properties of > these series change through time). > > I have C++ code, admittedly not written particularly efficiently, that does > this. ?The question is, is it possible to do this reasonably efficiently > using R? > > Thanks > > Ted > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Possibly Parallel Threads
- beginner's guide to C++ programming with R packages?
- Memory management issues
- exercise in frustration: applying a function to subsamples
- Query about using timestamps returned by SQL as 'factor' for split
- any suggestions to deal with 'Argument list too long' for a R CMD check?