Hi, I found that apply.monthly() in xts does not work as I expected in the case of a sparse timeseries: my.dates <- as.Date(c("1992-06-01", "1992-06-24", "1992-06-30", "1993-06-22", "1994-06-07", "1995-06-08")) my.xts <- xts(1:6, my.dates) start(my.xts) # "1992-06-24" end(my.xts) # "1995-06-08" apply.monthly(my.xts, mean) # [,1] # 1995-06-08 3.5 The endpoints it chooses are based on looking at the month (June) alone. I was able to get a value for each (month, year) in the timeseries with the following use of aggregate(): my.months <- months(my.dates) my.years <- years(my.dates) df1 <- data.frame(x = coredata(my.xts), dates = my.dates, months = my.months, years = my.years) df2 <- aggregate(df1[-c(3,4)], df1[c("months", "years")], mean) xts(df2$x, df2$dates) # [,1] # 1992-06-18 2 # 1993-06-22 4 # 1994-06-07 5 # 1995-06-08 6 Two questions: 1) Is there a more elegant way to do this? 2) Shouldn't the xts documentation discuss the problem of sparse data? Regards, Scott Waichler Pacific Northwest National Laboratory Richland, WA USA
On Thu, Mar 9, 2017 at 3:31 PM, Waichler, Scott R <Scott.Waichler at pnnl.gov> wrote:> Hi, > > I found that apply.monthly() in xts does not work as I expected in the case of a sparse timeseries: > > my.dates <- as.Date(c("1992-06-01", "1992-06-24", "1992-06-30", "1993-06-22", "1994-06-07", "1995-06-08")) > my.xts <- xts(1:6, my.dates) > start(my.xts) # "1992-06-24" > end(my.xts) # "1995-06-08" > apply.monthly(my.xts, mean) > # [,1] > # 1995-06-08 3.5 > > The endpoints it chooses are based on looking at the month (June) alone. I was able to get a value for each (month, year) in the timeseries with the following use of aggregate(): >Thanks for the minimal, reproducible example! This is clearly a bug.> my.months <- months(my.dates) > my.years <- years(my.dates) > df1 <- data.frame(x = coredata(my.xts), dates = my.dates, months = my.months, years = my.years) > df2 <- aggregate(df1[-c(3,4)], df1[c("months", "years")], mean) > xts(df2$x, df2$dates) > # [,1] > # 1992-06-18 2 > # 1993-06-22 4 > # 1994-06-07 5 > # 1995-06-08 6 > > Two questions: > 1) Is there a more elegant way to do this?Create your own endpoints until endpoints() is fixed. Here's a quick hack, off the top of my head: endpointsMonthHack <- function(x, on = "months", k = 1) { # yearmon index ymIndex <- as.yearmon(index(x)) # month changes monthDiff <- c(0, diff(ymIndex)) # locations in index locations <- which(monthDiff != 0) ep <- c(0, locations, nrow(x)) unique(ep) }> 2) Shouldn't the xts documentation discuss the problem of sparse data?No, because it shouldn't be a problem. :)> > Regards, > Scott Waichler > Pacific Northwest National Laboratory > Richland, WA USA > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2017 | www.rinfinance.com
On Thu, Mar 9, 2017 at 3:46 PM, Joshua Ulrich <josh.m.ulrich at gmail.com> wrote:> On Thu, Mar 9, 2017 at 3:31 PM, Waichler, Scott R > <Scott.Waichler at pnnl.gov> wrote: >> Hi, >> >> I found that apply.monthly() in xts does not work as I expected in the case of a sparse timeseries: >> >> my.dates <- as.Date(c("1992-06-01", "1992-06-24", "1992-06-30", "1993-06-22", "1994-06-07", "1995-06-08")) >> my.xts <- xts(1:6, my.dates) >> start(my.xts) # "1992-06-24" >> end(my.xts) # "1995-06-08" >> apply.monthly(my.xts, mean) >> # [,1] >> # 1995-06-08 3.5 >> >> The endpoints it chooses are based on looking at the month (June) alone. I was able to get a value for each (month, year) in the timeseries with the following use of aggregate(): >> > Thanks for the minimal, reproducible example! This is clearly a bug. >Now formally documented as such: https://github.com/joshuaulrich/xts/issues/169>> my.months <- months(my.dates) >> my.years <- years(my.dates) >> df1 <- data.frame(x = coredata(my.xts), dates = my.dates, months = my.months, years = my.years) >> df2 <- aggregate(df1[-c(3,4)], df1[c("months", "years")], mean) >> xts(df2$x, df2$dates) >> # [,1] >> # 1992-06-18 2 >> # 1993-06-22 4 >> # 1994-06-07 5 >> # 1995-06-08 6 >> >> Two questions: >> 1) Is there a more elegant way to do this? > > Create your own endpoints until endpoints() is fixed. Here's a quick > hack, off the top of my head: > > endpointsMonthHack <- function(x, on = "months", k = 1) { > # yearmon index > ymIndex <- as.yearmon(index(x)) > # month changes > monthDiff <- c(0, diff(ymIndex)) > # locations in index > locations <- which(monthDiff != 0) > ep <- c(0, locations, nrow(x)) > unique(ep) > } >The function above is wrong. That's what I get for posting without actually running the code. Here's a function that's actually tested (only on this example though): endpointsMonthHack <- function(x, on = "months", k = 1) { # yearmon index ymIndex <- as.yearmon(index(x)) # month change locations locations <- which(diff(ymIndex) != 0) # endpoints ep <- c(0, locations, nrow(x)) unique(ep) }>> 2) Shouldn't the xts documentation discuss the problem of sparse data? > > No, because it shouldn't be a problem. :) > >> >> Regards, >> Scott Waichler >> Pacific Northwest National Laboratory >> Richland, WA USA >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Joshua Ulrich | about.me/joshuaulrich > FOSS Trading | www.fosstrading.com > R/Finance 2017 | www.rinfinance.com-- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2017 | www.rinfinance.com