Consider the following scrap of code: > x<- ts(1:50,start=c(1,11),freq=12) > y <- aggregate(x,nfreq=4) > c(y) [1] 6 15 24 33 42 51 60 69 78 87 96 105 114 123 132 141 > y Error in rep.int("", start.pad) : invalid number of copies in rep.int() > tsp(y) [1] 1.833333 5.583333 4.000000 So we can aggregate into quarters, but we cannot print it using print.ts Even if print.ts cannot line the series into columns as it normally does for quarterly data, we would expect it to behave as it does when we aggregate into thirds. > y3 <- aggregate(x,nfreq=3) > y3 Time Series: Start = 1.83333333333333 End = 5.5 Frequency = 3 [1] 10 26 42 58 74 90 106 122 138 154 170 186 And don't tell me that the aggregating a monthly series into quarters makes no sense!! (see response to Bug 9798). Laimonis Kavalieris
On Wed, 25 Jul 2007, laimonis wrote:> Consider the following scrap of code:...slightly modified to x1 <- ts(1:24, start = c(2000, 10), freq = 12) x2 <- ts(1:24, start = c(2000, 11), freq = 12) and then y1 <- aggregate(x1, nfreq = 4) gives the desired result while y2 <- aggregate(x2, nfreq = 4) probably does not what you would like it to do. In both cases, the 24 observations are broken into 8 segments of equal length (as documented on ?aggregate.ts) and then aggregated. Therefore as.vector(y1) as.vector(y2) are identical (and not matched by quarter...as you would probably like).> And don't tell me that the aggregating a monthly series into quarters > makes no sense!! (see response to Bug 9798).1. Your tone is not appropriate. 2. You're not quoting the reply correctly. It said: "You cannot aggregate a time series that does not run over quarters into quarters. The speculation is plain wrong." The reply means that aggregate.ts() does not do what you think it does. As I tried to illustrate with the example above. One can probably argue about whether it makes sense to aggregate a monthly time series into quarter when I don't have complete observations in each quarter. But maybe it might be worth considering a change in aggregate.ts() so that the old and new frequency are matched even with incomplete observations? Currently, the "zoo" implementation allows this: Coercing back and forth gives: library("zoo") z1 <- as.ts(aggregate(as.zoo(x1), as.yearqtr, sum)) z2 <- as.ts(aggregate(as.zoo(x2), as.yearqtr, sum)) where z1 is identical to y1, and z2 is what you probably want. hth, Z
(moved from r-help) Achim Zeileis wrote:>On Wed, 25 Jul 2007, laimonis wrote: > > > >>Consider the following scrap of code: >> >> > >...slightly modified to > x1 <- ts(1:24, start = c(2000, 10), freq = 12) > x2 <- ts(1:24, start = c(2000, 11), freq = 12) > >and then > y1 <- aggregate(x1, nfreq = 4) >gives the desired result while > y2 <- aggregate(x2, nfreq = 4) >probably does not what you would like it to do. >I've been caught by this before, and complained before. It does not do what most people that work with economic time series would expect. (One might argue that not all time series are economic, but other time series don't usually fit with ts very well.) At the very least aggregate should issue a warning. Quarterly observations are for quarters of the year, so just arbitrarily grouping in 3 beginning with the first observation is *extremely* misleading, even if it is documented. [ BTW, there is a bug in the print method here (R-2.5.1 on Linux) : > y2 <- aggregate(x2, nfreq = 4) > > y2 Error in rep.int("", start.pad) : invalid number of copies in rep.int() > traceback() 5: rep.int("", start.pad) 4: as.vector(data) 3: matrix(c(rep.int("", start.pad), format(x, ...), rep.int("", end.pad)), nc = fr.x, byrow = TRUE, dimnames = list(dn1, dn2)) 2: print.ts(c(6L, 15L, 24L, 33L, 42L, 51L, 60L, 69L)) 1: print(c(6L, 15L, 24L, 33L, 42L, 51L, 60L, 69L)) ] ....>Currently, the "zoo" implementation allows this: Coercing back and forth >gives: > library("zoo") > z1 <- as.ts(aggregate(as.zoo(x1), as.yearqtr, sum)) > z2 <- as.ts(aggregate(as.zoo(x2), as.yearqtr, sum)) > >This is better, but still potentially misleading. I would prefer a default NA when only some of the observations are available for a quarter (and the syntax is a bit cumbersome for something one needs to do fairly often). Paul>where z1 is identical to y1, and z2 is what you probably want. > >hth, >Z > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > >=================================================================================== La version fran?aise suit le texte anglais. ------------------------------------------------------------------------------------ This email may contain privileged and/or confidential inform...{{dropped}}
Your troubles with 'aggregate' for a ts are one of the reasons I created the 'tis' and 'ti' classes in the fame package. If you do this:> x1 <- tis(1:24, start = c(2000, 10), freq = 12) > x2 <- tis(1:24, start = c(2000, 11), freq = 12) > y1 <- aggregate(x1, nfreq = 4) > y2 <- aggregate(x2, nfreq = 4) > x1Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2000 1 2 3 2001 4 5 6 7 8 9 10 11 12 13 14 15 2002 16 17 18 19 20 21 22 23 24 class: tis> x2Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2000 1 2 2001 3 4 5 6 7 8 9 10 11 12 13 14 2002 15 16 17 18 19 20 21 22 23 24 class: tis> y1Qtr1 Qtr2 Qtr3 Qtr4 2000 6 2001 15 24 33 42 2002 51 60 69 class: tis> y2Qtr1 Qtr2 Qtr3 Qtr4 2001 12 21 30 39 2002 48 57 66 class: tis Everything pretty much works as you would expect. One thing to notice is that, even using a 'tis' rather than a 'ts', aggregate will only sum up the monthly observations for a quarter if all three of the months are there. That's why y2 starts with 2001Q1, rather than 2000Q4. If you really want the 2000Q4 observation to be the sum of the first two x2 months, the convert() function in fame can handle that.> convert(x2, tif = "quarterly", observed = "summed", ignore = T)Qtr1 Qtr2 Qtr3 Qtr4 2000 4.033333 2001 12.000000 21.000000 30.000000 39.000000 2002 48.000000 57.000000 66.000000 71.225806 class: tis Now back to ts. If you look deeper into what's happening here:> y3 <- aggregate(as.ts(x2), nf = 4) > y3Error in rep.int("", start.pad) : invalid number of copies in rep.int() Enter a frame number, or 0 to exit 1: print(c(6, 15, 24, 33, 42, 51, 60, 69)) 2: print.ts(c(6, 15, 24, 33, 42, 51, 60, 69)) 3: matrix(c(rep.int("", start.pad), format(x, ...), rep.int("", end.pad)), nc 4: as.vector(data) 5: rep.int("", start.pad) Selection: 0> unclass(y3)[1] 6 15 24 33 42 51 60 69 attr(,"tsp") [1] 2000.833 2002.583 4.000 what you see is that aggregate() did indeed create a quarterly series, but the quarters cover (Nov-Jan, Feb-Apr, May-Jul, Aug-Oct), not the usual (Jan-Mar, Apr-Jun, Jul-Sep, Oct-Dec). The author of the print.ts code evidently never even thought of this possibility. Not that I blame him. I work with monthly and quarterly data all the time, and the behavior of aggregate.ts() is so counter-intuitive that I wouldn't have imagined it either. Bottom line: use 'tis' series from the fame package, or 'zoo` stuff from Gabor's zoo package. As the author of the fame package, I hope you'll excuse me for asserting that the 'tis' class is easier to understand and use than the zoo stuff, which takes a more general approach. Some day Gabor or I or some other enterprising soul should try combining the best ideas from zoo and fame into a package that is better than either one. Jeff -- Jeff