I have a monthly price index series x, the related return series y = diff(log(x)) and a POSIXlt date-time variable dp. I would like to apply annual blocks to compute for example annual block maxima and mean of y. When studying the POSIX classes, in the first stage of the learning curve, I computed the maximum drawdown of x:> mdd <- maxdrawdown(x) > max.dd <- mdd$maxdrawdown > from <- as.character(dp[mdd$from]) > to <- as.character(dp[mdd$to]) > from; to[1] "2000-08-31" [1] "2003-03-31" that gives me the POSIX dates of the start and end of the period and suggests that I have done something correctly. Two questions: (1) how to implement annual blocks and compute e.g. annual max, min and mean of y (each year's max, min, mean)? (2) how to apply POSIX variables with the 'block' argument in gev in the evir package? The S+FinMetrics function aggregateSeries does the job in that module; but I do not know, how handle it in R. My guess is that (1) is done by using the function aggregate, but how to define the 'by' argument with POSIX variables? Thanks! Hannu Kahra Progetti Speciali Monte Paschi Asset Management SGR S.p.A. Via San Vittore, 37 IT-20123 Milano, Italia Tel.: +39 02 43828 754 Mobile: +39 333 876 1558 Fax: +39 02 43828 247 E-mail: kahra at mpsgr.it Web: www.mpsam.it
[This email is either empty or too large to be displayed at this time]
Kahra Hannu <kahra <at> mpsgr.it> writes: : : I have a monthly price index series x, the related return series y = diff(log (x)) and a POSIXlt date-time : variable dp. I would like to apply annual blocks to compute for example annual block maxima and mean of y. : : When studying the POSIX classes, in the first stage of the learning curve, I computed the maximum drawdown : of x: : > mdd <- maxdrawdown(x) : > max.dd <- mdd$maxdrawdown : > from <- as.character(dp[mdd$from]) : > to <- as.character(dp[mdd$to]) : > from; to : [1] "2000-08-31" : [1] "2003-03-31" : that gives me the POSIX dates of the start and end of the period and suggests that I have done something correctly. : : Two questions: : (1) how to implement annual blocks and compute e.g. annual max, min and mean of y (each year's max, min, mean)? : (2) how to apply POSIX variables with the 'block' argument in gev in the evir package? : : The S+FinMetrics function aggregateSeries does the job in that module; but I do not know, how handle it in R. : My guess is that (1) is done by using the function aggregate, but how to define the 'by' argument with POSIX variables? 1. To create a ts monthly time series you specify the first month and a frequency of 12 like this. z.m <- ts(rep(1:6,4), start = c(2000,1), freq = 12) z.m # Annual aggregate is done using aggregate.ts with nfreq = 1 z.y <- aggregate(z.m, nfreq = 1, max) z.y # To create a POSIXct series of times using seq # (This will use GMT. Use tz="" arg to ISOdate if you want current tz.) z.y.times <- seq(ISOdate(2000,1,1), length = length(z.y), by = "year") z.y.times 2. Have not used evir but looking at ?gev it seems you can use block = 12 if you have monthly data and want the blocks to be successive 12 month periods or you can add a POSIXct times attribute to your data as below (also see comment re tz above) and then use block = "year" in your gev call. attr(z.m, "times") <- seq(ISOdate(2000,1,1), length=length(z.m), by="month") str(z.m) # display z.m along with attribute info
Thank you Petr and Gabor for the answers. They did not, however, solve my original problem. When I have a monthly time series y with a POSIX date variable dp, the most obvious way to compute e.g. the annual means is to use aggregate(y, 1, mean) that works with time series. However, I got stuck with the idea of using the 'by' argument as by = dp$year. But in that case y has to be a data.frame. The easiest way must be the best way. Regards, Hannu -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Gabor Grothendieck Sent: Thursday, September 23, 2004 12:56 PM To: r-help at stat.math.ethz.ch Subject: Re: [R] block statistics with POSIX classes Kahra Hannu <kahra <at> mpsgr.it> writes: : : I have a monthly price index series x, the related return series y = diff(log (x)) and a POSIXlt date-time : variable dp. I would like to apply annual blocks to compute for example annual block maxima and mean of y. : : When studying the POSIX classes, in the first stage of the learning curve, I computed the maximum drawdown : of x: : > mdd <- maxdrawdown(x) : > max.dd <- mdd$maxdrawdown : > from <- as.character(dp[mdd$from]) : > to <- as.character(dp[mdd$to]) : > from; to : [1] "2000-08-31" : [1] "2003-03-31" : that gives me the POSIX dates of the start and end of the period and suggests that I have done something correctly. : : Two questions: : (1) how to implement annual blocks and compute e.g. annual max, min and mean of y (each year's max, min, mean)? : (2) how to apply POSIX variables with the 'block' argument in gev in the evir package? : : The S+FinMetrics function aggregateSeries does the job in that module; but I do not know, how handle it in R. : My guess is that (1) is done by using the function aggregate, but how to define the 'by' argument with POSIX variables? 1. To create a ts monthly time series you specify the first month and a frequency of 12 like this. z.m <- ts(rep(1:6,4), start = c(2000,1), freq = 12) z.m # Annual aggregate is done using aggregate.ts with nfreq = 1 z.y <- aggregate(z.m, nfreq = 1, max) z.y # To create a POSIXct series of times using seq # (This will use GMT. Use tz="" arg to ISOdate if you want current tz.) z.y.times <- seq(ISOdate(2000,1,1), length = length(z.y), by = "year") z.y.times 2. Have not used evir but looking at ?gev it seems you can use block = 12 if you have monthly data and want the blocks to be successive 12 month periods or you can add a POSIXct times attribute to your data as below (also see comment re tz above) and then use block = "year" in your gev call. attr(z.m, "times") <- seq(ISOdate(2000,1,1), length=length(z.m), by="month") str(z.m) # display z.m along with attribute info ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
I have followed Gabor's instructions:> aggregate(list(y=y), list(dp$year), mean)$y # returns NULL since y is a time seriesNULL> aggregate(list(y=as.vector(y)), list(dp$year), mean)$y # returns annual means[1] 0.0077656696 0.0224050294 0.0099991898 0.0240550925 -0.0084085867 [6] -0.0170950194 -0.0355641251 0.0065873997 0.0008253111> aggregate(list(y=y), list(dp$year), mean) # returns the same as the previous oneGroup.1 Series.1 1 96 0.0077656696 2 97 0.0224050294 3 98 0.0099991898 4 99 0.0240550925 5 100 -0.0084085867 6 101 -0.0170950194 7 102 -0.0355641251 8 103 0.0065873997 9 104 0.0008253111 Gabor's second suggestion returns different results:> aggregate(ts(y, start=c(dp$year[1],dp$mon[1]+1), freq = 12), nfreq=1, mean)Time Series: Start = 96.33333 End = 103.3333 Frequency = 1 Series 1 [1,] 0.016120895 [2,] 0.024257131 [3,] 0.007526997 [4,] 0.017466118 [5,] -0.016024846 [6,] -0.017145159 [7,] -0.036047765 [8,] 0.014198501> aggregate(y, 1, mean) # verifies the result aboveTime Series: Start = 1996.333 End = 2003.333 Frequency = 1 Series 1 [1,] 0.016120895 [2,] 0.024257131 [3,] 0.007526997 [4,] 0.017466118 [5,] -0.016024846 [6,] -0.017145159 [7,] -0.036047765 [8,] 0.014198501 The data is from 1996:5 to 2004:8. The difference of the results must depend on the fact that the beginning of the data is not January and the end is not December? The first two solutions give nine annual means while the last two give only eight means. The block size in the last two must be 12 months, as is said in ?aggregate, instead of a calender year that I am looking for. Gabor's first suggestion solved my problem. Thank you, Hannu -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Gabor Grothendieck Sent: Thursday, September 23, 2004 3:52 PM To: r-help at stat.math.ethz.ch Subject: Re: [R] block statistics with POSIX classes I am not sure that I understand what you are looking for since you did not provide any sample data with corresponding output to clarify your question. Here is another guess. If y is just a numeric vector with monthly data and dp is a POSIXlt vector of the same length then: aggregate(list(y=y), list(dp$year), mean)$y will perform aggregation, as will aggregate(ts(y, start=c(d$year[1],d$mon[1]+1), freq = 12), nfreq=1, mean) which converts y to ts and then performs the aggregation. The first one will work even if y is irregular while the second one assumes that y must be monthly. The second one returns a ts object. By the way, I had a look at gev's source and it seems that despite the documentation it does not use POSIXct anywhere internally. If the block is "year" or other character value then it simply assumes that whatever datetime class is used has an as.POSIXlt method. If your dates were POSIXct rather than POSIXlt then it would be important to ensure that whatever timezone is assumed (which I did not check) in the conversion is the one you are using. You could use character dates or Date class to avoid this problem. Since you appear to be using POSIXlt datetimes from the beginning I think you should be ok. Kahra Hannu <kahra <at> mpsgr.it> writes: : : Thank you Petr and Gabor for the answers. : : They did not, however, solve my original problem. When I have a monthly time series y with a POSIX date : variable dp, the most obvious way to compute e.g. the annual means is to use aggregate(y, 1, mean) that : works with time series. However, I got stuck with the idea of using the 'by' argument as by = dp$year. But in : that case y has to be a data.frame. The easiest way must be the best way. : : Regards, : Hannu : : -----Original Message----- : From: r-help-bounces <at> stat.math.ethz.ch : [mailto:r-help-bounces <at> stat.math.ethz.ch]On Behalf Of Gabor Grothendieck : Sent: Thursday, September 23, 2004 12:56 PM : To: r-help <at> stat.math.ethz.ch : Subject: Re: [R] block statistics with POSIX classes : : : Kahra Hannu <kahra <at> mpsgr.it> writes: : : : : : I have a monthly price index series x, the related return series y = diff (log : (x)) and a POSIXlt date-time : : variable dp. I would like to apply annual blocks to compute for example : annual block maxima and mean of y. : : : : When studying the POSIX classes, in the first stage of the learning curve, I : computed the maximum drawdown : : of x: : : > mdd <- maxdrawdown(x) : : > max.dd <- mdd$maxdrawdown : : > from <- as.character(dp[mdd$from]) : : > to <- as.character(dp[mdd$to]) : : > from; to : : [1] "2000-08-31" : : [1] "2003-03-31" : : that gives me the POSIX dates of the start and end of the period and : suggests that I have done something correctly. : : : : Two questions: : : (1) how to implement annual blocks and compute e.g. annual max, min and mean : of y (each year's max, min, mean)? : : (2) how to apply POSIX variables with the 'block' argument in gev in the : evir package? : : : : The S+FinMetrics function aggregateSeries does the job in that module; but I : do not know, how handle it in R. : : My guess is that (1) is done by using the function aggregate, but how to : define the 'by' argument with POSIX variables? : : 1. To create a ts monthly time series you specify the first month : and a frequency of 12 like this. : : z.m <- ts(rep(1:6,4), start = c(2000,1), freq = 12) : z.m : : # Annual aggregate is done using aggregate.ts with nfreq = 1 : z.y <- aggregate(z.m, nfreq = 1, max) : z.y : : # To create a POSIXct series of times using seq : # (This will use GMT. Use tz="" arg to ISOdate if you want current tz.) : z.y.times <- seq(ISOdate(2000,1,1), length = length(z.y), by = "year") : z.y.times : : 2. Have not used evir but looking at ?gev it seems you can : use block = 12 if you have monthly data and want the blocks to be : successive 12 month periods or you can add a POSIXct times attribute to : your data as below (also see comment re tz above) and then use : block = "year" in your gev call. : : attr(z.m, "times") <- seq(ISOdate(2000,1,1), length=length(z.m), by="month") : str(z.m) # display z.m along with attribute info : : ______________________________________________ : R-help <at> stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html : : ______________________________________________ : R-help <at> stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html : : ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
In a private mail Petr noted that I had mixed POSIX with a time series class in a call to aggregate, that was the case. Petr's sugestion usa <- diff(log(MXNA/XEU)) z <- aggregate(usa, list(annual=cut(dp,"year")), mean, na.rm=T) also gives the result I was looking for. In my original code usa was a time series. Thank you Gabor and Petr! Hannu -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Gabor Grothendieck Sent: Thursday, September 23, 2004 7:04 PM To: r-help at stat.math.ethz.ch Subject: Re: [R] block statistics with POSIX classes Kahra Hannu <kahra <at> mpsgr.it> writes: : : I have followed Gabor's instructions: : : > aggregate(list(y=y), list(dp$year), mean)$y # returns NULL since y is a time series : NULL : : > aggregate(list(y=as.vector(y)), list(dp$year), mean)$y # returns annual means : [1] 0.0077656696 0.0224050294 0.0099991898 0.0240550925 -0.0084085867 : [6] -0.0170950194 -0.0355641251 0.0065873997 0.0008253111 : : > aggregate(list(y=y), list(dp$year), mean) # returns the same as the previous one : Group.1 Series.1 : 1 96 0.0077656696 : 2 97 0.0224050294 : 3 98 0.0099991898 : 4 99 0.0240550925 : 5 100 -0.0084085867 : 6 101 -0.0170950194 : 7 102 -0.0355641251 : 8 103 0.0065873997 : 9 104 0.0008253111 : : Gabor's second suggestion returns different results: : : > aggregate(ts(y, start=c(dp$year[1],dp$mon[1]+1), freq = 12), nfreq=1, mean) : Time Series: : Start = 96.33333 : End = 103.3333 : Frequency = 1 : Series 1 : [1,] 0.016120895 : [2,] 0.024257131 : [3,] 0.007526997 : [4,] 0.017466118 : [5,] -0.016024846 : [6,] -0.017145159 : [7,] -0.036047765 : [8,] 0.014198501 : : > aggregate(y, 1, mean) # verifies the result above : Time Series: : Start = 1996.333 : End = 2003.333 : Frequency = 1 : Series 1 : [1,] 0.016120895 : [2,] 0.024257131 : [3,] 0.007526997 : [4,] 0.017466118 : [5,] -0.016024846 : [6,] -0.017145159 : [7,] -0.036047765 : [8,] 0.014198501 : : The data is from 1996:5 to 2004:8. The difference of the results must depend on the fact that the beginning of : the data is not January and the end is not December? The first two solutions give nine annual means while the : last two give only eight means. The block size in the last two must be 12 months, as is said in ?aggregate, : instead of a calender year that I am looking for. Gabor's first suggestion solved my problem. Yes, that seems to be the case. Using length instead of mean we find that the aggregate.data.frame example used calendar years as the basis of aggregation whereas the aggregate.ts example used successive 12 month periods starting from the first month discarding the 4 points at the end which do not fill out a full year. R> set.seed(1) R> dp <- as.POSIXlt(seq(from=as.Date("1996-5-1"), to=as.Date("2004-8-1"), + by="month")) R> y <- rnorm(length(dp$year)) R> aggregate(list(y=y), list(dp$year), length)$y [1] 8 12 12 12 12 12 12 12 8 R> aggregate(ts(y, start=c(dp$year[1],dp$mon[1]+1), freq = 12), nfreq=1, length) Time Series: Start = 96.33333 End = 103.3333 Frequency = 1 [1] 12 12 12 12 12 12 12 12 ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html