On Mon, Aug 8, 2011 at 9:16 AM, Johannes Egner <johannes.egner at
gmail.com> wrote:> Hi,
>
> I'm removing non-unique time indices in a zoo time series by means of
> aggregate. The time series is bivariate, and the row to be kept only
depends
> on the maximum of one of the two columns. Here's an example:
>
> x <- zoo(rbind( c(1,1), c(1.1, 0.9), c(1.1, 1.1), c(1,1) ),
> ? ? ? ?order.by=c(1,1,2,2))
>
> The eventual aggregated result should be
>
> 1 ? 1.1 ? 0.9
> 2 ? 1.1 ? 1.1
>
> that is, in each slice of the underlying data (a slice being all rows with
> the same time stamp), we take the row that has maximum value in the first
> column. (For the moment, let's not worry about several rows within the
same
> slice having the same maximum value in the first column.)
>
> I have tried subsetting x by
>
> slices <- aggregate(x[,1], by=identity, FUN=which.max)
>
> but ended up with something as ugly as:
>
> T <- length( unique(time(x)) )
> result <- zoo( matrix(NA, ncol=2, nrow=T), order.by=unique(time(x)) )
>
> for(t in seq(length.out=T))
> {
> ? ?result[t,] <- x[ time(x)==time(slices[t]) ][coredata(slices[t]),]
>
> }
>
> There must be a better way of doing this -- maybe using tapply or the plyr
> package, but possibly something much simpler. Any pointers are very
welcome.
Where does the data come from in the first place? Is it being read
in? or is it in a data frame that is converted to a zoo object?
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com