Talbot Katz
2008-Jan-07 16:18 UTC
[R] Seeking a more efficient way to find partition maxima
Hi. Suppose I have a vector that I partition into disjoint, contiguous subvectors. For example, let v = c(1,4,2,6,7,5), partition it into three subvectors, v1 = v[1:3], v2 = v[4], v3 = v[5:6]. I want to find the maximum element of each subvector. In this example, max(v1) is 4, max(v2) is 6, max(v3) is 7. If I knew that the successive subvector maxima would never decrease, as in the example, I could do the following: partiCmax <- function( values, seriesIdx ) { # assume seriesIdx is increasing integer sequence beginning with 1, ending at less than or equal to length(values) parti <- cbind( seriesIdx, c( ( seriesIdx[ -1 ] - 1 ), length( values ) ) ) return( cummax( values )[ parti[ , 2 ] ] ) } The use of cummax makes that pretty efficient, but if the subvector maxima are not non-decreasing, it doesn't work. The following function works (at least it did on the examples I tried): partiMax <- function( values, seriesIdx ) { # assume seriesIdx is increasing integer sequence beginning with 1, ending at less than or equal to length(values) parti <- cbind( seriesIdx, c( ( seriesIdx[ -1 ] - 1 ), length( values ) ) ) return( sapply( ( 1:length(seriesIdx) ), function ( i ) {return( max( values[ parti[ i, 1 ]:parti[ i, 2 ] ] ) ) } ) ) } but I figured someone out there could come up with something cleverer. Thanks! -- TMK --212-460-5430 home917-656-5351 cellt o p k a t z @ m s n . c o m [[alternative HTML version deleted]]
Marc Schwartz
2008-Jan-07 16:44 UTC
[R] Seeking a more efficient way to find partition maxima
Talbot Katz wrote:> Hi. > > Suppose I have a vector that I partition into disjoint, contiguous > subvectors. For example, let v = c(1,4,2,6,7,5), partition it into > three subvectors, v1 = v[1:3], v2 = v[4], v3 = v[5:6]. I want to > find the maximum element of each subvector. In this example, max(v1) > is 4, max(v2) is 6, max(v3) is 7. If I knew that the successive > subvector maxima would never decrease, as in the example, I could do > the following: > > partiCmax <- function( values, seriesIdx ) { # assume seriesIdx is > increasing integer sequence beginning with 1, ending at less than or > equal to length(values) parti <- cbind( seriesIdx, c( ( seriesIdx[ -1 > ] - 1 ), length( values ) ) ) return( cummax( values )[ parti[ , 2 ] > ] ) } > > > The use of cummax makes that pretty efficient, but if the subvector > maxima are not non-decreasing, it doesn't work. The following > function works (at least it did on the examples I tried): > > partiMax <- function( values, seriesIdx ) { # assume seriesIdx is > increasing integer sequence beginning with 1, ending at less than or > equal to length(values) parti <- cbind( seriesIdx, c( ( seriesIdx[ -1 > ] - 1 ), length( values ) ) ) return( sapply( ( 1:length(seriesIdx) > ), function ( i ) {return( max( values[ parti[ i, 1 ]:parti[ i, 2 ] ] > ) ) } ) ) } > > > but I figured someone out there could come up with something > cleverer. Thanks!It is not clear how you are creating the partitions, but if you are (hopefully) ending up with a list, such as: > Vec.List $v1 [1] 1 4 2 $v2 [1] 6 $v3 [1] 7 5 Then you can use: > sapply(Vec.List, max, na.rm = TRUE) v1 v2 v3 4 6 7 See ?sapply for more information. HTH, Marc Schwartz
Gabor Grothendieck
2008-Jan-07 18:49 UTC
[R] Seeking a more efficient way to find partition maxima
Try testing the performance of transforming your series to one in which the values of each partition are larger than all prior partitions and the untransforming back: # test data myseq <- c(1, 4, 2, 6, 7, 5) part <- c(1, 4, 5) M <- max(myseq) # transform myseq2 <- myseq + M * cumsum(replace(0 * myseq, part, 1)) # calcuate on transformed version tmp <- partiCmax(myseq2, part) # untransform tmp - M * seq_along(tmp) # c(4, 6, 7) Also you might check how it compares to the simpler tapply(myseq, cumsum(replace(0 * myseq, part, 1)), max) On Jan 7, 2008 11:18 AM, Talbot Katz <topkatz at msn.com> wrote:> > Hi. > > Suppose I have a vector that I partition into disjoint, contiguous subvectors. For example, let v = c(1,4,2,6,7,5), partition it into three subvectors, v1 = v[1:3], v2 = v[4], v3 = v[5:6]. I want to find the maximum element of each subvector. In this example, max(v1) is 4, max(v2) is 6, max(v3) is 7. If I knew that the successive subvector maxima would never decrease, as in the example, I could do the following: > > partiCmax <- function( values, seriesIdx ) { > # assume seriesIdx is increasing integer sequence beginning with 1, ending at less than or equal to length(values) > parti <- cbind( seriesIdx, c( ( seriesIdx[ -1 ] - 1 ), length( values ) ) ) > return( cummax( values )[ parti[ , 2 ] ] ) > } > > > The use of cummax makes that pretty efficient, but if the subvector maxima are not non-decreasing, it doesn't work. The following function works (at least it did on the examples I tried): > > partiMax <- function( values, seriesIdx ) { > # assume seriesIdx is increasing integer sequence beginning with 1, ending at less than or equal to length(values) > parti <- cbind( seriesIdx, c( ( seriesIdx[ -1 ] - 1 ), length( values ) ) ) > return( sapply( ( 1:length(seriesIdx) ), function ( i ) {return( max( values[ parti[ i, 1 ]:parti[ i, 2 ] ] ) ) } ) ) > } > > > but I figured someone out there could come up with something cleverer. Thanks! > > -- TMK --212-460-5430 home917-656-5351 cellt o p k a t z @ m s n . c o m > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
jim holtman
2008-Jan-08 00:02 UTC
[R] Seeking a more efficient way to find partition maxima
Does this do what you want?> x <- c(1,4,2,6,7,5) > x.i <- c(1,4,5) > partiMax <- function(vec, index){+ # create a vector to identify the subvectors + x.b <- diff(c(index, length(vec) + 1)) + # split up the vector + x.s <- split(vec, rep(seq_along(index), times=x.b)) + unlist(lapply(x.s, max)) + }> > partiMax(x, x.i)1 2 3 4 6 7> partiMax(6:1, x.i)1 2 3 6 3 2> >On Jan 7, 2008 11:18 AM, Talbot Katz <topkatz at msn.com> wrote:> > Hi. > > Suppose I have a vector that I partition into disjoint, contiguous subvectors. For example, let v = c(1,4,2,6,7,5), partition it into three subvectors, v1 = v[1:3], v2 = v[4], v3 = v[5:6]. I want to find the maximum element of each subvector. In this example, max(v1) is 4, max(v2) is 6, max(v3) is 7. If I knew that the successive subvector maxima would never decrease, as in the example, I could do the following: > > partiCmax <- function( values, seriesIdx ) { > # assume seriesIdx is increasing integer sequence beginning with 1, ending at less than or equal to length(values) > parti <- cbind( seriesIdx, c( ( seriesIdx[ -1 ] - 1 ), length( values ) ) ) > return( cummax( values )[ parti[ , 2 ] ] ) > } > > > The use of cummax makes that pretty efficient, but if the subvector maxima are not non-decreasing, it doesn't work. The following function works (at least it did on the examples I tried): > > partiMax <- function( values, seriesIdx ) { > # assume seriesIdx is increasing integer sequence beginning with 1, ending at less than or equal to length(values) > parti <- cbind( seriesIdx, c( ( seriesIdx[ -1 ] - 1 ), length( values ) ) ) > return( sapply( ( 1:length(seriesIdx) ), function ( i ) {return( max( values[ parti[ i, 1 ]:parti[ i, 2 ] ] ) ) } ) ) > } > > > but I figured someone out there could come up with something cleverer. Thanks! > > -- TMK --212-460-5430 home917-656-5351 cellt o p k a t z @ m s n . c o m > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?