Hello everybody, I have two series of intervals, and I'd like to output the shared regions. For example: series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > series1 Start End [1,] 10 20 [2,] 21 26 [3,] 40 70 [4,] 300 350 > series2 Start End [1,] 25 40 [2,] 60 100 [3,] 210 400 [4,] 500 1000 I'd like to have something like this as result: > shared Start End [1,] 25 26 [2,] 60 70 [3,] 300 350 I found this post, but the solution finds the regions shared across all the intervals. http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59594.html Can anybody help me with this? Thanks Giovanni
Not the most efficient and requires integer values (maybe less than 1M). My results show an additional overlap at 40 - start & end were the same -- does this count? If not, just delete rows that are the same in both columns.> series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) > series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > x1 <- x2 <- logical(max(series1, series2)) # vector FALSE > x1[unlist(mapply(seq, series1[,1], series1[,2]))] <- TRUE > x2[unlist(mapply(seq, series2[,1], series2[,2]))] <- TRUE > r <- rle(x1 & x2) # determine overlaps > offset <- cumsum(r$lengths) > (z <- cbind(offset[r$values] - r$lengths[r$values] + 1, offset[r$values]))[,1] [,2] [1,] 25 26 [2,] 40 40 [3,] 60 70 [4,] 300 350> # if you don't like dups for overlaps (@40) > z[z[,1] != z[,2],][,1] [,2] [1,] 25 26 [2,] 60 70 [3,] 300 350 On 10/15/06, Giovanni Coppola <gcoppola at ucla.edu> wrote:> Hello everybody, > > I have two series of intervals, and I'd like to output the shared > regions. > For example: > series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) > series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > > > series1 > Start End > [1,] 10 20 > [2,] 21 26 > [3,] 40 70 > [4,] 300 350 > > series2 > Start End > [1,] 25 40 > [2,] 60 100 > [3,] 210 400 > [4,] 500 1000 > > I'd like to have something like this as result: > > shared > Start End > [1,] 25 26 > [2,] 60 70 > [3,] 300 350 > > I found this post, but the solution finds the regions shared across > all the intervals. > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59594.html > Can anybody help me with this? > Thanks > Giovanni > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Here is a more general way that looks for the transitions:> series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) > series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > x <- rbind(series1, series2) > # create +1 for start and -1 for end > x.s <- rbind(cbind(x[,1], 1), cbind(x[,2], -1)) > #sort by start times > x.s <- x.s[order(x.s[,1]),] > # cumsum is a count of the transitions > x.s <- cbind(x.s, cumsum(x.s[,2])) > # c(1,2) is start and c(-1,1) is the end of an overlap > cbind(x.s[x.s[,2] == 1 & x.s[,3] == 2, 1], x.s[x.s[,2] == -1 & x.s[,3] == 1, 1])[,1] [,2] [1,] 25 26 [2,] 40 40 [3,] 60 70 [4,] 300 350 On 10/15/06, Giovanni Coppola <gcoppola at ucla.edu> wrote:> Hello everybody, > > I have two series of intervals, and I'd like to output the shared > regions. > For example: > series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) > series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > > > series1 > Start End > [1,] 10 20 > [2,] 21 26 > [3,] 40 70 > [4,] 300 350 > > series2 > Start End > [1,] 25 40 > [2,] 60 100 > [3,] 210 400 > [4,] 500 1000 > > I'd like to have something like this as result: > > shared > Start End > [1,] 25 26 > [2,] 60 70 > [3,] 300 350 > > I found this post, but the solution finds the regions shared across > all the intervals. > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59594.html > Can anybody help me with this? > Thanks > Giovanni > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?