Hello everybody,
I have two series of intervals, and I'd like to output the shared
regions.
For example:
series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350))
series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000))
> series1
Start End
[1,] 10 20
[2,] 21 26
[3,] 40 70
[4,] 300 350
> series2
Start End
[1,] 25 40
[2,] 60 100
[3,] 210 400
[4,] 500 1000
I'd like to have something like this as result:
> shared
Start End
[1,] 25 26
[2,] 60 70
[3,] 300 350
I found this post, but the solution finds the regions shared across
all the intervals.
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59594.html
Can anybody help me with this?
Thanks
Giovanni
Not the most efficient and requires integer values (maybe less than 1M). My results show an additional overlap at 40 - start & end were the same -- does this count? If not, just delete rows that are the same in both columns.> series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) > series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > x1 <- x2 <- logical(max(series1, series2)) # vector FALSE > x1[unlist(mapply(seq, series1[,1], series1[,2]))] <- TRUE > x2[unlist(mapply(seq, series2[,1], series2[,2]))] <- TRUE > r <- rle(x1 & x2) # determine overlaps > offset <- cumsum(r$lengths) > (z <- cbind(offset[r$values] - r$lengths[r$values] + 1, offset[r$values]))[,1] [,2] [1,] 25 26 [2,] 40 40 [3,] 60 70 [4,] 300 350> # if you don't like dups for overlaps (@40) > z[z[,1] != z[,2],][,1] [,2] [1,] 25 26 [2,] 60 70 [3,] 300 350 On 10/15/06, Giovanni Coppola <gcoppola at ucla.edu> wrote:> Hello everybody, > > I have two series of intervals, and I'd like to output the shared > regions. > For example: > series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) > series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > > > series1 > Start End > [1,] 10 20 > [2,] 21 26 > [3,] 40 70 > [4,] 300 350 > > series2 > Start End > [1,] 25 40 > [2,] 60 100 > [3,] 210 400 > [4,] 500 1000 > > I'd like to have something like this as result: > > shared > Start End > [1,] 25 26 > [2,] 60 70 > [3,] 300 350 > > I found this post, but the solution finds the regions shared across > all the intervals. > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59594.html > Can anybody help me with this? > Thanks > Giovanni > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Here is a more general way that looks for the transitions:> series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) > series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > x <- rbind(series1, series2) > # create +1 for start and -1 for end > x.s <- rbind(cbind(x[,1], 1), cbind(x[,2], -1)) > #sort by start times > x.s <- x.s[order(x.s[,1]),] > # cumsum is a count of the transitions > x.s <- cbind(x.s, cumsum(x.s[,2])) > # c(1,2) is start and c(-1,1) is the end of an overlap > cbind(x.s[x.s[,2] == 1 & x.s[,3] == 2, 1], x.s[x.s[,2] == -1 & x.s[,3] == 1, 1])[,1] [,2] [1,] 25 26 [2,] 40 40 [3,] 60 70 [4,] 300 350 On 10/15/06, Giovanni Coppola <gcoppola at ucla.edu> wrote:> Hello everybody, > > I have two series of intervals, and I'd like to output the shared > regions. > For example: > series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350)) > series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000)) > > > series1 > Start End > [1,] 10 20 > [2,] 21 26 > [3,] 40 70 > [4,] 300 350 > > series2 > Start End > [1,] 25 40 > [2,] 60 100 > [3,] 210 400 > [4,] 500 1000 > > I'd like to have something like this as result: > > shared > Start End > [1,] 25 26 > [2,] 60 70 > [3,] 300 350 > > I found this post, but the solution finds the regions shared across > all the intervals. > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59594.html > Can anybody help me with this? > Thanks > Giovanni > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?