So I have a bunch of c(start,end) points and want to consolidate them into as few c(start,end) as possible. For example: sample start end A 5 10 B 7 18 C 1 4 D 16 20 I'd want the function to return the two distinct sets (1,4) and (5,20) Is there an R function that already does this? or should I write my own? (how would I go about that?) -- View this message in context: http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324684.html Sent from the R help mailing list archive at Nabble.com.
I think it would be helpful if you could clarify youre question - do you want distinct sets - maybe use unique() but why (5,20) when its (5,10) in the row in youre example? What criteria do you want the function to select the "sets" by and what kind of output do you need? Maybe it's just me who dosn't get the question..sr -- View this message in context: http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324844.html Sent from the R help mailing list archive at Nabble.com.
I too think I worded it incorrectly... so the second two columns of the matrix are the start and end of an interval however, because some of the intervals overlap, I want to limit the number of intervals I have to deal with. So therefore, (5 10) should merge with (7 18) making (5 18) and then (5 18) should merge with (16 20) giving (5 20) whereas (1 4) has no overlap with any other interval and is therefore left on its own Ideal output would just be a collapsing of the matrix sample start end # 5 20 # 1 4 I got this to work using unique(c(5:10,7:18,16:20,1:4)) which gives me a c(1:4,5:20) However, I have to do this on a very large dataset and the numbers are more like c(100542:100782,598322:598821,...) any help would be appreciated thanks -- View this message in context: http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324855.html Sent from the R help mailing list archive at Nabble.com.
On Fri, 13 Aug 2010, fishkbob wrote:> > So I have a bunch of c(start,end) points and want to consolidate them into as > few c(start,end) as possible. > > For example: > sample start end > A 5 10 > B 7 18 > C 1 4 > D 16 20 > > I'd want the function to return the two distinct sets (1,4) and (5,20) > > Is there an R function that already does this?Yes. See the reduce() function in the IRanges package on BioConductor See pages 11-12 of http://www.bioconductor.org/packages/2.6/bioc/vignettes/IRanges/inst/doc/IRangesOverview.pdf HTH, Chuck> or should I write my own? (how would I go about that?) > -- > View this message in context: http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324684.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Thanks Chuck, I was trying to implement something more complicated than what I had to and after finding the reduce() function in bioconductor, everything went smoothly. Thanks again -- View this message in context: http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2327133.html Sent from the R help mailing list archive at Nabble.com.