On 03/04/2012 11:06 AM, Yadav Sapkota wrote:> Hi,
>
> I want to merge multiple chromosomal regions based on their common
> intersecting regions. I tried couple of things using while and if loops but
> did not work out.
>
> I would appreciate if anyone could provide me a small piece of code in R to
> get the intersection of following example:
>
> chr1: 100-150
> chr1: 79-250
> chr1: 100-175
> chr1: 300-350
>
> I want the intersection of all four regions as follow:
> chr1: 100-150
> chr1: 300-350
>
> I have thousands of these regions (some overlap and some not).
Not exactly sure what you mean by 'intersection of all four regions' but
source("http://bioconductor.org/biocLite.R")
biocLite("GenomicRanges")
and then
library(GenomicRanges)
q = GRanges("chr1", IRanges(c(100, 79, 100, 300),
c(150, 250, 175, 350)))
s = GRanges("chr1", IRanges(c(100, 300),
c(150, 350)))
o = findOverlaps(query=q, subject=s)
xtabs(~queryHits(o) + subjectHits(o))
leading to
subjectHits(o)
queryHits(o) 1 2
1 1 0
2 1 0
3 1 0
4 0 1
This will be efficient for (10's of) millions of overlaps. There are
extensive vignettes for help, vignettes(package="IRanges"),
vignettes(package="GenomicRanges") and the Bioconductor mailing list
http://bioconductor.org/help/mailing-list/
is the appropriate place for further help in this direction.
Martin
>
> Regards,
> --Yadav
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793