Have you tried 'foverlaps' in the data.table package?
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
On Mon, Sep 4, 2017 at 8:31 AM, Mohammad Tanvir Ahamed via R-help <
r-help at r-project.org> wrote:
> Hi,
> I have two big data set.
>
> data _1 :
> > dim(data_1)
> [1] 15820 5
>
> > head(data_1)
> Chromosome Start End Feature GroupA_3
> 1: chr1 521369 750000 chr1-0001 0.170
> 2: chr1 750001 800000 chr1-0002 -0.086
> 3: chr1 800001 850000 chr1-0003 0.006
> 4: chr1 850001 900000 chr1-0004 0.050
> 5: chr1 900001 950000 chr1-0005 0.062
> 6: chr1 950001 1000000 chr1-0006 -0.016
>
> data_2:
> > dim(data_2)
> [1] 470870 5
>
> > head(data_2)
> Chromosome Start End Feature GroupA_3
> 1: chr1 15864 15865 cg13869341 0.207
> 2: chr1 18826 18827 cg14008030 -0.288
> 3: chr1 29406 29407 cg12045430 -0.331
> 4: chr1 29424 29425 cg20826792 -0.074
> 5: chr1 29434 29435 cg00381604 0.141
> 6: chr1 68848 68849 cg20253340 -0.458
>
>
> What I want to do :
> Based on column name "Chromosome", "Start" and
"End" of two data set , I
> want to find which row (preciously "Feature") of data_2 is in
every range (
> between "Start" and "End") of data_1 ? Also
"Chromosome" column element
> should be match between two data set.
>
> I have tried "GenomicRanges" packages describe in the post
> https://stackoverflow.com/questions/11892241/merge-by-
> range-in-r-applying-loops
> But i was not successful. Can any one please help me to do this fast, as
> the data is very big ?
> Thanks in advance.
>
>
> Regards.............
> Tanvir Ahamed Stockholm, Sweden | mashranga at yahoo.com
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]