thr3ads.net - R help - [R] intersect two files [Aug 2004]

If this information is useful, please help other people find it:
Share via:

Christian Mora

2004-Aug-10 22:44 UTC

[R] intersect two files

Hi all;
Im working with two datasets in R, say data1 and data2. Both datasets
are composed of several rows and columns (dataframe) and some of the
rows are identical in both datasets. Im wondering if there is any way to
remove from one set, say data1, the rows that are identical in the other
set, say data2, using R?
Thanks for any hint in advance
Christian

Liaw, Andy

2004-Aug-10 23:01 UTC

head link

[R] intersect two files

You have not given enough info.  Do the data sets have the same columns?  If
not, you need to tell us more about how you can tell whether one row of a
data frame is `identical' to some row of another.

Assuming the columns are the same between the two, the basic idea is to
combine all columns into a single vector for each, then check which elements
of one is in the other.  Something like (code untested!):

id1 <- do.call("paste", c(data1, sep=":")
id2 <- do.call("paste", c(data2, sep=":")
## Rows of data1 that are in data2:
r1 <- which(id1 %in% id2)

## Remove:
data1.reduced <- data1[-r1,]

Andy

> From: Christian Mora
> 
> Hi all;
> Im working with two datasets in R, say data1 and data2. Both datasets
> are composed of several rows and columns (dataframe) and some of the
> rows are identical in both datasets. Im wondering if there is 
> any way to
> remove from one set, say data1, the rows that are identical 
> in the other
> set, say data2, using R?
> Thanks for any hint in advance
> Christian
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

Adaikalavan Ramasamy

2004-Aug-10 23:16 UTC

head link

[R] intersect two files

In short, merge with all=FALSE followed by removal of redundant columns might do
the trick.
If rownames serve as common key, use the argument by=0.

See http://tolstoy.newcastle.edu.au/R/help/04/07/1250.html and many
other hits on http://maths.newcastle.edu.au/~rking/R/


On Tue, 2004-08-10 at 23:44, Christian Mora wrote:> Hi all;
> Im working with two datasets in R, say data1 and data2. Both datasets
> are composed of several rows and columns (dataframe) and some of the
> rows are identical in both datasets. Im wondering if there is any way to
> remove from one set, say data1, the rows that are identical in the other
> set, say data2, using R?
> Thanks for any hint in advance
> Christian
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Aug 2004 - intersect two files

[R] intersect two files

[R] intersect two files

[R] intersect two files

Possibly Parallel Threads