I'd to match-merge 2 tables in such a manner that I keep all the rows in table 1, but not the rows that are in both table 1 and 2. Thank you for your help, Alfredo> master <- data.frame(ID=2001:2011) > train <- data.frame(ID=2004:2006) > valid <- ???in this example table valid should have the following> str(valid)Year: int 2001 2002 2003 2007 2008 2009 2010 2011 in SAS I'd do the following: data master; do id=2001 to 2011; output; end; run; data train; do id=2004 to 2006; output; end; run; data valid; merge master(in=a) train(in=b); by id; if a and not b; run; and in SQL: create table valid as select a.* from master where ID not in (select ID from train) [[alternative HTML version deleted]]
Look at the merge command
?merge
Steve Friedman Ph. D.
Ecologist  / Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034
Steve_Friedman at nps.gov
Office (305) 224 - 4282
Fax     (305) 224 - 4147
                                                                           
             "Roccato Alfredo                                              
             (UniCredit)"                                                  
             <Alfredo.Roccato@                                          To 
             unicredit.eu>             "r-help at r-project.org"
             Sent by:                  <r-help at r-project.org>
             r-help-bounces at r-                                          cc 
             project.org                                                   
                                                                   Subject 
                                       [R] join tables in R                
             05/04/2011 10:32                                              
             AM                                                            
                                                                           
                                                                           
                                                                           
                                                                           
I'd to match-merge 2 tables in such a manner that I keep all the rows in
table 1, but not the rows that are in both table 1 and 2.
Thank you for your help,
Alfredo
> master <- data.frame(ID=2001:2011)
> train   <- data.frame(ID=2004:2006)
> valid <- ???
in this example table valid should have the following
> str(valid)
 Year: int  2001 2002 2003 2007 2008 2009 2010 2011
in SAS I'd do the following:
data master; do id=2001 to 2011; output; end; run;
data train; do id=2004 to 2006; output; end; run;
data valid; merge master(in=a) train(in=b); by id; if a and not b; run;
and in SQL:
create table valid as
  select a.* from master where ID not in (select ID from train)
             [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
On Wed, May 04, 2011 at 04:32:14PM +0200, Roccato Alfredo (UniCredit) wrote:> I'd to match-merge 2 tables in such a manner that I keep all the rows in table 1, but not the rows that are in both table 1 and 2. > Thank you for your help, > Alfredo > > > master <- data.frame(ID=2001:2011) > > train <- data.frame(ID=2004:2006) > > valid <- ??? > > in this example table valid should have the following > > > str(valid) > Year: int 2001 2002 2003 2007 2008 2009 2010 2011Hi. Try the following, which assumes that "train" is a subset of "master". master <- data.frame(ID=2001:2011) train <- data.frame(ID=2004:2006) valid <- master[! (master[, 1] %in% train[ ,1]), , drop=FALSE] Hope this helps. Petr Savicky.
Hi, On Wed, May 4, 2011 at 10:32 AM, Roccato Alfredo (UniCredit) <Alfredo.Roccato at unicredit.eu> wrote:> I'd to match-merge 2 tables in such a manner that I keep all the rows in table 1, but not the rows that are in both table 1 and 2. > Thank you for your help, > Alfredo > >> master <- data.frame(ID=2001:2011) >> train ? <- data.frame(ID=2004:2006) >> valid <- ??? > > in this example table valid should have the following > >> str(valid) > ?Year: int ?2001 2002 2003 2007 2008 2009 2010 2011Are you working with only one column at a time? If so: R> keep <- !(master$ID %in% train$ID) R> valid <- master[keep,] If you are working with combinations of columns as the keys for each row, there are other ways ...> in SAS I'd do the following: > data master; do id=2001 to 2011; output; end; run; > data train; do id=2004 to 2006; output; end; run; > data valid; merge master(in=a) train(in=b); by id; if a and not b; run; > > and in SQL: > create table valid as > ?select a.* from master where ID not in (select ID from train)My solution does pretty much what this select statement would do. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact