Mangalani Peter Makananisa
2010-Jul-21 09:33 UTC
[R] Obtaining the unmerged cases from one of the two data set
Dear "R Gurus", I am having two dummy csv data sets A and B containing 19 and 15 cases/observations respectively. From the two data set 13 cases are intersection. From one of the two (any) data set, How do I then retrieve the unmerged data ? let's take A for example, six cases must appear in our results. See the R codes below. Please assist. Looking forward to hearing from the group sooner and thanking you in advance Kind regards Mangalani Peter Makananisa Statistical Analyst South African Revenue Service (SARS) Segmentation and Research : Data Modelling Tel: +27 12 422 7357, Cell: +27 82 456 4669, Fax:+27 12 422 6579 E-Mail : pmakananisa at sars.gov.za <mailto:pmakananisa at sars.gov.za>> A = read.csv("C:/Documents and Settings/S1033067/Desktop/A.csv",header = TRUE, dec =",", sep = ",")> names(A)[1] "NAME" "SALARY"> dim(A)[1][1] 19> B = read.csv("C:/Documents and Settings/S1033067/Desktop/B.csv",header = TRUE, dec =",", sep = ",")> names(B)[1] "NAME" "B.SALARY"> dim(B)[1][1] 15> common = merge(A,B)> names(common)[1] "NAME" "SALARY" "B.SALARY"> dim(common)[1][1] 13>Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Email_Disclaimer.pdf
David Winsemius
2010-Jul-21 11:38 UTC
[R] Obtaining the unmerged cases from one of the two data set
On Jul 21, 2010, at 5:33 AM, Mangalani Peter Makananisa wrote:> Dear "R Gurus",I saw no reason to copy Rob Hyndman. I did not see that this involves any of the packages he maintains.> > I am having two dummy csv data sets A and B containing 19 and 15 > cases/observations respectively. From the two data set 13 cases are > intersection. From one of the two (any) data set, How do I then > retrieve > the unmerged data ? let's take A for example, six cases must appear in > our results. See the R codes below. >?setdiff Perhaps: setdiff( (NAME(A), NAME(B) ) You can also do a merge that is an outer join that includes all the NAME information and then extract the records with SALARY and .B.SALARY data. Untested in absence of working example: ?merge mer <- merge(A,B, all=TRUE) mer[ mer$NAME %in% setdiff(NAME(A), NAME(B) ), ] -- David.> > > >> A = read.csv("C:/Documents and Settings/S1033067/Desktop/A.csv", > header = TRUE, dec =",", sep = ",") > >> names(A) > > [1] "NAME" "SALARY" > >> dim(A)[1] > > [1] 19 > >> B = read.csv("C:/Documents and Settings/S1033067/Desktop/B.csv", > header = TRUE, dec =",", sep = ",") > >> names(B) > > [1] "NAME" "B.SALARY" > >> dim(B)[1] > > [1] 15 > >> common = merge(A,B) > >> names(common) > > [1] "NAME" "SALARY" "B.SALARY" > >> dim(common)[1] > > [1] 13 > >>David Winsemius, MD West Hartford, CT