thr3ads.net - R help - [R] comparing 2 dataframes [Nov 2006]

If this information is useful, please help other people find it:
Share via:

Priya Kanhai

2006-Nov-06 20:25 UTC

[R] comparing 2 dataframes

Een ingesloten tekst met niet-gespecificeerde tekenset is
van het bericht gescrubt ...
Naam: niet beschikbaar
Url:
https://stat.ethz.ch/pipermail/r-help/attachments/20061106/756382a8/attachment.pl

Christoph Buser

2006-Nov-07 07:46 UTC

head link

[R] comparing 2 dataframes

Hi

Maybe this example can help you to find your solution:

dat1 <- data.frame(CUSTOMER_ID = c("1000786BR",
"1002047BR", "10127BR",
                     "1004166834BR"," 1004310897BR",
"1006180BR",
                     "10064798BR", "1007311BR",
"1007621BR",
                     "1008195BR", "10126BR",
"95323994BR"),
                   CUSTOMER_RR = c("5+", "4",
"5+", "2", "X", "4", "4",
"5+",
                     "4", "4-", "5+",
"4"))

dat2 <- data.frame(CUSTOMER_ID = c("1200786BR",
"1802047BR", "1027BR",
                     "10166834BR", "107BR",
"100BR", "164798BR", "1008195BR",
                     "10126BR"),
                   CUSTOMER_RR = c("6+", "4",
"1+", "2", "X", "4", "4",
"4",
                     "5+"))

## Merge, but only by "CUSTOMER_ID"
datM <- merge(dat1, dat2, by = "CUSTOMER_ID")
datM
## Select only cases that have a similar "CUSTOMER_RR"
datM1 <- datM[as.character(datM[, "CUSTOMER_RR.x"]) %in%
              as.character(datM[,"CUSTOMER_RR.y"]), ]
datM1

Regards,

Christoph

--------------------------------------------------------------

Credit and Surety PML study: visit our web page www.cs-pml.org

--------------------------------------------------------------
Christoph Buser <buser at stat.math.ethz.ch>
Seminar fuer Statistik, LEO C13
ETH Zurich	8092 Zurich	 SWITZERLAND
phone: x-41-44-632-4673		fax: 632-1228
http://stat.ethz.ch/~buser/
--------------------------------------------------------------



Priya Kanhai writes:
 > Hi,
 > 
 > I''ve a question about comparing 2 dataframes: RRC_db1 and RRC_db2
of
 > different length.
 > 
 > For example:
 > 
 > RRC_db1:
 > 
 >     CUSTOMER_ID CUSTOMER_RR
 > 1     1000786BR                   5+
 > 2     1002047BR                    4
 > 3       10127BR                   5+
 > 4  1004166834BR                    2
 > 5  1004310897BR                    X
 > 6     1006180BR                    4
 > 7    10064798BR                    4
 > 8     1007311BR                   5+
 > 9     1007621BR                    4
 > 10    1008195BR                   4-
 > 11      10126BR                   5+
 > 12   95323994BR                    4
 > 
 >  RRC_db2:
 > 
 >     CUSTOMER_ID CUSTOMER_RR
 > 1     1200786BR                   6+
 > 2     1802047BR                    4
 > 3      1027BR                     1+
 > 4   10166834BR                    2
 > 5   107BR                          X
 > 6     100BR                        4
 > 7    164798BR                    4
 > 8    1008195BR                   4-
 > 9      10126BR                   5+
 > 
 > 
 > I want to pick the CUSTOMER_ID of RRC_db1 which also exist in RRC_db2:
 > third <- merge(RRC_db1, RRC_db2) or  third <-subset(RRC_db1,
CUSTOMER_ID%in%
 > RRC_db2$CUSTOMER_ID)
 > 
 > But I also want to check if the CUSTOMER_RR is correct. I had tried this:
 > 
 > > test <- function(RRC_db1,RRC_db2)
 > + {
 > + noteq <- c()
 > + for( i in 1:length(RRC_db1$CUSTOMER_ID)){
 > + for( j in 1:length(RRC_db2$CUSTOMER_ID)){
 > + if(RRC_db1$CUSTOMER_ID[i] == RRC_db2$CUSTOMER_ID[j]){
 > + if(RRC_db1$CUSTOMER_RR[i] != RRC_db2$CUSTOMER_RR[j]){
 > + noteq <- c(noteq,RRC_db1$CUSTOMER_ID[i]);
 > + }
 > + }
 > + }
 > + }
 > + noteq;
 > + }
 > >
 > > test(RRC_db1, RRC_db2)
 > Error in Ops.factor(RRC_db1$CUSTOMER_ID[i], RRC_db2$CUSTOMER_ID[j]) :
 >         level sets of factors are different
 > 
 > 
 > But then I got this error.
 > 
 > I don't only want the CUSTOMER_ID to be the same but also the
CUSTOMER_RR.
 > 
 > Can you please help me?
 > 
 > Thanks in advance.
 > 
 > Regards,
 > 
 > Priya
 > 
 > 	[[alternative HTML version deleted]]
 > 
 > ______________________________________________
 > R-help at stat.math.ethz.ch mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-help
 > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 > and provide commented, minimal, self-contained, reproducible code.

Priya Kanhai

2006-Nov-07 11:29 UTC

head link

[R] comparing 2 dataframes

Een ingesloten tekst met niet-gespecificeerde tekenset is
van het bericht gescrubt ...
Naam: niet beschikbaar
Url:
https://stat.ethz.ch/pipermail/r-help/attachments/20061107/49d2ac8c/attachment.pl

R help - Nov 2006 - comparing 2 dataframes

[R] comparing 2 dataframes

[R] comparing 2 dataframes

[R] comparing 2 dataframes

Maybe Matching Threads