Arnaud Gaboury
2012-Feb-27 14:46 UTC
[R] compare two data frames with same columns names but of different dimensions
Dear List, I want to compare and return the rows which are NOT in the two data frames. Classic methods don't work as the df have NOT the same dimensions. Here are one example of my df: reported <- structure(list(Product = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 4L, 5L, 5L), .Label = c("Cocoa", "Coffee C", "GC", "Sugar No 11", "ZS"), class = "factor"), Price = c(2331, 2356, 2440, 2450, 204.55, 205.45, 17792, 24.81, 1273.5, 1276.25), Nbr.Lots = c(-61L, -61L, 5L, 1L, 40L, 40L, -1L, -1L, -1L, 1L)), .Names = c("Product", "Price", "Nbr.Lots"), row.names = c(1L, 2L, 3L, 4L, 6L, 7L, 5L, 10L, 8L, 9L), class = "data.frame") exportfile <- structure(list(Product = c("Cocoa", "Cocoa", "Cocoa", "Coffee C", "Coffee C", "GC", "Sugar No 11", "ZS", "ZS"), Price = c(2331, 2356, 2440, 204.55, 205.45, 17792, 24.81, 1273.5, 1276.25), Nbr.Lots = c(-61, -61, 6, 40, 40, -1, -1, -1, 1)), .Names = c("Product", "Price", "Nbr.Lots"), row.names = c(NA, 9L), class = "data.frame") As you can see, they have same column names. My idea was to merge these two df when passing as argument "not to take into account duplicate rows", so I will get one df with rows which are not in both df. Is it possible? How to do it? TY for any help. Arnaud Gaboury ? A2CT2 Ltd.
Gaurav Sood
2012-Feb-27 21:30 UTC
[R] compare two data frames with same columns names but of different dimensions
m <- rbind(reported, exportfile) m1 <- m[duplicated(m),] m[is.na(match(m$key, m1$key)),] On Mon, Feb 27, 2012 at 9:46 AM, Arnaud Gaboury <arnaud.gaboury at a2ct2.com> wrote:> Dear List, > > I want to compare and return the rows which are NOT in the two data frames. Classic methods don't work as the df have NOT the same dimensions. > > > Here are one example of my df: > > reported <- > structure(list(Product = structure(c(1L, 1L, 1L, 1L, 2L, 2L, > 3L, 4L, 5L, 5L), .Label = c("Cocoa", "Coffee C", "GC", "Sugar No 11", > "ZS"), class = "factor"), Price = c(2331, 2356, 2440, 2450, 204.55, > 205.45, 17792, 24.81, 1273.5, 1276.25), Nbr.Lots = c(-61L, -61L, > 5L, 1L, 40L, 40L, -1L, -1L, -1L, 1L)), .Names = c("Product", > "Price", "Nbr.Lots"), row.names = c(1L, 2L, 3L, 4L, 6L, 7L, 5L, > 10L, 8L, 9L), class = "data.frame") > > exportfile <- > structure(list(Product = c("Cocoa", "Cocoa", "Cocoa", "Coffee C", > "Coffee C", "GC", "Sugar No 11", "ZS", "ZS"), Price = c(2331, > 2356, 2440, 204.55, 205.45, 17792, 24.81, 1273.5, 1276.25), Nbr.Lots = c(-61, > -61, 6, 40, 40, -1, -1, -1, 1)), .Names = c("Product", "Price", > "Nbr.Lots"), row.names = c(NA, 9L), class = "data.frame") > > As you can see, they have same column names. > My idea was to merge these two df when passing as argument "not to take into account duplicate rows", so I will get one df with rows which are not in both df. > Is it possible? How to do it? > > TY for any help. > > > Arnaud Gaboury > > A2CT2 Ltd. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.