Hi, I want to select elements which have duplicates by are not all duplicated. Here is what I mean. Suppose I have a two column matrix with columns "Country" and "Pet" Country, Pet ------------------ France, Dog France, Cat France, Dog Canada, Cat Canada, Cat Japan, Dog Japan, Cat Italy, Cat I want to extract all the entries that are duplicated in column "Country" but not ALL duplicated in column "Pet". In this case I want Country, Pet ------------------ France, Dog France, Cat France, Dog Japan, Dog Japan, Cat Notice that I keep France, because not all are duplicated. If there was no entry "France, Cat" then it all of the entries with "France" would be eliminated. Thanks for your help.
try this:> x <- read.csv(text = "Country, Pet+ France, Dog + France, Cat + France, Dog + Canada, Cat + Canada, Cat + Japan, Dog + Japan, Cat + Italy, Cat", as.is = TRUE)> # split by Country and then see if dups in "Pet" > xs <- split(x, x$Country) > Dups <- do.call(rbind+ , lapply(xs, function(.country){ + if (all(.country$Pet[1L] == .country$Pet)) return(NULL) + .country # return match + }) + )> row.names(Dups) <- NULL # remove rownames before printing > DupsCountry Pet 1 France Dog 2 France Cat 3 France Dog 4 Japan Dog 5 Japan Cat>On Sat, Jul 13, 2013 at 4:12 PM, Vesco Miloushev <vesco.miloushev@gmail.com>wrote:> Hi, > > I want to select elements which have duplicates by are not all duplicated. > > Here is what I mean. Suppose I have a two column matrix with columns > "Country" and "Pet" > > > Country, Pet > ------------------ > France, Dog > France, Cat > France, Dog > Canada, Cat > Canada, Cat > Japan, Dog > Japan, Cat > Italy, Cat > > I want to extract all the entries that are duplicated in column > "Country" but not ALL duplicated in column "Pet". > > In this case I want > > Country, Pet > ------------------ > France, Dog > France, Cat > France, Dog > Japan, Dog > Japan, Cat > > Notice that I keep France, because not all are duplicated. If there > was no entry "France, Cat" then it all of the entries with "France" > would be eliminated. > > Thanks for your help. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]]
Hi, May be this helps: dat1<- read.table(text=" Country, Pet France, Dog France, Cat France, Dog Canada, Cat Canada, Cat Japan, Dog Japan, Cat Italy, Cat ",sep=",",header=TRUE,stringsAsFactors=FALSE) ?dat1[with(dat1,as.numeric(ave(Pet,Country,FUN=function(x) length(unique(x)))))>1,] #? Country? Pet #1? France? Dog #2? France? Cat #3? France? Dog #6?? Japan? Dog #7?? Japan? Cat A.K. ----- Original Message ----- From: Vesco Miloushev <vesco.miloushev at gmail.com> To: r-help at r-project.org Cc: Sent: Saturday, July 13, 2013 4:12 PM Subject: [R] "not all duplicated" question Hi, I want to select elements which have duplicates by are not all duplicated. Here is what I mean. Suppose I have a two column matrix with columns "Country" and "Pet" Country, Pet ------------------ France, Dog France, Cat France, Dog Canada, Cat Canada, Cat Japan, Dog Japan, Cat Italy, Cat I want to extract all the entries that are duplicated in column "Country" but not ALL duplicated in column "Pet". In this case I want Country, Pet ------------------ France, Dog France, Cat France, Dog Japan, Dog Japan, Cat Notice that I keep France, because not all are duplicated. If there was no entry "France, Cat" then it all of the entries with "France" would be eliminated. Thanks for your help. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.