arun
2013-Apr-12 20:06 UTC
[R] Removing rows that are duplicates but column values are in reversed order
Hi, From your example data, dat1<- read.table(text=" id1?? id2?? value a????? b?????? 10 c????? d??????? 11 b???? a???????? 10 c????? e???????? 12 ",sep="",header=TRUE,stringsAsFactors=FALSE) #it is easier to get the output you wanted dat1[!duplicated(dat1$value),] #? id1 id2 value #1?? a?? b??? 10 #2?? c?? d??? 11 #4?? c?? e??? 12 But, if you have cases like the one below (assuming that all those instances were there is reversed order have the same value) dat2<- read.table(text=" id1?? id2?? value a????? b?????? 10 c????? d??????? 11 b???? a???????? 10 e????? c???????? 12 c????? e???????? 12 ",sep="",header=TRUE,stringsAsFactors=FALSE) ?dat2[apply(dat2[,-3],1,function(x) {x1<- order(x); x1[1]<x1[2]}),] ?# id1 id2 value #1?? a?? b??? 10 #2?? c?? d??? 11 #5?? c?? e??? 12 #or you have cases like these: dat3<- read.table(text=" id1?? id2?? value a????? b?????? 10 c????? d??????? 11 b???? a???????? 10 a????? b??????? 10 e????? c???????? 12 c????? e???????? 12 c????? d???????? 11 ",sep="",header=TRUE,stringsAsFactors=FALSE) ?dat3New<-dat3[apply(dat3[,-3],1,function(x) {x1<- order(x); x1[1]<x1[2]}),] dat3New[!duplicated(dat3New$value),] #? id1 id2 value #1?? a?? b??? 10 #2?? c?? d??? 11 #6?? c?? e??? 12 A.K.>Hi everybody, > >I was hoping that someone could help me with this problem. Ihave a table with 3 columns. Some rows contain duplicates where the identifiers in >columns 1 and 2 are in reverse order, but the value associated with the row is the same.> >For example: > >id1 ? id2 ? value >a ? ? ?b ? ? ? 10 >c ? ? ?d ? ? ? ?11 >b ? ? a ? ? ? ? 10 >c ? ? ?e ? ? ? ? 12 > >Rows 1 and 3 are duplicates (have the same value). I would liketo retain only row 1 and delete row 3. Final table should look like this:> >id1 ? id2 ? value >a ? ? ?b ? ? ? 10 >c ? ? ?d ? ? ? ?11 >c ? ? ?e ? ? ? ? 12 > >Thanks in advance for any help provided. > >Vince
vpr3
2013-Apr-12 20:18 UTC
[R] Removing rows that are duplicates but column values are in reversed order
Thanks very much for your rapid help Arun. Vince On Apr 12, 2013, at 4:10 PM, arun kirshna [via R] wrote: Hi,>From your example data,dat1<- read.table(text=" id1 id2 value a b 10 c d 11 b a 10 c e 12 ",sep="",header=TRUE,stringsAsFactors=FALSE) #it is easier to get the output you wanted dat1[!duplicated(dat1$value),] # id1 id2 value #1 a b 10 #2 c d 11 #4 c e 12 But, if you have cases like the one below (assuming that all those instances were there is reversed order have the same value) dat2<- read.table(text=" id1 id2 value a b 10 c d 11 b a 10 e c 12 c e 12 ",sep="",header=TRUE,stringsAsFactors=FALSE) dat2[apply(dat2[,-3],1,function(x) {x1<- order(x); x1[1]<x1[2]}),] # id1 id2 value #1 a b 10 #2 c d 11 #5 c e 12 #or you have cases like these: dat3<- read.table(text=" id1 id2 value a b 10 c d 11 b a 10 a b 10 e c 12 c e 12 c d 11 ",sep="",header=TRUE,stringsAsFactors=FALSE) dat3New<-dat3[apply(dat3[,-3],1,function(x) {x1<- order(x); x1[1]<x1[2]}),] dat3New[!duplicated(dat3New$value),] # id1 id2 value #1 a b 10 #2 c d 11 #6 c e 12 A.K.>Hi everybody, > >I was hoping that someone could help me with this problem. Ihave a table with 3 columns. Some rows contain duplicates where the identifiers in >columns 1 and 2 are in reverse order, but the value associated with the row is the same.> >For example: > >id1 id2 value >a b 10 >c d 11 >b a 10 >c e 12 > >Rows 1 and 3 are duplicates (have the same value). I would liketo retain only row 1 and delete row 3. Final table should look like this:> >id1 id2 value >a b 10 >c d 11 >c e 12 > >Thanks in advance for any help provided. > >Vince______________________________________________ [hidden email]<x-msg://23/user/SendEmail.jtp?type=node&node=4664105&i=0> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ________________________________ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Removing-rows-that-are-duplicates-but-column-values-are-in-reversed-order-tp4664069p4664105.html To unsubscribe from Removing rows that are duplicates but column values are in reversed order, click here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4664069&code=dnByM0Bjb3JuZWxsLmVkdXw0NjY0MDY5fDEzODMwOTA4MTI=>. NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> -- View this message in context: http://r.789695.n4.nabble.com/Removing-rows-that-are-duplicates-but-column-values-are-in-reversed-order-tp4664069p4664107.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]