Hi Guys, I have a list elements in two columns of a data frame. I want first to subselect on V1 and then to form and count all possible and unique triplets of V1 with the corresponding elements in V2 but exclude triplets for which a pair (V1 V2) does not exists: Example input V1 V2 A B A C D E D F D G E F E G F G Example output DEF DEG DFG EFG (ABC is eliminated because the pair B C does not exist in the data frame) Total: 4 triplets Here is what I tried, but was unsuccessful: uniq.V1 <- unique(df.V1) original.pairs <- do.call(paste, c(df[c("V1", "V2")], sep = ":")) nbElements <- 3 l.res<-lapply(uniq.V1, function(x){ set <- c(x, unlist(subset(df$1==x, select=c(V2)))) if(length(set) >= nbElements){ tmp.combn <- combn(set, nbElements, simplify=FALSE) ## I tried here to create all possible combination of pairs to test with the original pairs and return only the successful ones but it became a very complicated structure .... } }) Any help/suggestion is appreciated, Best -- View this message in context: http://r.789695.n4.nabble.com/obtain-triplets-from-Data-Frame-columns-tp4673091.html Sent from the R help mailing list archive at Nabble.com.
Try this: merge(df, df, by.x="V2", by.y="V1") Jean On Mon, Aug 5, 2013 at 1:26 AM, PQuery <pierre.khoueiry@embl.de> wrote:> Hi Guys, > > I have a list elements in two columns of a data frame. I want first to > subselect on V1 and then to form and count all possible and unique triplets > of V1 with the corresponding elements in V2 but exclude triplets for > which > a pair (V1 V2) does not exists: > > Example input > V1 V2 > A B > A C > D E > D F > D G > E F > E G > F G > > Example output > DEF > DEG > DFG > EFG > (ABC is eliminated because the pair B C does not exist in the data frame) > > Total: 4 triplets > > Here is what I tried, but was unsuccessful: > > uniq.V1 <- unique(df.V1) > original.pairs <- do.call(paste, c(df[c("V1", "V2")], sep = ":")) > nbElements <- 3 > > l.res<-lapply(uniq.V1, function(x){ > set <- c(x, unlist(subset(df$1==x, select=c(V2)))) > if(length(set) >= nbElements){ > tmp.combn <- combn(set, nbElements, simplify=FALSE) > ## I tried here to create all possible combination of pairs to test > with > the original pairs and return only the successful ones but it became a very > complicated structure .... > } > }) > > Any help/suggestion is appreciated, > Best > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/obtain-triplets-from-Data-Frame-columns-tp4673091.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Here's one way to extend the code to groups of 4 as well ... Jean # 3 df2 <- df names(df2) <- paste0("new", 1:2) df3 <- merge(df2, df, by.x="new2", by.y="V.1")[, c(2, 1, 3)] names(df3) <- paste0("new", 1:3) df3 # 4 df4 <- merge(df3, df, by.x="new3", by.y="V.1")[, c(2, 3, 1, 4)] names(df4) <- paste0("new", 1:4) df4 On Mon, Aug 5, 2013 at 2:07 PM, Pierre Khoueiry <pierre.khoueiry@embl.de>wrote:> No no I'am buying it for sure. Thank you very much. It is way better. > > Just one issue. I did my code because I wanted indeed triplets but also > groups of 4, 5 and so on. This is why the nbElements is defined in my code. > > My problem I think is that I am using the scripting languages logic in R > !!! I should > Thanks, > > > > > On 5Aug, 2013, at 21:01 , Adams, Jean wrote: > > Hmmm. I'm not sure why you prefer 20-30 lines of looping code over a > simpler 1 line solution, but, as you wish. > > merge(df, df, by.x="V.2", by.y="V.1")[, c(2, 1, 3)] > > Jean > > > On Mon, Aug 5, 2013 at 12:37 PM, PQuery <pierre.khoueiry@embl.de> wrote: > >> Hello Jean, >> >> Thanks for the reply. However, you solution doesn't reproduce the output >> that I desire. >> >> I updated my post with my solution full of loops. >> >> If there is a more fancy/elegant way, I'll take it. >> >> Best, >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/obtain-triplets-from-Data-Frame-columns-tp4673091p4673164.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > =======================> Pierre Khoueiry > EMBL - Heidelberg > Furlong group, room V320 > Meyerhofstraße 1, > 69117 Heidelberg, Germany > Tel: +49 (0)6221-387 8682 > =======================> > > > > > > > > > > >[[alternative HTML version deleted]]