Hi Guys,
I have a list elements in two columns of a data frame. I want first to
subselect on V1 and then to form and count all possible and unique triplets
of V1 with the corresponding elements in V2 but exclude triplets for which
a pair (V1 V2) does not exists:
Example input
V1 V2
A B
A C
D E
D F
D G
E F
E G
F G
Example output
DEF
DEG
DFG
EFG
(ABC is eliminated because the pair B C does not exist in the data frame)
Total: 4 triplets
Here is what I tried, but was unsuccessful:
uniq.V1 <- unique(df.V1)
original.pairs <- do.call(paste, c(df[c("V1", "V2")], sep
= ":"))
nbElements <- 3
l.res<-lapply(uniq.V1, function(x){
set <- c(x, unlist(subset(df$1==x, select=c(V2))))
if(length(set) >= nbElements){
tmp.combn <- combn(set, nbElements, simplify=FALSE)
## I tried here to create all possible combination of pairs to test with
the original pairs and return only the successful ones but it became a very
complicated structure ....
}
})
Any help/suggestion is appreciated,
Best
--
View this message in context:
http://r.789695.n4.nabble.com/obtain-triplets-from-Data-Frame-columns-tp4673091.html
Sent from the R help mailing list archive at Nabble.com.
Try this: merge(df, df, by.x="V2", by.y="V1") Jean On Mon, Aug 5, 2013 at 1:26 AM, PQuery <pierre.khoueiry@embl.de> wrote:> Hi Guys, > > I have a list elements in two columns of a data frame. I want first to > subselect on V1 and then to form and count all possible and unique triplets > of V1 with the corresponding elements in V2 but exclude triplets for > which > a pair (V1 V2) does not exists: > > Example input > V1 V2 > A B > A C > D E > D F > D G > E F > E G > F G > > Example output > DEF > DEG > DFG > EFG > (ABC is eliminated because the pair B C does not exist in the data frame) > > Total: 4 triplets > > Here is what I tried, but was unsuccessful: > > uniq.V1 <- unique(df.V1) > original.pairs <- do.call(paste, c(df[c("V1", "V2")], sep = ":")) > nbElements <- 3 > > l.res<-lapply(uniq.V1, function(x){ > set <- c(x, unlist(subset(df$1==x, select=c(V2)))) > if(length(set) >= nbElements){ > tmp.combn <- combn(set, nbElements, simplify=FALSE) > ## I tried here to create all possible combination of pairs to test > with > the original pairs and return only the successful ones but it became a very > complicated structure .... > } > }) > > Any help/suggestion is appreciated, > Best > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/obtain-triplets-from-Data-Frame-columns-tp4673091.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Here's one way to extend the code to groups of 4 as well ...
Jean
# 3
df2 <- df
names(df2) <- paste0("new", 1:2)
df3 <- merge(df2, df, by.x="new2", by.y="V.1")[, c(2, 1,
3)]
names(df3) <- paste0("new", 1:3)
df3
# 4
df4 <- merge(df3, df, by.x="new3", by.y="V.1")[, c(2, 3,
1, 4)]
names(df4) <- paste0("new", 1:4)
df4
On Mon, Aug 5, 2013 at 2:07 PM, Pierre Khoueiry
<pierre.khoueiry@embl.de>wrote:
> No no I'am buying it for sure. Thank you very much. It is way better.
>
> Just one issue. I did my code because I wanted indeed triplets but also
> groups of 4, 5 and so on. This is why the nbElements is defined in my code.
>
> My problem I think is that I am using the scripting languages logic in R
> !!! I should
> Thanks,
>
>
>
>
> On 5Aug, 2013, at 21:01 , Adams, Jean wrote:
>
> Hmmm. I'm not sure why you prefer 20-30 lines of looping code over a
> simpler 1 line solution, but, as you wish.
>
> merge(df, df, by.x="V.2", by.y="V.1")[, c(2, 1, 3)]
>
> Jean
>
>
> On Mon, Aug 5, 2013 at 12:37 PM, PQuery <pierre.khoueiry@embl.de>
wrote:
>
>> Hello Jean,
>>
>> Thanks for the reply. However, you solution doesn't reproduce the
output
>> that I desire.
>>
>> I updated my post with my solution full of loops.
>>
>> If there is a more fancy/elegant way, I'll take it.
>>
>> Best,
>>
>>
>>
>> --
>> View this message in context:
>>
http://r.789695.n4.nabble.com/obtain-triplets-from-Data-Frame-columns-tp4673091p4673164.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> =======================> Pierre Khoueiry
> EMBL - Heidelberg
> Furlong group, room V320
> Meyerhofstraße 1,
> 69117 Heidelberg, Germany
> Tel: +49 (0)6221-387 8682
> =======================>
>
>
>
>
>
>
>
>
>
>
>
[[alternative HTML version deleted]]