thr3ads.net - R help - [R] Nested foreach loops in R repeating items [Feb 2014]

If this information is useful, please help other people find it:
Share via:

arun

2014-Feb-05 23:26 UTC

[R] Nested foreach loops in R repeating items

Hi,
Try ?duplicated()
?apply(x,2,function(x) {x[duplicated(x)]<-"";x})
A.K.



Hi all, 

I have a dataset of around a thousand column and a few thousands
 of rows. I'm trying to get all the possible combinations (without 
repetition) of the data columns and process them in parallel. Here's a 
simplification of what my data and my code looks like: 

mydata <- structure(list(col1 = c(231L, 8946L, 534L), col2 = c(123L, 2361L, 
65L), col3 = c(5645L, 45L, 51L), col4 = c(654L, 356L, 32L), col5 = c(21L, 
1L, 51L), col6 = c(4L, 4515L, 15L), col7 = c(6L, 1L, 535L), col8 = c(894L, 
20L, 35L), col9 = c(68L, 21L, 123L), col10 = c(46L, 2L, 2L)), .Names =
c("col1",
"col2", "col3", "col4", "col5",
"col6", "col7", "col8", "col9",
"col10"), class = "data.frame", row.names = c(NA, -3L)) 

require(foreach) 

x <- 
foreach(m=1:5, .combine='cbind') %:% 
foreach(j=(m+1):10, .combine='c') %do% { 
paste(colnames(mydata)[m], colnames(mydata)[j]) 

} 

x 



if you execute the command above in R, you will get this result. 



? ? ? result.1 ? ? result.2 ? ? result.3 ? ? result.4 ? ? result.5 ? ? 
?[1,] "col1 col2" ?"col2 col3" ?"col3 col4"
?"col4 col5" ?"col5 col6"
?[2,] "col1 col3" ?"col2 col4" ?"col3 col5"
?"col4 col6" ?"col5 col7"
?[3,] "col1 col4" ?"col2 col5" ?"col3 col6"
?"col4 col7" ?"col5 col8"
?[4,] "col1 col5" ?"col2 col6" ?"col3 col7"
?"col4 col8" ?"col5 col9"
?[5,] "col1 col6" ?"col2 col7" ?"col3 col8"
?"col4 col9" ?"col5 col10"
?[6,] "col1 col7" ?"col2 col8" ?"col3 col9"
?"col4 col10" "col5 col6"
?[7,] "col1 col8" ?"col2 col9" ?"col3 col10"
"col4 col5" ?"col5 col7"
?[8,] "col1 col9" ?"col2 col10" "col3 col4"
?"col4 col6" ?"col5 col8"
?[9,] "col1 col10" "col2 col3" ?"col3 col5"
?"col4 col7" ?"col5 col9"

notice that first problem I face that in the last row of the 
second column of the ?"x" matrix says "col2 col3" which is a
repetition
of the first item (which happens also in all succeeding columns). I was 
planning to have unique combinations of all columns, which obviously, 
did not work. 

Can somebody please help me with this? My desired output would be 



? ? ? result.1 ? ? result.2 ? ? result.3 ? ? result.4 ? ? result.5 ? ? 
?[1,] "col1 col2" ?"col2 col3" ?"col3 col4"
?"col4 col5" ?"col5 col6"
?[2,] "col1 col3" ?"col2 col4" ?"col3 col5"
?"col4 col6" ?"col5 col7"
?[3,] "col1 col4" ?"col2 col5" ?"col3 col6"
?"col4 col7" ?"col5 col8"
?[4,] "col1 col5" ?"col2 col6" ?"col3 col7"
?"col4 col8" ?"col5 col9"
?[5,] "col1 col6" ?"col2 col7" ?"col3 col8"
?"col4 col9" ?
?[6,] "col1 col7" ?"col2 col8" ?"col3 col9" ? 
?[7,] "col1 col8" ?"col2 col9" ? 
?[8,] "col1 col9" ?"col2 col10" 
?[9,] "col1 col10" 


Many thanks

Bert Gunter

2014-Feb-06 00:42 UTC

head link

[R] Nested foreach loops in R repeating items

I don't think you answered the OP's query, although I confess that I
am not so sure I understand it either (see below). In any case, I
believe the R level loop (i.e. apply()) is unnecessary. There is a
unique (and a duplicated()) method for data frames, so simply

unique(x)

returns a data frame with all the unique rows of x.

However, I don't think that's what the OP wanted. (S)he appeared to
want all unique combinations of 2 columns. If I got that right (??),
combn(ncol(x),2) gives that and could be used for indexing. I'm not
sure parallel processing is useful here, but then again, I may have
misunderstood the query. If so, my apologies, and feel free to ignore
all the above :-(  .


Cheers,
Bert




Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Wed, Feb 5, 2014 at 3:26 PM, arun <smartpink111 at yahoo.com>
wrote:> Hi,
> Try ?duplicated()
>  apply(x,2,function(x) {x[duplicated(x)]<-"";x})
> A.K.
>
>
>
> Hi all,
>
> I have a dataset of around a thousand column and a few thousands
>  of rows. I'm trying to get all the possible combinations (without
> repetition) of the data columns and process them in parallel. Here's a
> simplification of what my data and my code looks like:
>
> mydata <- structure(list(col1 = c(231L, 8946L, 534L), col2 = c(123L,
2361L,
> 65L), col3 = c(5645L, 45L, 51L), col4 = c(654L, 356L, 32L), col5 = c(21L,
> 1L, 51L), col6 = c(4L, 4515L, 15L), col7 = c(6L, 1L, 535L), col8 = c(894L,
> 20L, 35L), col9 = c(68L, 21L, 123L), col10 = c(46L, 2L, 2L)), .Names =
c("col1",
> "col2", "col3", "col4", "col5",
"col6", "col7", "col8", "col9",
> "col10"), class = "data.frame", row.names = c(NA, -3L))
>
> require(foreach)
>
> x <-
> foreach(m=1:5, .combine='cbind') %:%
> foreach(j=(m+1):10, .combine='c') %do% {
> paste(colnames(mydata)[m], colnames(mydata)[j])
>
> }
>
> x
>
>
>
> if you execute the command above in R, you will get this result.
>
>
>
>       result.1     result.2     result.3     result.4     result.5
>  [1,] "col1 col2"  "col2 col3"  "col3 col4" 
"col4 col5"  "col5 col6"
>  [2,] "col1 col3"  "col2 col4"  "col3 col5" 
"col4 col6"  "col5 col7"
>  [3,] "col1 col4"  "col2 col5"  "col3 col6" 
"col4 col7"  "col5 col8"
>  [4,] "col1 col5"  "col2 col6"  "col3 col7" 
"col4 col8"  "col5 col9"
>  [5,] "col1 col6"  "col2 col7"  "col3 col8" 
"col4 col9"  "col5 col10"
>  [6,] "col1 col7"  "col2 col8"  "col3 col9" 
"col4 col10" "col5 col6"
>  [7,] "col1 col8"  "col2 col9"  "col3 col10"
"col4 col5"  "col5 col7"
>  [8,] "col1 col9"  "col2 col10" "col3 col4" 
"col4 col6"  "col5 col8"
>  [9,] "col1 col10" "col2 col3"  "col3 col5" 
"col4 col7"  "col5 col9"
>
> notice that first problem I face that in the last row of the
> second column of the  "x" matrix says "col2 col3" which
is a repetition
> of the first item (which happens also in all succeeding columns). I was
> planning to have unique combinations of all columns, which obviously,
> did not work.
>
> Can somebody please help me with this? My desired output would be
>
>
>
>       result.1     result.2     result.3     result.4     result.5
>  [1,] "col1 col2"  "col2 col3"  "col3 col4" 
"col4 col5"  "col5 col6"
>  [2,] "col1 col3"  "col2 col4"  "col3 col5" 
"col4 col6"  "col5 col7"
>  [3,] "col1 col4"  "col2 col5"  "col3 col6" 
"col4 col7"  "col5 col8"
>  [4,] "col1 col5"  "col2 col6"  "col3 col7" 
"col4 col8"  "col5 col9"
>  [5,] "col1 col6"  "col2 col7"  "col3 col8" 
"col4 col9"
>  [6,] "col1 col7"  "col2 col8"  "col3 col9"
>  [7,] "col1 col8"  "col2 col9"
>  [8,] "col1 col9"  "col2 col10"
>  [9,] "col1 col10"
>
>
> Many thanks
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Feb 2014 - Nested foreach loops in R repeating items

[R] Nested foreach loops in R repeating items

[R] Nested foreach loops in R repeating items