Julien Textoris
2012-Jan-03 13:19 UTC
[R] Error when using foreach package for parralelization
Hi, i tried to find the answer but didn't so my apologies if the question is obvious ! I'm trying to parallelize the following R code : pk2test = c(1:16,(12*16+1):(12*16+16),(16*16+1):(16*16+16),(20*16+1):(20*16+16)) score.mat = matrix(nc=16*4,nr=16*4) for(i in 1:(16*4)) { for(j in i:(16*4)) { score.mat[i,j] = score.mat[j,i] computeScore(pk[[pk2test[i]]],pk[[pk2test[j]]],10,5)$score } } pk is a list of Object of type MassPeak from MALDIquant library. Each object is composed with a mass vector (@mass) an intensity vector (@intensity) and a metaData field (another list) score.mat is a matrix with scores (reals) pk2test is just a vector to know which objects in pk i want to deal with computeScore is the function i wrote to compute the score, it calls another function called filterSpectra I write the function like that, and i got the error below, and i can't figure out why ? pk2test = c(1:16,(12*16+1):(12*16+16),(16*16+1):(16*16+16),(21*16+1):(21*16+16)) score.mat = matrix(nc=16*4,nr=16*4) for(i in 1:4) { score.mat[i,i:4] foreach(filterSpectra=filterSpectra, computeScore=computeScore, pk=pk,pk2test=pk2test,i=i,j=c(i:4), .combine="c", .packages="MALDIquant" ) %dopar% { computeScore(pk[[pk2test[i]]],pk[[pk2test[j]]],10,5)$score } } Error: trying to get slot "mass" from an object of a basic class ("integer") with no slots. I think it is because when parallelizing, object are sent to child R session, but i don't understand how i have to write things to get it work. If someone have a clue, it would be really appreciated and would let me save a lot of computing time !! Thanks in advance, Julien -- Envoy? de mon ENIAC Julien Textoris CCA - Service d'anesth?sie et de r?animation H?pital Nord, Assistance Publique - H?pitaux de Marseille chemin des bourrely 13915 Marseille +33 491 965 531
Mikko Korpela
2012-Jan-10 19:06 UTC
[R] Error when using foreach package for parralelization
On 01/03/2012 03:19 PM, Julien Textoris wrote:> I'm trying to parallelize the following R code : > > pk2test > c(1:16,(12*16+1):(12*16+16),(16*16+1):(16*16+16),(20*16+1):(20*16+16)) > score.mat = matrix(nc=16*4,nr=16*4) > for(i in 1:(16*4)) { > > for(j in i:(16*4)) { > score.mat[i,j] = score.mat[j,i] > computeScore(pk[[pk2test[i]]],pk[[pk2test[j]]],10,5)$score > } > } > > pk is a list of Object of type MassPeak from MALDIquant library. Each > object is composed with a mass vector (@mass) an intensity vector > (@intensity) and a metaData field (another list) > > score.mat is a matrix with scores (reals) > pk2test is just a vector to know which objects in pk i want to deal with > computeScore is the function i wrote to compute the score, it calls > another function called filterSpectra > > > I write the function like that, and i got the error below, and i can't > figure out why ? > > pk2test > c(1:16,(12*16+1):(12*16+16),(16*16+1):(16*16+16),(21*16+1):(21*16+16)) > score.mat = matrix(nc=16*4,nr=16*4) > for(i in 1:4) { > score.mat[i,i:4] > foreach(filterSpectra=filterSpectra, > computeScore=computeScore, > pk=pk,pk2test=pk2test,i=i,j=c(i:4), > .combine="c", .packages="MALDIquant" ) %dopar% { > computeScore(pk[[pk2test[i]]],pk[[pk2test[j]]],10,5)$score > } > } > > Error: trying to get slot "mass" from an object of a basic class > ("integer") with no slots.Hi Julien! Are these two code sequences supposed to produce the same result? The two definitions of pk2test are slightly different. Also, in the attempted parallelized version, you are only assigning to a small part of score.mat. Is that intentional? The real error in this case seems to be that you mistakenly redefine some variables in the foreach() call. As far as I can tell, you should not redefine the variables 'filterSpectra', 'computeScore', 'pk', 'pk2test' or 'i'. In the foreach() call, you should only define iteration variables, i.e. variables whose value changes from one iteration to another (like 'j'). Now you actually accidentally iterate over some data structures. For example, 'pk' inside the %dopar% loop is a single element of the original 'pk' list (which may get overwritten, depending on whether the loop is actually run in parallel). This is probably not what you want. - Mikko Korpela