Giacomo May
2016-Jul-05 09:47 UTC
[R] Problems with parallel processing using the foreach package
Dear R-users,
I am trying my hand at parallel processing in R using the foreach package but it
is not working as I want it to. I am using a function I created myself
(Xenopus_Walk) which returns a vector. Now I would like to run this function for
every number that is saved in a vector (newly_populated_vec) and obtain a list
that stores a every vector that has been created as one element of said list.
The command I am currently using is the following (I guess you can ignore most
of it since that's mostly only exported packages and parameters my function
relies on):
no_cores <- detectCores()-1
cl <- makeCluster(no_cores)
registerDoParallel(cl)
Xenopus_Data <-
foreach(b=1:length(newly_populated_vec),.combine=list,.multicombine=TRUE,.packages
= c("raster", "gdistance",
"rgdal","sp")) %dopar% {
Xenopus_Walk(altdata=altdata,water=water,habitat_suitability=habitat_suitability,max_range_without_water=max_range_without_water,max_range=max_range,slope=slope,Start_Pt=newly_populated_vec[b])
}
The problem I have now is that the length of the returned list (Xenopus_Data) si
different from the length of the vector I retrieve the iterator from
(newly_populated_vec):
> length(Xenopus_Data) [1] 47
> length(newly_populated_vec) [1] 2027
While trying to figure out what is wrong I have read that one has to split up
the workload into equal chunks and pass each of them to a core but as you
probably can tell my understanding of all this is pretty low. I am have a total
of 32 cores at my disposition. Does anyone know why I have this problem and
maybe also a way to solve it ? I know that reproducible examples are desired,
but the function I use is pretty long and I doubt anyone is going to work
through it. Still, if I can help make things clearer by providing additional
information I will be glad to do so! Any type of help is greatly appreciated.
Thanks in advance!
EDIT: I forgot to add that when I look at the list it is nested, so I don't
get one element for every number in the vector but I get one element with
multiple sub-elements. Just in case that helps.