On 03/25/2015 07:48 PM, Simon Urbanek wrote:> On Mar 25, 2015, at 3:46 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote: > >> Hi Simon, >> >> I'm having trouble with nested parallel workers, specifically, forking inside socket connections. >> > > You simply can't by definition - when you fork *all* the workers share the same connection inherited from the parent, so you cannot use any I/O operations that you didn't start in the worker since reading in one worker affects all the workers. >Sorry if I'm missing the obvious here - I thought since the fork workers were shut down by the time the SOCK worker returned to its master conflicting I/O wouldn't be a problem. There are quite a few examples floating around where SOCK workers are spawned on a cluster and multicore workers are called within them. If I understand correctly this should not be done (or at least not encouraged). Instead, nested parallel should only be done with distributed memory workers, SOCK, MPI etc. Thanks. Valerie> Cheers, > Simon > > >> When mclapply is called inside a SOCK, PSOCK or FORK worker I get an >> error in unserialize(). >> >> cl <- makeCluster(1, "SOCK") >> >> fun = function(i) { >> library(parallel) >> mclapply(1:2, sqrt) >> } >> >> Failure occurs after multiple calls to clusterApply: >> >>> clusterApply(cl, 1, fun) >> [[1]] >> [[1]][[1]] >> [1] 1 >> >> [[1]][[2]] >> [1] 1.414214 >> >>> clusterApply(cl, 1, fun) >> [[1]] >> [[1]][[1]] >> [1] 1 >> >> [[1]][[2]] >> [1] 1.414214 >> >>> clusterApply(cl, 1, fun) >> Error in unserialize(node$con) : error reading from connection >> >> >> This example is from Martin and may be a different problem. >> >> ~/tmp >cat test1.R >> ## like mclapply >> ## should run 'forever' but terminates semi-randomly >> library(parallel) >> children <- parallel:::children >> >> while (TRUE) { >> n <- 8 ## n == dectectCores() >> jobs <- lapply(seq_len(n), function(i) mcparallel(Sys.sleep(20))) >> mccollect(children(jobs), FALSE) >> parallel:::mckill(children(jobs), tools::SIGTERM) >> leni <- length(mccollect(children(jobs))) >> message("leni: ", leni) >> } >> >> ~/tmp >R-dev --vanilla --slave -f test1.R >> leni: 6 >> leni: 7 >> leni: 7 >> leni: 7 >> leni: 7 >> leni: 7 >> leni: 7 >> leni: 7 >> leni: 8 >> leni: 7 >> leni: 7 >> leni: 7 >> ~/tmp > >> >> >> Thanks. >> Valerie >> >> >>> sessionInfo() >> R Under development (unstable) (2015-03-18 r68009) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> Running under: Fedora 21 (Twenty One) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> loaded via a namespace (and not attached): >> [1] snow_0.3-13 >> >> >> -- >> Computational Biology / Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, Seattle, WA 98109 >> >> Email: vobencha at fredhutch.org >> Phone: (206) 667-3158 >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >
On Mar 30, 2015, at 4:40 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote:> On 03/25/2015 07:48 PM, Simon Urbanek wrote: >> On Mar 25, 2015, at 3:46 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote: >> >>> Hi Simon, >>> >>> I'm having trouble with nested parallel workers, specifically, forking inside socket connections. >>> >> >> You simply can't by definition - when you fork *all* the workers share the same connection inherited from the parent, so you cannot use any I/O operations that you didn't start in the worker since reading in one worker affects all the workers. >> > > Sorry if I'm missing the obvious here - > I thought since the fork workers were shut down by the time the SOCK worker returned to its master conflicting I/O wouldn't be a problem. >If the workers are done and don't use I/O then all is well. However, it's not easy to guarantee that they don't use I/O since they all already come with active sockets, so e.g. on exit they may flush the socket buffers which would confuse the recipient. Interestingly your example works fine on OS X but fails on Linux. I'll try to dig deeper in a quiet minute --- in principle it should be sufficient to close all FDs right away, which you can do when using mcparallel() but not using mclapply(). Cheers, Simon> There are quite a few examples floating around where SOCK workers are spawned on a cluster and multicore workers are called within them. If I understand correctly this should not be done (or at least not encouraged). Instead, nested parallel should only be done with distributed memory workers, SOCK, MPI etc. > > Thanks. > Valerie > > >> Cheers, >> Simon >> >> >>> When mclapply is called inside a SOCK, PSOCK or FORK worker I get an >>> error in unserialize(). >>> >>> cl <- makeCluster(1, "SOCK") >>> >>> fun = function(i) { >>> library(parallel) >>> mclapply(1:2, sqrt) >>> } >>> >>> Failure occurs after multiple calls to clusterApply: >>> >>>> clusterApply(cl, 1, fun) >>> [[1]] >>> [[1]][[1]] >>> [1] 1 >>> >>> [[1]][[2]] >>> [1] 1.414214 >>> >>>> clusterApply(cl, 1, fun) >>> [[1]] >>> [[1]][[1]] >>> [1] 1 >>> >>> [[1]][[2]] >>> [1] 1.414214 >>> >>>> clusterApply(cl, 1, fun) >>> Error in unserialize(node$con) : error reading from connection >>> >>> >>> This example is from Martin and may be a different problem. >>> >>> ~/tmp >cat test1.R >>> ## like mclapply >>> ## should run 'forever' but terminates semi-randomly >>> library(parallel) >>> children <- parallel:::children >>> >>> while (TRUE) { >>> n <- 8 ## n == dectectCores() >>> jobs <- lapply(seq_len(n), function(i) mcparallel(Sys.sleep(20))) >>> mccollect(children(jobs), FALSE) >>> parallel:::mckill(children(jobs), tools::SIGTERM) >>> leni <- length(mccollect(children(jobs))) >>> message("leni: ", leni) >>> } >>> >>> ~/tmp >R-dev --vanilla --slave -f test1.R >>> leni: 6 >>> leni: 7 >>> leni: 7 >>> leni: 7 >>> leni: 7 >>> leni: 7 >>> leni: 7 >>> leni: 7 >>> leni: 8 >>> leni: 7 >>> leni: 7 >>> leni: 7 >>> ~/tmp > >>> >>> >>> Thanks. >>> Valerie >>> >>> >>>> sessionInfo() >>> R Under development (unstable) (2015-03-18 r68009) >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> Running under: Fedora 21 (Twenty One) >>> >>> locale: >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] parallel stats graphics grDevices utils datasets methods >>> [8] base >>> >>> loaded via a namespace (and not attached): >>> [1] snow_0.3-13 >>> >>> >>> -- >>> Computational Biology / Fred Hutchinson Cancer Research Center >>> 1100 Fairview Ave. N, Seattle, WA 98109 >>> >>> Email: vobencha at fredhutch.org >>> Phone: (206) 667-3158 >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >
On 03/30/2015 02:51 PM, Simon Urbanek wrote:> > On Mar 30, 2015, at 4:40 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote: > >> On 03/25/2015 07:48 PM, Simon Urbanek wrote: >>> On Mar 25, 2015, at 3:46 PM, Valerie Obenchain <vobencha at fredhutch.org> wrote: >>> >>>> Hi Simon, >>>> >>>> I'm having trouble with nested parallel workers, specifically, forking inside socket connections. >>>> >>> >>> You simply can't by definition - when you fork *all* the workers share the same connection inherited from the parent, so you cannot use any I/O operations that you didn't start in the worker since reading in one worker affects all the workers. >>> >> >> Sorry if I'm missing the obvious here - >> I thought since the fork workers were shut down by the time the SOCK worker returned to its master conflicting I/O wouldn't be a problem. >> > > If the workers are done and don't use I/O then all is well. However, it's not easy to guarantee that they don't use I/O since they all already come with active sockets, so e.g. on exit they may flush the socket buffers which would confuse the recipient. Interestingly your example works fine on OS X but fails on Linux. I'll try to dig deeper in a quiet minute --- in principle it should be sufficient to close all FDs right away, which you can do when using mcparallel() but not using mclapply(). >I see. Thanks for the explanation. Valerie> Cheers, > Simon > > > >> There are quite a few examples floating around where SOCK workers are spawned on a cluster and multicore workers are called within them. If I understand correctly this should not be done (or at least not encouraged). Instead, nested parallel should only be done with distributed memory workers, SOCK, MPI etc. >> >> Thanks. >> Valerie >> >> >>> Cheers, >>> Simon >>> >>> >>>> When mclapply is called inside a SOCK, PSOCK or FORK worker I get an >>>> error in unserialize(). >>>> >>>> cl <- makeCluster(1, "SOCK") >>>> >>>> fun = function(i) { >>>> library(parallel) >>>> mclapply(1:2, sqrt) >>>> } >>>> >>>> Failure occurs after multiple calls to clusterApply: >>>> >>>>> clusterApply(cl, 1, fun) >>>> [[1]] >>>> [[1]][[1]] >>>> [1] 1 >>>> >>>> [[1]][[2]] >>>> [1] 1.414214 >>>> >>>>> clusterApply(cl, 1, fun) >>>> [[1]] >>>> [[1]][[1]] >>>> [1] 1 >>>> >>>> [[1]][[2]] >>>> [1] 1.414214 >>>> >>>>> clusterApply(cl, 1, fun) >>>> Error in unserialize(node$con) : error reading from connection >>>> >>>> >>>> This example is from Martin and may be a different problem. >>>> >>>> ~/tmp >cat test1.R >>>> ## like mclapply >>>> ## should run 'forever' but terminates semi-randomly >>>> library(parallel) >>>> children <- parallel:::children >>>> >>>> while (TRUE) { >>>> n <- 8 ## n == dectectCores() >>>> jobs <- lapply(seq_len(n), function(i) mcparallel(Sys.sleep(20))) >>>> mccollect(children(jobs), FALSE) >>>> parallel:::mckill(children(jobs), tools::SIGTERM) >>>> leni <- length(mccollect(children(jobs))) >>>> message("leni: ", leni) >>>> } >>>> >>>> ~/tmp >R-dev --vanilla --slave -f test1.R >>>> leni: 6 >>>> leni: 7 >>>> leni: 7 >>>> leni: 7 >>>> leni: 7 >>>> leni: 7 >>>> leni: 7 >>>> leni: 7 >>>> leni: 8 >>>> leni: 7 >>>> leni: 7 >>>> leni: 7 >>>> ~/tmp > >>>> >>>> >>>> Thanks. >>>> Valerie >>>> >>>> >>>>> sessionInfo() >>>> R Under development (unstable) (2015-03-18 r68009) >>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>> Running under: Fedora 21 (Twenty One) >>>> >>>> locale: >>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>> >>>> attached base packages: >>>> [1] parallel stats graphics grDevices utils datasets methods >>>> [8] base >>>> >>>> loaded via a namespace (and not attached): >>>> [1] snow_0.3-13 >>>> >>>> >>>> -- >>>> Computational Biology / Fred Hutchinson Cancer Research Center >>>> 1100 Fairview Ave. N, Seattle, WA 98109 >>>> >>>> Email: vobencha at fredhutch.org >>>> Phone: (206) 667-3158 >>>> >>>> ______________________________________________ >>>> R-devel at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>> >> >