Balamuta, James Joseph
2020-Nov-06 21:47 UTC
[Rd] Process to Incorporate Functions from {parallely} into base R's {parallel} package
Hi all, Henrik Bengtsson has done some fantastic work with {future} and, more importantly, greatly improved constructing and deconstructing a parallelized environment within R. It was with great joy that I saw Henrik slowly split off some functionality of {future} into {parallelly} package. Reading over the package?s README, he states:> The functions and features added to this package are written to be backward compatible with the parallel package, such that they may be incorporated there later. > The parallelly package comes with an open invitation for the R Core Team to adopt all or parts of its code into the parallel package.https://github.com/HenrikBengtsson/parallelly I?m wondering what the appropriate process would be to slowly merge some functions from {parallelly} into the base R {parallel} package. Should this be done with targeted issues on Bugzilla for different fields Henrik has identified? Or would an omnibus patch bringing in all suggested modifications be preferred? Or is it best to discuss via the list-serv appropriate contributions? Best, JJB [[alternative HTML version deleted]]
Duncan Murdoch
2020-Nov-07 00:37 UTC
[Rd] Process to Incorporate Functions from {parallely} into base R's {parallel} package
On 06/11/2020 4:47 p.m., Balamuta, James Joseph wrote:> Hi all, > > Henrik Bengtsson has done some fantastic work with {future} and, more importantly, greatly improved constructing and deconstructing a parallelized environment within R. It was with great joy that I saw Henrik slowly split off some functionality of {future} into {parallelly} package. Reading over the package?s README, he states: > >> The functions and features added to this package are written to be backward compatible with the parallel package, such that they may be incorporated there later. >> The parallelly package comes with an open invitation for the R Core Team to adopt all or parts of its code into the parallel package. > > https://github.com/HenrikBengtsson/parallelly > > I?m wondering what the appropriate process would be to slowly merge some functions from {parallelly} into the base R {parallel} package. Should this be done with targeted issues on Bugzilla for different fields Henrik has identified? Or would an omnibus patch bringing in all suggested modifications be preferred? Or is it best to discuss via the list-serv appropriate contributions?One way is to convince R Core that incorporating this into the parallel package would - make less work for them, or - add a lot to R that couldn't happen if it was a contributed package. The fact that it's good isn't a good reason to put it into a base package, which would largely mean transferring Henrik's workload to R Core. There are lots of good packages, and their maintainers should continue to maintain them. Duncan Murdoch
Henrik Bengtsson
2020-Nov-07 18:39 UTC
[Rd] Process to Incorporate Functions from {parallely} into base R's {parallel} package
FWIW, there are indeed a few low hanging bug fixes in 'parallelly' that should be easy to incorporate into 'parallel' without adding extra maintenance. For example, in parallel::makePSOCKcluster(), it is not possible to disable SSH option '-l USER' so that it can be set in ~/.ssh/config. The remote user name will be the user name of your local machine and if you try to set user=NULL, you'll end up with an invalid SSH call. The current behavior means that you are forced to specify the remote user name in your R code. All that it takes is to fix this is to update: cmd <- paste(rshcmd, "-l", user, machine, cmd) to something like: cmd <- paste(rshcmd, if (length(user) == 1L) paste("-l", user), machine, cmd) This is one example of what I've patched in parallelly::makeClusterPSOCK() over the years. Another is the use of reverse tunneling in SSH - that completely avoids the need to know and specify your public IP and reconfiguring the firewalls from the remote server back to your local machine so that the worker can connect back to your local machine. Not many users have the permission to reconfigure firewalls and it's also extremely tedious. Reverse SSH tunneling is super simply; all you need to to is something like: rshopts <- c(sprintf("-R %d:%s:%d", rscript_port, master, port), rshopts) /Henrik On Fri, Nov 6, 2020 at 4:37 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> > On 06/11/2020 4:47 p.m., Balamuta, James Joseph wrote: > > Hi all, > > > > Henrik Bengtsson has done some fantastic work with {future} and, more importantly, greatly improved constructing and deconstructing a parallelized environment within R. It was with great joy that I saw Henrik slowly split off some functionality of {future} into {parallelly} package. Reading over the package?s README, he states: > > > >> The functions and features added to this package are written to be backward compatible with the parallel package, such that they may be incorporated there later. > >> The parallelly package comes with an open invitation for the R Core Team to adopt all or parts of its code into the parallel package. > > > > https://github.com/HenrikBengtsson/parallelly > > > > I?m wondering what the appropriate process would be to slowly merge some functions from {parallelly} into the base R {parallel} package. Should this be done with targeted issues on Bugzilla for different fields Henrik has identified? Or would an omnibus patch bringing in all suggested modifications be preferred? Or is it best to discuss via the list-serv appropriate contributions? > > One way is to convince R Core that incorporating this into the parallel > package would > > - make less work for them, or > - add a lot to R that couldn't happen if it was a contributed package. > > The fact that it's good isn't a good reason to put it into a base > package, which would largely mean transferring Henrik's workload to R > Core. There are lots of good packages, and their maintainers should > continue to maintain them. > > Duncan Murdoch > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Possibly Parallel Threads
- Process to Incorporate Functions from {parallely} into base R's {parallel} package
- Process to Incorporate Functions from {parallely} into base R's {parallel} package
- Process to Incorporate Functions from {parallely} into base R's {parallel} package
- Parallel R expression evaluations
- Packages sometimes don't update, but no error or warning is thrown