Hi all, I have a few cloud instances and I want to use them to do parallel computing. I would like to create a socket cluster on my local machine to control the remote instances. Here is my network setup: local machine -- NAT -- Internet -- cloud instances In the parallel package, the server needs to call `makeCluster()` and listens to the connection from the workers. In my case, the server is the local machine and the workers are the cloud instances. However, since the local machine is hidden behind the NAT, it does not have a public address and the worker cannot connect to it. Therefore, `makeCluster()` will never be able to see the connection from the workers and hang forever. One solution for letting the external machine to access the device inside the NAT is to use port forwarding. However, this would not work for my case as the NAT is set by the network provider(not my home router) so I do not have access to the router. As the cloud instances have public addresses, I'll wonder if there is any way to build the cluster by letting the server connect to the cloud? I have checked `?parallel::makeCluster` and `?snow::makeSOCKcluster` but I found no result. The only promising solution I can see now is to use TCP hole punching, but it is quite complicated and may not work for every case. Since building a connection from local to the remote is super easy, I would like to know if there exists any simple solution. I have searched it on Google for a week but find no answer. I'll appreciate it if you can provide me any suggestions! Best, Jiefei [[alternative HTML version deleted]]
Henrik Bengtsson
2021-Jan-18 17:22 UTC
[R] parallel: socket connection behind a NAT router
If you have SSH access to the workers, then workers <- c("machine1.example.org", "machine2.example.org") cl <- parallelly::makeClusterPSOCK(workers) should do it. It does this without admin rights and port forwarding. See also the README in https://cran.r-project.org/package=parallelly. /Henrik On Mon, Jan 18, 2021 at 6:45 AM Jiefei Wang <szwjf08 at gmail.com> wrote:> > Hi all, > > I have a few cloud instances and I want to use them to do parallel > computing. I would like to create a socket cluster on my local machine to > control the remote instances. Here is my network setup: > > local machine -- NAT -- Internet -- cloud instances > > In the parallel package, the server needs to call `makeCluster()` and > listens to the connection from the workers. In my case, the server is the > local machine and the workers are the cloud instances. However, since the > local machine is hidden behind the NAT, it does not have a public address > and the worker cannot connect to it. Therefore, `makeCluster()` will never > be able to see the connection from the workers and hang forever. > > One solution for letting the external machine to access the device inside > the NAT is to use port forwarding. However, this would not work for my case > as the NAT is set by the network provider(not my home router) so I do not > have access to the router. As the cloud instances have public addresses, > I'll wonder if there is any way to build the cluster by letting the server > connect to the cloud? I have checked `?parallel::makeCluster` and > `?snow::makeSOCKcluster` but I found no result. The only promising solution > I can see now is to use TCP hole punching, but it is quite complicated and > may not work for every case. Since building a connection from local to the > remote is super easy, I would like to know if there exists any simple > solution. I have searched it on Google for a week but find no answer. I'll > appreciate it if you can provide me any suggestions! > > Best, > Jiefei > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.