Henrik Bengtsson
2021-Jan-18 17:22 UTC
[R] parallel: socket connection behind a NAT router
If you have SSH access to the workers, then workers <- c("machine1.example.org", "machine2.example.org") cl <- parallelly::makeClusterPSOCK(workers) should do it. It does this without admin rights and port forwarding. See also the README in https://cran.r-project.org/package=parallelly. /Henrik On Mon, Jan 18, 2021 at 6:45 AM Jiefei Wang <szwjf08 at gmail.com> wrote:> > Hi all, > > I have a few cloud instances and I want to use them to do parallel > computing. I would like to create a socket cluster on my local machine to > control the remote instances. Here is my network setup: > > local machine -- NAT -- Internet -- cloud instances > > In the parallel package, the server needs to call `makeCluster()` and > listens to the connection from the workers. In my case, the server is the > local machine and the workers are the cloud instances. However, since the > local machine is hidden behind the NAT, it does not have a public address > and the worker cannot connect to it. Therefore, `makeCluster()` will never > be able to see the connection from the workers and hang forever. > > One solution for letting the external machine to access the device inside > the NAT is to use port forwarding. However, this would not work for my case > as the NAT is set by the network provider(not my home router) so I do not > have access to the router. As the cloud instances have public addresses, > I'll wonder if there is any way to build the cluster by letting the server > connect to the cloud? I have checked `?parallel::makeCluster` and > `?snow::makeSOCKcluster` but I found no result. The only promising solution > I can see now is to use TCP hole punching, but it is quite complicated and > may not work for every case. Since building a connection from local to the > remote is super easy, I would like to know if there exists any simple > solution. I have searched it on Google for a week but find no answer. I'll > appreciate it if you can provide me any suggestions! > > Best, > Jiefei > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thanks for introducing this interesting package to me! it is great to know a new powerful tool, but it seems like this method does not work in my environment. ` parallelly::makeClusterPSOCK` will hang until timeout. I checked the verbose output and it looks like the parallelly package also depends on `parallel:::.slaveRSOCK` on the remote instance to build the connection. This explains why it failed for the local machine does not have a public IP and the remote does not know how to build the connection. I see in README the package states it works with "remote clusters without knowing public IP". I think this might be where the confusion is, it may mean the remote machine does not have a public IP, but the server machine does. I'm in the opposite situation, the server does not have a public IP, but the remote does. I'm not sure if this package can handle my case, but it looks very powerful and I appreciate your help! Best, Jiefei On Tue, Jan 19, 2021 at 1:22 AM Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:> If you have SSH access to the workers, then > > workers <- c("machine1.example.org", "machine2.example.org") > cl <- parallelly::makeClusterPSOCK(workers) > > should do it. It does this without admin rights and port forwarding. > See also the README in https://cran.r-project.org/package=parallelly. > > /Henrik > > On Mon, Jan 18, 2021 at 6:45 AM Jiefei Wang <szwjf08 at gmail.com> wrote: > > > > Hi all, > > > > I have a few cloud instances and I want to use them to do parallel > > computing. I would like to create a socket cluster on my local machine to > > control the remote instances. Here is my network setup: > > > > local machine -- NAT -- Internet -- cloud instances > > > > In the parallel package, the server needs to call `makeCluster()` and > > listens to the connection from the workers. In my case, the server is the > > local machine and the workers are the cloud instances. However, since the > > local machine is hidden behind the NAT, it does not have a public address > > and the worker cannot connect to it. Therefore, `makeCluster()` will > never > > be able to see the connection from the workers and hang forever. > > > > One solution for letting the external machine to access the device inside > > the NAT is to use port forwarding. However, this would not work for my > case > > as the NAT is set by the network provider(not my home router) so I do not > > have access to the router. As the cloud instances have public addresses, > > I'll wonder if there is any way to build the cluster by letting the > server > > connect to the cloud? I have checked `?parallel::makeCluster` and > > `?snow::makeSOCKcluster` but I found no result. The only promising > solution > > I can see now is to use TCP hole punching, but it is quite complicated > and > > may not work for every case. Since building a connection from local to > the > > remote is super easy, I would like to know if there exists any simple > > solution. I have searched it on Google for a week but find no answer. > I'll > > appreciate it if you can provide me any suggestions! > > > > Best, > > Jiefei > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]