Henrik Bengtsson
2021-Jan-18 17:22 UTC
[R] parallel: socket connection behind a NAT router
If you have SSH access to the workers, then
workers <- c("machine1.example.org",
"machine2.example.org")
cl <- parallelly::makeClusterPSOCK(workers)
should do it. It does this without admin rights and port forwarding.
See also the README in https://cran.r-project.org/package=parallelly.
/Henrik
On Mon, Jan 18, 2021 at 6:45 AM Jiefei Wang <szwjf08 at gmail.com>
wrote:>
> Hi all,
>
> I have a few cloud instances and I want to use them to do parallel
> computing. I would like to create a socket cluster on my local machine to
> control the remote instances. Here is my network setup:
>
> local machine -- NAT -- Internet -- cloud instances
>
> In the parallel package, the server needs to call `makeCluster()` and
> listens to the connection from the workers. In my case, the server is the
> local machine and the workers are the cloud instances. However, since the
> local machine is hidden behind the NAT, it does not have a public address
> and the worker cannot connect to it. Therefore, `makeCluster()` will never
> be able to see the connection from the workers and hang forever.
>
> One solution for letting the external machine to access the device inside
> the NAT is to use port forwarding. However, this would not work for my case
> as the NAT is set by the network provider(not my home router) so I do not
> have access to the router. As the cloud instances have public addresses,
> I'll wonder if there is any way to build the cluster by letting the
server
> connect to the cloud? I have checked `?parallel::makeCluster` and
> `?snow::makeSOCKcluster` but I found no result. The only promising solution
> I can see now is to use TCP hole punching, but it is quite complicated and
> may not work for every case. Since building a connection from local to the
> remote is super easy, I would like to know if there exists any simple
> solution. I have searched it on Google for a week but find no answer.
I'll
> appreciate it if you can provide me any suggestions!
>
> Best,
> Jiefei
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Thanks for introducing this interesting package to me! it is great to know a new powerful tool, but it seems like this method does not work in my environment. ` parallelly::makeClusterPSOCK` will hang until timeout. I checked the verbose output and it looks like the parallelly package also depends on `parallel:::.slaveRSOCK` on the remote instance to build the connection. This explains why it failed for the local machine does not have a public IP and the remote does not know how to build the connection. I see in README the package states it works with "remote clusters without knowing public IP". I think this might be where the confusion is, it may mean the remote machine does not have a public IP, but the server machine does. I'm in the opposite situation, the server does not have a public IP, but the remote does. I'm not sure if this package can handle my case, but it looks very powerful and I appreciate your help! Best, Jiefei On Tue, Jan 19, 2021 at 1:22 AM Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:> If you have SSH access to the workers, then > > workers <- c("machine1.example.org", "machine2.example.org") > cl <- parallelly::makeClusterPSOCK(workers) > > should do it. It does this without admin rights and port forwarding. > See also the README in https://cran.r-project.org/package=parallelly. > > /Henrik > > On Mon, Jan 18, 2021 at 6:45 AM Jiefei Wang <szwjf08 at gmail.com> wrote: > > > > Hi all, > > > > I have a few cloud instances and I want to use them to do parallel > > computing. I would like to create a socket cluster on my local machine to > > control the remote instances. Here is my network setup: > > > > local machine -- NAT -- Internet -- cloud instances > > > > In the parallel package, the server needs to call `makeCluster()` and > > listens to the connection from the workers. In my case, the server is the > > local machine and the workers are the cloud instances. However, since the > > local machine is hidden behind the NAT, it does not have a public address > > and the worker cannot connect to it. Therefore, `makeCluster()` will > never > > be able to see the connection from the workers and hang forever. > > > > One solution for letting the external machine to access the device inside > > the NAT is to use port forwarding. However, this would not work for my > case > > as the NAT is set by the network provider(not my home router) so I do not > > have access to the router. As the cloud instances have public addresses, > > I'll wonder if there is any way to build the cluster by letting the > server > > connect to the cloud? I have checked `?parallel::makeCluster` and > > `?snow::makeSOCKcluster` but I found no result. The only promising > solution > > I can see now is to use TCP hole punching, but it is quite complicated > and > > may not work for every case. Since building a connection from local to > the > > remote is super easy, I would like to know if there exists any simple > > solution. I have searched it on Google for a week but find no answer. > I'll > > appreciate it if you can provide me any suggestions! > > > > Best, > > Jiefei > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]