thr3ads.net - R help - [R] Rserve and R to R communication [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Ramon Diaz-Uriarte

2007-Apr-07 14:56 UTC

[R] Rserve and R to R communication

Dear All,

The "clients.txt" file of the latest Rserve package, by Simon Urbanek,
says, regarding its R client,

"(...) a simple R client, i.e. it allows you to connect to Rserve from
R itself. It is very simple and limited,  because Rserve was not
primarily meant for R-to-R communication (there are better ways to do
that), but it is useful for quick interactive connection to an Rserve
farm."

Which are those better ways to do it? I am thinking about using Rserve
to have an R process send jobs to a bunch of Rserves in different
machines. It is like what we could do with Rmpi (or pvm), but without
the MPI layer. Therefore, presumably it'd be easier to deal with
network problems, machine's failures, using checkpoints, etc. (i.e.,
to try to get better fault tolerance).

It seems that Rserve would provide the basic infrastructure for doing
that and saves me from reinventing the wheel of using sockets, etc,
directly from R.

However, Simon's comment about better ways of R-to-R communication
made me wonder if this idea really makes sense. What is the catch?
Have other people tried similar approaches?

Thanks,

R.

-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

Matthew Keller

2007-Apr-09 16:08 UTC

head link

[R] Rserve and R to R communication

Hi Ramon,

I've been interested in responses to your question. I have what I
think is a similar issue - I have a very large simulation script and
would like to be able to modularize it by having a main script that
calls lots of subscripts - but I haven't done that yet because the
only way I could think to do it was to call a subscript, have it run,
save the objects from the subscript, and then call those objects back
into the main script, which seems like a very slow and onerous way to
do it.

Would Rserve do what I'm looking for?

On 4/7/07, Ramon Diaz-Uriarte <rdiaz02 at gmail.com>
wrote:> Dear All,
>
> The "clients.txt" file of the latest Rserve package, by Simon
Urbanek,
> says, regarding its R client,
>
> "(...) a simple R client, i.e. it allows you to connect to Rserve from
> R itself. It is very simple and limited,  because Rserve was not
> primarily meant for R-to-R communication (there are better ways to do
> that), but it is useful for quick interactive connection to an Rserve
> farm."
>
> Which are those better ways to do it? I am thinking about using Rserve
> to have an R process send jobs to a bunch of Rserves in different
> machines. It is like what we could do with Rmpi (or pvm), but without
> the MPI layer. Therefore, presumably it'd be easier to deal with
> network problems, machine's failures, using checkpoints, etc. (i.e.,
> to try to get better fault tolerance).
>
> It seems that Rserve would provide the basic infrastructure for doing
> that and saves me from reinventing the wheel of using sockets, etc,
> directly from R.
>
> However, Simon's comment about better ways of R-to-R communication
> made me wonder if this idea really makes sense. What is the catch?
> Have other people tried similar approaches?
>
> Thanks,
>
> R.
>
> --
> Ramon Diaz-Uriarte
> Statistical Computing Team
> Structural Biology and Biocomputing Programme
> Spanish National Cancer Centre (CNIO)
> http://ligarto.org/rdiaz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Matthew C Keller
Postdoctoral Fellow
Virginia Institute for Psychiatric and Behavioral Genetics

Simon Urbanek

2007-Apr-09 17:42 UTC

head link

[R] Rserve and R to R communication

On Apr 7, 2007, at 10:56 AM, Ramon Diaz-Uriarte wrote:
> Dear All,
>
> The "clients.txt" file of the latest Rserve package, by Simon  
> Urbanek, says, regarding its R client,
>
> "(...) a simple R client, i.e. it allows you to connect to Rserve  
> from R itself. It is very simple and limited,  because Rserve was  
> not primarily meant for R-to-R communication (there are better ways  
> to do that), but it is useful for quick interactive connection to  
> an Rserve farm."
>
> Which are those better ways to do it? I am thinking about using  
> Rserve to have an R process send jobs to a bunch of Rserves in  
> different machines. It is like what we could do with Rmpi (or pvm),  
> but without the MPI layer. Therefore, presumably it'd be easier to  
> deal with network problems, machine's failures, using checkpoints,  
> etc. (i.e., to try to get better fault tolerance).
>
> It seems that Rserve would provide the basic infrastructure for  
> doing that and saves me from reinventing the wheel of using  
> sockets, etc, directly from R.
>
> However, Simon's comment about better ways of R-to-R communication  
> made me wonder if this idea really makes sense. What is the catch?  
> Have other people tried similar approaches?
>
I was commenting on direct R-to-R communication using sockets +  
'serialize' in R or the 'snow' package for parallel processing.
The
latter could be useful for what you have in mind, because it includes  
a socket-based implementation which allows you to spawn multiple  
children (across multiple machines) and collect their results. It  
uses regular rsh or ssh to start the jobs, so if can use that, it  
should work for you. 'snow' also has PVM and MPI implementations, the  
PVM one is really easy to setup (on unix) and that was what I was  
using for parallel computing in R on a cluster.

Rserve is sort of comparable, but in addition it provides the  
spawning infrastructure due to its client/server concept. What it  
doesn't have is the convenience functions that snow provides like  
clusterApply etc. Thinking of it, it would be actually possible to  
add them, although I admit that the original goal of Rserve was not  
parallel computing :). The idea was to have one Rserve server and  
multiple clients whereas in 'snow' you sort of have one client and  
multiple servers. You could spawn multiple Rserves on multiple  
machines, but Rserve itself doesn't provide any load-balancing out of  
the box, so you'd have to do that yourself.

I don't know if that helps... :)

Cheers,
Simon

Seemingly Similar Threads

Search for more reasonably related threads

R help - Apr 2007 - Rserve and R to R communication

[R] Rserve and R to R communication

[R] Rserve and R to R communication

[R] Rserve and R to R communication

Seemingly Similar Threads