Dear R-devel, Inspired by Michael Li's talk at JSM, I decided to try rpvm and snow on our two linux boxes. It only took me a couple of hours of screwing around to get it working (sooner if I had RTFM). Our setup is: 2 dual PIII-866 Xeons, one with 2GB RDRAM, the other with 1.28GB RDRAM. The first machine is acting as the NIS/NFS server. both /usr and /home are exported to the second machine, so both are seeing the same copy of R. I managed to get the following, using the same example that Prof. Tierney used:> system.time(nuke.boot <-+ boot(nuke.data, nuke.fun, R=999, m=1, + fit.pred=new.fit, x.pred=new.data)) [1] 29.38 0.52 30.68 0.00 0.00> system.time(cl.nuke.boot <-+ clusterCall(cl,boot,nuke.data, nuke.fun, R=500, m=1, + fit.pred=new.fit, x.pred=new.data)) [1] 0.03 0.00 15.44 0.00 0.00 So I'm getting almost twice the performance, which is great. Now the questions: 1. Since each of these boxes has two CPUs, how do I spawn more than one slave process on them? 2. I was hoping I can see similar gain with randomForest, but that doesn't seem to be the case:> system.time(iris.rf <- randomForest(iris[,1:4], iris[,5], ntree=10000))[1] 8.52 1.00 9.61 0.00 0.00> system.time(cl.iris.rf <- clusterCall(cl, randomForest, iris[,1:4],+ iris[,5], ntree=5000)) [1] 1.38 0.14 15.50 0.00 0.00 What am I missing here? Is there anything I can do to see similar gain as the boot() example? Thanks very much in advance for any pointers, and big thanks to the developers of these great stuff!! Regards, Andy Andy I. Liaw, PhD Biometrics Research Phone: (732) 594-0820 Merck & Co., Inc. Fax: (732) 594-1565 P.O. Box 2000, RY84-16 Rahway, NJ 07065 mailto:andy_liaw@merck.com ------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. ============================================================================= -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>>>>> "andy" == Andy Liaw <Liaw> writes:andy> Inspired by Michael Li's talk at JSM, I decided to try rpvm and snow on our andy> two linux boxes. It only took me a couple of hours of screwing around to andy> get it working (sooner if I had RTFM). Which is why we wrote TFM.... :-). It's the annoying part of clustering (true for nearly every clustering system currently available). andy> 1. Since each of these boxes has two CPUs, how do I spawn more than one andy> slave process on them? Create a cluster of more virtual machines. There isn't necessarily a 1-1 map from virtual machines to real machines (i.e. you can run 3 virtual machines on 1 host, but don't expect ANY gain...). andy> 2. I was hoping I can see similar gain with randomForest, but that doesn't andy> seem to be the case: >> system.time(iris.rf <- randomForest(iris[,1:4], iris[,5], ntree=10000)) andy> [1] 8.52 1.00 9.61 0.00 0.00 >> system.time(cl.iris.rf <- clusterCall(cl, randomForest, iris[,1:4], andy> + iris[,5], ntree=5000)) andy> [1] 1.38 0.14 15.50 0.00 0.00 andy> What am I missing here? Is there anything I can do to see similar gain as andy> the boot() example? Monitor the processes with XPVM, as Michael showed. If you were at the session, you would've noted Antony's comment on visualization and how nice it would've been if it was real time -- it was false, xpvm does real-time monitoring. You also might write Michael for the augmentations he's done for SNOW. lina@u.washington.edu best, -tony -- A.J. Rossini Rsrch. Asst. Prof. of Biostatistics U. of Washington Biostatistics rossini@u.washington.edu FHCRC/SCHARP/HIV Vaccine Trials Net rossini@scharp.org -------------- http://software.biostat.washington.edu/ ---------------- FHCRC: M: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email UW: Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX (my tuesday/wednesday/friday locations are completely unpredictable.) -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>>>>> On Mon, 19 Aug 2002 15:02:22 -0400, "Liaw, Andy" <andy_liaw@merck.com> said:Andy> 1. Since each of these boxes has two CPUs, how do I spawn more than one Andy> slave process on them? Note unlike MPI, PVM has no access to hardware information like the number of CPUs on each node. So there is no restriction on how many tasks one can spawn on the cluster. More tasks may be desirable when some of them are less CPU intensive jobs. For instances, tasks that monitor the activity of the network, report host or task failure and spawn new task or add new hosts, etc. Andy> 2. I was hoping I can see similar gain with randomForest, but that Andy> doesn't seem to be the case:>> system.time(iris.rf <- randomForest(iris[,1:4], iris[,5], ntree=10000))Andy> [1] 8.52 1.00 9.61 0.00 0.00>> system.time(cl.iris.rf <- clusterCall(cl, randomForest, iris[,1:4],Andy> + iris[,5], ntree=5000)) Andy> [1] 1.38 0.14 15.50 0.00 0.00 Andy> What am I missing here? Is there anything I can do to see similar gain Andy> as the boot() example? I tried this example and found that most of the extra time is overhead, the packing and unpacking of the messages. When saving the object iris.rf, its size is over 12M. So it might be desirable to process the returned result in each slave first and only return information needed. I got similar timing with our cluster. Saving and loading the object to/from a file require about 1.5 seconds each, which I assume is the cost of the serialization (plus file reading and writing). Then it seems the packing (as bytes), transferring, and unpacking the object take 7-8 seconds?? I wonder how much the serialization itself hurts the performance. Would sending raw numbers with pvm routines improve the performance? Michael (BTW, is there a convenient function in R to examine the size of an object?) -- --------------------------------------------------- Michael Na Li Email: lina@u.washington.edu Department of Biostatistics, Box 357232 University of Washington, Seattle, WA 98195 --------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>>>>> "michael" == Michael Na Li <lina@u.washington.edu> writes:michael> I got similar timing with our cluster. Saving and loading the object to/from michael> a file require about 1.5 seconds each, which I assume is the cost of the michael> serialization (plus file reading and writing). Then it seems the packing (as michael> bytes), transferring, and unpacking the object take 7-8 seconds?? michael> I wonder how much the serialization itself hurts the performance. Would michael> sending raw numbers with pvm routines improve the performance? It might. It is "well-known" that PVM (and MPI, and other message-passing systems) have a marshalling overhead that can be easily beat by RPC or direct socket programming. This suggests that the lower/rawer the data at the message passing stage, the faster it might be. best, -tony -- A.J. Rossini Rsrch. Asst. Prof. of Biostatistics U. of Washington Biostatistics rossini@u.washington.edu FHCRC/SCHARP/HIV Vaccine Trials Net rossini@scharp.org -------------- http://software.biostat.washington.edu/ ---------------- FHCRC: M: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email UW: Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX (my tuesday/wednesday/friday locations are completely unpredictable.) -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> From: Michael Na Li [mailto:lina@u.washington.edu] > Andy> 2. I was hoping I can see similar gain with > randomForest, but that > Andy> doesn't seem to be the case: > > >> system.time(iris.rf <- randomForest(iris[,1:4], iris[,5], > ntree=10000)) > Andy> [1] 8.52 1.00 9.61 0.00 0.00 > >> system.time(cl.iris.rf <- clusterCall(cl, randomForest, iris[,1:4], > Andy> + iris[,5], ntree=5000)) > Andy> [1] 1.38 0.14 15.50 0.00 0.00 > > Andy> What am I missing here? Is there anything I can do to > see similar gain > Andy> as the boot() example? > > I tried this example and found that most of the extra time is > overhead, the > packing and unpacking of the messages. When saving the > object iris.rf, its > size is over 12M. So it might be desirable to process the > returned result in > each slave first and only return information needed. > > I got similar timing with our cluster. Saving and loading > the object to/from > a file require about 1.5 seconds each, which I assume is the > cost of the > serialization (plus file reading and writing). Then it seems > the packing (as > bytes), transferring, and unpacking the object take 7-8 seconds??I tried the following, and it seems about right. (I used a socket cluster with 3 nodes. PVM cluster gave similar result.) system.time(iris.rf <- randomForest(iris[,1:4], iris[,5], ntree=30000)$votes) [1] 24.57 3.33 27.98 0.00 0.00 system.time(clusterEvalQ(cl, {data(iris); randomForest(iris[, 1:4], iris[, 5], ntree=10000)$votes})) [1] 0.01 0.00 12.70 0.00 0.00 I was planning to add an alternative random forest interface that does not return the entire forest (to save memory). Seems like that item just got promoted on my to-do list... Thanks very much for the help!! Andy> I wonder how much the serialization itself hurts the > performance. Would > sending raw numbers with pvm routines improve the performance? > > Michael > > (BTW, is there a convenient function in R to examine the size > of an object?) > > -- > --------------------------------------------------- > Michael Na Li > Email: lina@u.washington.edu > Department of Biostatistics, Box 357232 > University of Washington, Seattle, WA 98195 > --------------------------------------------------- > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-. > -.-.-.-.-.-.-.-.- > r-devel mailing list -- Read > http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: > r-devel-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. > _._._._._._._._._ >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. ============================================================================= -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._