Hey my R buddies, I installed the "snow" and "rpvm" package on my Lenovo Thinkpad T400 today. The experiment below gave me a surprise. The time consumed by serial processing was several times larger than that taken by parallel processing. I'm very curious how this happened. Thank you very much.> library(snow) > > cc <- makePVMcluster(2) > > temp <- list(matrix(rnorm(1000000),1000),matrix(rnorm(1000000),1000)) > > system.time(tt <- clusterApply(cc,temp,"solve"))user system elapsed 0.584 0.144 4.355> system.time(ttt <- sapply(temp,"solve"))user system elapsed 4.777 0.100 4.901 I'm using Ubuntu 8.10. And here's my CPU info: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz stepping : 6 cpu MHz : 800.000 cache size : 3072 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 4521.96 clflush size : 64 power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz stepping : 6 cpu MHz : 800.000 cache size : 3072 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 4521.97 clflush size : 64 power management: -- ??? Hesen Peng http://hesen.peng.googlepages.com/
On Sun, 23 Nov 2008, Hesen Peng wrote:> Hey my R buddies, > > I installed the "snow" and "rpvm" package on my Lenovo Thinkpad T400 > today. The experiment below gave me a surprise. The time consumed by > serial processing was several times larger than that taken by parallel > processing. I'm very curious how this happened. Thank you very much.Read again! clusterApply() was not running solve() in this process so you do not have the total CPU time, and the elapsed time gain is small. I would surmise from the times given that you are not using an optimized BLAS, which for this problem would make good use of the dual cores.> >> library(snow) >> >> cc <- makePVMcluster(2) >> >> temp <- list(matrix(rnorm(1000000),1000),matrix(rnorm(1000000),1000)) >> >> system.time(tt <- clusterApply(cc,temp,"solve")) > user system elapsed > 0.584 0.144 4.355 >> system.time(ttt <- sapply(temp,"solve")) > user system elapsed > 4.777 0.100 4.901 > > I'm using Ubuntu 8.10. And here's my CPU info: > > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 23 > model name : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz > stepping : 6 > cpu MHz : 800.000 > cache size : 3072 KB > physical id : 0 > siblings : 2 > core id : 0 > cpu cores : 2 > apicid : 0 > initial apicid : 0 > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 10 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx > lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx smx est > tm2 ssse3 cx16 xtpr sse4_1 lahf_lm > bogomips : 4521.96 > clflush size : 64 > power management: > > processor : 1 > vendor_id : GenuineIntel > cpu family : 6 > model : 23 > model name : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz > stepping : 6 > cpu MHz : 800.000 > cache size : 3072 KB > physical id : 0 > siblings : 2 > core id : 1 > cpu cores : 2 > apicid : 1 > initial apicid : 1 > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 10 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx > lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx smx est > tm2 ssse3 cx16 xtpr sse4_1 lahf_lm > bogomips : 4521.97 > clflush size : 64 > power management: > > -- > ??? Hesen Peng > http://hesen.peng.googlepages.com/ > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
---------- Forwarded message ---------- From: Hesen Peng <hesen.peng at gmail.com> Date: Mon, Nov 24, 2008 at 9:28 AM Subject: Re: [R] More than doubling performance with snow To: Prof Brian Ripley <ripley at stats.ox.ac.uk> I'm sorry but I don't quite understand what "not running solve() in this process" means. I updated the code and it do show that the result from clusterApply() are identical with the result from lapply(). Could you please explain more about this? Following is the updated code: library(snow) cc <- makePVMcluster(2) n.size <- 1000 temp <- NULL for(i in 1:10){ x <- list(matrix(rnorm(n.size^2),n.size)) temp <- c(temp,x) } system.time(t.1 <- clusterApply(cc,temp,"solve")) system.time(t.2 <- lapply(temp,"solve")) On Mon, Nov 24, 2008 at 1:47 AM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:> On Sun, 23 Nov 2008, Hesen Peng wrote: > >> Hey my R buddies, >> >> I installed the "snow" and "rpvm" package on my Lenovo Thinkpad T400 >> today. The experiment below gave me a surprise. The time consumed by >> serial processing was several times larger than that taken by parallel >> processing. I'm very curious how this happened. Thank you very much. > > Read again! clusterApply() was not running solve() in this process so you > do not have the total CPU time, and the elapsed time gain is small. > > I would surmise from the times given that you are not using an optimized > BLAS, which for this problem would make good use of the dual cores. > >> >>> library(snow) >>> >>> cc <- makePVMcluster(2) >>> >>> temp <- list(matrix(rnorm(1000000),1000),matrix(rnorm(1000000),1000)) >>> >>> system.time(tt <- clusterApply(cc,temp,"solve")) >> >> user system elapsed >> 0.584 0.144 4.355 >>> >>> system.time(ttt <- sapply(temp,"solve")) >> >> user system elapsed >> 4.777 0.100 4.901 >> >> I'm using Ubuntu 8.10. And here's my CPU info: >> >> processor : 0 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 23 >> model name : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz >> stepping : 6 >> cpu MHz : 800.000 >> cache size : 3072 KB >> physical id : 0 >> siblings : 2 >> core id : 0 >> cpu cores : 2 >> apicid : 0 >> initial apicid : 0 >> fdiv_bug : no >> hlt_bug : no >> f00f_bug : no >> coma_bug : no >> fpu : yes >> fpu_exception : yes >> cpuid level : 10 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge >> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx >> lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx smx est >> tm2 ssse3 cx16 xtpr sse4_1 lahf_lm >> bogomips : 4521.96 >> clflush size : 64 >> power management: >> >> processor : 1 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 23 >> model name : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz >> stepping : 6 >> cpu MHz : 800.000 >> cache size : 3072 KB >> physical id : 0 >> siblings : 2 >> core id : 1 >> cpu cores : 2 >> apicid : 1 >> initial apicid : 1 >> fdiv_bug : no >> hlt_bug : no >> f00f_bug : no >> coma_bug : no >> fpu : yes >> fpu_exception : yes >> cpuid level : 10 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge >> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx >> lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx smx est >> tm2 ssse3 cx16 xtpr sse4_1 lahf_lm >> bogomips : 4521.97 >> clflush size : 64 >> power management: >> >> -- >> ??? Hesen Peng >> http://hesen.peng.googlepages.com/ >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595-- ??? Hesen Peng http://hesen.peng.googlepages.com/ -- ??? Hesen Peng http://hesen.peng.googlepages.com/