Dear All, I have been searching online for help increasing my R code more efficiently for almost a whole day, however, there is no solution to my case. So if anyone could give any clue to solve my problem, I would be very appreciate for you help. Thanks in advance. Here is my issue: My desktop is with i7-950 Quad-core CPU with 24Gb memory, and a NVIDIA GTX 480 graphic card, and I am using a 64-bit version of R under 64-bit Windows . I am running a "for" loop to generate a 461*5 matrix data, which is coming from the coefficients of 5 models. The loop would produce 5 values one time, and it will run 461 times in total. I have tried to run the code inside the loop just once, it will cost almost 10 seconds, so if we intuitively calculate the time of the whole loop will cost, it would be 4610 seconds, equal to almost one and a half hours, which is exactly the whole loop taking indeed. But I have to run this kinda loop for 30 data-sets! Although I thought I am using a not-bad at all desktop, I checked the usage of CPU and memory during my running R code, and found out the whole code just used 15% of CPU and 10% of memory. Does anyone have the same issue with me? or Does anyone know some methods to shorten the running time and increase the usage of CPU and memory? Many thanks, Xi [[alternative HTML version deleted]]
See the vignette for package 'parallel' to make use of your 4 cores. On 26/06/2012 01:07, Xi wrote:> Dear All, > > I have been searching online for help increasing my R code more efficiently > for almost a whole day, however, there is no solution to my case. So if > anyone could give any clue to solve my problem, I would be very appreciate > for you help. Thanks in advance. > > Here is my issue: > > My desktop is with i7-950 Quad-core CPU with 24Gb memory, and a NVIDIA GTX > 480 graphic card, and I am using a 64-bit version of R under 64-bit Windows > . > > I am running a "for" loop to generate a 461*5 matrix data, which is coming > from the coefficients of 5 models. The loop would produce 5 values one > time, and it will run 461 times in total. I have tried to run the code > inside the loop just once, it will cost almost 10 seconds, so if > we intuitively calculate the time of the whole loop will cost, it would be > 4610 seconds, equal to almost one and a half hours, which is exactly the > whole loop taking indeed. But I have to run this kinda loop for > 30 data-sets! > > Although I thought I am using a not-bad at all desktop, I checked the usage > of CPU and memory during my running R code, and found out the whole code > just used 15% of CPU and 10% of memory. Does anyone have the same issue > with me? or Does anyone know some methods to shorten the running time and > increase the usage of CPU and memory? > > Many thanks, > Xi > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Hello Xi, If a program has input or output to disk or network, this may cause it to wait and not use the available CPU. Output is usually buffered, but may cause delay if the buffer gets full (I'm not sure though whether this is an issue with plenty of memory available) Take care Oliver On Mon, Jun 25, 2012 at 8:07 PM, Xi <amzhangxi at gmail.com> wrote:> Dear All, > > I have been searching online for help increasing my R code more efficiently > for almost a whole day, however, there is no solution to my case. So if > anyone could give any clue to solve my problem, I would be very appreciate > for you help. Thanks in advance. > > Here is my issue: > > My desktop is with i7-950 Quad-core CPU with 24Gb memory, and a ?NVIDIA GTX > 480 ?graphic card, and I am using a 64-bit version of R under 64-bit Windows > . > > I am running a "for" loop to generate a 461*5 matrix data, which is coming > from the coefficients of 5 models. The loop would produce 5 values one > time, and it will run 461 times in total. I have tried to run the code > inside the loop just once, it will cost almost 10 seconds, so if > we intuitively calculate the time of the whole loop will cost, it would be > 4610 seconds, equal to almost one and a half hours, which is exactly the > whole loop taking indeed. But I have to run this kinda loop for > 30 data-sets! > > Although I thought I am using a not-bad at all desktop, I checked the usage > of CPU and memory during my running R code, and found out the whole code > just used 15% of CPU and 10% of memory. Does anyone have the same issue > with me? or Does anyone know some methods to shorten the running time and > increase the usage of CPU and memory? > > Many thanks, > Xi > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Oliver Ruebenacker, Bioinformatics and Network Analysis Consultant President and Founder of Knowomics (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Consultant at Predictive Medicine (http://predmed.com/people/oliverruebenacker.html) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
Hi Xi, Maybe you should try to "parallelize" your calculations. See package "parallel". http://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf Arnaud On Mon, Jun 25, 2012 at 8:07 PM, Xi <amzhangxi@gmail.com> wrote:> Dear All, > > I have been searching online for help increasing my R code moreefficiently> for almost a whole day, however, there is no solution to my case. So if > anyone could give any clue to solve my problem, I would be very appreciate > for you help. Thanks in advance. > > Here is my issue: > > My desktop is with i7-950 Quad-core CPU with 24Gb memory, and a ?NVIDIAGTX> 480 ?graphic card, and I am using a 64-bit version of R under 64-bitWindows> . > > I am running a "for" loop to generate a 461*5 matrix data, which is coming > from the coefficients of 5 models. The loop would produce 5 values one > time, and it will run 461 times in total. I have tried to run the code > inside the loop just once, it will cost almost 10 seconds, so if > we intuitively calculate the time of the whole loop will cost, it would be > 4610 seconds, equal to almost one and a half hours, which is exactly the > whole loop taking indeed. But I have to run this kinda loop for > 30 data-sets! > > Although I thought I am using a not-bad at all desktop, I checked theusage> of CPU and memory during my running R code, and found out the whole code > just used 15% of CPU and 10% of memory. Does anyone have the same issue > with me? or Does anyone know some methods to shorten the running time and > increase the usage of CPU and memory? > > Many thanks, > Xi[[alternative HTML version deleted]]
Hello Xi, Have you tried replacing the for loop by an apply construct, e.g., lapply or sapply? In my experience these functions are more efficient than for. At any rate, if you succeed with, say, lapply, there are some R packages that support parallel processing versions. I believe the package parallel is now part of base R, and a version of multicore's mclapply (which, in multicore, does not support Windows) is available. Regards, Richard Richard R. Liu Hebelstr. 136 CH-4056 Basel Switzerland Tel.: +41 61 321 66 00 Mobil: +41 79 708 67 66 Email: richard.liu at pueo-owl.ch