Anantha Prasad/NE/USDAFS
2000-Oct-02 18:58 UTC
[R] R vs S-PLUS with regard to memory usage
I am trying to translate code from S-PLUS to R and R really struggles! After starting R with the foll. R --vsize 50M --nsize 6M --no-restore on a 400 MHz Pentium with 192 MB of memory running Linux (RH 6.2), I run a function that essentially picks up an external dataset with 2121 rows and 30 columns and builds a lm() object and also runs step() ... the step() takes forever to run...(takes very little time in S-PLUS). I remove the objects whenever I am done with them hoping that it'll free up the memory....but this does not seem to help. THen I present the user a choice (tk radiobutton using tcltk package) to choose between residual plot and actual vs predicted plot and then allow the user to identify the outliers...here, after the first choice, there is lot of disk activity ..which makes me suspect that it has run out of memory and swapping from disk ...this takes an enormous amount of time....after which it seems to run OK. The application that took seconds to run in S-PLUS takes forever in R....why is it?...I feel that a --vsize of 50 MB is plenty for a 2121/30 dataset (correct?) ....does R after having taken 50 MB chunk of memory, handle it poorly, run out of it and start swapping to disk?? .... or am I neglecting to do something? I am beginning to suspect that R is not the language to use when writing applications like mine. Please enlighten me on what is happening. Thanks much. ***************************************************************** Mr. Anantha Prasad, Ecologist/GIS Specialist USDA Forest Service, 359 Main Rd. Delaware OHIO 43015 USA Ph: 740-368-0103 Email: aprasad at fs.fed.us Web: http://www.fs.fed.us/ne/delaware/index.html Don't Miss Climate Change Tree Atlas at: http://www.fs.fed.us/ne/delaware/atlas/index.html ****************************************************************** -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
"Anantha Prasad/NE/USDAFS" <aprasad at fs.fed.us> writes:> I am trying to translate code from S-PLUS to R and R really struggles! > After starting R with the foll. > R --vsize 50M --nsize 6M --no-restore > on a 400 MHz Pentium with 192 MB of memory running Linux (RH 6.2), > I run a function that essentially picks up an external dataset with 2121 > rows > and 30 columns and builds a lm() object and also runs step() ... the step() > takes forever to run...(takes very little time in S-PLUS).Notice that the --nsize takes the number of *nodes* as the value. Each is 20 bytes, so you're allocating a 170MB chunk there. With various other memory eaters active, that could easily push a 192MB machine into thrashing. The upcoming 1.2 version will be much better at handling memory, but for now maybe reduce the nsize a bit? The vsize looks a bit hefty as well given that the data should take up on the order of half a MB. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Anantha Prasad/NE/USDAFS
2000-Oct-02 20:42 UTC
[R] R vs S-PLUS with regard to memory usage
Worked like magic thanks (I thought more I can allocate the better since my application does not know before hand what the datasize is going to be).... however, the step() function ran out of heap memory even after I allocated 15 MB ..... so, it looks like it is a no-win situation.....If I increase heap too much to satisfy step(), there is disk swapping (if not on my present machine, on others which do not have the luxury of 192 MB); if I don't it runs out of memory.... any suggestions other than wait for 1.2? If there are more people like me, it's some incentive to get 1.2 out soon! Thanks much. Prasad ***************************************************************** Mr. Anantha Prasad, Ecologist/GIS Specialist USDA Forest Service, 359 Main Rd. Delaware OHIO 43015 USA Ph: 740-368-0103 Email: aprasad at fs.fed.us Web: http://www.fs.fed.us/ne/delaware/index.html Don't Miss Climate Change Tree Atlas at: http://www.fs.fed.us/ne/delaware/atlas/index.html ****************************************************************** Peter Dalgaard BSA To: "Anantha Prasad/NE/USDAFS" <p.dalgaard at bios <aprasad at fs.fed.us> tat.ku.dk> cc: r-help at stat.math.ethz.ch Sent by: Subject: Re: [R] R vs S-PLUS with pd at blueberry.kub regard to memory usage ism.ku.dk 10/02/00 04:20 PM "Anantha Prasad/NE/USDAFS" <aprasad at fs.fed.us> writes:> I am trying to translate code from S-PLUS to R and R really struggles! > After starting R with the foll. > R --vsize 50M --nsize 6M --no-restore > on a 400 MHz Pentium with 192 MB of memory running Linux (RH 6.2), > I run a function that essentially picks up an external dataset with 2121 > rows > and 30 columns and builds a lm() object and also runs step() ... the step()> takes forever to run...(takes very little time in S-PLUS).Notice that the --nsize takes the number of *nodes* as the value. Each is 20 bytes, so you're allocating a 170MB chunk there. With various other memory eaters active, that could easily push a 192MB machine into thrashing. The upcoming 1.2 version will be much better at handling memory, but for now maybe reduce the nsize a bit? The vsize looks a bit hefty as well given that the data should take up on the order of half a MB. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Anantha Prasad/NE/USDAFS
2000-Oct-02 20:59 UTC
[R] R vs S-PLUS with regard to memory usage
OK .... my mistake ...this vsize and nsize confusion....I re-read the memory part and Prof. Ripley's reply - here is the lession I learnt: Increase vsize but NOT nsize and you'll not have problem with disk swapping. Yes, I now have renewed faith that I can indeed use R - although I do feel 1.2 would be better if this "allocation confusion" is taken care of. THanks again. Prasad ***************************************************************** Mr. Anantha Prasad, Ecologist/GIS Specialist USDA Forest Service, 359 Main Rd. Delaware OHIO 43015 USA Ph: 740-368-0103 Email: aprasad at fs.fed.us Web: http://www.fs.fed.us/ne/delaware/index.html Don't Miss Climate Change Tree Atlas at: http://www.fs.fed.us/ne/delaware/atlas/index.html ****************************************************************** Peter Dalgaard BSA To: "Anantha Prasad/NE/USDAFS" <p.dalgaard at bios <aprasad at fs.fed.us> tat.ku.dk> cc: r-help at stat.math.ethz.ch Sent by: Subject: Re: [R] R vs S-PLUS with pd at blueberry.kub regard to memory usage ism.ku.dk 10/02/00 04:20 PM "Anantha Prasad/NE/USDAFS" <aprasad at fs.fed.us> writes:> I am trying to translate code from S-PLUS to R and R really struggles! > After starting R with the foll. > R --vsize 50M --nsize 6M --no-restore > on a 400 MHz Pentium with 192 MB of memory running Linux (RH 6.2), > I run a function that essentially picks up an external dataset with 2121 > rows > and 30 columns and builds a lm() object and also runs step() ... the step()> takes forever to run...(takes very little time in S-PLUS).Notice that the --nsize takes the number of *nodes* as the value. Each is 20 bytes, so you're allocating a 170MB chunk there. With various other memory eaters active, that could easily push a 192MB machine into thrashing. The upcoming 1.2 version will be much better at handling memory, but for now maybe reduce the nsize a bit? The vsize looks a bit hefty as well given that the data should take up on the order of half a MB. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._