dear experts, i m very concerned about memory management. would appreciate if you leave me some tips on handling large datasets.. special interset: 1. importing large data from a text file 2. subsequent manipulations in R thanks very much best regards pan yuming -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Thu, 9 Nov 2000 Pan_Yuming at aam.de wrote:> dear experts, > > i m very concerned about memory management. would appreciate if you leave > me some tips on handling large datasets.. > > special interset: > 1. importing large data from a text file > 2. subsequent manipulations in Ra. Read the FAQ (memory managment) b. Read "An Introduction to R" (data manipulations) (both available at CRAN) To your questions: 1. read.table() 2. What kind of answer do you expect ??? Uwe Ligges -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
dear uwe ligges, thanks for your response. i have problems when importing and handling large data sets in R. i know read.table() is not optimal for it. what i want to know how to deal with large datasets efficiently in R with respect to 1. importing large data from a text file 2. subsequent manipulations in R and also some general guidelines to improve efficiency. thanks and regards pan yuming Uwe Ligges <ligges at statistik.uni-dortmund.de> on 09.11.2000 19:04:12 To: Pan_Yuming at aam.de cc: r-help at stat.math.ethz.ch Subject: Re: [R] memory management On Thu, 9 Nov 2000 Pan_Yuming at aam.de wrote:> dear experts, > > i m very concerned about memory management. would appreciate if you leave > me some tips on handling large datasets.. > > special interset: > 1. importing large data from a text file > 2. subsequent manipulations in Ra. Read the FAQ (memory managment) b. Read "An Introduction to R" (data manipulations) (both available at CRAN) To your questions: 1. read.table() 2. What kind of answer do you expect ??? Uwe Ligges -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
thank you for your useful discussions. i had unpleasant experinces when handling data in R, like importing large datasets into R and do calculations on it. it is not whether the job can be donw, but how to do it more elegantly. read.table() is slower than scan(), but i wont use scan() if read.table() can do it satifactoriely. Kjetil gave me the right direction. i tend to use a lot of loops in the program and that s not efficient. best wishes pan yuming Uwe Ligges <ligges at statistik.uni-dortmund.de> on 10.11.2000 12:04:20 To: Pan_Yuming at aam.de cc: r-help <r-help at stat.math.ethz.ch> Subject: Re: [R] memory management Pan_Yuming at aam.de wrote:> > dear uwe ligges, > > thanks for your response. i have problems when importing and handlinglarge> data sets in R. i know read.table() is not optimal for it. what i want to > know how to deal with large datasets efficiently in R with respect to > 1. importing large data from a text file > 2. subsequent manipulations in R > > and also some general guidelines to improve efficiency.Please give us some more information! We cannot help, if you don't describe your problem exactly. "Importing large data" and "subsequent manipulations" sounds like: Please tell me how to use R! As mentioned in my last mail: At first read the FAQ and "An introduction to R". To import large data sets you might want to start R with specified memory size, e.g.: R --nsize=500K --vsize=20M or c:\.....\RGui.exe --nsize=500K --vsize=20M (if you are on windows) But that's also described in the FAQ. Regards, Uwe Ligges I think you need to specify more what you are looking for. All the general guidelines are specified in "An Introduction to R", and if you need a long discussion, Venable and Ripley's book is the source. The general guidelines in short is that you should never use loops (for, while, etc), and rather use apply (including sapply, lapply). You know, it is very hard to answer your question in general terms, other than refering you to existing literature. If there is something in the existing literature that is unclear, people like to hear about it though. Best, Kjetil -- Kjetil Kjernsmo Graduate astronomy-student Problems worthy of attack University of Oslo, Norway Prove their worth by hitting back E-mail: kjetikj at astro.uio.no - Piet Hein Homepage <URL:http://www.astro.uio.no/~kjetikj/> Webmaster at skepsis.no -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> Date: Fri, 10 Nov 2000 14:20:26 +0100 (CET) > From: Roger Bivand <rsb at reclus.nhh.no> > To: Pan_Yuming at aam.de > cc: r-help at stat.math.ethz.ch > Subject: Re: [R] memory management > > On Fri, 10 Nov 2000 Pan_Yuming at aam.de wrote: > > > read.table() is slower than scan(), but i wont use scan() if > > read.table() can do it satifactoriely. Kjetil gave me the right > > direction. i tend to use > > a lot of loops in the program and that s not efficient. > > > I have a feeling that there is an underlying issue concerning the > treatment of character strings in read.table(), both for factors and as > row names. A lot of con cells seem to be used up - that's where I've > typically hit memory limits on reading largish files. If you can > manage with scan(), and convert your character vectors to numeric (for > later conversion to factor) before you read the file, you can use more > memory for heap and less for con cells. If you are really stuck, then for > special files, like images, it's best to write a small C function to suck > in the data - this is much less challenging than it might seem, and gives > you a chance to see how elegant R really is under the hood!Or use package Rstreams on CRAN, thereby using someone else's C function. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._