I wonder if there is a way to read in CSV files so that - certain fields within each record (line) are ignored, i.e. not converted to columns in the output data frame - factors/levels are created automatically: often, the possible labels are known in advance and could immediately be converted. This would save a lot of memory and time for bigger files, I assume. Johann -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, 24 Sep 2001, Johann Petrak wrote:> I wonder if there is a way to read in CSV files so that > - certain fields within each record (line) are ignored, i.e. > not converted to columns in the output data frameNo. We've thought about that, but it's better to use external tools (e.g. cut in Unix).> - factors/levels are created automatically: often, the possible > labels are known in advance and could immediately be > converted. This would save a lot of memory and time for > bigger files, I assume.What are these on the file? If characters then factors/levels *are* created automatically. If codes then they are stored economically as integers (in R-devel) and just need a levels attribute and class added. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Johann Petrak <johann at ai.univie.ac.at> writes:>I wonder if there is a way to read in CSV files so that >- certain fields within each record (line) are ignored, i.e. > not converted to columns in the output data frameYou could write a bespoke function to do this using scan(). An internal function or extension to read.table() might be useful but you can always do that with whatever data-management tools you already use ... I use EpiData (which allows you to drop variables when exporting to ASCII) and DBMS/COPY which allows you to drop and recode during data transfer operations.>- factors/levels are created automatically: often, the possible > labels are known in advance and could immediately be > converted. This would save a lot of memory and time for > bigger files, I assume.You can use levels() and factor() to do this. Mark -- Mark Myatt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Mark Myatt wrote:> Johann Petrak <johann at ai.univie.ac.at> writes: > >>- factors/levels are created automatically: often, the possible >> labels are known in advance and could immediately be >> converted. This would save a lot of memory and time for >> bigger files, I assume. >> > > You can use levels() and factor() to do this. >I know ... the point I wanted to make is that for very big files, this is an awful waste of memory and time: everything gets first stored as strings and only after being read in gets converted to levels. I am currently trying to understand the R source code to do it another way: first convert a single ASCII file to several binary coded column-files, then read in those columns needed using readBin. This should allow greates flexibility when files are really big. Cheers, Johann -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._