I find that read.table cannot handle large datasets. Suppose data is a 40000 x 6 dataset R -v 100 x_read.table("data") gives Error: memory exhausted but x_as.data.frame(matrix(scan("data"),byrow=T,ncol=6)) works fine. read.table is less typing ,I can include the variable names in the first line and in Splus executes faster. Is there a fix for read.table on the way? -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Rick White <rick at stat.ubc.ca> writes:> I find that read.table cannot handle large datasets. Suppose data is a > 40000 x 6 dataset > > R -v 100 > > x_read.table("data") gives > Error: memory exhausted > but > x_as.data.frame(matrix(scan("data"),byrow=T,ncol=6)) > works fine. > > read.table is less typing ,I can include the variable names in the first > line and in Splus executes faster. Is there a fix for read.table on the > way?You probably need to increase -n as well as -v to read in this table. Try setting gcinfo(TRUE) to see what is happening with the garbage collector. Most likely it is running out of cons cells long before it runs out of heap storage. The reason I suspect this is because I encountered exactly the same situation several weeks ago and Thomas Lumley pointed this out to me. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, 9 Mar 1998, Rick White wrote:> I find that read.table cannot handle large datasets. Suppose data is a > 40000 x 6 dataset > > R -v 100 > > x_read.table("data") gives > Error: memory exhausted > but > x_as.data.frame(matrix(scan("data"),byrow=T,ncol=6)) > works fine.You need to increase the number of cons cells as well as the vector heap size eg R -v 40 -n 1000000 to allocate 1000000 cons cells instead of the standard 200000. To see what sort of memory you are running out of, use gcinfo(T), which tells R to report the memory status after each garbage collection. Thomas Lumley ------------------------ Biostatistics Uni of Washington Box 357232 Seattle WA 98195-7232 ------------------------ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._