http://cran.r-project.org/web/packages/data.table/index.html
On Wed, May 22, 2013 at 12:31 PM, ivo welch
<ivo.welch@anderson.ucla.edu>wrote:
> I have a couple of large data sets, on the order of 4GB. they come in .csv
> files, with about 50 columns and lots of rows. a couple have weird NA
> values, such as "C" and "B", in numeric columns.
>
> I am wondering how good read.csv() is dealing with this stuff on the first
> pass.
>
> d<-(read.csv("t.csv", colClasses=c(NA, NA, "NULL",
"NULL",
> "numeric","numeric", "numeric",
"numeric"), na.strings=c("C","B")))
>
> does R first read the entire file and then worry about colClasses and
> na.strings, or does it handle this line by line as it goes?
>
> (if it does the former, I can write a perl pre-filter)
>
> /iaw
>
> ----
> Ivo Welch (ivo.welch@gmail.com)
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]