In addition to Dirks advice about the biglm package, you may also want to look
at the RSQLite and SQLiteDF packages which may make dealing with the large
dataset faster and easier, especially for passing the chunks to bigglm.
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Andr? de Boer
> Sent: Wednesday, September 08, 2010 5:27 AM
> To: r-help at r-project.org
> Subject: [R] big data
>
> Hello,
>
> I searched the internet but i didn't find the answer for the next
> problem:
> I want to do a glm on a csv file consisting of 25 columns and 4 mln
> rows.
> Not all the columns are relevant. My problem is to read the data into
> R.
> Manipulate the data and then do a glm.
>
> I've tried with:
>
> dd<-scan("myfile.csv",colClasses=classes)
> dat<-as.data.frame(dd)
>
> My question is: what is the right way to do is?
> Can someone give me a hint?
>
> Thanks,
> Arend
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.