Hi Everyone, I am a little new to R and the first problem I am facing is the dilemma whether R is suitable for files of size 2 GB's and slightly more then 2 Million rows. When I try importing the data using read.table, it seems to take forever and I have to cancel the command. Are there any special techniques or methods which i can use or some tricks of the game that I should keep in mind in order to be able to do data analysis on such large files using R? -- Regards Gaurav Singh [[alternative HTML version deleted]]
take a look at ff package On Jan 19, 2013 7:04 AM, "gaurav singh" <gauravonline20@gmail.com> wrote:> Hi Everyone, > > I am a little new to R and the first problem I am facing is the dilemma > whether R is suitable for files of size 2 GB's and slightly more then 2 > Million rows. When I try importing the data using read.table, it seems to > take forever and I have to cancel the command. Are there any special > techniques or methods which i can use or some tricks of the game that I > should keep in mind in order to be able to do data analysis on such large > files using R? > > -- > Regards > Gaurav Singh > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On 13-01-19 3:28 AM, gaurav singh wrote:> Hi Everyone, > > I am a little new to R and the first problem I am facing is the dilemma > whether R is suitable for files of size 2 GB's and slightly more then 2 > Million rows. When I try importing the data using read.table, it seems to > take forever and I have to cancel the command. Are there any special > techniques or methods which i can use or some tricks of the game that I > should keep in mind in order to be able to do data analysis on such large > files using R? >Specifying the type of each column with colClasses will speed up read.table a lot in a big file. You have a lot of data, so having a lot of memory will help. You may want to work in 64 bit R, which has access to a lot more than 32 bit R sees. Duncan Murdoch