Hello, I'm attempting to load a ~110 MB text file with ~500,000 rows and 200 columns using read.table . R hangs and seems to give up. Can anyone tell me an efficient way to load a file of this size? Thank you! Alex -- View this message in context: http://www.nabble.com/Loading-large-files-in-R-tp17025045p17025045.html Sent from the R help mailing list archive at Nabble.com.
Hi Alex, Perhaps http://www.nabble.com/How-to-read-HUGE-data-sets--td15729830.html#a15746400can helps. HTH, Jorge On Fri, May 2, 2008 at 2:13 PM, ajoyner <ahjoyn@hotmail.com> wrote:> > Hello, > I'm attempting to load a ~110 MB text file with ~500,000 rows and 200 > columns using read.table . R hangs and seems to give up. Can anyone tell > me > an efficient way to load a file of this size? > Thank you! > Alex > -- > View this message in context: > http://www.nabble.com/Loading-large-files-in-R-tp17025045p17025045.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
On 5/2/2008 2:13 PM, ajoyner wrote:> Hello, > I'm attempting to load a ~110 MB text file with ~500,000 rows and 200 > columns using read.table . R hangs and seems to give up. Can anyone tell me > an efficient way to load a file of this size?It will help a lot if you specify the column types (using the colClasses argument), so that R doesn't have to determine them from the data. It will also help if you've got lots of physical memory available for R; depending on the data, that could take several hundred MB of memory, and if the OS needs to use swap space to get it, you'll find it very slow. If you want to limit the memory footprint, don't read all of the data at once: specify some columns to be skipped (set their class to "NULL") or some rows (using skip and/or nrow). Duncan Murdoch
On 5/2/2008 2:52 PM, Alex Joyner wrote:> Duncan, > Thank you for your response. I actually am using colClasses, but the > first column is a character column, and the rest are numeric. Is there > any way to specify that all columns are numeric except for the first > one? I couldn't find this in the documentation. Also, I can't remove the > first column until I read the file in right?If you set colClasses = c("NULL", rep("numeric", 199)) you should get what you want. Duncan Murdoch> Thanks again! > Alex > > > Date: Fri, 2 May 2008 14:34:39 -0400 > > From: murdoch at stats.uwo.ca > > To: ahjoyn at hotmail.com > > CC: r-help at r-project.org > > Subject: Re: [R] Loading large files in R > > > > On 5/2/2008 2:13 PM, ajoyner wrote: > > > Hello, > > > I'm attempting to load a ~110 MB text file with ~500,000 rows and 200 > > > columns using read.table . R hangs and seems to give up. Can anyone > tell me > > > an efficient way to load a file of this size? > > > > It will help a lot if you specify the column types (using the colClasses > > argument), so that R doesn't have to determine them from the data. > > > > It will also help if you've got lots of physical memory available for R; > > depending on the data, that could take several hundred MB of memory, and > > if the OS needs to use swap space to get it, you'll find it very slow. > > If you want to limit the memory footprint, don't read all of the data at > > once: specify some columns to be skipped (set their class to "NULL") or > > some rows (using skip and/or nrow). > > > > Duncan Murdoch > > > With Windows Live for mobile, your contacts travel with you. Connect on > the go. > <http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_Refresh_mobile_052008>