What tools do you like for working with tab delimited text files up to 1.5 GB (under Windows 7 with 8 GB RAM)? Standard tools for smaller data sometimes grab all the available RAM, after which CPU usage drops to 3% ;-) The "bigmemory" project won the 2010 John Chambers Award but "is not available (for R version 3.1.0)". findFn("big data", 999) downloaded 961 links in 437 packages. That contains tools for data PostgreSQL and other formats, but I couldn't find anything for large tab delimited text files. Absent a better idea, I plan to write a function getField to extract a specific field from the data, then use that to split the data into 4 smaller files, which I think should be small enough that I can do what I want. Thanks, Spencer
Have you tried read.csv.sql from package sqldf? Peter On Tue, Aug 5, 2014 at 10:20 AM, Spencer Graves <spencer.graves at structuremonitoring.com> wrote:> What tools do you like for working with tab delimited text files up to > 1.5 GB (under Windows 7 with 8 GB RAM)? > > > Standard tools for smaller data sometimes grab all the available RAM, > after which CPU usage drops to 3% ;-) > > > The "bigmemory" project won the 2010 John Chambers Award but "is not > available (for R version 3.1.0)". > > > findFn("big data", 999) downloaded 961 links in 437 packages. That > contains tools for data PostgreSQL and other formats, but I couldn't find > anything for large tab delimited text files. > > > Absent a better idea, I plan to write a function getField to extract a > specific field from the data, then use that to split the data into 4 smaller > files, which I think should be small enough that I can do what I want. > > > Thanks, > Spencer > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Aug 5, 2014, at 10:20 AM, Spencer Graves wrote:> What tools do you like for working with tab delimited text files up to 1.5 GB (under Windows 7 with 8 GB RAM)??data.table::fread> Standard tools for smaller data sometimes grab all the available RAM, after which CPU usage drops to 3% ;-) > > > The "bigmemory" project won the 2010 John Chambers Award but "is not available (for R version 3.1.0)". > > > findFn("big data", 999) downloaded 961 links in 437 packages. That contains tools for data PostgreSQL and other formats, but I couldn't find anything for large tab delimited text files. > > > Absent a better idea, I plan to write a function getField to extract a specific field from the data, then use that to split the data into 4 smaller files, which I think should be small enough that I can do what I want.There is the colbycol package with which I have no experience, but I understand it is designed to partition data into column sized objects. #--- from its help file----- cbc.get.col {colbycol} R Documentation Reads a single column from the original file into memory Description Function cbc.read.table reads a file, stores it column by column in disk file and creates a colbycol object. Functioncbc.get.col queries this object and returns a single column.> Thanks, > Spencer > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA