We are reading big tables, such as, Chemicals <- read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header = TRUE, sep = '\t', as.is =T) I was wondering if it is possible to set a filter during loading so that we just load what we want not the whole table each time. Thanks, -james
One can use colClasses to set which columns get read in. For the columns you don't want you can set those to NULL. For example, cc <- c("NULL",rep("numeric",9)) myData <- read.table("myFile.txt",header=TRUE,colClasses=cc,nrow=numRows). On Wed, May 27, 2009 at 12:27 PM, <guox at ucalgary.ca> wrote:> We are reading big tables, such as, > > Chemicals <- > read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header > = TRUE, sep = '\t', as.is =T) > > I was wondering if it is possible to set a filter during loading so that > we just load what we want not the whole table each time. Thanks, > > -james > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
The sqldf package can read a subset of rows and columns into R without reading the entire file into R. There are a few caveats: - It does not support ftp so you will need to download the file to your computer first as shown in the example below - since value is an SQL keyword it turns value into value__1 to avoid a collision. - you will have to convert the value column to numeric yourself as shown: library(sqldf) download.file("ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals", "Chemicals.txt", method = "wget") # define wp as a file with indicated format wp <- file("Chemicals.txt") attr(wp, "file.format") <- list(sep = "\t", header = TRUE) # use sqldf to read it in keeping only indicated rows wp.df <- sqldf("select * from wp where footnote_codes = 'p' and period = 'M01'") # fix up type of value__1 wp.df$value__1 <- as.numeric(as.character(wp.df$value__1)) head(wp.df) See http://sqldf.googlecode.com On Wed, May 27, 2009 at 12:27 PM, <guox at ucalgary.ca> wrote:> We are reading big tables, such as, > > Chemicals <- > read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header > = TRUE, sep = '\t', as.is =T) > > I was wondering if it is possible to set a filter during loading so that > we just load what we want not the whole table each time. Thanks, > > -james > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >