We are reading big tables, such as,
Chemicals <-
read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
= TRUE, sep = '\t', as.is =T)
I was wondering if it is possible to set a filter during loading so that
we just load what we want not the whole table each time. Thanks,
-james
One can use colClasses to set which columns get read in. For the
columns you don't
want you can set those to NULL. For example,
cc <- c("NULL",rep("numeric",9))
myData <-
read.table("myFile.txt",header=TRUE,colClasses=cc,nrow=numRows).
On Wed, May 27, 2009 at 12:27 PM, <guox at ucalgary.ca>
wrote:> We are reading big tables, such as,
>
> Chemicals <-
>
read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
> = TRUE, sep = '\t', as.is =T)
>
> I was wondering if it is possible to set a filter during loading so that
> we just load what we want not the whole table each time. Thanks,
>
> -james
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
The sqldf package can read a subset of rows and columns into R without
reading the entire file into R. There are a few caveats:
- It does not support ftp so you will need to download the file to your
computer first as shown in the example below
- since value is an SQL keyword it turns value into value__1 to avoid
a collision.
- you will have to convert the value column to numeric yourself as shown:
library(sqldf)
download.file("ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals",
"Chemicals.txt", method = "wget")
# define wp as a file with indicated format
wp <- file("Chemicals.txt")
attr(wp, "file.format") <- list(sep = "\t", header =
TRUE)
# use sqldf to read it in keeping only indicated rows
wp.df <- sqldf("select * from wp where footnote_codes = 'p' and
period = 'M01'")
# fix up type of value__1
wp.df$value__1 <- as.numeric(as.character(wp.df$value__1))
head(wp.df)
See http://sqldf.googlecode.com
On Wed, May 27, 2009 at 12:27 PM, <guox at ucalgary.ca>
wrote:> We are reading big tables, such as,
>
> Chemicals <-
>
read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
> = TRUE, sep = '\t', as.is =T)
>
> I was wondering if it is possible to set a filter during loading so that
> we just load what we want not the whole table each time. Thanks,
>
> -james
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>