thr3ads.net - R help - [R] Reading parts of data files [Nov 2010]

If this information is useful, please help other people find it:
Share via:

fbielejec

2010-Nov-23 11:05 UTC

[R] Reading parts of data files

Dear,

I'm doing analysis where I need to work on relatively large (50-60 MB)
text files, though I'm really interested only in parts with binary
variables (named indicators1, indicators2, ... etc.)

Every text file contains other numeric columns, but not always the same
and not always in the same order - therefore I would rather need a
method connecting to file and reading only colums with respect to name
pattern (ie indicators + number). That should speed things up (now I
have to clean data by hand) but also leave less memory footprint. Could
You point me towards sth?

jim holtman

2010-Nov-23 13:15 UTC

head link

[R] Reading parts of data files

?file   -  how to use connections
?read.table    'skip' parameter, colClasses to only read columns you
want

That is not a large file.  Read the whole thing in and then extract
the data you need.

On Tue, Nov 23, 2010 at 6:05 AM, fbielejec <fbielejec at gmail.com>
wrote:> Dear,
>
> I'm doing analysis where I need to work on relatively large (50-60 MB)
> text files, though I'm really interested only in parts with binary
> variables (named indicators1, indicators2, ... etc.)
>
> Every text file contains other numeric columns, but not always the same
> and not always in the same order - therefore I would rather need a
> method connecting to file and reading only colums with respect to name
> pattern (ie indicators + number). That should speed things up (now I
> have to clean data by hand) but also leave less memory footprint. Could
> You point me towards sth?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

Gabor Grothendieck

2010-Nov-23 14:57 UTC

head link

[R] Reading parts of data files

On Tue, Nov 23, 2010 at 6:05 AM, fbielejec <fbielejec at gmail.com>
wrote:> Dear,
>
> I'm doing analysis where I need to work on relatively large (50-60 MB)
> text files, though I'm really interested only in parts with binary
> variables (named indicators1, indicators2, ... etc.)
>
> Every text file contains other numeric columns, but not always the same
> and not always in the same order - therefore I would rather need a
> method connecting to file and reading only colums with respect to name
> pattern (ie indicators + number). That should speed things up (now I
> have to clean data by hand) but also leave less memory footprint. Could
> You point me towards sth?
>
This is easy using read.csv.sql:

library(sqldf)

# create test file
write.table(anscombe, "anscombe.csv", sep = ",", quote =
FALSE,
row.names = FALSE)

# read it back but only indicated columns
read.csv.sql("anscombe.csv", sql = "select x1, x2, y1, y2 from
file")

See ?read.csv.sql and also sqldf home page at http://sqldf.googlecode.com

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Nov 2010 - Reading parts of data files

[R] Reading parts of data files

[R] Reading parts of data files

[R] Reading parts of data files

Seemingly Similar Threads