thr3ads.net - R help - No subject [Jul 2001]

If this information is useful, please help other people find it:
Share via:

Micha? Bojanowski

2001-Jul-11 14:27 UTC

No subject

Hello to all

Recently I came across a problem. I have to analyze a large survey 
data - something about 600 columns and 10000 rows (tab-delimited file 
with names in the header). I was able do import the data into an 
object, but there is no more memory left.

Is there a way to import the data column by column? I have to analyze 
the whole data, but only two variables at a time.

thank in advance

Michal Bojanowski

-----------------------------------------------------------------------
P.S. Wejd? w Kontakt! Wygraj Nokie 9110i i rejs do Szwecji! <
http://kontakt.wp.pl/konkurs >

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Douglas Bates

2001-Jul-11 15:19 UTC

head link

[R] Re: large survey data

Micha? Bojanowski <bojanr at wp.pl> writes:
> Recently I came across a problem. I have to analyze a large survey 
> data - something about 600 columns and 10000 rows (tab-delimited file 
> with names in the header). I was able do import the data into an 
> object, but there is no more memory left.
> 
> Is there a way to import the data column by column? I have to analyze 
> the whole data, but only two variables at a time.
You will probably need to do the data manipulation externally.
Two possible solutions are to use a scripting language like python or
perl or to store the data in a relational database like PostgreSQL or
MySQL.  For data of this size I would recommend the relational
database approach.

R has packages to connect to PostgreSQL or to MySQL.

If you want to use python instead the code is fairly easy to write.
Extracting the first two fields (for which the index expression really
is written 0:2, not 0:1 or 1:2 as one might expect), you could use

#!/usr/bin/env python

import string
import fileinput

for line in fileinput.input():
    flds = string.split(line, "\t")
    print string.join(flds[0:2], "\t")



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

R help - Jul 2001 - No subject

No subject

[R] Re: large survey data