thr3ads.net - R help - [R] efficiency [Apr 2002]

If this information is useful, please help other people find it:
Share via:

jimi adams

2002-Apr-29 22:37 UTC

[R] efficiency

i have a set of  files that i am reading into R one at a time and applying 
to a function that i have written
where each is a 'table' n (columns) x 10000 (rows)
n varies across the files and most of the rows only have data in the first 
few columns
currently i am reading them in with the command:
read.table(file="2.75.0.997.1", header=FALSE, sep="",
skip=13, fill=,
row.names=1, nrows=10000)->list

***and it works fine
however we are now working with a huge table.
i was wondering if there is a more efficient way to read this in

IDEALLY i would like to have it as a list where each element is a row from 
the input file, eliminating all of the NA's that the above approach results 
in , such that i would have a list with 10000 elements and each of variable 
length from 1:n

any help greatly appreciated
jimi adams
Department of Sociology
The Ohio State University
300 Bricker Hall
190 N. Oval Mall
Columbus, OH 43210-1353
614-688-4261

our mind has a remarkable ability to think of contents as being independent 
of the act of thinking
                                             -georg simmel

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Thomas Lumley

2002-Apr-29 22:59 UTC

head link

[R] efficiency

On Mon, 29 Apr 2002, jimi adams wrote:
> i have a set of  files that i am reading into R one at a time and applying
> to a function that i have written
> where each is a 'table' n (columns) x 10000 (rows)
> n varies across the files and most of the rows only have data in the first
> few columns
> currently i am reading them in with the command:
> read.table(file="2.75.0.997.1", header=FALSE, sep="",
skip=13, fill=,
> row.names=1, nrows=10000)->list
>
> ***and it works fine
> however we are now working with a huge table.
> i was wondering if there is a more efficient way to read this in
>
> IDEALLY i would like to have it as a list where each element is a row from
> the input file, eliminating all of the NA's that the above approach
results
> in , such that i would have a list with 10000 elements and each of variable
> length from 1:n
>
You could declare a list with 10000 elements as
  data<-vector("list",10000)
and then open a connection to the file and read one line at a time:
  a<-file("2.75.0.997.1")
  open(a)
  for(i in 1:10000) data[[i]]<-scan(a,nlines=1)


I don't know if that would be more efficient, but it would use less
memory.

	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

jimi adams

2002-May-07 00:15 UTC

head link

[R] efficiency

i previously sent in the message below
and i got several responses back, that work, however, now i am running into 
a different problem
i used the following line to read in the file:

temp.file<- readLines("2.75.0.997.1")

i was then recommended to use:

lapply(strsplit(temp.file,"*", as.numeric)

to convert this to a list
the only problem is that the file that i am reading in has values ranging 
from 1:10000, and this splits it out into individual numeric characters... 
not the initial values (e.g., 876 returns as 8, 7, & 6)
i think i figured out how to do this if the values were all of the same 
length, but they are not, so i am wondering if there is some sort of split 
command that is equivalent to what sep="" does when writing...rather
than
being defined by a specific numeric value.

ultimately what i want is:
if the initial file which looks like:
1 412 2000
2 4
3 8888
...

to become a list:
[1]
412 2000
[2]
4
[3]
8888
...

thanks in advance.




***************************
i have a set of  files that i am reading into R one at a time and applying 
to a function that i have written
where each is a 'table' n (columns) x 10000 (rows)
n varies across the files and most of the rows only have data in the first 
few columns
currently i am reading them in with the command:
read.table(file="2.75.0.997.1", header=FALSE, sep="",
skip=13, fill=,
row.names=1, nrows=10000)->list

***and it works fine
however we are now working with a huge table.
i was wondering if there is a more efficient way to read this in

IDEALLY i would like to have it as a list where each element is a row from 
the input file, eliminating all of the NA's that the above approach results 
in , such that i would have a list with 10000 elements and each of variable 
length from 1:n

any help greatly appreciated
jimi adams
Department of Sociology
The Ohio State University
300 Bricker Hall
190 N. Oval Mall
Columbus, OH 43210-1353
614-688-4261

our mind has a remarkable ability to think of contents as being independent 
of the act of thinking
                                             -georg simmel

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Apr 2002 - efficiency

[R] efficiency

[R] efficiency

[R] efficiency

Apparently Analagous Threads