i have a set of files that i am reading into R one at a time and applying to a function that i have written where each is a 'table' n (columns) x 10000 (rows) n varies across the files and most of the rows only have data in the first few columns currently i am reading them in with the command: read.table(file="2.75.0.997.1", header=FALSE, sep="", skip=13, fill=, row.names=1, nrows=10000)->list ***and it works fine however we are now working with a huge table. i was wondering if there is a more efficient way to read this in IDEALLY i would like to have it as a list where each element is a row from the input file, eliminating all of the NA's that the above approach results in , such that i would have a list with 10000 elements and each of variable length from 1:n any help greatly appreciated jimi adams Department of Sociology The Ohio State University 300 Bricker Hall 190 N. Oval Mall Columbus, OH 43210-1353 614-688-4261 our mind has a remarkable ability to think of contents as being independent of the act of thinking -georg simmel -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, 29 Apr 2002, jimi adams wrote:> i have a set of files that i am reading into R one at a time and applying > to a function that i have written > where each is a 'table' n (columns) x 10000 (rows) > n varies across the files and most of the rows only have data in the first > few columns > currently i am reading them in with the command: > read.table(file="2.75.0.997.1", header=FALSE, sep="", skip=13, fill=, > row.names=1, nrows=10000)->list > > ***and it works fine > however we are now working with a huge table. > i was wondering if there is a more efficient way to read this in > > IDEALLY i would like to have it as a list where each element is a row from > the input file, eliminating all of the NA's that the above approach results > in , such that i would have a list with 10000 elements and each of variable > length from 1:n >You could declare a list with 10000 elements as data<-vector("list",10000) and then open a connection to the file and read one line at a time: a<-file("2.75.0.997.1") open(a) for(i in 1:10000) data[[i]]<-scan(a,nlines=1) I don't know if that would be more efficient, but it would use less memory. -thomas Thomas Lumley Asst. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
i previously sent in the message below and i got several responses back, that work, however, now i am running into a different problem i used the following line to read in the file: temp.file<- readLines("2.75.0.997.1") i was then recommended to use: lapply(strsplit(temp.file,"*", as.numeric) to convert this to a list the only problem is that the file that i am reading in has values ranging from 1:10000, and this splits it out into individual numeric characters... not the initial values (e.g., 876 returns as 8, 7, & 6) i think i figured out how to do this if the values were all of the same length, but they are not, so i am wondering if there is some sort of split command that is equivalent to what sep="" does when writing...rather than being defined by a specific numeric value. ultimately what i want is: if the initial file which looks like: 1 412 2000 2 4 3 8888 ... to become a list: [1] 412 2000 [2] 4 [3] 8888 ... thanks in advance. *************************** i have a set of files that i am reading into R one at a time and applying to a function that i have written where each is a 'table' n (columns) x 10000 (rows) n varies across the files and most of the rows only have data in the first few columns currently i am reading them in with the command: read.table(file="2.75.0.997.1", header=FALSE, sep="", skip=13, fill=, row.names=1, nrows=10000)->list ***and it works fine however we are now working with a huge table. i was wondering if there is a more efficient way to read this in IDEALLY i would like to have it as a list where each element is a row from the input file, eliminating all of the NA's that the above approach results in , such that i would have a list with 10000 elements and each of variable length from 1:n any help greatly appreciated jimi adams Department of Sociology The Ohio State University 300 Bricker Hall 190 N. Oval Mall Columbus, OH 43210-1353 614-688-4261 our mind has a remarkable ability to think of contents as being independent of the act of thinking -georg simmel -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._