Cable, Samuel B Civ USAF AFMC AFRL/RVBXI
2010-Apr-07 19:15 UTC
[R] FW: fairly simple file I/O
OK, my apologies. I am sure this is a question that has been answered before. But I have looked all over the web and can't find an answer for it. I promise, wasting your time and bandwidth is my last resort. So here goes: I have an ASCII file formatted like so: Label 1.1 Time 1 Label 1.2 Array of data from time 1 Label 2.1 Time 2 Label 2.2 Array of data from time 2 Label 3.1 Etc. I just want an efficient way of reading this data in so that 1) The "Label" values are ignored. 2) The "Time" values go into a single vector. 3) The "Array of data" values go into a single array. The only thing I have been able to do is "scan" everything in to one honking big list and then distribute the data out of this list one index at a time. Surely there is a more elegant way? Thanks.
Hi, On Wed, Apr 7, 2010 at 3:15 PM, Cable, Samuel B Civ USAF AFMC AFRL/RVBXI <Samuel.Cable at hanscom.af.mil> wrote:> > OK, my apologies. ?I am sure this is a question that has been answered > before. ?But I have looked all over the web and can't find an answer for > it. ?I promise, wasting your time and bandwidth is my last resort. > So here goes: > > I have an ASCII file formatted like so: > > Label 1.1 > > Time 1 > > Label 1.2 > > Array of data from time 1 > > Label 2.1 > > Time 2 > > Label 2.2 > > Array of data from time 2 > > Label 3.1 > > Etc. > > > > I just want an efficient way of reading this data in so that > > > > 1) ? ? ?The "Label" values are ignored. > > 2) ? ? ?The "Time" values go into a single vector. > > 3) ? ? ?The "Array of data" values go into a single array. > > > > The only thing I have been able to do is "scan" everything in to one > honking big list and then distribute the data out of this list one index > at a time. ?Surely there is a more elegant way? ?Thanks.I'm not sure what combo of search terms you could have used to get this direct answer -- I think you just have to break this problem down into smaller ones, which you could then have smoked out ... for instance: 1. You can read in a file into a vector of character(s)/strings with `readLines` 2. You can use `grep` over a vector of characters to find the indices in the vector that have strings that match your grep/regex search. 3. `strsplit` breaks a string into pieces given a delimiter. 4. indexing a vector with negative numbers is really helpful Anyway, let's start with reading in your data into a vector of characetrs R> lines <- readLines('/path/to/your/file.txt') Now `lines[1]` will be the first line of the file. Moving on: do the "Time", "Label", etc. lines actually start with the word "Time" and "Label"? If so you can just find them with grep. You say you don't want the "Label" lines, so you can remove them: R> label.lines <- grep("Label", lines) R> clean.lines <- lines[-label.lines] Now `clean.lines` I guess looks like: Time 1 Array of data form time 1 Time 2 Array of data from time 2 Now you can use grep again, or just pull out every other index: R> time.lines <- seq(1, length(clean.lines), by=2) R> times <- clean.lines[time.lines] R> datas <- clean.lines[-time.lines] If "datas" is some comma separated line of data, you can use strsplit on it do split the pieces by a delimiter (like ","). See ?strsplit for more info. Does that help? -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact