Henrik Andersson
2004-Oct-25 13:56 UTC
[R] Reading sections of data files based on pattern matching
I am about to write general functions to read the output of simulations models. These model generate output files with different sections which I want to analyze plot etc. Since this will be used many people at the department I wanted to make sure that will do this in the best way. For instance I want to read a snippets of data from a text that look like this. ------------------------------- Lots of stuff ... @@Start Values@@ Column1 Column2 Column3 ... Row1 1 2 3 ... ... @@End Values@@ More stuff ... @@Start OtherValues@@ Column1 Column2 Column3 ... Row1 1 2 3 ... ... @@End OtherValues@@ I looked in the help files and found grep which operates on character strings, do I have to like this then? 1. Read file with readLines("foo.txt") 2. grep this object for the start and end of each section ->startline & stopline 3. Read the file again with read.table("foo.txt",skip=startline,nrows=stoplin-startline) Or is there a more beautiful way? Cheers, --------------------------------------------- Henrik Andersson Netherlands Institute of Ecology - Centre for Estuarine and Marine Ecology P.O. Box 140 4400 AC Yerseke Phone: +31 113 577473 h.andersson at nioo.knaw.nl http://www.nioo.knaw.nl/ppages/handersson
Gabor Grothendieck
2004-Oct-25 15:06 UTC
[R] Reading sections of data files based on pattern matching
Henrik Andersson <h.andersson <at> nioo.knaw.nl> writes: : : I am about to write general functions to read the output of simulations : models. : : These model generate output files with different sections which I want : to analyze plot etc. : : Since this will be used many people at the department I wanted to make : sure that will do this in the best way. : : For instance I want to read a snippets of data from a text that look : like this. : ------------------------------- : Lots of stuff : ... : <at> <at> Start Values <at> <at> : Column1 Column2 Column3 ... : Row1 1 2 3 ... : ... : <at> <at> End Values <at> <at> : : More stuff : ... : <at> <at> Start OtherValues <at> <at> : Column1 Column2 Column3 ... : Row1 1 2 3 ... : ... : <at> <at> End OtherValues <at> <at> : : : I looked in the help files and found grep which operates on character : strings, do I have to like this then? : : 1. Read file with readLines("foo.txt") : 2. grep this object for the start and end of each section ->startline & : stopline : 3. Read the file again with : read.table("foo.txt",skip=startline,nrows=stoplin-startline) : : Or is there a more beautiful way? You could adapt the following to your situation (i.e. multiple sections rather than just one): https://www.stat.math.ethz.ch/pipermail/r-help/2003-November/040184.html Also regarding your example, one potential gotcha to be aware of is that skip= skips lines but nrow= counts rows of the data frame so they are slightly different concepts.
Duncan Murdoch
2004-Oct-25 17:44 UTC
[R] Reading sections of data files based on pattern matching
On Mon, 25 Oct 2004 15:56:37 +0200, Henrik Andersson <h.andersson at nioo.knaw.nl> wrote :>I am about to write general functions to read the output of simulations >models. > >These model generate output files with different sections which I want >to analyze plot etc. > >Since this will be used many people at the department I wanted to make >sure that will do this in the best way. > >For instance I want to read a snippets of data from a text that look >like this. >------------------------------- >Lots of stuff >... >@@Start Values@@ > Column1 Column2 Column3 ... >Row1 1 2 3 ... >... >@@End Values@@ > >More stuff >... >@@Start OtherValues@@ > Column1 Column2 Column3 ... >Row1 1 2 3 ... >... >@@End OtherValues@@ > > >I looked in the help files and found grep which operates on character >strings, do I have to like this then? > >1. Read file with readLines("foo.txt") >2. grep this object for the start and end of each section ->startline & >stopline >3. Read the file again with >read.table("foo.txt",skip=startline,nrows=stoplin-startline) > >Or is there a more beautiful way?I would avoid putting mixing multiple tables in the same file. I think you'll run into fewer problems if you put each table into a separate file, and generate an index file to list all the tables. Each of the files in your scheme would then become a subdirectory in my scheme. If the multiplicity of files is a problem, you could use zip or winzip to put them all into a zip file; R can extract a file from one of those using zip.file.extract. Duncan Murdoch
Maybe Matching Threads
- Function to read a string as the variables as opposed to taking the string name as the variable
- Split data.frames depeding values of a column
- accessing a data frame with row names
- R Shiny Help - Trouble passing user input columns to emmeans after ANOVA analysis
- Displaying median value over the horizontal(median)line in the boxplot