[this msg was first bounced and the manually approved; don't capitalize "xxx": it looks too much like a word used in a particular kind of spam. Your list maintainer, MM] I would like to read in a .csv file ignoring all lines before the header line. I am counting on the header line being the first line that has xxx in it. I don't know how many lines to skip prior to that line, otherwise I would use the skip= feature of read.csv. This is sort of what I would like to do, although I gather this is not real R since I tried it without success: my.table <- read.csv("gawk '/xxx/{++z};z' myfile.csv |") The gawk program prints out only those lines starting from the first line containing xxx. I have played around with pipe() but could not get anything to work. I am using R 1.4.1 on Windows 2000. Any ideas on how to accomplish this? -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Sun, 24 Mar 2002 ggrothendieck at yifan.net wrote:> [this msg was first bounced and the manually approved; > don't capitalize "xxx": it looks too much like a word > used in a particular kind of spam. Your list maintainer, MM] > > > I would like to read in a .csv file ignoring all lines > before the header line. I am counting on the header > line being the first line that has xxx in it. I don't > know how many lines to skip prior to that line, otherwise > I would use the skip= feature of read.csv. > > This is sort of what I would like to do, although I gather > this is not real R since I tried it without success: > > my.table <- read.csv("gawk '/xxx/{++z};z' myfile.csv |") > > The gawk program prints out only those lines starting from > the first line containing xxx. > > I have played around with pipe() but could not get anything > to work. > > I am using R 1.4.1 on Windows 2000. > > Any ideas on how to accomplish this?Windows has very limited support for pipes on GUI executables. A version with pipe should work under Rterm. However, one could use R to read the file with a line at a time with readLines, check each line until you get the header, then pass the connection to read.csv. If the header is to be read by read.csv, push it back. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thanks. I checked out readLines, which you mentioned, and using that, came up with the following compact solution which skips all lines prior to the first containing the string xxx and reads the non-skipped lines as a csv file: read.csv( fn, skip = grep("xxx",readLines(fn))[1]-1 ) On 25 Mar 2002 at 7:38, Prof Brian D Ripley wrote:> On Sun, 24 Mar 2002 ggrothendieck at yifan.net wrote: > > > [this msg was first bounced and the manually approved; > > don't capitalize "xxx": it looks too much like a word > > used in a particular kind of spam. Your list maintainer, MM] > > > > > > I would like to read in a .csv file ignoring all lines > > before the header line. I am counting on the header > > line being the first line that has xxx in it. I don't > > know how many lines to skip prior to that line, otherwise > > I would use the skip= feature of read.csv. > > > > This is sort of what I would like to do, although I gather > > this is not real R since I tried it without success: > > > > my.table <- read.csv("gawk '/xxx/{++z};z' myfile.csv |") > > > > The gawk program prints out only those lines starting from > > the first line containing xxx. > > > > I have played around with pipe() but could not get anything > > to work. > > > > I am using R 1.4.1 on Windows 2000. > > > > Any ideas on how to accomplish this? > > Windows has very limited support for pipes on GUI executables. A > version with pipe should work under Rterm. > > However, one could use R to read the file with a line at a time with > readLines, check each line until you get the header, then pass the > connection to read.csv. If the header is to be read by read.csv, push it > back. > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272860 (secr) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._