I am trying to import a series of text files generated by stimulus presentation software. The problem that I am having is that the number of rows I need to skip is not fixed (depending on subject's pretest behavior) nor is the first row of the data I want always the same (the stimuli were presented in random order). So I need to bring in the .txt file (using readLines?), look for the row containing the text "Begin Main" (see exact row below) and start reading data to a table from that point. [13] "Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t \tPressed\t(any response)\tC\t25860\t\t\t\t\t" I would also like it to ignore the row: [173] "Main Group\t1000\tBreak\tBreak\ tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t" which will always be the same number of rows after the "Begin Main" row. Thanks, Kevin Burnham [[alternative HTML version deleted]]
On May 31, 2010, at 7:51 PM, Kevin Burnham wrote:> I am trying to import a series of text files generated by stimulus > presentation software. The problem that I am having is that the > number of > rows I need to skip is not fixed (depending on subject's pretest > behavior) > nor is the first row of the data I want always the same (the stimuli > were > presented in random order). So I need to bring in the .txt file > (using > readLines?), look for the row containing the text "Begin Main" (see > exact > row below) and start reading data to a table from that point. > > [13] "Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t > \tPressed\t(any response)\tC\t25860\t\t\t\t\t" > > I would also like it to ignore the row: > [173] "Main Group\t1000\tBreak\tBreak\ > tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t" > > which will always be the same number of rows after the "Begin Main" > row.txt <- "blah blahe blah blah Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t \tPressed\t(any response)\tC\t25860\t\t\t\t\t more blah after blank line uy ytre jhgf Main Group\t1000\tBreak\tBreak\ tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t uytr hgfd" # ___end setup input______________ > bring.in <- readLines(textConnection(txt)) > grep("\\tBegin Main", bring.in) [1] 5 > grep("Main Group\\t1000", bring.in) [1] 5 12 > length.vec <- grep("Main Group\\t1000", bring.in) > length.vec[2] - length.vec[1] [1] 7 > # So a vectorized solution would be: bring.in[grep("\\tBegin Main", bring.in):( grep("\\tBegin Main", bring.in)+length.vec[2] - length.vec[1]-1)] [2] "\tPressed\t(any response)\tC\t25860\t\t\t\t\t" [3] "" [4] "more blah after blank line" [5] "uy" [6] "ytre" [7] "jhgf" bring.in[grep("\\tBegin Main", bring.in):( grep("\\tBegin Main", bring.in)+length.vec[2] - length.vec[1]-1)] -- David Winsemius, MD West Hartford, CT
try this: input <- readLines("yourfile.txt") # determine start start <- grep("\tBegin Main\t", input)[1] # first line if many if (length(start) == 1 && (start > 1)){ input <- tail(input, -(start - 1)) # delete heading lines } # find lines you want to delete breaks <- grep("\tBreak\t", input) if (length(breaks) > 0){ input <- input[-breaks] } # now read in your data real_input <- read.table(textConnection(input), header=TRUE) closeAllConnections() On Mon, May 31, 2010 at 7:51 PM, Kevin Burnham <kburnham at gmail.com> wrote:> I am trying to import a series of text files generated by stimulus > presentation software. ?The problem that I am having is that the number of > rows I need to skip is not fixed (depending on subject's pretest behavior) > nor is the first row of the data I want always the same (the stimuli were > presented in random order). ?So I need to bring in the .txt file (using > readLines?), look for the row containing the text "Begin Main" (see exact > row below) and start reading data to a table from that point. > > ?[13] "Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t > \tPressed\t(any response)\tC\t25860\t\t\t\t\t" > > I would also like it to ignore the row: > [173] "Main Group\t1000\tBreak\tBreak\ > tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t" > > which will always be the same number of rows after the "Begin Main" row. > > Thanks, > Kevin Burnham > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?