I am trying to import a series of text files generated by stimulus presentation software. The problem that I am having is that the number of rows I need to skip is not fixed (depending on subject's pretest behavior) nor is the first row of the data I want always the same (the stimuli were presented in random order). So I need to bring in the .txt file (using readLines?), look for the row containing the text "Begin Main" (see exact row below) and start reading data to a table from that point. [13] "Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t \tPressed\t(any response)\tC\t25860\t\t\t\t\t" I would also like it to ignore the row: [173] "Main Group\t1000\tBreak\tBreak\ tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t" which will always be the same number of rows after the "Begin Main" row. Thanks, Kevin Burnham [[alternative HTML version deleted]]
On May 31, 2010, at 7:51 PM, Kevin Burnham wrote:> I am trying to import a series of text files generated by stimulus > presentation software. The problem that I am having is that the > number of > rows I need to skip is not fixed (depending on subject's pretest > behavior) > nor is the first row of the data I want always the same (the stimuli > were > presented in random order). So I need to bring in the .txt file > (using > readLines?), look for the row containing the text "Begin Main" (see > exact > row below) and start reading data to a table from that point. > > [13] "Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t > \tPressed\t(any response)\tC\t25860\t\t\t\t\t" > > I would also like it to ignore the row: > [173] "Main Group\t1000\tBreak\tBreak\ > tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t" > > which will always be the same number of rows after the "Begin Main" > row.txt <- "blah blahe blah blah Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t \tPressed\t(any response)\tC\t25860\t\t\t\t\t more blah after blank line uy ytre jhgf Main Group\t1000\tBreak\tBreak\ tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t uytr hgfd" # ___end setup input______________ > bring.in <- readLines(textConnection(txt)) > grep("\\tBegin Main", bring.in) [1] 5 > grep("Main Group\\t1000", bring.in) [1] 5 12 > length.vec <- grep("Main Group\\t1000", bring.in) > length.vec[2] - length.vec[1] [1] 7 > # So a vectorized solution would be: bring.in[grep("\\tBegin Main", bring.in):( grep("\\tBegin Main", bring.in)+length.vec[2] - length.vec[1]-1)] [2] "\tPressed\t(any response)\tC\t25860\t\t\t\t\t" [3] "" [4] "more blah after blank line" [5] "uy" [6] "ytre" [7] "jhgf" bring.in[grep("\\tBegin Main", bring.in):( grep("\\tBegin Main", bring.in)+length.vec[2] - length.vec[1]-1)] -- David Winsemius, MD West Hartford, CT
try this:
input <- readLines("yourfile.txt")
# determine start
start <- grep("\tBegin Main\t", input)[1] # first line if many
if (length(start) == 1 && (start > 1)){
input <- tail(input, -(start - 1)) # delete heading lines
}
# find lines you want to delete
breaks <- grep("\tBreak\t", input)
if (length(breaks) > 0){
input <- input[-breaks]
}
# now read in your data
real_input <- read.table(textConnection(input), header=TRUE)
closeAllConnections()
On Mon, May 31, 2010 at 7:51 PM, Kevin Burnham <kburnham at gmail.com>
wrote:> I am trying to import a series of text files generated by stimulus
> presentation software. ?The problem that I am having is that the number of
> rows I need to skip is not fixed (depending on subject's pretest
behavior)
> nor is the first row of the data I want always the same (the stimuli were
> presented in random order). ?So I need to bring in the .txt file (using
> readLines?), look for the row containing the text "Begin Main"
(see exact
> row below) and start reading data to a table from that point.
>
> ?[13] "Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t
> \tPressed\t(any response)\tC\t25860\t\t\t\t\t"
>
> I would also like it to ignore the row:
> [173] "Main Group\t1000\tBreak\tBreak\
> tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t"
>
> which will always be the same number of rows after the "Begin
Main" row.
>
> Thanks,
> Kevin Burnham
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?