Dear Simon,
Maybe I don't understand properly....if you are doing this in R, can't
you just pick the line you want?
Josh
## print your data to clipboard
cat("Document 1 of 100 \n \n \n Newspaper Name \n \n Day Date", file
"clipboard")
## read data in, and only select the 4th line to pass to grep()
grep("pattern", x = readLines("clipboard")[4])
On Mon, Jul 11, 2011 at 9:31 AM, Simon Kiss <sjkiss at gmail.com>
wrote:> Dear colleagues,
> I have a series of newspaper articles in a text file, downloaded from a
text file. ?They look as follows:
>
> Document 1 of 100
> \n
> \n
> \n
> Newspaper Name
> \n
> \n
> Day Date
>
> I have a series of grep scripts that can extract the date and convert it to
a date object, but I can't figure out how to grep the newspaper name. ?There
is no field ID attached to those lines. The best I can come up with would be to
have the program grep the four lines following matching the pattern
"Document [0-9]". ?There is an an argument to grep in unix that can do
this ...grep -A4 'pattern' infile>outfile, but I don't know if
there is an equivalent argument in R.
>
> Any thoughts.
> Yours, Simon Kiss
> *********************************
> Simon J. Kiss, PhD
> Assistant Professor, Wilfrid Laurier University
> 73 George Street
> Brantford, Ontario, Canada
> N3T 2C9
> Cell: +1 905 746 7606
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/