Tom D. Harray
2011-Oct-05 02:52 UTC
[R] fgrep with caret (^) meta-character in system() call
Hi there,
I would like to use my linux system's fgrep to search for a text pattern
in a file. Calling system with
system("fgrep \"SearchPattern\" /path/to/the/textFile.txt")
works in general, but I need to search for the search pattern at the
beginning of the line.
The corresponding shell command
fgrep "^SearchPattern" /path/to/the/textFile.txt
|
|___ here's my problem
does exactly what I want. I tried various combinations on ", \", \^,
but
failed to make system() work.
How can I call the working shell command including the caret
meta-character with system()?
Thanks and regards,
dirk
P.S.: Actually I have to search for about 5.000 patterns, stored in an R
list, in a text file with about 30.000.000 lines. The patterns appear in
one or more lines of the text file. Only those lines have to be
extracted if the patterns at the beginning of the line.
Example with matching line 1, non matching line 2, non-matching line 3
(line three comprises aaa, but not at the beginning of the line 3):
SearchPattern = "^aaa"
Text file: aaaooooooooooo
bbbiiiiiiiiiii
aacttttttttaaa
Going line by line through the file in R is too slow, and I cannot
program it in C or C++. Hence I use the fgrep command. I would
appreciate if anyone has a fast alternative which works with R on Linux
and Windows systems.
man awk?
I've used awk for similar tasks (if I am reading the post correctly.)
Google-Fu should turn up some useful examples.
Also awk should be on your linux installation in some form or another.
Regards,
Ken Hutchison
On Oct 4, 2554 BE, at 10:52 PM, "Tom D. Harray" <tomdharray at
gmail.com> wrote:
> Hi there,
>
> I would like to use my linux system's fgrep to search for a text
pattern
> in a file. Calling system with
>
> system("fgrep \"SearchPattern\"
/path/to/the/textFile.txt")
>
> works in general, but I need to search for the search pattern at the
> beginning of the line.
>
> The corresponding shell command
>
> fgrep "^SearchPattern" /path/to/the/textFile.txt
> |
> |___ here's my problem
>
> does exactly what I want. I tried various combinations on ", \",
\^, but
> failed to make system() work.
>
> How can I call the working shell command including the caret
> meta-character with system()?
>
> Thanks and regards,
>
> dirk
>
>
> P.S.: Actually I have to search for about 5.000 patterns, stored in an R
> list, in a text file with about 30.000.000 lines. The patterns appear in
> one or more lines of the text file. Only those lines have to be
> extracted if the patterns at the beginning of the line.
>
> Example with matching line 1, non matching line 2, non-matching line 3
> (line three comprises aaa, but not at the beginning of the line 3):
>
> SearchPattern = "^aaa"
>
> Text file: aaaooooooooooo
> bbbiiiiiiiiiii
> aacttttttttaaa
>
> Going line by line through the file in R is too slow, and I cannot
> program it in C or C++. Hence I use the fgrep command. I would
> appreciate if anyone has a fast alternative which works with R on Linux
> and Windows systems.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.