On Wed, 24 Mar 2004, MMarques Power wrote:
>
> Recently working with strings and data
> I have found a small problem.
>
> Windows XP
> R 1.8.1
>
> Reading data from a "txt file" with readLine.
> finding a specific line with "grep" command, all OK.
> but here comes the problem...
> After finding the correct line(s) i need to find a substring
> inside each string.
> In this case "tabs" I think it represented by "\t" in
the grep command
> trying to use grep in each string it only returns 1 ...
That says it is present in character element one. Do read the help page
Value:
For 'grep' a vector giving either the indices of the elements of
'x' that yielded a match or, if 'value' is 'TRUE',
the matched
elements.
> Afterwards I tried regexpr command it returns the correct position of the
> substring that I am looking for but it only reports the first one.
> does regexpr only returns the first one ?
Yes.
> Partial example:
>
> d5 = "load0004 node0014 0.05 0.014583333"
> "load0005 node0017 0.05 0.014583333"
> "load0006 node0019 0.05 0.014583333"
>
>
> >grep("\t",d5[1])
> [1] 1
> >regexpr("\t",d5[1]
> [1] 9
> attr(,"match.length")
> [1] 1
>
> any idea how to make regexpr return the several substrings ?
> or the grep and
> Am I missing anything obvious ?
Telling us what you actually want to do! Would
sapply(strsplit(d5, "\t"), length)
be closer to what you have in mind?
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595