thr3ads.net - R help - [R] Variable substitution in grep pattern [Jan 2004]

If this information is useful, please help other people find it:
Share via:

Alberto Fornasier

2004-Jan-29 19:07 UTC

[R] Variable substitution in grep pattern

Hi everibody.
I'm working with a dataframe with many character vector in which each
observation is made of one or more unique values.
Example:
> Licenza[56:58][1] BSD License, GNU Library or Lesser General Public License (LGPL)
[2] Qt Public License (QPL)
[3] GNU General Public License (GPL)
66 Levels:  ... Zope Public License

As you can see, the observation can have one or more Licenses associated
with them.
I want to build a vector with the number of times every element (e.g.
"BSD License") occurs in the vector, by itself or in association with
others (i.e. I want to count the elements containing "BSD License" as
well as those containing "BSD License, GNU Library or Lesser General
Public License (LGPL)", and so on).

I've tried to use a "for" loop as follows:
> for(i in Licenza.elenco) {+ Licenza.elenco.prova[Licenza.elenco==i] <-
  length(grep(".*i.*",as.character(Licenza)))}

In which Licenza.elenco is a character vector containing all unique
values I need to match (e.g. BSD License, Qt Public License (QPL), GNU
General Public License (GPL)).
However R handles as I expect only the first variable substitution (the
index), but grep matches all strings containing the letter "i", that
is
100% of the vector, except NAs of course.
After running the above code I get:
> Licenza.elenco.prova[1] 2235 2235 2235

I've tried escaping the variable name, enclosing it in brackets, but
nothing works as I want.
I'm sure I'm doing something wrong, but what?

Thaks in advance

Alberto Fornasier

Thomas Lumley

2004-Jan-30 02:55 UTC

head link

[R] Variable substitution in grep pattern

On Thu, 29 Jan 2004, Alberto Fornasier wrote:>
> I've tried to use a "for" loop as follows:
>
> > for(i in Licenza.elenco) {
> + Licenza.elenco.prova[Licenza.elenco==i] <-
>   length(grep(".*i.*",as.character(Licenza)))}
>
> In which Licenza.elenco is a character vector containing all unique
> values I need to match (e.g. BSD License, Qt Public License (QPL), GNU
> General Public License (GPL)).
> However R handles as I expect only the first variable substitution (the
> index), but grep matches all strings containing the letter "i",
that is
> 100% of the vector, except NAs of course.

You can't do that.  If you could , how would you search for all strings
containing the letter "i"?

You need to use something like paste() to construct the pattern

length(grep(paste(".*",i,".*",sep=""),as.character(Licenza)))}

	-thomas

Maybe Matching Threads

Search for more seemingly similar threads

R help - Jan 2004 - Variable substitution in grep pattern

[R] Variable substitution in grep pattern

[R] Variable substitution in grep pattern

Maybe Matching Threads