Hi everibody.
I'm working with a dataframe with many character vector in which each
observation is made of one or more unique values.
Example:
> Licenza[56:58]
[1] BSD License, GNU Library or Lesser General Public License (LGPL)
[2] Qt Public License (QPL)
[3] GNU General Public License (GPL)
66 Levels: ... Zope Public License
As you can see, the observation can have one or more Licenses associated
with them.
I want to build a vector with the number of times every element (e.g.
"BSD License") occurs in the vector, by itself or in association with
others (i.e. I want to count the elements containing "BSD License" as
well as those containing "BSD License, GNU Library or Lesser General
Public License (LGPL)", and so on).
I've tried to use a "for" loop as follows:
> for(i in Licenza.elenco) {
+ Licenza.elenco.prova[Licenza.elenco==i] <-
length(grep(".*i.*",as.character(Licenza)))}
In which Licenza.elenco is a character vector containing all unique
values I need to match (e.g. BSD License, Qt Public License (QPL), GNU
General Public License (GPL)).
However R handles as I expect only the first variable substitution (the
index), but grep matches all strings containing the letter "i", that
is
100% of the vector, except NAs of course.
After running the above code I get:
> Licenza.elenco.prova
[1] 2235 2235 2235
I've tried escaping the variable name, enclosing it in brackets, but
nothing works as I want.
I'm sure I'm doing something wrong, but what?
Thaks in advance
Alberto Fornasier