On 27 Jun 2008, at 14:30, francogrex wrote:
>
> Hello,
> It's just a strange coincidence that someone posted just very
> recently a
> question about matching. I know there are several match function in
> the base
> package (such as match, pmatch, charmatch, and the gsub etc) but I
> can't
> seem to use them wisely to be able to get what I need.
> suppose I have the following strings:
> "tets"
> "estt"
> "rtes7"
> "gstes"
> "tes5t"
>
> Is there an R procedure to determine how related each string is to the
> reference string "test", for example to say that "tets"
is similar
> to "test"
> with a probability of 0.9 or something of that sort?
Have a look at ?agrep.
One could loop for different max.distances to get the relation.
An other way is to calculate the edit distance by Levenshtein(-
Damerau). A starting point could be :
http://wiki.r-project.org/rwiki/doku.php?id=tips:data-strings:levenshtein
--Hans