Incidentally, if you're actually aiming to match different forms of a name (Peter vs Pete, Ann vs Anne vs Annette) then you might find the synonym feature a better option than wildcarding. You'd need to give it a list of names to treat as synonyms, but it should have many fewer false positives, and can also handle cases which aren't just a substring - e.g. Robert vs Rob vs Bob vs Bobby, or look entirely different: e.g. Terence vs Terrence vs Spike or Margaret vs Peggy vs Daisy. Cheers, Olly
Olly Betts <olly at survex.com> writes:> Incidentally, if you're actually aiming to match different forms of a > name (Peter vs Pete, Ann vs Anne vs Annette) then you might find the > synonym feature a better option than wildcarding. >What about a custom stemmer? d
On Thu, 19 Sep 2019 at 21:45, Olly Betts <olly at survex.com> wrote:> Incidentally, if you're actually aiming to match different forms of a > name (Peter vs Pete, Ann vs Anne vs Annette) then you might find the > synonym feature a better option than wildcarding. > > You'd need to give it a list of names to treat as synonyms, but it > should have many fewer false positives, and can also handle cases > which aren't just a substring - e.g. Robert vs Rob vs Bob vs Bobby, or > look entirely different: e.g. Terence vs Terrence vs Spike or > Margaret vs Peggy vs Daisy. >This is exactly what I want to do - pending finding a suitable dataset that's free to use. Peter
On Thu, Sep 19, 2019 at 08:23:31PM -0300, David Bremner wrote:> Olly Betts <olly at survex.com> writes: > > > Incidentally, if you're actually aiming to match different forms of a > > name (Peter vs Pete, Ann vs Anne vs Annette) then you might find the > > synonym feature a better option than wildcarding. > > What about a custom stemmer?That would work too, and probably makes for faster queries. But it achieves that by doing some of the work at index time, and a consequential drawback is that with a custom stemmer you'd need to perform a full reindex for changes to name mappings to properly take effect, whereas with synonyms you can add or remove synonyms at any time. Cheers, Olly