Currently I am working on a game scoring system. This system has team names with scores attached. These scores are going to be entered by copy and pasting a CSV sheet in an input field. I am currently working on the parser for this. The problem I run into is with finding the team name in the database when the CSV sheet does not entirely match. You must imagine that referees verbally register teams. Let''s say the team ''Skullz n Bonez'' signs up for a match. The referee might put down ''Skulls and Bones'' which will not match when it is entered later on in the parsebox. Obviously for any human ''Skullz n Bonez'' matches ''Skulls and Bones'' better than any other entry in the team name table. How can I bring this insight to my Ruby on Rails application? I already was thinking about replacing all non-alpha characters with % signs and then doing an SQL-LIKE operation. This would match ''''n'', ''&'', ''and''. Yet this would not solve the ''Skullz''-''Skulls'' mismatch for example. I know Word for example uses an algorithm to check which word in it''s dictionary is closest to the misspelled word. I have written such an algorithm once in C++ which would work very well in this situation, though I fear that without pointer and manual memory management these recursive operations would be very performance heavy. -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Chris Dekker wrote:> Currently I am working on a game scoring system. This system has team > names with scores attached. These scores are going to be entered by copy > and pasting a CSV sheet in an input field. I am currently working on the > parser for this. > > The problem I run into is with finding the team name in the database > when the CSV sheet does not entirely match. You must imagine that > referees verbally register teams. Let''s say the team ''Skullz n Bonez'' > signs up for a match. The referee might put down ''Skulls and Bones'' > which will not match when it is entered later on in the parsebox. > > Obviously for any human ''Skullz n Bonez'' matches ''Skulls and Bones'' > better than any other entry in the team name table. How can I bring this > insight to my Ruby on Rails application? > > I already was thinking about replacing all non-alpha characters with % > signs and then doing an SQL-LIKE operation. This would match ''''n'', ''&'', > ''and''. Yet this would not solve the ''Skullz''-''Skulls'' mismatch for > example. > > I know Word for example uses an algorithm to check which word in it''s > dictionary is closest to the misspelled word. I have written such an > algorithm once in C++ which would work very well in this situation, > though I fear that without pointer and manual memory management these > recursive operations would be very performance heavy. >There is an algorithm used in geneology for encoding names so as to get matches when the spelling varies a bit. It is called soundex. It is a simple algorithm that combines many of the consonants and assigns each group a number. You end up with a letter plus 3 digits. It is most commonly used for indexing the census. A definition of the algorithm is at http://www.genealogy.com/00000060.html --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Rodrigo Fuentealba
2008-Sep-02 01:59 UTC
Re: Finding: Want closest match, not exact (complex LIKE)
2008/9/1 Norm <normscherer-ihVZJaRskl1bRRN4PJnoQQ@public.gmane.org>:> > Chris Dekker wrote: >> >> Obviously for any human ''Skullz n Bonez'' matches ''Skulls and Bones'' >> better than any other entry in the team name table. How can I bring this >> insight to my Ruby on Rails application? > > There is an algorithm used in geneology for encoding names so as to get > matches when the spelling varies a bit. It is called soundex.MySQL already has this function. Not sure if PostgreSQL does, but I think it does. Cheers! -- Rodrigo Fuentealba ConcepciĆ³n, Chile --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---