I''m just looking for some direction on the best way to implement this and I''m not sure if there is a ton of info out there on this, but I''m just searching under the wrong terminology. I have a MySql database that contains a table of products. Each product has a field for the "product name", "store", "city", "state", "price". When someone adds a new product record I''d like to validate the record against the existing database and check for duplicates to include spelling differences. If it finds a similar match in "product name" and "store" I''d like it not to duplicate, but add it if it''s different city or store (So you could add the same product, but show pricing in different cities or different stores). Example: Name: Magnavox 42 inch Store: Wal-Mart City: Milwaukee State: WI New entry: Name: Magnavox 42 in Store: Walmart City: Milwauke State: WI I don''t want a duplicate record created since these match. Trying to figure out what to do when product names can be abbreviated or store names spelled differently (wal-mart/walmart/wal mart or h.e.b./heb) or city names mispelled. This sounded like a fuzzy search using something like Ferret, or a "did you mean" spelling plugin such as: http://antoniocangiano.com/2007/02/08/acts-as-suggest-plugin/ or http://www.ruby-forum.com/topic/104327 It also sounded like a validation problem, but I didn''t find anything under http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html that fit. A combination of all three sounds like a bloated solution that would degrade performance. Am I on the right track or should I be looking into something else to help me figure this out? Thanks. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On Apr 21, 2:34 am, Jeremy <mbenz_dri...-/E1597aS9LQAvxtiuMwx3w@public.gmane.org> wrote:> I don''t want a duplicate record created since these match. Trying to > figure out what to do when product names can be abbreviated or store > names spelled differently (wal-mart/walmart/wal mart or h.e.b./heb) or > city names mispelled.I''ve previously used aspell (a spellchecker) (via the raspell ruby library) to handle a similar sort of problem. Fred> > This sounded like a fuzzy search using something like Ferret, or a > "did you mean" spelling plugin such as:http://antoniocangiano.com/2007/02/08/acts-as-suggest-plugin/ > orhttp://www.ruby-forum.com/topic/104327It also sounded like a > validation problem, but I didn''t find anything underhttp://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMeth... > that fit. A combination of all three sounds like a bloated solution > that would degrade performance. > > Am I on the right track or should I be looking into something else to > help me figure this out? Thanks.--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
The algorithm you want is called the Levenschtein algorithm, and is the one used in many spell checkers. It is the minimal number of changes to get from one string to another (so in your example it would be 2.. c and h). Since you want to manually compare two strings, a spell-checker isn''t exactly what you want. Words very close together will likely be variations of each other. When doing this, good UI standards suggest using a "hint box" (think like google does for common searches) with the already existing items after they have typed it in is a good idea... list the nearest matches (up to distance 5 or so) and put a "Did you mean ... " label on it. Humans are always better then computers to see if two things are the same, and this is a good way to combine good computing with good human interaction. You can also use the spelling package to cull out misspellings as Fred suggested. This library has the raw algorithm you need to do it very quickly. http://text.rubyforge.org/ On Apr 21, 2:36 am, Frederick Cheung <frederick.che...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> On Apr 21, 2:34 am, Jeremy <mbenz_dri...-/E1597aS9LQAvxtiuMwx3w@public.gmane.org> wrote: > > > I don''t want a duplicate record created since these match. Trying to > > figure out what to do when product names can be abbreviated or store > > names spelled differently (wal-mart/walmart/wal mart or h.e.b./heb) or > > city names mispelled. > > I''ve previously used aspell (a spellchecker) (via the raspell ruby > library) to handle a similar sort of problem. > > Fred > > > > > This sounded like a fuzzy search using something like Ferret, or a > > "did you mean" spelling plugin such as:http://antoniocangiano.com/2007/02/08/acts-as-suggest-plugin/ > > orhttp://www.ruby-forum.com/topic/104327Italso sounded like a > > validation problem, but I didn''t find anything underhttp://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMeth... > > that fit. A combination of all three sounds like a bloated solution > > that would degrade performance. > > > Am I on the right track or should I be looking into something else to > > help me figure this out? Thanks.--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---