Daniel Jorge
2009-May-25  15:26 UTC
Product Search Engine Design with Sphinx and Faceted Search
Hi, this is my first question on this forum! Please, take some time to read this message, since it is very big. (sorry) I''m building a search engine (crawler) that indexes products data from more than 500 online brazilian stores. This part is easy => Crawling, Extracting Information, etc. I''m running on an app design problem. Different types of products have different attributes. For instance: Books have :Publisher, :Edition, :Author, etc... Digital Cameras have :Brand, :Megapixel, etc... and so on. I REFUSE MYSELF TO CREATE ONE MODEL FOR EVERY TYPE OF PRODUCT. The crawler automatically discover the types of products and product attributes per type of product. What I was thinking is to have only one model => Product. Please see below the design I want to have (if you have any suggestions, please, tell me). class Category < ActiveRecord::Base end class Product < ActiveRecord::Base belongs_to :category end class AttributeType < ActiveRecord::Base belongs_to :category end class Attribute < ActiveRecord::Base belongs_to :product belongs_to :attribute_type end The Category model represents the type of product (Books, Digital Camera) and each category has an Attribute Type set. My application is just like http://www.pricejunkie.com but specially for brazilian customers. The customer will be presented with a search field (Sphinx with Thinking Sphinx or UltraSphinx) and a list of categories. What exactly is my problem?? Thinking Sphinx FACETS (filters). In this applications, each facet is an Attribute Type. Please see http://www.pricejunkie.com, click on the Book category and you will see what I mean. The problem is that facets in Thinking Sphinx are defined per model field. If I had a Book model with the field :autor, I could just do this with Thinking Sphinx on the Book model: define_index do indexes author, :facet => true end I need my system to have dynamic facets per product category, otherwise I will have to create more than 300 different product types... I hope I made myself clear. PLEASE SOMEONE HELP. Thanks -- Posted via http://www.ruby-forum.com/.
Colin Law
2009-May-25  15:54 UTC
Re: Product Search Engine Design with Sphinx and Faceted Search
I think there may be something wrong with your model. Firstly I think you need Category has_many :products has_many :attribute_types Product belongs_to :category has_many :attributes AttributeType belongs_to :category has_many :attributes Attribute belongs_to :product belongs_to :attribute_type Is there a problem with the above? If one has a product then product.category.attribute types will give one a collection of attribute types relevant to that model. One could also use product.attributes to get a collection of attributes, but since each attribute has an associated attribute type one can also get the attribute types for a product by this route. I am no database expert, so I may be wrong, but I suspect that it is not a good idea to have two separate relationship routes between models like this. My other problem is that I am not sure what you are trying to achieve. Can you explain in a couple of sentences what data you wish to extract when the user clicks on the Book category for example? Colin 2009/5/25 Daniel Jorge <rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org>> > Hi, this is my first question on this forum! Please, take some time to > read this message, since it is very big. (sorry) > > I''m building a search engine (crawler) that indexes products data from > more than 500 online brazilian stores. This part is easy => Crawling, > Extracting Information, etc. > > I''m running on an app design problem. Different types of products have > different attributes. For instance: Books have :Publisher, :Edition, > :Author, etc... Digital Cameras have :Brand, :Megapixel, etc... and so > on. I REFUSE MYSELF TO CREATE ONE MODEL FOR EVERY TYPE OF PRODUCT. > > The crawler automatically discover the types of products and product > attributes per type of product. What I was thinking is to have only one > model => Product. Please see below the design I want to have (if you > have any suggestions, please, tell me). > > > class Category < ActiveRecord::Base > end > > class Product < ActiveRecord::Base > belongs_to :category > end > > class AttributeType < ActiveRecord::Base > belongs_to :category > end > > class Attribute < ActiveRecord::Base > belongs_to :product > belongs_to :attribute_type > end > > The Category model represents the type of product (Books, Digital > Camera) and each category has an Attribute Type set. My application is > just like http://www.pricejunkie.com but specially for brazilian > customers. The customer will be presented with a search field (Sphinx > with Thinking Sphinx or UltraSphinx) and a list of categories. > > What exactly is my problem?? Thinking Sphinx FACETS (filters). In this > applications, each facet is an Attribute Type. Please see > http://www.pricejunkie.com, click on the Book category and you will see > what I mean. > > The problem is that facets in Thinking Sphinx are defined per model > field. If I had a Book model with the field :autor, I could just do this > with Thinking Sphinx on the Book model: > > define_index do > indexes author, :facet => true > end > > I need my system to have dynamic facets per product category, otherwise > I will have to create more than 300 different product types... > > I hope I made myself clear. PLEASE SOMEONE HELP. > Thanks > -- > Posted via http://www.ruby-forum.com/. > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Daniel Jorge
2009-May-25  16:32 UTC
Re: Product Search Engine Design with Sphinx and Faceted Search
Hi thank you for your answer The models you wrote are ok, I just simplified them for illustration purposes. My problem is with search facets (filtering) using Thinking Sphinx. Suppose that I had a book model with the fields author, publisher and year. When a customer select the book category, he will be presented with a list of books and on the left side, he will have the "facets" like that: authors -author 1 (203) -author 2 (125) -author 3 (99) -author 4 (38) ... publishers -publisher1 (199) -publisher1 (21) -publisher1 (408) -publisher1 (134) ... years -2009 (109) -2008 (33) -2007 (12) -2006 (500) ... This would be very easy if I had a book model. Thinking Sphinx requires a facet to be defined like this on the "book" model: define_index do indexes author, :facet => true indexes publisher, :facet => true indexes year, :facet => true end THE PROBLEM IS THAT I DO NOT HAVE THIS MODEL AND THIS ATTRIBUTES DEFINED. I don''t have this defined because I would have to know all the product types and all the possible attributes for each product type and create a different model for each product type. ------------------- WHAT IM LOOKING FOR ------------------- I''M LOOKING FOR SOME KIND OF METAPROGRAMMING THAT ENABLES ME TO FIND WHAT ARE THE ATTRIBUTE TYPES FOR A CATEGORY OF PRODUCTS AND BUILD THE FACETS AT RUNTIME. SOMETHING LIKE THIS IN MY PRODUCT MODEL: define_index do find whatever attribute types are relevant for this category and makes it :facet => true end THANKS!! -- Posted via http://www.ruby-forum.com/.
Colin Law
2009-May-25  16:42 UTC
Re: Product Search Engine Design with Sphinx and Faceted Search
I understand. Sorry, I do not know enough about how the inner workings of Sphinx to be able to help. Can anyone else help? Colin 2009/5/25 Daniel Jorge <rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org>> > Hi thank you for your answer > > The models you wrote are ok, I just simplified them for illustration > purposes. > > My problem is with search facets (filtering) using Thinking Sphinx. > Suppose that I had a book model with the fields author, publisher and > year. When a customer select the book category, he will be presented > with a list of books and on the left side, he will have the "facets" > like that: > > authors > -author 1 (203) > -author 2 (125) > -author 3 (99) > -author 4 (38) > ... > > publishers > -publisher1 (199) > -publisher1 (21) > -publisher1 (408) > -publisher1 (134) > ... > > years > -2009 (109) > -2008 (33) > -2007 (12) > -2006 (500) > ... > > This would be very easy if I had a book model. Thinking Sphinx requires > a facet to be defined like this on the "book" model: > > define_index do > indexes author, :facet => true > indexes publisher, :facet => true > indexes year, :facet => true > end > > THE PROBLEM IS THAT I DO NOT HAVE THIS MODEL AND THIS ATTRIBUTES > DEFINED. I don''t have this defined because I would have to know all the > product types and all the possible attributes for each product type and > create a different model for each product type. > > ------------------- > WHAT IM LOOKING FOR > ------------------- > I''M LOOKING FOR SOME KIND OF METAPROGRAMMING THAT ENABLES ME TO FIND > WHAT ARE THE ATTRIBUTE TYPES FOR A CATEGORY OF PRODUCTS AND BUILD THE > FACETS AT RUNTIME. SOMETHING LIKE THIS IN MY PRODUCT MODEL: > > define_index do > find whatever attribute types are relevant for this category > and makes it :facet => true > end > > THANKS!! > -- > Posted via http://www.ruby-forum.com/. > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Michael Schuerig
2009-May-25  18:09 UTC
Re: Product Search Engine Design with Sphinx and Faceted Search
On Monday 25 May 2009, Daniel Jorge wrote:> The problem is that facets in Thinking Sphinx are defined per model > field. If I had a Book model with the field :autor, I could just do > this with Thinking Sphinx on the Book model: > > define_index do > indexes author, :facet => true > end > > I need my system to have dynamic facets per product category, > otherwise I will have to create more than 300 different product > types...I have never used Thinking Sphinx or even plain Sphinx, still, here are my 2ยข: It seems to me that your problem is that you have a meta-model toolkit on the one hand, your Category, Product, AttributeType, and Attribute classes. While on the other hand, you want to define concrete facets such as author, brand, etc. on top of that. That may be possible, but probably it is not. I''ve had a quick look at the code for Thinking Sphinx and you''ll probably be able to find what you need if you start studying it closely starting from its ActiveRecord integration. I don''t think you will be able to achieve your goal without understanding details of how Thinking Sphinx works, as you''re trying to stretch it beyond its intended use. Michael -- Michael Schuerig mailto:michael-q5aiKMLteq4b1SvskN2V4Q@public.gmane.org http://www.schuerig.de/michael/