Andrew Cowan
2006-Apr-24 17:25 UTC
[Rails] Confusion with expressing many to many relationship
Hi folks, I am in the process of converting an existing non-rails application to rails, and am not sure what the best approach would be for specifying the relationship between the two sets of data. I''ll describe the existing table structures first. Table 1: Urls Each url has a unique id and two lists of Phrases, these are currently setup as varchars with "," delimitation - so each url has many Phrases. Table 2: Phrases Each phrase has a unique id and two lists of urls, each url list is setup as a varchar containing a "," delimited list of urls ids - so each phrase also has many urls. In trying to convert this to rails and improve the table structure I find myself confused with the way rails business model allows table relationships to be specified. I''d be very grateful if someone more experienced could explain how they might define these tables and the relationsip they share. Thanks, Andy -- Posted via http://www.ruby-forum.com/.
Josh Susser
2006-Apr-24 19:49 UTC
[Rails] Re: Confusion with expressing many to many relationship
Andrew Cowan wrote:> Table 1: Urls > Each url has a unique id and two lists of Phrases, these are currently > setup as varchars with "," delimitation - so each url has many Phrases. > > Table 2: Phrases > Each phrase has a unique id and two lists of urls, each url list is > setup as a varchar containing a "," delimited list of urls ids - so each > phrase also has many urls. > > In trying to convert this to rails and improve the table structure I > find myself confused with the way rails business model allows table > relationships to be specified.Wow, that sounds gross. Packing foreign keys into a varchar list of ids means you lose all the power of the database to create relations between records or do queries with joins. You get more leverage from using something like a join table to let the database manage those relations for you. There are two ways to do what you want, has_and_belong_to_many (join table) or has_many :through (join model). If you go with habtm you''ll need two join tables, one for each type of relationship. If you use has_many :through, you can use one join model with an attribute that indicates the type of the relationship. You can find a lot more information about doing that on my blog. -- Josh Susser http://blog.hasmanythrough.com -- Posted via http://www.ruby-forum.com/.
Carl Fyffe
2006-Apr-24 20:07 UTC
[Rails] Re: Confusion with expressing many to many relationship
Can you explain why there are two lists of phrases and two lists of urls? Is that because they are split into types? On 4/24/06, Josh Susser <josh@hasmanythrough.com> wrote:> Andrew Cowan wrote: > > Table 1: Urls > > Each url has a unique id and two lists of Phrases, these are currently > > setup as varchars with "," delimitation - so each url has many Phrases. > > > > Table 2: Phrases > > Each phrase has a unique id and two lists of urls, each url list is > > setup as a varchar containing a "," delimited list of urls ids - so each > > phrase also has many urls. > > > > In trying to convert this to rails and improve the table structure I > > find myself confused with the way rails business model allows table > > relationships to be specified. > > Wow, that sounds gross. Packing foreign keys into a varchar list of ids > means you lose all the power of the database to create relations between > records or do queries with joins. You get more leverage from using > something like a join table to let the database manage those relations > for you. There are two ways to do what you want, has_and_belong_to_many > (join table) or has_many :through (join model). If you go with habtm > you''ll need two join tables, one for each type of relationship. If you > use has_many :through, you can use one join model with an attribute that > indicates the type of the relationship. You can find a lot more > information about doing that on my blog. > > -- > Josh Susser > http://blog.hasmanythrough.com > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails >
Andrew Cowan
2006-Apr-24 21:02 UTC
[Rails] Re: Re: Confusion with expressing many to many relationship
Carl Fyffe wrote:> Can you explain why there are two lists of phrases and two lists of > urls? Is that because they are split into types?Yes, one list is for _normal_ phrases and one is for relevant phrases. Each url has a list of normal-related phrases and a list of relevant phrases. Each phrase in turn has a list going backwards to the urls it is normally-related to and the list of urls it is relevant to. Its a very redundant schema, and years old, at the time it seemed like a good way to go since it made lookups easy in either direction. But moving to Rails it seems a horrible approach. ;) Thanks -Andy -- Posted via http://www.ruby-forum.com/.
Carl Fyffe
2006-Apr-24 21:52 UTC
[Rails] Re: Re: Confusion with expressing many to many relationship
The approach would have been very suitable for a flat file database. Flat file databases have their place and can be useful, imdb was flat file for a very long time, but you want to move it to Rails so you need to put it into a more logical (normalized) schema now... You basically need to add a third table to the schema that will link the phrases and urls. You should make it a full blown model and take advantage of the "has_many :through", which Josh was talking about. You want a full blown model because you need to apply a "type" field that describes the relationship (normal phrase vs relevant phrase). You could use HABTM but that is, in my opinion, a little more confusing to implement. Hope that helps! On 4/24/06, Andrew Cowan <icculus@gmdstudios.com> wrote:> Carl Fyffe wrote: > > Can you explain why there are two lists of phrases and two lists of > > urls? Is that because they are split into types? > > Yes, one list is for _normal_ phrases and one is for relevant phrases. > Each url has a list of normal-related phrases and a list of relevant > phrases. Each phrase in turn has a list going backwards to the urls it > is normally-related to and the list of urls it is relevant to. > > Its a very redundant schema, and years old, at the time it seemed like a > good way to go since it made lookups easy in either direction. But > moving to Rails it seems a horrible approach. ;) > > Thanks > -Andy > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails >
Thank you both for the suggestions and for the pointer to your blog! Yep, my approach was and is still gross, but I must admit I am concerned with the size that the join table will take, as currently the urls table has about 7000 records and the phrases table has about 87000 records... So, being a newbie with the has_many :through relationship I''d like to make sure I understand before commiting to its implementation. A quick rundown to make sure I am getting what I need to do, please tear this apart if it becomes apparent that I missed the boat -- used the example from hasmanythrough.com. Table 1: urls - has id and url Table 2: phrases - has id and phrase Table 3: relations - has id, url_id, phrase_id, isrelevant (boolean) Ignore isrelevant for now as I can add class methods to weed out the relationship during overloaded finds, am mostly concerned in the url <-> phrases relationship. class Relation < ActiveRecord::Base belongs_to :url, :foreign_key => "url_id", :class_name => "Url" belongs_to :phrase, :foreign_key => "phrase_id", :class_name => "Phrase" end class Url < ActiveRecord::Base has_many :related_as_phrase, :foreign_key => ''phrase_id'', :class_name => ''Relation'' has_many :phrases, :through => :related_as_phrase end class Phrase < ActiveRecord::Base has_many :related_as_url, :foreign_key => ''url_id'', :class_name => ''Relation'' has_many :urls, :through => :related_as_urls end Do the above model rules look right for this? Anything I should be aware of? Thanks again, am much appeciated that this was pointed out to me! -Andy> Wow, that sounds gross. Packing foreign keys into a varchar list of ids > means you lose all the power of the database to create relations between > records or do queries with joins. You get more leverage from using > something like a join table to let the database manage those relations > for you. There are two ways to do what you want, has_and_belong_to_many > (join table) or has_many :through (join model). If you go with habtm > you''ll need two join tables, one for each type of relationship. If you > use has_many :through, you can use one join model with an attribute that > indicates the type of the relationship. You can find a lot more > information about doing that on my blog. > > -- > Josh Susser > http://blog.hasmanythrough.com-- Posted via http://www.ruby-forum.com/.
Andrew Cowan wrote:> Thank you both for the suggestions and for the pointer to your blog!No prob. That''s what it''s there for.> Yep, my approach was and is still gross, but I must admit I am concerned > with the size that the join table will take, as currently the urls table > has about 7000 records and the phrases table has about 87000 records...Don''t worry about the size of the join table. Databases are built just for doing this sort of thing, and unless your table is many millions of rows long you don''t need to be concerned.> Table 1: urls - has id and url > Table 2: phrases - has id and phrase > Table 3: relations - has id, url_id, phrase_id, isrelevant (boolean) > > Ignore isrelevant for now as I can add class methods to weed out the > relationship during overloaded finds, am mostly concerned in the url <-> > phrases relationship.Your tables look fine, but you go a bit overboard in the model classes. You''ll have much more fun in Rails if you go with the flow, meaning just follow the conventions and let Rails use the defaults unless you actually need to specify something different. It looks like you got confused looking at the polymorphic examples, which is serious overkill for what you want. Just go with the standard has_many :through, like so: class Relation < ActiveRecord::Base belongs_to :url belongs_to :phrase end class Url < ActiveRecord::Base has_many :relations has_many :phrases, :through => :relations end class Phrase < ActiveRecord::Base has_many :relations has_many :urls, :through => :relations end One more thing - the model name "relation" is probably too generic. Having a bad name isn''t going to break anything, but having a meaningful name can help you think more clearly about your objects. "Magazine has_many :readers, :through => :subscritions" is a lot more meaningful than "Magazine has_many :people, :through => :relations". It''s *almost* possible for Ruby code to be self-documenting if you give things good names. Just something to think about. -- Josh Susser http://blog.hasmanythrough.com -- Posted via http://www.ruby-forum.com/.
Ahhhh! I like your suggested model changes much better, the node/edges example on your site put some doubt in my mind that defining the models would be as natural as the revised models end up being. Thank you again for taking the time to help clear this up, it is much appreciated! :) -Andy Josh Susser wrote:> Andrew Cowan wrote: >> Thank you both for the suggestions and for the pointer to your blog! > > No prob. That''s what it''s there for. > >> Yep, my approach was and is still gross, but I must admit I am concerned >> with the size that the join table will take, as currently the urls table >> has about 7000 records and the phrases table has about 87000 records... > > Don''t worry about the size of the join table. Databases are built just > for doing this sort of thing, and unless your table is many millions of > rows long you don''t need to be concerned. > >> Table 1: urls - has id and url >> Table 2: phrases - has id and phrase >> Table 3: relations - has id, url_id, phrase_id, isrelevant (boolean) >> >> Ignore isrelevant for now as I can add class methods to weed out the >> relationship during overloaded finds, am mostly concerned in the url <-> >> phrases relationship. > > Your tables look fine, but you go a bit overboard in the model classes. > You''ll have much more fun in Rails if you go with the flow, meaning just > follow the conventions and let Rails use the defaults unless you > actually need to specify something different. It looks like you got > confused looking at the polymorphic examples, which is serious overkill > for what you want. Just go with the standard has_many :through, like so: > > class Relation < ActiveRecord::Base > belongs_to :url > belongs_to :phrase > end > > class Url < ActiveRecord::Base > has_many :relations > has_many :phrases, :through => :relations > end > > class Phrase < ActiveRecord::Base > has_many :relations > has_many :urls, :through => :relations > end > > One more thing - the model name "relation" is probably too generic. > Having a bad name isn''t going to break anything, but having a meaningful > name can help you think more clearly about your objects. "Magazine > has_many :readers, :through => :subscritions" is a lot more meaningful > than "Magazine has_many :people, :through => :relations". It''s *almost* > possible for Ruby code to be self-documenting if you give things good > names. Just something to think about. > > -- > Josh Susser > http://blog.hasmanythrough.com-- Posted via http://www.ruby-forum.com/.
Josh, I didn''t get to play with this until this morning as my server was getting a hardware upgrade last night -- so far this is working out great as far as being able to access url.phrases and phrase.urls ... I fully understand your meaning with renaming the Relation model, but after much thought I really cannot find anything more approriate so for the time being am leaving that. One thing I am wondering is, how am I able to access the specific relation that a url and phrase share? Example: given a list of single url @url @url.phrases.each do |ph| # How can I access the relation that @url and ph share through the # Relation model here ? end I tried ph.relation on a whim and of course found it a non-existent member. I''ll keep poking at it but so far am not sure if I need to modify the model or if something else is needed... Thanks! Andy> > class Relation < ActiveRecord::Base > belongs_to :url > belongs_to :phrase > end > > class Url < ActiveRecord::Base > has_many :relations > has_many :phrases, :through => :relations > end > > class Phrase < ActiveRecord::Base > has_many :relations > has_many :urls, :through => :relations > end >-- Posted via http://www.ruby-forum.com/.
Actually, now that I have gotten to play with this some more I see that there is a very real performance hit, so much that this won''t be usable as-is. Not sure if this is the way has_many :through should work but I''ll desribe what I am seeing. Some prelim info: The box is only about a year or so old, has dual xeon 3.06 HT processors, 4G ram and dual raid-1 10kMhz scsi drives - it has handled everything else I have thrown at it without a whimper. Table 1: urls - has roughly 7,000 records Table 2: phrases - has roughly 87,000 records Table 3: relations - has 235,000 or so records, each one having 1 url and 1 phrase from their respective DBs Given a single url object, Merely doing a simple: @phrases = url.phrases I am watching 87,000 SQL queries fly by in my lighttpd output, similar to this: Relation Load (0.049132) SELECT * FROM relations WHERE (url_id=7163 AND phrase_id=77686) LIMIT 1 one for each phrase_id covering every phrase_id in the phrases table. Ok, that in itself is an application killer because the above operation would be one of many for a given page display, and just that is showing noticable lag before the page is rendered. So, my hunch tells me something is not right? Are these tables just too big to be using this technique or is something else going on? -Andy Andrew Cowan wrote:> Josh, I didn''t get to play with this until this morning as my server was > getting a hardware upgrade last night -- so far this is working out > great as far as being able to access url.phrases and phrase.urls ... > > I fully understand your meaning with renaming the Relation model, but > after much thought I really cannot find anything more approriate so for > the time being am leaving that. > > One thing I am wondering is, how am I able to access the specific > relation that a url and phrase share? > > Example: > > given a list of single url @url > @url.phrases.each do |ph| > # How can I access the relation that @url and ph share through the > # Relation model here ? > end > > I tried ph.relation on a whim and of course found it a non-existent > member. > > I''ll keep poking at it but so far am not sure if I need to modify the > model or if something else is needed... > > Thanks! > Andy >-- Posted via http://www.ruby-forum.com/.