Hi all, I am going to write a fairly long email about deletion and destruction in has_many :through associations, and then make some proposals. Sorry for the length, but I think it''s currently a mess which we need to get sorted. Please do read it :) ### Issues ### -------------- * The :dependent option is not supported, and the "default" behaviour for delete (a delete_all operation) is different to has_many (a nullify operation) * delete and destroy operate on different records (explained further below) * delete, and destroy since 3.0, raise an exception unless the source association is a belongs_to * Counter caches and :touch fields are not updated correctly on deletion ### History ### --------------- Consider the following: class Post < AR::Base has_many :taggings has_many :tags, :through => :taggings end In 2.3, the behaviour was thus: * post.tags.delete(tag) would not affect the tag at all, but it would delete the relevant Tagging record, thus detaching the tag from the post * post.tags.destroy(tag) would delete the tag record, but would not affect the tagging record at all, thus leaving invalid records lying around Ticket #2251 was created about this, which contains quite a bit of fairly confusing debate: http://bit.ly/fBpCfM The solution proposed in #2251 was to make destroy behave more like delete, and a patch was submitted which causes post.tags.destroy(tag) to delete both the tag AND the tagging. This lead to a complaint that destroy had ceased to work in some cases where it had previously work. Specifically, because destroy now involved the join record, it didn''t work unless the source association on the join record was a belongs_to. Due to the complaint, Michael Koziarski reverted the change in 2-3-stable, but left it in master (which went on to become 3-0-stable) pending further discussion. There was some further discussion, but it eventually petered out without any action, so we now have a situation where that change has crept into 3-0-stable without proper agreement. Some of that further discussion centred around the assumptions that are made by has_many :through about your join records. In the above example, you might expect post.tags.delete(tag) to delete the tagging, on the basis that you don''t want orphaned records just lying around. However, this is an assumption that Tagging exists purely to link Post to Tag, and does not have any independent purpose. Here''s another example: class Post < AR::Base has_many :comments has_many :comment_authors, :through => :comments, :source => :author end class Comment < AR::Base belongs_to :author end Suppose we can have anonymous comments, which have no associated author. Now, if we follow the same assumptions, post.comments_authors.delete(author) should also delete the associated comment. But we can clearly see that this assumption could be wrong if what we actually want to achieve is to make all of a given author''s comments on a given post anonymous. I basically think that the lack of support for the :dependent option has lead to assumptions being built into has_many :through, because people have lacked the ability to be explicit about what they want. ### Proposal ### ---------------- Here is my proposal: * If the :dependent option is implemented in an analogous way to the implementation for regular has_many, it will provide enough power to be completely explicit about what should be done in a given situation, and therefore we can stop making (potentially incorrect) assumptions * delete and destroy should only ever operate on the same records, as implied by the documentation of them in AssociationCollection [reminder: at the moment delete operates on the join record, but destroy operates on the join record and the associated record] * Therefore, has_many :through should never specifically delete join records during a delete or destroy operation (but it may do given the right configuration of :dependent options on models) To illustrate why this will solve things, I''ll return to the examples given. Assuming my proposal is implemented, the tags example could be coded like so: class Post < AR::Base has_many :taggings, :dependent => :destroy has_many :tags, :through => :taggings end class Tagging < AR::Base belongs_to :post belongs_to :tag end class Tag < AR::Base has_many :taggings, :dependent => :destroy has_many :posts, :through => :taggings end Now consider post.tags.delete(tag). In the absence, of a :dependent option, we should fall back to the default of nullification, so no records are actually deleted. The tag_id on the tagging is nullified. However, if we did post.tags.destroy(tag), the tag itself would of course be destroyed. But because :dependent is set on Tag#taggings, the associated tagging would *also* be destroyed automatically, as desired. So no orphaned records. But the option would also be open to add a :dependent option to Post#tags, in which case post.tags.delete(tag) would act in the same way as post.tags.destroy(tag), and the tagging would be destroyed too, as desired. The point is that we can now control this behaviour. On to the second example: class Post < AR::Base has_many :comments, :dependent => :destroy has_many :comment_authors, :through => :comments, :source => :author, :dependent => :destroy end class Comment < AR::Base belongs_to :post belongs_to :author end class Author < AR::Base has_many :comments end post.comment_authors.delete(author) will destroy the author, because of the :dependent option. However, as we have decided we want this to have the effect of making the comment anonymous, not destroying it, we have not added a :dependent option to Author#comments. Hence the Comment join record is not destroyed, as desired. ### Compatibility ### --------------------- The current default behaviour will change in the following way: 1. post.tags.delete(tag) will switch from deleting the join record (via delete_all), to nullifying its foreign key 2. post.tags.destroy(tag) will no longer also delete the join record (which has only been the behaviour since 3.0 anyway) I don''t think either of these changes are so awful that they cause a great problem. In the worst case scenario, we simply have a few more records lying around that we would otherwise have deleted. In any case, if we make this change I think we should advertise it widely, so users know that they ought to check their code when upgrading. Additionally we should add documentation to explain how to use the :dependent option to be explicit about what should be deleted in a has_many :through association. An alternative way to address compatibility might be to provide a new temporary :dependent option, so you could perhaps specify :dependent => :join_record, which would revert to the old delete/destroy behaviour, but emit a deprecation warning, and then this option would subsequently be removed in the next version. But I don''t think this really gains a lot, as users would still have to update their code. ### Conclusion ### ------------------ I am willing and able to implement these changes, but I could do with some feedback, particularly from core team members, about whether this is to right way to go. I''d also like to know if you can think of any situations which would *not* be adequately solved by the above proposal. I have started a branch at https://github.com/jonleighton/rails/tree/fix_has_many_through_deletion which so far just contains refactorings and a couple of bug fixes. Cheers, Jon -- http://jonathanleighton.com/