Hi all,
I am going to write a fairly long email about deletion and destruction
in has_many :through associations, and then make some proposals. Sorry
for the length, but I think it''s currently a mess which we need to get
sorted. Please do read it :)
### Issues ###
--------------
* The :dependent option is not supported, and the "default" behaviour
for delete (a delete_all operation) is different to has_many (a nullify
operation)
* delete and destroy operate on different records (explained further
below)
* delete, and destroy since 3.0, raise an exception unless the source
association is a belongs_to
* Counter caches and :touch fields are not updated correctly on deletion
### History ###
---------------
Consider the following:
class Post < AR::Base
has_many :taggings
has_many :tags, :through => :taggings
end
In 2.3, the behaviour was thus:
* post.tags.delete(tag) would not affect the tag at all, but it would
delete the relevant Tagging record, thus detaching the tag from the post
* post.tags.destroy(tag) would delete the tag record, but would not
affect the tagging record at all, thus leaving invalid records lying
around
Ticket #2251 was created about this, which contains quite a bit of
fairly confusing debate: http://bit.ly/fBpCfM
The solution proposed in #2251 was to make destroy behave more like
delete, and a patch was submitted which causes post.tags.destroy(tag) to
delete both the tag AND the tagging.
This lead to a complaint that destroy had ceased to work in some cases
where it had previously work. Specifically, because destroy now involved
the join record, it didn''t work unless the source association on the
join record was a belongs_to.
Due to the complaint, Michael Koziarski reverted the change in
2-3-stable, but left it in master (which went on to become 3-0-stable)
pending further discussion.
There was some further discussion, but it eventually petered out without
any action, so we now have a situation where that change has crept into
3-0-stable without proper agreement.
Some of that further discussion centred around the assumptions that are
made by has_many :through about your join records. In the above example,
you might expect post.tags.delete(tag) to delete the tagging, on the
basis that you don''t want orphaned records just lying around. However,
this is an assumption that Tagging exists purely to link Post to Tag,
and does not have any independent purpose.
Here''s another example:
class Post < AR::Base
has_many :comments
has_many :comment_authors, :through => :comments, :source => :author
end
class Comment < AR::Base
belongs_to :author
end
Suppose we can have anonymous comments, which have no associated author.
Now, if we follow the same assumptions,
post.comments_authors.delete(author) should also delete the associated
comment. But we can clearly see that this assumption could be wrong if
what we actually want to achieve is to make all of a given author''s
comments on a given post anonymous.
I basically think that the lack of support for the :dependent option has
lead to assumptions being built into has_many :through, because people
have lacked the ability to be explicit about what they want.
### Proposal ###
----------------
Here is my proposal:
* If the :dependent option is implemented in an analogous way to the
implementation for regular has_many, it will provide enough power to be
completely explicit about what should be done in a given situation, and
therefore we can stop making (potentially incorrect) assumptions
* delete and destroy should only ever operate on the same records, as
implied by the documentation of them in AssociationCollection [reminder:
at the moment delete operates on the join record, but destroy operates
on the join record and the associated record]
* Therefore, has_many :through should never specifically delete join
records during a delete or destroy operation (but it may do given the
right configuration of :dependent options on models)
To illustrate why this will solve things, I''ll return to the examples
given. Assuming my proposal is implemented, the tags example could be
coded like so:
class Post < AR::Base
has_many :taggings, :dependent => :destroy
has_many :tags, :through => :taggings
end
class Tagging < AR::Base
belongs_to :post
belongs_to :tag
end
class Tag < AR::Base
has_many :taggings, :dependent => :destroy
has_many :posts, :through => :taggings
end
Now consider post.tags.delete(tag). In the absence, of a :dependent
option, we should fall back to the default of nullification, so no
records are actually deleted. The tag_id on the tagging is nullified.
However, if we did post.tags.destroy(tag), the tag itself would of
course be destroyed. But because :dependent is set on Tag#taggings, the
associated tagging would *also* be destroyed automatically, as desired.
So no orphaned records.
But the option would also be open to add a :dependent option to
Post#tags, in which case post.tags.delete(tag) would act in the same way
as post.tags.destroy(tag), and the tagging would be destroyed too, as
desired.
The point is that we can now control this behaviour.
On to the second example:
class Post < AR::Base
has_many :comments, :dependent => :destroy
has_many :comment_authors, :through => :comments,
:source => :author, :dependent => :destroy
end
class Comment < AR::Base
belongs_to :post
belongs_to :author
end
class Author < AR::Base
has_many :comments
end
post.comment_authors.delete(author) will destroy the author, because of
the :dependent option. However, as we have decided we want this to have
the effect of making the comment anonymous, not destroying it, we have
not added a :dependent option to Author#comments. Hence the Comment join
record is not destroyed, as desired.
### Compatibility ###
---------------------
The current default behaviour will change in the following way:
1. post.tags.delete(tag) will switch from deleting the join record (via
delete_all), to nullifying its foreign key
2. post.tags.destroy(tag) will no longer also delete the join record
(which has only been the behaviour since 3.0 anyway)
I don''t think either of these changes are so awful that they cause a
great problem. In the worst case scenario, we simply have a few more
records lying around that we would otherwise have deleted. In any case,
if we make this change I think we should advertise it widely, so users
know that they ought to check their code when upgrading. Additionally we
should add documentation to explain how to use the :dependent option to
be explicit about what should be deleted in a has_many :through
association.
An alternative way to address compatibility might be to provide a new
temporary :dependent option, so you could perhaps specify :dependent
=> :join_record, which would revert to the old delete/destroy behaviour,
but emit a deprecation warning, and then this option would subsequently
be removed in the next version. But I don''t think this really gains a
lot, as users would still have to update their code.
### Conclusion ###
------------------
I am willing and able to implement these changes, but I could do with
some feedback, particularly from core team members, about whether this
is to right way to go.
I''d also like to know if you can think of any situations which would
*not* be adequately solved by the above proposal.
I have started a branch at
https://github.com/jonleighton/rails/tree/fix_has_many_through_deletion
which so far just contains refactorings and a couple of bug fixes.
Cheers,
Jon
--
http://jonathanleighton.com/