Toby Ovod-Everett
2013-Jun-14 01:07 UTC
Support for deferrable constraints in PostgreSQLAdapter#disable_referential_integrity
I recently submitted a pull-request (https://github.com/rails/rails/pull/10939) to address a bug in PostgreSQLAdapter#disable_referential_integrity related to how the existing code behaves in a transaction when the PostgreSQL user account lacks superuser privileges on the PostgreSQL instance. While this permits the code to function without crashing, it doesn''t actually make #disable_referential_integrity work unless the user has superuser privileges (which apply to the entire PostgreSQL instance). There may be perfectly valid reasons for a developer not to have superuser privs on the PostgreSQL instance on their development machine, and I would like to explore options for resolving this issue. The first item to discuss is the purpose of #disable_referential_integrity. Technically, the method name implies that the only thing it is supposed to do is to disable referential integrity. Referential integrity is but one specific case of a whole set of database level integrity constraints (including primary keys, uniqueness constraints, etc.). Few databases support SQL 92 assertions (see http://stackoverflow.com/questions/6368349/why-dont-dbmss-support-assertion), so most developers/DBAs that want more complicated database level integrity constraints implement them via triggers. In addition, triggers can also be used to implement automatic caching of computer or derived data within the database (for instance, updating a table that keeps track of all direct and indirect reporting pairs for employees in an organization). I would suggest that what we really mean by #disable_referential_integrity is "disable all constraints and automatic updates that affect more than one row in the database so that I can do bulk database changes without running into issues sequencing the updates because I need to make a set of changes that will (or should) leave the database in a consistent state". It just happens to be that referential integrity is the most common of these constraints, and so the method was given its current name. I see no need to change the name, but being clear about what we mean by it is important. Now on to the PostgreSQL specifics. PostgreSQL supports two approaches for disabling constraints and triggers. The first is to use ALTER TABLE table DISABLE TRIGGER ALL. This requires superuser privileges and disables both user and system created triggers. The system created triggers are used by PostgreSQL to implement foreign key contraints (as well as some other types of constraints). The user created triggers could have been created by the user for any purpose. The current implementation of #disable_referential_integrity attempts to fall back to ALTER TABLE table DISABLE TRIGGER USER for scenarios where the user lacks superuser privileges. This will disable user created triggers, but leaves alone the system created triggers used by PostgreSQL to implement foreign key constraints. This means the foreign key constraints are still being enforced, and while #disable_referential_integrity doesn''t raise any errors, referential integrity is not being disabled! There is another approach that doesn''t require superuser privs, and that is to use SET CONSTRAINTS ALL DEFERRED, which defers evaluation of constraints until the end of the current transaction. This approach has two requirements. First, it must be issued within a transaction - if there is no current transaction, the statement has no effect. Second, the constraints must have been created with the DEFERRABLE parameter. There is one major advantage to this approach, and that is that the constraints are verified when the transaction ends. Unlike the Oracle behavior documented in adapter_test.rb, PostgreSQL does not appear to verify the referential integrity triggers when they are re-enabled via ALTER TABLE table ENABLE TRIGGER ALL. Here are the various scenario options that need to be considered when deciding what behavior PostgreSQLAdapter#disable_referential_integrity should implement: * User has SUPERUSER vs. User does not have SUPERUSER * FOREIGN KEY constraints were defined without DEFERRABLE vs. defined with DEFERRABLE * User has declared additional constraints via user triggers vs. no user triggers * The call was made from within a transaction vs. outside a transaction Part of the problem in crafting the code is that we don''t have any easy way of knowing which of the 16 possible scenarios the code faces, but we can do our best. Also, while it might be nice to provide more control to the user over options such as whether or not user triggers are disabled, I''d rather not complicate the interface. Finally, while it would be nice to verify referential integrity at the end of the block, I don''t think that takes precedence over ensuring reliable code under a wide variety of scenarios. My suggestions are as follows: * If the user has SUPERUSER, disable all triggers. This ensures we can disable referential integrity even if the constraints weren''t declared with DEFERRABLE, so long as the user has SUPERUSER. * If the user doesn''t have SUPERUSER, disable all user triggers and then use SET CONSTRAINTS ALL DEFERRED. * If the call was made from within a transaction, there is no need to create a nested transaction. * If the call was not made from within a transaction, create one before calling SET CONSTRAINTS ALL DEFERRED and yield to the block from within that transaction. While this results in three possible code paths depending upon the scenario, I believe it gets us as close as possible to the intent of #disable_referential_integrity under all the scenarios. I would be happy to implement this approach and to develop a reasonably comprehensive test suite to validate it (starting with the test suite, of course). Should I proceed? Is there any feedback before I start? --Toby Ovod-Everett P.S. I don''t intend to open discussion on whether support for referential integrity belongs in ActiveRecord. I personally am happy with the current situation. There are at least two 3rd-party modules that provide this support (foreigner and schema_plus), and I think it is useful to encourage innovation in this space. I do, however, strongly believe that #disable_referential_integrity belongs in ActiveRecord because it exposes an interface to both internal code (i.e. fixtures) and 3rd-party code (database_cleaner, yaml_db, etc.) that need this support, and centralizing that improves the chances of both getting the behavior implemented correctly and simplifying life for the code that needs this support to tap into it. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To unsubscribe from this group and stop receiving emails from it, send an email to rubyonrails-core+unsubscribe@googlegroups.com. To post to this group, send email to rubyonrails-core@googlegroups.com. Visit this group at http://groups.google.com/group/rubyonrails-core. For more options, visit https://groups.google.com/groups/opt_out.