Toby Ovod-Everett
2013-Jun-14  01:07 UTC
Support for deferrable constraints in PostgreSQLAdapter#disable_referential_integrity
I recently submitted a pull-request
(https://github.com/rails/rails/pull/10939) to address a bug in
PostgreSQLAdapter#disable_referential_integrity related to how the
existing code behaves in a transaction when the PostgreSQL user account
lacks superuser privileges on the PostgreSQL instance.
While this permits the code to function without crashing, it doesn''t
actually make #disable_referential_integrity work unless the user has
superuser privileges (which apply to the entire PostgreSQL instance).
There may be perfectly valid reasons for a developer not to have
superuser privs on the PostgreSQL instance on their development machine,
and I would like to explore options for resolving this issue.
The first item to discuss is the purpose of
#disable_referential_integrity. Technically, the method name implies
that the only thing it is supposed to do is to disable referential
integrity. Referential integrity is but one specific case of a whole set
of database level integrity constraints (including primary keys,
uniqueness constraints, etc.). Few databases support SQL 92 assertions
(see
http://stackoverflow.com/questions/6368349/why-dont-dbmss-support-assertion),
so most developers/DBAs that want more complicated database level
integrity constraints implement them via triggers. In addition, triggers
can also be used to implement automatic caching of computer or derived
data within the database (for instance, updating a table that keeps
track of all direct and indirect reporting pairs for employees in an
organization).
I would suggest that what we really mean by
#disable_referential_integrity is "disable all constraints and automatic
updates that affect more than one row in the database so that I can do
bulk database changes without running into issues sequencing the updates
because I need to make a set of changes that will (or should) leave the
database in a consistent state". It just happens to be that referential
integrity is the most common of these constraints, and so the method was
given its current name. I see no need to change the name, but being
clear about what we mean by it is important.
Now on to the PostgreSQL specifics. PostgreSQL supports two approaches
for disabling constraints and triggers. The first is to use ALTER TABLE
table DISABLE TRIGGER ALL. This requires superuser privileges and
disables both user and system created triggers. The system created
triggers are used by PostgreSQL to implement foreign key contraints (as
well as some other types of constraints). The user created triggers
could have been created by the user for any purpose.
The current implementation of #disable_referential_integrity attempts to
fall back to ALTER TABLE table DISABLE TRIGGER USER for scenarios where
the user lacks superuser privileges. This will disable user created
triggers, but leaves alone the system created triggers used by
PostgreSQL to implement foreign key constraints. This means the foreign
key constraints are still being enforced, and while
#disable_referential_integrity doesn''t raise any errors, referential
integrity is not being disabled!
There is another approach that doesn''t require superuser privs, and
that
is to use SET CONSTRAINTS ALL DEFERRED, which defers evaluation of
constraints until the end of the current transaction. This approach has
two requirements. First, it must be issued within a transaction - if
there is no current transaction, the statement has no effect. Second,
the constraints must have been created with the DEFERRABLE parameter.
There is one major advantage to this approach, and that is that the
constraints are verified when the transaction ends. Unlike the Oracle
behavior documented in adapter_test.rb, PostgreSQL does not appear to
verify the referential integrity triggers when they are re-enabled via
ALTER TABLE table ENABLE TRIGGER ALL.
Here are the various scenario options that need to be considered when
deciding what behavior PostgreSQLAdapter#disable_referential_integrity
should implement:
* User has SUPERUSER vs. User does not have SUPERUSER
* FOREIGN KEY constraints were defined without DEFERRABLE vs.
  defined with DEFERRABLE
* User has declared additional constraints via user triggers vs.
  no user triggers
* The call was made from within a transaction vs. outside a transaction
Part of the problem in crafting the code is that we don''t have any easy
way of knowing which of the 16 possible scenarios the code faces, but we
can do our best. Also, while it might be nice to provide more control to
the user over options such as whether or not user triggers are disabled,
I''d rather not complicate the interface. Finally, while it would be
nice
to verify referential integrity at the end of the block, I don''t think
that takes precedence over ensuring reliable code under a wide variety
of scenarios.
My suggestions are as follows:
* If the user has SUPERUSER, disable all triggers.  This ensures we can
  disable referential integrity even if the constraints weren''t
declared
  with DEFERRABLE, so long as the user has SUPERUSER.
* If the user doesn''t have SUPERUSER, disable all user triggers and
then
  use SET CONSTRAINTS ALL DEFERRED.
  * If the call was made from within a transaction, there is no need to
    create a nested transaction.
  * If the call was not made from within a transaction, create one
    before calling SET CONSTRAINTS ALL DEFERRED and yield to the block
    from within that transaction.
While this results in three possible code paths depending upon the
scenario, I believe it gets us as close as possible to the intent of
#disable_referential_integrity under all the scenarios.
I would be happy to implement this approach and to develop a reasonably
comprehensive test suite to validate it (starting with the test suite,
of course). Should I proceed? Is there any feedback before I start?
--Toby Ovod-Everett
P.S. I don''t intend to open discussion on whether support for
referential integrity belongs in ActiveRecord. I personally am happy
with the current situation. There are at least two 3rd-party modules
that provide this support (foreigner and schema_plus), and I think it is
useful to encourage innovation in this space. I do, however, strongly
believe that #disable_referential_integrity belongs in ActiveRecord
because it exposes an interface to both internal code (i.e. fixtures)
and 3rd-party code (database_cleaner, yaml_db, etc.) that need this
support, and centralizing that improves the chances of both getting the
behavior implemented correctly and simplifying life for the code that
needs this support to tap into it.
-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to rubyonrails-core+unsubscribe@googlegroups.com.
To post to this group, send email to rubyonrails-core@googlegroups.com.
Visit this group at http://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/groups/opt_out.
