Hi,
[My apologies if this isn''t the right list for this post.]
The documentation for validates_uniqueness_of says that the validation
can''t protect against races that would result in duplicate insertions,
even if the DB is in serializable mode. I don''t think this is true:
the SQL serializable isolation mode is supposed to ensure that
transactions appear to have a well-defined execution order. If all
transactions appear to execute in a specific order, than doing a
SELECT to check for the existence of a key followed by an UPDATE
should be enough to protect against duplicate keys.
You can test the race shown at
http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#M002167
using MySQL 5 to see that serializable isolation does actually prevent
duplicate keys:
[Session A]
create table comments (title varchar(100), content varchar(100))
engine InnoDB;
set transaction isolation level serializable;
begin;
select * from comments where title = "My Post";
[Session B]
set transaction isolation level serializable;
begin;
select * from comments where title = "My Post";
[Session A]
insert into comments (title, content) values ("My Post",
"hi!");
[NOTE: We begin to wait on a lock here]
[Session B]
insert into comments (title, content) values ("My Post",
"hello!");
[NOTE: Deadlock detected, Session B''s transaction is aborted]
[Session A]
[insert has unblocked]
commit;
We can get the same protection even in the default repeatable read
isolation mode by explicitly taking out a lock during the select. So,
with this patch, validates_uniqueness_of should work correctly even if
the DB is running only in repeatable read:
diff --git a/activerecord/lib/active_record/validations/uniqueness.rb
b/activerecord/lib/active_record/validations/uniqueness.rb
index 711086d..bc0f38c 100644
--- a/activerecord/lib/active_record/validations/uniqueness.rb
+++ b/activerecord/lib/active_record/validations/uniqueness.rb
@@ -148,7 +148,7 @@ module ActiveRecord
end
finder_class.with_exclusive_scope do
- if finder_class.exists?([condition_sql,
*condition_params])
+ if finder_class.find(:first, :conditions =>
[condition_sql, *condition_params], :lock => true)
record.errors.add(attr_name, :taken, :default =>
configuration[:message], :value => value)
end
end
Actually, all we need is a shared lock, which on MySQL you could get
with :lock => "lock in share mode". Unfortunately, Postgres seems
to
have a different syntax, and there''s no abstraction for shared locks
in the DB adapters.
-- Andrew
On Sun, Aug 16, 2009 at 2:59 PM, Andrew Miklas<andrew@pagerduty.com> wrote:> > Hi, > > [My apologies if this isn''t the right list for this post.] > > The documentation for validates_uniqueness_of says that the validation > can''t protect against races that would result in duplicate insertions, > even if the DB is in serializable mode. I don''t think this is true: > the SQL serializable isolation mode is supposed to ensure that > transactions appear to have a well-defined execution order. If all > transactions appear to execute in a specific order, than doing a > SELECT to check for the existence of a key followed by an UPDATE > should be enough to protect against duplicate keys. > > You can test the race shown at > http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#M002167 > using MySQL 5 to see that serializable isolation does actually prevent > duplicate keys: > > [Session A] > create table comments (title varchar(100), content varchar(100)) > engine InnoDB; > set transaction isolation level serializable; > begin; > select * from comments where title = "My Post"; > > [Session B] > set transaction isolation level serializable; > begin; > select * from comments where title = "My Post"; > > [Session A] > insert into comments (title, content) values ("My Post", "hi!"); > [NOTE: We begin to wait on a lock here] > > [Session B] > insert into comments (title, content) values ("My Post", "hello!"); > [NOTE: Deadlock detected, Session B''s transaction is aborted] > > [Session A] > [insert has unblocked] > commit; > > > We can get the same protection even in the default repeatable read > isolation mode by explicitly taking out a lock during the select. So, > with this patch, validates_uniqueness_of should work correctly even if > the DB is running only in repeatable read: > > diff --git a/activerecord/lib/active_record/validations/uniqueness.rb > b/activerecord/lib/active_record/validations/uniqueness.rb > index 711086d..bc0f38c 100644 > --- a/activerecord/lib/active_record/validations/uniqueness.rb > +++ b/activerecord/lib/active_record/validations/uniqueness.rb > @@ -148,7 +148,7 @@ module ActiveRecord > end > > finder_class.with_exclusive_scope do > - if finder_class.exists?([condition_sql, > *condition_params]) > + if finder_class.find(:first, :conditions => > [condition_sql, *condition_params], :lock => true) > record.errors.add(attr_name, :taken, :default => > configuration[:message], :value => value) > end > end > > Actually, all we need is a shared lock, which on MySQL you could get > with :lock => "lock in share mode". Unfortunately, Postgres seems to > have a different syntax, and there''s no abstraction for shared locks > in the DB adapters.Unfortunately, this is a pointless endeavor. It''s easier to prevent duplicate entries using a unique index. In both the unique index and your method, the error is raised by the database when the insert is attempted. The only reason to use a validation is to report a nice validation error message before attempting the insert. Your method doesn''t do that, because both select statements return no rows. Jeremy
On Aug 17, 2:06 am, Jeremy Evans <jeremyeva...@gmail.com> wrote:> Unfortunately, this is a pointless endeavor. It''s easier to prevent > duplicate entries using a unique index. In both the unique index and > your method, the error is raised by the database when the insert is > attempted. > > The only reason to use a validation is to report a nice validation > error message before attempting the insert. Your method doesn''t do > that, because both select statements return no rows.Yes. The patch also keeps the row locked indefinitely, even if we later on decide not to update the row (e.g. because other validations fail). Using a unique index is better.