Hi, [My apologies if this isn''t the right list for this post.] The documentation for validates_uniqueness_of says that the validation can''t protect against races that would result in duplicate insertions, even if the DB is in serializable mode. I don''t think this is true: the SQL serializable isolation mode is supposed to ensure that transactions appear to have a well-defined execution order. If all transactions appear to execute in a specific order, than doing a SELECT to check for the existence of a key followed by an UPDATE should be enough to protect against duplicate keys. You can test the race shown at http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#M002167 using MySQL 5 to see that serializable isolation does actually prevent duplicate keys: [Session A] create table comments (title varchar(100), content varchar(100)) engine InnoDB; set transaction isolation level serializable; begin; select * from comments where title = "My Post"; [Session B] set transaction isolation level serializable; begin; select * from comments where title = "My Post"; [Session A] insert into comments (title, content) values ("My Post", "hi!"); [NOTE: We begin to wait on a lock here] [Session B] insert into comments (title, content) values ("My Post", "hello!"); [NOTE: Deadlock detected, Session B''s transaction is aborted] [Session A] [insert has unblocked] commit; We can get the same protection even in the default repeatable read isolation mode by explicitly taking out a lock during the select. So, with this patch, validates_uniqueness_of should work correctly even if the DB is running only in repeatable read: diff --git a/activerecord/lib/active_record/validations/uniqueness.rb b/activerecord/lib/active_record/validations/uniqueness.rb index 711086d..bc0f38c 100644 --- a/activerecord/lib/active_record/validations/uniqueness.rb +++ b/activerecord/lib/active_record/validations/uniqueness.rb @@ -148,7 +148,7 @@ module ActiveRecord end finder_class.with_exclusive_scope do - if finder_class.exists?([condition_sql, *condition_params]) + if finder_class.find(:first, :conditions => [condition_sql, *condition_params], :lock => true) record.errors.add(attr_name, :taken, :default => configuration[:message], :value => value) end end Actually, all we need is a shared lock, which on MySQL you could get with :lock => "lock in share mode". Unfortunately, Postgres seems to have a different syntax, and there''s no abstraction for shared locks in the DB adapters. -- Andrew
On Sun, Aug 16, 2009 at 2:59 PM, Andrew Miklas<andrew@pagerduty.com> wrote:> > Hi, > > [My apologies if this isn''t the right list for this post.] > > The documentation for validates_uniqueness_of says that the validation > can''t protect against races that would result in duplicate insertions, > even if the DB is in serializable mode. I don''t think this is true: > the SQL serializable isolation mode is supposed to ensure that > transactions appear to have a well-defined execution order. If all > transactions appear to execute in a specific order, than doing a > SELECT to check for the existence of a key followed by an UPDATE > should be enough to protect against duplicate keys. > > You can test the race shown at > http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#M002167 > using MySQL 5 to see that serializable isolation does actually prevent > duplicate keys: > > [Session A] > create table comments (title varchar(100), content varchar(100)) > engine InnoDB; > set transaction isolation level serializable; > begin; > select * from comments where title = "My Post"; > > [Session B] > set transaction isolation level serializable; > begin; > select * from comments where title = "My Post"; > > [Session A] > insert into comments (title, content) values ("My Post", "hi!"); > [NOTE: We begin to wait on a lock here] > > [Session B] > insert into comments (title, content) values ("My Post", "hello!"); > [NOTE: Deadlock detected, Session B''s transaction is aborted] > > [Session A] > [insert has unblocked] > commit; > > > We can get the same protection even in the default repeatable read > isolation mode by explicitly taking out a lock during the select. So, > with this patch, validates_uniqueness_of should work correctly even if > the DB is running only in repeatable read: > > diff --git a/activerecord/lib/active_record/validations/uniqueness.rb > b/activerecord/lib/active_record/validations/uniqueness.rb > index 711086d..bc0f38c 100644 > --- a/activerecord/lib/active_record/validations/uniqueness.rb > +++ b/activerecord/lib/active_record/validations/uniqueness.rb > @@ -148,7 +148,7 @@ module ActiveRecord > end > > finder_class.with_exclusive_scope do > - if finder_class.exists?([condition_sql, > *condition_params]) > + if finder_class.find(:first, :conditions => > [condition_sql, *condition_params], :lock => true) > record.errors.add(attr_name, :taken, :default => > configuration[:message], :value => value) > end > end > > Actually, all we need is a shared lock, which on MySQL you could get > with :lock => "lock in share mode". Unfortunately, Postgres seems to > have a different syntax, and there''s no abstraction for shared locks > in the DB adapters.Unfortunately, this is a pointless endeavor. It''s easier to prevent duplicate entries using a unique index. In both the unique index and your method, the error is raised by the database when the insert is attempted. The only reason to use a validation is to report a nice validation error message before attempting the insert. Your method doesn''t do that, because both select statements return no rows. Jeremy
On Aug 17, 2:06 am, Jeremy Evans <jeremyeva...@gmail.com> wrote:> Unfortunately, this is a pointless endeavor. It''s easier to prevent > duplicate entries using a unique index. In both the unique index and > your method, the error is raised by the database when the insert is > attempted. > > The only reason to use a validation is to report a nice validation > error message before attempting the insert. Your method doesn''t do > that, because both select statements return no rows.Yes. The patch also keeps the row locked indefinitely, even if we later on decide not to update the row (e.g. because other validations fail). Using a unique index is better.