Coly Li
2008-Dec-04 10:38 UTC
[Ocfs2-devel] [PATCH] ocfs2/dlm: fix lockres mastery race, v2
dlm_get_lock_resource() is supposed to return a lock resource with a proper master. If multiple concurrent threads attempt to lookup the lockres for the same lockid, one or more threads are likely to return a lockres without a proper master, if the lock mastery is underway. This patch makes the threads wait in dlm_get_lock_resource() while the mastery is underway, ensuring all threads return the lockres with a proper master. This issue is known limited to users using the flock() syscall. For all other fs operations, dlmglue should ensure that only one thread is performing dlm operations for a given lockid at any one time. Thank Sunil for his review and comments. Signed-off-by: Coly Li <coyli at suse.de> Cc: Sunil Mushran <sunil.mushran at oracle.com> Cc: Mark Fasheh <mfasheh at suse.com> Cc: Jeff Mahoney <jeffm at suse.com> --- fs/ocfs2/dlm/dlmmaster.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 44f87ca..dd1e754 100644 --- a/fs/ocfs2/dlm/dlmmaster.c +++ b/fs/ocfs2/dlm/dlmmaster.c @@ -742,6 +742,17 @@ lookup: goto lookup; } + /* wait for lock resource is being mastered by another thread */ + spin_lock(&tmpres->spinlock); + if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) { + __dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_IN_PROGRESS); + spin_unlock(&tmpres->spinlock); + dlm_lockres_put(tmpres); + tmpres = NULL; + goto lookup; + } + spin_unlock(&tmpres->spinlock); + mlog(0, "found in hash!\n"); if (res) dlm_lockres_put(res); -- Coly Li SuSE PRC Labs
Sunil Mushran
2008-Dec-04 20:17 UTC
[Ocfs2-devel] [PATCH] ocfs2/dlm: fix lockres mastery race, v2
nak The reason is listed in yesterday's email. Coly Li wrote:> dlm_get_lock_resource() is supposed to return a lock resource with a proper master. If multiple > concurrent threads attempt to lookup the lockres for the same lockid, one or more threads are likely > to return a lockres without a proper master, if the lock mastery is underway. > > This patch makes the threads wait in dlm_get_lock_resource() while the mastery is underway, ensuring > all threads return the lockres with a proper master. > > This issue is known limited to users using the flock() syscall. For all other fs operations, dlmglue > should ensure that only one thread is performing dlm operations for a given lockid at any one time. > > Thank Sunil for his review and comments. > > Signed-off-by: Coly Li <coyli at suse.de> > Cc: Sunil Mushran <sunil.mushran at oracle.com> > Cc: Mark Fasheh <mfasheh at suse.com> > Cc: Jeff Mahoney <jeffm at suse.com> > --- > fs/ocfs2/dlm/dlmmaster.c | 11 +++++++++++ > 1 files changed, 11 insertions(+), 0 deletions(-) > > diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c > index 44f87ca..dd1e754 100644 > --- a/fs/ocfs2/dlm/dlmmaster.c > +++ b/fs/ocfs2/dlm/dlmmaster.c > @@ -742,6 +742,17 @@ lookup: > goto lookup; > } > > + /* wait for lock resource is being mastered by another thread */ > + spin_lock(&tmpres->spinlock); > + if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) { > + __dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_IN_PROGRESS); > + spin_unlock(&tmpres->spinlock); > + dlm_lockres_put(tmpres); > + tmpres = NULL; > + goto lookup; > + } > + spin_unlock(&tmpres->spinlock); > + > mlog(0, "found in hash!\n"); > if (res) > dlm_lockres_put(res); >