Wengang Wang
2010-Sep-02 13:17 UTC
[Ocfs2-devel] [PATCH] ocfs2/dlm: remove a potential deadlock
/* * * spinlock lock ordering: if multiple locks are needed, obey this ordering: * dlm_domain_lock * struct dlm_ctxt->spinlock * struct dlm_lock_resource->spinlock * struct dlm_ctxt->master_lock * struct dlm_ctxt->ast_lock * dlm_master_list_entry->spinlock * dlm_lock->spinlock * */ There is a violation that locks on dlm_lock_resource->spinlock after locking on dlm_ctxt->ast_lock. The violation is not detected because so far there is only one place where both the two spinlock are taken at the same time. But it is a potential deadlock in future(when the two lock are taken all together in correct order). This patch fixes the problem by changing the calling order. Signed-off-by: Wengang Wang <wen.gang.wang at oracle.com> --- fs/ocfs2/dlm/dlmthread.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c index 2211acf..676ff3e 100644 --- a/fs/ocfs2/dlm/dlmthread.c +++ b/fs/ocfs2/dlm/dlmthread.c @@ -671,8 +671,8 @@ static int dlm_thread(void *data) /* lockres can be re-dirtied/re-added to the * dirty_list in this gap, but that is ok */ - spin_lock(&dlm->ast_lock); spin_lock(&res->spinlock); + spin_lock(&dlm->ast_lock); if (res->owner != dlm->node_num) { __dlm_print_one_lock_resource(res); mlog(ML_ERROR, "inprog:%s, mig:%s, reco:%s, dirty:%s\n", @@ -691,8 +691,8 @@ static int dlm_thread(void *data) DLM_LOCK_RES_RECOVERING)) { /* move it to the tail and keep going */ res->state &= ~DLM_LOCK_RES_DIRTY; - spin_unlock(&res->spinlock); spin_unlock(&dlm->ast_lock); + spin_unlock(&res->spinlock); mlog(0, "delaying list shuffling for in-" "progress lockres %.*s, state=%d\n", res->lockname.len, res->lockname.name, @@ -713,8 +713,8 @@ static int dlm_thread(void *data) /* called while holding lockres lock */ dlm_shuffle_lists(dlm, res); res->state &= ~DLM_LOCK_RES_DIRTY; - spin_unlock(&res->spinlock); spin_unlock(&dlm->ast_lock); + spin_unlock(&res->spinlock); dlm_lockres_calc_usage(dlm, res); -- 1.7.2.2
Wengang Wang
2010-Sep-02 14:43 UTC
[Ocfs2-devel] [PATCH] ocfs2/dlm: remove a potential deadlock
I am taking back this patch since where is another such violation in dlm_flush_asts(). I will try to fix the both in one combination. regards, wengang. On 10-09-02 21:17, Wengang Wang wrote:> /* > * > * spinlock lock ordering: if multiple locks are needed, obey this ordering: > * dlm_domain_lock > * struct dlm_ctxt->spinlock > * struct dlm_lock_resource->spinlock > * struct dlm_ctxt->master_lock > * struct dlm_ctxt->ast_lock > * dlm_master_list_entry->spinlock > * dlm_lock->spinlock > * > */ > > There is a violation that locks on dlm_lock_resource->spinlock after locking > on dlm_ctxt->ast_lock. The violation is not detected because so far there is > only one place where both the two spinlock are taken at the same time. But it > is a potential deadlock in future(when the two lock are taken all together in > correct order). > > This patch fixes the problem by changing the calling order. > > Signed-off-by: Wengang Wang <wen.gang.wang at oracle.com> > --- > fs/ocfs2/dlm/dlmthread.c | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c > index 2211acf..676ff3e 100644 > --- a/fs/ocfs2/dlm/dlmthread.c > +++ b/fs/ocfs2/dlm/dlmthread.c > @@ -671,8 +671,8 @@ static int dlm_thread(void *data) > /* lockres can be re-dirtied/re-added to the > * dirty_list in this gap, but that is ok */ > > - spin_lock(&dlm->ast_lock); > spin_lock(&res->spinlock); > + spin_lock(&dlm->ast_lock); > if (res->owner != dlm->node_num) { > __dlm_print_one_lock_resource(res); > mlog(ML_ERROR, "inprog:%s, mig:%s, reco:%s, dirty:%s\n", > @@ -691,8 +691,8 @@ static int dlm_thread(void *data) > DLM_LOCK_RES_RECOVERING)) { > /* move it to the tail and keep going */ > res->state &= ~DLM_LOCK_RES_DIRTY; > - spin_unlock(&res->spinlock); > spin_unlock(&dlm->ast_lock); > + spin_unlock(&res->spinlock); > mlog(0, "delaying list shuffling for in-" > "progress lockres %.*s, state=%d\n", > res->lockname.len, res->lockname.name, > @@ -713,8 +713,8 @@ static int dlm_thread(void *data) > /* called while holding lockres lock */ > dlm_shuffle_lists(dlm, res); > res->state &= ~DLM_LOCK_RES_DIRTY; > - spin_unlock(&res->spinlock); > spin_unlock(&dlm->ast_lock); > + spin_unlock(&res->spinlock); > > dlm_lockres_calc_usage(dlm, res); > > -- > 1.7.2.2 > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Wengang Wang
2010-Sep-02 14:53 UTC
[Ocfs2-devel] [PATCH] ocfs2/dlm: remove a potential deadlock
On 10-09-02 22:43, Wengang Wang wrote:> I am taking back this patch since where is another such violation inSorry for the typo. s/where/there regards, wengang.> dlm_flush_asts(). > > I will try to fix the both in one combination. > > regards, > wengang. > > On 10-09-02 21:17, Wengang Wang wrote: > > /* > > * > > * spinlock lock ordering: if multiple locks are needed, obey this ordering: > > * dlm_domain_lock > > * struct dlm_ctxt->spinlock > > * struct dlm_lock_resource->spinlock > > * struct dlm_ctxt->master_lock > > * struct dlm_ctxt->ast_lock > > * dlm_master_list_entry->spinlock > > * dlm_lock->spinlock > > * > > */ > > > > There is a violation that locks on dlm_lock_resource->spinlock after locking > > on dlm_ctxt->ast_lock. The violation is not detected because so far there is > > only one place where both the two spinlock are taken at the same time. But it > > is a potential deadlock in future(when the two lock are taken all together in > > correct order). > > > > This patch fixes the problem by changing the calling order. > > > > Signed-off-by: Wengang Wang <wen.gang.wang at oracle.com> > > --- > > fs/ocfs2/dlm/dlmthread.c | 6 +++--- > > 1 files changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c > > index 2211acf..676ff3e 100644 > > --- a/fs/ocfs2/dlm/dlmthread.c > > +++ b/fs/ocfs2/dlm/dlmthread.c > > @@ -671,8 +671,8 @@ static int dlm_thread(void *data) > > /* lockres can be re-dirtied/re-added to the > > * dirty_list in this gap, but that is ok */ > > > > - spin_lock(&dlm->ast_lock); > > spin_lock(&res->spinlock); > > + spin_lock(&dlm->ast_lock); > > if (res->owner != dlm->node_num) { > > __dlm_print_one_lock_resource(res); > > mlog(ML_ERROR, "inprog:%s, mig:%s, reco:%s, dirty:%s\n", > > @@ -691,8 +691,8 @@ static int dlm_thread(void *data) > > DLM_LOCK_RES_RECOVERING)) { > > /* move it to the tail and keep going */ > > res->state &= ~DLM_LOCK_RES_DIRTY; > > - spin_unlock(&res->spinlock); > > spin_unlock(&dlm->ast_lock); > > + spin_unlock(&res->spinlock); > > mlog(0, "delaying list shuffling for in-" > > "progress lockres %.*s, state=%d\n", > > res->lockname.len, res->lockname.name, > > @@ -713,8 +713,8 @@ static int dlm_thread(void *data) > > /* called while holding lockres lock */ > > dlm_shuffle_lists(dlm, res); > > res->state &= ~DLM_LOCK_RES_DIRTY; > > - spin_unlock(&res->spinlock); > > spin_unlock(&dlm->ast_lock); > > + spin_unlock(&res->spinlock); > > > > dlm_lockres_calc_usage(dlm, res); > > > > -- > > 1.7.2.2 > > > > > > _______________________________________________ > > Ocfs2-devel mailing list > > Ocfs2-devel at oss.oracle.com > > http://oss.oracle.com/mailman/listinfo/ocfs2-devel