akpm at linux-foundation.org
2014-Mar-19 21:10 UTC
[Ocfs2-devel] [patch 5/8] ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless, loop during umount
From: jiangyiwen <jiangyiwen at huawei.com> Subject: ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop during umount The following case may lead to endless loop during umount. node A node B node C node D umount volume, migrate lockres1 to B want to lock lockres1, send MASTER_REQUEST_MSG to C init block mle send MIGRATE_REQUEST_MSG to C find a block mle, and then return DLM_MIGRATE_RESPONSE_MASTERY_REF to B set C in refmap umount successfully try to umount, endless loop occurs when migrate lockres1 since C is in refmap So we can fix this endless loop case by only returning DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving MIGRATE_REQUEST_MSG. [akpm at linux-foundation.org: coding-style fixes] Signed-off-by: jiangyiwen <jiangyiwen at huawei.com> Cc: Mark Fasheh <mfasheh at suse.com> Cc: Joel Becker <jlbec at evilplan.org> Cc: Xue jiufei <xuejiufei at huawei.com> Signed-off-by: Andrew Morton <akpm at linux-foundation.org> --- fs/ocfs2/dlm/dlmmaster.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff -puN fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount fs/ocfs2/dlm/dlmmaster.c --- a/fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount +++ a/fs/ocfs2/dlm/dlmmaster.c @@ -3084,11 +3084,15 @@ static int dlm_add_migration_mle(struct /* remove it so that only one mle will be found */ __dlm_unlink_mle(dlm, tmp); __dlm_mle_detach_hb_events(dlm, tmp); - ret = DLM_MIGRATE_RESPONSE_MASTERY_REF; - mlog(0, "%s:%.*s: master=%u, newmaster=%u, " - "telling master to get ref for cleared out mle " - "during migration\n", dlm->name, namelen, name, - master, new_master); + if (tmp->type == DLM_MLE_MASTER) { + ret = DLM_MIGRATE_RESPONSE_MASTERY_REF; + mlog(0, "%s:%.*s: master=%u, newmaster=%u, " + "telling master to get ref " + "for cleared out mle during " + "migration\n", dlm->name, + namelen, name, master, + new_master); + } } spin_unlock(&tmp->spinlock); } _
Mark Fasheh
2014-Mar-31 02:23 UTC
[Ocfs2-devel] [patch 5/8] ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless, loop during umount
On Wed, Mar 19, 2014 at 02:10:03PM -0700, Andrew Morton wrote:> From: jiangyiwen <jiangyiwen at huawei.com> > Subject: ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop during umount > > The following case may lead to endless loop during umount. > > node A node B node C node D > umount volume, > migrate lockres1 > to B > want to lock lockres1, > send > MASTER_REQUEST_MSG > to C > init block mle > send > MIGRATE_REQUEST_MSG > to C > find a block > mle, and then > return > DLM_MIGRATE_RESPONSE_MASTERY_REF > to B > set C in refmap > umount successfully > try to umount, endless > loop occurs when migrate > lockres1 since C is in > refmap > > So we can fix this endless loop case by only returning > DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving > MIGRATE_REQUEST_MSG. > > [akpm at linux-foundation.org: coding-style fixes] > Signed-off-by: jiangyiwen <jiangyiwen at huawei.com> > Cc: Mark Fasheh <mfasheh at suse.com> > Cc: Joel Becker <jlbec at evilplan.org> > Cc: Xue jiufei <xuejiufei at huawei.com> > Signed-off-by: Andrew Morton <akpm at linux-foundation.org>Ok, I _think_ I got this race condition, and the patch itself seems sane. How was this bug hit, and how much testing did you do with this patch? I ask because dlm changes can sometimes have unintended effects and I really don't understand that particular code well enough right now to tell with 100% certainty we didn't mess something else up. Actually, I'm going to CC Sunil in the hopes he can look at this. --Mark> --- > > fs/ocfs2/dlm/dlmmaster.c | 14 +++++++++----- > 1 file changed, 9 insertions(+), 5 deletions(-) > > diff -puN fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount fs/ocfs2/dlm/dlmmaster.c > --- a/fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount > +++ a/fs/ocfs2/dlm/dlmmaster.c > @@ -3084,11 +3084,15 @@ static int dlm_add_migration_mle(struct > /* remove it so that only one mle will be found */ > __dlm_unlink_mle(dlm, tmp); > __dlm_mle_detach_hb_events(dlm, tmp); > - ret = DLM_MIGRATE_RESPONSE_MASTERY_REF; > - mlog(0, "%s:%.*s: master=%u, newmaster=%u, " > - "telling master to get ref for cleared out mle " > - "during migration\n", dlm->name, namelen, name, > - master, new_master); > + if (tmp->type == DLM_MLE_MASTER) { > + ret = DLM_MIGRATE_RESPONSE_MASTERY_REF; > + mlog(0, "%s:%.*s: master=%u, newmaster=%u, " > + "telling master to get ref " > + "for cleared out mle during " > + "migration\n", dlm->name, > + namelen, name, master, > + new_master); > + } > } > spin_unlock(&tmp->spinlock); > } > _-- Mark Fasheh