akpm at linux-foundation.org
2014-Mar-19 21:10 UTC
[Ocfs2-devel] [patch 5/8] ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless, loop during umount
From: jiangyiwen <jiangyiwen at huawei.com>
Subject: ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid
endless,loop during umount
The following case may lead to endless loop during umount.
node A node B node C node D
umount volume,
migrate lockres1
to B
want to lock lockres1,
send
MASTER_REQUEST_MSG
to C
init block mle
send
MIGRATE_REQUEST_MSG
to C
find a block
mle, and then
return
DLM_MIGRATE_RESPONSE_MASTERY_REF
to B
set C in refmap
umount successfully
try to umount, endless
loop occurs when migrate
lockres1 since C is in
refmap
So we can fix this endless loop case by only returning
DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving
MIGRATE_REQUEST_MSG.
[akpm at linux-foundation.org: coding-style fixes]
Signed-off-by: jiangyiwen <jiangyiwen at huawei.com>
Cc: Mark Fasheh <mfasheh at suse.com>
Cc: Joel Becker <jlbec at evilplan.org>
Cc: Xue jiufei <xuejiufei at huawei.com>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
---
fs/ocfs2/dlm/dlmmaster.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff -puN
fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount
fs/ocfs2/dlm/dlmmaster.c
---
a/fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount
+++ a/fs/ocfs2/dlm/dlmmaster.c
@@ -3084,11 +3084,15 @@ static int dlm_add_migration_mle(struct
/* remove it so that only one mle will be found */
__dlm_unlink_mle(dlm, tmp);
__dlm_mle_detach_hb_events(dlm, tmp);
- ret = DLM_MIGRATE_RESPONSE_MASTERY_REF;
- mlog(0, "%s:%.*s: master=%u, newmaster=%u, "
- "telling master to get ref for cleared out mle "
- "during migration\n", dlm->name, namelen, name,
- master, new_master);
+ if (tmp->type == DLM_MLE_MASTER) {
+ ret = DLM_MIGRATE_RESPONSE_MASTERY_REF;
+ mlog(0, "%s:%.*s: master=%u, newmaster=%u, "
+ "telling master to get ref "
+ "for cleared out mle during "
+ "migration\n", dlm->name,
+ namelen, name, master,
+ new_master);
+ }
}
spin_unlock(&tmp->spinlock);
}
_
Mark Fasheh
2014-Mar-31 02:23 UTC
[Ocfs2-devel] [patch 5/8] ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless, loop during umount
On Wed, Mar 19, 2014 at 02:10:03PM -0700, Andrew Morton wrote:> From: jiangyiwen <jiangyiwen at huawei.com> > Subject: ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop during umount > > The following case may lead to endless loop during umount. > > node A node B node C node D > umount volume, > migrate lockres1 > to B > want to lock lockres1, > send > MASTER_REQUEST_MSG > to C > init block mle > send > MIGRATE_REQUEST_MSG > to C > find a block > mle, and then > return > DLM_MIGRATE_RESPONSE_MASTERY_REF > to B > set C in refmap > umount successfully > try to umount, endless > loop occurs when migrate > lockres1 since C is in > refmap > > So we can fix this endless loop case by only returning > DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving > MIGRATE_REQUEST_MSG. > > [akpm at linux-foundation.org: coding-style fixes] > Signed-off-by: jiangyiwen <jiangyiwen at huawei.com> > Cc: Mark Fasheh <mfasheh at suse.com> > Cc: Joel Becker <jlbec at evilplan.org> > Cc: Xue jiufei <xuejiufei at huawei.com> > Signed-off-by: Andrew Morton <akpm at linux-foundation.org>Ok, I _think_ I got this race condition, and the patch itself seems sane. How was this bug hit, and how much testing did you do with this patch? I ask because dlm changes can sometimes have unintended effects and I really don't understand that particular code well enough right now to tell with 100% certainty we didn't mess something else up. Actually, I'm going to CC Sunil in the hopes he can look at this. --Mark> --- > > fs/ocfs2/dlm/dlmmaster.c | 14 +++++++++----- > 1 file changed, 9 insertions(+), 5 deletions(-) > > diff -puN fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount fs/ocfs2/dlm/dlmmaster.c > --- a/fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount > +++ a/fs/ocfs2/dlm/dlmmaster.c > @@ -3084,11 +3084,15 @@ static int dlm_add_migration_mle(struct > /* remove it so that only one mle will be found */ > __dlm_unlink_mle(dlm, tmp); > __dlm_mle_detach_hb_events(dlm, tmp); > - ret = DLM_MIGRATE_RESPONSE_MASTERY_REF; > - mlog(0, "%s:%.*s: master=%u, newmaster=%u, " > - "telling master to get ref for cleared out mle " > - "during migration\n", dlm->name, namelen, name, > - master, new_master); > + if (tmp->type == DLM_MLE_MASTER) { > + ret = DLM_MIGRATE_RESPONSE_MASTERY_REF; > + mlog(0, "%s:%.*s: master=%u, newmaster=%u, " > + "telling master to get ref " > + "for cleared out mle during " > + "migration\n", dlm->name, > + namelen, name, master, > + new_master); > + } > } > spin_unlock(&tmp->spinlock); > } > _-- Mark Fasheh