Xue jiufei
2012-Aug-14 02:03 UTC
[Ocfs2-devel] [PATCH] ocfs2: skip locks in the blocked list
A parallel umount on 4 nodes triggered a bug in dlm_process_recovery_date(). Here?s the situation: Receiving MIG_LOCKRES message, A node processes the locks in migratable lockres. It copys lvb from migratable lockres when processing the first valid lock. If there is a lock in the blocked list with the EX level, it triggers the BUG. Since valid lvbs are set when locks are granted with EX or PR levels, locks in the blocked list cannot have valid lvbs. Therefore I think we should skip the locks in the blocked list. Signed-off-by: Xuejiufei <xuejiufei at huawei.com> --- fs/ocfs2/dlm/dlmrecovery.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 01ebfd0..15d81ad 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dlmrecovery.c @@ -1887,6 +1887,13 @@ static int dlm_process_recovery_data(struct dlm_ctxt *dlm, if (ml->type == LKM_NLMODE) goto skip_lvb; + + /* + * If the lock is in the blocked list it can't have a valid lvb, + * so skip it + */ + if (ml->list == DLM_BLOCKED_LIST) + goto skip_lvb; if (!dlm_lvb_is_empty(mres->lvb)) { if (lksb->flags & DLM_LKSB_PUT_LVB) { -- 1.7.9.7
Sunil Mushran
2012-Aug-14 16:03 UTC
[Ocfs2-devel] [PATCH] ocfs2: skip locks in the blocked list
On Mon, Aug 13, 2012 at 7:03 PM, Xue jiufei <xuejiufei at huawei.com> wrote:> A parallel umount on 4 nodes triggered a bug in > dlm_process_recovery_date(). Here?s the situation: > Receiving MIG_LOCKRES message, A node processes the locks in migratable > lockres. It copys lvb from migratable lockres when processing the first > valid lock. > If there is a lock in the blocked list with the EX level, it triggers the > BUG. Since valid lvbs are set when locks are granted with EX or PR levels, > locks in > the blocked list cannot have valid lvbs. Therefore I think we should skip > the locks in the blocked list. > > Signed-off-by: Xuejiufei <xuejiufei at huawei.com> > --- > fs/ocfs2/dlm/dlmrecovery.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c > index 01ebfd0..15d81ad 100644 > --- a/fs/ocfs2/dlm/dlmrecovery.c > +++ b/fs/ocfs2/dlm/dlmrecovery.c > @@ -1887,6 +1887,13 @@ static int dlm_process_recovery_data(struct > dlm_ctxt *dlm, > > if (ml->type == LKM_NLMODE) > goto skip_lvb; > + > + /* > + * If the lock is in the blocked list it can't have a > valid lvb, > + * so skip it > + */ > + if (ml->list == DLM_BLOCKED_LIST) > + goto skip_lvb; > > if (!dlm_lvb_is_empty(mres->lvb)) { > if (lksb->flags & DLM_LKSB_PUT_LVB) { > -- >Looks reasonable. Just wanted to confirm. Did this BUG_ON in dlmrecovery,c get tripped? 1903 /* otherwise, the node is sending its 1904 * most recent valid lvb info */ 1905 BUG_ON(ml->type != LKM_EXMODE && 1906 ml->type != LKM_PRMODE); -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20120814/daec42aa/attachment.html
Joel Becker
2012-Aug-15 06:41 UTC
[Ocfs2-devel] [PATCH] ocfs2: skip locks in the blocked list
On Tue, Aug 14, 2012 at 10:03:17AM +0800, Xue jiufei wrote:> A parallel umount on 4 nodes triggered a bug in dlm_process_recovery_date(). Here?s the situation: > Receiving MIG_LOCKRES message, A node processes the locks in migratable lockres. It copys lvb from migratable lockres when processing the first valid lock. > If there is a lock in the blocked list with the EX level, it triggers the BUG. Since valid lvbs are set when locks are granted with EX or PR levels, locks in > the blocked list cannot have valid lvbs. Therefore I think we should skip the locks in the blocked list. > > Signed-off-by: Xuejiufei <xuejiufei at huawei.com>This patch is now part of the fixes branch of ocfs2.git. Joel> --- > fs/ocfs2/dlm/dlmrecovery.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c > index 01ebfd0..15d81ad 100644 > --- a/fs/ocfs2/dlm/dlmrecovery.c > +++ b/fs/ocfs2/dlm/dlmrecovery.c > @@ -1887,6 +1887,13 @@ static int dlm_process_recovery_data(struct dlm_ctxt *dlm, > > if (ml->type == LKM_NLMODE) > goto skip_lvb; > + > + /* > + * If the lock is in the blocked list it can't have a valid lvb, > + * so skip it > + */ > + if (ml->list == DLM_BLOCKED_LIST) > + goto skip_lvb; > > if (!dlm_lvb_is_empty(mres->lvb)) { > if (lksb->flags & DLM_LKSB_PUT_LVB) { > -- > 1.7.9.7-- "There is shadow under this red rock. (Come in under the shadow of this red rock) And I will show you something different from either Your shadow at morning striding behind you Or your shadow at evening rising to meet you. I will show you fear in a handful of dust." http://www.jlbec.org/ jlbec at evilplan.org