xiaowei.hu at oracle.com
2012-Feb-21 06:12 UTC
[Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.
I am trying to fix bug13611997,CT's machine run into BUG in ocfs2dc thread, BUG_ON(lockres->l_action != OCFS2_AST_CONVERT && lockres->l_action != OCFS2_AST_DOWNCONVERT); I analysized the vmcore , the lockres->l_action = OCFS2_AST_ATTACH and l_flags=326(which means OCFS2_LOCK_BUSY|OCFS2_LOCK_BLOCKED|OCFS2_LOCK_INITIALIZED|OCFS2_LOCK_QUEUED), after compared with the code , this status could be only possible during ocfs2_cluster_lock,here is the race situation: NodeA NodeB ocfs2_cluster_lock on a new lockres M spin_lock_irqsave(&lockres->l_lock, flags); gen = lockres_set_pending(lockres); lockres->l_action = OCFS2_AST_ATTACH; lockres_or_flags(lockres, OCFS2_LOCK_BUSY); spin_unlock_irqrestore(&lockres->l_lock, flags); ocfs2_dlm_lock() finished and returned. **and lockres_clear_pending(lockres, gen, osb); request a lock on the same lockres M It's blocked by nodeA, and a ast proxy was send to A bast queued and flushed,before the ast was queued then the ocfs2dc was scheduled there is a chance to execute this code path: ocfs2_downconvert_thread() ocfs2_downconvert_thread_do_work() ocfs2_blocking_ast() ocfs2_process_blocked_lock() ocfs2_unblock_lock() spin_lock_irqsave(&lockres->l_lock, flags); if (lockres->l_flags & OCFS2_LOCK_BUSY) ret = ocfs2_prepare_cancel_convert(osb, lockres); BUG_ON(lockres->l_action != OCFS2_AST_CONVERT && lockres->l_action != OCFS2_AST_DOWNCONVERT); here trigger the BUG() Solution: One possible solution for this is to remove the lockres_clear_pending marked by 2 stars, and left this clear work to the ast function.In this way could make sure the bast function wait for ast , let it clear OCFS2_LOCK_BUSY and set OCFS2_LOCK_ATTACHED first, before enter downconvert process.
xiaowei.hu at oracle.com
2012-Feb-21 06:12 UTC
[Ocfs2-devel] [PATCH] fixing dlmglue race condition
From: Xiaowei.Hu <xiaowei.hu at oracle.com> NodeA NodeB ocfs2_cluster_lock on a new lockres M spin_lock_irqsave(&lockres->l_lock, flags); gen = lockres_set_pending(lockres); lockres->l_action = OCFS2_AST_ATTACH; lockres_or_flags(lockres, OCFS2_LOCK_BUSY); spin_unlock_irqrestore(&lockres->l_lock, flags); ocfs2_dlm_lock() finished and returned. **and lockres_clear_pending(lockres, gen, osb); request a lock on the same lockres M It's blocked by nodeA, and a ast proxy was send to A bast queued and flushed,before the ast was queued then the ocfs2dc was scheduled there is a chance to execute this code path,since pending flag was cleared already: ocfs2_downconvert_thread() ocfs2_downconvert_thread_do_work() ocfs2_blocking_ast() ocfs2_process_blocked_lock() ocfs2_unblock_lock() spin_lock_irqsave(&lockres->l_lock, flags); if (lockres->l_flags & OCFS2_LOCK_BUSY) ret = ocfs2_prepare_cancel_convert(osb, lockres); BUG_ON(lockres->l_action != OCFS2_AST_CONVERT && lockres->l_action != OCFS2_AST_DOWNCONVERT); here trigger the BUG() --- fs/ocfs2/dlmglue.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 81a4cd2..6f5e516 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -1471,7 +1471,6 @@ again: lkm_flags, lockres->l_name, OCFS2_LOCK_ID_MAX_LEN - 1); - lockres_clear_pending(lockres, gen, osb); if (ret) { if (!(lkm_flags & DLM_LKF_NOQUEUE) || (ret != -EAGAIN)) { -- 1.7.4.4
Sunil Mushran
2012-Feb-21 17:48 UTC
[Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.
> bast queued and flushed,before the ast was queuedUnlikely with o2dlm. dlmthread always sends ASTs before BASTs. Can you recreate the entire lockres? A full dump may yield more information. Sunil On 02/20/2012 10:12 PM, xiaowei.hu at oracle.com wrote:> I am trying to fix bug13611997,CT's machine run into BUG in ocfs2dc thread, BUG_ON(lockres->l_action != OCFS2_AST_CONVERT&& lockres->l_action != OCFS2_AST_DOWNCONVERT); I analysized the vmcore , the lockres->l_action = OCFS2_AST_ATTACH and l_flags=326(which means OCFS2_LOCK_BUSY|OCFS2_LOCK_BLOCKED|OCFS2_LOCK_INITIALIZED|OCFS2_LOCK_QUEUED), after compared with the code , this status could be only possible during ocfs2_cluster_lock,here is the race situation: > > NodeA NodeB > ocfs2_cluster_lock on a new lockres M > spin_lock_irqsave(&lockres->l_lock, flags); > gen = lockres_set_pending(lockres); > lockres->l_action = OCFS2_AST_ATTACH; > lockres_or_flags(lockres, OCFS2_LOCK_BUSY); > spin_unlock_irqrestore(&lockres->l_lock, flags); > > ocfs2_dlm_lock() finished and returned. > **and lockres_clear_pending(lockres, gen, osb); > request a lock on the same lockres M > It's blocked by nodeA, and a ast proxy was send to A > > bast queued and flushed,before the ast was queued > then the ocfs2dc was scheduled > there is a chance to execute this code path: > ocfs2_downconvert_thread() > ocfs2_downconvert_thread_do_work() > ocfs2_blocking_ast() > ocfs2_process_blocked_lock() > ocfs2_unblock_lock() > spin_lock_irqsave(&lockres->l_lock, flags); > if (lockres->l_flags& OCFS2_LOCK_BUSY) > ret = ocfs2_prepare_cancel_convert(osb, lockres); > BUG_ON(lockres->l_action != OCFS2_AST_CONVERT&& > lockres->l_action != OCFS2_AST_DOWNCONVERT); > here trigger the BUG() > > Solution: > One possible solution for this is to remove the lockres_clear_pending marked by 2 stars, and left this clear work to the ast function.In this way could make sure the bast function wait for ast , let it clear OCFS2_LOCK_BUSY and set OCFS2_LOCK_ATTACHED first, before enter downconvert process. > >
Sunil Mushran
2012-Feb-21 18:04 UTC
[Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.
Moreover what is lockres_clear_pending doing in 1.4. That code is not meant for 1.4. It fixes a problem associated with fsdlm. It was left out of 1.4 for a reason. Meaning this bug was introduced by the patch that introduced this one in 1.4. On 02/20/2012 10:12 PM, xiaowei.hu at oracle.com wrote:> I am trying to fix bug13611997,CT's machine run into BUG in ocfs2dc thread, BUG_ON(lockres->l_action != OCFS2_AST_CONVERT&& lockres->l_action != OCFS2_AST_DOWNCONVERT); I analysized the vmcore , the lockres->l_action = OCFS2_AST_ATTACH and l_flags=326(which means OCFS2_LOCK_BUSY|OCFS2_LOCK_BLOCKED|OCFS2_LOCK_INITIALIZED|OCFS2_LOCK_QUEUED), after compared with the code , this status could be only possible during ocfs2_cluster_lock,here is the race situation: > > NodeA NodeB > ocfs2_cluster_lock on a new lockres M > spin_lock_irqsave(&lockres->l_lock, flags); > gen = lockres_set_pending(lockres); > lockres->l_action = OCFS2_AST_ATTACH; > lockres_or_flags(lockres, OCFS2_LOCK_BUSY); > spin_unlock_irqrestore(&lockres->l_lock, flags); > > ocfs2_dlm_lock() finished and returned. > **and lockres_clear_pending(lockres, gen, osb); > request a lock on the same lockres M > It's blocked by nodeA, and a ast proxy was send to A > > bast queued and flushed,before the ast was queued > then the ocfs2dc was scheduled > there is a chance to execute this code path: > ocfs2_downconvert_thread() > ocfs2_downconvert_thread_do_work() > ocfs2_blocking_ast() > ocfs2_process_blocked_lock() > ocfs2_unblock_lock() > spin_lock_irqsave(&lockres->l_lock, flags); > if (lockres->l_flags& OCFS2_LOCK_BUSY) > ret = ocfs2_prepare_cancel_convert(osb, lockres); > BUG_ON(lockres->l_action != OCFS2_AST_CONVERT&& > lockres->l_action != OCFS2_AST_DOWNCONVERT); > here trigger the BUG() > > Solution: > One possible solution for this is to remove the lockres_clear_pending marked by 2 stars, and left this clear work to the ast function.In this way could make sure the bast function wait for ast , let it clear OCFS2_LOCK_BUSY and set OCFS2_LOCK_ATTACHED first, before enter downconvert process. > >