xuejiufei
2016-Jan-18 07:07 UTC
[Ocfs2-devel] ocfs2: A race between refmap setting and clearing
On 2016/1/18 12:28, Junxiao Bi wrote:> On 01/13/2016 04:24 PM, Joseph Qi wrote: >> Hi Junxiao, >> >> On 2016/1/13 15:00, Junxiao Bi wrote: >>> On 01/13/2016 02:21 PM, xuejiufei wrote: >>>> Hi Junxiao, >>>> I have not describe the issue clearly. >>>> >>>> Node 1 Node 2(master) >>>> dlmlock >>>> dlm_do_master_request >>>> dlm_master_request_handler >>>> -> dlm_lockres_set_refmap_bit >>>> dlmlock succeed >>>> dlmunlock succeed >>>> >>>> dlm_purge_lockres >>>> dlm_deref_handler >>>> -> find lock resource is in >>>> DLM_LOCK_RES_SETREF_INPROG state, >>>> so dispatch a deref work >>>> dlm_purge_lockres succeed. >>>> >>>> call dlmlock again >>>> dlm_do_master_request >>>> dlm_master_request_handler >>>> -> dlm_lockres_set_refmap_bit >>>> >>>> deref work trigger, call >>>> dlm_lockres_clear_refmap_bit >>>> to clear Node 1 from refmap >>>> >>>> dlm_purge_lockres succeed >>>> >>>> dlm_send_remote_lock_request >>>> return DLM_IVLOCKID because >>>> the lockres is not exist >>> More clear now. Thank you. >>> This is a very complicated race. I didn't have a good solution to fix it >>> now. Your fix looks work, but I am afraid if we keep going fix this >>> kinds of races case by case, we will make dlm harder to understand and >>> easy to involve bugs, maybe we should think about refactor dlm. >>> >> Agree. IMO, the root cause is bit op cannot handle such a case. >> I wonder if we have to change it to refcount, which may require a much >> bigger refactoring. > one bit for each node seems reasonable, as lockres is per node. I think > the cause is the dis-order of set/clear, i am trying to see whether they > can be made happen in order. >Agree. The solution 1) in my first mail is going to add a new message to keep the order of set and clear. Other nodes can purge the lock resource only after the refmap on master is cleared. Thanks Jiufei> Thanks, > Junxiao. >> >> Thanks, >> Joseph >> >>> Thanks, >>> Junxiao. >>> >>>> BUG if the lockres is $RECOVERY >>>> >>>> On 2016/1/13 10:46, Junxiao Bi wrote: >>>>> On 01/12/2016 03:16 PM, xuejiufei wrote: >>>>>> Hi, Junxiao >>>>>> >>>>>> On 2016/1/12 12:03, Junxiao Bi wrote: >>>>>>> Hi Jiufei, >>>>>>> >>>>>>> On 01/11/2016 10:46 AM, xuejiufei wrote: >>>>>>>> Hi all, >>>>>>>> We have found a race between refmap setting and clearing which >>>>>>>> will cause the lock resource on master is freed before other nodes >>>>>>>> purge it. >>>>>>>> >>>>>>>> Node 1 Node 2(master) >>>>>>>> dlm_do_master_request >>>>>>>> dlm_master_request_handler >>>>>>>> -> dlm_lockres_set_refmap_bit >>>>>>>> call dlm_purge_lockres after unlock >>>>>>>> dlm_deref_handler >>>>>>>> -> find lock resource is in >>>>>>>> DLM_LOCK_RES_SETREF_INPROG state, >>>>>>>> so dispatch a deref work >>>>>>>> dlm_purge_lockres succeed. >>>>>>>> >>>>>>>> dlm_do_master_request >>>>>>>> dlm_master_request_handler >>>>>>>> -> dlm_lockres_set_refmap_bit >>>>>>>> >>>>>>>> deref work trigger, call >>>>>>>> dlm_lockres_clear_refmap_bit >>>>>>>> to clear Node 1 from refmap >>>>>>>> >>>>>>>> Now Node 2 can purge the lock resource but the owner of lock resource >>>>>>>> on Node 1 is still Node 2 which may trigger BUG if the lock resource >>>>>>>> is $RECOVERY or other problems. >>>>>>>> >>>>>>>> We have discussed 2 solutions: >>>>>>>> 1)The master return error to Node 1 if the DLM_LOCK_RES_SETREF_INPROG >>>>>>>> is set. Node 1 will not retry and master send another message to Node 1 >>>>>>>> after clearing the refmap. Node 1 can purge the lock resource after the >>>>>>>> refmap on master is cleared. >>>>>>>> 2) The master return error to Node 1 if the DLM_LOCK_RES_SETREF_INPROG >>>>>>>> is set, and Node 1 will retry to deref the lockres. >>>>>>>> >>>>>>>> Does anybody has better ideas? >>>>>>>> >>>>>>> dlm_purge_lockres() will wait to drop ref until >>>>>>> DLM_LOCK_RES_SETREF_INPROG cleared. So if set this flag when find the >>>>>>> master during doing master request. And then this flag was cleared when >>>>>>> receiving assert master message, can this fix the issue? >>>>>>> >>>>>> I don't think this can fix. Before doing master request, the lock resource is >>>>>> already purged. The master should clear the refmap before client purge it. >>>>> inflight_locks is increased in dlm_get_lock_resource() which will stop >>>>> lockres purged? Set DLM_LOCK_RES_SETREF_INPROG when found lockres owner >>>>> during master request, then this will stop lockres purged after unlock? >>>>> >>>>> Thanks, >>>>> Junxiao. >>>>> >>>>>> >>>>>> Thanks, >>>>>> Jiufei >>>>>> >>>>>>> Thanks, >>>>>>> Junxiao. >>>>>>>> Thanks, >>>>>>>> --Jiufei >>>>>>>> >>>>>>> >>>>>>> >>>>>>> . >>>>>>> >>>>>> >>>>> >>>>> >>>>> . >>>>> >>>> >>> >>> >>> . >>> >> >> > > > . >
Junxiao Bi
2016-Jan-19 03:03 UTC
[Ocfs2-devel] ocfs2: A race between refmap setting and clearing
Hi Jiufei & Joseph, On 01/18/2016 03:07 PM, xuejiufei wrote:> > > On 2016/1/18 12:28, Junxiao Bi wrote: >> On 01/13/2016 04:24 PM, Joseph Qi wrote: >>> Hi Junxiao, >>> >>> On 2016/1/13 15:00, Junxiao Bi wrote: >>>> On 01/13/2016 02:21 PM, xuejiufei wrote: >>>>> Hi Junxiao, >>>>> I have not describe the issue clearly. >>>>> >>>>> Node 1 Node 2(master) >>>>> dlmlock >>>>> dlm_do_master_request >>>>> dlm_master_request_handler >>>>> -> dlm_lockres_set_refmap_bit >>>>> dlmlock succeed >>>>> dlmunlock succeed >>>>> >>>>> dlm_purge_lockres >>>>> dlm_deref_handler >>>>> -> find lock resource is in >>>>> DLM_LOCK_RES_SETREF_INPROG state, >>>>> so dispatch a deref work >>>>> dlm_purge_lockres succeed. >>>>> >>>>> call dlmlock again >>>>> dlm_do_master_request >>>>> dlm_master_request_handler >>>>> -> dlm_lockres_set_refmap_bit >>>>> >>>>> deref work trigger, call >>>>> dlm_lockres_clear_refmap_bit >>>>> to clear Node 1 from refmap >>>>> >>>>> dlm_purge_lockres succeed >>>>> >>>>> dlm_send_remote_lock_request >>>>> return DLM_IVLOCKID because >>>>> the lockres is not exist >>>> More clear now. Thank you. >>>> This is a very complicated race. I didn't have a good solution to fix it >>>> now. Your fix looks work, but I am afraid if we keep going fix this >>>> kinds of races case by case, we will make dlm harder to understand and >>>> easy to involve bugs, maybe we should think about refactor dlm. >>>> >>> Agree. IMO, the root cause is bit op cannot handle such a case. >>> I wonder if we have to change it to refcount, which may require a much >>> bigger refactoring. >> one bit for each node seems reasonable, as lockres is per node. I think >> the cause is the dis-order of set/clear, i am trying to see whether they >> can be made happen in order. >> > Agree. The solution 1) in my first mail is going to add a new message to > keep the order of set and clear. Other nodes can purge the lock resource > only after the refmap on master is cleared.I am taking another way, try to delete deref_lockres_worker, and make deref refmap happen directly in deref handler. This needs find the dis-order part and fix. As I can see, the dis-order can only happen at do_assert_master set refmap after deref_lockres_handler clear it? For this, need make sure not return MASTERY_REF to owner after purge. SETREF_INPROG is designed to do this. It is set by assert_master_handler and purge will wait it cleared before deref lockres. There are two cases where SETREF_INPROG is not set when purge 1. assert master message not coming when purge() This happened when lockres owner already exist when asking for a lockres. Lockres owner will be known from master request message and dlm_get_lock_resouces return. Soon dlmlock/dlmunlock is done, and purge done. Then assert master message coming, but since MLE have been deleted in dlm_get_lock_resouce, it will not return MASTERY_REF to owner to reset refmap. So no problem in this case. 2. assert master handler set SETREF_INPROG too late SETREF_INPROG is set after lockres owner has been updated and dlm->spinlock released. That will make dlm_get_lock_resource waken up too fast, and then purge may happen before SETREF_INPROG is set. This is a bug and can be fix by the following patch. diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 9477d6e1de37..83cd65b128d0 100644 --- a/fs/ocfs2/dlm/dlmmaster.c +++ b/fs/ocfs2/dlm/dlmmaster.c @@ -1965,6 +1965,7 @@ ok: if (res) { int wake = 0; spin_lock(&res->spinlock); + res->state |= DLM_LOCK_RES_SETREF_INPROG; if (mle->type == DLM_MLE_MIGRATION) { mlog(0, "finishing off migration of lockres %.*s, " "from %u to %u\n", @@ -2029,12 +2030,8 @@ ok: done: ret = 0; - if (res) { - spin_lock(&res->spinlock); - res->state |= DLM_LOCK_RES_SETREF_INPROG; - spin_unlock(&res->spinlock); + if (res) *ret_data = (void *)res; - } dlm_put(dlm); if (master_request) { mlog(0, "need to tell master to reassert\n"); Besides this path, revert commit f3f854648de64c4b6f13f6f13113bc9525c621e5 ("ocfs2_dlm: Ensure correct ordering of set/clear refmap bit on lockres"), can this fix this issue? Thanks, Junxiao.> > Thanks > Jiufei > >> Thanks, >> Junxiao. >>> >>> Thanks, >>> Joseph >>> >>>> Thanks, >>>> Junxiao. >>>> >>>>> BUG if the lockres is $RECOVERY >>>>> >>>>> On 2016/1/13 10:46, Junxiao Bi wrote: >>>>>> On 01/12/2016 03:16 PM, xuejiufei wrote: >>>>>>> Hi, Junxiao >>>>>>> >>>>>>> On 2016/1/12 12:03, Junxiao Bi wrote: >>>>>>>> Hi Jiufei, >>>>>>>> >>>>>>>> On 01/11/2016 10:46 AM, xuejiufei wrote: >>>>>>>>> Hi all, >>>>>>>>> We have found a race between refmap setting and clearing which >>>>>>>>> will cause the lock resource on master is freed before other nodes >>>>>>>>> purge it. >>>>>>>>> >>>>>>>>> Node 1 Node 2(master) >>>>>>>>> dlm_do_master_request >>>>>>>>> dlm_master_request_handler >>>>>>>>> -> dlm_lockres_set_refmap_bit >>>>>>>>> call dlm_purge_lockres after unlock >>>>>>>>> dlm_deref_handler >>>>>>>>> -> find lock resource is in >>>>>>>>> DLM_LOCK_RES_SETREF_INPROG state, >>>>>>>>> so dispatch a deref work >>>>>>>>> dlm_purge_lockres succeed. >>>>>>>>> >>>>>>>>> dlm_do_master_request >>>>>>>>> dlm_master_request_handler >>>>>>>>> -> dlm_lockres_set_refmap_bit >>>>>>>>> >>>>>>>>> deref work trigger, call >>>>>>>>> dlm_lockres_clear_refmap_bit >>>>>>>>> to clear Node 1 from refmap >>>>>>>>> >>>>>>>>> Now Node 2 can purge the lock resource but the owner of lock resource >>>>>>>>> on Node 1 is still Node 2 which may trigger BUG if the lock resource >>>>>>>>> is $RECOVERY or other problems. >>>>>>>>> >>>>>>>>> We have discussed 2 solutions: >>>>>>>>> 1)The master return error to Node 1 if the DLM_LOCK_RES_SETREF_INPROG >>>>>>>>> is set. Node 1 will not retry and master send another message to Node 1 >>>>>>>>> after clearing the refmap. Node 1 can purge the lock resource after the >>>>>>>>> refmap on master is cleared. >>>>>>>>> 2) The master return error to Node 1 if the DLM_LOCK_RES_SETREF_INPROG >>>>>>>>> is set, and Node 1 will retry to deref the lockres. >>>>>>>>> >>>>>>>>> Does anybody has better ideas? >>>>>>>>> >>>>>>>> dlm_purge_lockres() will wait to drop ref until >>>>>>>> DLM_LOCK_RES_SETREF_INPROG cleared. So if set this flag when find the >>>>>>>> master during doing master request. And then this flag was cleared when >>>>>>>> receiving assert master message, can this fix the issue? >>>>>>>> >>>>>>> I don't think this can fix. Before doing master request, the lock resource is >>>>>>> already purged. The master should clear the refmap before client purge it. >>>>>> inflight_locks is increased in dlm_get_lock_resource() which will stop >>>>>> lockres purged? Set DLM_LOCK_RES_SETREF_INPROG when found lockres owner >>>>>> during master request, then this will stop lockres purged after unlock? >>>>>> >>>>>> Thanks, >>>>>> Junxiao. >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Jiufei >>>>>>> >>>>>>>> Thanks, >>>>>>>> Junxiao. >>>>>>>>> Thanks, >>>>>>>>> --Jiufei >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> . >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> . >>>>>> >>>>> >>>> >>>> >>>> . >>>> >>> >>> >> >> >> . >> >