thr3ads.net - Ocfs2 devel - [Ocfs2-devel] [PATCH] ocfs2: wait for recovering done after direct unlock request [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Changwei Ge

2019-Feb-15 08:43 UTC

[Ocfs2-devel] [PATCH] ocfs2: wait for recovering done after direct unlock request

Hi Jun,

On 2019/2/15 16:36, piaojun wrote:> Hi Changwei,
> 
> Thanks for your explaination, and I still have a question below.
Thank you for looking into this.
> 
> On 2019/2/15 14:29, Changwei Ge wrote:
>> Hi Jun,
>>
>> On 2019/2/15 14:20, piaojun wrote:
>>> Hi Changwei,
>>>
>>> The DLM process is a little bit complex, so I suggest pasting the
code
>>> path. And I wonder if my code is right?
>>>
>>> Thanks,
>>> Jun
>>>
>>> On 2019/2/14 14:14, Changwei Ge wrote:
>>>> There is scenario causing ocfs2 umount hang when multiple hosts
are
>>>> rebooting at the same time.
>>>>
>>>> NODE1                           NODE2               NODE3
>>>> send unlock requset to NODE2
>>>>                                   dies
>>>>                                                       become
recovery master
>>>>                                                       recover
NODE2
>>>> find NODE2 dead
>>>> mark resource RECOVERING
>>>> directly remove lock from grant list
>>> dlm_do_local_recovery_cleanup
>>>     dlm_move_lockres_to_recovery_list
>>>       res->state |= DLM_LOCK_RES_RECOVERING;
>>>       list_add_tail(&res->recovering,
&dlm->reco.resources);
>>>
>>>> calculate usage but RECOVERING marked
>>>> **miss the window of purging
>>> dlmunlock
>>>     dlmunlock_remote
>>>       dlmunlock_common // unlock successfully directly
>>>
>>>     dlm_lockres_calc_usage
>>>       __dlm_lockres_calc_usage
>>>         __dlm_lockres_unused
>>>           if (res->state & (DLM_LOCK_RES_RECOVERING| //
won't purge lock as DLM_LOCK_RES_RECOVERING is set
>>
>> True.
>>
>>>
>>>> clear RECOVERING
>>> dlm_finish_local_lockres_recovery
>>>     res->state &= ~DLM_LOCK_RES_RECOVERING;
>>>
>>> Could you help explaining where getting stuck?
>>
>> Sure,
>> As dlm missed the window to purge lock resource, it can't be
unhashed.
>>
>> During umount:
>> dlm_unregister_domain
>>     dlm_migrate_all_locks -> there is always a lock resource hashed,
so can't return from dlm_migrate_all_locks() thus hang during umount.
> 
> In dlm_migrate_all_locks, lockres will be move to purge_list and purged
again:
> dlm_migrate_all_locks
>    __dlm_lockres_calc_usage
>      list_add_tail(&res->purge, &dlm->purge_list);
> 
> Do you mean this process does not work?
Yes, only for Migrating lock resources, it has a chance to call
__dlm_lockres_calc_usage()
But in our situation, the problematic lock resource is obviously no master. So
no chance for it to set stack variable *dropped* in dlm_migrate_all_locks()

Thanks,
Changwei
> 
>>
>> Thanks,
>> Changwei
>> 	
>>
>>>
>>>>
>>>> To reproduce this iusse, crash a host and then umount ocfs2
>>>> from another node.
>>>>
>>>> To sovle this, just let unlock progress wait for recovery done.
>>>>
>>>> Signed-off-by: Changwei Ge <ge.changwei at h3c.com>
>>>> ---
>>>>    fs/ocfs2/dlm/dlmunlock.c | 23 +++++++++++++++++++----
>>>>    1 file changed, 19 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c
b/fs/ocfs2/dlm/dlmunlock.c
>>>> index 63d701c..c8e9b70 100644
>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>> @@ -105,7 +105,8 @@ static enum dlm_status
dlmunlock_common(struct dlm_ctxt *dlm,
>>>>    	enum dlm_status status;
>>>>    	int actions = 0;
>>>>    	int in_use;
>>>> -        u8 owner;
>>>> +	u8 owner;
>>>> +	int recovery_wait = 0;
>>>>    
>>>>    	mlog(0, "master_node = %d, valblk = %d\n",
master_node,
>>>>    	     flags & LKM_VALBLK);
>>>> @@ -208,9 +209,12 @@ static enum dlm_status
dlmunlock_common(struct dlm_ctxt *dlm,
>>>>    		}
>>>>    		if (flags & LKM_CANCEL)
>>>>    			lock->cancel_pending = 0;
>>>> -		else
>>>> -			lock->unlock_pending = 0;
>>>> -
>>>> +		else {
>>>> +			if (!lock->unlock_pending)
>>>> +				recovery_wait = 1;
>>>> +			else
>>>> +				lock->unlock_pending = 0;
>>>> +		}
>>>>    	}
>>>>    
>>>>    	/* get an extra ref on lock.  if we are just switching
>>>> @@ -244,6 +248,17 @@ static enum dlm_status
dlmunlock_common(struct dlm_ctxt *dlm,
>>>>    	spin_unlock(&res->spinlock);
>>>>    	wake_up(&res->wq);
>>>>    
>>>> +	if (recovery_wait) {
>>>> +		spin_lock(&res->spinlock);
>>>> +		/* Unlock request will directly succeed after owner dies,
>>>> +		 * and the lock is already removed from grant list. We have
to
>>>> +		 * wait for RECOVERING done or we miss the chance to purge
it
>>>> +		 * since the removement is much faster than RECOVERING proc.
>>>> +		 */
>>>> +		__dlm_wait_on_lockres_flags(res, DLM_LOCK_RES_RECOVERING);
>>>> +		spin_unlock(&res->spinlock);
>>>> +	}
>>>> +
>>>>    	/* let the caller's final dlm_lock_put handle the
actual kfree */
>>>>    	if (actions & DLM_UNLOCK_FREE_LOCK) {
>>>>    		/* this should always be coupled with list removal */
>>>>
>>>
>> .
>>
>

Changwei Ge

2019-Feb-18 09:46 UTC

head link

[Ocfs2-devel] [PATCH] ocfs2: wait for recovering done after direct unlock request

Hi Jun,

Ping...

Do you have any further question?
If any question uncleared, please let me know.
Otherwise, could you please give a feedback?

Thanks,
Changwei

On 2019/2/15 16:48, Changwei Ge wrote:> Hi Jun,
> 
> On 2019/2/15 16:36, piaojun wrote:
>> Hi Changwei,
>>
>> Thanks for your explaination, and I still have a question below.
> 
> Thank you for looking into this.
> 
>>
>> On 2019/2/15 14:29, Changwei Ge wrote:
>>> Hi Jun,
>>>
>>> On 2019/2/15 14:20, piaojun wrote:
>>>> Hi Changwei,
>>>>
>>>> The DLM process is a little bit complex, so I suggest pasting
the code
>>>> path. And I wonder if my code is right?
>>>>
>>>> Thanks,
>>>> Jun
>>>>
>>>> On 2019/2/14 14:14, Changwei Ge wrote:
>>>>> There is scenario causing ocfs2 umount hang when multiple
hosts are
>>>>> rebooting at the same time.
>>>>>
>>>>> NODE1                           NODE2               NODE3
>>>>> send unlock requset to NODE2
>>>>>                                    dies
>>>>>                                                       
become recovery master
>>>>>                                                       
recover NODE2
>>>>> find NODE2 dead
>>>>> mark resource RECOVERING
>>>>> directly remove lock from grant list
>>>> dlm_do_local_recovery_cleanup
>>>>      dlm_move_lockres_to_recovery_list
>>>>        res->state |= DLM_LOCK_RES_RECOVERING;
>>>>        list_add_tail(&res->recovering,
&dlm->reco.resources);
>>>>
>>>>> calculate usage but RECOVERING marked
>>>>> **miss the window of purging
>>>> dlmunlock
>>>>      dlmunlock_remote
>>>>        dlmunlock_common // unlock successfully directly
>>>>
>>>>      dlm_lockres_calc_usage
>>>>        __dlm_lockres_calc_usage
>>>>          __dlm_lockres_unused
>>>>            if (res->state & (DLM_LOCK_RES_RECOVERING| //
won't purge lock as DLM_LOCK_RES_RECOVERING is set
>>>
>>> True.
>>>
>>>>
>>>>> clear RECOVERING
>>>> dlm_finish_local_lockres_recovery
>>>>      res->state &= ~DLM_LOCK_RES_RECOVERING;
>>>>
>>>> Could you help explaining where getting stuck?
>>>
>>> Sure,
>>> As dlm missed the window to purge lock resource, it can't be
unhashed.
>>>
>>> During umount:
>>> dlm_unregister_domain
>>>      dlm_migrate_all_locks -> there is always a lock resource
hashed, so can't return from dlm_migrate_all_locks() thus hang during
umount.
>>
>> In dlm_migrate_all_locks, lockres will be move to purge_list and purged
again:
>> dlm_migrate_all_locks
>>     __dlm_lockres_calc_usage
>>       list_add_tail(&res->purge, &dlm->purge_list);
>>
>> Do you mean this process does not work?
> 
> Yes, only for Migrating lock resources, it has a chance to call
__dlm_lockres_calc_usage()
> But in our situation, the problematic lock resource is obviously no master.
So no chance for it to set stack variable *dropped* in dlm_migrate_all_locks()
> 
> Thanks,
> Changwei
> 
>>
>>>
>>> Thanks,
>>> Changwei
>>> 	
>>>
>>>>
>>>>>
>>>>> To reproduce this iusse, crash a host and then umount ocfs2
>>>>> from another node.
>>>>>
>>>>> To sovle this, just let unlock progress wait for recovery
done.
>>>>>
>>>>> Signed-off-by: Changwei Ge <ge.changwei at h3c.com>
>>>>> ---
>>>>>     fs/ocfs2/dlm/dlmunlock.c | 23 +++++++++++++++++++----
>>>>>     1 file changed, 19 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c
b/fs/ocfs2/dlm/dlmunlock.c
>>>>> index 63d701c..c8e9b70 100644
>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>> @@ -105,7 +105,8 @@ static enum dlm_status
dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>     	enum dlm_status status;
>>>>>     	int actions = 0;
>>>>>     	int in_use;
>>>>> -        u8 owner;
>>>>> +	u8 owner;
>>>>> +	int recovery_wait = 0;
>>>>>     
>>>>>     	mlog(0, "master_node = %d, valblk = %d\n",
master_node,
>>>>>     	     flags & LKM_VALBLK);
>>>>> @@ -208,9 +209,12 @@ static enum dlm_status
dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>     		}
>>>>>     		if (flags & LKM_CANCEL)
>>>>>     			lock->cancel_pending = 0;
>>>>> -		else
>>>>> -			lock->unlock_pending = 0;
>>>>> -
>>>>> +		else {
>>>>> +			if (!lock->unlock_pending)
>>>>> +				recovery_wait = 1;
>>>>> +			else
>>>>> +				lock->unlock_pending = 0;
>>>>> +		}
>>>>>     	}
>>>>>     
>>>>>     	/* get an extra ref on lock.  if we are just switching
>>>>> @@ -244,6 +248,17 @@ static enum dlm_status
dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>     	spin_unlock(&res->spinlock);
>>>>>     	wake_up(&res->wq);
>>>>>     
>>>>> +	if (recovery_wait) {
>>>>> +		spin_lock(&res->spinlock);
>>>>> +		/* Unlock request will directly succeed after owner
dies,
>>>>> +		 * and the lock is already removed from grant list. We
have to
>>>>> +		 * wait for RECOVERING done or we miss the chance to
purge it
>>>>> +		 * since the removement is much faster than RECOVERING
proc.
>>>>> +		 */
>>>>> +		__dlm_wait_on_lockres_flags(res,
DLM_LOCK_RES_RECOVERING);
>>>>> +		spin_unlock(&res->spinlock);
>>>>> +	}
>>>>> +
>>>>>     	/* let the caller's final dlm_lock_put handle the
actual kfree */
>>>>>     	if (actions & DLM_UNLOCK_FREE_LOCK) {
>>>>>     		/* this should always be coupled with list removal */
>>>>>
>>>>
>>> .
>>>
>>
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>

Ocfs2 devel - Feb 2019 - [PATCH] ocfs2: wait for recovering done after direct unlock request

[Ocfs2-devel] [PATCH] ocfs2: wait for recovering done after direct unlock request

[Ocfs2-devel] [PATCH] ocfs2: wait for recovering done after direct unlock request