thr3ads.net - Ocfs2 devel - [Ocfs2-devel] [PATCH] fix dead lock caused by ocfs2_defrag

If this information is useful, please help other people find it:
Share via:

Changwei Ge

2018-Aug-30 07:52 UTC

[Ocfs2-devel] [PATCH] fix dead lock caused by ocfs2_defrag_extent

Hi Larry,


On 2018/8/30 14:26, Larry Chen wrote:> Hi Changwei,
>
> Maybe we need more investigation.
>
> The following is your fix:
> lock truncate log inode
> ??? __ocfs2_flush_truncate_log()
> ????ocfs2_lock_allocators_move_extents()
> ??????? unlock truncate log inode
>
> The lock action will happen like following:
> lock(truncate_inode)
> ????lock(sub allocat)
> ????lock(local_alloc) or lock(global_bitmap)
I don't think we have to worry much about mixed order? of cluster lock 
and inode mutex since cluster lock granted node will directly succeed 
instead of waiting for itself.
>
> I'm not sure if there is another code path that tries to get the same 
> locks but in different order, which may causes dead locks.
>
> Indeed ocfs2 involves too many locks, I would like to reduce the 
> deadlock risk at max extent.
>
> Another way is to add an new argument for __ocfs2_flush_truncate which 
> indicates whether global bitmap is needed to be locked.
Sounds a feasible way :)

Thanks,
Changwei
>
>
> Thanks,
> Larry
>
>
>
>
>
> On 08/30/2018 10:05 AM, Changwei Ge wrote:
>> Hi Larry,
>>
>> Sorry for this late reply.
>>
>>
>> On 2018/8/28 12:46, Larry Chen wrote:
>>> Hi Changwei,
>>>
>>> I think there might be some logic error.
>>>
>>> Imagine that we remove __ocfs2_flush_truncate_log() in
>>> ocfs2_defrag_extent().
>>>
>>> In ocfs2_reserve_clusters might skip freeing truncate log if there
is
>>> enough clusters in local. Then __ocfs2_move_extent might return
error
>>> because ocfs2_truncate_log_append might fail.
>> I think changing ocfs2 infrastructure methods like your way is not a
>> good idea.
>> So how about this way?
>> ->lock truncate log inode
>> ->__ocfs2_flush_truncate_log()
>> ->ocfs2_lock_allocators_move_extents()
>> ->unlock truncate log inode
>>
>> Thanks
>> Changwei
>>
>>>
>>> Thanks,
>>> Larry
>>>
>>>
>>> On 08/28/2018 09:59 AM, Changwei Ge wrote:
>>>> Hi Larry,
>>>>
>>>> I'd like to propose another way to fix this issue, which
might be
>>>> easier. :-)
>>>> Actually, ocfs2_reserve_clusters() inside will flush truncate
log if
>>>> there is no enough clusters left.
>>>> So how about just remove __ocfs2_flush_truncate_log() in
>>>> ocfs2_defrag_extent()?
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>> On 2018/8/27 16:01, Larry Chen wrote:
>>>>> ocfs2_defrag_extent may fall into dead lock.
>>>>>
>>>>> ocfs2_ioctl_move_extents
>>>>> ???? ocfs2_ioctl_move_extents
>>>>> ?????? ocfs2_move_extents
>>>>> ???????? ocfs2_defrag_extent
>>>>> ?????????? ocfs2_lock_allocators_move_extents
>>>>>
>>>>> ???????????? ocfs2_reserve_clusters
>>>>> ?????????????? inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>>>
>>>>> ?????? __ocfs2_flush_truncate_log
>>>>> ?????????????? inode_lock GLOBAL_BITMAP_SYSTEM_INODE
>>>>>
>>>>> As back trace shows above, ocfs2_reserve_clusters will call
>>>>> inode_lock
>>>>> against the global bitmap if local allocator has not
sufficient
>>>>> cluters.
>>>>> Once global bitmap could meet the demand,
ocfs2_reserve_cluster will
>>>>> return success with global bitmap locked.
>>>>>
>>>>> After ocfs2_reserve_cluster, if truncate log is full,
>>>>> __ocfs2_flush_truncate_log will definitely fall into dead
lock
>>>>> because it needs to inode_lock global bitmap, which has
>>>>> already been locked.
>>>>>
>>>>> To fix this bug, we could remove from
>>>>> ocfs2_lock_allocators_move_extents
>>>>> the code which intends to lock global allocator, and put
the removed
>>>>> code after __ocfs2_flush_truncate_log
>>>>>
>>>>> The ocfs2_lock_allocators_move_extents has been refered by
2 places,
>>>>> one
>>>>> is here, the other does not need the data allocator
context, which
>>>>> means
>>>>> this patch does not affect the caller so far.
>>>>>
>>>>>
>>>>>
>>>>> Signed-off-by: Larry Chen <lchen at suse.com>
>>>>> ---
>>>>> ??? fs/ocfs2/move_extents.c | 13 ++++++-------
>>>>> ??? 1 file changed, 6 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/fs/ocfs2/move_extents.c
b/fs/ocfs2/move_extents.c
>>>>> index 7eb3b0a6347e..064dedf40d74 100644
>>>>> --- a/fs/ocfs2/move_extents.c
>>>>> +++ b/fs/ocfs2/move_extents.c
>>>>> @@ -192,13 +192,6 @@ static int
>>>>> ocfs2_lock_allocators_move_extents(struct inode *inode,
>>>>> ??????????? goto out;
>>>>> ??????? }
>>>>> ??? -??? if (data_ac) {
>>>>> -??????? ret = ocfs2_reserve_clusters(osb,
clusters_to_move,
>>>>> data_ac);
>>>>> -??????? if (ret) {
>>>>> -??????????? mlog_errno(ret);
>>>>> -??????????? goto out;
>>>>> -??????? }
>>>>> -??? }
>>>>> ??? ?????? *credits +=
ocfs2_calc_extend_credits(osb->sb,
>>>>> et->et_root_el);
>>>>> ??? @@ -283,6 +276,12 @@ static int
ocfs2_defrag_extent(struct
>>>>> ocfs2_move_extents_context *context,
>>>>> ??????????? }
>>>>> ??????? }
>>>>> ??? +??? ret = ocfs2_reserve_clusters(osb, *len,
&context->data_ac);
>>>>> +??? if (ret) {
>>>>> +??????? mlog_errno(ret);
>>>>> +??????? goto out_unlock_mutex;
>>>>> +??? }
>>>>> +
>>>>> ??????? handle = ocfs2_start_trans(osb, credits);
>>>>> ??????? if (IS_ERR(handle)) {
>>>>> ??????????? ret = PTR_ERR(handle);
>>>>
>>>
>>
>

Larry Chen

2018-Aug-30 08:24 UTC

head link

[Ocfs2-devel] [PATCH] fix dead lock caused by ocfs2_defrag_extent

Hi Changwei,

On 08/30/2018 03:52 PM, Changwei Ge wrote:> Hi Larry,
> 
> 
> On 2018/8/30 14:26, Larry Chen wrote:
>> Hi Changwei,
>>
>> Maybe we need more investigation.
>>
>> The following is your fix:
>> lock truncate log inode
>>  ??? __ocfs2_flush_truncate_log()
>>  ????ocfs2_lock_allocators_move_extents()
>>  ??????? unlock truncate log inode
>>
>> The lock action will happen like following:
>> lock(truncate_inode)
>>  ????lock(sub allocat)
>>  ????lock(local_alloc) or lock(global_bitmap)
> 
> I don't think we have to worry much about mixed order? of cluster lock
> and inode mutex since cluster lock granted node will directly succeed
> instead of waiting for itself.
> 
>>
>> I'm not sure if there is another code path that tries to get the
same
>> locks but in different order, which may causes dead locks.
Yeah, I use lock to mean both inode_lock and ocfs2_inode_lock.
As too many types of lock and inode locks are involved, I can not 
guarantee that there is no logic error.
>>
>> Indeed ocfs2 involves too many locks, I would like to reduce the
>> deadlock risk at max extent.
>>
>> Another way is to add an new argument for __ocfs2_flush_truncate which
>> indicates whether global bitmap is needed to be locked.
> 
> Sounds a feasible way :)
Haha, I also prefer this way :)
I'll send another patch and run test cases on my environment.


Thanks,
Larry

Larry Chen

2018-Sep-02 09:09 UTC

head link

[Ocfs2-devel] [PATCH] fix dead lock caused by ocfs2_defrag_extent

Hi Changwei,
>>>
>>> Another way is to add an new argument for __ocfs2_flush_truncate
which
>>> indicates whether global bitmap is needed to be locked.
>>
>> Sounds a feasible way :)
> 
I tried this way, but it still causes dead lock. Here is the back trace.
The root cause is that ocfs2 sometimes relies on work queue to flush 
truncate log.

The worker who process the work queue is ocfs2_truncate_log_worker,
which directly call ocfs2_flush_truncate_log. And its lock order
is truncate_log and then global bitmap, which conflicts with 
ocfs2_defrag_extent.


  #0 [ffffb6c000a03c50] __schedule at ffffffff966c47bf
  #1 [ffffb6c000a03ce0] schedule at ffffffff966c4e08
  #2 [ffffb6c000a03ce8] rwsem_down_write_failed at ffffffff966c82b3
  #3 [ffffb6c000a03d80] schedule at ffffffff966c4e08
  #4 [ffffb6c000a03d88] call_rwsem_down_write_failed at ffffffff963a6e03
  #5 [ffffb6c000a03dc8] down_write at ffffffff966c7410
		try global bit map
  #6 [ffffb6c000a03dd0] __ocfs2_flush_truncate_log at ffffffffc0749145 
[ocfs2]
  		got truncate log
  #7 [ffffb6c000a03e68] ocfs2_flush_truncate_log at ffffffffc0749afd [ocfs2]
  #8 [ffffb6c000a03e80] ocfs2_truncate_log_worker at ffffffffc0749b29 
[ocfs2]
  #9 [ffffb6c000a03e98] process_one_work at ffffffff9609e64a
#10 [ffffb6c000a03ed8] worker_thread at ffffffff9609e88b
#11 [ffffb6c000a03f10] kthread at ffffffff960a470a
#12 [ffffb6c000a03f50] ret_from_fork at ffffffff96800235



[<ffffffff963a6e03>] call_rwsem_down_write_failed+0x13/0x20
         try truncate log
	got global bm
[<ffffffffc07906f3>] ocfs2_defrag_extent+0x1d3/0x680 [ocfs2] 

[<ffffffffc07919f9>] ocfs2_move_extents+0x329/0x740 [ocfs2] 

[<ffffffffc0791f43>] ocfs2_ioctl_move_extents+0x133/0x2d0 [ocfs2] 

[<ffffffffc0773a83>] ocfs2_ioctl+0x253/0x640 [ocfs2] 

[<ffffffff96256480>] do_vfs_ioctl+0x90/0x5f0 

[<ffffffff96256a54>] SyS_ioctl+0x74/0x80 

[<ffffffff96003ae4>] do_syscall_64+0x74/0x140 

[<ffffffff9680009a>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 

[<ffffffffffffffff>] 0xffffffffffffffff

Ocfs2 devel - Sep 2018 - [PATCH] fix dead lock caused by ocfs2_defrag_extent

[Ocfs2-devel] [PATCH] fix dead lock caused by ocfs2_defrag_extent

[Ocfs2-devel] [PATCH] fix dead lock caused by ocfs2_defrag_extent

[Ocfs2-devel] [PATCH] fix dead lock caused by ocfs2_defrag_extent