Heming Zhao
2023-Feb-17 00:37 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix defrag path triggering jbd2 ASSERT
code path:
ocfs2_ioctl_move_extents
ocfs2_move_extents
ocfs2_defrag_extent
__ocfs2_move_extent
+ ocfs2_journal_access_di
+ ocfs2_split_extent //sub-paths call jbd2_journal_restart
+ ocfs2_journal_dirty //crash by jbs2 ASSERT
crash stacks:
PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2"
#0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01
#1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d
#2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d
#3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f
#4 [ffffb25d8dad3a58] do_trap at ffffffff83833205
#5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6
#6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18
[exception RIP: jbd2_journal_dirty_metadata+0x2ba]
RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207
RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250
RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000
R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28
R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2]
#8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2]
#9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2]
Analysis
This bug has the same root cause of 'commit 7f27ec978b0e ("ocfs2: call
ocfs2_journal_access_di() before ocfs2_journal_dirty() in
ocfs2_write_end_nolock()")'.
For this bug, jbd2_journal_restart() is called by ocfs2_split_extent()
during defragmenting.
How to fix
For ocfs2_split_extent() can handle journal operations totally by itself.
Caller doesn't need to call journal access/dirty pair, and caller only
needs to call journal start/stop pair. The fix method is to remove journal
access/dirty from __ocfs2_move_extent().
The discussion for this patch:
https://oss.oracle.com/pipermail/ocfs2-devel/2023-February/000647.html
Signed-off-by: Heming Zhao <heming.zhao at suse.com>
---
v1 -> v2:
- doesn't change any code.
- change patch subject from "ocfs2: fix J_ASSERT_JH in defragment
path"
to "ocfs2: fix defrag path triggering jbd2 ASSERT"
- rewrite/polish commit log
v1: https://oss.oracle.com/pipermail/ocfs2-devel/2022-May/000101.html
---
fs/ocfs2/move_extents.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index 192cad0662d8..6251748c695b 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -105,14 +105,6 @@ static int __ocfs2_move_extent(handle_t *handle,
*/
replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED;
- ret = ocfs2_journal_access_di(handle, INODE_CACHE(inode),
- context->et.et_root_bh,
- OCFS2_JOURNAL_ACCESS_WRITE);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
-
ret = ocfs2_split_extent(handle, &context->et, path, index,
&replace_rec, context->meta_ac,
&context->dealloc);
@@ -121,8 +113,6 @@ static int __ocfs2_move_extent(handle_t *handle,
goto out;
}
- ocfs2_journal_dirty(handle, context->et.et_root_bh);
-
context->new_phys_cpos = new_p_cpos;
/*
--
2.39.0
Joseph Qi
2023-Feb-19 11:12 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix defrag path triggering jbd2 ASSERT
On 2/17/23 8:37 AM, Heming Zhao wrote:> code path: > > ocfs2_ioctl_move_extents > ocfs2_move_extents > ocfs2_defrag_extent > __ocfs2_move_extent > + ocfs2_journal_access_di > + ocfs2_split_extent //sub-paths call jbd2_journal_restart > + ocfs2_journal_dirty //crash by jbs2 ASSERT > > crash stacks: > > PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2" > #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01 > #1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d > #2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d > #3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f > #4 [ffffb25d8dad3a58] do_trap at ffffffff83833205 > #5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6 > #6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18 > [exception RIP: jbd2_journal_dirty_metadata+0x2ba] > RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207 > RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000 > RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250 > RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000 > R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28 > R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2] > #8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2] > #9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2] > > Analysis > > This bug has the same root cause of 'commit 7f27ec978b0e ("ocfs2: call > ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock()")'. > For this bug, jbd2_journal_restart() is called by ocfs2_split_extent() > during defragmenting. > > How to fix > > For ocfs2_split_extent() can handle journal operations totally by itself. > Caller doesn't need to call journal access/dirty pair, and caller only > needs to call journal start/stop pair. The fix method is to remove journal > access/dirty from __ocfs2_move_extent(). > > The discussion for this patch: > https://oss.oracle.com/pipermail/ocfs2-devel/2023-February/000647.html > > Signed-off-by: Heming Zhao <heming.zhao at suse.com>Reviewed-by: Joseph Qi <joseph.qi at linux.alibaba.com>> --- > v1 -> v2: > - doesn't change any code. > - change patch subject from "ocfs2: fix J_ASSERT_JH in defragment path" > to "ocfs2: fix defrag path triggering jbd2 ASSERT" > - rewrite/polish commit log > > v1: https://oss.oracle.com/pipermail/ocfs2-devel/2022-May/000101.html > > --- > fs/ocfs2/move_extents.c | 10 ---------- > 1 file changed, 10 deletions(-) > > diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c > index 192cad0662d8..6251748c695b 100644 > --- a/fs/ocfs2/move_extents.c > +++ b/fs/ocfs2/move_extents.c > @@ -105,14 +105,6 @@ static int __ocfs2_move_extent(handle_t *handle, > */ > replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED; > > - ret = ocfs2_journal_access_di(handle, INODE_CACHE(inode), > - context->et.et_root_bh, > - OCFS2_JOURNAL_ACCESS_WRITE); > - if (ret) { > - mlog_errno(ret); > - goto out; > - } > - > ret = ocfs2_split_extent(handle, &context->et, path, index, > &replace_rec, context->meta_ac, > &context->dealloc); > @@ -121,8 +113,6 @@ static int __ocfs2_move_extent(handle_t *handle, > goto out; > } > > - ocfs2_journal_dirty(handle, context->et.et_root_bh); > - > context->new_phys_cpos = new_p_cpos; > > /*
Reasonably Related Threads
- Patch "ocfs2: fix defrag path triggering jbd2 ASSERT" has been added to the 4.19-stable tree
- Patch "ocfs2: fix defrag path triggering jbd2 ASSERT" has been added to the 5.15-stable tree
- Patch "ocfs2: fix defrag path triggering jbd2 ASSERT" has been added to the 6.2-stable tree
- Patch "ocfs2: fix defrag path triggering jbd2 ASSERT" has been added to the 5.4-stable tree
- Patch "ocfs2: fix defrag path triggering jbd2 ASSERT" has been added to the 5.10-stable tree