Joseph Qi
2023-Feb-14 11:08 UTC
[Ocfs2-devel] discuss about jbd2 assertion in defragment path
Hi, On 2/14/23 12:33 PM, Heming Zhao wrote:> On Tue, Feb 14, 2023 at 10:52:30AM +0800, Joseph Qi wrote: >> Hi, Sorry about the late reply. >> This thread is indeed a long time ago:( >> It seems that I said the two ocfs2_journal_access_di() are for different >> buffer head. Anyway, I have to recall the discussion before and get back >> to you. >> > If you belive from [a] to [b.1] doesn't need any journal protection, my patch > makes sense from theory. > > From code logic, if the defrag path (ocfs2_split_extent) doesn't call > jbd2_journal_restart(), current code is absolute correct. Increasing credits > can't help avoid jbd2 assert crash, but make more easily to trigger crash, > because it makes jbd2_journal_extend() more easily to return "status > 0". In my > view, the correct fix should delete the access/dirty pair and make sure the > ocfs2_split_extent() is journal self-service. > > Different buffer head issue could be handled by rechecking ocfs2_split_extent(), > make sure all journal handle branches are correct. I already checked > ocfs2_split_extent, but I can't assure my eyes didn't miss any corner.ocfs2_split_extent are not just 'read' operations, it may grow the tree, that's why we have to do a transaction here. The whole code flow: ocfs2_ioctl_move_extents ocfs2_move_extents __ocfs2_move_extents_range ocfs2_defrag_extent [1] ocfs2_start_trans [a] __ocfs2_move_extent ocfs2_journal_access_di ocfs2_split_extent ocfs2_journal_dirty ocfs2_commit_trans ocfs2_move_extent [2] ocfs2_start_trans __ocfs2_move_extent ocfs2_journal_access_di ocfs2_split_extent ocfs2_journal_dirty ocfs2_commit_trans ocfs2_start_trans [b] ocfs2_journal_access_di ocfs2_journal_dirty ocfs2_commit_trans In above, [a] and [b] are different transaction start/commit pair, and each allocates different journal credits, even they are operate the same bh, they are still different operations. My another question is, __ocfs2_move_extent will be called by both ocfs2_defrag_extent and ocfs2_move_extent, why this doesn't happen on the other path? Thanks, Joseph
Heming Zhao
2023-Feb-14 11:48 UTC
[Ocfs2-devel] discuss about jbd2 assertion in defragment path
On Tue, Feb 14, 2023 at 07:08:16PM +0800, Joseph Qi wrote:> Hi, > > On 2/14/23 12:33 PM, Heming Zhao wrote: > > On Tue, Feb 14, 2023 at 10:52:30AM +0800, Joseph Qi wrote: > >> Hi, Sorry about the late reply. > >> This thread is indeed a long time ago:( > >> It seems that I said the two ocfs2_journal_access_di() are for different > >> buffer head. Anyway, I have to recall the discussion before and get back > >> to you. > >> > > If you belive from [a] to [b.1] doesn't need any journal protection, my patch > > makes sense from theory. > > > > From code logic, if the defrag path (ocfs2_split_extent) doesn't call > > jbd2_journal_restart(), current code is absolute correct. Increasing credits > > can't help avoid jbd2 assert crash, but make more easily to trigger crash, > > because it makes jbd2_journal_extend() more easily to return "status > 0". In my > > view, the correct fix should delete the access/dirty pair and make sure the > > ocfs2_split_extent() is journal self-service. > > > > Different buffer head issue could be handled by rechecking ocfs2_split_extent(), > > make sure all journal handle branches are correct. I already checked > > ocfs2_split_extent, but I can't assure my eyes didn't miss any corner. > > ocfs2_split_extent are not just 'read' operations, it may grow the tree, > that's why we have to do a transaction here.You misunderstand my meaning, my mean: from the caller __ocfs2_move_extent() calling ocfs2_journal_access_di() (below [a]) to (before) the handling ctxt.c_contig_type (below [b.1]) in ocfs2_split_extent, this area is purely read operations. It's the range [ [a], [b.1]) (include [a], not include [b.1]) __ocfs2_move_extent + ocfs2_journal_access_di //[a] <------+ + ocfs2_split_extent //[b] | [a, b.1) area is read operations | + if //[b.1] <------+ | | A | | +---- from this line, this func starts r/w IOs & needs jbd2 operations | | | | ocfs2_replace_extent_rec()/ocfs2_split_and_insert() | + else | ocfs2_try_to_merge_extent () | + ocfs2_journal_dirty //[c]> > The whole code flow: > ocfs2_ioctl_move_extents > ocfs2_move_extents > __ocfs2_move_extents_range > ocfs2_defrag_extent [1] > ocfs2_start_trans [a] > __ocfs2_move_extent > ocfs2_journal_access_di > ocfs2_split_extent > ocfs2_journal_dirty > ocfs2_commit_trans > ocfs2_move_extent [2] > ocfs2_start_trans > __ocfs2_move_extent > ocfs2_journal_access_di > ocfs2_split_extent > ocfs2_journal_dirty > ocfs2_commit_trans > ocfs2_start_trans [b] > ocfs2_journal_access_di > ocfs2_journal_dirty > ocfs2_commit_trans > > In above, [a] and [b] are different transaction start/commit pair, and > each allocates different journal credits, even they are operate the same > bh, they are still different operations.I agree above writing, but it can't change my conclusion: ocfs2_split_extent() is journal self-service. It seems the author meaning: he wanted [a] to protect defragging actions (extents changes), wanted [b] to protect inode ctime changes. the ocfs2_start_trans/ocfs2_commit_trans in [a] area is necessary. but the access/dirty pair in __ocfs2_move_extent() is potential crash risk.> > My another question is, __ocfs2_move_extent will be called by both > ocfs2_defrag_extent and ocfs2_move_extent, why this doesn't happen on the > other path?The reason is userspace defragfs.ocfs2 never trigger ocfs2_move_extent(). see do_file_defrag(): struct ocfs2_move_extents me = { .me_start = 0, .me_len = buf->st_size, .me_flags = OCFS2_MOVE_EXT_FL_AUTO_DEFRAG, //<-- It makes do_defrag becomes 1 //in kernel space __ocfs2_move_extents_range() }; Thanks, Heming