Heming Zhao
2023-Feb-15 06:20 UTC
[Ocfs2-devel] discuss about jbd2 assertion in defragment path
On 2/15/23 10:06 AM, Joseph Qi wrote:> > > On 2/14/23 7:48 PM, Heming Zhao wrote: >> On Tue, Feb 14, 2023 at 07:08:16PM +0800, Joseph Qi wrote: >>> Hi, >>> >>> On 2/14/23 12:33 PM, Heming Zhao wrote: >>>> On Tue, Feb 14, 2023 at 10:52:30AM +0800, Joseph Qi wrote: >>>>> Hi, Sorry about the late reply. >>>>> This thread is indeed a long time ago:( >>>>> It seems that I said the two ocfs2_journal_access_di() are for different >>>>> buffer head. Anyway, I have to recall the discussion before and get back >>>>> to you. >>>>> >>>> If you belive from [a] to [b.1] doesn't need any journal protection, my patch >>>> makes sense from theory. >>>> >>>> From code logic, if the defrag path (ocfs2_split_extent) doesn't call >>>> jbd2_journal_restart(), current code is absolute correct. Increasing credits >>>> can't help avoid jbd2 assert crash, but make more easily to trigger crash, >>>> because it makes jbd2_journal_extend() more easily to return "status > 0". In my >>>> view, the correct fix should delete the access/dirty pair and make sure the >>>> ocfs2_split_extent() is journal self-service. >>>> >>>> Different buffer head issue could be handled by rechecking ocfs2_split_extent(), >>>> make sure all journal handle branches are correct. I already checked >>>> ocfs2_split_extent, but I can't assure my eyes didn't miss any corner. >>> >>> ocfs2_split_extent are not just 'read' operations, it may grow the tree, >>> that's why we have to do a transaction here. >> >> You misunderstand my meaning, my mean: from the caller __ocfs2_move_extent() >> calling ocfs2_journal_access_di() (below [a]) to (before) the handling >> ctxt.c_contig_type (below [b.1]) in ocfs2_split_extent, this area is purely read >> operations. It's the range [ [a], [b.1]) (include [a], not include [b.1]) >> >> __ocfs2_move_extent >> + ocfs2_journal_access_di //[a] <------+ >> + ocfs2_split_extent //[b] | [a, b.1) area is read operations >> | + if //[b.1] <------+ >> | | A >> | | +---- from this line, this func starts r/w IOs & needs jbd2 operations >> | | >> | | ocfs2_replace_extent_rec()/ocfs2_split_and_insert() >> | + else >> | ocfs2_try_to_merge_extent () >> | >> + ocfs2_journal_dirty //[c] >> >>> >>> The whole code flow: >>> ocfs2_ioctl_move_extents >>> ocfs2_move_extents >>> __ocfs2_move_extents_range >>> ocfs2_defrag_extent [1] >>> ocfs2_start_trans [a] >>> __ocfs2_move_extent >>> ocfs2_journal_access_di >>> ocfs2_split_extent >>> ocfs2_journal_dirty >>> ocfs2_commit_trans >>> ocfs2_move_extent [2] >>> ocfs2_start_trans >>> __ocfs2_move_extent >>> ocfs2_journal_access_di >>> ocfs2_split_extent >>> ocfs2_journal_dirty >>> ocfs2_commit_trans >>> ocfs2_start_trans [b] >>> ocfs2_journal_access_di >>> ocfs2_journal_dirty >>> ocfs2_commit_trans >>> >>> In above, [a] and [b] are different transaction start/commit pair, and >>> each allocates different journal credits, even they are operate the same >>> bh, they are still different operations. >> >> I agree above writing, but it can't change my conclusion: ocfs2_split_extent() >> is journal self-service. > > Don't understand what does 'journal self-service' mean.Sorry for my English skill, I invent this word 'journal self-service'. The meaning is: this function could handle journal operations totally by itself. Caller doesn't need to call access/dirty pair. Caller only needs to call jbd2 journal start/stop pair. Because ocfs2_start_trans() & ocfs2_commit_trans are reenterable function, we even could add these two func in ocfs2_split_extent(). Then any caller won't take care of any journal operations when calling ocfs2_split_extent().> >> It seems the author meaning: he wanted [a] to protect defragging actions (extents >> changes), wanted [b] to protect inode ctime changes. the >> ocfs2_start_trans/ocfs2_commit_trans in [a] area is necessary. but the >> access/dirty pair in __ocfs2_move_extent() is potential crash risk. >> > If so, what is transaction [a] protect? (no dirty buffer)Transaction [a] protects the dirty buffers in ocfs2_split_extent(). ocfs2_split_extent() has already correctly used access/dirty pair to mark/handle dirty buffers. Thanks, Heming
Joseph Qi
2023-Feb-15 06:42 UTC
[Ocfs2-devel] discuss about jbd2 assertion in defragment path
On 2/15/23 2:20 PM, Heming Zhao wrote:> On 2/15/23 10:06 AM, Joseph Qi wrote: >> >> >> On 2/14/23 7:48 PM, Heming Zhao wrote: >>> On Tue, Feb 14, 2023 at 07:08:16PM +0800, Joseph Qi wrote: >>>> Hi, >>>> >>>> On 2/14/23 12:33 PM, Heming Zhao wrote: >>>>> On Tue, Feb 14, 2023 at 10:52:30AM +0800, Joseph Qi wrote: >>>>>> Hi, Sorry about the late reply. >>>>>> This thread is indeed a long time ago:( >>>>>> It seems that I said the two ocfs2_journal_access_di() are for different >>>>>> buffer head. Anyway, I have to recall the discussion before and get back >>>>>> to you. >>>>>> >>>>> If you belive from [a] to [b.1] doesn't need any journal protection, my patch >>>>> makes sense from theory. >>>>> >>>>> ?From code logic, if the defrag path (ocfs2_split_extent) doesn't call >>>>> jbd2_journal_restart(), current code is absolute correct. Increasing credits >>>>> can't help avoid jbd2 assert crash, but make more easily to trigger crash, >>>>> because it makes jbd2_journal_extend() more easily to return "status > 0". In my >>>>> view, the correct fix should delete the access/dirty pair and make sure the >>>>> ocfs2_split_extent() is journal self-service. >>>>> >>>>> Different buffer head issue could be handled by rechecking ocfs2_split_extent(), >>>>> make sure all journal handle branches are correct. I already checked >>>>> ocfs2_split_extent, but I can't assure my eyes didn't miss any corner. >>>> >>>> ocfs2_split_extent are not just 'read' operations, it may grow the tree, >>>> that's why we have to do a transaction here. >>> >>> You misunderstand my meaning, my mean: from the caller __ocfs2_move_extent() >>> calling ocfs2_journal_access_di() (below [a]) to (before) the handling >>> ctxt.c_contig_type (below [b.1]) in ocfs2_split_extent, this area is purely read >>> operations. It's the range [ [a], [b.1]) (include [a], not include [b.1]) >>> >>> __ocfs2_move_extent >>> ? + ocfs2_journal_access_di //[a]?? <------+ >>> ? + ocfs2_split_extent????? //[b]????????? | [a, b.1) area is read operations >>> ? |? + if?????????????????? //[b.1] <------+ >>> ? |? |? A >>> ? |? |? +---- from this line, this func starts r/w IOs & needs jbd2 operations >>> ? |? | >>> ? |? |?? ocfs2_replace_extent_rec()/ocfs2_split_and_insert() >>> ? |? + else >>> ? |????? ocfs2_try_to_merge_extent () >>> ? | >>> ? + ocfs2_journal_dirty???? //[c] >>> >>>> >>>> The whole code flow: >>>> ocfs2_ioctl_move_extents >>>> ?? ocfs2_move_extents >>>> ???? __ocfs2_move_extents_range >>>> ?????? ocfs2_defrag_extent? [1] >>>> ???????? ocfs2_start_trans? [a] >>>> ???????? __ocfs2_move_extent >>>> ?????????? ocfs2_journal_access_di >>>> ?????????? ocfs2_split_extent >>>> ?????????? ocfs2_journal_dirty >>>> ???????? ocfs2_commit_trans >>>> ?????? ocfs2_move_extent? [2] >>>> ???????? ocfs2_start_trans >>>> ???????? __ocfs2_move_extent >>>> ?????????? ocfs2_journal_access_di >>>> ?????????? ocfs2_split_extent >>>> ?????????? ocfs2_journal_dirty >>>> ???????? ocfs2_commit_trans >>>> ???? ocfs2_start_trans? [b] >>>> ???? ocfs2_journal_access_di >>>> ???? ocfs2_journal_dirty >>>> ???? ocfs2_commit_trans >>>> >>>> In above, [a] and [b] are different transaction start/commit pair, and >>>> each allocates different journal credits, even they are operate the same >>>> bh, they are still different operations. >>> >>> I agree above writing, but it can't change my conclusion: ocfs2_split_extent() >>> is journal self-service. >> >> Don't understand what does 'journal self-service' mean. > > Sorry for my English skill, I invent this word 'journal self-service'. > The meaning is: this function could handle journal operations totally by itself. > Caller doesn't need to call access/dirty pair. Caller only needs to call jbd2 > journal start/stop pair. > > Because ocfs2_start_trans() & ocfs2_commit_trans are reenterable function, > we even could add these two func in ocfs2_split_extent(). Then any caller won't > take care of any journal operations when calling ocfs2_split_extent(). > >> >>> It seems the author meaning: he wanted [a] to protect defragging actions (extents >>> changes), wanted [b] to protect inode ctime changes. the >>> ocfs2_start_trans/ocfs2_commit_trans in [a] area is necessary. but the >>> access/dirty pair in __ocfs2_move_extent() is potential crash risk. >>> >> If so, what is transaction [a] protect? (no dirty buffer) > > Transaction [a] protects the dirty buffers in ocfs2_split_extent(). > ocfs2_split_extent() has already correctly used access/dirty pair to > mark/handle dirty buffers. >Okay, I think I've gotten your idea now. You change looks reasonable. Could you please also check if move extents without auto defrag also has the same issue? Thanks, Joseph