Joseph Qi
2023-Feb-14 02:52 UTC
[Ocfs2-devel] discuss about jbd2 assertion in defragment path
Hi, Sorry about the late reply. This thread is indeed a long time ago:( It seems that I said the two ocfs2_journal_access_di() are for different buffer head. Anyway, I have to recall the discussion before and get back to you. Thanks, Joseph On 2/10/23 6:04 PM, Heming Zhao wrote:> Hello Joseph, > > I am sorry to wake up a long time ago thread. > > All mails of this thread (my patch is [1]): > [1] https://oss.oracle.com/pipermail/ocfs2-devel/2022-May/000101.html > [2] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000105.html > [3] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000109.html > [4] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000217.html > > I re-checked ocfs2 defragmentation & jbd2 flow recently, I still think my > patch [1] is right. At least, the fixing code is correct, the patch commit > log needs to polish. > > This bug has the same root cause of commit 7f27ec978b0e ("ocfs2: call > ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock()"). > For this bug, jbd2_journal_restart() is called by ocfs2_split_extent() during > defragmenting, and it's not about "not enough credits" issue you ever said in [2]. > > I explain my thinking again in this mail. > > the crash call flow: > > ocfs2_defrag_extent //caller call it in while() loop. > + handle = ocfs2_start_trans(osb, credits) > + __ocfs2_move_extent > | + ocfs2_journal_access_di //[a] > | + ocfs2_split_extent //[b] > | | + if //[b.1] > | | | ocfs2_replace_extent_rec/ocfs2_split_and_insert > | | + else > | | ocfs2_try_to_merge_extent > | | > | + ocfs2_journal_dirty //[c] > | > + ocfs2_commit_trans(osb, handle) //<== complete this handle > > In my viewpoint, ocfs2_split_extent() is journal self-service function. I still > belive the two lines ([a] & [c]) in __ocfs2_move_extent() are totally useless. > In ocfs2_split_extent(), the code from the first code line to "if-else" code > area ([b.1]) doesn't need any journal protection, and we also could see there > are only read operations. > > If we worry about data corruption after removing [a] & [c], (e.g: my eyes missed > some journal operations from [a] to [b.1]), we could only delete [c]. So the > fixed code seems (only remove line [c]): > > ocfs2_defrag_extent > + handle = ocfs2_start_trans(osb, credits) > + __ocfs2_move_extent > | + ocfs2_journal_access_di //[a] <-- keep it, but remove pair dirty action > | + ocfs2_split_extent //[b] > | + if //[b.1] > | | ocfs2_replace_extent_rec/ocfs2_split_and_insert > | + else > | ocfs2_try_to_merge_extent > | > + ocfs2_commit_trans(osb, handle) > > Thanks, > Heming
Heming Zhao
2023-Feb-14 04:33 UTC
[Ocfs2-devel] discuss about jbd2 assertion in defragment path
On Tue, Feb 14, 2023 at 10:52:30AM +0800, Joseph Qi wrote:> Hi, Sorry about the late reply. > This thread is indeed a long time ago:( > It seems that I said the two ocfs2_journal_access_di() are for different > buffer head. Anyway, I have to recall the discussion before and get back > to you. >If you belive from [a] to [b.1] doesn't need any journal protection, my patch makes sense from theory.>From code logic, if the defrag path (ocfs2_split_extent) doesn't calljbd2_journal_restart(), current code is absolute correct. Increasing credits can't help avoid jbd2 assert crash, but make more easily to trigger crash, because it makes jbd2_journal_extend() more easily to return "status > 0". In my view, the correct fix should delete the access/dirty pair and make sure the ocfs2_split_extent() is journal self-service. Different buffer head issue could be handled by rechecking ocfs2_split_extent(), make sure all journal handle branches are correct. I already checked ocfs2_split_extent, but I can't assure my eyes didn't miss any corner. Thanks, Heming> > On 2/10/23 6:04 PM, Heming Zhao wrote: > > Hello Joseph, > > > > I am sorry to wake up a long time ago thread. > > > > All mails of this thread (my patch is [1]): > > [1] https://oss.oracle.com/pipermail/ocfs2-devel/2022-May/000101.html > > [2] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000105.html > > [3] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000109.html > > [4] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000217.html > > > > I re-checked ocfs2 defragmentation & jbd2 flow recently, I still think my > > patch [1] is right. At least, the fixing code is correct, the patch commit > > log needs to polish. > > > > This bug has the same root cause of commit 7f27ec978b0e ("ocfs2: call > > ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock()"). > > For this bug, jbd2_journal_restart() is called by ocfs2_split_extent() during > > defragmenting, and it's not about "not enough credits" issue you ever said in [2]. > > > > I explain my thinking again in this mail. > > > > the crash call flow: > > > > ocfs2_defrag_extent //caller call it in while() loop. > > + handle = ocfs2_start_trans(osb, credits) > > + __ocfs2_move_extent > > | + ocfs2_journal_access_di //[a] > > | + ocfs2_split_extent //[b] > > | | + if //[b.1] > > | | | ocfs2_replace_extent_rec/ocfs2_split_and_insert > > | | + else > > | | ocfs2_try_to_merge_extent > > | | > > | + ocfs2_journal_dirty //[c] > > | > > + ocfs2_commit_trans(osb, handle) //<== complete this handle > > > > In my viewpoint, ocfs2_split_extent() is journal self-service function. I still > > belive the two lines ([a] & [c]) in __ocfs2_move_extent() are totally useless. > > In ocfs2_split_extent(), the code from the first code line to "if-else" code > > area ([b.1]) doesn't need any journal protection, and we also could see there > > are only read operations. > > > > If we worry about data corruption after removing [a] & [c], (e.g: my eyes missed > > some journal operations from [a] to [b.1]), we could only delete [c]. So the > > fixed code seems (only remove line [c]): > > > > ocfs2_defrag_extent > > + handle = ocfs2_start_trans(osb, credits) > > + __ocfs2_move_extent > > | + ocfs2_journal_access_di //[a] <-- keep it, but remove pair dirty action > > | + ocfs2_split_extent //[b] > > | + if //[b.1] > > | | ocfs2_replace_extent_rec/ocfs2_split_and_insert > > | + else > > | ocfs2_try_to_merge_extent > > | > > + ocfs2_commit_trans(osb, handle) > > > > Thanks, > > Heming