Joseph Qi
2020-Mar-11  01:15 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a null pointer derefrence in ocfs2_block_group_clear_bits()
On 2020/3/9 16:26, lishan wrote:> A NULL pointer panic dereference in ocfs2_block_group_clear_bits() happen again. > The information of NULL pointer stack as follows: > > PID: 81866 TASK: ffffa07c3c21ae80 CPU: 66 COMMAND: "fallocate" > #0 [ffff0000b4d6b0b0] machine_kexec at ffff0000800a2954 > #1 [ffff0000b4d6b110] __crash_kexec at ffff0000801bab34 > #2 [ffff0000b4d6b2a0] panic at ffff0000800f02cc > #3 [ffff0000b4d6b380] die at ffff00008008f6ac > #4 [ffff0000b4d6b3d0] bug_handler at ffff00008008f744 > #5 [ffff0000b4d6b400] brk_handler at ffff000080085d1c > #6 [ffff0000b4d6b420] do_debug_exception at ffff000080081194 > #7 [ffff0000b4d6b630] el1_dbg at ffff00008008332c > PC: ffff00000190e9c0 [_ocfs2_free_suballoc_bits+1608] > LR: ffff00000190e990 [_ocfs2_free_suballoc_bits+1560] > SP: ffff0000b4d6b640 PSTATE: 60400009 > X29: ffff0000b4d6b650 X28: 0000000000000000 X27: 00000000000052f3 > X26: ffff807c511a9570 X25: ffff807ca0054000 X24: 00000000000052f2 > X23: 0000000000000001 X22: ffff807c7cde7a90 X21: ffff0000811d9000 > X20: ffff807c5e7d2000 X19: ffff00000190c768 X18: 0000000000000000 > X17: 0000000000000000 X16: ffff000080a032f0 X15: 0000000000000000 > X14: ffffffffffffffff X13: fffffffffffffff7 X12: ffffffffffffffff > X11: 0000000000000038 X10: 0101010101010101 X9: ffffffffffffffff > X8: 7f7f7f7f7f7f7f7f X7: 0000000000000000 X6: 0000000000000080 > X5: 0000000000000000 X4: 0000000000000002 X3: ffff00000199f390 > X2: a603c08321456e00 X1: ffff807c7cde7a90 X0: 0000000000000000 > #8 [ffff0000b4d6b650] _ocfs2_free_suballoc_bits at ffff00000190e9bc [ocfs2] > #9 [ffff0000b4d6b710] _ocfs2_free_clusters at ffff0000019110d4 [ocfs2] > #10 [ffff0000b4d6b790] ocfs2_free_clusters at ffff000001913e94 [ocfs2] > #11 [ffff0000b4d6b7d0] __ocfs2_flush_truncate_log at ffff0000018b5294 [ocfs2] > #12 [ffff0000b4d6b8a0] ocfs2_remove_btree_range at ffff0000018bb34c [ocfs2] > #13 [ffff0000b4d6b960] ocfs2_commit_truncate at ffff0000018bc76c [ocfs2] > #14 [ffff0000b4d6ba60] ocfs2_wipe_inode at ffff0000018e57bc [ocfs2] > #15 [ffff0000b4d6bb00] ocfs2_evict_inode at ffff0000018e5db8 [ocfs2] > #16 [ffff0000b4d6bb70] evict at ffff000080365040 > #17 [ffff0000b4d6bba0] iput at ffff0000803655d8 > #18 [ffff0000b4d6bbe0] ocfs2_dentry_iput at ffff0000018c60a0 [ocfs2] > #19 [ffff0000b4d6bc30] dentry_unlink_inode at ffff00008035ef58 > #20 [ffff0000b4d6bc50] __dentry_kill at ffff000080360384 > #21 [ffff0000b4d6bc80] dentry_kill at ffff000080360670 > #22 [ffff0000b4d6bcb0] dput at ffff00008036093c > #23 [ffff0000b4d6bcf0] __fput at ffff000080343930 > #24 [ffff0000b4d6bd40] ____fput at ffff000080343aac > #25 [ffff0000b4d6bd60] task_work_run at ffff0000801172fc > > The direct panic reason is that bh2jh (group_bh)-> b_committed_data is null. > It is presumed that the network was disconnected during the write process, > causing the transaction abort. as follows: > jbd2_journal_abort > ....... > jbd2_journal_commit_transaction > jh->b_committed_data = NULL; > > _ocfs2_free_suballoc_bits > ocfs2_block_group_clear_bits > // undo_bg is now set to null > BUG_ON(!undo_bg); > > When applying for free space, if b_committed_data is null, > it will be directly occupied, as follows: > ocfs2_cluster_group_search > ocfs2_block_group_find_clear_bits > ocfs2_test_bg_bit_allocatable: > bg = (struct ocfs2_group_desc *) bh2jh(bg_bh)->b_committed_data; > if (bg) > ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > else > ret = 1; > b_committed_data is an intermediate state backup for bitmap transaction commits, > newly applied space can overwrite previous dirty data, > so, I think, while free clusters, if b_committed_data is null, ignore it. > Host panic directly, too violent. > > Signed-off-by: Shan Li <lishan24 at huawei.com> > Reviewed-by: Jun Piao <piaojun at huawei.com> > --- > fs/ocfs2/suballoc.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c > index 939df99d2dec..aaf1b3cbd984 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -2412,14 +2412,19 @@ static int ocfs2_block_group_clear_bits(handle_t *handle, > if (undo_fn) { > spin_lock(&jh->b_state_lock); > undo_bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - BUG_ON(!undo_bg); > + if (!undo_bg) > + mlog(ML_NOTICE, "%s: group descriptor # %llu (device %s) journal " > + "b_committed_data had been cleared.\n", > + OCFS2_SB(alloc_inode->i_sb)->uuid_str, > + (unsigned long long)le64_to_cpu(bg->bg_blkno), > + alloc_inode->i_sb->s_id);Seems a kind of workaround. I am worrying about other abnormal cases of NULL b_committed_data, it may lead to a corrupt filesystem. So how about isolating the journal abort case? Thanks, Joseph> } > > tmp = num_bits; > while(tmp--) { > ocfs2_clear_bit((bit_off + tmp), > (unsigned long *) bg->bg_bitmap); > - if (undo_fn) > + if (undo_fn && undo_bg) > undo_fn(bit_off + tmp, > (unsigned long *) undo_bg->bg_bitmap); > } >
lishan
2020-Mar-12  02:38 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a null pointer derefrence in ocfs2_block_group_clear_bits()
Hi, Jan Kara: I have a long-standing problem, In the ocfs2 file system, there is a panic problem related to b_committed_data, details can be found in the historical mail. Refer to the old version of the ext3 file system, ocfs2 file system uses the undo process to ensure that metadata which is not committed to disk is not reused. I understand that for transactions that have not been committed, b_committed_data will not be NULL, so, in what scenario do you think b_committed_data will be set to NULL? Thanks, Shan On 2020/3/11 9:15, Joseph Qi wrote:> > > On 2020/3/9 16:26, lishan wrote: >> A NULL pointer panic dereference in ocfs2_block_group_clear_bits() happen again. >> The information of NULL pointer stack as follows: >> >> PID: 81866 TASK: ffffa07c3c21ae80 CPU: 66 COMMAND: "fallocate" >> #0 [ffff0000b4d6b0b0] machine_kexec at ffff0000800a2954 >> #1 [ffff0000b4d6b110] __crash_kexec at ffff0000801bab34 >> #2 [ffff0000b4d6b2a0] panic at ffff0000800f02cc >> #3 [ffff0000b4d6b380] die at ffff00008008f6ac >> #4 [ffff0000b4d6b3d0] bug_handler at ffff00008008f744 >> #5 [ffff0000b4d6b400] brk_handler at ffff000080085d1c >> #6 [ffff0000b4d6b420] do_debug_exception at ffff000080081194 >> #7 [ffff0000b4d6b630] el1_dbg at ffff00008008332c >> PC: ffff00000190e9c0 [_ocfs2_free_suballoc_bits+1608] >> LR: ffff00000190e990 [_ocfs2_free_suballoc_bits+1560] >> SP: ffff0000b4d6b640 PSTATE: 60400009 >> X29: ffff0000b4d6b650 X28: 0000000000000000 X27: 00000000000052f3 >> X26: ffff807c511a9570 X25: ffff807ca0054000 X24: 00000000000052f2 >> X23: 0000000000000001 X22: ffff807c7cde7a90 X21: ffff0000811d9000 >> X20: ffff807c5e7d2000 X19: ffff00000190c768 X18: 0000000000000000 >> X17: 0000000000000000 X16: ffff000080a032f0 X15: 0000000000000000 >> X14: ffffffffffffffff X13: fffffffffffffff7 X12: ffffffffffffffff >> X11: 0000000000000038 X10: 0101010101010101 X9: ffffffffffffffff >> X8: 7f7f7f7f7f7f7f7f X7: 0000000000000000 X6: 0000000000000080 >> X5: 0000000000000000 X4: 0000000000000002 X3: ffff00000199f390 >> X2: a603c08321456e00 X1: ffff807c7cde7a90 X0: 0000000000000000 >> #8 [ffff0000b4d6b650] _ocfs2_free_suballoc_bits at ffff00000190e9bc [ocfs2] >> #9 [ffff0000b4d6b710] _ocfs2_free_clusters at ffff0000019110d4 [ocfs2] >> #10 [ffff0000b4d6b790] ocfs2_free_clusters at ffff000001913e94 [ocfs2] >> #11 [ffff0000b4d6b7d0] __ocfs2_flush_truncate_log at ffff0000018b5294 [ocfs2] >> #12 [ffff0000b4d6b8a0] ocfs2_remove_btree_range at ffff0000018bb34c [ocfs2] >> #13 [ffff0000b4d6b960] ocfs2_commit_truncate at ffff0000018bc76c [ocfs2] >> #14 [ffff0000b4d6ba60] ocfs2_wipe_inode at ffff0000018e57bc [ocfs2] >> #15 [ffff0000b4d6bb00] ocfs2_evict_inode at ffff0000018e5db8 [ocfs2] >> #16 [ffff0000b4d6bb70] evict at ffff000080365040 >> #17 [ffff0000b4d6bba0] iput at ffff0000803655d8 >> #18 [ffff0000b4d6bbe0] ocfs2_dentry_iput at ffff0000018c60a0 [ocfs2] >> #19 [ffff0000b4d6bc30] dentry_unlink_inode at ffff00008035ef58 >> #20 [ffff0000b4d6bc50] __dentry_kill at ffff000080360384 >> #21 [ffff0000b4d6bc80] dentry_kill at ffff000080360670 >> #22 [ffff0000b4d6bcb0] dput at ffff00008036093c >> #23 [ffff0000b4d6bcf0] __fput at ffff000080343930 >> #24 [ffff0000b4d6bd40] ____fput at ffff000080343aac >> #25 [ffff0000b4d6bd60] task_work_run at ffff0000801172fc >> >> The direct panic reason is that bh2jh (group_bh)-> b_committed_data is null. >> It is presumed that the network was disconnected during the write process, >> causing the transaction abort. as follows: >> jbd2_journal_abort >> ....... >> jbd2_journal_commit_transaction >> jh->b_committed_data = NULL; >> >> _ocfs2_free_suballoc_bits >> ocfs2_block_group_clear_bits >> // undo_bg is now set to null >> BUG_ON(!undo_bg); >> >> When applying for free space, if b_committed_data is null, >> it will be directly occupied, as follows: >> ocfs2_cluster_group_search >> ocfs2_block_group_find_clear_bits >> ocfs2_test_bg_bit_allocatable: >> bg = (struct ocfs2_group_desc *) bh2jh(bg_bh)->b_committed_data; >> if (bg) >> ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> else >> ret = 1; >> b_committed_data is an intermediate state backup for bitmap transaction commits, >> newly applied space can overwrite previous dirty data, >> so, I think, while free clusters, if b_committed_data is null, ignore it. >> Host panic directly, too violent. >> >> Signed-off-by: Shan Li <lishan24 at huawei.com> >> Reviewed-by: Jun Piao <piaojun at huawei.com> >> --- >> fs/ocfs2/suballoc.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c >> index 939df99d2dec..aaf1b3cbd984 100644 >> --- a/fs/ocfs2/suballoc.c >> +++ b/fs/ocfs2/suballoc.c >> @@ -2412,14 +2412,19 @@ static int ocfs2_block_group_clear_bits(handle_t *handle, >> if (undo_fn) { >> spin_lock(&jh->b_state_lock); >> undo_bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> - BUG_ON(!undo_bg); >> + if (!undo_bg) >> + mlog(ML_NOTICE, "%s: group descriptor # %llu (device %s) journal " >> + "b_committed_data had been cleared.\n", >> + OCFS2_SB(alloc_inode->i_sb)->uuid_str, >> + (unsigned long long)le64_to_cpu(bg->bg_blkno), >> + alloc_inode->i_sb->s_id); > > Seems a kind of workaround. > I am worrying about other abnormal cases of NULL b_committed_data, it > may lead to a corrupt filesystem. > So how about isolating the journal abort case? > > Thanks, > Joseph > >> } >> >> tmp = num_bits; >> while(tmp--) { >> ocfs2_clear_bit((bit_off + tmp), >> (unsigned long *) bg->bg_bitmap); >> - if (undo_fn) >> + if (undo_fn && undo_bg) >> undo_fn(bit_off + tmp, >> (unsigned long *) undo_bg->bg_bitmap); >> } >> > > . >