lishan
2020-Mar-09 08:26 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a null pointer derefrence in ocfs2_block_group_clear_bits()
A NULL pointer panic dereference in ocfs2_block_group_clear_bits() happen again. The information of NULL pointer stack as follows: PID: 81866 TASK: ffffa07c3c21ae80 CPU: 66 COMMAND: "fallocate" #0 [ffff0000b4d6b0b0] machine_kexec at ffff0000800a2954 #1 [ffff0000b4d6b110] __crash_kexec at ffff0000801bab34 #2 [ffff0000b4d6b2a0] panic at ffff0000800f02cc #3 [ffff0000b4d6b380] die at ffff00008008f6ac #4 [ffff0000b4d6b3d0] bug_handler at ffff00008008f744 #5 [ffff0000b4d6b400] brk_handler at ffff000080085d1c #6 [ffff0000b4d6b420] do_debug_exception at ffff000080081194 #7 [ffff0000b4d6b630] el1_dbg at ffff00008008332c PC: ffff00000190e9c0 [_ocfs2_free_suballoc_bits+1608] LR: ffff00000190e990 [_ocfs2_free_suballoc_bits+1560] SP: ffff0000b4d6b640 PSTATE: 60400009 X29: ffff0000b4d6b650 X28: 0000000000000000 X27: 00000000000052f3 X26: ffff807c511a9570 X25: ffff807ca0054000 X24: 00000000000052f2 X23: 0000000000000001 X22: ffff807c7cde7a90 X21: ffff0000811d9000 X20: ffff807c5e7d2000 X19: ffff00000190c768 X18: 0000000000000000 X17: 0000000000000000 X16: ffff000080a032f0 X15: 0000000000000000 X14: ffffffffffffffff X13: fffffffffffffff7 X12: ffffffffffffffff X11: 0000000000000038 X10: 0101010101010101 X9: ffffffffffffffff X8: 7f7f7f7f7f7f7f7f X7: 0000000000000000 X6: 0000000000000080 X5: 0000000000000000 X4: 0000000000000002 X3: ffff00000199f390 X2: a603c08321456e00 X1: ffff807c7cde7a90 X0: 0000000000000000 #8 [ffff0000b4d6b650] _ocfs2_free_suballoc_bits at ffff00000190e9bc [ocfs2] #9 [ffff0000b4d6b710] _ocfs2_free_clusters at ffff0000019110d4 [ocfs2] #10 [ffff0000b4d6b790] ocfs2_free_clusters at ffff000001913e94 [ocfs2] #11 [ffff0000b4d6b7d0] __ocfs2_flush_truncate_log at ffff0000018b5294 [ocfs2] #12 [ffff0000b4d6b8a0] ocfs2_remove_btree_range at ffff0000018bb34c [ocfs2] #13 [ffff0000b4d6b960] ocfs2_commit_truncate at ffff0000018bc76c [ocfs2] #14 [ffff0000b4d6ba60] ocfs2_wipe_inode at ffff0000018e57bc [ocfs2] #15 [ffff0000b4d6bb00] ocfs2_evict_inode at ffff0000018e5db8 [ocfs2] #16 [ffff0000b4d6bb70] evict at ffff000080365040 #17 [ffff0000b4d6bba0] iput at ffff0000803655d8 #18 [ffff0000b4d6bbe0] ocfs2_dentry_iput at ffff0000018c60a0 [ocfs2] #19 [ffff0000b4d6bc30] dentry_unlink_inode at ffff00008035ef58 #20 [ffff0000b4d6bc50] __dentry_kill at ffff000080360384 #21 [ffff0000b4d6bc80] dentry_kill at ffff000080360670 #22 [ffff0000b4d6bcb0] dput at ffff00008036093c #23 [ffff0000b4d6bcf0] __fput at ffff000080343930 #24 [ffff0000b4d6bd40] ____fput at ffff000080343aac #25 [ffff0000b4d6bd60] task_work_run at ffff0000801172fc The direct panic reason is that bh2jh (group_bh)-> b_committed_data is null. It is presumed that the network was disconnected during the write process, causing the transaction abort. as follows: jbd2_journal_abort ....... jbd2_journal_commit_transaction jh->b_committed_data = NULL; _ocfs2_free_suballoc_bits ocfs2_block_group_clear_bits // undo_bg is now set to null BUG_ON(!undo_bg); When applying for free space, if b_committed_data is null, it will be directly occupied, as follows: ocfs2_cluster_group_search ocfs2_block_group_find_clear_bits ocfs2_test_bg_bit_allocatable: bg = (struct ocfs2_group_desc *) bh2jh(bg_bh)->b_committed_data; if (bg) ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); else ret = 1; b_committed_data is an intermediate state backup for bitmap transaction commits, newly applied space can overwrite previous dirty data, so, I think, while free clusters, if b_committed_data is null, ignore it. Host panic directly, too violent. Signed-off-by: Shan Li <lishan24 at huawei.com> Reviewed-by: Jun Piao <piaojun at huawei.com> --- fs/ocfs2/suballoc.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 939df99d2dec..aaf1b3cbd984 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -2412,14 +2412,19 @@ static int ocfs2_block_group_clear_bits(handle_t *handle, if (undo_fn) { spin_lock(&jh->b_state_lock); undo_bg = (struct ocfs2_group_desc *) jh->b_committed_data; - BUG_ON(!undo_bg); + if (!undo_bg) + mlog(ML_NOTICE, "%s: group descriptor # %llu (device %s) journal " + "b_committed_data had been cleared.\n", + OCFS2_SB(alloc_inode->i_sb)->uuid_str, + (unsigned long long)le64_to_cpu(bg->bg_blkno), + alloc_inode->i_sb->s_id); } tmp = num_bits; while(tmp--) { ocfs2_clear_bit((bit_off + tmp), (unsigned long *) bg->bg_bitmap); - if (undo_fn) + if (undo_fn && undo_bg) undo_fn(bit_off + tmp, (unsigned long *) undo_bg->bg_bitmap); } -- 2.19.1
lishan
2020-Mar-10 13:07 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a null pointer derefrence in ocfs2_block_group_clear_bits()
ping ? cc Andrew Morton On 2020/3/9 16:26, lishan wrote:> A NULL pointer panic dereference in ocfs2_block_group_clear_bits() happen again. > The information of NULL pointer stack as follows: > > PID: 81866 TASK: ffffa07c3c21ae80 CPU: 66 COMMAND: "fallocate" > #0 [ffff0000b4d6b0b0] machine_kexec at ffff0000800a2954 > #1 [ffff0000b4d6b110] __crash_kexec at ffff0000801bab34 > #2 [ffff0000b4d6b2a0] panic at ffff0000800f02cc > #3 [ffff0000b4d6b380] die at ffff00008008f6ac > #4 [ffff0000b4d6b3d0] bug_handler at ffff00008008f744 > #5 [ffff0000b4d6b400] brk_handler at ffff000080085d1c > #6 [ffff0000b4d6b420] do_debug_exception at ffff000080081194 > #7 [ffff0000b4d6b630] el1_dbg at ffff00008008332c > PC: ffff00000190e9c0 [_ocfs2_free_suballoc_bits+1608] > LR: ffff00000190e990 [_ocfs2_free_suballoc_bits+1560] > SP: ffff0000b4d6b640 PSTATE: 60400009 > X29: ffff0000b4d6b650 X28: 0000000000000000 X27: 00000000000052f3 > X26: ffff807c511a9570 X25: ffff807ca0054000 X24: 00000000000052f2 > X23: 0000000000000001 X22: ffff807c7cde7a90 X21: ffff0000811d9000 > X20: ffff807c5e7d2000 X19: ffff00000190c768 X18: 0000000000000000 > X17: 0000000000000000 X16: ffff000080a032f0 X15: 0000000000000000 > X14: ffffffffffffffff X13: fffffffffffffff7 X12: ffffffffffffffff > X11: 0000000000000038 X10: 0101010101010101 X9: ffffffffffffffff > X8: 7f7f7f7f7f7f7f7f X7: 0000000000000000 X6: 0000000000000080 > X5: 0000000000000000 X4: 0000000000000002 X3: ffff00000199f390 > X2: a603c08321456e00 X1: ffff807c7cde7a90 X0: 0000000000000000 > #8 [ffff0000b4d6b650] _ocfs2_free_suballoc_bits at ffff00000190e9bc [ocfs2] > #9 [ffff0000b4d6b710] _ocfs2_free_clusters at ffff0000019110d4 [ocfs2] > #10 [ffff0000b4d6b790] ocfs2_free_clusters at ffff000001913e94 [ocfs2] > #11 [ffff0000b4d6b7d0] __ocfs2_flush_truncate_log at ffff0000018b5294 [ocfs2] > #12 [ffff0000b4d6b8a0] ocfs2_remove_btree_range at ffff0000018bb34c [ocfs2] > #13 [ffff0000b4d6b960] ocfs2_commit_truncate at ffff0000018bc76c [ocfs2] > #14 [ffff0000b4d6ba60] ocfs2_wipe_inode at ffff0000018e57bc [ocfs2] > #15 [ffff0000b4d6bb00] ocfs2_evict_inode at ffff0000018e5db8 [ocfs2] > #16 [ffff0000b4d6bb70] evict at ffff000080365040 > #17 [ffff0000b4d6bba0] iput at ffff0000803655d8 > #18 [ffff0000b4d6bbe0] ocfs2_dentry_iput at ffff0000018c60a0 [ocfs2] > #19 [ffff0000b4d6bc30] dentry_unlink_inode at ffff00008035ef58 > #20 [ffff0000b4d6bc50] __dentry_kill at ffff000080360384 > #21 [ffff0000b4d6bc80] dentry_kill at ffff000080360670 > #22 [ffff0000b4d6bcb0] dput at ffff00008036093c > #23 [ffff0000b4d6bcf0] __fput at ffff000080343930 > #24 [ffff0000b4d6bd40] ____fput at ffff000080343aac > #25 [ffff0000b4d6bd60] task_work_run at ffff0000801172fc > > The direct panic reason is that bh2jh (group_bh)-> b_committed_data is null. > It is presumed that the network was disconnected during the write process, > causing the transaction abort. as follows: > jbd2_journal_abort > ....... > jbd2_journal_commit_transaction > jh->b_committed_data = NULL; > > _ocfs2_free_suballoc_bits > ocfs2_block_group_clear_bits > // undo_bg is now set to null > BUG_ON(!undo_bg); > > When applying for free space, if b_committed_data is null, > it will be directly occupied, as follows: > ocfs2_cluster_group_search > ocfs2_block_group_find_clear_bits > ocfs2_test_bg_bit_allocatable: > bg = (struct ocfs2_group_desc *) bh2jh(bg_bh)->b_committed_data; > if (bg) > ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > else > ret = 1; > b_committed_data is an intermediate state backup for bitmap transaction commits, > newly applied space can overwrite previous dirty data, > so, I think, while free clusters, if b_committed_data is null, ignore it. > Host panic directly, too violent. > > Signed-off-by: Shan Li <lishan24 at huawei.com> > Reviewed-by: Jun Piao <piaojun at huawei.com> > --- > fs/ocfs2/suballoc.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c > index 939df99d2dec..aaf1b3cbd984 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -2412,14 +2412,19 @@ static int ocfs2_block_group_clear_bits(handle_t *handle, > if (undo_fn) { > spin_lock(&jh->b_state_lock); > undo_bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - BUG_ON(!undo_bg); > + if (!undo_bg) > + mlog(ML_NOTICE, "%s: group descriptor # %llu (device %s) journal " > + "b_committed_data had been cleared.\n", > + OCFS2_SB(alloc_inode->i_sb)->uuid_str, > + (unsigned long long)le64_to_cpu(bg->bg_blkno), > + alloc_inode->i_sb->s_id); > } > > tmp = num_bits; > while(tmp--) { > ocfs2_clear_bit((bit_off + tmp), > (unsigned long *) bg->bg_bitmap); > - if (undo_fn) > + if (undo_fn && undo_bg) > undo_fn(bit_off + tmp, > (unsigned long *) undo_bg->bg_bitmap); > } >
Joseph Qi
2020-Mar-11 01:15 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a null pointer derefrence in ocfs2_block_group_clear_bits()
On 2020/3/9 16:26, lishan wrote:> A NULL pointer panic dereference in ocfs2_block_group_clear_bits() happen again. > The information of NULL pointer stack as follows: > > PID: 81866 TASK: ffffa07c3c21ae80 CPU: 66 COMMAND: "fallocate" > #0 [ffff0000b4d6b0b0] machine_kexec at ffff0000800a2954 > #1 [ffff0000b4d6b110] __crash_kexec at ffff0000801bab34 > #2 [ffff0000b4d6b2a0] panic at ffff0000800f02cc > #3 [ffff0000b4d6b380] die at ffff00008008f6ac > #4 [ffff0000b4d6b3d0] bug_handler at ffff00008008f744 > #5 [ffff0000b4d6b400] brk_handler at ffff000080085d1c > #6 [ffff0000b4d6b420] do_debug_exception at ffff000080081194 > #7 [ffff0000b4d6b630] el1_dbg at ffff00008008332c > PC: ffff00000190e9c0 [_ocfs2_free_suballoc_bits+1608] > LR: ffff00000190e990 [_ocfs2_free_suballoc_bits+1560] > SP: ffff0000b4d6b640 PSTATE: 60400009 > X29: ffff0000b4d6b650 X28: 0000000000000000 X27: 00000000000052f3 > X26: ffff807c511a9570 X25: ffff807ca0054000 X24: 00000000000052f2 > X23: 0000000000000001 X22: ffff807c7cde7a90 X21: ffff0000811d9000 > X20: ffff807c5e7d2000 X19: ffff00000190c768 X18: 0000000000000000 > X17: 0000000000000000 X16: ffff000080a032f0 X15: 0000000000000000 > X14: ffffffffffffffff X13: fffffffffffffff7 X12: ffffffffffffffff > X11: 0000000000000038 X10: 0101010101010101 X9: ffffffffffffffff > X8: 7f7f7f7f7f7f7f7f X7: 0000000000000000 X6: 0000000000000080 > X5: 0000000000000000 X4: 0000000000000002 X3: ffff00000199f390 > X2: a603c08321456e00 X1: ffff807c7cde7a90 X0: 0000000000000000 > #8 [ffff0000b4d6b650] _ocfs2_free_suballoc_bits at ffff00000190e9bc [ocfs2] > #9 [ffff0000b4d6b710] _ocfs2_free_clusters at ffff0000019110d4 [ocfs2] > #10 [ffff0000b4d6b790] ocfs2_free_clusters at ffff000001913e94 [ocfs2] > #11 [ffff0000b4d6b7d0] __ocfs2_flush_truncate_log at ffff0000018b5294 [ocfs2] > #12 [ffff0000b4d6b8a0] ocfs2_remove_btree_range at ffff0000018bb34c [ocfs2] > #13 [ffff0000b4d6b960] ocfs2_commit_truncate at ffff0000018bc76c [ocfs2] > #14 [ffff0000b4d6ba60] ocfs2_wipe_inode at ffff0000018e57bc [ocfs2] > #15 [ffff0000b4d6bb00] ocfs2_evict_inode at ffff0000018e5db8 [ocfs2] > #16 [ffff0000b4d6bb70] evict at ffff000080365040 > #17 [ffff0000b4d6bba0] iput at ffff0000803655d8 > #18 [ffff0000b4d6bbe0] ocfs2_dentry_iput at ffff0000018c60a0 [ocfs2] > #19 [ffff0000b4d6bc30] dentry_unlink_inode at ffff00008035ef58 > #20 [ffff0000b4d6bc50] __dentry_kill at ffff000080360384 > #21 [ffff0000b4d6bc80] dentry_kill at ffff000080360670 > #22 [ffff0000b4d6bcb0] dput at ffff00008036093c > #23 [ffff0000b4d6bcf0] __fput at ffff000080343930 > #24 [ffff0000b4d6bd40] ____fput at ffff000080343aac > #25 [ffff0000b4d6bd60] task_work_run at ffff0000801172fc > > The direct panic reason is that bh2jh (group_bh)-> b_committed_data is null. > It is presumed that the network was disconnected during the write process, > causing the transaction abort. as follows: > jbd2_journal_abort > ....... > jbd2_journal_commit_transaction > jh->b_committed_data = NULL; > > _ocfs2_free_suballoc_bits > ocfs2_block_group_clear_bits > // undo_bg is now set to null > BUG_ON(!undo_bg); > > When applying for free space, if b_committed_data is null, > it will be directly occupied, as follows: > ocfs2_cluster_group_search > ocfs2_block_group_find_clear_bits > ocfs2_test_bg_bit_allocatable: > bg = (struct ocfs2_group_desc *) bh2jh(bg_bh)->b_committed_data; > if (bg) > ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > else > ret = 1; > b_committed_data is an intermediate state backup for bitmap transaction commits, > newly applied space can overwrite previous dirty data, > so, I think, while free clusters, if b_committed_data is null, ignore it. > Host panic directly, too violent. > > Signed-off-by: Shan Li <lishan24 at huawei.com> > Reviewed-by: Jun Piao <piaojun at huawei.com> > --- > fs/ocfs2/suballoc.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c > index 939df99d2dec..aaf1b3cbd984 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -2412,14 +2412,19 @@ static int ocfs2_block_group_clear_bits(handle_t *handle, > if (undo_fn) { > spin_lock(&jh->b_state_lock); > undo_bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - BUG_ON(!undo_bg); > + if (!undo_bg) > + mlog(ML_NOTICE, "%s: group descriptor # %llu (device %s) journal " > + "b_committed_data had been cleared.\n", > + OCFS2_SB(alloc_inode->i_sb)->uuid_str, > + (unsigned long long)le64_to_cpu(bg->bg_blkno), > + alloc_inode->i_sb->s_id);Seems a kind of workaround. I am worrying about other abnormal cases of NULL b_committed_data, it may lead to a corrupt filesystem. So how about isolating the journal abort case? Thanks, Joseph> } > > tmp = num_bits; > while(tmp--) { > ocfs2_clear_bit((bit_off + tmp), > (unsigned long *) bg->bg_bitmap); > - if (undo_fn) > + if (undo_fn && undo_bg) > undo_fn(bit_off + tmp, > (unsigned long *) undo_bg->bg_bitmap); > } >
Andrew Morton
2020-Mar-12 01:40 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a null pointer derefrence in ocfs2_block_group_clear_bits()
On Tue, 10 Mar 2020 21:07:06 +0800 lishan <lishan24 at huawei.com> wrote:> On 2020/3/9 16:26, lishan wrote: > > A NULL pointer panic dereference in ocfs2_block_group_clear_bits() happen again. > > The information of NULL pointer stack as follows: > > > > PID: 81866 TASK: ffffa07c3c21ae80 CPU: 66 COMMAND: "fallocate" > > #0 [ffff0000b4d6b0b0] machine_kexec at ffff0000800a2954 > > #1 [ffff0000b4d6b110] __crash_kexec at ffff0000801bab34 > > #2 [ffff0000b4d6b2a0] panic at ffff0000800f02cc > > #3 [ffff0000b4d6b380] die at ffff00008008f6ac > > #4 [ffff0000b4d6b3d0] bug_handler at ffff00008008f744 > > #5 [ffff0000b4d6b400] brk_handler at ffff000080085d1c > > #6 [ffff0000b4d6b420] do_debug_exception at ffff000080081194 > > #7 [ffff0000b4d6b630] el1_dbg at ffff00008008332c > > PC: ffff00000190e9c0 [_ocfs2_free_suballoc_bits+1608] > > LR: ffff00000190e990 [_ocfs2_free_suballoc_bits+1560] > > SP: ffff0000b4d6b640 PSTATE: 60400009 > > X29: ffff0000b4d6b650 X28: 0000000000000000 X27: 00000000000052f3 > > X26: ffff807c511a9570 X25: ffff807ca0054000 X24: 00000000000052f2 > > X23: 0000000000000001 X22: ffff807c7cde7a90 X21: ffff0000811d9000 > > X20: ffff807c5e7d2000 X19: ffff00000190c768 X18: 0000000000000000 > > X17: 0000000000000000 X16: ffff000080a032f0 X15: 0000000000000000 > > X14: ffffffffffffffff X13: fffffffffffffff7 X12: ffffffffffffffff > > X11: 0000000000000038 X10: 0101010101010101 X9: ffffffffffffffff > > X8: 7f7f7f7f7f7f7f7f X7: 0000000000000000 X6: 0000000000000080 > > X5: 0000000000000000 X4: 0000000000000002 X3: ffff00000199f390 > > X2: a603c08321456e00 X1: ffff807c7cde7a90 X0: 0000000000000000 > > #8 [ffff0000b4d6b650] _ocfs2_free_suballoc_bits at ffff00000190e9bc [ocfs2] > > #9 [ffff0000b4d6b710] _ocfs2_free_clusters at ffff0000019110d4 [ocfs2] > > #10 [ffff0000b4d6b790] ocfs2_free_clusters at ffff000001913e94 [ocfs2] > > #11 [ffff0000b4d6b7d0] __ocfs2_flush_truncate_log at ffff0000018b5294 [ocfs2] > > #12 [ffff0000b4d6b8a0] ocfs2_remove_btree_range at ffff0000018bb34c [ocfs2] > > #13 [ffff0000b4d6b960] ocfs2_commit_truncate at ffff0000018bc76c [ocfs2] > > #14 [ffff0000b4d6ba60] ocfs2_wipe_inode at ffff0000018e57bc [ocfs2] > > #15 [ffff0000b4d6bb00] ocfs2_evict_inode at ffff0000018e5db8 [ocfs2] > > #16 [ffff0000b4d6bb70] evict at ffff000080365040 > > #17 [ffff0000b4d6bba0] iput at ffff0000803655d8 > > #18 [ffff0000b4d6bbe0] ocfs2_dentry_iput at ffff0000018c60a0 [ocfs2] > > #19 [ffff0000b4d6bc30] dentry_unlink_inode at ffff00008035ef58 > > #20 [ffff0000b4d6bc50] __dentry_kill at ffff000080360384 > > #21 [ffff0000b4d6bc80] dentry_kill at ffff000080360670 > > #22 [ffff0000b4d6bcb0] dput at ffff00008036093c > > #23 [ffff0000b4d6bcf0] __fput at ffff000080343930 > > #24 [ffff0000b4d6bd40] ____fput at ffff000080343aac > > #25 [ffff0000b4d6bd60] task_work_run at ffff0000801172fc > > > > The direct panic reason is that bh2jh (group_bh)-> b_committed_data is null. > > It is presumed that the network was disconnected during the write process, > > causing the transaction abort. as follows: > > jbd2_journal_abort > > ....... > > jbd2_journal_commit_transaction > > jh->b_committed_data = NULL; > > > > _ocfs2_free_suballoc_bits > > ocfs2_block_group_clear_bits > > // undo_bg is now set to null > > BUG_ON(!undo_bg); > > > > When applying for free space, if b_committed_data is null, > > it will be directly occupied, as follows: > > ocfs2_cluster_group_search > > ocfs2_block_group_find_clear_bits > > ocfs2_test_bg_bit_allocatable: > > bg = (struct ocfs2_group_desc *) bh2jh(bg_bh)->b_committed_data; > > if (bg) > > ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > > else > > ret = 1; > > b_committed_data is an intermediate state backup for bitmap transaction commits, > > newly applied space can overwrite previous dirty data, > > so, I think, while free clusters, if b_committed_data is null, ignore it. > > Host panic directly, too violent.(top-posting repaired)> ping ? > cc Andrew MortonThere's something dreadfully wrong with the ocfs2-devel mailing list. I almost never receive patches when people add me to cc. I've never seen much of a pattern to it - it just drops stuff everywhere. Who is the admin for this list? Can we please move it to kernel.org? Regarding this patch, I did actually receive one email from the server.>From Joseph:: > + if (!undo_bg) : > + mlog(ML_NOTICE, "%s: group descriptor # %llu (device %s) journal " : > + "b_committed_data had been cleared.\n", : > + OCFS2_SB(alloc_inode->i_sb)->uuid_str, : > + (unsigned long long)le64_to_cpu(bg->bg_blkno), : > + alloc_inode->i_sb->s_id); : : Seems a kind of workaround. : I am worrying about other abnormal cases of NULL b_committed_data, it : may lead to a corrupt filesystem. : So how about isolating the journal abort case?