Hi Junxiao,
Thanks for pointing out this. I meant to free hr_db_*, which leaked.
Sorry for my mistake.
Reviewed-by: Joseph Qi <joseph.qi at huawei.com>
On 2016/3/17 15:50, Junxiao Bi wrote:> This is a regression issue and caused the following kernel panic
> when do ocfs2 multiple test.
>
> [ 254.604228] BUG: unable to handle kernel paging request at
> 00000002000800c0
> [ 254.605013] IP: [<ffffffff81192978>] kmem_cache_alloc+0x78/0x160
> [ 254.605013] PGD 7bbe5067 PUD 0
> [ 254.605013] Oops: 0000 [#1] SMP
> [ 254.605013] Modules linked in: ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm
> ocfs2_nodemanager ocfs2_stackglue iscsi_tcp libiscsi_tcp libiscsi
> scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront
> [ 254.605013] CPU: 2 PID: 4044 Comm: mpirun Not tainted
> 4.5.0-rc5-next-20160225 #1
> [ 254.605013] Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014
> [ 254.605013] task: ffff88007a521a80 ti: ffff88007aed0000 task.ti:
> ffff88007aed0000
> [ 254.605013] RIP: 0010:[<ffffffff81192978>]
[<ffffffff81192978>]
> kmem_cache_alloc+0x78/0x160
> [ 254.605013] RSP: 0018:ffff88007aed3a48 EFLAGS: 00010282
> [ 254.605013] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000001991
> [ 254.605013] RDX: 0000000000001990 RSI: 00000000024000c0 RDI:
> 000000000001b330
> [ 254.605013] RBP: ffff88007aed3a98 R08: ffff88007d29b330 R09:
> 00000002000800c0
> [ 254.605013] R10: 0000000c51376d87 R11: ffff8800792cac38 R12:
> ffff88007cc30f00
> [ 254.605013] R13: 00000000024000c0 R14: ffffffff811b053f R15:
> ffff88007aed3ce7
> [ 254.605013] FS: 0000000000000000(0000) GS:ffff88007d280000(0000)
> knlGS:0000000000000000
> [ 254.605013] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 254.605013] CR2: 00000002000800c0 CR3: 000000007aeb2000 CR4:
> 00000000000406e0
> [ 254.605013] Stack:
> [ 254.605013] 0000000013082000 ffff88007aed3d28 0000007900000000
> 0000000000000001
> [ 254.605013] 2f2f2f2f00000000 ffff8800792cac00 ffff88007aed3d38
> 0000000000000101
> [ 254.605013] ffff88007a5e2000 ffff88007aed3ce7 ffff88007aed3b08
> ffffffff811b053f
> [ 254.605013] Call Trace:
> [ 254.605013] [<ffffffff811b053f>] __d_alloc+0x2f/0x1a0
> [ 254.605013] [<ffffffff811a58f2>] ? unlazy_walk+0xe2/0x160
> [ 254.605013] [<ffffffff811b1c67>] d_alloc+0x17/0x80
> [ 254.605013] [<ffffffff811a5b0a>] lookup_dcache+0x8a/0xc0
> [ 254.605013] [<ffffffff81143e63>] ?
__alloc_pages_nodemask+0x173/0xeb0
> [ 254.605013] [<ffffffff811aa523>] path_openat+0x3c3/0x1210
> [ 254.605013] [<ffffffff81354eb3>] ?
radix_tree_lookup_slot+0x13/0x30
> [ 254.605013] [<ffffffff81139002>] ? find_get_entry+0x32/0xc0
> [ 254.605013] [<ffffffff811b4065>] ? atime_needs_update+0x55/0xe0
> [ 254.605013] [<ffffffff8113b7a1>] ? filemap_fault+0xd1/0x4b0
> [ 254.605013] [<ffffffff81168296>] ? do_set_pte+0xb6/0x140
> [ 254.605013] [<ffffffff811ab3f0>] do_filp_open+0x80/0xe0
> [ 254.605013] [<ffffffff811b7c48>] ? __alloc_fd+0x48/0x1a0
> [ 254.605013] [<ffffffff811a60aa>] ? getname_flags+0x7a/0x1e0
> [ 254.605013] [<ffffffff8119a2d0>] do_sys_open+0x110/0x200
> [ 254.605013] [<ffffffff8119a3f9>] SyS_open+0x19/0x20
> [ 254.605013] [<ffffffff81003ec2>] do_syscall_64+0x72/0x230
> [ 254.605013] [<ffffffff8105fc37>] ? __do_page_fault+0x177/0x430
> [ 254.605013] [<ffffffff8193bc61>]
entry_SYSCALL64_slow_path+0x25/0x25
> [ 254.605013] Code: 05 e6 77 e7 7e 4d 8b 08 49 8b 40 10 4d 85 c9 0f 84
> dd 00 00 00 48 85 c0 0f 84 d4 00 00 00 49 63 44 24 20 49 8b 3c 24 48 8d
> 4a 01 <49> 8b 1c 01 4c 89 c8 65 48 0f c7 0f 0f 94 c0 3c 01 75 b6 49
63
> [ 254.605013] RIP [<ffffffff81192978>] kmem_cache_alloc+0x78/0x160
> [ 254.605013] RSP <ffff88007aed3a48>
> [ 254.605013] CR2: 00000002000800c0
> [ 254.792273] ---[ end trace 823969e602e4aaac ]---
>
> Fixes: a4a1dfa4bb8b("ocfs2/cluster: fix memory leak in
o2hb_region_release")
> Signed-off-by: Junxiao Bi <junxiao.bi at oracle.com>
> ---
> fs/ocfs2/cluster/heartbeat.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
> index ef6a2ec494de..bd15929b5f92 100644
> --- a/fs/ocfs2/cluster/heartbeat.c
> +++ b/fs/ocfs2/cluster/heartbeat.c
> @@ -1444,8 +1444,8 @@ static void o2hb_region_release(struct config_item
*item)
> debugfs_remove(reg->hr_debug_dir);
> kfree(reg->hr_db_livenodes);
> kfree(reg->hr_db_regnum);
> - kfree(reg->hr_debug_elapsed_time);
> - kfree(reg->hr_debug_pinned);
> + kfree(reg->hr_db_elapsed_time);
> + kfree(reg->hr_db_pinned);
>
> spin_lock(&o2hb_live_lock);
> list_del(®->hr_all_item);
>