wuzhouhui
2018-Jan-10 07:19 UTC
[CentOS] soft lockup after set multicast_router of bridge and it's port to 2
OS: CentOS 6.5. After I set multicast_router of bridge and it's port to 2, like following: echo 2 > /sys/devices/virtual/net/eth81/bridge/multicast_router echo 2 > /sys/devices/virtual/net/bond2/brport/multicast_router Then soft lockup occured: Message from syslogd at node-0 at Jan 9 15:47:12 ... kernel:BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0] And the call trace is RIP: 0010:[<ffffffffa04f3608>] [<ffffffffa04f3608>] br_multicast_flood+0x88/0x140 [bridge] RSP: 0018:ffff88013bc038f0 EFLAGS: 00000246 RAX: ffff88404f816020 RBX: ffff88013bc03940 RCX: ffff88204e40a640 RDX: ffff882002b9ce01 RSI: ffff882002b9ce80 RDI: 0000000000000000 RBP: ffffffff8100bb93 R08: 0000000000000001 R09: 00000000ff09f4a1 R10: ffff88202c884070 R11: 0000000000000000 R12: ffff88013bc03870 R13: ffff882002b9ce80 R14: ffff88013bc03860 R15: ffffffff8151b225 FS: 0000000000000000(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00007fa11a942000 CR3: 0000000001a85000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020) Stack: 880be7e100028813 ffff882002b9ce80 ffff882002b9ce80 ffffffffa04f3930 <d> 00000000880be7e1 ffff882002b9ce80 ffff882002b9ce80 ffff88200286c042 <d> ffff88202ae7c6e0 ffff882002b9ceb8 ffff88013bc03950 ffffffffa04f36d5 Call Trace: <IRQ> [<ffffffffa04f3930>] ? __br_forward+0x0/0xd0 [bridge] [<ffffffffa04f36d5>] ? br_multicast_forward+0x15/0x20 [bridge] [<ffffffffa04f4a34>] ? br_handle_frame_finish+0x144/0x2a0 [bridge] [<ffffffffa04fa938>] ? br_nf_pre_routing_finish+0x238/0x350 [bridge] [<ffffffffa04faedb>] ? br_nf_pre_routing+0x48b/0x7b0 [bridge] [<ffffffff8143ba57>] ? __kfree_skb+0x47/0xa0 [<ffffffff814734f9>] ? nf_iterate+0x69/0xb0 [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge] [<ffffffff814736b6>] ? nf_hook_slow+0x76/0x120 [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge] [<ffffffffa04f4d1c>] ? br_handle_frame+0x18c/0x250 [bridge] [<ffffffff81445709>] ? __netif_receive_skb+0x529/0x750 [<ffffffff814397da>] ? __alloc_skb+0x7a/0x180 [<ffffffff814492f8>] ? netif_receive_skb+0x58/0x60 [<ffffffff81449400>] ? napi_skb_finish+0x50/0x70 [<ffffffff8144ab79>] ? napi_gro_receive+0x39/0x50 [<ffffffffa016887f>] ? bnx2x_rx_int+0x83f/0x1630 [bnx2x] [<ffffffff810608dc>] ? perf_event_task_sched_out+0x4c/0x70 [<ffffffffa01698ae>] ? bnx2x_poll+0x23e/0x2f0 [bnx2x] [<ffffffff8144ac93>] ? net_rx_action+0x103/0x2f0 [<ffffffff8107a811>] ? __do_softirq+0xc1/0x1e0 [<ffffffff810e6b30>] ? handle_IRQ_event+0x60/0x170 [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0 [<ffffffff8107a6c5>] ? irq_exit+0x85/0x90 [<ffffffff8151b165>] ? do_IRQ+0x75/0xf0 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 <EOI> [<ffffffff81016627>] ? mwait_idle+0x77/0xd0 [<ffffffff815176fa>] ? atomic_notifier_call_chain+0x1a/0x20 [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 [<ffffffff814f6e3a>] ? rest_init+0x7a/0x80 [<ffffffff81c25f70>] ? start_kernel+0x405/0x411 [<ffffffff81c2533a>] ? x86_64_start_reservations+0x125/0x129 [<ffffffff81c25453>] ? x86_64_start_kernel+0x115/0x124 Does anyone know the reason?
wuzhouhui
2018-Jan-10 10:03 UTC
[CentOS] soft lockup after set multicast_router of bridge and it's port to 2
Never mind, commit 1a040eaca1a2 (bridge: fix multicast router rlist endless loop) fixes it.> -----????----- > ???: wuzhouhui <wuzhouhui14 at mails.ucas.ac.cn> > ????: 2018-01-10 15:19:09 (???) > ???: centos at centos.org > ??: wuzhouhui14 <wuzhouhui14 at mails.ucas.ac.cn> > ??: soft lockup after set multicast_router of bridge and it's port to 2 > > OS: CentOS 6.5. > > After I set multicast_router of bridge and it's port to 2, like following: > echo 2 > /sys/devices/virtual/net/eth81/bridge/multicast_router > echo 2 > /sys/devices/virtual/net/bond2/brport/multicast_router > Then soft lockup occured: > Message from syslogd at node-0 at Jan 9 15:47:12 ... > kernel:BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0] > And the call trace is > RIP: 0010:[<ffffffffa04f3608>] [<ffffffffa04f3608>] br_multicast_flood+0x88/0x140 [bridge] > RSP: 0018:ffff88013bc038f0 EFLAGS: 00000246 > RAX: ffff88404f816020 RBX: ffff88013bc03940 RCX: ffff88204e40a640 > RDX: ffff882002b9ce01 RSI: ffff882002b9ce80 RDI: 0000000000000000 > RBP: ffffffff8100bb93 R08: 0000000000000001 R09: 00000000ff09f4a1 > R10: ffff88202c884070 R11: 0000000000000000 R12: ffff88013bc03870 > R13: ffff882002b9ce80 R14: ffff88013bc03860 R15: ffffffff8151b225 > FS: 0000000000000000(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 00007fa11a942000 CR3: 0000000001a85000 CR4: 00000000001407e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020) > Stack: > 880be7e100028813 ffff882002b9ce80 ffff882002b9ce80 ffffffffa04f3930 > <d> 00000000880be7e1 ffff882002b9ce80 ffff882002b9ce80 ffff88200286c042 > <d> ffff88202ae7c6e0 ffff882002b9ceb8 ffff88013bc03950 ffffffffa04f36d5 > Call Trace: > <IRQ> > [<ffffffffa04f3930>] ? __br_forward+0x0/0xd0 [bridge] > [<ffffffffa04f36d5>] ? br_multicast_forward+0x15/0x20 [bridge] > [<ffffffffa04f4a34>] ? br_handle_frame_finish+0x144/0x2a0 [bridge] > [<ffffffffa04fa938>] ? br_nf_pre_routing_finish+0x238/0x350 [bridge] > [<ffffffffa04faedb>] ? br_nf_pre_routing+0x48b/0x7b0 [bridge] > [<ffffffff8143ba57>] ? __kfree_skb+0x47/0xa0 > [<ffffffff814734f9>] ? nf_iterate+0x69/0xb0 > [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge] > [<ffffffff814736b6>] ? nf_hook_slow+0x76/0x120 > [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge] > [<ffffffffa04f4d1c>] ? br_handle_frame+0x18c/0x250 [bridge] > [<ffffffff81445709>] ? __netif_receive_skb+0x529/0x750 > [<ffffffff814397da>] ? __alloc_skb+0x7a/0x180 > [<ffffffff814492f8>] ? netif_receive_skb+0x58/0x60 > [<ffffffff81449400>] ? napi_skb_finish+0x50/0x70 > [<ffffffff8144ab79>] ? napi_gro_receive+0x39/0x50 > [<ffffffffa016887f>] ? bnx2x_rx_int+0x83f/0x1630 [bnx2x] > [<ffffffff810608dc>] ? perf_event_task_sched_out+0x4c/0x70 > [<ffffffffa01698ae>] ? bnx2x_poll+0x23e/0x2f0 [bnx2x] > [<ffffffff8144ac93>] ? net_rx_action+0x103/0x2f0 > [<ffffffff8107a811>] ? __do_softirq+0xc1/0x1e0 > [<ffffffff810e6b30>] ? handle_IRQ_event+0x60/0x170 > [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 > [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0 > [<ffffffff8107a6c5>] ? irq_exit+0x85/0x90 > [<ffffffff8151b165>] ? do_IRQ+0x75/0xf0 > [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 > <EOI> > [<ffffffff81016627>] ? mwait_idle+0x77/0xd0 > [<ffffffff815176fa>] ? atomic_notifier_call_chain+0x1a/0x20 > [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 > [<ffffffff814f6e3a>] ? rest_init+0x7a/0x80 > [<ffffffff81c25f70>] ? start_kernel+0x405/0x411 > [<ffffffff81c2533a>] ? x86_64_start_reservations+0x125/0x129 > [<ffffffff81c25453>] ? x86_64_start_kernel+0x115/0x124 > > Does anyone know the reason?
Apparently Analagous Threads
- [Bridge] [Patch net] bridge: do not expire mdb entry when bridge still uses it
- [Bridge] [RFC PATCH 1/2] bridge: export port_no and port_id via IFA_INFO_DATA
- [Bridge] bride: IPv6 multicast snooping enhancements
- [PATCH] bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones
- [Bridge] [PATCH net-next v5] bridge: export multicast database via netlink