thr3ads.net - CentOS - [CentOS] soft lockup after set multicast_router of bridge and it's port to 2 [Jan 2018]

If this information is useful, please help other people find it:
Share via:

wuzhouhui

2018-Jan-10 07:19 UTC

[CentOS] soft lockup after set multicast_router of bridge and it's port to 2

OS: CentOS 6.5.

After I set multicast_router of bridge and it's port to 2, like following:
    echo 2 > /sys/devices/virtual/net/eth81/bridge/multicast_router
    echo 2 > /sys/devices/virtual/net/bond2/brport/multicast_router
Then soft lockup occured:
    Message from syslogd at node-0 at Jan  9 15:47:12 ...
     kernel:BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
And the call trace is
    RIP: 0010:[<ffffffffa04f3608>]  [<ffffffffa04f3608>]
br_multicast_flood+0x88/0x140 [bridge]
    RSP: 0018:ffff88013bc038f0  EFLAGS: 00000246
    RAX: ffff88404f816020 RBX: ffff88013bc03940 RCX: ffff88204e40a640
    RDX: ffff882002b9ce01 RSI: ffff882002b9ce80 RDI: 0000000000000000
    RBP: ffffffff8100bb93 R08: 0000000000000001 R09: 00000000ff09f4a1
    R10: ffff88202c884070 R11: 0000000000000000 R12: ffff88013bc03870
    R13: ffff882002b9ce80 R14: ffff88013bc03860 R15: ffffffff8151b225
    FS:  0000000000000000(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 00007fa11a942000 CR3: 0000000001a85000 CR4: 00000000001407e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020)
    Stack:
     880be7e100028813 ffff882002b9ce80 ffff882002b9ce80 ffffffffa04f3930
    <d> 00000000880be7e1 ffff882002b9ce80 ffff882002b9ce80
ffff88200286c042
    <d> ffff88202ae7c6e0 ffff882002b9ceb8 ffff88013bc03950
ffffffffa04f36d5
    Call Trace:
     <IRQ> 
     [<ffffffffa04f3930>] ? __br_forward+0x0/0xd0 [bridge]
     [<ffffffffa04f36d5>] ? br_multicast_forward+0x15/0x20 [bridge]
     [<ffffffffa04f4a34>] ? br_handle_frame_finish+0x144/0x2a0 [bridge]
     [<ffffffffa04fa938>] ? br_nf_pre_routing_finish+0x238/0x350 [bridge]
     [<ffffffffa04faedb>] ? br_nf_pre_routing+0x48b/0x7b0 [bridge]
     [<ffffffff8143ba57>] ? __kfree_skb+0x47/0xa0
     [<ffffffff814734f9>] ? nf_iterate+0x69/0xb0
     [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge]
     [<ffffffff814736b6>] ? nf_hook_slow+0x76/0x120
     [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge]
     [<ffffffffa04f4d1c>] ? br_handle_frame+0x18c/0x250 [bridge]
     [<ffffffff81445709>] ? __netif_receive_skb+0x529/0x750
     [<ffffffff814397da>] ? __alloc_skb+0x7a/0x180
     [<ffffffff814492f8>] ? netif_receive_skb+0x58/0x60
     [<ffffffff81449400>] ? napi_skb_finish+0x50/0x70
     [<ffffffff8144ab79>] ? napi_gro_receive+0x39/0x50
     [<ffffffffa016887f>] ? bnx2x_rx_int+0x83f/0x1630 [bnx2x]
     [<ffffffff810608dc>] ? perf_event_task_sched_out+0x4c/0x70
     [<ffffffffa01698ae>] ? bnx2x_poll+0x23e/0x2f0 [bnx2x]
     [<ffffffff8144ac93>] ? net_rx_action+0x103/0x2f0
     [<ffffffff8107a811>] ? __do_softirq+0xc1/0x1e0
     [<ffffffff810e6b30>] ? handle_IRQ_event+0x60/0x170
     [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
     [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0
     [<ffffffff8107a6c5>] ? irq_exit+0x85/0x90
     [<ffffffff8151b165>] ? do_IRQ+0x75/0xf0
     [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11
    <EOI> 
     [<ffffffff81016627>] ? mwait_idle+0x77/0xd0
     [<ffffffff815176fa>] ? atomic_notifier_call_chain+0x1a/0x20
     [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
     [<ffffffff814f6e3a>] ? rest_init+0x7a/0x80
     [<ffffffff81c25f70>] ? start_kernel+0x405/0x411
     [<ffffffff81c2533a>] ? x86_64_start_reservations+0x125/0x129
     [<ffffffff81c25453>] ? x86_64_start_kernel+0x115/0x124

Does anyone know the reason?

wuzhouhui

2018-Jan-10 10:03 UTC

head link

[CentOS] soft lockup after set multicast_router of bridge and it's port to 2

Never mind, commit 1a040eaca1a2 (bridge: fix multicast router rlist endless
loop)
fixes it.
> -----????-----
> ???: wuzhouhui <wuzhouhui14 at mails.ucas.ac.cn>
> ????: 2018-01-10 15:19:09 (???)
> ???: centos at centos.org
> ??: wuzhouhui14 <wuzhouhui14 at mails.ucas.ac.cn>
> ??: soft lockup after set multicast_router of bridge and it's port to 2
> 
> OS: CentOS 6.5.
> 
> After I set multicast_router of bridge and it's port to 2, like
following:
>     echo 2 > /sys/devices/virtual/net/eth81/bridge/multicast_router
>     echo 2 > /sys/devices/virtual/net/bond2/brport/multicast_router
> Then soft lockup occured:
>     Message from syslogd at node-0 at Jan  9 15:47:12 ...
>      kernel:BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
> And the call trace is
>     RIP: 0010:[<ffffffffa04f3608>]  [<ffffffffa04f3608>]
br_multicast_flood+0x88/0x140 [bridge]
>     RSP: 0018:ffff88013bc038f0  EFLAGS: 00000246
>     RAX: ffff88404f816020 RBX: ffff88013bc03940 RCX: ffff88204e40a640
>     RDX: ffff882002b9ce01 RSI: ffff882002b9ce80 RDI: 0000000000000000
>     RBP: ffffffff8100bb93 R08: 0000000000000001 R09: 00000000ff09f4a1
>     R10: ffff88202c884070 R11: 0000000000000000 R12: ffff88013bc03870
>     R13: ffff882002b9ce80 R14: ffff88013bc03860 R15: ffffffff8151b225
>     FS:  0000000000000000(0000) GS:ffff88013bc00000(0000)
knlGS:0000000000000000
>     CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>     CR2: 00007fa11a942000 CR3: 0000000001a85000 CR4: 00000000001407e0
>     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>     Process swapper (pid: 0, threadinfo ffffffff81a00000, task
ffffffff81a8d020)
>     Stack:
>      880be7e100028813 ffff882002b9ce80 ffff882002b9ce80 ffffffffa04f3930
>     <d> 00000000880be7e1 ffff882002b9ce80 ffff882002b9ce80
ffff88200286c042
>     <d> ffff88202ae7c6e0 ffff882002b9ceb8 ffff88013bc03950
ffffffffa04f36d5
>     Call Trace:
>      <IRQ> 
>      [<ffffffffa04f3930>] ? __br_forward+0x0/0xd0 [bridge]
>      [<ffffffffa04f36d5>] ? br_multicast_forward+0x15/0x20 [bridge]
>      [<ffffffffa04f4a34>] ? br_handle_frame_finish+0x144/0x2a0
[bridge]
>      [<ffffffffa04fa938>] ? br_nf_pre_routing_finish+0x238/0x350
[bridge]
>      [<ffffffffa04faedb>] ? br_nf_pre_routing+0x48b/0x7b0 [bridge]
>      [<ffffffff8143ba57>] ? __kfree_skb+0x47/0xa0
>      [<ffffffff814734f9>] ? nf_iterate+0x69/0xb0
>      [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge]
>      [<ffffffff814736b6>] ? nf_hook_slow+0x76/0x120
>      [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge]
>      [<ffffffffa04f4d1c>] ? br_handle_frame+0x18c/0x250 [bridge]
>      [<ffffffff81445709>] ? __netif_receive_skb+0x529/0x750
>      [<ffffffff814397da>] ? __alloc_skb+0x7a/0x180
>      [<ffffffff814492f8>] ? netif_receive_skb+0x58/0x60
>      [<ffffffff81449400>] ? napi_skb_finish+0x50/0x70
>      [<ffffffff8144ab79>] ? napi_gro_receive+0x39/0x50
>      [<ffffffffa016887f>] ? bnx2x_rx_int+0x83f/0x1630 [bnx2x]
>      [<ffffffff810608dc>] ? perf_event_task_sched_out+0x4c/0x70
>      [<ffffffffa01698ae>] ? bnx2x_poll+0x23e/0x2f0 [bnx2x]
>      [<ffffffff8144ac93>] ? net_rx_action+0x103/0x2f0
>      [<ffffffff8107a811>] ? __do_softirq+0xc1/0x1e0
>      [<ffffffff810e6b30>] ? handle_IRQ_event+0x60/0x170
>      [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
>      [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0
>      [<ffffffff8107a6c5>] ? irq_exit+0x85/0x90
>      [<ffffffff8151b165>] ? do_IRQ+0x75/0xf0
>      [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11
>     <EOI> 
>      [<ffffffff81016627>] ? mwait_idle+0x77/0xd0
>      [<ffffffff815176fa>] ? atomic_notifier_call_chain+0x1a/0x20
>      [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
>      [<ffffffff814f6e3a>] ? rest_init+0x7a/0x80
>      [<ffffffff81c25f70>] ? start_kernel+0x405/0x411
>      [<ffffffff81c2533a>] ? x86_64_start_reservations+0x125/0x129
>      [<ffffffff81c25453>] ? x86_64_start_kernel+0x115/0x124
> 
> Does anyone know the reason?

Possibly Parallel Threads

Search for more seemingly similar threads

CentOS - Jan 2018 - soft lockup after set multicast_router of bridge and it's port to 2

[CentOS] soft lockup after set multicast_router of bridge and it's port to 2

[CentOS] soft lockup after set multicast_router of bridge and it's port to 2

Possibly Parallel Threads