zhangzhengming
2021-Apr-28 14:38 UTC
[Bridge] [PATCH v2] bridge: Fix possible races between assigning rx_handler_data and setting IFF_BRIDGE_PORT bit
From: Zhang Zhengming <zhangzhengming at huawei.com>
There is a crash in the function br_get_link_af_size_filtered,
as the port_exists(dev) is true and the rx_handler_data of dev is NULL.
But the rx_handler_data of dev is correct saved in vmcore.
The oops looks something like:
...
pc : br_get_link_af_size_filtered+0x28/0x1c8 [bridge]
...
Call trace:
br_get_link_af_size_filtered+0x28/0x1c8 [bridge]
if_nlmsg_size+0x180/0x1b0
rtnl_calcit.isra.12+0xf8/0x148
rtnetlink_rcv_msg+0x334/0x370
netlink_rcv_skb+0x64/0x130
rtnetlink_rcv+0x28/0x38
netlink_unicast+0x1f0/0x250
netlink_sendmsg+0x310/0x378
sock_sendmsg+0x4c/0x70
__sys_sendto+0x120/0x150
__arm64_sys_sendto+0x30/0x40
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc
In br_add_if(), we found there is no guarantee that
assigning rx_handler_data to dev->rx_handler_data
will before setting the IFF_BRIDGE_PORT bit of priv_flags.
So there is a possible data competition:
CPU 0: CPU 1:
(RCU read lock) (RTNL lock)
rtnl_calcit() br_add_slave()
if_nlmsg_size() br_add_if()
br_get_link_af_size_filtered() ->
netdev_rx_handler_register
...
// The order
is not guaranteed
... ->
dev->priv_flags |= IFF_BRIDGE_PORT;
// The IFF_BRIDGE_PORT bit of priv_flags has been set
-> if (br_port_exists(dev)) {
// The dev->rx_handler_data has NOT been assigned
-> p = br_port_get_rcu(dev);
....
->
rcu_assign_pointer(dev->rx_handler_data, rx_handler_data);
...
Fix it in br_get_link_af_size_filtered, using br_port_get_check_rcu() and
checking the return value.
Signed-off-by: Zhang Zhengming <zhangzhengming at huawei.com>
Reviewed-by: Zhao Lei <zhaolei69 at huawei.com>
Reviewed-by: Wang Xiaogang <wangxiaogang3 at huawei.com>
Suggested-by: Nikolay Aleksandrov <nikolay at nvidia.com>
---
net/bridge/br_netlink.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index f2b1343..ed5aba2 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -103,8 +103,9 @@ static size_t br_get_link_af_size_filtered(const struct
net_device *dev,
rcu_read_lock();
if (netif_is_bridge_port(dev)) {
- p = br_port_get_rcu(dev);
- vg = nbp_vlan_group_rcu(p);
+ p = br_port_get_check_rcu(dev);
+ if (p)
+ vg = nbp_vlan_group_rcu(p);
} else if (dev->priv_flags & IFF_EBRIDGE) {
br = netdev_priv(dev);
vg = br_vlan_group_rcu(br);
--
2.7.4
patchwork-bot+netdevbpf at kernel.org
2021-Apr-29 22:40 UTC
[Bridge] [PATCH v2] bridge: Fix possible races between assigning rx_handler_data and setting IFF_BRIDGE_PORT bit
Hello: This patch was applied to netdev/net.git (refs/heads/master): On Wed, 28 Apr 2021 22:38:14 +0800 you wrote:> From: Zhang Zhengming <zhangzhengming at huawei.com> > > There is a crash in the function br_get_link_af_size_filtered, > as the port_exists(dev) is true and the rx_handler_data of dev is NULL. > But the rx_handler_data of dev is correct saved in vmcore. > > The oops looks something like: > ... > pc : br_get_link_af_size_filtered+0x28/0x1c8 [bridge] > ... > Call trace: > br_get_link_af_size_filtered+0x28/0x1c8 [bridge] > if_nlmsg_size+0x180/0x1b0 > rtnl_calcit.isra.12+0xf8/0x148 > rtnetlink_rcv_msg+0x334/0x370 > netlink_rcv_skb+0x64/0x130 > rtnetlink_rcv+0x28/0x38 > netlink_unicast+0x1f0/0x250 > netlink_sendmsg+0x310/0x378 > sock_sendmsg+0x4c/0x70 > __sys_sendto+0x120/0x150 > __arm64_sys_sendto+0x30/0x40 > el0_svc_common+0x78/0x130 > el0_svc_handler+0x38/0x78 > el0_svc+0x8/0xc > > [...]Here is the summary with links: - [v2] bridge: Fix possible races between assigning rx_handler_data and setting IFF_BRIDGE_PORT bit https://git.kernel.org/netdev/net/c/59259ff7a81b You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html