Nikolay Aleksandrov
2015-Sep-30 18:16 UTC
[Bridge] [PATCH net-next 4/5] bridge: vlan: fix possible null ptr derefs on port init and deinit
From: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> When a new port is being added we need to make vlgrp available after rhashtable has been initialized and when removing a port we need to flush the vlans and free the resources after we're sure noone can use the port, i.e. after it's removed from the port list and synchronize_rcu is executed. Signed-off-by: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> --- net/bridge/br_if.c | 3 ++- net/bridge/br_vlan.c | 16 ++++++++++------ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 45e4757c6fd2..934cae9fa317 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -248,7 +248,6 @@ static void del_nbp(struct net_bridge_port *p) list_del_rcu(&p->list); - nbp_vlan_flush(p); br_fdb_delete_by_port(br, p, 0, 1); nbp_update_port_count(br); @@ -257,6 +256,8 @@ static void del_nbp(struct net_bridge_port *p) dev->priv_flags &= ~IFF_BRIDGE_PORT; netdev_rx_handler_unregister(dev); + /* use the synchronize_rcu done by netdev_rx_handler_unregister */ + nbp_vlan_flush(p); br_multicast_del_port(p); diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c index 90ac4b0c55c1..7e9d60a402e2 100644 --- a/net/bridge/br_vlan.c +++ b/net/bridge/br_vlan.c @@ -854,16 +854,20 @@ err_rhtbl: int nbp_vlan_init(struct net_bridge_port *p) { + struct net_bridge_vlan_group *vg; int ret = -ENOMEM; - p->vlgrp = kzalloc(sizeof(struct net_bridge_vlan_group), GFP_KERNEL); - if (!p->vlgrp) + vg = kzalloc(sizeof(struct net_bridge_vlan_group), GFP_KERNEL); + if (!vg) goto out; - ret = rhashtable_init(&p->vlgrp->vlan_hash, &br_vlan_rht_params); + ret = rhashtable_init(&vg->vlan_hash, &br_vlan_rht_params); if (ret) goto err_rhtbl; - INIT_LIST_HEAD(&p->vlgrp->vlan_list); + INIT_LIST_HEAD(&vg->vlan_list); + /* Make sure everything's committed before publishing vg */ + smp_wmb(); + p->vlgrp = vg; if (p->br->default_pvid) { ret = nbp_vlan_add(p, p->br->default_pvid, BRIDGE_VLAN_INFO_PVID | @@ -875,9 +879,9 @@ out: return ret; err_vlan_add: - rhashtable_destroy(&p->vlgrp->vlan_hash); + rhashtable_destroy(&vg->vlan_hash); err_rhtbl: - kfree(p->vlgrp); + kfree(vg); goto out; } -- 2.4.3
Ido Schimmel
2015-Oct-11 12:21 UTC
[Bridge] [PATCH net-next 4/5] bridge: vlan: fix possible null ptr derefs on port init and deinit
Wed, Sep 30, 2015 at 09:16:54PM IDT, razor at blackwall.org wrote:>From: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> > >When a new port is being added we need to make vlgrp available after >rhashtable has been initialized and when removing a port we need to >flush the vlans and free the resources after we're sure noone can use >the port, i.e. after it's removed from the port list and synchronize_rcu >is executed.Hi Nikolay, Changing the order of port deinit breaks symmetry with the init sequence. It also introduces a problem for switchdev drivers. Flushing the VLANs clears HW VLAN filters and then, when port is unlinked from bridge and CHANGEUPPER is received, port is configured to direct traffic to CPU (as it's not offloaded anymore). Doing the reverse (like in this patch) renders the port unusable. Regarding the reason for this change, are you afraid that vlgrp will be accessed in bridge's rx handler or xmit function after it's freed? If so, maybe we can access it using RCU primitives? That way, both the rx handler and xmit function (executed under RCU lock) will either have a valid copy or not. Looking at previous iterations of this code, I see that was the case with the 'net_port_vlans' struct. I can start working on a fix if you agree with the proposed solution. Thanks.
Nikolay Aleksandrov
2015-Oct-11 12:42 UTC
[Bridge] [PATCH net-next 4/5] bridge: vlan: fix possible null ptr derefs on port init and deinit
On 10/11/2015 02:21 PM, Ido Schimmel wrote:> Wed, Sep 30, 2015 at 09:16:54PM IDT, razor at blackwall.org wrote: >> From: Nikolay Aleksandrov <nikolay at cumulusnetworks.com> >> >> When a new port is being added we need to make vlgrp available after >> rhashtable has been initialized and when removing a port we need to >> flush the vlans and free the resources after we're sure noone can use >> the port, i.e. after it's removed from the port list and synchronize_rcu >> is executed. > > Hi Nikolay, > > Changing the order of port deinit breaks symmetry with the init > sequence. It also introduces a problem for switchdev drivers. Flushing > the VLANs clears HW VLAN filters and then, when port is unlinked from > bridge and CHANGEUPPER is received, port is configured to direct traffic > to CPU (as it's not offloaded anymore). Doing the reverse (like in this > patch) renders the port unusable. > > Regarding the reason for this change, are you afraid that vlgrp will be > accessed in bridge's rx handler or xmit function after it's freed? If > so, maybe we can access it using RCU primitives? That way, both the rx > handler and xmit function (executed under RCU lock) will either have a > valid copy or not. Looking at previous iterations of this code, I see > that was the case with the 'net_port_vlans' struct. > > I can start working on a fix if you agree with the proposed solution. > > Thanks. >Hi, Ah, I didn't know about this, I feared that something might rely on the particular order of the operations but didn't have a way to test this one in particular. Anyway, your proposed solution sounds good to me. Thank you, Nik