Johannes Nixdorf
2023-May-15 08:50 UTC
[Bridge] [PATCH net-next 1/2] bridge: Add a limit on FDB entries
A malicious actor behind one bridge port may spam the kernel with packets with a random source MAC address, each of which will create an FDB entry, each of which is a dynamic allocation in the kernel. There are roughly 2^48 different MAC addresses, further limited by the rhashtable they are stored in to 2^31. Each entry is of the type struct net_bridge_fdb_entry, which is currently 128 bytes big. This means the maximum amount of memory allocated for FDB entries is 2^31 * 128B 256GiB, which is too much for most computers. Mitigate this by adding a bridge netlink setting IFLA_BR_FDB_MAX_ENTRIES, which, if nonzero, limits the amount of entries to a user specified maximum. For backwards compatibility the default setting of 0 disables the limit. All changes to fdb_n_entries are under br->hash_lock, which means we do not need additional locking. The call paths are (? denotes that br->hash_lock is taken around the next call): - fdb_delete <-+- fdb_delete_local <-+- br_fdb_changeaddr ? | +- br_fdb_change_mac_address ? | +- br_fdb_delete_by_port ? +- br_fdb_find_delete_local ? +- fdb_add_local <-+- br_fdb_changeaddr ? | +- br_fdb_change_mac_address ? | +- br_fdb_add_local ? +- br_fdb_cleanup ? +- br_fdb_flush ? +- br_fdb_delete_by_port ? +- fdb_delete_by_addr_and_port <--- __br_fdb_delete ? +- br_fdb_external_learn_del ? - fdb_create <-+- fdb_add_local <-+- br_fdb_changeaddr ? | +- br_fdb_change_mac_address ? | +- br_fdb_add_local ? +- br_fdb_update ? +- fdb_add_entry <--- __br_fdb_add ? +- br_fdb_external_learn_add ? Signed-off-by: Johannes Nixdorf <jnixdorf-oss at avm.de> --- include/uapi/linux/if_link.h | 1 + net/bridge/br_device.c | 2 ++ net/bridge/br_fdb.c | 6 ++++++ net/bridge/br_netlink.c | 9 ++++++++- net/bridge/br_private.h | 2 ++ 5 files changed, 19 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 4ac1000b0ef2..27cf5f2d8790 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -510,6 +510,7 @@ enum { IFLA_BR_VLAN_STATS_PER_PORT, IFLA_BR_MULTI_BOOLOPT, IFLA_BR_MCAST_QUERIER_STATE, + IFLA_BR_FDB_MAX_ENTRIES, __IFLA_BR_MAX, }; diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index 8eca8a5c80c6..d455a28df7c9 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -528,6 +528,8 @@ void br_dev_setup(struct net_device *dev) br->bridge_hello_time = br->hello_time = 2 * HZ; br->bridge_forward_delay = br->forward_delay = 15 * HZ; br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME; + br->fdb_n_entries = 0; + br->fdb_max_entries = 0; dev->max_mtu = ETH_MAX_MTU; br_netfilter_rtable_init(br); diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e69a872bfc1d..8a833e6dee92 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -329,6 +329,8 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f, hlist_del_init_rcu(&f->fdb_node); rhashtable_remove_fast(&br->fdb_hash_tbl, &f->rhnode, br_fdb_rht_params); + if (!WARN_ON(!br->fdb_n_entries)) + br->fdb_n_entries--; fdb_notify(br, f, RTM_DELNEIGH, swdev_notify); call_rcu(&f->rcu, fdb_rcu_free); } @@ -391,6 +393,9 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br, struct net_bridge_fdb_entry *fdb; int err; + if (unlikely(br->fdb_max_entries && br->fdb_n_entries >= br->fdb_max_entries)) + return NULL; + fdb = kmem_cache_alloc(br_fdb_cache, GFP_ATOMIC); if (!fdb) return NULL; @@ -408,6 +413,7 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br, } hlist_add_head_rcu(&fdb->fdb_node, &br->fdb_list); + br->fdb_n_entries++; return fdb; } diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 05c5863d2e20..e5b8d36a3291 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1527,6 +1527,12 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], return err; } + if (data[IFLA_BR_FDB_MAX_ENTRIES]) { + u32 val = nla_get_u32(data[IFLA_BR_FDB_MAX_ENTRIES]); + + br->fdb_max_entries = val; + } + return 0; } @@ -1656,7 +1662,8 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev) nla_put_u8(skb, IFLA_BR_TOPOLOGY_CHANGE_DETECTED, br->topology_change_detected) || nla_put(skb, IFLA_BR_GROUP_ADDR, ETH_ALEN, br->group_addr) || - nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm)) + nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm) || + nla_put_u32(skb, IFLA_BR_FDB_MAX_ENTRIES, br->fdb_max_entries)) return -EMSGSIZE; #ifdef CONFIG_BRIDGE_VLAN_FILTERING diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 2119729ded2b..64fb359c6e3e 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -494,6 +494,8 @@ struct net_bridge { #endif struct rhashtable fdb_hash_tbl; + u32 fdb_n_entries; + u32 fdb_max_entries; struct list_head port_list; #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER) union { -- 2.40.1
Johannes Nixdorf
2023-May-15 08:50 UTC
[Bridge] [PATCH net-next 2/2] bridge: Add a sysctl to limit new brides FDB entries
This is a convenience setting, which allows the administrator to limit the default limit of FDB entries for all created bridges, instead of having to set it for each created bridge using the netlink property. The setting is network namespace local, and defaults to 0, which means unlimited, for backwards compatibility reasons. Signed-off-by: Johannes Nixdorf <jnixdorf-oss at avm.de> --- net/bridge/br.c | 83 +++++++++++++++++++++++++++++++++++++++++ net/bridge/br_device.c | 4 +- net/bridge/br_private.h | 9 +++++ 3 files changed, 95 insertions(+), 1 deletion(-) diff --git a/net/bridge/br.c b/net/bridge/br.c index 4f5098d33a46..e32bb956111c 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -14,6 +14,7 @@ #include <linux/init.h> #include <linux/llc.h> #include <net/llc.h> +#include <net/netns/generic.h> #include <net/stp.h> #include <net/switchdev.h> @@ -348,6 +349,82 @@ void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on) clear_bit(opt, &br->options); } +#ifdef CONFIG_SYSCTL +static unsigned int br_net_id __read_mostly; + +struct br_net { + struct ctl_table_header *ctl_hdr; + + unsigned int fdb_max_entries_default; +}; + +static int br_proc_rtnl_uintvec(struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + int ret; + + rtnl_lock(); + ret = proc_douintvec(table, write, buffer, lenp, ppos); + rtnl_unlock(); + + return ret; +} + +static struct ctl_table br_sysctl_table[] = { + { + .procname = "bridge-fdb-max-entries-default", + .maxlen = sizeof(unsigned int), + .mode = 0644, + .proc_handler = br_proc_rtnl_uintvec, + }, + { } +}; + +static int __net_init br_net_init(struct net *net) +{ + struct ctl_table *table = br_sysctl_table; + struct br_net *brnet; + + if (!net_eq(net, &init_net)) { + table = kmemdup(table, sizeof(br_sysctl_table), GFP_KERNEL); + if (!table) + return -ENOMEM; + } + + brnet = net_generic(net, br_net_id); + + brnet->fdb_max_entries_default = 0; + + table[0].data = &brnet->fdb_max_entries_default; + brnet->ctl_hdr = register_net_sysctl(net, "net/bridge", table); + if (!brnet->ctl_hdr) { + if (!net_eq(net, &init_net)) + kfree(table); + + return -ENOMEM; + } + + return 0; +} + +static void __net_exit br_net_exit(struct net *net) +{ + struct br_net *brnet = net_generic(net, br_net_id); + struct ctl_table *table = brnet->ctl_hdr->ctl_table_arg; + + unregister_net_sysctl_table(brnet->ctl_hdr); + if (!net_eq(net, &init_net)) + kfree(table); +} + +unsigned int br_fdb_max_entries_default(struct net *net) +{ + struct br_net *brnet = net_generic(net, br_net_id); + + return brnet->fdb_max_entries_default; +} +#endif + static void __net_exit br_net_exit_batch(struct list_head *net_list) { struct net_device *dev; @@ -367,6 +444,12 @@ static void __net_exit br_net_exit_batch(struct list_head *net_list) } static struct pernet_operations br_net_ops = { +#ifdef CONFIG_SYSCTL + .init = br_net_init, + .exit = br_net_exit, + .id = &br_net_id, + .size = sizeof(struct br_net), +#endif .exit_batch = br_net_exit_batch, }; diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index d455a28df7c9..26023f2732e8 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -117,8 +117,11 @@ static void br_set_lockdep_class(struct net_device *dev) static int br_dev_init(struct net_device *dev) { struct net_bridge *br = netdev_priv(dev); + struct net *net = dev_net(dev); int err; + br->fdb_max_entries = br_fdb_max_entries_default(net); + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); if (!dev->tstats) return -ENOMEM; @@ -529,7 +532,6 @@ void br_dev_setup(struct net_device *dev) br->bridge_forward_delay = br->forward_delay = 15 * HZ; br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME; br->fdb_n_entries = 0; - br->fdb_max_entries = 0; dev->max_mtu = ETH_MAX_MTU; br_netfilter_rtable_init(br); diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 64fb359c6e3e..d4b0f85cc278 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -2223,4 +2223,13 @@ void br_do_suppress_nd(struct sk_buff *skb, struct net_bridge *br, u16 vid, struct net_bridge_port *p, struct nd_msg *msg); struct nd_msg *br_is_nd_neigh_msg(struct sk_buff *skb, struct nd_msg *m); bool br_is_neigh_suppress_enabled(const struct net_bridge_port *p, u16 vid); + +#ifdef CONFIG_SYSFS +unsigned int br_fdb_max_entries_default(struct net *net); +#else +static inline unsigned int br_fdb_max_entries_default(struct net *net) +{ + return 0; +} +#endif #endif -- 2.40.1
Nikolay Aleksandrov
2023-May-15 09:35 UTC
[Bridge] [PATCH net-next 1/2] bridge: Add a limit on FDB entries
On 15/05/2023 11:50, Johannes Nixdorf wrote:> A malicious actor behind one bridge port may spam the kernel with packets > with a random source MAC address, each of which will create an FDB entry, > each of which is a dynamic allocation in the kernel. > > There are roughly 2^48 different MAC addresses, further limited by the > rhashtable they are stored in to 2^31. Each entry is of the type struct > net_bridge_fdb_entry, which is currently 128 bytes big. This means the > maximum amount of memory allocated for FDB entries is 2^31 * 128B > 256GiB, which is too much for most computers. > > Mitigate this by adding a bridge netlink setting IFLA_BR_FDB_MAX_ENTRIES, > which, if nonzero, limits the amount of entries to a user specified > maximum. > > For backwards compatibility the default setting of 0 disables the limit. > > All changes to fdb_n_entries are under br->hash_lock, which means we do > not need additional locking. The call paths are (? denotes that > br->hash_lock is taken around the next call): > > - fdb_delete <-+- fdb_delete_local <-+- br_fdb_changeaddr ? > | +- br_fdb_change_mac_address ? > | +- br_fdb_delete_by_port ? > +- br_fdb_find_delete_local ? > +- fdb_add_local <-+- br_fdb_changeaddr ? > | +- br_fdb_change_mac_address ? > | +- br_fdb_add_local ? > +- br_fdb_cleanup ? > +- br_fdb_flush ? > +- br_fdb_delete_by_port ? > +- fdb_delete_by_addr_and_port <--- __br_fdb_delete ? > +- br_fdb_external_learn_del ? > - fdb_create <-+- fdb_add_local <-+- br_fdb_changeaddr ? > | +- br_fdb_change_mac_address ? > | +- br_fdb_add_local ? > +- br_fdb_update ? > +- fdb_add_entry <--- __br_fdb_add ? > +- br_fdb_external_learn_add ? > > Signed-off-by: Johannes Nixdorf <jnixdorf-oss at avm.de> > --- > include/uapi/linux/if_link.h | 1 + > net/bridge/br_device.c | 2 ++ > net/bridge/br_fdb.c | 6 ++++++ > net/bridge/br_netlink.c | 9 ++++++++- > net/bridge/br_private.h | 2 ++ > 5 files changed, 19 insertions(+), 1 deletion(-) >Hi, If you're sending a patch series please add a cover letter (--cover-letter) which explains what the series are trying to do and why. I've had a patch that implements this feature for a while but didn't get to upstreaming it. :) Anyway more comments below,> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h > index 4ac1000b0ef2..27cf5f2d8790 100644 > --- a/include/uapi/linux/if_link.h > +++ b/include/uapi/linux/if_link.h > @@ -510,6 +510,7 @@ enum { > IFLA_BR_VLAN_STATS_PER_PORT, > IFLA_BR_MULTI_BOOLOPT, > IFLA_BR_MCAST_QUERIER_STATE, > + IFLA_BR_FDB_MAX_ENTRIES, > __IFLA_BR_MAX, > }; > > diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c > index 8eca8a5c80c6..d455a28df7c9 100644 > --- a/net/bridge/br_device.c > +++ b/net/bridge/br_device.c > @@ -528,6 +528,8 @@ void br_dev_setup(struct net_device *dev) > br->bridge_hello_time = br->hello_time = 2 * HZ; > br->bridge_forward_delay = br->forward_delay = 15 * HZ; > br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME; > + br->fdb_n_entries = 0; > + br->fdb_max_entries = 0;Unnecessary, the private area is already cleared.> dev->max_mtu = ETH_MAX_MTU; > > br_netfilter_rtable_init(br); > diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c > index e69a872bfc1d..8a833e6dee92 100644 > --- a/net/bridge/br_fdb.c > +++ b/net/bridge/br_fdb.c > @@ -329,6 +329,8 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f, > hlist_del_init_rcu(&f->fdb_node); > rhashtable_remove_fast(&br->fdb_hash_tbl, &f->rhnode, > br_fdb_rht_params); > + if (!WARN_ON(!br->fdb_n_entries)) > + br->fdb_n_entries--;This is pointless, just put the WARN_ON(!br->fdb_n_entries) above decrementing, if we hit that we are already in trouble and not decrementing doesn't help us.> fdb_notify(br, f, RTM_DELNEIGH, swdev_notify); > call_rcu(&f->rcu, fdb_rcu_free); > } > @@ -391,6 +393,9 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br, > struct net_bridge_fdb_entry *fdb; > int err; > > + if (unlikely(br->fdb_max_entries && br->fdb_n_entries >= br->fdb_max_entries)) > + return NULL; > +This one needs more work, fdb_create() is also used when user-space is adding new entries, so it would be nice to return a proper error.> fdb = kmem_cache_alloc(br_fdb_cache, GFP_ATOMIC); > if (!fdb) > return NULL; > @@ -408,6 +413,7 @@ static struct net_bridge_fdb_entry *fdb_create(struct net_bridge *br, > } > > hlist_add_head_rcu(&fdb->fdb_node, &br->fdb_list); > + br->fdb_n_entries++; > > return fdb; > } > diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c > index 05c5863d2e20..e5b8d36a3291 100644 > --- a/net/bridge/br_netlink.c > +++ b/net/bridge/br_netlink.c > @@ -1527,6 +1527,12 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], > return err; > } > > + if (data[IFLA_BR_FDB_MAX_ENTRIES]) { > + u32 val = nla_get_u32(data[IFLA_BR_FDB_MAX_ENTRIES]); > + > + br->fdb_max_entries = val; > + } > + > return 0; > } > > @@ -1656,7 +1662,8 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev) > nla_put_u8(skb, IFLA_BR_TOPOLOGY_CHANGE_DETECTED, > br->topology_change_detected) || > nla_put(skb, IFLA_BR_GROUP_ADDR, ETH_ALEN, br->group_addr) || > - nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm)) > + nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm) || > + nla_put_u32(skb, IFLA_BR_FDB_MAX_ENTRIES, br->fdb_max_entries))You are not returning the current entry count, that is also needed.> return -EMSGSIZE; > > #ifdef CONFIG_BRIDGE_VLAN_FILTERING > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h > index 2119729ded2b..64fb359c6e3e 100644 > --- a/net/bridge/br_private.h > +++ b/net/bridge/br_private.h > @@ -494,6 +494,8 @@ struct net_bridge { > #endif > > struct rhashtable fdb_hash_tbl; > + u32 fdb_n_entries; > + u32 fdb_max_entries;These are not critical, so I'd use 4 byte holes in net_bridge and pack it better instead of making it larger.> struct list_head port_list; > #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER) > union {
Nikolay Aleksandrov
2023-May-15 09:35 UTC
[Bridge] [PATCH net-next 2/2] bridge: Add a sysctl to limit new brides FDB entries
On 15/05/2023 11:50, Johannes Nixdorf wrote:> This is a convenience setting, which allows the administrator to limit > the default limit of FDB entries for all created bridges, instead of > having to set it for each created bridge using the netlink property. > > The setting is network namespace local, and defaults to 0, which means > unlimited, for backwards compatibility reasons. > > Signed-off-by: Johannes Nixdorf <jnixdorf-oss at avm.de> > --- > net/bridge/br.c | 83 +++++++++++++++++++++++++++++++++++++++++ > net/bridge/br_device.c | 4 +- > net/bridge/br_private.h | 9 +++++ > 3 files changed, 95 insertions(+), 1 deletion(-) >The bridge doesn't need private sysctls. Netlink is enough. Nacked-by: Nikolay Aleksandrov <razor at blackwall.org>
Stephen Hemminger
2023-May-15 15:56 UTC
[Bridge] [PATCH net-next 2/2] bridge: Add a sysctl to limit new brides FDB entries
On Mon, 15 May 2023 10:50:46 +0200 Johannes Nixdorf <jnixdorf-oss at avm.de> wrote:> +static struct ctl_table br_sysctl_table[] = { > + { > + .procname = "bridge-fdb-max-entries-default",That name is too long. Also, all the rest of bridge code does not use sysctl's. Why is this special and why should the property be global and not per bridge? NAK
Nikolay Aleksandrov
2023-May-16 08:38 UTC
[Bridge] [PATCH net-next 1/2] bridge: Add a limit on FDB entries
On 15/05/2023 11:50, Johannes Nixdorf wrote:> A malicious actor behind one bridge port may spam the kernel with packets > with a random source MAC address, each of which will create an FDB entry, > each of which is a dynamic allocation in the kernel. > > There are roughly 2^48 different MAC addresses, further limited by the > rhashtable they are stored in to 2^31. Each entry is of the type struct > net_bridge_fdb_entry, which is currently 128 bytes big. This means the > maximum amount of memory allocated for FDB entries is 2^31 * 128B > 256GiB, which is too much for most computers. > > Mitigate this by adding a bridge netlink setting IFLA_BR_FDB_MAX_ENTRIES, > which, if nonzero, limits the amount of entries to a user specified > maximum. > > For backwards compatibility the default setting of 0 disables the limit. > > All changes to fdb_n_entries are under br->hash_lock, which means we do > not need additional locking. The call paths are (? denotes that > br->hash_lock is taken around the next call): > > - fdb_delete <-+- fdb_delete_local <-+- br_fdb_changeaddr ? > | +- br_fdb_change_mac_address ? > | +- br_fdb_delete_by_port ? > +- br_fdb_find_delete_local ? > +- fdb_add_local <-+- br_fdb_changeaddr ? > | +- br_fdb_change_mac_address ? > | +- br_fdb_add_local ? > +- br_fdb_cleanup ? > +- br_fdb_flush ? > +- br_fdb_delete_by_port ? > +- fdb_delete_by_addr_and_port <--- __br_fdb_delete ? > +- br_fdb_external_learn_del ? > - fdb_create <-+- fdb_add_local <-+- br_fdb_changeaddr ? > | +- br_fdb_change_mac_address ? > | +- br_fdb_add_local ? > +- br_fdb_update ? > +- fdb_add_entry <--- __br_fdb_add ? > +- br_fdb_external_learn_add ? > > Signed-off-by: Johannes Nixdorf <jnixdorf-oss at avm.de> > --- > include/uapi/linux/if_link.h | 1 + > net/bridge/br_device.c | 2 ++ > net/bridge/br_fdb.c | 6 ++++++ > net/bridge/br_netlink.c | 9 ++++++++- > net/bridge/br_private.h | 2 ++ > 5 files changed, 19 insertions(+), 1 deletion(-) >I completely missed the fact that you don't deal with the situation where you already have fdbs created and a limit is set later, then it would be useless because it will start counting from 0 even though there are already entries. Also another issue that came to mind is that you don't deal with fdb_create() for "special" entries, i.e. when adding a port. Currently it will print an error, but you should revisit all callers and see where it might be a problem.
Apparently Analagous Threads
- [Bridge] [PATCH net-next 1/2] bridge: Add a limit on FDB entries
- [Bridge] [PATCH net-next v2 2/3] bridge: Add a limit on learned FDB entries
- [Bridge] [PATCH net-next v2 0/3, iproute2-next 0/1] bridge: Add a limit on learned FDB entries
- [Bridge] [PATCH] bridge: check kmem_cache_create() error
- [Bridge] [PATCH net-next v2 1/3] bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry