thr3ads.net - Linux Ethernet Bridging - [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature [Jul 2022]

If this information is useful, please help other people find it:
Share via:

Vladimir Oltean

2022-Jul-06 20:21 UTC

[Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature

On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov
wrote:> I don't think that is new or surprising, if there isn't anything to
control the
> device resources you'll get there. You don't really need to write
any new programs
> you can easily do it with mausezahn. I have tests that add over 10 million
fdbs on
> devices for a few seconds.
Of course it isn't new, but that doesn't make the situation in any way
better,
quite the opposite...
> The point is it's not the bridge's task to limit memory consumption
or to watch for resource
> management. You can limit new entries from the device driver (in case of
swdev learning) or
> you can use a daemon to watch the number of entries and disable learning.
There are many
> different ways to avoid this. We've discussed it before and I don't
mind adding a hard fdb
> per-port limit in the bridge as long as it's done properly. We've
also discussed LRU and similar
> algorithms for fdb learning and eviction. But any hardcoded limits or
limits that can break
> current default use cases are unacceptable, they must be opt-in.
I don't think you can really say that it's not the bridge's task to
limit memory consumption when what it does is essentially allocate
memory from untrusted and unbounded user input, in kernel softirq
context.

That's in fact the problem, the kernel OOM killer will kick in, but
there will be no process to kill. This is why the kernel deadlocks on
memory and dies.

Maybe where our expectations differ is that I believe that a Linux
bridge shouldn't need gazillions of tweaks to not kill the kernel?
There are many devices in production using a bridge without such
configuration, you can't just make it opt-in.

Of course, performance under heavy stress is a separate concern, and
maybe user space monitoring would be a better idea for that.

I know you changed jobs, but did Cumulus Linux have an application to
monitor and limit the FDB entry count? Is there some standard
application which does this somewhere, or does everybody roll their own?

Anyway, limiting FDB entry count from user space is still theoretically
different from not dying. If you need to schedule a task to dispose of
the weight while the ship is sinking from softirq context, you may never
get to actually schedule that task in time. AFAIK the bridge UAPI doesn't
expose a pre-programmed limit, so what needs to be done is for user
space to manually delete entries until the count falls below the limit.

Nikolay Aleksandrov

2022-Jul-06 21:01 UTC

head link

[Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature

On 06/07/2022 23:21, Vladimir Oltean wrote:> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
>> I don't think that is new or surprising, if there isn't
anything to control the
>> device resources you'll get there. You don't really need to
write any new programs
>> you can easily do it with mausezahn. I have tests that add over 10
million fdbs on
>> devices for a few seconds.
> 
> Of course it isn't new, but that doesn't make the situation in any
way better,
> quite the opposite...
> 
>> The point is it's not the bridge's task to limit memory
consumption or to watch for resource
>> management. You can limit new entries from the device driver (in case
of swdev learning) or
>> you can use a daemon to watch the number of entries and disable
learning. There are many
>> different ways to avoid this. We've discussed it before and I
don't mind adding a hard fdb
>> per-port limit in the bridge as long as it's done properly.
We've also discussed LRU and similar
>> algorithms for fdb learning and eviction. But any hardcoded limits or
limits that can break
>> current default use cases are unacceptable, they must be opt-in.
> 
> I don't think you can really say that it's not the bridge's
task to
> limit memory consumption when what it does is essentially allocate
> memory from untrusted and unbounded user input, in kernel softirq
> context.
> 
> That's in fact the problem, the kernel OOM killer will kick in, but
> there will be no process to kill. This is why the kernel deadlocks on
> memory and dies.
> 
> Maybe where our expectations differ is that I believe that a Linux
> bridge shouldn't need gazillions of tweaks to not kill the kernel?
> There are many devices in production using a bridge without such
> configuration, you can't just make it opt-in.
> 
No, you cannot suddenly enforce such limit because such limit cannot work for
everyone.
There is no silver bullet that works for everyone. Opt-in is the only way to go
about this with specific config for different devices and deployments, anyone
interested can set their limits. They can be auto-adjusted by swdev drivers
after that if necessary, but first they must be implemented in software.

If you're interested in adding default limits based on memory heuristics and
consumption
I'd be interested to see it.
> Of course, performance under heavy stress is a separate concern, and
> maybe user space monitoring would be a better idea for that.
> 
You can do the whole software learning from user-space if needed, not only under
heavy stress.
> I know you changed jobs, but did Cumulus Linux have an application to
> monitor and limit the FDB entry count? Is there some standard
> application which does this somewhere, or does everybody roll their own?
> 
I don't see how that is relevant.
> Anyway, limiting FDB entry count from user space is still theoretically
> different from not dying. If you need to schedule a task to dispose of
you can disable learning altogether and add entries from a user-space daemon, ie
implement complete user-space learning agent, theoretically you can solve it in
many ways if that's the problem
> the weight while the ship is sinking from softirq context, you may never
> get to actually schedule that task in time. AFAIK the bridge UAPI
doesn't
> expose a pre-programmed limit, so what needs to be done is for user
> space to manually delete entries until the count falls below the limit.
That is a single case speculation, it depends on how it was implemented in the
first place. You
can disable learning and have more than enough time to deal with it.

I already said it's ok to add hard configurable limits if they're done
properly performance-wise.
Any distribution can choose to set some default limits after the option exists.

Linux Ethernet Bridging - Jul 2022 - [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature

[Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature

[Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature