thr3ads.net - Linux Ethernet Bridging - [Bridge] [PATCH net-next v2] bridge: vlan: allow to suppress local mac install for all vlans [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Vlad Yasevich

2015-Aug-27 23:47 UTC

[Bridge] [PATCH net-next v2] bridge: vlan: allow to suppress local mac install for all vlans

On 08/27/2015 05:02 PM, Nikolay Aleksandrov wrote:> 
>> On Aug 26, 2015, at 9:57 PM, roopa <roopa at cumulusnetworks.com>
wrote:
>>
>> On 8/26/15, 4:33 AM, Nikolay Aleksandrov wrote:
>>>> On Aug 25, 2015, at 11:06 PM, David Miller <davem at
davemloft.net> wrote:
>>>>
>>>> From: Nikolay Aleksandrov <nikolay at
cumulusnetworks.com>
>>>> Date: Tue, 25 Aug 2015 22:28:16 -0700
>>>>
>>>>> Certainly, that should be done and I will look into it, but
the
>>>>> essence of this patch is a bit different. The problem here
is not
>>>>> the size of the fdb entries, it?s more the number of them -
having
>>>>> 96000 entries (even if they were 1 byte ones) is just way
too much
>>>>> especially when the fdb hash size is small and static. We
could work
>>>>> on making it dynamic though, but still these type of local
entries
>>>>> per vlan per port can easily be avoided with this option.
>>>> 96000 bits can be stored in 12k.  Get where I'm going with
this?
>>>>
>>>> Look at the problem sideways.
>>> Oh okay, I misunderstood your previous comment. I?ll look into
that.
>>>
>> I just wanted to add the other problems we have had with keeping these
macs (mostly from userspace POV):
>> - add/del netlink notification storms
>> - and large netlink dumps
>>
>> In addition to in-kernel optimizations, will be nice to have a solution
that reduces the burden on userspace. That will need a newer netlink dump format
for fdbs. Considering all the changes needed, Nikolays patch seems less
intrusive.
> 
> Right, we need to take these into account as well. I?ll continue the
discussion on this (or restart it) because
> I looked into using a bitmap for the local entries only and while it fixes
the scalability issue, it presents
> a few new ones which are mostly related to the fact that these entries now
exist only without a vlan
> and if a new mac comes along which matches one of these but is in a vlan,
the entry will get created
> in br_fdb_update() unless we add a second lookup, but that will slow down
the learning path.
> Also this change requires an update of every fdb function that uses the vid
as a key (every fdb function?!)
> because now we can have the mac in two places instead of one which is a
pretty big churn with lots
> of conditionals all over the place and I don?t like it. Adding this
complexity for the local addresses only
> seems like an overkill, so I think to drop this issue for now.
I seem to recall Roopa and I and maybe a few others have discussing this a few
years ago at plumbers, I can't remember the details any more.  All these
local
addresses add a ton of confusion.  Does anyone (Stephen?) remember what the
original reason was for all these local addresses? I wonder if we can have
a nob to disable all of them (not just per vlan)?  That might be cleaner and
easier to swallow.
> This patch (that works around the initial problem) also has these issues.
> Note that one way to take care of this in a more straight-forward way would
be to have each entry
> with some sort of a bitmap (like Vlad has tried earlier) and then we can
combine the paths so most
> of these issues disappear, but that will not be easy as was already
commented earlier. I?ve looked
> briefly into doing this with rhashtable so we can keep the memory footprint
for each entry relatively
> small but it still affects the performance and we can have thousands of
resizes happening.
> 
So, one of the earlier approaches that I've tried (before rhashtable was
in the kernel) was to have a hash of vlan ids each with a data structure
pointing to a list of ports for a given vlan as well as a list of fdbs for
a given vlan.  As far as scalability goes, that's really the best approach.
It would also allow us to do packet accounting per vlan.  The only concern
at the time was performance of ingress lookup.   I think rhashtables might
help with this as well as ability to grow the footprint of the vlan hash
table dynamically.

-vlad
> On the notification side if we can fix that, we can actually delete the
96000 entries without creating a
> huge notification storm and do a user-land workaround of the original
issue, so I?ll look into that next.
> 
> Any comments or ideas are very welcome.
> 
> Thank you,
>  Nik
>

Nikolay Aleksandrov

2015-Aug-28 02:17 UTC

head link

[Bridge] [PATCH net-next v2] bridge: vlan: allow to suppress local mac install for all vlans

> On Aug 27, 2015, at 4:47 PM, Vlad Yasevich <vyasevic at redhat.com>
wrote:
> 
> On 08/27/2015 05:02 PM, Nikolay Aleksandrov wrote:
>> 
>>> On Aug 26, 2015, at 9:57 PM, roopa <roopa at
cumulusnetworks.com> wrote:
>>> 
>>> On 8/26/15, 4:33 AM, Nikolay Aleksandrov wrote:
>>>>> On Aug 25, 2015, at 11:06 PM, David Miller <davem at
davemloft.net> wrote:
>>>>> 
>>>>> From: Nikolay Aleksandrov <nikolay at
cumulusnetworks.com>
>>>>> Date: Tue, 25 Aug 2015 22:28:16 -0700
>>>>> 
>>>>>> Certainly, that should be done and I will look into it,
but the
>>>>>> essence of this patch is a bit different. The problem
here is not
>>>>>> the size of the fdb entries, it?s more the number of
them - having
>>>>>> 96000 entries (even if they were 1 byte ones) is just
way too much
>>>>>> especially when the fdb hash size is small and static.
We could work
>>>>>> on making it dynamic though, but still these type of
local entries
>>>>>> per vlan per port can easily be avoided with this
option.
>>>>> 96000 bits can be stored in 12k.  Get where I'm going
with this?
>>>>> 
>>>>> Look at the problem sideways.
>>>> Oh okay, I misunderstood your previous comment. I?ll look into
that.
>>>> 
>>> I just wanted to add the other problems we have had with keeping
these macs (mostly from userspace POV):
>>> - add/del netlink notification storms
>>> - and large netlink dumps
>>> 
>>> In addition to in-kernel optimizations, will be nice to have a
solution that reduces the burden on userspace. That will need a newer netlink
dump format for fdbs. Considering all the changes needed, Nikolays patch seems
less intrusive.
>> 
>> Right, we need to take these into account as well. I?ll continue the
discussion on this (or restart it) because
>> I looked into using a bitmap for the local entries only and while it
fixes the scalability issue, it presents
>> a few new ones which are mostly related to the fact that these entries
now exist only without a vlan
>> and if a new mac comes along which matches one of these but is in a
vlan, the entry will get created
>> in br_fdb_update() unless we add a second lookup, but that will slow
down the learning path.
>> Also this change requires an update of every fdb function that uses the
vid as a key (every fdb function?!)
>> because now we can have the mac in two places instead of one which is a
pretty big churn with lots
>> of conditionals all over the place and I don?t like it. Adding this
complexity for the local addresses only
>> seems like an overkill, so I think to drop this issue for now.
> 
> I seem to recall Roopa and I and maybe a few others have discussing this a
few
> years ago at plumbers, I can't remember the details any more.  All
these local
> addresses add a ton of confusion.  Does anyone (Stephen?) remember what the
> original reason was for all these local addresses? I wonder if we can have
> a nob to disable all of them (not just per vlan)?  That might be cleaner
and
> easier to swallow.
> 
Right, this would be the easiest way and if the others agree - I?ll post a patch
for it so we can
have some way to resolve it today and even if we fix the scalability issue, this
is still a valid case
that some people don?t want local fdbs installed automatically.
Any objections to this ?
>> This patch (that works around the initial problem) also has these
issues.
>> Note that one way to take care of this in a more straight-forward way
would be to have each entry
>> with some sort of a bitmap (like Vlad has tried earlier) and then we
can combine the paths so most
>> of these issues disappear, but that will not be easy as was already
commented earlier. I?ve looked
>> briefly into doing this with rhashtable so we can keep the memory
footprint for each entry relatively
>> small but it still affects the performance and we can have thousands of
resizes happening.
>> 
> 
> So, one of the earlier approaches that I've tried (before rhashtable
was
> in the kernel) was to have a hash of vlan ids each with a data structure
> pointing to a list of ports for a given vlan as well as a list of fdbs for
> a given vlan.  As far as scalability goes, that's really the best
approach.
> It would also allow us to do packet accounting per vlan.  The only concern
> at the time was performance of ingress lookup.   I think rhashtables might
> help with this as well as ability to grow the footprint of the vlan hash
> table dynamically.
> 
> -vlad
> I?ll look into it but I?m guessing the learning will become a more complicated
process with additional
allocations and some hash handling.
>> On the notification side if we can fix that, we can actually delete the
96000 entries without creating a
>> huge notification storm and do a user-land workaround of the original
issue, so I?ll look into that next.
>> 
>> Any comments or ideas are very welcome.
>> 
>> Thank you,
>> Nik

Linux Ethernet Bridging - Aug 2015 - [Bridge] [PATCH net-next v2] bridge: vlan: allow to suppress local mac install for all vlans

[Bridge] [PATCH net-next v2] bridge: vlan: allow to suppress local mac install for all vlans

[Bridge] [PATCH net-next v2] bridge: vlan: allow to suppress local mac install for all vlans