Hi again,
On 01/07/2019 20:31, Martin Weinelt wrote:> Hi Nik,
>
> On 7/1/19 7:03 PM, Nikolay Aleksandrov wrote:
>> Hi Martin,
>>
>> On 01/07/2019 19:53, Martin Weinelt wrote:
>>> Hi Nik,
>>>
>>> more info below.
>>>
>>> On 6/29/19 3:11 PM, nikolay at cumulusnetworks.com wrote:
>>>> On 29 June 2019 14:54:44 EEST, Martin Weinelt <martin at
linuxlounge.net> wrote:
>>>>> Hello,
>>>>>
>>>>> we've recently been experiencing memory leaks on our
Linux-based
>>>>> routers,
>>>>> at least as far back as v4.19.16.
>>>>>
>>>>> After rebuilding with KASAN it found a use-after-free in
>>>>> br_multicast_rcv which I could reproduce on v5.2.0-rc6.
>>>>>
>>>>> Please find the KASAN report below, I'm anot sure what
else to provide
>>>>> so
>>>>> feel free to ask.
>>>>>
>>>>> Best,
>>>>> Martin
>>>>>
>>>>>
>>>>
>>>> Hi Martin,
>>>> I'll look into this, are there any specific steps to
reproduce it?
>>>>
>>>> Thanks,
>>>> Nik
>>>>>
>>> Each server is a KVM Guest and has 18 bridges with the same
master/slave
>>> relationships:
>>>
>>> bridge -> batman-adv -> {l2 tunnel, virtio device}
>>>
>>> Linus L?ssing from the batman-adv asked me to apply this patch to
help
>>> debugging.
>>>
>>> v5.2-rc6-170-g728254541ebc with this patch yielded the following
KASAN
>>> report, not sure if the additional information at the end is a
result of
>>> the added patch though.
>>>
>>> Best,
>>> Martin
>>>
>>
>> I see a couple of issues that can cause out-of-bounds accesses in
br_multicast.c
>> more specifically there're pskb_may_pull calls and accesses to
stale skb pointers.
>> I've had these on my "to fix" list for some time now,
will prepare, test the fixes and
>> send them for review. In a few minutes I'll send a test patch for
you.
>> That being said, I thought you said you've been experiencing memory
leaks, but below
>> reports are for out-of-bounds accesses, could you please clarify if you
were
>> speaking about these or is there another issue as well ?
>> If you're experiencing memory leaks, are you sure they're
related to the bridge ?
>> You could try kmemleak for those.
>>
>> Thank you,
>> Nik
>>
>
> we had been experiencing memory leaks on v4.19.37, thats why we started to
turn on
> KASAN and kmemleak in the first place. This is when we found this
use-after-free.
>
> The memory leak exists, and is a separate issue. Apparently kmemleak does
not work,
> I suspect the early log size is too small
>
> root at gw02:~# echo scan > /sys/kernel/debug/kmemleak
-bash: echo: write error: Device or resource busy
>
> CONFIG_HAVE_DEBUG_KMEMLEAK=y
> CONFIG_DEBUG_KMEMLEAK=y
> CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
> CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
>
> I'll increase the early log size with the next build to try and get
more information
> on the memory leak, I'll open a separate thread for that then.
>
> Thanks,
> Martin
>
I see, thanks for clarifying this. So on the KASAN could you please try the
attached patch ?
Also could you please run the br_multicast_rcv+xxx addresses through
linux/scripts/faddr2line for your kernel/bridge:
usage: faddr2line [--list] <object file> <func+offset>
<func+offset>...
Thanks,
Nik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-net-bridge-mcast-fix-possible-uses-of-stale-pointers.patch
Type: text/x-patch
Size: 3443 bytes
Desc: not available
URL:
<http://lists.linuxfoundation.org/pipermail/bridge/attachments/20190701/848a5c95/attachment.bin>