thr3ads.net - freebsd stable - kern/165903: mbuf leak [Apr 2013]

If this information is useful, please help other people find it:
Share via:

Chris Forgeron

2013-Apr-10 19:39 UTC

kern/165903: mbuf leak

Hello,

 I've updated the PR on this via bug track email (hopefully, it bounced my
first email) , but I thought I should bring it to the attention of the list as
it's still happening, and the original PR was from March 2012.

 The PR is here: http://www.freebsd.org/cgi/query-pr.cgi?pr=165903&cat
 I am experiencing the same mbuf leak on fresh 9.1-RELEASE machines (AMD64).
Most of my machines are ESXi 5.1 VM's running the e1000 (em0) NIC. This VM
is stock, just one freebsd-update done, nothing custom.

 I have also experienced this condition on an older 9.0-STABLE from Jul 1st
2012. I did not notice it much before that date, but I can't tell for sure.
I have a few machines on that build that I still use, so confirmation was easy.

 I do not experience the error if I load up vmware tools and use the vmx3f0
adapter, it's just with em0.

 I have set the mbufs to very high numbers (322144) to buy more time between
lockups/crashes. Most often the systems stay functional, they just need a reboot
or more mbufs if I add them. Some times they lock up or crash as I ifconfig
down/up the adapter or attempt to add more mbufs via sysctl.

 1) Is anyone else able to reproduce this problem? The PR is still open, which
says to me not all of us can be having this problem or there would be more drive
to fix.
 2) What do I need to help with to advance this problem? It's not just my
systems, as evidenced by the original poster of the PR.

Thanks.

Jeremy Chadwick

2013-Apr-10 23:53 UTC

head link

kern/165903: mbuf leak

On Wed, Apr 10, 2013 at 07:39:31PM +0000, Chris Forgeron
wrote:>  I've updated the PR on this via bug track email (hopefully, it bounced
my first email) , but I thought I should bring it to the attention of the list
as it's still happening, and the original PR was from March 2012.
> 
>  The PR is here:
http://www.freebsd.org/cgi/query-pr.cgi?pr=165903&cat>
>  I am experiencing the same mbuf leak on fresh 9.1-RELEASE machines
(AMD64). Most of my machines are ESXi 5.1 VM's running the e1000 (em0) NIC.
This VM is stock, just one freebsd-update done, nothing custom.
> 
>  I have also experienced this condition on an older 9.0-STABLE from Jul 1st
2012. I did not notice it much before that date, but I can't tell for sure.
I have a few machines on that build that I still use, so confirmation was easy.
> 
>  I do not experience the error if I load up vmware tools and use the vmx3f0
adapter, it's just with em0.
> 
>  I have set the mbufs to very high numbers (322144) to buy more time
between lockups/crashes. Most often the systems stay functional, they just need
a reboot or more mbufs if I add them. Some times they lock up or crash as I
ifconfig down/up the adapter or attempt to add more mbufs via sysctl.
> 
>  1) Is anyone else able to reproduce this problem? The PR is still open,
which says to me not all of us can be having this problem or there would be more
drive to fix.
>  2) What do I need to help with to advance this problem? It's not just
my systems, as evidenced by the original poster of the PR.
1. This PR does not contain output from "dmesg" nor "pciconf
-lvbc", nor
does your Email.  Output from this matters.

2. Please try 9.1-STABLE and see if there is an improvement; there have
been a huge number of changes/fixes to em(4) between 9.1-RELEASE and
now.  You can try this:

https://pub.allbsd.org/FreeBSD-snapshots/amd64-amd64/9.1-RELENG_9-r249290-JPSNAP/

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |

Eugene M. Zheganin

2013-Apr-11 05:58 UTC

head link

kern/165903: mbuf leak

Hi.

On 11.04.2013 01:39, Chris Forgeron wrote:>  I do not experience the error if I load up vmware tools and use the vmx3f0
adapter, it's just with em0.
>
>  I have set the mbufs to very high numbers (322144) to buy more time
between lockups/crashes. Most often the systems stay functional, they just need
a reboot or more mbufs if I add them. Some times they lock up or crash as I
ifconfig down/up the adapter or attempt to add more mbufs via sysctl.
>
>  1) Is anyone else able to reproduce this problem? The PR is still open,
which says to me not all of us can be having this problem or there would be more
drive to fix.
>  2) What do I need to help with to advance this problem? It's not just
my systems, as evidenced by the original poster of the PR.
>(I'm the author of the PR).
I was experiencing this on 9.0 'till some -STABLE, after that the leak
was gone on the exactly same configuration. This server is equipped with
bce(4) interfaces only, so I don't see any connection with interface
driver. I think it's more configuration related. I created this pr in
order to investigate why one of my 9.x servers hangs periodically. Since
that I tried lots of 9-STABLE snapshots, none of them fixed my problem.
Last month I decided to switch to 10.x. The uptime is 37 days so far,
none of my 9.x snapshots was able to stand that long. Even if this
machine will crash while I write this - this definitely means that 10.x
right now is at least equally stable as 9.x is, and can run as smoothly
as 9.x does.

My advice - use 10.0-CURRENT, 9.0 and all of it's descendant versions
are broken beyond repair, imo. Switching to 10.x was a hard decision for
me too, I was too scared by the '-CURRENT' karma. Seems like it's
not
that creepy.

Eugene.

Jeremy Chadwick

2013-Apr-16 16:46 UTC

head link

kern/165903: mbuf leak

On Tue, Apr 16, 2013 at 04:43:49PM +0000, Chris Forgeron
wrote:> Thanks, I've applied it, and am rebuilding now. I should know
tonight/tomorrow
> 
> I take it this proves that I don't have the latest source with cvsup,
or is this work in progress?
> 
> Thanks again. 
The patch Gleb provided is not committed anywhere (not even
HEAD/current) -- it's a patch for you to test.  :-)

The sources you have via csup/cvsup seem to be recent enough (all I can
go off of is your legacy em(4) driver version being 1.0.5).

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |

Jeremy Chadwick

2013-Apr-17 20:17 UTC

head link

kern/165903: mbuf leak

On Wed, Apr 17, 2013 at 05:38:12PM +0000, Chris Forgeron
wrote:> Hello,
> 
>  I'm happy to report that the patch from Gleb has fixed the problem.
> 
>  My system had 256 mbuf clusters in use at boot, and after a day, still
only has 256 mbuf clusters in use.
> 
>  From the patch, I see we are now dropping these packets (?) - Was the
issue that the packets were being queued up for further work, but nothing was
being done with them?
Not exactly.  Please open up the source file and follow along.

At line 538, a call to mtod() is performed, which is what allocates the
memory for the mbuf used for the ARP header.

Now go to lines 543 and 549.  These are error checks for certain kinds
of ARP headers which are either malformed (line 543) or should not be
honoured (line 549).

When these error checks proved true, the code simply did "return"
to get out of the function it was in (in_arpinput()), but never issued
m_freem() to free the previously-allocated mbuf, hence leaking mbufs.

The patch changes the "return" into "goto drop".  The drop
label is at
line 873, which is where you'll find the m_freem(), followed immediately
by the function returning.

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |

freebsd stable - Apr 2013 - kern/165903: mbuf leak

kern/165903: mbuf leak

kern/165903: mbuf leak

kern/165903: mbuf leak

kern/165903: mbuf leak

kern/165903: mbuf leak