thr3ads.net - freebsd stable - All the memory eaten away by ZFS 'solaris' malloc

If this information is useful, please help other people find it:
Share via:

Mark Martinec

2018-Aug-07 12:58 UTC

All the memory eaten away by ZFS 'solaris' malloc - on 11.1-R amd64

> On Sat, Aug 04, 2018 at 08:38:04PM +0200, Mark Martinec wrote:
>> 2018-08-04 19:01, Mark Johnston wrote:
>> > I think running "zpool list" is adding a lot of noise to
the output.
>> > Could you retry without doing that?
>> No, like I said previously, the "zpool list" (with one
defunct
>> zfs pool) *is* the sole culprit of the zfs memory leak.
>> With each invocation of "zpool list" the "solaris"
malloc
>> jumps up by the same amount, and never ever drops. Without
>> running it (like repeatedly under 'telegraf' monitoring
>> of zfs), the machine runs normally and never runs out of
>> memory, the "solaris" malloc count no longer grows steadily.
2018-08-04 21:47, Mark Johnston wrote:> Sorry, I missed that message.  Given that information, it would be
> useful to see the output of the following script instead:
> 
> # dtrace -c "zpool list -Hp" -x temporal=off -n '
>              dtmalloc::solaris:malloc
>                /pid == $target/{@allocs[stack(), args[3]] = count()}
> 	     dtmalloc::solaris:free
> 	       /pid == $target/{@frees[stack(), args[3]] = count();}'
> 
> This will record all allocations and frees from a single instance of
> "zpool list".

Collected, here it is:

   https://www.ijs.si/usr/mark/tmp/dtrace-cmd.out.bz2



Kevin P. Neal wrote:> Was there a mention of a defunct pool?
Indeed.
Haven't tried yet to destroy it, so it is only my hypothesis
that a defunct pool plays a role in this leak.
> I've got a machine with 8GB RAM running 11.1-RELEASE-p4 with a single 
> ZFS
> pool. It runs zfs list in a script multiple times a minute, and it has
> been doing so for 181 days with no reboot. I have not seen any memory
> issues.
I have jumped from 10.3 directly to 11.1-RELEASE-p11, so I'm not sure
with exactly which version / patch level the problem was introduced.

Tried to reproduce the problem on another host running 11.2R,
using memory disk (md), created GPT partition on it and a ZFS pool
on top, then destroyed the disk, so the pool was left as UNAVAILABLE.
Unfortunately this did not reproduce the problem, the "zpool list"
on that host does not cause ZFS to leak memory. Must be something
specific to that failed disk or pool, which is causing the leak.

   Mark

Mark Martinec

2018-Aug-13 17:39 UTC

head link

All the memory eaten away by ZFS 'solaris' malloc - on 11.2-R amd64

> 2018-08-04 21:47, Mark Johnston wrote:
>> Sorry, I missed that message.  Given that information, it would be
>> useful to see the output of the following script instead:
>> 
>> # dtrace -c "zpool list -Hp" -x temporal=off -n '
>>              dtmalloc::solaris:malloc
>>                /pid == $target/{@allocs[stack(), args[3]] = count()}
>> 	     dtmalloc::solaris:free
>> 	       /pid == $target/{@frees[stack(), args[3]] = count();}'
>> This will record all allocations and frees from a single instance of
>> "zpool list".
> 
2018-08-07 14:58, Mark Martinec wrote:> Collected, here it is:
>   https://www.ijs.si/usr/mark/tmp/dtrace-cmd.out.bz2
>> Was there a mention of a defunct pool?
> 
> Indeed.
> Haven't tried yet to destroy it, so it is only my hypothesis
> that a defunct pool plays a role in this leak.
[...]> I have jumped from 10.3 directly to 11.1-RELEASE-p11, so I'm not sure
> with exactly which version / patch level the problem was introduced.
> 
> Tried to reproduce the problem on another host running 11.2R,
> using memory disk (md), created GPT partition on it and a ZFS pool
> on top, then destroyed the disk, so the pool was left as UNAVAILABLE.
> Unfortunately this did not reproduce the problem, the "zpool
list"
> on that host does not cause ZFS to leak memory. Must be something
> specific to that failed disk or pool, which is causing the leak.
>   Mark

More news: on my last posting I said I can't reproduce the issue
on another 11.2 host. Well, it turned out this was only half the truth.

So this is what I did the last time:

   # create a test pool on md
   mdconfig -a -t swap -s 1Gb
   gpart create -s gpt /dev/md0
   gpart add -t freebsd-zfs -a 4k /dev/md0
   zpool create test /dev/md0p1
   # destroy the disk underneath the pool, making it "unavailable"
   mdconfig -d -u 0 -o force

and I reported that the "zpool list" command does not leak memory,
unlike on another host where the problem was first detected.

But in the following days after this, the second machine
started to run out of memory and ground to a standstill after
a couple of days - this now happened three times, until I realized
the same thing was happening here as on the original host.
(the "zpool list" is running periodically as a plugin to a
"telegraf" monitoring)

Sure enough the "zpool list" was leaking "solaris" zone
memory
here too, and even in larger chunks (previously by 570, now by about 
2k):

   # (while true; do zpool list >/dev/null; vmstat -m | \
       fgrep solaris; sleep 0.5; done) | awk '{print $2-a; a=$2}'
   12224540
   2509
   3121
   5022
   2507
   1834
   2508
   2505

And it's not just the "zpool list" command. The same leak occurs
with
"zpool status" and with "zpool iostat", either when
explicitly
specifying
the defunct pool as argument, or without specifying a pool (implying 
all).
(but not when a healthy pool is explicitly specified to such command)

And to confirm the hypothesis: while running the "zpool list" in an
above loop, I destroyed the defunct pool from another terminal, and
the leak immediately vanished (the vmstat -m | fgrep solaris
no longer grew).

So the only missing link is: why the leak did not start immediately
after revoking the disk and making the pool unavailable, but only
some time later (hours? few days? after a reboot? after running some
other command?).

   Mark

Andriy Gapon

2018-Aug-14 09:18 UTC

head link

All the memory eaten away by ZFS 'solaris' malloc - on 11.1-R amd64

On 07/08/2018 15:58, Mark Martinec wrote:> Collected, here it is:
> 
> ? https://www.ijs.si/usr/mark/tmp/dtrace-cmd.out.bz2

I see one memory leak, not sure if it's the only one.
It looks like vdev_geom_read_config() leaks all parsed vdev nvlist-s but
the last.  The problems seems to come from r316760.  Before that commit
the function would return upon finding the first valid config, but now
it keeps iterating.

The memory leak should not be a problem when vdev-s are probed
sufficiently rarely, but it appears that with an unhealthy pool the
probing can happen much more frequently (e.g., every time pools are listed).

-- 
Andriy Gapon

freebsd stable - Aug 2018 - All the memory eaten away by ZFS 'solaris' malloc - on 11.1-R amd64

All the memory eaten away by ZFS 'solaris' malloc - on 11.1-R amd64

All the memory eaten away by ZFS 'solaris' malloc - on 11.2-R amd64

All the memory eaten away by ZFS 'solaris' malloc - on 11.1-R amd64