thr3ads.net - freebsd stable - 11.2-STABLE kernel wired memory leak [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Eugene Grosbein

2019-Feb-12 16:14 UTC

11.2-STABLE kernel wired memory leak

Hi!

Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel wired
memory over 81 days uptime
out of 8GB total RAM.

Details follow.

I have a workstation running Xorg, Firefox, Thunderbird, LibreOffice and
occasionally VirtualBox for single VM.

It has two identical 320GB HDDs combined with single graid-based array with
"Intel"
on-disk format having 3 volumes:
- one "RAID1" volume /dev/raid/r0 occupies first 10GB or each HDD;
- two "SINGLE" volumes /dev/raid/r1 and /dev/raid/r2 that utilize
"tails" of HDDs (310GB each).

/dev/raid/r0 (10GB) has MBR partitioning and two slices:
- /dev/raid/r0s1 (8GB) is used for swap;
- /dev/raid/r0s2 (2GB) is used by non-redundant ZFS pool named "os"
that contains only
root file system (177M used) and /usr file system (340M used).

There is also second pool (ZMIRROR) named "z" built directly on top of
/dev/raid/r[12] volumes,
this pool contains all other file systems including /var, /home, /usr/ports,
/usr/local, /usr/{src|obj} etc.

# zpool list
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH 
ALTROOT
os    1,98G   520M  1,48G        -         -    55%    25%  1.00x  ONLINE  -
z      288G  79,5G   209G        -         -    34%    27%  1.00x  ONLINE  -

This way I have swap outside of ZFS, boot blocks and partitioning mirrored by
means of GEOM_RAID and
can use local console to break to single user mode to unmount all file system
other than root and /usr
and can even export bigger ZFS pool "z". And I did that to see that
ARC usage
(limited with vfs.zfs.arc_max="3G" in /boot/loader.conf) dropped from
over 2500MB
down to 44MB but "Wired" stays high. Now after I imported
"z" back and booted to multiuser mode
top(1) shows:

last pid: 51242;  load averages:  0.24,  0.16,  0.13  up 81+02:38:38  22:59:18
104 processes: 1 running, 103 sleeping
CPU:  0.0% user,  0.0% nice,  0.4% system,  0.2% interrupt, 99.4% idle
Mem: 84M Active, 550M Inact, 4K Laundry, 4689M Wired, 2595M Free
ARC: 273M Total, 86M MFU, 172M MRU, 64K Anon, 1817K Header, 12M Other
     117M Compressed, 333M Uncompressed, 2.83:1 Ratio
Swap: 8192M Total, 940K Used, 8191M Free

I have KDB and DDB in my custom kernel also. How do I debug the leak further?

I use nvidia-driver-340-340.107 driver for GK208 [GeForce GT 710B] video card.
Here are outputs of "vmstat -m":
http://www.grosbein.net/freebsd/leak/vmstat-m.txt
and "vmstat -z": http://www.grosbein.net/freebsd/leak/vmstat-z.txt
as well as "sysctl hw":
http://www.grosbein.net/freebsd/leak/sysctl-hw.txt
and "sysctl vm": http://www.grosbein.net/freebsd/leak/sysctl-vm.txt

Mark Johnston

2019-Feb-12 16:34 UTC

head link

11.2-STABLE kernel wired memory leak

On Tue, Feb 12, 2019 at 11:14:31PM +0700, Eugene Grosbein
wrote:> Hi!
> 
> Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel wired
memory over 81 days uptime
> out of 8GB total RAM.
> 
> Details follow.
> 
> I have a workstation running Xorg, Firefox, Thunderbird, LibreOffice and
occasionally VirtualBox for single VM.
> 
> It has two identical 320GB HDDs combined with single graid-based array with
"Intel"
> on-disk format having 3 volumes:
> - one "RAID1" volume /dev/raid/r0 occupies first 10GB or each
HDD;
> - two "SINGLE" volumes /dev/raid/r1 and /dev/raid/r2 that utilize
"tails" of HDDs (310GB each).
> 
> /dev/raid/r0 (10GB) has MBR partitioning and two slices:
> - /dev/raid/r0s1 (8GB) is used for swap;
> - /dev/raid/r0s2 (2GB) is used by non-redundant ZFS pool named
"os" that contains only
> root file system (177M used) and /usr file system (340M used).
> 
> There is also second pool (ZMIRROR) named "z" built directly on
top of /dev/raid/r[12] volumes,
> this pool contains all other file systems including /var, /home,
/usr/ports, /usr/local, /usr/{src|obj} etc.
> 
> # zpool list
> NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH 
ALTROOT
> os    1,98G   520M  1,48G        -         -    55%    25%  1.00x  ONLINE 
-
> z      288G  79,5G   209G        -         -    34%    27%  1.00x  ONLINE 
-
> 
> This way I have swap outside of ZFS, boot blocks and partitioning mirrored
by means of GEOM_RAID and
> can use local console to break to single user mode to unmount all file
system other than root and /usr
> and can even export bigger ZFS pool "z". And I did that to see
that ARC usage
> (limited with vfs.zfs.arc_max="3G" in /boot/loader.conf) dropped
from over 2500MB
> down to 44MB but "Wired" stays high. Now after I imported
"z" back and booted to multiuser mode
> top(1) shows:
> 
> last pid: 51242;  load averages:  0.24,  0.16,  0.13  up 81+02:38:38 
22:59:18
> 104 processes: 1 running, 103 sleeping
> CPU:  0.0% user,  0.0% nice,  0.4% system,  0.2% interrupt, 99.4% idle
> Mem: 84M Active, 550M Inact, 4K Laundry, 4689M Wired, 2595M Free
> ARC: 273M Total, 86M MFU, 172M MRU, 64K Anon, 1817K Header, 12M Other
>      117M Compressed, 333M Uncompressed, 2.83:1 Ratio
> Swap: 8192M Total, 940K Used, 8191M Free
> 
> I have KDB and DDB in my custom kernel also. How do I debug the leak
further?
> 
> I use nvidia-driver-340-340.107 driver for GK208 [GeForce GT 710B] video
card.
> Here are outputs of "vmstat -m":
http://www.grosbein.net/freebsd/leak/vmstat-m.txt
> and "vmstat -z":
http://www.grosbein.net/freebsd/leak/vmstat-z.txt
I suspect that the "leaked" memory is simply being used to cache UMA
items.  Note that the values in the FREE column of vmstat -z output are
quite large.  The cached items are reclaimed only when the page daemon
wakes up to reclaim memory; if there are no memory shortages, large
amounts of memory may accumulate in UMA caches.  In this case, the sum
of the product of columns 2 and 5 gives a total of roughly 4GB cached.
> as well as "sysctl hw":
http://www.grosbein.net/freebsd/leak/sysctl-hw.txt
> and "sysctl vm":
http://www.grosbein.net/freebsd/leak/sysctl-vm.txt

Garrett Wollman

2019-Feb-12 17:57 UTC

head link

11.2-STABLE kernel wired memory leak

In article <d8c7abc0-3ba1-40e4-22b1-1b30d28ced14 at grosbein.net>
eugen at grosbein.net writes:
>Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel
>wired memory over 81 days uptime
>out of 8GB total RAM.
Not a whole lot of evidence yet, but anecdotally I'm seeing the same
thing on some huge-memory NFS servers running releng/11.2.  They seem
to run fine for a few weeks, then mysteriously start swapping
continuously, a few hundred pages a second.  The continues for hours
at a time, and then stops just as mysteriously.  Over time the total
memory dedicated to ZFS ARC goes down but there's no decrease in wired
memory.  I've tried disabling swap, but this seems to make the server
unstable.  I have yet to find any obvious commonality (aside from the
fact that these are all large-memory NFS servers which don't do much
of anything else -- the only software running on them is related to
managing and monitoring the NFS service).

-GAWollman

Robert Schulze

2019-Mar-29 09:52 UTC

head link

11.2-STABLE kernel wired memory leak

Hi,

I just want to report a similar issue here with 11.2-RELEASE-p8.

The affected machine has 64 GB ram and does daily backups from several
machines in the night, at daytime there a parallel runs of clamav on a
specific dataset.

One symtom is basic I/O-performance: After upgrading from 11.1 to 11.2
backup times have increased, and are even still increasing. After one
week of operation, backup times have doubled - without having changed
anything else.

Then there is this wired memory and way too lazy reclaim of memory for
user processes: The clamav scans start at 10:30 and get swapped out
immediatly. Although vfs.zfs.arc_max=48G, wired is at 62 GB before the
scans and it takes about 10 minutes for the scan processes to actually
run on system ram, not swap.

There is obviously something broken, as there are several threads with
similar observations.

with kind regards,
Robert Schulze

freebsd stable - Feb 2019 - 11.2-STABLE kernel wired memory leak

11.2-STABLE kernel wired memory leak

11.2-STABLE kernel wired memory leak

11.2-STABLE kernel wired memory leak

11.2-STABLE kernel wired memory leak