On Tue, Feb 12, 2019 at 11:14:31PM +0700, Eugene Grosbein wrote:> Hi! > > Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel wired memory over 81 days uptime > out of 8GB total RAM. > > Details follow. > > I have a workstation running Xorg, Firefox, Thunderbird, LibreOffice and occasionally VirtualBox for single VM. > > It has two identical 320GB HDDs combined with single graid-based array with "Intel" > on-disk format having 3 volumes: > - one "RAID1" volume /dev/raid/r0 occupies first 10GB or each HDD; > - two "SINGLE" volumes /dev/raid/r1 and /dev/raid/r2 that utilize "tails" of HDDs (310GB each). > > /dev/raid/r0 (10GB) has MBR partitioning and two slices: > - /dev/raid/r0s1 (8GB) is used for swap; > - /dev/raid/r0s2 (2GB) is used by non-redundant ZFS pool named "os" that contains only > root file system (177M used) and /usr file system (340M used). > > There is also second pool (ZMIRROR) named "z" built directly on top of /dev/raid/r[12] volumes, > this pool contains all other file systems including /var, /home, /usr/ports, /usr/local, /usr/{src|obj} etc. > > # zpool list > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT > os 1,98G 520M 1,48G - - 55% 25% 1.00x ONLINE - > z 288G 79,5G 209G - - 34% 27% 1.00x ONLINE - > > This way I have swap outside of ZFS, boot blocks and partitioning mirrored by means of GEOM_RAID and > can use local console to break to single user mode to unmount all file system other than root and /usr > and can even export bigger ZFS pool "z". And I did that to see that ARC usage > (limited with vfs.zfs.arc_max="3G" in /boot/loader.conf) dropped from over 2500MB > down to 44MB but "Wired" stays high. Now after I imported "z" back and booted to multiuser mode > top(1) shows: > > last pid: 51242; load averages: 0.24, 0.16, 0.13 up 81+02:38:38 22:59:18 > 104 processes: 1 running, 103 sleeping > CPU: 0.0% user, 0.0% nice, 0.4% system, 0.2% interrupt, 99.4% idle > Mem: 84M Active, 550M Inact, 4K Laundry, 4689M Wired, 2595M Free > ARC: 273M Total, 86M MFU, 172M MRU, 64K Anon, 1817K Header, 12M Other > 117M Compressed, 333M Uncompressed, 2.83:1 Ratio > Swap: 8192M Total, 940K Used, 8191M Free > > I have KDB and DDB in my custom kernel also. How do I debug the leak further? > > I use nvidia-driver-340-340.107 driver for GK208 [GeForce GT 710B] video card. > Here are outputs of "vmstat -m": http://www.grosbein.net/freebsd/leak/vmstat-m.txt > and "vmstat -z": http://www.grosbein.net/freebsd/leak/vmstat-z.txtI suspect that the "leaked" memory is simply being used to cache UMA items. Note that the values in the FREE column of vmstat -z output are quite large. The cached items are reclaimed only when the page daemon wakes up to reclaim memory; if there are no memory shortages, large amounts of memory may accumulate in UMA caches. In this case, the sum of the product of columns 2 and 5 gives a total of roughly 4GB cached.> as well as "sysctl hw": http://www.grosbein.net/freebsd/leak/sysctl-hw.txt > and "sysctl vm": http://www.grosbein.net/freebsd/leak/sysctl-vm.txt
12.02.2019 23:34, Mark Johnston wrote:> I suspect that the "leaked" memory is simply being used to cache UMA > items. Note that the values in the FREE column of vmstat -z output are > quite large. The cached items are reclaimed only when the page daemon > wakes up to reclaim memory; if there are no memory shortages, large > amounts of memory may accumulate in UMA caches. In this case, the sum > of the product of columns 2 and 5 gives a total of roughly 4GB cached.Forgot to note, that before I got system to single user mode, there was heavy swap usage (over 3.5GB) and heavy page-in/page-out, 10-20 megabytes per second and system was crawling slow due to pageing.
12.02.2019 23:34, Mark Johnston wrote:> I suspect that the "leaked" memory is simply being used to cache UMA > items. Note that the values in the FREE column of vmstat -z output are > quite large. The cached items are reclaimed only when the page daemon > wakes up to reclaim memory; if there are no memory shortages, large > amounts of memory may accumulate in UMA caches. In this case, the sum > of the product of columns 2 and 5 gives a total of roughly 4GB cached. > >> as well as "sysctl hw": http://www.grosbein.net/freebsd/leak/sysctl-hw.txt >> and "sysctl vm": http://www.grosbein.net/freebsd/leak/sysctl-vm.txtIt seems page daemon is broken somehow as it did not reclaim several gigs of wired memory despite of long period of vm thrashing: $ sed 's/:/,/' vmstat-z.txt | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort -k1,1 -rn | head 1892 abd_chunk 454.629 dnode_t 351.35 zio_buf_512 228.391 zio_buf_16384 173.968 dmu_buf_impl_t 130.25 zio_data_buf_131072 93.6887 VNODE 81.6978 arc_buf_hdr_t_full 74.9368 256 57.4102 4096
12.02.2019 23:34, Mark Johnston wrote:> On Tue, Feb 12, 2019 at 11:14:31PM +0700, Eugene Grosbein wrote: >> Hi! >> >> Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel wired memory over 81 days uptime >> out of 8GB total RAM. >> >> Details follow. >> >> I have a workstation running Xorg, Firefox, Thunderbird, LibreOffice and occasionally VirtualBox for single VM. >> >> It has two identical 320GB HDDs combined with single graid-based array with "Intel" >> on-disk format having 3 volumes: >> - one "RAID1" volume /dev/raid/r0 occupies first 10GB or each HDD; >> - two "SINGLE" volumes /dev/raid/r1 and /dev/raid/r2 that utilize "tails" of HDDs (310GB each). >> >> /dev/raid/r0 (10GB) has MBR partitioning and two slices: >> - /dev/raid/r0s1 (8GB) is used for swap; >> - /dev/raid/r0s2 (2GB) is used by non-redundant ZFS pool named "os" that contains only >> root file system (177M used) and /usr file system (340M used). >> >> There is also second pool (ZMIRROR) named "z" built directly on top of /dev/raid/r[12] volumes, >> this pool contains all other file systems including /var, /home, /usr/ports, /usr/local, /usr/{src|obj} etc. >> >> # zpool list >> NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT >> os 1,98G 520M 1,48G - - 55% 25% 1.00x ONLINE - >> z 288G 79,5G 209G - - 34% 27% 1.00x ONLINE - >> >> This way I have swap outside of ZFS, boot blocks and partitioning mirrored by means of GEOM_RAID and >> can use local console to break to single user mode to unmount all file system other than root and /usr >> and can even export bigger ZFS pool "z". And I did that to see that ARC usage >> (limited with vfs.zfs.arc_max="3G" in /boot/loader.conf) dropped from over 2500MB >> down to 44MB but "Wired" stays high. Now after I imported "z" back and booted to multiuser mode >> top(1) shows: >> >> last pid: 51242; load averages: 0.24, 0.16, 0.13 up 81+02:38:38 22:59:18 >> 104 processes: 1 running, 103 sleeping >> CPU: 0.0% user, 0.0% nice, 0.4% system, 0.2% interrupt, 99.4% idle >> Mem: 84M Active, 550M Inact, 4K Laundry, 4689M Wired, 2595M Free >> ARC: 273M Total, 86M MFU, 172M MRU, 64K Anon, 1817K Header, 12M Other >> 117M Compressed, 333M Uncompressed, 2.83:1 Ratio >> Swap: 8192M Total, 940K Used, 8191M Free >> >> I have KDB and DDB in my custom kernel also. How do I debug the leak further? >> >> I use nvidia-driver-340-340.107 driver for GK208 [GeForce GT 710B] video card. >> Here are outputs of "vmstat -m": http://www.grosbein.net/freebsd/leak/vmstat-m.txt >> and "vmstat -z": http://www.grosbein.net/freebsd/leak/vmstat-z.txt > > I suspect that the "leaked" memory is simply being used to cache UMA > items. Note that the values in the FREE column of vmstat -z output are > quite large. The cached items are reclaimed only when the page daemon > wakes up to reclaim memory; if there are no memory shortages, large > amounts of memory may accumulate in UMA caches. In this case, the sum > of the product of columns 2 and 5 gives a total of roughly 4GB cached.After another day with mostly idle system, "Wired" increased to more than 6GB out of 8GB total. I've tried to increase sysctl vm.v_free_min from default 12838 (50MB) upto 1048576 (4GB) and "Wired" dropped a bit but it is still huge, 5060M: last pid: 61619; load averages: 1.05, 0.78, 0.40 up 81+22:33:09 18:53:49 119 processes: 1 running, 118 sleeping CPU: 0.0% user, 0.0% nice, 50.0% system, 0.0% interrupt, 50.0% idle Mem: 47M Active, 731M Inact, 4K Laundry, 5060M Wired, 2080M Free ARC: 3049M Total, 216M MFU, 2428M MRU, 64K Anon, 80M Header, 325M Other 2341M Compressed, 5874M Uncompressed, 2.51:1 Ratio Swap: 8192M Total, 940K Used, 8191M Free # sysctl vm.v_free_min vm.v_free_min: 1048576 # sysctl vm.stats.vm.v_free_count vm.stats.vm.v_free_count: 533232 ARC probably cached results of nightly periodic jobs traversing file system trees and hit its limit (3G). Still cannot understand where have another 2G (5G-3G) of wired memory leaked to? USED: # vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$4/1024/1024, $1}' | sort -k1,1 -rn | head 2763,2 abd_chunk 196,547 zio_buf_16384 183,711 dnode_t 128,304 zio_buf_512 96,3062 VNODE 79,0076 arc_buf_hdr_t_full 66,5 zio_data_buf_131072 63,0772 UMA Slabs 61,6484 256 61,2564 dmu_buf_impl_t FREE: # vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort -k1,1 -rn | head 245,301 dnode_t 209,086 zio_buf_512 110,163 dmu_buf_impl_t 31,2598 64 21,656 256 10,9262 swblk 10,6295 128 9,0379 RADIX NODE 8,54521 L VFS Cache 7,4917 512