Karl Denninger
2014-Mar-16 15:56 UTC
OK, so the buffer allocation problem with ZFS is fixed, but now I got this.... (VM management issues)
From this morning..... 3 users Load 2.08 2.40 2.33 Mar 16 10:41 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 2341416 20052 9279140 56272 436064 count 44 32 All 3144676 25544 10111364 255624 pages 112 43 Proc: Interrupts r p d s w Csw Trp Sys Int Sof Flt 43 ioflt 29689 total 3 40 200 45 80k 40k 186k 17k 438 14k 1348 cow 11 uart0 4 1890 zfod 2085 uhci0 16 8.0%Sys 2.3%Intr 5.6%User 0.0%Nice 84.1%Idle 48 ozfod pcm0 17 | | | | | | | | | | 2%ozfod ehci0 uhci ====+>>> daefr uhci1 21 55 dtbuf 2123 prcfr 520 uhci3 ehci Namei Name-cache Dir-cache 485946 desvn 8196 totfr 1045 arcmsr0 30 Calls hits % hits % 164802 numvn 22 react 1085 cpu0:timer 14254 14202 100 121388 frevn pdwak 77 mps0 256 1663632 pdpgs 7020 em0:rx 0 Disks da0 da1 da2 da3 da4 da5 da6 49 intrn 6864 em0:tx 0 KB/t 8.20 63.89 11.91 29.66 34.00 17.36 0.00 4829120 wire em0:link tps 77 1497 9 19 15 9 0 2117724 act 86 em1:rx 0 MB/s 0.61 93.43 0.11 0.54 0.51 0.15 0.00 17078072 inact 87 em1:tx 0 %busy 26 26 1 3 2 1 0 431968 cache em1:link 7832 free ahci0:ch0 1694896 buf ahci0:ch2 655 cpu1:timer 898 cpu11:time 627 cpu2:timer 784 cpu10:time 938 cpu5:timer 1054 cpu13:time 636 cpu4:timer 476 cpu12:time 579 cpu3:timer 702 cpu8:timer 646 cpu6:timer 573 cpu9:timer 670 cpu7:timer 1056 cpu14:time 515 cpu15:time This is a rather busy (read: extreme demands on the system) time during which I have managed to provoke some really awful behavior, including filesystem stalls. The system in question has both ufs and zfs filesystems (but won't for much longer) along with running both SMB service (samba) and Postgres. Of note is that nasty "inact" page count. It has driven the adaptive ARC code patch (which is on this box) to trim the ARC cache down to the minimum, where it remains pinned. My reading of the "inact" page count is that pages shouldn't stay in that state on an indefinite basis - - they should either be reactivated (if they're re-used) or invalidated and moved to the "cache" bucket where the VM code can free them. Buuuuut.... neither is happening over the space of several hours and a look at the RSS of the working processes doesn't show anything interesting -- or different than normal activity in that regard. 17 _*gigabytes*_ of inactive pages (out of 24 GB of RAM in total) and they're not being reclaimed? Time for me to dig into the vm code? FreeBSD 10.0-STABLE #13 r263037M: Fri Mar 14 14:58:11 CDT 2014 karl at NewFS.denninger.net:/usr/obj/usr/src/sys/KSD-SMP -- -- Karl karl at denninger.net -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2711 bytes Desc: S/MIME Cryptographic Signature URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20140316/9a59b4bd/attachment.bin>