On Wed, Apr 07, 2021 at 11:22:41PM +0300, Andriy Gapon wrote:> On 07/04/2021 22:54, Mark Johnston wrote: > > On Wed, Apr 07, 2021 at 10:42:57PM +0300, Andriy Gapon wrote: > >> > >> I regularly see that the top's memory line does not add up (and by a lot). > >> That can be seen with vm.stats as well. > >> > >> For example: > >> $ sysctl vm.stats | fgrep count > >> vm.stats.vm.v_cache_count: 0 > >> vm.stats.vm.v_user_wire_count: 3231 > >> vm.stats.vm.v_laundry_count: 262058 > >> vm.stats.vm.v_inactive_count: 3054178 > >> vm.stats.vm.v_active_count: 621131 > >> vm.stats.vm.v_wire_count: 1871176 > >> vm.stats.vm.v_free_count: 187777 > >> vm.stats.vm.v_page_count: 8134982 > >> > >> $ bc > >>>>> 187777 + 1871176 + 621131 + 3054178 + 262058 > >> 5996320 > >>>>> 8134982 - 5996320 > >> 2138662 > >> > >> As you can see, it's not a small number of pages either. > >> Approximately 2 million pages, 8 gigabytes or 25% of the whole memory on this > >> system. > >> > >> This is 47c00a9835926e96, 13.0-STABLE amd64. > >> I do not think that I saw anything like that when I used (much) older FreeBSD. > > > > One relevant change is that vm_page_wire() no longer removes pages from > > LRU queues, so the count of pages in the queues can include wired pages. > > If the page daemon runs, it will dequeue any wired pages that are > > encountered. > > Maybe I misunderstand how that works, but I would expect that the sum of all > counters could be greater than v_page_count at times. But in my case it's less.I misread, sorry. You're right, what I described would cause double counting. I don't know what might be causing it then. It could be a page leak. The kernel allocates wired pages without adjusting the v_wire_count counter in some cases, but the ones I know about happen at boot and should not account for such a large disparity. I do not see it on a few systems that I have access to.> > This was done to reduce queue lock contention, operations like > > sendfile() which transiently wire pages would otherwise trigger two > > queue operations per page. Now that queue operations are batched this > > might not be as important. > > > > We could perhaps add a new flavour of vm_page_wire() which is not lazy > > and would be suited for e.g., the buffer cache. What is the primary > > source of wired pages in this case? > > It should be ZFS, I guess. > > -- > Andriy Gapon
On 8/04/2021 6:56 am, Mark Johnston wrote:> On Wed, Apr 07, 2021 at 11:22:41PM +0300, Andriy Gapon wrote: >> On 07/04/2021 22:54, Mark Johnston wrote: >>> On Wed, Apr 07, 2021 at 10:42:57PM +0300, Andriy Gapon wrote: >>>> >>>> I regularly see that the top's memory line does not add up (and by a lot). >>>> That can be seen with vm.stats as well. >>>> >>>> For example: >>>> $ sysctl vm.stats | fgrep count >>>> vm.stats.vm.v_cache_count: 0 >>>> vm.stats.vm.v_user_wire_count: 3231 >>>> vm.stats.vm.v_laundry_count: 262058 >>>> vm.stats.vm.v_inactive_count: 3054178 >>>> vm.stats.vm.v_active_count: 621131 >>>> vm.stats.vm.v_wire_count: 1871176 >>>> vm.stats.vm.v_free_count: 187777 >>>> vm.stats.vm.v_page_count: 8134982 >>>> >>>> $ bc >>>>>>> 187777 + 1871176 + 621131 + 3054178 + 262058 >>>> 5996320 >>>>>>> 8134982 - 5996320 >>>> 2138662 >>>> >>>> As you can see, it's not a small number of pages either. >>>> Approximately 2 million pages, 8 gigabytes or 25% of the whole memory on this >>>> system. >>>> >>>> This is 47c00a9835926e96, 13.0-STABLE amd64. >>>> I do not think that I saw anything like that when I used (much) older FreeBSD.For reference, I think that a smaller error has been around for awhile. On a UFS only system, FreeBSD 12.2-STABLE #0 r369523M: Sat Mar 27 00:27:03 AEDT 2021 I have: # sysctl vm.stats | fgrep count; top -b vm.stats.vm.v_cache_count: 0 vm.stats.vm.v_user_wire_count: 0 vm.stats.vm.v_laundry_count: 0 vm.stats.vm.v_inactive_count: 423959 vm.stats.vm.v_active_count: 82623 vm.stats.vm.v_wire_count: 256273 vm.stats.vm.v_free_count: 5457329 vm.stats.vm.v_page_count: 6112118 last pid: 83881; load averages: 0.07, 0.09, 0.06 up 0+07:31:44 12:59:37 90 processes: 1 running, 89 sleeping CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.9% idle Mem: 323M Active, 1656M Inact, 1001M Wired, 585M Buf, 21G Free Swap: 24G Total, 24G Free # bc 423959+82623+256273+5457329 6220184 6112118 - 6220184 -108066
> -----Original Message----- > From: owner-freebsd-current at freebsd.org <owner-freebsd- > current at freebsd.org> On Behalf Of Mark Johnston > Sent: Wednesday, April 7, 2021 10:57 PM > To: Andriy Gapon <avg at freebsd.org> > Cc: freebsd-stable List <stable at freebsd.org>; FreeBSD Current > <current at freebsd.org> > Subject: Re: stable/13, vm page counts do not add up > > On Wed, Apr 07, 2021 at 11:22:41PM +0300, Andriy Gapon wrote: > > On 07/04/2021 22:54, Mark Johnston wrote: > > > On Wed, Apr 07, 2021 at 10:42:57PM +0300, Andriy Gapon wrote: > > >> > > >> I regularly see that the top's memory line does not add up (and by a lot). > > >> That can be seen with vm.stats as well. > > >> > > >> For example: > > >> $ sysctl vm.stats | fgrep count > > >> vm.stats.vm.v_cache_count: 0 > > >> vm.stats.vm.v_user_wire_count: 3231 > > >> vm.stats.vm.v_laundry_count: 262058 > > >> vm.stats.vm.v_inactive_count: 3054178 > > >> vm.stats.vm.v_active_count: 621131 > > >> vm.stats.vm.v_wire_count: 1871176 > > >> vm.stats.vm.v_free_count: 187777 > > >> vm.stats.vm.v_page_count: 8134982 > > >> > > >> $ bc > > >>>>> 187777 + 1871176 + 621131 + 3054178 + 262058 > > >> 5996320 > > >>>>> 8134982 - 5996320 > > >> 2138662 > > >> > > >> As you can see, it's not a small number of pages either. > > >> Approximately 2 million pages, 8 gigabytes or 25% of the whole memory > on this > > >> system. > > >> > > >> This is 47c00a9835926e96, 13.0-STABLE amd64. > > >> I do not think that I saw anything like that when I used (much) older > FreeBSD. > > > > > > One relevant change is that vm_page_wire() no longer removes pages > from > > > LRU queues, so the count of pages in the queues can include wired > pages. > > > If the page daemon runs, it will dequeue any wired pages that are > > > encountered. > > > > Maybe I misunderstand how that works, but I would expect that the sum > of all > > counters could be greater than v_page_count at times. But in my case it's > less. > > I misread, sorry. You're right, what I described would cause double > counting. > > I don't know what might be causing it then. It could be a page leak. > The kernel allocates wired pages without adjusting the v_wire_count > counter in some cases, but the ones I know about happen at boot and > should not account for such a large disparity. I do not see it on a few > systems that I have access to. > > > > This was done to reduce queue lock contention, operations like > > > sendfile() which transiently wire pages would otherwise trigger two > > > queue operations per page. Now that queue operations are batched this > > > might not be as important. > > > > > > We could perhaps add a new flavour of vm_page_wire() which is not lazy > > > and would be suited for e.g., the buffer cache. What is the primary > > > source of wired pages in this case? > > > > It should be ZFS, I guess. > > > > -- > > Andriy Gapon > _______________________________________________ > freebsd-current at freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current- > unsubscribe at freebsd.org"I see kernel memory disappearing, when enabling ktls: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253281 Last test done with 13.0-RC1. I'm a bit at a loss how to debug this further. Regards Juergen Weiss Juergen Weiss | weiss at uni-mainz.de |
On 07/04/2021 23:56, Mark Johnston wrote:> I don't know what might be causing it then. It could be a page leak. > The kernel allocates wired pages without adjusting the v_wire_count > counter in some cases, but the ones I know about happen at boot and > should not account for such a large disparity. I do not see it on a few > systems that I have access to.Mark or anyone, do you have a suggestion on how to approach hunting for the potential page leak? It's been a long while since I worked with that code and it changed a lot. Here is some additional info. I had approximately 2 million unaccounted pages. I rebooted the system and that number became 20 thousand which is more reasonable and could be explained by those boot-time allocations that you mentioned. After 30 hours of uptime the number became 60 thousand. I monitored the number and so far I could not correlate it with any activity. P.S. I have not been running any virtual machines. I do use nvidia graphics driver. -- Andriy Gapon