ZFS works really stable on FreeBSD, but I''m biggest problem is how to control ZFS memory usage. I''ve no idea how to leash that beast. FreeBSD has a backpresure mechanism. I can register my function so it will be called when there are memory problems, which I do. I using it for ARC layer. Even with this in place under heavy load the kernel panics, because memory with KM_SLEEP cannot be allocated. Here are some statistics of memory usage when the panic occurs: zfs_znode_cache: 356 * 11547 = 4110732 bytes zil_lwb_cache: 176 * 43 = 7568 bytes arc_buf_t: 20 * 7060 = 141200 bytes arc_buf_hdr_t: 188 * 7060 = 1327280 bytes dnode_t: 756 * 162311 = 122707116 bytes !! dmu_buf_impl_t: 332 * 18649 = 6191468 other: 14432256 bytes (regular kmem_alloc()) There is 1GB of RAM, 320MB is for the kernel. 1/3 if kernel memory is configured as ARC''s maximum. When it panics, debugger statistics show that there is around 2/3 of this actually allocated, but probably due memory fragmentation. The most important part is dnode_t probably as it looks it doesn''t obey any limits. Maybe it is a bug in my port and I''m leaking them somehow? On the other hand when I unload ZFS kernel module, FreeBSD''s kernel reports any memory leaks - they exist, but are much, much smaller. There is also quite a lot of znodes, which I''d also like to be able to free and not sure how. In Solaris vnode''s life end in VOP_INACTIVE() routine, but znode if kept around. In FreeBSD VOP_INACTIVE() means "puts the vnode onto free vnodes list" and when we want to use this vnode for different file system VOP_RECLAIM() is called and VOP_RECLAIM() will be a good place to free znode as well, if possible. Any ideas how to fix it? -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20061102/60aa0fd9/attachment.bin>
Aawel Kakub Dawidek, et al, First, I am describing a moving target and maybe I am off target for you... The memory consumption output via slab allocator functions is not really correct. Normally, even when memory is freed it is cached until SLEEP memory allocation fails, and then it is re-allocated. Is this your memory leak? So, memory tends to so up as more and more allocated and never decreased from a point, IMO. The assumption is that memory will be allocated if SLEEP is called and then returns.. Two standard hash tables allocs via NOSLEEPs in arc.c and ?dbuf.c? within local buf_init()s retry with 1/2 values. So, my first memory consumption issue is to decrease to 1/4s if failed. And or even starting with 1/2 or 1/4 the size of the default hash tables. Minimally generating a message as to what size the hash tables are might tell you something.. I am assuming that your smaller page size or disk blocks might alloc larger hash tables than wanted. Then go thru the functions and identify all of the SLEEP allocs and pre-alloc a working set number of items via one of the slab functions. Then go and change all of the SLEEPs to NOSLEEPs and return failures on memory allocs. You can then retry the allocs at a later time or return ENOMEM. This will keep a responsive system even when low memory has occured. Worst case scenario, hopefully a poorly managed FS is a /opt based FSs that can be offlined instead of panicing the whole box. Also, I am assuming that FSs memory allocs are less important than some other internal / kernel object. You might find that you can''t change some of the SLEEPs to NOSLEEP, but doing most will probably delay your problem until you have possibly leaked more mem. This isn''t that hard and will remove or hopefully significantly delay your panics.. On the other hand, if I remember correctly, VOP_INACTIVE ties to the DNLC and this is a much more complicated fix and would require a code walk thru your current dev code. Mitchell Erblich ------------------ Pawel Jakub Dawidek wrote:> > ZFS works really stable on FreeBSD, but I''m biggest problem is how to > control ZFS memory usage. I''ve no idea how to leash that beast. > > FreeBSD has a backpresure mechanism. I can register my function so it > will be called when there are memory problems, which I do. I using it > for ARC layer. > Even with this in place under heavy load the kernel panics, because > memory with KM_SLEEP cannot be allocated. > > Here are some statistics of memory usage when the panic occurs: > > zfs_znode_cache: 356 * 11547 = 4110732 bytes > zil_lwb_cache: 176 * 43 = 7568 bytes > arc_buf_t: 20 * 7060 = 141200 bytes > arc_buf_hdr_t: 188 * 7060 = 1327280 bytes > dnode_t: 756 * 162311 = 122707116 bytes !! > dmu_buf_impl_t: 332 * 18649 = 6191468 > other: 14432256 bytes (regular kmem_alloc()) > > There is 1GB of RAM, 320MB is for the kernel. 1/3 if kernel memory is > configured as ARC''s maximum. > When it panics, debugger statistics show that there is around 2/3 of > this actually allocated, but probably due memory fragmentation. > > The most important part is dnode_t probably as it looks it doesn''t obey > any limits. Maybe it is a bug in my port and I''m leaking them somehow? > On the other hand when I unload ZFS kernel module, FreeBSD''s kernel > reports any memory leaks - they exist, but are much, much smaller. > > There is also quite a lot of znodes, which I''d also like to be able to > free and not sure how. In Solaris vnode''s life end in VOP_INACTIVE() > routine, but znode if kept around. In FreeBSD VOP_INACTIVE() means "puts > the vnode onto free vnodes list" and when we want to use this vnode for > different file system VOP_RECLAIM() is called and VOP_RECLAIM() will be > a good place to free znode as well, if possible. > > Any ideas how to fix it? > > -- > Pawel Jakub Dawidek http://www.wheel.pl > pjd at FreeBSD.org http://www.FreeBSD.org > FreeBSD committer Am I Evil? Yes, I Am! > > ------------------------------------------------------------------------ > Part 1.1.2Type: application/pgp-signature > > ------------------------------------------------------------------------ > _______________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://opensolaris.org/mailman/listinfo/zfs-code
sorry Pawel Jakub Dawidek, I don''t read what I type.. :-) Mitchell Eeerblich ------------------ Erblichs wrote:> > Aawel Kakub Dawidek, et al, > > First, I am describing a moving target and maybe I > am off target for you... > > The memory consumption output via slab allocator > functions is not really correct. > > Normally, even when memory is freed it is cached until > SLEEP memory allocation fails, and then it is > re-allocated. Is this your memory leak? So, memory > tends to so up as more and more allocated and never > decreased from a point, IMO. > > The assumption is that memory will be allocated if SLEEP > is called and then returns.. > > Two standard hash tables allocs via NOSLEEPs in arc.c > and ?dbuf.c? within local buf_init()s retry with 1/2 > values. > > So, my first memory consumption issue is to decrease to > 1/4s if failed. And or even starting with 1/2 or 1/4 > the size of the default hash tables. Minimally generating > a message as to what size the hash tables are might > tell you something.. I am assuming that your smaller > page size or disk blocks might alloc larger hash tables > than wanted. > > Then go thru the functions and identify all of the > SLEEP allocs and pre-alloc a working set number of > items via one of the slab functions. > > Then go and change all of the SLEEPs to NOSLEEPs and > return failures on memory allocs. You can then retry > the allocs at a later time or return ENOMEM. This will > keep a responsive system even when low memory has > occured. Worst case scenario, hopefully a poorly managed > FS is a /opt based FSs that can be offlined instead of > panicing the whole box. > > Also, I am assuming that FSs memory allocs are less > important than some other internal / kernel object. > > You might find that you can''t change some of the SLEEPs > to NOSLEEP, but doing most will probably delay your > problem until you have possibly leaked more mem. > > This isn''t that hard and will remove or hopefully > significantly delay your panics.. > > On the other hand, if I remember correctly, VOP_INACTIVE > ties to the DNLC and this is a much more complicated fix > and would require a code walk thru your current dev code. > > Mitchell Erblich > ------------------ > > > > > Pawel Jakub Dawidek wrote: > > > > ZFS works really stable on FreeBSD, but I''m biggest problem is how to > > control ZFS memory usage. I''ve no idea how to leash that beast. > > > > FreeBSD has a backpresure mechanism. I can register my function so it > > will be called when there are memory problems, which I do. I using it > > for ARC layer. > > Even with this in place under heavy load the kernel panics, because > > memory with KM_SLEEP cannot be allocated. > > > > Here are some statistics of memory usage when the panic occurs: > > > > zfs_znode_cache: 356 * 11547 = 4110732 bytes > > zil_lwb_cache: 176 * 43 = 7568 bytes > > arc_buf_t: 20 * 7060 = 141200 bytes > > arc_buf_hdr_t: 188 * 7060 = 1327280 bytes > > dnode_t: 756 * 162311 = 122707116 bytes !! > > dmu_buf_impl_t: 332 * 18649 = 6191468 > > other: 14432256 bytes (regular kmem_alloc()) > > > > There is 1GB of RAM, 320MB is for the kernel. 1/3 if kernel memory is > > configured as ARC''s maximum. > > When it panics, debugger statistics show that there is around 2/3 of > > this actually allocated, but probably due memory fragmentation. > > > > The most important part is dnode_t probably as it looks it doesn''t obey > > any limits. Maybe it is a bug in my port and I''m leaking them somehow? > > On the other hand when I unload ZFS kernel module, FreeBSD''s kernel > > reports any memory leaks - they exist, but are much, much smaller. > > > > There is also quite a lot of znodes, which I''d also like to be able to > > free and not sure how. In Solaris vnode''s life end in VOP_INACTIVE() > > routine, but znode if kept around. In FreeBSD VOP_INACTIVE() means "puts > > the vnode onto free vnodes list" and when we want to use this vnode for > > different file system VOP_RECLAIM() is called and VOP_RECLAIM() will be > > a good place to free znode as well, if possible. > > > > Any ideas how to fix it? > > > > -- > > Pawel Jakub Dawidek http://www.wheel.pl > > pjd at FreeBSD.org http://www.FreeBSD.org > > FreeBSD committer Am I Evil? Yes, I Am! > > > > ------------------------------------------------------------------------ > > Part 1.1.2Type: application/pgp-signature > > > > ------------------------------------------------------------------------ > > _______________________________________________ > > zfs-code mailing list > > zfs-code at opensolaris.org > > http://opensolaris.org/mailman/listinfo/zfs-code > _______________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://opensolaris.org/mailman/listinfo/zfs-code
On Thu, Nov 02, 2006 at 12:16:31PM -0800, Erblichs wrote:> Aawel Kakub Dawidek, et al, > > First, I am describing a moving target and maybe I > am off target for you... > > The memory consumption output via slab allocator > functions is not really correct. > > Normally, even when memory is freed it is cached until > SLEEP memory allocation fails, and then it is > re-allocated. Is this your memory leak? So, memory > tends to so up as more and more allocated and never > decreased from a point, IMO.That''s not the case for FreeBSD. I wasn''t clear, but statistics I used shows me both used and freed, but kept around elements. And what I posted what the number of really used elements. I don''t think it is a leak, because when I unload the module, the memory if freed by ZFS, it''s just that backpressure mechanism doesn''t work too well. My function is called when there is no memory in the system, is there anything I can do from there to decrease the number of active dnodes? -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20061103/600fc1d1/attachment.bin>
> > Normally, even when memory is freed it is cached until > > SLEEP memory allocation fails, and then it is > > re-allocated. Is this your memory leak? So, memory > > tends to so up as more and more allocated and never > > decreased from a point, IMO. > > That''s not the case for FreeBSD. I wasn''t clear, but statistics I used > shows me both used and freed, but kept around elements. And what I > posted what the number of really used elements.This isn''t the case for Solaris, either. KM_SLEEP allocations _never_ fail, which is why there''s the option for KM_NOSLEEP. I presume that this is really a description of vmem_xalloc; however, this isn''t an accurate description either. vmem_xalloc() calls kmem_reap() when it fails to find vmem that it can allocate. The code which checks whether VM_NOSLEEP has been set is after the call to kmem_reap(). If NOSLEEP is set, vmem_xalloc() breaks out of its loop and returns NULL. If SLEEP is set, the routine ends up sleeping on the vmp->vm_cv. Waiters here are awoken by a broadcast in vmem_freelist_insert() and vmem_update(). There''s also a thread in kmem which checks the caches once every 15 seconds. This thread will perform hashtable and magazine resizing. If it resizes a magazine, it will purge the contents of the magazine layer before instantiating a re-sized set. This will free memory back to the system. See kmem_update(). Since this is a cache, the system tries to leave the memory available, since allocating it from scratch is expensive. kmem_reap() is called in about 9 places in the kernel where we might get especially tight on free memory. -j
Pawel Jakub Dawidek wrote:> ZFS works really stable on FreeBSD, but I''m biggest problem is how to > control ZFS memory usage. I''ve no idea how to leash that beast. > > FreeBSD has a backpresure mechanism. I can register my function so it > will be called when there are memory problems, which I do. I using it > for ARC layer. > Even with this in place under heavy load the kernel panics, because > memory with KM_SLEEP cannot be allocated. > > Here are some statistics of memory usage when the panic occurs: > > zfs_znode_cache: 356 * 11547 = 4110732 bytes > zil_lwb_cache: 176 * 43 = 7568 bytes > arc_buf_t: 20 * 7060 = 141200 bytes > arc_buf_hdr_t: 188 * 7060 = 1327280 bytes > dnode_t: 756 * 162311 = 122707116 bytes !!The physical blocks for storing dnodes are 16KB (as dictated by DNODE_BLOCK_SHIFT), with 32 dnodes per physical block. The interesting part is that if you read in one file (or have it cached by the DNLC), then that whole 16KB block has to stay in memory. So in the worst case, you have an app that somehow unluckily reads in every 32nd file / dnode (incurring the memory baggage of 31 dnodes not actively being used). eric> dmu_buf_impl_t: 332 * 18649 = 6191468 > other: 14432256 bytes (regular kmem_alloc()) > > There is 1GB of RAM, 320MB is for the kernel. 1/3 if kernel memory is > configured as ARC''s maximum. > When it panics, debugger statistics show that there is around 2/3 of > this actually allocated, but probably due memory fragmentation. > > The most important part is dnode_t probably as it looks it doesn''t obey > any limits. Maybe it is a bug in my port and I''m leaking them somehow? > On the other hand when I unload ZFS kernel module, FreeBSD''s kernel > reports any memory leaks - they exist, but are much, much smaller. > > There is also quite a lot of znodes, which I''d also like to be able to > free and not sure how. In Solaris vnode''s life end in VOP_INACTIVE() > routine, but znode if kept around. In FreeBSD VOP_INACTIVE() means "puts > the vnode onto free vnodes list" and when we want to use this vnode for > different file system VOP_RECLAIM() is called and VOP_RECLAIM() will be > a good place to free znode as well, if possible. > > Any ideas how to fix it? > > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://opensolaris.org/mailman/listinfo/zfs-code
Pawel Jakub Dawidek wrote:> ZFS works really stable on FreeBSD, but I''m biggest problem is how to > control ZFS memory usage. I''ve no idea how to leash that beast. > > FreeBSD has a backpresure mechanism. I can register my function so it > will be called when there are memory problems, which I do. I using it > for ARC layer. > Even with this in place under heavy load the kernel panics, because > memory with KM_SLEEP cannot be allocated. > > Here are some statistics of memory usage when the panic occurs: > > zfs_znode_cache: 356 * 11547 = 4110732 bytes > zil_lwb_cache: 176 * 43 = 7568 bytes > arc_buf_t: 20 * 7060 = 141200 bytes > arc_buf_hdr_t: 188 * 7060 = 1327280 bytes > dnode_t: 756 * 162311 = 122707116 bytes !! > dmu_buf_impl_t: 332 * 18649 = 6191468 > other: 14432256 bytes (regular kmem_alloc()) > > There is 1GB of RAM, 320MB is for the kernel. 1/3 if kernel memory is > configured as ARC''s maximum. > When it panics, debugger statistics show that there is around 2/3 of > this actually allocated, but probably due memory fragmentation. > > The most important part is dnode_t probably as it looks it doesn''t obey > any limits. Maybe it is a bug in my port and I''m leaking them somehow? > On the other hand when I unload ZFS kernel module, FreeBSD''s kernel > reports any memory leaks - they exist, but are much, much smaller. > > There is also quite a lot of znodes, which I''d also like to be able to > free and not sure how. In Solaris vnode''s life end in VOP_INACTIVE() > routine, but znode if kept around. In FreeBSD VOP_INACTIVE() means "puts > the vnode onto free vnodes list" and when we want to use this vnode for > different file system VOP_RECLAIM() is called and VOP_RECLAIM() will be > a good place to free znode as well, if possible. > > Any ideas how to fix it? >The problem is that in ZFS the vnode holds onto more memory than just the vnode itself. Its fine to place the vnode on a "free vnodes list" after a VOP_INACTIVE()... but you need to make sure the you have "released" the *extra* memory associated with the vnode: vnode refs a znode (356 bytes + 512 bytes for the phys) znode refs a dnode (756 bytes + 512-16k for the phys) So a vnode could be holding up to 17k of data in memory! I suggest you free up the znode at VOP_INACTIVE() rather than waiting until VOP_RECLAIM(). -Mark
On Tue, Nov 07, 2006 at 06:06:48PM -0700, Mark Maybee wrote:> The problem is that in ZFS the vnode holds onto more memory than just > the vnode itself. Its fine to place the vnode on a "free vnodes list" > after a VOP_INACTIVE()... but you need to make sure the you have > "released" the *extra* memory associated with the vnode: > vnode refs a znode (356 bytes + 512 bytes for the phys) > znode refs a dnode (756 bytes + 512-16k for the phys) > So a vnode could be holding up to 17k of data in memory! > > I suggest you free up the znode at VOP_INACTIVE() rather than waiting > until VOP_RECLAIM().How? From What I see znode is freed via zfs_znode_free() called from znode_pageout_func(). I don''t think I can just call zfs_znode_free(), as I''m quite sure there are some dependent resources I need to release. Which function should I used for this? -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20061110/9c933ccd/attachment.bin>
Pawel Jakub Dawidek wrote:> On Tue, Nov 07, 2006 at 06:06:48PM -0700, Mark Maybee wrote: > >>The problem is that in ZFS the vnode holds onto more memory than just >>the vnode itself. Its fine to place the vnode on a "free vnodes list" >>after a VOP_INACTIVE()... but you need to make sure the you have >>"released" the *extra* memory associated with the vnode: >> vnode refs a znode (356 bytes + 512 bytes for the phys) >> znode refs a dnode (756 bytes + 512-16k for the phys) >>So a vnode could be holding up to 17k of data in memory! >> >>I suggest you free up the znode at VOP_INACTIVE() rather than waiting >>until VOP_RECLAIM(). > > > How? From What I see znode is freed via zfs_znode_free() called from > znode_pageout_func(). I don''t think I can just call zfs_znode_free(), as > I''m quite sure there are some dependent resources I need to release. > Which function should I used for this? >The callback to the pageout function is triggered by the dmu_buf_rele() on the buffer containing the phys portion of the znode. This rele also triggers the freeing of all the other memory associated with the znode. In the current Solaris bits, the rele happens in zfs_zinactive(). -Mark
On Fri, Nov 10, 2006 at 06:36:07AM -0700, Mark Maybee wrote:> Pawel Jakub Dawidek wrote: > >On Tue, Nov 07, 2006 at 06:06:48PM -0700, Mark Maybee wrote: > >>The problem is that in ZFS the vnode holds onto more memory than just > >>the vnode itself. Its fine to place the vnode on a "free vnodes list" > >>after a VOP_INACTIVE()... but you need to make sure the you have > >>"released" the *extra* memory associated with the vnode: > >> vnode refs a znode (356 bytes + 512 bytes for the phys) > >> znode refs a dnode (756 bytes + 512-16k for the phys) > >>So a vnode could be holding up to 17k of data in memory! > >> > >>I suggest you free up the znode at VOP_INACTIVE() rather than waiting > >>until VOP_RECLAIM(). > >How? From What I see znode is freed via zfs_znode_free() called from > >znode_pageout_func(). I don''t think I can just call zfs_znode_free(), as > >I''m quite sure there are some dependent resources I need to release. > >Which function should I used for this? > The callback to the pageout function is triggered by the dmu_buf_rele() > on the buffer containing the phys portion of the znode. This rele also > triggers the freeing of all the other memory associated with the znode. > > In the current Solaris bits, the rele happens in zfs_zinactive().Ok, I also call dmu_buf_rele() from zfs_zinactive(), but I don''t see pageout beeing called right after zfs_zinactive(). In dmu_buf_rele() function I''ve such data (this is on simple ''touch /tank/foo''): holds = 1 db->db_dirtycnt = 1 db->db_level = 0 db->db_d.db_immediate_evict = 0 Is it the same in Solaris on ''touch /tank/foo'' for dmu_buf_rele() called from zfs_zinactive()? PS. I''m sorry for wasting your time, I could check it be myself with dtrace, but I decided to reinstall my Solaris installation and now it seems I''ve some hardware problems, so I need to find another box to install Solaris on, which will allow me to do such comparsions by myself. -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20061110/411d4f27/attachment.bin>
Pawel Jakub Dawidek wrote:> On Fri, Nov 10, 2006 at 06:36:07AM -0700, Mark Maybee wrote: > >>Pawel Jakub Dawidek wrote: >> >>>On Tue, Nov 07, 2006 at 06:06:48PM -0700, Mark Maybee wrote: >>> >>>>The problem is that in ZFS the vnode holds onto more memory than just >>>>the vnode itself. Its fine to place the vnode on a "free vnodes list" >>>>after a VOP_INACTIVE()... but you need to make sure the you have >>>>"released" the *extra* memory associated with the vnode: >>>> vnode refs a znode (356 bytes + 512 bytes for the phys) >>>> znode refs a dnode (756 bytes + 512-16k for the phys) >>>>So a vnode could be holding up to 17k of data in memory! >>>> >>>>I suggest you free up the znode at VOP_INACTIVE() rather than waiting >>>>until VOP_RECLAIM(). >>> >>>How? From What I see znode is freed via zfs_znode_free() called from >>>znode_pageout_func(). I don''t think I can just call zfs_znode_free(), as >>>I''m quite sure there are some dependent resources I need to release. >>>Which function should I used for this? >> >>The callback to the pageout function is triggered by the dmu_buf_rele() >>on the buffer containing the phys portion of the znode. This rele also >>triggers the freeing of all the other memory associated with the znode. >> >>In the current Solaris bits, the rele happens in zfs_zinactive(). > > > Ok, I also call dmu_buf_rele() from zfs_zinactive(), but I don''t see > pageout beeing called right after zfs_zinactive(). > > In dmu_buf_rele() function I''ve such data (this is on simple > ''touch /tank/foo''): > > holds = 1 > db->db_dirtycnt = 1 > db->db_level = 0 > db->db_d.db_immediate_evict = 0 > > Is it the same in Solaris on ''touch /tank/foo'' for dmu_buf_rele() called > from zfs_zinactive()?This looks fine. Note that the db_immediate_evict == 0 means that you will probably *not* see a callback to the pageout function immediately. This is the general case. We hold onto the znode (and related memory) until the associated disk blocks are evicted from the cache (arc). The cache is likely to hold onto that data until either: - we encounter memory shortage, and so reduce the cache size - we read new data into the cache, and evict this data to make space for it. -Mark
On Fri, Nov 10, 2006 at 10:41:07AM -0700, Mark Maybee wrote:> This looks fine. Note that the db_immediate_evict == 0 means that you > will probably *not* see a callback to the pageout function immediately. > This is the general case. We hold onto the znode (and related memory) > until the associated disk blocks are evicted from the cache (arc). The > cache is likely to hold onto that data until either: > - we encounter memory shortage, and so reduce the cache size > - we read new data into the cache, and evict this data to > make space for it.It seems we draw a circle here:) I use FreeBSD specific mechanism to detect memory shortage situations and I use ZFS code which should reduce cache size, etc. Most of the times it frees some memory, but sometimes it frees nothing and statistics show that there are many znodes/dnodes in memory still. I''ll try to investigate further. Thanks. -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20061114/10f68e6c/attachment.bin>