My customer is running java on a ZFS file system. His platform is Soalris 10 x86 SF X4200. When he enabled ZFS his memory of 18 gigs drops to 2 gigs rather quickly. I had him do a # ps -e -o pid,vsz,comm | sort -n +1 and it came back: The culprit application you see is java: 507 89464 /usr/bin/postmaster 515 89944 /usr/bin/postmaster 517 91136 /usr/bin/postmaster 508 96444 /usr/bin/postmaster 516 98088 /usr/bin/postmaster 503 3449580 /usr/jre1.5.0_07/bin/amd64/java 512 3732468 /usr/jre1.5.0_07/bin/amd64/java Here is what the customer responded: Well, Java''s is a memory hog, but it''s not the leak -- it''s the application. Even after it fails due to lack of memory, the memory is not reclaimed and we can no longer restart it. Is there a bug on zfs? I did not find one in sunsolve but then again I might have been searching the wrong thing. We have done some slueth work and are starting to think our problem might be ZFS -- the new file system Sun supports. The documentation for ZFS states that it tries to cache as much as it can, and it uses kernel memory for the cache. That would explain memory gradually disappearing. ZFS can give memory back, but it does not do so quickly. So, is there any way to check that? If turns out to be the problem... 1) Is there a way to limit the size of ZFS''s caches? If not, then 2) Is there a way to clear ZFS''s cache? If not, then 3) Is there a way to force the Java VM to take a certain amount of memory on startup and never give it back? Xms does not appear to work. Thanks, Jill =========================================================================== S U N M I C R O S Y S T E M S I N C. Jill Manfield - TSE-Alternate Platform Team email: jill.manfield at sun.com phone: (800)USA-4SUN (Reference your case #) address: 1617 Southwood Drive Nashua,NH 03063 mailstop: NSH-01- B287 Mgr: Dave O''Connor: dave.oconnor at sun.com Submit, View and Update tickets at http://www.sun.com/service/online This email may contain confidential and privileged material for the sole use of the intended recipient. Any review or distribution by others is strictly prohibited. If you are not the intended recipient please contact the sender and delete all copies. =============================================================================
Jill Manfield writes: > My customer is running java on a ZFS file system. His platform is Soalris 10 x86 SF X4200. When he enabled ZFS his memory of 18 gigs drops to 2 gigs rather quickly. I had him do a # ps -e -o pid,vsz,comm | sort -n +1 and it came back: > > The culprit application you see is java: > 507 89464 /usr/bin/postmaster > 515 89944 /usr/bin/postmaster > 517 91136 /usr/bin/postmaster > 508 96444 /usr/bin/postmaster > 516 98088 /usr/bin/postmaster > 503 3449580 /usr/jre1.5.0_07/bin/amd64/java > 512 3732468 /usr/jre1.5.0_07/bin/amd64/java > > Here is what the customer responded: > Well, Java''s is a memory hog, but it''s not the leak -- it''s the > application. Even after it fails due to lack of memory, the memory is > not reclaimed and we can no longer restart it. > Is there a bug on zfs? I did not find one in sunsolve > but then again I might have been searching the wrong thing. Assuming you run S10U2, you may be hit by this one: 4034947 anon_swap_adjust(), anon_resvmem() should call kmem_reap() if availrmem is low. Fixed in snv_42. It would show up as bad return code from either of the above function when java fails to startup. -r
Jill Manfield wrote:> My customer is running java on a ZFS file system. His platform is Soalris 10 x86 SF X4200. When he enabled ZFS his memory of 18 gigs drops to 2 gigs rather quickly. I had him do a # ps -e -o pid,vsz,comm | sort -n +1 and it came back: > > The culprit application you see is java: > 507 89464 /usr/bin/postmaster > 515 89944 /usr/bin/postmaster > 517 91136 /usr/bin/postmaster > 508 96444 /usr/bin/postmaster > 516 98088 /usr/bin/postmaster > 503 3449580 /usr/jre1.5.0_07/bin/amd64/java > 512 3732468 /usr/jre1.5.0_07/bin/amd64/java > > Here is what the customer responded: > Well, Java''s is a memory hog, but it''s not the leak -- it''s the > application. Even after it fails due to lack of memory, the memory is > not reclaimed and we can no longer restart it. > Is there a bug on zfs? I did not find one in sunsolve but then again I might have been searching the wrong thing. > > We have done some slueth work and are starting to think our problem > might be ZFS -- the new file system Sun supports. The documentation for > ZFS states that it tries to cache as much as it can, and it uses kernel > memory for the cache. That would explain memory gradually disappearing. > ZFS can give memory back, but it does not do so quickly. >Yup, this is likely your problem. ZFS takes a little time to "give back" memory, and the app may fail with ENOMEM before this happens.> So, is there any way to check that? If turns out to be the problem... > > 1) Is there a way to limit the size of ZFS''s caches? >Well... sort of. You can set the size of arc.c_max and this will put an upper bound on the cache. But this is a bit of a hack.> If not, then > > 2) Is there a way to clear ZFS''s cache? >Try unmounting/mounting the file system, if that does not work, try export/import of the pool. -Mark