My customer is running java on a ZFS file system. His platform is Soalris 10
x86 SF X4200. When he enabled ZFS his memory of 18 gigs drops to 2 gigs rather
quickly. I had him do a # ps -e -o pid,vsz,comm | sort -n +1 and it came back:
The culprit application you see is java:
507 89464 /usr/bin/postmaster
515 89944 /usr/bin/postmaster
517 91136 /usr/bin/postmaster
508 96444 /usr/bin/postmaster
516 98088 /usr/bin/postmaster
503 3449580 /usr/jre1.5.0_07/bin/amd64/java
512 3732468 /usr/jre1.5.0_07/bin/amd64/java
Here is what the customer responded:
Well, Java''s is a memory hog, but it''s not the leak --
it''s the
application. Even after it fails due to lack of memory, the memory is
not reclaimed and we can no longer restart it.
Is there a bug on zfs? I did not find one in sunsolve but then again I might
have been searching the wrong thing.
We have done some slueth work and are starting to think our problem
might be ZFS -- the new file system Sun supports. The documentation for
ZFS states that it tries to cache as much as it can, and it uses kernel
memory for the cache. That would explain memory gradually disappearing.
ZFS can give memory back, but it does not do so quickly.
So, is there any way to check that? If turns out to be the problem...
1) Is there a way to limit the size of ZFS''s caches?
If not, then
2) Is there a way to clear ZFS''s cache?
If not, then
3) Is there a way to force the Java VM to take a certain amount of
memory on startup and never give it back? Xms does not appear to work.
Thanks,
Jill
===========================================================================
S U N M I C R O S Y S T E M S I N C.
Jill Manfield - TSE-Alternate Platform Team
email: jill.manfield at sun.com
phone: (800)USA-4SUN (Reference your case #)
address: 1617 Southwood Drive Nashua,NH 03063
mailstop: NSH-01- B287
Mgr: Dave O''Connor: dave.oconnor at sun.com
Submit, View and Update tickets at http://www.sun.com/service/online
This email may contain confidential and privileged material for the sole use of
the intended recipient. Any review or distribution by others is strictly
prohibited. If you are not the intended recipient please contact the sender and
delete all copies.
=============================================================================
Jill Manfield writes: > My customer is running java on a ZFS file system. His platform is Soalris 10 x86 SF X4200. When he enabled ZFS his memory of 18 gigs drops to 2 gigs rather quickly. I had him do a # ps -e -o pid,vsz,comm | sort -n +1 and it came back: > > The culprit application you see is java: > 507 89464 /usr/bin/postmaster > 515 89944 /usr/bin/postmaster > 517 91136 /usr/bin/postmaster > 508 96444 /usr/bin/postmaster > 516 98088 /usr/bin/postmaster > 503 3449580 /usr/jre1.5.0_07/bin/amd64/java > 512 3732468 /usr/jre1.5.0_07/bin/amd64/java > > Here is what the customer responded: > Well, Java''s is a memory hog, but it''s not the leak -- it''s the > application. Even after it fails due to lack of memory, the memory is > not reclaimed and we can no longer restart it. > Is there a bug on zfs? I did not find one in sunsolve > but then again I might have been searching the wrong thing. Assuming you run S10U2, you may be hit by this one: 4034947 anon_swap_adjust(), anon_resvmem() should call kmem_reap() if availrmem is low. Fixed in snv_42. It would show up as bad return code from either of the above function when java fails to startup. -r
Jill Manfield wrote:> My customer is running java on a ZFS file system. His platform is Soalris 10 x86 SF X4200. When he enabled ZFS his memory of 18 gigs drops to 2 gigs rather quickly. I had him do a # ps -e -o pid,vsz,comm | sort -n +1 and it came back: > > The culprit application you see is java: > 507 89464 /usr/bin/postmaster > 515 89944 /usr/bin/postmaster > 517 91136 /usr/bin/postmaster > 508 96444 /usr/bin/postmaster > 516 98088 /usr/bin/postmaster > 503 3449580 /usr/jre1.5.0_07/bin/amd64/java > 512 3732468 /usr/jre1.5.0_07/bin/amd64/java > > Here is what the customer responded: > Well, Java''s is a memory hog, but it''s not the leak -- it''s the > application. Even after it fails due to lack of memory, the memory is > not reclaimed and we can no longer restart it. > Is there a bug on zfs? I did not find one in sunsolve but then again I might have been searching the wrong thing. > > We have done some slueth work and are starting to think our problem > might be ZFS -- the new file system Sun supports. The documentation for > ZFS states that it tries to cache as much as it can, and it uses kernel > memory for the cache. That would explain memory gradually disappearing. > ZFS can give memory back, but it does not do so quickly. >Yup, this is likely your problem. ZFS takes a little time to "give back" memory, and the app may fail with ENOMEM before this happens.> So, is there any way to check that? If turns out to be the problem... > > 1) Is there a way to limit the size of ZFS''s caches? >Well... sort of. You can set the size of arc.c_max and this will put an upper bound on the cache. But this is a bit of a hack.> If not, then > > 2) Is there a way to clear ZFS''s cache? >Try unmounting/mounting the file system, if that does not work, try export/import of the pool. -Mark