We''re trying to locate a problem on one of our web servers where suddenly everything grinds to a virtual halt (well, not really) due to something forcing a *lot* of paging activitity. We are suspecting that it might be some process that suddenly allocates a lot of memory and accesses it quickly - forcing the rest of the (big) processes out to swap. *Or* something filesystem related (ZFS perhaps?). One thing that makes it problematic to trace is that when things go slowly/halt we can''t login to the machine ("fork: resource temporarily unavailable"). Using a dtrace scripts we''ve seen that during the periods when things are really slow some processes are starting to paging (and have really long paging response times). (script: http://www.solarisinternals.com/si/dtrace/whospaging.d) An added complication is that during the times when things fail dtrace also more or less fails to run... # priocntl -e -c RT dtrace -s ./whospaging.d > paging-RT.log dtrace: processing aborted: Abort due to systemic unresponsiveness It worked better with: # priocntl -e -c RT dtrace -w -s ./whospaging.d > paging-RT-2.log but then it wouldn''t print anything at all when the interesting things were happening... (Machine: Sun Ultra 60, 2x360MHz CPUs, 1500MB RAM) Any suggestions on what to check next? This message posted from opensolaris.org
Before you dig down, there is currently a ZFS best practise to configure as much disk based swap as your expected ZFS caches (and if that is unknown provision swap for all of memory). That may change in the future but it''s still the reality for now. It will at least then help in the diagnostic. More BP here: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide -r Peter Eriksson writes: > We''re trying to locate a problem on one of our web servers where > suddenly everything grinds to a virtual halt (well, not really) due to > something forcing a *lot* of paging activitity. We are suspecting that > it might be some process that suddenly allocates a lot of memory and > accesses it quickly - forcing the rest of the (big) processes out to > swap. *Or* something filesystem related (ZFS perhaps?). > > One thing that makes it problematic to trace is that when things go slowly/halt > we can''t login to the machine ("fork: resource temporarily > unavailable"). > > Using a dtrace scripts we''ve seen that during the periods when things > are really slow some processes are starting to paging (and have really > long paging response times). (script: > http://www.solarisinternals.com/si/dtrace/whospaging.d) > > An added complication is that during the times when things fail dtrace > also more or less fails to run... > > # priocntl -e -c RT dtrace -s ./whospaging.d > paging-RT.log > dtrace: processing aborted: Abort due to systemic unresponsiveness > > It worked better with: > > # priocntl -e -c RT dtrace -w -s ./whospaging.d > paging-RT-2.log > > but then it wouldn''t print anything at all when the interesting things were happening... > > (Machine: Sun Ultra 60, 2x360MHz CPUs, 1500MB RAM) > > Any suggestions on what to check next? > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
You need to take a step back I think, and first identify the problem. You do not yet know if the memory usage is a user-land process, and the approach of determing which processes are having their pages stolen may not help - such processes may be victims, not the cause. You need to audit the memory consumers, and go from there. They are: - The kernel - The file system cache - Processes I assume you''re certain about the paging activity, meaning you see free memory drop and the page scanner getting busy. This is observable with vmstat - monitor freemem and the "sr" column. prstat(1) is your friend. "prstat -s rss" is a wonderfully simple and effective way to track physical memory usage on a per-process basis. Sure, we know all about shared pages, and the fact that the sum of all process''s RSS sizes will be something much, much larger than physical memory. But all we''re looking for here are processes with increasing large RSS, and who the large consumers are. Once you''re identified the process(es), use "pmap -x" to refine your understanding of its memory usage. On a system of this size (1.5GB of RAM), use mdb''s "memstat" dcmd - mdb -k [output from mdb starting] ::memstat This will give you a memory usage profile. In my experience, the symptoms you describe are frequently the result of the file system cache consuming memory (which, in and of itself, is not a bad thing), then a process comes along that needs a bigger chunk than is available, and the kernel has to get busy managing the shortfall. With UFS, you''ll see the page cache in memstat. With ZFS, you will not, since ZFS uses a its own mechanism for caching data and metadata. Unfortunately, there isn''t an easy way to track ZFS as a memory consumer (at least not that I''m aware of) - The mdb "kmastat" dcmd will show usage for all the zio pools and zfs caches, but it takes a bit of parsing to sort it out. I''m sure a dtrace script could help track ZFS memory consumption, but I''d need to spend a bit of time working through something like that. Anyway, before we jump to conclusions, let''s start with first identifying the consumer. If it turns out that kernel memory is growing, we can chase that down with dtrace and mdb/kmastat. If it''s a process, pmap to determine the segment(s), and dtrace to track allocations. HTH, /jim Peter Eriksson wrote:> We''re trying to locate a problem on one of our web servers where suddenly everything grinds to a virtual halt (well, not really) due to something forcing a *lot* of paging activitity. We are suspecting that it might be some process that suddenly allocates a lot of memory and accesses it quickly - forcing the rest of the (big) processes out to swap. *Or* something filesystem related (ZFS perhaps?). > > One thing that makes it problematic to trace is that when things go slowly/halt > we can''t login to the machine ("fork: resource temporarily unavailable"). > > Using a dtrace scripts we''ve seen that during the periods when things are really slow some processes are starting to paging (and have really long paging response times). (script: http://www.solarisinternals.com/si/dtrace/whospaging.d) > > An added complication is that during the times when things fail dtrace also more or less fails to run... > > # priocntl -e -c RT dtrace -s ./whospaging.d > paging-RT.log > dtrace: processing aborted: Abort due to systemic unresponsiveness > > It worked better with: > > # priocntl -e -c RT dtrace -w -s ./whospaging.d > paging-RT-2.log > > but then it wouldn''t print anything at all when the interesting things were happening... > > (Machine: Sun Ultra 60, 2x360MHz CPUs, 1500MB RAM) > > Any suggestions on what to check next? > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
> With UFS, you''ll see the page cache in memstat. With ZFS, you will > not, since ZFS uses a its own mechanism for caching data and metadata. > Unfortunately, there isn''t an easy way to track ZFS as a memory consumer > (at least not that I''m aware of) - The mdb "kmastat" dcmd will show > usage for all the zio pools and zfs caches, but it takes a bit of parsing to sort > it out.Is there an RFE open to add a ZFS ARC cache entry to the mdb memstat dcmd? I looked through the bug archive, but wasn''t able to locate anything to this effect. Thanks, - Ryan -- UNIX Administrator http://prefetch.net
Peter Eriksson
2006-Dec-04 22:22 UTC
[dtrace-discuss] Re: How to trace process memory usage
> prstat(1) is your friend. "prstat -s rss" is a wonderfully simple and effectiveThere is only one small problem with that approach - when things are running normally we don''t see any strange behaviour. And when things misbehave we typically can''t start any new processes (fork: resource temporarily unavailable)... Ie, no prstat/vmstat/pmap/top/ps.. That''s why we was thinking of using an already running Dtrace to try to "see" what''s going on when things are misbehaving... Anyway, I''ve now increased the available swapspace so it''s more than the size of the RAM in the machine and we''ll see what happens... This message posted from opensolaris.org
Got it - I didn''t realize the window of time between "things are getting slow" and "we''re wedged" was so small. I would not start with a DTrace collection in the back ground. You could run prstat in collect mode ("prstat -s rss -c 10 > /var/tmp/prstat.out" - the 10 is an interval of 10 seconds) in the background. If it''s a user land process growing, you''ll hopefully capture something before things wedge. I would also recommend running "kstat -n system_pages 10 > /var/tmp/kstat.out" in the background. You can track the free page list size, and kernel pages. Again, hopefully there will be a trend there we can drill down on before things get wedged. Between those 2, we should be able to determine if it''s a user land process consuming memory, or something in the kernel. The next step will be based on what that tells us. HTH /jim Peter Eriksson wrote:>> prstat(1) is your friend. "prstat -s rss" is a wonderfully simple and effective >> > > There is only one small problem with that approach - when things are running normally we don''t see any strange behaviour. And when things misbehave we typically can''t start any new processes (fork: resource temporarily unavailable)... > Ie, no prstat/vmstat/pmap/top/ps.. > > That''s why we was thinking of using an already running Dtrace to try to "see" what''s going on when things are misbehaving... > > > Anyway, I''ve now increased the available swapspace so it''s more than the size of the RAM in the machine and we''ll see what happens... > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Surya.Prakki at Sun.COM
2006-Dec-05 03:37 UTC
[dtrace-discuss] Re: How to trace process memory usage
One of things you may need to look out for is : In case you are creating way too many lwps on the system, you may exhaust segkpsize configured on your system. ::kmastat from an already running ''mdb -k'' session will help you figure out this (check for any failures in segkp caches). -surya Peter Eriksson wrote On 12/05/06 03:52,:>>prstat(1) is your friend. "prstat -s rss" is a wonderfully simple and effective >> >> > >There is only one small problem with that approach - when things are running normally we don''t see any strange behaviour. And when things misbehave we typically can''t start any new processes (fork: resource temporarily unavailable)... >Ie, no prstat/vmstat/pmap/top/ps.. > >That''s why we was thinking of using an already running Dtrace to try to "see" what''s going on when things are misbehaving... > > >Anyway, I''ve now increased the available swapspace so it''s more than the size of the RAM in the machine and we''ll see what happens... > > >This message posted from opensolaris.org >_______________________________________________ >dtrace-discuss mailing list >dtrace-discuss at opensolaris.org > >