Would someone "in the know" be willing to write up (preferably blog) definitive definitions/explanations of all the arcstats provided via kstat? I''m struggling with proper interpretation of certain values, namely "p", "memory_throttle_count", and the mru/mfu+ghost hit vs demand/prefetch hit counters. I think I''ve got it figured out, but I''d really like expert clarification before I start tweaking. Thanks. benr. This message posted from opensolaris.org
Ben, Here is an attempt. c -> Is the total cache size (MRU + MFU) p -> represents the limit of MRU (c - p) -> represents the limit of MFU c_max, c_min -> hard limits size -> Total amount consumed by ARC memory_throttle_count -> The number of times ZFS decided to throttle ARC growth. ARC maintains ghost lists for MRU and MFU. When it decides to evict a buffer from the MRU/MFU, the data buffer is freed. However, it moves the corresponding header into these ghost lists. In future, when we have a hit on one of these ghost lists it is a indication to the algorithm that the corresponding (MRU/MFU) should have been larger to accomodate it. mfu_ghost_hits -> These are the hits into the Ghost list mru_ghost_hits This is used to correct the size of MRU/MFU by adjusting the value of ''p'' (arc.c:arc_adjust()). I know this is not complete. Thanks and regrads, Sanjeev. On Wed, Aug 20, 2008 at 04:04:59AM -0700, Ben Rockwood wrote:> Would someone "in the know" be willing to write up (preferably blog) definitive definitions/explanations of all the arcstats provided via kstat? I''m struggling with proper interpretation of certain values, namely "p", "memory_throttle_count", and the mru/mfu+ghost hit vs demand/prefetch hit counters. I think I''ve got it figured out, but I''d really like expert clarification before I start tweaking. > > Thanks. > > benr. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Thanks, not as much as I was hoping for but still extremely helpful. Can you, or others have a look at this: http://cuddletech.com/arc_summary.html This is a PERL script that uses kstats to drum up a report such as the following: System Memory: Physical RAM: 32759 MB Free Memory : 10230 MB LotsFree: 511 MB ARC Size: Current Size: 7989 MB (arcsize) Target Size (Adaptive): 8192 MB (c) Min Size (Hard Limit): 1024 MB (zfs_arc_min) Max Size (Hard Limit): 8192 MB (zfs_arc_max) ARC Size Breakdown: Most Recently Used Cache Size: 13% 1087 MB (p) Most Frequently Used Cache Size: 86% 7104 MB (c-p) ARC Efficency: Cache Access Total: 3947194710 Cache Hit Ratio: 99% 3944674329 Cache Miss Ratio: 0% 2520381 Data Demand Efficiency: 99% Data Prefetch Efficiency: 69% CACHE HITS BY CACHE LIST: Anon: 0% 16730069 Most Frequently Used: 99% 3915830091 (mfu) Most Recently Used: 0% 10490502 (mru) Most Frequently Used Ghost: 0% 439554 (mfu_ghost) Most Recently Used Ghost: 0% 1184113 (mru_ghost) CACHE HITS BY DATA TYPE: Demand Data: 99% 3914527790 Prefetch Data: 0% 2447831 Demand Metadata: 0% 10709326 Prefetch Metadata: 0% 16989382 CACHE MISSES BY DATA TYPE: Demand Data: 45% 1144679 Prefetch Data: 42% 1068975 Demand Metadata: 5% 132649 Prefetch Metadata: 6% 174078 --------------------------------------------- Feedback and input is welcome, in particular if I''m mischarrectorizing data. benr. This message posted from opensolaris.org
On 21 August, 2008 - Ben Rockwood sent me these 2,2K bytes:> Thanks, not as much as I was hoping for but still extremely helpful. > > > Can you, or others have a look at this: http://cuddletech.com/arc_summary.html > > This is a PERL script that uses kstats to drum up a report such as the following:It breaks with a division by zero (line 102) if you have disabled zfs prefetch (set zfs:zfs_prefetch_disable = 1) and gives some interesting stats on the same machine.. CACHE HITS BY CACHE LIST: Anon: -6% -121314224 Most Frequently Used: 69% 1282896887 (mfu) Most Recently Used: 30% 572058573 (mru) Most Frequently Used Ghost: 2% 41192924 (mfu_ghost) Most Recently Used Ghost: 4% 80121300 (mru_ghost) Disk backend with 2G ram and 2.7TB working set, so data cache is a lost cause. (Waiting for snv96 with primarycache=metadata) Other than that.. I like it. /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Ben, This looks good ! I Couple of points ; - You might want show whether ARC is throttled. (Although this is not available as a kstat). I think it is good indication of how stressed is the box. - I am not sure if ''anon_hits'' make too much sense. : -- From arc.c -- * Anonymous buffers are buffers that are not associated with * a DVA. These are buffers that hold dirty block copies * before they are written to stable storage. By definition, * they are "ref''d" and are considered part of arc_mru * that cannot be freed. Generally, they will aquire a DVA * as they are written and migrate onto the arc_mru list. */ -- snip -- I agree with you that we would need a little better explaination about the demand* parameters. Let me try and understand that a little better. Thanks and regards, Sanjeev. Ben Rockwood wrote:> Thanks, not as much as I was hoping for but still extremely helpful. > > > Can you, or others have a look at this: http://cuddletech.com/arc_summary.html > > This is a PERL script that uses kstats to drum up a report such as the following: > > System Memory: > Physical RAM: 32759 MB > Free Memory : 10230 MB > LotsFree: 511 MB > > ARC Size: > Current Size: 7989 MB (arcsize) > Target Size (Adaptive): 8192 MB (c) > Min Size (Hard Limit): 1024 MB (zfs_arc_min) > Max Size (Hard Limit): 8192 MB (zfs_arc_max) > > ARC Size Breakdown: > Most Recently Used Cache Size: 13% 1087 MB (p) > Most Frequently Used Cache Size: 86% 7104 MB (c-p) > > ARC Efficency: > Cache Access Total: 3947194710 > Cache Hit Ratio: 99% 3944674329 > Cache Miss Ratio: 0% 2520381 > > Data Demand Efficiency: 99% > Data Prefetch Efficiency: 69% > > CACHE HITS BY CACHE LIST: > Anon: 0% 16730069 > Most Frequently Used: 99% 3915830091 (mfu) > Most Recently Used: 0% 10490502 (mru) > Most Frequently Used Ghost: 0% 439554 (mfu_ghost) > Most Recently Used Ghost: 0% 1184113 (mru_ghost) > CACHE HITS BY DATA TYPE: > Demand Data: 99% 3914527790 > Prefetch Data: 0% 2447831 > Demand Metadata: 0% 10709326 > Prefetch Metadata: 0% 16989382 > CACHE MISSES BY DATA TYPE: > Demand Data: 45% 1144679 > Prefetch Data: 42% 1068975 > Demand Metadata: 5% 132649 > Prefetch Metadata: 6% 174078 > --------------------------------------------- > > > Feedback and input is welcome, in particular if I''m mischarrectorizing data. > > benr. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Its a starting point anyway. The key is to try and draw useful conclusions from the info to answer the torrent of "why is my ARC 30GB???" There are several things I''m unclear on whether or not I''m properly interpreting such as: * As you state, the anon pages. Even the comment in code is, to me anyway, a little vague. I include them because otherwise you look at the hit counters and wonder where a large chunk of them went. * Prefetch... I want to use the Prefetch Data hit ratio as a judgment call on the efficiency of prefetch. If the value is very low it might be best to turn it off..... but I''d like to hear that from someone else before I go saying that. In high latency environments, such as ZFS on iSCSI, prefetch can either significantly help or hurt, determining which is difficult without some type of metric as as above. * There are several instances (based on dtracing) in which the ARC is bypassed... for ZIL I understand, in some other cases I need to spend more time analyzing the DMU (dbuf_*) for why. * In answering the "Is having a 30GB ARC good?" question, I want to say that if MFU is >60% of ARC, and if the hits are mostly MFU that you are deriving significant benefit from your large ARC.... but on a system with a 2GB ARC or a 30GB ARC the overall hit ratio tends to be 99%. Which is nuts, and tends to reinforce a misinterpretation of anon hits. The only way I''m seeing to _really_ understand ARC''s efficiency is to look at the overall number of reads and then how many are intercepted by ARC and how many actually made it to disk... and why (prefetch or demand). This is tricky to implement via kstats because you have to pick out and monitor the zpool disks themselves. I''ve spent a lot of time in this code (arc.c) and still have a lot of questions. I really wish there was an "Advanced ZFS Internals" talk coming up; I simply can''t keep spending so much time on this. Feedback from PAE or other tuning experts is welcome and appreciated. :) benr. This message posted from opensolaris.org
New version is available.... (v0.2) : * Fixes divide by zero, * includes tuning from /etc/system in output * if prefetch is disabled I explicitly say so. * Accounts for jacked anon count. Still need improvement here. * Added friendly explanations for MRU/MFU & Ghost lists counts. Page and examples are updated: cuddletech.com/arc_summary.pl Still needs work, but hopefully interest in this will stimulate some improved understanding of ARC internals. benr. This message posted from opensolaris.org
On Thu, Aug 21, 2008 at 8:47 PM, Ben Rockwood <benr at cuddletech.com> wrote:> New version is available.... (v0.2) : > > * Fixes divide by zero, > * includes tuning from /etc/system in output > * if prefetch is disabled I explicitly say so. > * Accounts for jacked anon count. Still need improvement here. > * Added friendly explanations for MRU/MFU & Ghost lists counts. > > Page and examples are updated: cuddletech.com/arc_summary.pl > > Still needs work, but hopefully interest in this will stimulate some improved understanding of ARC internals.For a bit of light relief (in other words, with pretty graphs) I''ve hacked up a graphical java version of Ben''s script as part of jkstat (updated to 0.24): http://www.petertribble.co.uk/Solaris/jkstat.html Now, this is pretty rough, and chews up a modest amount of CPU, and I''m not sure of the interpretation, but I''ve basically taken Ben''s code and lifted it more or less as is. -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
Brendan Gregg - Sun Microsystems
2008-Aug-28 22:09 UTC
[zfs-discuss] ARCSTAT Kstat Definitions
G''Day Ben, ARC visibility is important; did you see Neel''s arcstat?: http://www.solarisinternals.com/wiki/index.php/Arcstat Try -x for various sizes, and -v for definitions. On Thu, Aug 21, 2008 at 10:23:24AM -0700, Ben Rockwood wrote:> Its a starting point anyway. The key is to try and draw useful conclusions from the info to answer the torrent of "why is my ARC 30GB???" > > There are several things I''m unclear on whether or not I''m properly interpreting such as: > > * As you state, the anon pages. Even the comment in code is, to me anyway, a little vague. I include them because otherwise you look at the hit counters and wonder where a large chunk of them went.Yes, anon hits doesn''t make sense - they are dirty pages and won''t have a DVA, and so won''t be findable by other threads in arc_read(). I can see why arc_summary.pl thinks they exist - accounting for the discrepancy between arcstats:hits and the sum of the hits from the four ARC lists. Ghost list hits aren''t part of arcstats:hits - arcstats:hits are real hits, the ghost hits are an artifact of the ARC algorithm. If you do want to break down arcstats:hits into it''s components, use: zfs:0:arcstats:demand_data_hits zfs:0:arcstats:demand_metadata_hits zfs:0:arcstats:prefetch_data_hits zfs:0:arcstats:prefetch_metadata_hits And for a different perspective on the demand hits: zfs:0:arcstats:mru_hits zfs:0:arcstats:mfu_hits Also, arc_summary.pl''s reported MRU and MFU sizes aren''t actual, these are target sizes. The ARC will try to steer itself towards them, but in at least one case (where the ARC has yet to fill) they can be very different from actual (until arc_adjust() is called to whip them back to size.)> * Prefetch... I want to use the Prefetch Data hit ratio as a judgment call on the efficiency of prefetch. If the value is very low it might be best to turn it off..... but I''d like to hear that from someone else before I go saying that.Sounds good to me.> In high latency environments, such as ZFS on iSCSI, prefetch can either significantly help or hurt, determining which is difficult without some type of metric as as above. > > * There are several instances (based on dtracing) in which the ARC is bypassed... for ZIL I understand, in some other cases I need to spend more time analyzing the DMU (dbuf_*) for why. > > * In answering the "Is having a 30GB ARC good?" question, I want to say that if MFU is >60% of ARC, and if the hits are mostly MFU that you are deriving significant benefit from your large ARC.... but on a system with a 2GB ARC or a 30GB ARC the overall hit ratio tends to be 99%. Which is nuts, and tends to reinforce a misinterpretation of anon hits.I wouldn''t read *too* much into MRU vs MFU hits. MFU means 2 hits, MRU means 1.> The only way I''m seeing to _really_ understand ARC''s efficiency is to look at the overall number of reads and then how many are intercepted by ARC and how many actually made it to disk... and why (prefetch or demand). This is tricky to implement via kstats because you have to pick out and monitor the zpool disks themselves.This would usually have more to do with the workload than the ARC''s efficiency.> I''ve spent a lot of time in this code (arc.c) and still have a lot of questions. I really wish there was an "Advanced ZFS Internals" talk coming up; I simply can''t keep spending so much time on this.Maybe you could try forgetting about the kstats for a moment and draw a fantasy arc_summary.pl output. Then we can look at adding kstats to make writing that script possible/easy (Mark and I could add the kstats, and Neel could provide the script, for example). Of course, if we do add more kstats, it''s not going to help on older rev kernels out there... cheers, Brendan -- Brendan [CA, USA]
Dear Saanjeevb, In order to analyze a customer issue I am working on a script that computes ARC and L2ARc activities by interval. My intention is to analyze caches efficiency as well as evictions. I would appreciate more details : about hits breakdown : - Are m[fr]u_ghost_hits a subset of m[fr]u_hits ? - I had a look to Ben''s arc_summary script but not sure of what is the correct interpretation of hits counters. The sum of mfu,mru,mfu_ghost and mru_ghost is greater than hits. about the following parameters : +deleted +evict_skip +hdr_size +mutex_miss +recycle_miss +memory_throttle_count -- This message posted from opensolaris.org