thr3ads.net - zfs discuss - [zfs-discuss] high read iops

If this information is useful, please help other people find it:
Share via:

Brad

2009-Dec-24 20:57 UTC

[zfs-discuss] high read iops - more memory for arc?

I''m running into a issue where there seems to be a high number of read
iops hitting disks and physical free memory is fluctuating between 200MB ->
450MB out of 16GB total.  We have the l2arc configured on a 32GB Intel X25-E ssd
and slog on another32GB X25-E ssd.

According to our tester, Oracle writes are extremely slow (high latency).   

Below is a snippet of iostat:

    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0
 4898.3   34.2   23.2    1.4  0.1 385.3    0.0   78.1   0 1246 c1
    0.0    0.8    0.0    0.0  0.0  0.0    0.0   16.0   0   1 c1t0d0
  401.7    0.0    1.9    0.0  0.0 31.5    0.0   78.5   1 100 c1t1d0
  421.2    0.0    2.0    0.0  0.0 30.4    0.0   72.3   1  98 c1t2d0
  403.9    0.0    1.9    0.0  0.0 32.0    0.0   79.2   1 100 c1t3d0
  406.7    0.0    2.0    0.0  0.0 33.0    0.0   81.3   1 100 c1t4d0
  414.2    0.0    1.9    0.0  0.0 28.6    0.0   69.1   1  98 c1t5d0
  406.3    0.0    1.8    0.0  0.0 32.1    0.0   79.0   1 100 c1t6d0
  404.3    0.0    1.9    0.0  0.0 31.9    0.0   78.8   1 100 c1t7d0
  404.1    0.0    1.9    0.0  0.0 34.0    0.0   84.1   1 100 c1t8d0
  407.1    0.0    1.9    0.0  0.0 31.2    0.0   76.6   1 100 c1t9d0
  407.5    0.0    2.0    0.0  0.0 33.2    0.0   81.4   1 100 c1t10d0
  402.8    0.0    2.0    0.0  0.0 33.5    0.0   83.2   1 100 c1t11d0
  408.9    0.0    2.0    0.0  0.0 32.8    0.0   80.3   1 100 c1t12d0
    9.6   10.8    0.1    0.9  0.0  0.4    0.0   20.1   0  17 c1t13d0
    0.0   22.7    0.0    0.5  0.0  0.5    0.0   22.8   0  33 c1t14d0

Is this an indicator that we need more physical memory?  From
http://blogs.sun.com/brendan/entry/test, the order that a read request is
satisfied is:

1) ARC
2) vdev cache of L2ARC devices
3) L2ARC devices
4) vdev cache of disks
5) disks

Using arc_summary.pl, we determined that prefletch was not helping much so we
disabled.

CACHE HITS BY DATA TYPE:
          Demand Data:                22%        158853174
          Prefetch Data:              17%        123009991   <---not
helping???
          Demand Metadata:            60%        437439104
          Prefetch Metadata:           0%        2446824

The write iops started to kick in more and latency reduced on spinning disks:
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0
 1629.0  968.0   17.4    7.3  0.0 35.9    0.0   13.8   0 1088 c1
    0.0    1.9    0.0    0.0  0.0  0.0    0.0    1.7   0   0 c1t0d0
  126.7   67.3    1.4    0.2  0.0  2.9    0.0   14.8   0  90 c1t1d0
  129.7   76.1    1.4    0.2  0.0  2.8    0.0   13.7   0  90 c1t2d0
  128.0   73.9    1.4    0.2  0.0  3.2    0.0   16.0   0  91 c1t3d0
  128.3   79.1    1.3    0.2  0.0  3.6    0.0   17.2   0  92 c1t4d0
  125.8   69.7    1.3    0.2  0.0  2.9    0.0   14.9   0  89 c1t5d0
  128.3   81.9    1.4    0.2  0.0  2.8    0.0   13.1   0  89 c1t6d0
  128.1   69.2    1.4    0.2  0.0  3.1    0.0   15.7   0  93 c1t7d0
  128.3   80.3    1.4    0.2  0.0  3.1    0.0   14.7   0  91 c1t8d0
  129.2   69.3    1.4    0.2  0.0  3.0    0.0   15.2   0  90 c1t9d0
  130.1   80.0    1.4    0.2  0.0  2.9    0.0   13.6   0  89 c1t10d0
  126.2   72.6    1.3    0.2  0.0  2.8    0.0   14.2   0  89 c1t11d0
  129.7   81.0    1.4    0.2  0.0  2.7    0.0   12.9   0  88 c1t12d0
   90.4   41.3    1.0    4.0  0.0  0.2    0.0    1.2   0   6 c1t13d0
    0.0   24.3    0.0    1.2  0.0  0.0    0.0    0.2   0   0 c1t14d0


Is it true if your MFU stats start to go over 50% then more memory is needed?
        CACHE HITS BY CACHE LIST:
          Anon:                       10%        74845266               [ New
Customer, First Cache Hit ]
          Most Recently Used:         19%        140478087 (mru)        [ Return
Customer ]
          Most Frequently Used:       65%        475719362 (mfu)        [
Frequent Customer ]
          Most Recently Used Ghost:    2%        20785604 (mru_ghost)   [ Return
Customer Evicted, Now Back ]
          Most Frequently Used Ghost:  1%        9920089 (mfu_ghost)    [
Frequent Customer Evicted, Now Back ]
        CACHE HITS BY DATA TYPE:
          Demand Data:                22%        158852935
          Prefetch Data:              17%        123009991
          Demand Metadata:            60%        437438658
          Prefetch Metadata:           0%        2446824

My theory is since there''s not enough memory for the arc to cache data,
its hits the l2arc where it can''t find data and has to query the disk
for the request.  This causes contention between reads and writes causing the
service times to inflate.

Thoughts?
-- 
This message posted from opensolaris.org

Richard L. Hamilton

2009-Dec-25 23:18 UTC

head link

[zfs-discuss] high read iops - more memory for arc?

FYI, the arc and arc-discuss lists or forums are not appropriate for this. 
There are
two "arc" acronyms:

* Architecture Review Committee (arc list is for cases being considered,
arc-discuss is for
other discussion.  Non-committee business is most unwelcome on the arc list.)

* the ZFS Adaptive Replacement Cache.  That is what you are talking about.

The zfs-discuss list is appropriate for that subject; storage-discuss and
database-discuss
_may_ relate, but rather than sending to every list that _might_ relate,
I''d suggest starting
with the most appropriate first, and reading enough of the posts already on a
list to
get some idea of what''s appropriate there and what isn''t,
before just adding it as
and additional CC in the hopes that someone might answer.

Very few people are likely to be responding here at this time, insofar as the
largest
part of the people that might are probably observing (at least socially) the
Christmas
holiday right now (their families might not appreciate them being distracted by
anything else!), and many of the rest aren''t interacting much because
of how
many are not around right now.  Don''t expect too much until the first
Monday
after 1 January.  And anyway, discussion lists are not a place where anyone is
_obligated_ to answer.  Those with support contracts presumably have other ways
of getting help.

Now...I probably couldn''t answer your question even if I had all the
information
you left out,but maybe someone could, eventually.  Some of the information they
might need:

* what are you running (uname -a will do)?  ZFS is constantly being improved;
problems
get fixed (and sometimes introduced) in just about every build

* what system, how is it configured, exactly what disk models, etc?

Free memory is _supposed_ to be low.  Free memory is wasted memory, except that
a little is kept free to quickly respond to requests for more.  Most memory not
otherwise
being used for mappings, kernel data structures, etc, is used as either
additional VM
page cache of pages that might be used again, or by the ZFS ARC.  The tools to
report on just how memory is used behave differently on Solaris (and even on
different
versions) than they do on other OSs, because Solaris tries really hard to make
best
use of all RAM.  The uname -a information would also help someone (more
knowledgeable
than I, although I might be able to look it up) suggest which tools would best
help to
understand your situation.

So while free memory alone doesn''t tell you much, there''s a
good chance that more
would help unless there''s some specific problem that''s
involved.  There''s also a good
chance that your problem is known, recognizable, and probably has a fix in a
newer
version or a workaround, if you provide enough information to help someone find
that for you.
-- 
This message posted from opensolaris.org

Maybe Matching Threads

Search for more possibly parallel threads

zfs discuss - Dec 2009 - high read iops - more memory for arc?

[zfs-discuss] high read iops - more memory for arc?

[zfs-discuss] high read iops - more memory for arc?

Maybe Matching Threads