PW
2009-Dec-28 06:38 UTC
[zfs-discuss] [storage-discuss] high read iops - more memory for arc?
Pre-fletching on the file and device level has been disabled yielding good results so far. We''ve lowered the number of concurrent ios from 35 to 1 causing the service times to go even lower (1 -> 8ms) but inflating actv (.4 -> 2ms). I''ve followed your recommendation in setting primarycache to metadata. I''ll have to check with our tester in the morning if it made a difference. I''m trying to understand why we''re seeing getting a lot of read requests to disk when the arc is set to 8GB and we have a 32GB ssd l2arc. With so much read requests hitting disks it may cause contention with the writes. --- On Sat, 12/26/09, Robert Heinzmann (reg) <reg at elconas.de> wrote:> From: Robert Heinzmann (reg) <reg at elconas.de> > Subject: Re: [storage-discuss] high read iops - more memory for arc? > To: "Brad" <beneri3 at yahoo.com> > Cc: storage-discuss at opensolaris.org > Date: Saturday, December 26, 2009, 6:07 AM > Hi Brad, > > just an idea: > > If you run Oracle / RDBM, read cache should not be of any > use because a > well sized RDBM (except a warehouse) should not do many > reads. If a well > sized RDBMs does reads (ok we are not talking MyISAM here > :), it > requests NEW data and a good RDBS does query for this data > only once - > so read cache is not of much use (except if it is a magic > one, > pre-reading guessed future-request addresses). > > So my idea would be to disbale the device read-ahead for > the Oracle > ZVOLS / folders by setting "primarycache" to "medata" and > thus using the > pressious main memory for metadata only (ZFS does a lot of > metadata > operastions, so having this in memory helps). If this does > not help, > secondarycache to "metadata" may be a good idea also. > > Maybe this helps, > Robert > > Brad schrieb: > > I''m running into a issue where there seems to be a > high number of read iops hitting disks and physical free > memory is fluctuating between 200MB -> 450MB out of 16GB > total.? We have the l2arc configured on a 32GB Intel > X25-E ssd and slog on another32GB X25-E ssd. > > > > According to our tester, Oracle writes are extremely > slow (high latency).??? > > > > Below is a snippet of iostat: > > > >? ???r/s? ? > w/s???Mr/s???Mw/s wait actv > wsvc_t asvc_t? %w? %b device > >? ???0.0? ? 0.0? > ? 0.0? ? 0.0? 0.0? 0.0? ? > 0.0? ? 0.0???0???0 > c0 > >? ???0.0? ? 0.0? > ? 0.0? ? 0.0? 0.0? 0.0? ? > 0.0? ? 0.0???0???0 > c0t0d0 > >? > 4898.3???34.2???23.2? > ? 1.4? 0.1 385.3? ? > 0.0???78.1???0 1246 c1 > >? ???0.0? ? 0.8? > ? 0.0? ? 0.0? 0.0? 0.0? ? > 0.0???16.0???0???1 > c1t0d0 > >???401.7? ? 0.0? ? > 1.9? ? 0.0? 0.0 31.5? ? > 0.0???78.5???1 100 c1t1d0 > >???421.2? ? 0.0? ? > 2.0? ? 0.0? 0.0 30.4? ? > 0.0???72.3???1? 98 > c1t2d0 > >???403.9? ? 0.0? ? > 1.9? ? 0.0? 0.0 32.0? ? > 0.0???79.2???1 100 c1t3d0 > >???406.7? ? 0.0? ? > 2.0? ? 0.0? 0.0 33.0? ? > 0.0???81.3???1 100 c1t4d0 > >???414.2? ? 0.0? ? > 1.9? ? 0.0? 0.0 28.6? ? > 0.0???69.1???1? 98 > c1t5d0 > >???406.3? ? 0.0? ? > 1.8? ? 0.0? 0.0 32.1? ? > 0.0???79.0???1 100 c1t6d0 > >???404.3? ? 0.0? ? > 1.9? ? 0.0? 0.0 31.9? ? > 0.0???78.8???1 100 c1t7d0 > >???404.1? ? 0.0? ? > 1.9? ? 0.0? 0.0 34.0? ? > 0.0???84.1???1 100 c1t8d0 > >???407.1? ? 0.0? ? > 1.9? ? 0.0? 0.0 31.2? ? > 0.0???76.6???1 100 c1t9d0 > >???407.5? ? 0.0? ? > 2.0? ? 0.0? 0.0 33.2? ? > 0.0???81.4???1 100 c1t10d0 > >???402.8? ? 0.0? ? > 2.0? ? 0.0? 0.0 33.5? ? > 0.0???83.2???1 100 c1t11d0 > >???408.9? ? 0.0? ? > 2.0? ? 0.0? 0.0 32.8? ? > 0.0???80.3???1 100 c1t12d0 > >? > ???9.6???10.8? ? > 0.1? ? 0.9? 0.0? 0.4? ? > 0.0???20.1???0? 17 > c1t13d0 > >? > ???0.0???22.7? ? > 0.0? ? 0.5? 0.0? 0.5? ? > 0.0???22.8???0? 33 > c1t14d0 > > > > Is this an indicator that we need more physical > memory?? From http://blogs.sun.com/brendan/entry/test, the order that > a read request is satisfied is: > > > > 1) ARC > > 2) vdev cache of L2ARC devices > > 3) L2ARC devices > > 4) vdev cache of disks > > 5) disks > > > > Using arc_summary.pl, we determined that prefletch was > not helping much so we disabled. > > > > CACHE HITS BY DATA TYPE: > >? ? ? ? ???Demand > Data:? ? ? ? ? ? ? ? > 22%? ? ? ? 158853174 > >? ? ? ? ???Prefetch > Data:? ? ? ? ? ? ? > 17%? ? ? ? > 123009991???<---not helping??? > >? ? ? ? ???Demand > Metadata:? ? ? ? ? ? 60%? > ? ? ? 437439104 > >? ? ? ? ???Prefetch > Metadata:? ? ? ? > ???0%? ? ? ? 2446824 > > > > The write iops started to kick in more and latency > reduced on spinning disks: > >? ???r/s? ? > w/s???Mr/s???Mw/s wait actv > wsvc_t asvc_t? %w? %b device > >? ???0.0? ? 0.0? > ? 0.0? ? 0.0? 0.0? 0.0? ? > 0.0? ? 0.0???0???0 > c0 > >? ???0.0? ? 0.0? > ? 0.0? ? 0.0? 0.0? 0.0? ? > 0.0? ? 0.0???0???0 > c0t0d0 > >? 1629.0? 968.0???17.4? > ? 7.3? 0.0 35.9? ? > 0.0???13.8???0 1088 c1 > >? ???0.0? ? 1.9? > ? 0.0? ? 0.0? 0.0? 0.0? ? > 0.0? ? 1.7???0???0 > c1t0d0 > >???126.7???67.3? > ? 1.4? ? 0.2? 0.0? 2.9? ? > 0.0???14.8???0? 90 > c1t1d0 > >???129.7???76.1? > ? 1.4? ? 0.2? 0.0? 2.8? ? > 0.0???13.7???0? 90 > c1t2d0 > >???128.0???73.9? > ? 1.4? ? 0.2? 0.0? 3.2? ? > 0.0???16.0???0? 91 > c1t3d0 > >???128.3???79.1? > ? 1.3? ? 0.2? 0.0? 3.6? ? > 0.0???17.2???0? 92 > c1t4d0 > >???125.8???69.7? > ? 1.3? ? 0.2? 0.0? 2.9? ? > 0.0???14.9???0? 89 > c1t5d0 > >???128.3???81.9? > ? 1.4? ? 0.2? 0.0? 2.8? ? > 0.0???13.1???0? 89 > c1t6d0 > >???128.1???69.2? > ? 1.4? ? 0.2? 0.0? 3.1? ? > 0.0???15.7???0? 93 > c1t7d0 > >???128.3???80.3? > ? 1.4? ? 0.2? 0.0? 3.1? ? > 0.0???14.7???0? 91 > c1t8d0 > >???129.2???69.3? > ? 1.4? ? 0.2? 0.0? 3.0? ? > 0.0???15.2???0? 90 > c1t9d0 > >???130.1???80.0? > ? 1.4? ? 0.2? 0.0? 2.9? ? > 0.0???13.6???0? 89 > c1t10d0 > >???126.2???72.6? > ? 1.3? ? 0.2? 0.0? 2.8? ? > 0.0???14.2???0? 89 > c1t11d0 > >???129.7???81.0? > ? 1.4? ? 0.2? 0.0? 2.7? ? > 0.0???12.9???0? 88 > c1t12d0 > >? ? 90.4???41.3? ? > 1.0? ? 4.0? 0.0? 0.2? ? > 0.0? ? 1.2???0???6 > c1t13d0 > >? > ???0.0???24.3? ? > 0.0? ? 1.2? 0.0? 0.0? ? > 0.0? ? 0.2???0???0 > c1t14d0 > > > > > > Is it true if your MFU stats start to go over 50% then > more memory is needed? > >? ? ? ???CACHE HITS BY > CACHE LIST: > >? ? ? ? > ???Anon:? ? ? ? ? > ? ? ? ? ? > ???10%? ? ? ? > 74845266? ? ? ? ? ? > ???[ New Customer, First Cache Hit ] > >? ? ? ? ???Most > Recently Used:? ? ? > ???19%? ? ? ? 140478087 > (mru)? ? ? ? [ Return Customer ] > >? ? ? ? ???Most > Frequently Used:? ? ???65%? > ? ? ? 475719362 (mfu)? ? ? > ? [ Frequent Customer ] > >? ? ? ? ???Most > Recently Used Ghost:? ? 2%? ? ? > ? 20785604 (mru_ghost)???[ Return > Customer Evicted, Now Back ] > >? ? ? ? ???Most > Frequently Used Ghost:? 1%? ? ? ? > 9920089 (mfu_ghost)? ? [ Frequent Customer > Evicted, Now Back ] > >? ? ? ???CACHE HITS BY > DATA TYPE: > >? ? ? ? ???Demand > Data:? ? ? ? ? ? ? ? > 22%? ? ? ? 158852935 > >? ? ? ? ???Prefetch > Data:? ? ? ? ? ? ? > 17%? ? ? ? 123009991 > >? ? ? ? ???Demand > Metadata:? ? ? ? ? ? 60%? > ? ? ? 437438658 > >? ? ? ? ???Prefetch > Metadata:? ? ? ? > ???0%? ? ? ? 2446824 > > > > My theory is since there''s not enough memory for the > arc to cache data, its hits the l2arc where it can''t find > data and has to query the disk for the request.? This > causes contention between reads and writes causing the > service times to inflate. > > > > Thoughts? > >??? > >