Seems a nice sale on Newegg for SSD devices. Talk about choices. What''s the latest recommendations for a log device? http://bit.ly/aL1dne -- This message posted from opensolaris.org
On Tue, 2010-04-06 at 08:26 -0700, Anil wrote:> Seems a nice sale on Newegg for SSD devices. Talk about choices. What''s the latest recommendations for a log device? > > http://bit.ly/aL1dneThe Vertex LE models should do well as ZIL (though not as well as an X25-E or a Zeus) for all non-enterprise users. The X25-M is still the best choice for a L2ARC device, but the Vertex Turbo or Cosair Nova are good if you''re on a budget. If you really want an SSD a boot drive, or just need something for L2ARC, the various Intel X25-V models are cheap, if not a really great performers. I''d recommend one of these if you want an SSD for rpool, or if you need a large L2ARC for dedup (or similar) and can''t afford anything in the X25-M price range. You should also be OK with a Corsair Reactor in this performance category. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On 4/6/2010 3:41 PM, Erik Trimble wrote:> On Tue, 2010-04-06 at 08:26 -0700, Anil wrote: > >> Seems a nice sale on Newegg for SSD devices. Talk about choices. What''s the latest recommendations for a log device? >> >> http://bit.ly/aL1dne >> > The Vertex LE models should do well as ZIL (though not as well as an > X25-E or a Zeus) for all non-enterprise users. > > The X25-M is still the best choice for a L2ARC device, but the Vertex > Turbo or Cosair Nova are good if you''re on a budget. > > If you really want an SSD a boot drive, or just need something for > L2ARC, the various Intel X25-V models are cheap, if not a really great > performers. I''d recommend one of these if you want an SSD for rpool, or > if you need a large L2ARC for dedup (or similar) and can''t afford > anything in the X25-M price range. You should also be OK with a Corsair > Reactor in this performance category. > >What about if you want to get one that you can use for both the rpool, and ZIL (for another data pool?) What if you want one for all 3 (rpool, ZIL, L2ARC)?? -Kyle> >
On Tue, 2010-04-06 at 19:43 -0400, Kyle McDonald wrote:> On 4/6/2010 3:41 PM, Erik Trimble wrote: > > On Tue, 2010-04-06 at 08:26 -0700, Anil wrote: > > > >> Seems a nice sale on Newegg for SSD devices. Talk about choices. What''s the latest recommendations for a log device? > >> > >> http://bit.ly/aL1dne > >> > > The Vertex LE models should do well as ZIL (though not as well as an > > X25-E or a Zeus) for all non-enterprise users. > > > > The X25-M is still the best choice for a L2ARC device, but the Vertex > > Turbo or Cosair Nova are good if you''re on a budget. > > > > If you really want an SSD a boot drive, or just need something for > > L2ARC, the various Intel X25-V models are cheap, if not a really great > > performers. I''d recommend one of these if you want an SSD for rpool, or > > if you need a large L2ARC for dedup (or similar) and can''t afford > > anything in the X25-M price range. You should also be OK with a Corsair > > Reactor in this performance category. > > > > > What about if you want to get one that you can use for both the rpool, > and ZIL (for another data pool?) > What if you want one for all 3 (rpool, ZIL, L2ARC)?? > > -Kyle >It all boils down to performance and the tradeoffs you are willing to make. For good ZIL, you want something that has a very high IOPS rating (50,000+ if possible, 10,000+ minimum, particularly when writing small chunks). For L2ARC, you are more concerned with total size/capacity, and modest IOPS (3000-10000 IOPS, or the ability to write at least 100Mb/s at 4-8k write sizes, plus as high as possible read I/O). For rpool use, you don''t really care about performance so much, as it''s almost exclusively read-only (one should generally not configure a swap device on an SSD-based rpool). You could probably live with an X25-M as something to use for all three, but of course you''re making tradeoffs all over the place. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On Apr 6, 2010, at 5:00 PM, Erik Trimble wrote:> On Tue, 2010-04-06 at 19:43 -0400, Kyle McDonald wrote: >> On 4/6/2010 3:41 PM, Erik Trimble wrote: >>> On Tue, 2010-04-06 at 08:26 -0700, Anil wrote: >>> >>>> Seems a nice sale on Newegg for SSD devices. Talk about choices. What''s the latest recommendations for a log device? >>>> >>>> http://bit.ly/aL1dne >>>> >>> The Vertex LE models should do well as ZIL (though not as well as an >>> X25-E or a Zeus) for all non-enterprise users. >>> >>> The X25-M is still the best choice for a L2ARC device, but the Vertex >>> Turbo or Cosair Nova are good if you''re on a budget. >>> >>> If you really want an SSD a boot drive, or just need something for >>> L2ARC, the various Intel X25-V models are cheap, if not a really great >>> performers. I''d recommend one of these if you want an SSD for rpool, or >>> if you need a large L2ARC for dedup (or similar) and can''t afford >>> anything in the X25-M price range. You should also be OK with a Corsair >>> Reactor in this performance category. >>> >>> >> What about if you want to get one that you can use for both the rpool, >> and ZIL (for another data pool?) >> What if you want one for all 3 (rpool, ZIL, L2ARC)?? >> >> -Kyle >> > > It all boils down to performance and the tradeoffs you are willing to > make. For good ZIL, you want something that has a very high IOPS rating > (50,000+ if possible, 10,000+ minimum, particularly when writing small > chunks).High write IOPS :-)> For L2ARC, you are more concerned with total size/capacity, and > modest IOPS (3000-10000 IOPS, or the ability to write at least 100Mb/s > at 4-8k write sizes, plus as high as possible read I/O).The L2ARC fill rate is throttled to 16 MB/sec at boot and 8 MB/sec later. Many SSDs work well as L2ARC cache devices.> For rpool use, > you don''t really care about performance so much, as it''s almost > exclusively read-onlyYep> (one should generally not configure a swap device > on an SSD-based rpool).Disagree. Swap is a perfectly fine workload for SSDs. Under ZFS, even more so. I''d really like to squash this rumour and thought we were making progress on that front :-( Today, there are millions or thousands of systems with deployed SSDs as boot and swap on a wide variety of OSes. Go for it.> You could probably live with an X25-M as something to use for all three, > but of course you''re making tradeoffs all over the place.That would be better than almost any HDD on the planet because the HDD tradeoffs result in much worse performance. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
Erik Trimble wrote:> On Tue, 2010-04-06 at 08:26 -0700, Anil wrote: >> Seems a nice sale on Newegg for SSD devices. Talk about choices. What''s the latest recommendations for a log device? >> >> http://bit.ly/aL1dne > > The Vertex LE models should do well as ZIL (though not as well as an > X25-E or a Zeus) for all non-enterprise users.I just found an 8 GB SATA Zeus (Z4S28I) for ?83.35 (~US$127) shipped to California. That should be more than large enough for my ZIL @home, based on zilstat. The web site says EOL, limited to current stock. http://www.dpieshop.com/stec-zeus-z4s28i-8gb-25-sata-ssd-solid-state-drive-industrial-temp-p-410.html Of course this seems _way_ too good to be true, but I decided to take the risk. -- Carson
On Tue, 2010-04-06 at 17:17 -0700, Richard Elling wrote:> On Apr 6, 2010, at 5:00 PM, Erik Trimble wrote:[snip]> > For L2ARC, you are more concerned with total size/capacity, and > > modest IOPS (3000-10000 IOPS, or the ability to write at least 100Mb/s > > at 4-8k write sizes, plus as high as possible read I/O). > > The L2ARC fill rate is throttled to 16 MB/sec at boot and 8 MB/sec later. > Many SSDs work well as L2ARC cache devices. >Where is that limit set? That''s completely new to me. :-( In any case, L2ARC devices should probably have at least reasonable write performance for small sizes, given the propensity to put things like the DDT and other table structures/metadata into it, all of which is small write chunks. I tried one of the old JMicron-based 1st-gen SSDs as an L2ARC, and it wasn''t much of a success. Fast read speed is good for an L2ARC, but that''s not generally a problem with even the cheap SSDs these days.> > (one should generally not configure a swap device > > on an SSD-based rpool). > > Disagree. Swap is a perfectly fine workload for SSDs. Under ZFS, > even more so. I''d really like to squash this rumour and thought we > were making progress on that front :-( Today, there are millions or > thousands of systems with deployed SSDs as boot and swap on a > wide variety of OSes. Go for it.Really? I''m generally not good for running swap on lower-performing SSDs over here in Java-land, but that may have to do with my specific workload. I''ll take your word for it (of course, I''m voting for swap not being necessary on many machines these days).> > You could probably live with an X25-M as something to use for all three, > > but of course you''re making tradeoffs all over the place. > > That would be better than almost any HDD on the planet because > the HDD tradeoffs result in much worse performance. > -- richard >True. Viva la SSD! -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On 04/06/10 17:17, Richard Elling wrote:>> You could probably live with an X25-M as something to use for all three, >> but of course you''re making tradeoffs all over the place. > > That would be better than almost any HDD on the planet because > the HDD tradeoffs result in much worse performance.Indeed. I''ve set up a couple small systems (one a desktop workstation, and the other a home fileserver) with root pool plus the l2arc and slog for a data pool on an 80G X25-M and have been very happy with the result. The recipe I''m using is to slice the ssd, with the rpool in s0 with roughly half the space, 1GB in s3 for slog, and the rest of the space as L2ARC in s4. That may actually be overly generous for the root pool, but I run with copies=2 on rpool/ROOT and I tend to keep a bunch of BE''s around. - Bill
On Apr 6, 2010, at 5:38 PM, Erik Trimble wrote:> On Tue, 2010-04-06 at 17:17 -0700, Richard Elling wrote: >> On Apr 6, 2010, at 5:00 PM, Erik Trimble wrote: > > [snip] > >>> For L2ARC, you are more concerned with total size/capacity, and >>> modest IOPS (3000-10000 IOPS, or the ability to write at least 100Mb/s >>> at 4-8k write sizes, plus as high as possible read I/O). >> >> The L2ARC fill rate is throttled to 16 MB/sec at boot and 8 MB/sec later. >> Many SSDs work well as L2ARC cache devices. >> > > Where is that limit set? That''s completely new to me. :-(L2ARC_WRITE_SIZE (8MB) is the default size of data to be written and L2ARC_FEED_SECS (1) is the interval. When arc_warm is FALSE, the L2ARC_WRITE_SIZE is doubled (16MB). Look somewhere around http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/arc.c#553 This change was made per CR 6709301, An empty L2ARC cache device is slow to warm up http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6709301 I''ll agree the feed rate is somewhat arbitrary, but probably suits many use cases.> In any case, L2ARC devices should probably have at least reasonable > write performance for small sizes, given the propensity to put things > like the DDT and other table structures/metadata into it, all of which > is small write chunks. I tried one of the old JMicron-based 1st-gen SSDs > as an L2ARC, and it wasn''t much of a success.I haven''t done many L2ARC measurements, but I suspect the writes are large.> Fast read speed is good for an L2ARC, but that''s not generally a problem > with even the cheap SSDs these days.yep.>>> (one should generally not configure a swap device >>> on an SSD-based rpool). >> >> Disagree. Swap is a perfectly fine workload for SSDs. Under ZFS, >> even more so. I''d really like to squash this rumour and thought we >> were making progress on that front :-( Today, there are millions or >> thousands of systems with deployed SSDs as boot and swap on a >> wide variety of OSes. Go for it. > > Really? I''m generally not good for running swap on lower-performing > SSDs over here in Java-land, but that may have to do with my specific > workload. I''ll take your word for it (of course, I''m voting for swap > not being necessary on many machines these days).If you have to swap, you have no performance. But people with SSDs (eg MacBook Air) seem happy to see fewer spinning beach balls :-) -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
On Tue, Apr 06, 2010 at 06:53:04PM -0700, Richard Elling wrote:> >> Disagree. Swap is a perfectly fine workload for SSDs. Under ZFS, > >> even more so. I''d really like to squash this rumour and thought we > >> were making progress on that front :-( Today, there are millions or > >> thousands of systems with deployed SSDs as boot and swap on a > >> wide variety of OSes. Go for it.+1> > Really? I''m generally not good for running swap on lower-performing > > SSDs over here in Java-land, but that may have to do with my specific > > workload. I''ll take your word for it (of course, I''m voting for swap > > not being necessary on many machines these days). > > If you have to swap, you have no performance.Disagree. If you''re thrashing heavily, yes. An SSD will make a difference in swap latency up until that point, but that won''t help much when everything''s stuck short for memory. However, a lot can happen before that point. Swapping out unused stuff (including idle services/processes and old tmpfs pages) can be very useful for performance, making room for the performance-sensitive working set. Some of your lower-priority processes can page in and out faster with an ssd, and smoothe the curve from memory pressure to total gridlock. Finally, this middle ground is where ssd root also helps, because executable text is paged from there. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100407/f20e9537/attachment.bin>
On Tue, Apr 06, 2010 at 05:22:25PM -0700, Carson Gaspar wrote:> I just found an 8 GB SATA Zeus (Z4S28I) for ?83.35 (~US$127) shipped to > California. That should be more than large enough for my ZIL @home, > based on zilstat.Transcend sells an 8 GByte SLC SSD for about 70 EUR. The specs are not awe-inspiring though (I used it in an embedded firewall).> The web site says EOL, limited to current stock. > > http://www.dpieshop.com/stec-zeus-z4s28i-8gb-25-sata-ssd-solid-state-drive-industrial-temp-p-410.html > > Of course this seems _way_ too good to be true, but I decided to take > the risk.-- Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
Carson Gaspar wrote:> I just found an 8 GB SATA Zeus (Z4S28I) for ?83.35 (~US$127) shipped to > California. That should be more than large enough for my ZIL @home, > based on zilstat. > > The web site says EOL, limited to current stock. > > http://www.dpieshop.com/stec-zeus-z4s28i-8gb-25-sata-ssd-solid-state-drive-industrial-temp-p-410.html > > > Of course this seems _way_ too good to be true, but I decided to take > the risk.Following up, I just added the STEC Zeus to my pool. The SSD is of the motherboard ICH7 AHCI port, the rust is all on an LSI 1068E. Anyone know how these numbers stack up against an X25-E? It certainly was a great deal for the price! NAME STATE READ WRITE CKSUM vault ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c7t9d0 ONLINE 0 0 0 c7t12d0 ONLINE 0 0 0 c7t11d0 ONLINE 0 0 0 c7t13d0 ONLINE 0 0 0 c7t14d0 ONLINE 0 0 0 c7t8d0 ONLINE 0 0 0 c7t10d0 ONLINE 0 0 0 c7t15d0 ONLINE 0 0 0 logs c9t3d0 ONLINE 0 0 0 Before (Mac OS 10.6.3 NFS client over GigE, local subnet, source file in RAM): carson:arthas 0 $ time tar jxf /Volumes/RamDisk/gcc-4.4.3.tar.bz2 real 92m33.698s user 0m20.291s sys 0m37.978s After: carson:arthas 130 $ time tar jxf /Volumes/RamDisk/gcc-4.4.3.tar.bz2 real 6m52.888s user 0m18.159s sys 0m33.909s Some data from a 5 second interval iostat of the SSD during the tar: extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 747.8 0.0 3055.3 0.0 0.4 0.0 0.5 1 36 c9t3d0 0.0 743.5 0.0 3010.9 0.0 0.4 0.0 0.5 0 35 c9t3d0 0.0 745.8 0.0 3087.3 0.0 0.4 0.0 0.5 1 36 c9t3d0 0.0 744.2 0.0 3088.8 0.0 0.4 0.0 0.5 1 36 c9t3d0 0.0 739.4 0.0 3166.4 0.0 0.4 0.0 0.5 1 36 c9t3d0 And from zilstat -p vault txg: txg N-Bytes N-Bytes/s N-Max-Rate B-Bytes B-Bytes/s B-Max-Rate ops <=4kB 4-32kB >=32kB 744952 7525472 250849 438848 94437376 3147912 3653632 21961 21443 503 15 744953 8185360 272845 662632 94330880 3144362 3899392 21761 21440 267 54 744954 9337848 311261 2596120 93069312 3102310 6045696 21502 21275 186 41 744955 6789952 226331 416336 91144192 3038139 3440640 21838 21665 157 16 744956 51461280 1715376 10132760 155000832 5166694 15319040 19621 15813 3290 518 744957 100863840 3362128 7033128 245440512 8181350 13565952 15589 10968 2720 1901 744958 38813592 1293786 5668728 136380416 4546013 11210752 19155 17173 1407 575 744959 46950168 1514521 5705184 150495232 4854684 10326016 18784 15418 2740 626 744960 35539504 1184650 2764008 137379840 4579328 7311360 19919 16287 3044 588
On Sun, 18 Apr 2010, Carson Gaspar wrote:> > Before (Mac OS 10.6.3 NFS client over GigE, local subnet, source file in > RAM): > > carson:arthas 0 $ time tar jxf /Volumes/RamDisk/gcc-4.4.3.tar.bz2 > > real 92m33.698s > user 0m20.291s > sys 0m37.978sThat''s awful!> carson:arthas 130 $ time tar jxf /Volumes/RamDisk/gcc-4.4.3.tar.bz2 > > real 6m52.888s > user 0m18.159s > sys 0m33.909sThat is a massive improvement! On a Mac G5 running Leopard (and with the wife running NeoOffice and other apps on the console), I see this extraction time here to my NFS server: tar jxf ~/scratch/gcc-4.4.3.tar.bz2 54.76s user 99.63s system 39% cpu 6:30.00 total % ./zilstat.ksh -p Sun_2540 txg waiting for txg commit... txg N-Bytes N-Bytes/s N-Max-Rate B-Bytes B-Bytes/s B-Max-Rate ops <=4kB 4-32kB >=32kB 7299113 9053088 646649 1298896 52330496 3737892 5058560 10625 9804 764 57 7299114 18021496 600716 2720424 105238528 3507950 6709248 21688 20829 718 141 7299115 6199592 206653 691992 46944256 1564808 3780608 10742 10414 308 20 I see similar times when using my old Sun Blade 2500 ("Red") as a client: % time (bzcat scratch/gcc-4.4.3.tar.bz2 | tar -xf -) (; bzcat scratch/gcc-4.4.3.tar.bz2 | tar -xf -; ) 68.31s user 136.00s system 50% cpu 6:42.77 total My NFS server is a Sun Ultra-40 M2 with storage on a Sun StorageTek 2540 arranged as six mirrors. No SSDs are used for the intent log. The StorageTek 2540 seems to offer 330MB of battery-backed cache per controller. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Bob Friesenhahn wrote:> On Sun, 18 Apr 2010, Carson Gaspar wrote: >> >> Before (Mac OS 10.6.3 NFS client over GigE, local subnet, source file >> in RAM): >> >> carson:arthas 0 $ time tar jxf /Volumes/RamDisk/gcc-4.4.3.tar.bz2 >> >> real 92m33.698s >> user 0m20.291s >> sys 0m37.978s > > That''s awful!...> tar jxf ~/scratch/gcc-4.4.3.tar.bz2 54.76s user 99.63s system 39% cpu > 6:30.00 total...> My NFS server is a Sun Ultra-40 M2 with storage on a Sun StorageTek 2540 > arranged as six mirrors. No SSDs are used for the intent log. The > StorageTek 2540 seems to offer 330MB of battery-backed cache per > controller.Yes, that much BB cache is plenty to make the 18MB per TXG ZIL writes (based on your zilstat output) very fast indeed ;-) Sadly my pool is raidz2 sans controller cache, thus the amazingly crappy performance sans SLOG. -- Carson
I did the same experiment in an VMWare guest (SLES10 x64). The archive was stored on the vdisk and untarring went to the same vdisk. The storage backend is sun system with 64 GB RAM, 2*QC cpus, 24 SAS disks with 450 GB, 4 vdevs with 6 disks as RAIDZ2, an Intel X25-E as log device (c2t1d0). A StorageTek SAS RAID Host Bus Adapters with 256 RAM and BBU for the zpool and a second HBA for the slog device. c3 is for the zpool and c2 for slog (c2t1d0)/boot (c2t0d0) devices. There are actually 140 VMs running and used over NFS from VSphere 4 with two 1 Gb/s links. zd-nms-s5:/build # iostat -indexC 5 before untarring r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device 0.0 396.0 0.0 9428.3 0.0 0.1 0.0 0.2 0 5 0 0 0 0 c2 0.0 14.0 0.0 61.9 0.0 0.0 0.0 2.8 0 1 0 0 0 0 c2t0d0 0.0 382.0 0.0 9366.4 0.0 0.0 0.0 0.1 0 3 0 0 0 0 c2t1d0 265.4 0.0 3631.2 0.0 0.0 1.2 0.0 4.3 0 105 0 0 0 0 c3 9.8 0.0 148.2 0.0 0.0 0.0 0.0 3.4 0 3 0 0 0 0 c3t0d0 8.8 0.0 137.7 0.0 0.0 0.0 0.0 3.6 0 3 0 0 0 0 c3t1d0 .... zd-nms-s5:/build # iostat -indexC 5 during untarring extended device statistics ---- errors --- r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device 0.0 1128.3 0.0 31713.6 0.0 0.2 0.0 0.1 0 12 0 0 0 0 c2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c2t0d0 0.0 1128.3 0.0 31713.6 0.0 0.2 0.0 0.1 1 12 0 0 0 0 c2t1d0 2005.7 5708.9 7423.7 42041.5 0.1 61.7 0.0 8.0 0 1119 0 0 0 0 c3 82.8 602.2 364.9 2408.4 0.0 4.4 0.0 6.4 1 68 0 0 0 0 c3t0d0 72.4 601.6 288.5 2452.7 0.0 4.2 0.0 6.2 1 61 0 0 0 0 c3t1d0 .... zd-nms-s5:/build # time tar jxf /tmp/gcc-4.4.3.tar.bz2 real 0m58.086s user 0m12.241s sys 0m6.552s Andreas -- This message posted from opensolaris.org