Jeff Bacon
2011-Jun-03 18:36 UTC
[zfs-discuss] Experiences with SAS, SATA SSDs, and Expanders (Re: Should Intel X25-E not be used with a SAS Expander?)
Let me throw two cents into the mix here. Background: I have probably 8 different ZFS boxes, BYO using SMC chassis. The standard config now looks like such: - CSE847-E26-1400LPB main chassis, X8DTH-iF board, dual X5670 CPUs, 96G RAM (some have 144G) - Intel X520 dual-10G card - 2 LSI 9211-8i controllers driving the internal backplanes, dual-attach - 2 CSE847-E26-RJBDO1 chassis, connected via 4 LSI 9200-8e controllers (so everything is dual-connected, no daisy-chaining) - primary disks are 2TB Constellation SAS drives - lots and lots of them - some Cheetah 450G 15k drives - some L2ARC and ZIL, mostly in the main chassis - sol10U9 - the 2TBs are all in 7-disk RAIDZ2s, the Cheetahs are mirrors. (Before you wonder: all of the ZFS filesystems are set compression=gzip. The data compresses anywhere from 2:1 to 5:1. So the actual record being written to disk is invariably of a variable size and less than 128k - so matching # of drives to 128k boundary is a pointless exercise from all that I''ve seen.) some of the machines are NFS fileservers, some are running a proprietary column-oriented time-series database. They don''t all follow the pattern, as it''s evolved over time. I have one machine that has 100-some Barracuda 1TB drives hung off a bunch of LSI 3081E-R controllers in three zpools. Some use the onboard 82576 intel NICs, no 10G. Some have a mix of 1068- and 2008-based controllers. CPUs vary as does total memory. The main prod boxes are of course the most standard. Note that I don''t use the X25-Es like everyone else does; I use a mix of drives: - Crucial C300 128G drives for L2ARC - OCZ Vertex 2 Pro 100G drives for ZIL I will be testing the new Intel 520s soon. Personally, I''ve been noodling with Solaris for going on 20 years - I cut my teeth on SunOS 3.2. ----- I spent a fair bit of time mucking with SAS/SATA interposers, and I agree with Richard that they''re toxic. a) they add latency - a not-inconsiderable amount b) they''re a bad idea. The thing is, the drive behind the interposer is still a single port. So you have multipath ... some small gain... but is it really worth it? I figure that if we lose a disk controller in some serious way, we''re hosed anyway and looking at a reboot and we can figure on doing whatever minor reconfig is necessary so long as all the right pieces are in place. Then too, it''s a SAS controller on a tiny board. What firmware is it running? What bugs are in that firmware? Who''s maintaining it? Do you really want another piece of weirdness in the chain? Plus, I never found anyone who made a mount for the SMC CSE-846/847 cases that would allow mounting the interposer and the drive in a consistent way, and didn''t feel like fabricating it myself. So that was that. I have a drawer full of interposers, if anyone cares. ------ One criteria to look for in choosing a SATA SSD to use: it needs to report a GUID, so the expander chip and the driver can keep track of the drive as a unique entity that is connected via a series of ports/connections, not "some drive that''s connected via that port on that backplane via that port off that controller". I can''t cite scientific evidence of a tie between that and "having issues", but the pattern holds. ----- I freely mix the 2TB Constellations and the SATA drives on the same backplanes and controllers. The main-line prod machines have been in continuous use for at least 6 months, if not more, and quite literally have the living crap beaten out of them. The primary db server has a pair of X5690s and 144G and runs flat out 24x7 and thrashes its disks like a mofo. We''ve learned a lot in terms of ARC tuning and other such fun, in an environment where we have probably 10,000 threads competing - 6 ZFS pools and a pile of Java applications. The L2ARC gets thrashed. The Cheetahs get pounded. So far, I have yet to have a single error pop up with respect to drive timeouts or other such fun that you might expect from SAS/SATA conflicts. They are amazingly solid, given they''re built out of commodity parts and running stock Sol10U9. ------ Note however that the above is using the LSI 2008-based controllers with the Phase7 firmware. Phase4 and Phase5 were abysmal and put me through much pain and grief. If you haven''t flashed your controllers, do yourself a favor and do so. Note that I use the retail box LSI cards, not the SMC cards. I''m sure SMC is less than best pleased by that, but I''d rather be able to talk to LSI directly in case of issues, and so far LSI support''s been pretty decent, to the extent that I''ve even needed it. (SMC consoles themselves by selling me piles of chassis, motherboards, and disks.) ----- Note that, if you''re using SATA drives on the SAS backplanes, the SATA drives will always only attach to the primary controller - so buying a dual-chip backplane does you zip for good; your SATA drives WILL NOT FAIL OVER. That''s just how it is. Distribute your SATA drives across the backplanes and controllers, assume hardware will fail, and move on. ----- SAS multipathing with Solaris - well, um, it sorta works. Unreliably. As has been observed here, round-robin load-balancing in scsi_vhci hurts performance terribly. Don''t do it. As for which drives come up via which paths, it seems relatively random. I haven''t dug much into the whys of it - I''m sure I could influence it, but it isn''t worth my time to do so at the present. ----- A tool that doesn''t get mentioned much around here is SANTools'' smartmon-ux, written by David Lethe (www.santools.com). I started talking to him a long time ago - he''s a valuable resource for low-level SAS knowledge. Anyway, the tool rather handily solves the problem of "how do I tell some remote-support guy which drive to pull" or "how do I reset the damn blinky lights on the enclosure" or "which drive is in what slot". It also does a whole bunch of other things. If you don''t need a fancy GUI, this is your swiss-army-knife tool. It''s no longer cheap (sadly, my own fault, as my asking tons of questions and asking for enhancements resulted in, well, enhancements, which raised the value of the product rather quite a bit), but if you''re driving a big ZFS box using SMC backplanes, you really want to look at this and I can''t recommend the tool enough. -bacon