Kyle McDonald
2008-Jan-22 17:47 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Are there, or Does it make any sense to try to find a RAID card with battery backup that will ignore the ZFS commit commands when the battery is able to guarantee stable storage? I don''t know if they do this, but I''ve recently had good non-ZFS performance with the IBM ServeRAID 8k raid that was in an xSeries server I was using. the 8k has 256MB or batter backed cache. The server it was in, only had 6 drive bays, and I''m not looking to have it do RAID5 for ZFS, but I just had the idea: "Hey, I wonder if I could setup the card with 5 (single drive) RAID 0 LUNs, and gain the advantage of the the 256MB battery backed cache, when I tell ZFS to do RAIDZ across them?" I know battery-backed cache, and the proper commit semantics are generally found only on higher end raid controllers and arrays (right?) But I''m wondering now if I couldn''t get an 8 port SATA controller that would let me map each single drive as a RAID 0 LUN and use it''s cache to boost performance. My primary use case, is NFS base storage to a farm of software build servers, and developer desktops. Anyone searched for this already? Anyone found any reasons why it wouldn''t work already? -Kyle
Kyle McDonald
2008-Jan-22 17:47 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Are there, or Does it make any sense to try to find a RAID card with battery backup that will ignore the ZFS commit commands when the battery is able to guarantee stable storage? I don''t know if they do this, but I''ve recently had good non-ZFS performance with the IBM ServeRAID 8k raid that was in an xSeries server I was using. the 8k has 256MB or batter backed cache. The server it was in, only had 6 drive bays, and I''m not looking to have it do RAID5 for ZFS, but I just had the idea: "Hey, I wonder if I could setup the card with 5 (single drive) RAID 0 LUNs, and gain the advantage of the the 256MB battery backed cache, when I tell ZFS to do RAIDZ across them?" I know battery-backed cache, and the proper commit semantics are generally found only on higher end raid controllers and arrays (right?) But I''m wondering now if I couldn''t get an 8 port SATA controller that would let me map each single drive as a RAID 0 LUN and use it''s cache to boost performance. My primary use case, is NFS base storage to a farm of software build servers, and developer desktops. Anyone searched for this already? Anyone found any reasons why it wouldn''t work already? -Kyle
Albert Chin
2008-Jan-22 19:14 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote:> > My primary use case, is NFS base storage to a farm of software build > servers, and developer desktops.For the above environment, you''ll probably see a noticable improvement with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive cards exist for the common consumer (with ECC memory anyways). If you convince http://www.micromemory.com/ to sell you one, let us know :) Set "set zfs:zil_disable = 1" in /etc/system to gauge the type of improvement you can expect. Don''t use this in production though. -- albert chin (china at thewrittenword.com)
Kyle McDonald
2008-Jan-22 20:36 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote:> On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote: > >> My primary use case, is NFS base storage to a farm of software build >> servers, and developer desktops. >> > > For the above environment, you''ll probably see a noticable improvement > with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive > cards exist for the common consumer (with ECC memory anyways). If you > convince http://www.micromemory.com/ to sell you one, let us know :) > >I know, but for a that card you need a driver to make it appear as a device. Plus it would take a PCI slot. I was hoping to make use of the battery backed ram on a RAID card that I already have (but can''t use since I want to let ZFS do the redundancy.) If I had a card with battery backed ram, how would I go about testing the commit semantics to see if it is only obeying ZFS commits when the battery is bad? Does anyone know if the IBM ServeRAID 7k or 8k do this correctly? If not any chance of getting IBM to ''fix'' the firmware? The Solaris RedBooks I''ve read, they seem to think highly of ZFS. Back on the subject of NVRAM for ZIL devices, What are people using then for ZIL devices on the budget-limited side of things? I''ve foudn some SATA Flash drive, and a bunch that are IDE. Unfortunately the HW I''d like to stick this in is a little older... It''s got a U320 SCSI controller in it. Has anyone found a good U320 Flash Disk that''s not overkill size wise, and not outrageously expensive? Google found what appear to be a few OEM vendors, but no resellers on the qty I''d be interested in. Anyone using a USB Flash drive? Is USB fast enough to gain any benefits? -Kyle> Set "set zfs:zil_disable = 1" in /etc/system to gauge the type of > improvement you can expect. Don''t use this in production though. > >
Carson Gaspar
2008-Jan-22 21:33 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Kyle McDonald wrote: ...> I know, but for a that card you need a driver to make it appear as a > device. Plus it would take a PCI slot. > I was hoping to make use of the battery backed ram on a RAID card that I > already have (but can''t use since I want to let ZFS do the redundancy.) > If I had a card with battery backed ram, how would I go about testing > the commit semantics to see if it is only obeying ZFS commits when the > battery is bad?Any _sane_ controller that supports battery backed cache will disable its write cache if its battery goes bad. It should also log this. I''d check the docs or contact your vendor''s tech support to verify the card you have is sane, and if it reports the error to its monitoring tools so you find out about it quickly. Now you''ll probably _still_ need to disable the ZFS cache flushes, which is a global option, so you''d need to make sure that _all_ your ZFS devices had battery backed write caches or no write caches at all. -- Carson
Kyle McDonald
2008-Jan-23 02:20 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Carson Gaspar wrote:> Kyle McDonald wrote: > ... > >> I know, but for a that card you need a driver to make it appear as a >> device. Plus it would take a PCI slot. >> I was hoping to make use of the battery backed ram on a RAID card that I >> already have (but can''t use since I want to let ZFS do the redundancy.) >> If I had a card with battery backed ram, how would I go about testing >> the commit semantics to see if it is only obeying ZFS commits when the >> battery is bad? >> > > Any _sane_ controller that supports battery backed cache will disable > its write cache if its battery goes bad. It should also log this. I''d > check the docs or contact your vendor''s tech support to verify the card > you have is sane, and if it reports the error to its monitoring tools so > you find out about it quickly. >You''re right. I forgot that. Not only would the commits need to happen right away, but the cache should be disabled completely. Now that you mention it, I know from experience, for the ServeRAID 7k/8k controllers, the cache is disabled if/when the battery fails. Good point. Now I just need to determine if a) the cache is used by the card even when useing the disks on it as JBOD, or b) if the card will allow me to make 5 or 6 raid 0 luns with only 1 disk in each, to simulate (a) and activate the write cache. Anyone know the answer to this? I''ll be ordering 2 of the 7K''s for my x346''s this week. If niether A nor B will work I''m not sure there''s any advantage to using the 7k card considering I want ZFS to do the mirroring. If this all does work, it should speed up all the writes to the disk, including the ZIL writes. Is there still an advantage to investigating a Solid State Disk, or Flash Drive device to reloacte the ZIL to?> Now you''ll probably _still_ need to disable the ZFS cache flushes, which > is a global option, so you''d need to make sure that _all_ your ZFS > devices had battery backed write caches or no write caches at all. > >I guess this is a better solution than chasing down firmware authors to get them to ignore flush requests. It''s just too bad it''s not settable on a pool by pool basis rather than server by server. Won''t affect me though this will be the only pool on this machine. -Kyle
Kyle McDonald
2008-Jan-23 02:48 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Kyle McDonald wrote:> Now I just need to determine if a) the cache is used by the card even > when useing the disks on it as JBOD, or b) if the card will allow me to > make 5 or 6 raid 0 luns with only 1 disk in each, to simulate (a) and > activate the write cache. > >I found docs at IBM that make me think that (B) at least will work. The 7k can use up to 30 disks, and can allow the host to see as many as 8 LUNs. IBM''s description of RAID0 is that it requires a min. of 1 drive, so I can''t see why I can''t create 5 or 6 1 drive RAID 0 LUNs to use the 7k''s 256MB cache with ZFS. Next question is, with a single drive RAID 0 LUN, will the card''s stripe unit size be a factor? and if so how to set the card''s stripe unit size. I know ZFS likes to write to the disk in 128K chunks. Is that 128K to each vdev? or 128K/n to each vdev? This card allows stripe unit sizes of 8k, 16k, 32k, and 64k. I''m guessing that if ZFS will be sending 128K to each vdev at once nearly all the time, I should use 64k. If it''s 128K across the 5 vdevs, then 16k or 32k might be better?? In either case is there an advantage to tuning ZFS''s size down to match the card''s?> If this all does work, it should speed up all the writes to the disk, > including the ZIL writes. Is there still an advantage to investigating a > Solid State Disk, or Flash Drive device to reloacte the ZIL to? >I''m still going to investigate this further. I think I''ll try to calculate the max ZIL size once I''m up and running, and see if I can''t get a cheap USB flash drive of a decent size to test this with. -Kyle
Albert Chin
2008-Jan-23 03:43 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote:> Anyone know the answer to this? I''ll be ordering 2 of the 7K''s for > my x346''s this week. If niether A nor B will work I''m not sure > there''s any advantage to using the 7k card considering I want ZFS to > do the mirroring.Why even both with a H/W RAID array when you won''t use the H/W RAID? Better to find a decent SAS/FC JBOD with cache. Would definitely be cheaper. -- albert chin (china at thewrittenword.com)
Carson Gaspar
2008-Jan-23 03:53 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote:> On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote: >> Anyone know the answer to this? I''ll be ordering 2 of the 7K''s for >> my x346''s this week. If niether A nor B will work I''m not sure >> there''s any advantage to using the 7k card considering I want ZFS to >> do the mirroring. > > Why even both with a H/W RAID array when you won''t use the H/W RAID? > Better to find a decent SAS/FC JBOD with cache. Would definitely be > cheaper.Please name some candidate cards matching that description - I don''t know of any. -- Carson
Kyle McDonald
2008-Jan-23 06:46 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote:> On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote: > >> Anyone know the answer to this? I''ll be ordering 2 of the 7K''s for >> my x346''s this week. If niether A nor B will work I''m not sure >> there''s any advantage to using the 7k card considering I want ZFS to >> do the mirroring. >> > > Why even both with a H/W RAID array when you won''t use the H/W RAID? > Better to find a decent SAS/FC JBOD with cache. Would definitely be > cheaper. > >I''ve never heard of such a thing? Do you have any links (cheap or not?) Do they exist for less than $350? Thats what the 7k will run me. Do they include an enclosure for at least 6 disks? the 7k will use the 6 U320 hot swap bays already in my IBM x346 chassis. I''m not being sarcastic, if something better exists, even for a little more, I''m interested. I''d especially love to switch to SATA as I''m about to pay about $550 each for 300GB U320 drives, and with SATA I could go bigger, or save money or both. :) -Kyle
Erik Trimble
2008-Jan-25 04:51 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Kyle McDonald wrote:> Albert Chin wrote: > >> On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote: >> >> >>> Anyone know the answer to this? I''ll be ordering 2 of the 7K''s for >>> my x346''s this week. If niether A nor B will work I''m not sure >>> there''s any advantage to using the 7k card considering I want ZFS to >>> do the mirroring. >>> >>> >> Why even both with a H/W RAID array when you won''t use the H/W RAID? >> Better to find a decent SAS/FC JBOD with cache. Would definitely be >> cheaper. >> >> >> > I''ve never heard of such a thing? Do you have any links (cheap or not?) > > Do they exist for less than $350? Thats what the 7k will run me. > Do they include an enclosure for at least 6 disks? the 7k will use the 6 > U320 hot swap bays already in my IBM x346 chassis. > > I''m not being sarcastic, if something better exists, even for a little > more, I''m interested. I''d especially love to switch to SATA as I''m about > to pay about $550 each for 300GB U320 drives, and with SATA I could go > bigger, or save money or both. :) > > -Kyle > >Frankly, the IBM ServeRAIDs are a good deal, and they DEFINITELY will support your second option, where you assign each physical drive to a logical drive, then use ZFS to assemble the Logicals. This fully utilizes the battery cache on the card. The only gotcha is the ServeRAIDs support a maximum of 8 logical drives. My favorite is the 4Mx card (cheap, good Ultra160 performance, and reasonable amount of cache). The only supported ones for the x346 are the 6M and 7k, though, and both are reasonable. The 4Mx works fine in a x346, if you can live without an "official" support. I''ve had _very_ little luck with finding any JBOD w/ cache, SCSI/FC/SATA/SAS or otherwise. Nobody makes these beasts - they all want to either be a pure JBOD, or add a full RAID controller onto it. At this point, one of my favorites is to use older EMC (frequently Dell-labled, but EMC-build) 1Gb FC arrays, and hook them to my servers. Case in point is the Dell 660F (RAID head) + Dell 224F (JBOD). Works in both direct-connect and SAN modes, and a full 28-drive config runs under $1k from reputable dealers (these are EOL''d, so find someone that can get you a real support contract). The minor issue is that they generally have minimal NVRAM cache, often as little as 256MB, whereas I''d really like 1GB or more. For the mid-line, the various FC-controller-to-host, SCSI-controller-to-jbod solutions are the most flexible and reasonable. HP''s StorageWorks 1500cs is an example. But there, you''re looking at $10k for a decent solution of a couple TB. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
Kyle McDonald
2008-Jan-25 05:59 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Erik Trimble wrote:> Kyle McDonald wrote: >> Albert Chin wrote: >> >>> On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote: >>> >>>> Anyone know the answer to this? I''ll be ordering 2 of the 7K''s for >>>> my x346''s this week. If niether A nor B will work I''m not sure >>>> there''s any advantage to using the 7k card considering I want ZFS to >>>> do the mirroring. >>>> >>> Why even both with a H/W RAID array when you won''t use the H/W RAID? >>> Better to find a decent SAS/FC JBOD with cache. Would definitely be >>> cheaper. >>> >>> >> I''ve never heard of such a thing? Do you have any links (cheap or not?) >> >> Do they exist for less than $350? Thats what the 7k will run me. >> Do they include an enclosure for at least 6 disks? the 7k will use >> the 6 U320 hot swap bays already in my IBM x346 chassis. >> >> I''m not being sarcastic, if something better exists, even for a >> little more, I''m interested. I''d especially love to switch to SATA as >> I''m about to pay about $550 each for 300GB U320 drives, and with SATA >> I could go bigger, or save money or both. :) >> >> -Kyle >> >> > > Frankly, the IBM ServeRAIDs are a good deal, and they DEFINITELY will > support your second option, where you assign each physical drive to a > logical drive, then use ZFS to assemble the Logicals. This fully > utilizes the battery cache on the card. >Thanks. thats good to know. With the 256MB doing write caching, is there any further benefit to moving thte ZIL to a flash or other fast NV storage?> The only gotcha is the ServeRAIDs support a maximum of 8 logical > drives. My favorite is the 4Mx card (cheap, good Ultra160 > performance, and reasonable amount of cache). The only supported ones > for the x346 are the 6M and 7k, though, and both are reasonable. The > 4Mx works fine in a x346, if you can live without an "official" support. >I can live with 8 drives for now. I only have the 6 internal drive bays anyway. 5 if I keep the boot drive seperate. I suppose I could use all 6 of I find a good external U320 enclosure for booting. Wonder what I''d have to pay for a used Sun D240? hmmm. with the bus split It could handle 2 x346''s and allow me to mirror the boot disk. Without it split I could go up to 8 drives hmm. Maybe in the next budget. ;)> I''ve had _very_ little luck with finding any JBOD w/ cache, > SCSI/FC/SATA/SAS or otherwise. Nobody makes these beasts - they all > want to either be a pure JBOD, or add a full RAID controller onto it. > > At this point, one of my favorites is to use older EMC (frequently > Dell-labled, but EMC-build) 1Gb FC arrays, and hook them to my > servers. Case in point is the Dell 660F (RAID head) + Dell 224F > (JBOD). Works in both direct-connect and SAN modes, and a full > 28-drive config runs under $1k from reputable dealers (these are > EOL''d, so find someone that can get you a real support contract). The > minor issue is that they generally have minimal NVRAM cache, often as > little as 256MB, whereas I''d really like 1GB or more. >That''s good to know. Thanks.> For the mid-line, the various FC-controller-to-host, > SCSI-controller-to-jbod solutions are the most flexible and > reasonable. HP''s StorageWorks 1500cs is an example. But there, you''re > looking at $10k for a decent solution of a couple TB. > >Out of my price range. ;) -Kyle
Albert Chin
2008-Jan-25 15:21 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
On Fri, Jan 25, 2008 at 12:59:18AM -0500, Kyle McDonald wrote:> ... With the 256MB doing write caching, is there any further benefit > to moving thte ZIL to a flash or other fast NV storage?Do some tests with/without ZIL enabled. You should see a big difference. You should see something equivalent to the performance of ZIL disabled with ZIL/RAM. I''d do ZIL with a battery-backed RAM in a heartbeat if I could find a card. I think others would as well. -- albert chin (china at thewrittenword.com)
Kyle McDonald
2008-Jan-25 15:35 UTC
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote:> On Fri, Jan 25, 2008 at 12:59:18AM -0500, Kyle McDonald wrote: > >> ... With the 256MB doing write caching, is there any further benefit >> to moving thte ZIL to a flash or other fast NV storage? >> > > Do some tests with/without ZIL enabled. You should see a big > difference. You should see something equivalent to the performance of > ZIL disabled with ZIL/RAM. I''d do ZIL with a battery-backed RAM in a > heartbeat if I could find a card. I think others would as well. > >I agree when your disk''s are slow to place the changes in ''safe'' storage. My question is, with the ZIL on the main disks of the zPool, *and* those same disks write-cached by the battery-backed RAM on the RIAD controller, aren''t the ZIL writes going to be (nearly?) just as fast as they would be to a dedicated NVRAM or FLASH device? Granted the 256MB on the RAID controller may not be enough, and it''s a shame to have to share it among all the writes to the disk, not just the ZIL writes, but it should still be a huge improvement. My question is just how close does it come to the dedicated ZIL device? 90%? 50%? For that matter, considering *all* the writes that ZFS will do (in my case) will be to battery backed cache devices, is there still a risk to disabling the ZIL altogether? -Kyle