Anyone know of a SATA and/or SAS HBA with battery backed write cache? Seems like using a full-blown RAID controller and exporting each individual drive back to ZFS as a single LUN is a waste of power and $$$. Looking for any thoughts or ideas. Thanks. -Matt -- This message posted from opensolaris.org
>>>>> "mb" == Matt Beebe <matthew.beebe at high-eng.com> writes:mb> Anyone know of a SATA and/or SAS HBA with battery backed write mb> cache? I''ve never heard of a battery that''s used for anything but RAID features. It''s an interesting question, if you use the controller in ``JBOD mode'''' will it use the write cache or not? I would guess not, but it might. And if it doesn''t, can you force it, even by doing sneaky things like making 2-disk mirrors where 1 disk happens to be missing thus wasting half the ports you bought, but turning on the damned write cache? I don''t know. The alternative is to get a battery-backed SATA slog like the gigabyte iram. However, beware, because once you add a slog to a pool, you can never remove it. You can''t improt the pool without the slog, not even DEGRADED, not even if you want ZFS to pretend the slog is empty, not even if the slog actually was empty. IIRC (might be confused) Ross found the pool will mount at boot without the slog if it''s listed in zpool.cache (why? don''t know, but I think he said it does), but once you export the pool there is no way to get it back into zpool.cache since zpool.cache is a secret binary config file. Can you substitute any empty device for the missing slog? nope---the slog has secret binary header label on it. I''m guessing one of the reasons you wanted a non-RAID controller with a write cache was so that if the controller failed, and the exact same model wasn''t available to replace it, most of your pool would still be readable with any random controller, modulo risk of corruption from the lost write cache. so...with the slog, you don''t have that, because there are magic irreplaceable bits stored on the slog without which your whole pool is useless. bash-3.00# zpool import -d /usr/vdev pool: slogtest id: 11808644862621052048 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: slogtest ONLINE mirror ONLINE /usr/vdev/d0 ONLINE /usr/vdev/d1 ONLINE logs slogtest ONLINE /usr/vdev/slog ONLINE bash-3.00# mv vdev/slog . bash-3.00# zpool import -d /usr/vdev pool: slogtest id: 11808644862621052048 state: FAULTED status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: slogtest UNAVAIL missing device mirror ONLINE /usr/vdev/d0 ONLINE /usr/vdev/d1 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. bash-3.00# damn. ``no user-serviceable parts inside.'''' however, if you were sneaky enough to save a backup copy of your empty slog to get around Solaris''s obtinence, maybe you can proceed: bash-3.00# gzip slog <-- save a copy of the exported empty slog bash-3.00# ls -l slog.gz -rw-r--r-- 1 root root 106209 Sep 3 16:17 slog.gz bash-3.00# gunzip < slog.gz > vdev/slog bash-3.00# zpool import -d /usr/vdev pool: slogtest id: 11808644862621052048 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: slogtest ONLINE mirror ONLINE /usr/vdev/d0 ONLINE /usr/vdev/d1 ONLINE logs slogtest ONLINE /usr/vdev/slog ONLINE bash-3.00# zpool import -d /usr/vdev slogtest bash-3.00# pax -rwpe /usr/sfw/bin /slogtest ^C bash-3.00# zpool export slogtest bash-3.00# gunzip < slog.gz > vdev/slog <-- wipe the slog bash-3.00# zpool import -d /usr/vdev slogtest bash-3.00# zfs list -r slogtest NAME USED AVAIL REFER MOUNTPOINT slogtest 18.1M 25.4M 17.9M /slogtest bash-3.00# zpool scrub slogtest bash-3.00# zpool status slogtest pool: slogtest state: ONLINE scrub: scrub completed with 0 errors on Wed Sep 3 16:23:44 2008 config: NAME STATE READ WRITE CKSUM slogtest ONLINE 0 0 0 mirror ONLINE 0 0 0 /usr/vdev/d0 ONLINE 0 0 0 /usr/vdev/d1 ONLINE 0 0 0 logs ONLINE 0 0 0 /usr/vdev/slog ONLINE 0 0 0 errors: No known data errors bash-3.00# I''m not sure this will always work, because there probably wasn''t anything in the slog when I wiped it. But I guess it''s better than ``restore your pool from backup'''' because of the pedantry of some wallpaper tool and brittle windows-registry-style binary config files. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080903/33ed700e/attachment.bin>
comment at bottom... Miles Nordin wrote:>>>>>> "mb" == Matt Beebe <matthew.beebe at high-eng.com> writes: >>>>>> > > mb> Anyone know of a SATA and/or SAS HBA with battery backed write > mb> cache? > > I''ve never heard of a battery that''s used for anything but RAID > features. It''s an interesting question, if you use the controller in > ``JBOD mode'''' will it use the write cache or not? I would guess not, > but it might. And if it doesn''t, can you force it, even by doing > sneaky things like making 2-disk mirrors where 1 disk happens to be > missing thus wasting half the ports you bought, but turning on the > damned write cache? I don''t know. > > The alternative is to get a battery-backed SATA slog like the gigabyte > iram. However, beware, because once you add a slog to a pool, you can > never remove it. You can''t improt the pool without the slog, not even > DEGRADED, not even if you want ZFS to pretend the slog is empty, not > even if the slog actually was empty. IIRC (might be confused) Ross > found the pool will mount at boot without the slog if it''s listed in > zpool.cache (why? don''t know, but I think he said it does), but once > you export the pool there is no way to get it back into zpool.cache > since zpool.cache is a secret binary config file. Can you substitute > any empty device for the missing slog? nope---the slog has secret > binary header label on it. > > I''m guessing one of the reasons you wanted a non-RAID controller with > a write cache was so that if the controller failed, and the exact same > model wasn''t available to replace it, most of your pool would still be > readable with any random controller, modulo risk of corruption from > the lost write cache. so...with the slog, you don''t have that, > because there are magic irreplaceable bits stored on the slog without > which your whole pool is useless. > > bash-3.00# zpool import -d /usr/vdev > pool: slogtest > id: 11808644862621052048 > state: ONLINE > action: The pool can be imported using its name or numeric identifier. > config: > > slogtest ONLINE > mirror ONLINE > /usr/vdev/d0 ONLINE > /usr/vdev/d1 ONLINE > logs > slogtest ONLINE > /usr/vdev/slog ONLINE > bash-3.00# mv vdev/slog . > bash-3.00# zpool import -d /usr/vdev > pool: slogtest > id: 11808644862621052048 > state: FAULTED > status: One or more devices are missing from the system. > action: The pool cannot be imported. Attach the missing > devices and try again. > see: http://www.sun.com/msg/ZFS-8000-6X > config: > > slogtest UNAVAIL missing device > mirror ONLINE > /usr/vdev/d0 ONLINE > /usr/vdev/d1 ONLINE > > Additional devices are known to be part of this pool, though their > exact configuration cannot be determined. > bash-3.00# > > damn. ``no user-serviceable parts inside.'''' however, if you were > sneaky enough to save a backup copy of your empty slog to get around > Solaris''s obtinence, maybe you can proceed: > > bash-3.00# gzip slog <-- save a copy of the exported empty slog > bash-3.00# ls -l slog.gz > -rw-r--r-- 1 root root 106209 Sep 3 16:17 slog.gz > bash-3.00# gunzip < slog.gz > vdev/slog > bash-3.00# zpool import -d /usr/vdev > pool: slogtest > id: 11808644862621052048 > state: ONLINE > action: The pool can be imported using its name or numeric identifier. > config: > > slogtest ONLINE > mirror ONLINE > /usr/vdev/d0 ONLINE > /usr/vdev/d1 ONLINE > logs > slogtest ONLINE > /usr/vdev/slog ONLINE > bash-3.00# zpool import -d /usr/vdev slogtest > bash-3.00# pax -rwpe /usr/sfw/bin /slogtest > ^C > bash-3.00# zpool export slogtest > bash-3.00# gunzip < slog.gz > vdev/slog <-- wipe the slog > bash-3.00# zpool import -d /usr/vdev slogtest > bash-3.00# zfs list -r slogtest > NAME USED AVAIL REFER MOUNTPOINT > slogtest 18.1M 25.4M 17.9M /slogtest > bash-3.00# zpool scrub slogtest > bash-3.00# zpool status slogtest > pool: slogtest > state: ONLINE > scrub: scrub completed with 0 errors on Wed Sep 3 16:23:44 2008 > config: > > NAME STATE READ WRITE CKSUM > slogtest ONLINE 0 0 0 > mirror ONLINE 0 0 0 > /usr/vdev/d0 ONLINE 0 0 0 > /usr/vdev/d1 ONLINE 0 0 0 > logs ONLINE 0 0 0 > /usr/vdev/slog ONLINE 0 0 0 > > errors: No known data errors > bash-3.00# > > I''m not sure this will always work, because there probably wasn''t > anything in the slog when I wiped it. But I guess it''s better than > ``restore your pool from backup'''' because of the pedantry of some > wallpaper tool and brittle windows-registry-style binary config files. >There are a number of fixes in the works to allow more options for dealing with slogs. http://bugs.opensolaris.org/search.do?process=1&type=&sortBy=relevance&bugStatus=&perPage=50&bugId=&keyword=&textSearch=slog+fault&category=kernel&subcategory=zfs&sinceIf you can think of a new wrinkle, please file a bug. -- richard
On Wed, Sep 3, 2008 at 1:48 PM, Miles Nordin <carton at ivy.net> wrote:> I''ve never heard of a battery that''s used for anything but RAID > features. It''s an interesting question, if you use the controller in > ``JBOD mode'''' will it use the write cache or not? I would guess not, > but it might. And if it doesn''t, can you force it, even by doing > sneaky things like making 2-disk mirrors where 1 disk happens to be > missing thus wasting half the ports you bought, but turning on the > damned write cache? I don''t know. >The X4150 SAS RAID controllers will use the on-board battery backed cache even when disks are presented as individual LUNs. You can also globally enable/disable the disk write caches. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080903/04acb47f/attachment.html>
Doesn''t really have a write cache, but some of us have been using this relatively inexpensive card with good fast results. I''ve been using it with SATA rather than SAS. AOC-USAS-L8i http://www.supermicro.com/products/accessories/addon/AOC-USAS-L8i.cfm Thread: http://opensolaris.org/jive/thread.jspa?threadID=66128&tstart=60 -- This message posted from opensolaris.org
On 03 September, 2008 - Aaron Blew sent me these 2,5K bytes:> On Wed, Sep 3, 2008 at 1:48 PM, Miles Nordin <carton at ivy.net> wrote: > > > I''ve never heard of a battery that''s used for anything but RAID > > features. It''s an interesting question, if you use the controller in > > ``JBOD mode'''' will it use the write cache or not? I would guess not, > > but it might. And if it doesn''t, can you force it, even by doing > > sneaky things like making 2-disk mirrors where 1 disk happens to be > > missing thus wasting half the ports you bought, but turning on the > > damned write cache? I don''t know. > > > > The X4150 SAS RAID controllers will use the on-board battery backed cache > even when disks are presented as individual LUNs. You can also globally > enable/disable the disk write caches.We''re using an Infortrend SATA/SCSI disk array with individual LUNs, but it still uses the disk cache. /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Hey Miles, Yes, you remembered right. If the zpool.cache file is there, Solaris will mount the pool fine on boot with or without the slog. So if you''re running a slog, make sure you keep a backup of that file! I was told that work is being done to make zpool import work the same way so pools with missing slogs can be imported, but that''s not available yet so this is definately a bit of a problem for now. However, I also found that once you have the pool online, you can happily replace the missing slog with another device. In my case, I didn''t have an iRAM available, so I was using a regular ramdisk as my slog device to see what effect it had on performance for us. To get it working after a reboot, I just wrote a boot script that creates a new ramdisk with the same name, and adds it to the pool with zpool replace. Obviously that''s not for production use, but it was an interesting test, just a 256MB slog gave us an immediate 12x performance boost over NFS. Ross -- This message posted from opensolaris.org
> > > I''m guessing one of the reasons you wanted a > non-RAID controller with > > a write cache was so that if the controller failed, > and the exact same > > model wasn''t available to replace it, most of your > pool would still be > > readable with any random controller, modulo risk of > corruption from > > the lost write cache. so...with the slog, you > don''t have that, > > because there are magic irreplaceable bits stored > on the slog without > > which your whole pool is useless. > >Actually I just wanted to get the benefit of increased write cache without paying for the RAID controller... All the "best practice" guides say that using your RAID controller is generally redundant, and you should use RaidZ (or a variant) in most implementations (leaving room for scenarios where hardware mirroring of some of the drives may be better, etc). Telling the RAID controller to export each drive as a single LUN works with most of the RAID controllers our there... but in addition to being painful to configure (on most of the RAID cards), you''re paying for RAID hardware logic that goes unused. Also, all the RAID cards (that I''ve seen) write some sort of magic secret on the drive (even in 1:1 config) that messes with you when you need to replace/move the drives down the road. So how ''bout it hardware vendors? when can we get a PCIe(x8) SAS/SATA controller with an x4 internal port and an x4 external port and 512MB battery backed cache for about $250?? :) Heck, I''d take SATA only if I could get it at a decent price point... While we''re at it, I''d also be happy with a PCIe(x4) card with 2 or 4 DIMM slots and a battery back-up that exposes itself as a system drive (ala iRAM, but PCIe not SATA 150) for slog and read cache... say $150 price point? heehee... there is an SSD based option out there, but it has 80GB available, and starts at $2500 (overkill for my requirement) -Matt -- This message posted from opensolaris.org
On Wed, Sep 10, 2008 at 16:56, Matt Beebe <matthew.beebe at high-eng.com> wrote:> So how ''bout it hardware vendors? when can we get a PCIe(x8) SAS/SATA controller with an x4 internal port and an x4 external port and 512MB battery backed cache for about $250?? :) Heck, I''d take SATA only if I could get it at a decent price point...The Supermicro AOC-USAS-S4iR fits the bill, nearly. It''s got 4 internal and 4 external ports, on PCI express x8, with 256 MB of cache, for about $320[1]. Adding battery backup is about another $150[2].> While we''re at it, I''d also be happy with a PCIe(x4) card with 2 or 4 DIMM slots and a battery back-up that exposes itself as a system drive (ala iRAM, but PCIe not SATA 150) for slog and read cache... say $150 price point? heehee... there is an SSD based option out there, but it has 80GB available, and starts at $2500 (overkill for my requirement)Not terribly likely to see this soon, I''m afraid. Memory interface technology keeps changing once every couple years, and that makes such a device less attractive to market. Consider that DDR(-1) RAM is almost three times as expensive as DDR-2 RAM ($184 versus $67 for a stick of 2GB ECC... and SDRAM? Survey says $500 easy) and having the latest generation seems to make sense. But that means (as a manufacturer) your device goes obsolete quicker, you sell fewer units, and make less return on your investment. So unless a common memory bus is developed, such a device would be a bad investment. Actually, what I''d rather have than battery "backup" is a large enough flash device to store the contents of RAM, and a battery big enough to get everything dumped to persistent storage. That takes out the question of running out of batteries prematurely, and leaves only the question of batteries losing capacity over time and needing to replace them. Will [1]: http://www.wiredzone.com/itemdesc.asp?ic=32005545 [2]: http://www.wiredzone.com/itemdesc.asp?ic=10017972