We have Sun STK RAID cards in our x4170 servers. These are battery backed with 256mb cache. What is the recommended ZFS configuration for these cards? Right now, I have created a one-to-one logical volume to disk mapping on the RAID card (one disk == one volume on RAID card). Then, I mirror them using ZFS. No hardware mirror. What I am a little confused with is if it is better to not do any logical volumes on the RAID card and only use ZFS mirrors for creating the pools. Please see configuration output here: http://pastebin.ca/1971460 Also, what exactly does "write-back" mean? Thanks
On 10/23/2010 8:22 PM, Anil wrote:> We have Sun STK RAID cards in our x4170 servers. These are battery > backed with 256mb cache. > What is the recommended ZFS configuration for these cards? > > Right now, I have created a one-to-one logical volume to disk mapping > on the RAID card (one disk == one volume on RAID card). Then, I mirror > them using ZFS. No hardware mirror. What I am a little confused with > is if it is better to not do any logical volumes on the RAID card and > only use ZFS mirrors for creating the pools.If you have a RAID card that has battery-backed cache, most cards make a distinction between a JBOD mode and a RAID volume mode. I''d have to look at the specs for your card, but most cards put into a JBOD mode will act similarly to a stupid HBA - that is, the one-board cache will be used solely for read-ahead cache, and not for any write-caching. When using the "one-disk-per-volume", the card acts as a "normal" RAID controller, and can potentially use the one-board cache as a write cache (see below). Normally, doing what you are is the preferred method of operation, as it gives ZFS the chance to do its magic (and use all its advantages), AND also gives the bonus of having the fast write cache of the RAID controller. It''s how I configure these things for virtually all my systems. The one real down side of doing it your way is that the drives now are non-portable. That is, you can''t move them to another HBA of a different brand, and you most likely will only ever be able to move them to another physical controller of identical brand AND model. And, of course, you have to make sure that you re-power the RAID controller within its battery lifespan after a power loss, or you lose data.> Please see configuration output here: > http://pastebin.ca/1971460 > > Also, what exactly does "write-back" mean? > > Thanks >"write-back" and "write-through" are the two major modes most RAID cards support in respect to how they treat their on-board cache and write requests. "Write-back" mode indicates that the RAID card will acknowledge a committed write as soon as the CACHE has been updated, but BEFORE the data makes it all the way out to the disks. "Write-through" mode will not acknowledge a committed write operation until the data actually makes it to disk (it "writes-through" the cache). -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
cool, thanks. One other question, how can I ensure that the controller''s cache is really being used? (arcconf doesn''t seem to show much). Since ZFS would flush the data as soon as it can, I am curious to see if the caching is making a difference or not. I think I will need to capture zfs stats with the logical volume vs just plain zfs mirrors, and see. Just wondeirng if these controllers had any other utility for this. On Sat, Oct 23, 2010 at 10:06 PM, Erik Trimble <erik.trimble at oracle.com> wrote:> On 10/23/2010 8:22 PM, Anil wrote: >> >> We have Sun STK RAID cards in our x4170 servers. These are battery >> backed with 256mb cache. >> What is the recommended ZFS configuration for these cards? >> >> Right now, I have created a one-to-one logical volume to disk mapping >> on the RAID card (one disk == one volume on RAID card). Then, I mirror >> them using ZFS. No hardware mirror. What I am a little confused with >> is if it is better to not do any logical volumes on the RAID card and >> only use ZFS mirrors for creating the pools. > > If you have a RAID card that has battery-backed cache, most cards make a > distinction between a JBOD mode and a RAID volume mode. ?I''d have to look at > the specs for your card, but most cards put into a JBOD mode will act > similarly to a stupid HBA - that is, the one-board cache will be used solely > for read-ahead cache, and not for any write-caching. ?When using the > "one-disk-per-volume", the card acts as a "normal" RAID controller, and can > potentially use the one-board cache as a write cache (see below). > > Normally, doing what you are is the preferred method of operation, as it > gives ZFS the chance to do its magic (and use all its advantages), AND also > gives the bonus of having the fast write cache of the RAID controller. ?It''s > how I configure these things for virtually all my systems. > > The one real down side of doing it your way is that the drives now are > non-portable. That is, you can''t move them to another HBA of a different > brand, and you most likely will only ever be able to move them to another > physical controller of identical brand AND model. ? And, of course, you have > to make sure that you re-power the RAID controller within its battery > lifespan after a power loss, or you lose data. > > >> Please see configuration output here: >> http://pastebin.ca/1971460 >> >> Also, what exactly does "write-back" mean? >> >> Thanks >> > "write-back" and "write-through" are the two major modes most RAID cards > support in respect to how they treat their on-board cache and write > requests. ?"Write-back" mode indicates that the RAID card will acknowledge a > committed write as soon as the CACHE has been updated, but BEFORE the data > makes it all the way out to the disks. ?"Write-through" mode will not > acknowledge a committed write operation until the data actually makes it to > disk (it "writes-through" the cache). > > > > > > -- > Erik Trimble > Java System Support > Mailstop: ?usca22-123 > Phone: ?x17195 > Santa Clara, CA > >
replicase at gmail.com said:> One other question, how can I ensure that the controller''s cache is really > being used? (arcconf doesn''t seem to show much). Since ZFS would flush the > data as soon as it can, I am curious to see if the caching is making a > difference or not.Share out a dataset on the pool over NFS to a remote client. On the client, unpack a tar archive onto the NFS dataset, timing how long it takes. Do this once with the cache set to "write-through" (which basically disables the write cache), and again with it set to "write-back". Regards, Marion