This question is concerning ZFS. We have a Sun Fire V890 attached to a EMC disk array. Here''s are plan to incorporate ZFS: On our EMC storage array we will create 3 LUNS. Now how would ZFS be used for the best performance? What I''m trying to ask is if you have 3 LUNS and you want to create a ZFS storage pool, would it be better to have a storage pool per LUN or combine the 3 LUNS as one big disks under ZFS and create 1 huge ZFS storage pool. Example: LUN1 200gb ZFS Storage Pool "pooldata1" LUN2 200gb ZFS Storage Pool "pooldata2" LUN3 200gb ZFS Storage Pool "pooldata3" or LUN 600gb ZFS Storage Pool "alldata" This message posted from opensolaris.org
Kory Wheatley wrote:> This question is concerning ZFS. We have a Sun Fire V890 attached to a EMC disk array. > Here''s are plan to incorporate ZFS: > On our EMC storage array we will create 3 LUNS. Now how would ZFS be used for the > best performance? > > What I''m trying to ask is if you have 3 LUNS and you want to create a ZFS storage pool, > would it be better to have a storage pool per LUN or combine the 3 LUNS as one big disks > under ZFS and create 1 huge ZFS storage pool.One huge zpool. Remember, the pool can contain many file systems, but the reverse is not true. -- richard
Are you looking purely for performance, or for the added reliability that ZFS can give you? If the latter, then you would want to configure across multiple LUNs in either a mirrored or RAID configuration. This does require sacrificing some storage in exchange for the peace of mind that any ?silent data corruption? in the array or storage fabric will be not only detected but repaired by ZFS.>From a performance point of view, what will work best depends greatly on your application I/O pattern, how you would map the application?s data to the available ZFS pools if you had more than one, how many channels are used to attach the disk array, etc. A single pool can be a good choice from an ease-of-use perspective, but multiple pools may perform better under certain types of load (for instance, there?s one intent log per pool, so if the intent log writes become a bottleneck then multiple pools can help). This also depends on how the LUNs are configured within the EMC array....If you can put together a test system, and run your application as a benchmark, you can get an answer. Without that, I don?t think anyone can predict which will work best in your particular situation. This message posted from opensolaris.org
> Are you looking purely for performance, or for the added reliability that ZFS can give you? > > If the latter, then you would want to configure across multiple LUNs in either a mirrored or RAID configuration. This does require sacrificing some storage in exchange for the peace of mind that any ?silent data corruption? in the array or storage fabric will be not only detected but repaired by ZFS. > >>From a performance point of view, what will work best depends greatly on your application I/O pattern, how you would map the application?s data to the available ZFS pools if you had more than one, how many channels are used to attach the disk array, etc. A single pool can be a good choice from an ease-of-use perspective, but multiple pools may perform better under certain types of load (for instance, there?s one intent log per pool, so if the intent log writes become a bottleneck then multiple pools can help).Bad example, as there''s actually one intent log per file system!> This also depends on how the LUNs are configured within the EMC array.... > > If you can put together a test system, and run your application as a benchmark, you can get an answer. Without that, I don?t think anyone can predict which will work best in your particular situation.
Hi Kory, It depends on the capabilities of your array in our experience...and also the zpool type. If you''re going to do RAID-Z in a write intensive environment you''re going to have a lot more I/Os with three LUNs then a single large LUN. Your controller may go nutty. Also, (Richard can address this better than I) you may want to disable the ZIL or have your array ignore the write cache flushes that ZFS issues. Best Regards, Jason On 12/12/06, Kory Wheatley <wheakory at isu.edu> wrote:> This question is concerning ZFS. We have a Sun Fire V890 attached to a EMC disk array. Here''s are plan to incorporate ZFS: > On our EMC storage array we will create 3 LUNS. Now how would ZFS be used for the best performance? > > What I''m trying to ask is if you have 3 LUNS and you want to create a ZFS storage pool, would it be better to have a storage pool per LUN or combine the 3 LUNS as one big disks under ZFS and create 1 huge ZFS storage pool. > > Example: > LUN1 200gb ZFS Storage Pool "pooldata1" > LUN2 200gb ZFS Storage Pool "pooldata2" > LUN3 200gb ZFS Storage Pool "pooldata3" > > or > > LUN 600gb ZFS Storage Pool "alldata" > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Were looking for pure performance. What will be contained in the LUNS is Student User account files that they will access and Department Share files like, MS word documents, excel files, PDF. There will be no applications on the ZFS Storage pools or pool Does this help on what strategy might be best? This message posted from opensolaris.org
Also there will be no NFS services on this system. This message posted from opensolaris.org
> Were looking for pure performance. > > What will be contained in the LUNS is Student User > account files that they will access and Department > Share files like, MS word documents, excel files, > PDF. There will be no applications on the ZFS > Storage pools or pool Does this help on what > strategy might be best?I think so. I would suggest striping a single pool across all available LUNs, then. (I''m presuming that you would be prepared to recover from ZFS-detected errors by reloading from backup.) There doesn''t seem any compelling reason to split your storage into multiple pools, and by using a single pool, you don''t have to worry about reallocating storage if one pool fills up while another has free space. This message posted from opensolaris.org
The Luns will be on separate "SPA" controllers"not on all the same controller, so that''s why I thought if we split our data on different disks and ZFS Storage Pools we would get better IO performance. Correct? This message posted from opensolaris.org
Kory Wheatley wrote:> The Luns will be on separate "SPA" controllers"not on all > the same controller, so that''s why I thought if we split > our data on different disks and ZFS Storage Pools we would > get better IO performance. Correct?The way to think about it is that, in general, for best performance, you want all parts of the system operating concurrently and the load spread randomly. This leads to designs with one zpool with multiple LUNs. -- richard
> Also, (Richard can address this better than I) you may want to disable > the ZIL or have your array ignore the write cache flushes that ZFS issues.The latter is quite a reasonable thing to do, since the array has battery-backed cache. The ZIL should almost [b]never[/b] be disabled. The only reason I can think of is to determine whether a performance issue is caused by the ZIL. Disabling the ZIL does not only disable the intent log; it also causes ZFS to renege on the contract that fsync(), O_SYNC, and friends ensure that data is safely stored. A mail server, for instance, relies on this contract to ensure that a message is on disk before acknowledging its reception; if the ZIL is disabled, incoming messages can be lost in the event of a system crash. A database relies on this contract to ensure that its log is on disk before modifying its tables; if the ZIL is disabled, the database may be damaged and uncoverable in the event of a system crash. The ZIL is a necessary part of ZFS. Just because the ZFS file structure will be consistent after a system crash even with the ZIL disabled does not mean that disabling it is safe! This message posted from opensolaris.org
Right on. And you might want to capture this in a blog for reference. The permalink will be quite useful. We did have a use case for zil synchronicity which was a big user controlled transaction : turn zil off do tons of thing to the filesystem. big sync turn zil back on [ ] Rename or remove zil_disable [x] Implement zil synchronicity. [ ] I see no problem the way it is currently. As for a DB, if the log and data are on different pools (our current best practice) then I guess that DB corruption is still possible with zil_disable. With the case of DB on a single pool but different filesystems, better insure you have the same setting for both. Notification of the Completion of a transtion may also leave the bound of the host system. Never use zil_disable there. This last issue applies to an NFS server. I have blog entry coming up on that. -r Anton B. Rang writes: > > Also, (Richard can address this better than I) you may want to disable > > the ZIL or have your array ignore the write cache flushes that ZFS issues. > > The latter is quite a reasonable thing to do, since the array has > battery-backed cache. > > The ZIL should almost [b]never[/b] be disabled. The only reason I can > think of is to determine whether a performance issue is caused by the > ZIL. > > Disabling the ZIL does not only disable the intent log; it also causes > ZFS to renege on the contract that fsync(), O_SYNC, and friends ensure > that data is safely stored. A mail server, for instance, relies on > this contract to ensure that a message is on disk before acknowledging > its reception; if the ZIL is disabled, incoming messages can be lost > in the event of a system crash. A database relies on this contract to > ensure that its log is on disk before modifying its tables; if the ZIL > is disabled, the database may be damaged and uncoverable in the event > of a system crash. > > The ZIL is a necessary part of ZFS. Just because the ZFS file > structure will be consistent after a system crash even with the ZIL > disabled does not mean that disabling it is safe! > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Thu, 2006-12-14 at 11:33 +0100, Roch - PAE wrote:> We did have a use case for zil synchronicity which was a > big user controlled transaction : > > turn zil off > do tons of thing to the filesystem. > big sync > turn zil back onYep. The bulk of the "heavy lifting" on systems I run with ZFS is conceptually of this form -- nightly builds of the solaris "ON" consolidation. Some of the tools used within the build may call fsync() -- and this may be appropriate when they''re operating on their own, but within the context of the build, the fsync() is wasted effort which may cause cpus to go idle. Similarly, the bulk of the synchronous I/O done during the import of SMF manifests early in boot after an install or upgrade are wasted effort.. - Bill
Anton B. Rang wrote:> The ZIL is a necessary part of ZFS. Just because the ZFS file structure will > be consistent after a system crash even with the ZIL disabled does not mean > that disabling it is safe!Is there a list of battery-backed RAID controllers supported by Solaris x86 somewhere? Does anyone know if 3ware cards are going to be supported in the near future? Jim
Bill Sommerfeld wrote:> Similarly, the bulk of the synchronous I/O done during the import of SMF > manifests early in boot after an install or upgrade are wasted effort..I''ve done hundreds of installs. Empirically, my observation is that the SMF manifest import scales well with processors. In other words, I don''t notice it being I/O bound. I suppose I could move my dtrace boot analysis scripts to the first boot and verify... -- richard
Roch - PAE wrote:> Right on. And you might want to capture this in a blog for > reference. The permalink will be quite useful.such as: http://blogs.sun.com/erickustarz/entry/zil_disable ?> > We did have a use case for zil synchronicity which was a > big user controlled transaction : > > turn zil off > do tons of thing to the filesystem. > big sync > turn zil back on > > > [ ] Rename or remove zil_disable > [x] Implement zil synchronicity. > [ ] I see no problem the way it is currently. > > > As for a DB, if the log and data are on different pools (our > current best practice) then I guess that DB corruption is > still possible with zil_disable. With the case of DB on a > single pool but different filesystems, better insure you > have the same setting for both. > > Notification of the Completion of a transtion may also leave > the bound of the host system. Never use zil_disable there. > > This last issue applies to an NFS server. I have blog entry > coming up on that. > > -r > > Anton B. Rang writes: > > > Also, (Richard can address this better than I) you may want to disable > > > the ZIL or have your array ignore the write cache flushes that ZFS issues. > > > > The latter is quite a reasonable thing to do, since the array has > > battery-backed cache. > > > > The ZIL should almost [b]never[/b] be disabled. The only reason I can > > think of is to determine whether a performance issue is caused by the > > ZIL. > > > > Disabling the ZIL does not only disable the intent log; it also causes > > ZFS to renege on the contract that fsync(), O_SYNC, and friends ensure > > that data is safely stored. A mail server, for instance, relies on > > this contract to ensure that a message is on disk before acknowledging > > its reception; if the ZIL is disabled, incoming messages can be lost > > in the event of a system crash. A database relies on this contract to > > ensure that its log is on disk before modifying its tables; if the ZIL > > is disabled, the database may be damaged and uncoverable in the event > > of a system crash. > > > > The ZIL is a necessary part of ZFS. Just because the ZFS file > > structure will be consistent after a system crash even with the ZIL > > disabled does not mean that disabling it is safe! > > > > > > This message posted from opensolaris.org > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>Bill Sommerfeld wrote: >> Similarly, the bulk of the synchronous I/O done during the import of SMF >> manifests early in boot after an install or upgrade are wasted effort.. > >I''ve done hundreds of installs. Empirically, my observation is that >the SMF manifest import scales well with processors. In other words, >I don''t notice it being I/O bound. I suppose I could move my dtrace >boot analysis scripts to the first boot and verify...My observation is not the same; I see it scaling with CPU speed. It''s not synchronous I/O; it''s synchronous door calls to the svc.configd, 100s of them. Casper
Hello Casper, Thursday, December 14, 2006, 6:40:31 PM, you wrote:>>Bill Sommerfeld wrote: >>> Similarly, the bulk of the synchronous I/O done during the import of SMF >>> manifests early in boot after an install or upgrade are wasted effort.. >> >>I''ve done hundreds of installs. Empirically, my observation is that >>the SMF manifest import scales well with processors. In other words, >>I don''t notice it being I/O bound. I suppose I could move my dtrace >>boot analysis scripts to the first boot and verify...CDSC> My observation is not the same; I see it scaling with CPU speed. CDSC> It''s not synchronous I/O; it''s synchronous door calls to the svc.configd, CDSC> 100s of them. Anyone is working to fix it? On some slower servers this is really annoying (I know flash would ''fix'' it). -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Basically then wilth data being stored on the ZFS disks (no applications), and web servers logs, it would benefit us more to have the 3 luns setup in one ZFS Storage Pool? This message posted from opensolaris.org
>Anyone is working to fix it? On some slower servers this is really >annoying (I know flash would ''fix'' it).Not that I am aware of; it is really annoying on older hardware. Casper
Basically then wilth data being stored on the ZFS disks (no applications), and web servers logs, it would benefit us more to have the 3 luns setup in one ZFS Storage Pool? This message posted from opensolaris.org
Yes. Use one pool. This message posted from opensolaris.org
Hello Kory, Friday, December 15, 2006, 6:58:53 PM, you wrote: KW> Basically then wilth data being stored on the ZFS disks (no KW> applications), and web servers logs, it would benefit us more to KW> have the 3 luns setup in one ZFS Storage Pool? If you put all 3 LUNs in one pool just make sure they have similar characteristics when it comes to performance. Also make sure there isn''t an application which is so hungry for IOs that it can hurt other applications perhaps more important. If it''s not a case then in most cases it would be much better to put all three LUNs in a one pool. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Kory Wheatley wrote:> Basically then wilth data being stored on the ZFS disks (no applications), and > web servers logs, it would benefit us more to have the 3 luns setup in one ZFS > Storage Pool?In general, yes. -- richard