Paul Sobey
2008-Nov-26 10:43 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
Hello, We have a new Thor here with 24TB of disk in (first of many, hopefully). We are trying to determine the bext practices with respect to file system management and sizing. Previously, we have tried to keep each file system to a max size of 500GB to make sure we could fit it all on a single tape, and to minimise restore times and impact should we experience some kind of volume corruption. With zfs, we are re-evaulating our working practices. A few questions then. Apologies if these have been asked recently, I went back through a month''s worth of posts and couldn''t see anything. I''ve also read the best practices guide here: solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_Administration_Considerations Pointers to additional info are most welcome! 1. Do these kinds of self-imposed limitations make any sense in a zfs world? 2. What is the ''logical corruption boundry'' for a zfs system - the filesystem or the zpool? 3. Are there scenarios apart from latency sensitive applications (e.g. Oracle logs) that warrant separate zpools? One of our first uses for our shiney new server is to hold about 5TB of data which logically belongs on one share/mount, but has historically been partitioned up into 500GB pieces. The owner of the data is keen to see it available in one place, and we (as the infrastructure team) are debating whether it''s a sensible thing to allow. Thanks for any advice/wisdom you may have... Cheers, Paul
Paul Sobey
2008-Nov-26 14:40 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
On Wed, 26 Nov 2008, Paul Sobey wrote:> Hello, > > We have a new Thor here with 24TB of disk in (first of many, hopefully). > We are trying to determine the bext practices with respect to file system > management and sizing. Previously, we have tried to keep each file system > to a max size of 500GB to make sure we could fit it all on a single tape, > and to minimise restore times and impact should we experience some kind of > volume corruption. With zfs, we are re-evaulating our working practices. > > A few questions then. Apologies if these have been asked recently, I went > back through a month''s worth of posts and couldn''t see anything. I''ve also > read the best practices guide here: > solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_Administration_Considerations > > Pointers to additional info are most welcome! > > 1. Do these kinds of self-imposed limitations make any sense in a zfs > world? > > 2. What is the ''logical corruption boundry'' for a zfs system - the > filesystem or the zpool? > > 3. Are there scenarios apart from latency sensitive applications (e.g. > Oracle logs) that warrant separate zpools? > > One of our first uses for our shiney new server is to hold about 5TB of > data which logically belongs on one share/mount, but has historically been > partitioned up into 500GB pieces. The owner of the data is keen to see it > available in one place, and we (as the infrastructure team) are debating > whether it''s a sensible thing to allow. > > Thanks for any advice/wisdom you may have...Apologies to all, sent to opensolaris.com and then bounced from Pine to correct address hoping the headers would sort themselves. Apparently not. Replying to this one should work... Paul
Bob Friesenhahn
2008-Nov-26 16:31 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
On Wed, 26 Nov 2008, Paul Sobey wrote:> > Pointers to additional info are most welcome! > > 1. Do these kinds of self-imposed limitations make any sense in a zfs > world?Depending on your backup situation, they may make just as much sense as before. For zfs this is simply implemented by applying a quota to each filesystem in the pool.> 2. What is the ''logical corruption boundry'' for a zfs system - the > filesystem or the zpool?The entire pool.> 3. Are there scenarios apart from latency sensitive applications (e.g. > Oracle logs) that warrant separate zpools?I can''t think of any reason for separate zpools other than to limit the exposure to catastrophic risk (e.g. total pool failure) or because parts of the storage may be moved to a different system. The size of the overall pool is much less important than the design of its zdevs (two-way mirror, three-way mirror, raidz, raidz2). Golden Rule: "The pool is only as strong as its weakest zdev". The number of zdevs in the pool, and the performance of the individual devices comprising the zdev, determine the pool''s performance. More zdevs results in better multi-user performance since more I/Os can be active at once. With an appropriate design, a larger pool will deliver more performance without sacrificing reliability. Given that your pool is entirely subservient to one system ("Thor") it likely makes sense to put its devices in one pool since (barring zfs implementation bugs) the reliability of the pool will be dicated by the reliability of that system. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, simplesystems.org/users/bfriesen GraphicsMagick Maintainer, GraphicsMagick.org
Paul Sobey
2008-Nov-26 16:55 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
On Wed, 26 Nov 2008, Bob Friesenhahn wrote:>> 1. Do these kinds of self-imposed limitations make any sense in a zfs >> world? > > Depending on your backup situation, they may make just as much sense as > before. For zfs this is simply implemented by applying a quota to each > filesystem in the pool.We''re more worried about the idea of a single ''zfs filesystem'' becoming corrupt somehow. From what you say below, the pool is the boundry where that might happen, not the individual filesystem. Therefore it seems no less dangerous creating a single 5TB pool vs. 10 500GB ones, from a risk-of-corruption point of view - is that correct?>> 2. What is the ''logical corruption boundry'' for a zfs system - the >> filesystem or the zpool? > > The entire pool. > >> 3. Are there scenarios apart from latency sensitive applications (e.g. >> Oracle logs) that warrant separate zpools? > > I can''t think of any reason for separate zpools other than to limit the > exposure to catastrophic risk (e.g. total pool failure) or because parts of > the storage may be moved to a different system. > > The size of the overall pool is much less important than the design of its > zdevs (two-way mirror, three-way mirror, raidz, raidz2). Golden Rule: "The > pool is only as strong as its weakest zdev". The number of zdevs in the > pool, and the performance of the individual devices comprising the zdev, > determine the pool''s performance. More zdevs results in better multi-user > performance since more I/Os can be active at once. With an appropriate > design, a larger pool will deliver more performance without sacrificing > reliability.Given that we have a load of available disks (can''t remember exact number for an X4540 - is it better to chop a storage pool into a few raidz devs then, rather than all into one? Are there any metrics I can use to guide me in this as far as performance tuning goes?> Given that your pool is entirely subservient to one system ("Thor") it likely > makes sense to put its devices in one pool since (barring zfs implementation > bugs) the reliability of the pool will be dicated by the reliability of that > system.The Thor is an X4540 - we''re rather pleased with it so far. On a slightly off topic note - do people find the top loading nature of these easy to work with? Strikes me that it''s a lot of torque on those rails when fully extended - presumably they are better in the bottom of racks than the top?! Paul
Bob Friesenhahn
2008-Nov-26 17:29 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
On Wed, 26 Nov 2008, Paul Sobey wrote:> > We''re more worried about the idea of a single ''zfs filesystem'' becoming > corrupt somehow. From what you say below, the pool is the boundry where that > might happen, not the individual filesystem. Therefore it seems no less > dangerous creating a single 5TB pool vs. 10 500GB ones, from a > risk-of-corruption point of view - is that correct?Other than hardware failure, the biggest fear is some sort of a zfs implementation bug. If there is a zfs implementation bug it could perhaps be more risky to have five pools rather than one.> Given that we have a load of available disks (can''t remember exact number for > an X4540 - is it better to chop a storage pool into a few raidz devs then, > rather than all into one? Are there any metrics I can use to guide me in this > as far as performance tuning goes?An important thing to keep in mind is that each vdev offers a "write IOP". If you put ten disks in a raidz2 vdev, then those ten disks are providing one write IOP and a one read IOP. If you use those 10 disks to create five mirror vdevs, then you obtain five write IOPs and ten read IOPs, but almost half the usable disk space. This a a sort of simplistic way to look at the performance issue, but it is still useful. One factor I forgot to mention, is that there is value to having multiple pools if the performance characteristics of the pools needs to be radically different. For example, one pool could be optimized for storage capacity while the other is optimized for IOPS. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, simplesystems.org/users/bfriesen GraphicsMagick Maintainer, GraphicsMagick.org
Paul Sobey
2008-Nov-26 17:35 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
On Wed, 26 Nov 2008, Bob Friesenhahn wrote:> On Wed, 26 Nov 2008, Paul Sobey wrote: > An important thing to keep in mind is that each vdev offers a "write IOP". > If you put ten disks in a raidz2 vdev, then those ten disks are providing one > write IOP and a one read IOP. If you use those 10 disks to create five > mirror vdevs, then you obtain five write IOPs and ten read IOPs, but almost > half the usable disk space. This a a sort of simplistic way to look at the > performance issue, but it is still useful. > > One factor I forgot to mention, is that there is value to having multiple > pools if the performance characteristics of the pools needs to be radically > different. For example, one pool could be optimized for storage capacity > while the other is optimized for IOPS.Excellent advice - thankyou very much. Paul
Anton B. Rang
2008-Nov-27 03:27 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
>If there is a zfs implementation bug it could perhaps be more risky >to have five pools rather than one.Kind of goes both ways. You''re perhaps 5 times as likely to wind up with a damaged pool, but if that ever happens, there''s only 1/5 as much data to restore. -- This message posted from opensolaris.org
Paul Sobey
2008-Nov-27 16:10 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
On Wed, 26 Nov 2008, Anton B. Rang wrote:>> If there is a zfs implementation bug it could perhaps be more risky >> to have five pools rather than one. > > Kind of goes both ways. You''re perhaps 5 times as likely to wind up with a damaged pool, but if that ever happens, there''s only 1/5 as much data to restore.One more question regarding this - has anybody on this list had (or heard of) a zpool corruption that prevented access to all data in the pool? We''re probably going to have a second X4540 here and push snapshots to it daily, so as a last resort we''d have access to the data on another machine. I''d like to minimise the chance of problems on the live machine first though :)
Yes, several people have had problems. Many, many people without redundancy in their pools have had problems, and I''ve also seen one or two cases of quite large pools going corrupt (including at least one on a thumper I believe). I think they were all a good while ago now (9 months or so), and ZFS appears much more mature these days. I suspect the risk now is relatively low, but ZFS is still a pretty young filesystem, and one that''s undergoing rapid development. I''m happily storing production data on ZFS, but I do have backups of everything stored on a non-ZFS filesystem. -- This message posted from opensolaris.org
Bob Friesenhahn
2008-Nov-27 16:51 UTC
[zfs-discuss] ZPool and Filesystem Sizing - Best Practices?
On Thu, 27 Nov 2008, Paul Sobey wrote:> One more question regarding this - has anybody on this list had (or heard > of) a zpool corruption that prevented access to all data in the pool?If you check the list archives, you will find a number of such cases. Usually they occured on non-Sun hardware. In some cases people lost their pools after a BIOS upgrade which commandeered a few more disk bytes for itself. In most cases the pool was recoverable with some assistance from Sun tech support. It is important to avoid hardware which writes data in the wrong order, or does not obey cache control and cache flush commands. As one would expect, Sun hardware is "well behaved". Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, simplesystems.org/users/bfriesen GraphicsMagick Maintainer, GraphicsMagick.org