Chris Siebenmann
2008-Apr-08 16:22 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
In our environment, the politically and administratively simplest approach to managing our storage is to give each separate group at least one ZFS pool of their own (into which they will put their various filesystems). This could lead to a proliferation of ZFS pools on our fileservers (my current guess is at least 50 pools and perhaps up to several hundred), which leaves us wondering how well ZFS handles this many pools. So: is ZFS happy with, say, 200 pools on a single server? Are there any issues (slow startup, say, or peculiar IO performance) that we''ll run into? Has anyone done this in production? If there are issues, is there any sense of what the recommended largest number of pools per server is? Thanks in advance. - cks
Wade.Stuart at fallon.com
2008-Apr-08 16:55 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
zfs-discuss-bounces at opensolaris.org wrote on 04/08/2008 11:22:53 AM:> In our environment, the politically and administratively simplest > approach to managing our storage is to give each separate group at > least one ZFS pool of their own (into which they will put their various > filesystems). This could lead to a proliferation of ZFS pools on our > fileservers (my current guess is at least 50 pools and perhaps up to > several hundred), which leaves us wondering how well ZFS handles this > many pools. > > So: is ZFS happy with, say, 200 pools on a single server? Are there any > issues (slow startup, say, or peculiar IO performance) that we''ll run > into? Has anyone done this in production? If there are issues, is there > any sense of what the recommended largest number of pools per server is? >Chris, Well, I have done testing with filesystems and not as much with pools -- I believe the core design premise for zfs is that administrators would use few pools and many filesystems. I would think that Sun would recommend that you make a large pool (or a few) and divvy out filesystem with reservations to the groups (to which they can add sub filesystems). As far as ZFS filesystems are concerned my testing has shown that the mount time and io overhead for multiple filesystems seems to be pretty linear -- timing 10 mounts translates pretty well to 100 and 1000. After you hit some level (depending on processor and memory) the mount time, io and write/read batching spikes up pretty heavily. This is one of the reasons I take a strong stance against the recommendation that people use reservations and filesystems as user/group quotas (ignoring that the functionality is not by any means in parity.) -Wade
Simon Breden
2008-Apr-12 09:55 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
Hi Chris, I would have thought that managing multiple pools (you mentioned 200) would be an absolute administrative nightmare. If you give more details about your storage needs like number of users, space required etc it might become clearer what you''re thinking of setting up. Also, I see you were considering 200 pools on a single server. Considering that you''ll want redundancy in each pool, if you''re forming your pools from complete physical disks, you are looking at 400 disks minimum if you use a simple 2-disk mirror for each pool. I think it''s not recommended t use partial disk slices to form pools -- use whole disks. I''ll be bold here and make some assumptions. You have a lot of students/staff there and you have a need, say, for 10TB of data. You could create one pool, using sixteen 1TB disks. The pool could be formed as two RAIDZ2 vdevs, each vdev containing five disks for data and two for parity. Additionally, for extra safety, you could add two hot spares. Something like: zpool create unipool RAIDZ2 disk1 disk2 disk3 disk4 disk5 disk6 disk7 RAIDZ2 disk8 disk9 disk10 disk11 disk12 disk13 disk14 spare disk15 disk16 This arrangement would give 10TB of double-parity redundant data, meaning that it would survive 2 disks failing within each vdev (there''s 2 of them), and when a disk or two fail from a vdev, spares would be used. Sun''s X4500 can house 48 disks for around 24TB of storage if you need more capacity. Then, according to your needs, you could create multiple filesystems, using different mountpoints and access permissions of your choosing. Using the approach of one pool and multiple filesystems avoids the problems associated with one pool getting full whilst other pools being under-used. The pool will provide capacity across all the filesystems. You will also want to consider using snapshots to help avoid data loss, and these can be used very elegantly to perform incremental backups to another large capacity backup server on your network -- see zfs send / recv with the -i option, for example. I''m not an expert with ZFS yet, so I hope that others correct any mistakes I may have made here. The above is just based on my current limited experience and knowledge of ZFS :) Anyway, hope it helps. Simon http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/ This message posted from opensolaris.org
Chris Siebenmann
2008-Apr-12 19:51 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
| Hi Chris, I would have thought that managing multiple pools (you | mentioned 200) would be an absolute administrative nightmare. If you | give more details about your storage needs like number of users, space | required etc it might become clearer what you''re thinking of setting | up. Every university department has to face the issue of how to allocate disk space to people. Here, we handle storage allocation decisions through the relatively simple method of selling fixed-size chunks of storage to faculty (either single professors or groups of them) for a small one-time fee. (We used fixed size chunks partly because it is simpler to administer and to set prices, and partly because it is our current model in our Solaris 8 + DiskSuite + constant-sized partitions environment.) So, we are always going to have a certain number of logical pools of storage space to manage. The question is whether to handle them as separate ZFS pools or aggregate them into fewer ZFS pools and then administer them as sub-hierarchies using quotas[*], and our current belief is that doing the former is simpler to administer and simpler to explain to users. 200 pools on a single server is probably pessimistic (hopefully there will be significantly fewer), but could happen if people go wild with separate pools and there is a failover situation where a single physical server has to handle several logical fileservers at once. | Also, I see you were considering 200 pools on a single | server. Considering that you''ll want redundancy in each pool, if | you''re forming your pools from complete physical disks, you are | looking at 400 disks minimum if you use a simple 2-disk mirror for | each pool. I think it''s not recommended t use partial disk slices to | form pools -- use whole disks. We''re not going to use local disk storage on the fileservers for various reasons, including failover and easier long-term storage management and expansion. We have pretty much settled on iSCSI (mirroring each ZFS vdev across two controllers, so our fileservers do not panic if we lose a single controller). The fixed-size chunks will be done at the disk level, either as slices from a single LUN on Solaris or as individual LUNs sliced out of each disk on the iSCSI target. (Probably the latter, because it lets us use more slices per disk and we have some number of ''legacy'' 35 GB disk chunks that we cannot really give free size upgrades to.) (Using full disks as the chunk size is infeasible for several reasons.) - cks [*: we''ve experimented, and quotas turn out to work better than reservations for this purpose. If anyone wants more details, see http://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSReservationsVsQuotas ]
Will Murnane
2008-Apr-12 20:35 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
On Sat, Apr 12, 2008 at 7:51 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:> So, we are always going to have a certain number of logical pools of > storage space to manage. The question is whether to handle them as > separate ZFS pools or aggregate them into fewer ZFS pools and then > administer them as sub-hierarchies using quotas[*], and our current > belief is that doing the former is simpler to administer and simpler to > explain to users.I don''t think that''s the case. What''s wrong with setting both a quota and a reservation on your user filesystems? What advantage will multiple zpools present over a single one with filesystems carved out of it? With a single pool, you can "expand" filesystems if the user requests it just by changing the quota and reservation for that filesystem, and add more capacity if necessary by adding more disks to the pool. If your policy is to use, say, a single pair of 35GB mirrors per zpool and the user wants more space, they need to split their files into categories somehow. Also, you mention ''legacy'' 35GB chunks that you can''t give free upgrades to. A single-pool methodology will help with that - you can set arbitrary-sized quotas for whichever users want them. If one group wants to buy 10 gigs and another wants 47, you can give them exactly how much they want. Lastly, if you plan to support (or use internally) snapshots, a single large pool will be much easier to deal with. Taking snapshots and making large changes means that the snapshot starts taking up space, which could be problematic on a small pool.> [*: we''ve experimented, and quotas turn out to work better than reservations > for this purpose.You might want to use "refquota" and "refreservation" if you''re running a Solaris that supports them---that precludes Solaris 10u4, unfortunately. If you''re running Nevada, though, they''re definitely the way to go. In any case, I''m interested in why multiple pools might be a good choice. We have a similar situation (albeit on a smaller scale) with many disks, a faculty member''s data on each, and I''d like to start using ZFS for that. A single pool was the model that sprung to mind for the purpose, and I''d like to hear reasons why it might not be a good choice. Will
David Collier-Brown
2008-Apr-12 20:40 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
Chris Siebenmann wrote:> Every university department has to face the issue of how to allocate > disk space to people. Here, we handle storage allocation decisions > through the relatively simple method of selling fixed-size chunks of > storage to faculty (either single professors or groups of them) for a > small one-time fee.I would expect all sorts of organizations would want to allocate space to their customers as a concatenation of "reasonable sized" chunks, where the definition of reasonable would vary in size depending on the business. I was recently in a similar discussion about how best to do this allocation on a 9990v, so I expect it''s not peculiar to the UofT (:-)) --dave (about 6 miles north of Chris) c-b -- David Collier-Brown | Always do right. This will gratify Sun Microsystems, Toronto | some people and astonish the rest davecb at sun.com | -- Mark Twain (905) 943-1983, cell: (647) 833-9377, (800) 555-9786 x56583, bridge: (877) 385-4099 code: 506 9191#
Chris Siebenmann
2008-Apr-12 20:58 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
| I don''t think that''s the case. What''s wrong with setting both a quota | and a reservation on your user filesystems? In a shared ZFS pool situation I don''t think we''d get anything from using both. We have to use something to limit people to the storage that they bought, and in at least S10 U4 quotas work better for this (we tested). | What advantage will multiple zpools present over a single one with | filesystems carved out of it? With a single pool, you can "expand" | filesystems if the user requests it just by changing the quota and | reservation for that filesystem, and add more capacity if necessary | by adding more disks to the pool. If your policy is to use, say, a | single pair of 35GB mirrors per zpool and the user wants more space, | they need to split their files into categories somehow. Pools can/will have more than one vdev. The plan is that we will have a set of unallocated fixed-size chunks of disk space (as LUNs or slices). When someone buys more space, we pair up two such chunks and add them to the person''s pool as a mirrored vdev. With the single pool approach, you have a number of issues: - if you keep pools at their currently purchased space, you have to both add a new vdev *and* bump someone''s quota by the appropriate amount. This is somewhat more work and opens you up to the possibility of stupid mistakes when you change the quotas. - if you preallocate space to pools before it is purchased by anyone, you have to statically split your space between fileservers in advance. You may also need to statically split the space between multiple pools on a single fileserver, if a single pool would otherwise have too many disks to make you comfortable; this limits how much space a person can add to their their existing allocation in an artificial way. - if a disaster happens and you lose both sides of a mirrored vdev, you will have lost a *lot* more data (and a lot more people will be affected) than if you had things split up into separate pools. (Of course, this depends on how many of your separate pools had vdevs involving the pair of disks that you just lost; you could lose nearly as much data, if most of your pools were using chunks of the disk.) This argues for having multiple pools on a fileserver, which runs you into the ''people can only grow so far'' problem. We plan to use snapshots only while we take backups, partly because of their effects on quotas and so on. Any additional usage of snapshots would probably be under user control, so that the people who own the space can make decisions like ''we will accept losing some space so that we can instantly go back to yesterday''. (There are groups that would probably take that, and groups that never would.) | You might want to use "refquota" and "refreservation" if you''re | running a Solaris that supports them---that precludes Solaris 10u4, | unfortunately. If you''re running Nevada, though, they''re definitely | the way to go. This is going to be a production environment, so we''re pretty much stuck to Solaris 10 U<whatever is current>. - cks
Joe Little
2008-Apr-12 23:02 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
On Tue, Apr 8, 2008 at 9:55 AM, <Wade.Stuart at fallon.com> wrote:> zfs-discuss-bounces at opensolaris.org wrote on 04/08/2008 11:22:53 AM: > > > > In our environment, the politically and administratively simplest > > approach to managing our storage is to give each separate group at > > least one ZFS pool of their own (into which they will put their various > > filesystems). This could lead to a proliferation of ZFS pools on our > > fileservers (my current guess is at least 50 pools and perhaps up to > > several hundred), which leaves us wondering how well ZFS handles this > > many pools. > > > > So: is ZFS happy with, say, 200 pools on a single server? Are there any > > issues (slow startup, say, or peculiar IO performance) that we''ll run > > into? Has anyone done this in production? If there are issues, is there > > any sense of what the recommended largest number of pools per server is? > > > > Chris, > > Well, I have done testing with filesystems and not as much with > pools -- I believe the core design premise for zfs is that administrators > would use few pools and many filesystems. I would think that Sun would > recommend that you make a large pool (or a few) and divvy out filesystem > with reservations to the groups (to which they can add sub filesystems). > As far as ZFS filesystems are concerned my testing has shown that the mount > time and io overhead for multiple filesystems seems to be pretty linear -- > timing 10 mounts translates pretty well to 100 and 1000. After you hit > some level (depending on processor and memory) the mount time, io and > write/read batching spikes up pretty heavily. This is one of the reasons I > take a strong stance against the recommendation that people use > reservations and filesystems as user/group quotas (ignoring that the > functionality is not by any means in parity.) >Not to beat a dead horse too much, the lack of quotas and the mount limits either of the clients or the time per filesystem mentioned above allows us to heavily utilize ZFS for second tier, where quotas can be at a logical group level, and not first tier use which still demands per user quotas. Its unmet requirement. As to your original question, with enough LUN carving you can artificially create many pools. However, ease of management and focusing on both performance and reliability suggest one put as many drives in a redundant config in as few a pools as possible, split up your disk use among top level ZFS filesystems to each group, and then let them divy up ZFS filesystems with further embedded ZFS file systems.> -Wade > > > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
David Collier-Brown
2008-Apr-15 14:22 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
We''ve discussed this in considerable detail, but the original question remains unanswered: if an organization *must* use multiple pools, is there an upper bound to avoid or a rate of degradation to be considered? --dave -- David Collier-Brown | Always do right. This will gratify Sun Microsystems, Toronto | some people and astonish the rest davecb at sun.com | -- Mark Twain (905) 943-1983, cell: (647) 833-9377, (800) 555-9786 x56583 bridge: (877) 385-4099 code: 506 9191#
Mike Gerdts
2008-Apr-15 14:52 UTC
[zfs-discuss] How many ZFS pools is it sensible to use on a single server?
On Tue, Apr 15, 2008 at 9:22 AM, David Collier-Brown <davecb at sun.com> wrote:> We''ve discussed this in considerable detail, but the original > question remains unanswered: if an organization *must* use > multiple pools, is there an upper bound to avoid or a rate > of degradation to be considered?I have a keen interest in this as well. I would really like zones to be able to independently fail over between hosts in a zone farm. The work coming out of the Indiana, IPS, Caiman, etc. projects imply that zones will have to be on zfs. In order to fail zones over between systems independently either I need to have a zpool per zone or I need to have per-dataset replication. Considering that with some workloads 20+ zones on a T2000 is quite feasible, a T5240 could be pushing 80+ zones and as such a relatively large number of zpools. -- Mike Gerdts http://mgerdts.blogspot.com/