I''m creating a zpool that is 25TB in size. What are the recommendations in regards to LUN sizes? For example: Should I have 4 x 6.25 TB LUNS to add to the zpool or 20 x 1.25TB LUNs to add to the pool? Or does it depend on the size of the san disks themselves? Or should I divide the zpool up and make several smaller zpools? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20121026/881bc9b6/attachment.html>
On Sat, Oct 27, 2012 at 4:08 AM, Morris Hooten <mhooten at us.ibm.com> wrote:> I''m creating a zpool that is 25TB in size. > > What are the recommendations in regards to LUN sizes? > > For example: > > Should I have 4 x 6.25 TB LUNS to add to the zpool or 20 x 1.25TB LUNs to > add to the pool? > > Or does it depend on the size of the san disks themselves?More like "you shouldn''t let the SAN mess with the disks and let zfs see the disks as JBOD". ... but then again, your SAN might not let you do that. So my suggestion is actually just present one huge 25TB LUN to zfs and let the SAN handle redundancy. Or if your SAN can''t do it either, just let it give whatever-biggest-size LUN that it can, and simply use stripe on zfs side.> > Or should I divide the zpool up and make several smaller zpools?If you''re going to use them for a single purpose anyway (e.g. storing database files from a single db, or whatever), I don''t see the benefit in doing that. -- Fajar
Disclaimer: I haven''t used LUNs with ZFS, so take this with a grain of salt. On Fri, Oct 26, 2012 at 4:08 PM, Morris Hooten <mhooten at us.ibm.com> wrote:> I''m creating a zpool that is 25TB in size. > > What are the recommendations in regards to LUN sizes? >The first standard advice I can give is that ZFS wants to be in charge of the redundancy, otherwise it can only tell you when there is a data error, but it can''t fix it (other than possibly metadata copies, or if you set copies for data). So, if you can have the LUNs be real disks, or otherwise set up so that they reside on independent storage, something like a pool of raidz2 vdevs, or of mirror pairs, is the general idea to provide good data integrity. When ZFS detects data corruption, and it can''t fix it (no redundancy or problems in too many of the constituent sectors to successfully reconstruct), it will return an I/O error, and not a "best effort" at what it thinks the data was, so beware insufficient redundancy.> For example: > > Should I have 4 x 6.25 TB LUNS to add to the zpool or 20 x 1.25TB LUNs to > add to the pool? > > Or does it depend on the size of the san disks themselves? >Without redundancy controlled by ZFS, I am unsure whether having multiple separate LUNs will change performance significantly, it probably depends most upon whether the LUNs actually perform independently (if saturating one doesn''t make any other LUN significantly slower). When in doubt, benchmark.> Or should I divide the zpool up and make several smaller zpools? >Multiple storage zpools are not usually needed, one main point of ZFS is that you can have as many filesystems as you want on each pool. As for performance, zpool performance should, in theory, scale with the performance of the underlying devices, so two small zpools shouldn''t be faster in aggregate than one large zpool. The recommended configuration will depend on how you intend to use it, and what your constraints are, which aren''t clear. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20121026/6b1d3087/attachment.html>
Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
2012-Oct-27 14:16 UTC
[zfs-discuss] Zpool LUN Sizes
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Fajar A. Nugraha > > So my > suggestion is actually just present one huge 25TB LUN to zfs and let > the SAN handle redundancy.Oh - No.... Definitely let zfs handle the redundancy. Because ZFS is doing the checksumming, if it finds a cksum error, it needs access to the redundant copy in order to correct it. If you let the SAN handle the redundancy, then zfs finds a cksum error, and your data is unrecoverable. (Just the file in question, not the whole pool or anything like that.) The answer to Morris''s question, about size of LUNs and so forth... It really doesn''t matter what size the LUNs are. Just choose based on your redundancy and performance requirements. Best would be to go JBOD, or if that''s not possible, create a bunch of 1-disk volumes and let ZFS handle them as if they''re JBOD. Performance is much better if you use mirrors instead of raid. (Sequential performance is just as good either way, but sequential IO is unusual for most use cases. Random IO is much better with mirrors, and that includes scrubs & resilvers.)
Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
2012-Oct-27 14:21 UTC
[zfs-discuss] Zpool LUN Sizes
> From: Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) > > Performance is much better if you use mirrors instead of raid. (Sequential > performance is just as good either way, but sequential IO is unusual for most > use cases. Random IO is much better with mirrors, and that includes scrubs & > resilvers.)Even if you think you use sequential IO... If you use snapshots... Thanks to the nature of snapshot creation & deletion & the nature of COW, you probably don''t have much sequential IO in your system, after a couple months of actual usage. Some people use raidzN, but I always use mirrors.
On Sat, Oct 27, 2012 at 9:21 AM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) < opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote:> > From: Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) > > > > Performance is much better if you use mirrors instead of raid. > (Sequential > > performance is just as good either way, but sequential IO is unusual for > most > > use cases. Random IO is much better with mirrors, and that includes > scrubs & > > resilvers.) > > Even if you think you use sequential IO... If you use snapshots... > Thanks to the nature of snapshot creation & deletion & the nature of COW, > you probably don''t have much sequential IO in your system, after a couple > months of actual usage. Some people use raidzN, but I always use mirrors. >This may be the case if you often rewrite portions of files, so especially database usage, but if you generally write entire new files rather than modifying old ones, I wouldn''t expect fragmentation to be that bad. The particular workload I have is like this, if a file is changed, it is overwritten entirely, so I went with raidz2 vdevs for more capacity. However, I''m not exactly pushing the limits of the pool performance, as my bottleneck is network. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20121027/11544d84/attachment.html>
On Sat, Oct 27, 2012 at 9:16 PM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) <opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Fajar A. Nugraha >> >> So my >> suggestion is actually just present one huge 25TB LUN to zfs and let >> the SAN handle redundancy.> create a bunch of 1-disk volumes and let ZFS handle them as if they''re JBOD.Last time I use IBM''s enterprise storage (which was, admittedly, a long time ago) you can''t even do that. And looking at Morris'' mail address, it should be revelant :) ... or probably it''s just me who haven''t found how to do that. Which why I suggested just use whatever the SAN can present :) -- Fajar
On Sun, Oct 28, 2012 at 04:43:34PM +0700, Fajar A. Nugraha wrote:> On Sat, Oct 27, 2012 at 9:16 PM, Edward Ned Harvey > (opensolarisisdeadlongliveopensolaris) > <opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote: > >> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > >> bounces at opensolaris.org] On Behalf Of Fajar A. Nugraha > >> > >> So my > >> suggestion is actually just present one huge 25TB LUN to zfs and let > >> the SAN handle redundancy. > > > create a bunch of 1-disk volumes and let ZFS handle them as if they''re JBOD. > > Last time I use IBM''s enterprise storage (which was, admittedly, a > long time ago) you can''t even do that. And looking at Morris'' mail > address, it should be revelant :) > > ... or probably it''s just me who haven''t found how to do that. Which > why I suggested just use whatever the SAN can present :)You are entering the uncharted waters of ``multi-level disk management'''' here. Both ZFS and the SAN use redundancy and error- checking to ensure data integrity. Both of them also do automatic replacement of failing disks. A good SAN will present LUNs that behave as perfectly reliable virtual disks, guaranteed to be error free. Almost all of the time, ZFS will find no errors. If ZFS does find an error, there''s no nice way to recover. Most commonly, this happens when the SAN is powered down or rebooted while the ZFS host is still running. -- -Gary Mills- -refurb- -Winnipeg, Manitoba, Canada-