Harry Putnam
2009-Mar-19 22:50 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
I''m finally getting close to the setup I wanted, after quite a bit of experimentation and bugging these lists endlessly. So first, thanks for your tolerance and patience. My setup consists of 4 disks. One holds the OS (rpool) and 3 more all the same model and brand, all 500gb. I''ve created a zpool in raidz1 configuration with: zpool create zbk raidz1 c3d0 c4d0 c4d1 No errors showed up and zpool status shows no problems with those three: pool: zbk state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM zbk ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c3d0 ONLINE 0 0 0 c4d0 ONLINE 0 0 0 c4d1 ONLINE 0 0 0 However, I appear to have lost an awfull lot of space... even above what I expercted. df -h [...] zbk 913G 26K 913G 1% /zbk It appears something like 1 entire disk is gobbled up by raidz1. The same disks configured in zpool with no raidz1 shows 1.4tb with df. I was under the impression raidz1 would take something like 20%.. but this is more like 33.33%. So, is this to be expected or is something wrong here?
IIRC, that''s about right. If you look at the zfs best practices wiki (genunix.org I think?), there should be some space calculations linked in there somewhere. On Thu, Mar 19, 2009 at 6:50 PM, Harry Putnam <reader at newsguy.com> wrote:> I''m finally getting close to the setup I wanted, after quite a bit of > experimentation and bugging these lists endlessly. > > So first, thanks for your tolerance and patience. > > My setup consists of 4 disks. ?One holds the OS (rpool) and 3 more all > the same model and brand, all 500gb. > > I''ve created a zpool in raidz1 configuration with: > > ?zpool create ?zbk raidz1 ?c3d0 c4d0 c4d1 > > No errors showed up and zpool status shows no problems with those > three: > ? pool: zbk > ?state: ONLINE > ?scrub: none requested > ?config: > > ? ? ? ?NAME ? ? ? ?STATE ? ? READ WRITE CKSUM > ? ? ? ?zbk ? ? ? ? ONLINE ? ? ? 0 ? ? 0 ? ? 0 > ? ? ? ? ?raidz1 ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 > ? ? ? ? ? ?c3d0 ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 > ? ? ? ? ? ?c4d0 ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 > ? ? ? ? ? ?c4d1 ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 > > > However, I appear to have lost an awfull lot of space... even above > what I expercted. > > ?df -h > [...] > ?zbk ? ? ? ? ? ? ? ? ? 913G ? 26K ?913G ? 1% /zbk > > It appears something like 1 entire disk is gobbled up by raidz1. > > The same disks configured in zpool with no raidz1 ?shows ?1.4tb with df. > > I was under the impression raidz1 would take something like 20%.. but > this is more like 33.33%. > > So, is this to be expected or is something wrong here? > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
This verifies my guess: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#RAID-Z_Configuration_Requirements_and_Recommendations On Thu, Mar 19, 2009 at 6:57 PM, Blake <blake.irvin at gmail.com> wrote:> IIRC, that''s about right. ?If you look at the zfs best practices wiki > (genunix.org I think?), there should be some space calculations linked > in there somewhere. > > On Thu, Mar 19, 2009 at 6:50 PM, Harry Putnam <reader at newsguy.com> wrote: >> I''m finally getting close to the setup I wanted, after quite a bit of >> experimentation and bugging these lists endlessly. >> >> So first, thanks for your tolerance and patience. >> >> My setup consists of 4 disks. ?One holds the OS (rpool) and 3 more all >> the same model and brand, all 500gb. >> >> I''ve created a zpool in raidz1 configuration with: >> >> ?zpool create ?zbk raidz1 ?c3d0 c4d0 c4d1 >> >> No errors showed up and zpool status shows no problems with those >> three: >> ? pool: zbk >> ?state: ONLINE >> ?scrub: none requested >> ?config: >> >> ? ? ? ?NAME ? ? ? ?STATE ? ? READ WRITE CKSUM >> ? ? ? ?zbk ? ? ? ? ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ?raidz1 ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ? ?c3d0 ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ? ?c4d0 ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ? ?c4d1 ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> >> >> However, I appear to have lost an awfull lot of space... even above >> what I expercted. >> >> ?df -h >> [...] >> ?zbk ? ? ? ? ? ? ? ? ? 913G ? 26K ?913G ? 1% /zbk >> >> It appears something like 1 entire disk is gobbled up by raidz1. >> >> The same disks configured in zpool with no raidz1 ?shows ?1.4tb with df. >> >> I was under the impression raidz1 would take something like 20%.. but >> this is more like 33.33%. >> >> So, is this to be expected or is something wrong here? >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >
Tomas Ă–gren
2009-Mar-19 23:00 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
On 19 March, 2009 - Harry Putnam sent me these 1,4K bytes:> I''m finally getting close to the setup I wanted, after quite a bit of > experimentation and bugging these lists endlessly. > > So first, thanks for your tolerance and patience. > > My setup consists of 4 disks. One holds the OS (rpool) and 3 more all > the same model and brand, all 500gb. > > I''ve created a zpool in raidz1 configuration with: > > zpool create zbk raidz1 c3d0 c4d0 c4d1 > > No errors showed up and zpool status shows no problems with those > three: > pool: zbk > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > zbk ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c3d0 ONLINE 0 0 0 > c4d0 ONLINE 0 0 0 > c4d1 ONLINE 0 0 0 > > > However, I appear to have lost an awfull lot of space... even above > what I expercted. > > df -h > [...] > zbk 913G 26K 913G 1% /zbk > > It appears something like 1 entire disk is gobbled up by raidz1. > > The same disks configured in zpool with no raidz1 shows 1.4tb with df. > > I was under the impression raidz1 would take something like 20%.. but > this is more like 33.33%. > > So, is this to be expected or is something wrong here?Not a percentage at all.. raidz1 "takes" 1 disk. raidz2 takes 2 disks. This is to be able to handle 1 vs 2 any-disk failures. Then there''s the 1000 vs 1024 factor as well. Your HD manufacturer says 500GB, the rest of the computer industry says ~465.. /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Bob Friesenhahn
2009-Mar-19 23:25 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
On Thu, 19 Mar 2009, Harry Putnam wrote:> I''ve created a zpool in raidz1 configuration with: > > zpool create zbk raidz1 c3d0 c4d0 c4d1This is not a very useful configuration. With this number of disks, it is best to use two of them to build a mirror, and save the other disk for something else (e.g. to build a mirrored root pool). Mirrors perform better, are more fault tolerant, and are easier to administer. With five disks, raidz1 becomes useful. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Richard Elling
2009-Mar-20 00:28 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Bob Friesenhahn wrote:> On Thu, 19 Mar 2009, Harry Putnam wrote: >> I''ve created a zpool in raidz1 configuration with: >> >> zpool create zbk raidz1 c3d0 c4d0 c4d1 > > This is not a very useful configuration. With this number of disks, > it is best to use two of them to build a mirror, and save the other > disk for something else (e.g. to build a mirrored root pool). Mirrors > perform better, are more fault tolerant, and are easier to administer. > > With five disks, raidz1 becomes useful.+1 also remember that you can add mirrors later. For best data availability, start with 2 mirrored disks, each split in half. As your data requirements grow, add mirrored halves. For diversity, make each side (half) of the mirror be on different disks. -- richard
Harry Putnam
2009-Mar-20 01:35 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Tomas ?gren <stric at acc.umu.se> writes:>> I was under the impression raidz1 would take something like 20%.. but >> this is more like 33.33%. >> >> So, is this to be expected or is something wrong here? > > Not a percentage at all.. raidz1 "takes" 1 disk. raidz2 takes 2 disks. > This is to be able to handle 1 vs 2 any-disk failures.Now I remember seeing something to that effect. It didn''t stick, I guess because I had no experience to hang it on.> Then there''s the 1000 vs 1024 factor as well. Your HD manufacturer says > 500GB, the rest of the computer industry says ~465..Yes, I knew this part... but still doesn''t it seem there should be some way to make them quit lying...
Harry Putnam
2009-Mar-20 02:00 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Bob Friesenhahn <bfriesen at simple.dallas.tx.us> writes:> On Thu, 19 Mar 2009, Harry Putnam wrote: >> I''ve created a zpool in raidz1 configuration with: >> >> zpool create zbk raidz1 c3d0 c4d0 c4d1 > > This is not a very useful configuration. With this number of disks, > it is best to use two of them to build a mirror, and save the other > disk for something else (e.g. to build a mirrored root pool). Mirrors > perform better, are more fault tolerant, and are easier to administer.Ok, I was going by comments on a site called Simons'' blog that tells you how to set things up with zfs. Of course it is just one guys opinion. Let me just say a couple of words about my intended usage. I''m building a home NAS server for my home lan using opensolaris and zfs. I will be backing up 4 Windows XP boxes, 2 of which are used primarily for processing video. And eventully I''d be storing finished video projects too. Often these projects run to 50gb or so. But I don''t do so many. Maybe 9-12 in a yr. So I''d want to put ghosted images of each machine OS (several apiece) and running backups (incremental) that would stretch back a few weeks. And the projects mentioned above. As well as what is becoming quite a large photo collection. Also 2 linux boxes will getting backed up there. I thought the kind of redundancy offered by raidz1 would be enough for my needs and would allow me to get a little more storage room out of my disks than mirrored setup. I suspect as well, that the access times in raidz1 or mirrored would be vastly higher than what the low end consumer grade NASs that are availabe offer. I did try one out.. a WD Worldbook (about $200 US) that advertises gigabit access but in use cannot even come close to what a 10/100 connect can handle. So raidz1 would probably be adequate for me... I wouldn''t be putting it to the test like a commercial operation might. You mentioned admistration was a bigger problem with raidz1, can you be more specific there?... I have really know idea what to expect in that regard with either technique.> With five disks, raidz1 becomes useful.The three 500gb I have now are all one brand and model number and IDE ata. If I were to expand to 5, those 2 would need to be sata or else I''d also need to add a PCI IDE controller. With that in mind would it be problematic to make up the 5 by adding 2 sata 500gb to the mix?
Harry Putnam
2009-Mar-20 02:04 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Richard Elling <richard.elling at gmail.com> writes:>> With five disks, raidz1 becomes useful. > > +1 > also remember that you can add mirrors later. For best data availability, > start with 2 mirrored disks, each split in half. As your data requirements > grow, add mirrored halves. For diversity, make each side (half) of the > mirror be on different disks.Not sure I understand why this is a good idea. Understand that I''m totally inexperienced with zfs, but I thought I''d seen that zfs was more usable on whole disks, you''re talking about fdisk partitions with slices inside above, aren''t you?
I''d be careful about raidz unless you have either: 1 - automatic notification of failure set up using fmadm 2 - at least one hot spare Because raidz is parity-based (it does some math-magic to give you redundancy), replacing a disk that''s failed can take a very long time compared to mirror resilvering (the zfs term for rebuilding redundancy). You can get a nice 1000gb SATA drive on newegg or a similar site for about $90 - well worth the extra money ($120) for the convenience of mirroring. Mirrors are probably also faster for any kind of video playback (like the video projects you mention). I use raidz2 with 2 hot spares at my company, yes, but only for data warehousing. User data (windows home dirs) I put on mirrors. On Thu, Mar 19, 2009 at 10:00 PM, Harry Putnam <reader at newsguy.com> wrote:> Bob Friesenhahn <bfriesen at simple.dallas.tx.us> writes: > >> On Thu, 19 Mar 2009, Harry Putnam wrote: >>> I''ve created a zpool in raidz1 configuration with: >>> >>> ?zpool create ?zbk raidz1 ?c3d0 c4d0 c4d1 >> >> This is not a very useful configuration. ?With this number of disks, >> it is best to use two of them to build a mirror, and save the other >> disk for something else (e.g. to build a mirrored root pool). Mirrors >> perform better, are more fault tolerant, and are easier to administer. > > Ok, I was going by comments on a site called Simons'' blog that tells > you how to set things up with zfs. ?Of course it is just one guys > opinion. > > Let me just say a couple of words about my intended usage. > > I''m building a home NAS server for my home lan using opensolaris and > zfs. > > I will be backing up 4 Windows XP boxes, 2 of which are used primarily > for processing video. ?And eventully I''d be storing finished video > projects too. ?Often these projects run to 50gb or so. ?But I don''t do > so many. ?Maybe 9-12 in a yr. > > So I''d want to put ghosted images of each machine OS (several apiece) > and running backups (incremental) that would stretch back a few weeks. > And the projects mentioned above. ?As well as what is becoming quite a > large photo collection. > > Also 2 linux boxes will getting backed up there. > > I thought the kind of redundancy offered by raidz1 would be enough for > my needs and would allow me to get a little more storage room out of > my disks than mirrored setup. > > I suspect as well, that the access times in raidz1 or mirrored would > be vastly higher than what the low end consumer grade NASs that are > availabe offer. > > I did try one out.. a WD Worldbook (about $200 US) that advertises > gigabit access but in use cannot even come close to what a 10/100 > connect can handle. > > So raidz1 would probably be adequate for me... I wouldn''t be putting > it to the test like a commercial operation might. > > You mentioned admistration was a bigger problem with raidz1, can you > be more specific there?... I have really know idea what to expect in > that regard with either technique. > >> With five disks, raidz1 becomes useful. > > The three 500gb I have now are all one brand and model number and IDE ata. > If I were to expand to 5, those 2 would need to be sata or else I''d > also need to add a PCI IDE controller. > > With that in mind would it be problematic to make up the 5 by adding 2 > sata 500gb to the mix? > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Harry Putnam
2009-Mar-20 03:38 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Blake <blake.irvin at gmail.com> writes:> I''d be careful about raidz unless you have either: > > 1 - automatic notification of failure set up using fmadm > > 2 - at least one hot spareSorry to be so dense here but can you expand a little on what a `hot spare'' is. Do you mean just a spare similar sized disk to use if one fails?> Because raidz is parity-based (it does some math-magic to give you > redundancy), replacing a disk that''s failed can take a very long time > compared to mirror resilvering (the zfs term for rebuilding > redundancy).How long are you talking about there? For a 500gb drive to be inserted in a disk failure situation and have the data rebuilt on it.> You can get a nice 1000gb SATA drive on newegg or a similar site for > about $90 - well worth the extra money ($120) for the convenience of > mirroring. Mirrors are probably also faster for any kind of video > playback (like the video projects you mention).I guess I''m dumb as a stick here.. Do you mean 1 costs $90 and 2 costs 120? (Or is it a type and you meant $180). Then I''d install 2 1000gb sata drives in a mirror configuration, instead of the 3 500gb drives I''ve already installed?
Michael Ramchand
2009-Mar-20 09:31 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
You also DON''T want to give a single disk to your rpool. ZFS really needs to be able to fix errors when it finds them. Suggest you read the ZFS Best Practices Guide (again). http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Storage_Pools Mike Tomas ?gren wrote:> On 19 March, 2009 - Harry Putnam sent me these 1,4K bytes: > > >> I''m finally getting close to the setup I wanted, after quite a bit of >> experimentation and bugging these lists endlessly. >> >> So first, thanks for your tolerance and patience. >> >> My setup consists of 4 disks. One holds the OS (rpool) and 3 more all >> the same model and brand, all 500gb. >> >> I''ve created a zpool in raidz1 configuration with: >> >> zpool create zbk raidz1 c3d0 c4d0 c4d1 >> >> No errors showed up and zpool status shows no problems with those >> three: >> pool: zbk >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> zbk ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> c3d0 ONLINE 0 0 0 >> c4d0 ONLINE 0 0 0 >> c4d1 ONLINE 0 0 0 >> >> >> However, I appear to have lost an awfull lot of space... even above >> what I expercted. >> >> df -h >> [...] >> zbk 913G 26K 913G 1% /zbk >> >> It appears something like 1 entire disk is gobbled up by raidz1. >> >> The same disks configured in zpool with no raidz1 shows 1.4tb with df. >> >> I was under the impression raidz1 would take something like 20%.. but >> this is more like 33.33%. >> >> So, is this to be expected or is something wrong here? >> > > Not a percentage at all.. raidz1 "takes" 1 disk. raidz2 takes 2 disks. > This is to be able to handle 1 vs 2 any-disk failures. > > Then there''s the 1000 vs 1024 factor as well. Your HD manufacturer says > 500GB, the rest of the computer industry says ~465.. > > /Tomas >-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3249 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090320/700c3536/attachment.bin>
Bob Friesenhahn
2009-Mar-20 14:57 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
On Thu, 19 Mar 2009, Harry Putnam wrote:> > So raidz1 would probably be adequate for me... I wouldn''t be putting > it to the test like a commercial operation might.Yes, but it is pointless to use it with three disks since a mirror provides the same space using only two disks and is therefore 1/3 less likely to encounter a disk or media failure.> You mentioned admistration was a bigger problem with raidz1, can you > be more specific there?... I have really know idea what to expect in > that regard with either technique.Mirrors are quite simple. Simple is good. Mirrors resilver faster if you have to replace a drive or a drive is accidentally unplugged for a while.>> With five disks, raidz1 becomes useful. > > The three 500gb I have now are all one brand and model number and IDE ata. > If I were to expand to 5, those 2 would need to be sata or else I''d > also need to add a PCI IDE controller. > > With that in mind would it be problematic to make up the 5 by adding 2 > sata 500gb to the mix?Adding a few more drives so that you have at least five drives would get you to the point where raidz1 is useful and you will have more space available. At the moment there is no way to expand the size of a raidz vdev other than by replacing all of the disks with larger disks. You can''t add additional drives to a vdev. So you need to install the drives from the very beginning. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Will Murnane
2009-Mar-20 15:25 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
On Fri, Mar 20, 2009 at 14:57, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:> On Thu, 19 Mar 2009, Harry Putnam wrote: >> >> So raidz1 would probably be adequate for me... I wouldn''t be putting >> it to the test like a commercial operation might. > > Yes, but it is pointless to use it with three disks since a mirror provides > the same space using only two disksHuh? A three-disk raidz1 provides as much space as two disks, a mirror of two disks provides as much space as one disk.> ?At the moment there is no way to expand the size of a raidz vdev other than > by replacing all of the disks with larger disks. ?You can''t add additional > drives to a vdev. ?So you need to install the drives from the very > beginning.Or create more vdevs. You could start with three disks in raidz1, and add another three in a year when you run out of space. This won''t be optimal in terms of performance, but it''ll work. Will
Harry Putnam
2009-Mar-20 15:42 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Bob Friesenhahn <bfriesen at simple.dallas.tx.us> writes:>>> With five disks, raidz1 becomes useful. >> >> The three 500gb I have now are all one brand and model number and IDE ata. >> If I were to expand to 5, those 2 would need to be sata or else I''d >> also need to add a PCI IDE controller. >> >> With that in mind would it be problematic to make up the 5 by adding 2 >> sata 500gb to the mix? > > Adding a few more drives so that you have at least five drives would > get you to the point where raidz1 is useful and you will have more > space available.So is 5 with 1 hotswap (total 6) a sensible arrangement? And would that leave me with something like 2tb (minus manufacturer exaggeration) and one disk would be swallowed for parity data. Also, I''m not getting why a 4 disk raidz arrangement would not be practicle. I see now why 3 isn''t smart, with no spc advantage over 2 disk mirror. But wouldn''t a 4 disk raidz1 of 500gb drives give about 1500 gb for space where as a mirror setup with 4 500gb would only give 1000 gb. (again minus manufacturer bragging). Or is the needed hotswap for the raidz1 the deal breaker? Meaning it would really take 5 disks to get 1500gb of space in a raidz1 of all 500gb disks.> . . . . . . At the moment there is no way to expand the size of > a raidz vdev other than by replacing all of the disks with larger > disks. You can''t add additional drives to a vdev. So you need to > install the drives from the very beginning.So far, I''m still at the point where I can just destroy whatever I''ve created and start over, since no data has been written to the zpool so far. But what about mixing the 3 IDE ata I have currently with 2 sata (all 500gb) (I have no more IDE controllers, available but do have 2 sata ports available). Is mixing IDE and SATA likely to be a problem? And another thing... if all controllers are full and used in a raidz is it still possible to add a hotswap? Would I pull out one of the raidz drives to do it or what? I understand the hotswap disk would not have to remain in the machine but still, all available controllers would be used for the 5 disk raidz so something would have to come out to be able to add the hotswap disk.
Bob Friesenhahn
2009-Mar-20 15:48 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
On Fri, 20 Mar 2009, Will Murnane wrote:>> >> Yes, but it is pointless to use it with three disks since a mirror provides >> the same space using only two disks > Huh? A three-disk raidz1 provides as much space as two disks, a > mirror of two disks provides as much space as one disk.Sorry about that. Brain fart. With the four drives mentioned, one mirror could be used for the root pool, and another mirror could be used for the second pool. While there are then two pools, the available fault-tolerant storage space will be roughly the same and the Solaris boot/execution environment will be more reliable. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Casper.Dik at Sun.COM
2009-Mar-20 15:49 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
>So is 5 with 1 hotswap (total 6) a sensible arrangement? And would >that leave me with something like 2tb (minus manufacturer exaggeration) >and one disk would be swallowed for parity data.Yeah, yeah. Perhaps you can ask the "SI" to change the definition of "T" from 10^12 to 2^40. The only people who are weird are those computer people who believe that 1K == 1024 and 1M is 2^20, etc. I suggested that we just give it a rest. (Even Flash memory is now "SI" measured) Casper
reader at newsguy.com
2009-Mar-20 16:32 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Casper.Dik at Sun.COM writes:>>So is 5 with 1 hotswap (total 6) a sensible arrangement? And would >>that leave me with something like 2tb (minus manufacturer exaggeration) >>and one disk would be swallowed for parity data. > > Yeah, yeah. Perhaps you can ask the "SI" to change the definition of > "T" from 10^12 to 2^40. The only people who are weird are those computer > people who believe that 1K == 1024 and 1M is 2^20, etc. I suggested that > we just give it a rest. (Even Flash memory is now "SI" measured)I was only trying to forstall other posters pointing out that I really didn''t get 500gb on a 500gb disk. But Casper, what about any of the 4 questions asked: 1) Is 5 disk raidz1 with 1 hotswap a sensible arrangemnt 2) Why is a 4 disk raidz1 not a good plan? Isn''t it still a space saving over 4 disks of mirror? Or is the needed hotswap disk (bringing the total to 5) the deal breaker? 3) Is mixing same size IDE and SATA disks in a raidz1 a bad idea? 4) If all controllers are full, how is a hotswap added to a zpool? Just pull one of the zpool raidz1 disks out or what?
Cindy.Swearingen at Sun.COM
2009-Mar-20 17:24 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Harry, Bob F. has give you some excellent advice about using mirrored configurations. I can answer your RAIDZ questions but your original configuration was for a root pool and non-root pool using 4 disks total. Start with two mirrored pools of two disks each. In the future, you will be able to add two or more disks to your non-root pool. You can''t do that with a RAIDZ pool. If you need to, you can even detach one side of the mirror of each pool. You can''t do that with a RAIDZ pool. If you need larger pools you can replace all the disks in both pools with larger disks. You can do that with a RAIDZ pool, but more flexibility exists with mirrored pools. 1. Yes, sensible. 2. Saving space isn''t always the best configuration. 3. I don''t know. 4. Yes, with more disks, you can identify hot spares to be used in the case of a disk failure. Cindy reader at newsguy.com wrote:> Casper.Dik at Sun.COM writes: > > >>>So is 5 with 1 hotswap (total 6) a sensible arrangement? And would >>>that leave me with something like 2tb (minus manufacturer exaggeration) >>>and one disk would be swallowed for parity data. >> >>Yeah, yeah. Perhaps you can ask the "SI" to change the definition of >>"T" from 10^12 to 2^40. The only people who are weird are those computer >>people who believe that 1K == 1024 and 1M is 2^20, etc. I suggested that >>we just give it a rest. (Even Flash memory is now "SI" measured) > > I was only trying to forstall other posters pointing out that I really > didn''t get 500gb on a 500gb disk. > > But Casper, what about any of the 4 questions asked: > > 1) Is 5 disk raidz1 with 1 hotswap a sensible arrangemnt > 2) Why is a 4 disk raidz1 not a good plan? Isn''t it still a space > saving over 4 disks of mirror? Or is the needed hotswap disk > (bringing the total to 5) the deal breaker? > 3) Is mixing same size IDE and SATA disks in a raidz1 a bad idea? > 4) If all controllers are full, how is a hotswap added to a zpool? > Just pull one of the zpool raidz1 disks out or what? > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Harry Putnam
2009-Mar-20 19:15 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
ALERT.. Long Winded reply ahead.. Cindy.Swearingen at Sun.COM writes:> Harry, > > Bob F. has give you some excellent advice about using mirrored > configurations. I can answer your RAIDZ questions but your original > configuration was for a root pool and non-root pool using 4 disks > total.I didn''t make it clear. 1 disk, the one with rpool on it is 60gb. The other 3 are 500GB. Using a 500gb to mirror a 60gb would be something of a waste .. eh? And is a mirror of rpool really important? I assumed there would be some way to backup rpool in a format that could be written onto a new disk and booted in the event of rpool disk failure. With the backup not kept on rpool disk. Someone had suggested even creating an rpool mirror and putting the bootmanager bits on its mbr but then keeping it in storage instead of on the machine (freeing a controller port). It has the downside of having to be re mirrored every now and then. But could be safe enough if nothing of great importance was kept on the rpool... Just OS and config changes, some BE''s. But nothing that couldn''t be lost. Then in the event of disk failure... You''d have to just install the spare, boot it, and bring it up to date. Something kind of like what people do with ghost images on windows OS.> Start with two mirrored pools of two disks each. In the future, > you will be able to add two or more disks to your non-root pool. > You can''t do that with a RAIDZ pool.Well one thing there... if I use 5 500gb disks (no counting rpool disk - 6 total), by the time my raidz fills up, I''ll need a whole new machine really since I''ll be out of controller ports and its getting hard to find controllers that are not PCI express already. (My hardware is plain PCI only and even then the onboard sata is not recognized and I''m addding a PCI sata controller already) Also some of the older data will have outlived its usefulness so what needs transferring to a new setup may not be really hard to accommodate or insurmountable. And finally, I''m 65 yrs old... Its kind of hard to imagine my wife and I filling up the nearly 2tb of spc the above mentioned raidz1 would afford before we go before the old grim reaper. Even with lots of pictures and video projects thrown in. I''m really thinking now to go to 5 500gb disks in raidz1, and one hotswap (Plus the rpool on 1 60gb disk). I would be clear out of both sata and IDE controller ports then, so I''m hoping I can add a hot swap by pulling one of the raidz disks long enough to add the hotswap... then take it back out and replace the missing raidz disk. I could do this by getting 3 more 500gb disks 2 more for the raidz and 1 for hotswap. No other hardware would be needed. All the while assuming I can mix 3 500GB IDE and 2 500GB SATA with no problems.> If you need to, you can even detach one side of the mirror > of each pool. You can''t do that with a RAIDZ pool. If you need > larger pools you can replace all the disks in both pools with > larger disks. You can do that with a RAIDZ pool, but more > flexibility exists with mirrored pools. > > 1. Yes, sensible. > 2. Saving space isn''t always the best configuration. > 3. I don''t know. > 4. Yes, with more disks, you can identify hot spares to > be used in the case of a disk failure.Nice thanks (To Bob F as well). And I''m not being hard headed about using a mirror config. Its just that I have limited controller ports (4 ide 2 sata), limited budget, and kind of wanted to get this backup machine setup to where I could basically just leave it alone and let the backups run. On 3) Mixing IDE and SATA on same zpool I''d really like to hear from someone who has done that. About 4).. so if all controllers are already full with either a zpool or rpool. Do you pull out one of the raidz1 disks to add a hotswap then remove the hotswap and put the pulled disk from the raidz back? If so, does that cause some kind of resilvering or does some other thing happen when a machine is booted with a raidz1 disk misssing, and then rebooted with it back in place?
Replies inline (I really would recommend reading the whole ZFS Best Practices guide a few times - many of your questions are answered in that document): On Fri, Mar 20, 2009 at 3:15 PM, Harry Putnam <reader at newsguy.com> wrote:> > I didn''t make it clear. ?1 disk, the one with rpool on it is 60gb. > The other 3 are 500GB. ?Using a 500gb to mirror a 60gb would be > something of a waste .. eh?In the near-term, yes, but it would work.> > And is a mirror of rpool really important? ?I assumed there would be > some way to backup rpool in a format that could be written onto a new > disk and booted in the event of rpool disk failure. ?With the backup > not kept on rpool disk.You say you want a storage server you can forget about - that sounds like zfs self-healing, which requires a mirror at least.> > Someone had suggested even creating an rpool mirror and putting the > bootmanager bits on its mbr but then keeping it in storage instead of > on the machine (freeing a controller port).This is not a replacement for live redundancy.> > It has the downside of having to be re mirrored every now and then.Actually, the point of a backup is to have a known-good copy of data somewhere. Re-mirroring would be a mistake, as it destroys your old data state.> > But could be safe enough if nothing of great importance was kept on > the rpool... Just OS and config changes, some BE''s. ?But nothing that > couldn''t be lost. > > Then in the event of disk failure... You''d have to just install the > spare, boot it, and bring it up to date. > > Something kind of like what people do with ghost images on windows > OS. > >> Start with two mirrored pools of two disks each. In the future, >> you will be able to add two or more disks to your non-root pool. >> You can''t do that with a RAIDZ pool. > > Well one thing there... if I use 5 500gb disks (no counting rpool disk > - 6 total), by the time my raidz fills up, I''ll need a whole new > machine really since I''ll be out of controller ports and its getting > hard to find controllers that are not PCI express already. (My > hardware is plain PCI only and even then the onboard sata is not > recognized and I''m addding a PCI sata controller already)If the hardware is old/partially supported/flaky, all the more reason to use mirrors. Any single disk from a mirror can be used standalone. Big disks are cheap: http://tinyurl.com/5tzguf> > Also some of the older data will have outlived its usefulness so what > needs transferring to a new setup may not be really hard to > accommodate or insurmountable. > > And finally, I''m 65 yrs old... Its kind of hard to imagine my wife and > I filling up the nearly 2tb of spc the above mentioned raidz1 would > afford before we go before the old grim reaper. > > Even with lots of pictures and video projects thrown in. > > I''m really thinking now to go to 5 500gb disks in raidz1, and one > hotswap (Plus the rpool on 1 60gb disk). I would be clear out of both > sata and IDE controller ports then, so I''m hoping I can add a hot swap > by pulling one of the raidz disks long enough to add the > hotswap... then take it back out and replace the missing raidz disk.See the zfs docs for more about hot spares. The ''hot'' part means the disk is in the chassis and spinning all the time, ready to replace a failed drive automatically. Not something easy work out the hardware for in a situation like yours. If you don''t have room for yet another disk in the chassis, you won''t be able to use a hot spare.> > I could do this by getting 3 more 500gb disks 2 more for the raidz and > 1 for hotswap. ?No other hardware would be needed. All the while > assuming I can mix 3 500GB IDE and 2 500GB SATA with no problems. > >> If you need to, you can even detach one side of the mirror >> of each pool. You can''t do that with a RAIDZ pool. If you need >> larger pools you can replace all the disks in both pools with >> larger disks. You can do that with a RAIDZ pool, but more >> flexibility exists with mirrored pools. >> >> 1. Yes, sensible. >> 2. Saving space isn''t always the best configuration. >> 3. I don''t know. >> 4. Yes, with more disks, you can identify hot spares to >> be used in the case of a disk failure. > > Nice thanks (To Bob F as well). ?And I''m not being hard headed about > using a mirror config. ?Its just that I have limited controller ports > (4 ide 2 sata), limited budget, and kind of wanted to get this backup > machine setup to where I could basically just leave it alone and let > the backups run. > > On 3) Mixing IDE and SATA on same zpool > I''d really like to hear from someone who has done that.In my experience, zfs doesn''t care what kind of block device you give it.> > About 4).. so if all controllers are already full with either a zpool > or rpool. ?Do you pull out one of the raidz1 disks to add a hotswap > then remove the hotswap and put the pulled disk from the raidz back? > > If so, does that cause some kind of resilvering or does some other > thing happen when a machine is booted with a raidz1 disk misssing, and > then rebooted with it back in place?If you pull a disk from a raidz1 array, you get DEGRADED status in your zpool status output. This means that you take a performance it, and that if you lose another drive, you lose the whole pool. This is why most storage admins only use raidz in the raidz2 config with a few hot spares ready to go. That means a chassis that supports 7+ drives, which means a big power supply, etc.> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Harry Putnam
2009-Mar-21 00:22 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Blake <blake.irvin at gmail.com> writes:> Replies inline (I really would recommend reading the whole ZFS Best > Practices guide a few times - many of your questions are answered in > that document):First, I hope you don''t think I am being hard headed and not reading the documentation you suggest. I have some trouble reading it but more important... it really is hard to stick any of the info into my pea brain and hope it stays without ANY actual experience to hang it on. That is what make responses like yours, which are based on actual experience, so valuable to me. [...] snipped out the whole prior discussion to take a different turn now> If the hardware is old/partially supported/flaky, all the more reason > to use mirrors.. . . . . . .OK, I''m convinced... Mirrors it is. I will ALWAYS be on old shakey half supported hardware. And I think I''m seeing a way to do this. But first, in the Best Practices it says in several places that keeping any storage data off rpool is a good thing.. and a best practice. I didn''t fully understand the reasoning given in the "ZFS Root Pool Considerations" One of the reasons: Data pools can be architecture-neutral. It might make sense to move a data pool between SPARC and Intel. Root pools are pretty much tied to a particular architecture. How hard of a problem is that? For example; can windows XP data be kept there without problems? Ditto linux data? I''m guessing `best-practices'' are mainly aimed at people who have sound new and supported hardware... so guessing there may be some corners a user like me might cut. I''m thinking I could manage 3 mirror style pools using the hardware I have mostly and maybe 1 to 3 more purchases of another 500 GB IDE disk and possibly 2 750 sata.. I would expect to use the rpool for backup storage too since the OS would take only a tiny portion. Being mirrored should remove some of the problems of having data on rpool. Another of the reasons given in `best practices'' for NOT keeping data on rpool... was the fact that it cannot be in a raidz, only mirrored but of coures thats just what I''ll be wanting. I could move my OS to a 500GB disk and mirror rpool (2 ide ports down). Create 2 more zpools 500gb disk and mirror on the remaining IDE controller ports which would be all of them. It appears after looking around when buying the 3 500gb IDE drives that 500gb is about as big as it gets for IDE. Nobody cares much about IDE anymore. Then one more zpool on two sata ports. I only have 200gb sata disks to hand but could use them well on one or another windows machines and I might splurge on a pair of 750s for the sata ports. So (keeping Casper happy) 500 + 500 + 750 would give me ============== 1750 GB (minus 50gb for OS and its mirror) Would give me about 1700GB of mirrored storage room. I doubt I''d actually use that up. But if so I could always add a better sata controller...one with 6 ports or something. Or get really big sata drives to replace the 750s. Or would I be likely to run into some kind PCU underpowered problem with 10 or so disks? The one I have is 420watt. On an Athlon64 2.2ghz +3400 with AOpen mobo and a limit of 3gb ram. I kind of like to keep the disk size down personally because in my experience on linux and some on Windows, the bigger the disk the longer everthing takes... I mean like formatting, scanning, defragmenting on windows... etc. Maybe that isn''t true with zfs but it seems something like resilvering would have to take much longer on a really huge disk. So 750gb may be about as big as I''d care to go.
Bob Friesenhahn
2009-Mar-21 15:35 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
On Fri, 20 Mar 2009, Harry Putnam wrote:> > I didn''t fully understand the reasoning given in the > "ZFS Root Pool Considerations" One of the reasons: > > Data pools can be architecture-neutral. It might make sense to move > a data pool between SPARC and Intel. Root pools are pretty much tied > to a particular architecture. > > How hard of a problem is that? > > For example; can windows XP data be kept there without problems? > Ditto linux data?This consideration only applies to those who have SPARC hardware (or maybe IBM POWER hardware in the future), which you are very unlikely to ever own. It is true that root pools are a bit more constraining than ones which can be exported and imported at will. You can store any data in a root pool similar to any other pool. In fact, two disk systems will usually store user data in the root pool.> I kind of like to keep the disk size down personally because in my > experience on linux and some on Windows, the bigger the disk the > longer everthing takes... I mean like formatting, scanning, > defragmenting on windows... etc.The good news is that disk sizes don''t really matter much to ZFS. It only has impact to the resilver times when a disk is replaced. Otherwise most operations which might have taken hours on some other system take maybe a second. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Harry Putnam
2009-Mar-21 19:07 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Bob Friesenhahn <bfriesen at simple.dallas.tx.us> writes:> The good news is that disk sizes don''t really matter much to ZFS. It > only has impact to the resilver times when a disk is > replaced. Otherwise most operations which might have taken hours on > some other system take maybe a second.Yes, I''ve been noticing that. How long would be about normal for resilvering on a 750gb disk mirror? Oh yes, thanks for the excellent information... It really helps hearing from people with plenty of experience. Good to know that storing data on rpool isn''t like begging for trouble or standing in the fast lane of the expressway.... hehe.
Harry Putnam
2009-Mar-21 19:23 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
Harry Putnam <reader at newsguy.com> writes: Just kind of dopy past mattering correction here.> I suspect as well, that the access times in raidz1 or mirrored would > be vastly higher than what the low end consumer grade NASs that are > availabe offer.Above should have said `vastly lower'' or `vastly better''. I''m saying either technique is bound to beat the snot our of consumer grade NAS.> I did try one out.. a WD Worldbook (about $200 US) that advertises > gigabit access but in use cannot even come close to what a 10/100 > connect can handle.Proof that at least that one was horribly slothful. Posted just for the unlikely event that some reader might see that and think consumer grade NAS have faster access times than zfs.
Eric D. Mudama
2009-Mar-21 19:47 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
On Sat, Mar 21 at 14:07, Harry Putnam wrote:>Bob Friesenhahn <bfriesen at simple.dallas.tx.us> writes: > >> The good news is that disk sizes don''t really matter much to ZFS. It >> only has impact to the resilver times when a disk is >> replaced. Otherwise most operations which might have taken hours on >> some other system take maybe a second. > >Yes, I''ve been noticing that. How long would be about normal for >resilvering on a 750gb disk mirror?Disk sizes matter, but as a second order effect because they let you have a larger dataset. Resilver is goverend by disk layout and dataset size. --eric -- Eric D. Mudama edmudama at mail.bounceswoosh.org
Bob Friesenhahn
2009-Mar-21 21:27 UTC
[zfs-discuss] Size discrepancy (beyond expected amount?)
On Sat, 21 Mar 2009, Harry Putnam wrote:> > Yes, I''ve been noticing that. How long would be about normal for > resilvering on a 750gb disk mirror?That is quite dependent on the disks involved. At 60MB/second, it would take 3-1/2 hours. At 30MB/second, it is way up to 7 hours. This assumes that the disk is entirely full. ZFS only needs to resilver the amount which is actually consumed. Regardless, maximum allowable resilver time should be considered when looking at disk size. Huge disks increase risk. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/