Hi there, I think I have managed to confuse myself so i am asking outright hoping for a straight answer. First, my situation. I have several disks of varying sizes I would like to run as redundant storage ina file server at home. Performance is not my number one priority, largest capacity possible while allowing for a single disk failure. Is there a soluton to my problem? My understanding is as follows, JBOD does not protect from a disk failure. Raidz can only be as big as your smallest disk. For example if I had a 320gig with a 250gig and 200gig I could only have 400gig of storage. I know it''s not exactly an enterprise level question but it would help my understanding a lot. I just want to be sure that buying new disks of equal size is not my only option to have a proper raidz configuration. thanks. This message posted from opensolaris.org
On 8/19/07, James <melchior at iamagloworm.com> wrote:> Raidz can only be as big as your smallest disk. For example if I had a 320gig with a 250gig and 200gig I could only have 400gig of storage. >Correct - variable sized disks are not (yet?) supported. However, you can circumvent this by slicing up the disks (or using partitions) and building multiple raidz sets. In you example, you would first use 200G each to make 3x200G raidz, leaving you with 120 and 50 G "excess" on two disks. Then you can make a mirror out of them, (2x50G) leaving 70G excess on the first disk (which can be used as temporary space or such). It gets better when there are more disks - if your two largest disks are of equal size, you even get away without any wasted space. This process is quite clumsy, and gets really complicated when you want to add disks - but as it stands, is the only option in this scenario. Wishlist: object level raidz..
I''m about to build a fileserver and I think I''m gonna use OpenSolaris and ZFS. I''ve got a 40GB PATA disk which will be the OS disk, and then I''ve got 4x250GB SATA + 2x500GB SATA disks. From what you are writing I would think my best option would be to slice the 500GB disks in two 250GB and then make two RAIDz with two 250 disks and one partition from each 500 disk, giving me two RAIDz of 4 slices of 250, equaling to 2 x 750GB RAIDz. How would the performance be with this? I mean, it would probably drop since I would have two raidz slices on one disk.>From what I gather, I would still be able to lose one of the 500 disks (or 250) and still be able to recover, right?Perhaps I should just get another 500GB disk and run a RAIDz on the 500s and one RAIDz on the 250s? I''m also a bit of a noob when it comes to ZFS (but it looks like it''s not that hard to admin) - Would I be able to join the two RAIDz together for one BIG volume altogether? And it will survive one disk failure? /Christopher This message posted from opensolaris.org
> I''m about to build a fileserver and I think I''m gonna > use OpenSolaris and ZFS. > > I''ve got a 40GB PATA disk which will be the OS disk, > and then I''ve got 4x250GB SATA + 2x500GB SATA disks. > From what you are writing I would think my best > option would be to slice the 500GB disks in two 250GB > and then make two RAIDz with two 250 disks and one > partition from each 500 disk, giving me two RAIDz of > 4 slices of 250, equaling to 2 x 750GB RAIDz.Why not do it this way... Pair the 250GB drives into a 500GB Raid-1 and then use these in the RAID-Z configuration? I would think that this setup would have no less performance than a Raid-Z using only 500GB drives. However, you''ll have to ask someone for the zfs command magic to do this. Off the top of my head, I''m not sure.> How would the performance be with this? I mean, it > would probably drop since I would have two raidz > slices on one disk. > > From what I gather, I would still be able to lose one > of the 500 disks (or 250) and still be able to > recover, right?Right, but then if a 500GB drive fails you degrade both pools.> > Perhaps I should just get another 500GB disk and run > a RAIDz on the 500s and one RAIDz on the 250s?That sounds even better.> I''m also a bit of a noob when it comes to ZFS (but it > looks like it''s not that hard to admin) - Would I be > able to join the two RAIDz together for one BIG > volume altogether? And it will survive one disk > failure?No, these would be two separate pools of storage. Gary This message posted from opensolaris.org
I would keep it simple. Let''s call your 250GB disks A, B, C, D, and your 500GB disks X and Y. I''d either make them all mirrors: zpool create mypool mirror A B mirror C D mirror X Y or raidz the little ones and mirror the big ones: zpool create mypool raidz A B C D mirror X Y or, as you mention, get another 500GB disk, Z, and raidz like this: zpool create mypool raidz A B C D raidz X Y Z Jeff On Wed, Sep 26, 2007 at 01:06:38PM -0700, Christopher wrote:> I''m about to build a fileserver and I think I''m gonna use OpenSolaris and ZFS. > > I''ve got a 40GB PATA disk which will be the OS disk, and then I''ve got 4x250GB SATA + 2x500GB SATA disks. From what you are writing I would think my best option would be to slice the 500GB disks in two 250GB and then make two RAIDz with two 250 disks and one partition from each 500 disk, giving me two RAIDz of 4 slices of 250, equaling to 2 x 750GB RAIDz. > > How would the performance be with this? I mean, it would probably drop since I would have two raidz slices on one disk. > > >From what I gather, I would still be able to lose one of the 500 disks (or 250) and still be able to recover, right? > > Perhaps I should just get another 500GB disk and run a RAIDz on the 500s and one RAIDz on the 250s? > > I''m also a bit of a noob when it comes to ZFS (but it looks like it''s not that hard to admin) - Would I be able to join the two RAIDz together for one BIG volume altogether? And it will survive one disk failure? > > /Christopher > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Wed, 26 Sep 2007, Jeff Bonwick wrote:> I would keep it simple. Let''s call your 250GB disks A, B, C, D, > and your 500GB disks X and Y. I''d either make them all mirrors: > > zpool create mypool mirror A B mirror C D mirror X Y > > or raidz the little ones and mirror the big ones: > > zpool create mypool raidz A B C D mirror X Y > > or, as you mention, get another 500GB disk, Z, and raidz like this: > > zpool create mypool raidz A B C D raidz X Y Z+1 All excellent solutions Also consider two pools, one a raidz and the 2nd a 2-way mirror. That way you can take advantage of the different operational characteristics of each pool: zpool create mypool raidz A B C D zpool create mirpool mirror X Y> Jeff > > On Wed, Sep 26, 2007 at 01:06:38PM -0700, Christopher wrote: >> I''m about to build a fileserver and I think I''m gonna use OpenSolaris and ZFS. >> >> I''ve got a 40GB PATA disk which will be the OS disk, and then I''ve got 4x250GB SATA + 2x500GB SATA disks. From what you are writing I would think my best option would be to slice the 500GB disks in two 250GB and then make two RAIDz with two 250 disks and one partition from each 500 disk, giving me two RAIDz of 4 slices of 250, equaling to 2 x 750GB RAIDz. >> >> How would the performance be with this? I mean, it would probably drop since I would have two raidz slices on one disk. >> >>> From what I gather, I would still be able to lose one of the 500 disks (or 250) and still be able to recover, right? >> >> Perhaps I should just get another 500GB disk and run a RAIDz on the 500s and one RAIDz on the 250s? >> >> I''m also a bit of a noob when it comes to ZFS (but it looks like it''s not that hard to admin) - Would I be able to join the two RAIDz together for one BIG volume altogether? And it will survive one disk failure? >> >> /Christopher >> >> >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >Al Hopper Logical Approach Inc, Plano, TX. al at logical-approach.com Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
Hmm.. Thanks for the input. I want to have the most space but still need a raid in some way to have redundancy. I''ve added it up and found this: ggendel - your suggestiong makes me "loose" 1TB - Loose 250GBx2 for the raid-1 ones and then 500GB from a 3x500GB = 1TB bonwick - your first suggestion makes me "loose" 1TB. The second 750GB. The third, still 750GB but I gain 500GB more, since I know only loose 1/3 of 1500 instead of 1/2 of 1000. ggendel - yeah I know you would degrade both pools, but you would still be able to recover, unless our good friend Murphy comes around in between, as I would expect him to :-/ So, as bonwick said - let''s keep it simple :) No need to make it very complex. How is SATA support in OpenSolaris these days. I''ve read about ppl saying it has poor support, but I believe it was blogs and suchs from 2006. I downloaded the Developer Edition yesterday. I can have 10 SATA disks in my tower (8 onbord sata connections and 2 from a controller card). Why not fill it :) I made a calculation, buying 1x500+3x750, 4x750 or 4x500 disks. The price/GB doesn''t differ much here in Norway. Option 1 - Buying 4x750GB disks: 4x250 RaidZ - 750/250 (Raid size/lost to redundancy) 2x500 Raid1 - 500/500 4x750 RaidZ - 2250/750 Equals: 3500/1500 (3500GB space / 1500GB lost to redundancy) Cost: 4x750 costs NOK6000 = US$1100 Option 2 - Buy 1x500 + 3x750 4x250 RaidZ - 750/250 3x500 Raid1 - 1000/500 3x750 RaidZ - 1500/750 Equals 3250/1500 1x750 costs NOK 1500 = US$ 270 3x500 costs NOK 3000 = US$ 550 Total US$ 820 Option 3 - Buying 4x500GB disks: 4x250 RaidZ - 750/250 6x500 Raid1 - 2500/500 Equals: 3250/750 Cost: 4x500 costs NOK4000 = US$ 720 Option 2 is not winning in either cost or space, so thats out. Option 1 gives me 250GB more space but costs me NOK 2000 / US$ 360 more than option 3. For NOK 2000 I could get two more 500GB disks or one big 1TB disk. Obviously from a cost AND size perspective it would be best/smart to go for option 3 and have a raidz of 4x250 and one of 6x500. Comments? This message posted from opensolaris.org
For SATA support, try to stick to the Marvell/Supermicro Card, LSI, certain Intel and Nvidia, and the Silicon Image 3112 and 3114 chipsets. The Sun HCL is invaluable here. Note that you want ZFS to be looking directly at the disks, not at some abstraction created by hardware RAID controllers. Therefore, you may have to set the chipset bios on these to JBOD or vanilla controller mode. Blake On 9/27/07, Christopher <joffer at online.no> wrote:> > Hmm.. Thanks for the input. I want to have the most space but still need a > raid in some way to have redundancy. > > I''ve added it up and found this: > ggendel - your suggestiong makes me "loose" 1TB - Loose 250GBx2 for the > raid-1 ones and then 500GB from a 3x500GB = 1TB > bonwick - your first suggestion makes me "loose" 1TB. The second 750GB. > The third, still 750GB but I gain 500GB more, since I know only loose 1/3 of > 1500 instead of 1/2 of 1000. > > ggendel - yeah I know you would degrade both pools, but you would still be > able to recover, unless our good friend Murphy comes around in between, as I > would expect him to :-/ > > So, as bonwick said - let''s keep it simple :) No need to make it very > complex. > > How is SATA support in OpenSolaris these days. I''ve read about ppl saying > it has poor support, but I believe it was blogs and suchs from 2006. I > downloaded the Developer Edition yesterday. > > I can have 10 SATA disks in my tower (8 onbord sata connections and 2 from > a controller card). Why not fill it :) > > I made a calculation, buying 1x500+3x750, 4x750 or 4x500 disks. The > price/GB doesn''t differ much here in Norway. > > Option 1 - Buying 4x750GB disks: > 4x250 RaidZ - 750/250 (Raid size/lost to redundancy) > 2x500 Raid1 - 500/500 > 4x750 RaidZ - 2250/750 > Equals: 3500/1500 (3500GB space / 1500GB lost to redundancy) > Cost: 4x750 costs NOK6000 = US$1100 > > Option 2 - Buy 1x500 + 3x750 > 4x250 RaidZ - 750/250 > 3x500 Raid1 - 1000/500 > 3x750 RaidZ - 1500/750 > Equals 3250/1500 > 1x750 costs NOK 1500 = US$ 270 > 3x500 costs NOK 3000 = US$ 550 > Total US$ 820 > > Option 3 - Buying 4x500GB disks: > 4x250 RaidZ - 750/250 > 6x500 Raid1 - 2500/500 > Equals: 3250/750 > Cost: 4x500 costs NOK4000 = US$ 720 > > Option 2 is not winning in either cost or space, so thats out. > Option 1 gives me 250GB more space but costs me NOK 2000 / US$ 360 more > than option 3. For NOK 2000 I could get two more 500GB disks or one big 1TB > disk. > > Obviously from a cost AND size perspective it would be best/smart to go > for option 3 and have a raidz of 4x250 and one of 6x500. > > Comments? > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070927/b591de66/attachment.html>
David Dyer-Bennet
2007-Sep-27 18:20 UTC
[zfs-discuss] Best option for my home file server?
Blake wrote:>> Obviously from a cost AND size perspective it would be best/smart to go >> for option 3 and have a raidz of 4x250 and one of 6x500. >> >> Comments? >> >>How long are you going to need this data? Do you have an easy and quick way to back it all up? Is the volume you need going to grow over time? For *my* home server, the need to expand over time ended up dominating the disk architecture, and I chose a less efficient (more space/money lost to redundant storage) architecture that was easier to upgrade in small increments, because that fit my intention to maintain the data long-term, and the lack of any efficient easy way to back up and restore the data (I *do* back it up to external firewire disks, but it takes 8 hours or so, so I don''t want to have to have the system down for a full two-way copy when I need to upgrade the disk sizes). -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Christopher Gibbs
2007-Sep-27 18:58 UTC
[zfs-discuss] Best option for my home file server?
David, mind sharing the specifics of your configuration? I''m also about to re-configure my home file server so I''m interested in what configurations people are using. On 9/27/07, David Dyer-Bennet <dd-b at dd-b.net> wrote:> Blake wrote: > >> Obviously from a cost AND size perspective it would be best/smart to go > >> for option 3 and have a raidz of 4x250 and one of 6x500. > >> > >> Comments? > >> > >> > > How long are you going to need this data? Do you have an easy and quick > way to back it all up? Is the volume you need going to grow over time? > For *my* home server, the need to expand over time ended up dominating > the disk architecture, and I chose a less efficient (more space/money > lost to redundant storage) architecture that was easier to upgrade in > small increments, because that fit my intention to maintain the data > long-term, and the lack of any efficient easy way to back up and restore > the data (I *do* back it up to external firewire disks, but it takes 8 > hours or so, so I don''t want to have to have the system down for a full > two-way copy when I need to upgrade the disk sizes). > > -- > David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ > Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/ > Photos: http://dd-b.net/photography/gallery/ > Dragaera: http://dragaera.info > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Christopher Gibbs Email / LDAP Administrator Web Integration & Programming Abilene Christian University
On 26/09/2007, Christopher <joffer at online.no> wrote:> I''m about to build a fileserver and I think I''m gonna use OpenSolaris and ZFS. > > I''ve got a 40GB PATA disk which will be the OS disk,Would be nice to remove that as a SPOF. I know ZFS likes whole disks, but I wonder how much would performance suffer if you SVMed up the first few Gb of a ZFS mirror pair for your root fs? I did it this week on Solaris 10 and it seemed to work pretty well ( http://number9.hellooperator.net/articles/2007/09/27/solaris-10-on-mirrored-disks ) Roll on ZFS root :) -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/
David Dyer-Bennet
2007-Sep-27 19:58 UTC
[zfs-discuss] Best option for my home file server?
Christopher Gibbs wrote:> On 9/27/07, David Dyer-Bennet <dd-b at dd-b.net> wrote: > >> >> How long are you going to need this data? Do you have an easy and quick >> way to back it all up? Is the volume you need going to grow over time? >> For *my* home server, the need to expand over time ended up dominating >> the disk architecture, and I chose a less efficient (more space/money >> lost to redundant storage) architecture that was easier to upgrade in >> small increments, because that fit my intention to maintain the data >> long-term, and the lack of any efficient easy way to back up and restore >> the data (I *do* back it up to external firewire disks, but it takes 8 >> hours or so, so I don''t want to have to have the system down for a full >> two-way copy when I need to upgrade the disk sizes).>> David, mind sharing the specifics of your configuration? >> >> I''m also about to re-configure my home file server so I''m interested >> in what configurations people are using. >>Asus M2n SLI Deluxe motherboard with AMD x2 processor and 2GB ram in a Chenbro case, maybe the 106? The pictures online show a different door than the one I have, so I may have the wrong model, but they may just have restyled the door. 8 hot-swap bays. Trouble is, hot-swap isn''t recognized on the chipset on that motherboard, so I''m currently living with that. Currently using two bays for the boot disks, leaving my 6 bays (but only 4 controllers) for data disks. I picked up a bunch of 400GB spare SATA (Sun-branded Hitachi drives) from work when they were put out for disposal, so I''ve got two mirror pairs in there right now. Also I can''t get RSA authentication to work with sshd on Solaris, and I keep asking here and elsewhere and nobody has been able to help yet, darn it (over a year now). I use RSA2 authentication to other servers just fine. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
On Thu, 27 Sep 2007, Blake wrote:> For SATA support, try to stick to the Marvell/Supermicro Card, LSI, certain > Intel and Nvidia, and the Silicon Image 3112 and 3114 chipsets. The Sun HCLAvoid the 3112 and 3114. Last I looked they are supported by the legacy ata driver and performance is, for want of a better word, poor. A 3124 based card would be a good choice - as long as you use the fixed 3124 driver here: http://www.opensolaris.org/jive/servlet/JiveServlet/download/80-32437-138083-3390/si3124.tar.gz ... snip .... Al Hopper Logical Approach Inc, Plano, TX. al at logical-approach.com Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
> I think I have managed to confuse myself so i am asking outright hoping for a straight answer. >Straight answer: ZFS does not (yet) support adding a disk to an existing raidz set - the only way to expand an existing pool is by adding a stripe. Stripes can either be mirror, raid5, or raid6 (raidz w/ single or double parity) - these striped pools are also known as raid10, raid50, and raid60 respectively. Each stripe in a pool may be different in both size and type - essentially, each offers space at a resiliency rating. However, since apps can''t control which stripe their data is written to, all stripes in a pool generally have the same amount of parity. Thus, in practice, stripes differ only in size, which can be achieved either by using larger disks or by using more disks (in a raidz). When stripes are of different size, ZFS will, in time, consume all the space each stripe offers - assuming data-access is completely balanced, larger stripes effectively have more I/O. Regarding matching the amount of parity in each stripe, note that a 2-disk mirror has the same amount of parity as RAID5 and a 3-disk mirror has the same parity as RAID6. So, if the primary goal is to grow a pool over time by adding as few disks as possible each time while having 1 bit of parity, you need to plan on each time adding two disks in a mirrored configuration. Thus your number of disks would grow like this: 2, 4, 6, 8, 10, etc. But since folks apparently want to be able to just add disks to a RAIDZ, lets compare that to adding 2-disk mirror stripes in terms of impact to space, resiliency, and performance. In both cases I''m assuming 500GB disks having a MTBF of 4 years,7,200 rpm, and 8.5 ms average read seek. Lets first consider adding disks to a RAID5: Following the ZFS best-practice rule of (N+P), where N={2,4,8} and P={1,2}, the disk-count should grow as follows: 3, 5, 9. That is, you would start with 3, add 2, and then add 4 - note: this would be the limit of the raidz expansion since ZFS discourages N>8. So, the pool''s MTTDL would be: 3 disks: space=1000 GB, mttdl=760.42 years, iops=79 5 disks: space=2000 GB, mttdl=228.12 years, iops=79 9 disks: space=4000 GB, mttdl=63.37 years, iops=79 Now lets consider adding 2-disk mirror stripes: We already said that the disks would grow by twos: 2, 4, 6, 8, 10, etc. - so the pool''s MTTDL would be: 2 disks: space=500 GB, mttdl=760.42 years, iops=158 4 disks: space=1000 GB, mttdl=380 years, iops=316 6 disks: space=1500 GB, mttdl=190 years, iops=474 8 disks: space=2000 GB, mttdl=95 years, iops=632 So, adding 2-disk mirrors: 1. is less expensive per addition (its always just two disks) 2. not limited in number of stripes (a raidz should only hold up to 8 data disks) 3. drops mttdl at about the same rate (though the raidz is dropping a little faster) 4. increases performance (adding disks to a raidz set has no impact) 5. increases space more slowly (the only negative - can you live with it?) Highly Recommended Resources: http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl Hope that helps, Kent -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070928/d6c33c6b/attachment.html>
I made a mistake in calculating the mttdl-drop for adding stripes - it should have read: 2 disks: space=500 GB, mttdl=760.42 years, iops=158 4 disks: space=1000 GB, mttdl=380 years, iops=316 6 disks: space=1500 GB, mttdl=*253* years, iops=474 8 disks: space=2000 GB, mttdl=*190* years, iops=632 So, in my conclusion, it should have read 1. is less expensive per addition (its always just two disks) 2. not limited in number of stripes (a raidz should only hold up to 8 data disks) 3. *drops mttdl much less quickly (in fact, you''d need 12 stripes before hitting the 8+1 mttdl)* 4. increases performance (adding disks to a raidz set has no impact) 5. increases space more slowly (the only negative - can you live with it?) Sorry! Kent Kent Watsen wrote:> >> I think I have managed to confuse myself so i am asking outright hoping for a straight answer. >> > Straight answer: > > ZFS does not (yet) support adding a disk to an existing raidz set > - the only way to expand an existing pool is by adding a stripe. > Stripes can either be mirror, raid5, or raid6 (raidz w/ single or > double parity) - these striped pools are also known as raid10, > raid50, and raid60 respectively. Each stripe in a pool may be > different in both size and type - essentially, each offers space > at a resiliency rating. However, since apps can''t control which > stripe their data is written to, all stripes in a pool generally > have the same amount of parity. Thus, in practice, stripes differ > only in size, which can be achieved either by using larger disks > or by using more disks (in a raidz). When stripes are of > different size, ZFS will, in time, consume all the space each > stripe offers - assuming data-access is completely balanced, > larger stripes effectively have more I/O. Regarding matching the > amount of parity in each stripe, note that a 2-disk mirror has the > same amount of parity as RAID5 and a 3-disk mirror has the same > parity as RAID6. > > > So, if the primary goal is to grow a pool over time by adding as few > disks as possible each time while having 1 bit of parity, you need to > plan on each time adding two disks in a mirrored configuration. Thus > your number of disks would grow like this: 2, 4, 6, 8, 10, etc. > > > But since folks apparently want to be able to just add disks to a > RAIDZ, lets compare that to adding 2-disk mirror stripes in terms of > impact to space, resiliency, and performance. In both cases I''m > assuming 500GB disks having a MTBF of 4 years,7,200 rpm, and 8.5 ms > average read seek. > > Lets first consider adding disks to a RAID5: > > Following the ZFS best-practice rule of (N+P), where N={2,4,8} and > P={1,2}, the disk-count should grow as follows: 3, 5, 9. That is, > you would start with 3, add 2, and then add 4 - note: this would > be the limit of the raidz expansion since ZFS discourages N>8. > So, the pool''s MTTDL would be: > > 3 disks: space=1000 GB, mttdl=760.42 years, iops=79 > 5 disks: space=2000 GB, mttdl=228.12 years, iops=79 > 9 disks: space=4000 GB, mttdl=63.37 years, iops=79 > > Now lets consider adding 2-disk mirror stripes: > > We already said that the disks would grow by twos: 2, 4, 6, 8, 10, > etc. - so the pool''s MTTDL would be: > > 2 disks: space=500 GB, mttdl=760.42 years, iops=158 > 4 disks: space=1000 GB, mttdl=380 years, iops=316 > 6 disks: space=1500 GB, mttdl=190 years, iops=474 > 8 disks: space=2000 GB, mttdl=95 years, iops=632 > > So, adding 2-disk mirrors: > > 1. is less expensive per addition (its always just two disks) > 2. not limited in number of stripes (a raidz should only hold up to > 8 data disks) > 3. drops mttdl at about the same rate (though the raidz is dropping > a little faster) > 4. increases performance (adding disks to a raidz set has no impact) > 5. increases space more slowly (the only negative - can you live > with it?) > > > Highly Recommended Resources: > > http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance > http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl > > > > > Hope that helps, > > Kent > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070928/f3b55615/attachment.html>
I''m new to the list so this is probably a noob question: Are this forum part of a mailinglist or something? I keep getting some answers to my posts in this thread on email as well as some here, but it seems that those answers/posts on email aren''t posted on this forum..?? Or do I just get a copy on email from what ppl post here on the forum? Georg Edelmann wrote me on email saying he was interested in making a homeserver/nas as I''m about to (try to) do and wanted to know my hardware etc. What I was thinking of using for this server was Asus A8N-SLI Deluxe with some kind of AMD64 CPU, probably the cheapest X2 I can find and pair it with 1 or perhaps 2GB of RAM. The mainbord has 8 SATA onboard, 4 nvidia and 4 sil3114. I was also gonna get a 2sata add-on controller card, totalling 10 sata ports. But now I''m not sure, since alhopper just said the performance of the 3114 is poor. Blake, on the other hand mentioned the Sil3114 as a controller chip to use. I will of course not make use of the fake-raid on the mainboard. Kent - I see your point and it''s a good one and, but for me, I only want a big fileserver with redundancy for my music collection, movie collection and pictures etc. I would of course make a backup of the most important data as well from time to time. This message posted from opensolaris.org
I would agree that the performance of the SiI 3114 is not great. I have a similar ASUS board, and have used the standalone controller as well. Adaptec makes a nice 2-channel SATA card that is a lot better, though about 2x as much money. The Supermicro/Marvell controller is very well rated and supports 8 drives I think. The best option would be to get the LSI card that also works on SPARC hardware (which is way more industrial-grade than anything pee-cee) - but that card is about 300 bux. Remember that ZFS obsoletes the need for hardware RAID, so you will need (for example in the case of the 3114) to set the controller to expose individual disks to the OS. In the case of the 3114 this means re-flashing the controller BIOS. As far as the system goes, make sure you use 64-bit proc (you can address a lot more memory with ZFS this way) and lots of RAM. Anything below 4gb of RAM in the Solaris world is considered paltry :^) - Solaris makes extremely good use of lots of RAM, and ZFS in particular (because of it''s smart I/O scheduler) enjoys nice performance gains on a box with lots of RAM. If I were you, I''d buy the cheapest 64-bit proc you can and spend the saved money on maxing the RAM out. Blake On 9/28/07, Christopher <joffer at online.no> wrote:> > I''m new to the list so this is probably a noob question: Are this forum > part of a mailinglist or something? I keep getting some answers to my posts > in this thread on email as well as some here, but it seems that those > answers/posts on email aren''t posted on this forum..?? Or do I just get a > copy on email from what ppl post here on the forum? > > Georg Edelmann wrote me on email saying he was interested in making a > homeserver/nas as I''m about to (try to) do and wanted to know my hardware > etc. > > What I was thinking of using for this server was Asus A8N-SLI Deluxe with > some kind of AMD64 CPU, probably the cheapest X2 I can find and pair it with > 1 or perhaps 2GB of RAM. The mainbord has 8 SATA onboard, 4 nvidia and 4 > sil3114. I was also gonna get a 2sata add-on controller card, totalling 10 > sata ports. But now I''m not sure, since alhopper just said the performance > of the 3114 is poor. > > Blake, on the other hand mentioned the Sil3114 as a controller chip to > use. I will of course not make use of the fake-raid on the mainboard. > > Kent - I see your point and it''s a good one and, but for me, I only want a > big fileserver with redundancy for my music collection, movie collection and > pictures etc. I would of course make a backup of the most important data as > well from time to time. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070928/3a5319ee/attachment.html>
Christopher wrote:> Kent - I see your point and it''s a good one and, but for me, I only want a big fileserver with redundancy for my music collection, movie collection and pictures etc. I would of course make a backup of the most important data as well from time to time. >Chris, We have two things in common - I''m also a n00b (only started looking at ZFS seriously in June) and I''m also building a home server for my music/movies/pictures and all the other data in my house. For me, maximizing space and resiliency are more important that performance (as even entry-level performance exceeds my worst-case of 7 simultaneous 1080p video streams). I decided to get a 24-bay case and will start with a single 4+2 set, and will stripe-in the remaining three 4+2 sets over time. The reason I chose this approach over having a bunch of 2-disk mirrors striped is because similar calculations resulted in the following: - 11 * (2-disk mirror): space=11 TB, mttdl=69 years, iops=1738 (2 hot-spares not inc in mttdl calc) - 4 * (4+2 raidz2 set): space=16 TB, mttdl=8673.5 years, iops=316 So you see, I get more space and resiliency, but not as good perfomance (though it exceeds my needs) Thanks, Kent
pet peeve below... Kent Watsen wrote:> >> I think I have managed to confuse myself so i am asking outright hoping for a straight answer. >> > Straight answer: > > ZFS does not (yet) support adding a disk to an existing raidz set - > the only way to expand an existing pool is by adding a stripe. > Stripes can either be mirror, raid5, or raid6 (raidz w/ single or > double parity) - these striped pools are also known as raid10, > raid50, and raid60 respectively. Each stripe in a pool may be > different in both size and type - essentially, each offers space at > a resiliency rating. However, since apps can''t control which stripe > their data is written to, all stripes in a pool generally have the > same amount of parity. Thus, in practice, stripes differ only in > size, which can be achieved either by using larger disks or by using > more disks (in a raidz). When stripes are of different size, ZFS > will, in time, consume all the space each stripe offers - assuming > data-access is completely balanced, larger stripes effectively have > more I/O. Regarding matching the amount of parity in each stripe, > note that a 2-disk mirror has the same amount of parity as RAID5 and > a 3-disk mirror has the same parity as RAID6. > > > So, if the primary goal is to grow a pool over time by adding as few > disks as possible each time while having 1 bit of parity, you need to > plan on each time adding two disks in a mirrored configuration. Thus > your number of disks would grow like this: 2, 4, 6, 8, 10, etc. > > > But since folks apparently want to be able to just add disks to a RAIDZ, > lets compare that to adding 2-disk mirror stripes in terms of impact to > space, resiliency, and performance. In both cases I''m assuming 500GB > disks having a MTBF of 4 years,7,200 rpm, and 8.5 ms average read seek.MTBF=4 years is *way too low*! Disk MTBF should be more like 114 years. This is also a common misapplication of reliability analysis. To excerpt from http://blogs.sun.com/relling/entry/using_mtbf_and_time_dependent For example, data collected for the years 1996-1998 in the US showed that the annual death rate for children aged 5-14 was 20.8 per 100,000 resident population. This shows an average failure rate of 0.0208% per year. Thus, the MTBF for children aged 5-14 in the US is approximately 4,807 years. Clearly, no human child could be expected to live 5,000 years. That said (ok, it is a pet peeve for RAS guys :-) the relative merit of the rest of the analysis is good :-) And, for the record, I mirror. -- richard> Lets first consider adding disks to a RAID5: > > Following the ZFS best-practice rule of (N+P), where N={2,4,8} and > P={1,2}, the disk-count should grow as follows: 3, 5, 9. That is, > you would start with 3, add 2, and then add 4 - note: this would be > the limit of the raidz expansion since ZFS discourages N>8. So, > the pool''s MTTDL would be: > > 3 disks: space=1000 GB, mttdl=760.42 years, iops=79 > 5 disks: space=2000 GB, mttdl=228.12 years, iops=79 > 9 disks: space=4000 GB, mttdl=63.37 years, iops=79 > > Now lets consider adding 2-disk mirror stripes: > > We already said that the disks would grow by twos: 2, 4, 6, 8, 10, > etc. - so the pool''s MTTDL would be: > > 2 disks: space=500 GB, mttdl=760.42 years, iops=158 > 4 disks: space=1000 GB, mttdl=380 years, iops=316 > 6 disks: space=1500 GB, mttdl=190 years, iops=474 > 8 disks: space=2000 GB, mttdl=95 years, iops=632 > > So, adding 2-disk mirrors: > > 1. is less expensive per addition (its always just two disks) > 2. not limited in number of stripes (a raidz should only hold up to 8 > data disks) > 3. drops mttdl at about the same rate (though the raidz is dropping a > little faster) > 4. increases performance (adding disks to a raidz set has no impact) > 5. increases space more slowly (the only negative - can you live with > it?) > > > Highly Recommended Resources: > > http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance > http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl > > > > > Hope that helps, > > Kent > > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
IMHO, a better investment is in the NVidia MCP-55 chipsets which support more than 4 SATA ports. The NForce 680a boasts 12 SATA ports. Nevada builds 72+ should see these as SATA drives using the nv_sata driver and not as ATA/IDE disks. -- richard Christopher wrote:> I''m new to the list so this is probably a noob question: Are this forum part of a mailinglist or something? I keep getting some answers to my posts in this thread on email as well as some here, but it seems that those answers/posts on email aren''t posted on this forum..?? Or do I just get a copy on email from what ppl post here on the forum? > > Georg Edelmann wrote me on email saying he was interested in making a homeserver/nas as I''m about to (try to) do and wanted to know my hardware etc. > > What I was thinking of using for this server was Asus A8N-SLI Deluxe with some kind of AMD64 CPU, probably the cheapest X2 I can find and pair it with 1 or perhaps 2GB of RAM. The mainbord has 8 SATA onboard, 4 nvidia and 4 sil3114. I was also gonna get a 2sata add-on controller card, totalling 10 sata ports. But now I''m not sure, since alhopper just said the performance of the 3114 is poor. > > Blake, on the other hand mentioned the Sil3114 as a controller chip to use. I will of course not make use of the fake-raid on the mainboard. > > Kent - I see your point and it''s a good one and, but for me, I only want a big fileserver with redundancy for my music collection, movie collection and pictures etc. I would of course make a backup of the most important data as well from time to time. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Just keep in mind that I tried the patched driver and occasionally had kernel panics because of recursive mutex calls. I believe that it isn''t multi-processor safe. I switched to the Marvell chipset and have been much happier. This message posted from opensolaris.org
sliceing say "S0" to be used as root-filesystem would make ZFS not using the write-buffer on the disks. This would be a slight performance degrade, but would increate reliability of the system (since root is mirrored). Why not living on the edge and booting from ZFS ? This would nearly eliminate UFS. Use e.g. the two 500GB Disks for the root-filesystem on a mirrored pool: mirror X Z here lives the OS with it''s root-Filesystem on ZFS *and* userdata in the same pool raidz A B C D or any other layout or User zwo of the 250GB ones: pool boot-and-userdata-one mirror A B here lives the OS and userdata-one pool userdata-two mirror C D userdata-two spanning CD - XY mirror X Y Thomas On Thu, Sep 27, 2007 at 08:39:40PM +0100, Dick Davies wrote:> On 26/09/2007, Christopher <joffer at online.no> wrote: > > I''m about to build a fileserver and I think I''m gonna use OpenSolaris and ZFS. > > > > I''ve got a 40GB PATA disk which will be the OS disk, > > Would be nice to remove that as a SPOF. > > I know ZFS likes whole disks, but I wonder how much would performance suffer > if you SVMed up the first few Gb of a ZFS mirror pair for your root fs? > I did it this week on Solaris 10 and it seemed to work pretty well > > ( > http://number9.hellooperator.net/articles/2007/09/27/solaris-10-on-mirrored-disks > ) > > Roll on ZFS root :) >
let A,B,C,D be the 250GB disks and X,Y the 500GB ones. my choise here would be raidz over (A+B),(C+D),X,Y means something like zpool create tank raidz (stripe A B) (stripe C D) X Y (how do you actually write that up as zpool commands?)
Would the nv_sata driver also be used on nforce 590 sli? I found Asus M2N32 WS PRO at my hw shop which has 9 internal sata connectors. This message posted from opensolaris.org
I believe so. The Solaris device detection tool will show the MCP version, too. http://www.sun.com/bigadmin/hcl/hcts/device_detect.html -- richard Christopher wrote:> Would the nv_sata driver also be used on nforce 590 sli? I found Asus M2N32 WS PRO at my hw shop which has 9 internal sata connectors. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I went ahead and bought a M9N-Sli motherboard with 6 sata controllers and also a promise tx4 (4x sata300 non-raid) pci controller. Anyone know if the tx4 is suppoerted in OpenSolaris? If it''s as badly supported as the (crappy) Sil chipsets i''m better of with OpenFiler (linux) I think. This message posted from opensolaris.org