I''m about to build a ZFS based NAS and i''d like some suggestions about how to set up my drives. The case i''m using holds 20 hot swap drives, so i plan to use either 4 vdevs with 5 drives or 5 vdevs with 4 drives each (and a hot spare inside the machine) The motherboard i''m getting has 4 pci-x slots 2 @ 133 Mhz and 2 @ 100 Mhz I was planning on buying 3 of the famous AOC-SAT2-MV8 cards which would give me more than enough sata slots. I''ll also have 6 onboard slots. I also plan on using 2 sata=> compact flash adapters with 16 gb compact flash cards for the os. My main question is what is the best way to lay out the vdevs? Does it really matter how i lay them out considering i only have gigabit network? Thanks for any help. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091230/2ef65e29/attachment.html>
I can''t answer your question - but I would like to see more details about the system you are building (sorry if off topic here). What motherboard and what compact flash adapters are you using? -- This message posted from opensolaris.org
Hello, On Dec 30, 2009, at 2:08 PM, Thomas Burgess wrote:> I''m about to build a ZFS based NAS and i''d like some suggestions about how to set up my drives. > > The case i''m using holds 20 hot swap drives, so i plan to use either 4 vdevs with 5 drives or 5 vdevs with 4 drives each (and a hot spare inside the machine) > > > The motherboard i''m getting has 4 pci-x slots 2 @ 133 Mhz and 2 @ 100 Mhz > > I was planning on buying 3 of the famous AOC-SAT2-MV8 cards which would give me more than enough sata slots. I''ll also have 6 onboard slots. > > I also plan on using 2 sata=> compact flash adapters with 16 gb compact flash cards for the os. > > My main question is what is the best way to lay out the vdevs? > > Does it really matter how i lay them out considering i only have gigabit network?It depends, random I/O and resilver/scrubbing should be a bit faster with 5 vdevs but for sequential data access it should not matter over gigabit. It all comes down to what you want out of the configuration, redundancy versus usable space and price. raidz2 might be a better choice than raidz, especially if you have large disks. For most of my storage need I would probably build a pool out of 4 radiz2 vdevs. Regards Henrik Henrik http://sparcv9.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091230/bdb455fb/attachment.html>
ok, but how should i connect the drives across the controllers? i''ll have 3 pci-x cards each with 8 sata ports 2 pci-x bus with 133 Mhz and 2 with 100 Mhz and, onboard with 6 sata ports....sooooo what would be the best method of connecting the drives if i go with 4 raidz vdevs or 5 raidz vdevs? On Wed, Dec 30, 2009 at 10:19 AM, Henrik Johansson <henrikj at henkis.net>wrote:> Hello, > > On Dec 30, 2009, at 2:08 PM, Thomas Burgess wrote: > > I''m about to build a ZFS based NAS and i''d like some suggestions about how > to set up my drives. > > The case i''m using holds 20 hot swap drives, so i plan to use either 4 > vdevs with 5 drives or 5 vdevs with 4 drives each (and a hot spare inside > the machine) > > > The motherboard i''m getting has 4 pci-x slots 2 @ 133 Mhz and 2 @ 100 Mhz > > I was planning on buying 3 of the famous AOC-SAT2-MV8 cards which would > give me more than enough sata slots. I''ll also have 6 onboard slots. > > I also plan on using 2 sata=> compact flash adapters with 16 gb compact > flash cards for the os. > > My main question is what is the best way to lay out the vdevs? > > Does it really matter how i lay them out considering i only have gigabit > network? > > > It depends, random I/O and resilver/scrubbing should be a bit faster with 5 > vdevs but for sequential data access it should not matter over gigabit. It > all comes down to what you want out of the configuration, redundancy versus > usable space and price. > > raidz2 might be a better choice than raidz, especially if you have large > disks. For most of my storage need I would probably build a pool out of 4 > radiz2 vdevs. > > Regards > > Henrik > Henrik > http://sparcv9.blogspot.com > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091230/952f04c2/attachment.html>
On Dec 30, 2009, at 7:50 AM, Thomas Burgess wrote:> ok, but how should i connect the drives across the controllers?Don''t worry about the controllers. They are at least an order of magnitude more reliable than the disks and if you are using HDDs, then you will have plenty of performance. -- richard
On Wed, 30 Dec 2009, Thomas Burgess wrote:> > and, onboard with 6 sata ports....sooooo what would be the best > method of connecting the drives if i go with 4 raidz vdevs or 5 > raidz vdevs?Try to distribute the raidz vdevs as evenly as possible across the available SATA controllers. In other words, try to wire a drive from each vdev to a different SATA controller if possible. The ideal case (rarely achieved) would be if you had the same number of SATA controllers as you have devices in the raidz vdev. This should assist with performance since you are only using PCI-X. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Dec 30, 2009, at 10:17 AM, Bob Friesenhahn wrote:> On Wed, 30 Dec 2009, Thomas Burgess wrote: >> and, onboard with 6 sata ports....sooooo what would be the best >> method of connecting the drives if i go with 4 raidz vdevs or 5 >> raidz vdevs? > > Try to distribute the raidz vdevs as evenly as possible across the > available SATA controllers. In other words, try to wire a drive > from each vdev to a different SATA controller if possible. The > ideal case (rarely achieved) would be if you had the same number of > SATA controllers as you have devices in the raidz vdev. This should > assist with performance since you are only using PCI-X.He''s limited by GbE, which can only do 100 MB/s or so... the PCI busses, bridges, memory, controllers, and disks will be mostly loafing, from a bandwidth perspective. In other words, don''t worry about it. -- richard
On Wed, Dec 30, 2009 at 1:17 PM, Bob Friesenhahn < bfriesen at simple.dallas.tx.us> wrote:> On Wed, 30 Dec 2009, Thomas Burgess wrote: > >> >> and, onboard with 6 sata ports....sooooo what would be the best method of >> connecting the drives if i go with 4 raidz vdevs or 5 raidz vdevs? >> > > Try to distribute the raidz vdevs as evenly as possible across the > available SATA controllers. In other words, try to wire a drive from each > vdev to a different SATA controller if possible. The ideal case (rarely > achieved) would be if you had the same number of SATA controllers as you > have devices in the raidz vdev. This should assist with performance since > you are only using PCI-X. > >thanks for the advice. Just curious, but in your "ideal" situation, is it considered best to use 1 controller for each vdev or user a different controler for each device in the vdev (i''d guess the latter but ive been wrong before)> Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091230/6b4e4d7d/attachment.html>
On Wed, 30 Dec 2009, Richard Elling wrote:> > He''s limited by GbE, which can only do 100 MB/s or so... > the PCI busses, bridges, memory, controllers, and disks will > be mostly loafing, from a bandwidth perspective. In other > words, don''t worry about it.Except that cases like ''zfs scrub'' and resilver will benefit from any bandwidth increases. This is sufficient reason to try to optimize the I/O. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Wed, 30 Dec 2009, Thomas Burgess wrote:> > Just curious, but in your "ideal" situation, is it considered best > to use 1 controller for each vdev or user a different controler for > each device in the vdev (i''d guess the latter but ive been wrong > before)>From both a fault-tolerance standpoint, and performance standpoint, itis best to distribute the vdev devices across controllers. With perfect distribution, you could pull a controller and it would be as if one drive failed in each vdev (and you can still read your data). If a controller messes up and sends bad data, then the damage is more limited. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Wed, Dec 30, 2009 at 2:01 PM, Bob Friesenhahn < bfriesen at simple.dallas.tx.us> wrote:> On Wed, 30 Dec 2009, Thomas Burgess wrote: > >> >> Just curious, but in your "ideal" situation, is it considered best to use >> 1 controller for each vdev or user a different controler for each device in >> the vdev (i''d guess the latter but ive been wrong before) >> > > From both a fault-tolerance standpoint, and performance standpoint, it is > best to distribute the vdev devices across controllers. With perfect > distribution, you could pull a controller and it would be as if one drive > failed in each vdev (and you can still read your data). If a controller > messes up and sends bad data, then the damage is more limited. > >I realized 2 things after i hit "send" That it made more sense in case a controller failed and 2, you already answered the question =) I''m going to go drink some coffee now and put a big reminder on the board DONT HIT SEND BEFORE coffee. Thanks =) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091230/93810cd5/attachment.html>
On Dec 30, 2009, at 10:56 AM, Bob Friesenhahn wrote:> On Wed, 30 Dec 2009, Richard Elling wrote: >> >> He''s limited by GbE, which can only do 100 MB/s or so... >> the PCI busses, bridges, memory, controllers, and disks will >> be mostly loafing, from a bandwidth perspective. In other >> words, don''t worry about it. > > Except that cases like ''zfs scrub'' and resilver will benefit from > any bandwidth increases. This is sufficient reason to try to > optimize the I/O.Disagree. Scrubs and resilvers are IOPS bound. -- richard
On Dec 30, 2009, at 11:01 AM, Bob Friesenhahn wrote:> On Wed, 30 Dec 2009, Thomas Burgess wrote: >> Just curious, but in your "ideal" situation, is it considered best >> to use 1 controller for each vdev or user a different controler for >> each device in the vdev (i''d guess the latter but ive been wrong >> before) > >> From both a fault-tolerance standpoint, and performance standpoint, >> it > is best to distribute the vdev devices across controllers. With > perfect distribution, you could pull a controller and it would be as > if one drive failed in each vdev (and you can still read your data). > If a controller messes up and sends bad data, then the damage is > more limited.Don''t confuse reliability with diversity. Reliability is highly dependent on the number of parts, fewer is better. Therefore the optimum number of items for redundancy is 2. Hence, you''ll see RAS guys recommending mirrors as a standard practice :-) In the case where you want to have high GB/$, you must trade off something. If you decide to go raidz[123] instead of mirroring, then it really doesn''t matter how many controllers you have, 2 is the optimum number for dependability. The reason I keep harping about this is because in the bad old days, there was a 1:1 relationship between a controller and a shared bus (IDE, parallel SCSI). If you take out the shared bus, you take out the controller. Today, with point-to-point connections like SATA or SAS, there is no shared bus, so the disks are isolated from each other. In this architecture, there is a 1:many relationship between controllers and disks, with one less (big, important) failure mode -- the shared bus. It is true that the controller itself is still a SPOF, but it is also true that the controller is more than an order of magnitude more reliable than the disks. If you do an availability analysis on such systems, the results will show that the controller doesn''t matter -- worry about the disks. To take this thought a step further, if you are so worried about a controller being the SPOF, then you will reject using Intel or AMD architectures where there is only one bridge chip. In other words, you would reject 99+% of the Intel and AMD systems available on the market. So, if you accept a high-volume Intel or AMD system, then you should also accept as few controllers as possible -- which is a very reasonable and defensible engineering decision. -- richard
On Wed, 30 Dec 2009, Richard Elling wrote:> > Disagree. Scrubs and resilvers are IOPS bound.This is a case of "it depends". On both of my Solaris systems, scrubs seem to be bandwidth-limited. However, I am not using raidz or SATA and the drives are faster than the total connectivity. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Wed, Dec 30, 2009 at 7:08 AM, Thomas Burgess <wonslung at gmail.com> wrote:> > I''m about to build a ZFS based NAS and i''d like some suggestions about how to set up my drives. > > The case i''m using holds 20 hot swap drives, so i plan to use either 4 vdevs with 5 drives or 5 vdevs with 4 drives each (and a hot spare inside the machine) > > > The motherboard i''m getting has 4 pci-x slots 2 @ 133 Mhz and 2 @ 100 Mhz > > I was planning on buying 3 of the famous AOC-SAT2-MV8 cards which would give me more than enough sata slots.? I''ll also have 6 onboard slots. > > I also plan on using 2 sata=> compact flash adapters with 16 gb compact flash cards for the os.For the OS, I''d drop the adapter/compact-flash combo and use the "stripped down" Kingston version of the Intel x25m MLC SSD. ?If you''re not familiar with it, the basic scoup is that this drive contains half the flash memory (40Gb) *and* half the controller channels (5 versus 10) of the Intel drive - and so, performance is basically a little less than half although read performance is still very good. ?For more info, google for hardware reviews. ?This product is still a little hard to find, froogle for the following part numbers: Desktop Bundle -?SNV125-S2BD/40GB Bare drive - SNV125-S2/40GB Currently you can find the bare drive for under $100. This is bound to give you better performance and guaranteed compatibility compared to adapters and compact flash. The problem with adapters is that, although the price is great, compatibility and build quality are all over the map and YMMV considerably. You would not be happy if you saved $20 on the adapter/flash combo and ended up with nightmare reliability. The great thing about 2.5" SSDs is that mounting is simply a question of duct tape or velcro! [ well .... almost ... but you can velcro them onto the sidewall of you case ] So you can use all your available 3.5" disk drive bays for ZFS disks.> > My main question is what is the best way to lay out the vdevs? > > Does it really matter how i lay them out considering i only have gigabit network? > > Thanks for any help. >Regards, -- Al Hopper ?Logical Approach Inc,Plano,TX al at logical-approach.com ? ? ? ? ? ? ? ? ? Voice: 972.379.2133 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
Rather than hacking something like that, he could use a Disk on Module (http://en.wikipedia.org/wiki/Disk_on_module) or something like http://www.tomshardware.com/news/nanoSSD-Drive-Elecom-Japan-SATA,8538.html (which I suspect may be a DOM but I''ve not poked around sufficiently to see). Paul -- This message posted from opensolaris.org
> > For the OS, I''d drop the adapter/compact-flash combo and use the > "stripped down" Kingston version of the Intel x25m MLC SSD. If you''re > not familiar with it, the basic scoup is that this drive contains half > the flash memory (40Gb) *and* half the controller channels (5 versus > 10) of the Intel drive - and so, performance is basically a little > less than half although read performance is still very good. For more > info, google for hardware reviews. This product is still a little > hard to find, froogle for the following part numbers: > > Desktop Bundle - SNV125-S2BD/40GB > Bare drive - SNV125-S2/40GB > > Currently you can find the bare drive for under $100. This is bound > to give you better performance and guaranteed compatibility compared > to adapters and compact flash. > > The problem with adapters is that, although the price is great, > compatibility and build quality are all over the map and YMMV > considerably. You would not be happy if you saved $20 on the > adapter/flash combo and ended up with nightmare reliability. > > The great thing about 2.5" SSDs is that mounting is simply a question > of duct tape or velcro! [ well .... almost ... but you can velcro them > onto the sidewall of you case ] So you can use all your available > 3.5" disk drive bays for ZFS disks. >I was able to find some of the 64 gb snv125-S2 drives for a decent price. Do these also work well for L2ARC? This brings more questions actually. I know it''s not recommended to use partitons for ZFS but does this still apply for SSD''s and the root pool? I was thinking about making maybe using half of the ssd for the root pool and putting the ZIL on the other half. Or would i just be better off leaving the ZIL on the raidz drives? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091231/36f42bf8/attachment.html>
Thomas Burgess wrote:> > > For the OS, I''d drop the adapter/compact-flash combo and use the > "stripped down" Kingston version of the Intel x25m MLC SSD. If you''re > not familiar with it, the basic scoup is that this drive contains half > the flash memory (40Gb) *and* half the controller channels (5 versus > 10) of the Intel drive - and so, performance is basically a little > less than half although read performance is still very good. For more > info, google for hardware reviews. This product is still a little > hard to find, froogle for the following part numbers: > > Desktop Bundle - SNV125-S2BD/40GB > Bare drive - SNV125-S2/40GB > > Currently you can find the bare drive for under $100. This is bound > to give you better performance and guaranteed compatibility compared > to adapters and compact flash. > > The problem with adapters is that, although the price is great, > compatibility and build quality are all over the map and YMMV > considerably. You would not be happy if you saved $20 on the > adapter/flash combo and ended up with nightmare reliability. > > The great thing about 2.5" SSDs is that mounting is simply a question > of duct tape or velcro! [ well .... almost ... but you can velcro them > onto the sidewall of you case ] So you can use all your available > 3.5" disk drive bays for ZFS disks. > > I was able to find some of the 64 gb snv125-S2 drives for a decent > price. Do these also work well for L2ARC? > > This brings more questions actually. > > I know it''s not recommended to use partitons for ZFS but does this > still apply for SSD''s and the root pool? > > I was thinking about making maybe using half of the ssd for the root > pool and putting the ZIL on the other half. > > Or would i just be better off leaving the ZIL on the raidz drives? >It''s OK to use partitions on SSDs, so long as you realize that using an SSD for multiple purposes splits the bandwidth into the SSD across multiple uses. In your case, using an SSD as both an L2ARC and a root pool device is reasonable, as the rpool traffic should not be heavy. I would NOT recommend using a X25-M or especially the snv125-S2 as a ZIL device. Write performance isn''t going to be very good at all - in fact, I think it should be not much different than using the bare drives. As an L2ARC cache device, however, it''s a good choice. Oh, and there''s plenty of bay adapters out there for cheap - use one. My favorite is a two-SSD-in-1-floppy drive bay like this: http://www.startech.com/item/HSB220SAT25B-35-Tray-Less-Dual-25-SATA-HD-Hot-Swap-Bay.aspx (I see them for under $40 at local stores) 20GB for a rpool is sufficient, so the rest can go to L2ARC. I would disable any swap volume on the SSDs, however. If you need swap, put it somewhere else. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
raidz2 is recommended. As discs get large, it can take long time to repair raidz. Maybe several days. With raidz1, if another discs blows during repair, you are screwed. -- This message posted from opensolaris.org