I have a shiny new big RAID array. 16x500GB SATA 300+NCQ drives connected to the host via 4Gb fibre channel. This gives me 6.5Tb of raw disk. I've come up with three possibilities on organizing this disk. My needs are really for a single 1Tb file system on which I will run postgres. However, in the future I'm not sure what I'll really need. I don't plan to ever connect any other servers to this RAID unit. The three choices I've come with so far are: 1) Make one RAID volume of 6.5Tb (in a RAID6 + hot spare configuration), and make one FreeBSD file system on the whole partition. 2) Make one RAID volume of 6.5Tb (in a RAID6 + hot spare configuration), and make 6 FreeBSD partitions with one file system each. 3) Make 6 RAID volumes and expose them to FreeBSD as multiple drives, then make one partition + file system on each "disk". Each RAID volume would span across all 16 drives, and I could make the volumes of differing RAID levels, if needed, but I'd probably stick with RAID6 +spare. I'm not keen on option 1 because of the potentially long fsck times after a crash. What advantage/disadvantage would I have between 2 and 3? The only thing I can come up with is that the disk scheduling algorithm in FreeBSD might not be optimal if the drives really are not truly independent as they are really backed by the same 16 drives, so option 2 might be better. However, with option 3, if I do ever end up connecting another host to the array, I can assign some of the volumes to the other host(s). My goal is speed, speed, speed. I'm running FreeBSD 6.2/amd64 and using an LSI fibre card. Thanks for any opinions and recommendations. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-301-869-4449 x806
> I have a shiny new big RAID array. 16x500GB SATA 300+NCQ drives > connected to the host via 4Gb fibre channel. This gives me 6.5Tb of > raw disk. > > I've come up with three possibilities on organizing this disk. My > needs are really for a single 1Tb file system on which I will run > postgres. However, in the future I'm not sure what I'll really need. > I don't plan to ever connect any other servers to this RAID unit. > > The three choices I've come with so far are: > > 1) Make one RAID volume of 6.5Tb (in a RAID6 + hot spare > configuration), and make one FreeBSD file system on the whole partition. > > 2) Make one RAID volume of 6.5Tb (in a RAID6 + hot spare > configuration), and make 6 FreeBSD partitions with one file system each. > > 3) Make 6 RAID volumes and expose them to FreeBSD as multiple drives, > then make one partition + file system on each "disk". Each RAID > volume would span across all 16 drives, and I could make the volumes > of differing RAID levels, if needed, but I'd probably stick with RAID6 > +spare. > > I'm not keen on option 1 because of the potentially long fsck times > after a crash.If you want to avoid the long fsck-times your remaining options are a journaling filesystem or zfs, either requires an upgrade from freebsd 6.2. I have used zfs and had a serverstop due to powerutage in out area. Our zfs-samba-server came up fine with no data corruption. So I will suggest freebsd 7.0 with zfs. Short fsck-times and ufs2 don't do well together. I know there is background-fsck but for me that is not an option. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare
On Fri, 17 Aug 2007 17:42:55 -0400 Vivek Khera wrote:> I have a shiny new big RAID array. 16x500GB SATA 300+NCQ drives > connected to the host via 4Gb fibre channel. This gives me 6.5Tb of > raw disk.> I've come up with three possibilities on organizing this disk. My > needs are really for a single 1Tb file system on which I will run > postgres. However, in the future I'm not sure what I'll really need. > I don't plan to ever connect any other servers to this RAID unit.> The three choices I've come with so far are:> 1) Make one RAID volume of 6.5Tb (in a RAID6 + hot spare > configuration), and make one FreeBSD file system on the whole > partition.> 2) Make one RAID volume of 6.5Tb (in a RAID6 + hot spare > configuration), and make 6 FreeBSD partitions with one file system > each.> 3) Make 6 RAID volumes and expose them to FreeBSD as multiple drives, > then make one partition + file system on each "disk". Each RAID > volume would span across all 16 drives, and I could make the volumes > of differing RAID levels, if needed, but I'd probably stick with RAID6 > +spare.> I'm not keen on option 1 because of the potentially long fsck times > after a crash.> What advantage/disadvantage would I have between 2 and 3? The only > thing I can come up with is that the disk scheduling algorithm in > FreeBSD might not be optimal if the drives really are not truly > independent as they are really backed by the same 16 drives, so > option 2 might be better. However, with option 3, if I do ever end > up connecting another host to the array, I can assign some of the > volumes to the other host(s).> My goal is speed, speed, speed.Seems that RAID[56] may be too sloooow. I'd suggest RAID10. I have 6 SATA-II 300MB/s disks at 3WARE adapter. My (very!) simple tests gave about 170MB/s for dd. BTW, I tested (OK, very fast) RAID5, RAID6, gmirror+gstripe and noone get close to RAID10. (Well, as expected, I suppose).> I'm running FreeBSD 6.2/amd64 and > using an LSI fibre card.If you have time you may do your own tests... And in case RAID0 you shouldn't have problems with long fsck. Leave a couple of your disks for hot-swapping and you'll get 7Tb. ;-)> Thanks for any opinions and recommendations.WBR -- bsam
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Vivek Khera wrote:> I'm not keen on option 1 because of the potentially long fsck times > after a crash.Depending on your allowable downtime after a crash, fscking even a 1 TB UFS file system can be a long time. For large file systems there's really no alternative to using -CURRENT / 7.0, and either gjournal or ZFS. When you get there, you'll need to create 1 small RAID volume (<= 1 GB) from which to boot (and probably use it for root) and use the rest for whatever your choice is (doesn't really matter at this point). This is because you can't have fdisk or bsdlabel partitions larger than 2 TB and you can't boot from GPT. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGxi/aldnAQVacBcgRAovJAKCnFTdEn81uf9lsYg+CuI5kulrd5ACeKcLt J/4WEUQA9Paw2FR9EnHZ8g0=HXLY -----END PGP SIGNATURE-----
On Aug 17, 2007, at 6:26 PM, Boris Samorodov wrote:> I have 6 SATA-II 300MB/s disks at 3WARE adapter. My (very!) simple > tests gave about 170MB/s for dd. BTW, I tested (OK, very fast) > RAID5, RAID6, gmirror+gstripe and noone get close to RAID10. (Well, as > expected, I suppose).Whichever RAID level I choose, I still need to decide how to split the 6.5Tb into smaller hunks. In any case, my testing with RAID10, RAID5, and RAID6 showed marginal differences with my workload.