Well, off the top of my head: 2 x Storage Heads, 4 x 10G, 256G RAM, 2 x Intel E5 CPU''s 8 x 60-Bay JBOD''s with 60 x 4TB SAS drives RAIDZ2 stripe over the 8 x JBOD''s That should fit within 1 rack comfortably and provide 1 PB storage.. Regards, Kristoffer Sheather Cloud Central Scale Your Data Center In The Cloud Phone: 1300 144 007 | Mobile: +61 414 573 130 | Email: kris@cloudcentral.com.au LinkedIn: | Skype: kristoffer.sheather | Twitter: http://twitter.com/kristofferjon ---------------------------------------- From: "Marion Hakanson" <hakansom@ohsu.edu> Sent: Saturday, March 16, 2013 12:12 PM To: zfs@lists.illumos.org Subject: [zfs] Petabyte pool? Greetings, Has anyone out there built a 1-petabyte pool? I''ve been asked to look into this, and was told "low performance" is fine, workload is likely to be write-once, read-occasionally, archive storage of gene sequencing data. Probably a single 10Gbit NIC for connectivity is sufficient. We''ve had decent success with the 45-slot, 4U SuperMicro SAS disk chassis, using 4TB "nearline SAS" drives, giving over 100TB usable space (raidz3). Back-of-the-envelope might suggest stacking up eight to ten of those, depending if you want a "raw marketing petabyte", or a proper "power-of-two usable petabyte". I get a little nervous at the thought of hooking all that up to a single server, and am a little vague on how much RAM would be advisable, other than "as much as will fit" (:-). Then again, I''ve been waiting for something like pNFS/NFSv4.1 to be usable for gluing together multiple NFS servers into a single global namespace, without any sign of that happening anytime soon. So, has anyone done this? Or come close to it? Thoughts, even if you haven''t done it yourself? Thanks and regards, Marion ------------------------------------------- illumos-zfs Archives: https://www.listbox.com/member/archive/182191/=now RSS Feed: https://www.listbox.com/member/archive/rss/182191/23629987-2afa167a Modify Your Subscription: https://www.listbox.com/member/?member_id=23629987&id_secret=23629987-c48148 a8 Powered by Listbox: http://www.listbox.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Sat, 16 Mar 2013, Kristoffer Sheather @ CloudCentral wrote:> Well, off the top of my head: > > 2 x Storage Heads, 4 x 10G, 256G RAM, 2 x Intel E5 CPU''s > 8 x 60-Bay JBOD''s with 60 x 4TB SAS drives > RAIDZ2 stripe over the 8 x JBOD''s > > That should fit within 1 rack comfortably and provide 1 PB storage..What does one do for power? What are the power requirements when the system is first powered on? Can drive spin-up be staggered between JBOD chassis? Does the server need to be powered up last so that it does not time out on the zfs import? Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On 2013-03-16 15:20, Bob Friesenhahn wrote:> On Sat, 16 Mar 2013, Kristoffer Sheather @ CloudCentral wrote: > >> Well, off the top of my head: >> >> 2 x Storage Heads, 4 x 10G, 256G RAM, 2 x Intel E5 CPU''s >> 8 x 60-Bay JBOD''s with 60 x 4TB SAS drives >> RAIDZ2 stripe over the 8 x JBOD''s >> >> That should fit within 1 rack comfortably and provide 1 PB storage.. > > What does one do for power? What are the power requirements when the > system is first powered on? Can drive spin-up be staggered between JBOD > chassis? Does the server need to be powered up last so that it does not > time out on the zfs import?I guess you can use managed PDUs like those from APC (many models for varied socket types and amounts); they can be scripted on an advanced level, and on a basic level I think delays can be just configured per-socket to make the staggered startup after giving power from the wall (UPS) regardless of what the boxes'' individual power sources can do. Conveniently, they also allow to do a remote hard-reset of hung boxes without walking to the server room ;) My 2c, //Jim Klimov
On 2013-03-16 15:20, Bob Friesenhahn wrote:> On Sat, 16 Mar 2013, Kristoffer Sheather @ CloudCentral wrote: > >> Well, off the top of my head: >> >> 2 x Storage Heads, 4 x 10G, 256G RAM, 2 x Intel E5 CPU''s >> 8 x 60-Bay JBOD''s with 60 x 4TB SAS drives >> RAIDZ2 stripe over the 8 x JBOD''s >> >> That should fit within 1 rack comfortably and provide 1 PB storage.. > > What does one do for power? What are the power requirements when the > system is first powered on? Can drive spin-up be staggered between JBOD > chassis? Does the server need to be powered up last so that it does not > time out on the zfs import?Giving this question a second thought, I think JBODs should spin-up quickly (i.e. when power is given) while the server head(s) take time to pass POST, initialize their HBAs and other stuff. Booting 8 JBODs, one every 15 seconds to complete a typical spin-up power draw, would take a couple of minutes. It is likely that a server booted along with the first JBOD won''t get to importing the pool this quickly ;) Anyhow, with such a system attention should be given to redundant power and cooling, including redundant UPSes preferably fed from different power lines going into the room. This does not seem like a fantastic power sucker, however. 480 drives at 15W would consume 7200W; add a bit for processor/RAM heads (perhaps a kW?) and this would still fit into 8-10kW, so a couple of 15kVA UPSes (or more smaller ones) should suffice including redundancy. This might overall exceed a rack in size though. But for power/cooling this seems like a standard figure for a 42U rack or just a bit more. //Jim
On Sat, Mar 16, 2013 at 2:27 PM, Jim Klimov <jimklimov@cos.ru> wrote:> On 2013-03-16 15:20, Bob Friesenhahn wrote: > >> On Sat, 16 Mar 2013, Kristoffer Sheather @ CloudCentral wrote: >> >> Well, off the top of my head: >>> >>> 2 x Storage Heads, 4 x 10G, 256G RAM, 2 x Intel E5 CPU''s >>> 8 x 60-Bay JBOD''s with 60 x 4TB SAS drives >>> RAIDZ2 stripe over the 8 x JBOD''s >>> >>> That should fit within 1 rack comfortably and provide 1 PB storage.. >>> >> >> What does one do for power? What are the power requirements when the >> system is first powered on? Can drive spin-up be staggered between JBOD >> chassis? Does the server need to be powered up last so that it does not >> time out on the zfs import? >> > > I guess you can use managed PDUs like those from APC (many models for > varied socket types and amounts); they can be scripted on an advanced > level, and on a basic level I think delays can be just configured > per-socket to make the staggered startup after giving power from the > wall (UPS) regardless of what the boxes'' individual power sources can > do. Conveniently, they also allow to do a remote hard-reset of hung > boxes without walking to the server room ;) > > My 2c, > //Jim Klimov > >Any modern JBOD should have the intelligence built in to stagger drive spin-up. I wouldn''t spend money on one that didn''t. There''s really no need to stagger the JBOD power-up at the PDU. As for the head, yes it should have a delayed power on which you can typically set in the BIOS. --Tim _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss