Hi All, This is a bit off-topic...but since the Thumper is the poster child for ZFS I hope its not too off-topic. What are the actual origins of the Thumper? I''ve heard varying stories in word and print. It appears that the Thumper was the original server Bechtolsheim designed at Kealia as a massive video server. However, when we were first told about it a year ago through Sun contacts "Thumper" was described as a part of a scalabe iSCSI storage system, where Thumpers would be connected to a head (which looked a lot like a pair of X4200s) via iSCSI that would then present the storage over iSCSI and NFS. Recently, other sources mentioned they were told about the same time that Thumper was part of the Honeycomb project. So I was curious if anyone had any insights into the history/origins of the Thumper...or just wanted to throw more rumors on the fire. ;-) Thanks in advance for your indulgence. Best Regards, Jason
Jason J. W. Williams wrote:> Hi All, > > This is a bit off-topic...but since the Thumper is the poster child > for ZFS I hope its not too off-topic. > > What are the actual origins of the Thumper? I''ve heard varying stories > in word and print. It appears that the Thumper was the original server > Bechtolsheim designed at Kealia as a massive video server. However, > when we were first told about it a year ago through Sun contacts > "Thumper" was described as a part of a scalabe iSCSI storage system, > where Thumpers would be connected to a head (which looked a lot like a > pair of X4200s) via iSCSI that would then present the storage over > iSCSI and NFS. Recently, other sources mentioned they were told about > the same time that Thumper was part of the Honeycomb project. > > So I was curious if anyone had any insights into the history/origins > of the Thumper...or just wanted to throw more rumors on the fire. ;-)Thumper was created to hold the the entire electronic transcript of the Bill Clinton impeachment proceedings...> > Thanks in advance for your indulgence. > > Best Regards, > Jason > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Neal Pollack wrote:> Jason J. W. Williams wrote: >> >> >> So I was curious if anyone had any insights into the history/origins >> of the Thumper...or just wanted to throw more rumors on the fire. ;-) > > > Thumper was created to hold the the entire electronic transcript of the > Bill Clinton impeachment proceedings...Actually, it was meant to hold the entire electronic transcript of the George Bush impeachment proceedings ... we were thinking ahead. Yes thats a joke for those playing at home. Please take it with the humor it was intended. :-P
Jason J. W. Williams wrote:> Hi All, > > This is a bit off-topic...but since the Thumper is the poster child > for ZFS I hope its not too off-topic. > > What are the actual origins of the Thumper? I''ve heard varying stories > in word and print. It appears that the Thumper was the original server > Bechtolsheim designed at Kealia as a massive video server. However, > when we were first told about it a year ago through Sun contacts > "Thumper" was described as a part of a scalabe iSCSI storage system, > where Thumpers would be connected to a head (which looked a lot like a > pair of X4200s) via iSCSI that would then present the storage over > iSCSI and NFS. Recently, other sources mentioned they were told about > the same time that Thumper was part of the Honeycomb project.That sounds more like the StorageTek 5320 systems.> So I was curious if anyone had any insights into the history/origins > of the Thumper...or just wanted to throw more rumors on the fire. ;-)Personally, I see it as a modern version of the SSA-100 (also designed by Andy) -- richard
> This is a bit off-topic...but since the Thumper is the poster child > for ZFS I hope its not too off-topic. > > What are the actual origins of the Thumper? I''ve heard varying stories > in word and print. It appears that the Thumper was the original server > Bechtolsheim designed at Kealia as a massive video server.That''s correct -- it was originally called the StreamStor. Speaking personally, I first learned about it in the meeting with Andy that I described here: http://blogs.sun.com/bmc/entry/man_myth_legend I think it might be true that this was the first that anyone in Solaris had heard of it. Certainly, it was the first time that Andy had ever heard of ZFS. It was a very high bandwidth conversation, at any rate. ;) After the meeting, I returned post-haste to Menlo Park, where I excitedly described the box to Jeff Bonwick, Bill Moore and Bart Smaalders. Bill said something like "I gotta see this thing" and sometime later (perhaps the next week?) Bill, Bart and I went down to visit Andy. Andy gave us a much more detailed tour, with Bill asking all sorts of technical questions about the hardware (many of which were something like "how did you get a supplier to build that for you?!"). After the tour, Andy took the three of us to lunch, and it was one of those moments that I won''t forget: Bart, Bill, Andy and I sitting in the late afternoon Palo Alto sun, with us very excited about his hardware, and Andy very excited about our software. Everyone realized that these two projects -- born independently -- were made for each other, that together they would change the market. It was one of those rare moments that reminds you why you got into this line of work -- and I feel lucky to have shared in it. - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
Wow. That''s an incredibly cool story. Thank you for sharing it! Does the Thumper today pretty much resemble what you saw then? Best Regards, Jason On 1/23/07, Bryan Cantrill <bmc at eng.sun.com> wrote:> > > This is a bit off-topic...but since the Thumper is the poster child > > for ZFS I hope its not too off-topic. > > > > What are the actual origins of the Thumper? I''ve heard varying stories > > in word and print. It appears that the Thumper was the original server > > Bechtolsheim designed at Kealia as a massive video server. > > That''s correct -- it was originally called the StreamStor. Speaking > personally, I first learned about it in the meeting with Andy that I > described here: > > http://blogs.sun.com/bmc/entry/man_myth_legend > > I think it might be true that this was the first that anyone in Solaris > had heard of it. Certainly, it was the first time that Andy had ever > heard of ZFS. It was a very high bandwidth conversation, at any rate. ;) > > After the meeting, I returned post-haste to Menlo Park, where I excitedly > described the box to Jeff Bonwick, Bill Moore and Bart Smaalders. Bill > said something like "I gotta see this thing" and sometime later (perhaps > the next week?) Bill, Bart and I went down to visit Andy. Andy gave > us a much more detailed tour, with Bill asking all sorts of technical > questions about the hardware (many of which were something like "how did > you get a supplier to build that for you?!"). After the tour, Andy > took the three of us to lunch, and it was one of those moments that I > won''t forget: Bart, Bill, Andy and I sitting in the late afternoon Palo > Alto sun, with us very excited about his hardware, and Andy very excited > about our software. Everyone realized that these two projects -- born > independently -- were made for each other, that together they would change > the market. It was one of those rare moments that reminds you why you got > into this line of work -- and I feel lucky to have shared in it. > > - Bryan > > -------------------------------------------------------------------------- > Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc >
On Wed, Jan 24, 2007 at 12:15:21AM -0700, Jason J. W. Williams wrote:> Wow. That''s an incredibly cool story. Thank you for sharing it! Does > the Thumper today pretty much resemble what you saw then?Yes, amazingly so: 4-way, 48 spindles, 4u. The real beauty of the match between ZFS and Thumper was (and is) that ZFS unlocks new economics in storage -- smart software achieving high performance and ultra-high reliability with dense, cheap hardware -- and that Thumper was (and is) the physical embodiment of those economics. And without giving away too much of our future roadmap, suffice it to say that one should expect much, much more from Sun in this vein: innovative software and innovative hardware working together to deliver world-beating systems with undeniable economics. And actually, as long as we''re talking history, you might be interested to know the story behind the name "Thumper": Fowler initially suggested the name as something of a joke, but, as often happens with Fowler, he tells a joke with a straight face once too many to one person too many, and next thing you know it''s the plan of record. I had suggested the name "Humper" for the server that became Andromeda (the x8000 series) -- so you could order a datacenter by asking for (say) "two Humpers and five Thumpers." (And I loved the idea of asking "would you like a Humper for your Thumper?") But Fowler said the name was too risque (!). Fortunately the name "Thumper" stuck... - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
>Actually, it was meant to hold the entire electronic transcript of the >George Bush impeachment proceedings ... we were thinking ahead.Fortunately, larger disks became available in time. Casper
On 24/1/07 9:06, "Bryan Cantrill" <bmc at eng.sun.com> wrote:> But Fowler said the name was too risque (!). Fortunately the name > "Thumper" stuck...I assumed it was a reference to Bambi... That''s what comes from having small children :-) Cheers, Chris
Chris, well, "Thumper" is actually a reference to Bambi The comment about being risque was refering to "Humper" as a codename proposed for a related server ( and e.g. leo.org confirms that is has a meaning labelled as "[vulg.]" :-) -- Roland Chris Ridd schrieb:> On 24/1/07 9:06, "Bryan Cantrill" <bmc at eng.sun.com> wrote: > >> But Fowler said the name was too risque (!). Fortunately the name >> "Thumper" stuck... > > I assumed it was a reference to Bambi... That''s what comes from having small > children :-) > > Cheers, > > Chris > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> too much of our future roadmap, suffice it to say that one should expect > much, much more from Sun in this vein: innovative software and innovative > hardware working together to deliver world-beating systems with undeniable > economics.Yes please. Now give me a fairly cheap (but still quality) FC-attached JBOD utilizing SATA/SAS disks and I''ll be really happy! :-) This message posted from opensolaris.org
I think this will be a hard sell internally given that it would eat up their own storagetek line. This message posted from opensolaris.org
On Jan 24, 2007, at 09:25, Peter Eriksson wrote:>> too much of our future roadmap, suffice it to say that one should >> expect >> much, much more from Sun in this vein: innovative software and >> innovative >> hardware working together to deliver world-beating systems with >> undeniable >> economics. > > Yes please. Now give me a fairly cheap (but still quality) FC- > attached JBOD utilizing SATA/SAS disks and I''ll be really happy! :-)Could you outline why FC attached instead of network attached (iSCSI say) makes more sense to you? It might help to illustrate the demand for an FC target I''m hearing instead of just a network target .. .je
Peter Eriksson wrote:>> too much of our future roadmap, suffice it to say that one should expect >> much, much more from Sun in this vein: innovative software and innovative >> hardware working together to deliver world-beating systems with undeniable >> economics. > > Yes please. Now give me a fairly cheap (but still quality) FC-attached JBOD > utilizing SATA/SAS disks and I''ll be really happy! :-)... with write cache and dual redundant controllers? I think we call that the Sun StorageTek 3511. -- richard
> well, "Thumper" is actually a reference to BambiYou''d have to ask Fowler, but certainly when he coined it, "Bambi" was the last thing on anyone''s mind. I believe Fowler''s intention was "one that thumps" (or, in the unique parlance of a certain Commander-in-Chief, "one that gives a thumpin''"). - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
Well, he did say fairly cheap. the ST 3511 is about $18.5k. That''s about the same price for the low-end NetApp FAS250 unit. -Moazam On Jan 24, 2007, at 9:40 AM, Richard Elling wrote:> Peter Eriksson wrote: >>> too much of our future roadmap, suffice it to say that one should >>> expect >>> much, much more from Sun in this vein: innovative software and >>> innovative >>> hardware working together to deliver world-beating systems with >>> undeniable >>> economics. >> Yes please. Now give me a fairly cheap (but still quality) FC- >> attached JBOD utilizing SATA/SAS disks and I''ll be really happy! :-) > > ... with write cache and dual redundant controllers? I think we > call that > the Sun StorageTek 3511. > -- richard > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Sean McGrath - Sun Microsystems Ireland
2007-Jan-24 17:54 UTC
[zfs-discuss] Thumper Origins Q
Bryan Cantrill stated: < < > well, "Thumper" is actually a reference to Bambi I keep thinking of the classic AC/DC song when Fowler and thumpers are mentioned.. s/thunder/thumper/ < < You''d have to ask Fowler, but certainly when he coined it, "Bambi" was the < last thing on anyone''s mind. I believe Fowler''s intention was "one that < thumps" (or, in the unique parlance of a certain Commander-in-Chief, < "one that gives a thumpin''"). < < - Bryan < < -------------------------------------------------------------------------- < Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc < _______________________________________________ < zfs-discuss mailing list < zfs-discuss at opensolaris.org < http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Sean. .
On Wed, 24 Jan 2007, Jonathan Edwards wrote:> > Yes please. Now give me a fairly cheap (but still quality) FC-attached JBOD > > utilizing SATA/SAS disks and I''ll be really happy! :-) > > Could you outline why FC attached instead of network attached (iSCSI say) > makes more sense to you? It might help to illustrate the demand for an FC > target I''m hearing instead of just a network target ..Dunno about FC or iSCSI, but what I''d really like to see is a 1U direct attach 8-drive SAS JBOD, as described (back in May 2006!) here: http://richteer.blogspot.com/2006/05/sun-storage-product-i-would-like-to.html Modulo the UltraSCSI 320 stuff perhaps. Given that other vendors have released something similar, and how strong Sun''s entry-level server offerings are, I can''t believe that Sun hasn''t annouced something like this, to bring their entry-level storage offerings up to the bar set by their servers... -- Rich Teer, SCSA, SCNA, SCSECA, OpenSolaris CAB member President, Rite Online Inc. Voice: +1 (250) 979-1638 URL: http://www.rite-group.com/rich
On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote:> Well, he did say fairly cheap. the ST 3511 is about $18.5k. That''s > about the same price for the low-end NetApp FAS250 unit.Note that the 3511 is being replaced with the 6140: http://www.sun.com/storagetek/disk_systems/midrange/6140/ Also, don''t read too much into the prices you see on the website -- that''s the list price, and doesn''t reflect any discounting. If you''re interested in what it _actually_ costs, you should talk to a Sun rep or one of our channel partners to get a quote. (And lest anyone attack the messenger: I''m not defending this system of getting an accurate price, I''m just describing it.) - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
On Jan 24, 2007, at 12:41, Bryan Cantrill wrote:> >> well, "Thumper" is actually a reference to Bambi > > You''d have to ask Fowler, but certainly when he coined it, "Bambi" > was the > last thing on anyone''s mind. I believe Fowler''s intention was "one > that > thumps" (or, in the unique parlance of a certain Commander-in-Chief, > "one that gives a thumpin''").You can take your pick of things that thump here: http://en.wikipedia.org/wiki/Thumper given the other name is the X4500 .. it does seem like it should be a weapon --- .je
On 1/24/07, Jonathan Edwards <Jonathan.Edwards at sun.com> wrote:> > On Jan 24, 2007, at 09:25, Peter Eriksson wrote: > > >> too much of our future roadmap, suffice it to say that one should > >> expect > >> much, much more from Sun in this vein: innovative software and > >> innovative > >> hardware working together to deliver world-beating systems with > >> undeniable > >> economics. > > > > Yes please. Now give me a fairly cheap (but still quality) FC- > > attached JBOD utilizing SATA/SAS disks and I''ll be really happy! :-) > > Could you outline why FC attached instead of network attached (iSCSI > say) makes more sense to you? It might help to illustrate the demand > for an FC target I''m hearing instead of just a network target .. >I''m not generally for FC-attached storage, but we''ve documented here many times how the round trip latency with iSCSI hasn''t been the perfect match with ZFS and NFS (think NAS). You need either IB or FC right now to make that workable. Some day though.. either with nvram-backed NFS or cheap 10Gig-E...> .je > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
> You can take your pick of things that thump here: > http://en.wikipedia.org/wiki/ThumperI think it''s safe to say that Fowler was thinking more along the lines of whomever dubbed the M79 grenade launcher -- which you can safely bet was not named after a fictional bunny... - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
On Wed, 24 Jan 2007, Sean McGrath - Sun Microsystems Ireland wrote:> Bryan Cantrill stated: > < > < > well, "Thumper" is actually a reference to Bambi > > I keep thinking of the classic AC/DC song when Fowler and thumpers are > mentioned.. s/thunder/thumper/Yeah, AC/DC songs seem to be most apropos for Sun at the moment: * Thumperstruck (the subject of this thread) * For those about to rock (the successor to the US-IV) * Back in back (Sun''s return to profitability as announced yesterday) Although Queen is almost as good: * We will Rock you * We are the champions And what do M$ users have? Courtesy of the Rolling Stones: * (I can''t get) no satisfaction * 19th Nervous breakdown :-) -- Rich Teer, SCSA, SCNA, SCSECA, OpenSolaris CAB member President, Rite Online Inc. Voice: +1 (250) 979-1638 URL: http://www.rite-group.com/rich
On January 24, 2007 9:40:41 AM -0800 Richard Elling <Richard.Elling at Sun.COM> wrote:> Peter Eriksson wrote: >> Yes please. Now give me a fairly cheap (but still quality) FC-attached >> JBOD utilizing SATA/SAS disks and I''ll be really happy! :-) > > ... with write cache and dual redundant controllers? I think we call that > the Sun StorageTek 3511.Ah but the 3511 JBOD is not supported for direct attach to a host, nor is it supported for attachment to a SAN. You have to have a 3510 or 3511 with RAID controller to use the 3511 JBOD. The RAID controller is pretty pricey on these guys. $5k each IIRC. On January 24, 2007 10:04:04 AM -0800 Bryan Cantrill <bmc at eng.sun.com> wrote:> > On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote: >> Well, he did say fairly cheap. the ST 3511 is about $18.5k. That''s >> about the same price for the low-end NetApp FAS250 unit. > > Note that the 3511 is being replaced with the 6140:Which is MUCH nicer but also much pricier. Also, no non-RAID option. You can get a 4Gb FC->SATA RAID with 12*750gb drives for about $10k from third parties. I doubt we''ll ever see that from Sun if for no other reason just due to the drive markups. (Which might be justified based on drive qualification; I''m not making any comment as to whether the markup is warranted or not, just that it exists and is obscene.) But you still can''t beat thumper overall. I believe S10U3 has iSCSI target support? If so, there you go. Not on the low end in absolute $$$ but certainly in $/GB per bits/sec. Probably better on power too compared to equivalent solutions. -frank
On Wed, 24 Jan 2007, Bryan Cantrill wrote:> I think it''s safe to say that Fowler was thinking more along the linesPresumably, that''s John Fowler? -- Rich Teer, SCSA, SCNA, SCSECA, OpenSolaris CAB member President, Rite Online Inc. Voice: +1 (250) 979-1638 URL: http://www.rite-group.com/rich
On January 24, 2007 10:02:52 AM -0800 Rich Teer <rich.teer at rite-group.com> wrote:> Dunno about FC or iSCSI, but what I''d really like to see is a 1U direct > attach 8-drive SAS JBOD, as described (back in May 2006!) here: > >http://richteer.blogspot.com/2006/05/sun-storage-product-i-would-like-to.html The problem with that is the 2.5" drives are too expensive and too small. -frank
On 24 Jan 2007, at 13:04, Bryan Cantrill wrote:> > On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote: >> Well, he did say fairly cheap. the ST 3511 is about $18.5k. That''s >> about the same price for the low-end NetApp FAS250 unit. > > Note that the 3511 is being replaced with the 6140: > > http://www.sun.com/storagetek/disk_systems/midrange/6140/ > > Also, don''t read too much into the prices you see on the website -- > that''s > the list price, and doesn''t reflect any discounting. If you''re > interested > in what it _actually_ costs, you should talk to a Sun rep or one of our > channel partners to get a quote. (And lest anyone attack the > messenger: > I''m not defending this system of getting an accurate price, I''m just > describing it.) >If your company can qualify as a start-up (4 year old or less with less than 150 employees) you may want to look at the Sun Startup essentials program. It provides Sun hardware at big discounts for startups. http://www.sun.com/emrkt/startupessentials/ For an idea on the levels of discounts see http://kalsey.com/2006/11/sun_startup_essentials_pricing/ -Angelo> - Bryan > > ----------------------------------------------------------------------- > --- > Bryan Cantrill, Solaris Kernel Development. > http://blogs.sun.com/bmc > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
''Cept the 3511 is highway robbery for what you get. ;-) Best Regards, Jason On 1/24/07, Richard Elling <Richard.Elling at sun.com> wrote:> Peter Eriksson wrote: > >> too much of our future roadmap, suffice it to say that one should expect > >> much, much more from Sun in this vein: innovative software and innovative > >> hardware working together to deliver world-beating systems with undeniable > >> economics. > > > > Yes please. Now give me a fairly cheap (but still quality) FC-attached JBOD > > utilizing SATA/SAS disks and I''ll be really happy! :-) > > ... with write cache and dual redundant controllers? I think we call that > the Sun StorageTek 3511. > -- richard > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Wed, Jan 24, 2007 at 01:25:26PM -0500, Angelo Rajadurai wrote:> If your company can qualify as a start-up (4 year old or less with less > than 150 employees) you may want to look at the Sun Startup essentials > program. It provides Sun hardware at big discounts for startups. > > http://www.sun.com/emrkt/startupessentials/ > > For an idea on the levels of discounts see > http://kalsey.com/2006/11/sun_startup_essentials_pricing/In addition, here are Sun''s promotions for educational institutions: http://www.sun.com/products-n-solutions/edu/promotions/hardware.html Ed Plese
Frank Cusack wrote:> On January 24, 2007 9:40:41 AM -0800 Richard Elling > <Richard.Elling at Sun.COM> wrote: >> Peter Eriksson wrote: >>> Yes please. Now give me a fairly cheap (but still quality) FC-attached >>> JBOD utilizing SATA/SAS disks and I''ll be really happy! :-) >> >> ... with write cache and dual redundant controllers? I think we call >> that >> the Sun StorageTek 3511. > > Ah but the 3511 JBOD is not supported for direct attach to a host, nor > is it > supported for attachment to a SAN. You have to have a 3510 or 3511 with > RAID controller to use the 3511 JBOD. The RAID controller is pretty pricey > on these guys. $5k each IIRC.I started looking into the 3511 for a ZFS system and just about immediately stopped considering it for this reason. If it is not supported in JBOD, then I might as well go get a third party JBOD at the same level of support.> You can get a 4Gb FC->SATA RAID with 12*750gb drives for about $10k > from third parties. I doubt we''ll ever see that from Sun if for no > other reason just due to the drive markups. (Which might be justified > based on drive qualification; I''m not making any comment as to whether > the markup is warranted or not, just that it exists and is obscene.) >Yep. I went with a third party FC/SATA unit which has been flawless as a direct attach for my ZFS JBOD system. Paid about $0.70/GB. And I still have enough money left over this year to upgrade my network core. If I would have gone with Sun, I wouldn''t be able to push as many bits across my network. I just don''t know how people can afford Sun storage, or even if they can, what drives them to pay such premiums. Sun is missing out on lots of lower end storage, but perhaps that is by design. I am a small shop by many standards, but I would have spent tens of thousands over the last few years with Sun if they had reasonably priced storage. <shrug> I just need a place to put my bits. Doesn''t need to be the fastest, bleeding edge stuff. Just a bucket that performs reasonably, and preferably one that I can use with ZFS. -Shannon
On Wed, 24 Jan 2007, Shannon Roddy wrote:> Sun is missing out on lots of lower end storage, but perhaps that is by > design. I am a small shop by many standards, but I would have spent > tens of thousands over the last few years with Sun if they had > reasonably priced storage. <shrug> I just need a place to put my bits. > Doesn''t need to be the fastest, bleeding edge stuff. Just a bucket > that performs reasonably, and preferably one that I can use with ZFS.+1 -- Rich Teer, SCSA, SCNA, SCSECA, OpenSolaris CAB member President, Rite Online Inc. Voice: +1 (250) 979-1638 URL: http://www.rite-group.com/rich
Bryan Cantrill wrote:>> well, "Thumper" is actually a reference to Bambi > > You''d have to ask Fowler, but certainly when he coined it, "Bambi" was the > last thing on anyone''s mind. I believe Fowler''s intention was "one that > thumps" (or, in the unique parlance of a certain Commander-in-Chief, > "one that gives a thumpin''").me, I always thought of calling sandworms. sandworms use up a lot of space, you see...
>Bryan Cantrill wrote: >>> well, "Thumper" is actually a reference to Bambi >> >> You''d have to ask Fowler, but certainly when he coined it, "Bambi" was the >> last thing on anyone''s mind. I believe Fowler''s intention was "one that >> thumps" (or, in the unique parlance of a certain Commander-in-Chief, >> "one that gives a thumpin''"). > >me, I always thought of calling sandworms. > >sandworms use up a lot of space, you see...And bring in a lot of cash.... (IIRC, the worms caused the spice and the spice was mined) It was my association too. Casper
Casper.Dik at Sun.COM wrote:>> Bryan Cantrill wrote: >>>> well, "Thumper" is actually a reference to Bambi >>> You''d have to ask Fowler, but certainly when he coined it, "Bambi" was the >>> last thing on anyone''s mind. I believe Fowler''s intention was "one that >>> thumps" (or, in the unique parlance of a certain Commander-in-Chief, >>> "one that gives a thumpin''"). >> me, I always thought of calling sandworms. >> >> sandworms use up a lot of space, you see... > > And bring in a lot of cash.... > > (IIRC, the worms caused the spice and the spice was mined) > > It was my association too....and if you imagine 48 head positioner arms moving at once.... one can imagine the vibration would travel through sand, is all. Just means it''s a good name, I suppose!
> #1 is speed. You can aggregate 4x1Gbit ethernet and still not touch 4Gb/sec FC. > #2 drop in compatibility. I''m sure people would love to drop this into an existing SAN#2 is the key for me. And I also have a #3: FC has been around a long time now. The HBAs and Switches are (more or less :-) debugged and we know how things work... iSCSI - well, perhaps. But to me that feels like it gets "too far away" from the hardware. I''d like to keep the "distance" between the disks and ZFS as short as possible. Ie: ZFS -> HBA -> FC Switch -> JBOD -> "Simple" FC-SATA-converter -> SATA disk This message posted from opensolaris.org
On Jan 24, 2007, at 12:37 PM, Shannon Roddy wrote:> I went with a third party FC/SATA unit which has been flawless as > a direct attach for my ZFS JBOD system. Paid about $0.70/GB.What did you use, if you don''t mind my asking? -- Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070124/9f42c1ab/attachment.bin>
Hello Peter, Wednesday, January 24, 2007, 10:24:22 PM, you wrote: PE> Ie: ZFS ->> HBA -> FC Switch -> JBOD -> "Simple" FC-SATA-converter -> SATA disk PE> Why bother with switch here? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Ben Gollmer wrote:> On Jan 24, 2007, at 12:37 PM, Shannon Roddy wrote: > >> I went with a third party FC/SATA unit which has been flawless as >> a direct attach for my ZFS JBOD system. Paid about $0.70/GB. > > What did you use, if you don''t mind my asking? >Arena Janus 6641. Turns out I underestimated what I paid per GB. I went back and dug up the invoice and I paid just under $1/GB. My memory was a little off on the 750 GB drive prices. I used an LSI Logic FC card that was listed on the "Solaris Ready" page, and I am using the LSI Logic driver. http://www.sun.com/io_technologies/vendor/lsi_logic_corporation.html Works fine for our purposes, but again, we don''t need screaming bleeding edge performance either. -Shannon
On Jan 24, 2007, at 04:06, Bryan Cantrill wrote:> On Wed, Jan 24, 2007 at 12:15:21AM -0700, Jason J. W. Williams wrote: >> Wow. That''s an incredibly cool story. Thank you for sharing it! Does >> the Thumper today pretty much resemble what you saw then? > > Yes, amazingly so: 4-way, 48 spindles, 4u. The real beauty of the > match between ZFS and Thumper was (and is) that ZFS unlocks new > economics > in storage -- smart software achieving high performance and ultra-highIf Thumper and ZFS were born independently, how were all those disks going to be used without ZFS? It seems logical that the two be mated, but AFAIK there is no hardware RAID available in Thumpers. Was "normal" software RAID the plan? Treating each disk as a separate mount point? Just curious.
Hello David, Thursday, January 25, 2007, 1:47:57 AM, you wrote: DM> On Jan 24, 2007, at 04:06, Bryan Cantrill wrote:>> On Wed, Jan 24, 2007 at 12:15:21AM -0700, Jason J. W. Williams wrote: >>> Wow. That''s an incredibly cool story. Thank you for sharing it! Does >>> the Thumper today pretty much resemble what you saw then? >> >> Yes, amazingly so: 4-way, 48 spindles, 4u. The real beauty of the >> match between ZFS and Thumper was (and is) that ZFS unlocks new >> economics >> in storage -- smart software achieving high performance and ultra-highDM> If Thumper and ZFS were born independently, how were all those disks DM> going to be used without ZFS? It seems logical that the two be mated, DM> but AFAIK there is no hardware RAID available in Thumpers. DM> Was "normal" software RAID the plan? Treating each disk as a separate DM> mount point? I guess Linux was considered probably with LVM or something else. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
> >On Wed, Jan 24, 2007 at 12:15:21AM -0700, Jason J. W. Williams wrote: > >>Wow. That''s an incredibly cool story. Thank you for sharing it! Does > >>the Thumper today pretty much resemble what you saw then? > > > >Yes, amazingly so: 4-way, 48 spindles, 4u. The real beauty of the > >match between ZFS and Thumper was (and is) that ZFS unlocks new > >economics > >in storage -- smart software achieving high performance and ultra-high > > If Thumper and ZFS were born independently, how were all those disks > going to be used without ZFS? It seems logical that the two be mated, > but AFAIK there is no hardware RAID available in Thumpers.Like I said, Andy was very interested in ZFS. ;) And ZFS in Thumper: after all, what was ZFS going to do with that expensive but useless hardware RAID controller? That''s part of what made this union so uncanny: despite being developed separately, each very much needed the other... - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
On 1/25/07, Bryan Cantrill <bmc at eng.sun.com> wrote:> ... > after all, what was ZFS going to do with that expensive but useless > hardware RAID controller? ...I almost rolled over reading this. This is exactly what I went through when we moved our database server out from Vx** to ZFS. We had a 3510 and were thinking how best to configure the RAID. In the end, we ripped out the controller board and used the 3510 as a JBOD directly attached to the server. My DBA was so happy with this setup (especially with the snapshot capability) he is asking for another such setup. -- Just me, Wire ...
Hi Wee, Having snapshots in the filesystem that work so well is really nice. How are y''all quiescing the DB? Best Regards, J On 1/24/07, Wee Yeh Tan <weeyeh at gmail.com> wrote:> On 1/25/07, Bryan Cantrill <bmc at eng.sun.com> wrote: > > ... > > after all, what was ZFS going to do with that expensive but useless > > hardware RAID controller? ... > > I almost rolled over reading this. > > This is exactly what I went through when we moved our database server > out from Vx** to ZFS. We had a 3510 and were thinking how best to > configure the RAID. In the end, we ripped out the controller board > and used the 3510 as a JBOD directly attached to the server. My DBA > was so happy with this setup (especially with the snapshot capability) > he is asking for another such setup. > > > -- > Just me, > Wire ... > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On 1/25/07, Jason J. W. Williams <jasonjwwilliams at gmail.com> wrote:> Having snapshots in the filesystem that work so well is really nice. > How are y''all quiescing the DB?So the DBA has a cronjob that puts the DB (Oracle) into hot backup mode, takes a snapshot of all affected filesystems (i.e. log + datafile + binary) and then releases the database. Oracle lives in its own zone and the DBA gets root to the zone. The 2 ZFS datasets from 2 pools are datasets imported into the zone. -- Just me, Wire ...
>> ZFS ->> HBA -> FC Switch -> JBOD -> "Simple" FC-SATA-converter -> SATA disk > Why bother with switch here?Think multiple JBODs. With a single JBOD then a switch is not needed and then FC probably also is overkill - then normal SCSI can work. - Peter Message was edited by: pen This message posted from opensolaris.org
On Wed, Jan 24, 2007 at 10:19:29AM -0800, Frank Cusack wrote:> On January 24, 2007 10:04:04 AM -0800 Bryan Cantrill <bmc at eng.sun.com> > wrote: > > > >On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote: > >>Well, he did say fairly cheap. the ST 3511 is about $18.5k. That''s > >>about the same price for the low-end NetApp FAS250 unit. > > > >Note that the 3511 is being replaced with the 6140: > > Which is MUCH nicer but also much pricier. Also, no non-RAID option.So there''s no way to treat a 6140 as JBOD? If you wanted to use a 6140 with ZFS, and really wanted JBOD, your only choice would be a RAID 0 config on the 6140? -- albert chin (china at thewrittenword.com)
Hello Peter, Thursday, January 25, 2007, 10:29:52 AM, you wrote:>>> ZFS ->> HBA -> FC Switch -> JBOD -> "Simple" FC-SATA-converter -> SATA disk >> Why bother with switch here?PE> Think multiple JBODs. PE> With a single JBOD then a switch is not needed and then FC PE> probably also is overkill - then normal SCSI can work. Still buying several dual-ported fc cards and putting them into server then connect directly would be cheaper and probably less error-prone. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Albert Chin wrote:> On Wed, Jan 24, 2007 at 10:19:29AM -0800, Frank Cusack wrote: > >> On January 24, 2007 10:04:04 AM -0800 Bryan Cantrill <bmc at eng.sun.com> >> wrote: >> >>> On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote: >>> >>>> Well, he did say fairly cheap. the ST 3511 is about $18.5k. That''s >>>> about the same price for the low-end NetApp FAS250 unit. >>>> >>> Note that the 3511 is being replaced with the 6140: >>> >> Which is MUCH nicer but also much pricier. Also, no non-RAID option. >> > > So there''s no way to treat a 6140 as JBOD? If you wanted to use a 6140 > with ZFS, and really wanted JBOD, your only choice would be a RAID 0 > config on the 6140?Why would you want to treat a 6140 like a JBOD? (See the previous threads about JBOD vs HW RAID...)
On Jan 25, 2007, at 10:16, Torrey McMahon wrote:> Albert Chin wrote: >> On Wed, Jan 24, 2007 at 10:19:29AM -0800, Frank Cusack wrote: >> >>> On January 24, 2007 10:04:04 AM -0800 Bryan Cantrill >>> <bmc at eng.sun.com> wrote: >>> >>>> On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote: >>>> >>>>> Well, he did say fairly cheap. the ST 3511 is about $18.5k. That''s >>>>> about the same price for the low-end NetApp FAS250 unit. >>>>> >>>> Note that the 3511 is being replaced with the 6140: >>>> >>> Which is MUCH nicer but also much pricier. Also, no non-RAID >>> option. >>> >> >> So there''s no way to treat a 6140 as JBOD? If you wanted to use a >> 6140 >> with ZFS, and really wanted JBOD, your only choice would be a RAID 0 >> config on the 6140? > > Why would you want to treat a 6140 like a JBOD? (See the previous > threads about JBOD vs HW RAID...)I was trying to see if we sold the CSM2 trays without the controller, but I don''t think that''s commonly asked for .. reminds me of the old D1000 days - i seem to recall putting in more of those as the A1000 controllers weren''t the greatest and people tended to opt for s/w mirrors instead. Then as the system application load went higher and the data became more critical the push was towards offloading this onto better storage controllers .. so since it seems like we now have more processing and bus speed on the system that applications aren''t taking advantage of yet, it looks like the pendulum might be swinging back towards host-based RAID again. not a verdict .. just a thought --- .je
On Thu, Jan 25, 2007 at 10:16:47AM -0500, Torrey McMahon wrote:> Albert Chin wrote: > >On Wed, Jan 24, 2007 at 10:19:29AM -0800, Frank Cusack wrote: > > > >>On January 24, 2007 10:04:04 AM -0800 Bryan Cantrill <bmc at eng.sun.com> > >>wrote: > >> > >>>On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote: > >>> > >>>>Well, he did say fairly cheap. the ST 3511 is about $18.5k. That''s > >>>>about the same price for the low-end NetApp FAS250 unit. > >>>> > >>>Note that the 3511 is being replaced with the 6140: > >>> > >>Which is MUCH nicer but also much pricier. Also, no non-RAID option. > >> > > > >So there''s no way to treat a 6140 as JBOD? If you wanted to use a 6140 > >with ZFS, and really wanted JBOD, your only choice would be a RAID 0 > >config on the 6140? > > Why would you want to treat a 6140 like a JBOD? (See the previous > threads about JBOD vs HW RAID...)Well, a 6140 with RAID 10 is not an option because we don''t want to lose 50% disk capacity. So, we''re left with RAID 5. Yes, we could layer ZFS on top of this. But what do you do if you want RAID 6? Easiest way to get it is ZFS RAIDZ2 on top of JBOD. The only reason I''d consider RAID is if the HW RAID performance was enough of a win over ZFS SW RAID. -- albert chin (china at thewrittenword.com)
On January 25, 2007 11:22:41 AM -0500 Jonathan Edwards <Jonathan.Edwards at Sun.COM> wrote:> > On Jan 25, 2007, at 10:16, Torrey McMahon wrote: > >> Albert Chin wrote: >>> So there''s no way to treat a 6140 as JBOD? If you wanted to use a >>> 6140 >>> with ZFS, and really wanted JBOD, your only choice would be a RAID 0 >>> config on the 6140? >> >> Why would you want to treat a 6140 like a JBOD? (See the previous >> threads about JBOD vs HW RAID...) > > I was trying to see if we sold the CSM2 trays without the controller, but > I don''t think that''s commonly asked forBest I could tell, the connector was proprietary so even if you could get the tray by itself you couldn''t attach it to your host. -frank
On Thu, Jan 25, 2007 at 10:57:17AM +0800, Wee Yeh Tan wrote:> On 1/25/07, Bryan Cantrill <bmc at eng.sun.com> wrote: > >... > >after all, what was ZFS going to do with that expensive but useless > >hardware RAID controller? ... > > I almost rolled over reading this. > > This is exactly what I went through when we moved our database server > out from Vx** to ZFS. We had a 3510 and were thinking how best to > configure the RAID. In the end, we ripped out the controller board > and used the 3510 as a JBOD directly attached to the server. My DBA > was so happy with this setup (especially with the snapshot capability) > he is asking for another such setup.The only benefit of using a HW RAID controller with ZFS is that it reduces the I/O that the host needs to do, but the trade off is that ZFS cannot do combinatorial parity reconstruction so that it could only detect errors, not correct them. It would be cool if the host could offload the RAID I/O to a HW controller but still be able to read the individual stripes to perform combinatorial parity reconstruction.
Nicolas Williams wrote:> On Thu, Jan 25, 2007 at 10:57:17AM +0800, Wee Yeh Tan wrote: >> On 1/25/07, Bryan Cantrill <bmc at eng.sun.com> wrote: >>> ... >>> after all, what was ZFS going to do with that expensive but useless >>> hardware RAID controller? ... >> I almost rolled over reading this. >> >> This is exactly what I went through when we moved our database server >> out from Vx** to ZFS. We had a 3510 and were thinking how best to >> configure the RAID. In the end, we ripped out the controller board >> and used the 3510 as a JBOD directly attached to the server. My DBA >> was so happy with this setup (especially with the snapshot capability) >> he is asking for another such setup. > > The only benefit of using a HW RAID controller with ZFS is that it > reduces the I/O that the host needs to do, but the trade off is that ZFS > cannot do combinatorial parity reconstruction so that it could only > detect errors, not correct them. It would be cool if the host could > offload the RAID I/O to a HW controller but still be able to read the > individual stripes to perform combinatorial parity reconstruction.OK, not the *only* benefit :-) IMHO, the most visible benefit is the nonvolatile write cache. The RAID configuration is simply an implementation detail. -- richard
On Thu, Jan 25, 2007 at 09:52:05AM -0800, Richard Elling wrote:> Nicolas Williams wrote: > >The only benefit of using a HW RAID controller with ZFS is that it > >reduces the I/O that the host needs to do, but the trade off is that ZFS > >cannot do combinatorial parity reconstruction so that it could only > >detect errors, not correct them. It would be cool if the host could > >offload the RAID I/O to a HW controller but still be able to read the > >individual stripes to perform combinatorial parity reconstruction. > > OK, not the *only* benefit :-) IMHO, the most visible benefit is the > nonvolatile write cache. The RAID configuration is simply an implementation > detail.Well, yes, but NVRAM could be a generic device. The task of taking one I/O from the host and mapping it to N + parity I/Os to actual devices is very specific to RAID. Nico --
On Thu, 2007-01-25 at 10:16 -0500, Torrey McMahon wrote:> > So there''s no way to treat a 6140 as JBOD? If you wanted to use a 6140 > > with ZFS, and really wanted JBOD, your only choice would be a RAID 0 > > config on the 6140? > > Why would you want to treat a 6140 like a JBOD? (See the previous > threads about JBOD vs HW RAID...)Let''s turn this around. Assume I want a FC JBOD. What should I get? - Bill
On Jan 25, 2007, at 14:34, Bill Sommerfeld wrote:> On Thu, 2007-01-25 at 10:16 -0500, Torrey McMahon wrote: > >>> So there''s no way to treat a 6140 as JBOD? If you wanted to use a >>> 6140 >>> with ZFS, and really wanted JBOD, your only choice would be a RAID 0 >>> config on the 6140? >> >> Why would you want to treat a 6140 like a JBOD? (See the previous >> threads about JBOD vs HW RAID...) > > Let''s turn this around. Assume I want a FC JBOD. What should I get?perhaps something coming real soon .. (stall) --- .je btw - I''ve also said you could do a FC target in a thumper a la FalconStor .. but i''m not sure if they''ve got that going on S10, and their target multipathing was less than stellar .. we did have a target mode driver at one point, but i think that project got scrapped a while back.
On Thu, 25 Jan 2007, Bill Sommerfeld wrote:> On Thu, 2007-01-25 at 10:16 -0500, Torrey McMahon wrote: > > > > So there''s no way to treat a 6140 as JBOD? If you wanted to use a 6140 > > > with ZFS, and really wanted JBOD, your only choice would be a RAID 0 > > > config on the 6140? > > > > Why would you want to treat a 6140 like a JBOD? (See the previous > > threads about JBOD vs HW RAID...) > > Let''s turn this around. Assume I want a FC JBOD. What should I get?Hi Bill, Many companies make FC expansion "boxes" to go along with their FC based hardware RAID arrays. Often, the expansion chassis is identical to the RAID equipped chassis - same power supplies, same physical chassis and disk drive carriers - the only difference is that the slots used to house the (dual) RAID H/W controllers have been blanked off. These expansion chassis are designed to be daisy chained back to the "box" with the H/W RAID. So you simply use one of the expansion chassis and attach it directly to a system equipped with an FC HBA and ... you''ve got an FC JBOD. Nearly all of them will support two FC connections to allow dual redundant connections to the FC RAID H/W. So if you equip your ZFS host with either a dual-port FC HBA or two single-port FC HBAs - you have a pretty good redundant FC JBOD solution. An example of such an expansion box is the DS4000 EXP100 from IBM. It''s also possible to purchase a 3510FC box from Sun with no RAID controllers - but their nearest equivalent of an "empty" box comes with 6 (overpriced) disk drives pre-installed. :( Perhaps you could use your vast influence at Sun to persuade them to sell an empty 3510FC box? Or an empty box bundled with a single or dual-port FC card (Qlogic based please). Well - there''s no harm in making the suggestion ... right? Regards, Al Hopper Logical Approach Inc, Plano, TX. al at logical-approach.com Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006
Hello Jonathan, Thursday, January 25, 2007, 9:03:47 PM, you wrote: JE> On Jan 25, 2007, at 14:34, Bill Sommerfeld wrote:>> On Thu, 2007-01-25 at 10:16 -0500, Torrey McMahon wrote: >> >>>> So there''s no way to treat a 6140 as JBOD? If you wanted to use a >>>> 6140 >>>> with ZFS, and really wanted JBOD, your only choice would be a RAID 0 >>>> config on the 6140? >>> >>> Why would you want to treat a 6140 like a JBOD? (See the previous >>> threads about JBOD vs HW RAID...) >> >> Let''s turn this around. Assume I want a FC JBOD. What should I get?JE> perhaps something coming real soon .. (stall) And that something would be...? ok I belive you can''t tell :) -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
On Thu, Jan 25, 2007 at 02:24:47PM -0600, Al Hopper wrote:> On Thu, 25 Jan 2007, Bill Sommerfeld wrote: > > > On Thu, 2007-01-25 at 10:16 -0500, Torrey McMahon wrote: > > > > > > So there''s no way to treat a 6140 as JBOD? If you wanted to use a 6140 > > > > with ZFS, and really wanted JBOD, your only choice would be a RAID 0 > > > > config on the 6140? > > > > > > Why would you want to treat a 6140 like a JBOD? (See the previous > > > threads about JBOD vs HW RAID...) > > > > Let''s turn this around. Assume I want a FC JBOD. What should I get? > > Many companies make FC expansion "boxes" to go along with their FC based > hardware RAID arrays. Often, the expansion chassis is identical to the > RAID equipped chassis - same power supplies, same physical chassis and > disk drive carriers - the only difference is that the slots used to house > the (dual) RAID H/W controllers have been blanked off. These expansion > chassis are designed to be daisy chained back to the "box" with the H/W > RAID. So you simply use one of the expansion chassis and attach it > directly to a system equipped with an FC HBA and ... you''ve got an FC > JBOD. Nearly all of them will support two FC connections to allow dual > redundant connections to the FC RAID H/W. So if you equip your ZFS host > with either a dual-port FC HBA or two single-port FC HBAs - you have a > pretty good redundant FC JBOD solution. > > An example of such an expansion box is the DS4000 EXP100 from IBM. It''s > also possible to purchase a 3510FC box from Sun with no RAID controllers - > but their nearest equivalent of an "empty" box comes with 6 (overpriced) > disk drives pre-installed. :( > > Perhaps you could use your vast influence at Sun to persuade them to sell > an empty 3510FC box? Or an empty box bundled with a single or dual-port > FC card (Qlogic based please). Well - there''s no harm in making the > suggestion ... right?Well, when you buy disk for the Sun 5320 NAS Appliance, you get a Controller Unit shelf and, if you expand storage, an Expansion Unit shelf that connects to the Controller Unit. Maybe the Expansion Unit shelf is a JBOD 6140? -- albert chin (china at thewrittenword.com)
On Jan 25, 2007, at 17:30, Albert Chin wrote:> On Thu, Jan 25, 2007 at 02:24:47PM -0600, Al Hopper wrote: >> On Thu, 25 Jan 2007, Bill Sommerfeld wrote: >> >>> On Thu, 2007-01-25 at 10:16 -0500, Torrey McMahon wrote: >>> >>>>> So there''s no way to treat a 6140 as JBOD? If you wanted to use >>>>> a 6140 >>>>> with ZFS, and really wanted JBOD, your only choice would be a >>>>> RAID 0 >>>>> config on the 6140? >>>> >>>> Why would you want to treat a 6140 like a JBOD? (See the previous >>>> threads about JBOD vs HW RAID...) >>> >>> Let''s turn this around. Assume I want a FC JBOD. What should I >>> get? >> >> Many companies make FC expansion "boxes" to go along with their FC >> based >> hardware RAID arrays. Often, the expansion chassis is identical >> to the >> RAID equipped chassis - same power supplies, same physical chassis >> and >> disk drive carriers - the only difference is that the slots used >> to house >> the (dual) RAID H/W controllers have been blanked off. These >> expansion >> chassis are designed to be daisy chained back to the "box" with >> the H/W >> RAID. So you simply use one of the expansion chassis and attach it >> directly to a system equipped with an FC HBA and ... you''ve got an FC >> JBOD. Nearly all of them will support two FC connections to allow >> dual >> redundant connections to the FC RAID H/W. So if you equip your >> ZFS host >> with either a dual-port FC HBA or two single-port FC HBAs - you >> have a >> pretty good redundant FC JBOD solution. >> >> An example of such an expansion box is the DS4000 EXP100 from >> IBM. It''s >> also possible to purchase a 3510FC box from Sun with no RAID >> controllers - >> but their nearest equivalent of an "empty" box comes with 6 >> (overpriced) >> disk drives pre-installed. :( >> >> Perhaps you could use your vast influence at Sun to persuade them >> to sell >> an empty 3510FC box? Or an empty box bundled with a single or >> dual-port >> FC card (Qlogic based please). Well - there''s no harm in making the >> suggestion ... right? > > Well, when you buy disk for the Sun 5320 NAS Appliance, you get a > Controller Unit shelf and, if you expand storage, an Expansion Unit > shelf that connects to the Controller Unit. Maybe the Expansion Unit > shelf is a JBOD 6140?that''s the CSM200 - the IOMs in that should just take a 2Gb or 4Gb SFP (copper or fibre) and the tray should run switched loop so you can mix FC and SATA as it connects back to the 6140 or 6540 controller head. --- .je
Nicolas Williams writes: > On Thu, Jan 25, 2007 at 10:57:17AM +0800, Wee Yeh Tan wrote: > > On 1/25/07, Bryan Cantrill <bmc at eng.sun.com> wrote: > > >... > > >after all, what was ZFS going to do with that expensive but useless > > >hardware RAID controller? ... > > > > I almost rolled over reading this. > > > > This is exactly what I went through when we moved our database server > > out from Vx** to ZFS. We had a 3510 and were thinking how best to > > configure the RAID. In the end, we ripped out the controller board > > and used the 3510 as a JBOD directly attached to the server. My DBA > > was so happy with this setup (especially with the snapshot capability) > > he is asking for another such setup. > > The only benefit of using a HW RAID controller with ZFS is that it > reduces the I/O that the host needs to do, but the trade off is that ZFS > cannot do combinatorial parity reconstruction so that it could only > detect errors, not correct them. It would be cool if the host could > offload the RAID I/O to a HW controller but still be able to read the > individual stripes to perform combinatorial parity reconstruction. right but in this situation, if the "cosmic ray / bit flip" hits on the way to the controller, the array will store wrong data and we will not be able to reconstruct the correct block. So having multiple I/Os here improves the time to data loss metric. -r > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Tue, Jan 30, 2007 at 06:32:14PM +0100, Roch - PAE wrote:> > The only benefit of using a HW RAID controller with ZFS is that it > > reduces the I/O that the host needs to do, but the trade off is that ZFS > > cannot do combinatorial parity reconstruction so that it could only > > detect errors, not correct them. It would be cool if the host could > > offload the RAID I/O to a HW controller but still be able to read the > > individual stripes to perform combinatorial parity reconstruction. > > right but in this situation, if the "cosmic ray / bit flip" hits on the > way to the controller, the array will store wrong data and > we will not be able to reconstruct the correct block. > > So having multiple I/Os here improves the time to data > loss metric.You missed my point. Assume _new_ RAID HW that allows the host to read the individual stripes. The ZFS could offload I/O to the RAID HW but, when a checksum fails to validate on read, THEN go read the individual stripes and parity and do the combinatorial reconstruction as if the RAID HW didn''t exist. I don''t believe such RAID HW exists, therefore the point is moot. But if such HW ever comes along... Nico --
Nicolas Williams writes: > On Tue, Jan 30, 2007 at 06:32:14PM +0100, Roch - PAE wrote: > > > The only benefit of using a HW RAID controller with ZFS is that it > > > reduces the I/O that the host needs to do, but the trade off is that ZFS > > > cannot do combinatorial parity reconstruction so that it could only > > > detect errors, not correct them. It would be cool if the host could > > > offload the RAID I/O to a HW controller but still be able to read the > > > individual stripes to perform combinatorial parity reconstruction. > > > > right but in this situation, if the "cosmic ray / bit flip" hits on the > > way to the controller, the array will store wrong data and > > we will not be able to reconstruct the correct block. > > > > So having multiple I/Os here improves the time to data > > loss metric. > > You missed my point. Assume _new_ RAID HW that allows the host to read > the individual stripes. The ZFS could offload I/O to the RAID HW but, > when a checksum fails to validate on read, THEN go read the individual > stripes and parity and do the combinatorial reconstruction as if the > RAID HW didn''t exist. > > I don''t believe such RAID HW exists, therefore the point is moot. But > if such HW ever comes along... > > Nico > -- I think I got the point. Mine was that if the data travels a single time toward the storage and is corrupted along the way then there will be no hope of recovering it since the array was given bad data. Having the data travel twice is a benefit for MTTDL. -r
On Tue, Jan 30, 2007 at 06:41:25PM +0100, Roch - PAE wrote:> I think I got the point. Mine was that if the data travels a > single time toward the storage and is corrupted along the > way then there will be no hope of recovering it since the > array was given bad data. Having the data travel twice is a > benefit for MTTDL.Well, this is certainly true, so I missed your point :) Mirroring would help. A mirror with RAID-Z members would only double the I/O and still provide for combinatorial reconstruction when the errors arise from bit rot on the rotating rust or on the path from the RAID HW to the individual disks, as opposed to on the path from the host to the RAID HW. Depending on the relative error rates this could be a useful trade-off to make (plus, mirroring should halve access times while RAID-Z, if disk heads can be synchronized and the disks have similar geometries, can provide an N multiple increase in bandwidth, though I''m told that disk head synchronization is no longer a common feature). Nico --
Nicolas Williams wrote:> On Tue, Jan 30, 2007 at 06:41:25PM +0100, Roch - PAE wrote: >> I think I got the point. Mine was that if the data travels a >> single time toward the storage and is corrupted along the >> way then there will be no hope of recovering it since the >> array was given bad data. Having the data travel twice is a >> benefit for MTTDL. > > Well, this is certainly true, so I missed your point :)This technique is used in many situations where the BER is non-zero (pretty much always) and the data is very important. For example, consider command sequences being sent to a deep space probe -- you really, really, really want the correct commands to be received, so you use ECC and repeat the commands many times. There are mathematical models for this. Slow? Yes. Correct? More likely.> Mirroring would help. A mirror with RAID-Z members would only double > the I/O and still provide for combinatorial reconstruction when the > errors arise from bit rot on the rotating rust or on the path from the > RAID HW to the individual disks, as opposed to on the path from the host > to the RAID HW. Depending on the relative error rates this could be a > useful trade-off to make (plus, mirroring should halve access times > while RAID-Z, if disk heads can be synchronized and the disks have > similar geometries, can provide an N multiple increase in bandwidth, > though I''m told that disk head synchronization is no longer a common > feature).One of the benefits of ZFS is that not only is head synchronization not needed, but also block offsets do not have to be the same. For example, in a traditional mirror, block 1 on device 1 is paired with block 1 on device 2. In ZFS, this 1:1 mapping is not required. I believe this will result in ZFS being more resilient to disks with multiple block failures. In order for a traditional RAID to implement this, it would basically need to [re]invent a file system. -- richard
On 30-Jan-07, at 5:48 PM, Richard Elling wrote: ...> > One of the benefits of ZFS is that not only is head synchronization > not > needed, but also block offsets do not have to be the same. For > example, > in a traditional mirror, block 1 on device 1 is paired with block 1 on > device 2. In ZFS, this 1:1 mapping is not required. I believe > this will > result in ZFS being more resilient to disks with multiple block > failures. > In order for a traditional RAID to implement this, it would basically > need to [re]invent a file system.Yes, this is another feature for the "Why ZFS can beat RAID" FAQ. --T> -- richard > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Richard Elling wrote:> > One of the benefits of ZFS is that not only is head synchronization not > needed, but also block offsets do not have to be the same. For example, > in a traditional mirror, block 1 on device 1 is paired with block 1 on > device 2. In ZFS, this 1:1 mapping is not required. I believe this will > result in ZFS being more resilient to disks with multiple block failures. > In order for a traditional RAID to implement this, it would basically > need to [re]invent a file system.This may well offer better protection against drive firmware bugs. Ian
> One of the benefits of ZFS is that not only is head synchronization not > needed, but also block offsets do not have to be the same. For example, > in a traditional mirror, block 1 on device 1 is paired with block 1 on > device 2. In ZFS, this 1:1 mapping is not required. I believe this will > result in ZFS being more resilient to disks with multiple block failures. > In order for a traditional RAID to implement this, it would basically > need to [re]invent a file system. > -- richardRichard, This does not seem to be enforced (! 1:1) in code anywhere that I can see. By not required are you pointing that this is able to be done in the future, or is this the case right now and I am missing the code that accomplishes this? -Wade
> > One of the benefits of ZFS is that not only is head synchronization not > > needed, but also block offsets do not have to be the same. For example, > > in a traditional mirror, block 1 on device 1 is paired with block 1 on > > device 2. In ZFS, this 1:1 mapping is not required. I believe this will > > result in ZFS being more resilient to disks with multiple block failures. > > In order for a traditional RAID to implement this, it would basically > > need to [re]invent a file system. > > -- richard > > This does not seem to be enforced (! 1:1) in code anywhere that I can > see. By not required are you pointing that this is able to be done in the > future, or is this the case right now and I am missing the code that > accomplishes this?I think he means that if a block fails to write on a VDEV, ZFS can write that data elsewhere and is not forced to use that location. As opposed to SVM as an example, where the mirror must try to write at a particular offset or fail. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
> > > One of the benefits of ZFS is that not only is head synchronizationnot> > > needed, but also block offsets do not have to be the same. Forexample,> > > in a traditional mirror, block 1 on device 1 is paired with block 1on> > > device 2. In ZFS, this 1:1 mapping is not required. I believe thiswill> > > result in ZFS being more resilient to disks with multiple blockfailures.> > > In order for a traditional RAID to implement this, it would basically > > > need to [re]invent a file system. > > > -- richard > > > > This does not seem to be enforced (! 1:1) in code anywhere that Ican> > see. By not required are you pointing that this is able to be done inthe> > future, or is this the case right now and I am missing the code that > > accomplishes this? > > I think he means that if a block fails to write on a VDEV, ZFS can write > that data elsewhere and is not forced to use that location. As opposed > to SVM as an example, where the mirror must try to write at a particular > offset or fail.Understood, I am asking if the current code base actually does this as I do not see the code path that deals with this case. -Wade
> > I think he means that if a block fails to write on a VDEV, ZFS can write > > that data elsewhere and is not forced to use that location. As opposed > > to SVM as an example, where the mirror must try to write at a particular > > offset or fail. > > Understood, I am asking if the current code base actually does this as I > do not see the code path that deals with this case.Got it. So what does happen with a block write failure now on one side of a mirror? Does it retry forever or eventually fail the device? -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
Wade.Stuart at fallon.com wrote:>> One of the benefits of ZFS is that not only is head synchronization not >> needed, but also block offsets do not have to be the same. For example, >> in a traditional mirror, block 1 on device 1 is paired with block 1 on >> device 2. In ZFS, this 1:1 mapping is not required. I believe this will >> result in ZFS being more resilient to disks with multiple block failures. >> In order for a traditional RAID to implement this, it would basically >> need to [re]invent a file system. >> -- richard > > Richard, > This does not seem to be enforced (! 1:1) in code anywhere that I can > see. By not required are you pointing that this is able to be done in the > future, or is this the case right now and I am missing the code that > accomplishes this?IMHO, the best description of how mirrors work is: http://blogs.sun.com/bonwick/entry/smokin_mirrors The ditto block code is interesting, too. -- richard
On Jan 24, 2007, at 1:19 PM, Frank Cusack wrote:>> On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote: >> >> Note that the 3511 is being replaced with the 6140: > > Which is MUCH nicer but also much pricier. Also, no non-RAID option.The 6140 is nicer in terms of performance. However if you have STK 35xx units now and want to replace them with 6140, AND you do LUN masking, be prepared to pay an extra double- digit percentage of the base cost of a 6140 to enable that feature. Not cool. I call shenanigans on Engenio for that, and raise an eyebrow at Sun for breaking feature continuity across the product line. However FC target for Solaris is very compelling, especially where one may have only 100/1Gb ethernet and a 2 or 4Gb FC SAN. It''s a no- brainer there which interconnect you want your targets to flow over in a high bandwidth environ. Yeah sure it "might" eat into STK profits, but one will still have to go there for redundant controllers. See, I have this dream of a stripped down appliance-oriented Solaris OS, with ZFS, NFSv<whatever>, iSCSI client and target server, FC target, and remote replication via the incoming AVS. Just add a UI, CIMOM provider, and a CLI. Most of those bits are there already to make a swiss army knife of storage anyway, and the more avenues that allow the inflow and outflow of data, the better. /dale
you can still do some lun masking at the HBA level (Solaris 10) this feature is call "blacklist" On 1/31/07, Dale Ghent <daleg at elemental.org> wrote:> On Jan 24, 2007, at 1:19 PM, Frank Cusack wrote: > > >> On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote: > >> > >> Note that the 3511 is being replaced with the 6140: > > > > Which is MUCH nicer but also much pricier. Also, no non-RAID option. > > The 6140 is nicer in terms of performance. > > However if you have STK 35xx units now and want to replace them with > 6140, AND you do LUN masking, be prepared to pay an extra double- > digit percentage of the base cost of a 6140 to enable that feature. > Not cool. > > I call shenanigans on Engenio for that, and raise an eyebrow at Sun > for breaking feature continuity across the product line. > > However FC target for Solaris is very compelling, especially where > one may have only 100/1Gb ethernet and a 2 or 4Gb FC SAN. It''s a no- > brainer there which interconnect you want your targets to flow over > in a high bandwidth environ. > > Yeah sure it "might" eat into STK profits, but one will still have to > go there for redundant controllers. See, I have this dream of a > stripped down appliance-oriented Solaris OS, with ZFS, > NFSv<whatever>, iSCSI client and target server, FC target, and remote > replication via the incoming AVS. Just add a UI, CIMOM provider, and > a CLI. > > Most of those bits are there already to make a swiss army knife of > storage anyway, and the more avenues that allow the inflow and > outflow of data, the better. > > /dale > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Jan 31, 2007, at 4:26 AM, Selim Daoud wrote:> you can still do some lun masking at the HBA level (Solaris 10) > this feature is call "blacklist"Oh, I''d do that but Solaris isn''t the only OS that uses arrays on my SAN, and other hosts even cross-departmental. Thus masking from the array is a must to keep the amount of host-based tomfoolery to a minimum. /dale
btw, I liked your blog entry about Lun Masking http://elektronkind.org/ selim. On 1/31/07, Dale Ghent <daleg at elemental.org> wrote:> On Jan 31, 2007, at 4:26 AM, Selim Daoud wrote: > > > you can still do some lun masking at the HBA level (Solaris 10) > > this feature is call "blacklist" > > Oh, I''d do that but Solaris isn''t the only OS that uses arrays on my > SAN, and other hosts even cross-departmental. Thus masking from the > array is a must to keep the amount of host-based tomfoolery to a > minimum. > > /dale >
Richard Elling wrote:> > One of the benefits of ZFS is that not only is head synchronization not > needed, but also block offsets do not have to be the same. For example, > in a traditional mirror, block 1 on device 1 is paired with block 1 on > device 2. In ZFS, this 1:1 mapping is not required. I believe this will > result in ZFS being more resilient to disks with multiple block failures. > In order for a traditional RAID to implement this, it would basically > need to [re]invent a file system.We had this fixed in T3 land awhile ago so I think most storage arrays don''t do the 1:1 mapping anymore. It''s striped down the drives. In theory, you could lose more then one drive in a T3 mirror and still maintain data in certain situations.