thr3ads.net - zfs discuss - [zfs-discuss] Sun JBOD setup [Apr 2006]

If this information is useful, please help other people find it:
Share via:

Peter Tribble

2006-Apr-03 19:41 UTC

[zfs-discuss] Sun JBOD setup

This isn''t about zfs as such, but about how to build a system for
zfs.

Zfs likes JBOD, right? So how do I best build a system with lots
of raw disk?

Lets assume that we''re talking Sun kit (as I''m generally
familiar
with most of the bits). And that we''re talking about a fibre
interconnect - so that it''s basically a SAN, and I can just add
more disk to the network any time I like.

This gives me the 3510 and 3511 at the bottom end. I''ve been
reading up on these (we already use direct attach 3510 boxes with
hardware raid, and 3320/3310 scsi boxes). My understanding here is
that the 3510 is only supported for direct attach to a host (so
no SAN switches) and the 3511 isn''t supported for host attach at
all - you''re supposed to hang it off a controller unit.

Go further up the scale and it isn''t clear to me that JBOD exists.

So, any suggestions for a good way to connect lots of JBOD disk
to a machine?

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Torrey McMahon

2006-Apr-03 19:58 UTC

head link

[zfs-discuss] Sun JBOD setup

I think it would be safer to say, "ZFS likes lots of LUNs". That way
it
can better place data within a pool, deal with failures, etc. JBOD makes 
that easier as usually you throw lots of drives in a JBOD and go. There 
is nothing inherent in a hardware raid array that disables features or 
interferes with ZFS. You may want to reconfigure or re-architect in 
order to use those features more efficiently but I don''t think anyone
is
going to say, "Throw away all your hardware raid controllers and convert 
to JBOD".

3510 and 3511 with raid controllers are supported on SAN or direct 
attach. The JBOD variant of the 3510 is only supported in direct connect 
mode as you state.

As for lots of FC connected JBOD: Got any A5K lying around? ;)

Peter Tribble wrote:> This isn''t about zfs as such, but about how to build a system for
> zfs.
>
> Zfs likes JBOD, right? So how do I best build a system with lots
> of raw disk?
>
> Lets assume that we''re talking Sun kit (as I''m generally
familiar
> with most of the bits). And that we''re talking about a fibre
> interconnect - so that it''s basically a SAN, and I can just add
> more disk to the network any time I like.
>
> This gives me the 3510 and 3511 at the bottom end. I''ve been
> reading up on these (we already use direct attach 3510 boxes with
> hardware raid, and 3320/3310 scsi boxes). My understanding here is
> that the 3510 is only supported for direct attach to a host (so
> no SAN switches) and the 3511 isn''t supported for host attach at
> all - you''re supposed to hang it off a controller unit.
>
> Go further up the scale and it isn''t clear to me that JBOD exists.
>
> So, any suggestions for a good way to connect lots of JBOD disk
> to a machine?
>
>

Peter Rival

2006-Apr-03 20:22 UTC

head link

[zfs-discuss] Sun JBOD setup

Torrey McMahon wrote:> I think it would be safer to say, "ZFS likes lots of LUNs". That
way it
> can better place data within a pool, deal with failures, etc. JBOD makes 
> that easier as usually you throw lots of drives in a JBOD and go. There 
> is nothing inherent in a hardware raid array that disables features or 
> interferes with ZFS. You may want to reconfigure or re-architect in 
> order to use those features more efficiently but I don''t think
anyone is
> going to say, "Throw away all your hardware raid controllers and
convert
> to JBOD".
The existence of this question (and it''s quite appropriate, IMO)
suggests the need for some good "Best Practices" documentation of how
to handle ZFS in "big" SANs.  I thought I''d seen a writeup
from Bill Moore but it''s lost in a mountain of email.  Since ZFS
changes the dynamic so much, but so many customers have so much invested in
their setups with these large SANs, are there plans for such a guide?  IME, many
customers spend more on their storage than on their systems and software.

 - Pete
> 3510 and 3511 with raid controllers are supported on SAN or direct 
> attach. The JBOD variant of the 3510 is only supported in direct connect 
> mode as you state.
> 
> As for lots of FC connected JBOD: Got any A5K lying around? ;)
> 
> Peter Tribble wrote:
>> This isn''t about zfs as such, but about how to build a system
for
>> zfs.
>>
>> Zfs likes JBOD, right? So how do I best build a system with lots
>> of raw disk?
>>
>> Lets assume that we''re talking Sun kit (as I''m
generally familiar
>> with most of the bits). And that we''re talking about a fibre
>> interconnect - so that it''s basically a SAN, and I can just
add
>> more disk to the network any time I like.
>>
>> This gives me the 3510 and 3511 at the bottom end. I''ve been
>> reading up on these (we already use direct attach 3510 boxes with
>> hardware raid, and 3320/3310 scsi boxes). My understanding here is
>> that the 3510 is only supported for direct attach to a host (so
>> no SAN switches) and the 3511 isn''t supported for host attach
at
>> all - you''re supposed to hang it off a controller unit.
>>
>> Go further up the scale and it isn''t clear to me that JBOD
exists.
>>
>> So, any suggestions for a good way to connect lots of JBOD disk
>> to a machine?
>>
>>   
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

David J. Orman

2006-Apr-03 20:28 UTC

head link

[zfs-discuss] Sun JBOD setup

I had a question much the same as yours before, but mainly because I
wanted to avoid the rather pricey (sorry Sun) storage when it was
unecessary.

One of these:
http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
Plus one of these:
http://www.cooldrives.com/sataenclosures.html

That should do you! :) At least, that''s my plan. I''ve been
assured that
SATA controller is supported, you might want to double check just to be
sure, though.

If you need more than 8 drives, I''m not really sure. :) Maybe two of
the
marvell controllers? :P

This way you can buy SATA drives for the going ~30c/gig, instead of
rather, uhm, painful pricing.

3511 for 1250G at 5 drives =~18000$:
http://store.sun.com/CMTemplate/CEServlet?process=SunStore&cmdViewProduct_CP&catid=114140

That''s $14.7/gig.

Solution I outlined for 2000G at 8 drives = ~1300$:
http://www.newegg.com/Product/Product.asp?Item=N82E16822144701
http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
http://www.cooldrives.com/sata-eight-bay-enclosure-1.html

That''s $0.65/gig.

I don''t think the integrated raid controller in the sun storage array
is
worth that kind of pricing. That''s just me though. :) If somebody can
point me to a nice 8+ drive rack mount enclosure for disks with SATA
interface, I''d be super appreciative! :P

Cheers,
David
> This isn''t about zfs as such, but about how to build a system for
> zfs.
>
> Zfs likes JBOD, right? So how do I best build a system with lots
> of raw disk?
>
> Lets assume that we''re talking Sun kit (as I''m generally
familiar
> with most of the bits). And that we''re talking about a fibre
> interconnect - so that it''s basically a SAN, and I can just add
> more disk to the network any time I like.
>
> This gives me the 3510 and 3511 at the bottom end. I''ve been
> reading up on these (we already use direct attach 3510 boxes with
> hardware raid, and 3320/3310 scsi boxes). My understanding here is
> that the 3510 is only supported for direct attach to a host (so
> no SAN switches) and the 3511 isn''t supported for host attach at
> all - you''re supposed to hang it off a controller unit.
>
> Go further up the scale and it isn''t clear to me that JBOD exists.
>
> So, any suggestions for a good way to connect lots of JBOD disk
> to a machine?
>
> --
> -Peter Tribble
> L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

David J. Orman

2006-Apr-03 20:31 UTC

head link

[zfs-discuss] Sun JBOD setup

Sorry, I misread your request, I didn''t see you needed it to be SAN. My
apologies, I suppose the FC connections would be worth the cost to you
then.

Cheers,
David
> I had a question much the same as yours before, but mainly because I
> wanted to avoid the rather pricey (sorry Sun) storage when it was
> unecessary.
>
> One of these:
> http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
> Plus one of these:
> http://www.cooldrives.com/sataenclosures.html
>
> That should do you! :) At least, that''s my plan. I''ve
been assured that
> SATA controller is supported, you might want to double check just to be
> sure, though.
>
> If you need more than 8 drives, I''m not really sure. :) Maybe two
of the
> marvell controllers? :P
>
> This way you can buy SATA drives for the going ~30c/gig, instead of
> rather, uhm, painful pricing.
>
> 3511 for 1250G at 5 drives =~18000$:
>
http://store.sun.com/CMTemplate/CEServlet?process=SunStore&cmdViewProduct_CP&catid=114140
>
> That''s $14.7/gig.
>
> Solution I outlined for 2000G at 8 drives = ~1300$:
> http://www.newegg.com/Product/Product.asp?Item=N82E16822144701
> http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
> http://www.cooldrives.com/sata-eight-bay-enclosure-1.html
>
> That''s $0.65/gig.
>
> I don''t think the integrated raid controller in the sun storage
array is
> worth that kind of pricing. That''s just me though. :) If somebody
can
> point me to a nice 8+ drive rack mount enclosure for disks with SATA
> interface, I''d be super appreciative! :P
>
> Cheers,
> David
>
>> This isn''t about zfs as such, but about how to build a system
for
>> zfs.
>>
>> Zfs likes JBOD, right? So how do I best build a system with lots
>> of raw disk?
>>
>> Lets assume that we''re talking Sun kit (as I''m
generally familiar
>> with most of the bits). And that we''re talking about a fibre
>> interconnect - so that it''s basically a SAN, and I can just
add
>> more disk to the network any time I like.
>>
>> This gives me the 3510 and 3511 at the bottom end. I''ve been
>> reading up on these (we already use direct attach 3510 boxes with
>> hardware raid, and 3320/3310 scsi boxes). My understanding here is
>> that the 3510 is only supported for direct attach to a host (so
>> no SAN switches) and the 3511 isn''t supported for host attach
at
>> all - you''re supposed to hang it off a controller unit.
>>
>> Go further up the scale and it isn''t clear to me that JBOD
exists.
>>
>> So, any suggestions for a good way to connect lots of JBOD disk
>> to a machine?
>>
>> --
>> -Peter Tribble
>> L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
>> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>>
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Gregory Shaw

2006-Apr-03 20:31 UTC

head link

[zfs-discuss] Sun JBOD setup

I was hoping to provide (at least) some of this.  I''m planning to do  
some Sun on Sun(tm) work with this to come up with best practices for  
ZFS in real terms.

I believe in ZFS.  I think it is the future.  However, it is not a  
simple beast, and offers plenty of points for sub-optimal  
configurations.

I''m waiting for the ZFS command set to stabilize before I start.   
Does anybody know the timeline for production quality ZFS?  (e.g.  
when it will be practical to trust oracle on top of ZFS?)


On Apr 3, 2006, at 2:22 PM, Peter Rival wrote:
> Torrey McMahon wrote:
>> I think it would be safer to say, "ZFS likes lots of LUNs".
That
>> way it can better place data within a pool, deal with failures,  
>> etc. JBOD makes that easier as usually you throw lots of drives in  
>> a JBOD and go. There is nothing inherent in a hardware raid array  
>> that disables features or interferes with ZFS. You may want to  
>> reconfigure or re-architect in order to use those features more  
>> efficiently but I don''t think anyone is going to say,
"Throw away
>> all your hardware raid controllers and convert to JBOD".
>
> The existence of this question (and it''s quite appropriate, IMO)  
> suggests the need for some good "Best Practices" documentation of
> how to handle ZFS in "big" SANs.  I thought I''d seen a
writeup from
> Bill Moore but it''s lost in a mountain of email.  Since ZFS
changes
> the dynamic so much, but so many customers have so much invested in  
> their setups with these large SANs, are there plans for such a  
> guide?  IME, many customers spend more on their storage than on  
> their systems and software.
>
> - Pete
>
>> 3510 and 3511 with raid controllers are supported on SAN or direct  
>> attach. The JBOD variant of the 3510 is only supported in direct  
>> connect mode as you state.
>> As for lots of FC connected JBOD: Got any A5K lying around? ;)
>> Peter Tribble wrote:
>>> This isn''t about zfs as such, but about how to build a
system for
>>> zfs.
>>>
>>> Zfs likes JBOD, right? So how do I best build a system with lots
>>> of raw disk?
>>>
>>> Lets assume that we''re talking Sun kit (as I''m
generally familiar
>>> with most of the bits). And that we''re talking about a
fibre
>>> interconnect - so that it''s basically a SAN, and I can
just add
>>> more disk to the network any time I like.
>>>
>>> This gives me the 3510 and 3511 at the bottom end. I''ve
been
>>> reading up on these (we already use direct attach 3510 boxes with
>>> hardware raid, and 3320/3310 scsi boxes). My understanding here is
>>> that the 3510 is only supported for direct attach to a host (so
>>> no SAN switches) and the 3511 isn''t supported for host
attach at
>>> all - you''re supposed to hang it off a controller unit.
>>>
>>> Go further up the scale and it isn''t clear to me that JBOD
exists.
>>>
>>> So, any suggestions for a good way to connect lots of JBOD disk
>>> to a machine?
>>>
>>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Eric Schrock

2006-Apr-03 20:44 UTC

head link

[zfs-discuss] Sun JBOD setup

On Mon, Apr 03, 2006 at 02:31:50PM -0600, Gregory Shaw
wrote:> I was hoping to provide (at least) some of this.  I''m planning to
do
> some Sun on Sun(tm) work with this to come up with best practices for  
> ZFS in real terms.
> 
> I believe in ZFS.  I think it is the future.  However, it is not a  
> simple beast, and offers plenty of points for sub-optimal  
> configurations.
> 
> I''m waiting for the ZFS command set to stabilize before I start.
I''m not sure what you mean by this.  It''s been stable since
it''s
integration.  Or do you mean "not adding new features" ?  If
it''s the
latter, we''ll always be adding new features so you''ll be
waiting for a
while.
> Does anybody know the timeline for production quality ZFS?  (e.g.  
> when it will be practical to trust oracle on top of ZFS?)
It is production quality now.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Peter Tribble

2006-Apr-03 21:49 UTC

head link

[zfs-discuss] Sun JBOD setup

On Mon, 2006-04-03 at 21:31, David J. Orman wrote:> Sorry, I misread your request, I didn''t see you needed it to be
SAN. My
> apologies, I suppose the FC connections would be worth the cost to you
> then.
It doesn''t have to be SAN. Personally, I''m not that enamoured
with
SAN systems - too complex and unreliable for my tastes - but they
do have the advantage of allowing you to scale better than direct
attach, and make adding to and managing the storage easier. A rough
estimate indicates that significantly more than a dozen fast drives
would be required just for one system - make that 20 SATA, just to
get enough spindles.

Anybody from Sun care to comment on Thumper? Something like that
could make a viable alternative.

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

James C. McPherson

2006-Apr-03 22:32 UTC

head link

[zfs-discuss] Sun JBOD setup

Peter Tribble wrote:> On Mon, 2006-04-03 at 21:31, David J. Orman wrote:
>> Sorry, I misread your request, I didn''t see you needed it to
be SAN. My
>> apologies, I suppose the FC connections would be worth the cost to you
>> then.
> It doesn''t have to be SAN. Personally, I''m not that
enamoured with
> SAN systems - too complex and unreliable for my tastes - but they
> do have the advantage of allowing you to scale better than direct
> attach, and make adding to and managing the storage easier. A rough
> estimate indicates that significantly more than a dozen fast drives
> would be required just for one system - make that 20 SATA, just to
> get enough spindles.
Hi Peter,
In my experience of supporting customer SANs, and now writing the
drivers to provide that connectivity, the complex SAN is the one
which the customer has not planned before hand. And I include the
"planning for expansion" part there too. As to unreliable - what
experiences have you had which make SANs misbehave?


best regards,
James C. McPherson
--
Solaris Datapath Engineering
Data Management Group
Sun Microsystems

grant beattie

2006-Apr-04 00:36 UTC

head link

[zfs-discuss] Sun JBOD setup

On Mon, Apr 03, 2006 at 03:58:02PM -0400, Torrey McMahon wrote:
> I think it would be safer to say, "ZFS likes lots of LUNs". That
way it
> can better place data within a pool, deal with failures, etc. JBOD makes 
> that easier as usually you throw lots of drives in a JBOD and go. There 
> is nothing inherent in a hardware raid array that disables features or 
> interferes with ZFS. You may want to reconfigure or re-architect in 
> order to use those features more efficiently but I don''t think
anyone is
> going to say, "Throw away all your hardware raid controllers and
convert
> to JBOD".
> 
> 3510 and 3511 with raid controllers are supported on SAN or direct 
> attach. The JBOD variant of the 3510 is only supported in direct connect 
> mode as you state.
what about a 3510 JBOD connected to a fabric? I have minimal expansion
slots on the front end systems and I don''t want to burn them to add
more storage.

along these lines, my ideas for a storage project I''m working on were
either:

- 3510 JBOD connected to fabric, 2x 5 disk raidz and two hot
  spares.
- 3510 RAID connected to fabric, 2x 5 disk HW raid5 sets, two hot
  spares, raid5 sets in a zfs pool.

our IO pattern will be small, whole file reads and writes. I imagine
the benefit of the HW cache would be worth the extra cost. I''ve
started running some filebench scenarios, but I don''t have the
hardware in my hands yet so I can''t do a direct comparison.

any thoughts on the above?

grant.

Henk Langeveld

2006-Apr-04 11:36 UTC

head link

[zfs-discuss] Sun JBOD setup

Peter Rival wrote:
 > Since ZFS changes the dynamic so much, but so many customers
 > have so much invested in their setups with these large SANs,
 > are there plans for such a guide?

 > IME, many customers spend more on their storage than on
their> systems and software.
That''s what ZFS is supposed to address.  This will take some
effort.

Henk
(who''s boss is at an EMC seminar this very moment)

Peter Tribble

2006-Apr-04 12:54 UTC

head link

[zfs-discuss] Sun JBOD setup

On Mon, 2006-04-03 at 20:58, Torrey McMahon wrote:> I think it would be safer to say, "ZFS likes lots of LUNs". That
way it
> can better place data within a pool, deal with failures, etc. JBOD makes 
> that easier as usually you throw lots of drives in a JBOD and go. There 
> is nothing inherent in a hardware raid array that disables features or 
> interferes with ZFS. You may want to reconfigure or re-architect in 
> order to use those features more efficiently
Having HW raid plus zfs costs you in two ways. The first is the obvious
cost of the raid controllers (and they aren''t at all cheap). The second
is that you need to put in two lots of redundancy - once inside the HW
raid, and then again so that zfs has redundant data.
>  but I don''t think anyone is 
> going to say, "Throw away all your hardware raid controllers and
convert
> to JBOD".
Oh I don''t know! Besides, what about people starting from scratch?

Looking at something like a 3510/3511, a JBOD solution is typically
half the price of the equivalent HW raid solution.
> As for lots of FC connected JBOD: Got any A5K lying around? ;)
Ugh! (As it happens, yes, and the sooner they are put out of
reach the better. Just replaced some with 3510s, and just have
one left that I want to replace soon - hopefully with something
that would involve zfs.)

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Richard Elling

2006-Apr-04 15:24 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Grant says,> what about a 3510 JBOD connected to a fabric? I have minimal expansion
> slots on the front end systems and I don''t want to burn them to
add
> more storage.
Just say "no."  Actually, say "no way in hell" or "no
(*%^@# way!"

The experience of the A5000 was so painful, that I really hoped
nobody would ever do it again.  The fundamental problem is that
the fault isolation is virtually nonexistent.  Disks will never be
sophisticated enough to offer both redundant paths and good
fault isolation.  In order to get there, you need disks which 
understand fabrics and switches with lots of ports and which are
smart enough to understand broken disks.  It is much, much
easier to use a more modern technology which is designed
for redundancy and fault isolation from the get-go: SAS.  To
use SAS, you are likely to be looking at a fancy controller
which gets you back to a RAID array model.  The circle is 
complete.  

As others have pointed out, SATA is similar to SAS, but less
robust and less costly.  From a fault isolation perspective,
SAS and SATA are staggeringly similar as commonly
implemented (a good thing).  I see few people implementing
the dual-port SAS capabilities, so far.
 -- richard
 
 
This message posted from opensolaris.org

Robert Milkowski

2006-Apr-04 15:52 UTC

head link

[zfs-discuss] Sun JBOD setup

Hello Peter,

Tuesday, April 4, 2006, 2:54:14 PM, you wrote:
>> As for lots of FC connected JBOD: Got any A5K lying around? ;)
PT> Ugh! (As it happens, yes, and the sooner they are put out of
PT> reach the better. Just replaced some with 3510s, and just have
PT> one left that I want to replace soon - hopefully with something
PT> that would involve zfs.)

Well, you can send me some A5k - I would gladly take them, especially
with disks :)



-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Peter Tribble

2006-Apr-04 15:55 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Tue, 2006-04-04 at 16:24, Richard Elling wrote:> Grant says,
> > what about a 3510 JBOD connected to a fabric? I have minimal expansion
> > slots on the front end systems and I don''t want to burn them
to add
> > more storage.
> 
> Just say "no."  Actually, say "no way in hell" or
"no (*%^@# way!"
> 
> The experience of the A5000 was so painful
Indeed. This worried me as well. I was hoping that newer
arrays would be better behaved, but I get the impression
that you''re saying they aren''t.

Are you saying that we should forget JBOD and just stick
regular HW arrays on the SAN?

What would you recommend as a scaleable solution, given
a requirement for 20+ spindles, and an assumption that zfs
is used?

Given the cost advantages of SATA, what about simply
adding more drives and making 3-way mirrors to get
the resilience back? (But even there, how to connect
everything together - back to a SAN.)

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Bill Sommerfeld

2006-Apr-04 16:07 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Tue, 2006-04-04 at 11:24, Richard Elling wrote:> To
> use SAS, you are likely to be looking at a fancy controller
> which gets you back to a RAID array model.  The circle is 
> complete.  
so, I''m confused by this.  Seems to me (as a networking guy) that if
all
you want is a JBOD with fault isolation, you can leave out caching,
RAID, cache batteries, periodic cache battery maintenance, an
administrative CLI/GUI, management ethernet & serial ports, etc. and I
think you''d end up with a much simpler array controller..

By analogy to networking, consider the difference in complexity (and
cost) between an unmanaged ethernet switch and a caching proxy...

						- Bill

Erik Trimble

2006-Apr-04 19:45 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Given ZFS''s capabilities, I''d actually like to see if there
were JBODs
out there with NVRAM for write caching. Sort of like the old Sun
StorEdge Array (you remember those, right? 10+ years ago?)

Since HW raid isn''t really needed anymore, I''d prefer to see
what an
NVRAM write cache would do for speeding things up. 

Oh, and we (here in J2SE) just got one of the 6920 SAN things, and
we''ll
be hooking our existing 3510/3511s into it. Seems a nice compromise on
managability, flexibility, storage, and cost. For our needs, I expect
that we will solely be adding 3511s into it, since most of the
requirements are long-term semi-archive storage. 

IF you''ve got the coin, it''s not bad at all.  We got the base
model with
4TB (essentially 2 6120 arrays, filled), and then have 3510s hooked into
the FC switch head for management. Sun internal pricing is cheap (under
$100k), but external is about 3x that.

Does anyone know if Sun is planning on producing something that looks
like a 6020 but uses SATA instead of FC drives?

-Erik

On Tue, 2006-04-04 at 16:55 +0100, Peter Tribble wrote:> On Tue, 2006-04-04 at 16:24, Richard Elling wrote:
> > Grant says,
> > > what about a 3510 JBOD connected to a fabric? I have minimal
expansion
> > > slots on the front end systems and I don''t want to burn
them to add
> > > more storage.
> > 
> > Just say "no."  Actually, say "no way in hell" or
"no (*%^@# way!"
> > 
> > The experience of the A5000 was so painful
> 
> Indeed. This worried me as well. I was hoping that newer
> arrays would be better behaved, but I get the impression
> that you''re saying they aren''t.
> 
> Are you saying that we should forget JBOD and just stick
> regular HW arrays on the SAN?
> 
> What would you recommend as a scaleable solution, given
> a requirement for 20+ spindles, and an assumption that zfs
> is used?
> 
> Given the cost advantages of SATA, what about simply
> adding more drives and making 3-way mirrors to get
> the resilience back? (But even there, how to connect
> everything together - back to a SAN.)
> -- 
Erik Trimble
Java System Support
Mailstop:  usca14-102
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

Torrey McMahon

2006-Apr-07 21:31 UTC

head link

[zfs-discuss] Sun JBOD setup

Peter Tribble wrote:> On Mon, 2006-04-03 at 20:58, Torrey McMahon wrote:
>   
>> I think it would be safer to say, "ZFS likes lots of LUNs".
That way it
>> can better place data within a pool, deal with failures, etc. JBOD
makes
>> that easier as usually you throw lots of drives in a JBOD and go. There
>> is nothing inherent in a hardware raid array that disables features or 
>> interferes with ZFS. You may want to reconfigure or re-architect in 
>> order to use those features more efficiently
>>     
>
> Having HW raid plus zfs costs you in two ways. The first is the obvious
> cost of the raid controllers (and they aren''t at all cheap). The
second
> is that you need to put in two lots of redundancy - once inside the HW
> raid, and then again so that zfs has redundant data.

You''re assuming that you have a bunch of Solaris boxes that are going
to
have dedicated storage arrays. Most datacenters are multiplatform and 
don''t like dedicated storage arrays on each and every box. HW raid 
arrays let you consolidate storage across many hosts and OS. I think 
it''s fair to say that customers don''t architect around
specific OS or
filesystems. They architect solutions to meet their overall data 
requirements. ZFS helps quite a bit but it''s not the solution to every 
problem....though I''m sure Jeff and company are working on it. :)

Also, when people talk about HW raid storage arrays you usually get 
other benefits: Cache for coalescing writes, remote replication, ability 
to split mirrors, higher redudancy, etc. JBODs don''t play in that
space.

Torrey McMahon

2006-Apr-07 21:36 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Bill Sommerfeld wrote:> On Tue, 2006-04-04 at 11:24, Richard Elling wrote:
>   
>> To
>> use SAS, you are likely to be looking at a fancy controller
>> which gets you back to a RAID array model.  The circle is 
>> complete.  
>>     
>
> so, I''m confused by this.  Seems to me (as a networking guy) that
if all
> you want is a JBOD with fault isolation, you can leave out caching,
> RAID, cache batteries, periodic cache battery maintenance, an
> administrative CLI/GUI, management ethernet & serial ports, etc. and I
> think you''d end up with a much simpler array controller..
>
> By analogy to networking, consider the difference in complexity (and
> cost) between an unmanaged ethernet switch and a caching proxy...

To continue your analogy, you would have to buy every host it''s own 
unmanaged ethernet switch and convert all the hosts to speak IP when 
they don''t.

Richard Elling

2006-Apr-07 22:42 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Tue, 2006-04-04 at 12:07 -0400, Bill Sommerfeld
wrote:> On Tue, 2006-04-04 at 11:24, Richard Elling wrote:
> > To
> > use SAS, you are likely to be looking at a fancy controller
> > which gets you back to a RAID array model.  The circle is 
> > complete.  
> 
> so, I''m confused by this.  Seems to me (as a networking guy) that
if all
> you want is a JBOD with fault isolation, you can leave out caching,
> RAID, cache batteries, periodic cache battery maintenance, an
> administrative CLI/GUI, management ethernet & serial ports, etc. and I
> think you''d end up with a much simpler array controller..
Fundamentally, disks aren''t very sophisticated, and the price point
for disks doesn''t allow much sophistication.  One of the lessons
learned with the A5000 is that even though each disk has 2 ports
for connection to 2 different FC-AL loops, by definition that is a 
common cause fault opportunity.  A faulty disk can bring down both
loops.  Since the disks aren''t very sophisticated, and firmware is
difficult to manage once released to the field, all hell can break
loose.  The way to get around this is to go point-to-point, ala 
SATA and SAS.  You will see this pattern repeated often in the
high-availability space.  For a recent example, see the Sun Netra
CT900 announced this week.  This design has excellent fault
isolation and containment.
http://www.sun.com/products-n-solutions/hw/networking/ct900/
> By analogy to networking, consider the difference in complexity (and
> cost) between an unmanaged ethernet switch and a caching proxy...
The problem in your analogy is that you are assuming IP.  There is
no equivalent level to IP in the SAN world, as implemented.  SANs
are more like IPX/SPX with some features removed (ok, maybe I am a
little biased... :-)
 -- richard

Gregory Shaw

2006-Apr-07 23:00 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Regarding firmware, most intelligent disk controllers (arrays) manage  
disk microcode.

I like that idea, as it allows the device closest to the disk (and  
not the host) to manage the disks.

On Apr 7, 2006, at 4:42 PM, Richard Elling wrote:
> On Tue, 2006-04-04 at 12:07 -0400, Bill Sommerfeld wrote:
>> On Tue, 2006-04-04 at 11:24, Richard Elling wrote:
>>> To
>>> use SAS, you are likely to be looking at a fancy controller
>>> which gets you back to a RAID array model.  The circle is
>>> complete.
>>
>> so, I''m confused by this.  Seems to me (as a networking guy)
that
>> if all
>> you want is a JBOD with fault isolation, you can leave out caching,
>> RAID, cache batteries, periodic cache battery maintenance, an
>> administrative CLI/GUI, management ethernet & serial ports, etc.  
>> and I
>> think you''d end up with a much simpler array controller..
>
> Fundamentally, disks aren''t very sophisticated, and the price
point
> for disks doesn''t allow much sophistication.  One of the lessons
> learned with the A5000 is that even though each disk has 2 ports
> for connection to 2 different FC-AL loops, by definition that is a
> common cause fault opportunity.  A faulty disk can bring down both
> loops.  Since the disks aren''t very sophisticated, and firmware is
> difficult to manage once released to the field, all hell can break
> loose.  The way to get around this is to go point-to-point, ala
> SATA and SAS.  You will see this pattern repeated often in the
> high-availability space.  For a recent example, see the Sun Netra
> CT900 announced this week.  This design has excellent fault
> isolation and containment.
> http://www.sun.com/products-n-solutions/hw/networking/ct900/
>
>> By analogy to networking, consider the difference in complexity (and
>> cost) between an unmanaged ethernet switch and a caching proxy...
>
> The problem in your analogy is that you are assuming IP.  There is
> no equivalent level to IP in the SAN world, as implemented.  SANs
> are more like IPX/SPX with some features removed (ok, maybe I am a
> little biased... :-)
>  -- richard
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Peter Tribble

2006-Apr-10 12:40 UTC

head link

[zfs-discuss] Sun JBOD setup

On Fri, 2006-04-07 at 22:31, Torrey McMahon wrote:> Peter Tribble wrote:
> >
> > Having HW raid plus zfs costs you in two ways. The first is the
obvious
> > cost of the raid controllers (and they aren''t at all cheap).
The second
> > is that you need to put in two lots of redundancy - once inside the HW
> > raid, and then again so that zfs has redundant data.
> 
> You''re assuming that you have a bunch of Solaris boxes that are
going to
> have dedicated storage arrays.
Indeed. That was essentially what the original question was.
Let me rephrase it:

"Assuming I have some Sun servers and will be using zfs,
what''s the best way to buy the storage?"

It''s not entirely a theoretical question - I have proposals
to write involving just this scenario, hence the original
question.

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Peter Tribble

2006-Apr-10 12:56 UTC

head link

[zfs-discuss] Sun JBOD setup

On Mon, 2006-04-03 at 23:32, James C. McPherson wrote:> Hi Peter,
> In my experience of supporting customer SANs, and now writing the
> drivers to provide that connectivity, the complex SAN is the one
> which the customer has not planned before hand. And I include the
> "planning for expansion" part there too. As to unreliable - what
> experiences have you had which make SANs misbehave?
Buying a Sun 6900 was - with hindsight - a mistake. (At the time
I''m not sure there was much else - it would have been more expensive
to direct attach T3s to each box [need more T3s] and the current
3xxx systems didn''t exist.) But the switches would freeze, the
VEs would randomly misbehave, GBICs would fail, cables go bad,
T3 controllers would play up - and diagnosis was extremely
difficult. Remember this was a canned solution - so we had very
little access. (I think that if the place hadn''t been closed down
I would have probably ripped out the T3s and gone for direct
attach.) Were we on the bleeding edge of adoption? Have all the
switch lockup problems (I know other colleagues at the time had
problems getting switches stable) been fixed?

The general point, though, is that a SAN has many more points
of failure. You''ve got switches, the software on them, and the 
configuration of the SAN. (Networks and network switches are
more mature, and I guess that VLANs are in some ways analagous
to zones, which - given some experience - doesn''t encourage me.)

My hope is that currently the components are more reliable
than they were when I last tried this. Would that be a
reasonable expectation?

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Peter Tribble

2006-Apr-10 13:02 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Fri, 2006-04-07 at 22:36, Torrey McMahon wrote:> Bill Sommerfeld wrote:
> >
> > By analogy to networking, consider the difference in complexity (and
> > cost) between an unmanaged ethernet switch and a caching proxy...
> 
> To continue your analogy, you would have to buy every host it''s
own
> unmanaged ethernet switch and convert all the hosts to speak IP when 
> they don''t.
Or maybe the difference between a huge SMP system and a
rack of cheap blade servers?

In the HPC space, commodity clusters (JBOD) have largely
superseded large single machines (HW raid boxes). The key
is the glue, usually MPI (zfs).

The issues are similar - large clusters have to deal with
fault isolation and manageability as well.

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Gregory Shaw

2006-Apr-10 14:02 UTC

head link

[zfs-discuss] Sun JBOD setup

Boy, it sounds like you''ve had some bad experience with switches.   I  
don''t know the timeframe for the below, but it sounds like ~1999.  At  
that time, there were some bad lasers being produced that would burn  
out.  Especially bad were the converters from copper FC to fiber FC  
(required by the T3).  They failed regularly.

I''ve been running (brocade) FC switches of varying size (16-120  
ports), and have had very few problems.   I''ve currently got over 60  
switches in production (1g-4g), and we''ve had 3 switches actually fail.

We designed everything with pairs of switches.   One path goes to one  
switch, while the other goes to another switch.  That has worked well  
when multi-pathing is available, and allows for an entire switch to  
fail, yet the system operation will continue.

In other words, our SANs have been reliable for years in  
production.   In recent history, it has helped that newer switches  
support upgrading microcode without downtime.

On Apr 10, 2006, at 6:56 AM, Peter Tribble wrote:
> On Mon, 2006-04-03 at 23:32, James C. McPherson wrote:
>> Hi Peter,
>> In my experience of supporting customer SANs, and now writing the
>> drivers to provide that connectivity, the complex SAN is the one
>> which the customer has not planned before hand. And I include the
>> "planning for expansion" part there too. As to unreliable -
what
>> experiences have you had which make SANs misbehave?
>
> Buying a Sun 6900 was - with hindsight - a mistake. (At the time
> I''m not sure there was much else - it would have been more
expensive
> to direct attach T3s to each box [need more T3s] and the current
> 3xxx systems didn''t exist.) But the switches would freeze, the
> VEs would randomly misbehave, GBICs would fail, cables go bad,
> T3 controllers would play up - and diagnosis was extremely
> difficult. Remember this was a canned solution - so we had very
> little access. (I think that if the place hadn''t been closed down
> I would have probably ripped out the T3s and gone for direct
> attach.) Were we on the bleeding edge of adoption? Have all the
> switch lockup problems (I know other colleagues at the time had
> problems getting switches stable) been fixed?
>
> The general point, though, is that a SAN has many more points
> of failure. You''ve got switches, the software on them, and the
> configuration of the SAN. (Networks and network switches are
> more mature, and I guess that VLANs are in some ways analagous
> to zones, which - given some experience - doesn''t encourage me.)
>
> My hope is that currently the components are more reliable
> than they were when I last tried this. Would that be a
> reasonable expectation?
>
> -- 
> -Peter Tribble
> L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382              greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Casper.Dik at Sun.COM

2006-Apr-10 14:11 UTC

head link

[zfs-discuss] Sun JBOD setup

>The general point, though, is that a SAN has many more points
>of failure. You''ve got switches, the software on them, and the 
>configuration of the SAN. (Networks and network switches are
>more mature, and I guess that VLANs are in some ways analagous
>to zones, which - given some experience - doesn''t encourage me.)
    "And if you''re on a SAN, you''re using a network
designed by disk
    firmware writers. God help you."
	- Jeff Bonwick
	  http://blogs.sun.com/roller/page/bonwick?entry=zfs_end_to_end_data


Casper

Al Hopper

2006-Apr-10 15:05 UTC

head link

[zfs-discuss] Sun JBOD setup

On Mon, 10 Apr 2006, Peter Tribble wrote:
> On Mon, 2006-04-03 at 23:32, James C. McPherson wrote:
> > Hi Peter,
> > In my experience of supporting customer SANs, and now writing the
> > drivers to provide that connectivity, the complex SAN is the one
> > which the customer has not planned before hand. And I include the
> > "planning for expansion" part there too. As to unreliable -
what
> > experiences have you had which make SANs misbehave?
>
> Buying a Sun 6900 was - with hindsight - a mistake. (At the time
> I''m not sure there was much else - it would have been more
expensive
> to direct attach T3s to each box [need more T3s] and the current
> 3xxx systems didn''t exist.) But the switches would freeze, the
> VEs would randomly misbehave, GBICs would fail, cables go bad,
> T3 controllers would play up - and diagnosis was extremely
> difficult. Remember this was a canned solution - so we had very
> little access. (I think that if the place hadn''t been closed down
> I would have probably ripped out the T3s and gone for direct
> attach.) Were we on the bleeding edge of adoption? Have all the
> switch lockup problems (I know other colleagues at the time had
> problems getting switches stable) been fixed?
>
> The general point, though, is that a SAN has many more points
> of failure. You''ve got switches, the software on them, and the
> configuration of the SAN. (Networks and network switches are
> more mature, and I guess that VLANs are in some ways analagous
> to zones, which - given some experience - doesn''t encourage me.)
>
> My hope is that currently the components are more reliable
> than they were when I last tried this. Would that be a
> reasonable expectation?
A couple of comments from someone with no first-hand experience of the
particular equipment you refer to.

One of the biggest gripes I have, and one that I constantly recommend that
people (clients) consider, it that while some level of redundancy is good
from a systems availability standpoint, *identicality* is not.  Because if
you use the same (FC) switches, with the same firmware, you experience the
same bugs and the same failure modes.  Whereas, if you use Switch A from
vendor A and switch B from vendor B and you also use different GBICs[1] etc
- you have redundancy without identical bugs and failure modes.

The same applies to networking equipment.  Going all Cisco is an easy sell
in the boardroom - but when a Cisco exploit hits the street, you''ll
wish
you had not designed in identicality.

Regarding GBICs: I insisted on sparing at the time we deployed a SAN
solution for a client and also insisted on Finisar GBICs.  We never did use
any of the spares (in 5 years).  One important design detail of FC, is that
the low-level (Layer 1) transport spec maintains a constant duty-cycle,
regardless of the data being transported over the FC links.  The intent is
to keep the temperature of opticial components[2] constant and prevent
component temperature cycling, which often has a negative impact on
electronic systems reliability.  The FC wire protocol was contributed by
IBM.

One more general point.  Redundancy does not increase system complexity and
managability by x2 (times two); IMHO, it''s more like x4.  In most areas
of
computing, the human operators are the weak link - and they have a poor
track record of being able to master highly complex technology.  If your
tech "owners" are unable to grok x2 complexity - don''t burden
them, and the
down-stream user community, with x4 complexity by building fully redundant
systems.

It''s interesting to note that human failure modes, and how to minimize
them, are well know in certain fields, but that knowledge has not been
widely applied to high-reliability/high-availability computing systems.
For example, its a well known fact, that a 2 person flight crew, flying a
modern aircraft, will exhibit dramatically better accident statistics, than
any single person crew flying equally equiped (modern) aircraft.

[1] There were some GBICs that were widely known to be "bad".  Bad, as
in
subject to excessively high failure rates.  I think that they were made by
IBM ... but I''m not sure.  They were sold/re-sold under various names.

[2] really all components that are transporting FC data.

Al Hopper  Logical Approach Inc, Plano, TX.  al at logical-approach.com
           Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005

Torrey McMahon

2006-Apr-11 20:15 UTC

head link

[zfs-discuss] Sun JBOD setup

Peter Tribble wrote:> On Mon, 2006-04-03 at 23:32, James C. McPherson wrote:
>   
>> Hi Peter,
>> In my experience of supporting customer SANs, and now writing the
>> drivers to provide that connectivity, the complex SAN is the one
>> which the customer has not planned before hand. And I include the
>> "planning for expansion" part there too. As to unreliable -
what
>> experiences have you had which make SANs misbehave?
>>     
>
> Buying a Sun 6900 was - with hindsight - a mistake. (At the time
> I''m not sure there was much else - it would have been more
expensive
> to direct attach T3s to each box [need more T3s] and the current
> 3xxx systems didn''t exist.) But the switches would freeze, the
> VEs would randomly misbehave, GBICs would fail, cables go bad,
> T3 controllers would play up - and diagnosis was extremely
> difficult. Remember this was a canned solution - so we had very
> little access. (I think that if the place hadn''t been closed down
> I would have probably ripped out the T3s and gone for direct
> attach.) Were we on the bleeding edge of adoption? Have all the
> switch lockup problems (I know other colleagues at the time had
> problems getting switches stable) been fixed?
>   

All of them? Hard to say. 6900 was quite the beast but I''d say SANs are
much more reliable then they were six years ago when the 6900 was being 
sold. (Or was it seven? I can''t remember that far back...)
> The general point, though, is that a SAN has many more points
> of failure. You''ve got switches, the software on them, and the 
> configuration of the SAN. (Networks and network switches are
> more mature, and I guess that VLANs are in some ways analagous
> to zones, which - given some experience - doesn''t encourage me.)
>   

I think it would be better to say there are as many points of failure 
but, in the past, SAN components were more likely to fail. I''d say the 
components are much less likely to fail now then they did 5+ years ago.
> My hope is that currently the components are more reliable
> than they were when I last tried this. Would that be a
> reasonable expectation?
>   

I''d argue that, yes, they are much more reliable but I''ve no
hard data
to back that up.

Torrey McMahon

2006-Apr-11 20:20 UTC

head link

[zfs-discuss] Sun JBOD setup

Peter Tribble wrote:> On Fri, 2006-04-07 at 22:31, Torrey McMahon wrote:
>   
>> Peter Tribble wrote:
>>     
>>> Having HW raid plus zfs costs you in two ways. The first is the
obvious
>>> cost of the raid controllers (and they aren''t at all
cheap). The second
>>> is that you need to put in two lots of redundancy - once inside the
HW
>>> raid, and then again so that zfs has redundant data.
>>>       
>> You''re assuming that you have a bunch of Solaris boxes that
are going to
>> have dedicated storage arrays.
>>     
>
> Indeed. That was essentially what the original question was.
> Let me rephrase it:
>
> "Assuming I have some Sun servers and will be using zfs,
> what''s the best way to buy the storage?"

What are the i/o requirements for the apps running on the Sun boxes? Is 
this a purely Sun/Solaris environment? What''s the growth potential in 
the next three years? How much redundancy do you want on the transport 
level? How many nines?

I''m not trying to be a pain in the arse - At least not in this forum :)
-  but a lot more data would be required before a good config could even 
be thought about.

Peter Eriksson

2006-Apr-14 21:24 UTC

head link

[zfs-discuss] Re: Re[2]: Sun JBOD setup

> Well, you can send me some A5k - I would gladly take
> them, especially with disks :)
We''ve (we == the Lysator computer club at Link?ping University) got
four A5000''s running just fine (equipped with 9GB and 18GB disks) on an
Ultra2. But it was *HELL* getting the system to work in a stable way. Finally
had to ditch the Sun FC Sbus controllers and use third party ones from JNI and
we also had to forget about using the whole pack of 36GB Seagate
Cheetah''s we had got for dirt cheap... Or else we''d get disk
spinning up/down randomly, and FC errors would fill the log files endlessly. The
Cheetah''s work just fine in standalone machines (like a Sun Blade 1000)
but they just won''t work in the A5000 :-(

That system have been running perfectly since then serving the HOME directories
(a whooping 100GB of mirrored storage :-).

Now, we just recently got two A3500FC systems (fully redundant), with 120 18GB
disks and thus we figured it''d be a nice upgrade from (more space
atleast :-) the old A5000 solution so I''ve started looking into how to
configure and connect them...

Installed Solaris Nevada on an Ultra 30 and hacked Raid Manager (I know, I know
- it''s not supported after Solaris 9 - who cares? :-) to work so I
could configure the controllers and the old configured LUNs showed up just fine
(13 RAID5 groups on each A3500FC system). Now since I was planning on using ZFS
with these systems I figured I''d try to reconfigure one of them to be a
more JBOD-like system - erased all the old LUN/RAID5 groups and wrote a script
to create an individual LUN for each drive - would be 60 LUNs per A3500. Started
the script and it created LUNS 0-15 just fine, then created LUN 16 and then it
started failing... Apparently Solaris refused to create the device node for LUN
16 and then Raid Manager got seriously confused and just gave up. Doh! So I
figured I''d remove the last LUN (configured on the controller just
fine, it was just that Solaris wouldn''t see it) and a truss on
"raidutil" gave that it was silently giving up since it
couldn''t see the /dev/osa/dev/rdsk/c1t4d15s0 device file - so I created
a dummy one (a link to c1t4d14s0) and gave the command "raidutil -c c1t4d0
-D 16"... And then the A3500 controller crashed (it doesn''t show
up on the FC bus anymore atleast and I''m currently 30km''s away
and can''t check on it).

*SIGH*

(Yeah yeah, I know... Just venting some frustration... The A3500 is known to be
crappy hardware but we got them for free :-)
 
 
This message posted from opensolaris.org

Al Hopper

2006-Apr-14 21:35 UTC

head link

[zfs-discuss] Re: Re[2]: Sun JBOD setup

On Fri, 14 Apr 2006, Peter Eriksson wrote:
> > Well, you can send me some A5k - I would gladly take
> > them, especially with disks :)
... reformatted ...
> We''ve (we == the Lysator computer club at Link?ping University)
got four
> A5000''s running just fine (equipped with 9GB and 18GB disks) on an
> Ultra2. But it was *HELL* getting the system to work in a stable way.
> Finally had to ditch the Sun FC Sbus controllers and use third party ones
> from JNI and we also had to forget about using the whole pack of 36GB
> Seagate Cheetah''s we had got for dirt cheap... Or else
we''d get disk
> spinning up/down randomly, and FC errors would fill the log files
> endlessly. The Cheetah''s work just fine in standalone machines
(like a
> Sun Blade 1000) but they just won''t work in the A5000 :-(
>
> That system have been running perfectly since then serving the HOME
> directories (a whooping 100GB of mirrored storage :-).
>
> Now, we just recently got two A3500FC systems (fully redundant), with 120
> 18GB disks and thus we figured it''d be a nice upgrade from (more
space
> atleast :-) the old A5000 solution so I''ve started looking into
how to
> configure and connect them...
>
> Installed Solaris Nevada on an Ultra 30 and hacked Raid Manager (I know,
> I know - it''s not supported after Solaris 9 - who cares? :-) to
work so I
> could configure the controllers and the old configured LUNs showed up
> just fine (13 RAID5 groups on each A3500FC system). Now since I was
> planning on using ZFS with these systems I figured I''d try to
reconfigure
> one of them to be a more JBOD-like system - erased all the old LUN/RAID5
> groups and wrote a script to create an individual LUN for each drive -
> would be 60 LUNs per A3500. Started the script and it created LUNS 0-15
> just fine, then created LUN 16 and then it started failing... Apparently
> Solaris refused to create the device node for LUN 16 and then Raid
> Manager got seriously confused and just gave up. Doh! So I figured
I''d
> remove the last LUN (configured on the controller just fine, it was just
> that Solaris wouldn''t see it) and a truss on "raidutil"
gave that it was
> silently giving up since it couldn''t see the
/dev/osa/dev/rdsk/c1t4d15s0
> device file - so I created a dummy one (a link to c1t4d14s0) and gave the
> command "raidutil -c c1t4d0 -D 16"... And then the A3500
controller
> crashed (it doesn''t show up on the FC bus anymore atleast and
I''m
> currently 30km''s away and can''t check on it).
>
> *SIGH*
>
> (Yeah yeah, I know... Just venting some frustration... The A3500 is known
> to be crappy hardware but we got them for free :-)
Ensure that the LUNs you need are defined in sd.conf.

Al Hopper  Logical Approach Inc, Plano, TX.  al at logical-approach.com
           Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005

Frank Cusack

2006-Apr-19 01:23 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

> If somebody can
> point me to a nice 8+ drive rack mount enclosure for
> disks with SATA
> interface, I''d be super appreciative! :P
> 
promise vtrak j300s

adaptec supposedly has one also, but you can''t get docs online,
and i had one on order for 2 months before cancelling it.
 
 
This message posted from opensolaris.org

Frank Cusack

2006-Apr-19 01:25 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

> The same applies to networking equipment.  Going all
> Cisco is an easy sell
> in the boardroom - but when a Cisco exploit hits the
> street, you''ll wish
> you had not designed in identicality.
That''s not generally true.  If you use two vendors, you''re not
50% isolated from one of their bugs, you''re 200% exposed.

-frank
 
 
This message posted from opensolaris.org

Frank Cusack

2006-Apr-19 01:29 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

> You''re assuming that you have a bunch of Solaris
> boxes that are going to 
> have dedicated storage arrays. Most datacenters are
> multiplatform and 
> don''t like dedicated storage arrays on each and every
> box. HW raid 
> arrays let you consolidate storage across many hosts
> and OS.
So does SAS JBOD.
> Also, when people talk about HW raid storage arrays
> you usually get 
> other benefits: Cache for coalescing writes, remote
> replication, ability 
> to split mirrors, higher redudancy, etc. JBODs don''t
> play in that space.
I think zfs will put JBODs into that space.

-frank
 
 
This message posted from opensolaris.org

Torrey McMahon

2006-Apr-19 02:24 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Frank Cusack wrote:>> You''re assuming that you have a bunch of Solaris
>> boxes that are going to 
>> have dedicated storage arrays. Most datacenters are
>> multiplatform and 
>> don''t like dedicated storage arrays on each and every
>> box. HW raid 
>> arrays let you consolidate storage across many hosts
>> and OS.
>>     
>
> So does SAS JBOD.
>   

How can you share storage from a "bunch of disks" to multiple hosts? 
Outside of connecting it to all of the host at the same time and making 
sure one system doesn''t grab more then the disk you''ve
allocated to it
via mental process?

Torrey McMahon

2006-Apr-19 03:11 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Frank Cusack wrote:> On April 18, 2006 10:24:01 PM -0400 Torrey McMahon 
> <Torrey.McMahon at Sun.COM> wrote:
>> Frank Cusack wrote:
>>>> You''re assuming that you have a bunch of Solaris
>>>> boxes that are going to
>>>> have dedicated storage arrays. Most datacenters are
>>>> multiplatform and
>>>> don''t like dedicated storage arrays on each and every
>>>> box. HW raid
>>>> arrays let you consolidate storage across many hosts
>>>> and OS.
>>>>
>>>
>>> So does SAS JBOD.
>>>
>>
>>
>> How can you share storage from a "bunch of disks" to multiple
hosts?
>> Outside of connecting it to
>> all of the host at the same time and making sure one system
doesn''t
>> grab more then the disk
>> you''ve allocated to it via mental process?
>>
>
> Use a SAS switch (analogue of SAN switch).  eg
> <http://pmc-sierra.com/products/details/pm8398/>
>
> Granted, you can only divvy up storage by entire disk, as opposed to
> arbitrarily-sized LUNs.

First, that looks to be a chip solution for an array product and not a 
switch to let you take a jbod and allocate storage in front of it. 
Second, allocating an entire disk to a host - When they''ll be at 1TB in
size in a short time - doesn''t help when a host might  simply need 
100GB. Allocate two for mirror and you''re over allocating by 20X the 
required space.

Richard Elling

2006-Apr-19 04:01 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Tue, 2006-04-18 at 18:25 -0700, Frank Cusack wrote:> > The same applies to networking equipment.  Going all
> > Cisco is an easy sell
> > in the boardroom - but when a Cisco exploit hits the
> > street, you''ll wish
> > you had not designed in identicality.
> 
> That''s not generally true.  If you use two vendors,
you''re not
> 50% isolated from one of their bugs, you''re 200% exposed.
I''m not sure I follow this, but if you''re saying what I think
you''re saying, you are both 50% isolated and 200% exposed, and
cannot be anything else.  But I don''t see what percentages
have to do with this.  When we do a RAS analysis, we wouldn''t
look at it that way.  Each fault has a probability and an
effect. If you have two identical things, then they have the
same probability of being affected by each fault.  If you have
two different things, then they have different (perhaps completely
different) probabilities of being affected by any given fault.
So yes, you are more exposed because there are more fault
opportunities.  But it is also less likely that any single
fault will bring both down.  For safety-critical systems, I''ll
go for the diversity.
 -- richard

Richard Elling

2006-Apr-19 04:49 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

[splitting hairs here...]

On Tue, 2006-04-18 at 21:37 -0700, Frank Cusack wrote:> You have to do an analysis specific to the deployment at hand.  You
can''t
> just outright say, diversity is good.
Diversity is good.  It also costs real money, which is your point.
Fast, reliable, inexpensive: pick one.

 --richard

Al Hopper

2006-Apr-19 04:59 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Tue, 18 Apr 2006, Richard Elling wrote:
> [splitting hairs here...]
>
> On Tue, 2006-04-18 at 21:37 -0700, Frank Cusack wrote:
> > You have to do an analysis specific to the deployment at hand.  You
can''t
> > just outright say, diversity is good.
>
> Diversity is good.  It also costs real money, which is your point.
> Fast, reliable, inexpensive: pick one.                                    ^^^
Correction: pick two.

Al Hopper  Logical Approach Inc, Plano, TX.  al at logical-approach.com
           Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005

Torrey McMahon

2006-Apr-19 05:31 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Frank Cusack wrote:> On April 18, 2006 11:11:09 PM -0400 Torrey McMahon 
> <Torrey.McMahon at Sun.COM> wrote:
>> Frank Cusack wrote:
>>> On April 18, 2006 10:24:01 PM -0400 Torrey McMahon
>>> <Torrey.McMahon at Sun.COM> wrote:
>>>> Frank Cusack wrote:
>>>>>> You''re assuming that you have a bunch of
Solaris
>>>>>> boxes that are going to
>>>>>> have dedicated storage arrays. Most datacenters are
>>>>>> multiplatform and
>>>>>> don''t like dedicated storage arrays on each
and every
>>>>>> box. HW raid
>>>>>> arrays let you consolidate storage across many hosts
>>>>>> and OS.
>>>>>>
>>>>>
>>>>> So does SAS JBOD.
>>>>>
>>>>
>>>>
>>>> How can you share storage from a "bunch of disks" to
multiple hosts?
>>>> Outside of connecting it to
>>>> all of the host at the same time and making sure one system
doesn''t
>>>> grab more then the disk
>>>> you''ve allocated to it via mental process?
>>>>
>>>
>>> Use a SAS switch (analogue of SAN switch).  eg
>>> <http://pmc-sierra.com/products/details/pm8398/>
>>>
>>> Granted, you can only divvy up storage by entire disk, as opposed
to
>>> arbitrarily-sized LUNs.
>>
>>
>> First, that looks to be a chip solution for an array product and not 
>> a switch to let you take a
>> jbod and allocate storage in front of it.
>
> Yes.  But LSI just demo''d an actual switch, and probably in a
year''s time
> there will be multiple products on the market.

...which adds how much to the cost of the JBOD? (Hardware, support, 
training, etc.) How much for a hw raid array that lets you do lun 
masking, carve out space, do lun expansion, etc.?
>
>> Second, allocating an entire disk to a host - When
>> they''ll be at 1TB in size in a short time - doesn''t
help when a host
>> might  simply need 100GB.
>> Allocate two for mirror and you''re over allocating by 20X the 
>> required space.
>
> It seems to me that hosts which require only 100GB don''t really 
> participate
> in today''s SANs.  (But just a guess, really.)  I''d expect
most disk
> allocations
> to be multi-disk, not sub-disk.  But why buy 1TB disks to give out 
> 100GB chunks?
> I wouldn''t pay the SAN/FC $$$ premium to do it.  It''s
going to be
> cheaper to buy
> smaller SAS disk than to part out parts of more expensive (per GB) 
> FC-attached
> disk. 

In some cases we see customers taking a single HW raid array and carving 
up hundreds of LUNs of small size for multiple systems. Not as often as 
larger datasets but I got a call about one such setup the other week.

My point here is that in the near future you will only be able to buy 
1TB, or some other large size, drives. Drive density has been growing at 
an exponential rate the past 10 years. Anyone recall those 2GB drives 
that we used to fill SSAs with? SATA drives are up to 500GB today...in 
case you didn''t notice Do you think SAS drives are going to stay at
37GB
for long? Not a chance. They''ll ramp just as fast, if not faster, then 
their FC cousins. (147 is out now. I''m pretty sure ~250 is in the 
works.) At some point you''ll need to split the disk drives in a logical
manner to avoid gross over allocation, yet maintaining performance 
requirements, to the majority of your hosts. Of just place all of the 
storage behind a NAS box. Take your pick. ;)

Nicolas Williams

2006-Apr-19 06:17 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Wed, Apr 19, 2006 at 01:31:43AM -0400, Torrey McMahon
wrote:> In some cases we see customers taking a single HW raid array and carving 
> up hundreds of LUNs of small size for multiple systems. Not as often as 
> larger datasets but I got a call about one such setup the other week.
This will change, surely.  Partly because this way lies madness, partly
because ZFS rocks.
> My point here is that in the near future you will only be able to buy 
> 1TB, or some other large size, drives. Drive density has been growing at 
> an exponential rate the past 10 years. Anyone recall those 2GB drives 
> that we used to fill SSAs with? SATA drives are up to 500GB today...in 
I remember 10MB hard drives, FWIW.  I also remember that no matter how
large the drives get there''s always stuff to fill them with.  100GB HW
RAID seems pitiful now...
> case you didn''t notice Do you think SAS drives are going to stay
at 37GB
> for long? Not a chance. They''ll ramp just as fast, if not faster,
then
> their FC cousins. (147 is out now. I''m pretty sure ~250 is in the 
> works.) At some point you''ll need to split the disk drives in a
logical
> manner to avoid gross over allocation, yet maintaining performance 
> requirements, to the majority of your hosts. Of just place all of the 
> storage behind a NAS box. Take your pick. ;)
I expect the latter, as you seem to also, because ZFS rocks :)

I expect some database applications will just use huge logical devices
without volume management, even if it''d be better to use volume
management.

But small allocations have got to go.

I can imagine, say, Solaris as iSCSI servers serving ZFS files as raw
devices where small allocations are needed but NAS is, for whatever
reason, not applicable.  Anything but manage multitudes of LUNs.

Nico
--

Peter Eriksson

2006-Apr-19 08:49 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

>I don''t think the integrated raid controller in the sun storage
array is
>worth that kind of pricing. That''s just me though. :) If somebody
can
>point me to a nice 8+ drive rack mount enclosure for disks with SATA
>interface, I''d be super appreciative! :P
This winter I built a nice rsync server running Solaris 10 (not ZFS yet though,
but that is definitely coming) from the following parts:
> Motherboard: SuperMicro X6DHE-XG2
> CPU: 2x Intel Xeon 2.8GHz
> RAM: 2GB
> Disks: 14x 400GB SATA 7200rpm
> Rack mount case: SuperMicro SC933T-R760
> Disk controller: Adaptec S21610SA (16port SATA RAID)
That rack mount case supports 15 1"SATA disks and
has a triple-redundant power supply, lots of fans and stuff.
Link: http://www.supermicro.com/products/chassis/3U/933/SC933T-R760.cfm

I don''t use the RAID capability of that Adaptec card though so
if I was going to build a similar server today I would probably
go for your suggested SATA controller instead (probably cheaper :-).
 
 
This message posted from opensolaris.org

Gregory Shaw

2006-Apr-19 14:29 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Apr 19, 2006, at 12:17 AM, Nicolas Williams wrote:
> On Wed, Apr 19, 2006 at 01:31:43AM -0400, Torrey McMahon wrote:
>> In some cases we see customers taking a single HW raid array and  
>> carving
>> up hundreds of LUNs of small size for multiple systems. Not as  
>> often as
>> larger datasets but I got a call about one such setup the other week.
>
> This will change, surely.  Partly because this way lies madness,  
> partly
> because ZFS rocks.
Why would this change?  When you buy disk resources, you want to use  
them as effectively as possible.  It''s dangerous to have a large  
number of systems on a large HW array because of SPOF and performance  
concerns, but it''s entirely reasonable.

It''s far better than local storage.
>
>> My point here is that in the near future you will only be able to buy
>> 1TB, or some other large size, drives. Drive density has been  
>> growing at
>> an exponential rate the past 10 years. Anyone recall those 2GB drives
>> that we used to fill SSAs with? SATA drives are up to 500GB  
>> today...in
>
> I remember 10MB hard drives, FWIW.  I also remember that no matter how
> large the drives get there''s always stuff to fill them with. 
100GB HW
> RAID seems pitiful now...
>
True.  Arrays today are measured in the 10''s of TBs.
>> case you didn''t notice Do you think SAS drives are going to
stay
>> at 37GB
>> for long? Not a chance. They''ll ramp just as fast, if not
faster,
>> then
>> their FC cousins. (147 is out now. I''m pretty sure ~250 is in
the
>> works.) At some point you''ll need to split the disk drives in
a
>> logical
>> manner to avoid gross over allocation, yet maintaining performance
>> requirements, to the majority of your hosts. Of just place all of the
>> storage behind a NAS box. Take your pick. ;)
>
> I expect the latter, as you seem to also, because ZFS rocks :)
>
> I expect some database applications will just use huge logical devices
> without volume management, even if it''d be better to use volume
> management.
>
> But small allocations have got to go.
>
> I can imagine, say, Solaris as iSCSI servers serving ZFS files as raw
> devices where small allocations are needed but NAS is, for whatever
> reason, not applicable.  Anything but manage multitudes of LUNs.
>
> Nico
> -- 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
One thing bothers me about NAS and iSCSI.   What''s the max  
performance?  On modern arrays and tape drives, the arrays can drive  
at several thousand i/o''s per second, and 200+mb/sec.  How can a nas  
box come even remotely close to that?    Single threaded gig-e using  
NFS maxes out at around 45MB/second.   That''s around 20% of what a  
local array can do.

I don''t get the focus on NAS when local disk performance is far better.

On an unrelated note, the problem I see with big drives is how to  
back up those drives in a reasonable amount of time.  Drives spindles  
aren''t getting faster, which means that as drives get bigger, the  
amount of time to back them up is linear.

-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382              greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Frank Cusack

2006-Apr-19 15:11 UTC

head link

[zfs-discuss] Re: Re: Sun JBOD setup

> [splitting hairs here...]
> 
> On Tue, 2006-04-18 at 21:37 -0700, Frank Cusack
> wrote:
> > You have to do an analysis specific to the
> deployment at hand.  You can''t
> > just outright say, diversity is good.
> 
> Diversity is good.  It also costs real money, which
> is your point.
Actually, my original point was that vendor diversity doesn''t
necessarily
insulate you from the problems of one of those vendors and may in fact
increase your problems.

I''ve certainly never been in a situation where having a second network
vendor saved me from the problems of the first.

There are, absolutely, reasons to use multiple vendors.  Security exposure
is probably not one of them.

-frank

This message posted from opensolaris.org

Casper.Dik at Sun.COM

2006-Apr-19 15:32 UTC

head link

[zfs-discuss] Re: Re: Sun JBOD setup

>There are, absolutely, reasons to use multiple vendors.  Security exposure
>is probably not one of them.
The DOD supposedly uses firewall complexes consisting of three layers;
each using a different vendor.

(It works when they''re in series, not when they''re in
parallel)

Casper

Torrey McMahon

2006-Apr-19 15:40 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Gregory Shaw wrote:>
> On Apr 19, 2006, at 12:17 AM, Nicolas Williams wrote:
>
>> On Wed, Apr 19, 2006 at 01:31:43AM -0400, Torrey McMahon wrote:
>>> In some cases we see customers taking a single HW raid array and 
>>> carving
>>> up hundreds of LUNs of small size for multiple systems. Not as
often as
>>> larger datasets but I got a call about one such setup the other
week.
>>
>> This will change, surely.  Partly because this way lies madness, partly
>> because ZFS rocks.
>
> Why would this change?  When you buy disk resources, you want to use 
> them as effectively as possible.  It''s dangerous to have a large 
> number of systems on a large HW array because of SPOF and performance 
> concerns, but it''s entirely reasonable.
>
> It''s far better than local storage.

Right. Datasets are growing but in a lot of cases there are apps that 
still only need a 100GB to get their work done. (Think fast temp space 
for a grid.) It could be argued that certain ZFS features, like 
snapshots, could cause the overall space requirements to increase but 
even then you will find systems that don''t require TBs of storage. They
might need high performing, reliable, storage.....but not lots and lots 
of TBs.

Of course, the issue isn''t with one system, or even five, but hundreds.
Datacenter math is always more interesting. :)

Richard Elling

2006-Apr-19 15:45 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Wed, 2006-04-19 at 01:17 -0500, Nicolas Williams
wrote:> > case you didn''t notice Do you think SAS drives are going to
stay at 37GB
> > for long? Not a chance. They''ll ramp just as fast, if not
faster, then
> > their FC cousins. (147 is out now. I''m pretty sure ~250 is in
the
> > works.) 
This week Seagate announced 300 GByte, 15k rpm SAS (enterprise-style)
drives.
> But small allocations have got to go.
Small blocks, too.  Today 512 byte blocks are common.  Going forward,
this won''t work and the concensus seems to be falling to 4kByte
blocks.  We''ve made great strides in the past few years getting
to large memory pages in the kernel, I expect the same exercise
in the disk drives.
 -- richard

Henk Langeveld

2006-Apr-19 17:50 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

> On an unrelated note, the problem I see with big drives is how to  back 
> up those drives in a reasonable amount of time.  Drives spindles 
aren''t
> getting faster, which means that as drives get bigger, the  amount of 
> time to back them up is linear.
So you back them up to more disk.  And you don''t do all at once.

This is where incremental snapshots and zfs send/receive can
replicate a snapshot remotely.

Henk

Torrey McMahon

2006-Apr-19 21:06 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Gregory Shaw wrote:>
> One thing bothers me about NAS and iSCSI.   What''s the max 
> performance?  On modern arrays and tape drives, the arrays can drive 
> at several thousand i/o''s per second, and 200+mb/sec.  How can a
nas
> box come even remotely close to that?    Single threaded gig-e using 
> NFS maxes out at around 45MB/second.   That''s around 20% of what a
> local array can do.

What stack are you using to test this? NFSv4 on Solaris 10 gig/E should 
be much much much faster then 45 MB/s.
>
>
> On an unrelated note, the problem I see with big drives is how to back 
> up those drives in a reasonable amount of time.  Drives spindles 
> aren''t getting faster, which means that as drives get bigger, the 
> amount of time to back them up is linear.

Traditional backup methods are running out of steam but software like 
SAM/FS helps ... but we''re probably veering off topic.

Nicolas Williams

2006-Apr-19 21:16 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Wed, Apr 19, 2006 at 08:29:15AM -0600, Gregory Shaw
wrote:> 
> On Apr 19, 2006, at 12:17 AM, Nicolas Williams wrote:
> 
> >On Wed, Apr 19, 2006 at 01:31:43AM -0400, Torrey McMahon wrote:
> >>In some cases we see customers taking a single HW raid array and  
> >>carving
> >>up hundreds of LUNs of small size for multiple systems. Not as  
> >>often as
> >>larger datasets but I got a call about one such setup the other
week.
> >
> >This will change, surely.  Partly because this way lies madness,  
> >partly
> >because ZFS rocks.
> 
> Why would this change?  When you buy disk resources, you want to use  
> them as effectively as possible.  It''s dangerous to have a large  
> number of systems on a large HW array because of SPOF and performance  
> concerns, but it''s entirely reasonable.
Not dangerous as much as difficult to manage.
> It''s far better than local storage.
Yes, but then NAS would probably be fine for any apps with small storage
needs.  Bring NAS into the picture and you bring ZFS into the picture,
with volume management, quotas, snapshots and all that.
> I don''t get the focus on NAS when local disk performance is far
better.
Managing local storage is a pain.  NAS is much easier.  I''ll let
someone
who has numbers respond to the NAS performance comment.
> On an unrelated note, the problem I see with big drives is how to  
> back up those drives in a reasonable amount of time.  Drives spindles  
> aren''t getting faster, which means that as drives get bigger, the
> amount of time to back them up is linear.
Redundancy (RAID-Z, mirroring, replication) + snapshots and/or backup to
disk + infrequent backup to tape is one answer.

Backup/restore has long been a problem, and local storage makes the
problem worse, not better.

Nico
--

Spencer Shepler

2006-Apr-19 21:36 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Wed, Torrey McMahon wrote:> Gregory Shaw wrote:
> >
> >One thing bothers me about NAS and iSCSI.   What''s the max 
> >performance?  On modern arrays and tape drives, the arrays can drive 
> >at several thousand i/o''s per second, and 200+mb/sec.  How can
a nas
> >box come even remotely close to that?    Single threaded gig-e using 
> >NFS maxes out at around 45MB/second.   That''s around 20% of
what a
> >local array can do.
I can regularly get 100-110MB/second on gig-e.  It has to be fast
enough processors on the client and server but it can be done 
out of the box.

Spencer

Rich Teer

2006-Apr-19 21:46 UTC

head link

[zfs-discuss] Re: Re: Sun JBOD setup

On Wed, 19 Apr 2006, Casper.Dik at Sun.COM wrote:
> The DOD supposedly uses firewall complexes consisting of three layers;
> each using a different vendor.
> 
> (It works when they''re in series, not when they''re in
parallel)
Avionics systems also use triple redundancy, with different implementations
of the same design spec for each instance* to avoid the common failure
mode problem described by Al earlier.

* Or at least they did when I last worked on military stuff, a few
years ago.

-- 
Rich Teer, SCNA, SCSA, OpenSolaris CAB member

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-group.com/rich

Gregory Shaw

2006-Apr-20 01:31 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Apr 19, 2006, at 3:16 PM, Nicolas Williams wrote:
> On Wed, Apr 19, 2006 at 08:29:15AM -0600, Gregory Shaw wrote:
>>
>> On Apr 19, 2006, at 12:17 AM, Nicolas Williams wrote:
>>
>>> On Wed, Apr 19, 2006 at 01:31:43AM -0400, Torrey McMahon wrote:
>>>> In some cases we see customers taking a single HW raid array
and
>>>> carving
>>>> up hundreds of LUNs of small size for multiple systems. Not as
>>>> often as
>>>> larger datasets but I got a call about one such setup the other
>>>> week.
>>>
>>> This will change, surely.  Partly because this way lies madness,
>>> partly
>>> because ZFS rocks.
>>
>> Why would this change?  When you buy disk resources, you want to use
>> them as effectively as possible.  It''s dangerous to have a
large
>> number of systems on a large HW array because of SPOF and performance
>> concerns, but it''s entirely reasonable.
>
> Not dangerous as much as difficult to manage.
>Agreed.  By dangerous, I meant lots of systems going to a single  
array.  It all goes away together...
>> It''s far better than local storage.
>
> Yes, but then NAS would probably be fine for any apps with small  
> storage
> needs.  Bring NAS into the picture and you bring ZFS into the picture,
> with volume management, quotas, snapshots and all that.
>
Perhaps.  I think the local storage on most hosts today are  
sufficient for small storage needs.

>> I don''t get the focus on NAS when local disk performance is
far
>> better.
>
> Managing local storage is a pain.  NAS is much easier.  I''ll let  
> someone
> who has numbers respond to the NAS performance comment.
>
Managing NAS instead of SAN seems to be the same to me, if not worse.
>> On an unrelated note, the problem I see with big drives is how to
>> back up those drives in a reasonable amount of time.  Drives spindles
>> aren''t getting faster, which means that as drives get bigger,
the
>> amount of time to back them up is linear.
>
> Redundancy (RAID-Z, mirroring, replication) + snapshots and/or  
> backup to
> disk + infrequent backup to tape is one answer.
>
Even with snapshots, you''ve got to back it up to tape.  That involves  
the same disks, so it helps in an application sense (no downtime due  
to snapshot), but it doesn''t impact the need to back everything up.
> Backup/restore has long been a problem, and local storage makes the
> problem worse, not better.
>
When you say NAS, do you mean appliances, or servers?  It changes the  
picture significantly between the two.
> Nico
> -- 
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382              greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Gregory Shaw

2006-Apr-20 01:33 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Is that NAS or iSCSI?

On Apr 19, 2006, at 3:36 PM, Spencer Shepler wrote:
> On Wed, Torrey McMahon wrote:
>> Gregory Shaw wrote:
>>>
>>> One thing bothers me about NAS and iSCSI.   What''s the max
>>> performance?  On modern arrays and tape drives, the arrays can
drive
>>> at several thousand i/o''s per second, and 200+mb/sec.  How
can a nas
>>> box come even remotely close to that?    Single threaded gig-e
using
>>> NFS maxes out at around 45MB/second.   That''s around 20%
of what a
>>> local array can do.
>
> I can regularly get 100-110MB/second on gig-e.  It has to be fast
> enough processors on the client and server but it can be done
> out of the box.
>
> Spencer
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382              greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Spencer Shepler

2006-Apr-20 15:08 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

NAS - NFSv3 or NFSv4...

On Wed, Gregory Shaw wrote:> Is that NAS or iSCSI?
> 
> On Apr 19, 2006, at 3:36 PM, Spencer Shepler wrote:
> 
> >On Wed, Torrey McMahon wrote:
> >>Gregory Shaw wrote:
> >>>
> >>>One thing bothers me about NAS and iSCSI.   What''s the
max
> >>>performance?  On modern arrays and tape drives, the arrays can
drive
> >>>at several thousand i/o''s per second, and 200+mb/sec. 
How can a nas
> >>>box come even remotely close to that?    Single threaded gig-e
using
> >>>NFS maxes out at around 45MB/second.   That''s around
20% of what a
> >>>local array can do.
> >
> >I can regularly get 100-110MB/second on gig-e.  It has to be fast
> >enough processors on the client and server but it can be done
> >out of the box.
> >
> >Spencer
> 
> -----
> Gregory Shaw, IT Architect
> Phone: (303) 673-8273        Fax: (303) 673-8273
> ITCTO Group, Sun Microsystems Inc.
> 1 StorageTek Drive MS 4382              greg.shaw at sun.com (work)
> Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
> "When Microsoft writes an application for Linux, I''ve
Won." - Linus
> Torvalds
> 
>

Gregory Shaw

2006-Apr-20 16:07 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

In my testing, I''ve found that a single task can write no faster than  
45mb/sec on nfsv3.  I don''t know if v4 is faster -- I don''t
have the
infrastructure for that at this time.  This was on a bluearc titan  
NAS fileserver, which is capable of far beyond gig-e throughput.

When you start running multiple threads, it''s possible get better  
throughput, but I tend to think in  single processes, like a batch  
job.  If it can''t write faster than 45mb/sec, it doesn''t
matter what
else may be occurring on the same system -- it''s limited by the  
single process throughput.

What were you using for testing?

On Apr 20, 2006, at 9:08 AM, Spencer Shepler wrote:
>
> NAS - NFSv3 or NFSv4...
>
> On Wed, Gregory Shaw wrote:
>> Is that NAS or iSCSI?
>>
>> On Apr 19, 2006, at 3:36 PM, Spencer Shepler wrote:
>>
>>> On Wed, Torrey McMahon wrote:
>>>> Gregory Shaw wrote:
>>>>>
>>>>> One thing bothers me about NAS and iSCSI.   What''s
the max
>>>>> performance?  On modern arrays and tape drives, the arrays
can
>>>>> drive
>>>>> at several thousand i/o''s per second, and
200+mb/sec.  How can
>>>>> a nas
>>>>> box come even remotely close to that?    Single threaded
gig-e
>>>>> using
>>>>> NFS maxes out at around 45MB/second.   That''s
around 20% of what a
>>>>> local array can do.
>>>
>>> I can regularly get 100-110MB/second on gig-e.  It has to be fast
>>> enough processors on the client and server but it can be done
>>> out of the box.
>>>
>>> Spencer
>>
>> -----
>> Gregory Shaw, IT Architect
>> Phone: (303) 673-8273        Fax: (303) 673-8273
>> ITCTO Group, Sun Microsystems Inc.
>> 1 StorageTek Drive MS 4382              greg.shaw at sun.com (work)
>> Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
>> "When Microsoft writes an application for Linux, I''ve
Won." - Linus
>> Torvalds
>>
>>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Roch Bourbonnais - Performance Engineering

2006-Apr-20 16:40 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Gregory Shaw writes:
 > In my testing, I''ve found that a single task can write no faster
than
 > 45mb/sec on nfsv3.  I don''t know if v4 is faster -- I
don''t have the
 > infrastructure for that at this time.  This was on a bluearc titan  
 > NAS fileserver, which is capable of far beyond gig-e throughput.
 > 
 > When you start running multiple threads, it''s possible get better
 > throughput, but I tend to think in  single processes, like a batch  
 > job.  If it can''t write faster than 45mb/sec, it doesn''t
matter what
 > else may be occurring on the same system -- it''s limited by the  
 > single process throughput.


Strange, was the problem investigated ?
Something certainly went wrong.

-r

Spencer Shepler

2006-Apr-20 16:40 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Thu, Gregory Shaw wrote:> In my testing, I''ve found that a single task can write no faster
than
> 45mb/sec on nfsv3.  I don''t know if v4 is faster -- I
don''t have the
> infrastructure for that at this time.  This was on a bluearc titan  
> NAS fileserver, which is capable of far beyond gig-e throughput.
> 
> When you start running multiple threads, it''s possible get better
> throughput, but I tend to think in  single processes, like a batch  
> job.  If it can''t write faster than 45mb/sec, it doesn''t
matter what
> else may be occurring on the same system -- it''s limited by the  
> single process throughput.
> 
> What were you using for testing?
Ah, I have to describe my treachery. :-)

So, the clients and servers were 2-way opteron boxes and using
tmpfs on the server for the filesystem.

I used dd on the client to generate the i/o.  The tmpfs usage
was just to make it convenient to remove the issues of filesystem
tuning.  But I have seen a written report describing a Solaris server
with local/FC attached nvram cached storage that demonstrated the
same type of throughput. Anyway.

As you have noted, the issue is queueing.  If the client can 
effectively queue i/o to the server at an appropriate queue depth, 
then most servers at this date can generate a very good overall 
effective throughput.  The main problem with NFS implementations is 
that they are not effective as they could be at queueing requests.  
To overcome the single threaded nature of an application, most NFS
clients will use async or helper threads in the kernel to drive
up the queue depth of i/o at the server.

So, my numbers were an attempt to demonstrate that NFS is capable of
reasonable throughput.

Spencer

Gregory Shaw

2006-Apr-20 16:51 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

I wasn''t able to find a better throughput solution.  I''ll have
to
recreate the test, as if I encountered it, I have to think that  
others encounter it regularly.

I did some serious research on this about a year and a half ago.    
You can get better with some tuning (water marks, jumbo frames,  
etc.), but it generally drops to 45MB/sec on a single threaded process.

Of course, the hardware in question has a huge impact.  Faster  
servers and  more intelligent gig-e cards (that don''t drown the CPU  
in interrupts) make a big difference.

On Apr 20, 2006, at 10:40 AM, Roch Bourbonnais - Performance  
Engineering wrote:
>
> Gregory Shaw writes:
>> In my testing, I''ve found that a single task can write no
faster than
>> 45mb/sec on nfsv3.  I don''t know if v4 is faster -- I
don''t have the
>> infrastructure for that at this time.  This was on a bluearc titan
>> NAS fileserver, which is capable of far beyond gig-e throughput.
>>
>> When you start running multiple threads, it''s possible get
better
>> throughput, but I tend to think in  single processes, like a batch
>> job.  If it can''t write faster than 45mb/sec, it
doesn''t matter what
>> else may be occurring on the same system -- it''s limited by
the
>> single process throughput.
>
>
> Strange, was the problem investigated ?
> Something certainly went wrong.
>
> -r
>
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Richard Elling

2006-Apr-20 17:12 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Thu, 2006-04-20 at 10:51 -0600, Gregory Shaw wrote:
> I did some serious research on this about a year and a half ago.    
> You can get better with some tuning (water marks, jumbo frames,  
> etc.), but it generally drops to 45MB/sec on a single threaded process.
45 MBytes/s is suspiciously close to the typical media speed 
for a single disk.  Disks are slower than GbE.
 -- richard

Peter Tribble

2006-Apr-20 20:13 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Thu, 2006-04-20 at 17:40, Spencer Shepler wrote:
> Ah, I have to describe my treachery. :-)
> 
> So, the clients and servers were 2-way opteron boxes and using
> tmpfs on the server for the filesystem.
> 
> I used dd on the client to generate the i/o.  The tmpfs usage
> was just to make it convenient to remove the issues of filesystem
> tuning.  But I have seen a written report describing a Solaris server
> with local/FC attached nvram cached storage that demonstrated the
> same type of throughput. Anyway.
Hm. 45 still seems low. I could get that for single-threaded reads
(from disk, at that) off an E250 5 years ago. Writes then were limited
to just over 30, because that''s all you can push into a mirrored A3500.

A couple of years ago I was getting 70M/s single-threaded writes onto
a V240 with an attached SE3310. That was CPU-bound - it would have been
interesting to try with Solaris 10, as that''s clearly better. (Although
I''m not sure the SE3310 could soak the data up much faster.)

With Solaris 10 what we noticed was that CPU utilization for a given
level of network traffic (NFS, specifically) dropped enormously.
Essentially, we were down to 1GHz to saturate a gigE network. And in
every case I remember the transfer was then limited by the disk
system (or the filesystem).

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Torrey McMahon

2006-Apr-20 21:23 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Nicolas Williams wrote:> On Wed, Apr 19, 2006 at 08:29:15AM -0600, Gregory Shaw wrote:
>   
>
>   
>> I don''t get the focus on NAS when local disk performance is
far better.
>>     
>
> Managing local storage is a pain.  NAS is much easier.  I''ll let
someone
> who has numbers respond to the NAS performance comment.

Managing storage when you''re a NAS client is much easier. Guess what
you
end up seeing behind a lot of the NAS heads these days? A SAN. :-)

Nicolas Williams

2006-Apr-20 21:25 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Thu, Apr 20, 2006 at 05:23:16PM -0400, Torrey McMahon
wrote:> Nicolas Williams wrote:
> >Managing local storage is a pain.  NAS is much easier.  I''ll
let someone
> >who has numbers respond to the NAS performance comment.
> 
> Managing storage when you''re a NAS client is much easier. Guess
what you
> end up seeing behind a lot of the NAS heads these days? A SAN. :-)
But the converse, that managing storage when you''re the NAS is harder
is
not true.  With a NAS server you can have one big volume with whole
disks -- no LUNs to maintain -- and use filesystem quotas to manage
storage allocations.  I.e., storage allocation and storage devices are
decoupled.

Torrey McMahon

2006-Apr-20 21:28 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Gregory Shaw wrote:>
> On Apr 19, 2006, at 3:16 PM, Nicolas Williams wrote:
>
>
>>
>> Redundancy (RAID-Z, mirroring, replication) + snapshots and/or backup
to
>> disk + infrequent backup to tape is one answer.
>>
>
> Even with snapshots, you''ve got to back it up to tape.  That
involves
> the same disks, so it helps in an application sense (no downtime due 
> to snapshot), but it doesn''t impact the need to back everything
up.
Do you? Maybe you can just take the snapshots? Maybe you just keep them 
on disk someplace? Maybe you really don''t have to archive as much as
you
need to?

One of my favorite past times is getting the CFO and legal types to 
answer the "What are the data retention requirements?" instead of the 
CIO types. Ask most IT departments and they''ll say, "Level 0 every
month, clone tapes off site, incremental every day." The CFO/Legal folks 
always have better answers that meet the real business requirements. 
(Well...not always but I think you get my meaning.)
>
>> Backup/restore has long been a problem, and local storage makes the
>> problem worse, not better.
>>
>
> When you say NAS, do you mean appliances, or servers?  It changes the 
> picture significantly between the two.

Whats the difference?

Gregory Shaw

2006-Apr-20 22:15 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Funny, that''s what ZFS does as well.

On Apr 20, 2006, at 3:25 PM, Nicolas Williams wrote:
> On Thu, Apr 20, 2006 at 05:23:16PM -0400, Torrey McMahon wrote:
>> Nicolas Williams wrote:
>>> Managing local storage is a pain.  NAS is much easier. 
I''ll let
>>> someone
>>> who has numbers respond to the NAS performance comment.
>>
>> Managing storage when you''re a NAS client is much easier.
Guess
>> what you
>> end up seeing behind a lot of the NAS heads these days? A SAN. :-)
>
> But the converse, that managing storage when you''re the NAS is  
> harder is
> not true.  With a NAS server you can have one big volume with whole
> disks -- no LUNs to maintain -- and use filesystem quotas to manage
> storage allocations.  I.e., storage allocation and storage devices are
> decoupled.
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Gregory Shaw

2006-Apr-20 22:17 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Apr 20, 2006, at 3:28 PM, Torrey McMahon wrote:
> Gregory Shaw wrote:
>>
>> On Apr 19, 2006, at 3:16 PM, Nicolas Williams wrote:
>>
>>
>>>
>>> Redundancy (RAID-Z, mirroring, replication) + snapshots and/or  
>>> backup to
>>> disk + infrequent backup to tape is one answer.
>>>
>>
>> Even with snapshots, you''ve got to back it up to tape.  That  
>> involves the same disks, so it helps in an application sense (no  
>> downtime due to snapshot), but it doesn''t impact the need to
back
>> everything up.
>
> Do you? Maybe you can just take the snapshots? Maybe you just keep  
> them on disk someplace? Maybe you really don''t have to archive as
> much as you need to?
> One of my favorite past times is getting the CFO and legal types to  
> answer the "What are the data retention requirements?" instead of
> the CIO types. Ask most IT departments and they''ll say,
"Level 0
> every month, clone tapes off site, incremental every day." The CFO/ 
> Legal folks always have better answers that meet the real business  
> requirements. (Well...not always but I think you get my meaning.)
>
With Sarbanes-Oxley, most companies are going far the other direction  
-- keep everything for 7-27 years.

In my experience, engineering wants their data offsite for 7 years,  
while core business (such as ERP systems) aim higher, such as 27 years.

Keeping things on disk doesn''t address disaster recovery either.
>>
>>> Backup/restore has long been a problem, and local storage makes the
>>> problem worse, not better.
>>>
>>
>> When you say NAS, do you mean appliances, or servers?  It changes  
>> the picture significantly between the two.
>
>
> Whats the difference?
>
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Nicolas Williams

2006-Apr-20 22:25 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Thu, Apr 20, 2006 at 04:15:13PM -0600, Gregory Shaw
wrote:> Funny, that''s what ZFS does as well.
That''s the point.  I may be speaking of NASes generally, but in mind I
have Solaris, ZFS, NFSv4, CIFS.
> On Apr 20, 2006, at 3:25 PM, Nicolas Williams wrote:
> >But the converse, that managing storage when you''re the NAS is
> >harder is
> >not true.  With a NAS server you can have one big volume with whole
> >disks -- no LUNs to maintain -- and use filesystem quotas to manage
> >storage allocations.  I.e., storage allocation and storage devices are
> >decoupled.

Casper.Dik at Sun.COM

2006-Apr-20 22:45 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

>In my experience, engineering wants their data offsite for 7 years,  
>while core business (such as ERP systems) aim higher, such as 27 years.
27?  Better use printers and ink then.

There''s no backup media I know off that will live that long.

(Spinning rust is the only one which will survive as long as the
data is migrated to new technology every 3-5 years)

Don''t expect any backup media to be readable after 5-10 years
(if you can find the drives, the media will have perished)

Casper

Erik Trimble

2006-Apr-20 22:57 UTC

head link

[zfs-discuss] Re: Sun JBOD setup (now: archiving)

> With Sarbanes-Oxley, most companies are going far the other direction  
> -- keep everything for 7-27 years.
> 
> In my experience, engineering wants their data offsite for 7 years,  
> while core business (such as ERP systems) aim higher, such as 27 years.
> 
> Keeping things on disk doesn''t address disaster recovery either.
> -----
> Gregory Shaw, IT Architect
> Phone: (303) 673-8273        Fax: (303) 673-8273
> ITCTO Group, Sun Microsystems Inc.
> 1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
> Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
> "When Microsoft writes an application for Linux, I''ve
Won." - Linus
> Torvalds
> 

The problem here is that both Engineering and Business generally say
"keep it all". That''s idiotic. Even with S-OX.

Take us here in JavaSoft for example. A good chunk of the email should
be kept according to SOX requirements. As should core business data,
such as what goes into customer and ordering DBs. 

BUT, the vast majority of daily work has no need to be kept, even with
SOX coming in now. Certainly, all the Dev and QA work has much lower
retention requirements. 


In a company that has enough $$$ to contemplate a SAN, multi-site disk
redundancy is well within the cost horizon, and can cover the needs of
Engineering for both backup and redundancy quite well. ZFS (and similar)
snapshot capability and WAN mirroring via a SAN work perfectly for
backup and D-R.  Periodic archive of important long-term data is still
required, but that is a VERY small amount compared to the transient data
volume.


Back to us here in JavaSoft.  We generate about 10TB/year in build and
testing binaries.  For example, there are weekly code snapshots of our
Mustang (JDK6) work, which are then built on multiple architectures and
run through QA.  Now, we need to keep the code snapshot around to do
regression analysis, and it might be good to keep the built binaries,
but neither have long-term requirements. When 6.0 finally ships later
this year, we can effectively dump virtually all the build/test binaries
and related test data, as it can be regenerated at will. 

In reality, we probably need to keep about ~50% of our data less than 1
year, ~40% for 1-3 years, and less than 5% for longer than 3.  I can''t
see this a atypical for an engineering department.


People continually confuse archival, backup, and redundancy (disaster
recovery) as the same thing. Mgmt as a whole (speaking about the
business world in general) really needs to have this drilled into their 
skulls - one system does NOT fulfill all three purposes. While it is
possible to have a single system for all 3, it is severely sub-optimal
in time, cost, and effort.


 
Erik Trimble
Java System Support
Mailstop:  usca14-102
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

Dana H. Myers

2006-Apr-20 23:08 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Casper.Dik at sun.com wrote:>> In my experience, engineering wants their data offsite for 7 years,  
>> while core business (such as ERP systems) aim higher, such as 27 years.
> 
> 27?  Better use printers and ink then.
> 
> There''s no backup media I know off that will live that long.
> 
> (Spinning rust is the only one which will survive as long as the
> data is migrated to new technology every 3-5 years)
> 
> Don''t expect any backup media to be readable after 5-10 years
> (if you can find the drives, the media will have perished)
Well, I have plenty of 20+ year old CDs that aren''t showing any
signs of degradation and are all still readable on new commodity hardware
today, but I''m not going to debate about the longevity of a single
piece
of media.

Dana

Erik Trimble

2006-Apr-20 23:10 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Archival quality Tape will reliably last at least 10 years if stored
properly.

Finding a tape drive to read it, however, is a severe problem. :-)

Long-term data storage is a problem. The best solution I''ve seen is
Magneto-Optical stuff (the media is resistant to all common problems),
but capacities suck, and finding old readers is problematic. 

Outside that, if you are truly worried about archival, then mastering a
DVD is the best option. Mastered (i.e. pressed) DVD/CDs will last 50
years or more with proper storage, and we''ll probably have a better
chance finding an operational reader for the format in 2050 than any
other media.

-Erik

On Fri, 2006-04-21 at 00:45 +0200, Casper.Dik at Sun.COM
wrote:> >In my experience, engineering wants their data offsite for 7 years,  
> >while core business (such as ERP systems) aim higher, such as 27 years.
> 
> 27?  Better use printers and ink then.
> 
> There''s no backup media I know off that will live that long.
> 
> (Spinning rust is the only one which will survive as long as the
> data is migrated to new technology every 3-5 years)
> 
> Don''t expect any backup media to be readable after 5-10 years
> (if you can find the drives, the media will have perished)
> 
> Casper
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- 
Erik Trimble
Java System Support
Mailstop:  usca14-102
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

Gregory Shaw

2006-Apr-20 23:10 UTC

head link

[zfs-discuss] Re: Sun JBOD setup (now: archiving)

Wow, what an email.   See my comments below:

On Apr 20, 2006, at 4:57 PM, Erik Trimble wrote:
>
>> With Sarbanes-Oxley, most companies are going far the other direction
>> -- keep everything for 7-27 years.
>>
>> In my experience, engineering wants their data offsite for 7 years,
>> while core business (such as ERP systems) aim higher, such as 27  
>> years.
>>
>> Keeping things on disk doesn''t address disaster recovery
either.
>> -----
>> Gregory Shaw, IT Architect
>> Phone: (303) 673-8273        Fax: (303) 673-8273
>> ITCTO Group, Sun Microsystems Inc.
>> 1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
>> Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
>> "When Microsoft writes an application for Linux, I''ve
Won." - Linus
>> Torvalds
>>
>
>
> The problem here is that both Engineering and Business generally say
> "keep it all". That''s idiotic. Even with S-OX.
>
I totally agree.  Being in IT, I''ve got to design the solutions that  
have to deal with it.

Data never gets smaller.
> Take us here in JavaSoft for example. A good chunk of the email should
> be kept according to SOX requirements. As should core business data,
> such as what goes into customer and ordering DBs.
>
> BUT, the vast majority of daily work has no need to be kept, even with
> SOX coming in now. Certainly, all the Dev and QA work has much lower
> retention requirements.
>
>
> In a company that has enough $$$ to contemplate a SAN, multi-site disk
> redundancy is well within the cost horizon, and can cover the needs of
> Engineering for both backup and redundancy quite well. ZFS (and  
> similar)
> snapshot capability and WAN mirroring via a SAN work perfectly for
> backup and D-R.  Periodic archive of important long-term data is still
> required, but that is a VERY small amount compared to the transient  
> data
> volume.
>
>
> Back to us here in JavaSoft.  We generate about 10TB/year in build and
> testing binaries.  For example, there are weekly code snapshots of our
> Mustang (JDK6) work, which are then built on multiple architectures  
> and
> run through QA.  Now, we need to keep the code snapshot around to do
> regression analysis, and it might be good to keep the built binaries,
> but neither have long-term requirements. When 6.0 finally ships later
> this year, we can effectively dump virtually all the build/test  
> binaries
> and related test data, as it can be regenerated at will.
>
> In reality, we probably need to keep about ~50% of our data less  
> than 1
> year, ~40% for 1-3 years, and less than 5% for longer than 3.  I
can''t
> see this a atypical for an engineering department.
>
>
> People continually confuse archival, backup, and redundancy (disaster
> recovery) as the same thing. Mgmt as a whole (speaking about the
> business world in general) really needs to have this drilled into  
> their
> skulls - one system does NOT fulfill all three purposes. While it is
> possible to have a single system for all 3, it is severely sub-optimal
> in time, cost, and effort.
>
>
>
> Erik Trimble
> Java System Support
> Mailstop:  usca14-102
> Phone:  x17195
> Santa Clara, CA
> Timezone: US/Pacific (GMT-0800)
>
I see two solutions here:

1.  What you''re talking about at a basic level is Information  
Lifecycle Management (ILM).  If we had defined policies around data  
retention, automation for the data policies can be implemented.   
However, without policies, you''re stuck with an all-or-nothing view  
by the business which translates into ''all''.  Everything has
to be
backed up, everything has to be offsite in multiple copies forever.   
If we had policies, we could use ILM to migrate the data (via SAMFS  
or another solution) from tier to tier of storage.  That would  
significantly reduce the ongoing cost.
2.  In your particular code case, you might want to look at the  
Intellistore storage solution.   It''s part of the STK purchase.  It  
allows you to define storage policies for data retention.  In other  
words, you configure a NFS share with a defined data lifetime.   It  
will guarantee that the data will not be touched and will live only  
as long as you need it.

I find in the unix admin space that everybody wants to have the data  
go away.  However, we''re not very good at following through when the  
data has expired and should be deleted.

-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Chad Lewis

2006-Apr-20 23:33 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Apr 20, 2006, at 4:08 PM, Dana H. Myers wrote:
> Casper.Dik at sun.com wrote:
>>> In my experience, engineering wants their data offsite for 7 years,
>>> while core business (such as ERP systems) aim higher, such as 27  
>>> years.
>>
>> 27?  Better use printers and ink then.
>>
>> There''s no backup media I know off that will live that long.
>>
>> (Spinning rust is the only one which will survive as long as the
>> data is migrated to new technology every 3-5 years)
>>
>> Don''t expect any backup media to be readable after 5-10 years
>> (if you can find the drives, the media will have perished)
>
> Well, I have plenty of 20+ year old CDs that aren''t showing any
> signs of degradation and are all still readable on new commodity  
> hardware
> today, but I''m not going to debate about the longevity of a single
> piece
> of media.
>
> Dana
>
Yes, but I''m sure those were CDs where the data was physically  
stamped into the metal.

Re/Writable CDs used laser-heated dyes and won''t last anywhere near  
as long.

ckl

Dana H. Myers

2006-Apr-20 23:52 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Chad Lewis wrote:> 
> On Apr 20, 2006, at 4:08 PM, Dana H. Myers wrote:
> 
>> Casper.Dik at sun.com wrote:
>>>> In my experience, engineering wants their data offsite for 7
years,
>>>> while core business (such as ERP systems) aim higher, such as
27 years.
>>>
>>> 27?  Better use printers and ink then.
>>>
>>> There''s no backup media I know off that will live that
long.
>> Well, I have plenty of 20+ year old CDs that aren''t showing
any
>> signs of degradation and are all still readable on new commodity
hardware
>> today, but I''m not going to debate about the longevity of a
single piece
>> of media.
>>
>> Dana
>>
> 
> Yes, but I''m sure those were CDs where the data was physically
stamped
> into the metal.
> 
> Re/Writable CDs used laser-heated dyes and won''t last anywhere
near as
> long.
Of course.  If you''re looking for long-term retention, you
wouldn''t
use -R or -R/W.

Dana

Casper.Dik at Sun.COM

2006-Apr-21 06:00 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

>Well, I have plenty of 20+ year old CDs that aren''t showing any
>signs of degradation and are all still readable on new commodity hardware
>today, but I''m not going to debate about the longevity of a single
piece
>of media.
20 year old CD-Rs?  Or 20 year old CDs?  (I''m not sure that writable
CDs
even existed, 20 years ago)

CDs will generally live that long; CD-Rs do not.


Casper

Dana H. Myers

2006-Apr-21 06:38 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Casper.Dik at Sun.COM wrote:>> Well, I have plenty of 20+ year old CDs that aren''t showing
any
>> signs of degradation and are all still readable on new commodity
hardware
>> today, but I''m not going to debate about the longevity of a
single piece
>> of media.
> 
> 20 year old CD-Rs?  Or 20 year old CDs?  (I''m not sure that
writable CDs
> even existed, 20 years ago)
You didn''t specify the difference; CDs are as much media as CD-Rs are.
I''m aware that CDs take longer and cost more to write than CD-Rs; but
they''re media all the same.

However, none of this matters; maintaining a data archive is much
more than making a copy of the data and storing it a long time.

Dana

Casper.Dik at Sun.COM

2006-Apr-21 08:34 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

>You didn''t specify the difference; CDs are as much media as CD-Rs
are.
>I''m aware that CDs take longer and cost more to write than CD-Rs;
but
>they''re media all the same.
Well, they''re not "backup media" by any standard.

But the optical disks, even pressed, are useless.  They don''t store
enough data; not currently anyway.  And the gap is only worsening.

Casper

Darren J Moffat

2006-Apr-21 10:53 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Dana H. Myers wrote:> Casper.Dik at Sun.COM wrote:
>>> Well, I have plenty of 20+ year old CDs that aren''t
showing any
>>> signs of degradation and are all still readable on new commodity
hardware
>>> today, but I''m not going to debate about the longevity of
a single piece
>>> of media.
>> 20 year old CD-Rs?  Or 20 year old CDs?  (I''m not sure that
writable CDs
>> even existed, 20 years ago)
> 
> You didn''t specify the difference; CDs are as much media as CD-Rs
are.
> I''m aware that CDs take longer and cost more to write than CD-Rs;
but
> they''re media all the same.
The difference matters a lot.  There is a huge difference in the quality 
and the failure characteristics between pressed (ie in a proper 
manufacturing plant) media and those produced in consumer burners.  This 
is basically the difference between CD and CD-R - same applies in the 
DVD world too.

-- 
Darren J Moffat

Dana H. Myers

2006-Apr-21 15:20 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Casper.Dik at sun.com wrote:>> You didn''t specify the difference; CDs are as much media as
CD-Rs are.
>> I''m aware that CDs take longer and cost more to write than
CD-Rs; but
>> they''re media all the same.
> 
> Well, they''re not "backup media" by any standard.
They may be archive media, though.  Backups and archives aren''t
necessarily the same thing. If the requirement is 27-year retention,
it may in-fact become cost-effective to master DVDs and handle them
less often, particularly if done in bulk.
> But the optical disks, even pressed, are useless.  They don''t
store
> enough data; not currently anyway.  And the gap is only worsening.
This is indeed a problem.

Dana

Dana H. Myers

2006-Apr-21 17:48 UTC

head link

Volume of DVDs vs. disk drives (was Re: [zfs-discuss] Re: Sun JBOD setup)

Casper.Dik at sun.com wrote:>> You didn''t specify the difference; CDs are as much media as
CD-Rs are.
>> I''m aware that CDs take longer and cost more to write than
CD-Rs; but
>> they''re media all the same.
> 
> Well, they''re not "backup media" by any standard.
> 
> But the optical disks, even pressed, are useless.  They don''t
store
> enough data; not currently anyway.  And the gap is only worsening.
Actually, this got me to thinking.  If one desires to maintain a long-term
archive for SOX, I''m guessing that the archive will be rarely accessed.
So
it''s perfectly reasonable to master double-sided DVDs and stack them on
spindles, and put them in a vault.  Each DVD would require a volume of
around 9cc and would store approximately 9.4GB.  A common 400GB SATA
drive today has a volume of around 394 cc; thus DVDs could contain
394/9 * 9.4GB = 411GB in approximately the same volume.  This is
just estimating, of course.

Fast forward to Blu-Ray (for example); it seems that a single double-sided
Blu-Ray disk could contain 50GB, or something in excess of 2TB in the same
volume as a disk drive.  I''m assuming that Blu-Ray disks would be
pressed
metal, could be double-sided, and have the same volume as current DVDs.

Since we''re talking about archiving data for a long period of time, the
latency in mastering DVDs is unimportant, and the cost savings over the
lifetime of the archive probably more than makes up for the greater cost
of mastering the DVDs, particularly if this turns into a common business.

Note that I calculate the volume of a DVD based on the square of the diameter,
since circles don''t pack as tightly as squares.

Dana

Nicolas Williams

2006-Apr-21 18:30 UTC

head link

Volume of DVDs vs. disk drives (was Re: [zfs-discuss] Re: Sun JBOD setup)

On Fri, Apr 21, 2006 at 10:48:49AM -0700, Dana H. Myers
wrote:> Since we''re talking about archiving data for a long period of
time, the
> latency in mastering DVDs is unimportant, and the cost savings over the
> lifetime of the archive probably more than makes up for the greater cost
> of mastering the DVDs, particularly if this turns into a common business.
Well, the latency matters because the archive system needs to be able to
cache bandwidth * latency.  A quick search leads me to think that
latency would be somewhere around 10 seconds.  For bandwidth = 1TB/day
that works out to a reasonably small number (<1/2 TB).  OK, yes, the
latency in mastering DVDs is unimportant :)

Nico
--

Nicolas Williams

2006-Apr-21 18:35 UTC

head link

Volume of DVDs vs. disk drives (was Re: [zfs-discuss] Re: Sun JBOD setup)

On Fri, Apr 21, 2006 at 01:30:42PM -0500, Nicolas Williams
wrote:> On Fri, Apr 21, 2006 at 10:48:49AM -0700, Dana H. Myers wrote:
> > Since we''re talking about archiving data for a long period of
time, the
> > latency in mastering DVDs is unimportant, and the cost savings over
the
> > lifetime of the archive probably more than makes up for the greater
cost
> > of mastering the DVDs, particularly if this turns into a common
business.
> 
> Well, the latency matters because the archive system needs to be able to
> cache bandwidth * latency.  A quick search leads me to think that
> latency would be somewhere around 10 seconds.  For bandwidth = 1TB/day
> that works out to a reasonably small number (<1/2 TB).  OK, yes, the
> latency in mastering DVDs is unimportant :)
Or not, 10 seconds is pressing time.  I don''t know how long it takes to
make a master.  I''m guessing that mastering is not realistic.

Erik Trimble

2006-Apr-21 19:48 UTC

head link

Volume of DVDs vs. disk drives (was Re: [zfs-discuss] Re: Sun JBOD setup)

I''ve done this before for CDs. Mastering a CD (i.e. getting it pressed)
usually takes about 2 weeks, provided you''ve already got the 
account/relationship set up with the vendor.  Normally, it''s 1 day to 
burn THREE CD-Rs of all the data (vendors want multiple verification 
before they commit to plastic) and ship it out to them. 2-3 days for the 
mail. 5 business days for them to insert your work into their work 
queue, and make a run of 100. 2-3 days to ship it back to you.  Voila!

Also, when you store them, you DON''T store them on a Spindle.
It''s bad
for the CD, and it makes finding the right one more difficult. 
Generally, I''ve found that they get stored in a standard jewel case, 
complete with full label (and barcode) in a (acid-free) paper insert 
(just like liner notes).

In reality, it''s fine for archival.  VERY few businesses truly need
vast
quantities of archival data.  Even a company the size of Sun, I''d be 
surprised if we needed more than a 1TB of long-term archives per year.   
All of JavaSoft probably produces no more than 25GB per year of data 
that should be archived (source code snapshots of our public releases - 
nice compression ratios, too), and I''d be surprised if any other 
Engineering division was significantly different.  HR and Sales generate 
the most data, and even there, it''s all DB files, and that stuff 
compresses _very_ nicely. :-)

So, for Fortune 100 company, we''d use somewhere around 200 DVDs per
year
(assuming some wasted space).  That''s about 75 linear feet of shelf 
space, 6" high, 6" deep, or less than 20 cubic feet.  That''s
trivial.

And, certified archival-quality pressed CDs have GUARRANTIED livespans 
of 50+ years. I haven''t looked at  the corresponding DVDs, but I
can''t
image them being different.

Erik Trimble
Java System Support
Mailstop:  usca14-102
Phone:  x17195
Santa Clara, CA

Joerg Schilling

2006-Apr-26 16:29 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Erik Trimble <Erik.Trimble at Sun.COM> wrote:
> Outside that, if you are truly worried about archival, then mastering a
> DVD is the best option. Mastered (i.e. pressed) DVD/CDs will last 50
> years or more with proper storage, and we''ll probably have a
better
> chance finding an operational reader for the format in 2050 than any
> other media.
This is also true for "burned" DVDs in case you use the right media.


J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2006-Apr-26 16:31 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Chad Lewis <Chad.Lewis at Sun.COM> wrote:
> > Well, I have plenty of 20+ year old CDs that aren''t showing
any
> > signs of degradation and are all still readable on new commodity  
> > hardware
> > today, but I''m not going to debate about the longevity of a
single
> > piece
> > of media.
> >
> > Dana
> >
>
> Yes, but I''m sure those were CDs where the data was physically  
> stamped into the metal.
>
> Re/Writable CDs used laser-heated dyes and won''t last anywhere
near
> as long.
I still have a readable Kodak CD from 1992 (when the first CD-Rs came out).

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Nicolas Williams

2006-Apr-26 16:51 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Wed, Apr 26, 2006 at 06:29:57PM +0200, Joerg Schilling
wrote:> Erik Trimble <Erik.Trimble at Sun.COM> wrote:
> > Outside that, if you are truly worried about archival, then mastering
a
> > DVD is the best option. Mastered (i.e. pressed) DVD/CDs will last 50
> > years or more with proper storage, and we''ll probably have a
better
> > chance finding an operational reader for the format in 2050 than any
> > other media.
> 
> This is also true for "burned" DVDs in case you use the right
media.
Links?  (I believe you, I just want to know what media to buy.)

Erik Trimble

2006-Apr-27 00:00 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

On Wed, 2006-04-26 at 11:51 -0500, Nicolas Williams
wrote:> On Wed, Apr 26, 2006 at 06:29:57PM +0200, Joerg Schilling wrote:
> > Erik Trimble <Erik.Trimble at Sun.COM> wrote:
> > > Outside that, if you are truly worried about archival, then
mastering a
> > > DVD is the best option. Mastered (i.e. pressed) DVD/CDs will last
50
> > > years or more with proper storage, and we''ll probably
have a better
> > > chance finding an operational reader for the format in 2050 than
any
> > > other media.
> > 
> > This is also true for "burned" DVDs in case you use the
right media.
> 
> Links?  (I believe you, I just want to know what media to buy.)
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

http://www.delkin.com/delkin_products_archival_gold_dvd.html

Kodack makes some to:  http://www.dvd-recordable.org/Article2616.phtml



Google for "DVD gold media" with additional terms "archive"
or
"archival".  There''s a good selection there.

-- 
Erik Trimble
Java System Support
Mailstop:  usca14-102
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

Joerg Schilling

2006-Apr-29 12:07 UTC

head link

Volume of DVDs vs. disk drives (was Re: [zfs-discuss] Re: Sun JBOD setup)

"Dana H. Myers" <Dana.Myers at Sun.COM> wrote:
> Fast forward to Blu-Ray (for example); it seems that a single double-sided
> Blu-Ray disk could contain 50GB, or something in excess of 2TB in the same
> volume as a disk drive.  I''m assuming that Blu-Ray disks would be
pressed
> metal, could be double-sided, and have the same volume as current DVDs.
HD-DVD and Blu ray are not using organic dye anymore, they are based on phase 
change technology afaik.

For reallity: Note that support for cdrw did stop recently and that cdrecord
is the software that is actively maintained. But HD-DVD and Blu ray drives are
expensive (~300 Euro for the laser component only) and for this reason, drive 
manufacturers do (currently) not give away sample drives for free. So you would
need to wait some time until I am able to support the drives with cdrecord.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2006-May-01 15:54 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Nicolas Williams <Nicolas.Williams at Sun.COM> wrote:
> On Wed, Apr 26, 2006 at 06:29:57PM +0200, Joerg Schilling wrote:
> > Erik Trimble <Erik.Trimble at Sun.COM> wrote:
> > > Outside that, if you are truly worried about archival, then
mastering a
> > > DVD is the best option. Mastered (i.e. pressed) DVD/CDs will last
50
> > > years or more with proper storage, and we''ll probably
have a better
> > > chance finding an operational reader for the format in 2050 than
any
> > > other media.
> > 
> > This is also true for "burned" DVDs in case you use the
right media.
>
> Links?  (I believe you, I just want to know what media to buy.)
I am sorry, this was Piomneer information I read in 1998.

So it is most likely referring Pioneer or TDK media.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2006-May-01 16:17 UTC

head link

[zfs-discuss] Re: Sun JBOD setup

Erik Trimble <Erik.Trimble at Sun.COM> wrote:
> On Wed, 2006-04-26 at 11:51 -0500, Nicolas Williams wrote:
> > On Wed, Apr 26, 2006 at 06:29:57PM +0200, Joerg Schilling wrote:
> > > Erik Trimble <Erik.Trimble at Sun.COM> wrote:
> > > > Outside that, if you are truly worried about archival, then
mastering a
> > > > DVD is the best option. Mastered (i.e. pressed) DVD/CDs will
last 50
> > > > years or more with proper storage, and we''ll
probably have a better
> > > > chance finding an operational reader for the format in 2050
than any
> > > > other media.
> > > 
> > > This is also true for "burned" DVDs in case you use the
right media.
> > 
> > Links?  (I believe you, I just want to know what media to buy.)
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss at opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>
> http://www.delkin.com/delkin_products_archival_gold_dvd.html
>
> Kodack makes some to:  http://www.dvd-recordable.org/Article2616.phtml
from what I''ve heard, the "Kodak" media is from MAME italy -
mid level quality.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

zfs discuss - Apr 2006 - Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Sun JBOD setup

[zfs-discuss] Re: Re[2]: Sun JBOD setup

[zfs-discuss] Re: Re[2]: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Re: Sun JBOD setup

[zfs-discuss] Re: Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup (now: archiving)

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup (now: archiving)

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup

[zfs-discuss] Re: Sun JBOD setup