thr3ads.net - zfs discuss - [zfs-discuss] Can I trust ZFS? [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Ross

2008-Jul-31 20:25 UTC

[zfs-discuss] Can I trust ZFS?

Hey folks,

I guess this is an odd question to be asking here, but I could do with some
feedback from anybody who''s actually using ZFS in anger.

I''m about to go live with ZFS in our company on a new fileserver, but I
have some real concerns about whether I can really trust ZFS to keep my data
alive if things go wrong.  This is a big step for us, we''re a 100%
windows company and I''m really going out on a limb by pushing Solaris.

The problems with zpool status hanging concern me, knowing that I can''t
hot plug drives is an issue, and the long resilver times bug is also a potential
problem.  I suspect I can work around the hot plug drive bug with a big warning
label on the server, but knowing the pool can hang so easily makes me worry
about how well ZFS will handle other faults.

On my drive home tonight I was wondering whether I''m going to have to
swallow my pride and order a hardware raid controller for this server, letting
that deal with the drive issues, and just using ZFS as a very basic filesystem.

What has me re-considering ZFS though is that on the other hand I know the
Thumpers have sold well for Sun, and they pretty much have to use ZFS.  So
there''s a big installed base out there using it, and that base has been
using it for a few years.  I know from the Thumper manual that you have to
unconfigure drives before removal on them on those servers, which goes a long
way towards making me think that should be a relatively safe way to work.

The question is whether I can make a server I can be confident in.  I''m
now planning a very basic OpenSolaris server just using ZFS as a NFS server, is
there anybody out there who can re-assure me that such a server can work well
and handle real life drive failures?

thanks,

Ross
 
 
This message posted from opensolaris.org

Enda O''Connor

2008-Jul-31 20:48 UTC

head link

[zfs-discuss] Can I trust ZFS?

Ross wrote:> Hey folks,
> 
> I guess this is an odd question to be asking here, but I could do with some
feedback from anybody who''s actually using ZFS in anger.
> 
> I''m about to go live with ZFS in our company on a new fileserver,
but I have some real concerns about whether I can really trust ZFS to keep my
data alive if things go wrong.  This is a big step for us, we''re a 100%
windows company and I''m really going out on a limb by pushing Solaris.
> 
> The problems with zpool status hanging concern me, knowing that I
can''t hot plug drives is an issue, and the long resilver times bug is
also a potential problem.  I suspect I can work around the hot plug drive bug
with a big warning label on the server, but knowing the pool can hang so easily
makes me worry about how well ZFS will handle other faults.
> 
> On my drive home tonight I was wondering whether I''m going to have
to swallow my pride and order a hardware raid controller for this server,
letting that deal with the drive issues, and just using ZFS as a very basic
filesystem.
> 
> What has me re-considering ZFS though is that on the other hand I know the
Thumpers have sold well for Sun, and they pretty much have to use ZFS.  So
there''s a big installed base out there using it, and that base has been
using it for a few years.  I know from the Thumper manual that you have to
unconfigure drives before removal on them on those servers, which goes a long
way towards making me think that should be a relatively safe way to work.
> 
> The question is whether I can make a server I can be confident in. 
I''m now planning a very basic OpenSolaris server just using ZFS as a
NFS server, is there anybody out there who can re-assure me that such a server
can work well and handle real life drive failures?
> 
> thanks,
> 
> Ross
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discussHi
What kind of hardware etc is the fileserver going to be running, and 
what zpool layout is being planned.

As for thumpers, once 138053-02 (  marvell88sx driver patch ) releases 
within the next two weeks ( assuming no issues found ), then the thumper 
platform running s10 updates will be up to date in terms of marvel88sx 
driver fixes, which fixes some pretty important issues for thumper.
Strongly suggest applying this patch to thumpers going forward.
u6 will have the fixes by default.


Enda

Will Murnane

2008-Jul-31 21:55 UTC

head link

[zfs-discuss] Can I trust ZFS?

On Thu, Jul 31, 2008 at 16:25, Ross <myxiplx at hotmail.com>
wrote:> The problems with zpool status hanging concern me, knowing that I
can''t hot plug drives is an issue, and the long resilver times bug is
also a potential problem. I suspect I can work around the hot plug drive bug
with a big warning label on the server, but knowing the pool can hang so easily
makes me worry about how well ZFS will handle other faults.Other hardware-failure type things can cause what appear to be big
problems, too. We have a scsi->sata enclosure here with some embedded
firmware, and it''s connected to a scsi controller on an x4150. I
swapped some disks in the enclosure and updated the controller
configuration, then rebooted the controller... and the host box died,
because ZFS decided that too many disks were unavailable to continue,
so it panicked the box. At first I thought this behavior was
terrible---my server is down!---but on some reflection, it makes
sense: It''s better to quit before anything else on the filesystem is
corrupted rather than write garbage across a whole pool because of
controller failure or something to that effect.

In any case, I thought you''d be interested in this property of zpools.
It''s not likely to happen in general (especially with DAS and a dumb
controller, like you have), and it''s better than the alternative of
potentially scribbling on a pool, but other services running on the
same box could suffer if you were incautious.
> On my drive home tonight I was wondering whether I''m going to have
to swallow my pride and order a hardware raid controller for this server,
letting that deal with the drive issues, and just using ZFS as a very basic
filesystem.Letting ZFS handle one layer of redundancy is always recommended, if
you''re going to use it at all. Otherwise it can get into a situation
where it finds checksum errors and can''t do anything about them.
> The question is whether I can make a server I can be confident in.
I''m now planning a very basic OpenSolaris server just using ZFS as a
NFS server, is there anybody out there who can re-assure me that such a server
can work well and handle real life drive failures?We haven''t had any "real life" drive failures at work, but at
home I
took some old flaky IDE drives and put them in a pentium 3 box running
Nevada. Several of them were known to cause errors under Linux, so I
mirrored them in approximately-the-same-size pairs and set up weekly
scrubs. Two drives out of six failed entirely, and were nicely
retired, before I gave up on the idea and bought new disks. I didn''t
lose any data with this scheme, and ZFS told me every once in a while
that it had recovered from a checksum error. Good drives are always
recommended, of course, but I saw nothing but good behavior with old
broken hardware while I was using it.

Finally, at work we''re switching everything over to ZFS because
it''s
so convenient... but we keep tape backups nonetheless. I strongly
recommend having up-to-date backups in any situation, but even more so
with ZFS. It''s been very reliable for me personally and at work, but
I''ve seen horror stories of corrupt pools from which all data is lost.
I''d rather be sitting around the campfire quaking in my boots at
story time than have a flashlight pointed at my face doing the
telling, if you catch my drift.

Will

Bob Netherton

2008-Jul-31 21:56 UTC

head link

[zfs-discuss] Can I trust ZFS?

On Thu, 2008-07-31 at 13:25 -0700, Ross wrote:> Hey folks,
> 
> I guess this is an odd question to be asking here, but I could do with some
> feedback from anybody who''s actually using ZFS in anger.
ZFS in anger ?   That''s an interesting way of putting it :-)
> but I have some real concerns about whether I can really trust ZFS to
>  keep my data alive if things go wrong.  This is a big step for us, 
> we''re a 100% windows company and I''m really going out on
a limb by
> pushing Solaris.
I can appreciate how this could be considered a risk, especially if it
is your idea.   But let''s put this all in perspective and
you''ll see
why it isn''t even remotely a question.

I have put all sorts of file servers into production with things like
Online Disk Suite 1.0, NFS V1 - and slept like a baby.  Now, for the
non-historians on the list, the quality of Online Disk Suite 1.0 led
directly to the creation of the volume management marketplace and
Veritas in particular (hey - that''s a joke, OK ????   but only
marginally).

> The question is whether I can make a server I can be confident in.  
> I''m now planning a very basic OpenSolaris server just using ZFS as
a
> NFS server, is there anybody out there who can re-assure me that such
> a server can work well and handle real life drive failures?
There are two questions in there - can it be built and are you
comfortable with it.   Those are two different things.  The simple
answer to the first is yes.  Although if this is mission critical
(and things like NFS servers generally are - even if they are only
serving up iTunes music libraries - ask my daughter).  

Enda''s point about the Marvell driver updates for Solaris 10 should
be carefully considered.  If it''s just an NFS server then the vast
majority of OpenSolaris benefits won''t be applicable (newer GNOME,
better packaging, better Linux interoperability, etc).  Putting
this one Solaris 10 with Live Upgrade and a service contract
would make me sleep like a baby.

Now, for the other question - if you are looking at this like an
appliance then you might not be quite as happy.  It does take a little
care and feeding, but nearly every piece of technology more complicated
than a toaster needs a little love every once in a while.   I would much
rather put a Solaris/ZFS file server into a Windows environment than a
Windows file server into a Unix environment :-)

Bob

Bob Netherton

2008-Jul-31 22:03 UTC

head link

[zfs-discuss] Can I trust ZFS?

> We haven''t had any "real life" drive failures at work,
but at home I
> took some old flaky IDE drives and put them in a pentium 3 box running
> Nevada. 
Similar story here.  Some IDE and SATA drive burps under Linux (and
please don''t tell me how wonderful Reiser4 is - ''cause
it''s banned in
this house forever.... arrrrgh) and Windows.   It ate my entire iTunes
library.  Yeah, lurve that silent data corruption feature.
>  Several of them were known to cause errors under Linux, so I
> mirrored them in approximately-the-same-size pairs and set up weekly
> scrubs.  Two drives out of six failed entirely, and were nicely
> retired, before I gave up on the idea and bought new disks. 
Pretty cool, eh ?
> Finally, at work we''re switching everything over to ZFS because
it''s
> so convenient... but we keep tape backups nonetheless.  
A very good idea.  Disasters will still occur.  With enough storage,
snapshots can eliminate the routine file by file restores but a complete
meltdown is always a possibility.  So backups aren''t optional, but I
find myself doing very few restores any more.


Bob

Chad Lewis

2008-Jul-31 22:26 UTC

head link

[zfs-discuss] Can I trust ZFS?

On Jul 31, 2008, at 2:56 PM, Bob Netherton wrote:
> On Thu, 2008-07-31 at 13:25 -0700, Ross wrote:
>> Hey folks,
>>
>> I guess this is an odd question to be asking here, but I could do  
>> with some
>> feedback from anybody who''s actually using ZFS in anger.
>
> ZFS in anger ?   That''s an interesting way of putting it :-)
>
If you watch Phil Liggett and/or Paul Sherwen commentating on a  
cycling event,
you''re pretty much guaranteed to hear "turning the pedals in
anger"
at some
point when a rider goes on the attack.

Vincent Fox

2008-Jul-31 23:11 UTC

head link

[zfs-discuss] Can I trust ZFS?

We have 50,000 users worth of mail spool on ZFS.

So we''ve been trusting it for production usage for THE most critical
& visible enterprise app.

Works fine.  Our stores are ZFS RAID-10 built of LUNS from pairs of 3510FC.  Had
an entire array go down once, the system kept going fine.  Brought the array
back online ran a scrub to be certain of data, came up clean.

Running checksum integrity scrub while online, THAT is the killer app that makes
me sleep better.
 
 
This message posted from opensolaris.org

Dave

2008-Jul-31 23:42 UTC

head link

[zfs-discuss] Can I trust ZFS?

Enda O''Connor wrote:> 
> As for thumpers, once 138053-02 (  marvell88sx driver patch ) releases 
> within the next two weeks ( assuming no issues found ), then the thumper 
> platform running s10 updates will be up to date in terms of marvel88sx 
> driver fixes, which fixes some pretty important issues for thumper.
> Strongly suggest applying this patch to thumpers going forward.
> u6 will have the fixes by default.
> 
I''m assuming the fixes listed in these patches are already committed in
OpenSolaris (b94 or greater)?

--
Dave

Miles Nordin

2008-Aug-01 02:41 UTC

head link

[zfs-discuss] Can I trust ZFS?

>>>>> "r" == Ross <myxiplx at hotmail.com>
writes:
r> This is a big step for us, we''re a 100% windows company and
r> I''m really going out on a limb by pushing Solaris.

I''m using it in anger. I''m angry at it, and can''t
afford anything
that''s better.

Whatever I replaced ZFS with, I would make sure it had:

* snapshots

* weekly scrubbing

* dual-parity. to make the rebuild succeed after a disk fails, in
case the frequent scrubbing is not adequate. and also to deal with
the infant-mortality problem and the relatively high 6% annual
failure rate

* checksums (block- or filesystem-level, either one is fine)

* fix for the RAID5 write hole (either FreeBSD-style RAID3 which is
analagous to the ZFS full-stripe-write approach, or battery-backed
NVRAM)

* built from only drives that have been burned in for 1 month

ZFS can have all those things, except the weekly scrubbing. I''m sure
the scrubbing works really well for some people like Vincent, but for
me it takes much longer than scrubbing took with pre-ZFS RAID, and
increases filesystem latency a lot more, too. this is probably partly
my broken iSCSI setup, but I''m not sure. I''m having problems
where
the combined load of ''zpool scrub'' and some filesystem
activity bogs
down the Linux iSCSI targets so much that ZFS marks the whole pool
faulted, so I have to use the pool ``gently'''' during scrub.
:(

RAID-on-a-card doesn''t usually have these bullet points, so I would
use ZFS over RAID-on-a-card. There are too many horror stories about
those damn cards, even the ``good'''' ones. Even if they worked
well
which in my opinion they do not, they make getting access to your pool
dependent on getting replacement cards of the same vintage, and get
the right drivers for this proprietary, obscure card for the (possibly
just re-installed different version of) the OS, possibly cards with
silently-different ``steppings'''' or ``firmware
revisions'''' or some
other such garbage. Also with raid-on-a-card there is no clear way to
get a support contract that stands behind the whole system, in terms
of the data''s availability, either. With Sun ZFS stuff there sort-of
is, and definitely is with a traditional storage hardware vendor, so
optimistically even if you are not covered by a contract yourself
because you downloaded Solaris or bought a Filer on eBay, some other
customer is, so the product (optimistically) won''t make some
colossally stupid mistakes that some RAID-on-a-card companies make. I
would stay well away from that card crap.

many ZFS problems discussed here sound like the fixes are going into
s10u6, so are not available on Solaris 10 yet, and are drastic enough
to introduce some regressions. I don''t think ZFS in stable solaris
will be up to my stability expectations until the end of the
year---for now ``that''s fixed in weeks-old b94''''
probably doesn''t fit
your application. maybe for a scrappy super-competitive high-roller
shared hosting shop, but not for a plodding windows shop. and having
fully-working drivers for X4500 right after its replacement is
announced makes me thing maybe you should buy an X4500, not the
replacement. :(

ZFS has been included in stable Solaris for two full years already,
and you''re still asking questions about it. The Solaris CIFS server
I''ve never tried, but it is even newer, so I think you would be crazy
to make yourself the black sheep pushing that within a conservative,
hostile environment. If you have some experience with Samba in your
environment maybe that''s ok to use in place of CIFS.

If you want something more out-of-the-box than Samba, you could get a
NetApp StoreVault. I''ve never had one myself, though, so maybe
I''ll
regret having this suggestion archived on the Interweb forever.

I think unlike Samba the StoreVault can accomodate the Windows
security model without kludgyness. To my view that''s not necessarily
a good thing, but it IS probably what a Windows shop wants. The
StoreVault has all those reliability bullet points above AIUI. It''s
advertised as a crippled version of their real Filer''s software. It
may annoy you by missing certain basic, dignified features, like it is
web-managed only?!, maybe you have to pay more to ``unlock''''
the
snapshot feature with some stupid registration code, but it should
have most of the silent reliability/availability tricks that are in
the higher-end Netapp''s.

Something cheaper than NetApp like the Adaptec SNAP filer has
snapshots, scrubbing, and I assume fix for RAID5 hole, and something
like the support-contract-covering-your-data though obviously not
anything to set beside NetApp. Also the Windows-security-model
support is kludgy. I''m not sure SNAP has dual-parity or checksums.
and I''ve found it slightly sketchy---it was locking up every week
until I forced an XFS fsck, and there is no supported way to force an
XFS fsck. Their integration work does seem to hide some of the Linux
crappyness but not all. LVM2 seems to be relatively high-quality on
the inside compared to current ZFS.

r> The problems with zpool status hanging concern me,

Yes.

You might distinguish bugs that affect availability from bugs that can
cause data loss. The ''zpool status'' not always working is
half-way in
between because it interferes with responding to failures.

The disk-pulled problems, the
slow-mirror-component-makes-whole-mirror-slow problems, and the
problems of proper error handling being put off over two years with
the excuse ``we''re integrating FMA'''' and then FMA
once integrated
isn''t behaving reasonably problems, are all in the availability
category, so maybe they aren''t show-stoppers? For people using ZFS on
top of an expensive storage solution, they may not care at all---if
there is some weird chain of event leading to an availability problem,
use the excuse ``you should have paid more and set up
multipath''''---the availability demands on ZFS are lower with
big FC
arrays.

However the reports of ``my pool is corrupt, help'''' /
<silence> and
``the kernel {panics,runs out of memory and freezes} every time I do
XXX''''---these scare the shit out of me, because it means you
lose your
data in this frustrating way as if it were encrypted by a
data-for-ransom Internet worm: some day, maybe a year from now, the
bug will be fixed and maybe you can get your data back. In the mean
time, you''re SOL with thousands of dollars of (possibly leased) disk,
while the data is just barely out of reach, perhaps sucking your time
away with desperate futile maybe-this-will-work attempts. I have
fairly high confidence I can recover most of the data off an abused
UFS-over-SVM-mirror with dd and fsck, but I don''t have that confidence
at all with supposedly ``always-consistent'''' ZFS.

Besides several tiers of storage-layer and ZFS-layer redundancy,
experience here suggests you also need rsync-level redundnacy---either
to another ZFS pool, or to some other cheap backup filesystem, a
backup filesystem that might be acceptable even with some of the
problems in the bulleted list like not being dual-parity, not having
snapshots, or having a RAID5 write hole (but it still needs to be
scrubbed).

If you get an integrated NAS like the StoreVault, the ZFS machine will
probably be cheaper, so you could use it as the cheaper backup
filesystem---rsync the storevault onto the ZFS filesystem every night.
You can do this for a couple years so you will have a chance to notice
if ZFS stability is improving, and maybe conduct more experiments in
provoking it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080731/29496aa2/attachment.bin>

Richard Elling

2008-Aug-01 04:01 UTC

head link

[zfs-discuss] Can I trust ZFS?

Ross wrote:> Hey folks,
>
> I guess this is an odd question to be asking here, but I could do with some
feedback from anybody who''s actually using ZFS in anger.
>   
I''ve been using ZFS for nearly 3 years now.  It has been my (mirrored
:-)
home directory for that time.  I''ve never lost any of that data, though
I do
spend some time torturing ZFS and hardware.  Inside Sun, we use ZFS
home directories for a large number of developers and these servers
are upgraded every build.  As marketing would say, we eat our own
dog food.
> I''m about to go live with ZFS in our company on a new fileserver,
but I have some real concerns about whether I can really trust ZFS to keep my
data alive if things go wrong.  This is a big step for us, we''re a 100%
windows company and I''m really going out on a limb by pushing Solaris.
>   
I''m not that familiar with running Windows file systems for large
numbers of users, but my personal experience with them has been
frought with data loss and, going back a few years, "ABORT, RETRY,
GIVE UP"
> The problems with zpool status hanging concern me, knowing that I
can''t hot plug drives is an issue, and the long resilver times bug is
also a potential problem.  I suspect I can work around the hot plug drive bug
with a big warning label on the server, but knowing the pool can hang so easily
makes me worry about how well ZFS will handle other faults.
>   
While you''ve demonstrated hot unplug problems with USB drives,
that is a very different software path than the more traditional hot
plug SAS/FC/UltraSCSI devices.  USB devices are considered
removable media and have a very different use case than what is
normally considered for enterprise-class storage devices.
> On my drive home tonight I was wondering whether I''m going to have
to swallow my pride and order a hardware raid controller for this server,
letting that deal with the drive issues, and just using ZFS as a very basic
filesystem.
>   
If you put all of your trust in the hardware RAID controller, then
one day you may be disappointed. This is why we tend to recommend
using some sort of data protection at the ZFS level, regardless of the
hardware.  If you look at this forum''s archive, you will see someone
who has discovered a faulty RAID controller, switch, HBA, or
some other device by using ZFS.  With other file systems, it would
be difficult to isolate the fault.
> What has me re-considering ZFS though is that on the other hand I know the
Thumpers have sold well for Sun, and they pretty much have to use ZFS.  So
there''s a big installed base out there using it, and that base has been
using it for a few years.  I know from the Thumper manual that you have to
unconfigure drives before removal on them on those servers, which goes a long
way towards making me think that should be a relatively safe way to work.
>   
You can run Windows, RHEL, FreeBSD, and probably another
dozen or two OSes on thumpers.  We have customers who run
many different OSes on our open and industry standard hardware.
> The question is whether I can make a server I can be confident in. 
I''m now planning a very basic OpenSolaris server just using ZFS as a
NFS server, is there anybody out there who can re-assure me that such a server
can work well and handle real life drive failures?
>   
Going back to your USB remove test, if you protect that disk
at the ZFS level, such as a mirror, then when the disk is removed
then it will be detected as removed and zfs status will show its
state as "removed" and the pool as "degraded" but it will
continue
to function, as expected.  Replacing the USB device will bring it
back online, again as expected, and it should resilver automatically.
To reiterate, it is best to let ZFS do the data protection regardless
of the storage used.
 -- richard

Ross

2008-Aug-01 06:03 UTC

head link

[zfs-discuss] Can I trust ZFS?

>Going back to your USB remove test, if you protect that disk
>at the ZFS level, such as a mirror, then when the disk is removed
>then it will be detected as removed and zfs status will show its
>state as "removed" and the pool as "degraded" but it
will continue
>to function, as expected.
>-- richard
Except it doesn''t.  The reason I''m doing these single disk
tests is that pulling a single SATA drive out of my main pool (5 sets of 3 way
mirrors) hangs the whole pool (or if I set failmode=continue, crashes solaris,
even though it''s a data pool and holds nothing the OS needs at all).

I also saw before with mirrored iSCSI drives that pulling the network cable on
one hung the ZFS pool for 3 minutes.  ZFS handles checksum errors great, but it
doesn''t seem to cope with the loss of devices at all.
 
 
This message posted from opensolaris.org

Brent Jones

2008-Aug-01 06:12 UTC

head link

[zfs-discuss] Can I trust ZFS?

On Thu, Jul 31, 2008 at 11:03 PM, Ross <myxiplx at hotmail.com> wrote:
> >Going back to your USB remove test, if you protect that disk
> >at the ZFS level, such as a mirror, then when the disk is removed
> >then it will be detected as removed and zfs status will show its
> >state as "removed" and the pool as "degraded" but
it will continue
> >to function, as expected.
> >-- richard
>
> Except it doesn''t.  The reason I''m doing these single
disk tests is that
> pulling a single SATA drive out of my main pool (5 sets of 3 way mirrors)
> hangs the whole pool (or if I set failmode=continue, crashes solaris, even
> though it''s a data pool and holds nothing the OS needs at all).
>
> I also saw before with mirrored iSCSI drives that pulling the network cable
> on one hung the ZFS pool for 3 minutes.  ZFS handles checksum errors great,
> but it doesn''t seem to cope with the loss of devices at all.
>
>
This conversation piques my interest.. I have been reading a lot about
Opensolaris/Solaris for the last few weeks.
Have even spoken to Sun storage techs about bringing in Thumper/Thor for our
storage needs.
I have recently brought online a Dell server with a DAS (14 SCSI drives).
This will be part of my tests now, physically removing a member of the pool
before issuing the removal command for that particular drive.

One other issue I have now also, how do you physically locate a
failing/failed drive in ZFS?
With hardware RAID sets, if the RAID controller itself detects the error, it
will inititate a BLINK command to that drive, so the individual drive is now
flashing red/amber/whatever on the RAID enclosure.
How would this be possible with ZFS? Say you have a JBOD enclosure, (14,
hell maybe 48 drives).
Knowing c0d0xx failed is no longer helpful, if only ZFS catches an error.
Will you be able to isolate the drive quickly, to replace it? Or will you be
going "does the enclosure start at logical zero... left to right..
hrmmm"

Thanks

-- 
Brent Jones
brent at servuhome.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080731/5a6c7578/attachment.html>

Ross Smith

2008-Aug-01 06:56 UTC

head link

[zfs-discuss] Can I trust ZFS?

Hey Brent,
 
On the Sun hardware like the Thumper you do get a nice bright blue "ready
to remove" led as soon as you issue the "cfgadm -c unconfigure
xxx" command.  On other hardware it takes a little more care, I''m
labelling our drive bays up *very* carefully to ensure we always remove the
right drive.  Stickers are your friend, mine will probably be labelled
"sata1/0", "sata1/1", "sata1/2", etc.
 
I know Sun are working to improve the LED support, but I don''t know
whether that support will ever be extended to 3rd party hardware:
http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris
 
I''d love to use Sun hardware for this, but while things like x2200
servers are great value for money, Sun don''t have anything even
remotely competative to a standard 3U server with 16 SATA bays.  The x4240 is
probably closest, but is at least double the price.  Even the J4200 arrays are
more expensive than this entire server.
 
Ross
 
PS.  Once you''ve tested SCSI removal, could you add your results to my
thread, would love to hear how that
went.http://www.opensolaris.org/jive/thread.jspa?threadID=67837&tstart=0
 
 > This conversation piques my interest.. I have been reading a lot about
Opensolaris/Solaris for the last few weeks.> Have even spoken to Sun storage
techs about bringing in Thumper/Thor for our storage needs.> I have recently
brought online a Dell server with a DAS (14 SCSI drives). This will be part of
my tests now,
> physically removing a member of the pool before issuing the removal command
for that particular drive.
> One other issue I have now also, how do you physically locate a
failing/failed drive in ZFS?
>> With hardware RAID sets, if the RAID controller itself detects the
error, it will inititate a BLINK command to that
> drive, so the individual drive is now flashing red/amber/whatever on the
RAID enclosure.> How would this be possible with ZFS? Say you have a JBOD
enclosure, (14, hell maybe 48 drives).> Knowing c0d0xx failed is no longer
helpful, if only ZFS catches an error. Will you be able to isolate the drive
> quickly, to replace it? Or will you be going "does the enclosure start
at logical zero... left to right.. hrmmm"
> Thanks
> -- > Brent Jones> brent at servuhome.net 
_________________________________________________________________
100?s of Nikon cameras to be won with Live Search
http://clk.atdmt.com/UKM/go/101719808/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080801/1693f1fa/attachment.html>

Enda O''Connor ( Sun Micro Systems Ireland)

2008-Aug-01 09:08 UTC

head link

[zfs-discuss] Can I trust ZFS?

Dave wrote:> 
> 
> Enda O''Connor wrote:
>>
>> As for thumpers, once 138053-02 (  marvell88sx driver patch ) releases 
>> within the next two weeks ( assuming no issues found ), then the 
>> thumper platform running s10 updates will be up to date in terms of 
>> marvel88sx driver fixes, which fixes some pretty important issues for 
>> thumper.
>> Strongly suggest applying this patch to thumpers going forward.
>> u6 will have the fixes by default.
>>
> 
> I''m assuming the fixes listed in these patches are already
committed in
> OpenSolaris (b94 or greater)?
> 
> -- 
> Daveyep.
I know this is opensolaris list, but a lot of folk asking questions do seem to
be running
various update releases.


Enda

Robert Milkowski

2008-Aug-01 09:58 UTC

head link

[zfs-discuss] Can I trust ZFS?

Hello Ross,

I know personally many environments using ZFS in a production for
quite some time. Quite often in business critical environments.
Some of them are small, some of them are rather large (hundreds of
TBs), some of them are clustered. Different usages like file servers,
MySQL on ZFS, Oracle on ZFS, mail on ZFS, virtualization on ZFS, ...

So far I haven''t seen loosing any data - I hit some issues from time
to time but nothing which can''t be work-arounded.

That being said ZFS is still relatively young technology so if your
top priority regardless of anything else is stability and confidence I
would go with UFS or VxFS/VxVM which are in the market for many many
years proven in a lot of technologies.



-- 
Best regards,
 Robert Milkowski                            mailto:milek at task.gda.pl
                                       http://milek.blogspot.com

Brent Jones

2008-Aug-02 00:41 UTC

head link

[zfs-discuss] Can I trust ZFS?

I have done a bit of testing, and so far so good really.
I have a Dell 1800 with a Perc4e and a 14 drive Dell Powervault 220S.
I have a RaidZ2 volume named ''tank'' that spans 6 drives. I
have made 1 drive
available as a spare to ZFS.

Normal array:

# zpool status
  pool: tank
 state: ONLINE
 scrub: scrub completed with 0 errors on Fri Aug  1 19:37:33 2008
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz2    ONLINE       0     0     0
            c0t1d0  ONLINE       0     0     0
            c0t2d0  ONLINE       0     0     0
            c0t3d0  ONLINE       0     0     0
            c0t4d0  ONLINE       0     0     0
            c0t5d0  ONLINE       0     0     0
            c0t6d0  ONLINE       0     0     0
        spares
          c0t13d0   AVAIL

errors: No known data errors



One drive removed:

# zpool status
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist
for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using ''zpool
online''.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: resilver completed with 0 errors on Fri Aug  1 20:30:39 2008
config:

        NAME           STATE     READ WRITE CKSUM
        tank           DEGRADED     0     0     0
          raidz2       DEGRADED     0     0     0
            c0t1d0     ONLINE       0     0     0
            c0t2d0     ONLINE       0     0     0
            spare      DEGRADED     0     0     0
              c0t3d0   UNAVAIL      0     0     0  cannot open
              c0t13d0  ONLINE       0     0     0
            c0t4d0     ONLINE       0     0     0
            c0t5d0     ONLINE       0     0     0
            c0t6d0     ONLINE       0     0     0
        spares
          c0t13d0      INUSE     currently in use

errors: No known data errors


Now lets remove the hot spare  ;)

# zpool status
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist
for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using ''zpool
online''.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: resilver completed with 0 errors on Fri Aug  1 20:30:39 2008
config:

        NAME           STATE     READ WRITE CKSUM
        tank           DEGRADED     0     0     0
          raidz2       DEGRADED     0     0     0
            c0t1d0     ONLINE       0     0     0
            c0t2d0     ONLINE       0     0     0
            spare      UNAVAIL      0   656     0  insufficient replicas
              c0t3d0   UNAVAIL      0     0     0  cannot open
              c0t13d0  UNAVAIL      0     0     0  cannot open
            c0t4d0     ONLINE       0     0     0
            c0t5d0     ONLINE       0     0     0
            c0t6d0     ONLINE       0     0     0
        spares
          c0t13d0      INUSE     currently in use

errors: No known data errors


Now, this Perc4e doesn''t support JBOD, so each drive is a standalone
Raid0
(how annoying).
With that, I cannot plug the drives back in with the system running,
controller will keep them offline until I enter the bios.

But in my scenario, this does demonstrate ZFS tolerates hot removal of
drives, without issuing a graceful removal of the device.
I was copying MP3s to the volume the whole time, and the copy continued
uninterrupted, without error.
I verified all data was written as well. All data should be online when I
reboot and put the pool back in normal state.

I am very happy with the test. I don''t know many hardware controllers
that''ll loose 3 drives out of an array of 6 (with spare), and still
function
normally (even if the controller supports Raid6, I''ve seen major issues
where writes were not committed).

I''ll add my results to your forum thread as well.

Regards

Brent Jones
brent at servuhome.net

On Thu, Jul 31, 2008 at 11:56 PM, Ross Smith <myxiplx at hotmail.com>
wrote:
>  Hey Brent,
>
> On the Sun hardware like the Thumper you do get a nice bright blue
"ready
> to remove" led as soon as you issue the "cfgadm -c unconfigure
xxx"
> command.  On other hardware it takes a little more care, I''m
labelling our
> drive bays up *very* carefully to ensure we always remove the right drive.
> Stickers are your friend, mine will probably be labelled
"sata1/0",
> "sata1/1", "sata1/2", etc.
>
> I know Sun are working to improve the LED support, but I don''t
know whether
> that support will ever be extended to 3rd party hardware:
> http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris
>
> I''d love to use Sun hardware for this, but while things like x2200
servers
> are great value for money, Sun don''t have anything even remotely
competative
> to a standard 3U server with 16 SATA bays.  The x4240 is probably closest,
> but is at least double the price.  Even the J4200 arrays are more expensive
> than this entire server.
>
> Ross
>
> PS.  Once you''ve tested SCSI removal, could you add your results
to my
> thread, would love to hear how that went.
> http://www.opensolaris.org/jive/thread.jspa?threadID=67837&tstart=0
>
> > This conversation piques my interest.. I have been reading a lot about
> Opensolaris/Solaris for the last few weeks.
> > Have even spoken to Sun storage techs about bringing in Thumper/Thor
for
> our storage needs.
> > I have recently brought online a Dell server with a DAS (14 SCSI
drives).
> This will be part of my tests now,
> > physically removing a member of the pool before issuing the removal
> command for that particular drive.
> > One other issue I have now also, how do you physically locate a
> failing/failed drive in ZFS?
> >
> > With hardware RAID sets, if the RAID controller itself detects the
error,
> it will inititate a BLINK command to that
> > drive, so the individual drive is now flashing red/amber/whatever on
the
> RAID enclosure.
> > How would this be possible with ZFS? Say you have a JBOD enclosure,
(14,
> hell maybe 48 drives).
> > Knowing c0d0xx failed is no longer helpful, if only ZFS catches an
error.
> Will you be able to isolate the drive
> > quickly, to replace it? Or will you be going "does the enclosure
start at
> logical zero... left to right.. hrmmm"
> > Thanks
> > --
> > Brent Jones
> > brent at servuhome.net
>
>
> ------------------------------
> Get Hotmail on your Mobile! Try it
Now!<http://clk.atdmt.com/UKM/go/101719965/direct/01/>
>


-- 
Brent Jones
brent at servuhome.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080801/6008fcf2/attachment.html>

Bob Friesenhahn

2008-Aug-03 16:42 UTC

head link

[zfs-discuss] Can I trust ZFS?

According to the hard disk drive guide at 
http://www.storagereview.com/guide2000/ref/hdd/index.html, a wopping 
36% of data loss is due to human error.  49% of data loss was due to 
hardware or system malfunction.  With proper pool design, zfs 
addresses most of the 49% of data loss due to hardware malfunction.

You can do as much MTTDL analysis as you want based on drive 
reliability and read failure rates, but it still only addresses that 
49% of data loss.

Zfs makes human error really easy.  For example

   $ zpool destroy mypool

   $ zfs destroy mypool/mydata

The commands are almost instantaneous and are much faster than the 
classic:

   $ rm -rf /mydata

or

   % newfs /dev/rdsk/c0t0d0s6 < /dev/null

Most problems we hear about on this list are due to one of these 
issues:

  * Human error

  * Beta level OS software

  * System memory error (particularly non-ECC memory)

  * Wrong pool design

Zfs is a tool which can lead to exceptional reliability.  Some forms 
of human error can be limited by facilities such as snapshots.  System 
administrator human error is still a major factor.

Bob
=====================================Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Bill Sommerfeld

2008-Aug-03 16:59 UTC

head link

[zfs-discuss] Can I trust ZFS?

On Sun, 2008-08-03 at 11:42 -0500, Bob Friesenhahn
wrote:> Zfs makes human error really easy.  For example
> 
>    $ zpool destroy mypool
Note that "zpool destroy" can be undone by "zpool import -D"
(if you get
to it before the disks are overwritten).

Possibly Parallel Threads

Search for more possibly parallel threads

zfs discuss - Jul 2008 - Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

[zfs-discuss] Can I trust ZFS?

Possibly Parallel Threads