thr3ads.net - zfs discuss - [zfs-discuss] The importance of ECC RAM for ZFS [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Michael McCandless

2009-Jul-24 10:18 UTC

[zfs-discuss] The importance of ECC RAM for ZFS

I''ve read in numerous threads that it''s important to use ECC
RAM in a
ZFS file server.

My question is: is there any technical reason, in ZFS''s design, that
makes it particularly important for ZFS to require ECC RAM?

Is ZFS especially vulnerable, moreso than other filesystems, to bit
errors in RAM?

For example, if the wrong bit flips at the wrong time, could I lose my
entire RAID-Z pool instead of, say, corrupting one file''s contents or
metadata?  Is there such a possibility?

(Assume the rest of the hardware stack "behaves", eg an fsync to the
drive won''t return until the bytes are written to stable storage).

I had assumed that a bit error from RAM would only have a localized
effect (eg, corrupt the contents or metadata of file or directory)
each time it "struck", but now I''m wondering if the failure
could be
global because of something in ZFS''s design, and that''s why
the
recommendation for ECC RAM is always so "strong".

Some of the posts in this thread ("Another user loses his pool..."):

  http://opensolaris.org/jive/thread.jspa?threadID=108213&tstart=0

make me think ZFS may in fact "require" ECC RAM.
-- 
This message posted from opensolaris.org

Rich Teer

2009-Jul-24 14:19 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Fri, 24 Jul 2009, Michael McCandless wrote:
> I''ve read in numerous threads that it''s important to use
ECC RAM in a
> ZFS file server.
> 
> My question is: is there any technical reason, in ZFS''s design,
that
> makes it particularly important for ZFS to require ECC RAM?
[...]
> Some of the posts in this thread ("Another user loses his
pool..."):
> 
>   http://opensolaris.org/jive/thread.jspa?threadID=108213&tstart=0
> 
> make me think ZFS may in fact "require" ECC RAM.
I don''t think it''s ZFS per se that "requires" ECC
RAM.  More likely,
it''s any application (in the use sense, not a program) that actually
cares about being able to detect--and preferably correct--errors in
memory.

Given that data integrity is presumably important in every non-gaming
computing use, I don''t understand why people even consider not using
ECC RAM all the time.  The hardware cost delta is a red herring: how
much would undetected memories cost an organisation?  That''s the true
cost of skimping on memory by using non-ECC RAM, IMHO.

HTH,

-- 
Rich Teer, SCSA, SCNA, SCSECA

URLs: http://www.rite-group.com/rich
      http://www.linkedin.com/in/richteer

Kyle McDonald

2009-Jul-24 14:44 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

Michael McCandless wrote:> I''ve read in numerous threads that it''s important to use
ECC RAM in a
> ZFS file server.
>
> My question is: is there any technical reason, in ZFS''s design,
that
> makes it particularly important for ZFS to require ECC RAM?
>   I think, basically the idea is, that if you''re going to use ZFS to 
protect your data from this sort of thing through the path to the stable 
storage, then it seems  like a shame (or a waste?)  not to equally 
protect the data both before it''s given to ZFS for writing, and after 
ZFS reads it back and returns it to you.

  -Kyle

dick hoogendijk

2009-Jul-24 14:59 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Fri, 24 Jul 2009 07:19:40 -0700 (PDT)
Rich Teer <rich.teer at rite-group.com> wrote:
> Given that data integrity is presumably important in every non-gaming
> computing use, I don''t understand why people even consider not
using
> ECC RAM all the time.  The hardware cost delta is a red herring:
I live in Holland and it is not easy to find motherboards that (a)
truly support ECC ram and (b) are (Open)Solaris compatible.

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
+ http://nagual.nl/ | nevada / OpenSolaris 2010.02 B118
+ All that''s really worth doing is what we do for others (Lewis Carrol)

dick hoogendijk

2009-Jul-24 15:01 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Fri, 24 Jul 2009 10:44:36 -0400
Kyle McDonald <KMcDonald at Egenera.COM> wrote:> ... then it seems  like a shame (or a waste?)  not to equally
> protect the data both before it''s given to ZFS for writing, and
after
> ZFS reads it back and returns it to you.
But that was not the question.
The question was: [quote] "My question is: is there any technical
reason, in ZFS''s design, that makes it particularly important for ZFS
to require ECC RAM?"

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
+ http://nagual.nl/ | nevada / OpenSolaris 2010.02 B118
+ All that''s really worth doing is what we do for others (Lewis Carrol)

Nicolas Williams

2009-Jul-24 15:46 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Fri, Jul 24, 2009 at 05:01:15PM +0200, dick hoogendijk
wrote:> On Fri, 24 Jul 2009 10:44:36 -0400
> Kyle McDonald <KMcDonald at Egenera.COM> wrote:
> > ... then it seems  like a shame (or a waste?)  not to equally
> > protect the data both before it''s given to ZFS for writing,
and after
> > ZFS reads it back and returns it to you.
> 
> But that was not the question.
> The question was: [quote] "My question is: is there any technical
> reason, in ZFS''s design, that makes it particularly important for
ZFS
> to require ECC RAM?"
The only thing I can think of is this: if a cosmic ray flips a bit in
memory holding a ZFS transaction that''s already had all its checksums
computed, but hasn''t hit disk yet, then you''ll have a checksum
verification failure later when you read back the affected file (or
directory).  Using ECC memory avoids that.  You still have the processor
to worry about though.

Nico
--

Richard Elling

2009-Jul-24 15:55 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Jul 24, 2009, at 3:18 AM, Michael McCandless wrote:
> I''ve read in numerous threads that it''s important to use
ECC RAM in a
> ZFS file server.
It is important to use ECC RAM.  The embedded market and
server market demand ECC RAM. It is only the el-cheapo PC
market that does not. Going back to some of the early studies
by IBM on errors in PC memory, it is really a shame that the
market has not moved on.
> My question is: is there any technical reason, in ZFS''s design,
that
> makes it particularly important for ZFS to require ECC RAM?
No.
> Is ZFS especially vulnerable, moreso than other filesystems, to bit
> errors in RAM?
No.  Except that ZFS actual does check data integrity. So ZFS can
detect if you had a problem.  Other file systems can be blissfully
ignorant of data corruption.
> For example, if the wrong bit flips at the wrong time, could I lose my
> entire RAID-Z pool instead of, say, corrupting one file''s contents
or
> metadata?  Is there such a possibility?
Not likely, but I don''t think anyone has done such low-level
analysis to prove it.
> (Assume the rest of the hardware stack "behaves", eg an fsync to
the
> drive won''t return until the bytes are written to stable storage).
>
> I had assumed that a bit error from RAM would only have a localized
> effect (eg, corrupt the contents or metadata of file or directory)
> each time it "struck", but now I''m wondering if the
failure could be
> global because of something in ZFS''s design, and that''s
why the
> recommendation for ECC RAM is always so "strong".
IMHO, the reason this gets discussed on zfs-discuss so frequently
is because ZFS detects data corruption and people start to
speculate about the source.

NB many hard disk drives and controllers have only parity protected
memory. So even if your main memory is ECC, it is unlikely that the
entire data path is ECC protected.
>
> Some of the posts in this thread ("Another user loses his
pool..."):
>
>  http://opensolaris.org/jive/thread.jspa?threadID=108213&tstart=0
>
> make me think ZFS may in fact "require" ECC RAM.
The root cause of this thread''s woes have absolutely nothing to
do with ECC RAM. It has everything to do with VirtualBox configuration.
  -- richard

Robert Milkowski

2009-Jul-24 16:08 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

dick hoogendijk wrote:> On Fri, 24 Jul 2009 10:44:36 -0400
> Kyle McDonald <KMcDonald at Egenera.COM> wrote:
>   
>> ... then it seems  like a shame (or a waste?)  not to equally
>> protect the data both before it''s given to ZFS for writing,
and after
>> ZFS reads it back and returns it to you.
>>     
>
> But that was not the question.
> The question was: [quote] "My question is: is there any technical
> reason, in ZFS''s design, that makes it particularly important for
ZFS
> to require ECC RAM?"
>
>   
No, there isn''t.


-- 
Robert Milkowski
http://milek.blogspot.com

Miles Nordin

2009-Jul-24 20:00 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

>>>>> "re" == Richard Elling <richard.elling at
gmail.com> writes:
    re> The root cause of this thread''s woes have absolutely nothing
    re> to do with ECC RAM. It has everything to do with VirtualBox
    re> configuration.

What part of VirtualBox configuration?

The post I read said OpenSolaris guest crashed, and the guy clicked
the ``power off guest'''' button on the virtual machine.  The
host never
crashed.  so whether the IDE cache flush parameter was set or not,
whether the guest backing store was a file or a raw disk, seems
irrelevant to me.

Is there a correct way to configure it, or will always any componoent
of the overall system other than ZFS get blamed when ZFS loses a pool?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090724/c7d55ddf/attachment.bin>

Rob Logan

2009-Jul-24 20:30 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

> The post I read said OpenSolaris guest crashed, and the guy clicked > the ``power off guest'''' button on the virtual machine.

I seem to recall "guest hung". 99% of solaris hangs (without
a crash dump) are "hardware" in nature. (my experience backed by
an uptime of 1116days) so the finger is still
pointed at VirtualBox''s "hardware" implementation.

as for ZFS requiring "better" hardware, you could turn
off checksums and other protections so one isn''t notified
of issues making it act like the others.

				Rob

Bob Friesenhahn

2009-Jul-24 20:35 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Fri, 24 Jul 2009, Miles Nordin wrote:> The post I read said OpenSolaris guest crashed, and the guy clicked
> the ``power off guest'''' button on the virtual machine. 
The host never
> crashed.  so whether the IDE cache flush parameter was set or not,
Clicking ``power off guest'''' is the same as walking up and
pulling the
power cord out of the wall.  That is now how the guest operating 
system is supposed to be shut down.

If VirtualBox does not at least flush pending writes (that it lied 
about) when the user clicks on ``power off guest'''' then it has
committed a crime.  Regardless, it has committed a crime.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Blake

2009-Jul-24 21:08 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Fri, Jul 24, 2009 at 4:35 PM, Bob
Friesenhahn<bfriesen at simple.dallas.tx.us>
wrote:> On Fri, 24 Jul 2009, Miles Nordin wrote:
>>
>> The post I read said OpenSolaris guest crashed, and the guy clicked
>> the ``power off guest'''' button on the virtual
machine. ?The host never
>> crashed. ?so whether the IDE cache flush parameter was set or not,
>
> Clicking ``power off guest'''' is the same as walking up
and pulling the power
> cord out of the wall. ?That is now how the guest operating system is
> supposed to be shut down.
>
> If VirtualBox does not at least flush pending writes (that it lied about)
> when the user clicks on ``power off guest'''' then it has
committed a crime.
> ?Regardless, it has committed a crime.
IIRC, VirtualBox has 3 shutdown options - Power Off (like pulling the
plug), Send Shutdown Signal (emulates a hardware signal asking for a
graceful (write-committing) shutdown), and Suspend (write system state
to the host''s disk).

Ian Collins

2009-Jul-24 22:33 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

Rob Logan wrote:> > The post I read said OpenSolaris guest crashed, and the guy clicked
> > the ``power off guest'''' button on the virtual
machine.
>
> I seem to recall "guest hung". 99% of solaris hangs (without
> a crash dump) are "hardware" in nature. (my experience backed by
> an uptime of 1116days) so the finger is still
> pointed at VirtualBox''s "hardware" implementation.
>
> as for ZFS requiring "better" hardware, you could turn
> off checksums and other protections so one isn''t notified
> of issues making it act like the others.
>Maybe not better hardware, but honest hardware.  Every piece of software 
depends to some extent on the devices it uses honouring their 
contracts.  ZFS has to trust the storage to have committed the data it 
claims to have committed in the same way it has to trust the integrity 
of the RAM it uses for checksummed data.

That''s the price you pay for end to end checksums.

-- 
Ian.

Frank Middleton

2009-Jul-24 22:41 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On 07/24/09 04:35 PM, Bob Friesenhahn wrote:> Regardless, it [VirtualBox] has committed a crime.
But ZFS is a journalled file system! Any hardware can lose a flush;
it''s just more likely in a VM, especially when anything Microsoft
is involved, and the whole point of journalling is to prevent things
like this happening. However the issue is moot since CR 6667683 is
being addressed. Here''s a related thought - does it make sense to
mirror ZFS on iscsi if the host drives are themselves ZFS mirrors?

The whole question of the requirement for ECC depends on your
tolerance for loss of files vs. errors in files. As Richard
Elling points out, there are other sources of error (e.g.,
no checking of PCI parity). But that isn''t relevant to the ECC
on main memory question. You can disable checksumming, and then
ZFS is no worse in this regard than any other file system; bad
files get read and you either notice or you don''t, but you
won''t
lose any because of fatal checksum errors and you still have all
the other great features of ZFS,

If you don''t mirror, all bets are off. You should set copies=2 or
higher and cross your fingers. You should also disable file
checksumming in ZFS and in that sense degenerate to the behavior
of lesser file systems. However mirroring doesn''t buy you much
here because it evidently doesn''t double buffer the write before
calculating the checksum, so a stray bitflip can cause metatdata or
data corruption, causing a mirrored file to have an unrecoverable
checksum failure (of course there are many other reasons to mirror).

The real question is - what is the probability of this occurring?
IMO the typical SOHO user has a 1 in 10 to 1 in 100 chance of this
happening in a year of reasonably constant operation (a few dozen
writes/day). I believe that this can be mitigated by setting
copies=2, a good idea anyway if you have biggish disks since, as
Richard Elling has pointed out in his excellent blogs, if you need
to resilver after a disk failure you have a rather large possibility
of a disk read error causing file loss and copies=2 also mitigates
this. Note that hopefully fixing CR 6667683 should eliminate any
possibility of losing an entire mirrored or raidz pool.

So, it seem to me ZFS has a definite dependency on ECC for reliable
operation. However, for non-commercial uses (i.e., less than an
hour or so a day of writes) the probability of losing a file is
fairly small and can be mitigated still further by setting copies=2.
But to eliminate the possibility entirely, you must have ECC. You
should also make sure that the buses have at least parity if not
ECC and that this is actually checked - maybe Richard can comment
on this since I believe he thinks this is a more likely source
of errors.

HTH -- Frank

Ian Collins

2009-Jul-24 23:04 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

Frank Middleton wrote:> On 07/24/09 04:35 PM, Bob Friesenhahn wrote:
>>  Regardless, it [VirtualBox] has committed a crime.
>
> But ZFS is a journalled file system! Even a journalled file system has to trust the journal.  If the storage 
says the journal is committed and its isn''t, all bets are off.

The issue we see here with ZFS appears to be the lack of a means of 
rewinding to a known sane state when this happens.
> The whole question of the requirement for ECC depends on your
> tolerance for loss of files vs. errors in files. As Richard
> Elling points out, there are other sources of error (e.g.,
> no checking of PCI parity). But that isn''t relevant to the ECC
> on main memory question. You can disable checksumming, and then
> ZFS is no worse in this regard than any other file system; bad
> files get read and you either notice or you don''t, but you
won''t
> lose any because of fatal checksum errors and you still have all
> the other great features of ZFS,
>That''s probably the root of the issues we see here, ZFS does a great
job
of telling you when something is irrevocably broken, but doesn''t (yet) 
offer a means of fixing the problem.  I guess ZFS is a bit like a single 
bit parity scheme that reports, but does not correct (gross) errors.  
When these are used in an on the wire protocol bad packets can either be 
dropped or retransmitted.  With a file system, only the former option is 
available, the original is lost.

Transmission protocols are always designed to manage data errors.  
Filesystems have traditionally been designed to ignore them, assuming 
the round trip from CPU to storage and back is 100% reliable.  ZFS has 
changed the rules.

-- 
Ian.

David Magda

2009-Jul-24 23:31 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Jul 24, 2009, at 16:00, Miles Nordin wrote:
> Is there a correct way to configure it, or will always any  
> componoent of the overall system other than ZFS get blamed when ZFS  
> loses a pool?
By default VB does not respect the ''disk sync'' command that a
guest OS
could send--it''s just ignored. This messes up ZFS'' assumption
about
transaction being safely on-disk.

Toby Thain posted a link to a VB forum posting on how to configure  
things so that the flush command is not silently ignored:

http://forums.virtualbox.org/viewtopic.php?f=8&t=13661&start=0

ZFS doesn''t make many assumptions, but it does assume that when the  
disk says "the data is safe" it actually is. If the disk lies then  
this is where things go wonky (and why we have these giant threads).

Bob Friesenhahn

2009-Jul-25 02:17 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Fri, 24 Jul 2009, Frank Middleton wrote:
> On 07/24/09 04:35 PM, Bob Friesenhahn wrote:
>>  Regardless, it [VirtualBox] has committed a crime.
>
> But ZFS is a journalled file system! Any hardware can lose a flush;
>From my understanding, ZFS is not a journalled file system.  ZFS relies on ordered writes followed by a cache sync (to make sure that 
the bits are on disk) and so it does not use a journaled transaction 
rollback mechanism.

Here is a description of what a journaling file system is:

   http://en.wikipedia.org/wiki/Journaling_file_system

Notice that the second sentence introduces the notion of a "race 
condition", but since ZFS uses ordered writes using freshly allocated 
space, there is no possibility of a race condition.  A journaling 
filesystem uses a journal (transaction log) to roll back (replace with 
previous data) the unordered writes in an incomplete transaction.  In 
the case of ZFS, it is only necessary to go back to the most recent 
checkpoint and any subsequent writes after that checkpoint are simply 
forgotten.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

David Magda

2009-Jul-25 02:34 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote:
> A journaling filesystem uses a journal (transaction log) to roll  
> back (replace with previous data) the unordered writes in an  
> incomplete transaction.  In the case of ZFS, it is only necessary to  
> go back to the most recent checkpoint and any subsequent writes  
> after that checkpoint are simply forgotten.
And fixing CR 6667683 is what would allow ZFS to properly /  
automatically recover from a messed up power down:
> need a way to rollback to an uberblock from a previous txg
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6667683

Most of the issues that I''ve read on this list would have been  
"solved" if there was a mechanism where the user / sysadmin could tell
ZFS to simply go back until it found a TXG that worked.

The trade off is that any transactions (and their data) after the  
working one would be lost. But at least you''re not left with an un- 
importable pool.

Toby Thain

2009-Jul-25 03:41 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On 24-Jul-09, at 6:41 PM, Frank Middleton wrote:
> On 07/24/09 04:35 PM, Bob Friesenhahn wrote:
>>  Regardless, it [VirtualBox] has committed a crime.
>
> But ZFS is a journalled file system! Any hardware can lose a flush;
No, the problematic default in VirtualBox is flushes being *ignored*,  
which has a different failure mode. A host crash under this regime  
can potentially corrupt *any* journaled and transactional system  
(starting with filesystems and RDBMS) in a manner that does not occur  
on properly functioning bare metal that honours flushes, because  
their ordering assumptions no longer hold.

Whether this is ''possible'' with a guest-only crash is arguable
- I
don''t want to speak for Miles, but I suspect he was reasoning that a  
guest crash would not interact with ignore-flush, as all requested  
issued I/O up until the crash "should" finally complete - making a  
guest crash similar to a "real" crash. But the virtualised stack is  
complex enough that I don''t know if we can be certain about that.

I would say that ignoring flushes is still a suspect.

> it''s just more likely in a VM, especially when anything Microsoft
> is involved,
I originally saw the problem on a Ubuntu system, 6 months ago. The  
subsystems which broke were ext3fs and InnoDB - both supposedly  
"journaling".
> and the whole point of journalling is to prevent things
> like this happening.

It can ONLY do that when flushes/barriers/ordering are respected.

--Toby

> ...
> HTH -- Frank
>
>
>
>
>
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Michael McCandless

2009-Jul-25 13:59 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

Thanks for the numerous responses everyone!  Responding to some of the
answers...:
> ZFS has to trust the storage to have committed the data it 
> claims to have committed in the same way it has to trust the integrity 
> of the RAM it uses for checksummed data.
I hope that''s not true.

Ie, I can understand that if an IO system "lies" in an fsync call
(returns before the bits are in fact on stable storage) that ZFS might
lose the pool.  EG it seems like that may''ve been what happened on the
VB thread (though I agree since it was only the guest that crashed,
the writes should in fact have made it to disk, so...).

But if a bit flips in RAM, at a particularly unlucky moment, is there
any chance whatsoever that ZFS could lose the pool?  There seems to be
mixed "opinions" here so far... but if I were tallying votes it looks
like more people say "no, it cannot" than "yes it may".
> > For example, if the wrong bit flips at the wrong time, could I lose my
> > entire RAID-Z pool instead of, say, corrupting one file''s
contents or
> > metadata? Is there such a possibility?
> 
> Not likely, but I don''t think anyone has done such low-level
> analysis to prove it.
So this is exactly what I''m driving at -- has there really been no
such low level failure analysis?  Ie, "if a bit error happens at point
XYZ in ZFS''s code, what''s the impact" (for XYZ at all
"interesting"
points)?

EG say (pure speculation) ZFS has a global checksum that''s written on
closing the pool, and then later the pool cannot be imported when the
checksum is bad.  Since a bit error could corrupt that checksum, this
would in fact mean I could "lose" the pool due to an unluckily timed
bit error.

The decision (to use ECC or not) ought to be a basic cost/benefit
analysis, once one has the facts.  I''m trying to get to the facts
here... ie, if you don''t use ECC just how bad is it when bit errors
inevitably happen?  If the effects are local (file/dir contents &
metadata get corrupted) that''s one thing; if I could lose the pool
that''s very different.

[Eventually] armed with the facts, one should be free to decide on ECC
or not just like one picks, say, the latest & greatest consumer hard
drive (higher risk of errors since they have no track record) or a
known good enterprise hard drive.
> You still have the processor to worry about though.
and
> NB many hard disk drives and controllers have only parity protected
> memory. So even if your main memory is ECC, it is unlikely that the
> entire data path is ECC protected.
These are good points -- even if you have ECC RAM, your CPU and PCI
bus and other parts of the data path could still flip bits.  So I''m
really hoping the answer is "no, you''ll never lose the pool from
bit errors".
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6667683
> 
> Most of the issues that I''ve read on this list would have been 
> "solved" if there was a mechanism where the user / sysadmin could
tell
> ZFS to simply go back until it found a TXG that worked.
This one sounds important!  Any means of disaster recovery would be
very welcome...

BTW is there some way for a user to vote/comment on bugs?  EG I think
I''ve hit this one:

  http://bugs.opensolaris.org/view_bug.do?bug_id=6807184

And would love to vote, share my config, situation, etc.  But I can''t
find any links that let me, there are no comments on the bug, etc.
-- 
This message posted from opensolaris.org

Ian Collins

2009-Jul-25 21:18 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

Michael McCandless wrote:> Thanks for the numerous responses everyone!  Responding to some of the
> answers...:
>
>   
>> ZFS has to trust the storage to have committed the data it 
>> claims to have committed in the same way it has to trust the integrity 
>> of the RAM it uses for checksummed data.
>>     
>
> I hope that''s not true.
>
> Ie, I can understand that if an IO system "lies" in an fsync call
> (returns before the bits are in fact on stable storage) that ZFS might
> lose the pool.  EG it seems like that may''ve been what happened on
the
> VB thread (though I agree since it was only the guest that crashed,
> the writes should in fact have made it to disk, so...).
>
> But if a bit flips in RAM, at a particularly unlucky moment, is there
> any chance whatsoever that ZFS could lose the pool?  There seems to be
> mixed "opinions" here so far... but if I were tallying votes it
looks
> like more people say "no, it cannot" than "yes it may".
>
>   I''ve never seen reports of that happening.  What I have seen is 
corrupted files.  Without checksums, the files would have been silently 
corrupted.

-- 
Ian.

Marc Bevand

2009-Jul-25 21:58 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

dick hoogendijk <dick <at> nagual.nl>
writes:> 
> I live in Holland and it is not easy to find motherboards that (a)
> truly support ECC ram and (b) are (Open)Solaris compatible.
Virtually all motherboards for AMD processors support ECC RAM because the 
memory controller is in the CPU and all AMD CPUs support ECC RAM.

I have heard of a few BIOSes that refuse to POST if ECC RAM is detected, but 
this is often an attempt to segment markets, rather than a real lack of 
ability to support ECC RAM.

-mrb

dick hoogendijk

2009-Jul-26 09:45 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On Sat, 25 Jul 2009 21:58:48 +0000 (UTC)
Marc Bevand <m.bevand at gmail.com> wrote:
> dick hoogendijk <dick <at> nagual.nl> writes:
> > 
> > I live in Holland and it is not easy to find motherboards that (a)
> > truly support ECC ram and (b) are (Open)Solaris compatible.
> 
> Virtually all motherboards for AMD processors support ECC RAM because
> the memory controller is in the CPU and all AMD CPUs support ECC RAM.
Than why is it that most AMD MoBo''s in the shops clearly state that ECC
Ram is not supported on the MoBo?

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
+ http://nagual.nl/ | SunOS 10u7 05/09 | OpenSolaris 2010.02 B118
+ All that''s really worth doing is what we do for others (Lewis Carrol)

Erik Trimble

2009-Jul-26 14:02 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

dick hoogendijk wrote:> On Sat, 25 Jul 2009 21:58:48 +0000 (UTC)
> Marc Bevand <m.bevand at gmail.com> wrote:
>
>   
>> dick hoogendijk <dick <at> nagual.nl> writes:
>>     
>>> I live in Holland and it is not easy to find motherboards that (a)
>>> truly support ECC ram and (b) are (Open)Solaris compatible.
>>>       
>> Virtually all motherboards for AMD processors support ECC RAM because
>> the memory controller is in the CPU and all AMD CPUs support ECC RAM.
>>     
>
> Than why is it that most AMD MoBo''s in the shops clearly state
that ECC
> Ram is not supported on the MoBo?
>
>   All /OPTERON/ chips support ECC, unbuffered, non-registered in the case 
of 100/1000 series, and unbuffered, registered in the case of 
200/2000/800/8000 series.

I _believe_ all socket AM2, AM2+ and AM3 consumer chips (Phenom, Phenom 
II, Athlon X2, Athlon X3 and Athlon X4) also support unbuffered 
non-registered ECC.   The AMD Specs page for the above processors 
indicates I''m right about those CPUs.

I think what they''re (the retail shops, that is) stating is consumer
AMD
CPUs won''t take the "server" (i.e. registered) ECC DIMMs.

A quick glance at ASUS''s website shows that all current consumer (i.e. 
socket AM2/2+/3) AMD motherboards from them support unregistered, 
unbuffered ECC.  I suspect it''s the same for the other board makers,
too.

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

Erik Trimble

2009-Jul-26 14:11 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

Erik Trimble wrote: >
> I _believe_ all socket AM2, AM2+ and AM3 consumer chips (Phenom, 
> Phenom II, Athlon X2, Athlon X3 and Athlon X4) also support unbuffered 
> non-registered ECC.   The AMD Specs page for the above processors 
> indicates I''m right about those CPUs.
>Quick correction:   the current AMD CPUs are  Phenom X3, Phenom X4, 
Phenom II, Athlon X2, Athlon, and Sempron. 

According to the Processor Data Sheets for all AMD CPUs, they /all/ 
support ECC RAM (in some form). All the way back to the Socket 754 chips.

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

Marc Bevand

2009-Jul-27 18:36 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

dick hoogendijk <dick <at> nagual.nl>
writes:> 
> Than why is it that most AMD MoBo''s in the shops clearly state
that ECC
> Ram is not supported on the MoBo?
To restate what Erik explained: *all* AMD CPUs support ECC RAM, however poorly 
written motherboard specs often make the mistake of confusing "non-ECC vs.
ECC"
with "unbuffered vs. registered" (these are 2 completely unrelated
technical
characteristics). So, don''t blindly trust manuals saying ECC RAM is not
supported.

-mrb

Kurt Olsen

2009-Jul-31 18:04 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

> On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote:
> ....
> Most of the issues that I''ve read on this list would
> have been  
> "solved" if there was a mechanism where the user /
> sysadmin could tell  
> ZFS to simply go back until it found a TXG that
> worked.
> 
> The trade off is that any transactions (and their
> data) after the  
> working one would be lost. But at least you''re not
> left with an un- 
> importable pool.
I''m curious as to why people think rolling back txgs don''t
come with additional costs beyond losing recent transactions. What are the odds
that the data blocks that were replaced by the discarded transactions
haven''t been overwritten? Without a snapshot to hold the references
aren''t those blocks considered free and available for reuse?

Don''t get me wrong, I do think that rolling back to previous uberblocks
should be an option v. total pool loss, but it doesn''t seem like one
can reliably say that their data is in some known good state.
-- 
This message posted from opensolaris.org

Victor Latushkin

2009-Jul-31 18:23 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On 31.07.09 22:04, Kurt Olsen wrote:>> On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote:
>> ....
>> Most of the issues that I''ve read on this list would
>> have been  
>> "solved" if there was a mechanism where the user /
>> sysadmin could tell  
>> ZFS to simply go back until it found a TXG that
>> worked.
>>
>> The trade off is that any transactions (and their
>> data) after the  
>> working one would be lost. But at least you''re not
>> left with an un- 
>> importable pool.
> 
> I''m curious as to why people think rolling back txgs
don''t come with
> additional costs beyond losing recent transactions. What are the odds
> that the data blocks that were replaced by the discarded transactions
> haven''t been overwritten?
Odds depend on lots of factors - activity in the pool, free space, block 
selection policy, metaslab cursor positions etc. I have seen examples of 
successful recovery to a point in time which is around 9 hours before 
last synced txg. Sometimes it is enough to roll one txg back, sometimes 
it requires going back and trying a a few older ones.
> Without a snapshot to hold the references
> aren''t those blocks considered free and available for reuse?
As soon as transaction group is synced, blocks freed during that 
transaction group time are released back to the pool, and potentially 
can be overwritten during next txg.
> Don''t get me wrong, I do think that rolling back to previous
> uberblocks should be an option v. total pool loss, but it doesn''t
> seem like one can reliably say that their data is in some known good
> state.
If fact thanks to the fact that everything is checksummed one can say 
that pool is in a good shape as reliably as current checksum in use allows.

victor

Victor Latushkin

2009-Aug-01 09:12 UTC

head link

[zfs-discuss] The importance of ECC RAM for ZFS

On 25.07.09 00:30, Rob Logan wrote:>  > The post I read said OpenSolaris guest crashed, and the guy clicked
>  > the ``power off guest'''' button on the virtual
machine.
> 
> I seem to recall "guest hung". 99% of solaris hangs (without
> a crash dump) are "hardware" in nature. (my experience backed by
> an uptime of 1116days) so the finger is still
> pointed at VirtualBox''s "hardware" implementation.
> 
> as for ZFS requiring "better" hardware, you could turn
> off checksums and other protections so one isn''t notified
> of issues making it act like the others.
You cannot turn off checksums and copies for metadata though, so even if you 
don''t care about your data ZFS still cares about its metadata.

victor

zfs discuss - Jul 2009 - The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS

[zfs-discuss] The importance of ECC RAM for ZFS