thr3ads.net - zfs discuss - [zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Shawn Joy

2006-Dec-21 15:28 UTC

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

All,

I understand that ZFS gives you more error correction when using two LUNS from a
SAN. But, does it provide you with less features than UFS does on one LUN from a
SAN (i.e is it less stable).

Thanks,
Shawn
 
 
This message posted from opensolaris.org

Robert Milkowski

2006-Dec-21 15:45 UTC

head link

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

Hello Shawn,

Thursday, December 21, 2006, 4:28:39 PM, you wrote:

SJ> All,

SJ> I understand that ZFS gives you more error correction when using
SJ> two LUNS from a SAN. But, does it provide you with less features
SJ> than UFS does on one LUN from a SAN (i.e is it less stable).

With only one LUN you still get error detection which UFS doesn''t give
you. You still can use snapshots, clones, quotas, etc. so in general
you still have more features than UFS.

Now when in comes to stability - depends. UFS is for years in use
while ZFS much younger.

More and more people are using ZFS in production and while there''re
some corner cases mostly performance related, it works really good.
And I haven''t heard of verified data lost due to ZFS. I''ve
been using
ZFS for quite some time (much sooner than it was available in SX) and
I haven''t also lost any data.


-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

przemolicc at poczta.fm

2006-Dec-22 09:02 UTC

head link

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski
wrote:> Hello Shawn,
> 
> Thursday, December 21, 2006, 4:28:39 PM, you wrote:
> 
> SJ> All,
> 
> SJ> I understand that ZFS gives you more error correction when using
> SJ> two LUNS from a SAN. But, does it provide you with less features
> SJ> than UFS does on one LUN from a SAN (i.e is it less stable).
> 
> With only one LUN you still get error detection which UFS doesn''t
give
> you. You still can use snapshots, clones, quotas, etc. so in general
> you still have more features than UFS.
> 
> Now when in comes to stability - depends. UFS is for years in use
> while ZFS much younger.
> 
> More and more people are using ZFS in production and while
there''re
> some corner cases mostly performance related, it works really good.
> And I haven''t heard of verified data lost due to ZFS.
I''ve been using
> ZFS for quite some time (much sooner than it was available in SX) and
> I haven''t also lost any data.
Robert,

I don''t understand why not loosing any data is an advantage of ZFS.
No filesystem should lose any data. It is like saying that an advantage
of football player is that he/she plays football (he/she should do that !)
or an advantage of chef is that he/she cooks (he/she should do that !).
Every filesystem should _save_ our data, not lose it.

Regards
przemol

----------------------------------------------------------------------
Jestes kierowca? To poczytaj! >>> http://link.interia.pl/f199e

Ulrich Graef

2006-Dec-22 10:39 UTC

head link

[zfs-discuss] !

przemolicc at poczta.fm wrote:
>Robert,
>
>I don''t understand why not loosing any data is an advantage of ZFS.
>No filesystem should lose any data. It is like saying that an advantage
>of football player is that he/she plays football (he/she should do that !)
>or an advantage of chef is that he/she cooks (he/she should do that !).
>Every filesystem should _save_ our data, not lose it.
>  
>yes, you are right: every filesystem should save the data.
(... and every program should have no error! ;-)

Unfortunately there are some cases, where the disks lose data,
these cannot be detected by traditional filesystems but with ZFS:

    * bit rot: some bits on the disk gets flipped (~ 1 in 10^11)
      (cosmic rays, static particles in airflow, random thermodynamics)
    * phantom writes: a disk ''forgets'' to write data (~ 1 in
10^8)
      (positioning errors, disk firmware errors, ...)
    * misdirected reads/writes: disk writes to the wrong position (~ 1
      in 10^8)
      (disks use very small structures, head can move after positioning)
    * errors on the data transfer connection

You can look up the probabilities at several disk vendors, the are 
published.
Traditional filesystems do not check the data they read. You get strange 
effects
when the filesystem code runs with wrong metadata (worst case: panic).
If you use the wrong data in your applicaton, you ''only'' have
the wrong
results...

ZFS on the contrary checks every block it reads and is able to find the 
mirror
or reconstruct the data in a raidz config.
Therefore ZFS uses only valid data and is able to repair the data blocks 
automatically.
This is not possible in a traditional filesystem/volume manager 
configuration.

You may say, you never heard of a disk losing data; but you have heard 
of systems,
which behave strange and a re-installation fixed everything.
Or some data have gone bad and you have to recover from backup.

It may be, that this was one of these cases.
Our service encounters a number of these cases every year,
where the customer was not able to re-install or did not want to restore 
his data,
which can be traced back to such a disk error.
These are always nasty problems and it gets nastier, because customers
have more and more data and there is a trend to save money on backup/restore
infrastructures which make it hurt to restore data.

Regards,

    Ulrich

-- 
| Ulrich Graef, Senior Consultant, OS Ambassador             \
|  Operating Systems, Performance \ Platform Technology       \
|   Mail: Ulrich.Graef at Sun.COM     \ Global Systems Enginering \
|    Phone: +49 6103 752 359        \ Sun Microsystems Inc      \

Robert Milkowski

2006-Dec-22 10:45 UTC

head link

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

Hello przemolicc,

Friday, December 22, 2006, 10:02:44 AM, you wrote:

ppf> On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski
wrote:>> Hello Shawn,
>> 
>> Thursday, December 21, 2006, 4:28:39 PM, you wrote:
>> 
>> SJ> All,
>> 
>> SJ> I understand that ZFS gives you more error correction when using
>> SJ> two LUNS from a SAN. But, does it provide you with less features
>> SJ> than UFS does on one LUN from a SAN (i.e is it less stable).
>> 
>> With only one LUN you still get error detection which UFS
doesn''t give
>> you. You still can use snapshots, clones, quotas, etc. so in general
>> you still have more features than UFS.
>> 
>> Now when in comes to stability - depends. UFS is for years in use
>> while ZFS much younger.
>> 
>> More and more people are using ZFS in production and while
there''re
>> some corner cases mostly performance related, it works really good.
>> And I haven''t heard of verified data lost due to ZFS.
I''ve been using
>> ZFS for quite some time (much sooner than it was available in SX) and
>> I haven''t also lost any data.
ppf> Robert,

ppf> I don''t understand why not loosing any data is an advantage of
ZFS.
ppf> No filesystem should lose any data. It is like saying that an advantage

I wasn''t saying this is advantage. Of course no file system should
lose your data - it''s just that when new file systems show up on
market people do not trust them in general at first - which is
expected precaution.

Part of such perception is Linux - due to different development type
you often get software badly written and tested - try to look at
google how many people lost their data with RaiserFS for example.
The same happened for many people with XFS on Linux.

That''s why I thought emphasis on ZFS that it hasn''t lost my
data even if
it''s new-born file system and I''ve been using it for years (as
other
users) is important, especially for people mostly from Linux world.

ps. I really belive development style in Open Solaris is better than
    in Linux (kernel).

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

przemolicc at poczta.fm

2006-Dec-22 11:45 UTC

head link

[zfs-discuss] Re: !

Ulrich,

in his e-mail Robert mentioned _two_ things regarding ZFS:
[1] ability to detect errors (checksums)
[2] using ZFS didn''t caused data lost so far
I completely agree that [1] is wonderful and this is huge advantage. And you
also underlined [1] in you e-mail !
The _only_ thing I mentioned is [2]. And I guess Robert wrote about
it only because ZFS is relatively young. When you talk
about VxFS/UFS you don''t underline that they don''t lose data -
it
would be ridiculous. 

Regards
przemol

On Fri, Dec 22, 2006 at 11:39:44AM +0100, Ulrich Graef
wrote:> przemolicc at poczta.fm wrote:
> 
> >Robert,
> >
> >I don''t understand why not loosing any data is an advantage of
ZFS.
> >No filesystem should lose any data. It is like saying that an advantage
> >of football player is that he/she plays football (he/she should do that
!)
> >or an advantage of chef is that he/she cooks (he/she should do that !).
> >Every filesystem should _save_ our data, not lose it.
> > 
> >
> yes, you are right: every filesystem should save the data.
> (... and every program should have no error! ;-)
> 
> Unfortunately there are some cases, where the disks lose data,
> these cannot be detected by traditional filesystems but with ZFS:
> 
>    * bit rot: some bits on the disk gets flipped (~ 1 in 10^11)
>      (cosmic rays, static particles in airflow, random thermodynamics)
>    * phantom writes: a disk ''forgets'' to write data (~ 1
in 10^8)
>      (positioning errors, disk firmware errors, ...)
>    * misdirected reads/writes: disk writes to the wrong position (~ 1
>      in 10^8)
>      (disks use very small structures, head can move after positioning)
>    * errors on the data transfer connection
> 
> You can look up the probabilities at several disk vendors, the are 
> published.
> Traditional filesystems do not check the data they read. You get strange 
> effects
> when the filesystem code runs with wrong metadata (worst case: panic).
> If you use the wrong data in your applicaton, you ''only''
have the wrong
> results...
> 
> ZFS on the contrary checks every block it reads and is able to find the 
> mirror
> or reconstruct the data in a raidz config.
> Therefore ZFS uses only valid data and is able to repair the data blocks 
> automatically.
> This is not possible in a traditional filesystem/volume manager 
> configuration.
> 
> You may say, you never heard of a disk losing data; but you have heard 
> of systems,
> which behave strange and a re-installation fixed everything.
> Or some data have gone bad and you have to recover from backup.
> 
> It may be, that this was one of these cases.
> Our service encounters a number of these cases every year,
> where the customer was not able to re-install or did not want to restore 
> his data,
> which can be traced back to such a disk error.
> These are always nasty problems and it gets nastier, because customers
> have more and more data and there is a trend to save money on
backup/restore
> infrastructures which make it hurt to restore data.
> 
> Regards,
> 
>    Ulrich
> 
> -- 
> | Ulrich Graef, Senior Consultant, OS Ambassador             \
> |  Operating Systems, Performance \ Platform Technology       \
> |   Mail: Ulrich.Graef at Sun.COM     \ Global Systems Enginering \
> |    Phone: +49 6103 752 359        \ Sun Microsystems Inc      \
> 
----------------------------------------------------------------------
Jestes kierowca? To poczytaj! >>> http://link.interia.pl/f199e

Shawn Joy

2006-Dec-22 13:34 UTC

head link

[zfs-discuss] Re: Difference between ZFS and UFS with one LUN from a SAN

OK,

But lets get back to the original question.

Does ZFS provide you with less features than UFS does on one LUN from a SAN (i.e
is it less stable).
>ZFS on the contrary checks every block it reads and is able to find the
>mirror
>or reconstruct the data in a raidz config.
>Therefore ZFS uses only valid data and is able to repair the data blocks
>automatically.
>This is not possible in a traditional filesystem/volume manager
>configuration.
The above is fine. If I have two LUNs. But my original question was if I only
have one LUN.

What about kernel panics from ZFS if for instance access to one controller goes
away for a few seconds or minutes. Normally UFS would just sit there and warn I
have lost access to the controller. Then when the controller returns, after a
short period, the warnings go away and the LUN continues to operate. The admin
can then research further into why the controller went away. With ZFS, the above
will panic the system and possibly cause other coruption  on other LUNs due to
this panic? I believe this was discussed in other threads? I also believe there
is a bug filed against this? If so when should we expect this bug to be fixed?


My understanding of ZFS is that it functions better in an environment where we
have JBODs attached to the hosts. This way ZFS takes care of all of the
redundancy? But what about SAN enviroments where customers have spend big money
to invest in storage. I know of one instance where a customer has a growing need
for more storage space. There environemt uses many inodes. Due to the UFS inode
limitation, when creating LUNs over one TB, they would have to quadrulpe the
about of storage usesd in there SAN in order to hold all of the files. A
possible solution to this inode issue would be ZFS. However they have
experienced kernel panics in there environment when a controller dropped of
line.

Any body have a solution to this?

Shawn
 
 
This message posted from opensolaris.org

Roch - PAE

2006-Dec-22 14:14 UTC

head link

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

Robert Milkowski writes:
 > Hello przemolicc,
 > 
 > Friday, December 22, 2006, 10:02:44 AM, you wrote:
 > 
 > ppf> On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski wrote:
 > >> Hello Shawn,
 > >> 
 > >> Thursday, December 21, 2006, 4:28:39 PM, you wrote:
 > >> 
 > >> SJ> All,
 > >> 
 > >> SJ> I understand that ZFS gives you more error correction when
using
 > >> SJ> two LUNS from a SAN. But, does it provide you with less
features
 > >> SJ> than UFS does on one LUN from a SAN (i.e is it less
stable).
 > >> 
 > >> With only one LUN you still get error detection which UFS
doesn''t give
 > >> you. You still can use snapshots, clones, quotas, etc. so in
general
 > >> you still have more features than UFS.
 > >> 
 > >> Now when in comes to stability - depends. UFS is for years in use
 > >> while ZFS much younger.
 > >> 
 > >> More and more people are using ZFS in production and while
there''re
 > >> some corner cases mostly performance related, it works really
good.
 > >> And I haven''t heard of verified data lost due to ZFS.
I''ve been using
 > >> ZFS for quite some time (much sooner than it was available in SX)
and
 > >> I haven''t also lost any data.
 > 
 > ppf> Robert,
 > 
 > ppf> I don''t understand why not loosing any data is an
advantage of ZFS.
 > ppf> No filesystem should lose any data. It is like saying that an
advantage
 > 
 > I wasn''t saying this is advantage. Of course no file system
should
 > lose your data - it''s just that when new file systems show up on
 > market people do not trust them in general at first - which is
 > expected precaution.
 > 
 > Part of such perception is Linux - due to different development type
 > you often get software badly written and tested - try to look at
 > google how many people lost their data with RaiserFS for example.
 > The same happened for many people with XFS on Linux.
 > 
 > That''s why I thought emphasis on ZFS that it hasn''t lost
my data even if
 > it''s new-born file system and I''ve been using it for
years (as other
 > users) is important, especially for people mostly from Linux world.
 > 
 > ps. I really belive development style in Open Solaris is better than
 >     in Linux (kernel).
 > 

The fact that most FS do not manage the disk write caches
does mean you''re at risk of data lost for those FS.

-r


 > -- 
 > Best regards,
 >  Robert                            mailto:rmilkowski at task.gda.pl
 >                                        http://milek.blogspot.com
 > 
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Anton B. Rang

2006-Dec-22 17:50 UTC

head link

[zfs-discuss] Re: !

> Unfortunately there are some cases, where the disks lose data,
> these cannot be detected by traditional filesystems but with ZFS:
> 
> * bit rot: some bits on the disk gets flipped (~  1 in 10^11)
> * phantom writes: a disk ''forgets'' to write data (~ 1 in
10^8)
> * misdirected reads/writes: disk writes to the wrong position (~ 1 in 10^8)
>
> u can look up the probabilities at several disk
> vendors, the are published.
I''m puzzled where you got those numbers from.  They seem to be several
orders of magnitude too low.

Bit errors:

For SATA disks, the probability of an *uncorrected* error is roughly 1 in 10^14
bits read (12 terabytes or so).  [Seagate WinHEC].  These should be handled
identically by ZFS and a traditional file system over RAID.

The probability of either an *undetected* or *miscorrected* error is not, so far
as I know, published for disks.  For high-end tape, where the uncorrected error
rate is roughly 1 in 10^17 bits read, the miscorrected error rate is 1 in 10^33
bits.  Modern disks may use a two-level ECC [IBM ECC] which reduces even further
the miscorrected error rate. These are one class of errors which ZFS will catch
and a traditional file system will not.

Phantom writes and/or misdirected reads/writes:

I haven''t seen probabilities published on this; obviously the disk
vendors would claim zero, but we believe they''re slightly wrong.  ;-) 
That said, 1 in 10^8 bits would mean we?d have an error in every 12 megabytes
written!  That?s clearly far too low.  1 in 10^8 blocks would be an error in
every 46 gigabytes written; that is also clearly far too low. (At 1 GB/second
that would be a phantom write every minute.)

References:

[Seagate WINHEC] "SATA in the Enterprise." Can be found at
<http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWST05005_WinHEC05.ppt>.

[IBM ECC] "Two-level coding for error control in magnetic disk storage
products." Can be found at
<http://www.research.ibm.com/journal/rd/334/ibmrd3304G.pdf>.

This message posted from opensolaris.org

Ed Gould

2006-Dec-22 18:10 UTC

head link

[zfs-discuss] Re: !

On  Dec 22, 2006, at 09:50, Anton B. Rang wrote:> Phantom writes and/or misdirected reads/writes:
>
> I haven''t seen probabilities published on this; obviously the disk
> vendors would claim zero, but we believe they''re slightly  
> wrong.  ;-)  That said, 1 in 10^8 bits would mean we?d have an  
> error in every 12 megabytes written!  That?s clearly far too low.   
> 1 in 10^8 blocks would be an error in every 46 gigabytes written;  
> that is also clearly far too low. (At 1 GB/second that would be a  
> phantom write every minute.)
Jim Gray (a well-known and respected database expert, currently at  
Microsoft) clams that the drive/controller combination will write  
data to the wrong place on the drive at a rate of about one incident/ 
drive/year.  In a 400-drive array (JBOD or RAID, doesn''t matter),  
that would be about once a day.  This is a kind of error that (so  
far, at least) can only be detected (and potentially corrected, given  
redundancy) by ZFS.

	--Ed

Torrey McMahon

2006-Dec-22 20:17 UTC

head link

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

Roch - PAE wrote:>
> The fact that most FS do not manage the disk write caches
> does mean you''re at risk of data lost for those FS.

Does ZFS? I thought it just turned it on in the places where we had 
previously turned if off.

Robert Milkowski

2006-Dec-22 20:40 UTC

head link

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

Hello Torrey,

Friday, December 22, 2006, 9:17:46 PM, you wrote:

TM> Roch - PAE wrote:>>
>> The fact that most FS do not manage the disk write caches
>> does mean you''re at risk of data lost for those FS.

TM> Does ZFS? I thought it just turned it on in the places where we had 
TM> previously turned if off.

ZFS send flush cache command after each transaction group so it''s sure
transaction is on stable storage.

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Neil Perrin

2006-Dec-22 22:06 UTC

head link

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

Robert Milkowski wrote On 12/22/06 13:40,:> Hello Torrey,
> 
> Friday, December 22, 2006, 9:17:46 PM, you wrote:
> 
> TM> Roch - PAE wrote:
> 
>>>The fact that most FS do not manage the disk write caches
>>>does mean you''re at risk of data lost for those FS.
> 
> 
> 
> TM> Does ZFS? I thought it just turned it on in the places where we had 
> TM> previously turned if off.
> 
> ZFS send flush cache command after each transaction group so it''s
sure
> transaction is on stable storage.
... and after every fsync, O_DSYNC, etc that writes out intent log blocks.

Apparently Analagous Threads

Search for more seemingly similar threads

zfs discuss - Dec 2006 - Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] !

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] Re: !

[zfs-discuss] Re: Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] Re: !

[zfs-discuss] Re: !

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN

Apparently Analagous Threads