Shawn Joy
2006-Dec-21 15:28 UTC
[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN
All, I understand that ZFS gives you more error correction when using two LUNS from a SAN. But, does it provide you with less features than UFS does on one LUN from a SAN (i.e is it less stable). Thanks, Shawn This message posted from opensolaris.org
Robert Milkowski
2006-Dec-21 15:45 UTC
[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN
Hello Shawn, Thursday, December 21, 2006, 4:28:39 PM, you wrote: SJ> All, SJ> I understand that ZFS gives you more error correction when using SJ> two LUNS from a SAN. But, does it provide you with less features SJ> than UFS does on one LUN from a SAN (i.e is it less stable). With only one LUN you still get error detection which UFS doesn''t give you. You still can use snapshots, clones, quotas, etc. so in general you still have more features than UFS. Now when in comes to stability - depends. UFS is for years in use while ZFS much younger. More and more people are using ZFS in production and while there''re some corner cases mostly performance related, it works really good. And I haven''t heard of verified data lost due to ZFS. I''ve been using ZFS for quite some time (much sooner than it was available in SX) and I haven''t also lost any data. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
przemolicc at poczta.fm
2006-Dec-22 09:02 UTC
[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN
On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski wrote:> Hello Shawn, > > Thursday, December 21, 2006, 4:28:39 PM, you wrote: > > SJ> All, > > SJ> I understand that ZFS gives you more error correction when using > SJ> two LUNS from a SAN. But, does it provide you with less features > SJ> than UFS does on one LUN from a SAN (i.e is it less stable). > > With only one LUN you still get error detection which UFS doesn''t give > you. You still can use snapshots, clones, quotas, etc. so in general > you still have more features than UFS. > > Now when in comes to stability - depends. UFS is for years in use > while ZFS much younger. > > More and more people are using ZFS in production and while there''re > some corner cases mostly performance related, it works really good. > And I haven''t heard of verified data lost due to ZFS. I''ve been using > ZFS for quite some time (much sooner than it was available in SX) and > I haven''t also lost any data.Robert, I don''t understand why not loosing any data is an advantage of ZFS. No filesystem should lose any data. It is like saying that an advantage of football player is that he/she plays football (he/she should do that !) or an advantage of chef is that he/she cooks (he/she should do that !). Every filesystem should _save_ our data, not lose it. Regards przemol ---------------------------------------------------------------------- Jestes kierowca? To poczytaj! >>> http://link.interia.pl/f199e
przemolicc at poczta.fm wrote:>Robert, > >I don''t understand why not loosing any data is an advantage of ZFS. >No filesystem should lose any data. It is like saying that an advantage >of football player is that he/she plays football (he/she should do that !) >or an advantage of chef is that he/she cooks (he/she should do that !). >Every filesystem should _save_ our data, not lose it. > >yes, you are right: every filesystem should save the data. (... and every program should have no error! ;-) Unfortunately there are some cases, where the disks lose data, these cannot be detected by traditional filesystems but with ZFS: * bit rot: some bits on the disk gets flipped (~ 1 in 10^11) (cosmic rays, static particles in airflow, random thermodynamics) * phantom writes: a disk ''forgets'' to write data (~ 1 in 10^8) (positioning errors, disk firmware errors, ...) * misdirected reads/writes: disk writes to the wrong position (~ 1 in 10^8) (disks use very small structures, head can move after positioning) * errors on the data transfer connection You can look up the probabilities at several disk vendors, the are published. Traditional filesystems do not check the data they read. You get strange effects when the filesystem code runs with wrong metadata (worst case: panic). If you use the wrong data in your applicaton, you ''only'' have the wrong results... ZFS on the contrary checks every block it reads and is able to find the mirror or reconstruct the data in a raidz config. Therefore ZFS uses only valid data and is able to repair the data blocks automatically. This is not possible in a traditional filesystem/volume manager configuration. You may say, you never heard of a disk losing data; but you have heard of systems, which behave strange and a re-installation fixed everything. Or some data have gone bad and you have to recover from backup. It may be, that this was one of these cases. Our service encounters a number of these cases every year, where the customer was not able to re-install or did not want to restore his data, which can be traced back to such a disk error. These are always nasty problems and it gets nastier, because customers have more and more data and there is a trend to save money on backup/restore infrastructures which make it hurt to restore data. Regards, Ulrich -- | Ulrich Graef, Senior Consultant, OS Ambassador \ | Operating Systems, Performance \ Platform Technology \ | Mail: Ulrich.Graef at Sun.COM \ Global Systems Enginering \ | Phone: +49 6103 752 359 \ Sun Microsystems Inc \
Robert Milkowski
2006-Dec-22 10:45 UTC
[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN
Hello przemolicc, Friday, December 22, 2006, 10:02:44 AM, you wrote: ppf> On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski wrote:>> Hello Shawn, >> >> Thursday, December 21, 2006, 4:28:39 PM, you wrote: >> >> SJ> All, >> >> SJ> I understand that ZFS gives you more error correction when using >> SJ> two LUNS from a SAN. But, does it provide you with less features >> SJ> than UFS does on one LUN from a SAN (i.e is it less stable). >> >> With only one LUN you still get error detection which UFS doesn''t give >> you. You still can use snapshots, clones, quotas, etc. so in general >> you still have more features than UFS. >> >> Now when in comes to stability - depends. UFS is for years in use >> while ZFS much younger. >> >> More and more people are using ZFS in production and while there''re >> some corner cases mostly performance related, it works really good. >> And I haven''t heard of verified data lost due to ZFS. I''ve been using >> ZFS for quite some time (much sooner than it was available in SX) and >> I haven''t also lost any data.ppf> Robert, ppf> I don''t understand why not loosing any data is an advantage of ZFS. ppf> No filesystem should lose any data. It is like saying that an advantage I wasn''t saying this is advantage. Of course no file system should lose your data - it''s just that when new file systems show up on market people do not trust them in general at first - which is expected precaution. Part of such perception is Linux - due to different development type you often get software badly written and tested - try to look at google how many people lost their data with RaiserFS for example. The same happened for many people with XFS on Linux. That''s why I thought emphasis on ZFS that it hasn''t lost my data even if it''s new-born file system and I''ve been using it for years (as other users) is important, especially for people mostly from Linux world. ps. I really belive development style in Open Solaris is better than in Linux (kernel). -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Ulrich, in his e-mail Robert mentioned _two_ things regarding ZFS: [1] ability to detect errors (checksums) [2] using ZFS didn''t caused data lost so far I completely agree that [1] is wonderful and this is huge advantage. And you also underlined [1] in you e-mail ! The _only_ thing I mentioned is [2]. And I guess Robert wrote about it only because ZFS is relatively young. When you talk about VxFS/UFS you don''t underline that they don''t lose data - it would be ridiculous. Regards przemol On Fri, Dec 22, 2006 at 11:39:44AM +0100, Ulrich Graef wrote:> przemolicc at poczta.fm wrote: > > >Robert, > > > >I don''t understand why not loosing any data is an advantage of ZFS. > >No filesystem should lose any data. It is like saying that an advantage > >of football player is that he/she plays football (he/she should do that !) > >or an advantage of chef is that he/she cooks (he/she should do that !). > >Every filesystem should _save_ our data, not lose it. > > > > > yes, you are right: every filesystem should save the data. > (... and every program should have no error! ;-) > > Unfortunately there are some cases, where the disks lose data, > these cannot be detected by traditional filesystems but with ZFS: > > * bit rot: some bits on the disk gets flipped (~ 1 in 10^11) > (cosmic rays, static particles in airflow, random thermodynamics) > * phantom writes: a disk ''forgets'' to write data (~ 1 in 10^8) > (positioning errors, disk firmware errors, ...) > * misdirected reads/writes: disk writes to the wrong position (~ 1 > in 10^8) > (disks use very small structures, head can move after positioning) > * errors on the data transfer connection > > You can look up the probabilities at several disk vendors, the are > published. > Traditional filesystems do not check the data they read. You get strange > effects > when the filesystem code runs with wrong metadata (worst case: panic). > If you use the wrong data in your applicaton, you ''only'' have the wrong > results... > > ZFS on the contrary checks every block it reads and is able to find the > mirror > or reconstruct the data in a raidz config. > Therefore ZFS uses only valid data and is able to repair the data blocks > automatically. > This is not possible in a traditional filesystem/volume manager > configuration. > > You may say, you never heard of a disk losing data; but you have heard > of systems, > which behave strange and a re-installation fixed everything. > Or some data have gone bad and you have to recover from backup. > > It may be, that this was one of these cases. > Our service encounters a number of these cases every year, > where the customer was not able to re-install or did not want to restore > his data, > which can be traced back to such a disk error. > These are always nasty problems and it gets nastier, because customers > have more and more data and there is a trend to save money on backup/restore > infrastructures which make it hurt to restore data. > > Regards, > > Ulrich > > -- > | Ulrich Graef, Senior Consultant, OS Ambassador \ > | Operating Systems, Performance \ Platform Technology \ > | Mail: Ulrich.Graef at Sun.COM \ Global Systems Enginering \ > | Phone: +49 6103 752 359 \ Sun Microsystems Inc \ >---------------------------------------------------------------------- Jestes kierowca? To poczytaj! >>> http://link.interia.pl/f199e
Shawn Joy
2006-Dec-22 13:34 UTC
[zfs-discuss] Re: Difference between ZFS and UFS with one LUN from a SAN
OK, But lets get back to the original question. Does ZFS provide you with less features than UFS does on one LUN from a SAN (i.e is it less stable).>ZFS on the contrary checks every block it reads and is able to find the >mirror >or reconstruct the data in a raidz config. >Therefore ZFS uses only valid data and is able to repair the data blocks >automatically. >This is not possible in a traditional filesystem/volume manager >configuration.The above is fine. If I have two LUNs. But my original question was if I only have one LUN. What about kernel panics from ZFS if for instance access to one controller goes away for a few seconds or minutes. Normally UFS would just sit there and warn I have lost access to the controller. Then when the controller returns, after a short period, the warnings go away and the LUN continues to operate. The admin can then research further into why the controller went away. With ZFS, the above will panic the system and possibly cause other coruption on other LUNs due to this panic? I believe this was discussed in other threads? I also believe there is a bug filed against this? If so when should we expect this bug to be fixed? My understanding of ZFS is that it functions better in an environment where we have JBODs attached to the hosts. This way ZFS takes care of all of the redundancy? But what about SAN enviroments where customers have spend big money to invest in storage. I know of one instance where a customer has a growing need for more storage space. There environemt uses many inodes. Due to the UFS inode limitation, when creating LUNs over one TB, they would have to quadrulpe the about of storage usesd in there SAN in order to hold all of the files. A possible solution to this inode issue would be ZFS. However they have experienced kernel panics in there environment when a controller dropped of line. Any body have a solution to this? Shawn This message posted from opensolaris.org
Roch - PAE
2006-Dec-22 14:14 UTC
[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN
Robert Milkowski writes: > Hello przemolicc, > > Friday, December 22, 2006, 10:02:44 AM, you wrote: > > ppf> On Thu, Dec 21, 2006 at 04:45:34PM +0100, Robert Milkowski wrote: > >> Hello Shawn, > >> > >> Thursday, December 21, 2006, 4:28:39 PM, you wrote: > >> > >> SJ> All, > >> > >> SJ> I understand that ZFS gives you more error correction when using > >> SJ> two LUNS from a SAN. But, does it provide you with less features > >> SJ> than UFS does on one LUN from a SAN (i.e is it less stable). > >> > >> With only one LUN you still get error detection which UFS doesn''t give > >> you. You still can use snapshots, clones, quotas, etc. so in general > >> you still have more features than UFS. > >> > >> Now when in comes to stability - depends. UFS is for years in use > >> while ZFS much younger. > >> > >> More and more people are using ZFS in production and while there''re > >> some corner cases mostly performance related, it works really good. > >> And I haven''t heard of verified data lost due to ZFS. I''ve been using > >> ZFS for quite some time (much sooner than it was available in SX) and > >> I haven''t also lost any data. > > ppf> Robert, > > ppf> I don''t understand why not loosing any data is an advantage of ZFS. > ppf> No filesystem should lose any data. It is like saying that an advantage > > I wasn''t saying this is advantage. Of course no file system should > lose your data - it''s just that when new file systems show up on > market people do not trust them in general at first - which is > expected precaution. > > Part of such perception is Linux - due to different development type > you often get software badly written and tested - try to look at > google how many people lost their data with RaiserFS for example. > The same happened for many people with XFS on Linux. > > That''s why I thought emphasis on ZFS that it hasn''t lost my data even if > it''s new-born file system and I''ve been using it for years (as other > users) is important, especially for people mostly from Linux world. > > ps. I really belive development style in Open Solaris is better than > in Linux (kernel). > The fact that most FS do not manage the disk write caches does mean you''re at risk of data lost for those FS. -r > -- > Best regards, > Robert mailto:rmilkowski at task.gda.pl > http://milek.blogspot.com > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> Unfortunately there are some cases, where the disks lose data, > these cannot be detected by traditional filesystems but with ZFS: > > * bit rot: some bits on the disk gets flipped (~ 1 in 10^11) > * phantom writes: a disk ''forgets'' to write data (~ 1 in 10^8) > * misdirected reads/writes: disk writes to the wrong position (~ 1 in 10^8) > > u can look up the probabilities at several disk > vendors, the are published.I''m puzzled where you got those numbers from. They seem to be several orders of magnitude too low. Bit errors: For SATA disks, the probability of an *uncorrected* error is roughly 1 in 10^14 bits read (12 terabytes or so). [Seagate WinHEC]. These should be handled identically by ZFS and a traditional file system over RAID. The probability of either an *undetected* or *miscorrected* error is not, so far as I know, published for disks. For high-end tape, where the uncorrected error rate is roughly 1 in 10^17 bits read, the miscorrected error rate is 1 in 10^33 bits. Modern disks may use a two-level ECC [IBM ECC] which reduces even further the miscorrected error rate. These are one class of errors which ZFS will catch and a traditional file system will not. Phantom writes and/or misdirected reads/writes: I haven''t seen probabilities published on this; obviously the disk vendors would claim zero, but we believe they''re slightly wrong. ;-) That said, 1 in 10^8 bits would mean we?d have an error in every 12 megabytes written! That?s clearly far too low. 1 in 10^8 blocks would be an error in every 46 gigabytes written; that is also clearly far too low. (At 1 GB/second that would be a phantom write every minute.) References: [Seagate WINHEC] "SATA in the Enterprise." Can be found at <http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWST05005_WinHEC05.ppt>. [IBM ECC] "Two-level coding for error control in magnetic disk storage products." Can be found at <http://www.research.ibm.com/journal/rd/334/ibmrd3304G.pdf>. This message posted from opensolaris.org
On Dec 22, 2006, at 09:50, Anton B. Rang wrote:> Phantom writes and/or misdirected reads/writes: > > I haven''t seen probabilities published on this; obviously the disk > vendors would claim zero, but we believe they''re slightly > wrong. ;-) That said, 1 in 10^8 bits would mean we?d have an > error in every 12 megabytes written! That?s clearly far too low. > 1 in 10^8 blocks would be an error in every 46 gigabytes written; > that is also clearly far too low. (At 1 GB/second that would be a > phantom write every minute.)Jim Gray (a well-known and respected database expert, currently at Microsoft) clams that the drive/controller combination will write data to the wrong place on the drive at a rate of about one incident/ drive/year. In a 400-drive array (JBOD or RAID, doesn''t matter), that would be about once a day. This is a kind of error that (so far, at least) can only be detected (and potentially corrected, given redundancy) by ZFS. --Ed
Torrey McMahon
2006-Dec-22 20:17 UTC
[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN
Roch - PAE wrote:> > The fact that most FS do not manage the disk write caches > does mean you''re at risk of data lost for those FS.Does ZFS? I thought it just turned it on in the places where we had previously turned if off.
Robert Milkowski
2006-Dec-22 20:40 UTC
[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN
Hello Torrey, Friday, December 22, 2006, 9:17:46 PM, you wrote: TM> Roch - PAE wrote:>> >> The fact that most FS do not manage the disk write caches >> does mean you''re at risk of data lost for those FS.TM> Does ZFS? I thought it just turned it on in the places where we had TM> previously turned if off. ZFS send flush cache command after each transaction group so it''s sure transaction is on stable storage. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Neil Perrin
2006-Dec-22 22:06 UTC
[zfs-discuss] Difference between ZFS and UFS with one LUN from a SAN
Robert Milkowski wrote On 12/22/06 13:40,:> Hello Torrey, > > Friday, December 22, 2006, 9:17:46 PM, you wrote: > > TM> Roch - PAE wrote: > >>>The fact that most FS do not manage the disk write caches >>>does mean you''re at risk of data lost for those FS. > > > > TM> Does ZFS? I thought it just turned it on in the places where we had > TM> previously turned if off. > > ZFS send flush cache command after each transaction group so it''s sure > transaction is on stable storage.... and after every fsync, O_DSYNC, etc that writes out intent log blocks.