Orvar Korvar
2009-Jun-15 17:45 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
According to this webpage, there are some errors that makes ZFS unusable under certain conditions. That is not really optimal for an Enterprise file system. In my opinion the ZFS team should focus on bug correction instead of adding new functionality. The functionality that exists far surpass any other file system, therefore it is better to fix bugs. In my opinion. Read those error reports and complaints and data corruption: http://hardware.slashdot.org/story/09/06/09/2336223/Apple-Removes-Nearly-All-Reference-To-ZFS -- This message posted from opensolaris.org
Joerg Schilling
2009-Jun-15 17:57 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Orvar Korvar <no-reply at opensolaris.org> wrote:> According to this webpage, there are some errors that makes ZFS unusable under certain conditions. That is not really optimal for an Enterprise file system. In my opinion the ZFS team should focus on bug correction instead of adding new functionality. The functionality that exists far surpass any other file system, therefore it is better to fix bugs. In my opinion. Read those error reports and complaints and data corruption: > http://hardware.slashdot.org/story/09/06/09/2336223/Apple-Removes-Nearly-All-Reference-To-ZFSCould you help, I cannot see any reference to data corruption in this page J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
Tim Cook
2009-Jun-15 18:30 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Mon, Jun 15, 2009 at 12:57 PM, Joerg Schilling < Joerg.Schilling at fokus.fraunhofer.de> wrote:> Orvar Korvar <no-reply at opensolaris.org> wrote: > > > According to this webpage, there are some errors that makes ZFS unusable > under certain conditions. That is not really optimal for an Enterprise file > system. In my opinion the ZFS team should focus on bug correction instead of > adding new functionality. The functionality that exists far surpass any > other file system, therefore it is better to fix bugs. In my opinion. Read > those error reports and complaints and data corruption: > > > http://hardware.slashdot.org/story/09/06/09/2336223/Apple-Removes-Nearly-All-Reference-To-ZFS > > Could you help, > > I cannot see any reference to data corruption in this page > > J?rg > >Did you actually search the page....? http://opensolaris.org/jive/message.jspa?messageID=318457#318457 http://mail.opensolaris.org/pipermail/zfs-discuss/2009-April/027748.html http://mail.opensolaris.org/pipermail/zfs-discuss/2009-April/027748.html http://mail.opensolaris.org/pipermail/zfs-discuss/2009-April/027765.html http://mail.opensolaris.org/pipermail/zfs-discuss/2009-January/025601.html http://mail.opensolaris.org/pipermail/zfs-discuss/2009-March/027629.html http://mail.opensolaris.org/pipermail/zfs-discuss/2009-March/027365.html http://mail.opensolaris.org/pipermail/zfs-discuss/2009-March/027257.html --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090615/1b2944dc/attachment.html>
Orvar Korvar
2009-Jun-15 18:58 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
In the comments there are several people complaining of loosing data. That doesnt sound to good. It takes a long time to build a good reputation, and 5 minutes to ruin it. We dont want ZFS to loose it''s reputation of an uber file system. -- This message posted from opensolaris.org
Bob Friesenhahn
2009-Jun-15 20:14 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Mon, 15 Jun 2009, Orvar Korvar wrote:> In the comments there are several people complaining of loosing > data. That doesnt sound to good. It takes a long time to build a > good reputation, and 5 minutes to ruin it. We dont want ZFS to loose > it''s reputation of an uber file system.I recognize the fellow who griped the most on Slashdot. He wasted quite a lot of time here because he was not willing to read any of the zfs documentation. His PC had failing memory chips which resulted in data corruption. He did not use any ZFS RAID features. Basically this Slashdot discussion is typical Apple discussion with lots of people who don''t know anything at all talking about what Apple may or may not do. Anyone who did learn what Apple is planning to do can''t say anything since they had to sign an NDA to learn it. As usual, the users will learn what Apple decided to do at midnight on the day the new OS is released. If Apple dumps ZFS it would be most likely due to not having developed sufficient GUIs to make it totally "user friendly". Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Sean Sprague
2009-Jun-15 20:30 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Orvar Korvar wrote:> In the comments there are several people complaining of loosing data. That doesnt sound to good. It takes a long time to build a good reputation, and 5 minutes to ruin it. We dont want ZFS to loose it''s reputation of an uber file system. >With due respect, I recommend that no-one waste the same five minutes that I have just done reading the comments section on Slashdot. It is a complete load of subjective claptrap. Do something sensible instead like microwaving a curry or calling your Mom (well maybe not the latter...) Bob F got it absolutely right about possible lack of GUI being a stumbler for potential users of ZFS in the Apple camp; but to be able to manipulate a filesystem with the underlying power that ZFS has via just two commands (or a few more if you include the SMF bits) is mindblowing. Try comparing that with the mess that is VxVM/VxFS.....
Bogdan M. Maryniuk
2009-Jun-16 02:38 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Tue, Jun 16, 2009 at 2:45 AM, Orvar Korvar<no-reply at opensolaris.org> wrote:> According to this webpage, there are some errors that makes ZFS unusable under certain conditions. > That is not really optimal for an Enterprise file system. In my opinion the ZFS team should focus > on bug correction instead of adding new functionality. The functionality that exists far surpass > any other file system, therefore it is better to fix bugs. In my opinion. Read those error reports > and complaints and data corruption: > http://hardware.slashdot.org/story/09/06/09/2336223/Apple-Removes-Nearly-All-Reference-To-ZFSSlashdot? ''cmon, Orvar... You''ve found the resource reference to, LOL. Try to say in Slashdot something really reasonable, like that GNOME (GUI No One Might Enjoy) actually sucks in its integration and is still horrible on small resolutions (e.g. you get OK/Cancel off the screen on a netbooks) and you will be an enemy of the whole world. And if you say that the latest KDE (Kids Desktop Environment) is actually even more terrible than Windows 95 ? you''re just simply dead. :-) Personally, I tried to get scared on ZFS, but all the time when yet another slashdotter (read: teenager) screams about dramatical data loss, I am unable to reproduce the problem. Thus I think it would be much better to the community if we actually find a real step-by-step reproducible crashes (VirtualBox is our friend here), fill a real bug reports and then it would be much more reasonable to speak about a particular case, rather then spreading out stupid FUD, taken from a useless slashdot commenters. P.S. I mean, let''s don''t waste our time on slashdot and let''s find something actually bad, reproduce, fill a bug and then report here. :-) -- Kind regards, bm
Orvar Korvar
2009-Jun-16 11:38 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
I totally agree with you. I am just concerned about ZFS'' reputation. If there are complaints, what should SUN do? Should the complaints be taken seriously or not? Me love ZFS, and I dont want it to loose it''s credibility. BTW, ZFS rocks. Hard. -- This message posted from opensolaris.org
Moutacim LACHHAB
2009-Jun-16 13:05 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Hi, If there are complaints, what should SUN do? Should the complaints be taken seriously or not? Customer complaints are ALWAYS taken serious by SUN, and more than that, with those kind of statements Bugs could be traced, problems resolved and so far filesystem -ZFS- could be improved. Taking Slashdots comment ''/ZFS isn''t really ready for prime time.'' /serious, then i would care about ZFS'' reputation. I do not ! And, please, try not to manage ZFS via webconsole, because your away of the commands. Except if you are winblows user, prefer GUI than CLI. So in this case, why moving to a UNIX system ??! ''Stay away advice'' is your friend. Take your time, learn, experiment, THEN use ZFS for productive environment. Its really the last word in filesystem. kind regards -- Moutacim LACHHAB Service Engineer Software Technical Services Center Sun Microsystems Inc. Email moutacim.lachhab at Sun.COM <mailto:moutacim.lachhab at sun.com> +33(0)134030594 x30210 For knowledge and support: http://sunsolve.sun.com
Miles Nordin
2009-Jun-16 21:58 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
>>>>> "bmm" == Bogdan M Maryniuk <bogdan.maryniuk at gmail.com> writes:bmm> but all the time when yet another slashdotter (read: teenager) bmm> screams the comments about data loss were mostly quoting this list. And some of the posters have said ``I''m losing a lot more ZFS pools than UFS and VxFS filesystems on my FC SAN,'''' so if you decide to take this _ad hominem_ I think it will hurt your argument. bmm> about dramatical data loss, I am unable to reproduce the bmm> problem. What have you done to try to reproduce the problem? I''m very interested in that approach. but ``I''ve had absolutely no problems with it. I can''t tell you how close to zero the number of problems I''ve had with it is. It''s so close to zero, it is zero.''''---that''s not the same thing as trying to reproduce the problem. That''s shouting an anecdote with your ears covered. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090616/39ce3c07/attachment.bin>
Bogdan M. Maryniuk
2009-Jun-17 01:42 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Wed, Jun 17, 2009 at 6:58 AM, Miles Nordin<carton at ivy.net> wrote:> What have you done to try to reproduce the problem?Well, if you had posted here steps that fails for you and I missed this, then I am sorry, I would like to get this somewhere from archive and try. However, please don''t get me wrong: no ad hominem, but just why bother busy people with a buzz? Let''s stop attitude. Personally I am running various open solaris versions on a VirtualBox as a crash dummy, as well as running osol on a real systems. All ZFS. So believe me, I am also really concerned about this kind of things and quite paranoidal. However, currently I see more FUD and useless buzz, rather than reality. Please give here a clear steps that fails for you, provide some dtrace output etc. Once others confirms the same ? I am pretty much sure that it will be fixed, because nobody here ever saying ZFS is bug free. And Sun guys always willing to help here as well as everybody else. :-) -- Kind regards, bm
Bogdan M. Maryniuk
2009-Jun-17 01:47 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Wed, Jun 17, 2009 at 6:58 AM, Miles Nordin <carton at ivy.net> wrote:> What have you done to try to reproduce the problem?P.S. I''ve read that Slashdot article and all the comments and even replied some. Plus, I''ve actually tried to reproduce few things that they vaguely are able to describe. No failures so far. Also, I tried to write some stress-tests (in Python) and still no failures. The problem is that 99% of Slashdot comments usually best to be archived to /dev/null... Really, If you find something serious ? let us know. -- Kind regards, bm
Orvar Korvar
2009-Jun-17 11:37 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Ok, so you mean the comments are mostly FUD and bull shit? Because there are no bug reports from the whiners? Could this be the case? It is mostly FUD? Hmmm...? -- This message posted from opensolaris.org
Bogdan M. Maryniuk
2009-Jun-17 14:13 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Wed, Jun 17, 2009 at 8:37 PM, Orvar Korvar<no-reply at opensolaris.org> wrote:> Ok, so you mean the comments are mostly FUD and bull shit?Unless there is real step-by-step reproducible proof, then yes, it is completely useless waste of time and BS that I would not care at all, if I were you. -- Kind regards, BM Things, that are stupid at the beginning, rarely ends up wisely.
Toby Thain
2009-Jun-17 14:40 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On 17-Jun-09, at 7:37 AM, Orvar Korvar wrote:> Ok, so you mean the comments are mostly FUD and bull shit? Because > there are no bug reports from the whiners? Could this be the case? > It is mostly FUD? Hmmm...? >Having read the thread, I would say "without a doubt". Slashdot was never the place to go for accurate information about ZFS. --Toby> -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Miles Nordin
2009-Jun-17 21:42 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
>>>>> "bmm" == Bogdan M Maryniuk <bogdan.maryniuk at gmail.com> writes: >>>>> "tt" == Toby Thain <toby at telegraphics.com.au> writes: >>>>> "ok" == Orvar Korvar <no-reply at opensolaris.org> writes:bmm> Personally I am running various open solaris versions on a bmm> VirtualBox as a crash dummy, as well as running osol on a real bmm> systems. All ZFS. It sounds like what you''re doing to ``reproduce'''' the problem is: to use solaris. This is about what I imagined, and isn''t what I had in mind as adequate. bmm> Please give here a clear steps that fails for you, steps given on this list were: 1. use iSCSI or FCP as a vdev. 2. reboot the target but do not reboot the ZFS initiator. bmm> provide some dtrace output etc. haha, ``hello, slashpot? This is slashkettle.'''' It''s just funny that after watching others on this list (sometimes with success!) debug their corrupt filesystems, the tool you latched onto was dtrace, and not mdb or zdb which do not appear in Sun marketing nearly so often. bmm> Unless there is real step-by-step reproducible proof, corruption problems with other filesystems generally do not work this way, though we can try to get closer to it. What''s more common is to pass around an image of the corrupt filesystem. Surely you can understand there is such thing as a ``hard to reproduce problem?'''' Is the phrase so new to you? If you''d experience with other filesystems in their corruption-prone infancy, it wouldn''t be. ok> Ok, so you mean the comments are mostly FUD and bull shit? ok> Because there are no bug reports from the whiners? Access to the bug database is controlled. Access to the mailing list is not. The posters did point to reports on the mailing list. tt> Slashdot was never the place to go for accurate information tt> about ZFS. again, the posts in the slashdot thread complaining about corruption were just pointers to original posts on this list, so attacking the forum where you saw the pointer instead of the content of its destination really is clearly _ad hominem_. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090617/35ff8af8/attachment.bin>
Toby Thain
2009-Jun-17 22:51 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On 17-Jun-09, at 5:42 PM, Miles Nordin wrote:>>>>>> "bmm" == Bogdan M Maryniuk <bogdan.maryniuk at gmail.com> writes: >>>>>> "tt" == Toby Thain <toby at telegraphics.com.au> writes: >>>>>> "ok" == Orvar Korvar <no-reply at opensolaris.org> writes: > > tt> Slashdot was never the place to go for accurate information > tt> about ZFS. > > again, the posts in the slashdot thread complaining about corruption > were just pointers to original posts on this list, so attacking the > forum where you saw the pointer instead of the content of its > destination really is clearly _ad hominem_.Ad foruminem? !! Or did you simply mean, "uncalled-for"? /. is no person... And most of the thread really was rubbish. If one or two posts linked to the mailing list, that doesn''t change it. --Toby> _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Bogdan M. Maryniuk
2009-Jun-18 01:35 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Thu, Jun 18, 2009 at 6:42 AM, Miles Nordin<carton at ivy.net> wrote:> Surely you can understand there is such thing as a ``hard to reproduce > problem?'''' ?Is the phrase so new to you? ?If you''d experience with > other filesystems in their corruption-prone infancy, it wouldn''t be.I understand your point, but I don''t understand what you''re trying to achieve this way? Of course, not everything that you can do you should do (like your target rebooting etc) and of course it helps, once reproducible. The same way, if you have a mirror of USB hard drives, then swap cables and reboot ? your mirror gone. But that''s not because of ZFS, if you will look more closely... That''s why I think that speaking "My $foo crashes therefore it is all crap" is bad idea: either help to fix it or just don''t use it, thus fcsk and lost+found are your friends on ext3 with corrupted superblock after yet another Linux kernel panic. :-) JFYI: *all* filesystems crashes and loses their data for time to time. That''s what backups are for. Hence if you use your backup quite often, then you can find the problem and report here. That would be very appreciated and helpful. Thanks! -- Kind regards, BM Things, that are stupid at the beginning, rarely ends up wisely.
Ian Collins
2009-Jun-18 01:35 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Thu 18/06/09 09:42 , Miles Nordin carton at Ivy.NET sent:> Access to the bug database is controlled.No, the bug databse is open. Ian.
Timh Bergström
2009-Jun-18 04:47 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
The way I see it is that eventhough ZFS may be a wonderful filesystem, it is not the best solution for every possible (odd) setup. I.e USB-sticks has proven a bad idea with zfs mirrors, ergo - dont do it(tm). ZFS on iSCSI *is* flaky and a host-reboot without telling the target will most likely get the target to panic and/or crash in a couple of fun ways. Yes I have tested, way to many times, and yes I know this - therefore I do not reboot my hosts if not absolutely necessary and if I have to, I usually tell my target(s) to unmount/export/calm down before I do it. The same is true for FC-connected vdevs, and thus - I dont "reboot" my SAN nor my FC connected cabinets without telling the targets. I know that sometimes you cant avoid theese situations but then I''ve learned how to "fix" it when it happends, and there''s no bs, just fix it. I''ve been equally disappointed and encouraged by ZFS itself, and i''ve lost data, i''ve recovered data (thanks to Victor Latushkin) and im ok with it. Why? Know thy filesystem! I know what I can do, what I can trust and what I cant, so I do and I dont accordingly. Flaming people on ./ is *not* the way to silence those who lost data due to corruptions, telling them that "Ok, that''s sad, but perhaps ZFS is NOT for you if you want to run USB-connected iscsi-initiator over zfs raidz with mushrooms and sauce." would probably be a better idea. Nor is blaming *all* errors/corruptiosn that actually occur on bad hardware (ECC discussion), stupid programs (mostly GNU-hate) or stupid administrators who didnt get the budget to run backups on 100TB+ nodes etc etc. Now I''ll probably get flamed by those who get religious over theese kind of discussions, but that''s ok, I dont get hurt by flames on internet... Arguing on the internet... special olympics yada yada.. ;-) Best Regards, Timh 2009/6/18 Ian Collins <masuma.ian at quicksilver.net.nz>:> On Thu 18/06/09 09:42 , Miles Nordin carton at Ivy.NET sent: > >> Access to the bug database is controlled. > > No, the bug databse is open. > > Ian. > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Timh Bergstr?m System Operations Manager Diino AB - www.diino.com :wq
Timh Bergström
2009-Jun-18 05:25 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Den 18 juni 2009 06.47 skrev Timh Bergstr?m<timh.bergstrom at diino.net>:> The way I see it is that eventhough ZFS may be a wonderful filesystem, > it is not the best solution for every possible (odd) setup. I.e > USB-sticks has proven a bad idea with zfs mirrors, ergo - dont do > it(tm). > > ZFS on iSCSI *is* flaky and a host-reboot without telling the target > will most likely get the target to panic and/or crash in a couple of > fun ways. Yes I have tested, way to many times, and yes I know this - > therefore I do not reboot my hosts if not absolutely necessary and if > I have to, I usually tell my target(s) to unmount/export/calm down > before I do it. The same is true for FC-connected vdevs, and thus - I > dont "reboot" my SAN nor my FC connected cabinets without telling the > targets. > > I know that sometimes you cant avoid theese situations but then I''ve > learned how to "fix" it when it happends, and there''s no bs, just fix > it. > > I''ve been equally disappointed and encouraged by ZFS itself, and i''ve > lost data, i''ve recovered data (thanks to Victor Latushkin) and im ok > with it. Why? > > Know thy filesystem! > > I know what I can do, what I can trust and what I cant, so I do and I > dont accordingly. > > Flaming people on ./ is *not* the way to silence those who lost data > due to corruptions, telling them that "Ok, that''s sad, but perhaps ZFS > is NOT for you if you want to run USB-connected iscsi-initiator over > zfs raidz with mushrooms and sauce." would probably be a better idea. > Nor is blaming *all* errors/corruptiosn that actually occur on bad > hardware (ECC discussion), stupid programs (mostly GNU-hate) or stupid > administrators who didnt get the budget to run backups on 100TB+ nodes > etc etc.Edit: Of course blaming all that happends on ZFS itself isnt the *right* thing to do either, rather blame the marketing, if Sun or any other pr says ZFS is the one and only filesystem you need for everything, tell them to go to h*ll, since that is never the thruth. And as usual, just because you *can* doesnt mean you *should*. //T> > Now I''ll probably get flamed by those who get religious over theese > kind of discussions, but that''s ok, I dont get hurt by flames on > internet... Arguing on the internet... special olympics yada yada.. > ;-) > > Best Regards, > Timh > > 2009/6/18 Ian Collins <masuma.ian at quicksilver.net.nz>: >> On Thu 18/06/09 09:42 , Miles Nordin carton at Ivy.NET sent: >> >>> Access to the bug database is controlled. >> >> No, the bug databse is open. >> >> Ian. >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > > > -- > Timh Bergstr?m > System Operations Manager > Diino AB - www.diino.com > :wq >-- Timh Bergstr?m System Operations Manager Diino AB - www.diino.com :wq
Bogdan M. Maryniuk
2009-Jun-18 07:42 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
2009/6/18 Timh Bergstr?m <timh.bergstrom at diino.net>:> USB-sticks has proven a bad idea with zfs mirrorsI think, USB sticks is bad idea for mirrors in general... :-)> ZFS on iSCSI *is* flakyOK, so what is the status of your bugreport about this? Was ignored or just rejected?..> Flaming people on ./Nobody flaming people nor in current directory (./) neither on /. (slash-dot). All asked is a practical steps or bug reports. P.S. Additionally, everyone can spend their true anger on an installed Solaris somewhere on a spare hardware and kill that sucker with a stress-tests. Effect: you''re relaxed and Sun folks has a job. :-) -- Kind regards, BM Things, that are stupid at the beginning, rarely ends up wisely.
Timh Bergström
2009-Jun-18 09:04 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Den 18 juni 2009 09.42 skrev Bogdan M. Maryniuk<bogdan.maryniuk at gmail.com>:>> ZFS on iSCSI *is* flaky > OK, so what is the status of your bugreport about this? Was ignored or > just rejected?..No bug report because I don''t think it''s the file systems fault, and why bother when disappearing vdevs (even though the pool is fully redundant (raidz) and got enough vdevs to be theoretically working) causes the machine to panic and crash when there is other solutions/file systems that is more robust (for me) when using iscsi/fc. If my data is gone (or inaccessible), I have other things to worry about than filing bug reports and/or get on the list and get flamed for not having proper backups. :-] How to reproduce? Create a raidz2 pool (with Solaris 10u3) over two iscsi-enclosures, shutdown one of the enclosures, observe the results. It would probably work better if I upgraded solaris/zfs, but as I said - at the time I had other things to worry about. No flaming/blaming/hating, I simply don''t use the combination zfs+iscsi/fc for critical data anymore and thats OK with me. -- Best Regards, Timh
Miles Nordin
2009-Jun-18 16:14 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
>>>>> "bmm" == Bogdan M Maryniuk <bogdan.maryniuk at gmail.com> writes: >>>>> "tt" == Toby Thain <toby at telegraphics.com.au> writes:bmm> That''s why I think that speaking "My $foo crashes therefore it bmm> is all crap" is bad idea: either help to fix it or just don''t bmm> use it, First, people are allowed to speak and share information, and yes, even complain, without helping to fix things. You do not get to silence people who lack the talent, time, and interest to fix problems. Everyone''s allowed to talk here. Second, I do use ZFS. But I keep a backup pool. And although my primary pool is iSCSI-based, the backup pools are direct-attached. Thanks to the open discussion on the list, I know that using iSCSI puts me at higher risk of pool loss. I know I need to budget for the backup pool equipment if I want to switch from $oldfilesystem to ZFS and not take a step down in reliability. I know that, while there is no time-consuming fsck to draw out downtime, pretty much every corruption event results in ``restore the pool from backup'''' which takes a while, so I need to expect that by, for example, being prepared to run critical things directly off the backup pools. Finally, I know that ZFS pool corruption almost always results in loss of the whole pool, while other filesystem corruption tends to do crazier things which cappen to be less catastrophic to my particular dataset: some files but not all are lost after fsck, some files remain but lose their names, or more usefully retain their names but lose the name of one of their parent directories, the insides of some files are silently corrupted. There''s actionable information in here. Technical discussion is worth more than sucks/rules armwrestling. bmm> The same way, if you have a mirror of USB hard drives, then bmm> swap cables and reboot ? your mirror gone. But that''s not bmm> because of ZFS, if you will look more closely... actually I think you are the one not looking closely enough. You say no one is losing pools, and then 10min later reply to a post about running zdb on a lost pool. You shouldn''t need me to tell you something''s wrong. When you limit your thesis to ``ZFS rules'''' and then actively mislead people, we all lose. tt> /. is no person... right, so I use a word like ad hominem, and you stray from the main point to say ``Erm ayctually your use of rhetorical terminology is incorrect.'''' maybe, maybe not, whatever, but again [x2], the posts in the slashdot thread complaining about corruption were just pointers to original posts on this list, so attacking the forum where you saw the pointer instead of the content of its destination really is clearly _ad hominem_. *brrk* *brr* ``no! no it''s not ad hominem! it''s a different word! ah, ha ah thought'' you''d slip one past me there eh?'''' QUIT BEING SO DAMNED ADD. We can get nowhere. As for the posts being rubbish, you and I both know it''s plausible speculation that Apple delayed unleashing ZFS on their consumers because of the lost pool problems. ZFS doesn''t suck, I do use it, I hope and predict it will get better---so just back off and calm down with the rotten fruit. But neither who''s saying it nor your not wanting to hear it makes it less plausible. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090618/d4039a78/attachment.bin>
Toby Thain
2009-Jun-18 19:36 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On 18-Jun-09, at 12:14 PM, Miles Nordin wrote:>>>>>> "bmm" == Bogdan M Maryniuk <bogdan.maryniuk at gmail.com> writes: >>>>>> "tt" == Toby Thain <toby at telegraphics.com.au> writes: > ... > tt> /. is no person...> ... you and I both know it''s plausible > speculation that Apple delayed unleashing ZFS on their consumers > because of the lost pool problems. ZFS doesn''t suck, I do use it, I > hope and predict it will get better---so just back off and calm down > with the rotten fruit. But neither who''s saying it nor your not > wanting to hear it makes it less plausible.In my opinion, a more plausible explanation is: Apple has not made ZFS integration a high priority [for 10.6]. There is no doubt Apple has the engineering resources to make it perfectly reliable as a component of Mac OS X, if that were a high priority goal. I run OS X but I am not at all tempted to play with ZFS on it there; life is too short for betas. If I want ZFS I install Solaris 10. --Toby> _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Sean Sprague
2009-Jun-18 23:11 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Toby,> > On 17-Jun-09, at 7:37 AM, Orvar Korvar wrote: > >> Ok, so you mean the comments are mostly FUD and bull shit? Because >> there are no bug reports from the whiners? Could this be the case? It >> is mostly FUD? Hmmm...? >> > > Having read the thread, I would say "without a doubt". > > Slashdot was never the place to go for accurate information about ZFS.Many would even say: Slashdot was never the place to go for accurate information. Slashdot was never the place to go for information. Slashdot was never the place to go. Slashdot? Never. Take your pick ;-) Regards... Sean.
Miles Nordin
2009-Jun-19 19:17 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
>>>>> "ic" == Ian Collins <masuma.ian at quicksilver.net.nz> writes:>> Access to the bug database is controlled. ic> No, the bug databse is open. no, it isn''t. Not all the bugs are visible, and after submitting a bug it has to be approved. Neither is true of the mailing list. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090619/4d289c75/attachment.bin>
Miles Nordin
2009-Jun-19 19:27 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
>>>>> "bmm" == Bogdan M Maryniuk <bogdan.maryniuk at gmail.com> writes:bmm> OK, so what is the status of your bugreport about this? That''s a good question if it''s meant genuinely, and not to be obstructionist. It''s hard to report one bug with clear information because the problem isn''t well-isolated yet. In my notes: 6565042, 6749630 but as I said before, I''ve found the information on the mailing list more useful w.r.t. this particular problem. You can see how those bugs are about specific, methodically-reproduceable problems. Bugs are not ``I have been losing more zpools than I lost UFS/vxfs filesystems on top of the same storage platform.'''' It may take a while. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090619/08a9bfea/attachment.bin>
Tim Haley
2009-Jun-19 19:34 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Miles Nordin wrote:>>>>>> "bmm" == Bogdan M Maryniuk <bogdan.maryniuk at gmail.com> writes: > > bmm> OK, so what is the status of your bugreport about this? > > That''s a good question if it''s meant genuinely, and not to be > obstructionist. It''s hard to report one bug with clear information > because the problem isn''t well-isolated yet. > > In my notes: 6565042, 6749630 >The first of which is marked as fixed in snv_77, 19 months ago. The second is marked as a duplicate of 6784395, fixed in snv_107, 20 weeks ago. -tim> but as I said before, I''ve found the information on the mailing list > more useful w.r.t. this particular problem. > > You can see how those bugs are about specific, > methodically-reproduceable problems. Bugs are not ``I have been > losing more zpools than I lost UFS/vxfs filesystems on top of the same > storage platform.'''' > > It may take a while. > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Miles Nordin
2009-Jun-19 20:09 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
>>>>> "th" == Tim Haley <tim.haley at Sun.COM> writes:th> The second is marked as a duplicate of 6784395, fixed in th> snv_107, 20 weeks ago. Yeah nice sleuthing. :/ I understood Bogdan''s post was a trap: ``provide bug numbers. Oh, they''re fixed? nothing to see here then. no bugs? nothing to see here then.'''' But think about it. Does this mean ZFS was not broken before those bugs were filed? It does not. now, extrapolate: imagine looking back on this day from the future. In the next line of that post right below where I give the bug numbers, I provide context explaining why I still think there''s a problem. Also, as I said elsewhere, there''s a barrier controlled by Sun to getting bugs accepted. This is a useful barrier: the bug database is a more useful drive toward improvement if it''s not cluttered. It also means, like I said, sometimes the mailing list is a more useful place for information. HTH. I think a better question would be: what kind of tests would be most promising for turning some subclass of these lost pools reported on the mailing list into an actionable bug? my first bet would be writing tools that test for ignored sync cache commands leading to lost writes, and apply them to the case when iSCSI targets are rebooted but the initiator isn''t. I think in the process of writing the tool you''ll immediately bump into a defect, because you''ll realize there is no equivalent of a ''hard'' iSCSI mount like there is in NFS. and there cannot be a strict equivalent to ''hard'' mounts in iSCSI, because we want zpool redundancy to preserve availability when an iSCSI target goes away. I think the whole model is wrong somehow. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090619/a3e26d63/attachment.bin>
Nicolas Williams
2009-Jun-19 20:24 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On Fri, Jun 19, 2009 at 04:09:29PM -0400, Miles Nordin wrote:> Also, as I said elsewhere, there''s a barrier controlled by Sun to > getting bugs accepted. This is a useful barrier: the bug database is > a more useful drive toward improvement if it''s not cluttered. It also > means, like I said, sometimes the mailing list is a more useful place > for information.There''s two bug databases, sadly. bugs.opensolaris.org is like you describe, whereas defect.opensolaris.org is not. Nico --
Haudy Kazemi
2009-Jun-20 04:51 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
> I think a better question would be: what kind of tests would be most > promising for turning some subclass of these lost pools reported on > the mailing list into an actionable bug? > > my first bet would be writing tools that test for ignored sync cache > commands leading to lost writes, and apply them to the case when iSCSI > targets are rebooted but the initiator isn''t. > > I think in the process of writing the tool you''ll immediately bump > into a defect, because you''ll realize there is no equivalent of a > ''hard'' iSCSI mount like there is in NFS. and there cannot be a strict > equivalent to ''hard'' mounts in iSCSI, because we want zpool redundancy > to preserve availability when an iSCSI target goes away. I think the > whole model is wrong somehow. >I''d surely hope that a ZFS pool with redundancy built on iSCSI targets could survive the loss of some targets whether due to actual failures or necessary upgrades to the iSCSI targets (think OS upgrades + reboots on the systems that are offering iSCSI devices to the network.) My suggestion is use multi-way redundancy with iSCSI...e.g. 3 way mirrors or RAIDZ2...so that you can safely offline one of the iSCSI targets while still leaving the pool with some redundancy. Sure there is an increased risk while that device is offline, but the window of opportunity is small for a failure of the 2nd level redundancy; and even then nothing is yet lost until a 3rd device has a fault. Failures should also distinguish between complete failure (e.g. device no longer responds to commands whatsoever) and intermittent failure (e.g. a "sticky" patch of sectors, or the drive stops responding for a minute because it has a non-changeable TLER value that otherwise may cause trouble in a RAID configuration). Drives have a gradation from complete failure to flaky to flawless...if the software running on them recognizes this, better decisions can be made about what to do when an error is encountered rather than the simplistic good/failed model that has been used in RAIDs for years. My preference for storage behavior is that it should not cause a system panic (ever). Graceful error recovery techniques are important. File system error messages should be passed up the line when possible so the user can figure out something is amiss with some files (even if not all) even though the sysadmin is not around or email notification of problems is not working. If it is possible to returning a CRC errors to a network share client, that would seem to be a close match to a uncorrectable checksum failure. (Windows throws these errors when it cannot read a CD/DVD.) A good damage mitigation feature is to provide some mechanism to allow a user to ignore the checksum failure as in many user data cases partial recovery is preferable to no recovery. To ensure that damaged files are not accidentally confused with good files, ignoring the checksum failures might only be allowed through a special "recovery filesystem" that only lists damaged files the authenticated user has access to. From the network client''s perspective, this would be another shared folder/subfolder that is only present when uncorrectable, damaged files have been found. ZFS would set up the appropriate links to replicate the directory structure of the original as needed to include the damaged file.
Dave
2009-Jun-20 06:25 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Haudy Kazemi wrote:> >> I think a better question would be: what kind of tests would be most >> promising for turning some subclass of these lost pools reported on >> the mailing list into an actionable bug? >> >> my first bet would be writing tools that test for ignored sync cache >> commands leading to lost writes, and apply them to the case when iSCSI >> targets are rebooted but the initiator isn''t. >> >> I think in the process of writing the tool you''ll immediately bump >> into a defect, because you''ll realize there is no equivalent of a >> ''hard'' iSCSI mount like there is in NFS. and there cannot be a strict >> equivalent to ''hard'' mounts in iSCSI, because we want zpool redundancy >> to preserve availability when an iSCSI target goes away. I think the >> whole model is wrong somehow. >> > I''d surely hope that a ZFS pool with redundancy built on iSCSI targets > could survive the loss of some targets whether due to actual failures or > necessary upgrades to the iSCSI targets (think OS upgrades + reboots on > the systems that are offering iSCSI devices to the network.) >I''ve had a mirrored zpool created from solaris iSCSI target servers in production since April 2008. I''ve had disks die and reboots of the target servers - ZFS has handled them very well. My biggest wish is to be able to tune the iSCSI timeout value so ZFS can failover reads/writes to the other half of the mirror quicker than it does now (about 180 seconds on my config). A minor gripe considering the features that ZFS provides. I''ve also had the zfs server (the initiator aggregating the mirrored disks) unintentionally power cycled with the iscsi zpool imported. The pool re-imported and scrubbed fine. ZFS is definitely my FS of choice - by far.
Bogdan M. Maryniuk
2009-Jun-20 13:09 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Hi, Miles! Hope, weather is fine at your place. :-) On Sat, Jun 20, 2009 at 5:09 AM, Miles Nordin wrote:> I understood Bogdan''s post was a trap: ``provide bug numbers. Oh, > they''re fixed? nothing to see here then. no bugs? nothing to see > here then.''''Would be great if you do not put a words in my mouth, please. All what I wanted to say (not to you, but to everyone, including myself): we have to be constructive and make common sense (which is not so common, unfortunately). Otherwise I am not sure we are welcomed here.> Does this mean ZFS was not broken before those bugs were filed?Does this mean ZFS has no more bugs? Does this mean that we have stop using it? Was flame throwing dragons real? Is there a life on a Mars?.. :) Just kidding, never mind. :-)> Also, as I said elsewhere, there''s a barrier controlled by Sun to > getting bugs accepted.Looks like you''re new here. :-) E.g. there is a list very nasty bugs in Sun Java that has been filled in 2006 or earlier and lots of people suffering (including me) now, in 2009. But hey, not our job to cry and FUD around, I think. How about this scenario: either let''s find workaround (and provide on the same bugreport) or, if it is so critical (and Sun rejected it), let''s make a nice PDF with exploit sources or step-by-step instruction how to crash your system down to italian spaghetti and publish on a Slashdot :-) to let "good" guys find the rest how to kill solarises in two seconds. Then I am 100.0% sure Sun will patch it just right immediately. It is exaggerated, but still do you like it? But instead to do this way, somewhat Slashdot folks more just talks vague blah-blah-blah (mostly being modded "insightful: 5" or "interesting: 5", while is a just a troll or FUD) rather then doing something really useful. I am pretty much sure, if there will be graphic comparisons with a source code on a Phoronix or similar resources like "FAT32 seriously beats ZFS in stability" or "How to DoS your ZFS from Google Android" or "Linux''s ext2 is quince faster than ZFS" ? then this would add more adrenaline to Sun''s folks fixing it. However... there are only Slashdot talks that are nothing more than just a Slashdot talks. I understand you and other Slashdot folks had some problems. But I hadn''t, including lots of other people that ZFS works for them just fine. Thus it is even/even. :-P> HTH.No, it does not. Just yet another e-mail posting that does not really helps fixing bugs. :-)> I think a better question would be: what kind of tests would be most > promising for turning some subclass of these lost pools reported on > the mailing list into an actionable bug? > > my first bet would be writing tools that test for ignored sync cache > commands leading to lost writes, and apply them to the case when iSCSI > targets are rebooted but the initiator isn''t. > > I think in the process of writing the tool you''ll immediately bump > into a defect, because you''ll realize there is no equivalent of a > ''hard'' iSCSI mount like there is in NFS. and there cannot be a strict > equivalent to ''hard'' mounts in iSCSI, because we want zpool redundancy > to preserve availability when an iSCSI target goes away. I think the > whole model is wrong somehow.Now this DOES make sense! :-) Actually, iSCSI has lots of various small issues that grows into serious problems, thus that needs to be brought up, clearly described and I am sure suggestions are welcome. If you want to help with stress-tests, then I can help you in this, I think. For example, here is very nice article of iSCSI setup for Time Machine. The article is also very nice academic example to let Slashdot folks learn once how to make sense writing docs, complains and reports: http://www.kamiogi.net/Kamiogi/Frame_Dragging/Entries/2009/5/25_OpenSolaris_ZFS_iSCSI_Time_Machine_in_20_Minutes_or_Less.html So go check it out, follow the steps and make the same. Then write some scripts that can bring it down, find why, find where is the problem, suggest solution and publish this in Sun''s bugs database. If you do that ? my applauds and respect. How this sounds to you? :-) -- Kind regards, BM Things, that are stupid at the beginning, rarely ends up wisely.
Moutacim LACHHAB
2009-Jun-22 07:41 UTC
[zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Nicolas Williams schrieb:> On Fri, Jun 19, 2009 at 04:09:29PM -0400, Miles Nordin wrote: > >> Also, as I said elsewhere, there''s a barrier controlled by Sun to >> getting bugs accepted. This is a useful barrier: the bug database is >> a more useful drive toward improvement if it''s not cluttered. It also >> means, like I said, sometimes the mailing list is a more useful place >> for information. >> > > There''s two bug databases, sadly. bugs.opensolaris.org is like you > describe, whereas defect.opensolaris.org is not. > > Nico >Hi, are you talking about ZFS or just BUG''s on opensolaris ? Just like everywhere, there are internal databases -for internal use only-, and others like http://bugs.sun.com/ for everyone ! And the barrier is controlled for good reasons; -most of the times there is a workaround, that someone ignored, or people just don''t know. So its only a BUG, if there is no workaround to fiw the problem. kind regards, Moutacim -- Moutacim LACHHAB Service Engineer Software Technical Services Center Sun Microsystems Inc. Email moutacim.lachhab at Sun.COM <mailto:moutacim.lachhab at sun.com> +33(0)134030594 x31457 For knowledge and support: http://sunsolve.sun.com