-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Chris - I''m starting to dig into the fun part of error handling and btrfs_commit_transaction is a minefield right now. I''ve been thinking about how I would go about recovering from a serious error like an -EIO while writing out or an -ENOMEM in a deep part of the code that it''s prohibitively expensive to recover from. Mostly I''m looking for the best way to make calling btrfs_std_error() be functionally equivalent to killing the power on the disk. We already block off new writers, but that''s obviously nowhere near enough. We could have an open transaction floating around, uncommitted transactions queued, and then an unrecoverable error hits, forcing us to shut it all down. It seems to me that that a similar method of recovery that I wrote for reiserfs can be used here as well. Am I understanding correctly that if I go through the motions of committing the transaction *except* for updating the tree roots, or maybe even doing that but declining to write the superblocks out, that the transaction essentially doesn''t exist on disk? Including the allocations? The in-memory representation will not match what''s on disk, but that''s what happens with every file system in RO-failure mode. With CoW even for data, data is essentially frozen in time as well. (I suppose with nodatacow that''s not true, but that''s for another day.) - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJO58fxAAoJEB57S2MheeWyiH8P/RGfJUCwBoz83vLH5qRfsAzO Rnfjq9/NS+58zh5MGImimr9u6ZuNCfNUEFUDXGnVwF2Er1jHh0orU1pQdvU9XlHv T/vAyZp1s/emwwDPQX0Xo24QNumSzA2u7qnUuUBklq8l+KL99OZCErhu/eJ6i06S vTv6KflsL/EU5ISgro051fVLGep0ZF5hLYOJHQbCJaRlL6OwC2d8cWHGR+qBdRaw t4SQ+tVmKnnd4UzlpPzyQTCSOwdnSYtei28fCAy7X4rmycCXTa8eYQgvxkIabgkM IF8F8utcKT2yTFyUbJM3MWUx0yzPVsL77XnO8FCfYbusYC1EPTnMSGJ1CbupHvr1 kmFJEOQ4rj8fxLzxYDdxjEJ7HtyIhQDfH1BZ1/0+e8BShepr7/60AwoNaWVOceN/ rDDkkKgIogprGO0un1Fv3J+FNPgIR/47t1ULSUTLhg4vAqbQRuYiI36Y2zlG7G2a C/u/4UgrH40CVFVVtIRnjO67/QffTC3pf8Q6kzaXgotQJUt1XfY3a4X6MLQnfWKo bBQaPTIpsxtf7k3cnH5XfjQqtljGgXrbBExtMPKBor7RDPVw3KrLm4F35Enr4Gur pumzXQfiSC2oiSxpG1RXegZ2CXLKW4a/++kMApAOR98xTAHM8dzFhx0V0YZh/MHY Sc+ddgI2v5ZIUL2IV3WK =DmtI -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Dec 13, 2011 at 04:47:30PM -0500, Jeff Mahoney wrote:> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > Hi Chris - > > I''m starting to dig into the fun part of error handling and > btrfs_commit_transaction is a minefield right now. > > I''ve been thinking about how I would go about recovering from a > serious error like an -EIO while writing out or an -ENOMEM in a deep > part of the code that it''s prohibitively expensive to recover from. > Mostly I''m looking for the best way to make calling btrfs_std_error() > be functionally equivalent to killing the power on the disk. We > already block off new writers, but that''s obviously nowhere near > enough. We could have an open transaction floating around, uncommitted > transactions queued, and then an unrecoverable error hits, forcing us > to shut it all down. > > It seems to me that that a similar method of recovery that I wrote for > reiserfs can be used here as well. Am I understanding correctly that > if I go through the motions of committing the transaction *except* for > updating the tree roots, or maybe even doing that but declining to > write the superblocks out, that the transaction essentially doesn''t > exist on disk? Including the allocations? The in-memory representation > will not match what''s on disk, but that''s what happens with every file > system in RO-failure mode. With CoW even for data, data is essentially > frozen in time as well. (I suppose with nodatacow that''s not true, but > that''s for another day.)Hi Jeff, Thanks for taking another pass at this. It should be possible to just skip the step where we update the roots in the super and you''ll keep a fully consistent FS on disk. The only rule would be that you''re not allowed to take a block that we''ve freed in the aborted transaction and reuse it. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/13/2011 07:13 PM, Chris Mason wrote:> On Tue, Dec 13, 2011 at 04:47:30PM -0500, Jeff Mahoney wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> >> >> Hi Chris - >> >> I''m starting to dig into the fun part of error handling and >> btrfs_commit_transaction is a minefield right now. >> >> I''ve been thinking about how I would go about recovering from a >> serious error like an -EIO while writing out or an -ENOMEM in a >> deep part of the code that it''s prohibitively expensive to >> recover from. Mostly I''m looking for the best way to make calling >> btrfs_std_error() be functionally equivalent to killing the power >> on the disk. We already block off new writers, but that''s >> obviously nowhere near enough. We could have an open transaction >> floating around, uncommitted transactions queued, and then an >> unrecoverable error hits, forcing us to shut it all down. >> >> It seems to me that that a similar method of recovery that I >> wrote for reiserfs can be used here as well. Am I understanding >> correctly that if I go through the motions of committing the >> transaction *except* for updating the tree roots, or maybe even >> doing that but declining to write the superblocks out, that the >> transaction essentially doesn''t exist on disk? Including the >> allocations? The in-memory representation will not match what''s >> on disk, but that''s what happens with every file system in >> RO-failure mode. With CoW even for data, data is essentially >> frozen in time as well. (I suppose with nodatacow that''s not >> true, but that''s for another day.) > > Hi Jeff, > > Thanks for taking another pass at this. > > It should be possible to just skip the step where we update the > roots in the super and you''ll keep a fully consistent FS on disk. > The only rule would be that you''re not allowed to take a block that > we''ve freed in the aborted transaction and reuse it.Perfect. Sorry I haven''t responded to this yet. I started digging right in and I''ve started to have some good results. It turns out there''s already a btrfs_cleanup_transaction call that will tear down outstanding transactions. It''s not perfect and I''ve fixed a few bugs in there, but it saved me a bunch of effort. I just wished I noticed it a day before since I had it half implemented myself. :) This afternoon I started running xfstests on a dm-linear mapped partition. Halfway through a sufficiently long test, I swap out the linear mapping to an error mapping. It still crashes, but somewhat less spectacularly. There are still a ton of BUG_ON''s I need to eliminate as well as work out the usual I/O error-recovery issue of uninterruptible, unrecoverable writeback contexts and still-locked pages holding up exit. I''m pretty pleased with the results so far and am pretty optimistic. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJO8p0MAAoJEB57S2MheeWyF4YP/A6uUzuP4zui+iwenSP44trw FkONuTidZRjSgA4tXNrdsnIF2txtiewzp0HvWWudw5rnMNQzznyO0WynKHSPG3ep xFZnfpvaYoCaMQt70IxAQFDsZpowbPAI8194mbJqKAql4f2RNzlg/3fR4k+Fz6Ye Gu824uEbtyHghy96C37e/E30Zizu6+S7xrx8jwmnKbq44docoIV3Pw9LZGOU99Db 1IFipExd0Z/ZhTTiK4gZ787nPhM9QNfxw/9+h1g4gUfJqlcmRrcwGJmOj5iOBGBt Man51ZCI8hYBpubTgvTQalut+uLq9lCoBZQGTbKHLNLd21qM+Ji4KCAQzMBUtqGn pzSfs3Gdwa1WjYszINAS6gqA+0ubh1F/WxGwJKW85JnAYy8OjTJHru7GlYzt3C9Y gouU7xgrneVn+lZFwV9X0gwX8yLQx5Lh9YEF6AJLXJuXHg4zGZyhpFjVkmTlle93 dFUblB92q9lxdw5V8f1Uw+EDIlACZZRo7MFDSypjdTTryRFiAjhCtBdBpnu54Mrb fH2kdhPCBm4YqAQLlo43aOPAbkOYElAr0rgPvqaLzimZLAW0kd/nGU/if3mhMMa2 7ad7tKTQyktyGKuEkMPnSCU8SqFNGA750aeFG22uJJjbdCytyzkJmeqYQD5oykqm vDpKh0g20Fcqb98q+qbt =jjDk -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/22/2011 10:59 AM, Jeff Mahoney wrote:> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 12/13/2011 07:13 PM, Chris Mason wrote: >> On Tue, Dec 13, 2011 at 04:47:30PM -0500, Jeff Mahoney wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >>> >>> >>> Hi Chris - >>> >>> I''m starting to dig into the fun part of error handling and >>> btrfs_commit_transaction is a minefield right now. >>> >>> I''ve been thinking about how I would go about recovering from a >>> serious error like an -EIO while writing out or an -ENOMEM in a >>> deep part of the code that it''s prohibitively expensive to >>> recover from. Mostly I''m looking for the best way to make calling >>> btrfs_std_error() be functionally equivalent to killing the power >>> on the disk. We already block off new writers, but that''s >>> obviously nowhere near enough. We could have an open transaction >>> floating around, uncommitted transactions queued, and then an >>> unrecoverable error hits, forcing us to shut it all down. >>> >>> It seems to me that that a similar method of recovery that I >>> wrote for reiserfs can be used here as well. Am I understanding >>> correctly that if I go through the motions of committing the >>> transaction *except* for updating the tree roots, or maybe even >>> doing that but declining to write the superblocks out, that the >>> transaction essentially doesn''t exist on disk? Including the >>> allocations? The in-memory representation will not match what''s >>> on disk, but that''s what happens with every file system in >>> RO-failure mode. With CoW even for data, data is essentially >>> frozen in time as well. (I suppose with nodatacow that''s not >>> true, but that''s for another day.) >> Hi Jeff, >> >> Thanks for taking another pass at this. >> >> It should be possible to just skip the step where we update the >> roots in the super and you''ll keep a fully consistent FS on disk. >> The only rule would be that you''re not allowed to take a block that >> we''ve freed in the aborted transaction and reuse it. > > Perfect. > > Sorry I haven''t responded to this yet. I started digging right in and > I''ve started to have some good results. It turns out there''s already a > btrfs_cleanup_transaction call that will tear down outstanding > transactions. It''s not perfect and I''ve fixed a few bugs in there, but > it saved me a bunch of effort. I just wished I noticed it a day before > since I had it half implemented myself. :) >Hi Jeff, Yes, it should be, and I wrote this cleanup_transaction where I should notice you earlier... Anyway, thanks for your effort. The error handling part has lots of corner cases, so I just pick up a brute way to tear down the current transaction in order to make the FS RO. thanks, liubo> This afternoon I started running xfstests on a dm-linear mapped > partition. Halfway through a sufficiently long test, I swap out the > linear mapping to an error mapping. It still crashes, but somewhat > less spectacularly. There are still a ton of BUG_ON''s I need to > eliminate as well as work out the usual I/O error-recovery issue of > uninterruptible, unrecoverable writeback contexts and still-locked > pages holding up exit. I''m pretty pleased with the results so far and > am pretty optimistic. > > - -Jeff > > > - -- > Jeff Mahoney > SUSE Labs > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.18 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQIcBAEBAgAGBQJO8p0MAAoJEB57S2MheeWyF4YP/A6uUzuP4zui+iwenSP44trw > FkONuTidZRjSgA4tXNrdsnIF2txtiewzp0HvWWudw5rnMNQzznyO0WynKHSPG3ep > xFZnfpvaYoCaMQt70IxAQFDsZpowbPAI8194mbJqKAql4f2RNzlg/3fR4k+Fz6Ye > Gu824uEbtyHghy96C37e/E30Zizu6+S7xrx8jwmnKbq44docoIV3Pw9LZGOU99Db > 1IFipExd0Z/ZhTTiK4gZ787nPhM9QNfxw/9+h1g4gUfJqlcmRrcwGJmOj5iOBGBt > Man51ZCI8hYBpubTgvTQalut+uLq9lCoBZQGTbKHLNLd21qM+Ji4KCAQzMBUtqGn > pzSfs3Gdwa1WjYszINAS6gqA+0ubh1F/WxGwJKW85JnAYy8OjTJHru7GlYzt3C9Y > gouU7xgrneVn+lZFwV9X0gwX8yLQx5Lh9YEF6AJLXJuXHg4zGZyhpFjVkmTlle93 > dFUblB92q9lxdw5V8f1Uw+EDIlACZZRo7MFDSypjdTTryRFiAjhCtBdBpnu54Mrb > fH2kdhPCBm4YqAQLlo43aOPAbkOYElAr0rgPvqaLzimZLAW0kd/nGU/if3mhMMa2 > 7ad7tKTQyktyGKuEkMPnSCU8SqFNGA750aeFG22uJJjbdCytyzkJmeqYQD5oykqm > vDpKh0g20Fcqb98q+qbt > =jjDk > -----END PGP SIGNATURE----- > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/21/2011 10:21 PM, Liu Bo wrote:> On 12/22/2011 10:59 AM, Jeff Mahoney wrote: Sorry I haven''t > responded to this yet. I started digging right in and I''ve started > to have some good results. It turns out there''s already a > btrfs_cleanup_transaction call that will tear down outstanding > transactions. It''s not perfect and I''ve fixed a few bugs in there, > but it saved me a bunch of effort. I just wished I noticed it a day > before since I had it half implemented myself. :) > > >> Hi Jeff, > >> Yes, it should be, and I wrote this cleanup_transaction where I >> should notice you earlier... Anyway, thanks for your effort. > >> The error handling part has lots of corner cases, so I just pick >> up a brute way to tear down the current transaction in order to >> make the FS RO.Oh, and it''s worked great. The brute force method is a good start and will address the most severe problems (and most cases) well. I''ve decided to ignore most cases of -ENOMEM for now. The biggest bug I ran into so far was calling mutex_lock while holding a spinlock. It was a quick fix. The method I''ve generally used is to mark the transaction aborted and pass the error up as quickly as possible, cleaning up the local allocations and locks as I go. The transaction gets completed normally, returns an error, isn''t committed, and then is destroyed (with others, potentially) when called from in btrfs_commit_transaction. Btrfs makes this super easy since we can just skip all the CoW writes. Thanks! - -Jeff>> thanks, liubo > > This afternoon I started running xfstests on a dm-linear mapped > partition. Halfway through a sufficiently long test, I swap out > the linear mapping to an error mapping. It still crashes, but > somewhat less spectacularly. There are still a ton of BUG_ON''s I > need to eliminate as well as work out the usual I/O error-recovery > issue of uninterruptible, unrecoverable writeback contexts and > still-locked pages holding up exit. I''m pretty pleased with the > results so far and am pretty optimistic. > > -Jeff > > >> -- To unsubscribe from this list: send the line "unsubscribe >> linux-btrfs" in the body of a message to >> majordomo@vger.kernel.org More majordomo info at >> http://vger.kernel.org/majordomo-info.html >> >- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJO8qZBAAoJEB57S2MheeWyCtYP/0+VGdUrdPceYkMGngweINFI Y6K/xzDG2tiogFyb8mVj4XH9xtGoODWiZ+yb2FkRfoqsq1dS34/XzM1Cf1SBgFTu J8xIxv3gVp0lDycV6QqpetNaPPpxDz61LmiFqNRd6bn/usBoYdlyexX3HmPll7Je MS0uAiUVNTJIK+W3qN9BIyvg8F61XFy3SdeCY5dmzClDJft1dgu6mWlHhcKVL7LW uDrX9vldV56qoL6rrNyR/wBVg8rhMxVN5z9qFttWsSpORwZdIOIUdKiTULqnCdvf mzs1yNAsAMTcE0GCLOIWEyiTSZrDlg4nGgZMIDKnzD0GywJDy+qc/9XPL+5WkyaD Z48a6sBCXGhmQsux8iEeGAlTfP5/YJMd2PqaKfFlpSeL2u+Pt6EAFUpEUfXDYRhI aBxzJK7D+GrgduheWTQc2AgeH8ee7bUEe1k+d4+EIWJTq5vKkPWH7x580q0yL+t2 qiLqzSlSTPaCr9tJlQo3d+dHu2L2r43+2qYeHut0JjFtp2dDjWO7AzcQ2JsL0yZR jL0dVT96OsWkmKu/qfvSbFZ6LLR+QrlqBzTgNA4R69nLlUj1f05AVaYvwuVqnIPH QdCf53kaEjvVlRw2WScsRHT1gMY62jmES0glIBgAH9bKAYKADlnzIAW6RSpB8NcO GZoCa+90OHl/kkXWB2eZ =DR3D -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/21/2011 10:38 PM, Jeff Mahoney wrote:> On 12/21/2011 10:21 PM, Liu Bo wrote: >> On 12/22/2011 10:59 AM, Jeff Mahoney wrote: Sorry I haven''t >> responded to this yet. I started digging right in and I''ve >> started to have some good results. It turns out there''s already a >> btrfs_cleanup_transaction call that will tear down outstanding >> transactions. It''s not perfect and I''ve fixed a few bugs in >> there, but it saved me a bunch of effort. I just wished I noticed >> it a day before since I had it half implemented myself. :) > > >>> Hi Jeff, > >>> Yes, it should be, and I wrote this cleanup_transaction where >>> I should notice you earlier... Anyway, thanks for your effort. > >>> The error handling part has lots of corner cases, so I just >>> pick up a brute way to tear down the current transaction in >>> order to make the FS RO. > > Oh, and it''s worked great. The brute force method is a good start > and will address the most severe problems (and most cases) well. > I''ve decided to ignore most cases of -ENOMEM for now. The biggest > bug I ran into so far was calling mutex_lock while holding a > spinlock. It was a quick fix. > > The method I''ve generally used is to mark the transaction aborted > and pass the error up as quickly as possible, cleaning up the > local allocations and locks as I go. The transaction gets > completed normally, returns an error, isn''t committed, and then is > destroyed (with others, potentially) when called from in > btrfs_commit_transaction. Btrfs makes this super easy since we can > just skip all the CoW writes.Now, just out of curiosity, would it be ok if I printed this when we ran out memory in deep call paths? FAIL WHALE! W W W W W W W ''. W .-""-._ \ \.--| / "-..__) .-'' | _ / \''-.__, .__.,'' `''----''._\--'' VVVVVVVVVVVVVVVVVVVVV Happy Holidays ;) - -Jeff> Thanks! > > -Jeff > > >>> thanks, liubo > >> This afternoon I started running xfstests on a dm-linear mapped >> partition. Halfway through a sufficiently long test, I swap out >> the linear mapping to an error mapping. It still crashes, but >> somewhat less spectacularly. There are still a ton of BUG_ON''s I >> need to eliminate as well as work out the usual I/O >> error-recovery issue of uninterruptible, unrecoverable writeback >> contexts and still-locked pages holding up exit. I''m pretty >> pleased with the results so far and am pretty optimistic. > >> -Jeff > > >>> -- To unsubscribe from this list: send the line "unsubscribe >>> linux-btrfs" in the body of a message to >>> majordomo@vger.kernel.org More majordomo info at >>> http://vger.kernel.org/majordomo-info.html >>> > > > -- To unsubscribe from this list: send the line "unsubscribe > linux-btrfs" in the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJO9A2/AAoJEB57S2MheeWyiNIP/3Z6NETIXskkp+OVKTiF/gaP bopj2dp92BlURFHEj5vJoESm4cUtQKTx9J/DB3yc7JDzc0UcRs9KCqGV9UpH6y9/ Zetzy3ZMsYyxvV5CZ50NGr+C1r5ULVGQ/UrPex/GT0bApcdBRMkFASLH8xkFl6dE dfRjir038GzjVX/Phy0VPm0mg8eg77aco11Xk2+Y1MdEhsEqI+cUQYgA8O9M7HWy 67Vv3KWxKC7PU6SYCPa0wGmQwTgs10GuKT9w+s7Ampy8iQhCgEuDo4dQxpRehQfp YwD/vlHwVATTAR2zMbRtI0BWa+ideBzcdQg1QrZxB3o026Z7ooy+/fTqS6MiUrXy mxGvb0g/BglK6Q86YQE77doIfJeUDLGoGQx2Zv1S9OzVwigo1a0LcP82P7yNnJBY oihql+FAYBXwjqiAQ+wUvo7wy0H+ltmQgWfUDf5wjDHquTRT1H0kE15Okc8MX8+T rmhp6vD1deX5Jz+JBIpCm94JhxUBPkBH2WksyA1jdLUOngHxRI0jmqz/5mPexV8e dChaq1rsjYs5Zbbv/jpaefnEw0kbZ0cqS7uDLVVoyjEqGnBpqjdwE86WYjxc4biM MkeSJ67Oof3ZGLWR0VQ+h4YnRjqAsMWsEd3jBLMo2krsr8ucc/UOzVDBVojDlGWJ Z2HunZuWJkNgcsBatVoS =z1sd -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/23/2011 01:12 PM, Jeff Mahoney wrote:> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 12/21/2011 10:38 PM, Jeff Mahoney wrote: >> On 12/21/2011 10:21 PM, Liu Bo wrote: >>> On 12/22/2011 10:59 AM, Jeff Mahoney wrote: Sorry I haven''t >>> responded to this yet. I started digging right in and I''ve >>> started to have some good results. It turns out there''s already a >>> btrfs_cleanup_transaction call that will tear down outstanding >>> transactions. It''s not perfect and I''ve fixed a few bugs in >>> there, but it saved me a bunch of effort. I just wished I noticed >>> it a day before since I had it half implemented myself. :) >> >>>> Hi Jeff, >>>> Yes, it should be, and I wrote this cleanup_transaction where >>>> I should notice you earlier... Anyway, thanks for your effort. >>>> The error handling part has lots of corner cases, so I just >>>> pick up a brute way to tear down the current transaction in >>>> order to make the FS RO. >> Oh, and it''s worked great. The brute force method is a good start >> and will address the most severe problems (and most cases) well. >> I''ve decided to ignore most cases of -ENOMEM for now. The biggest >> bug I ran into so far was calling mutex_lock while holding a >> spinlock. It was a quick fix. >> >> The method I''ve generally used is to mark the transaction aborted >> and pass the error up as quickly as possible, cleaning up the >> local allocations and locks as I go. The transaction gets >> completed normally, returns an error, isn''t committed, and then is >> destroyed (with others, potentially) when called from in >> btrfs_commit_transaction. Btrfs makes this super easy since we can >> just skip all the CoW writes. > > > Now, just out of curiosity, would it be ok if I printed this when we > ran out memory in deep call paths? >I''m ok with this, but it depends on Chris :) Indeed, ENOMEM in deep call paths is a big big trouble for us, we don''t yet have a graceful solution, and we can make an memory allocation with mask __GFP_NOFAIL flags for simplicity, although it is not recommended: * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller * cannot handle allocation failures. This modifier is deprecated and no new * users should be added.> FAIL WHALE! > > W W W > W W W W > ''. W > .-""-._ \ \.--| > / "-..__) .-'' > | _ / > \''-.__, .__.,'' > `''----''._\--'' > VVVVVVVVVVVVVVVVVVVVV > > > Happy Holidays ;) >Happy Holidays! thanks, liubo> - -Jeff > >> Thanks! >> >> -Jeff >> >> >>>> thanks, liubo >>> This afternoon I started running xfstests on a dm-linear mapped >>> partition. Halfway through a sufficiently long test, I swap out >>> the linear mapping to an error mapping. It still crashes, but >>> somewhat less spectacularly. There are still a ton of BUG_ON''s I >>> need to eliminate as well as work out the usual I/O >>> error-recovery issue of uninterruptible, unrecoverable writeback >>> contexts and still-locked pages holding up exit. I''m pretty >>> pleased with the results so far and am pretty optimistic. >>> -Jeff >> >>>> -- To unsubscribe from this list: send the line "unsubscribe >>>> linux-btrfs" in the body of a message to >>>> majordomo@vger.kernel.org More majordomo info at >>>> http://vger.kernel.org/majordomo-info.html >>>> >> >> -- To unsubscribe from this list: send the line "unsubscribe >> linux-btrfs" in the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > - -- > Jeff Mahoney > SUSE Labs > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.18 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQIcBAEBAgAGBQJO9A2/AAoJEB57S2MheeWyiNIP/3Z6NETIXskkp+OVKTiF/gaP > bopj2dp92BlURFHEj5vJoESm4cUtQKTx9J/DB3yc7JDzc0UcRs9KCqGV9UpH6y9/ > Zetzy3ZMsYyxvV5CZ50NGr+C1r5ULVGQ/UrPex/GT0bApcdBRMkFASLH8xkFl6dE > dfRjir038GzjVX/Phy0VPm0mg8eg77aco11Xk2+Y1MdEhsEqI+cUQYgA8O9M7HWy > 67Vv3KWxKC7PU6SYCPa0wGmQwTgs10GuKT9w+s7Ampy8iQhCgEuDo4dQxpRehQfp > YwD/vlHwVATTAR2zMbRtI0BWa+ideBzcdQg1QrZxB3o026Z7ooy+/fTqS6MiUrXy > mxGvb0g/BglK6Q86YQE77doIfJeUDLGoGQx2Zv1S9OzVwigo1a0LcP82P7yNnJBY > oihql+FAYBXwjqiAQ+wUvo7wy0H+ltmQgWfUDf5wjDHquTRT1H0kE15Okc8MX8+T > rmhp6vD1deX5Jz+JBIpCm94JhxUBPkBH2WksyA1jdLUOngHxRI0jmqz/5mPexV8e > dChaq1rsjYs5Zbbv/jpaefnEw0kbZ0cqS7uDLVVoyjEqGnBpqjdwE86WYjxc4biM > MkeSJ67Oof3ZGLWR0VQ+h4YnRjqAsMWsEd3jBLMo2krsr8ucc/UOzVDBVojDlGWJ > Z2HunZuWJkNgcsBatVoS > =z1sd > -----END PGP SIGNATURE----- >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Dec 23, 2011 at 12:12:31AM -0500, Jeff Mahoney wrote:> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 12/21/2011 10:38 PM, Jeff Mahoney wrote: > > On 12/21/2011 10:21 PM, Liu Bo wrote: > >> On 12/22/2011 10:59 AM, Jeff Mahoney wrote: Sorry I haven''t > >> responded to this yet. I started digging right in and I''ve > >> started to have some good results. It turns out there''s already a > >> btrfs_cleanup_transaction call that will tear down outstanding > >> transactions. It''s not perfect and I''ve fixed a few bugs in > >> there, but it saved me a bunch of effort. I just wished I noticed > >> it a day before since I had it half implemented myself. :) > > > > > >>> Hi Jeff, > > > >>> Yes, it should be, and I wrote this cleanup_transaction where > >>> I should notice you earlier... Anyway, thanks for your effort. > > > >>> The error handling part has lots of corner cases, so I just > >>> pick up a brute way to tear down the current transaction in > >>> order to make the FS RO. > > > > Oh, and it''s worked great. The brute force method is a good start > > and will address the most severe problems (and most cases) well. > > I''ve decided to ignore most cases of -ENOMEM for now. The biggest > > bug I ran into so far was calling mutex_lock while holding a > > spinlock. It was a quick fix. > > > > The method I''ve generally used is to mark the transaction aborted > > and pass the error up as quickly as possible, cleaning up the > > local allocations and locks as I go. The transaction gets > > completed normally, returns an error, isn''t committed, and then is > > destroyed (with others, potentially) when called from in > > btrfs_commit_transaction. Btrfs makes this super easy since we can > > just skip all the CoW writes. > > > Now, just out of curiosity, would it be ok if I printed this when we > ran out memory in deep call paths? > > FAIL WHALE! > > W W W > W W W W > ''. W > .-""-._ \ \.--| > / "-..__) .-'' > | _ / > \''-.__, .__.,'' > `''----''._\--'' > VVVVVVVVVVVVVVVVVVVVV > > > Happy Holidays ;)I''ll take any patch you put into the suse kernel ;) -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html