pbojanic@clusterfs.com
2007-Jan-18 05:37 UTC
[Lustre-devel] [Bug 10734] ptlrpc connect to non-existant node crashes
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10734 Eric advises... 1. Create a global list of zombie imports and exports and a cleanup thread that consumes them. 2. Change class_import_put() to add the import to the zombie import list on removing the last reference and run the rest of it from the cleanup thread. 3. Change __class_export_put() to add the export to the zombie export list on removing the last reference and tun the rest of it from the cleanup thread. 1 week to code and unit test? I wonder if there are existing fields in struct obd_export and struct obd_import that could be used for the queueing, but it''s not a big deal. But someone with familiarity with this stuff should verify my suggestion.
eeb@clusterfs.com
2007-Jan-18 13:12 UTC
[Lustre-devel] [Bug 10734] ptlrpc connect to non-existant node crashes
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10734 Created an attachment (id=9372) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9372&action=view) patch against b1_5 This patch changes put_{im,ex}port to schedule the {im,ex}port for destruction by the ptlrpc daemon. It''s more of a DLD than an actual solution and the following points must be considered. 1. It uses the ptlrpc daemon to do the actual {im,ex}port destruction. I used it for convenience and the extra cleanup work is negligible and wont affect performance. But this means the ptlrpc daemon must be run everywhere, not just clients (and the MDS?) as it has been up till now. 2. This introduces an extra level of asynchronousness into shutting down. Please note that shutdown has never actually been synchronous even though it may appear to be so and lconf actually assumes so - it''s always been the case that network callbacks have to complete before everything can clean up but normally this is quite fast. More work to make shutdown/unload scripts block properly may be required. 3. I''ve not actually tested this code on b1_5
nathan@clusterfs.com
2007-Jan-19 12:35 UTC
[Lustre-devel] [Bug 10734] ptlrpc connect to non-existant node crashes
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10734 What |Removed |Added ---------------------------------------------------------------------------- Attachment #9372 is|0 |1 obsolete| | Created an attachment (id=9388) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9388&action=view) compile fix and added regression test
eeb@clusterfs.com
2007-Jan-20 04:14 UTC
[Lustre-devel] [Bug 10734] ptlrpc connect to non-existant node crashes
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10734> Eric, is there any more work you want to do here before we land this?I''m not familiar with 1.6 teardown / module unload procedures. Does that need to take account of possible delays and if so, does it? Apart from that issue, I''m happy this can land.
nathan@clusterfs.com
2007-Jan-20 11:41 UTC
[Lustre-devel] [Bug 10734] ptlrpc connect to non-existant node crashes
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10734 What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED (In reply to comment #13)> > Eric, is there any more work you want to do here before we land this? > > I''m not familiar with 1.6 teardown / module unload procedures. Does that need > to take account of possible delays and if so, does it?umount waits for the last disk reference to drop before returning, but isn''t particularly concerned that all obds stop. As long as all obd''s eventually do stop (the mgc in this case), I don''t have a problem with it. It will prevent immediate re-startup (probably EALREADY), but since it''s trying to talk to a non-responding node anyhow, I don''t think this will cause anyone pain.> Apart from that issue, I''m happy this can land.Landed on b1_5