All, I am finding that on reboot, my client systems hang requiring a hard reset because of the lnet service. It doesn''t unload all the modules in the proper order for me. I get errors like module osc has non-zero count I do some hunting and see lov needs unloaded before osc I also find fid and fld need unloaded before ptlrpc ofd needs unloaded before ost Sometimes others as well.... Is there an updated lnet script available that is more complete? For the time, I have been modifying the stock one to include the ''missing'' modules for processing. Brian Andrus ITACS/Research Computing Naval Postgraduate School Monterey, California voice: 831-656-6238
Hi Brian, On Sun, Jun 30, 2013 at 05:37:42PM +0000, Andrus, Brian Contractor wrote:> All, > > I am finding that on reboot, my client systems hang requiring a hard reset because of the lnet service. > > It doesn''t unload all the modules in the proper order for me. > I get errors like module osc has non-zero count > > I do some hunting and see lov needs unloaded before osc > I also find > fid and fld need unloaded before ptlrpc > ofd needs unloaded before ost > > Sometimes others as well.... > > Is there an updated lnet script available that is more complete? For the time, I have been modifying the stock one to include the ''missing'' modules for processing.There is a patch to improve the module unloading behavior here, but it is not yet landed: http://review.whamcloud.com/5478/ It would be valuable to know if the proposed changes fix your issue. Ned
I have put the patch in place and so far it seems to be working! Brian Andrus ITACS/Research Computing Naval Postgraduate School Monterey, California voice: 831-656-6238> -----Original Message----- > From: Ned Bass [mailto:bass6-i2BcT+NCU+M@public.gmane.org] > Sent: Monday, July 01, 2013 3:03 PM > To: Andrus, Brian Contractor > Cc: lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org > Subject: Re: [Lustre-discuss] /etc/init.d/lnet > > Hi Brian, > > On Sun, Jun 30, 2013 at 05:37:42PM +0000, Andrus, Brian Contractor wrote: > > All, > > > > I am finding that on reboot, my client systems hang requiring a hard reset > because of the lnet service. > > > > It doesn''t unload all the modules in the proper order for me. > > I get errors like module osc has non-zero count > > > > I do some hunting and see lov needs unloaded before osc I also find > > fid and fld need unloaded before ptlrpc ofd needs unloaded before ost > > > > Sometimes others as well.... > > > > Is there an updated lnet script available that is more complete? For the > time, I have been modifying the stock one to include the ''missing'' modules > for processing. > > There is a patch to improve the module unloading behavior here, but it is > not yet landed: > > http://review.whamcloud.com/5478/ > > It would be valuable to know if the proposed changes fix your issue. > > Ned
Bleh, I''m getting a 404 error on that link. Is there an updated link for the patch or a full copy of the patched script? ------------------------------------------------------------------------ *From: * Andrus, Brian Contractor <bdandrus-u6e/tGqFTB8@public.gmane.org> *Sent: * Tue, 2 Jul 2013 14:11:38 +0000 *To: * Ned Bass <bass6-i2BcT+NCU+M@public.gmane.org> *Subject: * Re: [Lustre-discuss] /etc/init.d/lnet> I have put the patch in place and so far it seems to be working! > > > Brian Andrus > ITACS/Research Computing > Naval Postgraduate School > Monterey, California > voice: 831-656-6238 > > > >> -----Original Message----- >> From: Ned Bass [mailto:bass6-i2BcT+NCU+M@public.gmane.org] >> Sent: Monday, July 01, 2013 3:03 PM >> To: Andrus, Brian Contractor >> Cc: lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org >> Subject: Re: [Lustre-discuss] /etc/init.d/lnet >> >> Hi Brian, >> >> On Sun, Jun 30, 2013 at 05:37:42PM +0000, Andrus, Brian Contractor wrote: >>> All, >>> >>> I am finding that on reboot, my client systems hang requiring a hard reset >> because of the lnet service. >>> It doesn''t unload all the modules in the proper order for me. >>> I get errors like module osc has non-zero count >>> >>> I do some hunting and see lov needs unloaded before osc I also find >>> fid and fld need unloaded before ptlrpc ofd needs unloaded before ost >>> >>> Sometimes others as well.... >>> >>> Is there an updated lnet script available that is more complete? For the >> time, I have been modifying the stock one to include the ''missing'' modules >> for processing. >> >> There is a patch to improve the module unloading behavior here, but it is >> not yet landed: >> >> http://review.whamcloud.com/5478/ >> >> It would be valuable to know if the proposed changes fix your issue. >> >> Ned > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss_______________________________________________ Lustre-discuss mailing list Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Tue, Jul 09, 2013 at 05:30:51PM -0500, Mike Hanby wrote:> Bleh, I''m getting a 404 error on that link. > > Is there an updated link for the patch or a full copy of the patched script?Sorry, there should be no trailing slash. http://review.whamcloud.com/5478 Ned> > > ------------------------------------------------------------------------------- > > From: Andrus, Brian Contractor <bdandrus-u6e/tGqFTB8@public.gmane.org> > Sent: Tue, 2 Jul 2013 14:11:38 +0000 > To: Ned Bass <bass6-i2BcT+NCU+M@public.gmane.org> > Subject: Re: [Lustre-discuss] /etc/init.d/lnet > > I have put the patch in place and so far it seems to be working! > > > Brian Andrus > ITACS/Research Computing > Naval Postgraduate School > Monterey, California > voice: 831-656-6238 > > > > > -----Original Message----- > From: Ned Bass [mailto:bass6-i2BcT+NCU+M@public.gmane.org] > Sent: Monday, July 01, 2013 3:03 PM > To: Andrus, Brian Contractor > Cc: lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org > Subject: Re: [Lustre-discuss] /etc/init.d/lnet > > Hi Brian, > > On Sun, Jun 30, 2013 at 05:37:42PM +0000, Andrus, Brian Contractor wrote: > > All, > > I am finding that on reboot, my client systems hang requiring a hard reset > > because of the lnet service. > > It doesn''t unload all the modules in the proper order for me. > I get errors like module osc has non-zero count > > I do some hunting and see lov needs unloaded before osc I also find > fid and fld need unloaded before ptlrpc ofd needs unloaded before ost > > Sometimes others as well.... > > Is there an updated lnet script available that is more complete? For the > > time, I have been modifying the stock one to include the ''missing'' modules > for processing. > > There is a patch to improve the module unloading behavior here, but it is > not yet landed: > > http://review.whamcloud.com/5478/ > > It would be valuable to know if the proposed changes fix your issue. > > Ned > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss