I have a small Luster setup that''s worked pretty well for the last year. A few weeks ago I posted that the OSS locks when it mounts a paticular OST. The Dilger Procedure was suggested and mounting with "-o abort_recovery" worked. This morning it crashed again and remounting the same OST and one other hard lock the system. I''m able to mount the newest problem child with abort_recovery but not the original trouble maker. I''ve made several tries at truncating the last_rcvd file but no luck. The system has about 30 seconds after being mounted before locking up. Suggestions? The dump mentions dirty journal meta data as a possible culprit. Dan
Andreas Dilger
2008-Sep-30 17:35 UTC
[Lustre-discuss] mounting OST causes hard lock, part 2
On Sep 29, 2008 17:08 -0700, Dan wrote:> I have a small Luster setup that''s worked pretty well for the last year. A few weeks ago I posted that the OSS locks when it mounts a paticular OST. The Dilger Procedure was suggested and mounting with "-o abort_recovery" worked. This morning it crashed again and remounting the same OST and one other hard lock the system. I''m able to mount the newest problem child with abort_recovery but not the original trouble maker. I''ve made several tries at truncating the last_rcvd file but no luck. The system has about 30 seconds after being mounted before locking up. > > Suggestions? > > The dump mentions dirty journal meta data as a possible culprit.Run "e2fsck -f" when OST is unmounted, using latest e2fsprogs from Sun. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Dear all, We had implemented a lustre file system sometimes ago. Lately, we changed DNS server. We have replaced the /etc/resolv.conf on every node to a new DNS. It seems working fine. However, we want to prevent any other issues which may be originated from this change in the future. As I know, as long as lustre could resolve the DNS name, there should not be any other issues. Am I correct to say so or there may be some hidden problems? Thanks for your clarifications Best Regards, Karen
Hi, As far as I know this is correct. To make sure just run lctl ping from each client to the lustre servers using names instead of IP addresses. If lctl ping resolves names correctly than in case of names everything should work fine. Cheers Wojciech Minh Hien wrote:> Dear all, > > We had implemented a lustre file system sometimes ago. Lately, we changed DNS server. We have replaced the /etc/resolv.conf on every node to a new DNS. It seems working fine. However, we want to prevent any other issues which may be originated from this change in the future. > > As I know, as long as lustre could resolve the DNS name, there should not be any other issues. Am I correct to say so or there may be some hidden problems? > > Thanks for your clarifications > > Best Regards, > Karen > > > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Dear Wojciech, Thanks for your reply Karen --- On Thu, 10/2/08, Wojciech Turek <wjt27 at cam.ac.uk> wrote:> From: Wojciech Turek <wjt27 at cam.ac.uk> > Subject: Re: [Lustre-discuss] Lustre file system and DNS > To: minhhien261 at yahoo.com > Cc: lustre-discuss at lists.lustre.org > Date: Thursday, October 2, 2008, 3:14 PM > Hi, > > As far as I know this is correct. To make sure just run > lctl ping from > each client to the lustre servers using names instead of IP > addresses. > If lctl ping resolves names correctly than in case of names > everything > should work fine. > > Cheers > > Wojciech > Minh Hien wrote: > > Dear all, > > > > We had implemented a lustre file system sometimes ago. > Lately, we changed DNS server. We have replaced the > /etc/resolv.conf on every node to a new DNS. It seems > working fine. However, we want to prevent any other issues > which may be originated from this change in the future. > > > > As I know, as long as lustre could resolve the DNS > name, there should not be any other issues. Am I correct to > say so or there may be some hidden problems? > > > > Thanks for your clarifications > > > > Best Regards, > > Karen > > > > > > > > > > > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >