thr3ads.net - Lustre discuss - [Lustre-discuss] Lustre Mount Crashing [Jun 2008]

If this information is useful, please help other people find it:
Share via:

Charles Taylor

2008-Jun-02 15:02 UTC

[Lustre-discuss] Lustre Mount Crashing

We lost our MDS/MGS to a power failure yesterday evening.     Just to  
be safe, we ran e2fsck on the combined MDT/MGT and there were only a  
couple of minor complaints about HTREE issues that it fixed.    The  
MDT/MGT now fsck''s cleanly.     The problem is that, despite the clean
e2fsck, the MGS is crashing in the lustre mount code when attempting  
to mount the MDT.

It is a scratch file system so it is not backed up.   Still, it is a  
pain to lose the data.    I''m assuming this is not normal and there is
not much in the manual about doing anything more than e2fsck but I  
want to ask if anyone else has seem something like this before and  
might have some additional suggestions before I trash the data and  
reformat the file system.

Thanks,

Charlie Taylor
UF HPC Center

Johann Lombardi

2008-Jun-02 15:16 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Mon, Jun 02, 2008 at 11:02:11AM -0400, Charles Taylor
wrote:> We lost our MDS/MGS to a power failure yesterday evening.     Just to
> be safe, we ran e2fsck on the combined MDT/MGT and there were only a
> couple of minor complaints about HTREE issues that it fixed.    The
> MDT/MGT now fsck''s cleanly.     The problem is that, despite the
clean
> e2fsck, the MGS is crashing in the lustre mount code when attempting
> to mount the MDT.
Where is it crashing exactly? Any stack traces, assertion failures ...
on the console?

Johann

Charles Taylor

2008-Jun-02 15:35 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

Well, I figured someone would ask that.  :)    The last messages that  
make it to syslog prior to the crash are....

Jun  2 10:29:54 hpcmds kernel: LDISKFS FS on md2, internal journal
Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: recovery complete.
Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: mounted filesystem with  
ordered data mode.
Jun  2 10:29:54 hpcmds kernel: kjournald starting.  Commit interval 5  
seconds
Jun  2 10:29:54 hpcmds kernel: LDISKFS FS on md2, internal journal
Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: mounted filesystem with  
ordered data mode.
Jun  2 10:29:54 hpcmds kernel: Lustre: MGS MGS started
Jun  2 10:29:54 hpcmds kernel: Lustre: Enabling user_xattr
Jun  2 10:29:54 hpcmds kernel: Lustre: 4540:0:(mds_fs.c: 
446:mds_init_server_data()) RECOVERY: service ufhpc-MDT0000, 100  
recoverable clients, last_transno 9412464331
Jun  2 10:29:54 hpcmds kernel: Lustre: MDT ufhpc-MDT0000 now serving  
dev (ufhpc-MDT0000/cac99db5-a66a-a6ac-4649-6ec8cc2dc0e7), but will be  
in recovery until 100 clients reconnect, or if no clients reconnect  
for 4:10; during that time new clients will not be allowed to connect.  
Recovery progress can be monitored by watching /proc/fs/lustre/mds/ 
ufhpc-MDT0000/recovery_status.
Jun  2 10:29:55 hpcmds kernel: Lustre: 4540:0:(mds_lov.c: 
858:mds_notify()) MDS ufhpc-MDT0000: in recovery, not resetting  
orphans on ufhpc-OST0004_UUID
Jun  2 10:29:55 hpcmds kernel: Lustre: 4540:0:(mds_lov.c: 
858:mds_notify()) MDS ufhpc-MDT0000: in recovery, not resetting  
orphans on ufhpc-OST0005_UUID

Note that all of the clients are powered off and the OSS''s are  
currently unmounted (though they appear to be fine).

Unfortunately, getting the messages off the console (in the machine  
room) means using a pencil and paper (you''d think we have something as
fancy as a ip-kvm console server, but alas, we do things, ahem,  
"inexpensively" here.   I''m going to let the md mirrors
resync before
I try it again (although I don''t think that should be an issue).      
If it crashes a third time, and I suspect it will, I''ll include some  
of the stack trace.   Of course, part of the problem is that it is  
deep enough that it goes off screen and we can''t see the top of the  
trace (which is kind of useful).  :)

I was hoping for a silver bullet, but...

Thanks,

Charlie Taylor
UF HPC Center

On Jun 2, 2008, at 11:16 AM, Johann Lombardi wrote:
> On Mon, Jun 02, 2008 at 11:02:11AM -0400, Charles Taylor wrote:
>> We lost our MDS/MGS to a power failure yesterday evening.     Just to
>> be safe, we ran e2fsck on the combined MDT/MGT and there were only a
>> couple of minor complaints about HTREE issues that it fixed.    The
>> MDT/MGT now fsck''s cleanly.     The problem is that, despite
the
>> clean
>> e2fsck, the MGS is crashing in the lustre mount code when attempting
>> to mount the MDT.
>
> Where is it crashing exactly? Any stack traces, assertion failures ...
> on the console?
>
> Johann

Brian J. Murrell

2008-Jun-02 15:45 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Mon, 2008-06-02 at 11:35 -0400, Charles Taylor wrote:> 
> Well, I figured someone would ask that.  :)    The last messages that  
> make it to syslog prior to the crash are....
> 
> Jun  2 10:29:54 hpcmds kernel: LDISKFS FS on md2, internal journal
> Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: recovery complete.
> Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: mounted filesystem with  
> ordered data mode.
> Jun  2 10:29:54 hpcmds kernel: kjournald starting.  Commit interval 5  
> seconds
> Jun  2 10:29:54 hpcmds kernel: LDISKFS FS on md2, internal journal
> Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: mounted filesystem with  
> ordered data mode.
> Jun  2 10:29:54 hpcmds kernel: Lustre: MGS MGS started
> Jun  2 10:29:54 hpcmds kernel: Lustre: Enabling user_xattr
> Jun  2 10:29:54 hpcmds kernel: Lustre: 4540:0:(mds_fs.c: 
> 446:mds_init_server_data()) RECOVERY: service ufhpc-MDT0000, 100  
> recoverable clients, last_transno 9412464331
> Jun  2 10:29:54 hpcmds kernel: Lustre: MDT ufhpc-MDT0000 now serving  
> dev (ufhpc-MDT0000/cac99db5-a66a-a6ac-4649-6ec8cc2dc0e7), but will be  
> in recovery until 100 clients reconnect, or if no clients reconnect  
> for 4:10; during that time new clients will not be allowed to connect.  
> Recovery progress can be monitored by watching /proc/fs/lustre/mds/ 
> ufhpc-MDT0000/recovery_status.
> Jun  2 10:29:55 hpcmds kernel: Lustre: 4540:0:(mds_lov.c: 
> 858:mds_notify()) MDS ufhpc-MDT0000: in recovery, not resetting  
> orphans on ufhpc-OST0004_UUID
> Jun  2 10:29:55 hpcmds kernel: Lustre: 4540:0:(mds_lov.c: 
> 858:mds_notify()) MDS ufhpc-MDT0000: in recovery, not resetting  
> orphans on ufhpc-OST0005_UUID
This is all perfectly normal.  Is there anything else or does this
amount to all that you are seeing?
> Note that all of the clients are powered off and the OSS''s are  
> currently unmounted (though they appear to be fine).
Does anything bad happen when you bring up the OSSes?  Ideally, OSTs
should be brought up before the MDT but there is no requirement for
that.
> If it crashes
Do you have messages from a crash?
> a third time, and I suspect it will, I''ll include some  
> of the stack trace.
Unless you are getting some kind of kernel panic, that stack trace
should be in the syslog.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080602/4997330c/attachment.bin

Dennis Nelson

2008-Jun-02 15:49 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

Todd,

Does this make sense?  He is saying that OSTs need to be mounted first?  I
thought that they sould not connect if the MDT is not mounted.



On 6/2/08 10:45 AM, "Brian J. Murrell" <Brian.Murrell at
Sun.COM> wrote:
> On Mon, 2008-06-02 at 11:35 -0400, Charles Taylor wrote:
>> 
>> Well, I figured someone would ask that.  :)    The last messages that
>> make it to syslog prior to the crash are....
>> 
>> Jun  2 10:29:54 hpcmds kernel: LDISKFS FS on md2, internal journal
>> Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: recovery complete.
>> Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: mounted filesystem with
>> ordered data mode.
>> Jun  2 10:29:54 hpcmds kernel: kjournald starting.  Commit interval 5
>> seconds
>> Jun  2 10:29:54 hpcmds kernel: LDISKFS FS on md2, internal journal
>> Jun  2 10:29:54 hpcmds kernel: LDISKFS-fs: mounted filesystem with
>> ordered data mode.
>> Jun  2 10:29:54 hpcmds kernel: Lustre: MGS MGS started
>> Jun  2 10:29:54 hpcmds kernel: Lustre: Enabling user_xattr
>> Jun  2 10:29:54 hpcmds kernel: Lustre: 4540:0:(mds_fs.c:
>> 446:mds_init_server_data()) RECOVERY: service ufhpc-MDT0000, 100
>> recoverable clients, last_transno 9412464331
>> Jun  2 10:29:54 hpcmds kernel: Lustre: MDT ufhpc-MDT0000 now serving
>> dev (ufhpc-MDT0000/cac99db5-a66a-a6ac-4649-6ec8cc2dc0e7), but will be
>> in recovery until 100 clients reconnect, or if no clients reconnect
>> for 4:10; during that time new clients will not be allowed to connect.
>> Recovery progress can be monitored by watching /proc/fs/lustre/mds/
>> ufhpc-MDT0000/recovery_status.
>> Jun  2 10:29:55 hpcmds kernel: Lustre: 4540:0:(mds_lov.c:
>> 858:mds_notify()) MDS ufhpc-MDT0000: in recovery, not resetting
>> orphans on ufhpc-OST0004_UUID
>> Jun  2 10:29:55 hpcmds kernel: Lustre: 4540:0:(mds_lov.c:
>> 858:mds_notify()) MDS ufhpc-MDT0000: in recovery, not resetting
>> orphans on ufhpc-OST0005_UUID
> 
> This is all perfectly normal.  Is there anything else or does this
> amount to all that you are seeing?
> 
>> Note that all of the clients are powered off and the OSS''s are
>> currently unmounted (though they appear to be fine).
> 
> Does anything bad happen when you bring up the OSSes?  Ideally, OSTs
> should be brought up before the MDT but there is no requirement for
> that.
> 
>> If it crashes
> 
> Do you have messages from a crash?
> 
>> a third time, and I suspect it will, I''ll include some
>> of the stack trace.
> 
> Unless you are getting some kind of kernel panic, that stack trace
> should be in the syslog.
> 
> b.
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080602/510a31b4/attachment.html

Johann Lombardi

2008-Jun-02 15:50 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Mon, Jun 02, 2008 at 11:35:35AM -0400, Charles Taylor
wrote:> Well, I figured someone would ask that.  :)    The last messages that
> make it to syslog prior to the crash are....[...]

As expected, there is nothing wrong here.
> Unfortunately, getting the messages off the console (in the machine
> room) means using a pencil and paper (you''d think we have
something as
> fancy as a ip-kvm console server, but alas, we do things, ahem,
> "inexpensively" here.
Unfortunately, we cannot really help without more information ...
You can still try to abort recovery (-o abort_recov) when mounting the MDS.

Johann

Brian J. Murrell

2008-Jun-02 15:59 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Mon, 2008-06-02 at 08:49 -0700, Dennis Nelson wrote:> Todd,
> 
> Does this make sense?  He is saying that OSTs need to be mounted
> first?
Not *need*, but rather, ideally, should.  The reason is that when the
MDS comes up, the opportunity for clients to get object pointers exists.
It''s better that the OSTs are up to serve the expected object requests
when that happens.

Conversely, when shutting down, ideally you shut down the MDS first so
that the ability for clients to get object pointers goes away before the
OSTs serving them go away.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080602/283fc2b1/attachment.bin

Charles Taylor

2008-Jun-02 16:58 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Jun 2, 2008, at 11:49 AM, Dennis Nelson wrote:
>
> >
> > Unless you are getting some kind of kernel panic, that stack trace
> > should be in the syslog.

No, it is going down hard in a kernel panic.     All of the stack  
trace I can see at the moment looks like (scribbled by hand... so  
forgive me for leaving off the addresses and offsets).

:libcfs:cfs_alloc
:obdclass:lustre_init_lsi
:obdclass:lustre_fill_super
:obdclass::lustre_fill_super
set_anon_super
set_anon_super
:obd_class:lustre_fill_super
et_sb_nodev
vfs_kern_mount
do_kern_mount
do_mount
__handle_mm_fault
__up_read
do_page_fault
zone_statistics
__alloc_pages
sys_mount
system_call

RIP <  .....  > resched_task

I wish I could get the whole trace to you.   We might try to get kdump  
on there but my luck with kdump has been mixed.   It seems to work  
with some chipsets and not with others.

Anyway, we may just be out of luck.   I just hate to give up too  
easily because it seems like everything is solid yet we crash on or  
just after the mount.   This is on a MDS that has been running without  
a problem for 5 months (lustre 1.6.4.2 ).

uname -a
Linux hpcmds 2.6.18-8.1.14.el5.L-1642 #2 SMP Thu Feb 21 15:42:14 EST  
2008 x86_64 x86_64 x86_64 GNU/Linux

I don''t know if that trace is a lot of help to you since it is not  
complete (which is why I didn''t post it initially) but maybe there is  
something there of use.

Regards,

Charlie Taylor
UF HPC Center

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080602/f6631c74/attachment.html

Kilian CAVALOTTI

2008-Jun-02 17:05 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Monday 02 June 2008 08:35:35 am Charles Taylor wrote:> Unfortunately, getting the messages off the console (in the machine
> room) means using a pencil and paper (you''d think we have
something
> as fancy as a ip-kvm console server, but alas, we do things, ahem,
> "inexpensively" here.   
There are a couple solutions to help you there:

* using a serial console connected to a remote machine (costs a serial 
  cable and some configuration).

* having an IPMI-enabled BMC, or any sort of remote-controler card 
  should give you easy access to the machine''s console, remotely. Those
  cards ain''t cheap, but if you already got them in your servers,
that''s
  the good occasion to put them in use.

* and maybe the easiest, most inexpensive (no hardware involved) and 
  most convenient one: using netdump [1]. You configure a netdump client 
  on the machine you want to gather logs and traces from, and a 
  netdump-server on an other host, to receive those messages. This 
  solution proved to be really efficient in gathering Lustre''s debug 
  logs and crash dumps.

[1] http://www.redhat.com/support/wpapers/redhat/netdump/
and http://docs.freevps.com/doku.php?id=how-to:netdump

HTH,
-- 
Kilian

Brian J. Murrell

2008-Jun-02 17:10 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Mon, 2008-06-02 at 12:58 -0400, Charles Taylor wrote:
> 
> No, it is going down hard in a kernel panic.     All of the stack
> trace I can see at the moment looks like (scribbled by hand... so
> forgive me for leaving off the addresses and offsets).
> 
> 
> 
> 
> :libcfs:cfs_alloc
> :obdclass:lustre_init_lsi
> :obdclass:lustre_fill_super
> :obdclass::lustre_fill_super
> set_anon_super
> set_anon_super
> :obd_class:lustre_fill_super
> et_sb_nodev
> vfs_kern_mount
> do_kern_mount
> do_mount
> __handle_mm_fault
> __up_read
> do_page_fault
> zone_statistics
> __alloc_pages
> sys_mount
> system_call
> 
> 
> RIP <  .....  > resched_task
I''m afraid that is too vague.

Can you plug the serial port from this crashing machine into another
(like a laptop) and use minicom on a serial directed console to capture
it?

b.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080602/e54e2d0a/attachment.bin

Andreas Dilger

2008-Jun-02 19:30 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Jun 02, 2008  10:05 -0700, Kilian CAVALOTTI wrote:> On Monday 02 June 2008 08:35:35 am Charles Taylor wrote:
> > Unfortunately, getting the messages off the console (in the machine
> > room) means using a pencil and paper (you''d think we have
something
> > as fancy as a ip-kvm console server, but alas, we do things, ahem,
> > "inexpensively" here.   
> 
> There are a couple solutions to help you there:
> * using a serial console connected to a remote machine (costs a serial 
>   cable and some configuration).
One very practical and low-cost mechanism is to cross-cable the serial
console from one machine to its neighbour.  Most server-class machines
have 2 serial consoles, so you can have an inbound port for the console
of the neighbour, and an outbound port configured to be the serial
console of that machine.
> * and maybe the easiest, most inexpensive (no hardware involved) and 
>   most convenient one: using netdump [1]. You configure a netdump client 
>   on the machine you want to gather logs and traces from, and a 
>   netdump-server on an other host, to receive those messages. This 
>   solution proved to be really efficient in gathering Lustre''s
debug
>   logs and crash dumps.
> 
> [1] http://www.redhat.com/support/wpapers/redhat/netdump/
> and http://docs.freevps.com/doku.php?id=how-to:netdump
Yes, LLNL has been using netdump to good effect.  It works with the
"normal" crashdump utilities like "crash" (modified gdb). 
It isn''t
in all kernels, however.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Charles Taylor

2008-Jun-02 19:35 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

We appreciate all the suggestions and help.   Just for the record,  
we''ve used netdump successfully for a long time up through CentOS/RH  
4.5.   However, it seems that support for it, in favor of kdump (as  
far as we can tell) has been deprecated in RH/CentOS 5 and above.      
If we are wrong about that let us know since we had more success with  
netdump than we are having with kdump.

Thanks,

Charlie Taylor
UF HPC Center

On Jun 2, 2008, at 3:30 PM, Andreas Dilger wrote:
>>
>> [1] http://www.redhat.com/support/wpapers/redhat/netdump/
>> and http://docs.freevps.com/doku.php?id=how-to:netdump
>
> Yes, LLNL has been using netdump to good effect.  It works with the
> "normal" crashdump utilities like "crash" (modified
gdb).  It isn''t
> in all kernels, however.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>

Andreas Dilger

2008-Jun-02 19:36 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Jun 02, 2008  12:58 -0400, Charles Taylor wrote:> No, it is going down hard in a kernel panic.     All of the stack trace I 
> can see at the moment looks like (scribbled by hand... so forgive me for 
> leaving off the addresses and offsets).
>
>
> :libcfs:cfs_alloc
> :obdclass:lustre_init_lsi
> :obdclass:lustre_fill_super
> :obdclass::lustre_fill_super
> set_anon_super
> set_anon_super
> :obd_class:lustre_fill_super
> et_sb_nodev
> vfs_kern_mount
> do_kern_mount
> do_mount
> __handle_mm_fault
> __up_read
> do_page_fault
> zone_statistics
> __alloc_pages
> sys_mount
> system_call
>
> RIP <  .....  > resched_task
Hmm, this doesn''t seem very useful.  The callpath shown:

	lustre_fill_super->lustre_init_lsi->cfs_alloc()

is _really_ early in the mount and either memory has been corrupted
before this point (causing cfs_alloc() to crash) or you are missing
some part of the stack at the top?
> I wish I could get the whole trace to you.   We might try to get kdump on 
> there but my luck with kdump has been mixed.   It seems to work with some 
> chipsets and not with others.
> Anyway, we may just be out of luck.   I just hate to give up too easily 
> because it seems like everything is solid yet we crash on or just after the
> mount.   This is on a MDS that has been running without a problem for 5 
> months (lustre 1.6.4.2 ).
>
> uname -a
> Linux hpcmds 2.6.18-8.1.14.el5.L-1642 #2 SMP Thu Feb 21 15:42:14 EST 2008 
> x86_64 x86_64 x86_64 GNU/Linux
If mounting with "-o abort_recovery" doesn''t solve the
problem,
are you able to mount the MDT filesystem as "-t ldiskfs" instead of
"-t lustre"?  Try that, then copy and truncate the last_rcvd file:

	mount -t ldiskfs /dev/MDSDEV /mnt/mds
	cp /mnt/mds/last_rcvd /mnt/mds/last_rcvd.sav
	cp /mnt/mds/last_rcvd /tmp/last_rcvd.sav
	dd if=/mnt/mds/last_rcvd.sav of=/mnt/mds/last_rcvd bs=8k count=1
	umount /mnt/mds

	mount -t lustre /dev/MSDDEV /mnt/mds

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Jim Garlick

2008-Jun-02 20:14 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

Right, netdump support was dropped in favor of kdump in RHEL 5.
As a result netdump is past tense for us at LLNL.

Jim

On Mon, Jun 02, 2008 at 03:35:04PM -0400, Charles Taylor
wrote:> We appreciate all the suggestions and help.   Just for the record,  
> we''ve used netdump successfully for a long time up through
CentOS/RH
> 4.5.   However, it seems that support for it, in favor of kdump (as  
> far as we can tell) has been deprecated in RH/CentOS 5 and above.      
> If we are wrong about that let us know since we had more success with  
> netdump than we are having with kdump.
> 
> Thanks,
> 
> Charlie Taylor
> UF HPC Center
> 
> 
> On Jun 2, 2008, at 3:30 PM, Andreas Dilger wrote:
> 
> >>
> >> [1] http://www.redhat.com/support/wpapers/redhat/netdump/
> >> and http://docs.freevps.com/doku.php?id=how-to:netdump
> >
> > Yes, LLNL has been using netdump to good effect.  It works with the
> > "normal" crashdump utilities like "crash"
(modified gdb).  It isn''t
> > in all kernels, however.
> >
> > Cheers, Andreas
> > --
> > Andreas Dilger
> > Sr. Staff Engineer, Lustre Group
> > Sun Microsystems of Canada, Inc.
> >
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Charles Taylor

2008-Jun-02 23:51 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

Wow, you are one powerful witch doctor.     So we rebuilt our system  
disk (just to be sure) and that made no difference we still panicked  
as soon as mounted the MDT.   The "-o abort_recov" did not help  
either.   However, your recipe below worked wonders....almost.     Now  
we can mount the MDT but it does not go into recovery.     It just  
shows as "inactive".     We are so close, I can taste it but what are
we doing wrong now?

[root at hpcmds lustre]# cat /proc/fs/lustre/mds/ufhpc-MDT0000/ 
recovery_status
status: INACTIVE

Which tire do we kick now?   :)

Thanks,

Charlie Taylor
UF HPC Center

On Jun 2, 2008, at 3:36 PM, Andreas Dilger wrote:>>
>
> If mounting with "-o abort_recovery" doesn''t solve the
problem,
> are you able to mount the MDT filesystem as "-t ldiskfs" instead
of
> "-t lustre"?  Try that, then copy and truncate the last_rcvd
file:
>
> 	mount -t ldiskfs /dev/MDSDEV /mnt/mds
> 	cp /mnt/mds/last_rcvd /mnt/mds/last_rcvd.sav
> 	cp /mnt/mds/last_rcvd /tmp/last_rcvd.sav
> 	dd if=/mnt/mds/last_rcvd.sav of=/mnt/mds/last_rcvd bs=8k count=1
> 	umount /mnt/mds
>
> 	mount -t lustre /dev/MSDDEV /mnt/mds
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>

Andreas Dilger

2008-Jun-03 20:20 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Jun 02, 2008  19:51 -0400, Charles Taylor wrote:> Wow, you are one powerful witch doctor.     So we rebuilt our system disk 
> (just to be sure) and that made no difference we still panicked as soon as 
> mounted the MDT.   The "-o abort_recov" did not help either.  
However,
> your recipe below worked wonders....almost.     Now we can mount the MDT 
> but it does not go into recovery.     It just shows as
"inactive".     We
> are so close, I can taste it but what are we doing wrong now?
>
>
> [root at hpcmds lustre]# cat
/proc/fs/lustre/mds/ufhpc-MDT0000/recovery_status
> status: INACTIVE
>
>
> Which tire do we kick now?   :)
Well, deleting the tail of the last_rcvd file is the "hard" way to
tell
the MDT/OST it is no longer in recovery...  The deleted part of the file
is where the per-client state is kept, so when it is removed the MDT
decides no recovery is needed.

The "recovery_status" being "INACTIVE" is somewhat
misleading.  It means
"no recovery is currently active", but the MDT is up and you should be
able to use it, with the caveat that clients previously doing operations
will get an IO error for in-flight operations before they start afresh...
However, you said the clients are powered off, so they probably aren''t
busy doing anything...

If you had a more complete stack trace it would be useful to determine
what is actually going wrong with the mount.
> On Jun 2, 2008, at 3:36 PM, Andreas Dilger wrote:
>> If mounting with "-o abort_recovery" doesn''t solve
the problem,
>> are you able to mount the MDT filesystem as "-t ldiskfs"
instead of
>> "-t lustre"?  Try that, then copy and truncate the last_rcvd
file:
>>
>> 	mount -t ldiskfs /dev/MDSDEV /mnt/mds
>> 	cp /mnt/mds/last_rcvd /mnt/mds/last_rcvd.sav
>> 	cp /mnt/mds/last_rcvd /tmp/last_rcvd.sav
>> 	dd if=/mnt/mds/last_rcvd.sav of=/mnt/mds/last_rcvd bs=8k count=1
>> 	umount /mnt/mds
>>
>> 	mount -t lustre /dev/MSDDEV /mnt/mds
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Charles Taylor

2008-Jun-03 20:37 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

I''m sorry, I should have updated you.  You are right, it was  
misleading.    The MDS/MDT was fine and  after about twenty minutes or  
so everything became active and we now have a working file system with  
data that we can access so we can''t *thank you* enough.

BTW, That''s a pretty obscure "fix".   I was going to ask for
an
explanation but we''ve been pretty busy doing fsck''s and
lfsck''s (which
we are still working up to since it takes a while to generate the  
db''s).    It is a pretty slow process but things are looking  
relatively good.   Of course, when you go from thinking you just lost  
all your data to having almost all of it, anything looks pretty  
good.  :)

Thanks again for your help,

Charlie Taylor
UF HPC Center

PS - we know refer to your commands to truncate the last_rcvd file as  
the "Dilger Procedure" (with great reverence).  :)

ct

On Jun 3, 2008, at 4:20 PM, Andreas Dilger wrote:
> On Jun 02, 2008  19:51 -0400, Charles Taylor wrote:
>> Wow, you are one powerful witch doctor.     So we rebuilt our  
>> system disk
>> (just to be sure) and that made no difference we still panicked as  
>> soon as
>> mounted the MDT.   The "-o abort_recov" did not help either.
>> However,
>> your recipe below worked wonders....almost.     Now we can mount  
>> the MDT
>> but it does not go into recovery.     It just shows as  
>> "inactive".     We
>> are so close, I can taste it but what are we doing wrong now?
>>
>>
>> [root at hpcmds lustre]# cat /proc/fs/lustre/mds/ufhpc-MDT0000/ 
>> recovery_status
>> status: INACTIVE
>>
>>
>> Which tire do we kick now?   :)
>
> Well, deleting the tail of the last_rcvd file is the "hard" way
to
> tell
> the MDT/OST it is no longer in recovery...  The deleted part of the  
> file
> is where the per-client state is kept, so when it is removed the MDT
> decides no recovery is needed.
>
> The "recovery_status" being "INACTIVE" is somewhat
misleading.  It
> means
> "no recovery is currently active", but the MDT is up and you
should be
> able to use it, with the caveat that clients previously doing  
> operations
> will get an IO error for in-flight operations before they start  
> afresh...
> However, you said the clients are powered off, so they probably
aren''t
> busy doing anything...
>
> If you had a more complete stack trace it would be useful to determine
> what is actually going wrong with the mount.
>
>> On Jun 2, 2008, at 3:36 PM, Andreas Dilger wrote:
>>> If mounting with "-o abort_recovery" doesn''t
solve the problem,
>>> are you able to mount the MDT filesystem as "-t ldiskfs"
instead of
>>> "-t lustre"?  Try that, then copy and truncate the
last_rcvd file:
>>>
>>> 	mount -t ldiskfs /dev/MDSDEV /mnt/mds
>>> 	cp /mnt/mds/last_rcvd /mnt/mds/last_rcvd.sav
>>> 	cp /mnt/mds/last_rcvd /tmp/last_rcvd.sav
>>> 	dd if=/mnt/mds/last_rcvd.sav of=/mnt/mds/last_rcvd bs=8k count=1
>>> 	umount /mnt/mds
>>>
>>> 	mount -t lustre /dev/MSDDEV /mnt/mds
>>>
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger
>>> Sr. Staff Engineer, Lustre Group
>>> Sun Microsystems of Canada, Inc.
>>>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>

Andreas Dilger

2008-Jun-03 20:55 UTC

head link

[Lustre-discuss] Lustre Mount Crashing

On Jun 03, 2008  16:37 -0400, Charles Taylor wrote:> I''m sorry, I should have updated you.  You are right, it was  
> misleading.    The MDS/MDT was fine and  after about twenty minutes or  
> so everything became active and we now have a working file system with  
> data that we can access so we can''t *thank you* enough.
You''re welcome.
> BTW, That''s a pretty obscure "fix".   I was going to ask
for an
> explanation but we''ve been pretty busy doing fsck''s and
lfsck''s (which
> we are still working up to since it takes a while to generate the  
> db''s).    It is a pretty slow process but things are looking  
> relatively good.   Of course, when you go from thinking you just lost  
> all your data to having almost all of it, anything looks pretty  
> good.  :)
> 
> PS - we know refer to your commands to truncate the last_rcvd file as  
> the "Dilger Procedure" (with great reverence).  :)
Well, by no means should this be a normal process.  If you can spare the
time after your system is back in shape, then copying the last_rcvd.sav
file to a test MDT and mounting it with a serial console enabled would
help track down what the root cause of this is.  The fewer people that
have to perform the "Dilger Procedure" the better.
> On Jun 3, 2008, at 4:20 PM, Andreas Dilger wrote:
> > On Jun 02, 2008  19:51 -0400, Charles Taylor wrote:
> >> Wow, you are one powerful witch doctor.     So we rebuilt our  
> >> system disk
> >> (just to be sure) and that made no difference we still panicked as
> >> soon as
> >> mounted the MDT.   The "-o abort_recov" did not help
either.
> >> However,
> >> your recipe below worked wonders....almost.     Now we can mount  
> >> the MDT
> >> but it does not go into recovery.     It just shows as  
> >> "inactive".     We
> >> are so close, I can taste it but what are we doing wrong now?
> >>
> >>
> >> [root at hpcmds lustre]# cat /proc/fs/lustre/mds/ufhpc-MDT0000/ 
> >> recovery_status
> >> status: INACTIVE
> >>
> >>
> >> Which tire do we kick now?   :)
> >
> > Well, deleting the tail of the last_rcvd file is the "hard"
way to
> > tell
> > the MDT/OST it is no longer in recovery...  The deleted part of the  
> > file
> > is where the per-client state is kept, so when it is removed the MDT
> > decides no recovery is needed.
> >
> > The "recovery_status" being "INACTIVE" is somewhat
misleading.  It
> > means
> > "no recovery is currently active", but the MDT is up and you
should be
> > able to use it, with the caveat that clients previously doing  
> > operations
> > will get an IO error for in-flight operations before they start  
> > afresh...
> > However, you said the clients are powered off, so they probably
aren''t
> > busy doing anything...
> >
> > If you had a more complete stack trace it would be useful to determine
> > what is actually going wrong with the mount.
> >
> >> On Jun 2, 2008, at 3:36 PM, Andreas Dilger wrote:
> >>> If mounting with "-o abort_recovery"
doesn''t solve the problem,
> >>> are you able to mount the MDT filesystem as "-t
ldiskfs" instead of
> >>> "-t lustre"?  Try that, then copy and truncate the
last_rcvd file:
> >>>
> >>> 	mount -t ldiskfs /dev/MDSDEV /mnt/mds
> >>> 	cp /mnt/mds/last_rcvd /mnt/mds/last_rcvd.sav
> >>> 	cp /mnt/mds/last_rcvd /tmp/last_rcvd.sav
> >>> 	dd if=/mnt/mds/last_rcvd.sav of=/mnt/mds/last_rcvd bs=8k
count=1
> >>> 	umount /mnt/mds
> >>>
> >>> 	mount -t lustre /dev/MSDDEV /mnt/mds
> >>>
> >>> Cheers, Andreas
> >>> --
> >>> Andreas Dilger
> >>> Sr. Staff Engineer, Lustre Group
> >>> Sun Microsystems of Canada, Inc.
> >>>
> >
> > Cheers, Andreas
> > --
> > Andreas Dilger
> > Sr. Staff Engineer, Lustre Group
> > Sun Microsystems of Canada, Inc.
> >
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Lustre discuss - Jun 2008 - Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing

[Lustre-discuss] Lustre Mount Crashing