thr3ads.net - Lustre discuss - [Lustre-discuss] lustre client 1.6.5.1 hangs [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Heiko Schroeter

2008-Jul-10 08:25 UTC

[Lustre-discuss] lustre client 1.6.5.1 hangs

Hello,

we have a _test_ setup for a lustre 1.6.5.1 installation with 2 Raid Systems 
(64 Bit Systems) counting for 4 OSTs with 6TB each. One combined MDS and MDT 
server (32 Bit system , for testing only).

OST lustre mkfs:
"mkfs.lustre --param="failover.mode=failout" --fsname 
scia --ost --mkfsoptions=''-i 2097152 -E stride=16 -b 
4096'' --mgsnode=mds1lustre at tcp0 /dev/sdb"
(Our files are quite large 100MB+ on the system)

Kernel: Vanilla Kernel 2.6.22.19, lustre compiled from the sources on Gentoo 
2008.0

The client mount point is /misc/testfs via automount.
The access can be done through a link from /mnt/testfs -> /misc/testfs

The following procedure hangs a client:
1) copy files to the lustre system
2) do a ''du -sh /mnt/testfs/willi'' while copying
3) unmount an OST (here OST0003) while copying

The ''du'' job hangs and the lustre file system cannot be
acessed any longer on
this client even from other logins. The only way to restore normal op is IMHO 
a hard reset of the machine. A reboot hangs because the filesystem is still 
active.
Other clients and there mount points are not affected as long as they do not 
access the file system with ''du'' ''ls'' or so.
I know that this is drastic but may happen in production by our users.

Deactivating/Reactivating or remounting the OST does not have any effect on 
the ''du'' job. The ''du'' job (#29665 see
process list below) and the
correpsonding lustre thread (#29694) cannot be killed manually.

This behaviour is reproducable. The OST0003 is not reactivated on the client 
side though the MDS does so. It seems that this info does not propagate to 
the client. See last lines of dmesg below.

What is the proper way (besides avoiding the use of ''du'') to
reactivate the
client file system ?

Thanks and Regards
Heiko




The process list on the CLIENT:
<snip>
root     29175  5026  0 08:36 ?        00:00:00 sshd: laura [priv]
laura   29177 29175  0 08:36 ?        00:00:01 sshd: laura at pts/0
laura   29178 29177  0 08:36 pts/0    00:00:00 -bash
laura   29665 29178  0 09:15 pts/0    00:00:03 du -sh /mnt/testfs/foo/fam/
schell   29694     2  0 09:15 ?        00:00:00 [ll_sa_29665]
root     29695  4846  0 09:15 ?        00:00:00 /usr/sbin/automount --timeout 
60 --pid-file /var/run/autofs.misc.pid /misc yp auto.misc
<snap>

and CLIENT dmesg:
Lustre: 5361:0:(import.c:395:import_select_connection()) 
scia-OST0003-osc-ffff8100ea24a000: tried all connections, increasing latency 
to 6s
Lustre: 5361:0:(import.c:395:import_select_connection()) Skipped 10 previous 
similar messages
LustreError: 11-0: an error occurred while communicating with 
192.168.16.97 at tcp. The ost_connect operation failed with -19
LustreError: Skipped 20 previous similar messages
Lustre: 5361:0:(import.c:395:import_select_connection()) 
scia-OST0003-osc-ffff8100ea24a000: tried all connections, increasing latency 
to 51s
Lustre: 5361:0:(import.c:395:import_select_connection()) Skipped 20 previous 
similar messages
LustreError: 11-0: an error occurred while communicating with 
192.168.16.97 at tcp. The ost_connect operation failed with -19
LustreError: Skipped 24 previous similar messages
Lustre: 5361:0:(import.c:395:import_select_connection()) 
scia-OST0003-osc-ffff8100ea24a000: tried all connections, increasing latency 
to 51s
Lustre: 5361:0:(import.c:395:import_select_connection()) Skipped 24 previous 
similar messages
LustreError: 167-0: This client was evicted by scia-OST0003; in progress 
operations using this service will fail.

The MDS dmesg:
<snip>
Lustre: 6108:0:(import.c:395:import_select_connection()) scia-OST0003-osc: 
tried all connections, increasing latency to 51s
Lustre: 6108:0:(import.c:395:import_select_connection()) Skipped 10 previous 
similar messages
LustreError: 11-0: an error occurred while communicating with 
192.168.16.97 at tcp. The ost_connect operation failed with -19
LustreError: Skipped 10 previous similar messages
Lustre: 6108:0:(import.c:395:import_select_connection()) scia-OST0003-osc: 
tried all connections, increasing latency to 51s
Lustre: 6108:0:(import.c:395:import_select_connection()) Skipped 20 previous 
similar messages
Lustre: Permanently deactivating scia-OST0003
Lustre: Setting parameter scia-OST0003-osc.osc.active in log scia-client
Lustre: Skipped 3 previous similar messages
Lustre: setting import scia-OST0003_UUID INACTIVE by administrator request
Lustre: scia-OST0003-osc.osc: set parameter active=0
Lustre: Skipped 2 previous similar messages
Lustre: scia-MDT0000: haven''t heard from client 
9111f740-b7a7-e2ff-b672-288a66decfab (at 192.168.16.106 at tcp) in 1269 seconds.
I think it''s dead, and I am evicting it.
Lustre: Permanently reactivating scia-OST0003
Lustre: Modifying parameter scia-OST0003-osc.osc.active in log scia-client
Lustre: Skipped 1 previous similar message
Lustre: 15406:0:(import.c:395:import_select_connection()) scia-OST0003-osc: 
tried all connections, increasing latency to 51s
Lustre: 15406:0:(import.c:395:import_select_connection()) Skipped 2 previous 
similar messages
LustreError: 167-0: This client was evicted by scia-OST0003; in progress 
operations using this service will fail.
Lustre: scia-OST0003-osc: Connection restored to service scia-OST0003 using 
nid 192.168.16.97 at tcp.
Lustre: scia-OST0003-osc.osc: set parameter active=1
Lustre: MDS scia-MDT0000: scia-OST0003_UUID now active, resetting orphans
<snap>

Lundgren, Andrew

2008-Jul-10 15:21 UTC

head link

[Lustre-discuss] lustre client 1.6.5.1 hangs

We are experiencing the same problem with 1.6.4.2.  We thought it was the
statahead problems.  After turning off the statahead code, we experienced the
same problem again.  I had hoped going to 1.6.5 would resolve the issue.  If you
open a bug, would you mind sending the bug number to the list?  I would like to
get on the CC list.
> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org
> [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of
> Heiko Schroeter
> Sent: Thursday, July 10, 2008 2:25 AM
> To: lustre-discuss at clusterfs.com
> Subject: [Lustre-discuss] lustre client 1.6.5.1 hangs
>
> Hello,
>
> we have a _test_ setup for a lustre 1.6.5.1 installation with
> 2 Raid Systems
> (64 Bit Systems) counting for 4 OSTs with 6TB each. One
> combined MDS and MDT
> server (32 Bit system , for testing only).
>
> OST lustre mkfs:
> "mkfs.lustre --param="failover.mode=failout" --fsname
> scia --ost --mkfsoptions=''-i 2097152 -E stride=16 -b
> 4096'' --mgsnode=mds1lustre at tcp0 /dev/sdb"
> (Our files are quite large 100MB+ on the system)
>
> Kernel: Vanilla Kernel 2.6.22.19, lustre compiled from the
> sources on Gentoo
> 2008.0
>
> The client mount point is /misc/testfs via automount.
> The access can be done through a link from /mnt/testfs -> /misc/testfs
>
> The following procedure hangs a client:
> 1) copy files to the lustre system
> 2) do a ''du -sh /mnt/testfs/willi'' while copying
> 3) unmount an OST (here OST0003) while copying
>
> The ''du'' job hangs and the lustre file system cannot be
> acessed any longer on
> this client even from other logins. The only way to restore
> normal op is IMHO
> a hard reset of the machine. A reboot hangs because the
> filesystem is still
> active.
> Other clients and there mount points are not affected as long
> as they do not
> access the file system with ''du'' ''ls''
or so.
> I know that this is drastic but may happen in production by our users.
>
> Deactivating/Reactivating or remounting the OST does not have
> any effect on
> the ''du'' job. The ''du'' job (#29665 see
process list below) and the
> correpsonding lustre thread (#29694) cannot be killed manually.
>
> This behaviour is reproducable. The OST0003 is not
> reactivated on the client
> side though the MDS does so. It seems that this info does not
> propagate to
> the client. See last lines of dmesg below.
>
> What is the proper way (besides avoiding the use of ''du'')
to
> reactivate the
> client file system ?
>
> Thanks and Regards
> Heiko
>
>
>
>
> The process list on the CLIENT:
> <snip>
> root     29175  5026  0 08:36 ?        00:00:00 sshd: laura [priv]
> laura   29177 29175  0 08:36 ?        00:00:01 sshd: laura at pts/0
> laura   29178 29177  0 08:36 pts/0    00:00:00 -bash
> laura   29665 29178  0 09:15 pts/0    00:00:03 du -sh
> /mnt/testfs/foo/fam/
> schell   29694     2  0 09:15 ?        00:00:00 [ll_sa_29665]
> root     29695  4846  0 09:15 ?        00:00:00
> /usr/sbin/automount --timeout
> 60 --pid-file /var/run/autofs.misc.pid /misc yp auto.misc
> <snap>
>
> and CLIENT dmesg:
> Lustre: 5361:0:(import.c:395:import_select_connection())
> scia-OST0003-osc-ffff8100ea24a000: tried all connections,
> increasing latency
> to 6s
> Lustre: 5361:0:(import.c:395:import_select_connection())
> Skipped 10 previous
> similar messages
> LustreError: 11-0: an error occurred while communicating with
> 192.168.16.97 at tcp. The ost_connect operation failed with -19
> LustreError: Skipped 20 previous similar messages
> Lustre: 5361:0:(import.c:395:import_select_connection())
> scia-OST0003-osc-ffff8100ea24a000: tried all connections,
> increasing latency
> to 51s
> Lustre: 5361:0:(import.c:395:import_select_connection())
> Skipped 20 previous
> similar messages
> LustreError: 11-0: an error occurred while communicating with
> 192.168.16.97 at tcp. The ost_connect operation failed with -19
> LustreError: Skipped 24 previous similar messages
> Lustre: 5361:0:(import.c:395:import_select_connection())
> scia-OST0003-osc-ffff8100ea24a000: tried all connections,
> increasing latency
> to 51s
> Lustre: 5361:0:(import.c:395:import_select_connection())
> Skipped 24 previous
> similar messages
> LustreError: 167-0: This client was evicted by scia-OST0003;
> in progress
> operations using this service will fail.
>
> The MDS dmesg:
> <snip>
> Lustre: 6108:0:(import.c:395:import_select_connection())
> scia-OST0003-osc:
> tried all connections, increasing latency to 51s
> Lustre: 6108:0:(import.c:395:import_select_connection())
> Skipped 10 previous
> similar messages
> LustreError: 11-0: an error occurred while communicating with
> 192.168.16.97 at tcp. The ost_connect operation failed with -19
> LustreError: Skipped 10 previous similar messages
> Lustre: 6108:0:(import.c:395:import_select_connection())
> scia-OST0003-osc:
> tried all connections, increasing latency to 51s
> Lustre: 6108:0:(import.c:395:import_select_connection())
> Skipped 20 previous
> similar messages
> Lustre: Permanently deactivating scia-OST0003
> Lustre: Setting parameter scia-OST0003-osc.osc.active in log
> scia-client
> Lustre: Skipped 3 previous similar messages
> Lustre: setting import scia-OST0003_UUID INACTIVE by
> administrator request
> Lustre: scia-OST0003-osc.osc: set parameter active=0
> Lustre: Skipped 2 previous similar messages
> Lustre: scia-MDT0000: haven''t heard from client
> 9111f740-b7a7-e2ff-b672-288a66decfab (at 192.168.16.106 at tcp)
> in 1269 seconds.
> I think it''s dead, and I am evicting it.
> Lustre: Permanently reactivating scia-OST0003
> Lustre: Modifying parameter scia-OST0003-osc.osc.active in
> log scia-client
> Lustre: Skipped 1 previous similar message
> Lustre: 15406:0:(import.c:395:import_select_connection())
> scia-OST0003-osc:
> tried all connections, increasing latency to 51s
> Lustre: 15406:0:(import.c:395:import_select_connection())
> Skipped 2 previous
> similar messages
> LustreError: 167-0: This client was evicted by scia-OST0003;
> in progress
> operations using this service will fail.
> Lustre: scia-OST0003-osc: Connection restored to service
> scia-OST0003 using
> nid 192.168.16.97 at tcp.
> Lustre: scia-OST0003-osc.osc: set parameter active=1
> Lustre: MDS scia-MDT0000: scia-OST0003_UUID now active,
> resetting orphans
> <snap>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Brian J. Murrell

2008-Jul-10 17:35 UTC

head link

[Lustre-discuss] lustre client 1.6.5.1 hangs

On Thu, 2008-07-10 at 10:25 +0200, Heiko Schroeter
wrote:> Hello,
Hi.
> OST lustre mkfs:
> "mkfs.lustre --param="failover.mode=failout" --fsname                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Given this (above) parameter setting...
> scia --ost --mkfsoptions=''-i 2097152 -E stride=16 -b 
> 4096'' --mgsnode=mds1lustre at tcp0 /dev/sdb"
> The following procedure hangs a client:
> 1) copy files to the lustre system
> 2) do a ''du -sh /mnt/testfs/willi'' while copying
> 3) unmount an OST (here OST0003) while copying
Do you expect that the copy and du (which are both running at the same
time while you unmount the OST, right?) should both get EIOs?
> Deactivating/Reactivating or remounting the OST does not have any effect on
> the ''du'' job. The ''du'' job (#29665 see
process list below) and the
> correpsonding lustre thread (#29694) cannot be killed manually.
That latter process (ll_sa_29665) is statahead at work.
> What is the proper way (besides avoiding the use of ''du'')
to reactivate the
> client file system ?
Well, in fact the du and the copy should both EIO when they get to
trying to write to the unmounted OST.

Can you get a stack trace (sysrq-t) on the client after you have
unmounted the OST and processes are hung/blocked?

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080710/dcaeb63f/attachment.bin

Heiko Schroeter

2008-Jul-11 06:24 UTC

head link

[Lustre-discuss] lustre client 1.6.5.1 hangs

Am Donnerstag, 10. Juli 2008 19:35:57 schrieben Sie:

Hi.>
> > OST lustre mkfs:
> > "mkfs.lustre --param="failover.mode=failout" --fsname
>
>                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Given this (above) parameter setting...
Is ''failout'' not ok ? Actually we like to use it because we
like to use the
lustre system as a huge expandable data archive system. If one OST breaks 
down and destroys the data on it we can restore them.
> > scia --ost --mkfsoptions=''-i 2097152 -E stride=16 -b
> > 4096'' --mgsnode=mds1lustre at tcp0 /dev/sdb"
> >
> > The following procedure hangs a client:
> > 1) copy files to the lustre system
> > 2) do a ''du -sh /mnt/testfs/willi'' while copying
> > 3) unmount an OST (here OST0003) while copying
>
> Do you expect that the copy and du (which are both running at the same
> time while you unmount the OST, right?
Right.
> ) should both get EIOs? 
Actually i do expect the client not tho hang any job that acesses the file 
systerm in this moment. If that needs an EIO and KILL of that process this is 
fine by me.
> > What is the proper way (besides avoiding the use of
''du'') to reactivate
> > the client file system ?
>
> Well, in fact the du and the copy should both EIO when they get to
> trying to write to the unmounted OST.
>
> Can you get a stack trace (sysrq-t) on the client after you have
> unmounted the OST and processes are hung/blocked?
I will get this done today. If the output is very large can i zip it and 
attach it ?

Thank you.
Heiko

Heiko Schroeter

2008-Jul-11 08:14 UTC

head link

[Lustre-discuss] lustre client 1.6.5.1 hangs

Am Donnerstag, 10. Juli 2008 19:35:57 schrieb Brian J.
Murrell:> Well, in fact the du and the copy should both EIO when they get to
> trying to write to the unmounted OST.
>
> Can you get a stack trace (sysrq-t) on the client after you have
> unmounted the OST and processes are hung/blocked?
Here is the stack trace. I hope it is the one you requested.

Regards
Heiko
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stack_trace.txt.gz
Type: application/x-gzip
Size: 30198 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080711/0592fb67/attachment-0003.bin

Brian J. Murrell

2008-Jul-11 17:15 UTC

head link

[Lustre-discuss] lustre client 1.6.5.1 hangs

On Fri, 2008-07-11 at 08:24 +0200, Heiko Schroeter
wrote:> 
> Is ''failout'' not ok ?
That''s up to you.

Failout means that if an OST becomes unreachable (because it has failed
or taken off the network, or unmounted or turned off, etc.) then any I/O
to get objects from that OST will cause a client to get an EIO
(Input/Output error).

Failover means that a client that tries to do I/O to a failed OST will
continue to try (forever) until it gets an answer.  A userspace sees
nothing strange, other than an I/O that takes, potentially, a very long
time to complete.
> Actually we like to use it because we like to use the 
> lustre system as a huge expandable data archive system.
I''m not sure what using failout has to do with that.
> If one OST breaks 
> down and destroys the data on it we can restore them.
Again, failout/failover really has nothing to do with this.  It has
everything to do with what a client does when it sees an OST fail.
> Actually i do expect the client not tho hang any job that acesses the file 
> systerm in this moment. If that needs an EIO and KILL of that process this
is
> fine by me.
Well, no kill should be necessary.  An EIO should terminate an
application.  Unless it has a retry handler for EIOs written into it.
That''s not very common.  EIO usually should be interpreted as fatal.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080711/22917b90/attachment.bin

Brian J. Murrell

2008-Jul-11 17:18 UTC

head link

[Lustre-discuss] lustre client 1.6.5.1 hangs

On Fri, 2008-07-11 at 10:14 +0200, Heiko Schroeter
wrote:> 
> Here is the stack trace. I hope it is the one you requested.
Hrm.

What is strange is that you have configured failout but not getting
EIOs.

Maybe you should file a bug in our bugzilla about this one.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080711/702728df/attachment-0001.bin

Lustre discuss - Jul 2008 - lustre client 1.6.5.1 hangs

[Lustre-discuss] lustre client 1.6.5.1 hangs

[Lustre-discuss] lustre client 1.6.5.1 hangs

[Lustre-discuss] lustre client 1.6.5.1 hangs

[Lustre-discuss] lustre client 1.6.5.1 hangs

[Lustre-discuss] lustre client 1.6.5.1 hangs

[Lustre-discuss] lustre client 1.6.5.1 hangs

[Lustre-discuss] lustre client 1.6.5.1 hangs