Marina Cacciagrano
2012-Feb-15 16:52 UTC
[Lustre-discuss] LustreError codes -114 and -16 (ldlm_lib.c:1919:target_send_reply_msg())
Hello, On all the nodes of a lustre 1.8.2 , I often see messages similar to the following in /var/log/syslog: LustreError: 8862:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-114) req at ffff8103f97dc850 x1393780295030087/t0 o250->bfd79683-ce51-1e18-7f40-632c3a616b01 at NET_0x20000ac1054d2_UUID:0/0 lens 368/264 e 0 to 0 dl 1329235090 ref 1 fl Interpret:/0/0 rc -114/0 and LustreError: 8963:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-16) req at ffff81024b636400 x1393871458142225/t0 o38->e3e2f978-3cc1-c6f9-17cb-9ac846be7fae at NET_0x20000ac1138c2_UUID:0/0 lens 368/264 e 0 to 0 dl 1329300007 ref 1 fl Interpret:/0/0 rc -16/0 I cannot find the meaning of error codes -144 and -16. Can anybody advise on what generates those errors? A quick description of the configuration: the lustre version is 1.8.2. the system is made up by one MDS host and seven OSS hosts. lnet is over 10Ge. Regards, marina Framestore 9 Noel Street London W1F 8GH [T] +44 (0)20 7208 2600 [F] +44 (0)20 7208 2626 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120215/3acf5794/attachment.html
Chris Horn
2012-Feb-15 17:21 UTC
[Lustre-discuss] LustreError codes -114 and -16 (ldlm_lib.c:1919:target_send_reply_msg())
errno 16 is EBUSY (device or resource busy) and errno 114 is EALREADY (Operation already in progress). Chris Horn On Feb 15, 2012, at 10:52 AM, Marina Cacciagrano wrote: Hello, On all the nodes of a lustre 1.8.2 , I often see messages similar to the following in /var/log/syslog: LustreError: 8862:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-114) req at ffff8103f97dc850 x1393780295030087/t0 o250->bfd79683-ce51-1e18-7f40-632c3a616b01 at NET_0x20000ac1054d2_UUID:0/0 lens 368/264 e 0 to 0 dl 1329235090 ref 1 fl Interpret:/0/0 rc -114/0 and LustreError: 8963:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-16) req at ffff81024b636400 x1393871458142225/t0 o38->e3e2f978-3cc1-c6f9-17cb-9ac846be7fae at NET_0x20000ac1138c2_UUID:0/0 lens 368/264 e 0 to 0 dl 1329300007 ref 1 fl Interpret:/0/0 rc -16/0 I cannot find the meaning of error codes -144 and -16. Can anybody advise on what generates those errors? A quick description of the configuration: the lustre version is 1.8.2. the system is made up by one MDS host and seven OSS hosts. lnet is over 10Ge. Regards, marina Framestore 9 Noel Street London W1F 8GH [T] +44 (0)20 7208 2600 [F] +44 (0)20 7208 2626 _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> http://lists.lustre.org/mailman/listinfo/lustre-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120215/bdea5fc5/attachment-0001.html
Marina Cacciagrano
2012-Feb-15 18:46 UTC
[Lustre-discuss] LustreError codes -114 and -16 (ldlm_lib.c:1919:target_send_reply_msg())
Thanks! Maybe that means that the drives are a bit too slow to respond to the requests... Can that be related to a problem with lnet as well? marina Framestore 9 Noel Street London W1F 8GH [T] +44 (0)20 7208 2600 [F] +44 (0)20 7208 2626 ----- Original Message ----- From: "Chris Horn" <hornc at cray.com> To: "Marina Cacciagrano" <Marina.Cacciagrano at framestore.com> Cc: "<lustre-discuss at lists.lustre.org>" <lustre-discuss at lists.lustre.org> Sent: Wednesday, 15 February, 2012 5:21:33 PM Subject: Re: [Lustre-discuss] LustreError codes -114 and -16 (ldlm_lib.c:1919:target_send_reply_msg()) errno 16 is EBUSY (device or resource busy) and errno 114 is EALREADY (Operation already in progress). Chris Horn On Feb 15, 2012, at 10:52 AM, Marina Cacciagrano wrote: Hello, On all the nodes of a lustre 1.8.2 , I often see messages similar to the following in /var/log/syslog: LustreError: 8862:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-114) req at ffff8103f97dc850 x1393780295030087/t0 o250->bfd79683-ce51-1e18-7f40-632c3a616b01 at NET_0x20000ac1054d2_UUID:0/0 lens 368/264 e 0 to 0 dl 1329235090 ref 1 fl Interpret:/0/0 rc -114/0 and LustreError: 8963:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-16) req at ffff81024b636400 x1393871458142225/t0 o38->e3e2f978-3cc1-c6f9-17cb-9ac846be7fae at NET_0x20000ac1138c2_UUID:0/0 lens 368/264 e 0 to 0 dl 1329300007 ref 1 fl Interpret:/0/0 rc -16/0 I cannot find the meaning of error codes -144 and -16. Can anybody advise on what generates those errors? A quick description of the configuration: the lustre version is 1.8.2. the system is made up by one MDS host and seven OSS hosts. lnet is over 10Ge. Regards, marina Framestore 9 Noel Street London W1F 8GH [T] +44 (0)20 7208 2600 [F] +44 (0)20 7208 2626 _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120215/deacdab6/attachment.html
Chris Horn
2012-Feb-15 18:57 UTC
[Lustre-discuss] LustreError codes -114 and -16 (ldlm_lib.c:1919:target_send_reply_msg())
Hard to say what''s going on without additional context. The first message relates to an MGS_CONNECT rpc (o250), the second messages relates to an MDS_CONNECT rpc (o38). I would suspect network issues. Chris Horn On Feb 15, 2012, at 12:46 PM, Marina Cacciagrano wrote: Thanks! Maybe that means that the drives are a bit too slow to respond to the requests... Can that be related to a problem with lnet as well? marina Framestore 9 Noel Street London W1F 8GH [T] +44 (0)20 7208 2600 [F] +44 (0)20 7208 2626 ----- Original Message ----- From: "Chris Horn" <hornc at cray.com<mailto:hornc at cray.com>> To: "Marina Cacciagrano" <Marina.Cacciagrano at framestore.com<mailto:Marina.Cacciagrano at framestore.com>> Cc: "<lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>>" <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> Sent: Wednesday, 15 February, 2012 5:21:33 PM Subject: Re: [Lustre-discuss] LustreError codes -114 and -16 (ldlm_lib.c:1919:target_send_reply_msg()) errno 16 is EBUSY (device or resource busy) and errno 114 is EALREADY (Operation already in progress). Chris Horn On Feb 15, 2012, at 10:52 AM, Marina Cacciagrano wrote: Hello, On all the nodes of a lustre 1.8.2 , I often see messages similar to the following in /var/log/syslog: LustreError: 8862:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-114) req at ffff8103f97dc850 x1393780295030087/t0 o250->bfd79683-ce51-1e18-7f40-632c3a616b01 at NET_0x20000ac1054d2_UUID:0/0 lens 368/264 e 0 to 0 dl 1329235090 ref 1 fl Interpret:/0/0 rc -114/0 and LustreError: 8963:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-16) req at ffff81024b636400 x1393871458142225/t0 o38->e3e2f978-3cc1-c6f9-17cb-9ac846be7fae at NET_0x20000ac1138c2_UUID:0/0 lens 368/264 e 0 to 0 dl 1329300007 ref 1 fl Interpret:/0/0 rc -16/0 I cannot find the meaning of error codes -144 and -16. Can anybody advise on what generates those errors? A quick description of the configuration: the lustre version is 1.8.2. the system is made up by one MDS host and seven OSS hosts. lnet is over 10Ge. Regards, marina Framestore 9 Noel Street London W1F 8GH [T] +44 (0)20 7208 2600 [F] +44 (0)20 7208 2626 _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> http://lists.lustre.org/mailman/listinfo/lustre-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120215/af000fdb/attachment.html