Our CentOS 5 2.6.18-53.1.13.el5_lustre.1.6.4.3smp OSS client was giving us trouble. The /var/log/messages file would frequently have the text: Jun 29 17:49:27 oss2 kernel: LustreError: 2565:0:(client.c: 519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810077ef4000 x82828/t0 o101->MGS at MGC172.18.0.10@o2ib_0:26 lens 232/240 ref 1 fl Rpc:/0/0 rc 0/0 Jun 29 17:49:27 oss2 kernel: LustreError: 2565:0:(client.c: 519:ptlrpc_import_delay_req()) Skipped 59 previous similar messages Jun 29 18:00:08 oss2 kernel: LustreError: 2565:0:(client.c: 519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810077f34400 x82888/t0 o101->MGS at MGC172.18.0.10@o2ib_0:26 lens 232/240 ref 1 fl Rpc:/0/0 rc 0/0 Jun 29 18:00:08 oss2 kernel: LustreError: 2565:0:(client.c: 519:ptlrpc_import_delay_req()) Skipped 59 previous similar messages Last week I connected a new InfiniBand cable between the OSS box and the IB switch. I have not had an IMP_INVALID error in the /var/log/ messages file since then. I don''t know how a cable can work and then not work, but in this example, changing the cable stopped the Lustre errors from appearing. megan
Brian J. Murrell
2008-Jul-02 13:27 UTC
[Lustre-discuss] IMP_INVALID -- one change that worked
On Tue, 2008-07-01 at 13:57 -0700, megan wrote:> Our CentOS 5 2.6.18-53.1.13.el5_lustre.1.6.4.3smp OSS client was > giving us trouble. The /var/log/messages file would frequently have > the text: > Jun 29 17:49:27 oss2 kernel: LustreError: 2565:0:(client.c: > 519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810077ef4000 > x82828/t0 o101->MGS at MGC172.18.0.10@o2ib_0:26 lens 232/240 ref 1 fl > Rpc:/0/0 rc 0/0 > Jun 29 17:49:27 oss2 kernel: LustreError: 2565:0:(client.c: > 519:ptlrpc_import_delay_req()) Skipped 59 previous similar messages > Jun 29 18:00:08 oss2 kernel: LustreError: 2565:0:(client.c: > 519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810077f34400 > x82888/t0 o101->MGS at MGC172.18.0.10@o2ib_0:26 lens 232/240 ref 1 fl > Rpc:/0/0 rc 0/0 > Jun 29 18:00:08 oss2 kernel: LustreError: 2565:0:(client.c: > 519:ptlrpc_import_delay_req()) Skipped 59 previous similar messages > > Last week I connected a new InfiniBand cable between the OSS box and > the IB switch. I have not had an IMP_INVALID error in the /var/log/ > messages file since then. I don''t know how a cable can work and then > not work, but in this example, changing the cable stopped the Lustre > errors from appearing.Our experience has shown that indeed, IB cables can be "flaky" where they may work intermittently. You could probably equate this flakiness to an audio cable that crackles when you wiggle it perhaps -- i.e. a connection that works sometimes and not others. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080702/e4bdb193/attachment.bin
IB cables (at least the copper ones) are fairly delicate. I know we always made sure ours didn''t have any severe bends or kinks in them -- they tend to behave like bent optical cables if they''re too severely abused. Klaus On 7/1/08 1:57 PM, "megan" <dobsonunit at gmail.com>did etch on stone tablets:> Our CentOS 5 2.6.18-53.1.13.el5_lustre.1.6.4.3smp OSS client was > giving us trouble. The /var/log/messages file would frequently have > the text: > Jun 29 17:49:27 oss2 kernel: LustreError: 2565:0:(client.c: > 519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810077ef4000 > x82828/t0 o101->MGS at MGC172.18.0.10@o2ib_0:26 lens 232/240 ref 1 fl > Rpc:/0/0 rc 0/0 > Jun 29 17:49:27 oss2 kernel: LustreError: 2565:0:(client.c: > 519:ptlrpc_import_delay_req()) Skipped 59 previous similar messages > Jun 29 18:00:08 oss2 kernel: LustreError: 2565:0:(client.c: > 519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810077f34400 > x82888/t0 o101->MGS at MGC172.18.0.10@o2ib_0:26 lens 232/240 ref 1 fl > Rpc:/0/0 rc 0/0 > Jun 29 18:00:08 oss2 kernel: LustreError: 2565:0:(client.c: > 519:ptlrpc_import_delay_req()) Skipped 59 previous similar messages > > Last week I connected a new InfiniBand cable between the OSS box and > the IB switch. I have not had an IMP_INVALID error in the /var/log/ > messages file since then. I don''t know how a cable can work and then > not work, but in this example, changing the cable stopped the Lustre > errors from appearing. > > megan > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss