eeb@clusterfs.com
2007-Jan-29 08:56 UTC
[Lustre-devel] [Bug 11616] ASSERTION(error != 0 || conn->ibc_state >= IBLND_CONN_ESTABLISHED) causes nmi
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11616 What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bugs@clusterfs.com |eeb@clusterfs.com Status|NEW |ASSIGNED Created an attachment (id=9440) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9440&action=view) debug patch Can you attach (dmesg | grep o2ib) to this bug? Any errors reported before the assertion failure should provide a clue. It would also help to enable network error printks (echo + neterror > /proc/sys/lnet/printk) beforehand. FYI, this assertion states that the only reason for closing a connection before it becomes established is if there has been an error. A trawl through the code hasn''t thrown up any obvious errors and AFAICS the only way this could happen is if the CM delivered RDMA_CM_EVENT_DISCONNECTED before RDMA_CM_EVENT_ESTABLISHED, which I believe shouldn''t be possible (but I''ll check). In any case, this patch adds some debug in case this is occurring.