nic@cray.com
2007-Feb-06 09:41 UTC
[Lustre-devel] [Bug 11659] ptllnd credit overflows on Portals congestion
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11659 Created an attachment (id=9526) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9526&action=view) client side syslog/console
nic@cray.com
2007-Feb-06 09:42 UTC
[Lustre-devel] [Bug 11659] ptllnd credit overflows on Portals congestion
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11659 Created an attachment (id=9527) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9527&action=view) server syslog
eeb@clusterfs.com
2007-Feb-07 14:26 UTC
[Lustre-devel] [Bug 11659] ptllnd credit overflows on Portals congestion
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11659 What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bugs@clusterfs.com |eeb@clusterfs.com Status|NEW |ASSIGNED Created an attachment (id=9538) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9538&action=view) patch to increase resilience to network problems This patch tracks peer buffer utilisation more closely so that overruns are handled properly and don''t fail the assertion in kptllnd_rx_done(). It also includes stronger peer incarnation checks which may help debug pathological behaviour if the underlying network misbehaves. Nic, it would be great if you can try this with the sort of network meltdown that produced the logs attached here.