nic@cray.com
2006-Dec-15  09:43 UTC
[Lustre-devel] [Bug 11394] OSS loses its mind, spits out error messages with garbage data.
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by
using the following link:
https://bugzilla.lustre.org/show_bug.cgi?id=11394
nic@cray.com changed:
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |eeb@clusterfs.com
           Priority|P3                          |P2
We have had 2 new hits on the production XT3 at ORNL (jaguar) of this bug. This
is now a hot one! 
This always seems to occur after some network issues -- it almost seems like we
are using a buffer with bogus data in it, like we are ignoring (or not seeing) a
bad return value from a network receive.
I''ll upload more logs -- but we need to get some movement on this soon.
pbojanic@clusterfs.com
2007-Jan-08  07:58 UTC
[Lustre-devel] [Bug 11394] OSS loses its mind, spits out error messages with garbage data.
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by
using the following link:
https://bugzilla.lustre.org/show_bug.cgi?id=11394
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |green@clusterfs.com
  Status Whiteboard|2006-12-21: CFS consider the|2007-01-08: Likely not an
                   |corruption has occurred     |LNET issue, but may be
                   |prior to reaching LNET      |Lustre or Cray Portals; CFS
                   |                            |to determine next steps for
                   |                            |debugging
Eric Barton advises that he really doesn''t think this is an LNET issue.
Not sure
if it''s Lustre or Cray Portals related. He suggests try normal
use-after-free
debugging stuff (e.g. POISON) first off. I''ve asked Oleg to discuss
with mjmac
to see if we can move this forward ourselves, or if we need further help from
Cray.