thr3ads.net - Lustre discuss - [Lustre-discuss] csum errors [Aug 2008]

If this information is useful, please help other people find it:
Share via:

Stuart Midgley

2008-Aug-28 03:57 UTC

[Lustre-discuss] csum errors

We recently upgraded from 1.4.10.1 to 1.6.5.1 (clients and servers)  
and now we are seeing errors like


Aug 27 07:49:54 oss025 kernel: LustreError: 3738:0:(ost_handler.c: 
1163:ost_brw_write()) client csum 2dbc1696, server csum 9d081697
Aug 27 07:49:54 oss025 kernel: LustreError: 168-f: p1-OST0018: BAD  
WRITE CHECKSUM: changed in transit before arrival at OST from  
12345-172.16.4.93 at tcp inum 24522277/426969871 object 12021/0 extent  
[10485760-11534335]
Aug 27 07:49:55 oss025 kernel: LustreError: 3738:0:(ost_handler.c: 
1225:ost_brw_write()) client csum 2dbc1696, original server csum  
9d081697, server csum now 9d081697


always from the same cluster node...  Should we be worried?  I suspect  
this means we shouldn''t turn check summing off?  I assume these are  
rejected and resent from the client?


-- 
Dr Stuart Midgley
sdm900 at gmail.com

Stuart Midgley

2008-Aug-28 13:49 UTC

head link

[Lustre-discuss] csum errors

for completeness, here are the logs from 172.16.4.93

Aug 27 07:49:55 clus093 kernel: LustreError: 132-0: BAD WRITE  
CHECKSUM: changed on the client after we checksummed it - likely false  
positive due to mmap IO (bug 11742): from 172.16.0.25 at tcp inum  
24522277/1605841060 object 12021/0 extent [10485760-11534335]
Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c: 
1162:check_write_checksum()) original client csum 2dbc1696 (type 2),  
server csum 9d081697 (type 2), client csum now 9d081697
Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c: 
1372:osc_brw_redo_request()) @@@ redo for recoverable error   
req at ffff81012c434600 x4720217/t820873 o4->p1- 
OST0018_UUID at 172.16.0.25@tcp:6/4 lens 384/480 e 0 to 100 dl 1219794694  
ref 2 fl Interpret:R/0/0 rc 0/0


-- 
Dr Stuart Midgley
sdm900 at gmail.com



On 28/08/2008, at 11:57 AM, Stuart Midgley wrote:
> We recently upgraded from 1.4.10.1 to 1.6.5.1 (clients and servers)  
> and now we are seeing errors like
>
>
> Aug 27 07:49:54 oss025 kernel: LustreError: 3738:0:(ost_handler.c: 
> 1163:ost_brw_write()) client csum 2dbc1696, server csum 9d081697
> Aug 27 07:49:54 oss025 kernel: LustreError: 168-f: p1-OST0018: BAD  
> WRITE CHECKSUM: changed in transit before arrival at OST from  
> 12345-172.16.4.93 at tcp inum 24522277/426969871 object 12021/0 extent  
> [10485760-11534335]
> Aug 27 07:49:55 oss025 kernel: LustreError: 3738:0:(ost_handler.c: 
> 1225:ost_brw_write()) client csum 2dbc1696, original server csum  
> 9d081697, server csum now 9d081697
>
>
> always from the same cluster node...  Should we be worried?  I  
> suspect this means we shouldn''t turn check summing off?  I assume
> these are rejected and resent from the client?
>
>
> -- 
> Dr Stuart Midgley
> sdm900 at gmail.com
>
>
>

Andreas Dilger

2008-Aug-28 17:25 UTC

head link

[Lustre-discuss] csum errors

On Aug 28, 2008  21:49 +0800, Stuart Midgley wrote:> for completeness, here are the logs from 172.16.4.93
> 
> Aug 27 07:49:55 clus093 kernel: LustreError: 132-0: BAD WRITE  
> CHECKSUM: changed on the client after we checksummed it - likely false  
> positive due to mmap IO (bug 11742): from 172.16.0.25 at tcp inum  
This is the important part to note - if you are doing mmap writes then
the VM doesn''t protect the pages from being modified while they are
being checksummed and sent over the wire.
> Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c: 
> 1162:check_write_checksum()) original client csum 2dbc1696 (type 2),  
> server csum 9d081697 (type 2), client csum now 9d081697
This means the data changed after the initial checksum was computed,
but now it has settled down.  In some cases the "client csum now" can
have changed again, depending on whether the process is rewriting the
same file repeatedly.
> Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c: 
> 1372:osc_brw_redo_request()) @@@ redo for recoverable error   
> req at ffff81012c434600 x4720217/t820873 o4->p1- 
> OST0018_UUID at 172.16.0.25@tcp:6/4 lens 384/480 e 0 to 100 dl 1219794694  
> ref 2 fl Interpret:R/0/0 rc 0/0
Here it tells you it is resending the RPC.
> > always from the same cluster node...  Should we be worried?  I  
> > suspect this means we shouldn''t turn check summing off?  I
assume
> > these are rejected and resent from the client?
If you are NOT doing mmap IO (just normal read/write) then it is possible
your node has memory corruption.  There is an extra check that can be
done to checksum the pages while they are in memory, instead of just
over the wire.  More overhead of course, but can help isolate the problem.

	echo 1> /proc/fs/lustre/llite/*/checksum_pages

This will also enable on-wire checksumming, which is already on
by default.  One caveat is that turning off checksum_pages will
also turn off the on-wire checksums (which can be re-enabled via
/proc/fs/lustre/osc/*/checksums)...  Blame Phil.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Stuart Midgley

2008-Aug-29 01:07 UTC

head link

[Lustre-discuss] csum errors

Thanks for the information, greatly appreciated.

We are keeping an eye on the client causing these errors and doing a  
few tests.  The mmap issue is interesting.  The code producing these  
errors is running across the entire cluster, so I assume if it was  
mmap-ing we would be seeing these sort of random errors from more than  
just one node.

If the problems persist, we will turn on the extra debugging and see  
where we are at.


-- 
Dr Stuart Midgley
sdm900 at gmail.com



On 29/08/2008, at 1:25 AM, Andreas Dilger wrote:
> On Aug 28, 2008  21:49 +0800, Stuart Midgley wrote:
>> for completeness, here are the logs from 172.16.4.93
>>
>> Aug 27 07:49:55 clus093 kernel: LustreError: 132-0: BAD WRITE
>> CHECKSUM: changed on the client after we checksummed it - likely  
>> false
>> positive due to mmap IO (bug 11742): from 172.16.0.25 at tcp inum
>
> This is the important part to note - if you are doing mmap writes then
> the VM doesn''t protect the pages from being modified while they
are
> being checksummed and sent over the wire.
>
>> Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c:
>> 1162:check_write_checksum()) original client csum 2dbc1696 (type 2),
>> server csum 9d081697 (type 2), client csum now 9d081697
>
> This means the data changed after the initial checksum was computed,
> but now it has settled down.  In some cases the "client csum now"
can
> have changed again, depending on whether the process is rewriting the
> same file repeatedly.
>
>> Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c:
>> 1372:osc_brw_redo_request()) @@@ redo for recoverable error
>> req at ffff81012c434600 x4720217/t820873 o4->p1-
>> OST0018_UUID at 172.16.0.25@tcp:6/4 lens 384/480 e 0 to 100 dl  
>> 1219794694
>> ref 2 fl Interpret:R/0/0 rc 0/0
>
> Here it tells you it is resending the RPC.
>
>>> always from the same cluster node...  Should we be worried?  I
>>> suspect this means we shouldn''t turn check summing off?  I
assume
>>> these are rejected and resent from the client?
>
> If you are NOT doing mmap IO (just normal read/write) then it is  
> possible
> your node has memory corruption.  There is an extra check that can be
> done to checksum the pages while they are in memory, instead of just
> over the wire.  More overhead of course, but can help isolate the  
> problem.
>
> 	echo 1> /proc/fs/lustre/llite/*/checksum_pages
>
> This will also enable on-wire checksumming, which is already on
> by default.  One caveat is that turning off checksum_pages will
> also turn off the on-wire checksums (which can be re-enabled via
> /proc/fs/lustre/osc/*/checksums)...  Blame Phil.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>

Lustre discuss - Aug 2008 - csum errors

[Lustre-discuss] csum errors

[Lustre-discuss] csum errors

[Lustre-discuss] csum errors

[Lustre-discuss] csum errors