thr3ads.net - Lustre announce - [Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases [Sep 2009]

If this information is useful, please help other people find it:
Share via:

Johann Lombardi

2009-Sep-09 15:00 UTC

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

A bug has been identified in the 1.8 releases (1.8.0, 1.8.0.1 & 1.8.1  
are
impacted) that can cause data corruption on the OSTs. This problem is
related to the OSS read cache feature that has been introduced in 1.8.0.
This can happen when a bulk read or write request is aborted due to the
client being evicted or because the data transfer over the network has
timed out. More details are available in bug 20560:
https://bugzilla.lustre.org/show_bug.cgi?id=20560

A patch is under testing and will be included in 1.8.1.1.
Until 1.8.1.1 is available, we recommend to disable the OSS read cache
feature. This feature can be disabled by running the two following
commands on the OSSs:
# lctl set_param obdfilter.*.writethrough_cache_enable=0
# lctl set_param obdfilter.*.read_cache_enable=0

This has to be done each time an OST is restarted.

Best regards,
Johann, for the Lustre team

Charles A. Taylor

2009-Sep-09 18:07 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

Just for the record, we''ve been running 1.8.1 for a several weeks now
with no problems.  Well, truthfully, "no problems" is an exaggeration
but it is mostly working.   We see lots of log messages we are not used
to regarding client and server csum differences.  

Anyway, your  email concerned us so we issued the recommended commands
on our OSSs to disable the caching.   That promptly crashed two of our
OSSs.   We got the servers back up and after fsck''ing (fsck.ext4) all
the OSTs and remounting lustre, one of the two OSSs promptly crashed
again.  

We''re still working through it but we weren''t having any
problems - or
at least none we were aware of - until we disabled the caching.   Maybe
we were already doomed - I don''t know. 

Right now I''m kind of wishing we had moved to 1.6.7.2 rather than
1.8.0.1/1.8.1.  I think we got overconfident after running 1.6.4.2 for
so long with so few problems.

Charlie Taylor
UF HPC Center

On Wed, 2009-09-09 at 17:00 +0200, Johann Lombardi
wrote:> A bug has been identified in the 1.8 releases (1.8.0, 1.8.0.1 & 1.8.1  
> are
> impacted) that can cause data corruption on the OSTs. This problem is
> related to the OSS read cache feature that has been introduced in 1.8.0.
> This can happen when a bulk read or write request is aborted due to the
> client being evicted or because the data transfer over the network has
> timed out. More details are available in bug 20560:
> https://bugzilla.lustre.org/show_bug.cgi?id=20560
> 
> A patch is under testing and will be included in 1.8.1.1.
> Until 1.8.1.1 is available, we recommend to disable the OSS read cache
> feature. This feature can be disabled by running the two following
> commands on the OSSs:
> # lctl set_param obdfilter.*.writethrough_cache_enable=0
> # lctl set_param obdfilter.*.read_cache_enable=0
> 
> This has to be done each time an OST is restarted.
> 
> Best regards,
> Johann, for the Lustre team
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Mervini, Joseph A

2009-Sep-09 18:23 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

I''m not really sure why writethrough_cache_enable is being disabled but
the method we have used to disable the read_cache_enable is "echo 0 >
/proc/fs/lustre/obdfilter/<ost name>/read_cache_enable" without any
issues.

-----Original Message-----
From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces
at lists.lustre.org] On Behalf Of Charles A. Taylor
Sent: Wednesday, September 09, 2009 12:07 PM
To: Johann Lombardi
Cc: lustre-discuss at lists.lustre.org discuss
Subject: Re: [Lustre-discuss] WARNING: data corruption issue found in 1.8.x
releases

Just for the record, we''ve been running 1.8.1 for a several weeks now
with no problems.  Well, truthfully, "no problems" is an exaggeration
but it is mostly working.   We see lots of log messages we are not used
to regarding client and server csum differences.  

Anyway, your  email concerned us so we issued the recommended commands
on our OSSs to disable the caching.   That promptly crashed two of our
OSSs.   We got the servers back up and after fsck''ing (fsck.ext4) all
the OSTs and remounting lustre, one of the two OSSs promptly crashed
again.  

We''re still working through it but we weren''t having any
problems - or
at least none we were aware of - until we disabled the caching.   Maybe
we were already doomed - I don''t know. 

Right now I''m kind of wishing we had moved to 1.6.7.2 rather than
1.8.0.1/1.8.1.  I think we got overconfident after running 1.6.4.2 for
so long with so few problems.

Charlie Taylor
UF HPC Center

On Wed, 2009-09-09 at 17:00 +0200, Johann Lombardi
wrote:> A bug has been identified in the 1.8 releases (1.8.0, 1.8.0.1 & 1.8.1  
> are
> impacted) that can cause data corruption on the OSTs. This problem is
> related to the OSS read cache feature that has been introduced in 1.8.0.
> This can happen when a bulk read or write request is aborted due to the
> client being evicted or because the data transfer over the network has
> timed out. More details are available in bug 20560:
> https://bugzilla.lustre.org/show_bug.cgi?id=20560
> 
> A patch is under testing and will be included in 1.8.1.1.
> Until 1.8.1.1 is available, we recommend to disable the OSS read cache
> feature. This feature can be disabled by running the two following
> commands on the OSSs:
> # lctl set_param obdfilter.*.writethrough_cache_enable=0
> # lctl set_param obdfilter.*.read_cache_enable=0
> 
> This has to be done each time an OST is restarted.
> 
> Best regards,
> Johann, for the Lustre team
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Lundgren, Andrew

2009-Sep-09 19:23 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

Does this need to be run on EACH OSS?  Is there a central way to do it on the
MDS?

You recommend disabling the read and the write as the settings indicate or just
the read as the text indicates?

-----Original Message-----

A patch is under testing and will be included in 1.8.1.1.
Until 1.8.1.1 is available, we recommend to disable the OSS read cache
feature. This feature can be disabled by running the two following
commands on the OSSs:
# lctl set_param obdfilter.*.writethrough_cache_enable=0
# lctl set_param obdfilter.*.read_cache_enable=0

This has to be done each time an OST is restarted.

Best regards,
Johann, for the Lustre team
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Charles A. Taylor

2009-Sep-09 19:30 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

On Wed, 2009-09-09 at 13:23 -0600, Lundgren, Andrew
wrote:> Does this need to be run on EACH OSS?  Is there a central way to do it on
the MDS?
> 
> You recommend disabling the read and the write as the settings indicate or
just the read as the text indicates?
A clarification would be good here.   So far, we have found that our
OSSs crash with the recommended work-around so that is a non-starter for
us.   If we can run with just the read_cache_enable=0 and that is
acceptable to avoid the corruptions bug, then that would be good to
know.

At the moment we are not even sure we can run with just
read_cache_enable=0.   We just know that we can''t run with them both
disabled for more than a few minutes with crashing in
obd_filter_preprw().

Charlie Taylor
UF HPC Center
> 
> -----Original Message-----
> 
> A patch is under testing and will be included in 1.8.1.1.
> Until 1.8.1.1 is available, we recommend to disable the OSS read cache
> feature. This feature can be disabled by running the two following
> commands on the OSSs:
> # lctl set_param obdfilter.*.writethrough_cache_enable=0
> # lctl set_param obdfilter.*.read_cache_enable=0
> 
> This has to be done each time an OST is restarted.
> 
> Best regards,
> Johann, for the Lustre team
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Charles A. Taylor

2009-Sep-09 21:03 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

FWIW, we seem to be OK with just "read_cache_enable=0".  
Don''t know if
that is sufficient to avoid the data corruption or not.   It will have
to do though because running with "writethrough_cache_enable=0"
crashes
the OSSs within a few minutes of completing recovery.

Charlie Taylor
UF HPC Center


On Wed, 2009-09-09 at 15:30 -0400, Charles A. Taylor
wrote:> 
> On Wed, 2009-09-09 at 13:23 -0600, Lundgren, Andrew wrote:
> > Does this need to be run on EACH OSS?  Is there a central way to do it
on the MDS?
> > 
> > You recommend disabling the read and the write as the settings
indicate or just the read as the text indicates?
> 
> A clarification would be good here.   So far, we have found that our
> OSSs crash with the recommended work-around so that is a non-starter for
> us.   If we can run with just the read_cache_enable=0 and that is
> acceptable to avoid the corruptions bug, then that would be good to
> know.
> 
> At the moment we are not even sure we can run with just
> read_cache_enable=0.   We just know that we can''t run with them
both
> disabled for more than a few minutes with crashing in
> obd_filter_preprw().
> 
> Charlie Taylor
> UF HPC Center
> 
> > 
> > -----Original Message-----
> > 
> > A patch is under testing and will be included in 1.8.1.1.
> > Until 1.8.1.1 is available, we recommend to disable the OSS read cache
> > feature. This feature can be disabled by running the two following
> > commands on the OSSs:
> > # lctl set_param obdfilter.*.writethrough_cache_enable=0
> > # lctl set_param obdfilter.*.read_cache_enable=0
> > 
> > This has to be done each time an OST is restarted.
> > 
> > Best regards,
> > Johann, for the Lustre team
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Oleg Drokin

2009-Sep-09 22:28 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

Hello!

On Sep 9, 2009, at 2:07 PM, Charles A. Taylor wrote:
> Anyway, your  email concerned us so we issued the recommended commands
> on our OSSs to disable the caching.   That promptly crashed two of our
> OSSs.   We got the servers back up and after fsck''ing (fsck.ext4)
all
> the OSTs and remounting lustre, one of the two OSSs promptly crashed
> again.
Can you share the crash information with us? Better yet - file a bug
with stack traces and whatever else you might be having.

Thanks.

Bye,
     Oleg

Andreas Dilger

2009-Sep-10 06:05 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

On Sep 09, 2009  15:30 -0400, Charles A. Taylor wrote:> > You recommend disabling the read and the write as the settings
> > indicate or just the read as the text indicates?
> 
> A clarification would be good here.   So far, we have found that our
> OSSs crash with the recommended work-around so that is a non-starter for
> us.   If we can run with just the read_cache_enable=0 and that is
> acceptable to avoid the corruptions bug, then that would be good to
> know.
The problem affects OSS-side caching of both write and read.  That said,
by disabling only the read cache you would reduce the chance of hitting
the problem significantly.  For writes there would still be a small
chance of data corruption making it to disk if a client was in the
middle of doing a write, it fails (due to eviction, network error, etc)
and then another client starts a partial-page write of the same data some
time after this failure.

This is a pretty unlikely scenario, since most clients aren''t evicted
very often, they write to separate files, or they write to disjoint
parts of the same file.  Still, there is some small risk.
> At the moment we are not even sure we can run with just
> read_cache_enable=0.   We just know that we can''t run with them
both
> disabled for more than a few minutes with crashing in
> obd_filter_preprw().
Can you please post your stack traces into bug 20560 so that we can
resolve this problem ASAP.

Note that the patch to actually fix this problem is already in bug 20560,
but it requires rebuilding Lustre for the OST.
> > -----Original Message-----
> > A patch is under testing and will be included in 1.8.1.1.
> > Until 1.8.1.1 is available, we recommend to disable the OSS read cache
> > feature. This feature can be disabled by running the two following
> > commands on the OSSs:
> > # lctl set_param obdfilter.*.writethrough_cache_enable=0
> > # lctl set_param obdfilter.*.read_cache_enable=0
> > 
> > This has to be done each time an OST is restarted.
> > 
> > Best regards,
> > Johann, for the Lustre team
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Johann Lombardi

2009-Sep-10 07:28 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

On Sep 10, 2009, at 8:05 AM, Andreas Dilger wrote:>> At the moment we are not even sure we can run with just
>> read_cache_enable=0.   We just know that we can''t run with
them both
>> disabled for more than a few minutes with crashing in
>> obd_filter_preprw().
>
> Can you please post your stack traces into bug 20560 so that we can
> resolve this problem ASAP.
For the record, we tested this workaround many times on various
clusters and it worked just fine. I see that you have provided more data
in bug 20560, we are looking at it.

Cheers,
Johann

Johann Lombardi

2009-Sep-10 10:35 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

On Sep 10, 2009, at 9:28 AM, Johann Lombardi wrote:> clusters and it worked just fine. I see that you have provided more  
> data
> in bug 20560, we are looking at it.
We have attached a new patch to bug 20560 which should address your
problem which may happen in rare cases with partial truncates.

Johann

Charles A. Taylor

2009-Sep-10 13:50 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

On Thu, 2009-09-10 at 09:28 +0200, Johann Lombardi
wrote:> >
> > Can you please post your stack traces into bug 20560 so that we can
> > resolve this problem ASAP.
> 
> For the record, we tested this workaround many times on various
> clusters and it worked just fine. I see that you have provided more data
> in bug 20560, we are looking at it.
Actually, we did not provide the data.  Another site did.  However,
their experience - including the stack trace - was identical to ours.
The only difference is our servers went down within minutes, not hours. 

Thanks for taking a look and providing the patch and I''m sorry we
didn''t
have the stack trace to post but it just goes to the console and there
is no easy way to get the text into a file.   And yes, I know we should
fix that.   We''re just not used to lustre crashing on us and have not
had to worry about capturing stack traces for some time.  :)

Charlie Taylor
UF HPC Center
> Cheers,
> Johann

Aaron Knister

2009-Sep-11 13:33 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

Is the read cache corruption actually causing on-disk corruption? Or just
in-memory corruption? I''m assuming the write cache corruption would end
up
causing the file to become corrupt on disk, but if a node crashes during a
write then I''m personally not all that bothered by it.

On a side note, any advice about how to avoid buggy releases or how to
select a stable older release would be much appreciated. Also, is 1.8
considered "stable" and/or "production-ready", or should I
be using the 1.6
series currently?

Thanks!

On Wed, Sep 9, 2009 at 11:00 AM, Johann Lombardi <johann at sun.com>
wrote:
> A bug has been identified in the 1.8 releases (1.8.0, 1.8.0.1 & 1.8.1
> are
> impacted) that can cause data corruption on the OSTs. This problem is
> related to the OSS read cache feature that has been introduced in 1.8.0.
> This can happen when a bulk read or write request is aborted due to the
> client being evicted or because the data transfer over the network has
> timed out. More details are available in bug 20560:
> https://bugzilla.lustre.org/show_bug.cgi?id=20560
>
> A patch is under testing and will be included in 1.8.1.1.
> Until 1.8.1.1 is available, we recommend to disable the OSS read cache
> feature. This feature can be disabled by running the two following
> commands on the OSSs:
> # lctl set_param obdfilter.*.writethrough_cache_enable=0
> # lctl set_param obdfilter.*.read_cache_enable=0
>
> This has to be done each time an OST is restarted.
>
> Best regards,
> Johann, for the Lustre team
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090911/cd46746d/attachment.html

Oleg Drokin

2009-Sep-11 13:39 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

Hello!

On Sep 11, 2009, at 9:33 AM, Aaron Knister wrote:
> Is the read cache corruption actually causing on-disk corruption? Or  
> just in-memory corruption? I''m assuming the write cache corruption
> would end up causing the file to become corrupt on disk, but if a  
> node crashes during a write then I''m personally not all that  
> bothered by it.
Well, It''s in-memory corruption that can later on could lead to on- 
disk corruption.
Consider that the app does partial-page write or just reads and then  
writes back data.
In both cases invalid data would be read from the cache and then  
written back to the disk.

Bye,
     Oleg

Robin Humble

2009-Sep-15 06:44 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

On Thu, Sep 10, 2009 at 12:35:54PM +0200, Johann Lombardi
wrote:>We have attached a new patch to bug 20560 which should address your
>problem which may happen in rare cases with partial truncates.
as we are about to throw users onto the new system, can I ask for a
quick update pointing us to the current best guess at a workaround/fix
for the 1.8.1 read cache problems please?

to me it looks like
  https://bugzilla.lustre.org/show_bug.cgi?id=20560
is still evolving, but it looks like writethrough_cache=0 should now
work (and not crash the OSS) with attachment:
  https://bugzilla.lustre.org/attachment.cgi?id=25833

so if I patched our OSS''s with just this one liner, then would that be
enough to run with until the situation has had some time to bed in?
or would we be better off with all 4 patches from 20560 applied (and
both read cache''s still off)?

cheers,
robin
--
Dr Robin Humble, HPC Systems Analyst, NCI National Facility

Johann Lombardi

2009-Sep-15 11:50 UTC

head link

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

On Sep 15, 2009, at 8:44 AM, Robin Humble wrote:> as we are about to throw users onto the new system, can I ask for a
> quick update pointing us to the current best guess at a workaround/fix
> for the 1.8.1 read cache problems please?
>
> to me it looks like
>  https://bugzilla.lustre.org/show_bug.cgi?id=20560
> is still evolving, but it looks like writethrough_cache=0 should now
> work (and not crash the OSS) with attachment:
>  https://bugzilla.lustre.org/attachment.cgi?id=25833
>
> so if I patched our OSS''s with just this one liner, then would
that be
> enough to run with until the situation has had some time to bed in?
Yes.
> or would we be better off with all 4 patches from 20560 applied (and
> both read cache''s still off)?
In fact, you don''t need to apply the 4 patches to fix the problem, but
only attachment 25896 (about to land for 1.8.1.1). Once this patch
is applied, it is safe to run with OSS read cache turned on (that''s
to say enabling both writethrough_cache_enable and
read_cache_enable). Since you have to recompile lustre in both
cases (we are doing our best  to release 1.8.1.1 asap), i would
recommend to apply both attachments 25833 and 25896.

Best regards,
Johann

Lustre announce - Sep 2009 - WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases