thr3ads.net - Lustre discuss - [Lustre-discuss] bad 1.6.3 striped write performance [Nov 2007]

If this information is useful, please help other people find it:
Share via:

Robin Humble

2007-Nov-26 13:39 UTC

[Lustre-discuss] bad 1.6.3 striped write performance

Hi,

I''m seeing what can only be described as dismal striped write
performance from lustre 1.6.3 clients :-/
1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs a couple
of days ago) are also terrible.

the below shows that the OS (centos4.5/5) or fabric (gigE/IB) or lustre
version on the servers doesn''t matter - the problem is with the 1.6.3
and 1.6.4rc3 client kernels with striped writes (although un-striped
writes are a tad slower too).

with 1M lustre stripes:
   client              client                     dd write speed (MB/s)
     OS                kernel                     a)    b)    c)    d)
1.6.2:
  centos4.5  2.6.9-55.0.2.EL_lustre.1.6.2smp     202   270   118   117
  centos5    2.6.18-8.1.8.el5_lustre.1.6.2rjh    166   190   117   119
1.6.3+:
  centos4.5  2.6.9-55.0.9.EL_lustre.1.6.3smp      32     9    30     9
  centos5    2.6.18-53.el5-lustre1.6.4rc3rjh      36    10    27    10
                                                       ^^^^        ^^^^
                                         yes, that is really 9MB/s. sigh

with no lustre stripes:
   client              client                     dd write speed (MB/s)
     OS                kernel                     a)          c)
1.6.2:
  centos4.5  2.6.9-55.0.2.EL_lustre.1.6.2smp     102          98
  centos5    2.6.18-8.1.8.el5_lustre.1.6.2rjh     84          77
1.6.3+:
  centos4.5  2.6.9-55.0.9.EL_lustre.1.6.3smp      94          95
  centos5    2.6.18-53.el5-lustre1.6.4rc3rjh      73          67

a) servers   centos5, 2.6.18-53.el5-lustre1.6.4rc3rjh, md raid5, fabric IB
b) servers centos4.5, 2.6.9-55.0.9.EL_lustre.1.6.3smp,    ""   ,
fabric IB
c) servers   centos5, 2.6.18-8.1.14.el5_lustre.1.6.3smp,  ""   .
fabric gigE
d) servers centos4.5, 2.6.9-55.0.9.EL_lustre.1.6.3smp,    ""   ,
fabric gigE

all runs have the same setup - two OSS''s, each with a 16 FC disk md
raid5 OST clients with 512m ram, server with 8g, all x86_64, test is
  dd if=/dev/zero of=/mnt/testfs/blah bs=1M count=5000
each test run >=2 times. there are no errors from lustre or kernels.

I can''t see anything relevant in bugzilla.
is anyone else seeing this?
seems weird that 1.6.3 has been out there for a while and nobody else
has reported it, but I can''t think or any more testing variants I can
try...

anyway, some more simple setup info:

 % lfs getstripe /mnt/testfs/
 OBDS:
 0: testfs-OST0000_UUID ACTIVE
 1: testfs-OST0001_UUID ACTIVE
 /mnt/testfs/
 default stripe_count: -1 stripe_size: 1048576 stripe_offset: -1
 /mnt/testfs/blah
         obdidx           objid          objid            group
              1               3            0x3                0
              0               2            0x2                0

 % lfs df
 UUID                 1K-blocks      Used Available  Use% Mounted on
 testfs-MDT0000_UUID    1534832    306680   1228152   19% /mnt/testfs[MDT:0]
 testfs-OST0000_UUID   15481840   3803284  11678556   24% /mnt/testfs[OST:0]
 testfs-OST0001_UUID   15481840   3803284  11678556   24% /mnt/testfs[OST:1]

 filesystem summary:   30963680   7606568  23357112   24% /mnt/testfs

cheers,
robin

ps. the ''rjh'' series kernels are required ''cos lustre
rhel5 kernels
don''t have ko2iblnd support in them.

Johann Lombardi

2007-Nov-26 13:53 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

On Mon, Nov 26, 2007 at 08:39:58AM -0500, Robin Humble
wrote:> Hi,
> 
> I''m seeing what can only be described as dismal striped write
> performance from lustre 1.6.3 clients :-/
> 1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs a couple
> of days ago) are also terrible.
> 
> the below shows that the OS (centos4.5/5) or fabric (gigE/IB) or lustre
> version on the servers doesn''t matter - the problem is with the
1.6.3
> and 1.6.4rc3 client kernels with striped writes (although un-striped
> writes are a tad slower too).
> 
> with 1M lustre stripes:
>    client              client                     dd write speed (MB/s)
>      OS                kernel                     a)    b)    c)    d)
> 1.6.2:
>   centos4.5  2.6.9-55.0.2.EL_lustre.1.6.2smp     202   270   118   117
>   centos5    2.6.18-8.1.8.el5_lustre.1.6.2rjh    166   190   117   119
> 1.6.3+:
>   centos4.5  2.6.9-55.0.9.EL_lustre.1.6.3smp      32     9    30     9
>   centos5    2.6.18-53.el5-lustre1.6.4rc3rjh      36    10    27    10
>                                                        ^^^^        ^^^^
>                                          yes, that is really 9MB/s. sigh
Could you please try to disable checksums?
On the client side:
for file in /proc/fs/lustre/osc/*/checksums; do echo 0 > $file; done

Johann

Robin Humble

2007-Nov-26 14:32 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

On Mon, Nov 26, 2007 at 02:53:25PM +0100, Johann Lombardi
wrote:>On Mon, Nov 26, 2007 at 08:39:58AM -0500, Robin Humble wrote:
>> Hi,
>> 
>> I''m seeing what can only be described as dismal striped write
>> performance from lustre 1.6.3 clients :-/
>> 1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs a couple
>> of days ago) are also terrible.
>> 
>> the below shows that the OS (centos4.5/5) or fabric (gigE/IB) or lustre
>> version on the servers doesn''t matter - the problem is with
the 1.6.3
>> and 1.6.4rc3 client kernels with striped writes (although un-striped
>> writes are a tad slower too).
>> 
>> with 1M lustre stripes:
>>    client              client                     dd write speed (MB/s)
>>      OS                kernel                     a)    b)    c)    d)
>> 1.6.2:
>>   centos4.5  2.6.9-55.0.2.EL_lustre.1.6.2smp     202   270   118   117
>>   centos5    2.6.18-8.1.8.el5_lustre.1.6.2rjh    166   190   117   119
>> 1.6.3+:
>>   centos4.5  2.6.9-55.0.9.EL_lustre.1.6.3smp      32     9    30     9
>>   centos5    2.6.18-53.el5-lustre1.6.4rc3rjh      36    10    27    10
>>                                                        ^^^^        ^^^^
>>                                          yes, that is really 9MB/s.
sigh
>
>Could you please try to disable checksums?
>On the client side:
>for file in /proc/fs/lustre/osc/*/checksums; do echo 0 > $file; done
done. no change.

cheers,
robin

Andrei Maslennikov

2007-Nov-26 15:58 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

On Nov 26, 2007 3:32 PM, Robin Humble <rjh+lustre at cita.utoronto.ca>
wrote:
> >> I''m seeing what can only be described as dismal striped
write
> >> performance from lustre 1.6.3 clients :-/
> >> 1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs a
couple
> >> of days ago) are also terrible.
I have 3 OSTs capable to deliver 300+ MB/sec each for large streaming writes
with 1M blocksize. On one client, with one OST I can see almost all
this bandwidth
over Infiniband. If I run three processes in parallel on this very client, each
writing into a separate OST, I arrive to 520 MB/sec aggregate (3 streams at
approx 170+ MB/sec each).

If I try to stripe over these three OSTs on this client, performance of one
stream drops to 60+ MB/sec. Changing stripesize to a smaller one (1/3 MB)
makes things worse. Writing with larger block sizes (9M, 30M) does not improve
things. Increasing the stripesize to 25 MB allows to approach the speed
of a single OST, as one would expect (blocks are round robined over all three
OSTs). But never more. Zeroing checksums on the client does not help.

Will now be downgrading the client to 1.6.2 to see if this helps.

Andrei.

Andrei Maslennikov

2007-Nov-26 17:16 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

Confirmed: 1.6.3 striped write performance sux.

With 1.6.2, I see this:

[root at srvandrei ~]$ lfs setstripe /lustre/162 0 0 3
[root at srvandrei ~]$ lmdd.linux of=/lustre/162 bs=1024k time=180 fsync=1
157705.8304 MB in 180.0225 secs, 876.0341 MB/sec

I.e. 1.6.2 had nicely joined the aggregate bw of three OSTs of 300 MB/sec each
into the almost 900 MB/sec.

Andrei.

On Nov 26, 2007 4:58 PM, Andrei Maslennikov
<andrei.maslennikov at gmail.com> wrote:> On Nov 26, 2007 3:32 PM, Robin Humble <rjh+lustre at
cita.utoronto.ca> wrote:
>
> > >> I''m seeing what can only be described as dismal
striped write
> > >> performance from lustre 1.6.3 clients :-/
> > >> 1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs
a couple
> > >> of days ago) are also terrible.
>
> I have 3 OSTs capable to deliver 300+ MB/sec each for large streaming
writes
> with 1M blocksize. On one client, with one OST I can see almost all
> this bandwidth over Infiniband. If I run three processes in parallel on
this very client,
> each writing into a separate OST, I arrive to 520 MB/sec aggregate (3
streams at
> approx 170+ MB/sec each).
>
> If I try to stripe over these three OSTs on this client, performance of one
> stream drops to 60+ MB/sec. Changing stripesize to a smaller one (1/3 MB)
> makes things worse. Writing with larger block sizes (9M, 30M) does not
improve
> things. Increasing the stripesize to 25 MB allows to approach the speed
> of a single OST, as one would expect (blocks are round robined over all
three
> OSTs). But never more. Zeroing checksums on the client does not help.
>
> Will now be downgrading the client to 1.6.2 to see if this helps.
>
> Andrei.
>

Andreas Dilger

2007-Nov-26 18:59 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

On Nov 26, 2007  18:16 +0100, Andrei Maslennikov wrote:> Confirmed: 1.6.3 striped write performance sux.
> 
> With 1.6.2, I see this:
> 
> [root at srvandrei ~]$ lfs setstripe /lustre/162 0 0 3
> [root at srvandrei ~]$ lmdd.linux of=/lustre/162 bs=1024k time=180 fsync=1
> 157705.8304 MB in 180.0225 secs, 876.0341 MB/sec
> 
> I.e. 1.6.2 had nicely joined the aggregate bw of three OSTs of 300 MB/sec
each
> into the almost 900 MB/sec.
Can you verify that you disabled data checksumming:

	echo 0 > /proc/fs/lustre/llite/*/checksum_pages

Note that there are 2 kinds of checksumming that Lustre does.  The first one
is checksumming of data in client memory, and the second one is checksumming
of data over the network.  Setting $LPROC/llite/*/checksum_pages turns on/off
both in-memory and wire checksums.  Setting $LPROC/osc/*/checksums turns on/off
the network checksums only.

If checksums are disabled, can you please report if the CPU usage on the
client is consuming all of the CPU, or possibly all of a single CPU on 1.6.3
and on 1.6.2?
> On Nov 26, 2007 4:58 PM, Andrei Maslennikov
> <andrei.maslennikov at gmail.com> wrote:
> > On Nov 26, 2007 3:32 PM, Robin Humble <rjh+lustre at
cita.utoronto.ca> wrote:
> >
> > > >> I''m seeing what can only be described as dismal
striped write
> > > >> performance from lustre 1.6.3 clients :-/
> > > >> 1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from
cvs a couple
> > > >> of days ago) are also terrible.
> >
> > I have 3 OSTs capable to deliver 300+ MB/sec each for large streaming
writes
> > with 1M blocksize. On one client, with one OST I can see almost all
> > this bandwidth over Infiniband. If I run three processes in parallel
on this very client,
> > each writing into a separate OST, I arrive to 520 MB/sec aggregate (3
streams at
> > approx 170+ MB/sec each).
> >
> > If I try to stripe over these three OSTs on this client, performance
of one
> > stream drops to 60+ MB/sec. Changing stripesize to a smaller one (1/3
MB)
> > makes things worse. Writing with larger block sizes (9M, 30M) does not
improve
> > things. Increasing the stripesize to 25 MB allows to approach the
speed
> > of a single OST, as one would expect (blocks are round robined over
all three
> > OSTs). But never more. Zeroing checksums on the client does not help.
> >
> > Will now be downgrading the client to 1.6.2 to see if this helps.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Andrei Maslennikov

2007-Nov-26 19:14 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

Hello Andreas,

I am currently reconfiguring the setup so cannot do these checks immediately.
Will come back on it, hopefully tomorrow.

Greetings - Andrei.

On Nov 26, 2007 7:59 PM, Andreas Dilger <adilger at sun.com> wrote:
> Can you verify that you disabled data checksumming:
>
>         echo 0 > /proc/fs/lustre/llite/*/checksum_pages
>
> Note that there are 2 kinds of checksumming that Lustre does.  The first
one
> is checksumming of data in client memory, and the second one is
checksumming
> of data over the network.  Setting $LPROC/llite/*/checksum_pages turns
on/off
> both in-memory and wire checksums.  Setting $LPROC/osc/*/checksums turns
on/off
> the network checksums only.
>
> If checksums are disabled, can you please report if the CPU usage on the
> client is consuming all of the CPU, or possibly all of a single CPU on
1.6.3
> and on 1.6.2?

Robin Humble

2007-Nov-26 23:15 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

On Mon, Nov 26, 2007 at 11:59:32AM -0700, Andreas Dilger
wrote:>On Nov 26, 2007  18:16 +0100, Andrei Maslennikov wrote:
>> Confirmed: 1.6.3 striped write performance sux.
>> 
>> With 1.6.2, I see this:
>> 
>> [root at srvandrei ~]$ lfs setstripe /lustre/162 0 0 3
>> [root at srvandrei ~]$ lmdd.linux of=/lustre/162 bs=1024k time=180
fsync=1
>> 157705.8304 MB in 180.0225 secs, 876.0341 MB/sec
>> 
>> I.e. 1.6.2 had nicely joined the aggregate bw of three OSTs of 300
MB/sec each
>> into the almost 900 MB/sec.
>
>Can you verify that you disabled data checksumming:
>	echo 0 > /proc/fs/lustre/llite/*/checksum_pages
those checksums were off in my runs (they were off by default?).
so I don''t think any of the checksums are making a difference.
>Note that there are 2 kinds of checksumming that Lustre does.  The first one
>is checksumming of data in client memory, and the second one is checksumming
>of data over the network.  Setting $LPROC/llite/*/checksum_pages turns
on/off
>both in-memory and wire checksums.  Setting $LPROC/osc/*/checksums turns
on/off
>the network checksums only.
good to know. thanks. all those are new in 1.6.3?
>If checksums are disabled, can you please report if the CPU usage on the
>client is consuming all of the CPU, or possibly all of a single CPU on 1.6.3
>and on 1.6.2?
with checksums disabled, a 1.6.3+ client looks like:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7437 root      15   0     0    0    0 R   57  0.0   7:31.77 ldlm_poold
18547 rjh900    15   0  5820  504  412 S    3  0.1   0:34.52 dd

which is interesting. ldlm_poold is using an awful lot of cpu.

a ''top'' on a 1.6.2 client shows only dd using significant cpu
(plus the
usual small percentages for ptlrpcd, kswapd0, pdflush, kiblnd_sd_*)

cheers,
robin
>> On Nov 26, 2007 4:58 PM, Andrei Maslennikov
>> <andrei.maslennikov at gmail.com> wrote:
>> > On Nov 26, 2007 3:32 PM, Robin Humble <rjh+lustre at
cita.utoronto.ca> wrote:
>> >
>> > > >> I''m seeing what can only be described as
dismal striped write
>> > > >> performance from lustre 1.6.3 clients :-/
>> > > >> 1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients
(from cvs a couple
>> > > >> of days ago) are also terrible.
>> >
>> > I have 3 OSTs capable to deliver 300+ MB/sec each for large
streaming writes
>> > with 1M blocksize. On one client, with one OST I can see almost
all
>> > this bandwidth over Infiniband. If I run three processes in
parallel on this very client,
>> > each writing into a separate OST, I arrive to 520 MB/sec aggregate
(3 streams at
>> > approx 170+ MB/sec each).
>> >
>> > If I try to stripe over these three OSTs on this client,
performance of one
>> > stream drops to 60+ MB/sec. Changing stripesize to a smaller one
(1/3 MB)
>> > makes things worse. Writing with larger block sizes (9M, 30M) does
not improve
>> > things. Increasing the stripesize to 25 MB allows to approach the
speed
>> > of a single OST, as one would expect (blocks are round robined
over all three
>> > OSTs). But never more. Zeroing checksums on the client does not
help.
>> >
>> > Will now be downgrading the client to 1.6.2 to see if this helps.
>
>Cheers, Andreas
>--
>Andreas Dilger
>Sr. Staff Engineer, Lustre Group
>Sun Microsystems of Canada, Inc.

Andrei Maslennikov

2007-Nov-27 12:59 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

Andreas, here are some numbers obtained against a file striped over 3 OSTs
(there were 8 cores; only 3 of them had any visible load, so I quote the
CPU usage only for these cores):

1) 1.6.3 client checksums enabled: 35 MB/sec
            ldlm_poold: all the time 85-100%,
            ptlrpcd: 15-23%
            dd: from 85% down gradually to 4-6%

2) Idem, after zeroed /proc/fs/lustre/llite/*/checksum_pages: 65 MB/sec
           loads are very much the same as in the first case

3) 1.6.2 client checksums enabled: 790 MB/sec
           dd: 85-95%
           kswapd: 35%
           ptlrpcd: 15-20%

Andrei.

On Nov 26, 2007 7:59 PM, Andreas Dilger <adilger at sun.com> wrote:
> Can you verify that you disabled data checksumming:
>
>         echo 0 > /proc/fs/lustre/llite/*/checksum_pages
>
> Note that there are 2 kinds of checksumming that Lustre does.  The first
one
> is checksumming of data in client memory, and the second one is
checksumming
> of data over the network.  Setting $LPROC/llite/*/checksum_pages turns
on/off
> both in-memory and wire checksums.  Setting $LPROC/osc/*/checksums turns
on/off
> the network checksums only.
>
> If checksums are disabled, can you please report if the CPU usage on the
> client is consuming all of the CPU, or possibly all of a single CPU on
1.6.3
> and on 1.6.2?
>

Nicholas Henke

2007-Nov-27 14:47 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

Andrei Maslennikov wrote:> Andreas, here are some numbers obtained against a file striped over 3 OSTs
> (there were 8 cores; only 3 of them had any visible load, so I quote the
> CPU usage only for these cores):
> 
> 1) 1.6.3 client checksums enabled: 35 MB/sec
>             ldlm_poold: all the time 85-100%,
>             ptlrpcd: 15-23%
>             dd: from 85% down gradually to 4-6%
> 
Any chance you can run oprofile for the 1.6.3 case? It''d (hopefully) 
show where ldlm_poold is spinning.

Nic

Andrei Maslennikov

2007-Nov-27 15:14 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

On Nov 27, 2007 3:47 PM, Nicholas Henke <nic at cray.com> wrote:
> > 1) 1.6.3 client checksums enabled: 35 MB/sec
> >             ldlm_poold: all the time 85-100%,
> >             ptlrpcd: 15-23%
> >             dd: from 85% down gradually to 4-6%
> >
>
> Any chance you can run oprofile for the 1.6.3 case? It''d
(hopefully)
> show where ldlm_poold is spinning.
  Not at the moment, my lab is currently dismantled...
  Maybe will be able to do it before Friday...

  Andrei.

chas williams - CONTRACTOR

2007-Nov-28 03:01 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

reconfigure your client with --disable-lru-resize.  this appears to be
a new feature in 1.6.3.  this fixed striped performance for me.

In message <515158c30711270459k5c80c142k1e179c71bfecbdab at
mail.gmail.com>,"Andrei Maslennikov"
writes:>Andreas, here are some numbers obtained against a file striped over 3 OSTs
>(there were 8 cores; only 3 of them had any visible load, so I quote the
>CPU usage only for these cores):
>
>1) 1.6.3 client checksums enabled: 35 MB/sec
>            ldlm_poold: all the time 85-100%,
>            ptlrpcd: 15-23%
>            dd: from 85% down gradually to 4-6%
>
>2) Idem, after zeroed /proc/fs/lustre/llite/*/checksum_pages: 65 MB/sec
>           loads are very much the same as in the first case
>
>3) 1.6.2 client checksums enabled: 790 MB/sec
>           dd: 85-95%
>           kswapd: 35%
>           ptlrpcd: 15-20%
>
>Andrei.
>
>On Nov 26, 2007 7:59 PM, Andreas Dilger <adilger at sun.com> wrote:
>
>> Can you verify that you disabled data checksumming:
>>
>>         echo 0 > /proc/fs/lustre/llite/*/checksum_pages
>>
>> Note that there are 2 kinds of checksumming that Lustre does.  The
first one
>> is checksumming of data in client memory, and the second one is
checksumming
>> of data over the network.  Setting $LPROC/llite/*/checksum_pages turns
on/off
>> both in-memory and wire checksums.  Setting $LPROC/osc/*/checksums
turns on/off
>> the network checksums only.
>>
>> If checksums are disabled, can you please report if the CPU usage on
the
>> client is consuming all of the CPU, or possibly all of a single CPU on
1.6.3
>> and on 1.6.2?
>>
>
>_______________________________________________
>Lustre-discuss mailing list
>Lustre-discuss at clusterfs.com
>https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>

Johann Lombardi

2007-Nov-28 16:32 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

On Tue, Nov 27, 2007 at 10:01:05PM -0500, chas williams - CONTRACTOR
wrote:> reconfigure your client with --disable-lru-resize.  this appears to be
> a new feature in 1.6.3.  this fixed striped performance for me.
FYI, I''ve filed a new bugzilla ticket about this problem (see bug
#14353).

Johann

Yuriy Umanets

2007-Nov-28 16:53 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

Johann Lombardi wrote:> On Tue, Nov 27, 2007 at 10:01:05PM -0500, chas williams - CONTRACTOR wrote:
>   
>> reconfigure your client with --disable-lru-resize.  this appears to be
>> a new feature in 1.6.3.  this fixed striped performance for me.
>>     
>
> FYI, I''ve filed a new bugzilla ticket about this problem (see bug
#14353).
>
>   hi all!

Will fix it tomorrow, but the fix will come to current 1.6 branch so 
only will be available in next release.

Thanks.> Johann
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>   

-- 
umka

chas williams - CONTRACTOR

2007-Nov-28 17:28 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

In message <474D9D00.1080506 at sun.com>,Yuriy Umanets
writes:>Will fix it tomorrow, but the fix will come to current 1.6 branch so 
>only will be available in next release.
if you have a fix, i can apply it as a point patch to my local copy.
its not a huge deal.

Yuriy Umanets

2007-Nov-29 15:18 UTC

head link

[Lustre-discuss] bad 1.6.3 striped write performance

chas williams - CONTRACTOR wrote:> In message <474D9D00.1080506 at sun.com>,Yuriy Umanets writes:
>   
>> Will fix it tomorrow, but the fix will come to current 1.6 branch so 
>> only will be available in next release.
>>     
>
>   
hi Williams,> if you have a fix, i can apply it as a point patch to my local copy.
> its not a huge deal.
>   It turned out to be a bit more complex issue which was already observed 
earlier in bug 13766. Some more info on this bug is also located in bug 
14353. It is related to aggressive memory pressure event handling in 
server side ldlm pools code. I have the patch (quite big, 55K) for 1.6.4 
version. It fixes this as well as other related things. But it is 
completely not tested on serious HW and I would not like to make you 
deal with stuff unless you ask for it.

If you really like to give it a try and will not use this on some alive 
storage with sensitive data, you need to update your local copy to 1.6.4 
and I will send you the patch. But as soon as I think you only want to 
make lustre behave like 1.6.2 about IO performance, you probably do not 
need this and may disable this feature by configure key --disable-lru-resize

Thanks.

-- 
umka

Possibly Parallel Threads

Search for more reasonably related threads

Lustre discuss - Nov 2007 - bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

[Lustre-discuss] bad 1.6.3 striped write performance

Possibly Parallel Threads