thr3ads.net - Lustre discuss - [Lustre-discuss] why so slow? [Jul 2007]

If this information is useful, please help other people find it:
Share via:

Stuart Midgley

2007-Jul-16 19:49 UTC

[Lustre-discuss] why so slow?

We are seeing really bad performance from a java app and it boils down to
poor performance from 1-byte reads from a Lustre file system.  After a
detailed strace of the application running, I have generated the following
code snippet which demonstrates the problem

#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main(int argc, char **argv) {
        int fd=open("5GB_file", O_RDONLY, 0666);
        int i,j;
        char b[1];

        for(i=0; i<10000; i++) {
                lseek(fd, i*500, SEEK_SET);
                for(j=0; j<100; j++) {
                        read(fd, b, 1);
                }
        }
        close(fd);
}


5GB_file is just a dd if=/dev/zero of=5GB_file bs=1024k count=5000 .

Anyway, this code runs in <1s on local disk, ~5s on NFS and >30s on
Lustre...  I was disappointed to see Lustre slower than nfs.   I was
hoping that Lustre''s read-a-head would have been triggered by this
code,
but it doesn''t appear to be.  Any way I can tune Lustre to work better
with this code?  (I know, change the code, but it isn''t that easy -
this
is a c example of an strace of a java app - and changing the original java
app isn''t so easy).

Thanks
Stu.


-- 
Dr Stuart Midgley
sdm900@gmail.com

Stuart Midgley

2007-Jul-17 05:27 UTC

head link

[Lustre-discuss] why so slow?

Extra, I zero''ed /proc/fs/lustre/llite/fs0/read_ahead_stats and re- 
ran the test and checked the read ahead stats afterward... and they  
were all zero.  So, I assume this means that Lustre isn''t doing any  
read ahead in this case.


On 17/07/2007, at 9:49 AM, Stuart Midgley wrote:
> We are seeing really bad performance from a java app and it boils  
> down to
> poor performance from 1-byte reads from a Lustre file system.  After a
> detailed strace of the application running, I have generated the  
> following
> code snippet which demonstrates the problem
>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
>
> int main(int argc, char **argv) {
>         int fd=open("5GB_file", O_RDONLY, 0666);
>         int i,j;
>         char b[1];
>
>         for(i=0; i<10000; i++) {
>                 lseek(fd, i*500, SEEK_SET);
>                 for(j=0; j<100; j++) {
>                         read(fd, b, 1);
>                 }
>         }
>         close(fd);
> }
>
>
> 5GB_file is just a dd if=/dev/zero of=5GB_file bs=1024k count=5000 .
>
> Anyway, this code runs in <1s on local disk, ~5s on NFS and >30s on
> Lustre...  I was disappointed to see Lustre slower than nfs.   I was
> hoping that Lustre''s read-a-head would have been triggered by this
> code,
> but it doesn''t appear to be.  Any way I can tune Lustre to work
better
> with this code?  (I know, change the code, but it isn''t that easy
-
> this
> is a c example of an strace of a java app - and changing the  
> original java
> app isn''t so easy).
>
> Thanks
> Stu.
>

-- 
Dr Stuart Midgley
sdm900@gmail.com

Troy Benjegerdes

2007-Jul-17 11:22 UTC

head link

[Lustre-discuss] why so slow?

Sorry to be obnoxious, but why is an app doing 1 byte reads the
filesystem''s problem? This seems like something that really should
be the responsibility of the application, not the FS.

why isn''t the java runtime doing some read-ahead of it''s own?

On Tue, Jul 17, 2007 at 07:26:26PM +0800, Stuart Midgley
wrote:> Extra, I zero''ed /proc/fs/lustre/llite/fs0/read_ahead_stats and
re-
> ran the test and checked the read ahead stats afterward... and they  
> were all zero.  So, I assume this means that Lustre isn''t doing
any
> read ahead in this case.
> 
> 
> On 17/07/2007, at 9:49 AM, Stuart Midgley wrote:
> 
> >We are seeing really bad performance from a java app and it boils  
> >down to
> >poor performance from 1-byte reads from a Lustre file system.  After a
> >detailed strace of the application running, I have generated the  
> >following
> >code snippet which demonstrates the problem
> >
> >#include <sys/stat.h>
> >#include <fcntl.h>
> >#include <unistd.h>
> >
> >int main(int argc, char **argv) {
> >        int fd=open("5GB_file", O_RDONLY, 0666);
> >        int i,j;
> >        char b[1];
> >
> >        for(i=0; i<10000; i++) {
> >                lseek(fd, i*500, SEEK_SET);
> >                for(j=0; j<100; j++) {
> >                        read(fd, b, 1);
> >                }
> >        }
> >        close(fd);
> >}
> >
> >
> >5GB_file is just a dd if=/dev/zero of=5GB_file bs=1024k count=5000 .
> >
> >Anyway, this code runs in <1s on local disk, ~5s on NFS and >30s
on
> >Lustre...  I was disappointed to see Lustre slower than nfs.   I was
> >hoping that Lustre''s read-a-head would have been triggered by
this
> >code,
> >but it doesn''t appear to be.  Any way I can tune Lustre to
work better
> >with this code?  (I know, change the code, but it isn''t that
easy -
> >this
> >is a c example of an strace of a java app - and changing the  
> >original java
> >app isn''t so easy).
> >
> >Thanks
> >Stu.
> >
> 
> 
> -- 
> Dr Stuart Midgley
> sdm900@gmail.com
> 
> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
-- 
--------------------------------------------------------------------------
Troy Benjegerdes                ''da hozer''               
hozer@hozed.org

Somone asked me why I work on this free (http://www.fsf.org/philosophy/)
software stuff and not get a real job. Charles Shultz had the best answer:

"Why do musicians compose symphonies and poets write poems? They do it
because life wouldn''t have any meaning for them if they
didn''t. That''s why
I draw cartoons. It''s my life." -- Charles Shultz

Andreas Dilger

2007-Jul-18 14:37 UTC

head link

[Lustre-discuss] why so slow?

On Jul 17, 2007  09:49 +0800, Stuart Midgley wrote:> We are seeing really bad performance from a java app and it boils down to
> poor performance from 1-byte reads from a Lustre file system.  After a
> detailed strace of the application running, I have generated the following
> code snippet which demonstrates the problem
> 
> Anyway, this code runs in <1s on local disk, ~5s on NFS and >30s on
> Lustre...  I was disappointed to see Lustre slower than nfs.   I was
> hoping that Lustre''s read-a-head would have been triggered by this
code,
> but it doesn''t appear to be.  Any way I can tune Lustre to work
better
> with this code?  (I know, change the code, but it isn''t that easy
- this
> is a c example of an strace of a java app - and changing the original java
> app isn''t so easy).
Two likely reasons:
- lustre has DLM overhead for each read() syscall that local filesystems
  and NFS do not have
- the default debug level for lustre is punishing for small reads.  Try
  setting "sysctl -w lnet.debug=0" to test this.  The default debug
level
  will be changing in lustre 1.6.1.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

Stuart Midgley

2007-Jul-18 17:15 UTC

head link

[Lustre-discuss] why so slow?

There is a reason your a super star

# sysctl -w lnet.debug=0
lnet.debug = 0

 > time ~/tmp/a.out
0.052u 4.937s 0:04.98 100.0%    0+0k 0+0io 0pf+0w

so MUCH better.  While still not as good as NFS, it is definitely  
acceptable.

Thanks
Stu.

> Two likely reasons:
> - lustre has DLM overhead for each read() syscall that local  
> filesystems
>   and NFS do not have
> - the default debug level for lustre is punishing for small reads.   
> Try
>   setting "sysctl -w lnet.debug=0" to test this.  The default
debug
> level
>   will be changing in lustre 1.6.1.
>
> Cheers, Andreas
>-- 
Dr Stuart Midgley
sdm900@gmail.com

Nathaniel Rutman

2007-Jul-20 18:03 UTC

head link

[Lustre-discuss] why so slow?

Stuart Midgley wrote:> There is a reason your a super star
>
> # sysctl -w lnet.debug=0
> lnet.debug = 0
>
> > time ~/tmp/a.out
> 0.052u 4.937s 0:04.98 100.0%    0+0k 0+0io 0pf+0w
>
> so MUCH better.  While still not as good as NFS, it is definitely 
> acceptable.
>Turn off the debugging on the server too, if you have access.
You also might want to try increasing the # of rpc''s in flight on the 
client:
/proc/fs/lustre/osc/*/max_rpcs_in_flight
/proc/fs/lutre/mdc/*/max_rpcs_in_flight
and/or increase your readahead limits:
/proc/fs/lustre/llite/lustre-c6cfd238/max_read_ahead_whole_mb
/proc/fs/lustre/llite/lustre-c6cfd238/max_read_ahead_mb
> Thanks
> Stu.
>
>
>> Two likely reasons:
>> - lustre has DLM overhead for each read() syscall that local
filesystems
>>   and NFS do not have
>> - the default debug level for lustre is punishing for small reads.  Try
>>   setting "sysctl -w lnet.debug=0" to test this.  The default
debug
>> level
>>   will be changing in lustre 1.6.1.
>>
>> Cheers, Andreas
>>
> --Dr Stuart Midgley
> sdm900@gmail.com
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Stuart Midgley

2007-Jul-20 19:05 UTC

head link

[Lustre-discuss] why so slow?

Morning

I have turned off debugging for our entire environment.  Increasing  
the maximum number of RPC''s didn''t assist performance at all
for this
issue, nor did increasing the read ahead.

But thanks for the suggestions
Stu.

> Turn off the debugging on the server too, if you have access.
> You also might want to try increasing the # of rpc''s in flight on
> the client:
> /proc/fs/lustre/osc/*/max_rpcs_in_flight
> /proc/fs/lutre/mdc/*/max_rpcs_in_flight
> and/or increase your readahead limits:
> /proc/fs/lustre/llite/lustre-c6cfd238/max_read_ahead_whole_mb
> /proc/fs/lustre/llite/lustre-c6cfd238/max_read_ahead_mb
>-- 
Dr Stuart Midgley
sdm900@gmail.com

Lustre discuss - Jul 2007 - why so slow?

[Lustre-discuss] why so slow?

[Lustre-discuss] why so slow?

[Lustre-discuss] why so slow?

[Lustre-discuss] why so slow?

[Lustre-discuss] why so slow?

[Lustre-discuss] why so slow?

[Lustre-discuss] why so slow?