thr3ads.net - Gluster users - [Gluster-users] Extremely slow file listing in folders with many files [Apr 2020]

If this information is useful, please help other people find it:
Share via:

Artem Russakovskii

2020-Apr-30 01:24 UTC

[Gluster-users] Extremely slow file listing in folders with many files

Hi all,

We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, and the
10TB one especially is extremely slow to do certain things with (and has
been since gluster 3.x when we started). We're currently on 5.13.

The number of files isn't even what I'd consider that great - under 100k
per dir.

Here are some numbers to look at:

On gluster volume in a dir of 45k files:
The first time

time find | wc -l
45423
real    8m44.819s
user    0m0.459s
sys     0m0.998s

And again

time find | wc -l
45423
real    0m34.677s
user    0m0.291s
sys     0m0.754s


If I run the same operation on the xfs block device itself:
The first time

time find | wc -l
45423
real    0m13.514s
user    0m0.144s
sys     0m0.501s

And again

time find | wc -l
45423
real    0m0.197s
user    0m0.088s
sys     0m0.106s


I'd expect a performance difference here but just as it was several years
ago when we started with gluster, it's still huge, and simple file listings
are incredibly slow.

At the time, the team was looking to do some optimizations, but I'm not
sure this has happened.

What can we do to try to improve performance?

Thank you.



Some setup values follow.

xfs_info /mnt/SNIP_block1
meta-data=/dev/sdc               isize=512    agcount=103, agsize=26214400
blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=2684354560, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=51200, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Volume Name: SNIP_data1
Type: Replicate
Volume ID: SNIP
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1
Brick2: forge:/mnt/SNIP_block1/SNIP_data1
Brick3: hive:/mnt/SNIP_block1/SNIP_data1
Brick4: citadel:/mnt/SNIP_block1/SNIP_data1
Options Reconfigured:
cluster.quorum-count: 1
cluster.quorum-type: fixed
network.ping-timeout: 5
network.remote-dio: enable
performance.rda-cache-limit: 256MB
performance.readdir-ahead: on
performance.parallel-readdir: on
network.inode-lru-limit: 500000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
cluster.readdir-optimize: on
performance.io-thread-count: 32
server.event-threads: 4
client.event-threads: 4
performance.read-ahead: off
cluster.lookup-optimize: on
performance.cache-size: 1GB
cluster.self-heal-daemon: enable
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
cluster.granular-entry-heal: enable
cluster.data-self-heal-algorithm: full

Sincerely,
Artem

--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net | @ArtemR <http://twitter.com/ArtemR>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200429/568a4bdf/attachment.html>

Strahil Nikolov

2020-Apr-30 14:44 UTC

head link

[Gluster-users] Extremely slow file listing in folders with many files

On April 30, 2020 4:24:23 AM GMT+03:00, Artem Russakovskii <archon810 at
gmail.com> wrote:>Hi all,
>
>We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, and the
>10TB one especially is extremely slow to do certain things with (and
>has
>been since gluster 3.x when we started). We're currently on 5.13.
>
>The number of files isn't even what I'd consider that great - under
>100k
>per dir.
>
>Here are some numbers to look at:
>
>On gluster volume in a dir of 45k files:
>The first time
>
>time find | wc -l
>45423
>real    8m44.819s
>user    0m0.459s
>sys     0m0.998s
>
>And again
>
>time find | wc -l
>45423
>real    0m34.677s
>user    0m0.291s
>sys     0m0.754s
>
>
>If I run the same operation on the xfs block device itself:
>The first time
>
>time find | wc -l
>45423
>real    0m13.514s
>user    0m0.144s
>sys     0m0.501s
>
>And again
>
>time find | wc -l
>45423
>real    0m0.197s
>user    0m0.088s
>sys     0m0.106s
>
>
>I'd expect a performance difference here but just as it was several
>years
>ago when we started with gluster, it's still huge, and simple file
>listings
>are incredibly slow.
>
>At the time, the team was looking to do some optimizations, but I'm not
>sure this has happened.
>
>What can we do to try to improve performance?
>
>Thank you.
>
>
>
>Some setup values follow.
>
>xfs_info /mnt/SNIP_block1
>meta-data=/dev/sdc               isize=512    agcount=103,
>agsize=26214400
>blks
>         =                       sectsz=512   attr=2, projid32bit=1
>      =                       crc=1        finobt=1, sparse=0, rmapbt=0
>         =                       reflink=0
>data     =                       bsize=4096   blocks=2684354560,
>imaxpct=25
>         =                       sunit=0      swidth=0 blks
>naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
>log      =internal log           bsize=4096   blocks=51200, version=2
>        =                       sectsz=512   sunit=0 blks, lazy-count=1
>realtime =none                   extsz=4096   blocks=0, rtextents=0
>
>Volume Name: SNIP_data1
>Type: Replicate
>Volume ID: SNIP
>Status: Started
>Snapshot Count: 0
>Number of Bricks: 1 x 4 = 4
>Transport-type: tcp
>Bricks:
>Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1
>Brick2: forge:/mnt/SNIP_block1/SNIP_data1
>Brick3: hive:/mnt/SNIP_block1/SNIP_data1
>Brick4: citadel:/mnt/SNIP_block1/SNIP_data1
>Options Reconfigured:
>cluster.quorum-count: 1
>cluster.quorum-type: fixed
>network.ping-timeout: 5
>network.remote-dio: enable
>performance.rda-cache-limit: 256MB
>performance.readdir-ahead: on
>performance.parallel-readdir: on
>network.inode-lru-limit: 500000
>performance.md-cache-timeout: 600
>performance.cache-invalidation: on
>performance.stat-prefetch: on
>features.cache-invalidation-timeout: 600
>features.cache-invalidation: on
>cluster.readdir-optimize: on
>performance.io-thread-count: 32
>server.event-threads: 4
>client.event-threads: 4
>performance.read-ahead: off
>cluster.lookup-optimize: on
>performance.cache-size: 1GB
>cluster.self-heal-daemon: enable
>transport.address-family: inet
>nfs.disable: on
>performance.client-io-threads: on
>cluster.granular-entry-heal: enable
>cluster.data-self-heal-algorithm: full
>
>Sincerely,
>Artem
>
>--
>Founder, Android Police <http://www.androidpolice.com>, APK Mirror
><http://www.apkmirror.com/>, Illogical Robot LLC
>beerpla.net | @ArtemR <http://twitter.com/ArtemR>
Hi Artem,

Have you checked the same on brick level ? How big is the difference ?

Best Regards,
Strahil Nikolov

Gluster users - Apr 2020 - Extremely slow file listing in folders with many files

[Gluster-users] Extremely slow file listing in folders with many files

[Gluster-users] Extremely slow file listing in folders with many files