Artem Russakovskii
2020-Apr-30 01:24 UTC
[Gluster-users] Extremely slow file listing in folders with many files
Hi all, We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, and the 10TB one especially is extremely slow to do certain things with (and has been since gluster 3.x when we started). We're currently on 5.13. The number of files isn't even what I'd consider that great - under 100k per dir. Here are some numbers to look at: On gluster volume in a dir of 45k files: The first time time find | wc -l 45423 real 8m44.819s user 0m0.459s sys 0m0.998s And again time find | wc -l 45423 real 0m34.677s user 0m0.291s sys 0m0.754s If I run the same operation on the xfs block device itself: The first time time find | wc -l 45423 real 0m13.514s user 0m0.144s sys 0m0.501s And again time find | wc -l 45423 real 0m0.197s user 0m0.088s sys 0m0.106s I'd expect a performance difference here but just as it was several years ago when we started with gluster, it's still huge, and simple file listings are incredibly slow. At the time, the team was looking to do some optimizations, but I'm not sure this has happened. What can we do to try to improve performance? Thank you. Some setup values follow. xfs_info /mnt/SNIP_block1 meta-data=/dev/sdc isize=512 agcount=103, agsize=26214400 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=0, rmapbt=0 = reflink=0 data = bsize=4096 blocks=2684354560, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=51200, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 Volume Name: SNIP_data1 Type: Replicate Volume ID: SNIP Status: Started Snapshot Count: 0 Number of Bricks: 1 x 4 = 4 Transport-type: tcp Bricks: Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1 Brick2: forge:/mnt/SNIP_block1/SNIP_data1 Brick3: hive:/mnt/SNIP_block1/SNIP_data1 Brick4: citadel:/mnt/SNIP_block1/SNIP_data1 Options Reconfigured: cluster.quorum-count: 1 cluster.quorum-type: fixed network.ping-timeout: 5 network.remote-dio: enable performance.rda-cache-limit: 256MB performance.readdir-ahead: on performance.parallel-readdir: on network.inode-lru-limit: 500000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on cluster.readdir-optimize: on performance.io-thread-count: 32 server.event-threads: 4 client.event-threads: 4 performance.read-ahead: off cluster.lookup-optimize: on performance.cache-size: 1GB cluster.self-heal-daemon: enable transport.address-family: inet nfs.disable: on performance.client-io-threads: on cluster.granular-entry-heal: enable cluster.data-self-heal-algorithm: full Sincerely, Artem -- Founder, Android Police <http://www.androidpolice.com>, APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC beerpla.net | @ArtemR <http://twitter.com/ArtemR> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200429/568a4bdf/attachment.html>
Strahil Nikolov
2020-Apr-30 14:44 UTC
[Gluster-users] Extremely slow file listing in folders with many files
On April 30, 2020 4:24:23 AM GMT+03:00, Artem Russakovskii <archon810 at gmail.com> wrote:>Hi all, > >We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, and the >10TB one especially is extremely slow to do certain things with (and >has >been since gluster 3.x when we started). We're currently on 5.13. > >The number of files isn't even what I'd consider that great - under >100k >per dir. > >Here are some numbers to look at: > >On gluster volume in a dir of 45k files: >The first time > >time find | wc -l >45423 >real 8m44.819s >user 0m0.459s >sys 0m0.998s > >And again > >time find | wc -l >45423 >real 0m34.677s >user 0m0.291s >sys 0m0.754s > > >If I run the same operation on the xfs block device itself: >The first time > >time find | wc -l >45423 >real 0m13.514s >user 0m0.144s >sys 0m0.501s > >And again > >time find | wc -l >45423 >real 0m0.197s >user 0m0.088s >sys 0m0.106s > > >I'd expect a performance difference here but just as it was several >years >ago when we started with gluster, it's still huge, and simple file >listings >are incredibly slow. > >At the time, the team was looking to do some optimizations, but I'm not >sure this has happened. > >What can we do to try to improve performance? > >Thank you. > > > >Some setup values follow. > >xfs_info /mnt/SNIP_block1 >meta-data=/dev/sdc isize=512 agcount=103, >agsize=26214400 >blks > = sectsz=512 attr=2, projid32bit=1 > = crc=1 finobt=1, sparse=0, rmapbt=0 > = reflink=0 >data = bsize=4096 blocks=2684354560, >imaxpct=25 > = sunit=0 swidth=0 blks >naming =version 2 bsize=4096 ascii-ci=0, ftype=1 >log =internal log bsize=4096 blocks=51200, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 >realtime =none extsz=4096 blocks=0, rtextents=0 > >Volume Name: SNIP_data1 >Type: Replicate >Volume ID: SNIP >Status: Started >Snapshot Count: 0 >Number of Bricks: 1 x 4 = 4 >Transport-type: tcp >Bricks: >Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1 >Brick2: forge:/mnt/SNIP_block1/SNIP_data1 >Brick3: hive:/mnt/SNIP_block1/SNIP_data1 >Brick4: citadel:/mnt/SNIP_block1/SNIP_data1 >Options Reconfigured: >cluster.quorum-count: 1 >cluster.quorum-type: fixed >network.ping-timeout: 5 >network.remote-dio: enable >performance.rda-cache-limit: 256MB >performance.readdir-ahead: on >performance.parallel-readdir: on >network.inode-lru-limit: 500000 >performance.md-cache-timeout: 600 >performance.cache-invalidation: on >performance.stat-prefetch: on >features.cache-invalidation-timeout: 600 >features.cache-invalidation: on >cluster.readdir-optimize: on >performance.io-thread-count: 32 >server.event-threads: 4 >client.event-threads: 4 >performance.read-ahead: off >cluster.lookup-optimize: on >performance.cache-size: 1GB >cluster.self-heal-daemon: enable >transport.address-family: inet >nfs.disable: on >performance.client-io-threads: on >cluster.granular-entry-heal: enable >cluster.data-self-heal-algorithm: full > >Sincerely, >Artem > >-- >Founder, Android Police <http://www.androidpolice.com>, APK Mirror ><http://www.apkmirror.com/>, Illogical Robot LLC >beerpla.net | @ArtemR <http://twitter.com/ArtemR>Hi Artem, Have you checked the same on brick level ? How big is the difference ? Best Regards, Strahil Nikolov