Amar Tumballi
2018-Feb-05 02:44 UTC
[Gluster-users] Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first
Thanks for the report Artem, Looks like the issue is about cache warming up. Specially, I suspect rsync doing a 'readdir(), stat(), file operations' loop, where as when a find or ls is issued, we get 'readdirp()' request, which contains the stat information along with entries, which also makes sure cache is up-to-date (at md-cache layer). Note that this is just a off-the memory hypothesis, We surely need to analyse and debug more thoroughly for a proper explanation. Some one in my team would look at it soon. Regards, Amar On Mon, Feb 5, 2018 at 7:25 AM, Vlad Kopylov <vladkopy at gmail.com> wrote:> You mounting it to the local bricks? > > struggling with same performance issues > try using this volume setting > http://lists.gluster.org/pipermail/gluster-users/2018-January/033397.html > performance.stat-prefetch: on might be it > > seems like when it gets to cache it is fast - those stat fetch which > seem to come from .gluster are slow > > On Sun, Feb 4, 2018 at 3:45 AM, Artem Russakovskii <archon810 at gmail.com> > wrote: > > An update, and a very interesting one! > > > > After I started stracing rsync, all I could see was lstat calls, quite > slow > > ones, over and over, which is expected. > > > > For example: lstat("uploads/2016/10/nexus2cee_DSC05339_thumb- > 161x107.jpg", > > {st_mode=S_IFREG|0664, st_size=4043, ...}) = 0 > > > > I googled around and found > > https://gist.github.com/nh2/1836415489e2132cf85ed3832105fcc1, which is > > seeing this exact issue with gluster, rsync and xfs. > > > > Here's the craziest finding so far. If while rsync is running (or right > > before), I run /bin/ls or find on the same gluster dirs, it immediately > > speeds up rsync by a factor of 100 or maybe even 1000. It's absolutely > > insane. > > > > I'm stracing the rsync run, and the slow lstat calls flood in at an > > incredible speed as soon as ls or find run. Several hundred of files per > > minute (excruciatingly slow) becomes thousands or even tens of thousands > of > > files a second. > > > > What do you make of this? > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://lists.gluster.org/mailman/listinfo/gluster-users > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180205/206efebb/attachment.html>
Tom Fite
2018-Feb-05 16:18 UTC
[Gluster-users] Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first
Hi all, I have seen this issue as well, on Gluster 3.12.1. (3 bricks per box, 2 boxes, distributed-replicate) My testing shows the same thing -- running a find on a directory dramatically increases lstat performance. To add another clue, the performance degrades again after issuing a call to reset the system's cache of dentries and inodes: # sync; echo 2 > /proc/sys/vm/drop_caches I think that this shows that it's the system cache that's actually doing the heavy lifting here. There are a couple of sysctl tunables that I've found helps out with this. See here: http://docs.gluster.org/en/latest/Administrator%20Guide/Linux%20Kernel%20Tuning/ Contrary to what that doc says, I've found that setting vm.vfs_cache_pressure to a low value increases performance by allowing more dentries and inodes to be retained in the cache. # Set the swappiness to avoid swap when possible. vm.swappiness = 10 # Set the cache pressure to prefer inode and dentry cache over file cache. This is done to keep as many # dentries and inodes in cache as possible, which dramatically improves gluster small file performance. vm.vfs_cache_pressure = 25 For comparison, my config is: Volume Name: gv0 Type: Tier Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196 Status: Started Snapshot Count: 13 Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Replicate Number of Bricks: 1 x 2 = 2 Brick1: gluster2:/data/hot_tier/gv0 Brick2: gluster1:/data/hot_tier/gv0 Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 3 x 2 = 6 Brick3: gluster1:/data/brick1/gv0 Brick4: gluster2:/data/brick1/gv0 Brick5: gluster1:/data/brick2/gv0 Brick6: gluster2:/data/brick2/gv0 Brick7: gluster1:/data/brick3/gv0 Brick8: gluster2:/data/brick3/gv0 Options Reconfigured: performance.cache-max-file-size: 128MB cluster.readdir-optimize: on cluster.watermark-hi: 95 features.ctr-sql-db-cachesize: 262144 cluster.read-freq-threshold: 5 cluster.write-freq-threshold: 2 features.record-counters: on cluster.tier-promote-frequency: 15000 cluster.tier-pause: off cluster.tier-compact: on cluster.tier-mode: cache features.ctr-enabled: on performance.cache-refresh-timeout: 60 performance.stat-prefetch: on server.outstanding-rpc-limit: 2056 cluster.lookup-optimize: on performance.client-io-threads: off nfs.disable: on transport.address-family: inet features.barrier: disable client.event-threads: 4 server.event-threads: 4 performance.cache-size: 1GB network.inode-lru-limit: 90000 performance.md-cache-timeout: 600 performance.cache-invalidation: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on performance.quick-read: on performance.io-cache: on performance.nfs.write-behind-window-size: 4MB performance.write-behind-window-size: 4MB performance.nfs.io-threads: off network.tcp-window-size: 1048576 performance.rda-cache-limit: 64MB performance.flush-behind: on server.allow-insecure: on cluster.tier-demote-frequency: 18000 cluster.tier-max-files: 1000000 cluster.tier-max-promote-file-size: 10485760 cluster.tier-max-mb: 64000 features.ctr-sql-db-wal-autocheckpoint: 2500 cluster.tier-hot-compact-frequency: 86400 cluster.tier-cold-compact-frequency: 86400 performance.readdir-ahead: off cluster.watermark-low: 50 storage.build-pgfid: on performance.rda-request-size: 128KB performance.rda-low-wmark: 4KB cluster.min-free-disk: 5% auto-delete: enable On Sun, Feb 4, 2018 at 9:44 PM, Amar Tumballi <atumball at redhat.com> wrote:> Thanks for the report Artem, > > Looks like the issue is about cache warming up. Specially, I suspect rsync > doing a 'readdir(), stat(), file operations' loop, where as when a find or > ls is issued, we get 'readdirp()' request, which contains the stat > information along with entries, which also makes sure cache is up-to-date > (at md-cache layer). > > Note that this is just a off-the memory hypothesis, We surely need to > analyse and debug more thoroughly for a proper explanation. Some one in my > team would look at it soon. > > Regards, > Amar > > On Mon, Feb 5, 2018 at 7:25 AM, Vlad Kopylov <vladkopy at gmail.com> wrote: > >> You mounting it to the local bricks? >> >> struggling with same performance issues >> try using this volume setting >> http://lists.gluster.org/pipermail/gluster-users/2018-January/033397.html >> performance.stat-prefetch: on might be it >> >> seems like when it gets to cache it is fast - those stat fetch which >> seem to come from .gluster are slow >> >> On Sun, Feb 4, 2018 at 3:45 AM, Artem Russakovskii <archon810 at gmail.com> >> wrote: >> > An update, and a very interesting one! >> > >> > After I started stracing rsync, all I could see was lstat calls, quite >> slow >> > ones, over and over, which is expected. >> > >> > For example: lstat("uploads/2016/10/nexus2c >> ee_DSC05339_thumb-161x107.jpg", >> > {st_mode=S_IFREG|0664, st_size=4043, ...}) = 0 >> > >> > I googled around and found >> > https://gist.github.com/nh2/1836415489e2132cf85ed3832105fcc1, which is >> > seeing this exact issue with gluster, rsync and xfs. >> > >> > Here's the craziest finding so far. If while rsync is running (or right >> > before), I run /bin/ls or find on the same gluster dirs, it immediately >> > speeds up rsync by a factor of 100 or maybe even 1000. It's absolutely >> > insane. >> > >> > I'm stracing the rsync run, and the slow lstat calls flood in at an >> > incredible speed as soon as ls or find run. Several hundred of files per >> > minute (excruciatingly slow) becomes thousands or even tens of >> thousands of >> > files a second. >> > >> > What do you make of this? >> > >> > >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > http://lists.gluster.org/mailman/listinfo/gluster-users >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > > > > -- > Amar Tumballi (amarts) > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180205/ca4a388c/attachment.html>
Artem Russakovskii
2018-Feb-27 13:52 UTC
[Gluster-users] Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first
Any updates on this one? On Mon, Feb 5, 2018 at 8:18 AM, Tom Fite <tomfite at gmail.com> wrote:> Hi all, > > I have seen this issue as well, on Gluster 3.12.1. (3 bricks per box, 2 > boxes, distributed-replicate) My testing shows the same thing -- running a > find on a directory dramatically increases lstat performance. To add > another clue, the performance degrades again after issuing a call to reset > the system's cache of dentries and inodes: > > # sync; echo 2 > /proc/sys/vm/drop_caches > > I think that this shows that it's the system cache that's actually doing > the heavy lifting here. There are a couple of sysctl tunables that I've > found helps out with this. > > See here: > > http://docs.gluster.org/en/latest/Administrator%20Guide/ > Linux%20Kernel%20Tuning/ > > Contrary to what that doc says, I've found that setting > vm.vfs_cache_pressure to a low value increases performance by allowing more > dentries and inodes to be retained in the cache. > > # Set the swappiness to avoid swap when possible. > vm.swappiness = 10 > > # Set the cache pressure to prefer inode and dentry cache over file cache. > This is done to keep as many > # dentries and inodes in cache as possible, which dramatically improves > gluster small file performance. > vm.vfs_cache_pressure = 25 > > For comparison, my config is: > > Volume Name: gv0 > Type: Tier > Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196 > Status: Started > Snapshot Count: 13 > Number of Bricks: 8 > Transport-type: tcp > Hot Tier : > Hot Tier Type : Replicate > Number of Bricks: 1 x 2 = 2 > Brick1: gluster2:/data/hot_tier/gv0 > Brick2: gluster1:/data/hot_tier/gv0 > Cold Tier: > Cold Tier Type : Distributed-Replicate > Number of Bricks: 3 x 2 = 6 > Brick3: gluster1:/data/brick1/gv0 > Brick4: gluster2:/data/brick1/gv0 > Brick5: gluster1:/data/brick2/gv0 > Brick6: gluster2:/data/brick2/gv0 > Brick7: gluster1:/data/brick3/gv0 > Brick8: gluster2:/data/brick3/gv0 > Options Reconfigured: > performance.cache-max-file-size: 128MB > cluster.readdir-optimize: on > cluster.watermark-hi: 95 > features.ctr-sql-db-cachesize: 262144 > cluster.read-freq-threshold: 5 > cluster.write-freq-threshold: 2 > features.record-counters: on > cluster.tier-promote-frequency: 15000 > cluster.tier-pause: off > cluster.tier-compact: on > cluster.tier-mode: cache > features.ctr-enabled: on > performance.cache-refresh-timeout: 60 > performance.stat-prefetch: on > server.outstanding-rpc-limit: 2056 > cluster.lookup-optimize: on > performance.client-io-threads: off > nfs.disable: on > transport.address-family: inet > features.barrier: disable > client.event-threads: 4 > server.event-threads: 4 > performance.cache-size: 1GB > network.inode-lru-limit: 90000 > performance.md-cache-timeout: 600 > performance.cache-invalidation: on > features.cache-invalidation-timeout: 600 > features.cache-invalidation: on > performance.quick-read: on > performance.io-cache: on > performance.nfs.write-behind-window-size: 4MB > performance.write-behind-window-size: 4MB > performance.nfs.io-threads: off > network.tcp-window-size: 1048576 > performance.rda-cache-limit: 64MB > performance.flush-behind: on > server.allow-insecure: on > cluster.tier-demote-frequency: 18000 > cluster.tier-max-files: 1000000 > cluster.tier-max-promote-file-size: 10485760 > cluster.tier-max-mb: 64000 > features.ctr-sql-db-wal-autocheckpoint: 2500 > cluster.tier-hot-compact-frequency: 86400 > cluster.tier-cold-compact-frequency: 86400 > performance.readdir-ahead: off > cluster.watermark-low: 50 > storage.build-pgfid: on > performance.rda-request-size: 128KB > performance.rda-low-wmark: 4KB > cluster.min-free-disk: 5% > auto-delete: enable > > > On Sun, Feb 4, 2018 at 9:44 PM, Amar Tumballi <atumball at redhat.com> wrote: > >> Thanks for the report Artem, >> >> Looks like the issue is about cache warming up. Specially, I suspect >> rsync doing a 'readdir(), stat(), file operations' loop, where as when a >> find or ls is issued, we get 'readdirp()' request, which contains the stat >> information along with entries, which also makes sure cache is up-to-date >> (at md-cache layer). >> >> Note that this is just a off-the memory hypothesis, We surely need to >> analyse and debug more thoroughly for a proper explanation. Some one in my >> team would look at it soon. >> >> Regards, >> Amar >> >> On Mon, Feb 5, 2018 at 7:25 AM, Vlad Kopylov <vladkopy at gmail.com> wrote: >> >>> You mounting it to the local bricks? >>> >>> struggling with same performance issues >>> try using this volume setting >>> http://lists.gluster.org/pipermail/gluster-users/2018-Januar >>> y/033397.html >>> performance.stat-prefetch: on might be it >>> >>> seems like when it gets to cache it is fast - those stat fetch which >>> seem to come from .gluster are slow >>> >>> On Sun, Feb 4, 2018 at 3:45 AM, Artem Russakovskii <archon810 at gmail.com> >>> wrote: >>> > An update, and a very interesting one! >>> > >>> > After I started stracing rsync, all I could see was lstat calls, quite >>> slow >>> > ones, over and over, which is expected. >>> > >>> > For example: lstat("uploads/2016/10/nexus2c >>> ee_DSC05339_thumb-161x107.jpg", >>> > {st_mode=S_IFREG|0664, st_size=4043, ...}) = 0 >>> > >>> > I googled around and found >>> > https://gist.github.com/nh2/1836415489e2132cf85ed3832105fcc1, which is >>> > seeing this exact issue with gluster, rsync and xfs. >>> > >>> > Here's the craziest finding so far. If while rsync is running (or right >>> > before), I run /bin/ls or find on the same gluster dirs, it immediately >>> > speeds up rsync by a factor of 100 or maybe even 1000. It's absolutely >>> > insane. >>> > >>> > I'm stracing the rsync run, and the slow lstat calls flood in at an >>> > incredible speed as soon as ls or find run. Several hundred of files >>> per >>> > minute (excruciatingly slow) becomes thousands or even tens of >>> thousands of >>> > files a second. >>> > >>> > What do you make of this? >>> > >>> > >>> > _______________________________________________ >>> > Gluster-users mailing list >>> > Gluster-users at gluster.org >>> > http://lists.gluster.org/mailman/listinfo/gluster-users >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >> >> -- >> Amar Tumballi (amarts) >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180227/88045149/attachment.html>
Possibly Parallel Threads
- Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first
- Blocking IO when hot tier promotion daemon runs
- Blocking IO when hot tier promotion daemon runs
- Blocking IO when hot tier promotion daemon runs
- Blocking IO when hot tier promotion daemon runs