Vlad Kopylov
2018-Apr-10 05:38 UTC
[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs
you definitely need mount options to /etc/fstab use ones from here http://lists.gluster.org/pipermail/gluster-users/2018-April/033811.html I went on with using local mounts to achieve performance as well Also, 3.12 or 3.10 branches would be preferable for production On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii <archon810 at gmail.com> wrote:> Hi again, > > I'd like to expand on the performance issues and plead for help. Here's > one case which shows these odd hiccups: https://i.imgur.com/CXBPjTK.gifv. > > In this GIF where I switch back and forth between copy operations on 2 > servers, I'm copying a 10GB dir full of .apk and image files. > > On server "hive" I'm copying straight from the main disk to an attached > volume block (xfs). As you can see, the transfers are relatively speedy and > don't hiccup. > On server "citadel" I'm copying the same set of data to a 4-replicate > gluster which uses block storage as a brick. As you can see, performance is > much worse, and there are frequent pauses for many seconds where nothing > seems to be happening - just freezes. > > All 4 servers have the same specs, and all of them have performance issues > with gluster and no such issues when raw xfs block storage is used. > > hive has long finished copying the data, while citadel is barely chugging > along and is expected to take probably half an hour to an hour. I have over > 1TB of data to migrate, at which point if we went live, I'm not even sure > gluster would be able to keep up instead of bringing the machines and > services down. > > > > Here's the cluster config, though it didn't seem to make any difference > performance-wise before I applied the customizations vs after. > > Volume Name: apkmirror_data1 > Type: Replicate > Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 4 = 4 > Transport-type: tcp > Bricks: > Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1 > Brick2: forge:/mnt/forge_block1/apkmirror_data1 > Brick3: hive:/mnt/hive_block1/apkmirror_data1 > Brick4: citadel:/mnt/citadel_block1/apkmirror_data1 > Options Reconfigured: > cluster.quorum-count: 1 > cluster.quorum-type: fixed > network.ping-timeout: 5 > network.remote-dio: enable > performance.rda-cache-limit: 256MB > performance.readdir-ahead: on > performance.parallel-readdir: on > network.inode-lru-limit: 500000 > performance.md-cache-timeout: 600 > performance.cache-invalidation: on > performance.stat-prefetch: on > features.cache-invalidation-timeout: 600 > features.cache-invalidation: on > cluster.readdir-optimize: on > performance.io-thread-count: 32 > server.event-threads: 4 > client.event-threads: 4 > performance.read-ahead: off > cluster.lookup-optimize: on > performance.cache-size: 1GB > cluster.self-heal-daemon: enable > transport.address-family: inet > nfs.disable: on > performance.client-io-threads: on > > > The mounts are done as follows in /etc/fstab: > /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1 /mnt/citadel_block1 > xfs defaults 0 2 > localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs defaults,_netdev > 0 0 > > I'm really not sure if direct-io-mode mount tweaks would do anything here, > what the value should be set to, and what it is by default. > > The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs, hosted by Linode. > > I'd really appreciate any help in the matter. > > Thank you. > > > Sincerely, > Artem > > -- > Founder, Android Police <http://www.androidpolice.com>, APK Mirror > <http://www.apkmirror.com/>, Illogical Robot LLC > beerpla.net | +ArtemRussakovskii > <https://plus.google.com/+ArtemRussakovskii> | @ArtemR > <http://twitter.com/ArtemR> > > On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii <archon810 at gmail.com> > wrote: > >> Hi, >> >> I'm trying to squeeze performance out of gluster on 4 80GB RAM 20-CPU >> machines where Gluster runs on attached block storage (Linode) in (4 >> replicate bricks), and so far everything I tried results in sub-optimal >> performance. >> >> There are many files - mostly images, several million - and many >> operations take minutes, copying multiple files (even if they're small) >> suddenly freezes up for seconds at a time, then continues, iostat >> frequently shows large r_await and w_awaits with 100% utilization for the >> attached block device, etc. >> >> But anyway, there are many guides out there for small-file performance >> improvements, but more explanation is needed, and I think more tweaks >> should be possible. >> >> My question today is about performance.cache-size. Is this a size of >> cache in RAM? If so, how do I view the current cache size to see if it gets >> full and I should increase its size? Is it advisable to bump it up if I >> have many tens of gigs of RAM free? >> >> >> >> More generally, in the last 2 months since I first started working with >> gluster and set a production system live, I've been feeling frustrated >> because Gluster has a lot of poorly-documented and confusing options. I >> really wish documentation could be improved with examples and better >> explanations. >> >> Specifically, it'd be absolutely amazing if the docs offered a strategy >> for setting each value and ways of determining more optimal values. For >> example, for performance.cache-size, if it said something like "run command >> abc to see your current cache size, and if it's hurting, up it, but be >> aware that it's limited by RAM," it'd be already a huge improvement to the >> docs. And so on with other options. >> >> >> >> The gluster team is quite helpful on this mailing list, but in a reactive >> rather than proactive way. Perhaps it's tunnel vision once you've worked on >> a project for so long where less technical explanations and even proper >> documentation of options takes a back seat, but I encourage you to be more >> proactive about helping us understand and optimize Gluster. >> >> Thank you. >> >> Sincerely, >> Artem >> >> -- >> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >> <http://www.apkmirror.com/>, Illogical Robot LLC >> beerpla.net | +ArtemRussakovskii >> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >> <http://twitter.com/ArtemR> >> > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180410/978d2dc1/attachment.html>
Artem Russakovskii
2018-Apr-10 05:47 UTC
[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs
Hi Vlad, I'm using only localhost: mounts. Can you please explain what effect each option has on performance issues shown in my posts? "negative-timeout=10,attribute-timeout=30,fopen- keep-cache,direct-io-mode=enable,fetch-attempts=5" From what I remember, direct-io-mode=enable didn't make a difference in my tests, but I suppose I can try again. The explanations about direct-io-mode are quite confusing on the web in various guides, saying enabling it could make performance worse in some situations and better in others due to OS file cache. There are also these gluster volume settings, adding to the confusion: Option: performance.strict-o-direct Default Value: off Description: This option when set to off, ignores the O_DIRECT flag. Option: performance.nfs.strict-o-direct Default Value: off Description: This option when set to off, ignores the O_DIRECT flag. Re: 4.0. I moved to 4.0 after finding out that it fixes the disappearing dirs bug related to cluster.readdir-optimize if you remember ( http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html). I was already on 3.13 by then, and 4.0 resolved the issue. It's been stable for me so far, thankfully. Sincerely, Artem -- Founder, Android Police <http://www.androidpolice.com>, APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC beerpla.net | +ArtemRussakovskii <https://plus.google.com/+ArtemRussakovskii> | @ArtemR <http://twitter.com/ArtemR> On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov <vladkopy at gmail.com> wrote:> you definitely need mount options to /etc/fstab > use ones from here http://lists.gluster.org/piper > mail/gluster-users/2018-April/033811.html > > I went on with using local mounts to achieve performance as well > > Also, 3.12 or 3.10 branches would be preferable for production > > On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii <archon810 at gmail.com> > wrote: > >> Hi again, >> >> I'd like to expand on the performance issues and plead for help. Here's >> one case which shows these odd hiccups: https://i.imgur.com/CXBPjTK.gifv. >> >> In this GIF where I switch back and forth between copy operations on 2 >> servers, I'm copying a 10GB dir full of .apk and image files. >> >> On server "hive" I'm copying straight from the main disk to an attached >> volume block (xfs). As you can see, the transfers are relatively speedy and >> don't hiccup. >> On server "citadel" I'm copying the same set of data to a 4-replicate >> gluster which uses block storage as a brick. As you can see, performance is >> much worse, and there are frequent pauses for many seconds where nothing >> seems to be happening - just freezes. >> >> All 4 servers have the same specs, and all of them have performance >> issues with gluster and no such issues when raw xfs block storage is used. >> >> hive has long finished copying the data, while citadel is barely chugging >> along and is expected to take probably half an hour to an hour. I have over >> 1TB of data to migrate, at which point if we went live, I'm not even sure >> gluster would be able to keep up instead of bringing the machines and >> services down. >> >> >> >> Here's the cluster config, though it didn't seem to make any difference >> performance-wise before I applied the customizations vs after. >> >> Volume Name: apkmirror_data1 >> Type: Replicate >> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 4 = 4 >> Transport-type: tcp >> Bricks: >> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1 >> Brick2: forge:/mnt/forge_block1/apkmirror_data1 >> Brick3: hive:/mnt/hive_block1/apkmirror_data1 >> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1 >> Options Reconfigured: >> cluster.quorum-count: 1 >> cluster.quorum-type: fixed >> network.ping-timeout: 5 >> network.remote-dio: enable >> performance.rda-cache-limit: 256MB >> performance.readdir-ahead: on >> performance.parallel-readdir: on >> network.inode-lru-limit: 500000 >> performance.md-cache-timeout: 600 >> performance.cache-invalidation: on >> performance.stat-prefetch: on >> features.cache-invalidation-timeout: 600 >> features.cache-invalidation: on >> cluster.readdir-optimize: on >> performance.io-thread-count: 32 >> server.event-threads: 4 >> client.event-threads: 4 >> performance.read-ahead: off >> cluster.lookup-optimize: on >> performance.cache-size: 1GB >> cluster.self-heal-daemon: enable >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: on >> >> >> The mounts are done as follows in /etc/fstab: >> /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1 /mnt/citadel_block1 >> xfs defaults 0 2 >> localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs >> defaults,_netdev 0 0 >> >> I'm really not sure if direct-io-mode mount tweaks would do anything >> here, what the value should be set to, and what it is by default. >> >> The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs, hosted by Linode. >> >> I'd really appreciate any help in the matter. >> >> Thank you. >> >> >> Sincerely, >> Artem >> >> -- >> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >> <http://www.apkmirror.com/>, Illogical Robot LLC >> beerpla.net | +ArtemRussakovskii >> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >> <http://twitter.com/ArtemR> >> >> On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii <archon810 at gmail.com> >> wrote: >> >>> Hi, >>> >>> I'm trying to squeeze performance out of gluster on 4 80GB RAM 20-CPU >>> machines where Gluster runs on attached block storage (Linode) in (4 >>> replicate bricks), and so far everything I tried results in sub-optimal >>> performance. >>> >>> There are many files - mostly images, several million - and many >>> operations take minutes, copying multiple files (even if they're small) >>> suddenly freezes up for seconds at a time, then continues, iostat >>> frequently shows large r_await and w_awaits with 100% utilization for the >>> attached block device, etc. >>> >>> But anyway, there are many guides out there for small-file performance >>> improvements, but more explanation is needed, and I think more tweaks >>> should be possible. >>> >>> My question today is about performance.cache-size. Is this a size of >>> cache in RAM? If so, how do I view the current cache size to see if it gets >>> full and I should increase its size? Is it advisable to bump it up if I >>> have many tens of gigs of RAM free? >>> >>> >>> >>> More generally, in the last 2 months since I first started working with >>> gluster and set a production system live, I've been feeling frustrated >>> because Gluster has a lot of poorly-documented and confusing options. I >>> really wish documentation could be improved with examples and better >>> explanations. >>> >>> Specifically, it'd be absolutely amazing if the docs offered a strategy >>> for setting each value and ways of determining more optimal values. For >>> example, for performance.cache-size, if it said something like "run command >>> abc to see your current cache size, and if it's hurting, up it, but be >>> aware that it's limited by RAM," it'd be already a huge improvement to the >>> docs. And so on with other options. >>> >>> >>> >>> The gluster team is quite helpful on this mailing list, but in a >>> reactive rather than proactive way. Perhaps it's tunnel vision once you've >>> worked on a project for so long where less technical explanations and even >>> proper documentation of options takes a back seat, but I encourage you to >>> be more proactive about helping us understand and optimize Gluster. >>> >>> Thank you. >>> >>> Sincerely, >>> Artem >>> >>> -- >>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>> <http://www.apkmirror.com/>, Illogical Robot LLC >>> beerpla.net | +ArtemRussakovskii >>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>> <http://twitter.com/ArtemR> >>> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180409/f569b91a/attachment.html>
Vlad Kopylov
2018-Apr-10 14:01 UTC
[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs
Wish I knew or was able to get detailed description of those options myself. here is direct-io-mode https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode Same as you I ran tests on a large volume of files, finding that main delays are in attribute calls, ending up with those mount options to add performance. I discovered those options through basically googling this user list with people sharing their tests. Not sure I would share your optimism, and rather then going up I downgraded to 3.12 and have no dir view issue now. Though I had to recreate the cluster and had to re-add bricks with existing data. On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii <archon810 at gmail.com> wrote:> Hi Vlad, > > I'm using only localhost: mounts. > > Can you please explain what effect each option has on performance issues > shown in my posts? "negative-timeout=10,attribute > -timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5" From > what I remember, direct-io-mode=enable didn't make a difference in my > tests, but I suppose I can try again. The explanations about direct-io-mode > are quite confusing on the web in various guides, saying enabling it could > make performance worse in some situations and better in others due to OS > file cache. > > There are also these gluster volume settings, adding to the confusion: > Option: performance.strict-o-direct > Default Value: off > Description: This option when set to off, ignores the O_DIRECT flag. > > Option: performance.nfs.strict-o-direct > Default Value: off > Description: This option when set to off, ignores the O_DIRECT flag. > > Re: 4.0. I moved to 4.0 after finding out that it fixes the disappearing > dirs bug related to cluster.readdir-optimize if you remember ( > http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html). > I was already on 3.13 by then, and 4.0 resolved the issue. It's been stable > for me so far, thankfully. > > > Sincerely, > Artem > > -- > Founder, Android Police <http://www.androidpolice.com>, APK Mirror > <http://www.apkmirror.com/>, Illogical Robot LLC > beerpla.net | +ArtemRussakovskii > <https://plus.google.com/+ArtemRussakovskii> | @ArtemR > <http://twitter.com/ArtemR> > > On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov <vladkopy at gmail.com> wrote: > >> you definitely need mount options to /etc/fstab >> use ones from here http://lists.gluster.org/piper >> mail/gluster-users/2018-April/033811.html >> >> I went on with using local mounts to achieve performance as well >> >> Also, 3.12 or 3.10 branches would be preferable for production >> >> On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii <archon810 at gmail.com> >> wrote: >> >>> Hi again, >>> >>> I'd like to expand on the performance issues and plead for help. Here's >>> one case which shows these odd hiccups: https://i.imgur.com/CXBPjTK.gifv >>> . >>> >>> In this GIF where I switch back and forth between copy operations on 2 >>> servers, I'm copying a 10GB dir full of .apk and image files. >>> >>> On server "hive" I'm copying straight from the main disk to an attached >>> volume block (xfs). As you can see, the transfers are relatively speedy and >>> don't hiccup. >>> On server "citadel" I'm copying the same set of data to a 4-replicate >>> gluster which uses block storage as a brick. As you can see, performance is >>> much worse, and there are frequent pauses for many seconds where nothing >>> seems to be happening - just freezes. >>> >>> All 4 servers have the same specs, and all of them have performance >>> issues with gluster and no such issues when raw xfs block storage is used. >>> >>> hive has long finished copying the data, while citadel is barely >>> chugging along and is expected to take probably half an hour to an hour. I >>> have over 1TB of data to migrate, at which point if we went live, I'm not >>> even sure gluster would be able to keep up instead of bringing the machines >>> and services down. >>> >>> >>> >>> Here's the cluster config, though it didn't seem to make any difference >>> performance-wise before I applied the customizations vs after. >>> >>> Volume Name: apkmirror_data1 >>> Type: Replicate >>> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x 4 = 4 >>> Transport-type: tcp >>> Bricks: >>> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1 >>> Brick2: forge:/mnt/forge_block1/apkmirror_data1 >>> Brick3: hive:/mnt/hive_block1/apkmirror_data1 >>> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1 >>> Options Reconfigured: >>> cluster.quorum-count: 1 >>> cluster.quorum-type: fixed >>> network.ping-timeout: 5 >>> network.remote-dio: enable >>> performance.rda-cache-limit: 256MB >>> performance.readdir-ahead: on >>> performance.parallel-readdir: on >>> network.inode-lru-limit: 500000 >>> performance.md-cache-timeout: 600 >>> performance.cache-invalidation: on >>> performance.stat-prefetch: on >>> features.cache-invalidation-timeout: 600 >>> features.cache-invalidation: on >>> cluster.readdir-optimize: on >>> performance.io-thread-count: 32 >>> server.event-threads: 4 >>> client.event-threads: 4 >>> performance.read-ahead: off >>> cluster.lookup-optimize: on >>> performance.cache-size: 1GB >>> cluster.self-heal-daemon: enable >>> transport.address-family: inet >>> nfs.disable: on >>> performance.client-io-threads: on >>> >>> >>> The mounts are done as follows in /etc/fstab: >>> /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1 /mnt/citadel_block1 >>> xfs defaults 0 2 >>> localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs >>> defaults,_netdev 0 0 >>> >>> I'm really not sure if direct-io-mode mount tweaks would do anything >>> here, what the value should be set to, and what it is by default. >>> >>> The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs, hosted by Linode. >>> >>> I'd really appreciate any help in the matter. >>> >>> Thank you. >>> >>> >>> Sincerely, >>> Artem >>> >>> -- >>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>> <http://www.apkmirror.com/>, Illogical Robot LLC >>> beerpla.net | +ArtemRussakovskii >>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>> <http://twitter.com/ArtemR> >>> >>> On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii <archon810 at gmail.com >>> > wrote: >>> >>>> Hi, >>>> >>>> I'm trying to squeeze performance out of gluster on 4 80GB RAM 20-CPU >>>> machines where Gluster runs on attached block storage (Linode) in (4 >>>> replicate bricks), and so far everything I tried results in sub-optimal >>>> performance. >>>> >>>> There are many files - mostly images, several million - and many >>>> operations take minutes, copying multiple files (even if they're small) >>>> suddenly freezes up for seconds at a time, then continues, iostat >>>> frequently shows large r_await and w_awaits with 100% utilization for the >>>> attached block device, etc. >>>> >>>> But anyway, there are many guides out there for small-file performance >>>> improvements, but more explanation is needed, and I think more tweaks >>>> should be possible. >>>> >>>> My question today is about performance.cache-size. Is this a size of >>>> cache in RAM? If so, how do I view the current cache size to see if it gets >>>> full and I should increase its size? Is it advisable to bump it up if I >>>> have many tens of gigs of RAM free? >>>> >>>> >>>> >>>> More generally, in the last 2 months since I first started working with >>>> gluster and set a production system live, I've been feeling frustrated >>>> because Gluster has a lot of poorly-documented and confusing options. I >>>> really wish documentation could be improved with examples and better >>>> explanations. >>>> >>>> Specifically, it'd be absolutely amazing if the docs offered a strategy >>>> for setting each value and ways of determining more optimal values. For >>>> example, for performance.cache-size, if it said something like "run command >>>> abc to see your current cache size, and if it's hurting, up it, but be >>>> aware that it's limited by RAM," it'd be already a huge improvement to the >>>> docs. And so on with other options. >>>> >>>> >>>> >>>> The gluster team is quite helpful on this mailing list, but in a >>>> reactive rather than proactive way. Perhaps it's tunnel vision once you've >>>> worked on a project for so long where less technical explanations and even >>>> proper documentation of options takes a back seat, but I encourage you to >>>> be more proactive about helping us understand and optimize Gluster. >>>> >>>> Thank you. >>>> >>>> Sincerely, >>>> Artem >>>> >>>> -- >>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>>> <http://www.apkmirror.com/>, Illogical Robot LLC >>>> beerpla.net | +ArtemRussakovskii >>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>> <http://twitter.com/ArtemR> >>>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180410/a379e1c6/attachment.html>
Possibly Parallel Threads
- performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs
- performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs
- performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs
- performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs
- performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs