thr3ads.net - Gluster users - [Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs [Apr 2018]

If this information is useful, please help other people find it:
Share via:

Vlad Kopylov

2018-Apr-10 14:01 UTC

[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

Wish I knew or was able to get detailed description of those options myself.
here is direct-io-mode
https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode
Same as you I ran tests on a large volume of files, finding that main
delays are in attribute calls, ending up with those mount options to add
performance.
I discovered those options through basically googling this user list with
people sharing their tests.
Not sure I would share your optimism, and rather then going up I downgraded
to 3.12 and have no dir view issue now. Though I had to recreate the
cluster and had to re-add bricks with existing data.

On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii <archon810 at
gmail.com>
wrote:
> Hi Vlad,
>
> I'm using only localhost: mounts.
>
> Can you please explain what effect each option has on performance issues
> shown in my posts? "negative-timeout=10,attribute
> -timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5"
From
> what I remember, direct-io-mode=enable didn't make a difference in my
> tests, but I suppose I can try again. The explanations about direct-io-mode
> are quite confusing on the web in various guides, saying enabling it could
> make performance worse in some situations and better in others due to OS
> file cache.
>
> There are also these gluster volume settings, adding to the confusion:
> Option: performance.strict-o-direct
> Default Value: off
> Description: This option when set to off, ignores the O_DIRECT flag.
>
> Option: performance.nfs.strict-o-direct
> Default Value: off
> Description: This option when set to off, ignores the O_DIRECT flag.
>
> Re: 4.0. I moved to 4.0 after finding out that it fixes the disappearing
> dirs bug related to cluster.readdir-optimize if you remember (
> http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html).
> I was already on 3.13 by then, and 4.0 resolved the issue. It's been
stable
> for me so far, thankfully.
>
>
> Sincerely,
> Artem
>
> --
> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
> <http://www.apkmirror.com/>, Illogical Robot LLC
> beerpla.net | +ArtemRussakovskii
> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
> <http://twitter.com/ArtemR>
>
> On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov <vladkopy at gmail.com>
wrote:
>
>> you definitely need mount options to /etc/fstab
>> use ones from here http://lists.gluster.org/piper
>> mail/gluster-users/2018-April/033811.html
>>
>> I went on with using local mounts to achieve performance as well
>>
>> Also, 3.12 or 3.10 branches would be preferable for production
>>
>> On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii <archon810 at
gmail.com>
>> wrote:
>>
>>> Hi again,
>>>
>>> I'd like to expand on the performance issues and plead for
help. Here's
>>> one case which shows these odd hiccups:
https://i.imgur.com/CXBPjTK.gifv
>>> .
>>>
>>> In this GIF where I switch back and forth between copy operations
on 2
>>> servers, I'm copying a 10GB dir full of .apk and image files.
>>>
>>> On server "hive" I'm copying straight from the main
disk to an attached
>>> volume block (xfs). As you can see, the transfers are relatively
speedy and
>>> don't hiccup.
>>> On server "citadel" I'm copying the same set of data
to a 4-replicate
>>> gluster which uses block storage as a brick. As you can see,
performance is
>>> much worse, and there are frequent pauses for many seconds where
nothing
>>> seems to be happening - just freezes.
>>>
>>> All 4 servers have the same specs, and all of them have performance
>>> issues with gluster and no such issues when raw xfs block storage
is used.
>>>
>>> hive has long finished copying the data, while citadel is barely
>>> chugging along and is expected to take probably half an hour to an
hour. I
>>> have over 1TB of data to migrate, at which point if we went live,
I'm not
>>> even sure gluster would be able to keep up instead of bringing the
machines
>>> and services down.
>>>
>>>
>>>
>>> Here's the cluster config, though it didn't seem to make
any difference
>>> performance-wise before I applied the customizations vs after.
>>>
>>> Volume Name: apkmirror_data1
>>> Type: Replicate
>>> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x 4 = 4
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1
>>> Brick2: forge:/mnt/forge_block1/apkmirror_data1
>>> Brick3: hive:/mnt/hive_block1/apkmirror_data1
>>> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1
>>> Options Reconfigured:
>>> cluster.quorum-count: 1
>>> cluster.quorum-type: fixed
>>> network.ping-timeout: 5
>>> network.remote-dio: enable
>>> performance.rda-cache-limit: 256MB
>>> performance.readdir-ahead: on
>>> performance.parallel-readdir: on
>>> network.inode-lru-limit: 500000
>>> performance.md-cache-timeout: 600
>>> performance.cache-invalidation: on
>>> performance.stat-prefetch: on
>>> features.cache-invalidation-timeout: 600
>>> features.cache-invalidation: on
>>> cluster.readdir-optimize: on
>>> performance.io-thread-count: 32
>>> server.event-threads: 4
>>> client.event-threads: 4
>>> performance.read-ahead: off
>>> cluster.lookup-optimize: on
>>> performance.cache-size: 1GB
>>> cluster.self-heal-daemon: enable
>>> transport.address-family: inet
>>> nfs.disable: on
>>> performance.client-io-threads: on
>>>
>>>
>>> The mounts are done as follows in /etc/fstab:
>>> /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1
/mnt/citadel_block1
>>> xfs defaults 0 2
>>> localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs
>>> defaults,_netdev 0 0
>>>
>>> I'm really not sure if direct-io-mode mount tweaks would do
anything
>>> here, what the value should be set to, and what it is by default.
>>>
>>> The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs, hosted by
Linode.
>>>
>>> I'd really appreciate any help in the matter.
>>>
>>> Thank you.
>>>
>>>
>>> Sincerely,
>>> Artem
>>>
>>> --
>>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>> beerpla.net | +ArtemRussakovskii
>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>> <http://twitter.com/ArtemR>
>>>
>>> On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii <archon810
at gmail.com
>>> > wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm trying to squeeze performance out of gluster on 4 80GB
RAM 20-CPU
>>>> machines where Gluster runs on attached block storage (Linode)
in (4
>>>> replicate bricks), and so far everything I tried results in
sub-optimal
>>>> performance.
>>>>
>>>> There are many files - mostly images, several million - and
many
>>>> operations take minutes, copying multiple files (even if
they're small)
>>>> suddenly freezes up for seconds at a time, then continues,
iostat
>>>> frequently shows large r_await and w_awaits with 100%
utilization for the
>>>> attached block device, etc.
>>>>
>>>> But anyway, there are many guides out there for small-file
performance
>>>> improvements, but more explanation is needed, and I think more
tweaks
>>>> should be possible.
>>>>
>>>> My question today is about performance.cache-size. Is this a
size of
>>>> cache in RAM? If so, how do I view the current cache size to
see if it gets
>>>> full and I should increase its size? Is it advisable to bump it
up if I
>>>> have many tens of gigs of RAM free?
>>>>
>>>>
>>>>
>>>> More generally, in the last 2 months since I first started
working with
>>>> gluster and set a production system live, I've been feeling
frustrated
>>>> because Gluster has a lot of poorly-documented and confusing
options. I
>>>> really wish documentation could be improved with examples and
better
>>>> explanations.
>>>>
>>>> Specifically, it'd be absolutely amazing if the docs
offered a strategy
>>>> for setting each value and ways of determining more optimal
values. For
>>>> example, for performance.cache-size, if it said something like
"run command
>>>> abc to see your current cache size, and if it's hurting, up
it, but be
>>>> aware that it's limited by RAM," it'd be already a
huge improvement to the
>>>> docs. And so on with other options.
>>>>
>>>>
>>>>
>>>> The gluster team is quite helpful on this mailing list, but in
a
>>>> reactive rather than proactive way. Perhaps it's tunnel
vision once you've
>>>> worked on a project for so long where less technical
explanations and even
>>>> proper documentation of options takes a back seat, but I
encourage you to
>>>> be more proactive about helping us understand and optimize
Gluster.
>>>>
>>>> Thank you.
>>>>
>>>> Sincerely,
>>>> Artem
>>>>
>>>> --
>>>> Founder, Android Police <http://www.androidpolice.com>,
APK Mirror
>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>> beerpla.net | +ArtemRussakovskii
>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>> <http://twitter.com/ArtemR>
>>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180410/a379e1c6/attachment.html>

Artem Russakovskii

2018-Apr-10 16:56 UTC

head link

[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

Hi Vlad,

I actually saw that post already and even asked a question 4 days ago (
https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode#comment1172497_540917).
The accepted answer also seems to go against your suggestion to enable
direct-io-mode as it says it should be disabled for better performance when
used just for file accesses.

It'd be great if someone from the Gluster team chimed in about this thread.


Sincerely,
Artem

--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net | +ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
<http://twitter.com/ArtemR>

On Tue, Apr 10, 2018 at 7:01 AM, Vlad Kopylov <vladkopy at gmail.com>
wrote:
> Wish I knew or was able to get detailed description of those options
> myself.
> here is direct-io-mode  https://serverfault.com/
> questions/517775/glusterfs-direct-i-o-mode
> Same as you I ran tests on a large volume of files, finding that main
> delays are in attribute calls, ending up with those mount options to add
> performance.
> I discovered those options through basically googling this user list with
> people sharing their tests.
> Not sure I would share your optimism, and rather then going up I
> downgraded to 3.12 and have no dir view issue now. Though I had to recreate
> the cluster and had to re-add bricks with existing data.
>
> On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii <archon810 at
gmail.com>
> wrote:
>
>> Hi Vlad,
>>
>> I'm using only localhost: mounts.
>>
>> Can you please explain what effect each option has on performance
issues
>> shown in my posts? "negative-timeout=10,attribute
>>
-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5"
>> From what I remember, direct-io-mode=enable didn't make a
difference in my
>> tests, but I suppose I can try again. The explanations about
direct-io-mode
>> are quite confusing on the web in various guides, saying enabling it
could
>> make performance worse in some situations and better in others due to
OS
>> file cache.
>>
>> There are also these gluster volume settings, adding to the confusion:
>> Option: performance.strict-o-direct
>> Default Value: off
>> Description: This option when set to off, ignores the O_DIRECT flag.
>>
>> Option: performance.nfs.strict-o-direct
>> Default Value: off
>> Description: This option when set to off, ignores the O_DIRECT flag.
>>
>> Re: 4.0. I moved to 4.0 after finding out that it fixes the
disappearing
>> dirs bug related to cluster.readdir-optimize if you remember (
>>
http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html).
>> I was already on 3.13 by then, and 4.0 resolved the issue. It's
been stable
>> for me so far, thankfully.
>>
>>
>> Sincerely,
>> Artem
>>
>> --
>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>> <http://www.apkmirror.com/>, Illogical Robot LLC
>> beerpla.net | +ArtemRussakovskii
>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>> <http://twitter.com/ArtemR>
>>
>> On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov <vladkopy at
gmail.com> wrote:
>>
>>> you definitely need mount options to /etc/fstab
>>> use ones from here http://lists.gluster.org/piper
>>> mail/gluster-users/2018-April/033811.html
>>>
>>> I went on with using local mounts to achieve performance as well
>>>
>>> Also, 3.12 or 3.10 branches would be preferable for production
>>>
>>> On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii <archon810 at
gmail.com>
>>> wrote:
>>>
>>>> Hi again,
>>>>
>>>> I'd like to expand on the performance issues and plead for
help. Here's
>>>> one case which shows these odd hiccups: https://i.imgur.com/C
>>>> XBPjTK.gifv.
>>>>
>>>> In this GIF where I switch back and forth between copy
operations on 2
>>>> servers, I'm copying a 10GB dir full of .apk and image
files.
>>>>
>>>> On server "hive" I'm copying straight from the
main disk to an attached
>>>> volume block (xfs). As you can see, the transfers are
relatively speedy and
>>>> don't hiccup.
>>>> On server "citadel" I'm copying the same set of
data to a 4-replicate
>>>> gluster which uses block storage as a brick. As you can see,
performance is
>>>> much worse, and there are frequent pauses for many seconds
where nothing
>>>> seems to be happening - just freezes.
>>>>
>>>> All 4 servers have the same specs, and all of them have
performance
>>>> issues with gluster and no such issues when raw xfs block
storage is used.
>>>>
>>>> hive has long finished copying the data, while citadel is
barely
>>>> chugging along and is expected to take probably half an hour to
an hour. I
>>>> have over 1TB of data to migrate, at which point if we went
live, I'm not
>>>> even sure gluster would be able to keep up instead of bringing
the machines
>>>> and services down.
>>>>
>>>>
>>>>
>>>> Here's the cluster config, though it didn't seem to
make any difference
>>>> performance-wise before I applied the customizations vs after.
>>>>
>>>> Volume Name: apkmirror_data1
>>>> Type: Replicate
>>>> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x 4 = 4
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1
>>>> Brick2: forge:/mnt/forge_block1/apkmirror_data1
>>>> Brick3: hive:/mnt/hive_block1/apkmirror_data1
>>>> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1
>>>> Options Reconfigured:
>>>> cluster.quorum-count: 1
>>>> cluster.quorum-type: fixed
>>>> network.ping-timeout: 5
>>>> network.remote-dio: enable
>>>> performance.rda-cache-limit: 256MB
>>>> performance.readdir-ahead: on
>>>> performance.parallel-readdir: on
>>>> network.inode-lru-limit: 500000
>>>> performance.md-cache-timeout: 600
>>>> performance.cache-invalidation: on
>>>> performance.stat-prefetch: on
>>>> features.cache-invalidation-timeout: 600
>>>> features.cache-invalidation: on
>>>> cluster.readdir-optimize: on
>>>> performance.io-thread-count: 32
>>>> server.event-threads: 4
>>>> client.event-threads: 4
>>>> performance.read-ahead: off
>>>> cluster.lookup-optimize: on
>>>> performance.cache-size: 1GB
>>>> cluster.self-heal-daemon: enable
>>>> transport.address-family: inet
>>>> nfs.disable: on
>>>> performance.client-io-threads: on
>>>>
>>>>
>>>> The mounts are done as follows in /etc/fstab:
>>>> /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1
/mnt/citadel_block1
>>>> xfs defaults 0 2
>>>> localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs
>>>> defaults,_netdev 0 0
>>>>
>>>> I'm really not sure if direct-io-mode mount tweaks would do
anything
>>>> here, what the value should be set to, and what it is by
default.
>>>>
>>>> The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs, hosted
by Linode.
>>>>
>>>> I'd really appreciate any help in the matter.
>>>>
>>>> Thank you.
>>>>
>>>>
>>>> Sincerely,
>>>> Artem
>>>>
>>>> --
>>>> Founder, Android Police <http://www.androidpolice.com>,
APK Mirror
>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>> beerpla.net | +ArtemRussakovskii
>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>> <http://twitter.com/ArtemR>
>>>>
>>>> On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii <
>>>> archon810 at gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm trying to squeeze performance out of gluster on 4
80GB RAM 20-CPU
>>>>> machines where Gluster runs on attached block storage
(Linode) in (4
>>>>> replicate bricks), and so far everything I tried results in
sub-optimal
>>>>> performance.
>>>>>
>>>>> There are many files - mostly images, several million - and
many
>>>>> operations take minutes, copying multiple files (even if
they're small)
>>>>> suddenly freezes up for seconds at a time, then continues,
iostat
>>>>> frequently shows large r_await and w_awaits with 100%
utilization for the
>>>>> attached block device, etc.
>>>>>
>>>>> But anyway, there are many guides out there for small-file
performance
>>>>> improvements, but more explanation is needed, and I think
more tweaks
>>>>> should be possible.
>>>>>
>>>>> My question today is about performance.cache-size. Is this
a size of
>>>>> cache in RAM? If so, how do I view the current cache size
to see if it gets
>>>>> full and I should increase its size? Is it advisable to
bump it up if I
>>>>> have many tens of gigs of RAM free?
>>>>>
>>>>>
>>>>>
>>>>> More generally, in the last 2 months since I first started
working
>>>>> with gluster and set a production system live, I've
been feeling frustrated
>>>>> because Gluster has a lot of poorly-documented and
confusing options. I
>>>>> really wish documentation could be improved with examples
and better
>>>>> explanations.
>>>>>
>>>>> Specifically, it'd be absolutely amazing if the docs
offered a
>>>>> strategy for setting each value and ways of determining
more optimal
>>>>> values. For example, for performance.cache-size, if it said
something like
>>>>> "run command abc to see your current cache size, and
if it's hurting, up
>>>>> it, but be aware that it's limited by RAM,"
it'd be already a huge
>>>>> improvement to the docs. And so on with other options.
>>>>>
>>>>>
>>>>>
>>>>> The gluster team is quite helpful on this mailing list, but
in a
>>>>> reactive rather than proactive way. Perhaps it's tunnel
vision once you've
>>>>> worked on a project for so long where less technical
explanations and even
>>>>> proper documentation of options takes a back seat, but I
encourage you to
>>>>> be more proactive about helping us understand and optimize
Gluster.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Sincerely,
>>>>> Artem
>>>>>
>>>>> --
>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>> beerpla.net | +ArtemRussakovskii
>>>>> <https://plus.google.com/+ArtemRussakovskii> |
@ArtemR
>>>>> <http://twitter.com/ArtemR>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180410/afe5306c/attachment.html>

Artem Russakovskii

2018-Apr-18 04:44 UTC

head link

[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

Following up here on a related and very serious for us issue.

I took down one of the 4 replicate gluster servers for maintenance today.
There are 2 gluster volumes totaling about 600GB. Not that much data. After
the server comes back online, it starts auto healing and pretty much all
operations on gluster freeze for many minutes.

For example, I was trying to run an ls -alrt in a folder with 7300 files,
and it took a good 15-20 minutes before returning.

During this time, I can see iostat show 100% utilization on the brick, heal
status takes many minutes to return, glusterfsd uses up tons of CPU (I saw
it spike to 600%). gluster already has massive performance issues for me,
but healing after a 4-hour downtime is on another level of bad perf.

For example, this command took many minutes to run:

gluster volume heal androidpolice_data3 info summary
Brick nexus2:/mnt/nexus2_block4/androidpolice_data3
Status: Connected
Total Number of entries: 91
Number of entries in heal pending: 90
Number of entries in split-brain: 0
Number of entries possibly healing: 1

Brick forge:/mnt/forge_block4/androidpolice_data3
Status: Connected
Total Number of entries: 87
Number of entries in heal pending: 86
Number of entries in split-brain: 0
Number of entries possibly healing: 1

Brick hive:/mnt/hive_block4/androidpolice_data3
Status: Connected
Total Number of entries: 87
Number of entries in heal pending: 86
Number of entries in split-brain: 0
Number of entries possibly healing: 1

Brick citadel:/mnt/citadel_block4/androidpolice_data3
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0


Statistics showed a diminishing number of failed heals:
...
Ending time of crawl: Tue Apr 17 21:13:08 2018

Type of crawl: INDEX
No. of entries healed: 2
No. of entries in split-brain: 0
No. of heal failed entries: 102

Starting time of crawl: Tue Apr 17 21:13:09 2018

Ending time of crawl: Tue Apr 17 21:14:30 2018

Type of crawl: INDEX
No. of entries healed: 4
No. of entries in split-brain: 0
No. of heal failed entries: 91

Starting time of crawl: Tue Apr 17 21:14:31 2018

Ending time of crawl: Tue Apr 17 21:15:34 2018

Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 88
...

Eventually, everything heals and goes back to at least where the roof isn't
on fire anymore.

The server stats and volume options were given in one of the previous
replies to this thread.

Any ideas or things I could run and show the output of to help diagnose?
I'm also very open to working with someone on the team on a live debugging
session if there's interest.

Thank you.


Sincerely,
Artem

--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net | +ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
<http://twitter.com/ArtemR>

On Tue, Apr 10, 2018 at 9:56 AM, Artem Russakovskii <archon810 at
gmail.com>
wrote:
> Hi Vlad,
>
> I actually saw that post already and even asked a question 4 days ago (
> https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode#
> comment1172497_540917). The accepted answer also seems to go against your
> suggestion to enable direct-io-mode as it says it should be disabled for
> better performance when used just for file accesses.
>
> It'd be great if someone from the Gluster team chimed in about this
thread.
>
>
> Sincerely,
> Artem
>
> --
> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
> <http://www.apkmirror.com/>, Illogical Robot LLC
> beerpla.net | +ArtemRussakovskii
> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
> <http://twitter.com/ArtemR>
>
> On Tue, Apr 10, 2018 at 7:01 AM, Vlad Kopylov <vladkopy at gmail.com>
wrote:
>
>> Wish I knew or was able to get detailed description of those options
>> myself.
>> here is direct-io-mode  https://serverfault.com/questi
>> ons/517775/glusterfs-direct-i-o-mode
>> Same as you I ran tests on a large volume of files, finding that main
>> delays are in attribute calls, ending up with those mount options to
add
>> performance.
>> I discovered those options through basically googling this user list
with
>> people sharing their tests.
>> Not sure I would share your optimism, and rather then going up I
>> downgraded to 3.12 and have no dir view issue now. Though I had to
recreate
>> the cluster and had to re-add bricks with existing data.
>>
>> On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii <archon810 at
gmail.com>
>> wrote:
>>
>>> Hi Vlad,
>>>
>>> I'm using only localhost: mounts.
>>>
>>> Can you please explain what effect each option has on performance
issues
>>> shown in my posts? "negative-timeout=10,attribute
>>>
-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5"
>>> From what I remember, direct-io-mode=enable didn't make a
difference in my
>>> tests, but I suppose I can try again. The explanations about
direct-io-mode
>>> are quite confusing on the web in various guides, saying enabling
it could
>>> make performance worse in some situations and better in others due
to OS
>>> file cache.
>>>
>>> There are also these gluster volume settings, adding to the
confusion:
>>> Option: performance.strict-o-direct
>>> Default Value: off
>>> Description: This option when set to off, ignores the O_DIRECT
flag.
>>>
>>> Option: performance.nfs.strict-o-direct
>>> Default Value: off
>>> Description: This option when set to off, ignores the O_DIRECT
flag.
>>>
>>> Re: 4.0. I moved to 4.0 after finding out that it fixes the
disappearing
>>> dirs bug related to cluster.readdir-optimize if you remember (
>>>
http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html).
>>> I was already on 3.13 by then, and 4.0 resolved the issue. It's
been stable
>>> for me so far, thankfully.
>>>
>>>
>>> Sincerely,
>>> Artem
>>>
>>> --
>>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>> beerpla.net | +ArtemRussakovskii
>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>> <http://twitter.com/ArtemR>
>>>
>>> On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov <vladkopy at
gmail.com>
>>> wrote:
>>>
>>>> you definitely need mount options to /etc/fstab
>>>> use ones from here http://lists.gluster.org/piper
>>>> mail/gluster-users/2018-April/033811.html
>>>>
>>>> I went on with using local mounts to achieve performance as
well
>>>>
>>>> Also, 3.12 or 3.10 branches would be preferable for production
>>>>
>>>> On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii
<archon810 at gmail.com
>>>> > wrote:
>>>>
>>>>> Hi again,
>>>>>
>>>>> I'd like to expand on the performance issues and plead
for help.
>>>>> Here's one case which shows these odd hiccups:
https://i.imgur.com/C
>>>>> XBPjTK.gifv.
>>>>>
>>>>> In this GIF where I switch back and forth between copy
operations on
>>>>> 2 servers, I'm copying a 10GB dir full of .apk and
image files.
>>>>>
>>>>> On server "hive" I'm copying straight from
the main disk to an
>>>>> attached volume block (xfs). As you can see, the transfers
are relatively
>>>>> speedy and don't hiccup.
>>>>> On server "citadel" I'm copying the same set
of data to a 4-replicate
>>>>> gluster which uses block storage as a brick. As you can
see, performance is
>>>>> much worse, and there are frequent pauses for many seconds
where nothing
>>>>> seems to be happening - just freezes.
>>>>>
>>>>> All 4 servers have the same specs, and all of them have
performance
>>>>> issues with gluster and no such issues when raw xfs block
storage is used.
>>>>>
>>>>> hive has long finished copying the data, while citadel is
barely
>>>>> chugging along and is expected to take probably half an
hour to an hour. I
>>>>> have over 1TB of data to migrate, at which point if we went
live, I'm not
>>>>> even sure gluster would be able to keep up instead of
bringing the machines
>>>>> and services down.
>>>>>
>>>>>
>>>>>
>>>>> Here's the cluster config, though it didn't seem to
make any
>>>>> difference performance-wise before I applied the
customizations vs after.
>>>>>
>>>>> Volume Name: apkmirror_data1
>>>>> Type: Replicate
>>>>> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 1 x 4 = 4
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1
>>>>> Brick2: forge:/mnt/forge_block1/apkmirror_data1
>>>>> Brick3: hive:/mnt/hive_block1/apkmirror_data1
>>>>> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1
>>>>> Options Reconfigured:
>>>>> cluster.quorum-count: 1
>>>>> cluster.quorum-type: fixed
>>>>> network.ping-timeout: 5
>>>>> network.remote-dio: enable
>>>>> performance.rda-cache-limit: 256MB
>>>>> performance.readdir-ahead: on
>>>>> performance.parallel-readdir: on
>>>>> network.inode-lru-limit: 500000
>>>>> performance.md-cache-timeout: 600
>>>>> performance.cache-invalidation: on
>>>>> performance.stat-prefetch: on
>>>>> features.cache-invalidation-timeout: 600
>>>>> features.cache-invalidation: on
>>>>> cluster.readdir-optimize: on
>>>>> performance.io-thread-count: 32
>>>>> server.event-threads: 4
>>>>> client.event-threads: 4
>>>>> performance.read-ahead: off
>>>>> cluster.lookup-optimize: on
>>>>> performance.cache-size: 1GB
>>>>> cluster.self-heal-daemon: enable
>>>>> transport.address-family: inet
>>>>> nfs.disable: on
>>>>> performance.client-io-threads: on
>>>>>
>>>>>
>>>>> The mounts are done as follows in /etc/fstab:
>>>>> /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1
>>>>> /mnt/citadel_block1 xfs defaults 0 2
>>>>> localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs
>>>>> defaults,_netdev 0 0
>>>>>
>>>>> I'm really not sure if direct-io-mode mount tweaks
would do anything
>>>>> here, what the value should be set to, and what it is by
default.
>>>>>
>>>>> The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs,
hosted by
>>>>> Linode.
>>>>>
>>>>> I'd really appreciate any help in the matter.
>>>>>
>>>>> Thank you.
>>>>>
>>>>>
>>>>> Sincerely,
>>>>> Artem
>>>>>
>>>>> --
>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>> beerpla.net | +ArtemRussakovskii
>>>>> <https://plus.google.com/+ArtemRussakovskii> |
@ArtemR
>>>>> <http://twitter.com/ArtemR>
>>>>>
>>>>> On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii <
>>>>> archon810 at gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm trying to squeeze performance out of gluster on
4 80GB RAM 20-CPU
>>>>>> machines where Gluster runs on attached block storage
(Linode) in (4
>>>>>> replicate bricks), and so far everything I tried
results in sub-optimal
>>>>>> performance.
>>>>>>
>>>>>> There are many files - mostly images, several million -
and many
>>>>>> operations take minutes, copying multiple files (even
if they're small)
>>>>>> suddenly freezes up for seconds at a time, then
continues, iostat
>>>>>> frequently shows large r_await and w_awaits with 100%
utilization for the
>>>>>> attached block device, etc.
>>>>>>
>>>>>> But anyway, there are many guides out there for
small-file
>>>>>> performance improvements, but more explanation is
needed, and I think more
>>>>>> tweaks should be possible.
>>>>>>
>>>>>> My question today is about performance.cache-size. Is
this a size of
>>>>>> cache in RAM? If so, how do I view the current cache
size to see if it gets
>>>>>> full and I should increase its size? Is it advisable to
bump it up if I
>>>>>> have many tens of gigs of RAM free?
>>>>>>
>>>>>>
>>>>>>
>>>>>> More generally, in the last 2 months since I first
started working
>>>>>> with gluster and set a production system live, I've
been feeling frustrated
>>>>>> because Gluster has a lot of poorly-documented and
confusing options. I
>>>>>> really wish documentation could be improved with
examples and better
>>>>>> explanations.
>>>>>>
>>>>>> Specifically, it'd be absolutely amazing if the
docs offered a
>>>>>> strategy for setting each value and ways of determining
more optimal
>>>>>> values. For example, for performance.cache-size, if it
said something like
>>>>>> "run command abc to see your current cache size,
and if it's hurting, up
>>>>>> it, but be aware that it's limited by RAM,"
it'd be already a huge
>>>>>> improvement to the docs. And so on with other options.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The gluster team is quite helpful on this mailing list,
but in a
>>>>>> reactive rather than proactive way. Perhaps it's
tunnel vision once you've
>>>>>> worked on a project for so long where less technical
explanations and even
>>>>>> proper documentation of options takes a back seat, but
I encourage you to
>>>>>> be more proactive about helping us understand and
optimize Gluster.
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> Sincerely,
>>>>>> Artem
>>>>>>
>>>>>> --
>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>> <https://plus.google.com/+ArtemRussakovskii> |
@ArtemR
>>>>>> <http://twitter.com/ArtemR>
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180417/a98b8022/attachment.html>

Maybe Matching Threads

Search for more reasonably related threads

Gluster users - Apr 2018 - performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs

Maybe Matching Threads