?? 20 ?????? 2020 ?. 3:46:41 GMT+03:00, Computerisms Corporation <bob at
computerisms.ca> ??????:>Hi Strahil,
>
>so over the last two weeks, the system has been relatively stable. I
>have powered off both servers at least once, for about 5 minutes each
>time. server came up, auto-healed what it needed to, so all of that
>part is working as expected.
>
>will answer things inline and follow with more questions:
>
>>>> Hm... OK. I guess you can try 7.7 whenever it's possible.
>>>
>>> Acknowledged.
>
>Still on my list.
>> It could be a bad firmware also. If you get the opportunity, flash
>the firmware and bump the OS to the max.
>
>Datacenter says everything was up to date as of installation, not
>really
>wanting them to take the servers offline for long enough to redo all
>the
>hardware.
>
>>>>> more number of CPU cycles than needed, increasing the event
thread
>>>>> count
>>>>> would enhance the performance of the Red Hat Storage
Server."
>which
>>> is
>>>>> why I had it at 8.
>>>> Yeah, but you got only 6 cores and they are not dedicated for
>>> gluster only. I think that you need to test with lower values.
>
>figured out my magic number for client/server threads, it should be 5.
>I set it to 5, observed no change I could attribute to it, so tried 4,
>and got the same thing; no visible effect.
>
>>>>> right now the only suggested parameter I haven't played
with is
>the
>>>>> performance.io-thread-count, which I currently have at 64.
>>> not really sure what would be a reasonable value for my system.
>> I guess you can try to increase it a little bit and check how is it
>going.
>
>turns out if you try to set this higher than 64, you get an error
>saying
>64 is the max.
>
>>>> What I/O scheduler are you using for the SSDs (you can check
via
>'cat
>>> /sys/block/sdX/queue/scheduler)?
>>>
>>> # cat /sys/block/vda/queue/scheduler
>>> [mq-deadline] none
>>
>> Deadline prioritizes reads in a 2:1 ratio /default tunings/ . You
>can consider testing 'none' if your SSDs are good.
>
>I did this. I would say it did have a positive effect, but it was a
>minimal one.
>
>> I see vda , please share details on the infra as this is very
>important. Virtual disks have their limitations and if you are on a VM,
> then there might be chance to increase the CPU count.
>> If you are on a VM, I would recommend you to use more (in numbers)
>and smaller disks in stripe sets (either raid0 via mdadm, or pure
>striped LV).
>> Also, if you are on a VM -> there is no reason to reorder your I/O
>requests in the VM, just to do it again on the Hypervisour. In such
>case 'none' can bring better performance, but this varies on the
>workload.
>
>hm, this is a good question, one I have been asking the datacenter for
>a
>while, but they are a little bit slippery on what exactly it is they
>have going on there. They advertise the servers as metal with a
>virtual
>layer. The virtual layer is so you can log into a site and power the
>server down or up, mount an ISO to boot from, access a console, and
>some
>other nifty things. can't any more, but when they first introduced the
>
>system, you could even access the BIOS of the server. But apparently,
>and they swear up and down by this, it is a physical server, with real
>dedicated SSDs and real sticks of RAM. I have found virtio and qemu as
>
>loaded kernel modules, so certainly there is something virtual
>involved,
>but other than that and their nifty little tools, it has always acted
>and worked like a metal server to me.
You can use 'virt-what' binary to find if and what type of
Virtualization is used.
I have a suspicion you are ontop of Openstack (which uses CEPH), so I guess you
can try to get more info.
For example, an Openstack instance can have '0x1af4' in
'/sys/block/vdX/device/vendor' (replace X with actual device letter).
Another check could be:
/usr/lib/udev/scsi_id -g -u -d /dev/vda
And also, you can try to take a look with smartctl from smartmontools package:
smartctl -a /dev/vdX
>> All necessary data is in the file attributes on the brick. I doubt
>you will need to have access times on the brick itself. Another
>possibility is to use 'relatime'.
>
>remounted all bricks with noatime, no significant difference.
>
>>> cache unless flush-behind is on. So seems that is a way to throw
>ram
>>> to
>>> it? I put performance.write-behind-window-size: 512MB and
>>> performance.flush-behind: on and the whole system calmed down
pretty
>>> much immediately. could be just timing, though, will have to see
>>> tomorrow during business hours whether the system stays at a
>reasonable
>
>Tried increasing this to its max of 1GB, no noticeable change from
>512MB.
>
>The 2nd server is not acting inline with the first server. glusterfsd
>processes are running at 50-80% of a core each, with one brick often
>going over 200%, where as they usually stick to 30-45% on the first
>server. apache processes consume as much as 90% of a core where as
>they
>rarely go over 15% on the first server, and they frequently stack up to
>
>having more than 100 running at once, which drives load average up to
>40-60. It's very much like the first server was before I found the
>flush-behind setting, but not as bad; at least it isn't going
>completely
>non-responsive.
>
>Additionally, it is still taking an excessive time to load the first
>page of most sites. I am guessing I need to increase read speeds to
>fix
>this, so I have played with
>performance.io-cache/cache-max-file-size(slight positive change),
>read-ahead/read-ahead-page-count(negative change till page count set to
>
>max of 16, then no noticeable difference), and
>rda-cache-limit/rda-request-size(minimal positive effect). I still
>have
>RAM to spare, so would be nice if I could be using it to improve things
>on the read side of things, but have found no magic bullet like
>flush-behind was.
>
>I found a good number of more options to try, have been going a little
>crazy with them, will post them at the bottom. I found a post that
>suggested mount options are also important:
>
>https://lists.gluster.org/pipermail/gluster-users/2018-September/034937.html
>
>I confirmed these are in the man pages, so I tried umounting and
>re-mounting with the -o option to include these thusly:
>
>mount -t glusterfs moogle:webisms /Computerisms/ -o
>negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5
>
>But I don't think they are working:
>
>/# mount | grep glus
>moogle:webisms on /Computerisms type fuse.glusterfs
>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>
>would be grateful if there are any other suggestions anyone can think
>of.
>
>root at moogle:/# gluster v info
>
>Volume Name: webisms
>Type: Distributed-Replicate
>Volume ID: 261901e7-60b4-4760-897d-0163beed356e
>Status: Started
>Snapshot Count: 0
>Number of Bricks: 2 x (2 + 1) = 6
>Transport-type: tcp
>Bricks:
>Brick1: mooglian:/var/GlusterBrick/replset-0/webisms-replset-0
>Brick2: moogle:/var/GlusterBrick/replset-0/webisms-replset-0
>Brick3: moogle:/var/GlusterBrick/replset-0-arb/webisms-replset-0-arb
>(arbiter)
>Brick4: moogle:/var/GlusterBrick/replset-1/webisms-replset-1
>Brick5: mooglian:/var/GlusterBrick/replset-1/webisms-replset-1
>Brick6: mooglian:/var/GlusterBrick/replset-1-arb/webisms-replset-1-arb
>(arbiter)
>Options Reconfigured:
>performance.rda-cache-limit: 1GB
>performance.client-io-threads: off
>nfs.disable: on
>storage.fips-mode-rchecksum: off
>transport.address-family: inet
>performance.stat-prefetch: on
>network.inode-lru-limit: 200000
>performance.write-behind-window-size: 1073741824
>performance.readdir-ahead: on
>performance.io-thread-count: 64
>performance.cache-size: 12GB
>server.event-threads: 4
>client.event-threads: 4
>performance.nl-cache-timeout: 600
>auth.allow: xxxxxx
>performance.open-behind: off
>performance.quick-read: off
>cluster.lookup-optimize: off
>cluster.rebal-throttle: lazy
>features.cache-invalidation: on
>features.cache-invalidation-timeout: 600
>performance.cache-invalidation: on
>performance.md-cache-timeout: 600
>performance.flush-behind: on
>cluster.read-hash-mode: 0
>performance.strict-o-direct: on
>cluster.readdir-optimize: on
>cluster.lookup-unhashed: off
>performance.cache-refresh-timeout: 30
>performance.enable-least-priority: off
>cluster.choose-local: on
>performance.rda-request-size: 128KB
>performance.read-ahead: on
>performance.read-ahead-page-count: 16
>performance.cache-max-file-size: 5MB
>performance.io-cache: on