?? 4 ?????? 2020 ?. 22:47:44 GMT+03:00, Computerisms Corporation <bob at
computerisms.ca> ??????:>Hi Strahil, thanks for your response.
>
>>>
>>> I have compiled gluster 7.6 from sources on both servers.
>>
>> There is a 7.7 version which is fixing somw stuff. Why do you have
>to compile it from source ?
>
>Because I have often found with other stuff in the past compiling from
>source makes a bunch of problems go away. software generally works the
>
>way the developers expect it to if you use the sources, so they are
>better able to help if required. so now I generally compile most of my
>
>center-piece softwares and use packages for all the supporting stuff.
Hm... OK. I guess you can try 7.7 whenever it's possible.
>>
>>> Servers are 6core/3.4Ghz with 32 GB RAM, no swap, and SSD and
>gigabit
>>> network connections. They are running debian, and are being used
as
>>> redundant web servers. There is some 3Million files on the Gluster
>>> Storage averaging 130KB/file.
>>
>> This type of workload is called 'metadata-intensive'.
>
>does this mean the metadata-cache group file would be a good one to
>enable? will try.
>
>waited 10 minutes, no change that I can see.
>
>> There are some recommendations for this type of workload:
>>
>https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/small_file_performance_enhancements
>>
>> Keep an eye on the section that mentions dirty-ratio?= 5
>&dirty-background-ration?= 2.
>
>I have actually read that whole manual, and specifically that page
>several times. And also this one:
>
>https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/administration_guide/small_file_performance_enhancements
>
>Perhaps I am not understanding it correctly. I tried these suggestions
>
>before and it got worse, not better. so I have been operating under
>the
>assumption that maybe these guidelines are not appropriate for newer
>versions.
Actually, the settings are not changed much, so they should work for you.
>But will try again. adjusting the dirty ratios.
>
>Load average went from around 15 to 35 in about 2-3 minutes, but 20
>minutes later, it is back down to 20. It may be having a minimal
>positive impact on cpu, though, I haven't see the main glusterfs go
>over
>200% since I changed this, an the brick processes are hovering just
>below 50% where they were consistently above 50% before. Might just
>be
>time of day with the system not as busy.
>
>after watching for 30 minutes, load average is fluctuating between 10
>and 30, but cpu idle appears marginally better on average than it was.
>
>>> Interestingly, mostly because it is not something I have ever
>>> experienced before, software interrupts sit between 1 and 5 on each
>>> core, but the last core is usually sitting around 20. Have never
>>> encountered a high load average where the si number was ever
>>> significant. I have googled the crap out of that (as well as
>gluster
>>> performance in general), there are nearly limitless posts about
what
>it
>>>
>>> is, but have yet to see one thing to explain what to do about it.
This is happening on all nodes ?
I got a similar situation caused by bad NIC (si in top was way high), but the
chance for bad NIC on all servers is very low.
You can still patch OS + Firmware on your next maintenance.
>> There is an explanation about that in the link I provided above:
>>
>> Configuring a higher event threads value than the available
>processing units could again cause context switches on these threads.
>As a result reducing the number deduced from the previous step to a
>number that is less that the available processing units is recommended.
>
>Okay, again, have played with these numbers before and it did not pan
>out as expected. if I understand it correctly, I have 3 brick
>processes
>(glusterfsd), so the "deduced" number should be 3, and I should
set it
>lower than that, so 2. but it also says "If a specific thread consumes
>
>more number of CPU cycles than needed, increasing the event thread
>count
>would enhance the performance of the Red Hat Storage Server." which is
>
>why I had it at 8.
Yeah, but you got only 6 cores and they are not dedicated for gluster only. I
think that you need to test with lower values.
>but will set it to 2 now. load average is at 17 to start, waiting a
>while to see what happens.
>
>so 15 minutes later, load average is currently 12, but is fluctuating
>between 10 and 20, have seen no significant change in cpu usage or
>anything else in top.
>
>now try also changing server.outstanding-rpc-limit to 256 and wait.
>
>15 minutes later; load has been above 30 but is currently back down to
>12. no significant change in cpu. try increasing to 512 and wait.
>
>15 minutes later, load average is 50. no signficant difference in cpu.
>
>Software interrupts remain around where they were. wa from top remains
>
>about where it was. not sure why load average is climbing so high.
>changing rpc-limit to 128.
>
>ugh. 10 minutes later, load average just popped over 100. resetting
>rpc-limit.
>
>now trying cluster.lookup-optimize on, lazy rebalancing (probably a bad
>
>idea on the live system, but how much worse can it get?) Ya, bad idea,
>
>80 hours estimated to complete, load is over 50 and server is crawling.
>
>disabling rebalance and turning lookup-optimize off, for now.
>
>right now the only suggested parameter I haven't played with is the
>performance.io-thread-count, which I currently have at 64.
I think that as you have SSDs only, you might have some results by changing
this one.
>sigh. an hour later load average is 80 and climbing. apache processes
>
>are numbering in the hundreds and I am constantly having to restart it.
>
>this brings load average down to 5, but as apache processes climb and
>are held open load average gets up to over 100 again with 3-4 minutes,
>and system starts going non-responsive. rinse and repeat.
>
>so followed all the recommendations, maybe the dirty settings had a
>small positive impact, but overall system is most definitely worse for
>having made the changes.
>
>I have returned the configs back to how they were except the dirty
>settings and the metadata-cache group. increased
>performance.cache-size
>to 16GB for now, because that is the one thing that seems to help when
>I
>"tune" (aka make worse) the system. have had to restart apache a
>couple
>dozen times or more, but after another 30 minutes or so system has
>pretty much settled back to how it was before I started. cpu is like I
>
>originally stated, all 6 cores maxed out most of the time, software
>interrupts still have all cpus running around 5 with the last one
>consistently sitting around 20-25. Disk is busy but not usually maxed
>out. RAM is about half used. network load peaks at about 1/3
>capacity.
>load average is between 10 and 20. sites are responding, but sluggish.
>
>so am I not reading these recommendations and following the
>instructions
>correctly? am I not waiting long enough after each implementation,
>should I be making 1 change per day instead of thinking 15 minutes
>should be enough for the system to catch up? I have read the full red
>hat documentation and the significant majority of the gluster docs,
>maybe I am missing something else there? should these settings have
>had
>a different effect than they did?
>
>For what it's worth, I am running ext4 as my underlying fs and I have
>read a few times that XFS might have been a better choice. But that is
>
>not a trivial experiment to make at this time with the system in
>production. It's one thing (and still a bad thing to be sure) to
>semi-bork the system for an hour or two while I play with
>configurations, but would take a day or so offline to reformat and
>restore the data.
XFS should bring better performance, but if the issue is not in FS -> it
won't make a change...
What I/O scheduler are you using for the SSDs (you can check via 'cat
/sys/block/sdX/queue/scheduler)?>>
>> As 'storage.fips-mode-rchecksum' is using sha256, you can try
to
>disable it - which should use the less cpu intensive md5. Yet, I have
>never played with that option ...
>
>Done. no signficant difference than I can see.
>
>> Check the RH page about the tunings and try different values for the
>event threads.
>
>in the past I have tried 2, 4, 8, 16, and 32. Playing with just those
>I
>never noticed that any of them made any difference. Though I might
>have
>some different options now than I did then, so might try these again
>throughout the day...
Are you talking about server or client event threads (or both)?
>Thanks again for your time Strahil, if you have any more thoughts would
>
>love to hear them.
Can you check if you use 'noatime' for the bricks ? It won't bring
any effect on the CPU side, but it might help with the I/O.
I see that your indicator for high load is loadavg, but have you actually
checked how many processes are in 'R' or 'D' state ?
Some monitoring checks can raise loadavg artificially.
Also, are you using software mirroring (either mdadm or striped/mirrored LVs )?
>>
>>
>> Best Regards,
>> Strahil Nikolov
>>
>________
>
>
>
>Community Meeting Calendar:
>
>Schedule -
>Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>Bridge: https://bluejeans.com/441850968
>
>Gluster-users mailing list
>Gluster-users at gluster.org
>https://lists.gluster.org/mailman/listinfo/gluster-users