Hi Strahil, thanks again for sticking with me on this.> Hm... OK. I guess you can try 7.7 whenever it's possible.Acknowledged.>> Perhaps I am not understanding it correctly. I tried these suggestions >> >> before and it got worse, not better. so I have been operating under >> the >> assumption that maybe these guidelines are not appropriate for newer >> versions. > > Actually, the settings are not changed much, so they should work for you.Okay, then maybe I am doing something incorrectly, or not understanding some fundamental piece of things that I should be.>>>> Interestingly, mostly because it is not something I have ever >>>> experienced before, software interrupts sit between 1 and 5 on each >>>> core, but the last core is usually sitting around 20. Have never >>>> encountered a high load average where the si number was ever >>>> significant. I have googled the crap out of that (as well as >> gluster >>>> performance in general), there are nearly limitless posts about what >> it >>>> >>>> is, but have yet to see one thing to explain what to do about it. > > This is happening on all nodes ? > I got a similar situation caused by bad NIC (si in top was way high), but the chance for bad NIC on all servers is very low. > You can still patch OS + Firmware on your next maintenance.Yes, but it's not to the same extreme. The other node is currently not actually serving anything to the internet, so right now it's only function is replicated gluster and databases. On the 2nd node there is also one core, the first one in this case as opposed to the last one on the main node, but it sits between 10 and 15 instead of 20 and 25, and the remaining cores will be between 0 and 2 instead of 1 and 5. I have no evidence of any bad hardware, and these servers were both commissioned only within the last couple of months. But will still poke around on this path.>> more number of CPU cycles than needed, increasing the event thread >> count >> would enhance the performance of the Red Hat Storage Server." which is >> >> why I had it at 8. > > Yeah, but you got only 6 cores and they are not dedicated for gluster only. I think that you need to test with lower values.Okay, I will change these values a few times over the next couple of hours and see what happens.>> right now the only suggested parameter I haven't played with is the >> performance.io-thread-count, which I currently have at 64. > > I think that as you have SSDs only, you might have some results by changing this one.Okay, will also modify this incrementally. do you think it can go higher? I think I got this number from a thread on this list, but I am not really sure what would be a reasonable value for my system.>> >> For what it's worth, I am running ext4 as my underlying fs and I have >> read a few times that XFS might have been a better choice. But that is >> >> not a trivial experiment to make at this time with the system in >> production. It's one thing (and still a bad thing to be sure) to >> semi-bork the system for an hour or two while I play with >> configurations, but would take a day or so offline to reformat and >> restore the data. > > XFS should bring better performance, but if the issue is not in FS -> it won't make a change... > What I/O scheduler are you using for the SSDs (you can check via 'cat /sys/block/sdX/queue/scheduler)?# cat /sys/block/vda/queue/scheduler [mq-deadline] none>> in the past I have tried 2, 4, 8, 16, and 32. Playing with just those >> I >> never noticed that any of them made any difference. Though I might >> have >> some different options now than I did then, so might try these again >> throughout the day... > > Are you talking about server or client event threads (or both)?It never occurred to me to set them to different values. so far when I set one I set the other to the same value.> >> Thanks again for your time Strahil, if you have any more thoughts would >> >> love to hear them. > > Can you check if you use 'noatime' for the bricks ? It won't bring any effect on the CPU side, but it might help with the I/O.I checked into this, and I have nodiratime set, but not noatime. from what I can gather, it should provide nearly the same benefit performance wise while leaving the atime attribute on the files. Never know, I may decide I want those at some point in the future.> I see that your indicator for high load is loadavg, but have you actually checked how many processes are in 'R' or 'D' state ? > Some monitoring checks can raise loadavg artificially.occasionally a batch of processes will be in R state, and I see the D state show up from time to time, but mostly everything is S.> Also, are you using software mirroring (either mdadm or striped/mirrored LVs )?No, single disk. And I opted to not put the gluster on a thinLVM, as I don't see myself using the lvm snapshots in this scenario. So, we just moved into a quieter time of the day, but maybe I just stumbled onto something. I was trying to figure out if/how I could throw more RAM at the problem. gluster docs says write behind is not a cache unless flush-behind is on. So seems that is a way to throw ram to it? I put performance.write-behind-window-size: 512MB and performance.flush-behind: on and the whole system calmed down pretty much immediately. could be just timing, though, will have to see tomorrow during business hours whether the system stays at a reasonable load. I will still test the other options you suggested tonight, though, this is probably too good to be true. Can't thank you enough for your input, Strahil, your help is truly appreciated!> >>> >>> >>> Best Regards, >>> Strahil Nikolov >>> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users
Hi List,> So, we just moved into a quieter time of the day, but maybe I just > stumbled onto something.? I was trying to figure out if/how I could > throw more RAM at the problem.? gluster docs says write behind is not a > cache unless flush-behind is on.? So seems that is a way to throw ram to > it?? I put performance.write-behind-window-size: 512MB and > performance.flush-behind: on and the whole system calmed down pretty > much immediately.? could be just timing, though, will have to see > tomorrow during business hours whether the system stays at a reasonable > load.so reporting back that this seems to have definitely had a significant positive effect. So far today I have not seen the load average climb over 13 with the 15minute average hovering around 7. cpus are still spiking from time to time, but they are not staying maxed out all the time, and frequently I am seeing brief periods of up to 80% idle. glusterfs process still spiking up to 180% or so, but consistently running around 70%, and the brick processes still spiking up to 70-80%, but consistently running around 20%. Disk has only been above 50% in atop once so far today when it spiked up to 92%, and still lots of RAM left over. So far nload even seems indicates I could get away with a 100Mbit network connection. Websites are snappy relative to what they were, still a bit sluggish on the first page of any given site, but tolerable or close to. Apache processes are opening and closing right away, instead of stacking up. Overall, system is performing pretty much like I would expect it to without gluster. I haven't played with any of the other settings yet, just going to leave it like this for a day. I have to admit I am a little bit suspicious. I have been arguing with Gluster for a very long time, and I have never known it to play this nice. kind feels like when your girl tells you she is "fine"; conversation has stopped, but you aren't really sure if it's done...> > I will still test the other options you suggested tonight, though, this > is probably too good to be true. > > Can't thank you enough for your input, Strahil, your help is truly > appreciated! > > > > > > >> >>>> >>>> >>>> Best Regards, >>>> Strahil Nikolov >>>> >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://bluejeans.com/441850968 >>> >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users
?? 5 ?????? 2020 ?. 4:53:34 GMT+03:00, Computerisms Corporation <bob at computerisms.ca> ??????:>Hi Strahil, > >thanks again for sticking with me on this. >> Hm... OK. I guess you can try 7.7 whenever it's possible. > >Acknowledged. > >>> Perhaps I am not understanding it correctly. I tried these >suggestions >>> >>> before and it got worse, not better. so I have been operating under >>> the >>> assumption that maybe these guidelines are not appropriate for newer >>> versions. >> >> Actually, the settings are not changed much, so they should work >for you. > >Okay, then maybe I am doing something incorrectly, or not understanding > >some fundamental piece of things that I should be.To be honest, the documentation seems pretty useless to me.>>>>> Interestingly, mostly because it is not something I have ever >>>>> experienced before, software interrupts sit between 1 and 5 on >each >>>>> core, but the last core is usually sitting around 20. Have never >>>>> encountered a high load average where the si number was ever >>>>> significant. I have googled the crap out of that (as well as >>> gluster >>>>> performance in general), there are nearly limitless posts about >what >>> it >>>>> >>>>> is, but have yet to see one thing to explain what to do about it. >> >> This is happening on all nodes ? >> I got a similar situation caused by bad NIC (si in top was way >high), but the chance for bad NIC on all servers is very low. >> You can still patch OS + Firmware on your next maintenance. > >Yes, but it's not to the same extreme. The other node is currently not > >actually serving anything to the internet, so right now it's only >function is replicated gluster and databases. On the 2nd node there is > >also one core, the first one in this case as opposed to the last one on > >the main node, but it sits between 10 and 15 instead of 20 and 25, and >the remaining cores will be between 0 and 2 instead of 1 and 5.>I have no evidence of any bad hardware, and these servers were both >commissioned only within the last couple of months. But will still >poke >around on this path.It could be a bad firmware also. If you get the opportunity, flash the firmware and bump the OS to the max.>>> more number of CPU cycles than needed, increasing the event thread >>> count >>> would enhance the performance of the Red Hat Storage Server." which >is >>> >>> why I had it at 8. >> >> Yeah, but you got only 6 cores and they are not dedicated for >gluster only. I think that you need to test with lower values. > >Okay, I will change these values a few times over the next couple of >hours and see what happens. > >>> right now the only suggested parameter I haven't played with is the >>> performance.io-thread-count, which I currently have at 64. >> >> I think that as you have SSDs only, you might have some results by >changing this one. > >Okay, will also modify this incrementally. do you think it can go >higher? I think I got this number from a thread on this list, but I am > >not really sure what would be a reasonable value for my system.I guess you can try to increase it a little bit and check how is it going.>>> >>> For what it's worth, I am running ext4 as my underlying fs and I >have >>> read a few times that XFS might have been a better choice. But that >is >>> >>> not a trivial experiment to make at this time with the system in >>> production. It's one thing (and still a bad thing to be sure) to >>> semi-bork the system for an hour or two while I play with >>> configurations, but would take a day or so offline to reformat and >>> restore the data. >> >> XFS should bring better performance, but if the issue is not in FS >-> it won't make a change... >> What I/O scheduler are you using for the SSDs (you can check via 'cat >/sys/block/sdX/queue/scheduler)? > ># cat /sys/block/vda/queue/scheduler >[mq-deadline] noneDeadline prioritizes reads in a 2:1 ratio /default tunings/ . You can consider testing 'none' if your SSDs are good. I see vda , please share details on the infra as this is very important. Virtual disks have their limitations and if you are on a VM, then there might be chance to increase the CPU count. If you are on a VM, I would recommend you to use more (in numbers) and smaller disks in stripe sets (either raid0 via mdadm, or pure striped LV). Also, if you are on a VM -> there is no reason to reorder your I/O requests in the VM, just to do it again on the Hypervisour. In such case 'none' can bring better performance, but this varies on the workload.>>> in the past I have tried 2, 4, 8, 16, and 32. Playing with just >those >>> I >>> never noticed that any of them made any difference. Though I might >>> have >>> some different options now than I did then, so might try these again >>> throughout the day... >> >> Are you talking about server or client event threads (or both)? > >It never occurred to me to set them to different values. so far when I > >set one I set the other to the same value.Yeah, this makes sense.>> >>> Thanks again for your time Strahil, if you have any more thoughts >would >>> >>> love to hear them. >> >> Can you check if you use 'noatime' for the bricks ? It won't bring >any effect on the CPU side, but it might help with the I/O. > >I checked into this, and I have nodiratime set, but not noatime. from >what I can gather, it should provide nearly the same benefit >performance >wise while leaving the atime attribute on the files. Never know, I may > >decide I want those at some point in the future.All necessary data is in the file attributes on the brick. I doubt you will need to have access times on the brick itself. Another possibility is to use 'relatime'.>> I see that your indicator for high load is loadavg, but have you >actually checked how many processes are in 'R' or 'D' state ? >> Some monitoring checks can raise loadavg artificially. > >occasionally a batch of processes will be in R state, and I see the D >state show up from time to time, but mostly everything is S. > >> Also, are you using software mirroring (either mdadm or >striped/mirrored LVs )? > >No, single disk. And I opted to not put the gluster on a thinLVM, as I > >don't see myself using the lvm snapshots in this scenario. > >So, we just moved into a quieter time of the day, but maybe I just >stumbled onto something. I was trying to figure out if/how I could >throw more RAM at the problem. gluster docs says write behind is not a > >cache unless flush-behind is on. So seems that is a way to throw ram >to >it? I put performance.write-behind-window-size: 512MB and >performance.flush-behind: on and the whole system calmed down pretty >much immediately. could be just timing, though, will have to see >tomorrow during business hours whether the system stays at a reasonable > >load. > >I will still test the other options you suggested tonight, though, this > >is probably too good to be true. > >Can't thank you enough for your input, Strahil, your help is truly >appreciated! > > > > > > >> >>>> >>>> >>>> Best Regards, >>>> Strahil Nikolov >>>> >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://bluejeans.com/441850968 >>> >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >________ > > > >Community Meeting Calendar: > >Schedule - >Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >Bridge: https://bluejeans.com/441850968 > >Gluster-users mailing list >Gluster-users at gluster.org >https://lists.gluster.org/mailman/listinfo/gluster-users