Amar Tumballi Suryanarayan
2019-Feb-21 05:18 UTC
[Gluster-users] glusterfsd Ubuntu 18.04 high iowait issues
If you have both systems to get some idea, can you get the `gluster profile info' output? That helps a bit to understand the issue. On Thu, Feb 21, 2019 at 8:20 AM Kartik Subbarao <subbarao at computer.org> wrote:> We're running gluster on two hypervisors running Ubuntu. When we > upgraded from Ubuntu 14.04 to 18.04, it upgraded gluster from 3.4.2 to > 3.13.2. As soon as we upgraded and since then, we've been seeing > substantially higher iowait on the system, as measured by top and iotop, > and iotop indicates that glusterfsd is the culprit. For some reason, > glusterfsd is doing more disk reads and/or those reads are being held up > up at a greater rate. The guest VMs are also seeing more iowait -- their > images are hosted on the gluster volume. This is causing inconsistent > responsiveness from the services hosted on the VMs. > > I'm looking for any recommendations on how to troubleshoot and/or > resolve this problem. We have other sites that are still running 14.04, > so I can compare/contrast any configuration parameters and performance. > > The block scheduler on 14.04 was set to deadline and 18.04 was set to > cfq. But changing the 18.04 scheduler to deadline didn't make any > difference. > > I was wondering whether glusterfsd on 18.04 isn't caching as much as it > should. We tried increasing performance.cache-size substantially but > that didn't make any difference. > > Another option we're considering but haven't tried yet is upgrading to > gluster 5.3 by back-porting the package from Ubuntu 19.04 to 18.04. Does > anyone think this might help? > > Is there any particular debug logging we could set up or other commands > we could run to troubleshoot this better? Any thoughts, suggestions, > ideas would be greatly appreciated. > > Thanks, > > -Kartik > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190221/0fcfa607/attachment.html>
Kartik Subbarao
2019-Feb-21 16:34 UTC
[Gluster-users] glusterfsd Ubuntu 18.04 high iowait issues
Here are three profile reports from 60-second intervals: Ubuntu 18.04 system with low load: https://pastebin.com/XzgmjeuJ Ubuntu 14.04 system with low load: https://pastebin.com/5BEHDFwq Ubuntu 14.04 system with high load: https://pastebin.com/CFSWW4qn Each of these systems is "gluster1" in the report. In each cluster, there are two bricks, gluster1:/md3/gluster and gluster2:/md3/gluster. The systems are identical hardware-wise (I noticed this morning that the 18.04 upgrade applied a powersave governor to the CPU. I changed it to the performance governor before running the profile, but that doesn't seem to have changed the iowait behavior or the profile report appreciably). What jumps out at me for the 18.04 systems is: 1) The excessively high average latency of the FINODELK operations on the *local* brick (i.e. gluster1:/md3/gluster). The latency is far lower for these FINODELK operations against the other node's brick (gluster2:/md3/gluster). This is puzzling to me. 2) Almost double higher average latency for FSYNC operations against both the gluster1 and gluster2 bricks. On the 14.04 systems, the number of FINODELK operations performed during the 60-second interval is much lower (even on the highload system). And the latencies are lower. Regards, ??? -Kartik On 2/21/19 12:18 AM, Amar Tumballi Suryanarayan wrote:> If you have both systems to get some idea, can you get the `gluster > profile info' output? That helps a bit to understand the issue. > > > On Thu, Feb 21, 2019 at 8:20 AM Kartik Subbarao <subbarao at computer.org > <mailto:subbarao at computer.org>> wrote: > > We're running gluster on two hypervisors running Ubuntu. When we > upgraded from Ubuntu 14.04 to 18.04, it upgraded gluster from > 3.4.2 to > 3.13.2. As soon as we upgraded and since then, we've been seeing > substantially higher iowait on the system, as measured by top and > iotop, > and iotop indicates that glusterfsd is the culprit. For some reason, > glusterfsd is doing more disk reads and/or those reads are being > held up > up at a greater rate. The guest VMs are also seeing more iowait -- > their > images are hosted on the gluster volume. This is causing inconsistent > responsiveness from the services hosted on the VMs. > > I'm looking for any recommendations on how to troubleshoot and/or > resolve this problem. We have other sites that are still running > 14.04, > so I can compare/contrast any configuration parameters and > performance. > > The block scheduler on 14.04 was set to deadline and 18.04 was set to > cfq. But changing the 18.04 scheduler to deadline didn't make any > difference. > > I was wondering whether glusterfsd on 18.04 isn't caching as much > as it > should. We tried increasing performance.cache-size substantially but > that didn't make any difference. > > Another option we're considering but haven't tried yet is > upgrading to > gluster 5.3 by back-porting the package from Ubuntu 19.04 to > 18.04. Does > anyone think this might help? > > Is there any particular debug logging we could set up or other > commands > we could run to troubleshoot this better? Any thoughts, suggestions, > ideas would be greatly appreciated. > > Thanks, > > ???? -Kartik > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Amar Tumballi (amarts)-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190221/24b6bf2d/attachment.html>