Sahina Bose
2018-May-29 05:50 UTC
[Gluster-users] [ovirt-users] Gluster problems, cluster performance issues
[Adding gluster-users to look at the heal issue] On Tue, May 29, 2018 at 9:17 AM, Jim Kusznir <jim at palousetech.com> wrote:> Hello: > > I've been having some cluster and gluster performance issues lately. I > also found that my cluster was out of date, and was trying to apply updates > (hoping to fix some of these), and discovered the ovirt 4.1 repos were > taken completely offline. So, I was forced to begin an upgrade to 4.2. > According to docs I found/read, I needed only add the new repo, do a yum > update, reboot, and be good on my hosts (did the yum update, the > engine-setup on my hosted engine). Things seemed to work relatively well, > except for a gluster sync issue that showed up. > > My cluster is a 3 node hyperconverged cluster. I upgraded the hosted > engine first, then engine 3. When engine 3 came back up, for some reason > one of my gluster volumes would not sync. Here's sample output: > > [root at ovirt3 ~]# gluster volume heal data-hdd info > Brick 172.172.1.11:/gluster/brick3/data-hdd > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/48d7ecb8- > 7ac5-4725-bca5-b3519681cf2f/0d6080b0-7018-4fa3-bb82-1dd9ef07d9b9 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/647be733- > f153-4cdc-85bd-ba72544c2631/b453a300-0602-4be1-8310-8bd5abe00971 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/6da854d1- > b6be-446b-9bf0-90a0dbbea830/3c93bd1f-b7fa-4aa2-b445-6904e31839ba > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/7f647567- > d18c-44f1-a58e-9b8865833acb/f9364470-9770-4bb1-a6b9-a54861849625 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/f3c8e7aa- > 6ef2-42a7-93d4-e0a4df6dd2fa/2eb0b1ad-2606-44ef-9cd3-ae59610a504b > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/b1ea3f62- > 0f05-4ded-8c82-9c91c90e0b61/d5d6bf5a-499f-431d-9013-5453db93ed32 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/8c8b5147- > e9d6-4810-b45b-185e3ed65727/16f08231-93b0-489d-a2fd-687b6bf88eaa > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/12924435- > b9c2-4aab-ba19-1c1bc31310ef/07b3db69-440e-491e-854c-bbfa18a7cff2 > Status: Connected > Number of entries: 8 > > Brick 172.172.1.12:/gluster/brick3/data-hdd > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/48d7ecb8- > 7ac5-4725-bca5-b3519681cf2f/0d6080b0-7018-4fa3-bb82-1dd9ef07d9b9 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/647be733- > f153-4cdc-85bd-ba72544c2631/b453a300-0602-4be1-8310-8bd5abe00971 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/b1ea3f62- > 0f05-4ded-8c82-9c91c90e0b61/d5d6bf5a-499f-431d-9013-5453db93ed32 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/6da854d1- > b6be-446b-9bf0-90a0dbbea830/3c93bd1f-b7fa-4aa2-b445-6904e31839ba > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/7f647567- > d18c-44f1-a58e-9b8865833acb/f9364470-9770-4bb1-a6b9-a54861849625 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/8c8b5147- > e9d6-4810-b45b-185e3ed65727/16f08231-93b0-489d-a2fd-687b6bf88eaa > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/12924435- > b9c2-4aab-ba19-1c1bc31310ef/07b3db69-440e-491e-854c-bbfa18a7cff2 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/f3c8e7aa- > 6ef2-42a7-93d4-e0a4df6dd2fa/2eb0b1ad-2606-44ef-9cd3-ae59610a504b > Status: Connected > Number of entries: 8 > > Brick 172.172.1.13:/gluster/brick3/data-hdd > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/b1ea3f62- > 0f05-4ded-8c82-9c91c90e0b61/d5d6bf5a-499f-431d-9013-5453db93ed32 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/8c8b5147- > e9d6-4810-b45b-185e3ed65727/16f08231-93b0-489d-a2fd-687b6bf88eaa > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/12924435- > b9c2-4aab-ba19-1c1bc31310ef/07b3db69-440e-491e-854c-bbfa18a7cff2 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/f3c8e7aa- > 6ef2-42a7-93d4-e0a4df6dd2fa/2eb0b1ad-2606-44ef-9cd3-ae59610a504b > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/647be733- > f153-4cdc-85bd-ba72544c2631/b453a300-0602-4be1-8310-8bd5abe00971 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/48d7ecb8- > 7ac5-4725-bca5-b3519681cf2f/0d6080b0-7018-4fa3-bb82-1dd9ef07d9b9 > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/6da854d1- > b6be-446b-9bf0-90a0dbbea830/3c93bd1f-b7fa-4aa2-b445-6904e31839ba > /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/7f647567- > d18c-44f1-a58e-9b8865833acb/f9364470-9770-4bb1-a6b9-a54861849625 > Status: Connected > Number of entries: 8 > > --------- > Its been in this state for a couple days now, and bandwidth monitoring > shows no appreciable data moving. I've tried repeatedly commanding a full > heal from all three clusters in the node. Its always the same files that > need healing. > > When running gluster volume heal data-hdd statistics, I see sometimes > different information, but always some number of "heal failed" entries. It > shows 0 for split brain. > > I'm not quite sure what to do. I suspect it may be due to nodes 1 and 2 > still being on the older ovirt/gluster release, but I'm afraid to upgrade > and reboot them until I have a good gluster sync (don't need to create a > split brain issue). How do I proceed with this? > > Second issue: I've been experiencing VERY POOR performance on most of my > VMs. To the tune that logging into a windows 10 vm via remote desktop can > take 5 minutes, launching quickbooks inside said vm can easily take 10 > minutes. On some linux VMs, I get random messages like this: > Message from syslogd at unifi at May 28 20:39:23 ... > kernel:[6171996.308904] NMI watchdog: BUG: soft lockup - CPU#0 stuck for > 22s! [mongod:14766] > > (the process and PID are often different) > > I'm not quite sure what to do about this either. My initial thought was > upgrad everything to current and see if its still there, but I cannot move > forward with that until my gluster is healed... > > Thanks! > --Jim > > _______________________________________________ > Users mailing list -- users at ovirt.org > To unsubscribe send an email to users-leave at ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community- > guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users at ovirt.org/ > message/3LEV6ZQ3JV2XLAL7NYBTXOYMYUOTIRQF/ > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180529/59818b3a/attachment.html>
Apparently Analagous Threads
- [ovirt-users] Re: Gluster problems, cluster performance issues
- [ovirt-users] Gluster problems, cluster performance issues
- [ovirt-users] Re: Gluster problems, cluster performance issues
- [ovirt-users] Re: Gluster problems, cluster performance issues
- [ovirt-users] Re: Gluster problems, cluster performance issues