Nicolas Ecarnot
2016-May-12 15:08 UTC
[Gluster-users] Continual heals happening on sharded cluster, me too...
Hello, On a replica 3 sharded cluster in gluster 3.7.11 (CentOS 7.2), I've witnessed a very sound behaviour for months. Recently, I've upgraded from [something quite recent, like 3.7.something] to 3.7.11, and immediately, I'm seeing continual healing appearing. I'm using this cluster to host 3 oVirt VMs with very low I/O. I read and tried to use informations here : https://www.mail-archive.com/gluster-users%40gluster.org/msg24598.html but I'm missing knowledge to know how to fix things. Here what I see when running "gluster volume heal data-shard-03 info" : * serv-vm-al01 Brick serv-vm-al01:/gluster/data/brick03 Status: Connected Number of entries: 0 Brick serv-vm-al02:/gluster/data/brick03 Status: Connected Number of entries: 0 Brick serv-vm-al03:/gluster/data/brick03 Status: Connected Number of entries: 0 * serv-vm-al02 gluster volume heal data-shard-03 info Brick serv-vm-al01:/gluster/data/brick03 <gfid:75ab0b1c-258f-4349-85bd-49ef80592919> <gfid:39bdcf05-f5b1-4df2-9941-614838b50e18> <gfid:7e0d84a0-0749-4a2d-9390-e95d489ec66a> Status: Connected Number of entries: 3 Brick serv-vm-al02:/gluster/data/brick03 /.shard/41573624-feb9-4ea6-bbd4-f0a912429b2f.1003 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.2351 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.1113 Status: Connected Number of entries: 3 Brick serv-vm-al03:/gluster/data/brick03 /.shard/34948b18-5991-4761-b867-2bd4fe4879d4.2807 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.2351 /.shard/34948b18-5991-4761-b867-2bd4fe4879d4.2062 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.1113 /.shard/41573624-feb9-4ea6-bbd4-f0a912429b2f.1833 Status: Connected Number of entries: 5 * serv-vm-al03 Brick serv-vm-al01:/gluster/data/brick03 /.shard/41573624-feb9-4ea6-bbd4-f0a912429b2f.882 <gfid:39bdcf05-f5b1-4df2-9941-614838b50e18> <gfid:93c83d37-71e5-4f17-b757-ff00a4384f2f> <gfid:75fcdab7-6d18-4371-bbb2-be156dd3da86> <gfid:75ab0b1c-258f-4349-85bd-49ef80592919> <gfid:f933a07a-4f89-4bf9-ab82-a05432a0162f> /.shard/34948b18-5991-4761-b867-2bd4fe4879d4.2329 Status: Connected Number of entries: 7 Brick serv-vm-al02:/gluster/data/brick03 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.1113 /.shard/34948b18-5991-4761-b867-2bd4fe4879d4.2807 /.shard/41573624-feb9-4ea6-bbd4-f0a912429b2f.190 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.2351 Status: Connected Number of entries: 4 Brick serv-vm-al03:/gluster/data/brick03 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.1113 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.185 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.183 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.2351 Status: Connected Number of entries: 4 We are monitoring healed entries in Nagios/centreon, and history graph are showing that from the time I upgraded to 3.7.11, it went from around 0~1 files to heal every 5 minutes, to an average of 6 to 10 files to heal every minutes. The VMs are still doing the exact same nothing as previous. Questions : - how comes? - how to fix? -- Nicolas ECARNOT
Ashish Pandey
2016-May-12 17:41 UTC
[Gluster-users] Continual heals happening on sharded cluster, me too...
Hi Nicolas, I think this issue has already been raised where we are seeing different heal info from different servers. https://bugzilla.redhat.com/show_bug.cgi?id=1335429 Patch for this is under review. Ashish ----- Original Message ----- From: "Nicolas Ecarnot" <nicolas at ecarnot.net> To: "gluster-users" <Gluster-users at gluster.org> Sent: Thursday, May 12, 2016 8:38:53 PM Subject: [Gluster-users] Continual heals happening on sharded cluster, me too... Hello, On a replica 3 sharded cluster in gluster 3.7.11 (CentOS 7.2), I've witnessed a very sound behaviour for months. Recently, I've upgraded from [something quite recent, like 3.7.something] to 3.7.11, and immediately, I'm seeing continual healing appearing. I'm using this cluster to host 3 oVirt VMs with very low I/O. I read and tried to use informations here : https://www.mail-archive.com/gluster-users%40gluster.org/msg24598.html but I'm missing knowledge to know how to fix things. Here what I see when running "gluster volume heal data-shard-03 info" : * serv-vm-al01 Brick serv-vm-al01:/gluster/data/brick03 Status: Connected Number of entries: 0 Brick serv-vm-al02:/gluster/data/brick03 Status: Connected Number of entries: 0 Brick serv-vm-al03:/gluster/data/brick03 Status: Connected Number of entries: 0 * serv-vm-al02 gluster volume heal data-shard-03 info Brick serv-vm-al01:/gluster/data/brick03 <gfid:75ab0b1c-258f-4349-85bd-49ef80592919> <gfid:39bdcf05-f5b1-4df2-9941-614838b50e18> <gfid:7e0d84a0-0749-4a2d-9390-e95d489ec66a> Status: Connected Number of entries: 3 Brick serv-vm-al02:/gluster/data/brick03 /.shard/41573624-feb9-4ea6-bbd4-f0a912429b2f.1003 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.2351 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.1113 Status: Connected Number of entries: 3 Brick serv-vm-al03:/gluster/data/brick03 /.shard/34948b18-5991-4761-b867-2bd4fe4879d4.2807 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.2351 /.shard/34948b18-5991-4761-b867-2bd4fe4879d4.2062 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.1113 /.shard/41573624-feb9-4ea6-bbd4-f0a912429b2f.1833 Status: Connected Number of entries: 5 * serv-vm-al03 Brick serv-vm-al01:/gluster/data/brick03 /.shard/41573624-feb9-4ea6-bbd4-f0a912429b2f.882 <gfid:39bdcf05-f5b1-4df2-9941-614838b50e18> <gfid:93c83d37-71e5-4f17-b757-ff00a4384f2f> <gfid:75fcdab7-6d18-4371-bbb2-be156dd3da86> <gfid:75ab0b1c-258f-4349-85bd-49ef80592919> <gfid:f933a07a-4f89-4bf9-ab82-a05432a0162f> /.shard/34948b18-5991-4761-b867-2bd4fe4879d4.2329 Status: Connected Number of entries: 7 Brick serv-vm-al02:/gluster/data/brick03 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.1113 /.shard/34948b18-5991-4761-b867-2bd4fe4879d4.2807 /.shard/41573624-feb9-4ea6-bbd4-f0a912429b2f.190 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.2351 Status: Connected Number of entries: 4 Brick serv-vm-al03:/gluster/data/brick03 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.1113 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.185 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.183 /.shard/26ec536a-8919-478c-834c-f6ac70882ee6.2351 Status: Connected Number of entries: 4 We are monitoring healed entries in Nagios/centreon, and history graph are showing that from the time I upgraded to 3.7.11, it went from around 0~1 files to heal every 5 minutes, to an average of 6 to 10 files to heal every minutes. The VMs are still doing the exact same nothing as previous. Questions : - how comes? - how to fix? -- Nicolas ECARNOT _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160512/fc48852b/attachment.html>