thr3ads.net - Gluster users - [Gluster-users] Heal not working [Nov 2012]

If this information is useful, please help other people find it:
Share via:

Mario Kadastik

2012-Nov-26 10:26 UTC

[Gluster-users] Heal not working

Hi,

I have a volume created of 12 bricks and with 3x replication (no stripe). We had
to take one server (2 bricks per server, but configured such that first brick
from every server, then second brick from every server so there should not be 1
server multiple times in any replica groups) for maintenance. The server was
down for 40 minutes and after it came up I saw that gluster volume heal home0
info showed some files. I started healing, but after 3 days it's still the
same. Today I enabled quorum enforcement to make sure we don't get for
future split brains and as we have 3 replicas, then 2 should make quorum.

Anyway, the healing information is attached to this e-mail for commands:
[root at se1 ~]# for i in "" heal-failed split-brain; do gluster
volume heal home0 info $i > home-heal-$i.txt 2>&1; done

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: home-heal-.txt
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121126/9c98d512/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: home-heal-heal-failed.txt
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121126/9c98d512/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: home-heal-split-brain.txt
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121126/9c98d512/attachment-0002.txt>
-------------- next part --------------

Ideas how to fix this?

Mario Kadastik, PhD
Researcher

---
"Physics is like sex, sure it may have practical reasons, but that's
not why we do it"
-- Richard P. Feynman

Jeff Darcy

2012-Nov-26 15:39 UTC

head link

[Gluster-users] Heal not working

On 11/26/2012 05:26 AM, Mario Kadastik wrote:> Hi,
>
> I have a volume created of 12 bricks and with 3x replication (no stripe).
We had to take one server (2 bricks per server, but configured such that first
brick from every server, then second brick from every server so there should not
be 1 server multiple times in any replica groups) for maintenance. The server
was down for 40 minutes and after it came up I saw that gluster volume heal
home0 info showed some files. I started healing, but after 3 days it's still
the same. Today I enabled quorum enforcement to make sure we don't get for
future split brains and as we have 3 replicas, then 2 should make quorum.
>
> Anyway, the healing information is attached to this e-mail for commands:
> [root at se1 ~]# for i in "" heal-failed split-brain; do gluster
volume heal home0 info $i > home-heal-$i.txt 2>&1; done
For some of the files where healing failed, check the extended attributes on 
each replica.  For example:

	getfattr -d -e hex -m . .../res/out_files_485.tgz

Also, check the logs in /var/log/glusterfs to see if they give any indication 
of why self-heal is failing.  In my experience, the most common cause of such 
failures is GFID mismatches, which are really a form of split brain but not 
recognized or handled as such (which is why they don't get reported there). 
These can occur e.g. if a file is created separately on two bricks due to a 
network partition or two servers being down at different times.

Maybe Matching Threads

Search for more seemingly similar threads

Gluster users - Nov 2012 - Heal not working

[Gluster-users] Heal not working

[Gluster-users] Heal not working

Maybe Matching Threads