Joe - Scott had sent me a private email and I provided the work around, for
some (unknown) reason all the nodes ended up having two uuids for a
particular peer which caused it. I've asked for the log files to further
debug.
On Fri, 17 Feb 2017 at 21:58, Joe Julian <joe at julianfamily.org> wrote:
> Does your repaired server have the correct uuid /var/lib/glusterd/
> glusterd.info?
>
> On February 16, 2017 9:49:56 PM PST, Scott Hazelhurst <
> Scott.Hazelhurst at wits.ac.za> wrote:
>
>
> Dear all
>
> Last week I posted a query about a problem I had with a machine that had
failed but the underlying hard disk with the gluster brick was good. I?ve made
some progress in restoring. I now have the problem with my new restored machine
where it becomes its own peer, which then breaks everything.
>
> 1. Gluster daemons are off on all peers, content of /var/lib/glusterd/peers
looks good.
> 2. I start the gluster daemons on all peers. All looks good.
> 3. For about 2 minutes, there?s no obvious problem ? if I do a gluster peer
status on any machine it looks good, if I do a gluster volume status A01 on any
machine it looks good.
> 4. Then at some point, the /var/lib/glusterd/peers file of the new,
restored machine gets an entry for itself and things start breaking. A typical
error message is the understandable
>
> : Unable to get lock for uuid: 4fb930f7-554e-462a-9204-4592591feeb8, lock
held by: 4fb930f7-554e-462a-9204-4592591feeb8
>
> 5. This is repeatable ? if I stop daemons, remove the offending entry in
/var/lib/glusterd/peer, and restart, the same behavior occurs ? all good for a
minute or two and then something magically puts something in
/var/lib/glusterd/peers
>
> In a previous step in restoring my machine, I had a different error of
mismatching cksums and what I did then may be the cause of the problem. In
searching the list archives I found someone with a similar cksum problem, and
the proposed solution was to copy the /var/lib/glusterd/vols/ from another of
the peers to the new machine. This may not be the issue but this is the only
thing I think I did that was unconventional.
>
> I am running version 3.7.5-19 on Scientific Linux 6.8
>
> If anyone can suggest a way forward I would be grateful
>
> Many thanks
>
> Scott
>
>
> <table width="100%" border="0"
cellspacing="0" cellpadding="0"
style="width:100%;">
> <tr>
> <td align="left"
style="text-align:justify;"><font
face="arial,sans-serif" size="1"
color="#999999"><span style="font-size:11px;">This
communication is intended for the addressee only. It is confidential. If you
have received this communication in error, please notify us immediately and
destroy the original message. You may not copy or disseminate this communication
without the permission of the University. Only authorised signatories are
competent to enter into agreements on behalf of the University and recipients
are thus advised that the content of this message may not be legally binding on
the University and may contain the personal views and opinions of the author,
which are not necessarily the views and opinions of The University of the
Witwatersrand, Johannesburg. All agreements between the University and outsiders
are subject to South African Law unless the University agrees in writing to the
contrary. </span></font></td>
> </tr>
> </table
> ------------------------------
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
--
- Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170217/eccc279f/attachment.html>