thr3ads.net - Gluster users - [Gluster-users] Problems with .gluster structure

If this information is useful, please help other people find it:
Share via:

Shawn Heisey

2014-Mar-09 02:45 UTC

[Gluster-users] Problems with .gluster structure - bad symlinks

Some background:

----

On version 3.3.1, we tried to rebalance after adding storage.  It blew
up badly due to this bug:

https://bugzilla.redhat.com/show_bug.cgi?id=859387

We have now upgraded to 3.4.2.  A new rebalance attempt resulted in a
several dozen entries showing up in the 'gluster volume heal $vol info'
output.

----

With the help of Joe Julian in the IRC channel, I made my way through
the heal problems, but I continue to get errors in my server logs.

I have now learned that there are a bunch of bad symlinks in the
.glusterfs structure on each of my bricks.  All of them say too many
levels of symbolic links.  I do not believe they are loops ... when I
manually checked a couple of them, they were actually valid, but had
more than the allowed number of symlinks in the chain.

cat:
/bricks/d00v00/mdfs/.glusterfs/65/30/6530ce82-310d-4c7c-8d14-135655328a77:
Too many levels of symbolic links

What do I need to do to fix this problem?  Is there something I can do
for each of the bad symlinks?  Would a 'heal full' do anything useful?
Do I need to do something more drastic, like take the volume down and
entirely remove (or rename) the .glusterfs structure from all 32 bricks
(16x2 distributed-replicate)?  I don't want to cause myself more
problems, but I want to get the volume in a completely pristine state
and NOT risk losing any of the 52 terabytes of data that's in the volume.

Thanks,
Shawn

Shawn Heisey

2014-Mar-09 16:39 UTC

head link

[Gluster-users] Problems with .gluster structure - bad symlinks

On 3/8/2014 7:45 PM, Shawn Heisey wrote:> cat:
> /bricks/d00v00/mdfs/.glusterfs/65/30/6530ce82-310d-4c7c-8d14-135655328a77:
> Too many levels of symbolic links
> 
> What do I need to do to fix this problem?  Is there something I can do
> for each of the bad symlinks?  Would a 'heal full' do anything
useful?
> Do I need to do something more drastic, like take the volume down and
> entirely remove (or rename) the .glusterfs structure from all 32 bricks
> (16x2 distributed-replicate)?  I don't want to cause myself more
> problems, but I want to get the volume in a completely pristine state
> and NOT risk losing any of the 52 terabytes of data that's in the
volume.
Some additional info:

http://fpaste.org/83806/43825451/

This is from nfs.log on the server that all my clients contact for NFS
mounts.  It is peered with the other servers, but has no bricks.

So far I have determined the following about my bricks:

* There are no stray directories under .glusterfs/??/??/
* There is nothing remaining with nonzero trusted.afr* attributes
* There *are* broken symlinks (too many levels)

I will run another check to make sure there are no files with one
hardlink outside of the indices directory.  I will also check for files
that have more than two hardlinks.  I do not use hardlinks in my data,
so I think that this should never happen.

Is there anything else I can look for, and if I find something, where
can I go for information about how to fix it?

Thanks,
Shawn

Gluster users - Mar 2014 - Problems with .gluster structure - bad symlinks

[Gluster-users] Problems with .gluster structure - bad symlinks

[Gluster-users] Problems with .gluster structure - bad symlinks