thr3ads.net - Gluster users - [Gluster-users] Automatic brick recovery failed [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Tobias Unsleber

2015-Feb-06 18:30 UTC

[Gluster-users] Automatic brick recovery failed

Hello all,

I'm very interested in gluster, because it seems
easy and powerful.

I carefully read the documentation for glusterfs.
But what I don't understand why recovery of a
dead brick faild.

I set up a new glusterfs system based on centos 7
for servers and debian wheezy for the client. GlusterFs
is used in version 3.6.2. on all systems.

I set up a replicated volume "gv0" with 2 bricks one on
each server 192.168.1.198 and 192.168.1.199.

gluster volume create gv0 replica 2 192.168.1.198:/export/VG0_DATA/brick \
192.168.1.199:/export/VG0_DATA/brick

gluster volume start gv0

Client Access works. Replication is fine.

Now I did a test:

removed one brick of one server. Recreated the Brick
directory as empty dir and added it via gluster volume add-brick

gluster volume remove-brick gv0 replica 1
192.168.1.199:/export/VG0_DATA/brick force
ssh 192.168.1.199 rm -rf /export/VG0_DATA/brick
ssh 192.168.1.199 mkdir /export/VG0_DATA/brick
gluster volume add-brick gv0 replica 2
192.168.1.199:/export/VG0_DATA/brick force

I watched this behaviour:

When the second brick is brought online again, All files are vanishing
at the client.
And only after doing a manual full heal on the volume, the files appear
again at client side.

Some Logs:

gluster pool list / volume info: http://fpaste.org/182536/23247297/

Client Glusterfs.log: http://fpaste.org/182527/23246678/

Client+Server OS Environment: http://fpaste.org/182530/24688114/

Server1 glustershd.log: http://fpaste.org/182531/24697414/
Server2 glustershd.log: http://fpaste.org/182532/14232470/

Server1 etc-glusterfs-glusterd.vol.log: http://fpaste.org/182533/14232470/
Server2 etc-glusterfs-glusterd.vol.log: http://fpaste.org/182534/23247143/

Server1 cli.log: http://fpaste.org/182535/24720014/

Thanks for any help.

Regards,
Tobias

Tobias Unsleber

2015-Feb-09 17:02 UTC

head link

[Gluster-users] Automatic brick recovery failed

Am 06.02.2015 um 19:30 schrieb Tobias Unsleber:> I carefully read the documentation for glusterfs.
> But what I don't understand why recovery of a
> dead brick faild.
>
> I set up a new glusterfs system based on centos 7
> for servers and debian wheezy for the client. GlusterFs
> is used in version 3.6.2. on all systems.
>
> I set up a replicated volume "gv0" with 2 bricks one on
> each server 192.168.1.198 and 192.168.1.199.
>
> gluster volume create gv0 replica 2 192.168.1.198:/export/VG0_DATA/brick \
> 192.168.1.199:/export/VG0_DATA/brick
>
> gluster volume start gv0
>
> Client Access works. Replication is fine.
>
> Now I did a test:
>
> removed one brick of one server. Recreated the Brick
> directory as empty dir and added it via gluster volume add-brick
>
> gluster volume remove-brick gv0 replica 1
> 192.168.1.199:/export/VG0_DATA/brick force
> ssh 192.168.1.199 rm -rf /export/VG0_DATA/brick
> ssh 192.168.1.199 mkdir /export/VG0_DATA/brick
> gluster volume add-brick gv0 replica 2
> 192.168.1.199:/export/VG0_DATA/brick force
>
> I watched this behaviour:
>
> When the second brick is brought online again, All files are vanishing
> at the client.
> And only after doing a manual full heal on the volume, the files appear
> again at client side.
>
> Some Logs:
>
> gluster pool list / volume info: http://fpaste.org/182536/23247297/
>
> Client Glusterfs.log: http://fpaste.org/182527/23246678/
>
> Client+Server OS Environment: http://fpaste.org/182530/24688114/
>
> Server1 glustershd.log: http://fpaste.org/182531/24697414/
> Server2 glustershd.log: http://fpaste.org/182532/14232470/
>
> Server1 etc-glusterfs-glusterd.vol.log: http://fpaste.org/182533/14232470/
> Server2 etc-glusterfs-glusterd.vol.log: http://fpaste.org/182534/23247143/
>
> Server1 cli.log: http://fpaste.org/182535/24720014/
>
> Thanks for any help.
>
> Regards,
> Tobias
>After trying different versions and packages, I got glusterfs working
the way I expect it to.

Maybe the CentOS-7-Build is broken. I compiled glusterfs 3.6.2 from
source and it worked right away.

Regards,
Tobias

Gluster users - Feb 2015 - Automatic brick recovery failed

[Gluster-users] Automatic brick recovery failed

[Gluster-users] Automatic brick recovery failed