thr3ads.net - Gluster users - [Gluster-users] about heal full [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Atin Mukherjee

2016-Mar-07 03:49 UTC

[Gluster-users] How to recovery a replicate volume

On 03/07/2016 07:40 AM, songxin wrote:> Hi all,
> I have a problem about how to recovery a replicate volume.
> 
> precondition:
> glusterfs version:3.7.6
> brick of A board :128.224.95.140:/data/brick/gv0
> brick of B board:128.224.162.255:/data/brick/gv0
> 
> reproduce:
> 1.gluster peer probe 128.224.162.255                                    
>                                                                        
>                                     (on A board)
> 2.gluster volume create gv0 replica 2 128.224.95.140:/data/brick/gv0
> 128.224.162.255:/data/brick/gv0 force                                 
> (on A board)
> 3.gluster volume start gv0                                              
>                                                                        
>                                              (on A board)
> 4.reboot the B board
> 
> After B board reboot,sometimes I have problems as below.
> 1.the peer status some times is rejected when I run "gluster peer
> status".                                  This is where you get into the problem. I am really not sure what
happens when you reboot a board. In our earlier conversation w.r.t to a
similar problem you did mention that board reboot doesn't wipe of
/var/lib/glusterd, please double confirm!

Also please send cmd_history.log along with glusterd log from both the
nodes. Also post reboot are you also trying to detach/probe A? If so
before detaching was A & B were in cluster connected state?
>                         (on A or B board)          
> 2.The brick on B board sometimes is offline When I run "gluster volume
> status"
>              (on A or B board) 
> 
> I want to know how I should do to recovery my replicate volume. 
> 
> PS.
> Now I do following operation to recovery my replicate volume.But
> sometimes I can't sync all the files in replicate volume even if I run
> "heal full".
> 1.gluster volume remove-brick gv0 replica 1
> 128.224.162.255:/data/brick/gv0 force                                  
>                                          (on A board)
> 2. gluster peer detach 128.224.162.255                                  
>                                                                        
>                                      (on A board)
> 3.gluster peer probe 128.224.162.255                                    
>                                                                        
>                                      (on A board)
> 4.gluster volume add-brick gv0 replica 2 128.224.162.255:/data/brick/gv0
> force                                                                   
>               (on A board)            
> 
> 
> 
> Please help me.
> 
> Thanks,
> Xin
> 
> 
>  
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

songxin

2016-Mar-10 07:33 UTC

head link

[Gluster-users] about split-brain

Hi all,
I have a file has a problem of gfid-mismatch as below.


stat: cannot stat '/mnt/c//public_html/cello/ior_files/nameroot.ior':
Input/output error
Remote: 

getfattr -d -m . -e hex
opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
# file: opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior 
trusted.afr.dirty=0x000000000000000000000000 
trusted.bit-rot.version=0x000000000000000256ded2f6000ad80f 
trusted.gfid=0x771221a7bb3c4f1aade40ce9e38a95ee 

Local: 

getfattr -d -m . -e hex
opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
# file: opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior 
trusted.bit-rot.version=0x000000000000000256ded38f000e3a51 
trusted.gfid=0x8ea33f46703c4e2d95c09153c1b858fd 


There is a saying in link
https://gluster.readthedocs.org/en/latest/Troubleshooting/split-brain/ as below.


This is done by observing the afr changelog extended attributes of the file on
the bricks using the getfattr command; then identifying the type of split-brain
(data split-brain, metadata split-brain, entry split-brain or split-brain due to
gfid-mismatch); and finally determining which of the bricks contains the
'good copy' of the file.


So the gfid-mismatch is also a split-brain.
But I found that "gluster volume heal gv0 info split-brain" can't
show split-brain entry due to gfid-mismatch.


My question is following:
1.Which command can be used to show split-brain due to gfid-mismatch?
2.How to heal it?Is it same as data split-brain?




Thanks?
Xin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160310/099f2c6b/attachment.html>

songxin

2016-Mar-14 07:19 UTC

head link

[Gluster-users] about heal full

Hi,
I have create a replicate volume and I want  to run "gluster volume heal
gv0 full".


I found that if I run "gluster volume heal gv0 full" on one board it
always output err like below.
         Launching heal operation to perform full self heal on volume gv0 has
been unsuccessful


But If I run "heal full " on the another board it alway sucessful.


I found the code of glusterfs as  below.


                if (gf_uuid_compare (brickinfo->uuid, candidate) > 0)
                        gf_uuid_copy (candidate, brickinfo->uuid);


                if ((*index) % hxl_children == 0) {
                        if (!gf_uuid_compare (MY_UUID, candidate)) {
                                _add_hxlator_to_dict (dict, volinfo,
                                                      ((*index)-1)/hxl_children,
                                                      (*hxlator_count));
                                (*hxlator_count)++;
                        }
                        gf_uuid_clear (candidate);
                }


My question is below:
Must I run "heal full" on the board whose uuid is the biggest?
If so, how cound I know which is the biggest board before I try to run
"heal full" on every board?


Thanks,
Xin






 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160314/bd963f44/attachment.html>

Gluster users - Mar 2016 - about heal full

[Gluster-users] How to recovery a replicate volume

[Gluster-users] about split-brain

[Gluster-users] about heal full