Dan Bretherton
2011-Aug-06 12:28 UTC
[Gluster-users] Brick pair file mismatch, self-heal problems?
> Try this to trigger self heal: > > find<gluster-mount> -noleaf -print0 -name<file name>| xargs --null > stat>/dev/null > > > > On Sun, May 15, 2011 at 11:20 AM, Martin Schenker > <martin.schenker at profitbricks.com <http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>> wrote: > >/ Can someone enlighten me what's going on here? We have a two peers, the file > />/ 21313 is shown through the client mountpoint as "1Jan1970", attribs on > />/ server pserver3 don't match but NO self-heal or repair can be triggered > />/ through "ls -alR"?!? > />/ > />/ Checking the files through the server mounts show that two versions are on > />/ the system. But the wrong one (as with the "1Jan1970") seems to be the > />/ preferred one by the client?!? > />/ > />/ Do I need to use setattr or what in order to get the client to see the RIGHT > />/ version?!? This is not the ONLY file displaying this problematic behaviour! > />/ > />/ Thanks for any feedback. > />/ > />/ Martin > />/ > />/ pserver5: > />/ > />/ 0root at pserver5 <http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>:~ # ls -al > />/ /mnt/gluster/brick1/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef > />/ /hdd-images > />/ > />/ -rwxrwx--- 1 libvirt-qemu vcb 483183820800 May 13 13:41 21313 > />/ > />/ 0root at pserver5 <http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>:~ # getfattr -R -d -e hex -m "trusted.afr." > />/ /mnt/gluster/brick1/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef > />/ /hdd-images/21313 > />/ getfattr: Removing leading '/' from absolute path names > />/ # file: > />/ mnt/gluster/brick1/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef/ > />/ hdd-images/21313 > />/ trusted.afr.storage0-client-2=0x000000000000000000000000 > />/ trusted.afr.storage0-client-3=0x000000000000000000000000 > />/ > />/ 0root at pserver5 <http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>:~ # ls -alR > />/ /opt/profitbricks/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef/h > />/ dd-images/21313 > />/ -rwxrwx--- 1 libvirt-qemu kvm 483183820800 Jan 1 1970 > />/ /opt/profitbricks/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef/h > />/ dd-images/21313 > />/ > />/ pserver3: > />/ > />/ 0root at pserver3 <http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>:~ # ls -al > />/ /mnt/gluster/brick1/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef > />/ /hdd-images > />/ > />/ -rwxrwx--- 1 libvirt-qemu kvm 483183820800 Jan 1 1970 21313 > />/ > />/ 0root at pserver3 <http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>:~ # ls -alR > />/ /opt/profitbricks/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef/h > />/ dd-images/21313 > />/ -rwxrwx--- 1 libvirt-qemu kvm 483183820800 Jan 1 1970 > />/ /opt/profitbricks/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef/h > />/ dd-images/21313 > />/ > />/ 0root at pserver3 <http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>:~ # getfattr -R -d -e hex -m "trusted.afr." > />/ /mnt/gluster/brick1/storage/images/2078/ebb83b05-3a83-9d18- > />/ ad8f-8542864da6ef/hdd-images/21313 > />/ getfattr: Removing leading '/' from absolute path names > />/ # file: > />/ mnt/gluster/brick1/storage/images/2078/ebb83b05-3a83-9d18-ad8f-8542864da6ef/ > />/ hdd-images/21313 > />/ trusted.afr.storage0-client-2=0x000000000000000000000000 > />/ trusted.afr.storage0-client-3=0x0b0000090900000000000000<- mismatch, > />/ should be targeted for self-heal/repair? Why is there a difference in the > />/ views? > />/ > />/ > />/ From the volfile: > />/ > />/ volume storage0-client-2 > />/ type protocol/client > />/ option remote-host de-dc1-c1-pserver3 > />/ option remote-subvolume /mnt/gluster/brick1/storage > />/ option transport-type rdma > />/ option ping-timeout 5 > />/ end-volume > />/ > />/ volume storage0-client-3 > />/ type protocol/client > />/ option remote-host de-dc1-c1-pserver5 > />/ option remote-subvolume /mnt/gluster/brick1/storage > />/ option transport-type rdma > />/ option ping-timeout 5 > />/ end-volume > />/ > /Hello All- I am seeing similar behaviour in two of my volumes, now using GlusterFS version 3.2.2. There are files dated 1st Jan 1970 on one brick, where the same files on the mirror brick have sensible date stamps. In the cases I have investigated the date shown at the mount point is 1st Jan 1970. However, unlike the problem initially reported in this thread, I have not seen any xattr mismatches, as illustrated by the example below. [root at bdan4 glusterfs]# ls -l behemoth/aatsr/AT2_AVG_3PAARC19951229_D_nD2b.nc -rw-r--r-- 1 resc essc 381894 Jan 1 1970 behemoth/aatsr/AT2_AVG_3PAARC19951229_D_nD2b.nc [root at bdan4 glusterfs]# getfattr -R -d -e hex -m "trusted.afr." behemoth/aatsr/AT2_AVG_3PAARC19951229_D_nD2b.nc # file: behemoth/aatsr/AT2_AVG_3PAARC19951229_D_nD2b.nc trusted.afr.marine-client-2=0x000000000000000000000000 trusted.afr.marine-client-3=0x000000000000000000000000 I have been using the following self heal method since it became the recommended method shown in the GlusterFS documentation. find<gluster-mount> -noleaf -print0 -name<file name>| xargs --null stat>/dev/null Is there a better way to trigger self-healing, which would catch these obvious modification time errors? -Dan. -- Mr. D.A. Bretherton Computer System Manager Environmental Systems Science Centre Harry Pitt Building 3 Earley Gate University of Reading Reading, RG6 6AL UK Tel. +44 118 378 5205 Fax: +44 118 378 6413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110806/aa17d113/attachment.html>