Pavel Cernohorsky
2016-Nov-23 10:26 UTC
[Gluster-users] Files won't heal, although no obvious problem visible
Hello, I have Gluster 3.8.5-1.fc24 with replica 3 arbiter 1 volume, where gluster heal <volname> info reports (simplified): Brick 10.10.27.11:/opt/data/hdd5/gluster /assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 Status: Connected Brick 10.10.27.10:/opt/data/hdd6/gluster /assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 Status: Connected Brick 10.10.27.12:/opt/data/ssd/arbiter8 /assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 Status: Connected Extended attributes of those files on the bricks are (in the same order of bricks): 10.10.27.11: getfattr -m . -d -e hex assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 # file: assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.bit-rot.version=0x0200000000000000583561b500050a4e trusted.gfid=0x5d2793f9b2a74514937ceb1a3bca3e1f 10.10.27.10: getfattr -m . -d -e hex assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 # file: assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.afr.hot-client-21=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000583558e2000b1457 trusted.gfid=0x5d2793f9b2a74514937ceb1a3bca3e1f 10.10.27.12: getfattr -m . -d -e hex assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 # file: assets/1/286381384/99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.afr.hot-client-21=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000583440580005944b trusted.gfid=0x5d2793f9b2a74514937ceb1a3bca3e1f The "hot-client-21" is, based on the vol-file, the following of the bricks: option remote-subvolume /opt/data/hdd5/gluster option remote-host 10.10.27.11 I have self healing daemon disabled, but when I try to trigger healing manually (gluster volume heal <volname>), I get: "Launching heal operation to perform index self heal on volume <volname> has been unsuccessful on bricks that are down. Please check if all brick processes are running.", although all the bricks are online (gluster volume status <volname>). When I try to just md5sum the file, to trigger automated healing on file manipulation, I get the result, but the file is not healed anyway. This usually works when I do not get 3 entries for the same file in the heal info. Any clues? What am I doing wrong? Kind regards, Pavel
Ravishankar N
2016-Nov-23 10:55 UTC
[Gluster-users] Files won't heal, although no obvious problem visible
On 11/23/2016 03:56 PM, Pavel Cernohorsky wrote:> The "hot-client-21" is, based on the vol-file, the following of the > bricks: > option remote-subvolume /opt/data/hdd5/gluster > option remote-host 10.10.27.11 > > I have self healing daemon disabled, but when I try to trigger healing > manually (gluster volume heal <volname>), I get: "Launching heal > operation to perform index self heal on volume <volname> has been > unsuccessful on bricks that are down. Please check if all brick > processes are running.", although all the bricks are online (gluster > volume status <volname>).Can you enable the self-heal daemon and try again ? `gluster volume heal <volname>` requires the shd to be enabled. The error message that you get is inappropriate and is being fixed.> > When I try to just md5sum the file, to trigger automated healing on > file manipulation, I get the result, but the file is not healed > anyway. This usually works when I do not get 3 entries for the same > file in the heal info.Is the file size for 99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 non-zero on the 2 data bricks (i.e. on 10.10.27.11 and 10.10.27.10) and do they match? Do the md5sums match with what you got on the mount when you calculate it directly on these bricks? Thanks, Ravi> > Any clues? What am I doing wrong? > > Kind regards, > Pavel