Joe Julian wrote on 28/01/2016 18:28:> Response inline
Ditto. And thanks BTW. :-).
> On 01/28/2016 07:28 AM, Ronny Adsetts wrote:
>> Hi all,
>>
>> Have an issue I'm having trouble explaining or getting to the
bottom of. I have a two node, two brick replicated 75G volume containing ~4661
files:
>>
>> gotham:~# gluster volume status software
>> Status of volume: software
>> Gluster process Port Online
Pid
>>
------------------------------------------------------------------------------
>> Brick gotham.stor.graysofwestminster.co.uk:/data/gluste
>> rfs/software/brick1/brick 49152 Y
27296
>> Brick metropolis.stor.graysofwestminster.co.uk:/data/gl
>> usterfs/software/brick1/brick 49152 Y
30335
>> NFS Server on localhost 2049 Y
27309
>> Self-heal Daemon on localhost N/A Y
27316
>> NFS Server on metropolis.stor.graysofwestminster.co.uk 2049 Y
30348
>> Self-heal Daemon on metropolis.stor.graysofwestminster.
>> co.uk N/A Y
30355
>>
>> Task Status of Volume software
>>
------------------------------------------------------------------------------
>> There are no active volume tasks
>>
>>
>> gotham:~# gluster volume heal software info
>> Brick
gotham.graysofwestminster.co.uk:/data/glusterfs/software/brick1/brick/
>> Number of entries: 0
>>
>> Brick
metropolis.graysofwestminster.co.uk:/data/glusterfs/software/brick1/brick/
>> Number of entries: 0
>>
>>
>> I have the volume mounted on both nodes at /stor/software. On one of
the nodes, there are lots of files showing with a link count of 0 which is
leading to the Windows software using the data having validation issues:
>>
>> metropolis:~# ls -ltr /stor/software/win_patches/ | head
>> total 21585884
>> -rwxr--r-- 0 ainet Domain Admins 27816 Jan 4 2006 nullpatch.exe
>> -rwxrwxr-- 0 ronny Domain Admins 135477136 Oct 29 2006
W2KA_SP4_128_x86_ENU.exe
>> -rwxrwxr-- 0 ronny Domain Admins 432568 Oct 29 2006 MSXML26SP3.exe
>> -rwxrwxr-- 0 ronny Domain Admins 1070592 Oct 29 2006 msxml3sp7.msi
>> -rwxrwxr-- 0 ronny Domain Admins 711160 Oct 29 2006
Windows2000-KB842773.EXE
>> -rwxrwxr-- 0 ronny Domain Admins 760824 Oct 29 2006
Windows2000-KB842933.EXE
>> -rwxrwxr-- 0 ronny Domain Admins 2585872 Oct 29 2006
WindowsInstaller-KB893803v2.exe
>> -rwxrwxr-- 0 ronny Domain Admins 106240 Oct 29 2006
Windows-KB870669.exe
>> -rwxrwxr-- 0 ronny Domain Admins 1260024 Oct 29 2006
Windows2000-KB908506.EXE
>>
>>
>> From the other node:
>>
>> gotham:~# ls -ltr /stor/software/win_patches/ | head
>> total 21585884
>> -rwxr--r-- 1 1066 1016 27816 Jan 4 2006 nullpatch.exe
>> -rwxrwxr-- 1 1045 1016 135477136 Oct 29 2006 W2KA_SP4_128_x86_ENU.exe
>> -rwxrwxr-- 1 1045 1016 432568 Oct 29 2006 MSXML26SP3.exe
>> -rwxrwxr-- 1 1045 1016 1070592 Oct 29 2006 msxml3sp7.msi
>> -rwxrwxr-- 1 1045 1016 711160 Oct 29 2006 Windows2000-KB842773.EXE
>> -rwxrwxr-- 1 1045 1016 760824 Oct 29 2006 Windows2000-KB842933.EXE
>> -rwxrwxr-- 1 1045 1016 2585872 Oct 29 2006
WindowsInstaller-KB893803v2.exe
>> -rwxrwxr-- 1 1045 1016 106240 Oct 29 2006 Windows-KB870669.exe
>> -rwxrwxr-- 1 1045 1016 1260024 Oct 29 2006 Windows2000-KB908506.EXE
>>
>>
>> I'm seeing the following and lots more similar in the logs:
>>
>> [2016-01-28 15:13:15.099478] W
[client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-software-client-1: remote
operation failed: No such file or directory. Path:
/win_patches/SkypeSetup71732106.msi (3489f648-9fde-4bcc-b906-6ef88ffcf90f)
>
> client-1 is brick2, metropolis. That error says the file is missing, but it
also shows the gfid. Perhaps the gfid hardlink is missing (the .glusterfs tree).
>
> Check the metadata for the file on both bricks:
> getfattr -m . -d -e hex
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
>
> Ensure the gfid is the same on both. Check the number of links on both,
there should be 2. With a gifd of 489f648-9fde-4bcc-b906-6ef88ffcf90f, that file
should be hard linked to .glusterfs/48/9f/489f648-9fde-4bcc-b906-6ef88ffcf90f on
both bricks.
"Good" brick:
gotham:~# getfattr -m . -d -e hex
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
trusted.afr.software-client-0=0x000000000000000000000000
trusted.afr.software-client-1=0x000000000000000000000000
trusted.gfid=0x3489f6489fde4bccb9066ef88ffcf90f
gotham:~# ls -l
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
-rwxr--r-- 2 1066 1016 44380160 Dec 31 11:41
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
gotham:~# ls -i
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
114563677
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
gotham:~# find /data/glusterfs/software -inum 114563677
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
/data/glusterfs/software/brick1/brick/.glusterfs/34/89/3489f648-9fde-4bcc-b906-6ef88ffcf90f
"Bad" brick:
metropolis:~# getfattr -m . -d -e hex
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
trusted.afr.software-client-0=0x000000000000000000000000
trusted.afr.software-client-1=0x000000000000000000000000
trusted.gfid=0x3489f6489fde4bccb9066ef88ffcf90f
metropolis:~# ls -l
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
-rwxr--r-- 1 ainet Domain Admins 44380160 Dec 31 11:41
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
metropolis:~# ls -i
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
132120540
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
metropolis:~# find /data/glusterfs/software -inum 132120540
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
So we're missing the file in .glusterfs.
The magic question is I guess do you happen to know how to put the node back
together?
Thanks.
Ronny
--
Ronny Adsetts
Technical Director
Amazing Internet Ltd, London
t: +44 20 8977 8943
w: www.amazinginternet.com
Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ
Registered in England. Company No. 4042957
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160128/c1fef89c/attachment.sig>