thr3ads.net - Gluster users - [Gluster-users] I/O error for one folder within the mountpoint [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Florian Leleu

2017-Jul-07 10:09 UTC

[Gluster-users] I/O error for one folder within the mountpoint

I guess you're right aboug gfid, I got that:

[2017-07-07 07:35:15.197003] W [MSGID: 108008]
[afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
0-applicatif-replicate-0: GFID mismatch for
<gfid:3fa785b5-4242-4816-a452-97da1a5e45c6>/snooper
b9222041-72dd-43a3-b0ab-4169dbd9a87f on applicatif-client-1 and
60056f98-20f8-4949-a4ae-81cc1a139147 on applicatif-client-0

Can you tell me how can I fix that ?If that helps I don't mind deleting
the whole folder snooper, I have backup.

Thanks.


Le 07/07/2017 ? 11:54, Ravishankar N a ?crit :> What does the mount log say when you get the EIO error on snooper?
> Check if there is a gfid mismatch on snooper directory or the files
> under it for all 3 bricks. In any case the mount log or the
> glustershd.log of the 3 nodes for the gfids you listed below should
> give you some idea on why the files aren't healed.
> Thanks.
>
> On 07/07/2017 03:10 PM, Florian Leleu wrote:
>>
>> Hi Ravi,
>>
>> thanks for your answer, sure there you go:
>>
>> # gluster volume heal applicatif info
>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>> <gfid:e3b5ef36-a635-4e0e-bd97-d204a1f8e7ed>
>> <gfid:f8030467-b7a3-4744-a945-ff0b532e9401>
>> <gfid:def47b0b-b77e-4f0e-a402-b83c0f2d354b>
>> <gfid:46f76502-b1d5-43af-8c42-3d833e86eb44>
>> <gfid:d27a71d2-6d53-413d-b88c-33edea202cc2>
>> <gfid:7e7f02b2-3f2d-41ff-9cad-cd3b5a1e506a>
>> Status: Connected
>> Number of entries: 6
>>
>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>> <gfid:47ddf66f-a5e9-4490-8cd7-88e8b812cdbd>
>> <gfid:8057d06e-5323-47ff-8168-d983c4a82475>
>> <gfid:5b2ea4e4-ce84-4f07-bd66-5a0e17edb2b0>
>> <gfid:baedf8a2-1a3f-4219-86a1-c19f51f08f4e>
>> <gfid:8261c22c-e85a-4d0e-b057-196b744f3558>
>> <gfid:842b30c1-6016-45bd-9685-6be76911bd98>
>> <gfid:1fcaef0f-c97d-41e6-87cd-cd02f197bf38>
>> <gfid:9d041c80-b7e4-4012-a097-3db5b09fe471>
>> <gfid:ff48a14a-c1d5-45c6-a52a-b3e2402d0316>
>> <gfid:01409b23-eff2-4bda-966e-ab6133784001>
>> <gfid:c723e484-63fc-4267-b3f0-4090194370a0>
>> <gfid:fb1339a8-803f-4e29-b0dc-244e6c4427ed>
>> <gfid:056f3bba-6324-4cd8-b08d-bdf0fca44104>
>> <gfid:a8f6d7e5-0ff2-4747-89f3-87592597adda>
>> <gfid:3f6438a0-2712-4a09-9bff-d5a3027362b4>
>> <gfid:392c8e2f-9da4-4af8-a387-bfdfea2f404e>
>> <gfid:37e1edfd-9f58-4da3-8abe-819670c70906>
>> <gfid:15b7cdb3-aae8-4ca5-b28c-e87a3e599c9b>
>> <gfid:1d087e51-fb40-4606-8bb5-58936fb11a4c>
>> <gfid:bb0352b9-4a5e-4075-9179-05c3a5766cf4>
>> <gfid:40133fcf-a1fb-4d60-b169-e2355b66fb53>
>> <gfid:00f75963-1b4a-4d75-9558-36b7d85bd30b>
>> <gfid:2c0babdf-c828-475e-b2f5-0f44441fffdc>
>> <gfid:bbeff672-43ef-48c9-a3a2-96264aa46152>
>> <gfid:6c0969dd-bd30-4ba0-a7e5-ba4b3a972b9f>
>> <gfid:4c81ea14-56f4-4b30-8fff-c088fe4b3dff>
>> <gfid:1072cda3-53c9-4b95-992d-f102f6f87209>
>> <gfid:2e8f9f29-78f9-4402-bc0c-e63af8cf77d6>
>> <gfid:eeaa2765-44f4-4891-8502-5787b1310de2>
>> Status: Connected
>> Number of entries: 29
>>
>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>> <gfid:e3b5ef36-a635-4e0e-bd97-d204a1f8e7ed>
>> <gfid:f8030467-b7a3-4744-a945-ff0b532e9401>
>> <gfid:def47b0b-b77e-4f0e-a402-b83c0f2d354b>
>> <gfid:46f76502-b1d5-43af-8c42-3d833e86eb44>
>> <gfid:d27a71d2-6d53-413d-b88c-33edea202cc2>
>> <gfid:7e7f02b2-3f2d-41ff-9cad-cd3b5a1e506a>
>> Status: Connected
>> Number of entries: 6
>>
>>
>> # gluster volume heal applicatif info split-brain
>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> Doesn't it seem odd that the first command give some different
output ?
>>
>> Le 07/07/2017 ? 11:31, Ravishankar N a ?crit :
>>> On 07/07/2017 01:23 PM, Florian Leleu wrote:
>>>>
>>>> Hello everyone,
>>>>
>>>> first time on the ML so excuse me if I'm not following well
the
>>>> rules, I'll improve if I get comments.
>>>>
>>>> We got one volume "applicatif" on three nodes (2 and
1 arbiter),
>>>> each following command was made on node ipvr8.xxx:
>>>>
>>>> # gluster volume info applicatif
>>>>  
>>>> Volume Name: applicatif
>>>> Type: Replicate
>>>> Volume ID: ac222863-9210-4354-9636-2c822b332504
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
>>>> Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
>>>> Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
>>>> Options Reconfigured:
>>>> performance.read-ahead: on
>>>> performance.cache-size: 1024MB
>>>> performance.quick-read: off
>>>> performance.stat-prefetch: on
>>>> performance.io-cache: off
>>>> transport.address-family: inet
>>>> performance.readdir-ahead: on
>>>> nfs.disable: off
>>>>
>>>> # gluster volume status applicatif
>>>> Status of volume: applicatif
>>>> Gluster process                             TCP Port  RDMA Port
>>>> Online  Pid
>>>>
------------------------------------------------------------------------------
>>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/
>>>> brick                                       49154     0
>>>> Y       2814
>>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/
>>>> brick                                       49154     0
>>>> Y       2672
>>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/
>>>> brick                                       49154     0
>>>> Y       3424
>>>> NFS Server on localhost                     2049      0
>>>> Y       26530
>>>> Self-heal Daemon on localhost               N/A       N/A
>>>> Y       26538
>>>> NFS Server on ipvr9.xxx                  2049      0         
>>>> Y       12238
>>>> Self-heal Daemon on ipvr9.xxx            N/A       N/A       
>>>> Y       12246
>>>> NFS Server on ipvr7.xxx                  2049      0         
>>>> Y       2234
>>>> Self-heal Daemon on ipvr7.xxx            N/A       N/A       
>>>> Y       2243
>>>>  
>>>> Task Status of Volume applicatif
>>>>
------------------------------------------------------------------------------
>>>> There are no active volume tasks
>>>>
>>>> The volume is mounted with autofs (nfs) in /home/applicatif and
one
>>>> folder is "broken":
>>>>
>>>> l /home/applicatif/services/
>>>> ls: cannot access /home/applicatif/services/snooper:
Input/output error
>>>> total 16
>>>> lrwxrwxrwx  1 applicatif applicatif    9 Apr  6 15:53 config
->
>>>> ../config
>>>> lrwxrwxrwx  1 applicatif applicatif    7 Apr  6 15:54 .pwd
-> ../.pwd
>>>> drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24
querybuilder
>>>> d?????????  ? ?          ?             ?            ? snooper
>>>> drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57
snooper_new
>>>> drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58
snooper_old
>>>> drwxr-xr-x  4 applicatif applicatif 4096 Jul  4 23:45 ssnooper
>>>>
>>>> I checked wether there was a heal, and it seems so:
>>>>
>>>> # gluster volume heal applicatif statistics heal-count
>>>> Gathering count of entries to be healed on volume applicatif
has
>>>> been successful
>>>>
>>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>>>> Number of entries: 8
>>>>
>>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>>>> Number of entries: 29
>>>>
>>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>>>> Number of entries: 8
>>>>
>>>> But actually in the brick on each server the folder
"snooper" is fine.
>>>>
>>>> I tried rebooting the servers, restarting gluster after killing
>>>> every process using it but it's not working.
>>>>
>>>> Has anyone already experienced that ? Any help would be nice.
>>>>
>>>
>>> Can you share the output of `gluster volume heal <volname>
info` and
>>> `gluster volume heal <volname> info split-brain`? If the
second
>>> command shows entries, please also share the getfattr output from
>>> the bricks for these files (getfattr -d -m . -e hex
>>> /brick/path/to/file).
>>> -Ravi
>>>>
>>>> Thanks a lot !
>>>>
>>>> -- 
>>>>
>>>> Cordialement,
>>>>
>>>> <http://www.cognix-systems.com/> 	  	
>>>>
>>>> Florian LELEU
>>>> Responsable Hosting, Cognix Systems
>>>>
>>>> *Rennes* | Brest | Saint-Malo | Paris
>>>> florian.leleu at cognix-systems.com
>>>> <mailto:florian.leleu at cognix-systems.com>
>>>>
>>>> T?l. : 02 99 27 75 92
>>>>
>>>> 	  	  	
>>>> Facebook Cognix Systems
<https://www.facebook.com/cognix.systems/>
>>>> Twitter Cognix Systems
<https://twitter.com/cognixsystems>
>>>> Logo Cognix Systems <http://www.cognix-systems.com/>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>
>> -- 
>>
>> Cordialement,
>>
>> <http://www.cognix-systems.com/> 	  	
>>
>> Florian LELEU
>> Responsable Hosting, Cognix Systems
>>
>> *Rennes* | Brest | Saint-Malo | Paris
>> florian.leleu at cognix-systems.com
>> <mailto:florian.leleu at cognix-systems.com>
>>
>> T?l. : 02 99 27 75 92
>>
>> 	  	  	
>> Facebook Cognix Systems
<https://www.facebook.com/cognix.systems/>
>> Twitter Cognix Systems <https://twitter.com/cognixsystems>
>> Logo Cognix Systems <http://www.cognix-systems.com/>
>>
>
-- 

Cordialement,

<http://www.cognix-systems.com/> 	  	

Florian LELEU
Responsable Hosting, Cognix Systems

*Rennes* | Brest | Saint-Malo | Paris
florian.leleu at cognix-systems.com <mailto:florian.leleu at
cognix-systems.com>

T?l. : 02 99 27 75 92

	  	  	
Facebook Cognix Systems <https://www.facebook.com/cognix.systems/>
Twitter Cognix Systems <https://twitter.com/cognixsystems>
Logo Cognix Systems <http://www.cognix-systems.com/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/52103c75/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cenppnejdmcecdnp.png
Type: image/png
Size: 4935 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/52103c75/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cjjbfmfkopeckelj.png
Type: image/png
Size: 1444 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/52103c75/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nhhkdenckjidjjeo.png
Type: image/png
Size: 1623 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/52103c75/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jhbhkifnanoibhbi.png
Type: image/png
Size: 1474 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/52103c75/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mlihkippmckbhocf.jpeg
Type: image/jpeg
Size: 4935 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/52103c75/attachment.jpeg>

Ravishankar N

2017-Jul-07 10:28 UTC

head link

[Gluster-users] I/O error for one folder within the mountpoint

On 07/07/2017 03:39 PM, Florian Leleu wrote:>
> I guess you're right aboug gfid, I got that:
>
> [2017-07-07 07:35:15.197003] W [MSGID: 108008] 
> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check] 
> 0-applicatif-replicate-0: GFID mismatch for 
> <gfid:3fa785b5-4242-4816-a452-97da1a5e45c6>/snooper 
> b9222041-72dd-43a3-b0ab-4169dbd9a87f on applicatif-client-1 and 
> 60056f98-20f8-4949-a4ae-81cc1a139147 on applicatif-client-0
>
> Can you tell me how can I fix that ?If that helps I don't mind 
> deleting the whole folder snooper, I have backup.
>
The steps listed in "Fixing Directory entry split-brain:" of 
https://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/ 
should give you an idea. It is for files whose gfids mismatch but the 
steps are similar for directories too.
If the contents of the snooper is same on all bricks ,  you could also 
try directly deleting the directory from one of the bricks and 
immediately doing an `ls snooper` from the mount to trigger heals to 
recreate the entries.
Hope this helps
Ravi>
> Thanks.
>
>
> Le 07/07/2017 ? 11:54, Ravishankar N a ?crit :
>> What does the mount log say when you get the EIO error on snooper? 
>> Check if there is a gfid mismatch on snooper directory or the files 
>> under it for all 3 bricks. In any case the mount log or the 
>> glustershd.log of the 3 nodes for the gfids you listed below should 
>> give you some idea on why the files aren't healed.
>> Thanks.
>>
>> On 07/07/2017 03:10 PM, Florian Leleu wrote:
>>>
>>> Hi Ravi,
>>>
>>> thanks for your answer, sure there you go:
>>>
>>> # gluster volume heal applicatif info
>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>>> <gfid:e3b5ef36-a635-4e0e-bd97-d204a1f8e7ed>
>>> <gfid:f8030467-b7a3-4744-a945-ff0b532e9401>
>>> <gfid:def47b0b-b77e-4f0e-a402-b83c0f2d354b>
>>> <gfid:46f76502-b1d5-43af-8c42-3d833e86eb44>
>>> <gfid:d27a71d2-6d53-413d-b88c-33edea202cc2>
>>> <gfid:7e7f02b2-3f2d-41ff-9cad-cd3b5a1e506a>
>>> Status: Connected
>>> Number of entries: 6
>>>
>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>>> <gfid:47ddf66f-a5e9-4490-8cd7-88e8b812cdbd>
>>> <gfid:8057d06e-5323-47ff-8168-d983c4a82475>
>>> <gfid:5b2ea4e4-ce84-4f07-bd66-5a0e17edb2b0>
>>> <gfid:baedf8a2-1a3f-4219-86a1-c19f51f08f4e>
>>> <gfid:8261c22c-e85a-4d0e-b057-196b744f3558>
>>> <gfid:842b30c1-6016-45bd-9685-6be76911bd98>
>>> <gfid:1fcaef0f-c97d-41e6-87cd-cd02f197bf38>
>>> <gfid:9d041c80-b7e4-4012-a097-3db5b09fe471>
>>> <gfid:ff48a14a-c1d5-45c6-a52a-b3e2402d0316>
>>> <gfid:01409b23-eff2-4bda-966e-ab6133784001>
>>> <gfid:c723e484-63fc-4267-b3f0-4090194370a0>
>>> <gfid:fb1339a8-803f-4e29-b0dc-244e6c4427ed>
>>> <gfid:056f3bba-6324-4cd8-b08d-bdf0fca44104>
>>> <gfid:a8f6d7e5-0ff2-4747-89f3-87592597adda>
>>> <gfid:3f6438a0-2712-4a09-9bff-d5a3027362b4>
>>> <gfid:392c8e2f-9da4-4af8-a387-bfdfea2f404e>
>>> <gfid:37e1edfd-9f58-4da3-8abe-819670c70906>
>>> <gfid:15b7cdb3-aae8-4ca5-b28c-e87a3e599c9b>
>>> <gfid:1d087e51-fb40-4606-8bb5-58936fb11a4c>
>>> <gfid:bb0352b9-4a5e-4075-9179-05c3a5766cf4>
>>> <gfid:40133fcf-a1fb-4d60-b169-e2355b66fb53>
>>> <gfid:00f75963-1b4a-4d75-9558-36b7d85bd30b>
>>> <gfid:2c0babdf-c828-475e-b2f5-0f44441fffdc>
>>> <gfid:bbeff672-43ef-48c9-a3a2-96264aa46152>
>>> <gfid:6c0969dd-bd30-4ba0-a7e5-ba4b3a972b9f>
>>> <gfid:4c81ea14-56f4-4b30-8fff-c088fe4b3dff>
>>> <gfid:1072cda3-53c9-4b95-992d-f102f6f87209>
>>> <gfid:2e8f9f29-78f9-4402-bc0c-e63af8cf77d6>
>>> <gfid:eeaa2765-44f4-4891-8502-5787b1310de2>
>>> Status: Connected
>>> Number of entries: 29
>>>
>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>>> <gfid:e3b5ef36-a635-4e0e-bd97-d204a1f8e7ed>
>>> <gfid:f8030467-b7a3-4744-a945-ff0b532e9401>
>>> <gfid:def47b0b-b77e-4f0e-a402-b83c0f2d354b>
>>> <gfid:46f76502-b1d5-43af-8c42-3d833e86eb44>
>>> <gfid:d27a71d2-6d53-413d-b88c-33edea202cc2>
>>> <gfid:7e7f02b2-3f2d-41ff-9cad-cd3b5a1e506a>
>>> Status: Connected
>>> Number of entries: 6
>>>
>>>
>>> # gluster volume heal applicatif info split-brain
>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Doesn't it seem odd that the first command give some different
output ?
>>>
>>> Le 07/07/2017 ? 11:31, Ravishankar N a ?crit :
>>>> On 07/07/2017 01:23 PM, Florian Leleu wrote:
>>>>>
>>>>> Hello everyone,
>>>>>
>>>>> first time on the ML so excuse me if I'm not following
well the
>>>>> rules, I'll improve if I get comments.
>>>>>
>>>>> We got one volume "applicatif" on three nodes (2
and 1 arbiter),
>>>>> each following command was made on node ipvr8.xxx:
>>>>>
>>>>> # gluster volume info applicatif
>>>>>
>>>>> Volume Name: applicatif
>>>>> Type: Replicate
>>>>> Volume ID: ac222863-9210-4354-9636-2c822b332504
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
>>>>> Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
>>>>> Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick (arbiter)
>>>>> Options Reconfigured:
>>>>> performance.read-ahead: on
>>>>> performance.cache-size: 1024MB
>>>>> performance.quick-read: off
>>>>> performance.stat-prefetch: on
>>>>> performance.io-cache: off
>>>>> transport.address-family: inet
>>>>> performance.readdir-ahead: on
>>>>> nfs.disable: off
>>>>>
>>>>> # gluster volume status applicatif
>>>>> Status of volume: applicatif
>>>>> Gluster process                             TCP Port RDMA
Port
>>>>> Online  Pid
>>>>>
------------------------------------------------------------------------------
>>>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/
>>>>> brick                                       49154 0
>>>>> Y       2814
>>>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/
>>>>> brick                                       49154 0
>>>>> Y       2672
>>>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/
>>>>> brick                                       49154 0
>>>>> Y       3424
>>>>> NFS Server on localhost                     2049 0
>>>>> Y       26530
>>>>> Self-heal Daemon on localhost               N/A N/A       
Y
>>>>> 26538
>>>>> NFS Server on ipvr9.xxx                  2049 0          Y 
12238
>>>>> Self-heal Daemon on ipvr9.xxx            N/A N/A        Y  
12246
>>>>> NFS Server on ipvr7.xxx                  2049 0          Y 
2234
>>>>> Self-heal Daemon on ipvr7.xxx            N/A N/A        Y  
2243
>>>>>
>>>>> Task Status of Volume applicatif
>>>>>
------------------------------------------------------------------------------
>>>>> There are no active volume tasks
>>>>>
>>>>> The volume is mounted with autofs (nfs) in /home/applicatif
and
>>>>> one folder is "broken":
>>>>>
>>>>> l /home/applicatif/services/
>>>>> ls: cannot access /home/applicatif/services/snooper:
Input/output
>>>>> error
>>>>> total 16
>>>>> lrwxrwxrwx  1 applicatif applicatif    9 Apr  6 15:53
config ->
>>>>> ../config
>>>>> lrwxrwxrwx  1 applicatif applicatif    7 Apr  6 15:54 .pwd
-> ../.pwd
>>>>> drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24
querybuilder
>>>>> d?????????  ? ?          ?             ?            ?
snooper
>>>>> drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57
snooper_new
>>>>> drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58
snooper_old
>>>>> drwxr-xr-x  4 applicatif applicatif 4096 Jul  4 23:45
ssnooper
>>>>>
>>>>> I checked wether there was a heal, and it seems so:
>>>>>
>>>>> # gluster volume heal applicatif statistics heal-count
>>>>> Gathering count of entries to be healed on volume
applicatif has
>>>>> been successful
>>>>>
>>>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>>>>> Number of entries: 8
>>>>>
>>>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>>>>> Number of entries: 29
>>>>>
>>>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>>>>> Number of entries: 8
>>>>>
>>>>> But actually in the brick on each server the folder
"snooper" is fine.
>>>>>
>>>>> I tried rebooting the servers, restarting gluster after
killing
>>>>> every process using it but it's not working.
>>>>>
>>>>> Has anyone already experienced that ? Any help would be
nice.
>>>>>
>>>>
>>>> Can you share the output of `gluster volume heal
<volname> info`
>>>> and `gluster volume heal <volname> info split-brain`? If
the second
>>>> command shows entries, please also share the getfattr output
from
>>>> the bricks for these files (getfattr -d -m . -e hex 
>>>> /brick/path/to/file).
>>>> -Ravi
>>>>>
>>>>> Thanks a lot !
>>>>>
>>>>> -- 
>>>>>
>>>>> Cordialement,
>>>>>
>>>>> <http://www.cognix-systems.com/> 		
>>>>>
>>>>> Florian LELEU
>>>>> Responsable Hosting, Cognix Systems
>>>>>
>>>>> *Rennes* | Brest | Saint-Malo | Paris
>>>>> florian.leleu at cognix-systems.com 
>>>>> <mailto:florian.leleu at cognix-systems.com>
>>>>>
>>>>> T?l. : 02 99 27 75 92
>>>>>
>>>>> 			
>>>>> Facebook Cognix Systems
<https://www.facebook.com/cognix.systems/>
>>>>> Twitter Cognix Systems
<https://twitter.com/cognixsystems>
>>>>> Logo Cognix Systems <http://www.cognix-systems.com/>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>
>>> -- 
>>>
>>> Cordialement,
>>>
>>> <http://www.cognix-systems.com/> 		
>>>
>>> Florian LELEU
>>> Responsable Hosting, Cognix Systems
>>>
>>> *Rennes* | Brest | Saint-Malo | Paris
>>> florian.leleu at cognix-systems.com 
>>> <mailto:florian.leleu at cognix-systems.com>
>>>
>>> T?l. : 02 99 27 75 92
>>>
>>> 			
>>> Facebook Cognix Systems
<https://www.facebook.com/cognix.systems/>
>>> Twitter Cognix Systems <https://twitter.com/cognixsystems>
>>> Logo Cognix Systems <http://www.cognix-systems.com/>
>>>
>>
>
> -- 
>
> Cordialement,
>
> <http://www.cognix-systems.com/> 		
>
> Florian LELEU
> Responsable Hosting, Cognix Systems
>
> *Rennes* | Brest | Saint-Malo | Paris
> florian.leleu at cognix-systems.com <mailto:florian.leleu at
cognix-systems.com>
>
> T?l. : 02 99 27 75 92
>
> 			
> Facebook Cognix Systems <https://www.facebook.com/cognix.systems/>
> Twitter Cognix Systems <https://twitter.com/cognixsystems>
> Logo Cognix Systems <http://www.cognix-systems.com/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/c28b3e02/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 4935 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/c28b3e02/attachment.jpe>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1444 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/c28b3e02/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1623 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/c28b3e02/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1474 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/c28b3e02/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 4935 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/c28b3e02/attachment-0001.jpe>

Florian Leleu

2017-Jul-07 13:16 UTC

head link

[Gluster-users] I/O error for one folder within the mountpoint

Thank you Ravi, after checking gfid within the brick I think someone
made modification inside the brick and not inside the mountpoint ...

Well, I'll try to fix it all it's all within my hands.

Thanks again, have a nice day.

 
Le 07/07/2017 ? 12:28, Ravishankar N a ?crit :> On 07/07/2017 03:39 PM, Florian Leleu wrote:
>>
>> I guess you're right aboug gfid, I got that:
>>
>> [2017-07-07 07:35:15.197003] W [MSGID: 108008]
>> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
>> 0-applicatif-replicate-0: GFID mismatch for
>> <gfid:3fa785b5-4242-4816-a452-97da1a5e45c6>/snooper
>> b9222041-72dd-43a3-b0ab-4169dbd9a87f on applicatif-client-1 and
>> 60056f98-20f8-4949-a4ae-81cc1a139147 on applicatif-client-0
>>
>> Can you tell me how can I fix that ?If that helps I don't mind
>> deleting the whole folder snooper, I have backup.
>>
>
> The steps listed in "Fixing Directory entry split-brain:" of
> https://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/ 
> should give you an idea. It is for files whose gfids mismatch but the
> steps are similar for directories too.
> If the contents of the snooper is same on all bricks ,  you could also
> try directly deleting the directory from one of the bricks and
> immediately doing an `ls snooper` from the mount to trigger heals to
> recreate the entries.
> Hope this helps
> Ravi
>>
>> Thanks.
>>
>>
>> Le 07/07/2017 ? 11:54, Ravishankar N a ?crit :
>>> What does the mount log say when you get the EIO error on snooper?
>>> Check if there is a gfid mismatch on snooper directory or the files
>>> under it for all 3 bricks. In any case the mount log or the
>>> glustershd.log of the 3 nodes for the gfids you listed below should
>>> give you some idea on why the files aren't healed.
>>> Thanks.
>>>
>>> On 07/07/2017 03:10 PM, Florian Leleu wrote:
>>>>
>>>> Hi Ravi,
>>>>
>>>> thanks for your answer, sure there you go:
>>>>
>>>> # gluster volume heal applicatif info
>>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>>>> <gfid:e3b5ef36-a635-4e0e-bd97-d204a1f8e7ed>
>>>> <gfid:f8030467-b7a3-4744-a945-ff0b532e9401>
>>>> <gfid:def47b0b-b77e-4f0e-a402-b83c0f2d354b>
>>>> <gfid:46f76502-b1d5-43af-8c42-3d833e86eb44>
>>>> <gfid:d27a71d2-6d53-413d-b88c-33edea202cc2>
>>>> <gfid:7e7f02b2-3f2d-41ff-9cad-cd3b5a1e506a>
>>>> Status: Connected
>>>> Number of entries: 6
>>>>
>>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>>>> <gfid:47ddf66f-a5e9-4490-8cd7-88e8b812cdbd>
>>>> <gfid:8057d06e-5323-47ff-8168-d983c4a82475>
>>>> <gfid:5b2ea4e4-ce84-4f07-bd66-5a0e17edb2b0>
>>>> <gfid:baedf8a2-1a3f-4219-86a1-c19f51f08f4e>
>>>> <gfid:8261c22c-e85a-4d0e-b057-196b744f3558>
>>>> <gfid:842b30c1-6016-45bd-9685-6be76911bd98>
>>>> <gfid:1fcaef0f-c97d-41e6-87cd-cd02f197bf38>
>>>> <gfid:9d041c80-b7e4-4012-a097-3db5b09fe471>
>>>> <gfid:ff48a14a-c1d5-45c6-a52a-b3e2402d0316>
>>>> <gfid:01409b23-eff2-4bda-966e-ab6133784001>
>>>> <gfid:c723e484-63fc-4267-b3f0-4090194370a0>
>>>> <gfid:fb1339a8-803f-4e29-b0dc-244e6c4427ed>
>>>> <gfid:056f3bba-6324-4cd8-b08d-bdf0fca44104>
>>>> <gfid:a8f6d7e5-0ff2-4747-89f3-87592597adda>
>>>> <gfid:3f6438a0-2712-4a09-9bff-d5a3027362b4>
>>>> <gfid:392c8e2f-9da4-4af8-a387-bfdfea2f404e>
>>>> <gfid:37e1edfd-9f58-4da3-8abe-819670c70906>
>>>> <gfid:15b7cdb3-aae8-4ca5-b28c-e87a3e599c9b>
>>>> <gfid:1d087e51-fb40-4606-8bb5-58936fb11a4c>
>>>> <gfid:bb0352b9-4a5e-4075-9179-05c3a5766cf4>
>>>> <gfid:40133fcf-a1fb-4d60-b169-e2355b66fb53>
>>>> <gfid:00f75963-1b4a-4d75-9558-36b7d85bd30b>
>>>> <gfid:2c0babdf-c828-475e-b2f5-0f44441fffdc>
>>>> <gfid:bbeff672-43ef-48c9-a3a2-96264aa46152>
>>>> <gfid:6c0969dd-bd30-4ba0-a7e5-ba4b3a972b9f>
>>>> <gfid:4c81ea14-56f4-4b30-8fff-c088fe4b3dff>
>>>> <gfid:1072cda3-53c9-4b95-992d-f102f6f87209>
>>>> <gfid:2e8f9f29-78f9-4402-bc0c-e63af8cf77d6>
>>>> <gfid:eeaa2765-44f4-4891-8502-5787b1310de2>
>>>> Status: Connected
>>>> Number of entries: 29
>>>>
>>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>>>> <gfid:e3b5ef36-a635-4e0e-bd97-d204a1f8e7ed>
>>>> <gfid:f8030467-b7a3-4744-a945-ff0b532e9401>
>>>> <gfid:def47b0b-b77e-4f0e-a402-b83c0f2d354b>
>>>> <gfid:46f76502-b1d5-43af-8c42-3d833e86eb44>
>>>> <gfid:d27a71d2-6d53-413d-b88c-33edea202cc2>
>>>> <gfid:7e7f02b2-3f2d-41ff-9cad-cd3b5a1e506a>
>>>> Status: Connected
>>>> Number of entries: 6
>>>>
>>>>
>>>> # gluster volume heal applicatif info split-brain
>>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>>>> Status: Connected
>>>> Number of entries in split-brain: 0
>>>>
>>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>>>> Status: Connected
>>>> Number of entries in split-brain: 0
>>>>
>>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>>>> Status: Connected
>>>> Number of entries in split-brain: 0
>>>>
>>>> Doesn't it seem odd that the first command give some
different output ?
>>>>
>>>> Le 07/07/2017 ? 11:31, Ravishankar N a ?crit :
>>>>> On 07/07/2017 01:23 PM, Florian Leleu wrote:
>>>>>>
>>>>>> Hello everyone,
>>>>>>
>>>>>> first time on the ML so excuse me if I'm not
following well the
>>>>>> rules, I'll improve if I get comments.
>>>>>>
>>>>>> We got one volume "applicatif" on three nodes
(2 and 1 arbiter),
>>>>>> each following command was made on node ipvr8.xxx:
>>>>>>
>>>>>> # gluster volume info applicatif
>>>>>>  
>>>>>> Volume Name: applicatif
>>>>>> Type: Replicate
>>>>>> Volume ID: ac222863-9210-4354-9636-2c822b332504
>>>>>> Status: Started
>>>>>> Snapshot Count: 0
>>>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: ipvr7.xxx:/mnt/gluster-applicatif/brick
>>>>>> Brick2: ipvr8.xxx:/mnt/gluster-applicatif/brick
>>>>>> Brick3: ipvr9.xxx:/mnt/gluster-applicatif/brick
(arbiter)
>>>>>> Options Reconfigured:
>>>>>> performance.read-ahead: on
>>>>>> performance.cache-size: 1024MB
>>>>>> performance.quick-read: off
>>>>>> performance.stat-prefetch: on
>>>>>> performance.io-cache: off
>>>>>> transport.address-family: inet
>>>>>> performance.readdir-ahead: on
>>>>>> nfs.disable: off
>>>>>>
>>>>>> # gluster volume status applicatif
>>>>>> Status of volume: applicatif
>>>>>> Gluster process                             TCP Port 
RDMA Port
>>>>>> Online  Pid
>>>>>>
------------------------------------------------------------------------------
>>>>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/
>>>>>> brick                                       49154     0
>>>>>> Y       2814
>>>>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/
>>>>>> brick                                       49154     0
>>>>>> Y       2672
>>>>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/
>>>>>> brick                                       49154     0
>>>>>> Y       3424
>>>>>> NFS Server on localhost                     2049      0
>>>>>> Y       26530
>>>>>> Self-heal Daemon on localhost               N/A      
N/A
>>>>>> Y       26538
>>>>>> NFS Server on ipvr9.xxx                  2049      0
>>>>>> Y       12238
>>>>>> Self-heal Daemon on ipvr9.xxx            N/A       N/A
>>>>>> Y       12246
>>>>>> NFS Server on ipvr7.xxx                  2049      0
>>>>>> Y       2234
>>>>>> Self-heal Daemon on ipvr7.xxx            N/A       N/A
>>>>>> Y       2243
>>>>>>  
>>>>>> Task Status of Volume applicatif
>>>>>>
------------------------------------------------------------------------------
>>>>>> There are no active volume tasks
>>>>>>
>>>>>> The volume is mounted with autofs (nfs) in
/home/applicatif and
>>>>>> one folder is "broken":
>>>>>>
>>>>>> l /home/applicatif/services/
>>>>>> ls: cannot access /home/applicatif/services/snooper:
Input/output
>>>>>> error
>>>>>> total 16
>>>>>> lrwxrwxrwx  1 applicatif applicatif    9 Apr  6 15:53
config ->
>>>>>> ../config
>>>>>> lrwxrwxrwx  1 applicatif applicatif    7 Apr  6 15:54
.pwd -> ../.pwd
>>>>>> drwxr-xr-x  3 applicatif applicatif 4096 Apr 12 10:24
querybuilder
>>>>>> d?????????  ? ?          ?             ?            ?
snooper
>>>>>> drwxr-xr-x  3 applicatif applicatif 4096 Jul  6 02:57
snooper_new
>>>>>> drwxr-xr-x 16 applicatif applicatif 4096 Jul  6 02:58
snooper_old
>>>>>> drwxr-xr-x  4 applicatif applicatif 4096 Jul  4 23:45
ssnooper
>>>>>>
>>>>>> I checked wether there was a heal, and it seems so:
>>>>>>
>>>>>> # gluster volume heal applicatif statistics heal-count
>>>>>> Gathering count of entries to be healed on volume
applicatif has
>>>>>> been successful
>>>>>>
>>>>>> Brick ipvr7.xxx:/mnt/gluster-applicatif/brick
>>>>>> Number of entries: 8
>>>>>>
>>>>>> Brick ipvr8.xxx:/mnt/gluster-applicatif/brick
>>>>>> Number of entries: 29
>>>>>>
>>>>>> Brick ipvr9.xxx:/mnt/gluster-applicatif/brick
>>>>>> Number of entries: 8
>>>>>>
>>>>>> But actually in the brick on each server the folder
"snooper" is
>>>>>> fine.
>>>>>>
>>>>>> I tried rebooting the servers, restarting gluster after
killing
>>>>>> every process using it but it's not working.
>>>>>>
>>>>>> Has anyone already experienced that ? Any help would be
nice.
>>>>>>
>>>>>
>>>>> Can you share the output of `gluster volume heal
<volname> info`
>>>>> and `gluster volume heal <volname> info split-brain`?
If the
>>>>> second command shows entries, please also share the
getfattr
>>>>> output from the bricks for these files (getfattr -d -m . -e
hex
>>>>> /brick/path/to/file).
>>>>> -Ravi
>>>>>>
>>>>>> Thanks a lot !
>>>>>>
>>>>>> -- 
>>>>>>
>>>>>> Cordialement,
>>>>>>
>>>>>> <http://www.cognix-systems.com/> 	  	
>>>>>>
>>>>>> Florian LELEU
>>>>>> Responsable Hosting, Cognix Systems
>>>>>>
>>>>>> *Rennes* | Brest | Saint-Malo | Paris
>>>>>> florian.leleu at cognix-systems.com
>>>>>> <mailto:florian.leleu at cognix-systems.com>
>>>>>>
>>>>>> T?l. : 02 99 27 75 92
>>>>>>
>>>>>> 	  	  	
>>>>>> Facebook Cognix Systems
<https://www.facebook.com/cognix.systems/>
>>>>>> Twitter Cognix Systems
<https://twitter.com/cognixsystems>
>>>>>> Logo Cognix Systems
<http://www.cognix-systems.com/>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>
>>>> -- 
>>>>
>>>> Cordialement,
>>>>
>>>> <http://www.cognix-systems.com/> 	  	
>>>>
>>>> Florian LELEU
>>>> Responsable Hosting, Cognix Systems
>>>>
>>>> *Rennes* | Brest | Saint-Malo | Paris
>>>> florian.leleu at cognix-systems.com
>>>> <mailto:florian.leleu at cognix-systems.com>
>>>>
>>>> T?l. : 02 99 27 75 92
>>>>
>>>> 	  	  	
>>>> Facebook Cognix Systems
<https://www.facebook.com/cognix.systems/>
>>>> Twitter Cognix Systems
<https://twitter.com/cognixsystems>
>>>> Logo Cognix Systems <http://www.cognix-systems.com/>
>>>>
>>>
>>
>> -- 
>>
>> Cordialement,
>>
>> <http://www.cognix-systems.com/> 	  	
>>
>> Florian LELEU
>> Responsable Hosting, Cognix Systems
>>
>> *Rennes* | Brest | Saint-Malo | Paris
>> florian.leleu at cognix-systems.com
>> <mailto:florian.leleu at cognix-systems.com>
>>
>> T?l. : 02 99 27 75 92
>>
>> 	  	  	
>> Facebook Cognix Systems
<https://www.facebook.com/cognix.systems/>
>> Twitter Cognix Systems <https://twitter.com/cognixsystems>
>> Logo Cognix Systems <http://www.cognix-systems.com/>
>>
>
-- 

Cordialement,

<http://www.cognix-systems.com/> 	  	

Florian LELEU
Responsable Hosting, Cognix Systems

*Rennes* | Brest | Saint-Malo | Paris
florian.leleu at cognix-systems.com <mailto:florian.leleu at
cognix-systems.com>

T?l. : 02 99 27 75 92

	  	  	
Facebook Cognix Systems <https://www.facebook.com/cognix.systems/>
Twitter Cognix Systems <https://twitter.com/cognixsystems>
Logo Cognix Systems <http://www.cognix-systems.com/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/e99c24c2/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ocmgejoieopkecif.png
Type: image/png
Size: 4935 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/e99c24c2/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mfelbepohogpdpof.png
Type: image/png
Size: 1444 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/e99c24c2/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kmdajbbnnnmaoakh.png
Type: image/png
Size: 1623 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/e99c24c2/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: imgglankaogkhbcn.png
Type: image/png
Size: 1474 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/e99c24c2/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ipbhpfgeecgekeii.jpeg
Type: image/jpeg
Size: 4935 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/e99c24c2/attachment.jpeg>

Possibly Parallel Threads

Search for more maybe matching threads

Gluster users - Jul 2017 - I/O error for one folder within the mountpoint

[Gluster-users] I/O error for one folder within the mountpoint

[Gluster-users] I/O error for one folder within the mountpoint

[Gluster-users] I/O error for one folder within the mountpoint

Possibly Parallel Threads