thr3ads.net - Gluster users - [Gluster-users] missing files on FUSE mount [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Martín Lorenzo

2020-Oct-27 17:23 UTC

[Gluster-users] missing files on FUSE mount

Hi Strahil, today we have the same number clients on all nodes, but the
problem persists. I have the impression that it gets more frequent as the
server capacity fills up, now we are having at least one incident per day.
Regards,
Martin

On Mon, Oct 26, 2020 at 8:09 AM Mart?n Lorenzo <mlorenzo at gmail.com>
wrote:
> HI Strahil, thanks for your reply,
> I had one node with 13 clients, the rest with 14. I've just restarted
the
> services on that node, now I have 14, let's see what happens.
> Regarding the samba repos, I wasn't aware of that, I was using centos
main
> repo. I'll check the out
> Best Regards,
> Martin
>
>
> On Tue, Oct 20, 2020 at 3:19 PM Strahil Nikolov <hunter86_bg at
yahoo.com>
> wrote:
>
>> Do you have the same ammount of clients connected to each brick ?
>>
>> I guess something like this can show it:
>>
>> gluster volume status VOL clients
>> gluster volume status VOL client-list
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>
>>
>>
>>
>>
>> ? ???????, 20 ???????? 2020 ?., 15:41:45 ???????+3, Mart?n Lorenzo <
>> mlorenzo at gmail.com> ??????:
>>
>>
>>
>>
>>
>> Hi, I have the following problem, I have a distributed replicated
cluster
>> set up with samba and CTDB, over fuse mount points
>> I am having inconsistencies across the FUSE mounts, users report that
>> files are disappearing after being copied/moved. I take a look at the
mount
>> points on each node, and they don't display the same data
>>
>> #### faulty mount point####
>> [root at gluster6 ARRIBA GENTE martes 20 de octubre]# ll
>> ls: cannot access PANEO VUELTA A CLASES CON TAPABOCAS.mpg: No such file
>> or directory
>> ls: cannot access PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg: No such file
or
>> directory
>> total 633723
>> drwxr-xr-x. 5 arribagente PN      4096 Oct 19 10:52 COMERCIAL AG martes
>> 20 de octubre
>> -rw-r--r--. 1 arribagente PN 648927236 Jun  3 07:16 PANEO FACHADA
PALACIO
>> LEGISLATIVO DRONE DIA Y NOCHE.mpg
>> -?????????? ? ?           ?          ?            ? PANEO NI?OS
ESCUELAS
>> CON TAPABOCAS.mpg
>> -?????????? ? ?           ?          ?            ? PANEO VUELTA A
CLASES
>> CON TAPABOCAS.mpg
>>
>>
>> ###healthy mount point###
>> [root at gluster7 ARRIBA GENTE martes 20 de octubre]# ll
>> total 3435596
>> drwxr-xr-x. 5 arribagente PN       4096 Oct 19 10:52 COMERCIAL AG
martes
>> 20 de octubre
>> -rw-r--r--. 1 arribagente PN  648927236 Jun  3 07:16 PANEO FACHADA
>> PALACIO LEGISLATIVO DRONE DIA Y NOCHE.mpg
>> -rw-r--r--. 1 arribagente PN 2084415492 Aug 18 09:14 PANEO NI?OS
ESCUELAS
>> CON TAPABOCAS.mpg
>> -rw-r--r--. 1 arribagente PN  784701444 Sep  4 07:23 PANEO VUELTA A
>> CLASES CON TAPABOCAS.mpg
>>
>>  - So far the only way to solve this is to create a directory in the
>> healthy mount point, on the same path:
>> [root at gluster7 ARRIBA GENTE martes 20 de octubre]# mkdir hola
>>
>> - When you refresh the other mountpoint, and the issue is resolved:
>> [root at gluster6 ARRIBA GENTE martes 20 de octubre]# ll
>> total 3435600
>> drwxr-xr-x. 5 arribagente PN         4096 Oct 19 10:52 COMERCIAL AG
>> martes 20 de octubre
>> drwxr-xr-x. 2 root        root       4096 Oct 20 08:45 hola
>> -rw-r--r--. 1 arribagente PN    648927236 Jun  3 07:16 PANEO FACHADA
>> PALACIO LEGISLATIVO DRONE DIA Y NOCHE.mpg
>> -rw-r--r--. 1 arribagente PN   2084415492 Aug 18 09:14 PANEO NI?OS
>> ESCUELAS CON TAPABOCAS.mpg
>> -rw-r--r--. 1 arribagente PN    784701444 Sep  4 07:23 PANEO VUELTA A
>> CLASES CON TAPABOCAS.mpg
>>
>> Interestingly, the error occurs on the mount point where the files were
>> copied. They don't show up as pending heal entries. I have around
15 people
>> using them over samba, I think I'm having this issue reported every
two
>> days.
>>
>> I have an older cluster with similar issues, different gluster version,
>> but a very similar topology (4 bricks, initially two bricks then
expanded)
>> Please note , the bricks aren't the same size (but their replicas
are),
>> so my other suspicion is that rebalancing has something to do with it.
>>
>> I'm trying to reproduce it over a small virtualized cluster, so far
no
>> results.
>>
>> Here are the cluster details
>> four nodes, replica 2, plus one arbiter hosting 2 bricks
>>
>> I have 2 bricks with ~20 TB capacity and the other pair is ~48TB
>> Volume Name: tapeless
>> Type: Distributed-Replicate
>> Volume ID: 53bfa86d-b390-496b-bbd7-c4bba625c956
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 2 x (2 + 1) = 6
>> Transport-type: tcp
>> Bricks:
>> Brick1:
gluster6.glustersaeta.net:/data/glusterfs/tapeless/brick_6/brick
>> Brick2:
gluster7.glustersaeta.net:/data/glusterfs/tapeless/brick_7/brick
>> Brick3:
kitchen-store.glustersaeta.net:/data/glusterfs/tapeless/brick_1a/brick
>> (arbiter)
>> Brick4: gluster12.glustersaeta.net:
>> /data/glusterfs/tapeless/brick_12/brick
>> Brick5: gluster13.glustersaeta.net:
>> /data/glusterfs/tapeless/brick_13/brick
>> Brick6:
kitchen-store.glustersaeta.net:/data/glusterfs/tapeless/brick_2a/brick
>> (arbiter)
>> Options Reconfigured:
>> features.quota-deem-statfs: on
>> performance.client-io-threads: on
>> nfs.disable: on
>> transport.address-family: inet
>> features.quota: on
>> features.inode-quota: on
>> features.cache-invalidation: on
>> features.cache-invalidation-timeout: 600
>> performance.cache-samba-metadata: on
>> performance.stat-prefetch: on
>> performance.cache-invalidation: on
>> performance.md-cache-timeout: 600
>> network.inode-lru-limit: 200000
>> performance.nl-cache: on
>> performance.nl-cache-timeout: 600
>> performance.readdir-ahead: on
>> performance.parallel-readdir: on
>> performance.cache-size: 1GB
>> client.event-threads: 4
>> server.event-threads: 4
>> performance.normal-prio-threads: 16
>> performance.io-thread-count: 32
>> performance.write-behind-window-size: 8MB
>> storage.batch-fsync-delay-usec: 0
>> cluster.data-self-heal: on
>> cluster.metadata-self-heal: on
>> cluster.entry-self-heal: on
>> cluster.self-heal-daemon: on
>> performance.write-behind: on
>> performance.open-behind: on
>>
>> Log section form faulty mount point. I think the [file exists] entries
>> are from people trying to copy the missing files over an over
>>
>>
>> [2020-10-20 11:31:03.034220] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:32:06.684329] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:33:02.191863] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:34:05.841608] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:35:20.736633] I [MSGID: 108026]
>> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
>> 0-tapeless-replicate-1: performing metadata selfheal on
>> 958dbd7a-3cd7-4b66-9038-76e5c5669644
>> [2020-10-20 11:35:20.741213] I [MSGID: 108026]
>> [afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-1:
>> Completed metadata selfheal on 958dbd7a-3cd7-4b66-9038-76e5c5669644.
>> sources=[0] 1  sinks=2
>> [2020-10-20 11:35:04.278043] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> The message "I [MSGID: 108026]
>> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
>> 0-tapeless-replicate-1: performing metadata selfheal on
>> 958dbd7a-3cd7-4b66-9038-76e5c5669644" repeated 3 times between
[2020-10-20
>> 11:35:20.736633] and [2020-10-20 11:35:26.733298]
>> The message "I [MSGID: 108026]
>> [afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-1:
>> Completed metadata selfheal on 958dbd7a-3cd7-4b66-9038-76e5c5669644.
>> sources=[0] 1  sinks=2 " repeated 3 times between [2020-10-20
>> 11:35:20.741213] and [2020-10-20 11:35:26.737629]
>> [2020-10-20 11:36:02.548350] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:36:57.365537] I [MSGID: 108026]
>> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
>> 0-tapeless-replicate-1: performing metadata selfheal on
>> f4907af2-1775-4c46-89b5-e9776df6d5c7
>> [2020-10-20 11:36:57.370824] I [MSGID: 108026]
>> [afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-1:
>> Completed metadata selfheal on f4907af2-1775-4c46-89b5-e9776df6d5c7.
>> sources=[0] 1  sinks=2
>> [2020-10-20 11:37:01.363925] I [MSGID: 108026]
>> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
>> 0-tapeless-replicate-1: performing metadata selfheal on
>> f4907af2-1775-4c46-89b5-e9776df6d5c7
>> [2020-10-20 11:37:01.368069] I [MSGID: 108026]
>> [afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-1:
>> Completed metadata selfheal on f4907af2-1775-4c46-89b5-e9776df6d5c7.
>> sources=[0] 1  sinks=2
>> The message "I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0" repeated 3 times
between
>> [2020-10-20 11:36:02.548350] and [2020-10-20 11:37:36.389208]
>> [2020-10-20 11:38:07.367113] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:39:01.595981] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:40:04.184899] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:41:07.833470] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:42:01.871621] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:43:04.399194] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:44:04.558647] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:44:15.953600] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-5:
>> remote operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
>> martes 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File
exists]
>> [2020-10-20 11:44:15.953819] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-2:
>> remote operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
>> martes 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File
exists]
>> [2020-10-20 11:44:15.954072] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-3:
>> remote operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
>> martes 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File
exists]
>> [2020-10-20 11:44:15.954680] W [fuse-bridge.c:2606:fuse_create_cbk]
>> 0-glusterfs-fuse: 31043294: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
martes
>> 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File
exists)
>> [2020-10-20 11:44:15.963175] W [fuse-bridge.c:2606:fuse_create_cbk]
>> 0-glusterfs-fuse: 31043306: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
martes
>> 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File
exists)
>> [2020-10-20 11:44:15.971839] W [fuse-bridge.c:2606:fuse_create_cbk]
>> 0-glusterfs-fuse: 31043318: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
martes
>> 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File
exists)
>> [2020-10-20 11:44:16.010242] W [fuse-bridge.c:2606:fuse_create_cbk]
>> 0-glusterfs-fuse: 31043403: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
martes
>> 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File
exists)
>> [2020-10-20 11:44:16.020291] W [fuse-bridge.c:2606:fuse_create_cbk]
>> 0-glusterfs-fuse: 31043415: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
martes
>> 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File
exists)
>> [2020-10-20 11:44:16.028857] W [fuse-bridge.c:2606:fuse_create_cbk]
>> 0-glusterfs-fuse: 31043427: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
martes
>> 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File
exists)
>> The message "W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-5:
>> remote operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
>> martes 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File
exists]"
>> repeated 5 times between [2020-10-20 11:44:15.953600] and [2020-10-20
>> 11:44:16.027785]
>> The message "W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-2:
>> remote operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
>> martes 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File
exists]"
>> repeated 5 times between [2020-10-20 11:44:15.953819] and [2020-10-20
>> 11:44:16.028331]
>> The message "W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-3:
>> remote operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE
>> martes 20 de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File
exists]"
>> repeated 5 times between [2020-10-20 11:44:15.954072] and [2020-10-20
>> 11:44:16.028355]
>> [2020-10-20 11:45:03.572106] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:45:40.080010] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> The message "I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0" repeated 2 times
between
>> [2020-10-20 11:45:40.080010] and [2020-10-20 11:47:10.871801]
>> [2020-10-20 11:48:03.913129] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:49:05.082165] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:50:06.725722] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:51:04.254685] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:52:07.903617] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:53:01.420513] I [MSGID: 108026]
>> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
>> 0-tapeless-replicate-0: performing metadata selfheal on
>> 3c316533-5f47-4267-ac19-58b3be305b94
>> [2020-10-20 11:53:01.428657] I [MSGID: 108026]
>> [afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-0:
>> Completed metadata selfheal on 3c316533-5f47-4267-ac19-58b3be305b94.
>> sources=[0]  sinks=1 2
>> The message "I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0" repeated 3 times
between
>> [2020-10-20 11:52:07.903617] and [2020-10-20 11:53:12.037835]
>> [2020-10-20 11:54:02.208354] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:55:04.360284] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:56:09.508092] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:57:02.580970] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>> [2020-10-20 11:58:06.230698] I [MSGID: 108031]
>> [afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0:
>> selecting local read_child tapeless-client-0
>>
>>
>> Let me know if you need something else. Thank you for you suppoort!
>> Best Regards,
>> Martin Lorenzo
>>
>>
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20201027/1346fe94/attachment.html>

Strahil Nikolov

2020-Oct-27 19:01 UTC

head link

[Gluster-users] missing files on FUSE mount

Have you tried to reduce the cache timeouts ?
I can't find your gluster version in the thread - can you share again OS +
gluster version ?


Best Regards,
Strahil Nikolov






? ???????, 27 ???????? 2020 ?., 19:23:28 ???????+2, Mart?n Lorenzo <mlorenzo
at gmail.com> ??????:





Hi Strahil, today we have the same number clients on all nodes, but the problem
persists. I have the impression that it gets more frequent?as the server
capacity fills up, now we are having at least one incident per day.
Regards,
Martin

On Mon, Oct 26, 2020 at 8:09 AM Mart?n Lorenzo <mlorenzo at gmail.com>
wrote:> HI Strahil, thanks for your reply,
> I had one node with 13 clients, the rest with 14. I've just restarted
the services on that node, now I have 14, let's see what happens.
> Regarding the samba repos, I wasn't aware of that, I was using centos
main repo. I'll check the out
> Best Regards,
> Martin
> 
> 
> On Tue, Oct 20, 2020 at 3:19 PM Strahil Nikolov <hunter86_bg at
yahoo.com> wrote:
>> Do you have the same ammount of clients connected to each brick ?
>> 
>> I guess something like this can show it:
>> 
>> gluster volume status VOL clients
>> gluster volume status VOL client-list
>> 
>> Best Regards,
>> Strahil Nikolov
>> 
>> 
>> 
>> 
>> 
>> 
>> ? ???????, 20 ???????? 2020 ?., 15:41:45 ???????+3, Mart?n Lorenzo
<mlorenzo at gmail.com> ??????:
>> 
>> 
>> 
>> 
>> 
>> Hi, I have the following?problem, I have a distributed replicated
cluster set up with samba and CTDB, over fuse mount points
>> I am having inconsistencies across the FUSE mounts, users report that
files are disappearing after?being copied/moved. I take a look?at the mount
points on each node, and they don't display the same data
>> 
>> #### faulty mount point####
>> [root at gluster6 ARRIBA GENTE martes 20 de octubre]# ll
>> ls: cannot access PANEO VUELTA A CLASES CON TAPABOCAS.mpg: No such file
or directory
>> ls: cannot access PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg: No such file
or directory
>> total 633723
>> drwxr-xr-x. 5 arribagente PN ? ? ?4096 Oct 19 10:52 COMERCIAL AG martes
20 de octubre
>> -rw-r--r--. 1 arribagente PN 648927236 Jun ?3 07:16 PANEO FACHADA
PALACIO LEGISLATIVO DRONE DIA Y NOCHE.mpg
>> -?????????? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ? ?? PANEO NI?OS
ESCUELAS CON TAPABOCAS.mpg
>> -?????????? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ? ?? PANEO VUELTA A
CLASES CON TAPABOCAS.mpg
>> 
>> 
>> ###healthy mount point###
>> [root at gluster7 ARRIBA GENTE martes 20 de octubre]# ll
>> total 3435596
>> drwxr-xr-x. 5 arribagente PN ? ? ? 4096 Oct 19 10:52 COMERCIAL AG
martes 20 de octubre
>> -rw-r--r--. 1 arribagente PN ?648927236 Jun ?3 07:16 PANEO FACHADA
PALACIO LEGISLATIVO DRONE DIA Y NOCHE.mpg
>> -rw-r--r--. 1 arribagente PN 2084415492 Aug 18 09:14 PANEO NI?OS
ESCUELAS CON TAPABOCAS.mpg
>> -rw-r--r--. 1 arribagente PN ?784701444 Sep ?4 07:23 PANEO VUELTA A
CLASES CON TAPABOCAS.mpg
>> 
>> ?- So far the only way to solve this is to create a directory in the
healthy mount point, on the same path:
>> [root at gluster7 ARRIBA GENTE martes 20 de octubre]# mkdir hola
>> 
>> - When you refresh the other mountpoint, and the issue is resolved:
>> [root at gluster6 ARRIBA GENTE martes 20 de octubre]# ll
>> total 3435600
>> drwxr-xr-x. 5 arribagente PN ? ? ? ? 4096 Oct 19 10:52 COMERCIAL AG
martes 20 de octubre
>> drwxr-xr-x. 2 root ? ? ? ?root ? ? ? 4096 Oct 20 08:45 hola
>> -rw-r--r--. 1 arribagente PN ? ?648927236 Jun ?3 07:16 PANEO FACHADA
PALACIO LEGISLATIVO DRONE DIA Y NOCHE.mpg
>> -rw-r--r--. 1 arribagente PN ? 2084415492 Aug 18 09:14 PANEO NI?OS
ESCUELAS CON TAPABOCAS.mpg
>> -rw-r--r--. 1 arribagente PN ? ?784701444 Sep ?4 07:23 PANEO VUELTA A
CLASES CON TAPABOCAS.mpg
>> 
>> Interestingly, the error occurs on the mount point where the files were
copied. They don't show up as pending heal entries. I have around 15 people
using them over samba, I think I'm having this issue reported every two
days.?
>> 
>> I have an older cluster with?similar?issues, different gluster version,
but a very similar topology (4 bricks, initially two bricks then expanded)
>> Please note , the bricks aren't?the same size (but their replicas
are), so my other suspicion is that rebalancing has something to do with it.
>> 
>> I'm trying to reproduce it over a small virtualized?cluster, so far
no results.
>> 
>> Here are the cluster details
>> four nodes, replica 2, plus one arbiter hosting 2 bricks
>> 
>> I have 2 bricks with ~20 TB capacity and the other?pair is ~48TB
>> Volume Name: tapeless
>> Type: Distributed-Replicate
>> Volume ID: 53bfa86d-b390-496b-bbd7-c4bba625c956
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 2 x (2 + 1) = 6
>> Transport-type: tcp
>> Bricks:
>> Brick1:
gluster6.glustersaeta.net:/data/glusterfs/tapeless/brick_6/brick
>> Brick2:
gluster7.glustersaeta.net:/data/glusterfs/tapeless/brick_7/brick
>> Brick3:
kitchen-store.glustersaeta.net:/data/glusterfs/tapeless/brick_1a/brick (arbiter)
>> Brick4:
gluster12.glustersaeta.net:/data/glusterfs/tapeless/brick_12/brick
>> Brick5:
gluster13.glustersaeta.net:/data/glusterfs/tapeless/brick_13/brick
>> Brick6:
kitchen-store.glustersaeta.net:/data/glusterfs/tapeless/brick_2a/brick (arbiter)
>> Options Reconfigured:
>> features.quota-deem-statfs: on
>> performance.client-io-threads: on
>> nfs.disable: on
>> transport.address-family: inet
>> features.quota: on
>> features.inode-quota: on
>> features.cache-invalidation: on
>> features.cache-invalidation-timeout: 600
>> performance.cache-samba-metadata: on
>> performance.stat-prefetch: on
>> performance.cache-invalidation: on
>> performance.md-cache-timeout: 600
>> network.inode-lru-limit: 200000
>> performance.nl-cache: on
>> performance.nl-cache-timeout: 600
>> performance.readdir-ahead: on
>> performance.parallel-readdir: on
>> performance.cache-size: 1GB
>> client.event-threads: 4
>> server.event-threads: 4
>> performance.normal-prio-threads: 16
>> performance.io-thread-count: 32
>> performance.write-behind-window-size: 8MB
>> storage.batch-fsync-delay-usec: 0
>> cluster.data-self-heal: on
>> cluster.metadata-self-heal: on
>> cluster.entry-self-heal: on
>> cluster.self-heal-daemon: on
>> performance.write-behind: on
>> performance.open-behind: on
>> 
>> Log section form faulty mount point. I think the [file exists] entries
are from people trying to copy the missing files over an over
>> 
>> 
>> [2020-10-20 11:31:03.034220] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:32:06.684329] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:33:02.191863] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:34:05.841608] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:35:20.736633] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-tapeless-replicate-1:
performing metadata selfheal on 958dbd7a-3cd7-4b66-9038-76e5c5669644
>> [2020-10-20 11:35:20.741213] I [MSGID: 108026]
[afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-1: Completed
metadata selfheal on 958dbd7a-3cd7-4b66-9038-76e5c5669644. sources=[0] 1
?sinks=2 ?
>> [2020-10-20 11:35:04.278043] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> The message "I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-tapeless-replicate-1:
performing metadata selfheal on 958dbd7a-3cd7-4b66-9038-76e5c5669644"
repeated 3 times between [2020-10-20 11:35:20.736633] and [2020-10-20
11:35:26.733298]
>> The message "I [MSGID: 108026]
[afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-1: Completed
metadata selfheal on 958dbd7a-3cd7-4b66-9038-76e5c5669644. sources=[0] 1
?sinks=2 " repeated 3 times between [2020-10-20 11:35:20.741213] and
[2020-10-20 11:35:26.737629]
>> [2020-10-20 11:36:02.548350] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:36:57.365537] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-tapeless-replicate-1:
performing metadata selfheal on f4907af2-1775-4c46-89b5-e9776df6d5c7
>> [2020-10-20 11:36:57.370824] I [MSGID: 108026]
[afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-1: Completed
metadata selfheal on f4907af2-1775-4c46-89b5-e9776df6d5c7. sources=[0] 1
?sinks=2 ?
>> [2020-10-20 11:37:01.363925] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-tapeless-replicate-1:
performing metadata selfheal on f4907af2-1775-4c46-89b5-e9776df6d5c7
>> [2020-10-20 11:37:01.368069] I [MSGID: 108026]
[afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-1: Completed
metadata selfheal on f4907af2-1775-4c46-89b5-e9776df6d5c7. sources=[0] 1
?sinks=2 ?
>> The message "I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0" repeated 3 times between [2020-10-20
11:36:02.548350] and [2020-10-20 11:37:36.389208]
>> [2020-10-20 11:38:07.367113] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:39:01.595981] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:40:04.184899] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:41:07.833470] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:42:01.871621] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:43:04.399194] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:44:04.558647] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:44:15.953600] W [MSGID: 114031]
[client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-5: remote
operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20 de
octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File exists]
>> [2020-10-20 11:44:15.953819] W [MSGID: 114031]
[client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-2: remote
operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20 de
octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File exists]
>> [2020-10-20 11:44:15.954072] W [MSGID: 114031]
[client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-3: remote
operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20 de
octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File exists]
>> [2020-10-20 11:44:15.954680] W [fuse-bridge.c:2606:fuse_create_cbk]
0-glusterfs-fuse: 31043294: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20
de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File exists)
>> [2020-10-20 11:44:15.963175] W [fuse-bridge.c:2606:fuse_create_cbk]
0-glusterfs-fuse: 31043306: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20
de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File exists)
>> [2020-10-20 11:44:15.971839] W [fuse-bridge.c:2606:fuse_create_cbk]
0-glusterfs-fuse: 31043318: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20
de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File exists)
>> [2020-10-20 11:44:16.010242] W [fuse-bridge.c:2606:fuse_create_cbk]
0-glusterfs-fuse: 31043403: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20
de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File exists)
>> [2020-10-20 11:44:16.020291] W [fuse-bridge.c:2606:fuse_create_cbk]
0-glusterfs-fuse: 31043415: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20
de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File exists)
>> [2020-10-20 11:44:16.028857] W [fuse-bridge.c:2606:fuse_create_cbk]
0-glusterfs-fuse: 31043427: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20
de octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg => -1 (File exists)
>> The message "W [MSGID: 114031]
[client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-5: remote
operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20 de
octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File exists]" repeated 5
times between [2020-10-20 11:44:15.953600] and [2020-10-20 11:44:16.027785]
>> The message "W [MSGID: 114031]
[client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-2: remote
operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20 de
octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File exists]" repeated 5
times between [2020-10-20 11:44:15.953819] and [2020-10-20 11:44:16.028331]
>> The message "W [MSGID: 114031]
[client-rpc-fops_v2.c:2114:client4_0_create_cbk] 0-tapeless-client-3: remote
operation failed. Path: /PN/arribagente/PLAYER 2020/ARRIBA GENTE martes 20 de
octubre/PANEO NI?OS ESCUELAS CON TAPABOCAS.mpg [File exists]" repeated 5
times between [2020-10-20 11:44:15.954072] and [2020-10-20 11:44:16.028355]
>> [2020-10-20 11:45:03.572106] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:45:40.080010] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> The message "I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0" repeated 2 times between [2020-10-20
11:45:40.080010] and [2020-10-20 11:47:10.871801]
>> [2020-10-20 11:48:03.913129] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:49:05.082165] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:50:06.725722] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:51:04.254685] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:52:07.903617] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:53:01.420513] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-tapeless-replicate-0:
performing metadata selfheal on 3c316533-5f47-4267-ac19-58b3be305b94
>> [2020-10-20 11:53:01.428657] I [MSGID: 108026]
[afr-self-heal-common.c:1750:afr_log_selfheal] 0-tapeless-replicate-0: Completed
metadata selfheal on 3c316533-5f47-4267-ac19-58b3be305b94. sources=[0] ?sinks=1
2 ?
>> The message "I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0" repeated 3 times between [2020-10-20
11:52:07.903617] and [2020-10-20 11:53:12.037835]
>> [2020-10-20 11:54:02.208354] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:55:04.360284] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:56:09.508092] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:57:02.580970] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0
>> [2020-10-20 11:58:06.230698] I [MSGID: 108031]
[afr-common.c:2581:afr_local_discovery_cbk] 0-tapeless-replicate-0: selecting
local read_child tapeless-client-0?
>> 
>> 
>> Let me know if you need something else. Thank you for you suppoort!
>> Best Regards,
>> Martin Lorenzo
>> 
>> 
>> ________
>> 
>> 
>> 
>> Community Meeting Calendar:
>> 
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>> 
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>

Gluster users - Oct 2020 - missing files on FUSE mount

[Gluster-users] missing files on FUSE mount

[Gluster-users] missing files on FUSE mount