thr3ads.net - Gluster users - [Gluster-users] Missing files on one of the bricks [Nov 2017]

If this information is useful, please help other people find it:
Share via:

Ravishankar N

2017-Nov-16 12:14 UTC

[Gluster-users] Missing files on one of the bricks

On 11/16/2017 04:12 PM, Nithya Balachandran wrote:>
>
> On 15 November 2017 at 19:57, Frederic Harmignies 
> <frederic.harmignies at elementai.com 
> <mailto:frederic.harmignies at elementai.com>> wrote:
>
>     Hello, we have 2x files that are missing from one of the bricks.
>     No idea how to fix this.
>
>     Details:
>
>     # gluster volume info
>     Volume Name: data01
>     Type: Replicate
>     Volume ID: 39b4479c-31f0-4696-9435-5454e4f8d310
>     Status: Started
>     Snapshot Count: 0
>     Number of Bricks: 1 x 2 = 2
>     Transport-type: tcp
>     Bricks:
>     Brick1: 192.168.186.11:/mnt/AIDATA/data
>     Brick2: 192.168.186.12:/mnt/AIDATA/data
>     Options Reconfigured:
>     performance.cache-refresh-timeout: 30
>     client.event-threads: 16
>     server.event-threads: 32
>     performance.readdir-ahead: off
>     performance.io-thread-count: 32
>     performance.cache-size: 32GB
>     transport.address-family: inet
>     nfs.disable: on
>     features.trash: off
>     features.trash-max-filesize: 500MB
>
>     # gluster volume heal data01 info
>     Brick 192.168.186.11:/mnt/AIDATA/data
>     Status: Connected
>     Number of entries: 0
>
>     Brick 192.168.186.12:/mnt/AIDATA/data
>     <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
>     <gfid:9612ecd2-106d-42f2-95eb-fef495c1d8ab>
>     Status: Connected
>     Number of entries: 2
>
>     # gluster volume heal data01 info split-brain
>     Brick 192.168.186.11:/mnt/AIDATA/data
>     Status: Connected
>     Number of entries in split-brain: 0
>
>     Brick 192.168.186.12:/mnt/AIDATA/data
>     Status: Connected
>     Number of entries in split-brain: 0
>
>
>     Both files is missing from the folder on Brick1, the gfid files
>     are also missing in the .gluster folder on that same Brick1.
>     Brick2 has both the files and the gfid file in .gluster
>
>     We already tried:
>
>     ?#gluster heal volume full
>     Running a stat and ls -l on both files from a mounted client to
>     try and trigger a heal
>
>     Would a re-balance fix this? Any guidance would be greatly
>     appreciated!
>
>
> A rebalance would not help here as this is a replicate volume. Ravi, 
> any idea what could be going wrong here?No, explicit lookup should have healed the file on the missing brick. 
Unless lookup did not hit afr and is served from caching translators.
Frederic, what version of gluster are you running? Can you launch 
'gluster heal volume' and see glustershd logs for possible warnings? Use
DEBUG client-log-level if you have to.? Also, instead of stat, try a 
getfattr on the file from the mount.

-Ravi>
> Regards,
> Nithya
>
>
>     Thank you in advance!
>
>     -- 
>     *
>     *
>     *Frederic Harmignies*
>     /High Performance Computer Administrator/
>
>     www.elementai.com <http://www.elementai.com/>
>
>     _______________________________________________
>     Gluster-users mailing list
>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>     <http://lists.gluster.org/mailman/listinfo/gluster-users>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20171116/43d4f701/attachment.html>

Frederic Harmignies

2017-Nov-16 15:13 UTC

head link

[Gluster-users] Missing files on one of the bricks

Hello, we are using glusterfs 3.10.3.

We currently have a gluster heal volume full running, the crawl is still
running.

Starting time of crawl: Tue Nov 14 15:58:35 2017

Crawl is in progress
Type of crawl: FULL
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0

getfattr from both files:

# getfattr -d -m . -e hex
/mnt/AIDATA/data//ishmaelb/experiments/omie/omieali/cifar10/donsker_grad_reg_ali_dcgan_stat_dcgan_ac_True/omieali_cifar10_zdim_100_enc_dcgan_dec_dcgan_stat_dcgan_posterior_propagated_enc_beta1.0_dec_beta_1.0_info_metric_donsker_varadhan_info_lam_0.334726025306_222219-23_10_17/data/data_gen_iter_86000.pkl
getfattr: Removing leading '/' from absolute path names
# file:
mnt/AIDATA/data//ishmaelb/experiments/omie/omieali/cifar10/donsker_grad_reg_ali_dcgan_stat_dcgan_ac_True/omieali_cifar10_zdim_100_enc_dcgan_dec_dcgan_stat_dcgan_posterior_propagated_enc_beta1.0_dec_beta_1.0_info_metric_donsker_varadhan_info_lam_0.334726025306_222219-23_10_17/data/data_gen_iter_86000.pkl
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.data01-client-0=0x000000000000000100000000
trusted.gfid=0x7e8513f4d4e24e66b0ba2dbe4c803c54

# getfattr -d -m . -e hex
/mnt/AIDATA/data/home/allac/experiments/171023_105655_mini_imagenet_projection_size_mixing_depth_num_filters_filter_size_block_depth_Explore\
architecture\ capacity/Explore\ architecture\
capacity\(projection_size\=32\;mixing_depth\=0\;num_filters\=64\;filter_size\=3\;block_depth\=3\)/model.ckpt-70001.data-00000-of-00001.tempstate1629411508065733704
getfattr: Removing leading '/' from absolute path names
# file:
mnt/AIDATA/data/home/allac/experiments/171023_105655_mini_imagenet_projection_size_mixing_depth_num_filters_filter_size_block_depth_Explore
architecture capacity/Explore architecture
capacity(projection_size=32;mixing_depth=0;num_filters=64;filter_size=3;block_depth=3)/model.ckpt-70001.data-00000-of-00001.tempstate1629411508065733704
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.data01-client-0=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005979d278000af1e7
trusted.gfid=0x9612ecd2106d42f295ebfef495c1d8ab

# gluster volume heal data01
Launching heal operation to perform index self heal on volume data01 has
been successful
Use heal info commands to check status
# cat /var/log/glusterfs/glustershd.log
[2017-11-12 08:39:01.907287] I [glusterfsd-mgmt.c:1789:mgmt_getspec_cbk]
0-glusterfs: No change in volfile, continuing
[2017-11-15 08:18:02.084766] I [MSGID: 100011]
[glusterfsd.c:1414:reincarnate] 0-glusterfsd: Fetching the volume file from
server...
[2017-11-15 08:18:02.085718] I [glusterfsd-mgmt.c:1789:mgmt_getspec_cbk]
0-glusterfs: No change in volfile, continuing
[2017-11-15 19:13:42.005307] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote
operation failed. Path: <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]
The message "W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote
operation failed. Path: <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]"
repeated 5 times between [2017-11-15 19:13:42.005307] and [2017-11-15
19:13:42.166579]
[2017-11-15 19:23:43.041956] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote
operation failed. Path: <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]
The message "W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote
operation failed. Path: <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]"
repeated 5 times between [2017-11-15 19:23:43.041956] and [2017-11-15
19:23:43.235831]
[2017-11-15 19:30:22.726808] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote
operation failed. Path: <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]
The message "W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-data01-client-0: remote
operation failed. Path: <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54) [No such file or directory]"
repeated 4 times between [2017-11-15 19:30:22.726808] and [2017-11-15
19:30:22.827631]
[2017-11-16 15:04:34.102010] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-data01-replicate-0: performing metadata selfheal on
9612ecd2-106d-42f2-95eb-fef495c1d8ab
[2017-11-16 15:04:34.186781] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed metadata selfheal on 9612ecd2-106d-42f2-95eb-fef495c1d8ab.
sources=[1]  sinks=0
[2017-11-16 15:04:38.776070] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed data selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
sources=[1]  sinks=0
[2017-11-16 15:04:38.811744] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-data01-replicate-0: performing metadata selfheal on
7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54
[2017-11-16 15:04:38.867474] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed metadata selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
sources=[1]  sinks=0

On Thu, Nov 16, 2017 at 7:14 AM, Ravishankar N <ravishankar at redhat.com>
wrote:
>
>
> On 11/16/2017 04:12 PM, Nithya Balachandran wrote:
>
>
>
> On 15 November 2017 at 19:57, Frederic Harmignies <frederic.harmignies@
> elementai.com> wrote:
>
>> Hello, we have 2x files that are missing from one of the bricks. No
idea
>> how to fix this.
>>
>> Details:
>>
>> # gluster volume info
>>
>> Volume Name: data01
>> Type: Replicate
>> Volume ID: 39b4479c-31f0-4696-9435-5454e4f8d310
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: 192.168.186.11:/mnt/AIDATA/data
>> Brick2: 192.168.186.12:/mnt/AIDATA/data
>> Options Reconfigured:
>> performance.cache-refresh-timeout: 30
>> client.event-threads: 16
>> server.event-threads: 32
>> performance.readdir-ahead: off
>> performance.io-thread-count: 32
>> performance.cache-size: 32GB
>> transport.address-family: inet
>> nfs.disable: on
>> features.trash: off
>> features.trash-max-filesize: 500MB
>>
>> # gluster volume heal data01 info
>> Brick 192.168.186.11:/mnt/AIDATA/data
>> Status: Connected
>> Number of entries: 0
>>
>> Brick 192.168.186.12:/mnt/AIDATA/data
>> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
>> <gfid:9612ecd2-106d-42f2-95eb-fef495c1d8ab>
>> Status: Connected
>> Number of entries: 2
>>
>> # gluster volume heal data01 info split-brain
>> Brick 192.168.186.11:/mnt/AIDATA/data
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> Brick 192.168.186.12:/mnt/AIDATA/data
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>>
>> Both files is missing from the folder on Brick1, the gfid files are
also
>> missing in the .gluster folder on that same Brick1.
>> Brick2 has both the files and the gfid file in .gluster
>>
>> We already tried:
>>
>>  #gluster heal volume full
>> Running a stat and ls -l on both files from a mounted client to try and
>> trigger a heal
>>
>> Would a re-balance fix this? Any guidance would be greatly appreciated!
>>
>
> A rebalance would not help here as this is a replicate volume. Ravi, any
> idea what could be going wrong here?
>
> No, explicit lookup should have healed the file on the missing brick.
> Unless lookup did not hit afr and is served from caching translators.
> Frederic, what version of gluster are you running? Can you launch
'gluster
> heal volume' and see glustershd logs for possible warnings? Use DEBUG
> client-log-level if you have to.  Also, instead of stat, try a getfattr on
> the file from the mount.
>
> -Ravi
>
>
> Regards,
> Nithya
>
>>
>> Thank you in advance!
>>
>> --
>>
>> *Frederic Harmignies*
>> *High Performance Computer Administrator*
>>
>> www.elementai.com
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>

-- 

*Frederic Harmignies*
*High Performance Computer Administrator*

www.elementai.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20171116/008ee18a/attachment.html>

Frederic Harmignies

2017-Nov-16 18:13 UTC

head link

[Gluster-users] Missing files on one of the bricks

Hello, looks like the full heal fixed the problem, i was just impatient :)

[2017-11-16 15:04:34.102010] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-data01-replicate-0: performing metadata selfheal on
9612ecd2-106d-42f2-95eb-fef495c1d8ab
[2017-11-16 15:04:34.186781] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed metadata selfheal on 9612ecd2-106d-42f2-95eb-fef495c1d8ab.
sources=[1]  sinks=0
[2017-11-16 15:04:38.776070] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed data selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
sources=[1]  sinks=0
[2017-11-16 15:04:38.811744] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-data01-replicate-0: performing metadata selfheal on
7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54
[2017-11-16 15:04:38.867474] I [MSGID: 108026]
[afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
Completed metadata selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
sources=[1]  sinks=0

# gluster volume heal data01 info
Brick 192.168.186.11:/mnt/AIDATA/data
Status: Connected
Number of entries: 0

Brick 192.168.186.12:/mnt/AIDATA/data
Status: Connected
Number of entries: 0

Thank you for your fast response!

On Thu, Nov 16, 2017 at 10:13 AM, Frederic Harmignies <
frederic.harmignies at elementai.com> wrote:
> Hello, we are using glusterfs 3.10.3.
>
> We currently have a gluster heal volume full running, the crawl is still
> running.
>
> Starting time of crawl: Tue Nov 14 15:58:35 2017
>
> Crawl is in progress
> Type of crawl: FULL
> No. of entries healed: 0
> No. of entries in split-brain: 0
> No. of heal failed entries: 0
>
> getfattr from both files:
>
> # getfattr -d -m . -e hex /mnt/AIDATA/data//ishmaelb/
> experiments/omie/omieali/cifar10/donsker_grad_reg_ali_
> dcgan_stat_dcgan_ac_True/omieali_cifar10_zdim_100_enc_
> dcgan_dec_dcgan_stat_dcgan_posterior_propagated_enc_
> beta1.0_dec_beta_1.0_info_metric_donsker_varadhan_info_
> lam_0.334726025306_222219-23_10_17/data/data_gen_iter_86000.pkl
> getfattr: Removing leading '/' from absolute path names
> # file: mnt/AIDATA/data//ishmaelb/experiments/omie/omieali/
> cifar10/donsker_grad_reg_ali_dcgan_stat_dcgan_ac_True/
> omieali_cifar10_zdim_100_enc_dcgan_dec_dcgan_stat_dcgan_
> posterior_propagated_enc_beta1.0_dec_beta_1.0_info_
> metric_donsker_varadhan_info_lam_0.334726025306_222219-23_
> 10_17/data/data_gen_iter_86000.pkl
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.data01-client-0=0x000000000000000100000000
> trusted.gfid=0x7e8513f4d4e24e66b0ba2dbe4c803c54
>
> # getfattr -d -m . -e hex /mnt/AIDATA/data/home/allac/
> experiments/171023_105655_mini_imagenet_projection_size_
> mixing_depth_num_filters_filter_size_block_depth_Explore\ architecture\
> capacity/Explore\ architecture\ capacity\(projection_size\=32\
> ;mixing_depth\=0\;num_filters\=64\;filter_size\=3\;block_
> depth\=3\)/model.ckpt-70001.data-00000-of-00001.
> tempstate1629411508065733704
> getfattr: Removing leading '/' from absolute path names
> # file: mnt/AIDATA/data/home/allac/experiments/171023_105655_
> mini_imagenet_projection_size_mixing_depth_num_filters_
> filter_size_block_depth_Explore architecture capacity/Explore
> architecture capacity(projection_size=32;mixing_depth=0;num_filters=64;
> filter_size=3;block_depth=3)/model.ckpt-70001.data-00000-of-00001.
> tempstate1629411508065733704
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.data01-client-0=0x000000000000000000000000
> trusted.bit-rot.version=0x02000000000000005979d278000af1e7
> trusted.gfid=0x9612ecd2106d42f295ebfef495c1d8ab
>
>
> # gluster volume heal data01
> Launching heal operation to perform index self heal on volume data01 has
> been successful
> Use heal info commands to check status
> # cat /var/log/glusterfs/glustershd.log
> [2017-11-12 08:39:01.907287] I [glusterfsd-mgmt.c:1789:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2017-11-15 08:18:02.084766] I [MSGID: 100011]
[glusterfsd.c:1414:reincarnate]
> 0-glusterfsd: Fetching the volume file from server...
> [2017-11-15 08:18:02.085718] I [glusterfsd-mgmt.c:1789:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2017-11-15 19:13:42.005307] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]
> The message "W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]" repeated 5 times between [2017-11-15
> 19:13:42.005307] and [2017-11-15 19:13:42.166579]
> [2017-11-15 19:23:43.041956] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]
> The message "W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]" repeated 5 times between [2017-11-15
> 19:23:43.041956] and [2017-11-15 19:23:43.235831]
> [2017-11-15 19:30:22.726808] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]
> The message "W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk]
> 0-data01-client-0: remote operation failed. Path:
> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
(7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54)
> [No such file or directory]" repeated 4 times between [2017-11-15
> 19:30:22.726808] and [2017-11-15 19:30:22.827631]
> [2017-11-16 15:04:34.102010] I [MSGID: 108026]
> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
> 0-data01-replicate-0: performing metadata selfheal on
> 9612ecd2-106d-42f2-95eb-fef495c1d8ab
> [2017-11-16 15:04:34.186781] I [MSGID: 108026]
> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
> Completed metadata selfheal on 9612ecd2-106d-42f2-95eb-fef495c1d8ab.
> sources=[1]  sinks=0
> [2017-11-16 15:04:38.776070] I [MSGID: 108026]
> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
> Completed data selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
> sources=[1]  sinks=0
> [2017-11-16 15:04:38.811744] I [MSGID: 108026]
> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
> 0-data01-replicate-0: performing metadata selfheal on
> 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54
> [2017-11-16 15:04:38.867474] I [MSGID: 108026]
> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-data01-replicate-0:
> Completed metadata selfheal on 7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54.
> sources=[1]  sinks=0
>
>
>
>
> On Thu, Nov 16, 2017 at 7:14 AM, Ravishankar N <ravishankar at
redhat.com>
> wrote:
>
>>
>>
>> On 11/16/2017 04:12 PM, Nithya Balachandran wrote:
>>
>>
>>
>> On 15 November 2017 at 19:57, Frederic Harmignies <
>> frederic.harmignies at elementai.com> wrote:
>>
>>> Hello, we have 2x files that are missing from one of the bricks. No
idea
>>> how to fix this.
>>>
>>> Details:
>>>
>>> # gluster volume info
>>>
>>> Volume Name: data01
>>> Type: Replicate
>>> Volume ID: 39b4479c-31f0-4696-9435-5454e4f8d310
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x 2 = 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 192.168.186.11:/mnt/AIDATA/data
>>> Brick2: 192.168.186.12:/mnt/AIDATA/data
>>> Options Reconfigured:
>>> performance.cache-refresh-timeout: 30
>>> client.event-threads: 16
>>> server.event-threads: 32
>>> performance.readdir-ahead: off
>>> performance.io-thread-count: 32
>>> performance.cache-size: 32GB
>>> transport.address-family: inet
>>> nfs.disable: on
>>> features.trash: off
>>> features.trash-max-filesize: 500MB
>>>
>>> # gluster volume heal data01 info
>>> Brick 192.168.186.11:/mnt/AIDATA/data
>>> Status: Connected
>>> Number of entries: 0
>>>
>>> Brick 192.168.186.12:/mnt/AIDATA/data
>>> <gfid:7e8513f4-d4e2-4e66-b0ba-2dbe4c803c54>
>>> <gfid:9612ecd2-106d-42f2-95eb-fef495c1d8ab>
>>> Status: Connected
>>> Number of entries: 2
>>>
>>> # gluster volume heal data01 info split-brain
>>> Brick 192.168.186.11:/mnt/AIDATA/data
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick 192.168.186.12:/mnt/AIDATA/data
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>>
>>> Both files is missing from the folder on Brick1, the gfid files are
also
>>> missing in the .gluster folder on that same Brick1.
>>> Brick2 has both the files and the gfid file in .gluster
>>>
>>> We already tried:
>>>
>>>  #gluster heal volume full
>>> Running a stat and ls -l on both files from a mounted client to try
and
>>> trigger a heal
>>>
>>> Would a re-balance fix this? Any guidance would be greatly
appreciated!
>>>
>>
>> A rebalance would not help here as this is a replicate volume. Ravi,
any
>> idea what could be going wrong here?
>>
>> No, explicit lookup should have healed the file on the missing brick.
>> Unless lookup did not hit afr and is served from caching translators.
>> Frederic, what version of gluster are you running? Can you launch
>> 'gluster heal volume' and see glustershd logs for possible
warnings? Use
>> DEBUG client-log-level if you have to.  Also, instead of stat, try a
>> getfattr on the file from the mount.
>>
>> -Ravi
>>
>>
>> Regards,
>> Nithya
>>
>>>
>>> Thank you in advance!
>>>
>>> --
>>>
>>> *Frederic Harmignies*
>>> *High Performance Computer Administrator*
>>>
>>> www.elementai.com
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>>
>
>
> --
>
> *Frederic Harmignies*
> *High Performance Computer Administrator*
>
> www.elementai.com
>


-- 

*Frederic Harmignies*
*High Performance Computer Administrator*

www.elementai.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20171116/e8592d5a/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

Gluster users - Nov 2017 - Missing files on one of the bricks

[Gluster-users] Missing files on one of the bricks

[Gluster-users] Missing files on one of the bricks

[Gluster-users] Missing files on one of the bricks

Apparently Analagous Threads