thr3ads.net - Gluster users - [Gluster-users] issue with self-heal [Jul 2018]

If this information is useful, please help other people find it:
Share via:

Brian Andrus

2018-Jul-13 15:50 UTC

[Gluster-users] issue with self-heal

You message means something (usually glusterfsd) is not running quite 
right or at all on one of the servers.

If you can tell which it is, you need to stop/restart glusterd and 
glusterfsd. Note: sometimes just stopping them doesn't really stop them. 
You need to do a killall -9? for glusterd, glusterfsd and anything else 
with "gluster"

Then just start glusterd and glusterfsd. Once they are up you should be 
able to do the heal.

If you can't tell which it is and are able to take gluster offline for 
users for a moment, do that process to all your brick servers.

Brian Andrus


On 7/13/2018 10:55 AM, hsafe wrote:>
> Hello Gluster community,
>
> After several hundred GB of data writes (small image? 100k <size> 1M)
> into a replicated 2x glusterfs servers , I am facing issue with 
> healing process. Earlier the heal info returned the bricks and nodes 
> and the fact that there are no failed heal; but now it gets to the 
> state with below message:
>
> *# gluster volume heal gv1 info healed*
>
> *Gathering list of heal failed entries on volume gv1 has been 
> unsuccessful on bricks that are down. Please check if all brick 
> processes are running.*
>
> issuing the heal info command gives a log list of gfid info that takes 
> like an hour to complete. The file data being images would not change 
> and primarily served from 8x server mount native glusterfs.
>
> Here is some insight on the status of the gluster, but how can I 
> effectively do a successful heal on the storages cause last times 
> trying to do that send the servers southway and irresponsive
>
> *# gluster volume info
>
> Volume Name: gv1
> Type: Replicate
> Volume ID: f1c955a1-7a92-4b1b-acb5-8b72b41aaace
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: IMG-01:/images/storage/brick1
> Brick2: IMG-02:/images/storage/brick1
> Options Reconfigured:
> performance.md-cache-timeout: 128
> cluster.background-self-heal-count: 32
> server.statedump-path: /tmp
> performance.readdir-ahead: on
> nfs.disable: true
> network.inode-lru-limit: 50000
> features.bitrot: off
> features.scrub: Inactive
> performance.cache-max-file-size: 16MB
> client.event-threads: 8
> cluster.eager-lock: on*
>
> Appreciate your help.Thanks
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180713/ec58c98c/attachment.html>

hsafe

2018-Jul-13 17:55 UTC

head link

[Gluster-users] issue with self-heal

Hello Gluster community,

After several hundred GB of data writes (small image? 100k <size> 1M) 
into a replicated 2x glusterfs servers , I am facing issue with healing 
process. Earlier the heal info returned the bricks and nodes and the 
fact that there are no failed heal; but now it gets to the state with 
below message:

*# gluster volume heal gv1 info healed*

*Gathering list of heal failed entries on volume gv1 has been 
unsuccessful on bricks that are down. Please check if all brick 
processes are running.*

issuing the heal info command gives a log list of gfid info that takes 
like an hour to complete. The file data being images would not change 
and primarily served from 8x server mount native glusterfs.

Here is some insight on the status of the gluster, but how can I 
effectively do a successful heal on the storages cause last times trying 
to do that send the servers southway and irresponsive

*# gluster volume info

Volume Name: gv1
Type: Replicate
Volume ID: f1c955a1-7a92-4b1b-acb5-8b72b41aaace
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: IMG-01:/images/storage/brick1
Brick2: IMG-02:/images/storage/brick1
Options Reconfigured:
performance.md-cache-timeout: 128
cluster.background-self-heal-count: 32
server.statedump-path: /tmp
performance.readdir-ahead: on
nfs.disable: true
network.inode-lru-limit: 50000
features.bitrot: off
features.scrub: Inactive
performance.cache-max-file-size: 16MB
client.event-threads: 8
cluster.eager-lock: on*

Appreciate your help.Thanks

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180713/6cdbeb64/attachment.html>

Gluster users - Jul 2018 - issue with self-heal

[Gluster-users] issue with self-heal

[Gluster-users] issue with self-heal