Laurent Chouinard
2014-May-22 15:46 UTC
[Gluster-users] Unavailability during self-heal for large volumes
Hi, Digging in the archives of this list and bugzilla, it seems that the problem I'm about to describe has existed for a long time. However, I am unclear if a solution was found or not, so I'd like to get some input from the users mailing list. For a volume with a very large number of files (several millions), following an outage from a node or if we replace a brick and present it empty to the cluster, the self-heal system kicks which is the expected behaviour. However, during this self-heal, system load is so high that it renders the machine unavailable for several hours until it's complete. On certain extreme occasions, it goes so far as to prevent SSH login, and at some point we even had to force a reboot to recover a minimum of usability. Has anyone found a way to control the load of the self-heal system to a more acceptable level? It is my understanding that the issue is caused by the very large number of IOPS required by every brick to enumerate all files and read metadata flags, then copy data and write changes. The machines are quite capable of heavy IO, since disks are all SSDs in raid-0 and multiple network links are bonded per machine for more bandwidth. I don't mind the time it takes to heal, I mind the impact healing has over other operations. Any ideas? Thanks Laurent Chouinard -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140522/e6c0ddfb/attachment.html>
Pranith Kumar Karampuri
2014-May-30 06:31 UTC
[Gluster-users] Unavailability during self-heal for large volumes
----- Original Message -----> From: "Laurent Chouinard" <laurent.chouinard at ubisoft.com> > To: gluster-users at gluster.org > Sent: Thursday, May 22, 2014 9:16:01 PM > Subject: [Gluster-users] Unavailability during self-heal for large volumes > > > > Hi, > > > > Digging in the archives of this list and bugzilla, it seems that the problem > I?m about to describe has existed for a long time. However, I am unclear if > a solution was found or not, so I?d like to get some input from the users > mailing list. > > > > For a volume with a very large number of files (several millions), following > an outage from a node or if we replace a brick and present it empty to the > cluster, the self-heal system kicks which is the expected behaviour. > > > > However, during this self-heal, system load is so high that it renders the > machine unavailable for several hours until it?s complete. On certain > extreme occasions, it goes so far as to prevent SSH login, and at some point > we even had to force a reboot to recover a minimum of usability. > > > > Has anyone found a way to control the load of the self-heal system to a more > acceptable level? It is my understanding that the issue is caused by the > very large number of IOPS required by every brick to enumerate all files and > read metadata flags, then copy data and write changes. The machines are > quite capable of heavy IO, since disks are all SSDs in raid-0 and multiple > network links are bonded per machine for more bandwidth. > > > > I don?t mind the time it takes to heal, I mind the impact healing has over > other operations. > > > > Any ideas?Laurent, This has been improved significantly in afr-v2 (enhanced version of replication translator in gluster) which will be released with 3.6 I believe. The issue happens because of the directory self-heal in the older versions. In the new version per file healing in a directory is performed instead of Full directory heal at-once which was creating a lot of traffic. Unfortunately This is too big a change to backport to older releases :-(. Pranith> > > > Thanks > > > > Laurent Chouinard > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users
Laurent Chouinard
2014-Jun-02 19:26 UTC
[Gluster-users] Unavailability during self-heal for large volumes
> Laurent, > This has been improved significantly in afr-v2 (enhanced version of replication > translator in gluster) which will be released with 3.6 I believe. The issue happens > because of the directory self-heal in the older versions. In the new version per file > healing in a directory is performed instead of Full directory heal at-once which was > creating a lot of traffic. Unfortunately This is too big a change to backport to older > releases :-(. > > PranithHi Pranith, Thank you for this information. Do you think there is a way to limit/throttle the current directory self-heal then? I don't mind if it takes a long time. Alternatively, is there a way to completely disable the complete healing system? I would consider running a manual healing operation by STAT'ing every file, which would allow me to throttle the speed to a more manageable level. Thanks, Laurent Chouinard