Grant Ridder
2015-Aug-31 19:35 UTC
[Gluster-users] Handling a fuse mount with a failed gluster node
Hi, I am testing out several failure scenarios with GlusterFS. I have a 3 node replicated gluster that i am testing with. One test i am having trouble solving is when a host dies. (i.e. as if someone pulled the power cord out). - Firewall off a host from the rest of the cluster - Test time it takes for a fuse mount to respond once the iptables rule is added With the default settings, the mount hangs for 39 seconds. If i change ping-timeout to 5 then the mount only hangs for 9.3 seconds. Is there anyway to eliminate or get the hang time to a negligible value (less than 1 second)? I have not seem much about handling GlusterFS failure scenarios with my Googling around. Several blog posts i have looked at: http://thornelabs.net/2015/02/24/change-gluster-volume-connection-timeout-for-glusterfs-native-client.html https://joejulian.name/blog/keeping-your-vms-from-going-read-only-when-encountering-a-ping-timeout-in-glusterfs/ Thanks, Grant -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150831/bacc2eb6/attachment.html>
Joe Julian
2015-Aug-31 22:00 UTC
[Gluster-users] Handling a fuse mount with a failed gluster node
On 08/31/2015 12:35 PM, Grant Ridder wrote:> Hi, > > I am testing out several failure scenarios with GlusterFS. I have a 3 > node replicated gluster that i am testing with. > > One test i am having trouble solving is when a host dies. (i.e. as if > someone pulled the power cord out). > - Firewall off a host from the rest of the cluster > - Test time it takes for a fuse mount to respond once the iptables > rule is added > > With the default settings, the mount hangs for 39 seconds. If i > change ping-timeout to 5 then the mount only hangs for 9.3 seconds. > Is there anyway to eliminate or get the hang time to a negligible > value (less than 1 second)? > > I have not seem much about handling GlusterFS failure scenarios with > my Googling around. > > Several blog posts i have looked at: > http://thornelabs.net/2015/02/24/change-gluster-volume-connection-timeout-for-glusterfs-native-client.html > https://joejulian.name/blog/keeping-your-vms-from-going-read-only-when-encountering-a-ping-timeout-in-glusterfs/ > > Thanks, > Grant >If a client disconnects from a server, you have to reestablish all the file descriptors and synchronize the locks when the client reconnects. This can be pretty expensive and there's no way to avoid it. To balance that, you don't want your clients to disconnect from the servers if a packet is lost or takes too long to get a response. That's why the connections are TCP, to help mitigate that, and why the client waits for some ping-timeout. If it was too short, even server load could trigger a disconnection which would be followed by high server load as the connection was reestablished, potentially causing a disconnection again. Pulled power cords or complete system failures, should be a very rare occurrence. Typically this is much rarer than a temporary network issue which is much more likely to be mitigated in the network fabric and is transient enough to allow the ping-timeout to hold the connection long enough to avoid the reestablishment of FDs and locks. It's also only an issue if your file hash hits that specific replica out of the dht set (or the file doesn't exist). If you're using a cluster where server failure is frequent enough to be an issue, your dht distribution lowers the likelihood of the file being hit to an insignificant statistic. If you're working with reasonably resilient hardware, you should easily be able to engineer for 5 or 6 nines even with a 42 second ping-timeout.