Sure, if you never restart / autoscale anything and if your use case isn't bothered with up to 42 seconds of downtime, for us - 42 seconds is a really long time for something like a patient management system to refuse file attachments from being uploaded etc... We apply a strict patching policy for security and kernel updates, we often also load balance between underlying physical hosts and if the virtual hosts have lots of storage it can be quicker to let them shutdown and start on another host. So for us, gone are the old Unix days of caring about uptime, a huge part of our measurement of success and risk reduction has become how quickly we can not just deploy our software / web apps into production but also how quickly our platform can be reformed, patched and migrated as is effective. So in reality, I'd probably rolling restart our three node gluster clusters every few weeks or so depending on what patches have been released etc... -- Sam McLeod https://smcleod.net https://twitter.com/s_mcleod> On 29 Dec 2017, at 11:08 am, Joe Julian <joe at julianfamily.org> wrote: > > The reason for the long (42 second) ping-timeout is because re-establishing fd's and locks can be a very expensive operation. With an average MTBF of 45000 hours for a server, even just a replica 2 would result in a 42 second MTTR every 2.6 years, or 6 nines of uptime. > > On December 27, 2017 3:17:01 AM PST, Omar Kohl <omar.kohl at iternity.com> wrote: > Hi, > > If you set it to 10 seconds, and a node goes down, you'll see a 10 seconds freez in all I/O for the volume. > > Exactly! ONLY 10 seconds instead of the default 42 seconds :-) > > As I said before the problem with the 42 seconds is that a Windows Samba Client will disconnect (and therefore interrupt any read/write operation) after waiting for about 25 seconds. So 42 seconds is too high. In this case it would therefore make more sense to reduce the ping-timeout, right? > > Has anyone done any performance measurements on what the implications of a low ping-timeout are? What are the costs of "triggering heals all the time"? > > On a related note I found the extras/hook-scripts/start/post/S29CTDBsetup.sh <http://s29ctdbsetup.sh/> script that mounts a CTDB (Samba) share and explicitly sets the ping-timeout to 10 seconds. There is a comment saying: "Make sure ping-timeout is not default for CTDB volume". Unfortunately there is no explanation in the script, in the commit or in the Gerrit review history (https://review.gluster.org/#/c/7569 <https://review.gluster.org/#/c/7569>/, https://review.gluster.org/#/c/8007 <https://review.gluster.org/#/c/8007>/) for WHY you make sure ping-timeout is not default. Can anyone tell me the reason? > > Kind regards, > Omar > > -----Urspr?ngliche Nachricht----- > Von: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] Im Auftrag von lemonnierk at ulrar.net > Gesendet: Dienstag, 26. Dezember 2017 22:05 > An: gluster-users at gluster.org > Betreff: Re: [Gluster-users] Exact purpose of network.ping <http://network.ping/>-timeout > > Hi, > > It's just the delay for which a node can stop responding before being marked as down. > Basically that's how long a node can go down before a heal becomes necessary to bring it back. > > If you set it to 10 seconds, and a node goes down, you'll see a 10 seconds freez in all I/O for the volume. That's why you don't want it too high (having a 2 minutes freez on I/O for example would be pretty bad, depending on what you host), but you don't want it too low either (to avoid triggering heals all the time). > > You can configure it because it depends on what you host. You might be okay with a few minutes freez to avoid a heal, or you might not care about heals at all and prefer a very low value to avoid feezes. > The default value should work pretty well for most things though > > On Tue, Dec 26, 2017 at 01:11:48PM +0000, Omar Kohl wrote: > Hi, > > I have a question regarding the "ping-timeout" option. I have been researching its purpose for a few days and it is not completely clear to me. Especially that it is apparently strongly encouraged by the Gluster community not to change or at least decrease this value! > > Assuming that I set ping-timeout to 10 seconds (instead of the default 42) this would mean that if I have a network outage of 11 seconds then Gluster internally would have to re-allocate some resources that it freed after the 10 seconds, correct? But apart from that there are no negative implications, are there? For instance if I'm copying files during the network outage then those files will continue copying after those 11 seconds. > > This means that the only purpose of ping-timeout is to save those extra resources that are used by "short" network outages. Is that correct? > > If I am confident that my network will not have many 11 second outages and if they do occur I am willing to incur those extra costs due to resource allocation is there any reason not to set ping-timeout to 10 seconds? > > The problem I have with a long ping-timeout is that the Windows Samba Client disconnects after 25 seconds. So if one of the nodes of a Gluster cluster shuts down ungracefully then the Samba Client disconnects and the file that was being copied is incomplete on the server. These "costs" seem to be much higher than the potential costs of those Gluster resource re-allocations. But it is hard to estimate because there is not clear documentation what exactly those Gluster costs are. > > In general I would be very interested in a comprehensive explanation of ping-timeout and the up- and downsides of setting high or low values for it. > > Kinds regards, > Omar > > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users> > > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users> > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171229/ae87ac29/attachment.html>
Restarts will go through a shutdown process. As long as the network isn't actively unconfigured before the final kill, the tcp connection will be shutdown and there will be no wait. On 12/28/17 20:19, Sam McLeod wrote:> Sure, if you never restart / autoscale anything and if your use case > isn't bothered with up to 42 seconds of downtime, for us - 42 seconds > is a really long time for something like a patient management system > to refuse file attachments from being uploaded etc... > > We apply a strict patching policy for security and kernel updates, we > often also load balance between underlying physical hosts and if the > virtual hosts have lots of storage it can be quicker to let them > shutdown and start on another host. > > So for us, gone are the old Unix days of caring about uptime, a huge > part of our measurement of success and risk reduction has become how > quickly we can not just deploy our software / web apps into production > but also how quickly our platform can be reformed, patched and > migrated as is effective. > > So in reality, I'd probably rolling restart our three node gluster > clusters every few weeks or so depending on what patches have been > released etc... > > -- > Sam McLeod > https://smcleod.net > https://twitter.com/s_mcleod > >> On 29 Dec 2017, at 11:08 am, Joe Julian <joe at julianfamily.org >> <mailto:joe at julianfamily.org>> wrote: >> >> The reason for the long (42 second) ping-timeout is because >> re-establishing fd's and locks can be a very expensive operation. >> With an average MTBF of 45000 hours for a server, even just a replica >> 2 would result in a 42 second MTTR every 2.6 years, or 6 nines of uptime. >> >> On December 27, 2017 3:17:01 AM PST, Omar Kohl >> <omar.kohl at iternity.com <mailto:omar.kohl at iternity.com>> wrote: >> >> Hi, >> >> If you set it to 10 seconds, and a node goes down, you'll see >> a 10 seconds freez in all I/O for the volume. >> >> >> Exactly! ONLY 10 seconds instead of the default 42 seconds :-) >> >> As I said before the problem with the 42 seconds is that a Windows Samba Client will disconnect (and therefore interrupt any read/write operation) after waiting for about 25 seconds. So 42 seconds is too high. In this case it would therefore make more sense to reduce the ping-timeout, right? >> >> Has anyone done any performance measurements on what the implications of a low ping-timeout are? What are the costs of "triggering heals all the time"? >> >> On a related note I found the extras/hook-scripts/start/post/S29CTDBsetup.sh <http://s29ctdbsetup.sh/> script that mounts a CTDB (Samba) share and explicitly sets the ping-timeout to 10 seconds. There is a comment saying: "Make sure ping-timeout is not default for CTDB volume". Unfortunately there is no explanation in the script, in the commit or in the Gerrit review history (https://review.gluster.org/#/c/7569/,https://review.gluster.org/#/c/8007/) for WHY you make sure ping-timeout is not default. Can anyone tell me the reason? >> >> Kind regards, >> Omar >> >> -----Urspr?ngliche Nachricht----- >> Von:gluster-users-bounces at gluster.org >> <mailto:gluster-users-bounces at gluster.org> [mailto:gluster-users-bounces at gluster.org] Im Auftrag vonlemonnierk at ulrar.net <mailto:lemonnierk at ulrar.net> >> Gesendet: Dienstag, 26. Dezember 2017 22:05 >> An:gluster-users at gluster.org <mailto:gluster-users at gluster.org> >> Betreff: Re: [Gluster-users] Exact purpose ofnetwork.ping <http://network.ping/>-timeout >> >> Hi, >> >> It's just the delay for which a node can stop responding before being marked as down. >> Basically that's how long a node can go down before a heal becomes necessary to bring it back. >> >> If you set it to 10 seconds, and a node goes down, you'll see a 10 seconds freez in all I/O for the volume. That's why you don't want it too high (having a 2 minutes freez on I/O for example would be pretty bad, depending on what you host), but you don't want it too low either (to avoid triggering heals all the time). >> >> You can configure it because it depends on what you host. You might be okay with a few minutes freez to avoid a heal, or you might not care about heals at all and prefer a very low value to avoid feezes. >> The default value should work pretty well for most things though >> >> On Tue, Dec 26, 2017 at 01:11:48PM +0000, Omar Kohl wrote: >> >> Hi, I have a question regarding the "ping-timeout" option. I >> have been researching its purpose for a few days and it is >> not completely clear to me. Especially that it is apparently >> strongly encouraged by the Gluster community not to change or >> at least decrease this value! Assuming that I set >> ping-timeout to 10 seconds (instead of the default 42) this >> would mean that if I have a network outage of 11 seconds then >> Gluster internally would have to re-allocate some resources >> that it freed after the 10 seconds, correct? But apart from >> that there are no negative implications, are there? For >> instance if I'm copying files during the network outage then >> those files will continue copying after those 11 seconds. >> This means that the only purpose of ping-timeout is to save >> those extra resources that are used by "short" network >> outages. Is that correct? If I am confident that my network >> will not have many 11 second outages and if they do occur I >> am willing to incur those extra costs due to resource >> allocation is there any reason not to set ping-timeout to 10 >> seconds? The problem I have with a long ping-timeout is that >> the Windows Samba Client disconnects after 25 seconds. So if >> one of the nodes of a Gluster cluster shuts down ungracefully >> then the Samba Client disconnects and the file that was being >> copied is incomplete on the server. These "costs" seem to be >> much higher than the potential costs of those Gluster >> resource re-allocations. But it is hard to estimate because >> there is not clear documentation what exactly those Gluster >> costs are. In general I would be very interested in a >> comprehensive explanation of ping-timeout and the up- and >> downsides of setting high or low values for it. Kinds >> regards, Omar >> ------------------------------------------------------------------------ >> Gluster-users mailing list Gluster-users at gluster.org >> <mailto:Gluster-users at gluster.org> >> http://lists.gluster.org/mailman/listinfo/gluster-users >> >> ------------------------------------------------------------------------ >> >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> http://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> -- >> Sent from my Android device with K-9 Mail. Please excuse my brevity. >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171228/4aeed403/attachment.html>
lemonnierk at ulrar.net
2017-Dec-29 09:05 UTC
[Gluster-users] Exact purpose of network.ping-timeout
On Fri, Dec 29, 2017 at 03:19:36PM +1100, Sam McLeod wrote:> Sure, if you never restart / autoscale anything and if your use case isn't bothered with up to 42 seconds of downtime, for us - 42 seconds is a really long time for something like a patient management system to refuse file attachments from being uploaded etc... >It won't refuse anything for 42 seconds, it'll just take 42 seconds + whatever time the upload would take to complete. Might be as bad to you, I don't know, but it shouldn't refuse. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Digital signature URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171229/24276373/attachment.sig>
Hi, I know that "glusterbot" text about ping-timeout almost by heart by now ;-) I have searched the complete IRC logs and Mailing list from the last 4 or 5 years for anything related to ping-timeout. The problem with "can be a very expensive operation" is that this is extremely vague. It would be helpful to put some numbers behind it. Of course I also understand that any numbers would be very case specific and would not necessarily generalize to other use cases. So anyway... Coming back to my original problem: If a Microsoft Windows client mounts a Samba Share with an underlying Gluster volume and this volume goes away for more than 25 seconds then the Samba Share is dead and any file operation is cancelled. This means for instance that a big file that is being copied will be stored in an incomplete state in the Gluster volume. This is especially annoying since one server (Gluster brick) is online the whole time and all operations could in theory have continued without problems. If I reduce the ping-timeout to something like 5 seconds the problem goes away! File operations in the Samba Share will stall for a few seconds and then everything will continue. I understand that with a regular server shutdown this should never happen anyway. In practice (at least with CentOS 7) this does still happen (possibly because the network goes away too quickly, as you suggested) but it should be fixable. BUT I definitely want to support hard server crashes as well. The current behaviour of the Samba Share is not an option! Would you therefore say it is appropriate in my use case to decrease the ping-timeout? Or can you think of anything else that could/should be done? I have no control over the client. Since there are plenty of layers that everything goes through there are many reasons that additional delays could be caused. So my first instinct would be to reduce ping-timeout as much as possible to avoid coming near those "25 seconds". Therefore my question on some specific data of what the "ping-timeout" costs are. What confirms me in my belief that 42 seconds ping-timeout for a Samba share is not appropriate is the script from the Gluster repository I linked to in a previous mail:> I found the extras/hook-scripts/start/post/S29CTDBsetup.sh script that mounts a CTDB (Samba) share and explicitly sets the ping-timeout to 10 seconds. There is a comment saying: "Make sure ping-timeout is not default for CTDB volume". Unfortunately there is no explanation in the script, in the commit or in the Gerrit review history (https://review.gluster.org/#/c/7569/, https://review.gluster.org/#/c/8007/) for WHY you make sure ping-timeout is not default. Can anyone tell me the reason?Thanks for your help! Kind regards, Omar -----Urspr?ngliche Nachricht----- Von: Joe Julian [mailto:joe at julianfamily.org] Gesendet: Freitag, 29. Dezember 2017 06:35 An: Sam McLeod <mailinglists at smcleod.net> Cc: Gluster Users <gluster-users at gluster.org>; Omar Kohl <omar.kohl at iternity.com> Betreff: Re: [Gluster-users] Exact purpose of network.ping-timeout Restarts will go through a shutdown process. As long as the network isn't actively unconfigured before the final kill, the tcp connection will be shutdown and there will be no wait. On 12/28/17 20:19, Sam McLeod wrote: Sure, if you never restart / autoscale anything and if your use case isn't bothered with up to 42 seconds of downtime, for us - 42 seconds is a really long time for something like a patient management system to refuse file attachments from being uploaded etc... We apply a strict patching policy for security and kernel updates, we often also load balance between underlying physical hosts and if the virtual hosts have lots of storage it can be quicker to let them shutdown and start on another host. So for us, gone are the old Unix days of caring about uptime, a huge part of our measurement of success and risk reduction has become how quickly we can not just deploy our software / web apps into production but also how quickly our platform can be reformed, patched and migrated as is effective. So in reality, I'd probably rolling restart our three node gluster clusters every few weeks or so depending on what patches have been released etc... -- Sam McLeod https://smcleod.net https://twitter.com/s_mcleod On 29 Dec 2017, at 11:08 am, Joe Julian <joe at julianfamily.org <mailto:joe at julianfamily.org> > wrote: The reason for the long (42 second) ping-timeout is because re-establishing fd's and locks can be a very expensive operation. With an average MTBF of 45000 hours for a server, even just a replica 2 would result in a 42 second MTTR every 2.6 years, or 6 nines of uptime. On December 27, 2017 3:17:01 AM PST, Omar Kohl <omar.kohl at iternity.com <mailto:omar.kohl at iternity.com> > wrote: Hi, If you set it to 10 seconds, and a node goes down, you'll see a 10 seconds freez in all I/O for the volume. Exactly! ONLY 10 seconds instead of the default 42 seconds :-) As I said before the problem with the 42 seconds is that a Windows Samba Client will disconnect (and therefore interrupt any read/write operation) after waiting for about 25 seconds. So 42 seconds is too high. In this case it would therefore make more sense to reduce the ping-timeout, right? Has anyone done any performance measurements on what the implications of a low ping-timeout are? What are the costs of "triggering heals all the time"? On a related note I found the extras/hook-scripts/start/post/S29CTDBsetup.sh <http://s29ctdbsetup.sh/> script that mounts a CTDB (Samba) share and explicitly sets the ping-timeout to 10 seconds. There is a comment saying: "Make sure ping-timeout is not default for CTDB volume". Unfortunately there is no explanation in the script, in the commit or in the Gerrit review history (https://review.gluster.org/#/c/7569/, https://review.gluster.org/#/c/8007/) for WHY you make sure ping-timeout is not default. Can anyone tell me the reason? Kind regards, Omar -----Urspr?ngliche Nachricht----- Von: gluster-users-bounces at gluster.org <mailto:gluster-users-bounces at gluster.org> [mailto:gluster-users-bounces at gluster.org] Im Auftrag von lemonnierk at ulrar.net <mailto:lemonnierk at ulrar.net> Gesendet: Dienstag, 26. Dezember 2017 22:05 An: gluster-users at gluster.org <mailto:gluster-users at gluster.org> Betreff: Re: [Gluster-users] Exact purpose of network.ping <http://network.ping/> -timeout Hi, It's just the delay for which a node can stop responding before being marked as down. Basically that's how long a node can go down before a heal becomes necessary to bring it back. If you set it to 10 seconds, and a node goes down, you'll see a 10 seconds freez in all I/O for the volume. That's why you don't want it too high (having a 2 minutes freez on I/O for example would be pretty bad, depending on what you host), but you don't want it too low either (to avoid triggering heals all the time). You can configure it because it depends on what you host. You might be okay with a few minutes freez to avoid a heal, or you might not care about heals at all and prefer a very low value to avoid feezes. The default value should work pretty well for most things though On Tue, Dec 26, 2017 at 01:11:48PM +0000, Omar Kohl wrote: Hi, I have a question regarding the "ping-timeout" option. I have been researching its purpose for a few days and it is not completely clear to me. Especially that it is apparently strongly encouraged by the Gluster community not to change or at least decrease this value! Assuming that I set ping-timeout to 10 seconds (instead of the default 42) this would mean that if I have a network outage of 11 seconds then Gluster internally would have to re-allocate some resources that it freed after the 10 seconds, correct? But apart from that there are no negative implications, are there? For instance if I'm copying files during the network outage then those files will continue copying after those 11 seconds. This means that the only purpose of ping-timeout is to save those extra resources that are used by "short" network outages. Is that correct? If I am confident that my network will not have many 11 second outages and if they do occur I am willing to incur those extra costs due to resource allocation is there any reason not to set ping-timeout to 10 seconds? The problem I have with a long ping-timeout is that the Windows Samba Client disconnects after 25 seconds. So if one of the nodes of a Gluster cluster shuts down ungracefully then the Samba Client disconnects and the file that was being copied is incomplete on the server. These "costs" seem to be much higher than the potential costs of those Gluster resource re-allocations. But it is hard to estimate because there is not clear documentation what exactly those Gluster costs are. In general I would be very interested in a comprehensive explanation of ping-timeout and the up- and downsides of setting high or low values for it. Kinds regards, Omar ________________________________ Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users ________________________________ Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users -- Sent from my Android device with K-9 Mail. Please excuse my brevity. _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users