dev at sapalski.de
2018-Sep-26 11:09 UTC
sshd: ClientAlive{CountMax,Interval} fires twice each interval if connection is interrupted
I've discovered a bug in serverloop.c(function=wait_until_can_do_something) for which I believe that it wasn't reported so far. With latest openssh (7.8p1 as well as current master) sshd disconnects a non-responding client after approximately: (ClientAliveCountMax / 2) * ClientAliveInterval I did a bisect which showed that the fix introduced for bz#2756 causes this behavior: https://bugzilla.mindrot.org/show_bug.cgi?id=2756 How to reproduce: 1. server #> /sbin/sshd -p 2020 -ddd -f ${sshd_config} 2>&1 | ts 2. client $> ssh $IP -p2020 3. close e.g. notebook of the client connection and wait for the timeout to happen ${sshd_config} ---- TCPKeepAlive no ClientAliveInterval 15 ClientAliveCountMax 8 ---- The debug log of sshd shows: ---- ... [2018-04-26 11:59:35] debug3: /tmp/sshd_config:94 setting TCPKeepAlive no [2018-04-26 11:59:35] debug3: /tmp/sshd_config:98 setting ClientAliveInterval 15 [2018-04-26 11:59:35] debug3: /tmp/sshd_config:99 setting ClientAliveCountMax 8 ... [2018-04-26 12:00:16] debug2: channel 0: request keepalive at openssh.com confirm 1 [2018-04-26 12:00:16] debug2: channel 0: request keepalive at openssh.com confirm 1 [2018-04-26 12:00:31] debug2: channel 0: request keepalive at openssh.com confirm 1 [2018-04-26 12:00:31] debug2: channel 0: request keepalive at openssh.com confirm 1 [2018-04-26 12:00:46] debug2: channel 0: request keepalive at openssh.com confirm 1 [2018-04-26 12:00:46] debug2: channel 0: request keepalive at openssh.com confirm 1 [2018-04-26 12:01:01] debug2: channel 0: request keepalive at openssh.com confirm 1 [2018-04-26 12:01:01] debug2: channel 0: request keepalive at openssh.com confirm 1 [2018-04-26 12:01:16] Timeout, client not responding from user $USER x.x.x.x port xxxxx ---- As we can see, keepalive packets are sent twice on every interval. I think the problem is that if a timeout of the select call in function=wait_until_can_do_something happens the variable=last_client_time isn't set to current time and during the next iteration the select call returns immediately with data contained in 'writesetp'. A possible fix for which I believe doesn't break the fix of bz#2756 and solves this problem could be: ---- diff --git a/serverloop.c b/serverloop.c index d71724e..7110bf6 100644 --- a/serverloop.c +++ b/serverloop.c @@ -290,6 +290,7 @@ wait_until_can_do_something(struct ssh *ssh, if (ret == 0) { /* timeout */ client_alive_check(ssh); + last_client_time = now; } else if (FD_ISSET(connection_in, *readsetp)) { last_client_time = now; } else if (last_client_time != 0 && last_client_time + ---- This solves the problem for me. Can someone confirm that this is a bug and apply either my proposed fix or any other which solves this problem? If this gets confirmed, shall I open a bugzilla ticket or isn't it necessary? Thanks, Samuel
Samuel Sapalski
2018-Oct-11 19:45 UTC
sshd: ClientAlive{CountMax, Interval} fires twice each interval if connection is interrupted
I?ve created meanwhile a bugzilla report for this issue: https://bugzilla.mindrot.org/show_bug.cgi?id=2917 <https://bugzilla.mindrot.org/show_bug.cgi?id=2917> Thanks for looking at it, Samuel