I will try to explain this as best I can. I have two computers; one a Supermicro X10SAE running CentOS 6, the other a very old DOS box.[*] The DOS box runs a CCD camera, sending images via Ethernet to the X10SAE. Thus, the X10SAE runs a Python server on port 5700 (a socket which binds to 5700 and listens, and then accepts a connection from the DOS box; nothing fancy).[**] The DOS box connects to the server and sends images. This all works great, except: When the DOS box exits, crashes, or is rebooted, it fails to shut down the socket properly. Under CentOS 6.5, upon reboot, when the DOS box would attempt to reconnect, the original accepted server socket would (after a couple of connection attempts from the DOS box) see a 0-length recv and close, allowing the server to accept a new connection and resume receiving images. Under CentOS 6.6, the server never sees the 0-length recv. The DOS box flails away attempting to reconnect forever, and the server never seems to get any type of signal that the DOS box is attempting to reconnect. Possibly relevant facts: - The DOS box uses the same local port (1025) every time it tries to connect. It does not use a random ephemeral port. - The exact same code was tested on a CentOS 6.5 and 6.6 box, resulting in the described behavior. The boxes were identical clones except for the O/S upgrade. - The Python interpreter was not changed during the upgrade, because I run this code using my own 2.7.2 install. However, both glibc and the kernel were upgraded as part of the O/S upgrade. My only theory is that this has something to do with non-ephemeral ports and socket reuse, but I'm not sure what. It is entirely possible that some low-level socket option default has changed between 6.5 and 6.6, and I wouldn't know it. It is also possible that I have been relying on unsupported behavior this whole time, and that the current behavior is actually correct. Does anyone have any insight they can offer? [*] Hardware is not an issue; in fact, I have two identical systems, each of which has one X10SAE and three DOS boxes. But the problem can be boiled down to a single pair. [**] I'm actually using an asyncore.dispatcher to do the bind/listen, and then tossing the accept()ed socket into an asynchat. But I actually went ahead and put a trap on socket.recv() just to be sure that I'm not swallowing the 0-length recv by accident. -G. -- Glenn Eychaner (geychaner at lco.cl) Telescope Systems Programmer, Las Campanas Observatory
On Thu, Jan 15, 2015 at 03:40:08PM -0300, Glenn Eychaner wrote:> My only theory is that this has something to do with non-ephemeral ports and > socket reuse, but I'm not sure what.If you want a quick detection that the link is dead, have the server occasionally send bytes to the dos box. You will get an immediate error if the dos box is up and knows that connection is kaput. Given that the port numbers of the new connection are the same, I'm kind of surprised that the behavior changed from 6.5 to 6.6, but, I always use defensive programming (sending those extra bytes). -- greg
Since you always use the same local port - maybe you need to set SO_REUSEADDR option. Greetings from Germany Alex
[I wish I knew how to get the mailing list to thread my replies properly in the archives; I subscribe to the daily digest, and replying to that doesn't do it.] Greg Lindahl wrote:> On Thu, Jan 15, 2015 at 03:40:08PM -0300, Glenn Eychaner wrote: > > > My only theory is that this has something to do with non-ephemeral ports and > > socket reuse, but I'm not sure what. > > If you want a quick detection that the link is dead, have the server > occasionally send bytes to the dos box. You will get an immediate > error if the dos box is up and knows that connection is kaput.What if I am sending bytes to the DOS box, but it never reads the socket? (Let us assume, for the sake of argument, that I can't change the DOS box software. In fact, I can, but it's more difficult than changing the Linux end.) Won't that either result in my detecting the socket as "dead" when it is not, or eventually overflowing the socket buffering?> Given that the port numbers of the new connection are the same, I'm > kind of surprised that the behavior changed from 6.5 to 6.6, but, I > always use defensive programming (sending those extra bytes).I was super-surprised by the change, in that I fully tested the upgrade on my simulator system before deploying, and still got bit on deployment. Of course, the simulator doesn't have a real DOS box, just a simulation process that sends the images. [And, I also recently got bit by this http://www.macstadium.com/blog/osx-10-9-mavericks-bugs/ after upgrading some Macs. Sigh, network issues.] Alex from Germany wrote:> Since you always use the same local port - > maybe you need to set SO_REUSEADDR option.I assume I would have to set that on the client (DOS) side (the box which is using the same local port 1025 each time); setting it on the bound-listener socket on the Linux side doesn't seem like it would do anything to resolve the issue, based on my reading of SO_REUSEADDR on the net: http://www.unixguide.net/network/socketfaq/4.5.shtml http://stackoverflow.com/questions/14388706/ -G. -- Glenn Eychaner (geychaner at lco.cl) Telescope Systems Programmer, Las Campanas Observatory
What about SO_LINGER at the Linux side, have you tried that? http://stackoverflow.com/questions/3757289/tcp-option-so-linger-zero-when-its-required On Fri, Jan 16, 2015 at 1:18 PM, Glenn Eychaner <geychaner at mac.com> wrote:>> Since you always use the same local port - >> maybe you need to set SO_REUSEADDR option. > > I assume I would have to set that on the client (DOS) side (the box which is > using the same local port 1025 each time); setting it on the bound-listener > socket on the Linux side doesn't seem like it would do anything to resolve > the issue, based on my reading of SO_REUSEADDR on the net: > http://www.unixguide.net/network/socketfaq/4.5.shtml > http://stackoverflow.com/questions/14388706/
On Fri, Jan 16, 2015 at 6:18 AM, Glenn Eychaner <geychaner at mac.com> wrote:> > I was super-surprised by the change, in that I fully tested the upgrade on > my simulator system before deploying, and still got bit on deployment.I'm not sure I completely understand the scenario, but it seems wrong for it to have worked before. Why should a 'new' attempt at a connection with different tcp sequence numbers have been able to have any affect on the working socket that hasn't been closed yet at the other end unless it is sending a RST packet. Might be interesting to watch with wireshark to see if you are getting a RST that doesn't close the old connection. -- Les Mikesell lesmikesell at gmail.com
On Jan 15, 2015, at 11:40 AM, Glenn Eychaner <geychaner at mac.com> wrote:> When the DOS box exits, crashes, or is rebooted, it fails to shut down the > socket properly.Yes, that?s what happens when you use an OS that doesn?t implement sockets in kernel space: there is no program still running that can send the RST packet for the dead socket.> Under CentOS 6.5, upon reboot, when the DOS box would attempt > to reconnect, the original accepted server socket would (after a couple of > connection attempts from the DOS box) see a 0-length recv and close, allowing > the server to accept a new connection and resume receiving images.You?re relying on undocumented behavior here. I don?t know exactly what was going on before [*] but the new behavior is at least legal, and probably better. It is preventing a bogus reconnection, which could be used for nefarious purposes. (Connection hijacking, etc.) [*] Your ?flailing about? diagnosis is somewhat lacking in its level of rigor. :) I think if you look more deeply into it, you?ll be shocked at how thin the ice you?ve been dancing on is.> Possibly relevant facts:Oh, yeah. Relevant like rashes are to a diagnosis of chicken pox.> - The DOS box uses the same local port (1025) every time it tries to connect.That?s legal only if you allow the previous connection to die first, via the TIME_WAIT delay. Until that delay expires, the connection?s 5-tuple [**] is still in use, and the kernel is right to refuse to accept another SYN using the same 5-tuple. Another poster recommended SO_REUSEADDR, but that?s just a hack around the TIME_WAIT delay. The correct fix is to change the DOS app to use an ephemeral port number. That won?t 100% fix it, because you?ll still have a 1:16,383 chance [***] of causing the same problem as you?ve run into now, but that sounds live-able to me. If you reboot only once a week, you?d have to be Yoda to have much reason to be worried about running into this again during the balance of your tenure with this company. If you?re really worried about it, write the prior port to a text file on program startup and avoid that one on the next run. Oh, let me guess the objection: old binary-only DOS app, no source code available, programmers long since vanished, right? [**] Transport protocol, local port, local IP, remote port, remote IP. At least one must be different for a new connection to be allowed. [***] The IANA ephemeral port range (https://en.wikipedia.org/wiki/Ephemeral_port) has about 16k ports. I spent some time puzzling over the probabilities, and I?m pretty sure you don?t count two ?draws? here: you?re only concerned with the chance that the *next* port you pick will be equal to the preceding one.
A couple more thoughts... On Jan 16, 2015, at 10:42 AM, Warren Young <wyml at etr-usa.com> wrote:> On Jan 15, 2015, at 11:40 AM, Glenn Eychaner <geychaner at mac.com> wrote: > >> When the DOS box exits, crashes, or is rebooted, it fails to shut down the >> socket properly. > > Yes, that?s what happens when you use an OS that doesn?t implement sockets in kernel space: there is no program still running that can send the RST packet for the dead socket.That said, your Linux/Python side code shouldn?t be relying on the RST anyway. A power blip that unceremoniously reboots the DOS box will also skip the RST. That happens with *all* TCP stacks, even in-kernel ones. True war story, seen on devices from multiple vendors: The setup: An embedded system has a TCP listener. Some network problem [*] causes packet loss for an extended period, causing an established peer to time out and drop its conn. The packet loss also prevents the RST/FIN from getting to the embedded device, so it thinks it?s still connected. Because the embedded device?s programmer is counting every processor cycle, he makes it so it only handles a single TCP connection at a time. The result: The embedded box is now unreachable until boots on the ground walk over and power-cycle it. The fix: Make the embedded TCP listener either a) allow multiple TCP connections; or b) drop the prior TCP conn when a new one comes in. The lesson: If your TCP/IP program was easy to write, it isn?t robust. You?ve missed *something*. [*] It could be a misconfiguration, broken cable, firmware update, power-cycled wiring closet, etc.> The correct fix is to change the DOS app to use an ephemeral port number.That also fixes the ?missing RST? problem I?ve described above. If by some bad bit of luck the DOS box happens to pick the same ephemeral port number after a reboot that it was using before, it will get RST. The DOS app will then retry, causing the DOS TCP stack to pick a different ephemeral port, so it will succeed. A different fix is to exploit the real-time nature of video camera imagery: if your Python app goes more than a second without receiving an image frame, it can presume that the DOS box has disappeared again, and drop its conn. By the time the DOS box reboots, TIME_WAIT may have expired, so the DOS box might reconnect without a problem. You may wish to reduce tcp_fin_timeout to ensure that TIME_WAIT does indeed expire before the DOS box reboots, per http://goo.gl/zQCzqK
I'd like to thank everyone for their replies and advice. I'm sorry it took so long for me to respond; I took a long weekend after a long shift. Some remaining questions can be found in the final section of this posting. The summary (I hope i have all of this correct): Problem: A DOS box (client) connects to a Linux box (server) using the same local port (1025) on the client each time. The client sends data which the server reads; the server is passive and does not write any data. If the client crashes and fails to properly close the connection, under CentOS 6.5, the unclosed listener on the server receives a 0-length recv(), allowing for a "clean" reconnect; under 6.6, it does not, and the client unsuccessfully retries the reconnect endlessly. Diagnosis: Because the client is connecting using the same port every time, the server sees the same 5-tuple each time. At that point, the reconnection should fail until the old socket on the server is closed, and the previous behavior of receiving a 0-length recv() on the old server socket is unsupported and unreliable. Until the update to CentOS 6.6 'broke' the existing functionality, I had never looked deeply into the connection between the client and the server; it 'just worked', so I left it alone. Once it did break, I realized that because the client was connecting on the same port every time, the whole setup might have been relying on unsupported behavior. My workaround: I unfortunately had to implement an emergency workaround before receiving any replies. Fortunately, the client also sends status messages to the same computer (but a different server program) over a serial-port side-channel (well, it's more complicated than that, but anyway). I set up a listener for a "failed connection" status message which signal()s the server program to close all client connections (but not the bound dispatchers) and thereby force all clients to reconnect. It's a cheat and a cheesy hack, but it works. Other diagnostics: One test I intend to run in a couple of weeks (next opportunity) is to boot the CentOS 6.6 box with the older kernel, in order to find out whether the behavior change is in the kernel or in the libraries. Correct solutions: 1) Client port: The client should be connecting on a random, ephemeral port like a good client instead of on a fixed port, which I suspected. I don't know if this can be changed (due to a really dumb binary TCP driver). 2) Protocol change: The server never writes to the socket in the existing protocol, and can therefore never find out that the connection is dead. Writing to the socket would reveal this. But what happens if the server writes to the socket, and the client never reads? (We do, as it happens, have access to the client software, so the protocol can be fixed eventually. But I'm still curious as to the answer.) 3) Several people suggested using SO_REUSEADDR and/or an SO_LINGER of zero to drop the socket out of TIME_WAIT, but does the socket enter TIME_WAIT as soon as the client crashes? I didn't think so, but I may be wrong. 4) Several people suggested SO_KEEPALIVE, but those occur only after hours unless you change kernel parameters via procfs and/or sysctl, and when the client crashes, I need recovery right away, not hours down the road. Time here is literally worth a dollar per second, roughly. Anyway, thanks for the discusssion and helpful links. At one time I knew all this stuff, but it has been 20 years since I had to dig into the TCP protocol this deeply. -G. -- Glenn Eychaner (geychaner at lco.cl) Telescope Systems Programmer, Las Campanas Observatory
On Wed, Jan 21, 2015 at 10:49 AM, Glenn Eychaner <geychaner at mac.com> wrote:> > 2) Protocol change: The server never writes to the socket in the existing > protocol, and can therefore never find out that the connection is dead. > Writing to the socket would reveal this. But what happens if the server writes > to the socket, and the client never reads? (We do, as it happens, have access > to the client software, so the protocol can be fixed eventually. But I'm still > curious as to the answer.)If you can change the client, and you want to keep essentially re-using the same socket after a reboot, can't you simply send a RST on it when starting up and then re-connect- or even run a different program ahead starting that just sends a RST with that source/dest/port combination? That should make the server side abandon that connection and accept another, although you may still need to play tricks on the server side to avoid TIME_WAIT. -- Les Mikesell lesmikesell at gmail.com
On 01/21/2015 08:49 AM, Glenn Eychaner wrote:> Diagnosis: > the previous behavior of > receiving a 0-length recv() on the old server socket is unsupported and > unreliable.You mention that a lot, and it might help to understand why that happens. A 0 length recv() on a standard (blocking) socket indicates end-of-file. The remote side has closed the connection. What you were previously seeing was the client sending SYN to establish a new connection. Because it was unrelated to the existing connection on the same 5-tuple, the server's TCP stack closed the existing socket. I'm not positive, but the server may have sent a keepalive or other probe to the client and got a RST. Either way, the kernel determined that the socket had been closed by the client, and a 0-length read (recv) is the way that the kernel informs an application of that closure.> Until the update to CentOS 6.6 'broke' the existing functionality, > I had never looked deeply into the connection between the client and the > server; it 'just worked', so I left it alone. Once it did break, I realized > that because the client was connecting on the same port every time, the > whole setup might have been relying on unsupported behavior.Not just unsupported, but incorrect. Unrelated packets with a 5-tuple matching an established socket are typically injection attacks. TCP is supposed to discard them.> Other diagnostics: > One test I intend to run in a couple of weeks (next opportunity) is to boot > the CentOS 6.6 box with the older kernel, in order to find out whether the > behavior change is in the kernel or in the libraries.It's always good to test, but it's almost certainly the kernel. Libraries don't decide whether or not a socket has closed, which is what the 0-length read (recv) indicates.> Correct solutions: > 1) Client port: The client should be connecting on a random, ephemeral portYes.> 2) Protocol change: The server never writes to the socket in the existing > protocol, and can therefore never find out that the connection is dead. > Writing to the socket would reveal this. But what happens if the server writes > to the socket, and the client never reads?You will eventually fill up a buffer on one side or the other, and at that point any further write (send) will block forever.> 3) Several people suggested using SO_REUSEADDR and/or an SO_LINGER of zero to > drop the socket out of TIME_WAIT, but does the socket enter TIME_WAIT as soon > as the client crashes? I didn't think so, but I may be wrong.No. It enters TIME_WAIT when the socket closes. If the socket were closing, you'd be getting a 0-length read (recv). You can confirm that with "netstat"