Hey all, I've run in to a strange issue and I'm wondering if anyone has any insight on this. So I have a modified version of OpenSSH called hpnssh that uses larger receive buffer sizes to improve throughput. When I ported that patch set to 8.9 I found out after an initial burst of data the throughput would drop to zero after an initial burst and just sort of stay there. This happens regardless of the RTT; same issues at sub-millisecond RTTs and 150ms. I've been able to replicate this behaviour in OpenSSH by increasing the size of CHAN_SES_DEFAULT_WINDOW from 64*CHAN_SES_PACKET_DEFAULT to higher values. You kind of see it in the debug log at 96*CSPD and you can start to see it impact throughput at 128*CSDP. With my tests I have it at 512*CSPD and the pause will last upwards of 5 seconds. When it's at 1024*CSPD (corresponding to a 32MB receive buffer) the pause will last 18 seconds or more. I dropped in a debug statement in to sshbuf.c in sshbuf_allocate to get a better view of whats happening. The following are the results of that in both 8.9p1 and 8.8p1 Version 8.9p1 Start of buffer growth debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 98304, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 131072, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 163840, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 196608, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 229376, len: 32768 for 0x55d548211550 ? debug1: adjusted rlen: 22003712, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 22036480, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 22069248, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 debug1: adjusted rlen: 22085632, len: 16384 for 0x55d548211550 debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 debug1: adjusted rlen: 22102016, len: 16384 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 22134784, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 22167552, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 22200320, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 debug1: adjusted rlen: 22216704, len: 16384 for 0x55d548211550 debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 debug1: adjusted rlen: 22233088, len: 16384 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 22265856, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 22298624, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 22331392, len: 32768 for 0x55d548211550 debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 debug1: adjusted rlen: 22347776, len: 16384 for 0x55d548211550 debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 ? end of buffer growth So there is a buffer that kicks in after an short initial period of good throughput that has then has to grow larger that the size of CHAN_SES_WINDOW_DEFAULT (128*CSPD in this case) before throughput can start to pick up again. During this growth period CPU usage is pegged at 100% even though throughput is minimal. Version 8.8p1 Start buffer adjustment debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 49152, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 81920, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 114688, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 147456, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 180224, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 212992, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 245760, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 278528, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 311296, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 344064, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 376832, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 409600, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 442368, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 475136, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 507904, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 540672, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 573440, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 606208, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 638976, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 671744, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 704512, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 737280, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 770048, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 802816, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 835584, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 868352, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 901120, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 933888, len: 32768 for 0x5570cf28a3e0 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 End buffer adjustment In version 8.8 I also see a buffer growing during the course of the transfer but it looks like it doesn?t need to grow to the same size as in 8.9p1 (in this case WINDOW_DEFAULT was set to 4096*CSPD). This is on the same test bed, same set up, and the tests are being run right after each other. The behaviour in 8.9 is very consistent. After a certain number of adjusted rlens for, what I am assuming is the packet buffer, it goes into this growth phase for another buffer. In my test that been about 700 of these adjusts which is unrelated to the size of CHAN_SES_WINDOW_DEFAULT So I?m wondering what changed between 8.8 and 8.9 that might account for this and if this is expected/desired behaviour. I'm still working through the diff from 8.8 to 8.9 but nothing has leaped out at me yet. Thanks, Chris
Of course, 90 minutes later I find that my issues starts with commit b30d32159d "upstream: add a ssh_packet_process_read() function that reads from a fd directly into the transport input buffer. Use this in the client and server mainloops to avoid unnecessary copying. It also lets us use a more greedy read size without penalty. Yields a 2-3% performance gain on cipher-speed.sh (in a fairly unscientific test tbf)" I still don't know why it's causing this behaviour so if anyone has a clue please share? Thanks Chris On 5/18/22 2:02 PM, rapier wrote:> Hey all, > > I've run in to a strange issue and I'm wondering if anyone has any > insight on this. So I have a modified version of OpenSSH called hpnssh > that uses larger receive buffer sizes to improve throughput. When I > ported that patch set to 8.9 I found out after an initial burst of data > the throughput would drop to zero after an initial burst and just sort > of stay there. This happens regardless of the RTT; same issues at > sub-millisecond RTTs and 150ms. > > I've been able to replicate this behaviour in OpenSSH by increasing the > size of CHAN_SES_DEFAULT_WINDOW from 64*CHAN_SES_PACKET_DEFAULT to > higher values. You kind of see it in the debug log at 96*CSPD and you > can start to see it impact throughput at 128*CSDP. With my tests I have > it at 512*CSPD and the pause will last upwards of 5 seconds. When it's > at 1024*CSPD (corresponding to a 32MB receive buffer) the pause will > last 18 seconds or more. > > I dropped in a debug statement in to sshbuf.c in sshbuf_allocate to get > a better view of whats happening. The following are the results of that > in both 8.9p1 and 8.8p1 > > > Version 8.9p1 > > Start of buffer growth > > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 98304, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 131072, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 163840, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 196608, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 229376, len: 32768 for 0x55d548211550 > ? > debug1: adjusted rlen: 22003712, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 22036480, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 22069248, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 > debug1: adjusted rlen: 22085632, len: 16384 for 0x55d548211550 > debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 > debug1: adjusted rlen: 22102016, len: 16384 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 22134784, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 22167552, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 22200320, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 > debug1: adjusted rlen: 22216704, len: 16384 for 0x55d548211550 > debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 > debug1: adjusted rlen: 22233088, len: 16384 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 22265856, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 22298624, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 22331392, len: 32768 for 0x55d548211550 > debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 > debug1: adjusted rlen: 22347776, len: 16384 for 0x55d548211550 > debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60 > ? end of buffer growth > > So there is a buffer that kicks in after an short initial period of good > throughput that has then has to grow larger that the size of > CHAN_SES_WINDOW_DEFAULT (128*CSPD in this case) before throughput can > start to pick up again. During this growth period CPU usage is pegged at > 100% even though throughput is minimal. > > > Version 8.8p1 > > Start buffer adjustment > > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 49152, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 81920, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 114688, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 147456, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 180224, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 212992, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 245760, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 278528, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 311296, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 344064, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 376832, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 409600, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 442368, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 475136, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 507904, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 540672, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 573440, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 606208, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 638976, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 671744, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 704512, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 737280, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 770048, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 802816, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 835584, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 868352, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 901120, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 933888, len: 32768 for 0x5570cf28a3e0 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410 > > End buffer adjustment > > In version 8.8 I also see a buffer growing during the course of the > transfer but it looks like it doesn?t need to grow to the same size as > in 8.9p1 (in this case WINDOW_DEFAULT was set to 4096*CSPD). This is on > the same test bed, same set up, and the tests are being run right after > each other. > > The behaviour in 8.9 is very consistent. After a certain number of > adjusted rlens for, what I am assuming is the packet buffer, it goes > into this growth phase for another buffer. In my test that been about > 700 of these adjusts which is unrelated to the size of > CHAN_SES_WINDOW_DEFAULT > > So I?m wondering what changed between 8.8 and 8.9 that might account for > this and if this is expected/desired behaviour. I'm still working > through the diff from 8.8 to 8.9 but nothing has leaped out at me yet. > > Thanks, > Chris
On Wed, 18 May 2022, rapier wrote:> Hey all, > > I've run in to a strange issue and I'm wondering if anyone has any > insight on this. So I have a modified version of OpenSSH called hpnssh > that uses larger receive buffer sizes to improve throughput. When I > ported that patch set to 8.9 I found out after an initial burst of data > the throughput would drop to zero after an initial burst and just sort > of stay there. This happens regardless of the RTT; same issues at > sub-millisecond RTTs and 150ms. > > I've been able to replicate this behaviour in OpenSSH by increasing the > size of CHAN_SES_DEFAULT_WINDOW from 64*CHAN_SES_PACKET_DEFAULT to > higher values. You kind of see it in the debug log at 96*CSPD and you > can start to see it impact throughput at 128*CSDP. With my tests I have > it at 512*CSPD and the pause will last upwards of 5 seconds. When it's > at 1024*CSPD (corresponding to a 32MB receive buffer) the pause will > last 18 seconds or more. > > I dropped in a debug statement in to sshbuf.c in sshbuf_allocate to get > a better view of whats happening. The following are the results of that > in both 8.9p1 and 8.8p1[snip]> In version 8.8 I also see a buffer growing during the course of the > transfer but it looks like it doesn?t need to grow to the same size as > in 8.9p1 (in this case WINDOW_DEFAULT was set to 4096*CSPD). This is on > the same test bed, same set up, and the tests are being run right after > each other. > > The behaviour in 8.9 is very consistent. After a certain number of > adjusted rlens for, what I am assuming is the packet buffer, it goes > into this growth phase for another buffer. In my test that been about > 700 of these adjusts which is unrelated to the size of > CHAN_SES_WINDOW_DEFAULT > > So I?m wondering what changed between 8.8 and 8.9 that might account for > this and if this is expected/desired behaviour. I'm still working > through the diff from 8.8 to 8.9 but nothing has leaped out at me yet.8.9 switch from select() to poll() and included a couple of bugs that could cause weird problems. IMO you should try to port to what's on the top of the V_9_0 git branch, which is 9.0 + one more poll()- related fix. -d