bugzilla-daemon at bugzilla.mindrot.org
2016-Apr-18 02:49 UTC
[Bug 2565] New: High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 Bug ID: 2565 Summary: High baud rate gets sent, solaris closes pty Product: Portable OpenSSH Version: 7.1p2 Hardware: Sparc OS: Solaris Status: NEW Severity: minor Priority: P5 Component: sshd Assignee: unassigned-bugs at mindrot.org Reporter: thogard at abnormal.com It seems that trying to set the tty baud rate in Solaris to reasonable modern values is broken resulting in the pseudo tty output being closed. This effects Solaris 11.3 back to at least Sol 9. I can replicate it from OS X: stty ospeed 57600 ssh solaris Password: .... (no output) I can force it 9600 to fix the problem with code that verifies the problem and fixes it: case TTY_OP_ISPEED_PROTO2 (and TTY_OP_OSPEED_PROTO2) ... baud = packet_get_int(); +debug("ibaud=%d",baud); +baud=9600; On some clients (such as QNX) stty can be used as a work around. "stty ispeed 9600;stty ospeed 9600;ssh broken-host" will work. Is it appropriate to try to fix this in sshd? If so, then should the proper way to fix this be something along the line of if ( baud > MAX_BAUD) baud=MAX_BAUD;? I can't see a way of setting MAX_BAUD other than an config option hack with a list for broken systems. Even very old Solaris has B57600 in termios.h so that can't be used as an indicator. -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Apr-18 10:59 UTC
[Bug 2565] High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |djm at mindrot.org --- Comment #1 from Damien Miller <djm at mindrot.org> --- does setting ospeed actually do anything on a PTY? AFAIK it doesn't on OS X, BSD or Linux. If the terminal is being closed as a result of a failed cfsetispeed() call then that indicates a kernel or (possibly) libc problem - ssh doesn't exit when that fails, it just logs an error and continues. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Jul-20 03:48 UTC
[Bug 2565] High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 Darren Tucker <dtucker at zip.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dtucker at zip.com.au --- Comment #2 from Darren Tucker <dtucker at zip.com.au> --- (In reply to Damien Miller from comment #1)> does setting ospeed actually do anything on a PTY? AFAIK it doesn't > on OS X, BSD or Linux. > > If the terminal is being closed as a result of a failed > cfsetispeed() call then that indicates a kernel or (possibly) libc > problem - ssh doesn't exit when that fails, it just logs an error > and continues.I can reproduce this with an OpenBSD client and Solaris 11 server. The cfsetispeed succeeds but it looks like the data from the pty slave never makes it back to the master. truss'ing (with -v pollsys) sshd then hitting enter ends with (lwp_sigmask calls elided): 21585: write(2, " [ d t u c k e r @ s o l".., 19) = 19 21583: pollsys(0x08047890, 2, 0x00000000, 0x00000000) (sleeping...) 21583: fd=4 ev=POLLRDNORM rev=0 21583: fd=6 ev=POLLRDNORM rev=0x814 21583: pollsys(0x08047000, 2, 0x00000000, 0x00000000) (sleeping...) 21585: read(0, 0x080473AC, 1) (sleeping...) The pids are: dtucker 21585 21583 0 11:40:29 pts/2 0:00 -bash root 21580 21579 0 11:40:27 pts/1 0:00 sshd -d -p 2022 -R dtucker 21583 21580 0 11:40:29 pts/1 0:00 sshd -d -p 2022 -R so we've got the shell writing the prompt to stderr then waiting for input on stdin, which are the pty. Looks reasonable. The sshd post-auth privsep slave is blocked in poll (which is how select is implemented on Solaris). sshd's descriptors are: $ sudo lsof -p 21583 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME [...] sshd 21583 root 0u VCHR 221,1 443220255 /dev/pts/1 sshd 21583 root 1u VCHR 221,1 443220255 /dev/pts/1 sshd 21583 root 2u VCHR 221,1 443220255 /dev/pts/1 sshd 21583 root 3r DOOR 0t0 46 /system/volatile/name_service_door (door to nscd[426]) (FA:->0xffffff00cf178700) sshd 21583 root 4u FIFO 0xffffff00dadd16c0 0t0 184434 (fifofs) PIPE->0xffffff00dadd1750 sshd 21583 root 5u FIFO 0xffffff00dadd1750 0t0 184434 (fifofs) PIPE->0xffffff00dadd16c0 sshd 21583 root 6u IPv4 0xffffff00ce59a100 0t5578 TCP sol11.dtucker.net:2022->client.dtucker.net:31563 (ESTABLISHED) sshd 21583 root 7u VCHR 238,2 81265120 /devices/pseudo/clone at 0:ptm->ptm sshd 21583 root 10u VCHR 238,2 81265120 /devices/pseudo/clone at 0:ptm->ptm So, sshd is waiting to read from the TCP connection or a pipe, but not the pty master. That seems odd. For the record, sshd is blocking is in wait_until_can_do_something(): $ sudo gdb -q --args `pwd`/sshd -de -p 2022 -o useprivilegeseparation=no -r (gdb) set follow-fork child (gdb) run [...] debug1: Setting controlling tty using TIOCSCTTY. ^C Program received signal SIGINT, Interrupt. 0xfec78ec5 in __pollsys () from /lib/libc.so.1 (gdb) bt #0 0xfec78ec5 in __pollsys () from /lib/libc.so.1 #1 0xfec664c6 in _pollsys () from /lib/libc.so.1 #2 0xfec233b8 in pselect () from /lib/libc.so.1 #3 0xfec236b9 in select () from /lib/libc.so.1 #4 0x08074bbb in wait_until_can_do_something (readsetp=0x80479f8, writesetp=0x80479f4, maxfdp=0x80479f0, nallocp=0x80479ec, max_time_ms=0) at ../../serverloop.c:370 #5 0x08075ce7 in server_loop2 (authctxt=0x814be78) at ../../serverloop.c:863 #6 0x08081a3b in do_authenticated2 (authctxt=0x814be78) at ../../session.c:2758 #7 0x0807be6c in do_authenticated (authctxt=0x814be78) at ../../session.c:271 #8 0x0806be26 in main (ac=7, av=0x81451a8) at ../../sshd.c:2324 -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Jul-20 03:56 UTC
[Bug 2565] High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 --- Comment #3 from Darren Tucker <dtucker at zip.com.au> --- Comparing the truss from a client with ospeed=9600: 5172: pollsys(0x08047890, 3, 0x00000000, 0x00000000) (sleeping...) 5172: fd=4 ev=POLLRDNORM rev=0 5172: fd=6 ev=POLLRDNORM rev=0x814 5172: fd=9 ev=POLLRDNORM rev=0x804 5174: read(0, 0x080473AC, 1) (sleeping...) $ sudo lsof -p 5172 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME [...] sshd 5172 root 4u FIFO 0xffffff00dadd16c0 0t0 184472 (fifofs) PIPE->0xffffff00dadd1750 sshd 5172 root 5u FIFO 0xffffff00dadd1750 0t0 184472 (fifofs) PIPE->0xffffff00dadd16c0 sshd 5172 root 6u IPv4 0xffffff00de346380 0t6014 TCP sol11.dtucker.net:2022->quoll.dtucker.net:30321 (ESTABLISHED) sshd 5172 root 7u VCHR 238,2 81265120 /devices/pseudo/clone at 0:ptm->ptm sshd 5172 root 9u VCHR 238,2 81265120 /devices/pseudo/clone at 0:ptm->ptm sshd 5172 root 10u VCHR 238,2 81265120 /devices/pseudo/clone at 0:ptm->ptm so sshd is polling an extra descriptor attached to the pty master. I don't know why yet though. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Jul-20 04:41 UTC
[Bug 2565] High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 --- Comment #4 from Darren Tucker <dtucker at zip.com.au> --- Confirmed that commenting out only the calls to cfsetospeed and cfsetispeed in ttymodes.c prevents the problem. Going back through a complete strace of sshd I see this: 6212: pollsys(0x08047890, 3, 0x00000000, 0x00000000) = 2 6212: fd=4 ev=POLLRDNORM rev=0 6212: fd=6 ev=POLLOUT|POLLRDNORM rev=POLLOUT 6212: fd=9 ev=POLLRDNORM rev=POLLRDNORM 6212: clock_gettime(4, 0x08047954) = 0 6212: read(9, 0x0804391C, 16384) = 0 6212: close(9) = 0 6212: write(6, "\0\0\010 BA0 q zF8 {88F9".., 84) = 84 6212: getpid() = 6212 [6209] 6212: clock_gettime(4, 0x080478B4) = 0 6212: pollsys(0x08047890, 2, 0x00000000, 0x00000000) = 1 6212: fd=4 ev=POLLRDNORM rev=0 6212: fd=6 ev=POLLOUT|POLLRDNORM rev=POLLOUT where fd#9 is our missing FD on the pty master, and this is the first pollsys where fd#9 is included.>From this we can infer from this that sshd does in fact set up thedescriptors correctly. pollsys says that fd#9 is readable, and when sshd reads it the read returns zero, indicating end-of-file. sshd then closes the descriptor and removes it from the set it's looking at, hence why we didn't see it in the later calls. If this sounds familiar it's because we've encountered the same thing on AIX (but probably for different reasons, in that case it was passing through zero-byte writes from the pty slave). There's already a workaround hack for it (PTY_ZEROREAD) which seems to help in this case, but I'd like to understand what the problem is (and what other potential side effects are) before we go enabling that on Solaris. Anyone from Oracle care to comment? Why does setting the baud rate on a pty to 57600 cause it to change its read behaviour? -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Aug-02 14:46 UTC
[Bug 2565] High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 Tomas Kuthan <tomas.kuthan at oracle.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tomas.kuthan at oracle.com Assignee|unassigned-bugs at mindrot.org |tomas.kuthan at oracle.com --- Comment #5 from Tomas Kuthan <tomas.kuthan at oracle.com> --- Thanks for reporting this issue. It turns out, it is a long-standing Solaris kernel bug. It only pertains to the one particular speed (57600 baud), other rates (such as 115200, 230400 and 460800) work fine, provided that the HW device supports it. I have a verified fix ready for this. We will fix it eventually in Solaris. I am taking ownership of the bug for now, but we could just as well close it, as this is not a bug in OpenSSH. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Aug-02 23:30 UTC
[Bug 2565] High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 --- Comment #6 from Darren Tucker <dtucker at zip.com.au> --- (In reply to Tomas Kuthan from comment #5)> Thanks for reporting this issue. > > It turns out, it is a long-standing Solaris kernel bug. It only > pertains to the one particular speed (57600 baud), other rates (such > as 115200, 230400 and 460800) work fine, provided that the HW device > supports it.That sounds fascinatingly specific. Are you able to share the details?> I have a verified fix ready for this. We will fix it eventually in > Solaris.Thanks!> I am taking ownership of the bug for now, but we could just as well > close it, as this is not a bug in OpenSSH.Given that there's already a workaround (ie "don't do that then") and a proper fix coming I think we should close the bug. -- You are receiving this mail because: You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Aug-03 08:56 UTC
[Bug 2565] High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 Tomas Kuthan <tomas.kuthan at oracle.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED --- Comment #7 from Tomas Kuthan <tomas.kuthan at oracle.com> --- On Solaris, 16 original baud rates B0 through B38400 were represented in the four least significant bits of termios.c_cflag. When additional baud rates were added back in 1994, the adjacent bits were already assigned for a different purpose, so bits 22 and 23 were used to flag extended baudrates. Baud rates B57600 through B460800 were again represented in the least significant bits with the extension bit set. All the speed handling code was updated to take the extension bits into account. ... all the code, but this one if, that was left behind. B57600 is encoded with the extension bit set and with the 4 least significant bits all zeros. The neglected if has mistaken B57600 for B0. B0 means 'hang up'; hung up it did. I am closing the bug as not-an-OpenSSH-bug. (Feel free to change the substatus.) -- You are receiving this mail because: You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2021-Apr-23 05:00 UTC
[Bug 2565] High baud rate gets sent, solaris closes pty
https://bugzilla.mindrot.org/show_bug.cgi?id=2565 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #8 from Damien Miller <djm at mindrot.org> --- closing resolved bugs as of 8.6p1 release -- You are receiving this mail because: You are watching someone on the CC list of the bug.
Possibly Parallel Threads
- Two APC900 UPS on the same usbbus1
- [Bug 2636] New: Fix X11 forwarding, when ::1 is not configured
- [Bug 2376] New: Add compile time option to disable Curve25519
- [Bug 2299] New: Disable uid=0 resetting test on Solaris
- [Bug 2719] New: Notify user, when ssh transport process dies.