thr3ads.net - openssh bugs - [Bug 2565] New: High baud rate gets sent, solaris closes pty [Apr 2016]

If this information is useful, please help other people find it:
Share via:

bugzilla-daemon at bugzilla.mindrot.org

2016-Apr-18 02:49 UTC

[Bug 2565] New: High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

            Bug ID: 2565
           Summary: High baud rate gets sent, solaris closes pty
           Product: Portable OpenSSH
           Version: 7.1p2
          Hardware: Sparc
                OS: Solaris
            Status: NEW
          Severity: minor
          Priority: P5
         Component: sshd
          Assignee: unassigned-bugs at mindrot.org
          Reporter: thogard at abnormal.com

It seems that trying to set the tty baud rate in Solaris to reasonable
modern values is broken resulting in the pseudo tty output being
closed.  This effects Solaris 11.3 back to at least Sol 9.

I can replicate it from OS X:
stty ospeed 57600
ssh solaris
Password: ....
(no output)

I can force it 9600 to fix the problem with code that verifies the
problem and fixes it:
 case TTY_OP_ISPEED_PROTO2 (and TTY_OP_OSPEED_PROTO2)
...
 baud = packet_get_int();
+debug("ibaud=%d",baud);
+baud=9600;

On some clients (such as QNX) stty can be used as a work around.  "stty
ispeed 9600;stty ospeed 9600;ssh broken-host" will work.

Is it appropriate to try to fix this in sshd?  If so, then should the
proper way to fix this be something along the line of if ( baud >
MAX_BAUD) baud=MAX_BAUD;?
I can't see a way of setting MAX_BAUD other than an config option hack
with a list for broken systems.  Even very old Solaris has B57600 in
termios.h so that can't be used as an indicator.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Apr-18 10:59 UTC

head link

[Bug 2565] High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

Damien Miller <djm at mindrot.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |djm at mindrot.org

--- Comment #1 from Damien Miller <djm at mindrot.org> ---
does setting ospeed actually do anything on a PTY? AFAIK it doesn't on
OS X, BSD or Linux.

If the terminal is being closed as a result of a failed cfsetispeed()
call then that indicates a kernel or (possibly) libc problem - ssh
doesn't exit when that fails, it just logs an error and continues.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Jul-20 03:48 UTC

head link

[Bug 2565] High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

Darren Tucker <dtucker at zip.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dtucker at zip.com.au

--- Comment #2 from Darren Tucker <dtucker at zip.com.au> ---
(In reply to Damien Miller from comment #1)> does setting ospeed actually do anything on a PTY? AFAIK it doesn't
> on OS X, BSD or Linux.
> 
> If the terminal is being closed as a result of a failed
> cfsetispeed() call then that indicates a kernel or (possibly) libc
> problem - ssh doesn't exit when that fails, it just logs an error
> and continues.
I can reproduce this with an OpenBSD client and Solaris 11 server.  The
cfsetispeed succeeds but it looks like the data from the pty slave
never makes it back to the master.

truss'ing (with -v pollsys) sshd then hitting enter ends with
(lwp_sigmask calls elided):
21585:  write(2, " [ d t u c k e r @ s o l".., 19)      = 19
21583:   pollsys(0x08047890, 2, 0x00000000, 0x00000000) (sleeping...)
21583:           fd=4  ev=POLLRDNORM rev=0
21583:           fd=6  ev=POLLRDNORM rev=0x814 
21583:  pollsys(0x08047000, 2, 0x00000000, 0x00000000) (sleeping...)
21585:  read(0, 0x080473AC, 1)          (sleeping...)

The pids are:
 dtucker 21585 21583   0 11:40:29 pts/2       0:00 -bash
    root 21580 21579   0 11:40:27 pts/1       0:00 sshd -d -p 2022 -R
 dtucker 21583 21580   0 11:40:29 pts/1       0:00 sshd -d -p 2022 -R

so we've got the shell writing the prompt to stderr then waiting for
input on stdin, which are the pty.  Looks reasonable.

The sshd post-auth privsep slave is blocked in poll (which is how
select is implemented on Solaris).

sshd's descriptors are:

$ sudo lsof -p 21583 
COMMAND  PID USER   FD   TYPE             DEVICE SIZE/OFF      NODE
NAME
[...]
sshd   21583 root    0u  VCHR              221,1          443220255
/dev/pts/1
sshd   21583 root    1u  VCHR              221,1          443220255
/dev/pts/1
sshd   21583 root    2u  VCHR              221,1          443220255
/dev/pts/1
sshd   21583 root    3r  DOOR                         0t0        46
/system/volatile/name_service_door (door to nscd[426])
(FA:->0xffffff00cf178700)
sshd   21583 root    4u  FIFO 0xffffff00dadd16c0      0t0    184434
(fifofs) PIPE->0xffffff00dadd1750
sshd   21583 root    5u  FIFO 0xffffff00dadd1750      0t0    184434
(fifofs) PIPE->0xffffff00dadd16c0
sshd   21583 root    6u  IPv4 0xffffff00ce59a100   0t5578       TCP
sol11.dtucker.net:2022->client.dtucker.net:31563 (ESTABLISHED)
sshd   21583 root    7u  VCHR              238,2           81265120
/devices/pseudo/clone at 0:ptm->ptm
sshd   21583 root   10u  VCHR              238,2           81265120
/devices/pseudo/clone at 0:ptm->ptm

So, sshd is waiting to read from the TCP connection or a pipe, but not
the pty master.  That seems odd.

For the record, sshd is blocking is in wait_until_can_do_something():

$ sudo gdb -q --args `pwd`/sshd -de -p 2022 -o
useprivilegeseparation=no -r
(gdb) set follow-fork child
(gdb) run
[...]
debug1: Setting controlling tty using TIOCSCTTY.
^C
Program received signal SIGINT, Interrupt.
0xfec78ec5 in __pollsys () from /lib/libc.so.1
(gdb) bt
#0  0xfec78ec5 in __pollsys () from /lib/libc.so.1
#1  0xfec664c6 in _pollsys () from /lib/libc.so.1
#2  0xfec233b8 in pselect () from /lib/libc.so.1
#3  0xfec236b9 in select () from /lib/libc.so.1
#4  0x08074bbb in wait_until_can_do_something (readsetp=0x80479f8, 
    writesetp=0x80479f4, maxfdp=0x80479f0, nallocp=0x80479ec,
max_time_ms=0)
    at ../../serverloop.c:370
#5  0x08075ce7 in server_loop2 (authctxt=0x814be78) at
../../serverloop.c:863
#6  0x08081a3b in do_authenticated2 (authctxt=0x814be78)
    at ../../session.c:2758
#7  0x0807be6c in do_authenticated (authctxt=0x814be78) at
../../session.c:271
#8  0x0806be26 in main (ac=7, av=0x81451a8) at ../../sshd.c:2324

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Jul-20 03:56 UTC

head link

[Bug 2565] High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

--- Comment #3 from Darren Tucker <dtucker at zip.com.au> ---
Comparing the truss from a client with ospeed=9600:

5172:   pollsys(0x08047890, 3, 0x00000000, 0x00000000) (sleeping...)
5172:           fd=4  ev=POLLRDNORM rev=0
5172:           fd=6  ev=POLLRDNORM rev=0x814  
5172:           fd=9  ev=POLLRDNORM rev=0x804  
5174:   read(0, 0x080473AC, 1)          (sleeping...)

$ sudo lsof -p 5172 
COMMAND  PID USER   FD   TYPE             DEVICE SIZE/OFF      NODE
NAME
[...]
sshd    5172 root    4u  FIFO 0xffffff00dadd16c0      0t0    184472
(fifofs) PIPE->0xffffff00dadd1750
sshd    5172 root    5u  FIFO 0xffffff00dadd1750      0t0    184472
(fifofs) PIPE->0xffffff00dadd16c0
sshd    5172 root    6u  IPv4 0xffffff00de346380   0t6014       TCP
sol11.dtucker.net:2022->quoll.dtucker.net:30321 (ESTABLISHED)
sshd    5172 root    7u  VCHR              238,2           81265120
/devices/pseudo/clone at 0:ptm->ptm
sshd    5172 root    9u  VCHR              238,2           81265120
/devices/pseudo/clone at 0:ptm->ptm
sshd    5172 root   10u  VCHR              238,2           81265120
/devices/pseudo/clone at 0:ptm->ptm

so sshd is polling an extra descriptor attached to the pty master.  I
don't know why yet though.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Jul-20 04:41 UTC

head link

[Bug 2565] High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

--- Comment #4 from Darren Tucker <dtucker at zip.com.au> ---
Confirmed that commenting out only the calls to cfsetospeed and
cfsetispeed in ttymodes.c prevents the problem.

Going back through a complete strace of sshd I see this:

6212:   pollsys(0x08047890, 3, 0x00000000, 0x00000000)  = 2
6212:           fd=4  ev=POLLRDNORM rev=0
6212:           fd=6  ev=POLLOUT|POLLRDNORM rev=POLLOUT
6212:           fd=9  ev=POLLRDNORM rev=POLLRDNORM
6212:   clock_gettime(4, 0x08047954)                    = 0
6212:   read(9, 0x0804391C, 16384)                      = 0
6212:   close(9)                                        = 0
6212:   write(6, "\0\0\010 BA0 q zF8 {88F9".., 84)      = 84
6212:   getpid()                                        = 6212 [6209]
6212:   clock_gettime(4, 0x080478B4)                    = 0
6212:   pollsys(0x08047890, 2, 0x00000000, 0x00000000)  = 1
6212:           fd=4  ev=POLLRDNORM rev=0
6212:           fd=6  ev=POLLOUT|POLLRDNORM rev=POLLOUT

where fd#9 is our missing FD on the pty master, and this is the first
pollsys where fd#9 is included.
>From this we can infer from this that sshd does in fact set up thedescriptors correctly.  pollsys says that fd#9 is readable, and when
sshd reads it the read returns zero, indicating end-of-file.  sshd then
closes the descriptor and removes it from the set it's looking at,
hence why we didn't see it in the later calls.

If this sounds familiar it's because we've encountered the same thing
on AIX (but probably for different reasons, in that case it was passing
through zero-byte writes from the pty slave).  There's already a
workaround hack for it (PTY_ZEROREAD) which seems to help in this case,
but I'd like to understand what the problem is (and what other
potential side effects are) before we go enabling that on Solaris.

Anyone from Oracle care to comment?  Why does setting the baud rate on
a pty to 57600 cause it to change its read behaviour?

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Aug-02 14:46 UTC

head link

[Bug 2565] High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

Tomas Kuthan <tomas.kuthan at oracle.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tomas.kuthan at oracle.com
           Assignee|unassigned-bugs at mindrot.org |tomas.kuthan at oracle.com

--- Comment #5 from Tomas Kuthan <tomas.kuthan at oracle.com> ---
Thanks for reporting this issue.

It turns out, it is a long-standing Solaris kernel bug. It only
pertains to the one particular speed (57600 baud), other rates (such as
115200, 230400 and 460800) work fine, provided that the HW device
supports it.

I have a verified fix ready for this. We will fix it eventually in
Solaris.

I am taking ownership of the bug for now, but we could just as well
close it, as this is not a bug in OpenSSH.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Aug-02 23:30 UTC

head link

[Bug 2565] High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

--- Comment #6 from Darren Tucker <dtucker at zip.com.au> ---
(In reply to Tomas Kuthan from comment #5)> Thanks for reporting this issue.
> 
> It turns out, it is a long-standing Solaris kernel bug. It only
> pertains to the one particular speed (57600 baud), other rates (such
> as 115200, 230400 and 460800) work fine, provided that the HW device
> supports it.
That sounds fascinatingly specific.  Are you able to share the details?
> I have a verified fix ready for this. We will fix it eventually in
> Solaris.
Thanks!
> I am taking ownership of the bug for now, but we could just as well
> close it, as this is not a bug in OpenSSH.
Given that there's already a workaround (ie "don't do that
then") and a
proper fix coming I think we should close the bug.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Aug-03 08:56 UTC

head link

[Bug 2565] High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

Tomas Kuthan <tomas.kuthan at oracle.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |MOVED

--- Comment #7 from Tomas Kuthan <tomas.kuthan at oracle.com> ---
On Solaris, 16 original baud rates B0 through B38400 were represented
in the four least significant bits of termios.c_cflag. When additional
baud rates were added back in 1994, the adjacent bits were already
assigned for a different purpose, so bits 22 and 23 were used to flag
extended baudrates. Baud rates B57600 through B460800 were again
represented in the least significant bits with the extension bit set.
All the speed handling code was updated to take the extension bits into
account.

... all the code, but this one if, that was left behind. B57600 is
encoded with the extension bit set and with the 4 least significant
bits all zeros. The neglected if has mistaken B57600 for B0. B0 means
'hang up'; hung up it did.

I am closing the bug as not-an-OpenSSH-bug. (Feel free to change the
substatus.)

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.

bugzilla-daemon at mindrot.org

2021-Apr-23 05:00 UTC

head link

[Bug 2565] High baud rate gets sent, solaris closes pty

https://bugzilla.mindrot.org/show_bug.cgi?id=2565

Damien Miller <djm at mindrot.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |CLOSED

--- Comment #8 from Damien Miller <djm at mindrot.org> ---
closing resolved bugs as of 8.6p1 release

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.

Possibly Parallel Threads

Search for more apparently analagous threads

openssh bugs - Apr 2016 - [Bug 2565] New: High baud rate gets sent, solaris closes pty

[Bug 2565] New: High baud rate gets sent, solaris closes pty

[Bug 2565] High baud rate gets sent, solaris closes pty

[Bug 2565] High baud rate gets sent, solaris closes pty

[Bug 2565] High baud rate gets sent, solaris closes pty

[Bug 2565] High baud rate gets sent, solaris closes pty

[Bug 2565] High baud rate gets sent, solaris closes pty

[Bug 2565] High baud rate gets sent, solaris closes pty

[Bug 2565] High baud rate gets sent, solaris closes pty

[Bug 2565] High baud rate gets sent, solaris closes pty

Possibly Parallel Threads