sshd also dies when certain other kinds of traffic is generated, such as `man pw' using the most pager[1], and many x11 apps such as emacs. However, it is stable when running simple x11 apps such as xeyes, and the link its self is stable -- a terminal will stay connected without issue for days, as long as not much happens in it. Also a sshfs connection dies immediately. ssh -Y karren gkrellm & *sshd dies* Cutting to the chase, the log message which seems the most important is: Aug 23 14:45:11 karen sshd[62451]: fatal: Fssh_packet_write_poll: Connection from 174.77.777.77 port 57670: Permission denied However, even if I put both machines outside their respective firewalls, opening all ports, the message is still the same. It sounds like something internal to the server is denying access to the high port it wants, but other high port services work ok: irc & mosh. And yeah, mosh works atop of ssh, but it doesn't do everything I need, and it scrambles keycodes going to emacs. Even more confusing, these two machines work fine when they're both on the same LAN, so it seems like it must be something with the uplink to the Internet. I also suspect the server's uplink as the behavior was the same when I took the client to our local university. `sshd -ddd' doesn't add any further insights for me, only lots of PAM diagnostics: debug1: Setting controlling tty using TIOCSCTTY. Fssh_packet_write_poll: Connection from 174.52.251.44 port 32812: Permission denied debug1: do_cleanup debug3: PAM: sshpam_thread_cleanup entering debug3: mm_request_receive entering debug1: do_cleanup debug1: PAM: cleanup debug1: PAM: closing session debug1: PAM: deleting credentials debug3: PAM: sshpam_thread_cleanup entering debug1: session_pty_cleanup: session 0 release /dev/pts/23 Feedback on one of the FreeBSD forums suggested that the MTU on the routers might be less than what the machines was using, and that excessive fragmentation might be causing the connection to die. The router MTUs were 1492, and the system MTUs were 1500. Unfortunately, reducing the systems' MTUs to 1400 did not affect the problem but at least I have less fragmentation now. I've tried every config option and commandline switch that looked even remotely related, but nothing has affected it. -ddd -vvv -E -D and all sorts of keepalives. Of course, I'm hoping that someone is going to point at one that I missed and magically make it work. I've used ssh for many years and never had a problem like this before. The Server ---------- $ uname -a FreeBSD karren.example.com 11.0-RELEASE-p9 FreeBSD 11.0-RELEASE-p9 #0: Tue Apr 11 08:48:40 UTC 2017 root at amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 $ sshd -v OpenSSH_7.2p2, OpenSSL 1.0.2k-freebsd 26 Jan 2017 The ISDN-TA is a CenturyLink ZyXEL PK5001Z The Client ---------- $ uname -a Linux piglet 4.10.0-32-generic #36~16.04.1-Ubuntu SMP Wed Aug 9 09:19:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux $ ssh -V OpenSSH_7.2p2 Ubuntu-4ubuntu2.2, OpenSSL 1.0.2g 1 Mar 2016 The Cable Modem is a ARRIS TG1682 Even after writing all this I'm not sure what makes sense to try next. I could upgrade the binaries, but these are the standard shipping ones on the distros; and no one else seems to be having this problem. This thing has really crimped my style for the last week of head banging against it. Please can someone help? [1] The failure with `man pw' and the more pager is quasi intermittent. Sometime the link dies before the first screen full is rendered. Other times you can page up and down a bit before it croaks. The `man pw' page is stable using the less pager.
On 30 August 2017 at 11:50, cira <ciradrak at centurylink.net> wrote:> [...]Cutting to the chase, the log message which seems the most important is:>In future please include the full log as it may have other information relevant to the problem. Aug 23 14:45:11 karen sshd[62451]: fatal: Fssh_packet_write_poll:> Connection from 174.77.777.77 port 57670: Permission denied >"Fssh_packet_write_poll" does not look like a message generated by the stock source available at openssh.com. OpenSSH_7.2p2, OpenSSL 1.0.2k-freebsd 26 Jan 2017>That is a server that has been modified by a third party. Can you reproduce your problem with the stock code from openssh.com? -- Darren Tucker (dtucker at zip.com.au) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
On 30 August 2017 at 12:51, Darren Tucker <dtucker at zip.com.au> wrote:> > "Fssh_packet_write_poll" does not look like a message generated by the > stock source available at openssh.com. >For the record this is due to some name mangling on the part of FreeBSD: https://github.com/freebsd/freebsd/blob/master/crypto/openssh/ssh_namespace.h#L496 so misleadingly the error actually comes from opacket.c:packet_write_poll() and not packet.c:ssh_packet_write_poll(). I would guess that something on the server (maybe a firewall) is causing either accept() or read() on high numbered ports to fail with EPERM. I'd suggest you strace/truss/ktrace the sshd immediately before triggering the failure and look for what's EPERMing. -- Darren Tucker (dtucker at zip.com.au) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
> sshd also dies when certain other kinds of traffic is generated, such > as > `man pw' using the most pager[1], and many x11 apps such as emacs. > However, it is stable when running simple x11 apps such as xeyes, and > the link its self is stable -- a terminal will stay connected without > issue for days, as long as not much happens in it. Also a sshfs > connection dies immediately. > > ssh -Y karren > gkrellm & > *sshd dies*My guess is that you *have* an MTU problem on your link, ie. that the PMTU detection doesn't work; and 1400 might still be too big. For a test, please run # ping -M do -c 3 -s <size> <peer> with size values from 600 to 1400, and then 1450 and 1472. If some size works but the higher ones don't, set the MTU of one of the machines to the corresponding value, and then establish an SSH connection. Also, if your traceroute has an "--mtu" option, you could run # traceroute --mtu <peer> 1472 to see some information; but as your ICMPs seem to get filtered, that might not actually help.
On 08/30/17 01:33, Philipp Marek wrote:>> sshd also dies when certain other kinds of traffic is generated, such as >> `man pw' using the most pager[1], and many x11 apps such as emacs. >> However, it is stable when running simple x11 apps such as xeyes, and >> the link its self is stable -- a terminal will stay connected without >> issue for days, as long as not much happens in it. Also a sshfs >> connection dies immediately. >> >> ssh -Y karren >> gkrellm & >> *sshd dies* > > My guess is that you *have* an MTU problem on your link, ie. that the > PMTU detection doesn't work; and 1400 might still be too big. > > For a test, please run > > # ping -M do -c 3 -s <size> <peer> > > with size values from 600 to 1400, and then 1450 and 1472.-M do appears to be a linux ping option -- executing from linux. $ ping -M do -c 3 -s 600 karren.example.us PING karren.example.us (67.77.77.777) 600(628) bytes of data. 608 bytes from 67.77.77.777: icmp_seq=1 ttl=55 time=92.8 ms 608 bytes from 67.77.77.777: icmp_seq=2 ttl=55 time=102 ms 608 bytes from 67.77.77.777: icmp_seq=3 ttl=55 time=89.9 ms --- karren.example.us ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 89.956/94.975/102.072/5.171 ms $ ping -M do -c 3 -s 1400 karren.example.us PING karren.example.us (67.77.77.777) 1400(1428) bytes of data. ping: local error: Message too long, mtu=1400 ping: local error: Message too long, mtu=1400 ping: local error: Message too long, mtu=1400 --- karren.example.us ping statistics --- 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2037ms $ ping -M do -c 3 -s 1300 karren.example.us [abbreviating output] 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 99.784/106.764/113.670/5.669 ms $ ping -M do -c 3 -s 1372 karren.example.us 3 packets transmitted, 3 received, 0% packet loss, time 6133ms rtt min/avg/max/mdev = 99.977/101.501/102.877/1.244 ms $ ping -M do -c 3 -s 1392 karren.example.us 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2051ms $ ping -M do -c 3 -s 1373 karren.example.us 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2034ms 1372 appears to be the magic number. *repeatingTestsFromFreeBSD* -D appears to be the equivalent of "-M do" $ ping -D -c 3 -s 1373 badger.example.us ping: packet size too large: 1373 > 56: Operation not permitted $ sudo bash # ping -D -c 3 -s 1373 badger.example.us 3 packets transmitted, 0 packets received, 100.0% packet loss # ping -D -c 3 -s 1372 badger.example.us 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 99.126/100.870/102.472/1.370 ms # ifconfig igb0 mtu 1372 (on both ends of the link for good measure) $ ssh karren.example.com $ man pw *linkDies* packet_write_wait: Connection to 67.77.77.777 port 22: Broken pipe Aug 30 12:31:58 karren sshd[28510]: fatal: Fssh_packet_write_poll: Connection from 174.77.777.77 port 49798: Permission denied Darn.> If some size works but the higher ones don't, set the MTU of one of the > machines to the corresponding value, and then establish an SSH connection. > > > Also, if your traceroute has an "--mtu" option, you could run > > # traceroute --mtu <peer> 1472 > > to see some information; but as your ICMPs seem to get filtered, that > might not actually help.$ traceroute --mtu karren.example.us 1373 traceroute to karren.example.us (67.77.77.777), 30 hops max, 1373 byte packets 1 10.0.0.1 (10.0.0.1) 5.500 ms F=1372 7.258 ms 3.027 ms 2 96.120.96.125 (96.120.96.125) 14.881 ms 14.909 ms 18.676 ms And traceroute confirms that the router is restricting packets to 1380. I also tried setting the MTUs down to 600 just to try it, but it didn't seem to make a difference. :( The link still dies. Thank you so much for your suggestions. I think I'm going to try compiling the portable 7.5p1 sshd from a local mirror, and see if it behaves differently.
Maybe Matching Threads
- Update on Kernel panic executing gkrellm.
- Kernel panic, CentOS 7.1503 fully updated, with executing gkrellm.
- Kernel panic, CentOS 7.1503 fully updated, with executing gkrellm.
- Kernel panic, CentOS 7.1503 fully updated, with executing gkrellm.
- Can't see more than 8 files? MTU-ish issue?