Dear Centos Users: I installed Centos 7 on my server a few months ago. While using ssh, there is always a strange message "Write failed: Broken pipe?. It forces quit of SSH. It?s really annoying as it happens very often with irregular time interval - from a couple of minutes to a few hours. I have been working using Linux (Red Hat, Fedora and Centos) over 15 years. This didn?t happen for me even under centos 6.6. I have tried the following approaches, but none of them can help. I wonder if it can be solved by reinstall the system again. But it?s time consuming to reinstall a lot of softwares. 1. Login via Mac, Windows, Linux systems from different computers. 2. Modify sshd_config on the server as suggested by many posts: TCPKeepAlive yes ClientAliveInterval 60 3. Modify ~/.ssh/config file on my local computer: Host * ServerAliveInterval 60 4. Login ssh using -Y instead of -X. 5. add ?unset autologout? in my .cshrc. 6. I checked IP address with the internet administrator, and it works well. 7. add a file named autologout.csh with ?set autologout=0?. Do you know a good solution? Thanks! Cheers, Hua ----------------------------- Hua Wang, Ph.D. in Geodesy Department of Surveying Engineering, Guangdong University of Technology, 100 Waihuan Xi Rd., Panyu District, Guangzhou, 510006, China. Tel: +86-13570019257 Email: ehwang at 163.com Homepage: http://homepages.see.leeds.ac.uk/~earhw
On Thu, 8 Oct 2015 10:33:55 +0800 Hua Wang wrote:> While using ssh, there is always a strange message "Write failed: Broken > pipe?. It forces quit of SSH.It sounds like the network connection between you and the server is dying for some reason. That being the case you probably can't fix it yourself if it's a remote server. You may need to get a better Internet connection on one or both ends. -- MELVILLE THEATRE ~ Real D 3D Digital Cinema ~ www.melvilletheatre.com
Hi Frank, Thanks for your prompt reply. The server is in my office. Because I tried a few computers, so it shouldn?t be a problem of Internet connection of the clients. I tried to ping the server, and it can accept all data. Is there a good way to check it? It always worked well for centos 6.6 using the same server and the same internet connections (IP, cable etc). The problem came out while reinstalling centos 7.7. I suspect it?s still a problem of system instead of network. Cheers, Hua> On Oct 8, 2015, at 11:11 AM, Frank Cox <theatre at melvilletheatre.com> wrote: > > On Thu, 8 Oct 2015 10:33:55 +0800 > Hua Wang wrote: > >> While using ssh, there is always a strange message "Write failed: Broken >> pipe?. It forces quit of SSH. > > It sounds like the network connection between you and the server is dying for some reason. > > That being the case you probably can't fix it yourself if it's a remote server. > > You may need to get a better Internet connection on one or both ends. > > -- > MELVILLE THEATRE ~ Real D 3D Digital Cinema ~ www.melvilletheatre.com > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos----------------------------- Hua Wang, Ph.D. in Geodesy Department of Surveying Engineering, Guangdong University of Technology, 100 Waihuan Xi Rd., Panyu District, Guangzhou, 510006, China. Tel: +86-13570019257 Email: ehwang at 163.com Homepage: http://homepages.see.leeds.ac.uk/~earhw
On 10/07/2015 07:33 PM, Hua Wang wrote:> I installed Centos 7 on my server a few months ago. While using ssh, there is always a strange message "Write failed: Broken pipe?.That's very often a result of IP conflict. I'm assuming that you're connecting to an IPv4 address. If so, log in to your CentOS server and use arping to look for conflicts: # arping -c 2 D -I em1 <your address>> 1. Login via Mac, Windows, Linux systems from different computers. > 2. Modify sshd_config on the server as suggested by many posts: > TCPKeepAlive yes > ClientAliveInterval 60TCPKeepAlive is "yes" by default. ClientAliveInterval doesn't appear to be a valid setting. Either TCPKeepAlive or ServerAliveInterval could be useful if the problem were a stateful firewall which was dropping your connection from its state table, and then resetting the connection in response to a later packet from your client. Since those don't help, that tends to suggest that the problem isn't an intermediate host, but the server itself. Possibly an IP conflict. Also, check the output of "dmesg" to see if there are any problems recorded with the NIC. Check the output of "ifconfig" to see if there are TX or RX errors that increase when your connections are reset.> 3. Modify ~/.ssh/config file on my local computer: > Host * > ServerAliveInterval 60 > 4. Login ssh using -Y instead of -X.You didn't say what client OS you're using, but Fedora and CentOS set ForwardX11Trusted to "yes" by default, so "ssh -Y" is the same as "ssh -X". And even if it weren't, it wouldn't cause the problem you're seeing.> 5. add ?unset autologout? in my .cshrc.The error you're seeing won't be triggered by your shell exiting.> 6. I checked IP address with the internet administrator, and it works well. > 7. add a file named autologout.csh with ?set autologout=0?. >
On 09/10/15 10:23, Gordon Messmer wrote:> > Since those don't help, that tends to suggest that the problem isn't > an intermediate host, but the server itself. Possibly an IP > conflict. Also, check the output of "dmesg" to see if there are any > problems recorded with the NIC. Check the output of "ifconfig" to see > if there are TX or RX errors that increase when your connections are > reset.As Gordon suggests, let's see if the problem might be related to a dying NIC. The output of the following command may reveal any illness: # ip -s -d l l Cheers, ak.
I am not sure if we can not send attachments to the mailing list. There were quite a lot replies before, but I got nothing back since attachements was added. I will remove the attachments and send it again. Please have a look at the email below. Thanks for your help. --- Dear All, Thanks for all your help. I will put all the comments together. Please have a look if there is any clue on such ghost problem. I have also attached the log files: dmesg, secure, messages. Please note that there is a message in secure when it exited just now. Oct 9 10:55:55 maya2012 su: pam_unix(su:session): session closed for user root> Can you trigger the error reliably by doing something network intensive, like scp or rsync a large file? I've seen similar behaviour with a bad NIC that was in the process of dying.Yes, I copied tens of Gb files using rsync. It worked well.> That's very often a result of IP conflict. I'm assuming that you're connecting to an IPv4 address. If so, log in to your CentOS server and use arping to look for conflicts: > > # arping -c 2 D -I em1 <your address>The IP is fixed to my server. The network administrator has checked the address, and only this computer uses it. When I run the above command line, the output is: [root at maya2012 hwang]# arping -c 2 -D -I em1 222.200.125.5 ARPING 222.200.125.5 from 0.0.0.0 em1 Sent 2 probes (2 broadcast(s)) Received 0 response(s)>> 1. Login via Mac, Windows, Linux systems from different computers. >> 2. Modify sshd_config on the server as suggested by many posts: >> TCPKeepAlive yes >> ClientAliveInterval 60 > > TCPKeepAlive is "yes" by default. ClientAliveInterval doesn't appear to be a valid setting. Either TCPKeepAlive or ServerAliveInterval could be useful if the problem were a stateful firewall which was dropping your connection from its state table, and then resetting the connection in response to a later packet from your client. > > Since those don't help, that tends to suggest that the problem isn't an intermediate host, but the server itself. Possibly an IP conflict. Also, check the output of "dmesg" to see if there are any problems recorded with the NIC. Check the output of "ifconfig" to see if there are TX or RX errors that increase when your connections are reset.[root at maya2012 hwang]# ifconfig em1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 222.200.125.5 netmask 255.255.255.128 broadcast 222.200.125.127 inet6 fe80::d6ae:52ff:fe6a:405e prefixlen 64 scopeid 0x20<link> ether d4:ae:52:6a:40:5e txqueuelen 1000 (Ethernet) RX packets 2865 bytes 396191 (386.9 KiB) RX errors 0 dropped 180 overruns 0 frame 0 TX packets 510 bytes 55844 (54.5 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 em2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether d4:ae:52:6a:40:5f txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 0 (Local Loopback) RX packets 7 bytes 748 (748.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 7 bytes 748 (748.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root at maya2012 hwang]# ip -s -d l l 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 RX: bytes packets errors dropped overrun mcast 748 7 0 0 0 0 TX: bytes packets errors dropped carrier collsns 748 7 0 0 0 0 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000 link/ether d4:ae:52:6a:40:5e brd ff:ff:ff:ff:ff:ff promiscuity 0 RX: bytes packets errors dropped overrun mcast 312908 2272 0 138 0 1081 TX: bytes packets errors dropped carrier collsns 43946 403 0 0 0 0 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000 link/ether d4:ae:52:6a:40:5f brd ff:ff:ff:ff:ff:ff promiscuity 0 RX: bytes packets errors dropped overrun mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0 Thanks, Hua