carl at bl.echidna.id.au
2001-Dec-13 01:19 UTC
behaviour of ssh/scp over flakey links - timeout/retry?
I'm using OpenSSH's ssh and scp to back up some remote machines, roughly as follows : ssh remote-host "tar up a few dirs" scp remote-host:tarfile local-repository On the whole, as I'd expect, this works just fine. But .. sometimes the link is a bit dodgey (for lack of a more explicit term, this being a polite list :) ) Can anyone tell me how ssh and scp timeout and retry, or if they do, during a session (I know about timeout and retry during establishment, it's documented in the man page). The job's scripted, and I need to be able to make my script deal with a loss, by as a minimum, alerting me that it's happened. At the moment, it seems to just hang when its lost the link. I could use expect I guess, and catch timeouts that way, but I'd prefer if ssh said "timeout on session, giving up" or something. FWIW, we're using OpenSSH_2.9p2, SSH protocols 1.5/2.0, OpenSSL 0x0090600f on RedHat 7.1 and 6.2. I'll be updating to 3.0 "soon", once I've had a chance to test it in our environment. Thanks :) Carl
Dan Kaminsky
2001-Dec-13 02:10 UTC
behaviour of ssh/scp over flakey links - timeout/retry?
Carl: I've honestly never had the best of luck with scp. In my experience, command forwarding the tar command is the fastest, most cross-platform(even on Windows, using the Cygwin OpenSSH daemon) method of moving files. Try this: # For unpacked files on the backup host alicehost$ ssh alice at bobhost "tar -cf - /path" | tar -xf - # To get the tarball itself alicehost$ ssh alice at bobhost "tar -cf - /path" > /path/bobhost.tar # Slight variant -- send a tarball somewhere else bobhost$ tar -cf - /path | ssh bob at alicehost "cat > /path/bobhost.tar" Now, that being said, it sounds like the real problem is that the TCP session dies on occasion. There are a couple solutions to this, some of which I'm still working on: 1) Use a transfer system that can handle incremental(and thus resumable) updates. Rsync comes to mind. Make sure Keepalives are enabled("ssh -o KeepAlive Yes", or modify your /etc/ssh.conf) so timeouts will happen quicker, then have your script go back and rsync again if rsync returns an error code. (It won't upon a successful syncing.) Do something like: alicehost$ rsync -e "ssh -o KeepAlive yes" alice at bobhost:/path /path/bobhost/ or bobhost$ rysnc -e "ssh -o KeepAlive yes" /path bob at alicehost:/path/bobhost 2) Add a TCP Reliability Layer. ROCKS, available at http://www.cs.wisc.edu/~zandy/rocks/ ,handles links that are...ah..."less than stable". Quoting the description: "Rock-enabled programs continue to run after any of these events; their broken connections recover automatically, without loss of in-flight data, when connectivity returns. Rocks work transparently with most applications, including SSH clients, X-windows applications, and network service daemons." You'll almost certainly need to disable SSH keepalives for this to work, but the reliability layer will almost certainly handle even extended network outages. I haven't tested ROCKS at *all*, but I'll be doing so shortly. You might find yourself missing, well, the status updates that you get with scp. cpipe, available at http://wsd.iitb.fhg.de/~kir/cpipehome/ , is a nice general purpose tool for monitoring the speed of a pipe. Lemme know if any of this helps; I'm working on stuff related to this right now. Yours Truly, Dan Kaminsky DoxPara Research http://www.doxpara.com
carl at bl.echidna.id.au
2001-Dec-13 02:20 UTC
behaviour of ssh/scp over flakey links - timeout/retry?
> From: "Dan Kaminsky" <dan at doxpara.com> > > Carl: > > I've honestly never had the best of luck with scp. In my experience, > command forwarding the tar command is the fastest, most cross-platform(even > on Windows, using the Cygwin OpenSSH daemon) method of moving files. > > Try this: > > # For unpacked files on the backup host > alicehost$ ssh alice at bobhost "tar -cf - /path" | tar -xf - > # To get the tarball itself > alicehost$ ssh alice at bobhost "tar -cf - /path" > /path/bobhost.tar > # Slight variant -- send a tarball somewhere else > bobhost$ tar -cf - /path | ssh bob at alicehost "cat > /path/bobhost.tar" > > Now, that being said, it sounds like the real problem is that the TCP > session dies on occasion.Yep, the connection is to a few IDS probes that are behind flakey switches and overloaded LANs. We get a lot of dropped sessions.> There are a couple solutions to this, some of > which I'm still working on: > > 1) Use a transfer system that can handle incremental(and thus resumable) > updates. Rsync comes to mind. Make sure Keepalives are enabled("ssh -o > KeepAlive Yes", or modify your /etc/ssh.conf) so timeouts will happen > quicker, then have your script go back and rsync again if rsync returns an > error code. (It won't upon a successful syncing.) Do something like: > > alicehost$ rsync -e "ssh -o KeepAlive yes" alice at bobhost:/path > /path/bobhost/ > or > bobhost$ rysnc -e "ssh -o KeepAlive yes" /path > bob at alicehost:/path/bobhostOk, I like that option.> 2) Add a TCP Reliability Layer. ROCKS, available at > http://www.cs.wisc.edu/~zandy/rocks/ ,handles links that are...ah..."less > than stable". Quoting the description: "Rock-enabled programs continue to > run after any of these events; their broken connections recover > automatically, without loss of in-flight data, when connectivity returns. > Rocks work transparently with most applications, including SSH clients, > X-windows applications, and network service daemons." You'll almost > certainly need to disable SSH keepalives for this to work, but the > reliability layer will almost certainly handle even extended network > outages. > > I haven't tested ROCKS at *all*, but I'll be doing so shortly.Interesting.> You might find yourself missing, well, the status updates that you get > with scp. cpipe, available at http://wsd.iitb.fhg.de/~kir/cpipehome/ , is a > nice general purpose tool for monitoring the speed of a pipe.No, don't care. Just want the copy to complete, or tell me it didn't.> Lemme know if any of this helps; I'm working on stuff related to this > right now.It sure does, thankyou. Carl
carl at bl.echidna.id.au
2001-Dec-13 02:49 UTC
behaviour of ssh/scp over flakey links - timeout/retry?
> From: "Dan Kaminsky" <dan at doxpara.com> > > > Yep, the connection is to a few IDS probes that are behind flakey > > switches and overloaded LANs. We get a lot of dropped sessions. > > Hmmm. There's an interesting third option that attempts to stream all files > directly to a backend aggregator, using the file system only as a cache in > case connectivity is lost, if at all. > > What's the datarate of each of your probes, worstcase?I'm not sure, I'm not involved in that part of the system much.