Nick Holland
2016-Jul-07 15:07 UTC
portable sftp oddity: sftp, redirection of stderr and ControPersist
hi, Ran into a problem which I thought was an AIXism, but have since found that it can be reproduced on Linux and MacOS. It can NOT be reproduced on OpenBSD. Reproduced on: AIXv7.1 OpenSSH v6.0p1 RedHat 6.8 OpenSSH 5.4p1 Redhat 7.2 OpenSSH 6.6.1p1 MacOS 10.11 (sorry, forgot to grab the OpenSSH version) Could not reproduce on: OpenBSD 5.9-current, march snapshot, OpenSSH_7.2 OpenBSD 5.3-current, April snapshot, OpenSSH_6.2 Demonstration: set up an account on a system that can ssh to itself and authenticate via keys. Create and run this script: ====#!/usr/bin/ksh mkdir -p ~/dest for X in 1 2 3 4 5 6 7 8 9 10; do echo $X echo "cd dest put $0" | sftp -b - localhost 2>&1 | tee -a outfile done ==== Change first line to whatever shell your system uses (that's AIX's default), otherwise, should be pretty portable. Run it, it should copy itself to a directory in your home directory ten times, should take just a few seconds. Now, add the following to ~/.ssh/config : ====ControlMaster auto ControlPath ~/.ssh/control/%h:%r:%p ControlPersist 10s ==== re-run script. * What I think should happen is the persistent control channel should greatly reduce the SSH connection time, so it should run significantly faster. * What DOES happen on platforms with a problem: the "ControlPersist" value becomes an SSH rate limiter -- instead of holding a connection OPEN for ten seconds, it PREVENTS another SSH session from starting for ten seconds! So, instead of taking maybe four seconds before, now it takes 104 seconds (10 x 10 seconds + ssh connection overhead). Change the ControlPersist to some other value, the overall speed goes up or down, but never as fast as without the ControlPersist options in place. here's where it gets weird. Remove the "2>&1" from the sftp line, and the problem goes away -- the script runs much faster with the .ssh/config file than without it. Unfortunately, I need the stderr output. :-/ It appears to be the "2>&1 | {cmd}" structure that is at fault -- doesn't matter if {cmd} is tee, wc, or a shell function (as it was in my "real" project). I have found that 2>error.file |tee outfile works, and then I can append the error.file output into the end of the "outfile" and accomplish my goals, but that's kinda ugly. Thanks for looking! Nick.
Jakub Jelen
2016-Jul-11 08:07 UTC
portable sftp oddity: sftp, redirection of stderr and ControPersist
On 07/07/2016 09:07 PM, Nick Holland wrote:> hi, > > Ran into a problem which I thought was an AIXism, but have since found > that it can be reproduced on Linux and MacOS. It can NOT be > reproduced on OpenBSD. > > Reproduced on: > AIXv7.1 OpenSSH v6.0p1 > RedHat 6.8 OpenSSH 5.4p1 > Redhat 7.2 OpenSSH 6.6.1p1 > MacOS 10.11 (sorry, forgot to grab the OpenSSH version) > > Could not reproduce on: > OpenBSD 5.9-current, march snapshot, OpenSSH_7.2 > OpenBSD 5.3-current, April snapshot, OpenSSH_6.2 > > Demonstration: set up an account on a system that can ssh to itself > and authenticate via keys. > > Create and run this script: > ====> #!/usr/bin/ksh > > mkdir -p ~/dest > > for X in 1 2 3 4 5 6 7 8 9 10; do > echo $X > echo "cd dest > put $0" | sftp -b - localhost 2>&1 | tee -a outfile > done > ====> > Change first line to whatever shell your system uses (that's AIX's > default), otherwise, should be pretty portable. > > Run it, it should copy itself to a directory in your home directory > ten times, should take just a few seconds. > > > Now, add the following to ~/.ssh/config : > ====> ControlMaster auto > ControlPath ~/.ssh/control/%h:%r:%p > ControlPersist 10s > ====> > re-run script. > * What I think should happen is the persistent control channel should > greatly reduce the SSH connection time, so it should run significantly > faster. > > * What DOES happen on platforms with a problem: the "ControlPersist" > value becomes an SSH rate limiter -- instead of holding a connection > OPEN for ten seconds, it PREVENTS another SSH session from starting > for ten seconds! So, instead of taking maybe four seconds before, now > it takes 104 seconds (10 x 10 seconds + ssh connection overhead). > Change the ControlPersist to some other value, the overall speed goes > up or down, but never as fast as without the ControlPersist options in > place. > > > here's where it gets weird. > Remove the "2>&1" from the sftp line, and the problem goes away -- the > script runs much faster with the .ssh/config file than without it. > Unfortunately, I need the stderr output. :-/ > > It appears to be the "2>&1 | {cmd}" structure that is at fault -- > doesn't matter if {cmd} is tee, wc, or a shell function (as it was in > my "real" project). > > I have found that 2>error.file |tee outfile works, and then I can > append the error.file output into the end of the "outfile" and > accomplish my goals, but that's kinda ugly.This looks like related to the bug #1988 [1]. The mux process is holding the stderr. Could you test it with the patch from the referenced bug applied? Also as I see you are using RHEL, you can contact your support to provide a test package (we plan to fix this in RHEL7.3). [1] https://bugzilla.mindrot.org/show_bug.cgi?id=1988 -- Jakub Jelen Associate Software Engineer Security Technologies Red Hat
Nick Holland
2016-Jul-13 22:34 UTC
portable sftp oddity: sftp, redirection of stderr and ControPersist
On 07/11/16 04:06, Jakub Jelen wrote:> On 07/07/2016 09:07 PM, Nick Holland wrote:...> This looks like related to the bug #1988 [1]. The mux process is holding > the stderr. Could you test it with the patch from the referenced bug > applied? > Also as I see you are using RHEL, you can contact your support to > provide a test package (we plan to fix this in RHEL7.3). > > [1] https://bugzilla.mindrot.org/show_bug.cgi?id=1988 >well, RHEL is what I reproduced the error on, the actual problem cropped up in an AIX project. Just not too easy to test AIX if you don't have power- hungry, outrageously expensive hw on hand. :-/ So. I loaded up a CentOS 7.2 VM, downloaded the most recent OpenSSH-portable tar file (openssh-7.2p2.tar.gz), added zlib-devel and openssl-devel, ./configure, make, make install'd the program. pkill'ed sshd, /usr/local/sbin/sshd to start sshd....and same results. I then applied this patch: diff --git a/log.c b/log.c index a5ba22f..9011f9d 100644 --- a/log.c +++ b/log.c @@ -308,7 +308,7 @@ log_change_level(LogLevel new_log_level) int log_is_on_stderr(void) { - return log_on_stderr; + return log_on_stderr && log_stderr_fd == STDERR_FILENO; } /* redirect what would usually get written to stderr to specified file */ diff --git a/ssh.c b/ssh.c index ec8a515..65e0217 100644 --- a/ssh.c +++ b/ssh.c @@ -1346,7 +1346,7 @@ static void control_persist_detach(void) { pid_t pid; - int devnull; + int devnull, keep_stderr; debug("%s: backgrounding master process", __func__); @@ -1377,8 +1377,10 @@ control_persist_detach(void) error("%s: open(\"/dev/null\"): %s", __func__, strerror(errno)); } else { + keep_stderr = log_is_on_stderr() && debug_flag; if (dup2(devnull, STDIN_FILENO) == -1 || - dup2(devnull, STDOUT_FILENO) == -1) + dup2(devnull, STDOUT_FILENO) == -1 || + (!keep_stderr && dup2(devnull, STDERR_FILENO) == -1)) error("%s: dup2: %s", __func__, strerror(errno)); if (devnull > STDERR_FILENO) close(devnull); and seems to work, at least once I modified my script to use the sftp I had compiled in /usr/local/bin rather than the "factory" /usr/bin (oops). my test script: ===#!/bin/bash mkdir -p ~/dest for X in 1 2 3 4 5 6 7 8 9 10; do echo $X echo "cd dest put $0" | /usr/local/bin/sftp -b - localhost 2>&1 | tee -a outfile done ===So yes, looks like there's a fix on the way. Thanks! Nick.
Apparently Analagous Threads
- ssh -f and -O ControlPersist=yes, ControlMaster=yes leaves stderr open
- [PATCH] ssh: set proctitle for mux master
- [Bug 1988] ControlPersist causes stderr to be left open until the master connection times out
- Problemes with ControlPersist
- ControlPersist and multiple X11 forwarding.