thr3ads.net - openssh unix dev - hang on exit bug under Linux [Dec 2001]

If this information is useful, please help other people find it:
Share via:

Rachit Siamwalla

2001-Dec-10 22:09 UTC

hang on exit bug under Linux

>From what I understand, the problem is due to people's disagreement
about what the "correct" behavior should be. I'm pretty sure that
the following is the correct behavior from running rsh and ssh often (both
fsecure and openssh).
Lets say you have a stupid script that does

while 1
do
   sleep 1
done

Called foreverSleep on your remote host:

rsh remotehost "foreverSleep &"

Should and does hang (on Linux and Solaris at least).

HOWEVER,

rsh remotehost
# foreverSleep &
# exit

does NOT hang.

---

If you run openssh, like the following:

ssh remotehost "foreverSleep &"

Should and does hang (fsecure hangs as well).

HOWEVER,

ssh remotehost
# foreverSleep &
# exit

DOES hang. (fsecure does not hang) This is where the bug is. If you run ssh with
a tty and in interactive mode, if the client decides to disconnect, it
disconnects cleanly (I'm not sure about what happens to the remaining
processes, you will have to look at rsh code for that -- it may be SIGHUP or
something, i dunno -- other posts may be clearer on this).

I hope I'm not just stating the obvious, and hope this clears things up. If
I'm wrong about the behaviours, let me know. I really think we should figure
out what the correct behaviour should be before trying to come up with a fix.

-rchit

-----Original Message-----
From: Michael [mailto:michael at bizsystems.com]
Sent: Monday, December 10, 2001 1:23 PM
To: openssh-unix-dev at mindrot.org
Subject: Re: hang on exit bug under Linux

> On Mon, Dec 10, 2001 at 10:50:06AM -0800, Dan Kaminsky wrote:
> > Look: ssh user at host "command" needs to never, ever hang.
> 
> wrong.
> 
> it needs to hang.
> 
> it needs to hang until it can be sure that 'command' does not need
> any input.
> 
> it needs to hang until it can be sure that 'command' does not
> produce any output.
> 
> it needs to hang until 'command' exits because sshd needs to tell
> the exit status from 'command' to ssh.
> 
So from a sysadmin's view point, some fool writes a piece of buggy 
software which hundreds of shell users decide to use and they then 
proceed to connect to the host via ssh and leave hundreds of "hung" 
sshd's in the process table, or even just one user with a cron job 
doing a repeated action. That sounds just great. Why on earth should 
anyone use openssh if they can expect it to mess up the operation of 
an entire system because it is BROKEN. This is a problem that will 
not go away. You can assert that script writers should do a better 
job, but they won't and that is why they write scripts.

Your response requesting me to write the code is something I can't 
do. I only have access to Linux boxes and have no clue (and would not 
presume to know) what the implications are for sun, aix, hp, 
bsd, etc... Closing off discussion on the issue won't fix it either. 
I don't mean to be a pest, but I consider openssh to be an excellent 
tool that does a lot to promote security in general and security at 
our site in particular. I'd like to see it work well. It seems to 
have one glaring flaw this one glaring flaw that needs to be fixed to 
make it generally acceptable as a replacement for virtually all other 
remote shell access programs. Saying that rsh is broken also simply 
doesnt' justify why a program under active development by a very 
bright group of people has to be broken also.

Michael
Michael at Insulin-Pumpers.org

Gert Doering

2001-Dec-10 22:18 UTC

head link

hang on exit bug under Linux

Hi,

On Mon, Dec 10, 2001 at 02:09:20PM -0800, Rachit Siamwalla
wrote:> Called foreverSleep on your remote host:
> 
> rsh remotehost "foreverSleep &"
> 
> Should and does hang (on Linux and Solaris at least).
> 
> HOWEVER,
> 
> rsh remotehost
> # foreverSleep &
> # exit
> 
> does NOT hang.
This is what I have already suggested:

 - if we have a pty, and the direct child goes away "just close the
   session", and accept data loss.   Data loss can only come from
   background processes, who are *background* processes and shouldn't 
   send stuff anyway - if they do, they deserve a SIGPIPE or worse.

 - if we have no pty, do what we do now, and block if needed.

gert
-- 
USENET is *not* the non-clickable part of WWW!
                                                           //www.muc.de/~gert/
Gert Doering - Munich, Germany                             gert at
greenie.muc.de
fax: +49-89-35655025                        gert.doering at
physik.tu-muenchen.de

Nicolas Williams

2001-Dec-11 04:20 UTC

head link

hang on exit bug under Linux

On Mon, Dec 10, 2001 at 02:09:20PM -0800, Rachit Siamwalla
wrote:> >From what I understand, the problem is due to people's disagreement
about what the "correct" behavior should be. I'm pretty sure that
the following is the correct behavior from running rsh and ssh often (both
fsecure and openssh).
Some people don't get stdio. Hey.
> Called foreverSleep on your remote host:
> 
> rsh remotehost "foreverSleep &"
> 
> Should and does hang (on Linux and Solaris at least).
> 
> HOWEVER,
> 
> rsh remotehost
> # foreverSleep &
> # exit
> 
> does NOT hang.
It should do a killpg() to send SIGHUP to the relevant processes. What
if they don't want to die? Huh?
> ---
> 
> If you run openssh, like the following:
> 
> ssh remotehost "foreverSleep &"
> 
> Should and does hang (fsecure hangs as well).
> 
> HOWEVER,
> 
> ssh remotehost
> # foreverSleep &
> # exit
> 
> DOES hang. (fsecure does not hang) This is where the bug is. If you run ssh
with a tty and in interactive mode, if the client decides to disconnect, it
disconnects cleanly (I'm not sure about what happens to the remaining
processes, you will have to look at rsh code for that -- it may be SIGHUP or
something, i dunno -- other posts may be clearer on this).
What if "foreverSleep" needs a forwarded agent/port/x11? Huh? What if
it
doesn't exit if sshd sends it SIGHUP when you exit?

I say: with ptys, send SIGHUP when the main process exits and/or when
the client closes the session.

Perhaps there should be an option like -n for the client but which
applies to stdout and stderr for the faint of heart who refuse to
understand '>' and '2>' and '2>&1' and so
on.
> I hope I'm not just stating the obvious, and hope this clears things
up. If I'm wrong about the behaviours, let me know. I really think we should
figure out what the correct behaviour should be before trying to come up with a
fix.
> 
> -rchit

Cheers,

Nico
--
-DISCLAIMER: an automatically appended disclaimer may follow. By posting-
-to a public e-mail mailing list I hereby grant permission to distribute-
-and copy this message.-

Visit our website at http://www.ubswarburg.com

This message contains confidential information and is intended only 
for the individual named.  If you are not the named addressee you 
should not disseminate, distribute or copy this e-mail.  Please 
notify the sender immediately by e-mail if you have received this 
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free 
as information could be intercepted, corrupted, lost, destroyed, 
arrive late or incomplete, or contain viruses.  The sender therefore 
does not accept liability for any errors or omissions in the contents 
of this message which arise as a result of e-mail transmission.  If 
verification is required please request a hard-copy version.  This 
message is provided for informational purposes and should not be 
construed as a solicitation or offer to buy or sell any securities or 
related financial instruments.

Nicolas Williams

2001-Dec-11 22:28 UTC

head link

hang on exit bug under Linux

On Wed, Dec 12, 2001 at 09:18:05AM +1100, Damien Miller
wrote:> On Tue, 11 Dec 2001, Dan Astoorian wrote:
> 
> > On Mon, 10 Dec 2001 23:20:14 EST, Nicolas Williams writes:
> > > 
> > > I say: with ptys, send SIGHUP when the main process exits and/or
when
> > > the client closes the session.
> > 
> > Would setting the HUPCL termios cflag for the pty a) work, b) be
> > portable, and c) be more appropriate than killpg()?
> 
> What about sessions without a pty?
They should always "hang", or, rather, "hang around."

In any case, I no longer think that sshd should do killpg(HUP) when the
session leader exits, nor, for that matter, should it set the HUPCL
termios cflag for the pty.

Instead I think the client should have an option to, when the sshd
tells it the session exited, close the related channels and/or pass a
SIGHUP to the session which the sshd would then send to the process
group of the session leader.
> -d
> 

Cheers,

Nico
--
-DISCLAIMER: an automatically appended disclaimer may follow. By posting-
-to a public e-mail mailing list I hereby grant permission to distribute-
-and copy this message.-

Visit our website at http://www.ubswarburg.com

This message contains confidential information and is intended only 
for the individual named.  If you are not the named addressee you 
should not disseminate, distribute or copy this e-mail.  Please 
notify the sender immediately by e-mail if you have received this 
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free 
as information could be intercepted, corrupted, lost, destroyed, 
arrive late or incomplete, or contain viruses.  The sender therefore 
does not accept liability for any errors or omissions in the contents 
of this message which arise as a result of e-mail transmission.  If 
verification is required please request a hard-copy version.  This 
message is provided for informational purposes and should not be 
construed as a solicitation or offer to buy or sell any securities or 
related financial instruments.

Michael

2001-Dec-11 23:13 UTC

head link

hang on exit bug under Linux

> > If you run openssh, like the following:
> > 
> > ssh remotehost "foreverSleep &"
> > 
> > Should and does hang (fsecure hangs as well).
> > 
> > HOWEVER,
> > 
> > ssh remotehost
> > # foreverSleep &
> > # exit
> > 
> > DOES hang. (fsecure does not hang) This is where the bug is. If you
run ssh with a tty and in interactive mode, if the client decides to disconnect,
it disconnects cleanly (I'm not sure about what happens to the remaining
processes, you will have tolook at rsh code for that -- it may be SIGHUP or something, i dunno -- other
posts may be clearer on this).> 
A real example would be a perl program that runs as a daemon
#!/usr/bin/perl
unless ($pid = fork) {
   unless (fork) {
       open(SDOUT,'>/dev/null');
       open(STDERR,'>/dev/null');
       open (X, 'some_process 2>&1 |'); # that generates stdio to
X
       while (X) {   # real program uses select
             do something
       }
     # dies
       exit 0;
   }
   waitpid($pid,0);
   exit 0;
}


This process will hang ssh, it should not. ....or please tell me why 
it should.

Michael
Michael at Insulin-Pumpers.org

"Petersen, Jörg"

2001-Dec-12 15:15 UTC

head link

hang on exit bug under Linux

If you like a C-Version; you might be inspired by my "daemon.c":
I use it to run Shell-Scripts as "clean" daemons:


/* daemon.c
 * $Id: daemon.c,v 1.4 2001/10/10 07:14:59 jp Exp $
 * $Source: /home/u/jp/RCS/daemon.c,v $
 *
 * 1.1   19.09.2001 jp- RCS-Checkin der ersten Version
 * 1.2   19.09.2001 jp- Einbau Option -l logfile, -c, -q
 * 1.3   19.09.2001 jp- ID als extern-String
 * 1.4   10.10.2001 jp- Usage auch auf stderr
 */

/*
   cc daemon.c -o daemon; strip daemon
 */
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include <time.h>
#include <sys/syslog.h>
#include <sys/types.h>

extern char version[]="$Id: daemon.c,v 1.4 2001/10/10 07:14:59 jp Exp
$";
char * progname;
int chdirRoot=0;
int quiet=0;

void usage(char * txt) {
        fprintf (stderr,"%s\n",txt);
        fprintf (stderr,"Usage: %s [-c] [-l /log/file] /path/to/exe arg1
arg2\n",
                        progname);
        fprintf (stderr,"   -l /log/file  does '>/log/file
2>&1'\n");
        fprintf (stderr,"   -c            does cd / (use whenever
possible!)\n");
        syslog (LOG_INFO | LOG_DAEMON,"%s\n",txt);
        syslog (LOG_INFO | LOG_DAEMON,"Usage: %s [-c] [-l /log/file]
/path/to/exe arg1 arg2\n",
                        progname);
        syslog (LOG_INFO | LOG_DAEMON,"   -l /log/file  does
'>/log/file
2>&1'\n");
        syslog (LOG_INFO | LOG_DAEMON,"   -c            does cd / (use
whenever possible!)\n");
        exit(-1);
}

char * timetext(void) {
    time_t current;
    struct tm * local;
    static char str[22];
    time(&current); /* momentane Zeit */
    local = localtime(&current);
    sprintf(str,"%02i.%02i.%04i,%02i:%02i:%02i",
                local->tm_mday,
                local->tm_mon+1,
                local->tm_year+1900,
                local->tm_hour,
                local->tm_min,
                local->tm_sec);
    return str;
}

void MSHdaemon(char * logfile, char * infotxt) {
        int rc;

          rc=fork();
          if (-1==rc) {
            syslog (LOG_INFO | LOG_DAEMON,"daemon - Unable to
fork()\n");
            exit(-1);
          }
          if (rc>0) exit(0); /* parent should exit and return control,
                 it's OK. */
          rc=setsid();
          if (-1==rc) {
            syslog (LOG_INFO | LOG_DAEMON,"daemon - Unable to
setsid()\n");
            exit(-1);
          }
          rc=fork();
          if (-1==rc) {
            syslog (LOG_INFO | LOG_DAEMON,"daemon - Unable to
fork()\n");
            exit(-1);
          }
          if (rc>0) exit(0); /* parent should exit and return control,
                                         it's OK. */

        if (chdirRoot == 1) {
          rc=chdir("/");
          if (-1==rc) {
            syslog (LOG_INFO | LOG_DAEMON,"daemon - Unable to
chdir()\n");
            exit(-1);
          }
          /* umask(0); */
        }

        if (!freopen("/dev/null", "r", stdin)) {
 syslog (LOG_INFO | LOG_DAEMON,"daemon - Unable to freopen(%s,...,%s):
%s\n",
                "/dev/null","stdin",strerror (errno));
            fflush(stderr);
            exit(-1);
        }
        if (!freopen(logfile, "a", stdout)) {
 syslog (LOG_INFO | LOG_DAEMON,"daemon - Unable to freopen(%s,...,%s):
%s\n",
                logfile,"stdout",strerror (errno));
            fflush(stderr);
            exit(-1);
        }
        if (! quiet)
            fprintf(stdout,"Test stdout\n");
        fflush(stdout);
        if (!freopen(logfile, "a", stderr)) {
 syslog (LOG_INFO | LOG_DAEMON,"daemon - Unable to freopen(%s,...,%s):
%s\n",
                logfile,"stderr",strerror (errno));
            fflush(stderr);
            exit(-1);
        }
        if (! quiet)
            fprintf(stderr,"Test stderr\n");
        if (! quiet)
            fprintf(stderr,"%s\n",infotxt);
        fprintf(stderr,"%s\n__O_u_t_p_u_t_:____\n",timetext());
        fflush(stderr);
        if (! quiet)
            syslog (LOG_INFO | LOG_DAEMON, infotxt);
        return; /* a grandchild returns... free as a bird..  */
}


int main (int argc, char * const * argv) {
    char prog[1024];
    char infotxt[1024];
    char errmsg[1024];
    FILE * ftest;

    extern char *optarg;
    extern int optind;
    int fd;
    int ch;                     /* global int chdirRoot: cd / ? */
    char logfile[256] = "/dev/null";    /* logfile:
                                       where stdout&stderr are sent to */
    progname = argv[0];
    while ((ch = getopt(argc, argv, "qcl:")) != -1)
        switch (ch) {
            case 'q':
                quiet = 1;
                break;
            case 'c':
                chdirRoot = 1;
                break;
            case 'l':
                if (argc < 2) {
 syslog (LOG_INFO | LOG_DAEMON,"Logfilename too long! (max 255
char.!)\n");
                    exit(-1);
                }
                strcpy(logfile,optarg);
                break;
             case '?':
             default:
                     usage("Unbekanntes Argument");
     }
     argc -= optind;
     argv += optind;



    if (strlen(progname)+strlen(argv[1]) > 750) {
            syslog (LOG_INFO | LOG_DAEMON,"ArgV too long....
Tschuess\n");
        exit(-1);
    }
    if (argc < 1) {
        usage("Zu wenig Argumente!");
    }
    /* Testen ob Logfile geschrieben werden kann */
        ftest=fopen(logfile, "w");
        if (!ftest) {
   syslog (LOG_INFO | LOG_DAEMON,"daemon - Unable to fopen(%s): %s\n",
                logfile,strerror (errno));
            fflush(stderr);
            exit(-1);
        }
        fclose(ftest);
    sprintf(prog,"uid=%i prog='%s'",(int)geteuid(),argv[0]);
    sprintf(infotxt,"run as daemon: %s
log='%s'",prog,logfile);
    if (! quiet)
        fprintf(stdout,"%s\n",infotxt);
    MSHdaemon(logfile,infotxt);
    execvp(argv[0],argv);
    /* Wenn wir hier ankommen, hat exec nicht funktioniert... */
    sprintf(errmsg,"%s ERROR: %s",prog,strerror (errno));
    syslog (LOG_ERR | LOG_DAEMON, errmsg);

    /* Write to log too ... */
    syslog (LOG_INFO | LOG_DAEMON,"\nERROR\n%s\n",errmsg);
    fflush(stderr);
    fclose(stderr);
    exit(0);
    return 0;
}

carl at bl.echidna.id.au

2001-Dec-13 02:34 UTC

head link

hang on exit bug under Linux

> From: Peter Stuge <stuge at cdy.org>
> 
> The true solution is considered to be one of two things:
> 
> 1. All daemons shall behave.  (ie. close std*)
Ideally, but not likely :(
> 2. The user knows what he/she wants.  (ie. to exit, loosing data)
> 
> I actually want both.  I want to be able to tell sloppy daemon programmers
> that they should clean up their code.  But I also want my users to not have
> to deal with sloppy daemon programmers, unless they choose to do so.
> This is tough.
Could it be done at the command line?  ssh -bad-daemon foohost ?

We have to restart Firewall-1 and IDS probes with closed source
all the time using ssh.  Having the ability to do so without
totally hosing things is a big plus.  It's not something I can
script around either.
> A thought that occured in my mind tonight while thinking about this is that
> the ssh client could background itself and tell the user about it, instead
> of closing down and causing possible data loss.  And it needs to leave std*
> open, ie. not daemonize, but background.  This would propagate over
multiple
> connections all the way back to my actual terminal.  And if I choose to
> close my terminal (xterm, console, whatever) the process that I started at
> remotehost will be sent SIGHUP.  In the perl case it would kill it, a C
> program that catches the signal doesn't care and keeps running.  Data
will
> be lost but that is out of SSH scope.
> 
> Comments on this, anyone?
Is it too hard to have a command line switch (or config option) to say
"lossy/not lossy" ?  It's because of this problem that I'm
still stuck
with a lot of firewalls still running ssh v1 :(

Carl

carl at bl.echidna.id.au

2001-Dec-13 02:52 UTC

head link

hang on exit bug under Linux

> From: Peter Stuge <stuge at cdy.org>
> 
> On Thu, Dec 13, 2001 at 01:34:32PM +1100, carl at bl.echidna.id.au wrote:
> > 
> > Is it too hard to have a command line switch (or config option) to say
> > "lossy/not lossy" ?  It's because of this problem that
I'm still stuck
> > with a lot of firewalls still running ssh v1 :(
> 
> You wouldn't need the option if the ssh client put itself in the
background.
so if I did 

ssh firewall
firewall$ fwstop; fwstart
exit

It wouldn't hang in that instance?

carl at bl.echidna.id.au

2001-Dec-13 03:04 UTC

head link

hang on exit bug under Linux

> From: Peter Stuge <stuge at cdy.org>
> 
> On Thu, Dec 13, 2001 at 01:52:09PM +1100, carl at bl.echidna.id.au wrote:
> > 
> > > You wouldn't need the option if the ssh client put itself in
the background.
> > 
> > so if I did 
> > 
> > ssh firewall
> > firewall$ fwstop; fwstart
> > exit
> > 
> > It wouldn't hang in that instance?
> 
> The ssh client process would linger, but in the background.  It would be
the
> equivalent of doing ^Z and then typing bg with the current OpenSSH version.
> 
> You would be back at the prompt where you typed "ssh firewall"..
Does that solve the problem?  Wouldn't I then end up with, over
time, a stack of these ssh client sessions?

I guess I could kill -KILL them? :)

Carl

Doug Kingston

2002-Jan-03 16:14 UTC

head link

hang on exit bug under Linux

Markus et al.,

Please pardon me if I've missed anything, but coming into this a bit late...

The hanging of ssh connections with openssh where a job has been 
"detached" (backgrounded) is a real concern here as well.  We have
lots
of existing ssh and rsh scripts that we are converting over to later 
versions of openssh, and have started running into this with startling 
frequency, and its operational impact is serious when compounded by 
regular invocations.  We wind up with large numbers of waiting sshd's on 
one side and/or hung scripts on the client side.

My understanding from reading code and the archives is that the current 
behaviour we are seeing (a hang) is by design - and is intended to 
ensure that any output from that spawned task is dutifully carried back 
to the originating ssh client for appropriate dispostion.  The goal is 
to avoid losing any data in the "end game" as things die off and 
connections are closed.

However, the prior behavior of rsh and ssh was different, and the 
termination of the remote connection was governed by the death of all 
child processes (or the only child process).  It is this behaviour that 
people would like to see again.  I agree that technically (in the sense 
of not losing data), the new behavior is more correct, and users should 
wrap programs they are trying to detach to close all FDs (easily done 
with a Perl program and probably shell as well).  People would like to 
control this on the client side, but its completely server side behavior 
and currently there is no way for the client to influence this other 
than to recode their scripts (which in our case is 1000's of scripts).

What we need is a way to support the old functionality but in a way that 
lets us migrate smoothly over time to the new behavior.   I believe that 
a few modifcations can be made to the client and server to support both 
the new and old behavior, and the controlling of the default behavior.

First the server side changes:
1. add an option to terminate when the primary or all child processes die.
2. add an option to set the default for this flag in the sshd_config file
    (default should probably be for the old behavior to be compatible 
with v1.3 and 1.5 clients)
3. add code to allow the client side to set this option (client should 
overide server)
    (I think this needs to be a SSH2_MSG_GLOBAL_REQUEST)
    Only ssh v2.0 clients will be able to set this option.

Client side changes:
1. add code to send new option described above.
2. add code to set the default setting of the option in ssh_config
3. add command line processing to override default and send desired 
option setting to server.

We need to be aware that there are many different versions of ssh client 
code out there, much of it beyond our control, and we need to ensure 
that it continues to operate as expected when we upgrade the server 
(backwards compatability).  This means making the default server 
behavior accomodate the older client expectations unless it knows its 
got a newer client that wants the new behavior.  Once a site has 
converted their environment, they can change the default to wait for all 
output FDs to close before exiting.

How does this proposal sound to folks?  Markus?  What have I missed...

-Doug-

-- 

Douglas Kingston
Director
Global Unix Engineering Manager

Deutsche Bank AG London
6 Bishopsgate
London EC2N 4DA

Work:	+44-20-7545-3907
Mobile:	+44-7767-616-028

Markus Friedl

2002-Jan-03 16:28 UTC

head link

hang on exit bug under Linux

for rlogin-like behaviour (i.e. a pty is allocated) it might
be an option to discard data, like telnetd/rlogind do.

for rsh-like behaviour data loss is not acceptable.


for example,

	$ rsh host broken-daemon

blocks too (on many platforms, if not, please show
me relevant rshd code). and it should block for ssh-1.2.32.

Apparently Analagous Threads

Search for more reasonably related threads

openssh unix dev - Dec 2001 - hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

hang on exit bug under Linux

Apparently Analagous Threads