Hi, As a regular user, I have a cron job; 'crontab -l' says: ----------------------------------------- SHELL=/bin/sh MAILTO="" # run at bootup and then every 5 minutes @reboot $HOME/bin/ssh_tunnel */5 * * * * $HOME/bin/ssh_tunnel ----------------------------------------- The ssh_tunnel is an sh-script, which checks whether a particular ssh-tunnel still exists, and if not regenerates it, as follows: #!/bin/sh #---------------- ssh_tunnel script --------- tunnel="-L 55110:localhost:110 pop3.univ.net" tunnel_up=`pgrep -f -- "${tunnel}"` [ "${tunnel_up}" = "" ] && /usr/bin/ssh -N -f ${tunnel} It works beautifully, but why does this also generate one zombie process: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND rob 655 0.0 0.0 0 0 ?? Z Sat02PM 0:00.01 <defunct> The "STARTED" time, is when the PC rebooted last time. When I remove the cronjob and reboot, the zombie process is not created anymore. Any idea what's the problem here? Thanks, Rob.
Hi, As a regular user, I have a cron job; 'crontab -l' says: ----------------------------------------- SHELL=/bin/sh MAILTO="" # run at bootup and then every 5 minutes @reboot $HOME/bin/ssh_tunnel */5 * * * * $HOME/bin/ssh_tunnel ----------------------------------------- The ssh_tunnel is an sh-script, which checks whether a particular ssh-tunnel still exists, and if not regenerates it, as follows: #!/bin/sh #---------------- ssh_tunnel script --------- tunnel="-L 55110:localhost:110 pop3.univ.net" tunnel_up=`pgrep -f -- "${tunnel}"` [ "${tunnel_up}" = "" ] && /usr/bin/ssh -N -f ${tunnel} It works beautifully, but why does this also generate one zombie process: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND rob 655 0.0 0.0 0 0 ?? Z Sat02PM 0:00.01 <defunct> The "STARTED" time, is when the PC rebooted last time. When I remove the cronjob and reboot, the zombie process is not created anymore. Any idea what's the problem here? Thanks, Rob.
On Wed, 2005-Jan-19 09:16:59 +0900, Rob Lahaye wrote:> tunnel="-L 55110:localhost:110 pop3.univ.net" > tunnel_up=`pgrep -f -- "${tunnel}"` > [ "${tunnel_up}" = "" ] && /usr/bin/ssh -N -f ${tunnel}>It works beautifully, but why does this also generate one zombie process: > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND > rob 655 0.0 0.0 0 0 ?? Z Sat02PM 0:00.01 <defunct>You get a zombie when a process has exited and the parent hasn't issued a wait(2) (or SIG_IGN'd SIGCHLD). Have a look at what the parent process is and that might give you an idea as to what is going wrong. -- Peter Jeremy
Raymond Wiker wrote:> Peter Jeremy writes: > > On Wed, 2005-Jan-19 09:16:59 +0900, Rob Lahaye wrote: > > > tunnel="-L 55110:localhost:110 pop3.univ.net" > > > tunnel_up=`pgrep -f -- "${tunnel}"` > > > [ "${tunnel_up}" = "" ] && /usr/bin/ssh -N -f ${tunnel} > > > > >It works beautifully, but why does this also generate one zombie process: > > > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND > > > rob 655 0.0 0.0 0 0 ?? Z Sat02PM 0:00.01 <defunct> > > > > You get a zombie when a process has exited and the parent hasn't issued > > a wait(2) (or SIG_IGN'd SIGCHLD). Have a look at what the parent process > > is and that might give you an idea as to what is going wrong. > > Ancient Perl did not collect for children started via the > backtick operator - is this a possible issue for /bin/sh as well? It > should be harmless to call wait just after the use of the backtick > operator above; does that change anything? I.e: > > tunnel_up=`pgrep -f -- "${tunnel}"`; wait > > To see the parent pid, add "-O ppid" to the arguments to ps; > e.g, > > ps axww -O ppidAdding the "wait" here does not help at all. When I verify the parent process I have this: PID PPID TT STAT TIME COMMAND USER %CPU %MEM VSZ RSS STARTED 423 417 ?? I 0:00.00 cron: running jo root 0.0 0.2 1360 1148 11:38PM 425 423 ?? Z 0:00.00 <defunct> lahaye 0.0 0.0 0 0 11:38PM So PID 423 "cron: running job (cron)" is the parent of my <defunct> zombie. After playing with commenting out lines, I found out that the ssh call is the reason for the zombie: "/usr/bin/ssh -N -f -L 55110:localhost:110 pop3.univ.net" Then I tried exec : "exec /usr/bin/ssh -N -f ..." And I also tried & : "/usr/bin/ssh -N -f ... &" Both to no avail. What else can I try, and why is this ssh command causing a zombie process when called from cron? Rob.
Raymond Wiker wrote:> Rob writes: > > Raymond Wiker wrote: > > > Peter Jeremy writes: > > > > On Wed, 2005-Jan-19 09:16:59 +0900, Rob Lahaye wrote: > > > > > tunnel="-L 55110:localhost:110 pop3.univ.net" > > > > > tunnel_up=`pgrep -f -- "${tunnel}"` > > > > > [ "${tunnel_up}" = "" ] && /usr/bin/ssh -N -f ${tunnel} > > > > > > > > >It works beautifully, but why does this also generate one zombie process: > > > > > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND > > > > > rob 655 0.0 0.0 0 0 ?? Z Sat02PM 0:00.01 <defunct> > > > > > > > > You get a zombie when a process has exited and the parent hasn't issued > > > > a wait(2) (or SIG_IGN'd SIGCHLD). Have a look at what the parent process > > > > is and that might give you an idea as to what is going wrong. > > > > > > Ancient Perl did not collect for children started via the > > > backtick operator - is this a possible issue for /bin/sh as well? It > > > should be harmless to call wait just after the use of the backtick > > > operator above; does that change anything? I.e: > > > > > > tunnel_up=`pgrep -f -- "${tunnel}"`; wait > > > > > > To see the parent pid, add "-O ppid" to the arguments to ps; > > > e.g, > > > > > > ps axww -O ppid > > > > Adding the "wait" here does not help at all. > > > > When I verify the parent process I have this: > > > > PID PPID TT STAT TIME COMMAND USER %CPU %MEM VSZ RSS STARTED > > 423 417 ?? I 0:00.00 cron: running jo root 0.0 0.2 1360 1148 11:38PM > > 425 423 ?? Z 0:00.00 <defunct> lahaye 0.0 0.0 0 0 11:38PM > > > > > > So PID 423 "cron: running job (cron)" is the parent of my <defunct> zombie. > > > > After playing with commenting out lines, I found out that the ssh call is the > > reason for the zombie: "/usr/bin/ssh -N -f -L 55110:localhost:110 pop3.univ.net" > > > > Then I tried exec : "exec /usr/bin/ssh -N -f ..." > > And I also tried & : "/usr/bin/ssh -N -f ... &" > > Both to no avail. > > > > What else can I try, and why is this ssh command causing a zombie process when > > called from cron? > > Hmm... Maybe this will work? > > /bin/sh -c "/usr/bin/ssh -n -f ${tunnel} &" > > --- the effect of this should (hopefully) be that init becomes the > parent of the zombie process.No, makes no difference. Peculiar of the problem is: I use this construct to keep a ssh-tunnel alive. What else is better, than having a cron-script check whether the tunnel is still active, and if not, re-establish the ssh-tunnel. Strange, that such an obvious contruct ends up with a zombie process. Another interesting detail: as soon as I kill the tunnel, created by the cron-script, then the zombie process also disappears. Does that give a clue? Any more suggestions how to tackle this? Thsnks, Rob.
Peter Jeremy wrote:> On Wed, 2005-Jan-19 21:14:26 -0800, spam maps wrote: > >>> ( /usr/bin/ssh -n -f ${tunnel} & ) >> >>Alas, no success. Still get the <defunct> zombie >>process. >> >>I actually wonder if this is an odd or buggy >>behaviour of ssh, or is cron making a mistake here? > > > The cron daemon (which will have a PPID of 1) forks > a copy of itself to actually handle the cron job > (I suspect this is the parent of the zombie that > you are seeing). This child process runs > "/bin/sh -c CRONJOB" (where CRONJOB is the line in > your crontab) and I suspect this is the zombie you > are seeing. > > My guess is that your ssh process is holding open > file descriptors and the cron child process is > waiting for these descriptors to close before > wait()ing for the child. If this is true, then you > should avoid it with something like: > ( /usr/bin/ssh -n -f ${tunnel} >/dev/null 2>&1 & ) >BINGO! That works. Zombie has gone. Thank you.>>Leaving a zombie process around, means there's akind>>of bug/mistake somewhere, right? > > Yes. But it's not necessarily a bug in FreeBSD :-).So, after you've given me a complicated solution to avoid the zombie, can you tell which program is at error here? Cron, ssh, or FreeBSD? Rob. __________________________________ Do you Yahoo!? Yahoo! Mail - Find what you need with new enhanced search. http://info.mail.yahoo.com/mail_250