thr3ads.net - Puppet users - puppet hangs while trying to restart a daemon [Mar 2007]

If this information is useful, please help other people find it:
Share via:

Lluis Gili

2007-Mar-19 11:55 UTC

puppet hangs while trying to restart a daemon

Hello all,
puppet hangs while trying to restart a daemon when the associated config
file changes, here the log:

Mar 19 11:30:04 ingentTest puppetd[27390]: Starting Puppet client
version 0.22.0
puppetd[27390]: Starting configuration run
puppetd[27390]:
(/ingenttest/qualsevolnode/guaita[guaita]/File[/etc/guaita.conf]/content) synced
puppetd[27390]:
(//ingenttest/qualsevolnode/guaita[guaita]/Service[guaitad]) Triggering
''refresh'' from 1 dependencies

At this point, puppet doesn''t do anything more, until I restart the
daemon manually:

# /etc/init.d/guaitad restart
S''està  aturant el  (guaitad):                             [ FET  ]
S''està  iniciant el guaitad:                               [ FET  ]

puppetd[27390]: Finished configuration run in 4169.79 seconds

here the involved part of the manifest:

file { "/etc/guaita.conf":
            backup => ".bak2",
            owner => root,
            group => root,
            mode => 644,
            content => template("guaita.erb"),
            alias => guaitaconf,
            notify => Service[guaitad],
}

service { "guaitad":
        enable => true,
        ensure => running,
        hasrestart => true,
        hasstatus => false,
        path => "/etc/init.d/",
        pattern => "guaitad",
        subscribe => File[guaitaconf],
}



thanks all!

Luke Kanies

2007-Mar-20 22:33 UTC

head link

Re: puppet hangs while trying to restart a daemon

On Mar 19, 2007, at 6:55 AM, Lluis Gili wrote:
> Hello all,
> puppet hangs while trying to restart a daemon when the associated  
> config
> file changes, here the log:
>
> Mar 19 11:30:04 ingentTest puppetd[27390]: Starting Puppet client
> version 0.22.0
> puppetd[27390]: Starting configuration run
> puppetd[27390]:
> (/ingenttest/qualsevolnode/guaita[guaita]/File[/etc/guaita.conf]/ 
> content) synced
> puppetd[27390]:
> (//ingenttest/qualsevolnode/guaita[guaita]/Service[guaitad])  
> Triggering
> ''refresh'' from 1 dependencies
Can you try this with 0.22.2?  I changed the execution code to reopen  
stdin as /dev/null, which might fix problems like this, but I''ve no  
real idea.

This is a bug in the init script of your program, but it''s a common  
enough bug that Puppet should be able to work around it, but I  
haven''t yet found the right workaround.

  --
  When a man sits with a pretty girl for an hour, it seems like a
  minute.  But let him sit on a hot stove for a minute, and it''s longer
  than any hour.  That''s relativity.   --Albert Einstein
  ---------------------------------------------------------------------
  Luke Kanies | http://reductivelabs.com | http://madstop.com

Lluis Gili

2007-Mar-21 09:10 UTC

head link

Re: puppet hangs while trying to restart a daemon

El mar 20 de 03 del 2007 a les 17:33 -0500, en/na Luke Kanies va
escriure:> Can you try this with 0.22.2?  I changed the execution code to reopen  
> stdin as /dev/null, which might fix problems like this, but I''ve
no
> real idea.
> 
> This is a bug in the init script of your program, but it''s a
common
> enough bug that Puppet should be able to work around it, but I  
> haven''t yet found the right workaround.
I''m still having same problem with 0.22.2, what would I change of my
init script? have to return something?
Anyway, I think that puppet should have a timeout for configuration run.

Lluis

Jeff McCune

2007-Mar-21 16:15 UTC

head link

Re: puppet hangs while trying to restart a daemon

Lluis Gili wrote:> El mar 20 de 03 del 2007 a les 17:33 -0500, en/na Luke Kanies va
> escriure:
>> Can you try this with 0.22.2?  I changed the execution code to reopen  
>> stdin as /dev/null, which might fix problems like this, but
I''ve no
>> real idea.
>>
>> This is a bug in the init script of your program, but it''s a
common
>> enough bug that Puppet should be able to work around it, but I  
>> haven''t yet found the right workaround.
> 
> I''m still having same problem with 0.22.2, what would I change of
my
> init script? have to return something?
> Anyway, I think that puppet should have a timeout for configuration run.
> 
> Lluis
I''m seeing the same thing trying to start automount on a mac
workstation
in 0.22.2.

Here''s what I have:

class autohome {
  # file { "/var/run/automount.initialized": }
  service { "autohome":
    provider => base,
    hasrestart => true,
    pattern => ''automount/home'',
    restart => "ps auxww | grep ''automount/home'' \
        | grep -v grep | awk ''{print \$2}'' | xargs kill
-HUP",
    start => "/usr/sbin/automount -m /home \
        /etc/auto_home -mnt /private/var/automount/home",
    stop => "ps auxww | grep ''automount/home'' | \
        grep -v grep | awk ''{print \$2}'' | xargs kill",
    subscribe => File["/etc/auto_home"],
    require => File["/etc/auto_home"],
    ensure => running
  }
  remotefile { "/etc/auto_home":
    basedir => "dynamic",
    source => "gen_autohome/auto_home",
    mode => 0644, owner => 0, group => 0
  }
}

I notice a few strange things about the process spawned by puppet.  The
service I''m starting forks a copy of itself then should terminate
gracefully, but it becomes a zombie as long as puppetd is running.

e.g. While puppet is running, I see: (ps auxww)
root 1526 0.0 0.0  0      0  p1  Z+   31Dec69   0:00.00 (automount)

As soon as I kill puppetd, the zombie process goes away.  It seems like
puppet is starting the process, passing the TTY I''m running puppetd
from
along the way.

I''ll look at spawning the process in a more non-interactive way.

-- 
Jeff McCune
The Ohio State University
Department of Mathematics
Systems Manager


_______________________________________________
Puppet-users mailing list
Puppet-users@madstop.com
https://mail.madstop.com/mailman/listinfo/puppet-users

Tim Stoop

2007-Mar-23 04:02 UTC

head link

Re: puppet hangs while trying to restart a daemon

On 3/21/07, Jeff McCune <mccune@math.ohio-state.edu>
wrote:> I''m seeing the same thing trying to start automount on a mac
workstation
> in 0.22.2.
I''m having it with 0.22.1 and the tomcat5.5 init script.

And I think I found a lead to the problem. This script calls su to
execute the program. As soon as I comment that out, the script returns
as normal. This is the code that I commented:

#               su -p -s /bin/sh $TOMCAT5_USER \
#                               -c "$ROTATELOGS
\"$CATALINA_BASE/logs/catalina_%F.log\" 86400" \
#                               <
"$CATALINA_BASE/logs/catalina.out" &
#               su -p -s /bin/sh $TOMCAT5_USER \
#                       -c "\"$DAEMON\" start $STARTUP_OPTS"
\
#                       >> "$CATALINA_BASE/logs/catalina.out"
2>&1

As soon as I uncomment either of these su statement, the process keeps
hanging on this startup. And I kinda need them because that''s how the
stuff gets started :)

I played around a bit with su, but it''s definitly something in the
scripts that gets called. The $DAEMON is tomcat5.5, the java app.
Rotatelogs is simply that, the ELF binary.

Everything gets started correctly, though, even when I abort the
puppet run. Maybe built a timeout for the service start and if after 5
seconds there is no response, try a status, to see if it''s running or
not? Something like that, at least.

-- 
Gegroet,
Tim

Jordan Share

2007-Mar-23 05:11 UTC

head link

Re: puppet hangs while trying to restart a daemon

Tim Stoop wrote:> On 3/21/07, Jeff McCune <mccune@math.ohio-state.edu> wrote:
>> I''m seeing the same thing trying to start automount on a mac
workstation
>> in 0.22.2.
> 
> I''m having it with 0.22.1 and the tomcat5.5 init script.
> 
> And I think I found a lead to the problem. This script calls su to
> execute the program. As soon as I comment that out, the script returns
> as normal. This is the code that I commented:
> 
> #               su -p -s /bin/sh $TOMCAT5_USER \
> #                               -c "$ROTATELOGS
> \"$CATALINA_BASE/logs/catalina_%F.log\" 86400" \
> #                               <
"$CATALINA_BASE/logs/catalina.out" &
> #               su -p -s /bin/sh $TOMCAT5_USER \
> #                       -c "\"$DAEMON\" start
$STARTUP_OPTS" \
> #                       >>
"$CATALINA_BASE/logs/catalina.out" 2>&1
> 
> As soon as I uncomment either of these su statement, the process keeps
> hanging on this startup. And I kinda need them because that''s how
the
> stuff gets started :)
> 
> I played around a bit with su, but it''s definitly something in the
> scripts that gets called. The $DAEMON is tomcat5.5, the java app.
> Rotatelogs is simply that, the ELF binary.
> 
> Everything gets started correctly, though, even when I abort the
> puppet run. Maybe built a timeout for the service start and if after 5
> seconds there is no response, try a status, to see if it''s running
or
> not? Something like that, at least.
This is the same problem that I described in my mail on Mar 15.

If you add " 2> /dev/null 1> /dev/null " before the ampersand on
the
first su command, you can start and stop the service properly.

My test was whether I am actually dropped out of ssh after starting the 
service manually (without the above fix, I had to ~. out of my ssh 
sessions to the box).

I figure I am just going to build my own tomcat package with that fix in 
the init script, since I can''t think how else to install the package
and
have it not hang puppet.

Jordan

Adrian Phillips

2007-Mar-23 05:55 UTC

head link

Re: puppet hangs while trying to restart a daemon

>>>>> "Tim" == Tim Stoop <tim.stoop@gmail.com>
writes:
    Tim> On 3/21/07, Jeff McCune <mccune@math.ohio-state.edu> wrote:
    >> I''m seeing the same thing trying to start automount on a
mac
    >> workstation in 0.22.2.

    Tim> I''m having it with 0.22.1 and the tomcat5.5 init script.

We had occasional problems with init.d scripts and cfengine - have you
tried using nohup (before the command in the init.d script that
appears to hang), with stdout/err to somehwre else, possibly
redirecting stdin to < dev/null.

Sincerely,

Adrian Phillips

-- 
Who really wrote the works of William Shakespeare ?
http://www.pbs.org/wgbh/pages/frontline/shakespeare/

Lluis Gili

2007-Mar-23 12:20 UTC

head link

Re: puppet hangs while trying to restart a daemon

Hello again,
running in debug mode I get this:

notice: Caught INT; shutting down
debug: Signal caught here:
debug:
/usr/lib/ruby/site_ruby/1.8/puppet/external/event-loop/event-loop.rb:123:in
`call''
debug:
/usr/lib/ruby/site_ruby/1.8/puppet/external/event-loop/event-loop.rb:123:in
`select''
debug:
/usr/lib/ruby/site_ruby/1.8/puppet/external/event-loop/event-loop.rb:123:in
`select''
debug:
/usr/lib/ruby/site_ruby/1.8/puppet/external/event-loop/event-loop.rb:112:in
`iterate''
debug:
/usr/lib/ruby/site_ruby/1.8/puppet/external/event-loop/event-loop.rb:103:in
`run''
debug: /usr/lib/ruby/site_ruby/1.8/puppet.rb:323:in `start''
debug: /usr/sbin/puppetd:443

I looked it but my ruby is under basics :(

Lluís

El vie 23 de 03 del 2007 a les 06:55 +0100, en/na Adrian Phillips va
escriure:> >>>>> "Tim" == Tim Stoop
<tim.stoop@gmail.com> writes:
> 
>     Tim> On 3/21/07, Jeff McCune <mccune@math.ohio-state.edu>
wrote:
>     >> I''m seeing the same thing trying to start automount
on a mac
>     >> workstation in 0.22.2.
> 
>     Tim> I''m having it with 0.22.1 and the tomcat5.5 init
script.
> 
> We had occasional problems with init.d scripts and cfengine - have you
> tried using nohup (before the command in the init.d script that
> appears to hang), with stdout/err to somehwre else, possibly
> redirecting stdin to < dev/null.
> 
> Sincerely,
> 
> Adrian Phillips
>

Ceri Storey

2007-Mar-23 13:41 UTC

head link

Re: puppet hangs while trying to restart a daemon

Adrian Phillips wrote:>>>>>> "Tim" == Tim Stoop
<tim.stoop@gmail.com> writes:
> 
>     Tim> On 3/21/07, Jeff McCune <mccune@math.ohio-state.edu>
wrote:
>     >> I''m seeing the same thing trying to start automount
on a mac
>     >> workstation in 0.22.2.
> 
>     Tim> I''m having it with 0.22.1 and the tomcat5.5 init
script.
> 
> We had occasional problems with init.d scripts and cfengine - have you
> tried using nohup (before the command in the init.d script that
> appears to hang), with stdout/err to somehwre else, possibly
> redirecting stdin to < dev/null.
To clarify things from what I''ve seen myself, we''ve had hangs
when
restarting daemons from within puppet. Now, when we ran puppet under
strace, it seemed to be the case that ruby would notice that the process
had exited, but would then wait to drain the output pipe before continuing.

The fix we had was, as suggested, to run the offending process under
something like nohup(1), (or in our case, initlog(1) via the daemon
shell function).

From my point of view, one way to fix this would be to re-implement
(yes, It''s horrible) IO.popen in terms of the non-blocking write
functions implemented in ruby 1.9 or 1.8.5. However, that''s not
necessarily such a great idea.

Alternatively, you could use a signal-handler for SIGCHLD and still use
IO.popen, but it seems difficult to ensure that you capture all of the
output from the command up until the point that the process exits.

You might use something like the (utterly horrible) code (written during
my lunch hour) I''ve attached below as a basis. If I have time,
I''ll have
a go at cleaning it up, and integrating it into puppet this weekend.

_______________________________________________
Puppet-users mailing list
Puppet-users@madstop.com
https://mail.madstop.com/mailman/listinfo/puppet-users

Ceri Storey

2007-Mar-23 13:44 UTC

head link

Re: puppet hangs while trying to restart a daemon

On Fri, Mar 23, 2007 at 01:41:08PM +0000, Ceri Storey
wrote:> **********************************************************************
> ******* This email and any attachments are strictly confidential
> and intended for the addressee(s) only. If this email has been
> sent to you in error, [...]
My apologies for the above guff. I''d forgotten that our outbound relays
add this to every message. 

-- 
Ceri Storey <cez@necrofish.org.uk>
''What I really want is "apt-get smite"'' 
    --Rob Partington
http://unix.culti.st/

Robb Wagoner

2007-Mar-23 14:28 UTC

head link

Re: puppet hangs while trying to restart a daemon

I''ve seen similar issues with my own init scripts. I''ve found
that not
closing stdin descriptor, using "<&-", is the culprit. 

Hope that is of help,
Robb

-----Original Message-----
From: puppet-users-bounces@madstop.com
[mailto:puppet-users-bounces@madstop.com] On Behalf Of Tim Stoop
Sent: Thursday, March 22, 2007 9:03 PM
To: Puppet User Discussion
Subject: Re: [Puppet-users] puppet hangs while trying to restart a
daemon

On 3/21/07, Jeff McCune <mccune@math.ohio-state.edu>
wrote:> I''m seeing the same thing trying to start automount on a mac 
> workstation in 0.22.2.
I''m having it with 0.22.1 and the tomcat5.5 init script.

And I think I found a lead to the problem. This script calls su to
execute the program. As soon as I comment that out, the script returns
as normal. This is the code that I commented:

#               su -p -s /bin/sh $TOMCAT5_USER \
#                               -c "$ROTATELOGS
\"$CATALINA_BASE/logs/catalina_%F.log\" 86400" \
#                               <
"$CATALINA_BASE/logs/catalina.out" &
#               su -p -s /bin/sh $TOMCAT5_USER \
#                       -c "\"$DAEMON\" start $STARTUP_OPTS"
\
#                       >> "$CATALINA_BASE/logs/catalina.out"
2>&1

As soon as I uncomment either of these su statement, the process keeps
hanging on this startup. And I kinda need them because that''s how the
stuff gets started :)

I played around a bit with su, but it''s definitly something in the
scripts that gets called. The $DAEMON is tomcat5.5, the java app.
Rotatelogs is simply that, the ELF binary.

Everything gets started correctly, though, even when I abort the puppet
run. Maybe built a timeout for the service start and if after 5 seconds
there is no response, try a status, to see if it''s running or not?
Something like that, at least.

--
Gegroet,
Tim
_______________________________________________
Puppet-users mailing list
Puppet-users@madstop.com
https://mail.madstop.com/mailman/listinfo/puppet-users

Robb Wagoner

2007-Mar-23 15:04 UTC

head link

Re: puppet hangs while trying to restart a daemon

Like I mentioned in an earlier email, in my experience this behavior is
usually caused by unclosed file descriptors. I believe a more
''correct''
approach is to close the stdin, stdout, and stderr file descriptors:

"<&- >&- 2>&-" , respectively.

Close stdin: <&-
Close stdout: >&-
Close stderr: 2>&-

I can only confirm that this is correct for bash. However, it is
untested for sh. So if someone has an *old* system, it may not work. 

RW

-----Original Message-----
From: puppet-users-bounces@madstop.com
[mailto:puppet-users-bounces@madstop.com] On Behalf Of Jordan Share
Sent: Thursday, March 22, 2007 10:11 PM
To: Puppet User Discussion
Subject: Re: [Puppet-users] puppet hangs while trying to restart a
daemon

Tim Stoop wrote:> On 3/21/07, Jeff McCune <mccune@math.ohio-state.edu> wrote:
>> I''m seeing the same thing trying to start automount on a mac 
>> workstation in 0.22.2.
> 
> I''m having it with 0.22.1 and the tomcat5.5 init script.
> 
> And I think I found a lead to the problem. This script calls su to 
> execute the program. As soon as I comment that out, the script returns
> as normal. This is the code that I commented:
> 
> #               su -p -s /bin/sh $TOMCAT5_USER \
> #                               -c "$ROTATELOGS
> \"$CATALINA_BASE/logs/catalina_%F.log\" 86400" \
> #                               <
"$CATALINA_BASE/logs/catalina.out" &
> #               su -p -s /bin/sh $TOMCAT5_USER \
> #                       -c "\"$DAEMON\" start
$STARTUP_OPTS" \
> #                       >>
"$CATALINA_BASE/logs/catalina.out" 2>&1
> 
> As soon as I uncomment either of these su statement, the process keeps
> hanging on this startup. And I kinda need them because that''s how
the
> stuff gets started :)
> 
> I played around a bit with su, but it''s definitly something in the
> scripts that gets called. The $DAEMON is tomcat5.5, the java app.
> Rotatelogs is simply that, the ELF binary.
> 
> Everything gets started correctly, though, even when I abort the 
> puppet run. Maybe built a timeout for the service start and if after 5
> seconds there is no response, try a status, to see if it''s running
or
> not? Something like that, at least.
This is the same problem that I described in my mail on Mar 15.

If you add " 2> /dev/null 1> /dev/null " before the ampersand on
the
first su command, you can start and stop the service properly.

My test was whether I am actually dropped out of ssh after starting the
service manually (without the above fix, I had to ~. out of my ssh
sessions to the box).

I figure I am just going to build my own tomcat package with that fix in
the init script, since I can''t think how else to install the package
and
have it not hang puppet.

Jordan
_______________________________________________
Puppet-users mailing list
Puppet-users@madstop.com
https://mail.madstop.com/mailman/listinfo/puppet-users

Jordan Share

2007-Mar-23 16:07 UTC

head link

Re: puppet hangs while trying to restart a daemon

Robb Wagoner wrote:> Like I mentioned in an earlier email, in my experience this behavior is
> usually caused by unclosed file descriptors. I believe a more
''correct''
> approach is to close the stdin, stdout, and stderr file descriptors:
> 
> "<&- >&- 2>&-" , respectively.
> 
> Close stdin: <&-
> Close stdout: >&-
> Close stderr: 2>&-
> 
> I can only confirm that this is correct for bash. However, it is
> untested for sh. So if someone has an *old* system, it may not work. 
Okey doke, I''ll give that a whirl.

I had sent my mail last night (before you replied) but from a 
non-subscribed account, so it got moderator-delay. :)

Jordan

David Lutterkort

2007-Mar-23 16:35 UTC

head link

Re: puppet hangs while trying to restart a daemon

On Fri, 2007-03-23 at 13:41 +0000, Ceri Storey wrote:> >From my point of view, one way to fix this would be to re-implement
> (yes, It''s horrible) IO.popen in terms of the non-blocking write
> functions implemented in ruby 1.9 or 1.8.5. However, that''s not
> necessarily such a great idea.
It would be great if puppet didn''t depend on 1.8.5 - RHEL4 and
therefore
CentOS4 are still on ruby-1.8.1

David

Russ Allbery

2007-Mar-23 17:18 UTC

head link

Re: puppet hangs while trying to restart a daemon

Robb Wagoner <robbw@klir.com> writes:
> Like I mentioned in an earlier email, in my experience this behavior is
> usually caused by unclosed file descriptors. I believe a more
''correct''
> approach is to close the stdin, stdout, and stderr file descriptors:
> "<&- >&- 2>&-" , respectively.
> Close stdin: <&-
> Close stdout: >&-
> Close stderr: 2>&-
> I can only confirm that this is correct for bash. However, it is
> untested for sh. So if someone has an *old* system, it may not work. 
Redirecting to /dev/null is safer than closing.  Some programs freak out
if the standard file descriptors are closed.  (Including parts of Ruby
that Puppet uses, in fact.)

A common programming error is to assume that a new file descriptor
can''t
ever be 0 since 0 is always open on standard input.

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>

Kenton Brede

2007-Mar-23 17:48 UTC

head link

Re: puppet hangs while trying to restart a daemon

On 3/23/07, David Lutterkort <dlutter@redhat.com>
wrote:> On Fri, 2007-03-23 at 13:41 +0000, Ceri Storey wrote:
> > >From my point of view, one way to fix this would be to
re-implement
> > (yes, It''s horrible) IO.popen in terms of the non-blocking
write
> > functions implemented in ruby 1.9 or 1.8.5. However, that''s
not
> > necessarily such a great idea.
>
> It would be great if puppet didn''t depend on 1.8.5 - RHEL4 and
therefore
> CentOS4 are still on ruby-1.8.1
Are you saying the new 0.22.2 version doesn''t work with ruby-1.8.1 or
are you saying it''s not a good idea to tie Puppet to a particular Ruby
version?  I''ve been testing version 0.22.1 on RHEL4 but
haven''t had a
chance to try 0.22.2.

I don''t have that many boxes compared to a lot of people but
upgrading/maintaining Ruby to support Puppet isn''t a road I want to go
down.  Is there a reasonable guess as to how long backward
compatibility will remain for ruby-1.8.1?  BTW Ubuntu LTS uses version
1.8.2.
Kent

David Lutterkort

2007-Mar-23 19:50 UTC

head link

Re: puppet hangs while trying to restart a daemon

On Fri, 2007-03-23 at 12:48 -0500, Kenton Brede wrote:> Are you saying the new 0.22.2 version doesn''t work with ruby-1.8.1
No, 0.22.2 works with ruby-1.8.1, in fact my puppetmaster is running on
RHEL4 and everything seems fine.
> or are you saying it''s not a good idea to tie Puppet to a
particular Ruby
> version?
I am saying we shouldn''t break compatibility with ruby-1.8.1 if
it''s not
absolutely necessary.

David

Kenton Brede

2007-Mar-23 21:00 UTC

head link

Re: puppet hangs while trying to restart a daemon

On 3/23/07, David Lutterkort <dlutter@redhat.com>
wrote:> On Fri, 2007-03-23 at 12:48 -0500, Kenton Brede wrote:
> > Are you saying the new 0.22.2 version doesn''t work with
ruby-1.8.1
>
> No, 0.22.2 works with ruby-1.8.1, in fact my puppetmaster is running on
> RHEL4 and everything seems fine.
>
> > or are you saying it''s not a good idea to tie Puppet to a
particular Ruby
> > version?
>
> I am saying we shouldn''t break compatibility with ruby-1.8.1 if
it''s not
> absolutely necessary.
I second that opinion :)
Kent

Luke Kanies

2007-Mar-24 21:39 UTC

head link

Re: puppet hangs while trying to restart a daemon

On Mar 23, 2007, at 8:41 AM, Ceri Storey wrote:> To clarify things from what I''ve seen myself, we''ve had
hangs when
> restarting daemons from within puppet. Now, when we ran puppet under
> strace, it seemed to be the case that ruby would notice that the  
> process
> had exited, but would then wait to drain the output pipe before  
> continuing.
Now that I''ve realized 0.22.2 has a bug in filebucket handling, along  
with a couple of other small bugs in it, I''m thinking of putting  
0.22.3 out on Monday or Tuesday.

Given that, is there something I can do that will fix all of these  
hanging problems?  Has anyone found a consistent fix for them?

  --
  Opportunity is missed by most people because it is dressed in overalls
  and looks like work.        -- Thomas A. Edison
  ---------------------------------------------------------------------
  Luke Kanies | http://reductivelabs.com | http://madstop.com

Russ Allbery

2007-Mar-24 21:51 UTC

head link

Re: puppet hangs while trying to restart a daemon

Luke Kanies <luke@madstop.com> writes:
> Given that, is there something I can do that will fix all of these
> hanging problems?  Has anyone found a consistent fix for them?
In the select loop, one has to wake up periodically and see if the child
has exited, and if it has, decide that the action is done even if the
output pipes are still unclosed.

In an ideal world where everyone implemented it properly, you could
instead block SIGCHLD and then use pselect instead of select to ensure
that the SIGCHLD breaks you out of your select loop, but given that
Linux''s pselect was broken until very recently (and in some testing
last
night appears to still be broken at least with the glibc that I have on
hand), you have to use a timeout in select and check for the child exit
status periodically.

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>

Luke Kanies

2007-Mar-24 21:54 UTC

head link

Re: puppet hangs while trying to restart a daemon

On Mar 24, 2007, at 4:51 PM, Russ Allbery wrote:
> Luke Kanies <luke@madstop.com> writes:
>
>> Given that, is there something I can do that will fix all of these
>> hanging problems?  Has anyone found a consistent fix for them?
>
> In the select loop, one has to wake up periodically and see if the  
> child
> has exited, and if it has, decide that the action is done even if the
> output pipes are still unclosed.
Could you provide some pseudo-code or real code to demonstrate this?   
This area isn''t exactly my forté.

  --
  It''s not the voting that''s democracy, it''s the
counting.  -- Tom
Stoppard
  ---------------------------------------------------------------------
  Luke Kanies | http://reductivelabs.com | http://madstop.com

Russ Allbery

2007-Mar-24 22:18 UTC

head link

Re: puppet hangs while trying to restart a daemon

Luke Kanies <luke@madstop.com> writes:
> Could you provide some pseudo-code or real code to demonstrate this?
> This area isn''t exactly my forté.
I don''t have anything in Ruby, but here''s the core of
remctld''s select
loop in C.  I spent a couple of hours last night debugging problems
related to this and trying to get pselect to work.

    /* Initialize read status for standard output and standard error. */
    status[0] = -1;
    status[1] = -1;

    /* Now, loop while we have input.  We no longer have input if the return
       status of read is 0 on all file descriptors.  At that point, we break
       out of the loop.

       Exceptionally, however, we want to catch the case where our child
       process ran some other command that didn''t close its inherited
standard
       output and error and then exited itself.  This is not uncommon with
       init scripts that start poorly-written daemons.  Once our child process
       is finished, we''re done, even if standard output and error from
the
       child process aren''t closed yet.  To catch this case, call
waitpid with
       the WNOHANG flag each time through the select loop and decide
we''re
       done as soon as our child has exited. */
    while (!process->reaped) {
        FD_ZERO(&fdset);
        maxfd = -1;
        for (i = 0; i < 2; i++) {
            if (status[i] != 0) {
                FD_SET(process->fds[i], &fdset);
                if (process->fds[i] > maxfd)
                    maxfd = process->fds[i];
            }
        }
        if (maxfd == -1)
            break;

        /* We want to wait until either our child exits or until we get data
           on its output file descriptors.  Normally, the SIGCHLD signal from
           the child exiting would break us out of our select loop.  However,
           the child could exit between the waitpid call and the select call,
           in which case select could block forever since there''s
nothing to
           wake it up.

           The POSIX-correct way of doing this is to block SIGCHLD and then
           use pselect instead of select with a signal mask that allows
           SIGCHLD.  This allows SIGCHLD from the exiting child process to
           reliably interrupt pselect without race conditions from the child
           exiting before pselect is called.

           Unfortunately, Linux didn''t implement a proper pselect until
2.6.16
           and the glibc wrapper that emulates it leaves us open to exactly
           the race condition we''re trying to avoid.  This
unfortunately
           leaves us with no choice but to set a timeout and wake up every
           five seconds to see if our child died.  (The wait time is arbitrary
           but makes the test suite less annoying.)

           If we see that the child has already exited, do one final poll of
           our output file descriptors and then call the command finished. */
        timeout.tv_sec = 5;
        timeout.tv_usec = 0;
        if (waitpid(process->pid, &process->status, WNOHANG) > 0) {
            process->reaped = 1;
            timeout.tv_sec = 0;
        }
        result = select(maxfd + 1, &fdset, NULL, NULL, &timeout);
        if (result < 0 && errno != EINTR) {
            syswarn("select failed");
            server_send_error(client, ERROR_INTERNAL, "Internal
failure");
            free(status);
            return 0;
        }

        /* Iterate through each set file descriptor and read its output.   If
           we''re using protocol version one, we append all the output
together
           into the buffer.  Otherwise, we send an output token for each bit
           of output as we see it. */
        for (i = 0; i < 2; i++) {
            fd = process->fds[i];
            if (!FD_ISSET(fd, &fdset))
                continue;
            if (client->protocol == 1) {
                if (left > 0) {
                    status[i] = read(fd, p, left);
                    if (status[i] < 0 && (errno != EINTR &&
errno != EAGAIN))
                        goto readfail;
                    else if (status[i] > 0) {
                        p += status[i];
                        left -= status[i];
                    }
                } else {
                    status[i] = read(fd, junk, sizeof(junk));
                    if (status[i] < 0 && (errno != EINTR &&
errno != EAGAIN))
                        goto readfail;
                }
            } else {
                status[i] = read(fd, client->output, MAXBUFFER);
                if (status[i] < 0 && (errno != EINTR && errno
!= EAGAIN))
                    goto readfail;
                if (status[i] > 0) {
                    client->outlen = status[i];
                    if (!server_v2_send_output(client, i + 1))
                        goto fail;
                }
            }
        }
    }

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>

Ceri Storey

2007-Mar-24 23:12 UTC

head link

Re: puppet hangs while trying to restart a daemon

On 23 Mar 2007, at 19:50, David Lutterkort wrote:
> On Fri, 2007-03-23 at 12:48 -0500, Kenton Brede wrote:
>> Are you saying the new 0.22.2 version doesn''t work with
ruby-1.8.1
>
> No, 0.22.2 works with ruby-1.8.1, in fact my puppetmaster is  
> running on
> RHEL4 and everything seems fine.
I''m using CentOS 4.4 i386 at work, and we had problems with the  
puppetmaster segfaulting, but upgrading to the ruby from CentOS  
testing fixed that issue.

Just a note for the archives.
-- 
Ceri Storey <cez@necrofish.org.uk> -- http://unix.culti.st/

Luke Kanies

2007-Apr-03 17:06 UTC

head link

Re: puppet hangs while trying to restart a daemon

On Mar 23, 2007, at 8:41 AM, Ceri Storey wrote:>
> To clarify things from what I''ve seen myself, we''ve had
hangs when
> restarting daemons from within puppet. Now, when we ran puppet under
> strace, it seemed to be the case that ruby would notice that the  
> process
> had exited, but would then wait to drain the output pipe before  
> continuing.
Jeff McCune has just done a bunch of work that should hopefully fix  
this problem.

Puppet no longer waits on the output of service commands, so this  
shouldn''t be a problem any more.  Package commands and execs, which  
produce output that Puppet needs, can still clearly cause problems,  
but I don''t think we''ll ever completely avoid those.

  --
  I respect faith, but doubt is what gets you an education.
          -- Wilson Mizner
  ---------------------------------------------------------------------
  Luke Kanies | http://reductivelabs.com | http://madstop.com

Seemingly Similar Threads

Search for more possibly parallel threads

Puppet users - Mar 2007 - puppet hangs while trying to restart a daemon

puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Re: puppet hangs while trying to restart a daemon

Seemingly Similar Threads