Puppet is failing to restart lighttpd using the Debian init script. Both the default action of stop/start and using the reload action, which basically does the same thing, fail. It seems to be a filehandle problem. Changing the execute method in service.rb to redirect stdout to /dev/null allows the daemon to restart. Otherwise, I end up with a zombie process and according to strace, ruby keeps doing this: read(7, 0xa7bf1000, 4096) = ? ERESTARTSYS (To be restarted) --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- read(7, 0xa7bf1000, 4096) = ? ERESTARTSYS (To be restarted) --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- Likewise, removing the redirection of stderr to stdout also fixes the problem. This works: ruby -e "print %x{/etc/init.d/lighttpd reload > /dev/null 2>&1}" As does: ruby -e "print %x{/etc/init.d/lighttpd reload}" But not: ruby -e "print %x{/etc/init.d/lighttpd reload 2>&1}" Christian
Anyone have any comments on whether this should be considered a bug of the init script, of puppet, or of ruby? Thanks, Christian On Fri, Sep 01, 2006 at 07:34:33PM -0700, Christian G. Warden wrote:> Puppet is failing to restart lighttpd using the Debian init script. > Both the default action of stop/start and using the reload action, which > basically does the same thing, fail. > > It seems to be a filehandle problem. Changing the execute method > in service.rb to redirect stdout to /dev/null allows the daemon to > restart. Otherwise, I end up with a zombie process and according to > strace, ruby keeps doing this: > > read(7, 0xa7bf1000, 4096) = ? ERESTARTSYS (To be restarted) > --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- > read(7, 0xa7bf1000, 4096) = ? ERESTARTSYS (To be restarted) > --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- > > Likewise, removing the redirection of stderr to stdout also fixes the > problem. > > This works: > ruby -e "print %x{/etc/init.d/lighttpd reload > /dev/null 2>&1}" > As does: > ruby -e "print %x{/etc/init.d/lighttpd reload}" > But not: > ruby -e "print %x{/etc/init.d/lighttpd reload 2>&1}" > > Christian
Christian G. Warden wrote:> Anyone have any comments on whether this should be considered a bug of > the init script, of puppet, or of ruby?It''s hard to say; I tend to think it''s probably a bug in lighttpd, but it''s going to be tough to actually pin it down. I''ll look at what impact it would be to send all output to /dev/null, since that seems to fix things. -- The Ninety-Ninety Rule of Project Schedules: The first 90% of the task takes 90% of the time, and the last 10% takes the other 90%. --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com
Christian G. Warden wrote:> Anyone have any comments on whether this should be considered a bug of > the init script, of puppet, or of ruby?Can you try this in perl? Your example was pure ruby, so it can''t be a Puppet bug (unless it''s a bug that I redirect stderr to stdout). -- The Ninety-Ninety Rule of Project Schedules: The first 90% of the task takes 90% of the time, and the last 10% takes the other 90%. --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com
On Tue, Sep 05, 2006 at 04:36:43PM -0700, Christian G. Warden wrote:> Anyone have any comments on whether this should be considered a bug of > the init script, of puppet, or of ruby?It''s a bug in lighttpd (or it''s init script, perhaps). Lighttpd doesn''t close / redirect it''s file descriptors like a good daemon should. In addition screwing up Puppet, it also does unpleasant things to cfengine and even to SSH connections (SSH hangs when logging out from a remote machine). - Matt -- "You keep using that word. I do not think it means what you think it means." -- Inigo, The Princess Bride
On Wed, Sep 06, 2006 at 09:53:32AM +1000, Matthew Palmer wrote:> On Tue, Sep 05, 2006 at 04:36:43PM -0700, Christian G. Warden wrote: > > Anyone have any comments on whether this should be considered a bug of > > the init script, of puppet, or of ruby? > > It''s a bug in lighttpd (or it''s init script, perhaps). Lighttpd doesn''t > close / redirect it''s file descriptors like a good daemon should. In > addition screwing up Puppet, it also does unpleasant things to cfengine and > even to SSH connections (SSH hangs when logging out from a remote machine).This hasn''t been my experience: # ls -l /proc/`pgrep lighttpd`/fd total 7 lr-x------ 1 root root 64 Sep 5 16:57 0 -> /dev/null l-wx------ 1 root root 64 Sep 5 16:57 1 -> /dev/null l-wx------ 1 root root 64 Sep 5 16:57 2 -> /dev/null l-wx------ 1 root root 64 Sep 5 16:57 3 -> /var/log/lighttpd/access.log lrwx------ 1 root root 64 Sep 5 16:57 4 -> socket:[1584089764] lrwx------ 1 root root 64 Sep 5 16:57 5 -> socket:[1584089765] l-wx------ 1 root root 64 Sep 5 16:57 6 -> /var/log/lighttpd/error.log I''m able to start lighty from an ssh session and log out, using the init script and directly, with or without -D. I thought the problem might be due to the fact that %x{} doesn''t pass a file descriptor for stderr. Christian
Christian G. Warden wrote:> > I thought the problem might be due to the fact that %x{} doesn''t pass a file > descriptor for stderr.It''s certainly possible. Since I''m in a four-way crunch right now, only one of which is trying to get this bug fixed, could you investigate whether redirecting to /dev/null would impact much? If not, can you submit a patch for me? Thanks! -- Anyone who considers arithmatical methods of producing random digits is, of course, in a state of sin. --John Von Neumann --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com
On Tue, Sep 05, 2006 at 07:09:59PM -0500, Luke Kanies wrote:> Christian G. Warden wrote: > > > > I thought the problem might be due to the fact that %x{} doesn''t pass a file > > descriptor for stderr. > > It''s certainly possible. > > Since I''m in a four-way crunch right now, only one of which is trying to > get this bug fixed, could you investigate whether redirecting to > /dev/null would impact much? If not, can you submit a patch for me?If the command fails, the error message that puppet generates includes the output from the command so it would reduce the ability to troubleshoot other types of failures. Here''s the patch though: --- lib/puppet/type/service.rb.orig 2006-09-05 17:27:43.000000000 -0700 +++ lib/puppet/type/service.rb 2006-09-05 17:28:52.000000000 -0700 @@ -442,10 +442,10 @@ # code. def execute(type, cmd) self.debug "Executing %s" % cmd.inspect - output = %x(#{cmd} 2>&1) + %x(#{cmd} > /dev/null 2>&1) unless $? == 0 - self.fail "Could not %s %s: %s" % - [type, self.name, output.chomp] + self.fail "Could not %s %s" % + [type, self.name] end end Christian
On Sep 5, 2006, at 5:32 PM, Christian G. Warden wrote:> If the command fails, the error message that puppet generates includes > the output from the command so it would reduce the ability to > troubleshoot other types of failures. > > Here''s the patch though:I need to get a version of 1.8.1 on my system, but can someone test this patch against that before committing it? I believe we have some code that makes sure that stderr isn''t redirected on that version, because it causes all sorts of problems. 1.8.1 IIRC is the version that RHEL is using, so this is an important thing to cover. -- Erik Hollensbe erik@hollensbe.org
On Sep 6, 2006, at 12:20 AM, Erik Hollensbe wrote:> > On Sep 5, 2006, at 5:32 PM, Christian G. Warden wrote: >> If the command fails, the error message that puppet generates >> includes >> the output from the command so it would reduce the ability to >> troubleshoot other types of failures. >> >> Here''s the patch though: > > I need to get a version of 1.8.1 on my system, but can someone test > this patch against that before committing it? I believe we have some > code that makes sure that stderr isn''t redirected on that version, > because it causes all sorts of problems. > > 1.8.1 IIRC is the version that RHEL is using, so this is an important > thing to cover.I should also note that I have new methods to abstract this (and several other process/fd related issues) in a patch I''m intending to introduce after the test suite is revamped. If someone wants to take on the arduous pain of integrating the module with the rest of the system, I''ll be happy to send you the working module and tests, although I believe it''s already in the bug tracker. -- Erik Hollensbe erik@hollensbe.org
On Sep 1, 2006, at 9:34 PM, Christian G. Warden wrote:> Puppet is failing to restart lighttpd using the Debian init script. > Both the default action of stop/start and using the reload action, > which > basically does the same thing, fail. > > It seems to be a filehandle problem. Changing the execute method > in service.rb to redirect stdout to /dev/null allows the daemon to > restart. Otherwise, I end up with a zombie process and according to > strace, ruby keeps doing this:[snip] Did you ever submit a bug for this, or find another fix? I have this as an open issue, and I want to make sure it''s resolved in some way. -- Luke Kanies http://madstop.com | http://reductivelabs.com | 615-594-8199
On Thu, Sep 14, 2006 at 01:40:06AM -0500, Luke Kanies wrote:> On Sep 1, 2006, at 9:34 PM, Christian G. Warden wrote: > > > Puppet is failing to restart lighttpd using the Debian init script. > > Both the default action of stop/start and using the reload action, > > which > > basically does the same thing, fail. > > > > It seems to be a filehandle problem. Changing the execute method > > in service.rb to redirect stdout to /dev/null allows the daemon to > > restart. Otherwise, I end up with a zombie process and according to > > strace, ruby keeps doing this: > [snip] > > Did you ever submit a bug for this, or find another fix? > > I have this as an open issue, and I want to make sure it''s resolved > in some way.I''m using the patch I submitted to redirect stdout to /dev/null. It''s not ideal, but it solved my immediate problem. xn
Christian G. Warden wrote:> > I''m using the patch I submitted to redirect stdout to /dev/null. It''s > not ideal, but it solved my immediate problem.Okay. What do others think -- is it more important that these weird services work, or that output is provided? Or should I add another parameter so people can choose, with the default being to keep output? -- Everything that is really great and inspiring is created by the individual who can labor in freedom. -- Albert Einstein --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com
>>>>> "Christian" == Christian G Warden <cwarden@xerus.org> writes:Christian> On Thu, Sep 14, 2006 at 01:40:06AM -0500, Luke Kanies Christian> wrote: >> On Sep 1, 2006, at 9:34 PM, Christian G. Warden wrote: >> >> > Puppet is failing to restart lighttpd using the Debian init >> script. > Both the default action of stop/start and using the >> reload action, > which > basically does the same thing, fail. >> > >> > It seems to be a filehandle problem. Changing the execute >> method > in service.rb to redirect stdout to /dev/null allows >> the daemon to > restart. Otherwise, I end up with a zombie >> process and according to > strace, ruby keeps doing this: >> [snip] >> [new to Puppet but not to Debian] Um, I''ve noticed similar problems in the past with restarting daemons under cfengine - although the example I can think of, proftpd, needed redirection to /dev/tty. I also have a recollection of a discussion on the exim mailing list about a similar problem to the above (with lighttpd) although I cannot find it at the moment. Perhaps there is a Debian developer following the list who can enlighten us of the current policy regarding init.d scripts and stdin/out/err and other filehandles. I would have though either the init.d script or the daemon itself should have handled this correctly, but perhaps an option(s) for redirecting filehandles needs to be available for those that don''t play nicely. Sincerely, Adrian Phillips -- Who really wrote the works of William Shakespeare ? http://www.pbs.org/wgbh/pages/frontline/shakespeare/
On Tue, Sep 19, 2006 at 07:10:18AM +0200, Adrian Phillips wrote:> Perhaps there is a Debian developer following the list who can > enlighten us of the current policy regarding init.d scripts and > stdin/out/err and other filehandles.A daemon must manage all of it''s file handles properly. Lighttpd is broken (but I haven''t got any spare round tuits to fix it). - Matt </mpalmer@debian.org> --> There really is no substitute for brute force.Indeed - I must admit to being a disciple of blessed Saint Makita myself. -- Robert Sneddon and Tanuki, in the Monastery
On Mon, 2006-09-18 at 10:48 -0500, Luke Kanies wrote:> Christian G. Warden wrote: > > > > I''m using the patch I submitted to redirect stdout to /dev/null. It''s > > not ideal, but it solved my immediate problem. > > Okay. > > What do others think -- is it more important that these weird services > work, or that output is provided? Or should I add another parameter so > people can choose, with the default being to keep output?I would be extremely reluctant to add a feature to puppet just to work around upstream breakage. How about using puppet to roll out a fixed init script as long as it''s broken upstream ? David