Walter Heck
2012-Apr-24 02:43 UTC
[Puppet Users] Analysing some puppetmaster logs to find out what''s happening on an agent
Hi all, in an unfortunate incident, I managed to lock myself out of a client''s server. Basically an openssh module that by default disabled remote root logins did that on a server that was only accessed by remote root login (no other use raccounts present on that server). Unfortunately, this is a colocated server and next trip to the dc is scheduled for next thursday. No KVM over IP, no remote hands, pretty much the ideal situation :P. The Xen server is still running, and so are the domU''s on it, but this is less then ideal. If any of the domU''s goes down, there''s nothing we can do :) Now, the puppet agent is running every 30 minutes, but something seems to make it not execute the catalog. I have set the puppetmaster to debug in order to see what''s happening, but I can''t figure it out. Here''s a gist of the puppet master log: gist.github.com/2475554 x7 is the offending server, x6 has exactly the same puppet definition. Can anyone tell me why the log for x7 just stops, with no error or nothing? What does that indicate is happening on x7? Any help is much appreciated :) cheers, -- Walter Heck -- follow @walterheck on twitter to see what I''m up to! -- Check out my new startup: Server Monitoring as a Service @ tribily.com Follow @tribily on Twitter and/or ''Like'' our Facebook page at facebook.com/tribily -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Christopher Wood
2012-Apr-24 02:56 UTC
Re: [Puppet Users] Analysing some puppetmaster logs to find out what''s happening on an agent
I admit I''ve never read puppetmaster logs like that so what I''m about to say may be very bad advice. Since resources removed from your manifests become unmanaged rather than deleted, why not swap the node''s current manifest for one which only re-enables ssh root login? Once you have access you can return to the desired manifest, and use your existing socket to see what gives. This might also be time to consider remote syslog, that way you can see what the node itself is doing. On Tue, Apr 24, 2012 at 10:43:04AM +0800, Walter Heck wrote:> Hi all, > > in an unfortunate incident, I managed to lock myself out of a client''s > server. Basically an openssh module that by default disabled remote > root logins did that on a server that was only accessed by remote root > login (no other use raccounts present on that server). Unfortunately, > this is a colocated server and next trip to the dc is scheduled for > next thursday. No KVM over IP, no remote hands, pretty much the ideal > situation :P. The Xen server is still running, and so are the domU''s > on it, but this is less then ideal. If any of the domU''s goes down, > there''s nothing we can do :) > > Now, the puppet agent is running every 30 minutes, but something seems > to make it not execute the catalog. I have set the puppetmaster to > debug in order to see what''s happening, but I can''t figure it out. > Here''s a gist of the puppet master log: > gist.github.com/2475554 > > x7 is the offending server, x6 has exactly the same puppet definition. > Can anyone tell me why the log for x7 just stops, with no error or > nothing? What does that indicate is happening on x7? Any help is much > appreciated :) > > cheers, > > -- > Walter Heck > > -- > follow @walterheck on twitter to see what I''m up to! > -- > Check out my new startup: Server Monitoring as a Service @ tribily.com > Follow @tribily on Twitter and/or ''Like'' our Facebook page at > facebook.com/tribily > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at groups.google.com/group/puppet-users?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Sharuzzaman Ahmat Raslan
2012-Apr-24 02:59 UTC
Re: [Puppet Users] Analysing some puppetmaster logs to find out what''s happening on an agent
In my test environment, I managed to make puppet agent stop working when I update facter package, without restarting puppet. Somehow, when facter updated, puppet is not able to continue working, maybe because the expected version of facter is different or something. It was resolved when I restarted puppet agent. On troubleshooting, did you enable reporting in puppet agent config? If you do, you might want to see the output of the report in /var/lib/puppet/reports/<hostname> I noticed my previous issue when I go through the report in that folder. On Tue, Apr 24, 2012 at 10:43 AM, Walter Heck <walterheck@gmail.com> wrote:> Hi all, > > in an unfortunate incident, I managed to lock myself out of a client''s > server. Basically an openssh module that by default disabled remote > root logins did that on a server that was only accessed by remote root > login (no other use raccounts present on that server). Unfortunately, > this is a colocated server and next trip to the dc is scheduled for > next thursday. No KVM over IP, no remote hands, pretty much the ideal > situation :P. The Xen server is still running, and so are the domU''s > on it, but this is less then ideal. If any of the domU''s goes down, > there''s nothing we can do :) > > Now, the puppet agent is running every 30 minutes, but something seems > to make it not execute the catalog. I have set the puppetmaster to > debug in order to see what''s happening, but I can''t figure it out. > Here''s a gist of the puppet master log: > gist.github.com/2475554 > > x7 is the offending server, x6 has exactly the same puppet definition. > Can anyone tell me why the log for x7 just stops, with no error or > nothing? What does that indicate is happening on x7? Any help is much > appreciated :) > > cheers, > > -- > Walter Heck > > -- > follow @walterheck on twitter to see what I''m up to! > -- > Check out my new startup: Server Monitoring as a Service @ > tribily.com > Follow @tribily on Twitter and/or ''Like'' our Facebook page at > facebook.com/tribily > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at > groups.google.com/group/puppet-users?hl=en. > >-- Sharuzzaman Ahmat Raslan -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Walter Heck
2012-Apr-24 03:48 UTC
Re: [Puppet Users] Analysing some puppetmaster logs to find out what''s happening on an agent
Hiya, On Tue, Apr 24, 2012 at 10:56, Christopher Wood <christopher_wood@pobox.com> wrote:> I admit I''ve never read puppetmaster logs like that so what I''m about to say may be very bad advice.Any advice is welcome :)> Since resources removed from your manifests become unmanaged rather than deleted, why not swap the node''s current manifest for one which only re-enables ssh root login? Once you have access you can return to the desired manifest, and use your existing socket to see what gives.We already have only an exec there (nothing else) that is supposed to copy the sshd config file to an nfs share, which we are 100% certain is mounted. The gist is the output from that.> This might also be time to consider remote syslog, that way you can see what the node itself is doing.Yes, as soon as we have access back, that is on the todo :) -- Walter Heck -- follow @walterheck on twitter to see what I''m up to! -- Check out my new startup: Server Monitoring as a Service @ tribily.com Follow @tribily on Twitter and/or ''Like'' our Facebook page at facebook.com/tribily -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Walter Heck
2012-Apr-24 05:12 UTC
Re: [Puppet Users] Analysing some puppetmaster logs to find out what''s happening on an agent
Hi Sharuzzaman, On Tue, Apr 24, 2012 at 10:59, Sharuzzaman Ahmat Raslan <sharuzzaman@gmail.com> wrote:> In my test environment, I managed to make puppet agent stop working when I > update facter package, without restarting puppet. > > Somehow, when facter updated, puppet is not able to continue working, maybe > because the expected version of facter is different or something. > > It was resolved when I restarted puppet agent.We run puppet off of a cron job, so no restarting is involved.> On troubleshooting, did you enable reporting in puppet agent config? If you > do, you might want to see the output of the report in > /var/lib/puppet/reports/<hostname> > > I noticed my previous issue when I go through the report in that folder.I checked and it seems the last report in there is from July 18th. It contains a bunch of errors from when it had the full puppet manifests enabled, but that should be irrelevant now as we only have a single exec in there. Any idea as to why it''s not creating new reports? cheers, -- Walter Heck -- follow @walterheck on twitter to see what I''m up to! -- Check out my new startup: Server Monitoring as a Service @ tribily.com Follow @tribily on Twitter and/or ''Like'' our Facebook page at facebook.com/tribily -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Sharuzzaman Ahmat Raslan
2012-Apr-24 06:07 UTC
Re: [Puppet Users] Analysing some puppetmaster logs to find out what''s happening on an agent
Hi Walter, Ok, I think that narrows it down to cron issue possibility, as you mentioned the other server with the same puppet configurations works well. Reports only generated if puppet is running, even if there are errors. When there are no reports, it could mean that puppet is not running, or things that should execute puppet (eg. cron) is not running. It looks like nothing you can do remotely right now, other than physical access to the server. Maybe others will have better suggestion. On Tue, Apr 24, 2012 at 1:12 PM, Walter Heck <walterheck@gmail.com> wrote:> Hi Sharuzzaman, > > On Tue, Apr 24, 2012 at 10:59, Sharuzzaman Ahmat Raslan > <sharuzzaman@gmail.com> wrote: > > In my test environment, I managed to make puppet agent stop working when > I > > update facter package, without restarting puppet. > > > > Somehow, when facter updated, puppet is not able to continue working, > maybe > > because the expected version of facter is different or something. > > > > It was resolved when I restarted puppet agent. > We run puppet off of a cron job, so no restarting is involved. > > > On troubleshooting, did you enable reporting in puppet agent config? If > you > > do, you might want to see the output of the report in > > /var/lib/puppet/reports/<hostname> > > > > I noticed my previous issue when I go through the report in that folder. > I checked and it seems the last report in there is from July 18th. It > contains a bunch of errors from when it had the full puppet manifests > enabled, but that should be irrelevant now as we only have a single > exec in there. Any idea as to why it''s not creating new reports? > > cheers, > > -- > Walter Heck > > -- > follow @walterheck on twitter to see what I''m up to! > -- > Check out my new startup: Server Monitoring as a Service @ > tribily.com > Follow @tribily on Twitter and/or ''Like'' our Facebook page at > facebook.com/tribily > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at > groups.google.com/group/puppet-users?hl=en. > >-- Sharuzzaman Ahmat Raslan -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Walter Heck
2012-Apr-24 13:07 UTC
Re: [Puppet Users] Analysing some puppetmaster logs to find out what''s happening on an agent
Hiya, On Tue, Apr 24, 2012 at 14:07, Sharuzzaman Ahmat Raslan <sharuzzaman@gmail.com> wrote:> Ok, I think that narrows it down to cron issue possibility, as you mentioned > the other server with the same puppet configurations works well. > > Reports only generated if puppet is running, even if there are errors. When > there are no reports, it could mean that puppet is not running, or things > that should execute puppet (eg. cron) is not running.Well, I''d tend to agree on that, but we see this appearing in the logs every 30 minutes, which means that something is contacting the puppet master to ask for x7''s catalog. I''d just expect an error message or anything that indicates what''s up here, especially in debug mode.> It looks like nothing you can do remotely right now, other than physical > access to the server.We have someone on-site on Thursday morning, it''s not the end of the world, it''s more my curiosity then anything else to see what happened here.> Maybe others will have better suggestion.That would be more then welcome :) -- Walter Heck -- follow @walterheck on twitter to see what I''m up to! -- Check out my new startup: Server Monitoring as a Service @ tribily.com Follow @tribily on Twitter and/or ''Like'' our Facebook page at facebook.com/tribily -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
jcbollinger
2012-Apr-24 13:09 UTC
[Puppet Users] Re: Analysing some puppetmaster logs to find out what''s happening on an agent
On Apr 23, 9:43 pm, Walter Heck <walterh...@gmail.com> wrote:> Hi all, > > in an unfortunate incident, I managed to lock myself out of a client''s > server. Basically an openssh module that by default disabled remote > root logins did that on a server that was only accessed by remote root > login (no other use raccounts present on that server). Unfortunately, > this is a colocated server and next trip to the dc is scheduled for > next thursday. No KVM over IP, no remote hands, pretty much the ideal > situation :P. The Xen server is still running, and so are the domU''s > on it, but this is less then ideal. If any of the domU''s goes down, > there''s nothing we can do :) > > Now, the puppet agent is running every 30 minutes, but something seems > to make it not execute the catalog. I have set the puppetmaster to > debug in order to see what''s happening, but I can''t figure it out. > Here''s a gist of the puppet master log:gist.github.com/2475554 > > x7 is the offending server, x6 has exactly the same puppet definition. > Can anyone tell me why the log for x7 just stops, with no error or > nothing? What does that indicate is happening on x7? Any help is much > appreciated :)In the excerpt you posted, it looks like x7 is getting a cached catalog, whereas x6''s catalog needed to be recompiled. The fact that their manifests are the same is not inconsistent with that. Perhaps that''s why you don''t see more. Alternatively, most of the log lines pertaining to x7 appear to show it downloading plugins -- maybe you have a hung client, but successive cron-initiated runs are performing plugin sync and fact gathering before that stops them. Since it looks like the client is still (plugin)syncing, however, that may be enough of an opening for you to break the server back open. You could try sending it a custom fact that has whatever clever side effect you like. I''m not certain whether facts are evaluated with privilege, but you should at least be able to collect information and write it to your share. John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Michael Baydoun
2012-Apr-24 17:18 UTC
Re: [Puppet Users] Re: Analysing some puppetmaster logs to find out what''s happening on an agent
Maybe you could use puppet to configure send to a remote syslog. If that works, you could then add your problem module back, and get visibility to exactly what the client is doing, and any errors it is seeing. Or make some small modification to the existing log file on the remote server, with filebucketing to your master, and use that to retrieve the puppet log from the remote On Tue, Apr 24, 2012 at 9:09 AM, jcbollinger <John.Bollinger@stjude.org>wrote:> > > On Apr 23, 9:43 pm, Walter Heck <walterh...@gmail.com> wrote: > > Hi all, > > > > in an unfortunate incident, I managed to lock myself out of a client''s > > server. Basically an openssh module that by default disabled remote > > root logins did that on a server that was only accessed by remote root > > login (no other use raccounts present on that server). Unfortunately, > > this is a colocated server and next trip to the dc is scheduled for > > next thursday. No KVM over IP, no remote hands, pretty much the ideal > > situation :P. The Xen server is still running, and so are the domU''s > > on it, but this is less then ideal. If any of the domU''s goes down, > > there''s nothing we can do :) > > > > Now, the puppet agent is running every 30 minutes, but something seems > > to make it not execute the catalog. I have set the puppetmaster to > > debug in order to see what''s happening, but I can''t figure it out. > > Here''s a gist of the puppet master log:gist.github.com/2475554 > > > > x7 is the offending server, x6 has exactly the same puppet definition. > > Can anyone tell me why the log for x7 just stops, with no error or > > nothing? What does that indicate is happening on x7? Any help is much > > appreciated :) > > > In the excerpt you posted, it looks like x7 is getting a cached > catalog, whereas x6''s catalog needed to be recompiled. The fact that > their manifests are the same is not inconsistent with that. Perhaps > that''s why you don''t see more. > > Alternatively, most of the log lines pertaining to x7 appear to show > it downloading plugins -- maybe you have a hung client, but successive > cron-initiated runs are performing plugin sync and fact gathering > before that stops them. > > Since it looks like the client is still (plugin)syncing, however, that > may be enough of an opening for you to break the server back open. > You could try sending it a custom fact that has whatever clever side > effect you like. I''m not certain whether facts are evaluated with > privilege, but you should at least be able to collect information and > write it to your share. > > > John > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at > groups.google.com/group/puppet-users?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Walter Heck
2012-Apr-28 14:11 UTC
Re: [Puppet Users] Re: Analysing some puppetmaster logs to find out what''s happening on an agent
So, we got access back. It turned out to be a malformed fact causing the problem. The particular fact was using a ruby exec to get lvm freespace, using chomp on the result without checking wether there was a result. Since the x7 server didn''t have lvm installed to begin with, that led to problems. So, just to make sure I understand this correctly: in this case just removing all the puppet code didn''t help, since the mere existence of the module with the custom fact in our module path made the fact execute on the agent, right? cheers, Walter On Wed, Apr 25, 2012 at 01:18, Michael Baydoun <indymichaelb@gmail.com> wrote:> Maybe you could use puppet to configure send to a remote syslog. If that > works, you could then add your problem module back, and get visibility to > exactly what the client is doing, and any errors it is seeing. > > Or make some small modification to the existing log file on the remote > server, with filebucketing to your master, and use that to retrieve the > puppet log from the remote > > > On Tue, Apr 24, 2012 at 9:09 AM, jcbollinger <John.Bollinger@stjude.org> > wrote: >> >> >> >> On Apr 23, 9:43 pm, Walter Heck <walterh...@gmail.com> wrote: >> > Hi all, >> > >> > in an unfortunate incident, I managed to lock myself out of a client''s >> > server. Basically an openssh module that by default disabled remote >> > root logins did that on a server that was only accessed by remote root >> > login (no other use raccounts present on that server). Unfortunately, >> > this is a colocated server and next trip to the dc is scheduled for >> > next thursday. No KVM over IP, no remote hands, pretty much the ideal >> > situation :P. The Xen server is still running, and so are the domU''s >> > on it, but this is less then ideal. If any of the domU''s goes down, >> > there''s nothing we can do :) >> > >> > Now, the puppet agent is running every 30 minutes, but something seems >> > to make it not execute the catalog. I have set the puppetmaster to >> > debug in order to see what''s happening, but I can''t figure it out. >> > Here''s a gist of the puppet master log:gist.github.com/2475554 >> > >> > x7 is the offending server, x6 has exactly the same puppet definition. >> > Can anyone tell me why the log for x7 just stops, with no error or >> > nothing? What does that indicate is happening on x7? Any help is much >> > appreciated :) >> >> >> In the excerpt you posted, it looks like x7 is getting a cached >> catalog, whereas x6''s catalog needed to be recompiled. The fact that >> their manifests are the same is not inconsistent with that. Perhaps >> that''s why you don''t see more. >> >> Alternatively, most of the log lines pertaining to x7 appear to show >> it downloading plugins -- maybe you have a hung client, but successive >> cron-initiated runs are performing plugin sync and fact gathering >> before that stops them. >> >> Since it looks like the client is still (plugin)syncing, however, that >> may be enough of an opening for you to break the server back open. >> You could try sending it a custom fact that has whatever clever side >> effect you like. I''m not certain whether facts are evaluated with >> privilege, but you should at least be able to collect information and >> write it to your share. >> >> >> John >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Puppet Users" group. >> To post to this group, send email to puppet-users@googlegroups.com. >> To unsubscribe from this group, send email to >> puppet-users+unsubscribe@googlegroups.com. >> For more options, visit this group at >> groups.google.com/group/puppet-users?hl=en. >> > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at > groups.google.com/group/puppet-users?hl=en.-- Walter Heck -- follow @walterheck on twitter to see what I''m up to! -- Check out my new startup: Server Monitoring as a Service @ tribily.com Follow @tribily on Twitter and/or ''Like'' our Facebook page at facebook.com/tribily -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.
Dick Davies
2012-Apr-28 18:17 UTC
Re: [Puppet Users] Re: Analysing some puppetmaster logs to find out what''s happening on an agent
On 28 April 2012 15:11, Walter Heck <walterheck@gmail.com> wrote:> So, just to make sure I understand this correctly: in this case just > removing all the puppet code didn''t help, since the mere existence of > the module with the custom fact in our module path made the fact > execute on the agent, right?Yes (in my experience at least): the agent pulls down all facts in all available modules, whether or not they''re actually loaded/included on the node. Removing that fact (or fixing it) up on the puppetmaster should push the change down to all nodes at the start of their next run, which will probably help. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at groups.google.com/group/puppet-users?hl=en.