Hello there I have puppet managing a fair few hosts but because we are still testing (and later for peace of mind) we''d like to hear from hosts that are failing their puppet run. I had a look at the configuration reference on reductivelabs but i can''t see anything that appears relevant. Does anyone know if there is something i can set to notify me or a fact through $yamldir that indicates that on a particular host, something is wrong. So, the puppet run did not complete because of an error. Oh, i''m using puppetd daemonized ... if i was still running it through cron there would be a way. Something i just thought of was monit - which i could get to look at the log ... but that seems to be a rather long way around... Any thoughts? Cheers chakkerz --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
You might want to have a look at: http://projects.reductivelabs.com/issues/1934 to add exit codes to puppetd --onetime runs at least. Otherwise you should implement reports and examine those I guess. On Tue, Aug 25, 2009 at 5:52 PM, chakkerz<chakkerz@gmail.com> wrote:> > Hello there > > I have puppet managing a fair few hosts but because we are still > testing (and later for peace of mind) we''d like to hear from hosts > that are failing their puppet run. > > I had a look at the configuration reference on reductivelabs but i > can''t see anything that appears relevant. Does anyone know if there is > something i can set to notify me or a fact through $yamldir that > indicates that on a particular host, something is wrong. So, the > puppet run did not complete because of an error. > > Oh, i''m using puppetd daemonized ... if i was still running it through > cron there would be a way. > > Something i just thought of was monit - which i could get to look at > the log ... but that seems to be a rather long way around... > > Any thoughts? > Cheers > chakkerz > > >-- Nigel Kersten nigelk@google.com System Administrator Google, Inc. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Op woensdag 26 augustus 2009 12:52:13 schreef chakkerz:> I have puppet managing a fair few hosts but because we are still > testing (and later for peace of mind) we''d like to hear from hosts > that are failing their puppet run.I did this to make it work with Nagios: http://www.kallisti.net.nz/blog/2009/02/monitoring-puppet-with-nagios/ You could probably easily adapt the idea here to do it some other way. -- Robin <robin@kallisti.net.nz> JabberID: <eythian@jabber.kallisti.net.nz> http://www.kallisti.net.nz/blog ||| http://identi.ca/eythian PGP Key 0xA99CEB6D = 5957 6D23 8B16 EFAB FEF8 7175 14D3 6485 A99C EB6D
a few options 1. if you dont have a lot of clients, use the build in tagmail report which sends email with every change. 2. write your own custom puppet report that does something. 3. parse the reports and import them to a db whatever 4. wait a bit until the puppet web interface is ready ;) Ohad On Wed, Aug 26, 2009 at 8:52 AM, chakkerz <chakkerz@gmail.com> wrote:> > Hello there > > I have puppet managing a fair few hosts but because we are still > testing (and later for peace of mind) we''d like to hear from hosts > that are failing their puppet run. > > I had a look at the configuration reference on reductivelabs but i > can''t see anything that appears relevant. Does anyone know if there is > something i can set to notify me or a fact through $yamldir that > indicates that on a particular host, something is wrong. So, the > puppet run did not complete because of an error. > > Oh, i''m using puppetd daemonized ... if i was still running it through > cron there would be a way. > > Something i just thought of was monit - which i could get to look at > the log ... but that seems to be a rather long way around... > > Any thoughts? > Cheers > chakkerz > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Aug 25, 2009, at 9:06 PM, Robin Sheat wrote:> Op woensdag 26 augustus 2009 12:52:13 schreef chakkerz: >> I have puppet managing a fair few hosts but because we are still >> testing (and later for peace of mind) we''d like to hear from hosts >> that are failing their puppet run. > > I did this to make it work with Nagios: > http://www.kallisti.net.nz/blog/2009/02/monitoring-puppet-with-nagios/ >IMHO the nagios method of checking the state.yaml file is the best option to ensure puppet itself is working correctly. To make sure your manifests are applied is a different issue. -L -- Larry Ludwig Reductive Labs --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
What about checking the last-update times of the /var/lib/puppet.....yaml files on the puppetmaster? I''ve looked at this manually just to see if a host stopped checking in, but not automated anything. On Tue, Aug 25, 2009 at 9:21 PM, Ohad Levy <ohadlevy@gmail.com> wrote:> a few options > > 1. if you dont have a lot of clients, use the build in tagmail report which > sends email with every change. > 2. write your own custom puppet report that does something. > 3. parse the reports and import them to a db whatever > 4. wait a bit until the puppet web interface is ready ;) > > Ohad > > > > On Wed, Aug 26, 2009 at 8:52 AM, chakkerz <chakkerz@gmail.com> wrote: > >> >> Hello there >> >> I have puppet managing a fair few hosts but because we are still >> testing (and later for peace of mind) we''d like to hear from hosts >> that are failing their puppet run. >> >> I had a look at the configuration reference on reductivelabs but i >> can''t see anything that appears relevant. Does anyone know if there is >> something i can set to notify me or a fact through $yamldir that >> indicates that on a particular host, something is wrong. So, the >> puppet run did not complete because of an error. >> >> Oh, i''m using puppetd daemonized ... if i was still running it through >> cron there would be a way. >> >> Something i just thought of was monit - which i could get to look at >> the log ... but that seems to be a rather long way around... >> >> Any thoughts? >> Cheers >> chakkerz >> >> > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Yeah, the timestamp is useful but only shows a complete failure, that is one in which the host didn''t make contact (i could be wrong). And i had been using that, but i discovered this morning that one of my hosts was working, but it would hit an error and fail. The timestamp didn''t give it away: <break host specific file> [root@tangelo ~]# ls -ltr /var/lib/puppet/yaml/node/ blah.example.org.yaml -rw-r----- 1 puppet puppet 3328 Aug 26 13:30 /var/lib/puppet/yaml/node/ blah.example.org.yaml [root@tangelo ~]# grep timestamp !$ grep timestamp /var/lib/puppet/yaml/node/blah.example.org.yaml :_timestamp: 2009-08-26 13:30:19.096401 +10:00 [root@tangelo ~]# ls -ltr /var/lib/puppet/yaml/node/ blah.example.org.yaml -rw-r----- 1 puppet puppet 3328 Aug 26 13:38 /var/lib/puppet/yaml/node/ blah.example.org.yaml [root@tangelo ~]# grep timestamp /var/lib/puppet/yaml/node/ blah.example.org.yaml :_timestamp: 2009-08-26 13:38:11.365687 +10:00 <fix file again :) > The error takes this following form (some variables are not defined in the right scope). [root@example ~]# service puppet stop ; puppetd -vt Stopping puppet: [ OK ] info: Loading fact all_mounted_partitions <...snip...> err: Could not create snmpd on && /tmp/snmpd_chkconfig.USG: '' snmpd on && /tmp/snmpd_chkconfig.USG'' is both unqualifed and specified no search path at /etc/puppet/manifests/nodes/example.node:10 warning: Not using cache on failed catalog warning: Configuration could not be instantiated: '' snmpd on && /tmp/ snmpd_chkconfig.USG'' is both unqualifed and specified no search path at /etc/puppet/manifests/nodes/example.node:10 I guess exit codes could be handy, but what i''m after is notification. Exit codes would need wrapping and wouldn''t work for a daemon, which would invalidate the way puppet is meant to be used. Should this be a feature request? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Aug 25, 2009, at 11:44 PM, chakkerz wrote:> > Yeah, the timestamp is useful but only shows a complete failure, that > is one in which the host didn''t make contact (i could be wrong).Communication failure AND a manifest that cannot compile on the puppetmaster will cause the yaml file to get stale. -L -- Larry Ludwig Reductive Labs --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Op woensdag 26 augustus 2009 15:44:21 schreef chakkerz:> Yeah, the timestamp is useful but only shows a complete failure, that > is one in which the host didn''t make contact (i could be wrong).It has alerted me to faults where the catalogue wouldn''t compile due to syntax errors or whatever (it''s probably about time I put the syntax checker onto SVN really.) However I suspect it wouldn''t notify errors where the syntax is fine, but an operation can''t be performed for some reason. -- Robin <robin@kallisti.net.nz> JabberID: <eythian@jabber.kallisti.net.nz> http://www.kallisti.net.nz/blog ||| http://identi.ca/eythian PGP Key 0xA99CEB6D = 5957 6D23 8B16 EFAB FEF8 7175 14D3 6485 A99C EB6D
Hi, I''m using nagios to monitor puppet by runing the script below in each host. The script looks at the log file, it returns 0 if everything is OK or 1 if there''s some error. Cheers, ------------------------------------------------------------------------- Gerard Bernabeu Port d''Informació Científica (PIC) e_mail: bernabeu@pic.es Campus UAB - Edificio D Tel: +34 93 581 33 22 08193 Bellaterra (Barcelona), Spain Fax: +34 93 581 41 10 ------------------------------------------------------------------------- <SCRIPT> cat check_puppet_log.sh #!/bin/bash # This senseror checks if puppet is running successfully and logging it LOGFILE=/var/log/puppet/puppet.log rc=0 message="" awk=awk if [ "`uname`" = "SunOS" ]; then awk=/opt/csw/bin/gawk fi #1st we check if the logfile exists, if not we''ll raise an error if [ ! -f $LOGFILE ]; then puppetversion="" puppetversion=`puppetd --version` if [ "$puppetversion" = "" ]; then echo "Puppet is not even installed in this host" else echo "Puppet $puppetversion log file does not exist $LOGFILE" fi exit 1 fi #we check if the last line in the log is newer than 1 hour and is a clean finish linia=`tail -1 $LOGFILE | $awk ''{print $3 " " $4}'' | cut -d: -f1` date=`date +"%a %b %e %T %Z %Y" | $awk ''{print $3 " " $4}'' | cut -d: -f1` i=0 for x in $linia; do let i++ y="`echo $date | $awk -v i=$i ''{print $i}''`" if [ $i -eq 1 ]; then #Day num comparition let dif="10#$y-10#$x" if [ $dif -gt 1 ]; then #If we''ve more than 1 day difference... message="$message WARN: Last log is $dif days old, check if puppet is running." rc=1 fi else if [ $i -eq 2 ]; then #Hour comparition if [ $y -gt 2 ]; then #We make sure that its later than 2 because we dont want to deal with the day change "issue" let dif="10#$y-10#$x" if [ $dif -gt 2 ]; then #And we''ve more than 2 hours difference message="$message WARN: Last log is $dif hours old, check if puppet is running." rc=1 fi fi fi fi done #In case a problem has been detected we quit to avoid loading the server.... if [ "$rc" != "0" ]; then echo $message exit $rc fi #we check if the last line of the log is a successful finish, if not we''ll check if puppet is running, and since when.... missatgeOK="`tail -1 $LOGFILE | grep "Puppet (notice): Finished catalog run in"`" if [ $? -eq 1 ]; then #This means that message has not been found and we should look further procsdate=`ps -ef | grep puppet | grep -v grep | $awk ''{print $5}''` for i in $procsdate; do if [ "$i" != "`date +"%a %b %e %T %Z %Y" | $awk ''{print $2$3}''`" ]; then #This means that the process is old message="$message CRIT: there are puppet procs running since $i" rc=1 fi done fi #We analyze the last run log tac=tac if [ "`uname`" = "SunOS" ]; then tac=/opt/csw/bin/gtac fi finish=0 $tac $LOGFILE | while read line; do if [ `echo $line | grep ''Puppet (notice): Finished catalog run in'' > /dev/null; echo $?` -eq 0 ]; then if [ $finish -eq 0 ]; then #we discard the first finish. finish=1 else #We finished processing the last run log if [ "$rc" != "0" ]; then echo $message exit $rc else echo $missatgeOK exit 0 fi fi #We''ll process each line grepping for possible errors. To add a new error just add a new || condition to the following line: elif [ `echo $line | grep ''(err)'' > /dev/null; echo $?` -eq 0 ] || [ `echo $line | grep ''skipping catalog run'' > /dev/null; echo $?` -eq 0 ]; then echo "$message ERROR: $line" exit 1 fi done </SCRIPT> On Wed, Aug 26, 2009 at 5:57 AM, Robin Sheat <robin@kallisti.net.nz> wrote:> Op woensdag 26 augustus 2009 15:44:21 schreef chakkerz: > > Yeah, the timestamp is useful but only shows a complete failure, that > > is one in which the host didn''t make contact (i could be wrong). > > It has alerted me to faults where the catalogue wouldn''t compile due to > syntax > errors or whatever (it''s probably about time I put the syntax checker onto > SVN > really.) However I suspect it wouldn''t notify errors where the syntax is > fine, > but an operation can''t be performed for some reason. > > -- > Robin <robin@kallisti.net.nz> JabberID: <eythian@jabber.kallisti.net.nz> > http://www.kallisti.net.nz/blog ||| http://identi.ca/eythian > > PGP Key 0xA99CEB6D = 5957 6D23 8B16 EFAB FEF8 7175 14D3 6485 A99C EB6D >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Very nice! I think we can make that work. Thanks for that!!! chakkerz On Aug 26, 8:44 pm, Gerard Bernabeu <berna...@pic.es> wrote:> Hi, > > I''m using nagios to monitor puppet by runing the script below in each host. > The script looks at the log file, it returns 0 if everything is OK or 1 if > there''s some error. > > Cheers, > ------------------------------------------------------------------------- > Gerard Bernabeu > Port d''Informació Científica (PIC) e_mail: berna...@pic.es > Campus UAB - Edificio D Tel: +34 93 581 33 22 > 08193 Bellaterra (Barcelona), Spain Fax: +34 93 581 41 10 > ------------------------------------------------------------------------- > > <SCRIPT> > cat check_puppet_log.sh > #!/bin/bash > # This senseror checks if puppet is running successfully and logging it > > LOGFILE=/var/log/puppet/puppet.log > rc=0 > message="" > > awk=awk > if [ "`uname`" = "SunOS" ]; then > awk=/opt/csw/bin/gawk > fi > > #1st we check if the logfile exists, if not we''ll raise an error > > if [ ! -f $LOGFILE ]; then > puppetversion="" > puppetversion=`puppetd --version` > > if [ "$puppetversion" = "" ]; then > echo "Puppet is not even installed in this host" > else > echo "Puppet $puppetversion log file does not exist $LOGFILE" > fi > exit 1 > fi > > #we check if the last line in the log is newer than 1 hour and is a clean > finish > linia=`tail -1 $LOGFILE | $awk ''{print $3 " " $4}'' | cut -d: -f1` > date=`date +"%a %b %e %T %Z %Y" | $awk ''{print $3 " " $4}'' | cut -d: -f1` > > i=0 > for x in $linia; do > let i++ > y="`echo $date | $awk -v i=$i ''{print $i}''`" > if [ $i -eq 1 ]; then > #Day num comparition > let dif="10#$y-10#$x" > if [ $dif -gt 1 ]; then #If we''ve more than 1 day difference... > message="$message WARN: Last log is $dif days old, check if > puppet is running." > rc=1 > fi > else > if [ $i -eq 2 ]; then > #Hour comparition > if [ $y -gt 2 ]; then #We make sure that its later than 2 > because we dont want to deal with the day change "issue" > let dif="10#$y-10#$x" > if [ $dif -gt 2 ]; then #And we''ve more than 2 hours > difference > message="$message WARN: Last log is $dif hours old, > check if puppet is running." > rc=1 > fi > fi > fi > fi > done > > #In case a problem has been detected we quit to avoid loading the server.... > if [ "$rc" != "0" ]; then > echo $message > exit $rc > fi > > #we check if the last line of the log is a successful finish, if not we''ll > check if puppet is running, and since when.... > > missatgeOK="`tail -1 $LOGFILE | grep "Puppet (notice): Finished catalog run > in"`" > if [ $? -eq 1 ]; then #This means that message has not been found and we > should look further > procsdate=`ps -ef | grep puppet | grep -v grep | $awk ''{print $5}''` > for i in $procsdate; > do > if [ "$i" != "`date +"%a %b %e %T %Z %Y" | $awk ''{print $2$3}''`" ]; > then #This means that the process is old > message="$message CRIT: there are puppet procs running since $i" > rc=1 > fi > done > fi > > #We analyze the last run log > tac=tac > if [ "`uname`" = "SunOS" ]; then > tac=/opt/csw/bin/gtac > fi > > finish=0 > $tac $LOGFILE | while read line; do > if [ `echo $line | grep ''Puppet (notice): Finished catalog run in'' > > /dev/null; echo $?` -eq 0 ]; then > if [ $finish -eq 0 ]; then #we discard the first finish. > finish=1 > else > #We finished processing the last run log > if [ "$rc" != "0" ]; then > echo $message > exit $rc > else > echo $missatgeOK > exit 0 > fi > fi > #We''ll process each line grepping for possible errors. To add a new > error just add a new || condition to the following line: > elif [ `echo $line | grep ''(err)'' > /dev/null; echo $?` -eq 0 ] || [ > `echo $line | grep ''skipping catalog run'' > /dev/null; echo $?` -eq 0 ]; > then > echo "$message ERROR: $line" > exit 1 > fi > done > > </SCRIPT> > > On Wed, Aug 26, 2009 at 5:57 AM, Robin Sheat <ro...@kallisti.net.nz> wrote: > > Op woensdag 26 augustus 2009 15:44:21 schreef chakkerz: > > > Yeah, the timestamp is useful but only shows a complete failure, that > > > is one in which the host didn''t make contact (i could be wrong). > > > It has alerted me to faults where the catalogue wouldn''t compile due to > > syntax > > errors or whatever (it''s probably about time I put the syntax checker onto > > SVN > > really.) However I suspect it wouldn''t notify errors where the syntax is > > fine, > > but an operation can''t be performed for some reason. > > > -- > > Robin <ro...@kallisti.net.nz> JabberID: <eyth...@jabber.kallisti.net.nz> > >http://www.kallisti.net.nz/blog ||| http://identi.ca/eythian > > > PGP Key 0xA99CEB6D = 5957 6D23 8B16 EFAB FEF8 7175 14D3 6485 A99C EB6D--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---