Al @ Lab42
2010-Oct-19 15:50 UTC
[Puppet Users] Automating infrastructure tests on Puppet nodes after a puppetrun
Hi List, I would like to discuss with whoever is interested one topic that I suppose has general interest. I want to implement some kind of automatic testing on the status of a node after a Puppet Run. These tests involve trivial and less trivial things things like: - A local service is running - A local port is open - A remote server on a remote port is reachable by the node - An URL replies with an expected content - Some specific function needed by the node and provided by a remote host is working (ie: ldap acces for users authentication, ntp sync...) - Whatever other check that asserts that the node is correctly working I want to do this directly in my modules, at least for the checks that are directly related to the resources provided by the module and build some defines to manage quickly things like "check the url" or "check if the remote port is accessible". The point is to have a solid testing infrastructure, early notification of any problem that might take place after a Puppet run and, at the same time have a sort of monitoring logic that might be used also by other tools, like Nagios. In order to achieve something like this there are different approaches and I would like to follow what seems most sane and, mostly, what could better fit the evolution of the Puppet ecosystem. Here a pair of examples: - APPROACH 1 - CHECK TRIGGERED BY PUPPET NODE After the Puppet run a script/command is launched and makes the necessary checks (built on the node in a dinamic way, according to the modules installed). If I''m not wrong in recent Puppet versions there''s an hook that makes you run custom commands after (or before? or both?) the execution of the puppet run, so this might be the way to automate the start of the checks without too many hassles. The cons are that everything is done on the node and there''s not (if not implemented specifically) a centralized management of checks runs, process logic, notifications and history. - APPROACH 2 - CHECK RUN BY AN MCOLLECTIVE CLIENT ON THE PUPPET NODE This somehow intrigues me and requires the node to have a mcollective server deamon running. The automation might be triggered remotely by the mcollective client using mcollective agents available on the Puppet node. The mcollective client should be notified of the puppetrun and might not be the same PuppetMaster, and a way to do this might be via a custom report extention that reports directly to the mcollective client. The benefit is that the monitoring can be managed via mcollective and there''s a central point where data are collected and commands executed. The list of checks to be done on the client should, IMHO, remain on the Puppet client (mcollective server) itself (no need to have store configs for this) and maybe a specific agent might be done to retrieve and run from the mcollective client the list of checks to perform. Another point is how to organize and define the checks'' list. Cucumber seems a nice and somehow "standard" way to define the checks logic, but could be also a plain execution of the different checks from a sort of wrapper script. The single checks could be nrpe commands and/or mcollective agents (I love the nettest one, incidentally). AFAIK there''s nothing in the above examples that is particularly difficult or can''t be done with existing tools, but I would like to introduce them seamlessly in my modules (using my monitoring abstraction classes). So, I wonder if someone is already doing similar checks, what''s the approach they are following and what might be the evolution of Puppet under regarding these topics. Any further or related idea is welcomed, Alessandro Franceschi -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Nicolas Szalay
2010-Oct-19 20:52 UTC
Re: [Puppet Users] Automating infrastructure tests on Puppet nodes after a puppetrun
----- "Al @ Lab42" <lab42.it@gmail.com> a écrit : | Hi List, Hi, | I would like to discuss with whoever is interested one topic that I | suppose has general interest. | | I want to implement some kind of automatic testing on the status of a | node after a Puppet Run. | These tests involve trivial and less trivial things things like: | - A local service is running | - A local port is open | - A remote server on a remote port is reachable by the node | - An URL replies with an expected content | - Some specific function needed by the node and provided by a remote | host is working (ie: ldap acces for users authentication, ntp | sync...) | - Whatever other check that asserts that the node is correctly | working | | I want to do this directly in my modules, at least for the checks | that are directly related to the resources provided by the module | and | build some defines to manage quickly things like "check the url" or | "check if the remote port is accessible". | | The point is to have a solid testing infrastructure, early | notification of any problem that might take place after a Puppet run | and, at the same time have a sort of monitoring logic that might be | used also by other tools, like Nagios. Do you know about puppet-cucumber ? | In order to achieve something like this there are different | approaches and I would like to follow what seems most sane and, | mostly, what could better fit the evolution of the Puppet ecosystem. | | Here a pair of examples: | | - APPROACH 1 - CHECK TRIGGERED BY PUPPET NODE This is an easy approach but how will you push information back to you ? I have not checked but I don''t think that the result of post run hooks are included into reports | - APPROACH 2 - CHECK RUN BY AN MCOLLECTIVE CLIENT ON THE PUPPET NODE I would use that one, combined with nagios through the mc nrpe agent probably or something like a hudson instance to do a permanent check about this. | Another point is how to organize and define the checks'' list. | Cucumber | seems a nice and somehow "standard" way to define the checks logic, | but could be also a plain execution of the different checks from a | sort of wrapper script. | The single checks could be nrpe commands and/or mcollective agents (I | love the nettest one, incidentally). | | | AFAIK there''s nothing in the above examples that is particularly | difficult or can''t be done with existing tools, but I would like to | introduce them seamlessly in my modules (using my monitoring | abstraction classes). | | So, I wonder if someone is already doing similar checks, what''s the | approach they are following and what might be the evolution of Puppet | under regarding these topics. Not doing it but definitely interested. Nico. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
R.I.Pienaar
2010-Oct-19 21:06 UTC
Re: [Puppet Users] Automating infrastructure tests on Puppet nodes after a puppetrun
----- "Al @ Lab42" <lab42.it@gmail.com> wrote:> Hi List, > I would like to discuss with whoever is interested one topic that I > suppose has general interest. > > I want to implement some kind of automatic testing on the status of a > node after a Puppet Run. > These tests involve trivial and less trivial things things like: > - A local service is running > - A local port is open > - A remote server on a remote port is reachable by the node > - An URL replies with an expected content > - Some specific function needed by the node and provided by a remote > host is working (ie: ldap acces for users authentication, ntp > sync...) > - Whatever other check that asserts that the node is correctly > workingsounds like things you want to monitor anyway in an ongoing manner? So assuming you have monitoring for all of this, is the problem that you want visibility of the state right now after a run and not when nagios gets round to doing its next checks which might be many minutes? I favor nrpe - cos I can deploy my check logic with puppet - but I really think you want your monitoring to cover all of this. To answer the ''now'' part of it, I''d just notify via mcollective my nagios box to do a check for all services on the node post puppet run. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Al @ Lab42
2010-Oct-20 06:21 UTC
[Puppet Users] Re: Automating infrastructure tests on Puppet nodes after a puppetrun
On Oct 19, 10:52 pm, Nicolas Szalay <nsza...@qualigaz.com> wrote:> ----- "Al @ Lab42" <lab42...@gmail.com> a écrit : > > | Hi List, > > Hi, > > | I would like to discuss with whoever is interested one topic that I > | suppose has general interest. > | > | I want to implement some kind of automatic testing on the status of a > | node after a Puppet Run. > | These tests involve trivial and less trivial things things like: > | - A local service is running > | - A local port is open > | - A remote server on a remote port is reachable by the node > | - An URL replies with an expected content > | - Some specific function needed by the node and provided by a remote > | host is working (ie: ldap acces for users authentication, ntp > | sync...) > | - Whatever other check that asserts that the node is correctly > | working > | > | I want to do this directly in my modules, at least for the checks > | that are directly related to the resources provided by the module > | and > | build some defines to manage quickly things like "check the url" or > | "check if the remote port is accessible". > | > | The point is to have a solid testing infrastructure, early > | notification of any problem that might take place after a Puppet run > | and, at the same time have a sort of monitoring logic that might be > | used also by other tools, like Nagios. > > Do you know about puppet-cucumber ?Yes, but as far as I''ve understood, puppet-cucumber is run on the Puppet Master and check resources managed by Puppet. I''d like also to make checks that might not be directly related to Puppet resources (but might be broken by a wrong config pushed via Puppet).> > | In order to achieve something like this there are different > | approaches and I would like to follow what seems most sane and, > | mostly, what could better fit the evolution of the Puppet ecosystem. > | > | Here a pair of examples: > | > | - APPROACH 1 - CHECK TRIGGERED BY PUPPET NODE > > This is an easy approach but how will you push information back to you ? I have not checked but I don''t think that the result of post run hooks are included into reportsIn fact, and that''s a reason why I don''t prefer this approach, because you should build your own reporting stuff.> > | - APPROACH 2 - CHECK RUN BY AN MCOLLECTIVE CLIENT ON THE PUPPET NODE > > I would use that one, combined with nagios through the mc nrpe agent probably or something like a hudson instance to do a permanent check about this.+1> > | Another point is how to organize and define the checks'' list. > | Cucumber > | seems a nice and somehow "standard" way to define the checks logic, > | but could be also a plain execution of the different checks from a > | sort of wrapper script. > | The single checks could be nrpe commands and/or mcollective agents (I > | love the nettest one, incidentally). > | > | > | AFAIK there''s nothing in the above examples that is particularly > | difficult or can''t be done with existing tools, but I would like to > | introduce them seamlessly in my modules (using my monitoring > | abstraction classes). > | > | So, I wonder if someone is already doing similar checks, what''s the > | approach they are following and what might be the evolution of Puppet > | under regarding these topics. > > Not doing it but definitely interested.I''ll let you know if I make up something interesting :-) Al -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Al @ Lab42
2010-Oct-20 06:49 UTC
[Puppet Users] Re: Automating infrastructure tests on Puppet nodes after a puppetrun
On Oct 19, 11:06 pm, "R.I.Pienaar" <r...@devco.net> wrote:> ----- "Al @ Lab42" <lab42...@gmail.com> wrote: > > > > > Hi List, > > I would like to discuss with whoever is interested one topic that I > > suppose has general interest. > > > I want to implement some kind of automatic testing on the status of a > > node after a Puppet Run. > > These tests involve trivial and less trivial things things like: > > - A local service is running > > - A local port is open > > - A remote server on a remote port is reachable by the node > > - An URL replies with an expected content > > - Some specific function needed by the node and provided by a remote > > host is working (ie: ldap acces for users authentication, ntp > > sync...) > > - Whatever other check that asserts that the node is correctly > > working > > sounds like things you want to monitor anyway in an ongoing manner?Generally yes.> > So assuming you have monitoring for all of this, is the problem that you > want visibility of the state right now after a run and not when nagios > gets round to doing its next checks which might be many minutes?Yes, but also I want direct correlation between a puppet run and an eventual failure.> I favor nrpe - cos I can deploy my check logic with puppet - but I really > think you want your monitoring to cover all of this. > > To answer the ''now'' part of it, I''d just notify via mcollective my nagios > box to do a check for all services on the node post puppet run.That could be an option but it wouldn''t directly correlate the check''s failure with a Puppet run. I think I would prefer to use the existing checks (so nrpe is perfect) but be able run them also outside Nagios. BTW, an implementation question. How do you suggest to manage the triggering of an action on the mcollective client from the PupetMaster, after a Puppet run on one of its clients? I suppose that using a custom report is the most logic approach, but what''s the sanest way to actually deliver it? Having a service listening on a mcollective client node and send reports there? Using stomp messaging? How? Al -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
R.I.Pienaar
2010-Oct-20 06:53 UTC
Re: [Puppet Users] Re: Automating infrastructure tests on Puppet nodes after a puppetrun
----- "Al @ Lab42" <lab42.it@gmail.com> wrote:> > BTW, an implementation question. How do you suggest to manage the > triggering of an action on the mcollective client from the > PupetMaster, after a Puppet run on one of its clients? > I suppose that using a custom report is the most logic approach, but > what''s the sanest way to actually deliver it? Having a service > listening on a mcollective client node and send reports there? Using > stomp messaging? How? >I''d run it in the postrun_command script on each node, else I guess a report isnt too bad but reports kind of only work when a whole lot of other stuff was working as well at the same time. -- R.I.Pienaar -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Matt Wallace
2010-Oct-20 08:27 UTC
Re: [Puppet Users] Automating infrastructure tests on Puppet nodes after a puppetrun
On Tuesday 19 Oct 2010 21:52:37 Nicolas Szalay wrote:> ----- "Al @ Lab42" <lab42.it@gmail.com> a écrit : > | Hi List, > > Hi, > > | I would like to discuss with whoever is interested one topic that I > | suppose has general interest. > | > | I want to implement some kind of automatic testing on the status of a > | node after a Puppet Run. > | These tests involve trivial and less trivial things things like: > | - A local service is running > | - A local port is open > | - A remote server on a remote port is reachable by the node > | - An URL replies with an expected content > | - Some specific function needed by the node and provided by a remote > | host is working (ie: ldap acces for users authentication, ntp > | sync...) > | - Whatever other check that asserts that the node is correctly > | workingOK, so we do this slightly differently however it might help... 1) All our manifests are stored in Git 2) A Git update forces Hudson to run a build 3) The Build process performs the following steps: * Checkout the latest version of the manifest into the staging server''s puppet-module-path * Start a virtual server of the defined type using cucumber-vhost[0] * Use puppet to deploy the latest staging versions of the manifests to the virtual-server * run cucumber tests (using webrat for webservices and SMTP/IMAP libraries to test sending/delivery of email) against the service/facility that is contained in the manifests we are testing * Report back on the results of those tests * Destroy the virtual server This means that all of our manifests are fully tested before they go near our production system and we can be confident (although obviously this is only as good as the tests that we write!) that when we merge from staging into master the changes that are rolled out will work correctly. I''ve not gone down the cucumber-puppet root as I''m not 100% sure how it works and how to write stories correctly so if anyone can point me at a good resource on this, I''d be very appreciative! Hope that helps, Matt -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Nicolas Szalay
2010-Oct-20 08:35 UTC
Re: [Puppet Users] Re: Automating infrastructure tests on Puppet nodes after a puppetrun
Le mardi 19 octobre 2010 à 23:49 -0700, Al @ Lab42 a écrit :> Generally yes.IMHO, monitoring needs a "refresh" to cope with the "new way" servers are operated & built. This is a larger topic that this single thread :)> > So assuming you have monitoring for all of this, is the problem that you > > want visibility of the state right now after a run and not when nagios > > gets round to doing its next checks which might be many minutes? > > Yes, but also I want direct correlation between a puppet run and an > eventual failure.Wouldn''t this kind of "instant" monitoring be too overwhelming ? I mean : if you have 500 hosts, checking every 30 minutes you would get a "central service server" checked every 3.6s. How about log correlation, it''s not perfect but it can be an acceptable intermediate solution (damn splunk and its crazy pricing) Nico.
Nikolay Sturm
2010-Oct-20 09:28 UTC
Re: [Puppet Users] Automating infrastructure tests on Puppet nodes after a puppetrun
* Matt Wallace [2010-10-20]:> I''ve not gone down the cucumber-puppet root as I''m not 100% sure how > it works and how to write stories correctly so if anyone can point me > at a good resource on this, I''d be very appreciative!I have put up some documentation at http://projects.puppetlabs.com/projects/cucumber-puppet/wiki If that doesn''t get you started, feel free to ask here or email me directly. I would be glad to update the documentation in case anything is unclear. cheers, Nikolay -- "It''s all part of my Can''t-Do approach to life." Wally -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Trevor Vaughan
2010-Oct-20 10:55 UTC
Re: [Puppet Users] Re: Automating infrastructure tests on Puppet nodes after a puppetrun
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 All, I would suggest taking a look at OpenSCAP and the SCAP initiative led by NIST. It is an Open Standard and to me, the concepts act as the validation side of Puppet enforcement. http://www.open-scap.org/page/Main_Page http://scap.nist.gov/revision/index.html http://oval.mitre.org/ Thanks, Trevor On 10/20/2010 02:21 AM, Al @ Lab42 wrote:> > > On Oct 19, 10:52 pm, Nicolas Szalay <nsza...@qualigaz.com> wrote: >> ----- "Al @ Lab42" <lab42...@gmail.com> a écrit : >> >> | Hi List, >> >> Hi, >> >> | I would like to discuss with whoever is interested one topic that I >> | suppose has general interest. >> | >> | I want to implement some kind of automatic testing on the status of a >> | node after a Puppet Run. >> | These tests involve trivial and less trivial things things like: >> | - A local service is running >> | - A local port is open >> | - A remote server on a remote port is reachable by the node >> | - An URL replies with an expected content >> | - Some specific function needed by the node and provided by a remote >> | host is working (ie: ldap acces for users authentication, ntp >> | sync...) >> | - Whatever other check that asserts that the node is correctly >> | working >> | >> | I want to do this directly in my modules, at least for the checks >> | that are directly related to the resources provided by the module >> | and >> | build some defines to manage quickly things like "check the url" or >> | "check if the remote port is accessible". >> | >> | The point is to have a solid testing infrastructure, early >> | notification of any problem that might take place after a Puppet run >> | and, at the same time have a sort of monitoring logic that might be >> | used also by other tools, like Nagios. >> >> Do you know about puppet-cucumber ? > > Yes, but as far as I''ve understood, puppet-cucumber is run on the > Puppet Master and check resources managed by Puppet. > I''d like also to make checks that might not be directly related to > Puppet resources (but might be broken by a wrong config pushed via > Puppet). > >> >> | In order to achieve something like this there are different >> | approaches and I would like to follow what seems most sane and, >> | mostly, what could better fit the evolution of the Puppet ecosystem. >> | >> | Here a pair of examples: >> | >> | - APPROACH 1 - CHECK TRIGGERED BY PUPPET NODE >> >> This is an easy approach but how will you push information back to you ? I have not checked but I don''t think that the result of post run hooks are included into reports > > In fact, and that''s a reason why I don''t prefer this approach, because > you should build your own reporting stuff. > >> >> | - APPROACH 2 - CHECK RUN BY AN MCOLLECTIVE CLIENT ON THE PUPPET NODE >> >> I would use that one, combined with nagios through the mc nrpe agent probably or something like a hudson instance to do a permanent check about this. > > +1 > >> >> | Another point is how to organize and define the checks'' list. >> | Cucumber >> | seems a nice and somehow "standard" way to define the checks logic, >> | but could be also a plain execution of the different checks from a >> | sort of wrapper script. >> | The single checks could be nrpe commands and/or mcollective agents (I >> | love the nettest one, incidentally). >> | >> | >> | AFAIK there''s nothing in the above examples that is particularly >> | difficult or can''t be done with existing tools, but I would like to >> | introduce them seamlessly in my modules (using my monitoring >> | abstraction classes). >> | >> | So, I wonder if someone is already doing similar checks, what''s the >> | approach they are following and what might be the evolution of Puppet >> | under regarding these topics. >> >> Not doing it but definitely interested. > > I''ll let you know if I make up something interesting :-) > > Al >- -- Trevor Vaughan Vice President, Onyx Point, Inc. email: tvaughan@onyxpoint.com phone: 410-541-ONYX (6699) pgp: 0x6C701E94 - -- This account not approved for unencrypted sensitive information -- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQEcBAEBAgAGBQJMvsqEAAoJECNCGV1OLcypn2QIAJXxNqPMXPLfNukjVRzsiogB DeDMKxOk1FsgBLd3KHBOsWTupDD3Tiv7t+NoQS+FACRq9ok2xuVyLUfGMYlUsswI yBiVawgWIKJAun3IVPoQne8eG2CIyCDnLVTyMjbDAKfJjEwTtSwzetokTEJbVakk 0ygBCnv6Lz7Hz6ghMNU4QyosMEjkeRRumwqEJCULBpU2mNY5ggcIgudoY6GwRQhd YX3wEelm0m3PvohieF4Rh9I5fx0hFMsFvWeNXCPRE1vzRea3Af/gRvROqHbywrl0 tf0iTsPEF1hMsd/+GLoSOvA+qzGfPq+zLHeeGbJ9sE2qKCKA3S7saRfl7/uOHlA=oBVD -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Felix Frank
2010-Oct-21 09:22 UTC
Re: [Puppet Users] Re: Automating infrastructure tests on Puppet nodes after a puppetrun
>> To answer the ''now'' part of it, I''d just notify via mcollective my nagios >> box to do a check for all services on the node post puppet run. > > That could be an option but it wouldn''t directly correlate the check''s > failure with a Puppet run. > I think I would prefer to use the existing checks (so nrpe is perfect) > but be able run them also outside Nagios.Hi, you could use NSCA (I think that was the acronym for the technology behind nagios load balancing) instead of NRPE. The puppet client can then push passive check results for all services to the nagios server after a puppet run. Regards, Felix -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.