Puppet runs every hour or so on our production servers and makes sure they stick to the manifest. I''m curious to know if this is advised for production. In theory, if something breaks in puppet for whatever reason, all of our production servers may be hurt simultaneously. berber -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
berber <webersites@gmail.com> writes:> Puppet runs every hour or so on our production servers and makes sure > they stick to the manifest. I''m curious to know if this is advised for > production.> In theory, if something breaks in puppet for whatever reason, all of > our production servers may be hurt simultaneously.We do it as well. You have to be fairly paranoid about what you change in the Puppet manifests, but we like the constant consistency check that all is as it should be. We do run Puppet in --noop mode and require manual intervention to run it on our Kerberos KDCs. We''ve only gotten bitten by this once, when there was a bug in Puppet that occasionally caused it to overwrite managed files with their own checksums. That bug was fixed a long time ago, but it was pretty bad while it was present. But you''ll notice it wasn''t enough to keep us from continuing the policy of running Puppet in production. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/> -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
berber wrote:> Puppet runs every hour or so on our production servers and makes sure > they stick to the manifest. I''m curious to know if this is advised for > production. > > In theory, if something breaks in puppet for whatever reason, all of > our production servers may be hurt simultaneously. >1) Don''t disable using the cached catalog in case of failure 2) Treat your Puppet deployments like software deployments. STAGE THEM!! If you stage manifest releases you know exactly what they will do in production. -scott -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
James Turnbull
2009-Dec-29 07:52 UTC
Re: [Puppet Users] continues puppet run in production
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Smith wrote:> berber wrote: >> Puppet runs every hour or so on our production servers and makes sure >> they stick to the manifest. I''m curious to know if this is advised for >> production. >> >> In theory, if something breaks in puppet for whatever reason, all of >> our production servers may be hurt simultaneously. >> > > 1) Don''t disable using the cached catalog in case of failure > 2) Treat your Puppet deployments like software deployments. STAGE THEM!!Puppet makes managing configuration easier and more automated - it doesn''t preclude change control, testing and work flow. :) In fact using Puppet without these can sometimes be downright dangerous. Regards James Turnbull - -- Author of: * Pro Linux System Administration (http://tinyurl.com/linuxadmin) * Pulling Strings with Puppet (http://tinyurl.com/pupbook) * Pro Nagios 2.0 (http://tinyurl.com/pronagios) * Hardening Linux (http://tinyurl.com/hardeninglinux) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEVAwUBSzm1UiFa/lDkFHAyAQLhGggA2Pa4GipIeWRq83EiKV9e3QKMz1CF4VGT hfGbW7UdUDKKdDCS8hORlGmSMjl27LInaa5PF21aB5v6v0XDhwpB6zrQ9RaO0rdr 0YdN8S/HiuJqeT5b9/+B/RzpPB6xkNF5u37eKIgd+MNdyL6G20W7X1Pv6XROrEt0 9/bvPACbc8hmdn5pirbR7dopg7ETNr4EMH7AtOsJ/wB1f4iiA+fyjmnncalGU3sv RXS8iefF/V/NwL+RFWACMVLuM5ZEdyG7JfVXn9vS9oGg9k6xr/pHbgPJ5iATjVyj loKufiUDjXIZpXd0oJn3Q1WiVVW2EB6W//4lhonkr95mzbPjIhW/fw==ZPyb -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
If you look at the reply from Russ Allbery you will notice he wrote " We''ve only gotten bitten by this once, when there was a bug in Puppet that occasionally caused it to overwrite managed files with their own checksums." I''m thinking to myself that bugs will always happen and that this particular kind of bug would not necessarily show on a staging environment as it only "occasionally" happens. Now consider a company running hundreds of production servers with puppet running continuously every hour and over the night random servers start to fail. By the time someone understands that puppet is to blame and stops it (one may think there is an attack), more servers may fail. At this point you may have 10,20,100 servers down and no puppet to fix them as the current version has a bug that randomly ("occasionally") kills files. Why would anyone want to put himself in this situation instead of running puppet on a need to deploy basis? I was looking for "cached catalog" but could not find a reference to it in the documentation, can you point me there? Thanks On Dec 29, 7:45 am, Scott Smith <sc...@ohlol.net> wrote:> berber wrote: > > Puppet runs every hour or so on our production servers and makes sure > > they stick to the manifest. I''m curious to know if this is advised for > > production. > > > In theory, if something breaks in puppet for whatever reason, all of > > our production servers may be hurt simultaneously. > > 1) Don''t disable using the cached catalog in case of failure > 2) Treat your Puppet deployments like software deployments. STAGE THEM!! > > If you stage manifest releases you know exactly what they will do in production. > > -scott-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
R.I.Pienaar
2009-Dec-29 18:53 UTC
Re: [Puppet Users] Re: continues puppet run in production
hello, ----- "berber" <webersites@gmail.com> wrote:> Now consider a company running hundreds of production servers with > puppet running continuously every hour and over the night random > servers start to fail. By the time someone understands that puppet is > to blame and stops it (one may think there is an attack), more > servers may fail. At this point you may have 10,20,100 servers down and no > puppet to fix them as the current version has a bug that randomly > ("occasionally") kills files.You could say the same about most parts of the software stack, edge cases are very hard to find and define. In the end though puppet has the flexibility to work in both modes and all you have to do is find a balance that meets your needs, not everyone has the same availability or real-time needs, many only use puppet during deploy time.> Why would anyone want to put himself in this situation instead of > running puppet on a need to deploy basis?I run it full time but I have the ability to enable/disable or run subsets of my architectures during at-risk periods, for some clients i disble puppet at night and only run it when people are around for example. http://www.devco.net/archives/2009/11/30/managing_puppetd_with_mcollective.php> I was looking for "cached catalog" but could not find a reference to > it in the documentation, can you point me there?Search in http://reductivelabs.com/trac/puppet/wiki/ConfigurationReference for usecacheonfailure> > 1) Don''t disable using the cached catalog in case of failureI''m curious why you recommend using this as a feature to enhance reliability? It does the opposite. I''d argue against the usecacheonfailure=true as a setting: It''s true that it would use a cache of the old compiled catalog if there''s a mistake in the new catalog. BUT it will still fetch files sourced with puppet:/// from the server, it will still apply new packages set with ensure=>latest and so forth. What it will not do is rebuild templates. So you could quite easily get in a situation where perhaps you''ve updated a package but applying config from old templates. Or you''ve updated one config file in a service but not another - since it comes from a template leading to weird unpredictable sets of circumstances made worse if you notify services etc. -- R.I.Pienaar -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Silviu Paragina
2009-Dec-29 22:36 UTC
Re: [Puppet Users] Re: continues puppet run in production
berber wrote:> If you look at the reply from Russ Allbery you will notice he wrote " > We''ve only gotten bitten by this once, when there was a bug in Puppet > that occasionally caused it to overwrite managed files with their own > checksums." > > I''m thinking to myself that bugs will always happen and that this > particular kind of bug would not necessarily show on a staging > environment as it only "occasionally" happens. > > Now consider a company running hundreds of production servers with > puppet running continuously every hour and over the night random > servers start to fail. By the time someone understands that puppet is > to blame and stops it (one may think there is an attack), more servers > may fail. At this point you may have 10,20,100 servers down and no > puppet to fix them as the current version has a bug that randomly > ("occasionally") kills files. > > Why would anyone want to put himself in this situation instead of > running puppet on a need to deploy basis? >The first thing you must understand is that this is a really dangerous piece of software, just as any other similar software (configuration/settings/policy enforcer). I read a course about SMS (the equivalent of puppet from M$ for windowze only, I think they renamed it) and the course started with something like: with administrator permissions one could break a computer, with sms permissions one can break all the computers in the organization :-) Why would you keep this software always running you say, well simply put because you get tired of making the same changes every day to computers. Because you sometimes change a setting for "10 minutes" somebody calls you and you forget about the change, configuration drift and the list can go on. Yes, it''s very dangerous, but very productive also :-) Silviu -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
This doesn''t sound very professional. You would risk your production environment because it''s uncomfortable for the sys admin to remember he needs to roll back a change in 10 minutes or 10 hours? Assume we are talking about a business that looses tens of thousands of $$$ for any small downtime. Does this change the picture? Just because it''s hard to follow procedure and deploy changes to production in an orderly manner, some system admins just get puppet to run every hour and then they can forget that they made a temporary change to a system. I''m starting to wonder, put bluntly so don’t get mad, if “Lazy” system admins run puppet continuously in production, while putting their systems in harm way due to a possible bug in puppet, corruption of the source, accidental changes to the manifest, etc… just so they don’t have to follow tiring procedures or keep track of manual changes to the servers (damn that was long). Is this the case or am I missing out on the big picture? Since when does “being productive” come before production integrity? On Dec 30, 12:36 am, Silviu Paragina <sil...@paragina.ro> wrote:> berber wrote: > > If you look at the reply from Russ Allbery you will notice he wrote " > > We''ve only gotten bitten by this once, when there was a bug in Puppet > > that occasionally caused it to overwrite managed files with their own > > checksums." > > > I''m thinking to myself that bugs will always happen and that this > > particular kind of bug would not necessarily show on a staging > > environment as it only "occasionally" happens. > > > Now consider a company running hundreds of production servers with > > puppet running continuously every hour and over the night random > > servers start to fail. By the time someone understands that puppet is > > to blame and stops it (one may think there is an attack), more servers > > may fail. At this point you may have 10,20,100 servers down and no > > puppet to fix them as the current version has a bug that randomly > > ("occasionally") kills files. > > > Why would anyone want to put himself in this situation instead of > > running puppet on a need to deploy basis? > > The first thing you must understand is that this is a really dangerous > piece of software, just as any other similar software > (configuration/settings/policy enforcer). > > I read a course about SMS (the equivalent of puppet from M$ for windowze > only, I think they renamed it) and the course started with something > like: with administrator permissions one could break a computer, with > sms permissions one can break all the computers in the organization :-) > > Why would you keep this software always running you say, well simply put > because you get tired of making the same changes every day to computers. > Because you sometimes change a setting for "10 minutes" somebody calls > you and you forget about the change, configuration drift and the list > can go on. > > Yes, it''s very dangerous, but very productive also :-) > > Silviu- Hide quoted text - > > - Show quoted text --- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
james@lovedthanlost.net
2009-Dec-30 08:41 UTC
Re: [Puppet Users] Re: continues puppet run in production
2009/12/30 berber <webersites@gmail.com>:> I'm starting to wonder, put bluntly so don’t get mad, if “Lazy” system > admins run puppet continuously in production, while putting their > systems in harm way due to a possible bug in puppet, corruption of the > source, accidental changes to the manifest, etc… just so they don’t > have to follow tiring procedures or keep track of manual changes to > the servers (damn that was long).That's a highly subjective view. The decision to run Puppet this way is a risk equation (it's actually two risks - you've conflated them above)? The risks goes something like this: * There is the risk of a bug in Puppet that could impact my production availability * There is a risk that poor controls will result in incorrect configuration being applied and impact my production availability These risks exists with pretty much every sysadmin tool that has similar powers - even just having root on the box - hence the sudo warning: We trust you have received the usual lecture from the local System Administrator. It usually boils down to these three things: #1) Respect the privacy of others. #2) Think before you type. #3) With great power comes great responsibility. We then determine if the likelihood/consequence of the risk of running Puppet in a particular mode outweigh the benefits? If in your environment it does then you shouldn't do it. And that's the first risk... Alternatively, if you have weighed up this risk and said "Sure I'll run it continuously" then you have to consider the mitigating controls that reduce the likelihood/consequences of any faults. Such controls include staging changes, version control manifests, work flow, test changes, --noop mode, change control, segregation of duties, etc, etc, etc. If you can reduce the level of risk to whatever your appetite is then you've addressed the second risk. That's professional, rational, and working within your organisation's risk appetite. Seems perfectly reasonable to me. Regards James Turnbull -- Author of: * Pro Linux System Administration (http://tinyurl.com/linuxadmin) * Pulling Strings with Puppet (http://tinyurl.com/pupbook) * Pro Nagios 2.0 (http://tinyurl.com/pronagios) * Hardening Linux (http://tinyurl.com/hardeninglinux)
Julian Simpson
2009-Dec-30 09:35 UTC
Re: [Puppet Users] Re: continues puppet run in production
> Is this the case or am I missing out on the big picture? Since when > does “being productive” come before production integrity?I''ve seen "production integrity" used as an excuse for not delivering new services, costing the business $$$. There has to be a balance... -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Scott Smith
2009-Dec-30 10:19 UTC
Re: [Puppet Users] Re: continues puppet run in production
berber wrote:> This doesn''t sound very professional. You would risk your production > environment because it''s uncomfortable for the sys admin to remember > he needs to roll back a change in 10 minutes or 10 hours? > > Assume we are talking about a business that looses tens of thousands > of $$$ for any small downtime. Does this change the picture? >How much money do you think one-off config file edit mistakes cost businesses every year? How much money do you think misconfigured applications or hosts cost businesses every year? You seem to be misunderstanding the notion of controlled change. Change is good. Preventing change actually causes more harm - it costs money in missed business opportunity, wasted employee resources, etc. The key part that you need to recognize is that it should be *controlled* change. With Puppet, this happens via deploying your manifests in the same way you deploy any other piece of software. You test it. You stage it. You roll it out in an iterative fashion to prod. You monitor it. -scott -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Silviu Paragina
2009-Dec-30 13:49 UTC
Re: [Puppet Users] Re: continues puppet run in production
Bugs happen in any service. This is the classical dilemma of plugging out your computer from the network vs connectivity (usability in this case). From my perspective high security risk bugs happen rarely enough to use both puppet and any other service I use. IMHO misconfiguration has a bigger chance of happening when you configure each server by hand as opposed to changing and testing and retesting a manifest. I personally have a higher chance of doing mistakes in a copy/paste scenario rather than in writing a manifest scenario. In a normal puppet workflow any configuration change may be reviewed by other sysadmins. Personally I see more risks on not being able to review changes made by other admins (we all are human and make mistakes after all), and not knowing who and why made a change. berber wrote:> This doesn''t sound very professional. You would risk your production > environment because it''s uncomfortable for the sys admin to remember > he needs to roll back a change in 10 minutes or 10 hours? > > Assume we are talking about a business that looses tens of thousands > of $$$ for any small downtime. Does this change the picture? >Not at all. You lose more $$$ for paying 10 times more admins. You are able to audit the security of the network, and know that it stands on all the computers. In the classical case you would say that it should be done in a way, but it probably won''t be done that way, which might lead to a lot more $$$ lost. You should also have an emergency night admin, one that could resolve the problems if/when they pop up, if your service is that important. Or an oncall admin which receives a sms or something if a service is down.> Just because it''s hard to follow procedure and deploy changes to > production in an orderly manner, some system admins just get puppet to > run every hour and then they can forget that they made a temporary > change to a system. >Don''t understand me wrong. Local changes should be done only in emergency cases. But without puppet you would do any change via a local login, and as in any workplace distractions occur, weather it''s the boss or whatever. Changes should be done via puppet, and with the development -> testing -> production workflow, with the sole exception of critical changes, which should have a tighter release schedule. But a reinstall workflow is kind of slow, maybe I don''t understand your scenario. As an alternative you could allow puppet to run only when admins are around if you are that scared of puppet running though the night.> I''m starting to wonder, put bluntly so don’t get mad, if “Lazy” system > admins run puppet continuously in production, while putting their > systems in harm way due to a possible bug in puppet, corruption of the > source, accidental changes to the manifest, etc… just so they don’t > have to follow tiring procedures or keep track of manual changes to > > > > I''m begging to believe that you are trolling :-) > > > the servers (damn that was long). > > Is this the case or am I missing out on the big picture? Since when > does “being productive” come before production integrity? > > >Pretty much with puppet running always you can concentrate on the real problem not on copy/pasting the configs, or waiting for an install to finish. I''m not mad, I take pride in my laziness, I''m more efficient that way. Cheers and good luck for the new year, Silviu -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Julian Simpson
2009-Dec-30 14:10 UTC
Re: [Puppet Users] Re: continues puppet run in production
> I''m not mad, I take pride in my laziness, I''m more efficient that way.On Laziness: Wall along with Randal L. Schwartz and Tom Christiansen writing in the second edition of Programming Perl, outlined the Three Virtues of a Programmer: Laziness - The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful, and document what you wrote so you don''t have to answer so many questions about it. Hence, the first great virtue of a programmer. Also hence, this book. See also impatience and hubris. Impatience - The anger you feel when the computer is being lazy. This makes you write programs that don''t just react to your needs, but actually anticipate them. Or at least pretend to. Hence, the second great virtue of a programmer. See also laziness and hubris. Hubris - Excessive pride, the sort of thing Zeus zaps you for. Also the quality that makes you write (and maintain) programs that other people won''t want to say bad things about. Hence, the third great virtue of a programmer. See also laziness and impatience. (http://en.wikipedia.org/wiki/Larry_Wall) I see no reason why this shouldn''t apply to systems administrators. Happy new year, everyone. J. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.