Ross McKerchar
2008-Nov-22 13:03 UTC
[Puppet Users] Potentially dangerous behaviour of puppet with facter timeouts
Hi All, I''ve just encountered a (admittedly fairly unlikely) sequence of events that could have lead to puppet breaking a lot of our systems. Most things are behaving correctly but the end result is unexpected and dangerous, so although I''m not sure what the fix is I thought I''d highlight it anyway: Consider the follow snippet: case $operatingsystemrelease { 4: { $filename = "auth.for4"} 5: { $filename = "auth.for5"} } file { "/etc/pam.d/system-auth": source => "puppet://$puppetserver/files/$filename" } I''m sure more experienced user will already be spotting problems with this, however I would guess it''s a fairly common pattern. Here''s what happened: 1. Server was under reasonably high load hence the operatingsystemrelease fact timed out. 2. Consequently $filename was undefined so source was set to "puppet://$puppetserver/files/" (My servers are all Red Hat 4 or 5 so I never considered needing a default) 3. Puppet spotted that source was a directory, so converted /etc/pam.d/system-auth to a directory, entirely breaking pam and locking me out of the system! Of course, there''s various easy fixes for this specific case, which I have now implemented. However the general problem remains: I think it''s _very_ important that puppet rigidly adheres to the principle of least surprise, given it naturally has the capability of breaking a lot of systems. The example above is particularly interesting as it is easy to see how it could be missed during tests. Maybe the solution is for puppet to behave in a slightly more paranoid manner when things like facts aren''t working quite right? When investigating the above I found that puppet had actually decided my OS was FreeBSD, not Red Hat - resulting in it calling all sorts of interesting commands. Luckily they all just failed but it''s conceivable they could''ve done some damage. As someone who does use puppet to manage high profile systems, I''d sleep far better at night knowing that puppet would rather give up at the merest hint of something a bit "whiffy" than valiantly fire ahead anyway and make some potentially serious mistakes... I am running 0.24.5 on the client, 0.24.6 on the server & facter 1.5.0 throughout. -ross --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
RijilV
2008-Nov-22 19:18 UTC
[Puppet Users] Re: Potentially dangerous behaviour of puppet with facter timeouts
2008/11/22 Ross McKerchar <Ross.McKerchar@sophos.com>> > Hi All, > > I''ve just encountered a (admittedly fairly unlikely) sequence of events > that could have lead to puppet breaking a lot of our systems. Most things > are behaving correctly but the end result is unexpected and dangerous, so > although I''m not sure what the fix is I thought I''d highlight it anyway: > > Consider the follow snippet: > > case $operatingsystemrelease { > 4: { $filename = "auth.for4"} > 5: { $filename = "auth.for5"} > } > > file { "/etc/pam.d/system-auth": > source => "puppet://$puppetserver/files/$filename" > } > > I''m sure more experienced user will already be spotting problems with this, > however I would guess it''s a fairly common pattern. > > Here''s what happened: > 1. Server was under reasonably high load hence the operatingsystemrelease > fact timed out. > 2. Consequently $filename was undefined so source was set to > "puppet://$puppetserver/files/" (My servers are all Red Hat 4 or 5 so I > never considered needing a default) > 3. Puppet spotted that source was a directory, so converted > /etc/pam.d/system-auth to a directory, entirely breaking pam and locking me > out of the system! > > Of course, there''s various easy fixes for this specific case, which I have > now implemented. However the general problem remains: I think it''s _very_ > important that puppet rigidly adheres to the principle of least surprise, > given it naturally has the capability of breaking a lot of systems. The > example above is particularly interesting as it is easy to see how it could > be missed during tests. > > Maybe the solution is for puppet to behave in a slightly more paranoid > manner when things like facts aren''t working quite right? When investigating > the above I found that puppet had actually decided my OS was FreeBSD, not > Red Hat - resulting in it calling all sorts of interesting commands. Luckily > they all just failed but it''s conceivable they could''ve done some damage. > > As someone who does use puppet to manage high profile systems, I''d sleep > far better at night knowing that puppet would rather give up at the merest > hint of something a bit "whiffy" than valiantly fire ahead anyway and make > some potentially serious mistakes... > > I am running 0.24.5 on the client, 0.24.6 on the server & facter 1.5.0 > throughout. > > -ross >Hi Ross, I think the problem you outlined is an important one. One solution is to use an external node tool that also supplies all of the facter varaibles, and only import data into that external node tool when the servers are not under heavy strain (iClassify is such a tool). Another idea is to not run puppet when there is a ton of load on the system, for example running puppet from cron and have a wrapper script that makes some reasonable guess as to the load of the system. Also you might consider upgrading facter to 1.5.2. Between 1.5.0 and 1.5.1 the timeout setting was refactered (har har har) a couple of times, and I think you might find it to behave more like you would want it to. .r'' --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Marcin Owsiany
2008-Nov-23 11:15 UTC
[Puppet Users] Re: Potentially dangerous behaviour of puppet with facter timeouts
On Sat, Nov 22, 2008 at 01:03:03PM +0000, Ross McKerchar wrote:> > Hi All, > > I''ve just encountered a (admittedly fairly unlikely) sequence of events that could have lead to puppet breaking a lot of our systems. Most things are behaving correctly but the end result is unexpected and dangerous, so although I''m not sure what the fix is I thought I''d highlight it anyway: > > Consider the follow snippet: > > case $operatingsystemrelease { > 4: { $filename = "auth.for4"} > 5: { $filename = "auth.for5"} > } > > file { "/etc/pam.d/system-auth": > source => "puppet://$puppetserver/files/$filename" > } > > I''m sure more experienced user will already be spotting problems with this, however I would guess it''s a fairly common pattern. > > Here''s what happened: > 1. Server was under reasonably high load hence the operatingsystemrelease fact timed out. > 2. Consequently $filename was undefined so source was set to > "puppet://$puppetserver/files/" (My servers are all Red Hat 4 or 5 so > I never considered needing a default)One way is to use selectors instead of case statement in such case. In this case it would just fail with: No matching value for selector param '''' at file.pp:N And all would be well. -- Marcin Owsiany <marcin@owsiany.pl> http://marcin.owsiany.pl/ GnuPG: 1024D/60F41216 FE67 DA2D 0ACA FC5E 3F75 D6F6 3A0D 8AA0 60F4 1216 "Every program in development at MIT expands until it can read mail." -- Unknown