The thread on "templates and tagging" (http://groups.google.com/group/puppet-users/browse_thread/thread/df87d0837b2e4993) brought out some questions about how I (and, for some of the things I''m about to say, others in the Puppet community) think about how Puppet is designed. In my message in that thread, I had a disclaimer, which talked about the fact that Puppet is a great tool, and that Luke deserves all praises for having built and conceived of it. That said, we do disagree about a few things. What follows is my attempt to clarify how and why I think the way I do. It''s broken up in to sections, since the places where we diverge philosophically don''t actually occur until you get pretty deep into actually using Puppet. My intent was to be able to bring people along, so that we''re not talking above peoples understanding of Puppet. That said, I don''t claim to be perfect or unique in my understanding of Puppet or it''s internals. These are simply my observations, having used Puppet to help build many different infrastructures for various organizations. == The Language = Puppet is a huge leap forward in the art and practice of systems administration/systems automation/systems architecture. Standing on the shoulders of cfengine and similar tools, Luke created a fundamentally useful abstraction layer for not only the practical management of systems, but for expressing the entire infrastructure (well, the *nix based systems in it anyway) as meaningful code. Let me say that one more time, because it''s the thing that is most amazing about Puppet, and about Luke''s accomplishment with it: Puppet allows you to express your entire infrastructure as meaningful code. This was just not even theoretically possible with any existing tool until Luke created puppet, in my opinion. Puppet, in two words, kicks ass. From a practical point of view, the big win here was the creation of a syntax that allowed for the easy expression of sometimes complicated relationships between discrete resources in the system. The minimum set of resources that must be capable of being managed by a tool like Puppet are: * Files * Directories * Symlinks * Exec * Packages * Services From those six basic resources, you could automate the vast majority of modern *nix systems. Puppet didn''t stop there, though. It gave an even more powerful abstraction layer to us, and that was the ability to group together sets of those fundamental resources (which it calls types) into classes. This allowed us to start grouping together resources according to function, regardless of the underlying implementation. Instead of modeling all of our init scripts, then all of our users, we could start rolling all of those things together into a single class, and then stating that it should exist on a given system. class apache { file { "httpd.conf": .. } package { "apache": .. } } That rocked. We took all the varying steps required to get a functioning apache installation, and we rolled them up in a single thing we could name: "apache". Still, this wasn''t enough. Often times, we would have a repetitive series of resources that we needed to apply over and over again. So Luke one-upped himself: he gave us the ability to create a resource definition. We could now group together resources, and reference them in our classes *as if they were a single resource*. define apache_virtual_host($ensure="enabled") { file { "$name.conf": .. } exec { "reload-apache": .. } } Now, instead of repeating that pattern anywhere I needed a new virtual host, I could just say: apache_virtual_host { "frank": ... } And have all of those resources created for me. This is a leap forward similar to the invention of chain-saws in the logging industry. The impact on the ability of a single systems administrator to create elegant, functional, repeatable configurations cannot be understated. The practical benefit of these three powerful abstraction layers (Resources, Classes, and Definitions) are: * I can manage different resources with similar semantics, in one place. * I can group those resources together under functional umbrellas, so it''s easy to find, maintain, and extend them. * I can create new semantic abstractions that simplify my problem domain. If you knew nothing else about what Puppet provided, or how it turns those resources into functional reality, you should already be able to see the value inherent in those three things alone. If I was putting a percentage value on the different parts of puppet, in terms of it''s impact on how efficient I am as a Systems Administrator, I''m putting the language at 90%. Sure, without the ability to take that language and do meaningful action with it, it''s functionally useless. But many tools provide the ability to manage the six fundamental resources, to say "make sure this file is owned by root" or "make sure this service is enabled". None of them come close to letting you do it as elegantly as Puppet does. == Providers and Native Types = Puppet takes the language abstraction layer, and it somehow magically turns them in to reality on your system. When you say: package { "apache": ensure => latest } Puppet knows that Apache should be installed, and that it should always be kept at the latest version. It knows how to do this for multiple platforms, even in the face of wildly different syntax for the underlying operation. This is the second half of the magic of resource abstraction. Not only does it let you easily say you want it, it also *hides the implementation details from you*. Now, there are a few snags to this magic. One is that the underlying platforms don''t all agree on the name of the apache package. Is it "httpd", "apache", or "apache2"? Puppet lets you work through this with a conditional syntax: package { "apache": name => $operatingsystem ? { Debian => "apache2", CentOS => "httpd", default => "apache" } ensure => latest } If you tried to do this the other way, which is having the Package provider understand "apache" in a native way, you would find the technical obstacles much to hard. If the provider was supposed to know that Package["apache"] was called "httpd" on CentOS dynamically, it would need to have applied it''s own ontology on top of a hideous number of potential package layers. So it lets you give it a hint, and move along. The results kick ass. I can now refer to one resource, Package["apache"], any time I want to talk about Apache, no matter what platform I''m on. That rocks, because it let me apply my own model of what was desired, and map that easily to a number of underlying implementations. It works because Package, as a concept, is a fundamental thing. It can be defined, and it''s basic attributes are similar regardless of implementation. Packages have: * Names * Versions * Well defined states (installed, not installed, current, out of date, etc.) Regardless of your packaging system, it probably has a lot of similarities with other packaging systems. Building an abstraction layer at the concept of a "Package" works great, because the similarities between packaging systems is quite high in the abstract. One of the things you''ll notice is that this is true about all six of the fundamental resources I outlined above. No matter what platform your on, things of that kind share a lot of common attributes. They are differentiated mainly by differences in implementation, but not in abstract concepts. These basic types show the sweet spot for this kind of underlying platform abstraction: A resource can be boiled down into discrete abstract concepts, which are present in the vast majority of implementations, regardless of platform. Doing this provides a clear benefit, as I can now manage a single declaration of a resource, and have it take care of the details, regardless of my platform. When a resource fits that requirement, the benefit to providing this cross-platform abstraction layer is clear. It takes more work up front, but it saves time in the long run. This combination of a Resource and a set of Providers is often referred to as a "native type". It''s a thing that Puppet inherently knows how to manage, across multiple implementations or platforms. This is in contrast to my apache_virtual_host definition above, which is a "defined type", and is made up of a series of Puppets own native types working in concert. If you didn''t have any cross platform providers at all, I could still build cross platform tools with Puppet''s language.. it would just be wildly less useful and elegant. = Where Provider''s and Native Types beak down So, the sweet spot for native types is if: A resource can be boiled into discrete abstract concepts, which are present in the vast majority of implementations, regardless of platform. There are many things that might fit into this box, outside of just the six fundamental resources: * Firewall rules * Mount points * Host entries * DNS entries The list goes on. In all of these above cases, you can take the basic idea of the thing, and boil it down to a series of high level concepts, regardless of the underlying implementation. But wait.. there is a snag. With Firewall rules, it''s actually about a lot more than just the basic idea of "from x to y on z port". That''s easy. But the implementation details do matter -- think of the wide variety of rules you can place in iptables alone, then extend that to pf or any other sort of similar tool. You might still see benefit out of a native type, but it probably won''t be the cross-platform nature of it, unless you are willing to settle for a non-fully-functional abstraction. Lets take this a step further, and talk about a much more complicated beast: web servers. You can make some abstraction about the sorts of things web servers allow you to do, on a high level. They bind to a port, they serve files. But they also have virtual hosts, complicated rewrite rules, proxy layers, ssl, and a huge list of other things. So while it would be cool to be able to say: web_server { "foo": port => [ 80, 443 ], document_root => "/", provider => "nginx" } And then be able to switch it to say: web_server { "foo": port => [ 80, 443 ], document_root => "/", provider => "apache" } Which you could absolutely build in the simplest case. But the underlying complexity of modern web servers, along with the wildly different approaches to their underlying configuration, make this kind of thing very difficult to do. Even though there would be a benefit to being able to switch out what webserver I am running with a simple swap of provider, that benefit is outweighed by what I would loose in flexibility. I use this example to illustrate that there is a place outside the sweet-spot for native type development. It is when: The complexity of implementing an abstraction layer means loosing significant functionality within the underlying providers. Or, put another way, as the complexity of the thing you are abstracting increases, the utility of abstracting it falls. The sweet spot in the case of web servers is to use definitions, along with the fundamental building blocks, to automate the process of configuring them. It provides huge semantic benefit for comparatively little effort. Building a native web_server provider provides an interesting benefit (the ability to use different web servers transparently,) but at a cost of lost functionality and a huge amount of effort. So, a particular thing is a great case for becoming a native type if: 1) It can be boiled into discrete abstract concepts, which are present in the vast majority of implementations, regardless of platform. 2) The complexity of implementing an abstraction layer does not result in a significant loss of functionality within the underlying providers. Often, one of these two things will be true, and we''ll wind up building a native type anyway. An example of this would be a pending iptables native type from Stanford. They aren''t building a "firewall" or "packet filter" native type with an iptables provider. They are building native iptables types, that understand the specific complexities inherent in that one implementation, so that they can have a more complete semantic set by which to manage their infrastructure. So, something might be an good case for being a native type if: 1) It''s configuration can be abstracted into puppet''s native syntax easily 2) It benefits from a deeper level of decision making on the client side In the case of iptables, it fits both these slots. We can totally model it''s syntax in puppet, or even punt on doing that at all, and just let you write the rule inline (since we don''t have to worry about it being cross platform!) Plus, it solves the problem of making sure your rules all appear in order, since you can use Puppet''s internal ability to declare relationships between resources to structure the resulting rule set, which would be impossible using Defined types alone. If it doesn''t fit well in to one of those two boxes, it''s a bad case for a native type, and you are better off making definitions and using the smaller building blocks to solve your problem. == Puppet''s language is declarative (except when it isn''t) = In practical terms, this means that when you declare two resources: file { "this": } file { "that": } In a manifest, they are not guaranteed to be applied on the client in the order you declared them. If they were dependent on one another, you must make that dependency explicit, or you won''t have any guarantee that they will be applied in the order you expect. Puppet makes a Directed Graph of all of your resources, and then performs a Topological Sort on that graph to determine the order they will be applied in. One side-effect of this is that the order they will be applied in may vary from one run to the next, because a topological sort of a directed graph may have many different solutions. The reasoning behind this is based on the assumption that, as complexity increases, relying on the order in which resources are specified to determine when things should happen becomes impossible to manage. With 10 things, it''s easy to put them in a line. With a 100 things, it''s harder. With a 1000 things, it''s quite hard. By using dependencies, you only have to say "this requires that". You don''t have to worry about where "this" or "that" slot in to the grand scheme of things. The gotcha here is that people actually don''t think in terms of directed graphs and topological sorts. If you look in your Puppet manifests, you most often write them in order (at least within a class). It''s the natural thing to do, because it''s just "normal" to write in a logical line. When you realize that order doesn''t matter that first time, you start to make your edits *out* of order, because you know it doesn''t matter. Many people have been bitten by this: file { "foo": content => template("woot") } $variable_i_need_in_foo = "something" You can see the thought process here... I need to model this file as a template. Oh, and it needs $variable_i_need_in_foo. That should work, you figure, because Puppet is declarative. Except that template() is a function that gets called at the time the parser sees it. And since the parser hasn''t seen $variable_i_need_in_foo, it can''t use it in the template. It''s a totally reasonable behavior to have, and it would blow up if the language wasn''t declarative as well. But it goes against what the user gets trained to expect. Finally, I think the nail in the declarative coffin is this. As the complexity of your implementation increases, it is harder to keep the explicit statements of dependency in order than it is to slot things into a line. The reason? Most of the time, it doesn''t matter where something gets slotted in. If I have a thousand things, very few of them *actually care about where they fit*. They just need to happen after a few other things. If Puppet was not declarative, but simply executed resources in the order they appear in the manifest (based on the order the top-level classes are assigned to the node) the problem would become one of ensuring that each class state explicitly what *other* classes need to have been applied before it can be run. Which you have to do anyway, if you want to ensure that your complete configuration can be delivered in a single run. This would be the first major philosophical break I have with Puppet. I think the declarative nature of the language is a hinderance at scale, not a benefit. One way to fix this would be to implement an automatic require statement for every resource that appears in an "include foo" line inside a class for every resource within the class. At that point, though, you might as well just ditch the graph altogether. = Puppet is Myopic by default For those of you who have read this far, this is where Luke and I really start to diverge philosophically. At the moment, the majority of Puppet users are utilizing it in a myopic way. For any given run, a node knows all about itself, but it doesn''t know about it''s neighbors. So how do we handle configuring services which require knowledge about the infrastructure at large, instead of just my one node? To solve this problem, Luke has added a few more layers of abstraction into the language. They are the ability to create a virtual resource and "export" it, and the ability to "collect" those resources again on another node. (Often referred to, cleverly enough, as export/collect) To take the simplest case, that of ssh host keys: class ssh { case $sshdsakey { "": { # ignore empty keys } default: { @sshkey { $hostname: type => dsa, key => $sshdsakey } } } sshkey <||> } So, a lot of things are happening here: * We are saying that the canonical sshkey for $hostname is $sshdsakey. (Both of which are facts the node submitted to puppet when it ran) * We are saying that, on every host, we want to make all of the sshkey resources puppet knows about exist on this node. If you accept that Puppet is, or should be, the single canonical and authoritative place where all the information about how your systems architecture should be built is, than the @sshkey statement seems to make some sense. The thing is, though, we are really saying that whatever the node says is correct is correct. Puppet doesn''t have an ssh host key for me that it''s distributing to my client. It isn''t in control of what that value is or is not. Each individual node decides for itself what the ssh host key should be, and we''re just letting everyone else know about it. This is a good thing, in my opinion. Creating every ssh host key manually and distributing it would work, but what a pain. This solves my problem, which is great. By letting me grab the resources of other nodes, I can have a much broader view of what''s happening around me. (I am less myopic!) In order to make this functionality work, you need to store every created resource for every node in your infrastructure at the last state you saw it in (Puppet does this by sticking the resources into a MySQL or PostgreSQL database). (Disclaimer: I am not familiar at all with the actual storeconfigs/export/collect code. I am extrapolating.) In one of our infrastructures, we define approximately 900 different resources per node. If we have 10 machines, that means 9000 discrete resources total. If Puppet runs every 30 minutes, that means: 1) The first run makes 9000 INSERTs. 2) Subsequent runs make ideally a single SELECT for all the resources I stored on my last run, and then any UPDATEs, INSERTs or DELETEs that are required to make my working set match. In an un-optimized version, it makes 9000 individual SELECTs for comparison. 3) Any use of the export/collect functionality is a subsequent SELECT. This is not such a big deal at 10 systems. Lets scale to 10x, though, to 100 systems. That number is now 90,000. 200 systems is 180,000. 1000 systems is 900,000 rows. That''s quite a bit of data, and it scales pretty rapidly as you use puppet more. Especially if you start declaring resources for everything you want to do. We have a Nagios installation I''ll talk about later, that has a total of 1748 service checks. How many of those would need to be @nagios_service { .. }? Now, there are benefits to this storage: 1) You know about the state of every resource on every system you manage. This has *super sweet* reporting potential. 2) You can query against it to do things like sshkey <| |> The downside: 1) Using storeconfigs today means taking a fairly significant performance penalty. 2) You can''t export/collect non-native types. The first one is problematic for obvious reasons. The second one is more nefarious. If I want to use a resource declaration in a puppet manifest to configure something that *doesn''t* make sense as a native type, I''m left out in the cold. I''m forced to make a native type, because it''s the only way I''m going to get the behavior I want when it comes out the other end. For example, Puppet has native Nagios types. These exist because you want to be able to configure Nagios through resource declarations within the functional classes that reflect the monitors you need. (And they aren''t a more generic "monitor" type because it would be too complex, and you would loose too much functionality.) So, what do I really *need* when I''m configuring Nagios in an automated way? 1) A list of all the hosts in my infrastructure 2) Knowledge of what services those hosts provide 3) Knowledge of what monitors should be watching those services All of these details can be boiled down to one thing: I need knowledge of data about all the systems in my infrastructure. I need to know the IP Address of every host, along with FQDNs. I need to know that you''re a "web server" or a "database server". All of this is information that exists *outside of the resource modeling* that provides so much power in puppet. When I model: @nagios_host { $fqdn: address => $ipaddress } All I''m doing is taking data I already know about and forcing it into a semantic model that provides very little actual value. If my infrastructure is using Nagios, I''m not learning about it because I see a @nagios_host declaration in my puppet manifest. (I hope not!) I''m not learning that this host is monitored by nagios, because the assumption is that, either way, it will be (or why was I automating nagios in the first place?) It''s only purpose is to force data I already have into a semantic model that it doesn''t fit well within, so that I can avoid knowledge of how nagios deals with it''s configuration. (The value of the puppet syntax, after all!) Compare export/collecting to these lines of Ruby in a host template: <% ic.search("tag:monitored").each do |node| -%> define host{ use generic-host ; Name of host template to use host_name <%= node[:hostname] %> alias <%= node[:fqdn] %> address <%= node[:ipaddress] %> } <% end -%> And this puppet resource: class nagios { file { "nagios-hosts": ... contents => template("nagios_hosts.erb") } } All wrapped up in a Nagios class. It assumes I have some knowledge of how to configure Nagios, since I am still writing the config file by hand. In order to use the Nagios native types well, you *still must have an intimate knowledge of Nagios*. You don''t get to magically wipe away that complexity, because it''s hard. Hard enough that it doesn''t make sense to make a "monitor" native type, hard enough that if you want to avoid having the 1748 service checks as individual resources you need to understand that you should be modeling hostgroups instead. So when people question why the template model is useful at this level, and that the world should be made up of resources, I have to ask: why? What''s the concrete benefit? It''s not: * That I can swap my monitoring solution out for another one easily with a native type (I can''t) * That it scales better (it takes a couple seconds to compile a manifest that handles 243 nodes, 1748 service checks - storeconfigs + export/collect will likely never match that) * That I now magically know that a node is being monitored by nagios (because, as Digant said so elegantly regarding sudoers, you still have to look at sudoers if you really want to be sure what''s in there) So what''s left? That I like being able to look at: class apache { @nagios_service { "httpd": ... } } And class nagios { nagios_service <| |> } Instead of knowing that my services are defined in the template for Nagios service configs? Even if that''s enough for you, is the above is so superior that we shouldn''t expand Puppet''s language to make the gross parts of doing it with templates (the inability to embed logic that belongs outside the template itself in the manifest that utilizes the template) easier? = Summary Resources, Definitions, and Native Types aren''t good in and of themselves. They need to have an increased level of practical utility for the people who wield them in order to provide any real benefit. There are situations where they clearly just are not the right answer. I believe Nagios is one of them, indeed, almost any situation where you need to know data about the infrastructure in aggregate. In those cases, I don''t need to model the data I already have as a resource -- I need a way to express that data to impact the creation of resources, sometimes simple fundamental ones. More abstraction is not always better. Regards, Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Thu, May 8, 2008 at 6:12 PM, Adam Jacob <adam@hjksolutions.com> wrote:>...> > = Where Provider''s and Native Types beak down >...[lots of really, really good summaries of Puppet deleted]> So, a particular thing is a great case for becoming a native type if: > > 1) It can be boiled into discrete abstract concepts, which are present > in the vast majority of implementations, regardless of platform. > 2) The complexity of implementing an abstraction layer does not result > in a significant loss of functionality within the underlying > providers. > > Often, one of these two things will be true, and we''ll wind up > building a native type anyway. An example of this would be a pending > iptables native type from Stanford. They aren''t building a "firewall" > or "packet filter" native type with an iptables provider. They are > building native iptables types, that understand the specific > complexities inherent in that one implementation, so that they can > have a more complete semantic set by which to manage their > infrastructure. > > So, something might be an good case for being a native type if: > > 1) It''s configuration can be abstracted into puppet''s native syntax easily > 2) It benefits from a deeper level of decision making on the client side...> > If it doesn''t fit well in to one of those two boxes, it''s a bad case > for a native type, and you are better off making definitions and using > the smaller building blocks to solve your problem. >Some work on that issue has occurred at various Configuration Management workshops at LISA over the past several years. One way to handle that (using terms from OOP, but not specific to Puppet) is to declare abstract classes (''interfaces'' in Java parlance) that a particular item might be, but it can have provider-specific extensions. You can do something like (N.B. this isn''t correct Puppet code): abstract type foo { requires "item1" requires "item2" requires "item3" ... } native type bar implements foo { provides "item1" as "bar-something" provides "item2" as "bar2" ... and provides "specific-to-bar" provides "something-else-specific-to-bar" } Now, bar is a subtype of foo, so everywhere that foo could be used, bar can be used in its place. Note that you could also choose to have methods instantiated as part of the class of bar -- I''ve just shown variables here. Note that there have been some projects to create ''meta-languages'' for configuration where an application provider would give a specification of its configuration needs, but I''ve never seen any of those gain significant traction. I suspect that the impetus will need to come from the configuration management side, and after the SAs & System Architects see the benefits of actual implementations, pressure will then move upstream to the application providers. But that won''t happen in the short term except in specific, controlled circumstances (I know of at least one company that is doing that kind of thing with in-house developed applications; it''s also occurring to a degree in the cluster computing/grid computing worlds).> == Puppet''s language is declarative (except when it isn''t) =... > > The gotcha here is that people actually don''t think in terms of > directed graphs and topological sorts. If you look in your Puppet > manifests, you most often write them in order (at least within a > class). It''s the natural thing to do, because it''s just "normal" to > write in a logical line. When you realize that order doesn''t matter > that first time, you start to make your edits *out* of order, because > you know it doesn''t matter. Many people have been bitten by this: > > file { "foo": > content => template("woot") > } > > $variable_i_need_in_foo = "something" > > You can see the thought process here... I need to model this file as a > template. Oh, and it needs $variable_i_need_in_foo. That should > work, you figure, because Puppet is declarative. > > Except that template() is a function that gets called at the time the > parser sees it. And since the parser hasn''t seen > $variable_i_need_in_foo, it can''t use it in the template. It''s a > totally reasonable behavior to have, and it would blow up if the > language wasn''t declarative as well. But it goes against what the > user gets trained to expect. >Again, bringing some ideas from programming languages, this is a solved problem: e.g., in C, you can declare a variable to be an extern, and the resolution does not occur until run-time (it''s usually done by a run-time linker/loader). If Puppet had a more sophisticated run-time engine, this issue could be addressed.> Finally, I think the nail in the declarative coffin is this. As the > complexity of your implementation increases, it is harder to keep the > explicit statements of dependency in order than it is to slot things > into a line. The reason? > > Most of the time, it doesn''t matter where something gets slotted in. > > If I have a thousand things, very few of them *actually care about > where they fit*. They just need to happen after a few other things. > If Puppet was not declarative, but simply executed resources in the > order they appear in the manifest (based on the order the top-level > classes are assigned to the node) the problem would become one of > ensuring that each class state explicitly what *other* classes need to > have been applied before it can be run. Which you have to do anyway, > if you want to ensure that your complete configuration can be > delivered in a single run. > > This would be the first major philosophical break I have with Puppet. > I think the declarative nature of the language is a hinderance at > scale, not a benefit. > > One way to fix this would be to implement an automatic require > statement for every resource that appears in an "include foo" line > inside a class for every resource within the class. At that point, > though, you might as well just ditch the graph altogether. >I think that partial ordering is very helpful in managing complexity, but you clearly don''t. What should be done instead? It would be helpful to have an alternative here that we could weight against the existing model and say "option X is better than option Y because of foo".> = Puppet is Myopic by default > > For those of you who have read this far, this is where Luke and I > really start to diverge philosophically. > > At the moment, the majority of Puppet users are utilizing it in a > myopic way. For any given run, a node knows all about itself, but it > doesn''t know about it''s neighbors. So how do we handle configuring > services which require knowledge about the infrastructure at large, > instead of just my one node? >This really requires a more sophisticated ontology (or ''model'') with a far richer type system. One doesn''t have to go to the extreme that CIM has done, but Puppet could benefit from a more composable type system. It might be instructive to do a side-by-side look at PAN vs Pupppet to see what I mean. ...> In order to make this functionality work, you need to store every > created resource for every node in your infrastructure at the last > state you saw it in (Puppet does this by sticking the resources into a > MySQL or PostgreSQL database). > > (Disclaimer: I am not familiar at all with the actual > storeconfigs/export/collect code. I am extrapolating.) > > In one of our infrastructures, we define approximately 900 different > resources per node. If we have 10 machines, that means 9000 discrete > resources total. If Puppet runs every 30 minutes, that means: >...> > This is not such a big deal at 10 systems. Lets scale to 10x, though, > to 100 systems. That number is now 90,000. 200 systems is 180,000. > 1000 systems is 900,000 rows. > > That''s quite a bit of data, and it scales pretty rapidly as you use > puppet more. Especially if you start declaring resources for > everything you want to do. We have a Nagios installation I''ll talk > about later, that has a total of 1748 service checks. How many of > those would need to be @nagios_service { .. }? > > Now, there are benefits to this storage: > > 1) You know about the state of every resource on every system you > manage. This has *super sweet* reporting potential. > 2) You can query against it to do things like sshkey <| |> > > The downside: > > 1) Using storeconfigs today means taking a fairly significant > performance penalty. > 2) You can''t export/collect non-native types. >Note that Quattor (which is the framework that uses PAN) has this issue as well: there, templates must all be compiled into an instantiation of the compute system. And lots of variables and interactions mean the compilation is expensive. There are many ideas that can be incorporated from the programming languages world to help manage this, though (e.g., object files, Partial Evaluation, conditional re-compilation, etc). ...> > All I''m doing is taking data I already know about and forcing it into > a semantic model that provides very little actual value....I have to> ask: why? What''s the concrete benefit? > > It''s not: > > * That I can swap my monitoring solution out for another one easily > with a native type (I can''t) > * That it scales better (it takes a couple seconds to compile a > manifest that handles 243 nodes, 1748 service checks - storeconfigs + > export/collect will likely never match that) > * That I now magically know that a node is being monitored by nagios > (because, as Digant said so elegantly regarding sudoers, you still > have to look at sudoers if you really want to be sure what''s in there) >Based on conversations I''ve had with Luke about this in the past, I think Puppet is very open to a more rich type system so that these types of compositions can be done more easily, but this is a hard problem. In my view, the various existing type systems in Puppet have been attempts to take ''one more step'' down the road, but no one is claiming that the existing model is perfect (well, no one I''ve talked to anyway). It''s probably worthwhile to have some kind of ''hackathon'' or Puppet-Con where these issues could be discussed, hashed out, and then some kind of road map assembled. ...> > = Summary > > Resources, Definitions, and Native Types aren''t good in and of > themselves. They need to have an increased level of practical utility > for the people who wield them in order to provide any real benefit. > There are situations where they clearly just are not the right answer. >Agreed.> I believe Nagios is one of them, indeed, almost any situation where > you need to know data about the infrastructure in aggregate. In those > cases, I don''t need to model the data I already have as a resource -- > I need a way to express that data to impact the creation of resources, > sometimes simple fundamental ones. >Agreed, given the current state of Puppet.> More abstraction is not always better. >That''s true, but, to me, your points can be boiled down to "here are areas where Puppet needs to improve -- what can we do to make this better?" Abstraction is not necessarily bad, but the Abstraction and Composition tools in Puppet could be improved, and there are well-known ways to deal with many of the issues you''ve raised, but they just aren''t in Puppet (yet). Steven --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Thu, May 8, 2008 at 8:04 PM, Steven Jenkins <steven.jenkins@gmail.com> wrote:> Note that there have been some projects to create ''meta-languages'' for > configuration where an application provider would give a specification > of its configuration needs, but I''ve never seen any of those gain > significant traction. I suspect that the impetus will need to come > from the configuration management side, and after the SAs & System > Architects see the benefits of actual implementations, pressure will > then move upstream to the application providers. But that won''t > happen in the short term except in specific, controlled circumstances > (I know of at least one company that is doing that kind of thing with > in-house developed applications; it''s also occurring to a degree in > the cluster computing/grid computing worlds).So, that would be neat, but I''m not sure it''s solving a problem I actually have most of the time. Projects like Elektra propose a new API for configuration. Projects like Augeas take a more rational approach, which is converting the syntax of configuration systems in to a common tree structure that it can manipulate and regurgitate. The issue with complex Native Types is more that having a sane provider layer becomes unwieldy, to the point where modeling certain complicated applications in that way becomes more trouble than using Defined Types made up of simpler native constructs (such as templates, files, and exec''s.) In the "Good" scenarios, the iptables of the world, the value of the Native Type becomes your ability to influence how the final collection of similar resources will be combined in way that''s a bit more flexible than definitions and shell scripts (because you have much more influence on the syntax and behavior.) I don''t this this is a bad thing, necessarily. More complicated applications are, well, more complicated, and building abstraction layers on top of them is hard. Making those abstraction layers also not limit the underlying functionality of the complicated applications is even more difficult. In those cases, all of that can be swept aside, basically, by using having better tools to influence the outcome of the simpler primitives (templates, files, etc.)>> == Puppet''s language is declarative (except when it isn''t) => ... >> >> The gotcha here is that people actually don''t think in terms of >> directed graphs and topological sorts. If you look in your Puppet >> manifests, you most often write them in order (at least within a >> class). It''s the natural thing to do, because it''s just "normal" to >> write in a logical line. When you realize that order doesn''t matter >> that first time, you start to make your edits *out* of order, because >> you know it doesn''t matter. Many people have been bitten by this: >> >> file { "foo": >> content => template("woot") >> } >> >> $variable_i_need_in_foo = "something" >> >> You can see the thought process here... I need to model this file as a >> template. Oh, and it needs $variable_i_need_in_foo. That should >> work, you figure, because Puppet is declarative. >> >> Except that template() is a function that gets called at the time the >> parser sees it. And since the parser hasn''t seen >> $variable_i_need_in_foo, it can''t use it in the template. It''s a >> totally reasonable behavior to have, and it would blow up if the >> language wasn''t declarative as well. But it goes against what the >> user gets trained to expect. >> > > Again, bringing some ideas from programming languages, this is a > solved problem: e.g., in C, you can declare a variable to be an > extern, and the resolution does not occur until run-time (it''s > usually done by a run-time linker/loader). > > If Puppet had a more sophisticated run-time engine, this issue could > be addressed.I agree, it can be addressed. One of the stated goals of the configuration language, though, is to explicitly avoid this sort of complexity. See the lack of some of the fundamental primitives you have in most modern programming languages (loops, associative arrays, etc.) as one example of that fact. I gush over the impact puppet''s language has had on my ability to get things done, and I''m not lying about it. Unlike Luke, I''m not convinced that a similar syntax built on top of a fully functional programming language wouldn''t solve many of these issues, while reducing the complexity of the run-time engine as a whole.> I think that partial ordering is very helpful in managing complexity, > but you clearly don''t. > > What should be done instead? It would be helpful to have an > alternative here that we could weight against the existing model and > say "option X is better than option Y because of foo".Personally, I think you would get by fairly easily on nothing but class-level include statements causing the resources they create to "happen" before those that follow. For example, if you have a class that includes an apache virtual host: class monkey { include apache apache_virtual_host { "something": .. } } The "include apache" statement could ensure that all the resources it lists are applied before the resources specified within class monkey. This is a good habit to get in to in general with puppet, since if you require one functional element to be available before another, you''re likely doing this kind of "include something everywhere I need a reference to a resource it creates" activity regardless. It''s essentially the same as what we do with Puppet already, but instead of saying all things are equal in execution order, we say that all things should happen after the things who need them. (But could happen before or after anything else who doesn''t required those things as well)>> = Puppet is Myopic by default >> >> For those of you who have read this far, this is where Luke and I >> really start to diverge philosophically. >> >> At the moment, the majority of Puppet users are utilizing it in a >> myopic way. For any given run, a node knows all about itself, but it >> doesn''t know about it''s neighbors. So how do we handle configuring >> services which require knowledge about the infrastructure at large, >> instead of just my one node? >> > > This really requires a more sophisticated ontology (or ''model'') with a > far richer type system. One doesn''t have to go to the extreme that > CIM has done, but Puppet could benefit from a more composable type > system. It might be instructive to do a side-by-side look at PAN vs > Pupppet to see what I mean.Really? I think the reason most of these efforts has failed in the past is exactly that attempt to create a sophisticated ontology. You don''t need the library of congress -- you need semi-structured data stores with a low barrier to entry and internal consistency. Give me access to the data and the ability to create resources from it, and you can take the abstraction much further with very little need to formalize an ontology. I''ll take a look at PAN, since I''m not familiar with it.>> Now, there are benefits to this storage: >> >> 1) You know about the state of every resource on every system you >> manage. This has *super sweet* reporting potential. >> 2) You can query against it to do things like sshkey <| |> >> >> The downside: >> >> 1) Using storeconfigs today means taking a fairly significant >> performance penalty. >> 2) You can''t export/collect non-native types. >> > > Note that Quattor (which is the framework that uses PAN) has this > issue as well: there, templates must all be compiled into an > instantiation of the compute system. And lots of variables and > interactions mean the compilation is expensive. There are many ideas > that can be incorporated from the programming languages world to help > manage this, though (e.g., object files, Partial Evaluation, > conditional re-compilation, etc).Absolutely. In this case, it''s more an issue of being able to get back at the data you need from across the infrastructure. I can think of lots of ways to speed it up (message queues, separating reporting from export/collect, etc.) What I''m not convinced is fruitful is this abstraction: @ssh_host_key { $fqdn: dsa_key => $ssh_dsa_key } Being stored as a resource, who then has a native type to assemble it. I don''t feel like it has the same kind of power that the types who exist as more than transient data stores do.> Based on conversations I''ve had with Luke about this in the past, I > think Puppet is very open to a more rich type system so that these > types of compositions can be done more easily, but this is a hard > problem. In my view, the various existing type systems in Puppet have > been attempts to take ''one more step'' down the road, but no one is > claiming that the existing model is perfect (well, no one I''ve talked > to anyway).I''m not certain extending the type system is the answer. I think extending the access to the data you would need to compose those types of resources is the answer, although the two aren''t mutually exclusive. Essentially, storing the data from each node as a resource you then later utilize is, I think, actually less useful than simply creating the end resource from the data. The level of benefit goes way down as the complexity of the abstraction increases like this.> It''s probably worthwhile to have some kind of ''hackathon'' or > Puppet-Con where these issues could be discussed, hashed out, and then > some kind of road map assembled.To be honest, one of the reasons I haven''t talked more about the way we do things (and the reasons why) is the reaction to that feeling about where the resource abstraction starts to not be useful. (It tends to be overwhelmingly negative, an come with a lot of "You don''t understand the Puppet Way still!") Puppet has a road map -- that road map is more and more native types, and a larger and larger use of tools like Export/Collect to solve the more difficult issues of services that require broader points of view within the infrastructure. (Along with some very cool work revolving around changing the network protocols) I disagree with some of it, but I''m doing just fine with the tools as they stand for the majority of our use case, and talking about it tends to make Luke dour and grumpy. Two things I would like to see that I''m fairly certain Luke would rather not have: 1) The ability to create arbitrary Ruby data-structures from within a manifest for use in a template 2) A saner default for relationships, or a removal of the directed graph altogether 3) The full set of common language constructs, such as loops, expanded conditionals, etc. See the comments in the earlier thread regarding Puppet''s syntax becoming more and more of a more fully featured programming language over time, and the agreement between Luke and Digant that this is an undesirable state of affairs. I see where they are coming from, I just don''t agree. :)> That''s true, but, to me, your points can be boiled down to "here are > areas where Puppet needs to improve -- what can we do to make this > better?" Abstraction is not necessarily bad, but the Abstraction and > Composition tools in Puppet could be improved, and there are > well-known ways to deal with many of the issues you''ve raised, but > they just aren''t in Puppet (yet).Absolutely. In some cases, though, like the above, I''m not sure that Luke would agree with my assessment of export/collect versus querying an external data store. (In fact, I''m almost positive he won''t, but I could be surprised.) I imagine he would much rather have a performance improved export/collect. Regards, Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 8, 2008, at 9:19 PM, Adam Jacob wrote:>> >> Again, bringing some ideas from programming languages, this is a >> solved problem: e.g., in C, you can declare a variable to be an >> extern, and the resolution does not occur until run-time (it''s >> usually done by a run-time linker/loader). >> >> If Puppet had a more sophisticated run-time engine, this issue could >> be addressed. > > I agree, it can be addressed. One of the stated goals of the > configuration language, though, is to explicitly avoid this sort of > complexity. See the lack of some of the fundamental primitives you > have in most modern programming languages (loops, associative arrays, > etc.) as one example of that fact. > > I gush over the impact puppet''s language has had on my ability to get > things done, and I''m not lying about it. > > Unlike Luke, I''m not convinced that a similar syntax built on top of a > fully functional programming language wouldn''t solve many of these > issues, while reducing the complexity of the run-time engine as a > whole.There''s a pure ruby DSL started, it can be found in git at: puppet/lib/ puppet/dsl.rb I''d love to see someone put some more effort into it. Probably someone other than Luke (we all know he has more than enough on his plate), who also has experience with this level of coding. Hint hint. :)> >> I think that partial ordering is very helpful in managing complexity, >> but you clearly don''t. >> >> What should be done instead? It would be helpful to have an >> alternative here that we could weight against the existing model and >> say "option X is better than option Y because of foo". > > Personally, I think you would get by fairly easily on nothing but > class-level include statements causing the resources they create to > "happen" before those that follow. For example, if you have a class > that includes an apache virtual host: > > class monkey { > include apache > > apache_virtual_host { "something": > .. > } > } > > The "include apache" statement could ensure that all the resources it > lists are applied before the resources specified within class monkey. > This is a good habit to get in to in general with puppet, since if you > require one functional element to be available before another, you''re > likely doing this kind of "include something everywhere I need a > reference to a resource it creates" activity regardless. > > It''s essentially the same as what we do with Puppet already, but > instead of saying all things are equal in execution order, we say that > all things should happen after the things who need them. (But could > happen before or after anything else who doesn''t required those things > as well)You can do this pretty easily already, just use require => Class["apache"] on the first resource in your class dependency chain.> > Puppet has a road map -- that road map is more and more native types, > and a larger and larger use of tools like Export/Collect to solve the > more difficult issues of services that require broader points of view > within the infrastructure. (Along with some very cool work revolving > around changing the network protocols) I disagree with some of it, > but I''m doing just fine with the tools as they stand for the majority > of our use case, and talking about it tends to make Luke dour and > grumpy. > > Two things I would like to see that I''m fairly certain Luke would > rather not have: > > 1) The ability to create arbitrary Ruby data-structures from within a > manifest for use in a template > 2) A saner default for relationships, or a removal of the directed > graph altogether > 3) The full set of common language constructs, such as loops, expanded > conditionals, etc. > > See the comments in the earlier thread regarding Puppet''s syntax > becoming more and more of a more fully featured programming language > over time, and the agreement between Luke and Digant that this is an > undesirable state of affairs. > I see where they are coming from, I just don''t agree. :)Honestly, I don''t think it would hurt the community at all if there were multiple implementations of the language. As long as the underlying functionality is still tied together. With that said, there''s a reason people haven''t done something like puppet with existing languages. When you provide all the features of a robust language, there are many many ways to do things (and shoot yourself in the foot). One of the great things about puppet is that it brings conventions to system administration. It''s obviously goes a little against our nature, but it brings tremendous shared benefit.> >> That''s true, but, to me, your points can be boiled down to "here are >> areas where Puppet needs to improve -- what can we do to make this >> better?" Abstraction is not necessarily bad, but the Abstraction and >> Composition tools in Puppet could be improved, and there are >> well-known ways to deal with many of the issues you''ve raised, but >> they just aren''t in Puppet (yet). > > Absolutely. In some cases, though, like the above, I''m not sure that > Luke would agree with my assessment of export/collect versus querying > an external data store. (In fact, I''m almost positive he won''t, but I > could be surprised.) I imagine he would much rather have a > performance improved export/collect.A performance improved export/collect will come, as well as external data stores for various things (they''re already popping up, you being the author of one). People will write what is useful to them, the best of the work will bubble up. -Blake --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
--On Thursday, May 08, 2008 3:12 PM -0700 Adam Jacob <adam@hjksolutions.com> wrote:> So, something might be an good case for being a native type if: > > 1) It''s configuration can be abstracted into puppet''s native syntax easily > 2) It benefits from a deeper level of decision making on the client side > > In the case of iptables, it fits both these slots. We can totally > model it''s syntax in puppet, or even punt on doing that at all, and > just let you write the rule inline (since we don''t have to worry about > it being cross platform!) Plus, it solves the problem of making sure > your rules all appear in order, since you can use Puppet''s internal > ability to declare relationships between resources to structure the > resulting rule set, which would be impossible using Defined types > alone. > > If it doesn''t fit well in to one of those two boxes, it''s a bad case > for a native type, and you are better off making definitions and using > the smaller building blocks to solve your problem. >This is where I totally disagree. If you can do something using a definition which I would assume is using templates that are also specific to the provider you are using and a series of file copies and execs, why not make that a native type so at least it executes faster? Using definitions or templates still doesn''t give you abstraction. In other words, you still can''t create a definition that can handle all different kinds of firewalls, can you? So even if the native type is specific to iptables, at least it is declared as a resource, looks and feels like a resource, and hopefully does things in a smarted way than just a definition might do. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
--On Thursday, May 08, 2008 3:12 PM -0700 Adam Jacob <adam@hjksolutions.com> wrote:> The gotcha here is that people actually don''t think in terms of > directed graphs and topological sorts. If you look in your Puppet > manifests, you most often write them in order (at least within a > class). It''s the natural thing to do, because it''s just "normal" to > write in a logical line. When you realize that order doesn''t matter > that first time, you start to make your edits *out* of order, because > you know it doesn''t matter. Many people have been bitten by this: > > file { "foo": > content => template("woot") > } > > $variable_i_need_in_foo = "something" > > You can see the thought process here... I need to model this file as a > template. Oh, and it needs $variable_i_need_in_foo. That should > work, you figure, because Puppet is declarative. > > Except that template() is a function that gets called at the time the > parser sees it. And since the parser hasn''t seen > $variable_i_need_in_foo, it can''t use it in the template. It''s a > totally reasonable behavior to have, and it would blow up if the > language wasn''t declarative as well. But it goes against what the > user gets trained to expect.Also why it is said often not to use variables this way, but to use definitions and native types. You can cleanly override them where variable scoping is a bit of a guess. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
--On Thursday, May 08, 2008 3:12 PM -0700 Adam Jacob <adam@hjksolutions.com> wrote:> If I have a thousand things, very few of them *actually care about > where they fit*. They just need to happen after a few other things. > If Puppet was not declarative, but simply executed resources in the > order they appear in the manifest (based on the order the top-level > classes are assigned to the node) the problem would become one of > ensuring that each class state explicitly what *other* classes need to > have been applied before it can be run. Which you have to do anyway, > if you want to ensure that your complete configuration can be > delivered in a single run. > > This would be the first major philosophical break I have with Puppet. > I think the declarative nature of the language is a hinderance at > scale, not a benefit. > > One way to fix this would be to implement an automatic require > statement for every resource that appears in an "include foo" line > inside a class for every resource within the class. At that point, > though, you might as well just ditch the graph altogether.Do you really have that big of a problem with this? I think there are some automatic dependency limitations that I would consider more bugs than limitations. There are some things that need to get ironed out. But for the most part, we don''t rely on as much dependency stuff as one might expect. I think you''re trying to use Puppet like it was CFengine or a script. Just use CFengine or a script if that''s what you want. Seriously, if you don''t want declarative, there are already tools that do that. Why use Puppet? And why make Puppet like tools that already do that? (apologies for breaking up the long long long long long email into multiple replies. Fascinating read but easier to digest in chunks). --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
--On Thursday, May 08, 2008 3:12 PM -0700 Adam Jacob <adam@hjksolutions.com> wrote: Finally getting to the heart of the discussion here....> So, what do I really *need* when I''m configuring Nagios in an automated > way? > > 1) A list of all the hosts in my infrastructure > 2) Knowledge of what services those hosts provide > 3) Knowledge of what monitors should be watching those services > > All of these details can be boiled down to one thing: I need knowledge > of data about all the systems in my infrastructure. I need to know > the IP Address of every host, along with FQDNs. I need to know that > you''re a "web server" or a "database server". > > All of this is information that exists *outside of the resource > modeling* that provides so much power in puppet. When I model: > > @nagios_host { $fqdn: > address => $ipaddress > } > > All I''m doing is taking data I already know about and forcing it into > a semantic model that provides very little actual value. If my > infrastructure is using Nagios, I''m not learning about it because I > see a @nagios_host declaration in my puppet manifest. (I hope not!) > I''m not learning that this host is monitored by nagios, because the > assumption is that, either way, it will be (or why was I automating > nagios in the first place?)You mention this is data you already have outside of Puppet. If that is true, then yes, you could use a Nagios configuration generation tool. But not everyone has this data outside of Puppet nor does everyone want to manually compile this data outside of Puppet. What if it is just represented in manifests?> > It''s only purpose is to force data I already have into a semantic > model that it doesn''t fit well within, so that I can avoid knowledge > of how nagios deals with it''s configuration. (The value of the puppet > syntax, after all!) > > Compare export/collecting to these lines of Ruby in a host template: > > <% ic.search("tag:monitored").each do |node| -%> > define host{ > use generic-host ; Name of host template to use > host_name <%= node[:hostname] %> > alias <%= node[:fqdn] %> > address <%= node[:ipaddress] %> > } > <% end -%> > > And this puppet resource: > > class nagios { > file { "nagios-hosts": > ... > contents => template("nagios_hosts.erb") > } > } > > All wrapped up in a Nagios class. It assumes I have some knowledge of > how to configure Nagios, since I am still writing the config file by > hand. > > In order to use the Nagios native types well, you *still must have an > intimate knowledge of Nagios*. You don''t get to magically wipe away > that complexity, because it''s hard. Hard enough that it doesn''t make > sense to make a "monitor" native type, hard enough that if you want to > avoid having the 1748 service checks as individual resources you need > to understand that you should be modeling hostgroups instead. > > So when people question why the template model is useful at this > level, and that the world should be made up of resources, I have to > ask: why? What''s the concrete benefit?A very good point. But most of the examples (90%?) I see of people relying heavily on templates don''t involve such a complex case. Most of the stuff I see actually can be handled quite well with a declarative language and native types. Nagios and Apache configs, as we discussed the other night, are probably the kinds of things that require something beyond what Puppet can natively understand how to do. But that doesn''t mean because it is necessary for these extreme cases, it should be considered the best practice.> > It''s not: > > * That I can swap my monitoring solution out for another one easily > with a native type (I can''t) > * That it scales better (it takes a couple seconds to compile a > manifest that handles 243 nodes, 1748 service checks - storeconfigs + > export/collect will likely never match that)But this means that you''re storing this knowledge in an external node tool, right? Because otherwise, you are having to duplicate this information in both the manifests and some kind of external tool. And what if we could get export/collect to match that? Why are we giving up on this and just saying "it will never work, lets just work around it?" Why not try to bring Puppet up to where it *can* perform at those levels?> * That I now magically know that a node is being monitored by nagios > (because, as Digant said so elegantly regarding sudoers, you still > have to look at sudoers if you really want to be sure what''s in there)But as I also said, looking at a template doesn''t tell you with any greater assurance either. The only way to be 100% sure is to go look on the Nagios server.> > So what''s left? That I like being able to look at: > > class apache { > @nagios_service { "httpd": > ... > } > } > > And > > class nagios { > nagios_service <| |> > } > > Instead of knowing that my services are defined in the template for > Nagios service configs?Yup. If I want to check if we''re setting up monitoring for Apache, I want to look at manifests related to configuring Apache. I don''t want to jump from template to template to see if I''ve got all the wheels doing the right things. That''s one of the major buy-in points for going with Puppet.> > Even if that''s enough for you, is the above is so superior that we > shouldn''t expand Puppet''s language to make the gross parts of doing it > with templates (the inability to embed logic that belongs outside the > template itself in the manifest that utilizes the template) easier?Yeah, because that leads to unnecessary complexity. I guess the question really then becomes when does Puppet just become Ruby? I''d love to see an example of what your proposing would be necessary to make a megatemplate work that you couldn''t do with existing classing or other Puppet constructs. I think that would help me understand what you are trying to get after. I too often here people say "I can''t live unless Puppet lets me do X" but they never say what they want to do with it and often, as others have said, when you look at this from a Puppet-way, you didn''t need X to actually accomplish the goal, it just wasn''t being looked at from the right angle. It seems to me this may be the major failing of your argument: maybe a clear example of how something absolutely cannot be done the Puppet way would help expand things. What kind of expansion in Puppet''s language is needed to put logic in manifests for use by templates?> > = Summary > > Resources, Definitions, and Native Types aren''t good in and of > themselves. They need to have an increased level of practical utility > for the people who wield them in order to provide any real benefit. > There are situations where they clearly just are not the right answer. > > I believe Nagios is one of them, indeed, almost any situation where > you need to know data about the infrastructure in aggregate. In those > cases, I don''t need to model the data I already have as a resource -- > I need a way to express that data to impact the creation of resources, > sometimes simple fundamental ones. > > More abstraction is not always better.I''m still not buying the leap from "you can''t make abstract native types" to "you shouldn''t do things declaratively." There is a major difference between declaring an iptables specific native type instead of a firewall native type and using that declaratively, and using a big iptables template. But you are right, native types may not always be the right answer. We do most certainly maintain apache configs as files and use definitions to simply drop in and enable those sites. I don''t know if it would be feasible to represent all the inner workings of an apache config in declarations. But even with these two examples, I still think there can exist a myriad of native types will have practical utility. I don''t think the complexity of Nagios and Apache means we need to abandon that push. = Summary I don''t think this philosophical difference will get settled. It is very fair to say each side has good points. But I''m more interested in native types and even native types for meta data, and modeling everything in manifests, with the reliance upon some kind of external node tool eventually to map nodes to a class (to replace the nodes.pp file, essentially). I''m expecting to hit performance issues and I hope to be able to help Reductive Labs with resources to help iron those out. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
--On Thursday, May 08, 2008 9:52 PM -0700 Blake Barnett <shadoi@gmail.com> wrote:>> See the comments in the earlier thread regarding Puppet''s syntax >> becoming more and more of a more fully featured programming language >> over time, and the agreement between Luke and Digant that this is an >> undesirable state of affairs. >> I see where they are coming from, I just don''t agree. :) > > Honestly, I don''t think it would hurt the community at all if there > were multiple implementations of the language. As long as the > underlying functionality is still tied together. With that said, > there''s a reason people haven''t done something like puppet with > existing languages. When you provide all the features of a robust > language, there are many many ways to do things (and shoot yourself in > the foot). One of the great things about puppet is that it brings > conventions to system administration. It''s obviously goes a little > against our nature, but it brings tremendous shared benefit.Very good point. The more ways you can do things, the more ways in which things can break. I believe that grows exponentially rather than linearly. Conventions was absolutely one of the key goals we had and why we went with Puppet. Before Puppet, we already had a situation where each system administrator has his or her own preference for how to solve a problem. This lead to the problem where no one else would know to maintain someone else''s system if they went on vacation, for instance. We wanted consistency in practice and we want the tool to help enforce that. We haven''t tried to do anything as complex as Nagios yet but we have rarely hit upon limitations in the language. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Thu, May 8, 2008 at 11:08 PM, Digant C Kasundra <digant@stanford.edu> wrote:> This is where I totally disagree. If you can do something using a > definition which I would assume is using templates that are also specific > to the provider you are using and a series of file copies and execs, why > not make that a native type so at least it executes faster? Using > definitions or templates still doesn''t give you abstraction. In other > words, you still can''t create a definition that can handle all different > kinds of firewalls, can you? So even if the native type is specific to > iptables, at least it is declared as a resource, looks and feels like a > resource, and hopefully does things in a smarted way than just a definition > might do.But I can make it look and feel like a resource! define iptables($foo=bar, $baz=foot) { ... } iptables { "foo": } In the case of iptables, I think we''re in agreement. There is some value to be gained from making this a native type -- you''ll be able to solve the ordinality issues in a much more elegant way than you ever could with definitions. But really, Apache? Nagios? Is everything better off modeled as a native type? I just don''t think so. As for the speed issue, I''m willing to wager that I use a lot more templates than you do, and I don''t have significant speed issues. If speed was the issue, lots of things can be done to make template rendering faster. (Erubis instead of Erb, caching of the pre-compiled template, etc.) Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Fri, May 9, 2008 at 12:04 AM, Digant C Kasundra <digant@stanford.edu> wrote:> Very good point. The more ways you can do things, the more ways in which > things can break. I believe that grows exponentially rather than linearly.I agree. I''m not sure that a syntax like: file "foo" do owner "root" group "root" ensure "exists" end Really makes that much of a difference. The thing we all don''t want is a shift backwards to the day when everyone wrote a script in whatever language they liked at the time to solve the problem of the moment.> Conventions was absolutely one of the key goals we had and why we went with > Puppet. Before Puppet, we already had a situation where each system > administrator has his or her own preference for how to solve a problem. > This lead to the problem where no one else would know to maintain someone > else''s system if they went on vacation, for instance. We wanted > consistency in practice and we want the tool to help enforce that.As is evidenced by the length of the best practices document, you still have to enforce a standard about how things should be done. Puppet gives you a powerful set of tools for doing it, and you can still get it wrong. The important thing here is that, if you provide people with a powerful enough paradigm by which to do what they need, they will use that paradigm. It''s the shortest path to getting what they want done.> We haven''t tried to do anything as complex as Nagios yet but we have rarely > hit upon limitations in the language.That is precisely why my message was so long. We have hit upon limitations of the language, and, in my opinon, of the "native types are always right, definitions are a hacky workaround" point of view. It takes a *long* time to get there! We''ve only a few parts of the complete set of manifests we build that don''t fit within Puppet''s Resource based world view. That''s a beautiful thing! We build infrastructure for people with 0 manual configuration for the most common cases, running production applications in Rails, Java, and PHP, doing things from Web 2.0 companies to electric car monitoring. For the 98% case, Puppet''s language fits. For the two percent, it doesn''t. Why does this make me a heretic? I''m not saying the language sucks (although I do have some disagreements with it, just like I have disagreements with Perl, Ruby, Python, Shell and English), or that the Resource paradigm is bad. It''s great! It just doesn''t work in some cases, and I think that''s okay. Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Thu, May 8, 2008 at 11:11 PM, Digant C Kasundra <digant@stanford.edu> wrote:>> file { "foo": >> content => template("woot") >> } >> >> $variable_i_need_in_foo = "something" > > Also why it is said often not to use variables this way, but to use > definitions and native types. You can cleanly override them where variable > scoping is a bit of a guess.I''m all about definitions, Digant, and we use them all over the place. There are lots of examples of people trying to do things like this, though. How about when you distribute a module to other people? You often will wind up needing people to set some variables at the top level scope, since that''s the only way to get module reuse done. (Otherwise, I have to edit all your modules directly, which diverges me from your mainline enough that I may not benefit from future changes) What if I want to ship defaults? The fact that variable scoping is mysterious doesn''t mean you shouldn''t use variables -- it means making it less mysterious is a good ticket. :) Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Thu, May 8, 2008 at 11:16 PM, Digant C Kasundra <digant@stanford.edu> wrote:>> One way to fix this would be to implement an automatic require >> statement for every resource that appears in an "include foo" line >> inside a class for every resource within the class. At that point, >> though, you might as well just ditch the graph altogether. > > Do you really have that big of a problem with this? I think there are some > automatic dependency limitations that I would consider more bugs than > limitations. There are some things that need to get ironed out. But for > the most part, we don''t rely on as much dependency stuff as one might > expect. I think you''re trying to use Puppet like it was CFengine or a > script. Just use CFengine or a script if that''s what you want. Seriously, > if you don''t want declarative, there are already tools that do that. Why > use Puppet? And why make Puppet like tools that already do that?There aren''t tools that do that, actually. And I''m certainly not using Puppet like it was Cfengine or a script. :) I gave a pretty exhaustive list of "Why use Puppet" at the top of that philosophy email. Puppet is a game changing piece of awesomeness. That awesomeness does not tend to come from the declarative language, in my opinion. It comes from the usefulness of this: file { "something": owner => root } As the abstraction layer for a whole set of complicated interactions. The fact that, if I want that file to exist in a directory, I''m liable to write: file { "my-directory": name => "/foo", ensure => directory } file { "something-in-foo": name => "/foo/something", ensure => exists, require => File["my-directory"] } As opposed to: file { "something-in-foo": name => "/foo/something", ensure => exists, require => File["my-directory"] } file { "my-directory": name => "/foo", ensure => directory } Is where I''m heading. The goal of the declarative nature of the language is not just that I declare "file" and the system figures out how to do something with it. That, I clearly love. The side-effect of a declarative language attempting to allow the declarations to self-organize into the proper execution order is, I think, significantly less useful (and more often a source of bugs than a saving grace.) How many people on this list have a complicated manifest set that always applies cleanly on the first pass? Convergence catches this for you over time, and you can absolutely get it right, but more often than not, it bites me instead of helps me.> (apologies for breaking up the long long long long long email into multiple > replies. Fascinating read but easier to digest in chunks).Not a problem at all. It was a crazy long message, and it''s far easier to discuss in pieces than as a whole. Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
> I gave a pretty exhaustive list of "Why use Puppet" at the top of that > philosophy email. Puppet is a game changing piece of awesomeness. > That awesomeness does not tend to come from the declarative language, > in my opinion. It comes from the usefulness of this:I think that should really find it way in to the Wiki. Excellent read. For those of us that have not hit that 98% wall, or in my case to see if I am looking at it, Could you put up a manifest of one of those situation. It has been quite informative to follow this discussion, even though I feel like the apprentice dinning at the masters tables. Evan --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Thu, May 8, 2008 at 11:59 PM, Digant C Kasundra <digant@stanford.edu> wrote:> Finally getting to the heart of the discussion here....Takes a while, huh? :) This is why it''s always a hard conversation to have... we started to have it in person a few days ago, and I don''t think we ever got this far.>> All I''m doing is taking data I already know about and forcing it into >> a semantic model that provides very little actual value. If my >> infrastructure is using Nagios, I''m not learning about it because I >> see a @nagios_host declaration in my puppet manifest. (I hope not!) >> I''m not learning that this host is monitored by nagios, because the >> assumption is that, either way, it will be (or why was I automating >> nagios in the first place?) > > You mention this is data you already have outside of Puppet. If that is > true, then yes, you could use a Nagios configuration generation tool. But > not everyone has this data outside of Puppet nor does everyone want to > manually compile this data outside of Puppet. What if it is just > represented in manifests?I would argue that this data actually belongs outside of Puppet, in some kind of external node tool, that''s accessible for other tools to utilize. When I say "outside of puppet", what I mean is, the canonical place for representing it should not be as declarations of resources within a manifest. I loose much of my ability to use that data in the many tools that it could be helpful with. (I''m think of things like inventory management systems, meta-directories, and application deployment systems.) I agree that not everyone has this data accessible outside of Puppet. Sometime in the next year, I would wager that most people will, as Luke will have completed work on a Reductive-stamped external node tool. If you want to solve the cases where you need knowledge of the external infrastructure, right now, using iClassify or LDAP to solve this problem for you is, in my opinion, your only viable option. Where the real cognitive divergence happens is that I take this a step further, and say that it''s actually the "right thing" that the world works this way. I *do* want some data outside of Puppet. It''s not the right tool for many jobs. But those other tools could get large scale benefit from access to that information, and Puppet is just another one of those tools.> A very good point. But most of the examples (90%?) I see of people relying > heavily on templates don''t involve such a complex case. Most of the stuff > I see actually can be handled quite well with a declarative language and > native types. Nagios and Apache configs, as we discussed the other night, > are probably the kinds of things that require something beyond what Puppet > can natively understand how to do. But that doesn''t mean because it is > necessary for these extreme cases, it should be considered the best > practice.It should be considered best practice for the 10% that can''t be solved that way, though, instead of dismissed as a misunderstanding of the grand vision, which is what happens today. These aren''t the extreme cases, either. They are the logical conclusion of building your infrastructure with a tool like Puppet in many cases. I want to get as close to 100% automation as I possibly can, and that means integration with other systems and tools. You already do this for Apache. Do you really want to configure Nagios by hand again? Another side effect of having that class of tool (the ones that need access to a more global view of the infrastructure) utilize a different paradigm is the ease of redistribution. Right now, if you have a fully resource-based infrastructure, and you roll that up and distribute it, it''s going to include your references to things like what nagios monitors you want. What if I''m using Hyperic? I can strip out your resource declarations, but now I''ve lost at least some of the utility involved in the redistribution. If, however, you distribute a single "nagios" module that understands how to grab that data from an external source, all you need to do is document what variables need to be populated with data in order to render your monitoring system. Sometimes, it''s better to silo.> But this means that you''re storing this knowledge in an external node tool, > right? Because otherwise, you are having to duplicate this information in > both the manifests and some kind of external tool.Absolutely.> And what if we could get export/collect to match that? Why are we giving > up on this and just saying "it will never work, lets just work around it?" > Why not try to bring Puppet up to where it *can* perform at those levels?I''m not convinced that export/collect is actually the right way to approach the problem. I think that, for most of the possible use cases for it, all I''m doing is forcing external data into a resource construct that may or may not actually give me any semantic advantage (and in the case of distribution above, a dis-advantage.) I''m not at all opposed to bringing Puppet up to where it can perform at those levels. If people see the value in: @ssh_key { $fqdn: ssh_key => $ssh_dsa_key } ssh_key <| |> Than go nuts with it. The part I dislike about the current work around is: file { "ssh_host_keys": contents => template(foo) } # in the template <% ic = IClassify::Client.new(server, username, password) -%> <% ic.search("ssh_dsa_key:*", [ "ssh_dsa_key" ]).each do |n| -%> <%= n[:ssh_dsa_key] %> <% end -%> What sucks about that is I need to look at the template to see what''s going on, and it''s the exact problem you point out with this technique. If we could say: $ssh_host_key_list = search("ssh_dsa_key:*", [ "ssh_dsa_key" ]) file { "ssh_host_keys": contents => template(foo) } # template <% ssh_host_key_list.each do |n| -%> <%= n[:ssh_dsa_key] %> <% end -%> You''ve resolved that issue. If you really, really want to declare each ssh key as a resource, because the above syntax isn''t clear enough for you: $ssh_host_key_list = search("ssh_dsa_key:*", [ "ssh_dsa_key" ]) foreach $ssh_host_key_list ($key) { ssh_key { $key: ssh_dsa_key => $key } } And if loops are too much: ssh_host_keys { "something": keys => search("ssh_dsa_key:*", [ "ssh_dsa_key" ]) } Lots of ways to make that better, that don''t involve modeling each discrete piece of data as a resource per-node.> But as I also said, looking at a template doesn''t tell you with any greater > assurance either. The only way to be 100% sure is to go look on the Nagios > server.Exactly. Template or Resource declaration, the only way to be sure is to look at Nagios.> Yup. If I want to check if we''re setting up monitoring for Apache, I want > to look at manifests related to configuring Apache. I don''t want to jump > from template to template to see if I''ve got all the wheels doing the right > things. That''s one of the major buy-in points for going with Puppet.See the syntax above, and my questions about distribution. Do you *really* want that embedded in your Apache class? We do this now for anything that is client-side configured. But when I think about making what we do useful outside of HJK, I start to reconsider that point of view.> Yeah, because that leads to unnecessary complexity. I guess the question > really then becomes when does Puppet just become Ruby?I think that is a great question.> I''d love to see an example of what your proposing would be necessary to > make a megatemplate work that you couldn''t do with existing classing or > other Puppet constructs. I think that would help me understand what you > are trying to get after. I too often here people say "I can''t live unless > Puppet lets me do X" but they never say what they want to do with it and > often, as others have said, when you look at this from a Puppet-way, you > didn''t need X to actually accomplish the goal, it just wasn''t being looked > at from the right angle.Oh, I can live. I''m living quite fine doing it the way I am... but I''m not doing it this way because I misunderstand something fundamental about how Puppet works. I get how it''s pictured to work, I just don''t think it''s the right way to solve a particular class of systems management problems.> It seems to me this may be the major failing of your argument: maybe a > clear example of how something absolutely cannot be done the Puppet way > would help expand things. What kind of expansion in Puppet''s language is > needed to put logic in manifests for use by templates?See above.> I''m still not buying the leap from "you can''t make abstract native types" > to "you shouldn''t do things declaratively." There is a major difference > between declaring an iptables specific native type instead of a firewall > native type and using that declaratively, and using a big iptables > template.It''s more about "sometimes, you shouldn''t do things entirely declaratively". It''s not "you shouldn''t do things declaratively". That''s why I was so careful to outline the places that declarative syntax fits so well. It''s a huge swath, just on all of ''em.> But you are right, native types may not always be the right answer. We do > most certainly maintain apache configs as files and use definitions to > simply drop in and enable those sites. I don''t know if it would be > feasible to represent all the inner workings of an apache config in > declarations.I''m willing to go out and a limb and say "No". :)> But even with these two examples, I still think there can exist a myriad of > native types will have practical utility. I don''t think the complexity of > Nagios and Apache means we need to abandon that push.Absolutely. I never argued otherwise. In the case that brought this thread up, we were talking about making decisions in a template based on what the larger infrastructure looked like. (Sudoers, in this case.) Ashley got pointed in the direction of using definitions and snippets, or writing a native type, to best solve the problem. I think those suggestions are probably the correct ones. However, that doesn''t change the fact that the approach he was taking to that problem was, in terms of capabilities, a totally reasonable one. You want that functionality eventually, especially as the complexity of what you are configuring increases, or as it''s need for a greater view of the infrastructure rises. You may think he was going about that problem wrong, but I think he probably should have been able to pull it off. Sometimes, you need that stuff.> I don''t think this philosophical difference will get settled. It is very > fair to say each side has good points. But I''m more interested in native > types and even native types for meta data, and modeling everything in > manifests, with the reliance upon some kind of external node tool > eventually to map nodes to a class (to replace the nodes.pp file, > essentially). I''m expecting to hit performance issues and I hope to be > able to help Reductive Labs with resources to help iron those out.I put forth that you want more than just mapping nodes to classes. You need the single, centralized source of raw information about the infrastructure, including how it''s been mapped to classes. Once you have that, the desire for certain types of information to be encoded as resrouces in the manifest fades, since it''s no longer giving you much practical utility. I agree that I don''t think the differences will get settled, but it''s an interesting conversation regardless. Thanks for your well reasoned feedback, and your continuing contributions to the Puppet community, Digant. Regards, Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Fri, May 9, 2008 at 10:24 AM, Adam Jacob <adam@hjksolutions.com> wrote:> I gave a pretty exhaustive list of "Why use Puppet" at the top of that > philosophy email. Puppet is a game changing piece of awesomeness. > That awesomeness does not tend to come from the declarative language, > in my opinion. It comes from the usefulness of this: > > file { "something": > owner => root > }We''re having a definition-mismatch here. When I say "it doesn''t come from the declarative langauge" above, I should have said: "It doesn''t come from the underlying assumption that discrete resources should organize themselves into a directed graph with a topological sort". It most certainly *does* come from the declarative aspect of the language. Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 9, 2008, at 10:59 AM, Adam Jacob wrote:> >> I don''t think this philosophical difference will get settled. It >> is very >> fair to say each side has good points. But I''m more interested in >> native >> types and even native types for meta data, and modeling everything in >> manifests, with the reliance upon some kind of external node tool >> eventually to map nodes to a class (to replace the nodes.pp file, >> essentially). I''m expecting to hit performance issues and I hope >> to be >> able to help Reductive Labs with resources to help iron those out. > > I put forth that you want more than just mapping nodes to classes. > You need the single, centralized source of raw information about the > infrastructure, including how it''s been mapped to classes. Once you > have that, the desire for certain types of information to be encoded > as resrouces in the manifest fades, since it''s no longer giving you > much practical utility.I agree with this, mainly because I don''t think it can be argued that puppet can accurately model an entire infrastructure yet. Configuration of routers, switches, disk arrays, and other "meta- infrastructure" pieces need to be accounted for as well. An external tool will most likely always be necessary. Being able to relate and add semantics to an overall infrastructure is invaluable. But that doesn''t mean it can''t be a part of the Puppet community, it''s just an extension of it. Of course, anyone can choose to fork off and do whatever they want, but the real value in all of this is the community. The knowledge sharing and collaboration that has been almost totally absent from the Systems Administration world.> > I agree that I don''t think the differences will get settled, but it''s > an interesting conversation regardless.I would go so far as to say that they don''t need to get settled. It''s an ecosystem, and that''s a good thing. -Blake --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Fri, May 9, 2008 at 12:08 PM, Blake Barnett <shadoi@gmail.com> wrote:> I agree with this, mainly because I don''t think it can be argued that > puppet can accurately model an entire infrastructure yet. > Configuration of routers, switches, disk arrays, and other "meta- > infrastructure" pieces need to be accounted for as well. An external > tool will most likely always be necessary. Being able to relate and > add semantics to an overall infrastructure is invaluable. But that > doesn''t mean it can''t be a part of the Puppet community, it''s just an > extension of it.And there are some human things that you might not ever be able to model easily with Puppet, such as what port an employee''s phone is on, which relates to how you configure the PBX and Voicemail system, which ties back to, say, how you configure Asterix.> Of course, anyone can choose to fork off and do whatever they want, > but the real value in all of this is the community. The knowledge > sharing and collaboration that has been almost totally absent from the > Systems Administration world.Well, even if there were tools other than Puppet, it won''t shatter that community. Much of what exists in Puppet (especially in the providers) would be an invaluable treasure trove of code to anyone who was writing another similar tool. It''s one of the reasons I''m glad Puppet is GPL, and not a BSD license... if people want to re-use that knowledge, they need to give credit where it''s due and continue that legacy of open sharing. Puppet moved the state of the art forward in the same ways that Space Flight changed aviation, and a whole host of ancillary technologies. It''s good to see what happens when we approach things in different ways.> I would go so far as to say that they don''t need to get settled. It''s > an ecosystem, and that''s a good thing.The more I talk to you, Blake, the more I like you. :) +1 Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 8, 2008, at 5:12 PM, Adam Jacob wrote:> = Puppet is Myopic by default > > For those of you who have read this far, this is where Luke and I > really start to diverge philosophically.I''m not convinced we diverge as much as you think. I appreciate your leading disclaimers; no worries that I thought you were trashing Puppet, but it''s good to have it reiterated for others. I''ve drastically cut down your email, and am trying to only reply to the heart of it. I don''t have a ton of time for 4000 emails right now, and this is the most important bit. I will try to go through and make notes where appropriate in a second pass.> At the moment, the majority of Puppet users are utilizing it in a > myopic way. For any given run, a node knows all about itself, but it > doesn''t know about it''s neighbors. So how do we handle configuring > services which require knowledge about the infrastructure at large, > instead of just my one node?I completely agree that this is a fundamental problem, and it''s one I''ve been struggling to address for most of my career using larger tools. What might surprise you is that I find export/collect to not be sufficient. It''s my best-effort attempt, but that doesn''t make it good enough. But... This is where you see disagreement but I see opportunity. Architecturally, you really only have problems with the ''export'' part of ''export/collect''. You don''t want to specify all of your resources in Puppet''s language, because it''s more natural to do in a database. Yes, implementation-wise, you don''t like how collection works (the db, etc.), but that''s a bug, not a feature, but architecturely, you''re just looking to put your nagios configurations in a database instead of specifying them in a manifest. Fine, put them in a database. But instead of using templates to do the query, and then translating them into chunks of opaque text, use Puppet to do the query, and it will translate them into resources. Yay! As an example, I just created a tool for a client that accepts YAML on stdin and uses it to create resources in the database. Then I have a collect statement in the language that pulls those resources out of the db. So, this tool replaces the ''export'' functionality, but we still use ''collect''. Of course, there are implementation problems stopping this from happening. You''d need to store all of your configurations in the db, which is slow (even worse than your estimate) and silly. But the solution here is for you to spend some of that time that Puppet has saved you on fixing the implementation, rather than seeing philosophical differences that don''t exist. You seem pretty fixated on Puppet''s language as its primary feature, so when the language isn''t the answer, you seem to think Puppet has kind of lots use. I, instead, see the RAL, and generally modeling things as resources, as its main advance, with the language is the primary enabler. In a few years, I see a significant portion of resources being specified in external data sources, or calculated using decision engines, rather than being in the language at all. The only way we''re going to get there, though, is if people spend their development time making Puppet better, rather than working around it. You have all of these templates, but they''re all built to your site, which makes them the exact kind of script you don''t want to go back to. If you use resources, and just have drivers to hook Puppet''s resource querying to your db, then you''re doing less work and more of it is reusable by others. Of course, you listed out other parts of Puppet you thought sucked, and I hope to be able to reply to those, but this is the most important bit. -- You''ve got to take the bitter with the sour. -- Samuel Goldwyn --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
This is my attempt to reply to issues on this thread that are small and separate from the heart of the matter. I''m cutting everything that isn''t specifically needing a reply, and I''m joining all replies into this one email. On May 8, 2008, at 5:12 PM, Adam Jacob wrote:> > Or, put another way, as the complexity of the thing you are > abstracting increases, the utility of abstracting it falls. The sweet > spot in the case of web servers is to use definitions, along with the > fundamental building blocks, to automate the process of configuring > them. It provides huge semantic benefit for comparatively little > effort. Building a native web_server provider provides an interesting > benefit (the ability to use different web servers transparently,) but > at a cost of lost functionality and a huge amount of effort.I agree with this -- you''re always going to struggle to determine whether a given resource can be abstracted or not. This is more a criticism of how unnecessarily different software is, IMO, and forcing an abstraction onto it is often pretty damn helpful. It''s always a judgement call, though, I''m seldom going to castigate someone for their decision.> == Puppet''s language is declarative (except when it isn''t) =Generally, I think this criticism is essentially unrelated to this thread. I agree that this is a bug, but if you want to discuss it, it belongs in a new thread (or a ticket).> The reasoning behind this is based on the assumption that, as > complexity increases, relying on the order in which resources are > specified to determine when things should happen becomes impossible to > manage. With 10 things, it''s easy to put them in a line. With a 100 > things, it''s harder. With a 1000 things, it''s quite hard. By using > dependencies, you only have to say "this requires that". You don''t > have to worry about where "this" or "that" slot in to the grand scheme > of things.I just can''t agree less, here. The *only* scalable way to manage large numbers of related resources is via local relationship graphs which all merge into a single directed graph.> Many people have been bitten by this: > > file { "foo": > content => template("woot") > } > > $variable_i_need_in_foo = "something" > > You can see the thought process here... I need to model this file as a > template. Oh, and it needs $variable_i_need_in_foo. That should > work, you figure, because Puppet is declarative.It''s probably more correct to call its compiled catalogs declarative. I have not done a good job of making the language itself very declarative, in many ways. But again, those are unrelated bugs.> Finally, I think the nail in the declarative coffin is this. As the > complexity of your implementation increases, it is harder to keep the > explicit statements of dependency in order than it is to slot things > into a line. The reason? > > Most of the time, it doesn''t matter where something gets slotted in.Wait... because it doesn''t matter where something gets slotted in, it''s harder to use a system that only requires you specify the relationships that matter? Instead, you''d prefer that every resource had implied relationships based on its order in the manifests? And you think this would be easier?> > If I have a thousand things, very few of them *actually care about > where they fit*. They just need to happen after a few other things. > If Puppet was not declarative, but simply executed resources in the > order they appear in the manifest (based on the order the top-level > classes are assigned to the node) the problem would become one of > ensuring that each class state explicitly what *other* classes need to > have been applied before it can be run. Which you have to do anyway, > if you want to ensure that your complete configuration can be > delivered in a single run.This is the road to hell, I absolutely promise you.> One way to fix this would be to implement an automatic require > statement for every resource that appears in an "include foo" line > inside a class for every resource within the class. At that point, > though, you might as well just ditch the graph altogether. > > = Puppet is Myopic by defaultIt''s more correct to say it''s easier to talk about a single host than sets of hosts.> If you accept that Puppet is, or should be, the single canonical and > authoritative place where all the information about how your systems > architecture should be built is, than the @sshkey statement seems to > make some sense.I''ve not only not said this, I''ve recommended external node sources, and multiple times posited external resource sources.> 2) You can''t export/collect non-native types.You can now, as of 0.24.3 or so. On May 8, 2008, at 11:19 PM, Adam Jacob wrote:> Unlike Luke, I''m not convinced that a similar syntax built on top of a > fully functional programming language wouldn''t solve many of these > issues, while reducing the complexity of the run-time engine as a > whole.As Blake pointed out, I''ve experimented both ways. The community seems to prefer the external DSL, and AFAICT, literally *no one* has ever looked at my pure Ruby internal DSL that attempts to mimic Puppet''s language. I clearly don''t have any religious biases against an internal DSL -- I''ve implemented one, and one seems to care> The "include apache" statement could ensure that all the resources it > lists are applied before the resources specified within class monkey. > This is a good habit to get in to in general with puppet, since if you > require one functional element to be available before another, you''re > likely doing this kind of "include something everywhere I need a > reference to a resource it creates" activity regardless.Again, this is basically unrelated to the heart of the thread (managing sets of hosts). This feature would be easy to add. Really, you''d probably want to add a super-set function: Something that both included the class and set up a relationship. This is *clearly* possible, given that custom functions have been around for ages. If you want it, add it, and submit it as an enhancement. Even better, start a thread on -dev about whether this makes sense for all cases of ''include'', or whether we should reassess these functions entirely. The ''include'' function and the ''require'' metaparam have gotten unweildy and counter-intuitive; it''s probably time to rethink them.> I''m not certain extending the type system is the answer. I think > extending the access to the data you would need to compose those types > of resources is the answer, although the two aren''t mutually > exclusive. Essentially, storing the data from each node as a resource > you then later utilize is, I think, actually less useful than simply > creating the end resource from the data. The level of benefit goes > way down as the complexity of the abstraction increases like this.You''ve got my answer for this whole thread right here; your mistake was assuming I didn''t agree with you. This is *exactly* what Puppet''s query system is supposed to enable, but its implementation has been poor enough that we haven''t been able to really experiment.> Two things I would like to see that I''m fairly certain Luke would > rather not have:> 1) The ability to create arbitrary Ruby data-structures from within a > manifest for use in a templateI''m not particularly against this, really. My biggest problem is that it will tie Puppet''s language to Ruby more than it already is, and my second biggest problem is that about ten seconds after you add this, people will ask, "why even use an external DSL?", at which point all of your effort was wasted. I''ve built an internal DSL no one has used. No one has ever provided a patch that adds support for hashes, regexes, or anything else like this to Puppet''s language.> > 2) A saner default for relationships, or a removal of the directed > graph altogetherThe directed graph is never going away, I can promise you that. I started without it, and I can tell you, it''s 100000x better. Having more automatic relationships is a reasonable idea, and one you should bring up on the dev list.> > 3) The full set of common language constructs, such as loops, expanded > conditionals, etc.My biggest problem here is that 99% of these would be used to make files, not resources. If I could somehow add code that noticed people were managing text instead of resources, it''d be fine, but at this point, nearly everyone starts with the same mistake, and I don''t want to enable it. Really. I know you think this is ok, but I think it could really kill Puppet. If you want that ability, put it in a module or a template. It''s not that hard, and it forces a clean separation between your resources and your text. -- It''s a small world, but I wouldn''t want to paint it. -- Stephen Wright --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Mon, May 12, 2008 at 11:59 AM, Luke Kanies <luke@madstop.com> wrote:> I''m not convinced we diverge as much as you think. I appreciate your > leading disclaimers; no worries that I thought you were trashing > Puppet, but it''s good to have it reiterated for others.That was primarily who it was target at. :)> I''ve drastically cut down your email, and am trying to only reply to > the heart of it. I don''t have a ton of time for 4000 emails right > now, and this is the most important bit. I will try to go through and > make notes where appropriate in a second pass.Cool.> I completely agree that this is a fundamental problem, and it''s one > I''ve been struggling to address for most of my career using larger > tools. > > What might surprise you is that I find export/collect to not be > sufficient. It''s my best-effort attempt, but that doesn''t make it > good enough. But... > > This is where you see disagreement but I see opportunity.We both see opportunity, I think. :)> Architecturally, you really only have problems with the ''export'' part > of ''export/collect''. You don''t want to specify all of your resources > in Puppet''s language, because it''s more natural to do in a database. > Yes, implementation-wise, you don''t like how collection works (the db, > etc.), but that''s a bug, not a feature, but architecturely, you''re > just looking to put your nagios configurations in a database instead > of specifying them in a manifest.Not really -- I''m saying my Nagios configuration is actually driven by data that has nothing to do with Nagios at all, for the most part. It''s information about what systems I have, what they should be doing, etc. It''s true that I''m not certain that specifying that data (again) in a manifest is a win.> But instead of using templates to do the query, and then translating > them into chunks of opaque text, use Puppet to do the query, and it > will translate them into resources. Yay!I''m not convinced that modeling the nagios configuration as resources gets me much farther than having the external data source available to render the template, which is essentially what the native resource type is doing anyway at the end of the game. Essentially, I''m saying that the template isn''t opaque -- the template is quite explicit about what it''s doing (configuring Nagios.) Nagios has already defined the way it wants to be spoken to (through it''s own configuration language.) The difficulty in tools automating tools like Nagios is where you get the data to configure them, because the breadth of knowledge required is so wide. See my replies to Digant for ways we might extend Puppets language to better support that kind of thing.> As an example, I just created a tool for a client that accepts YAML on > stdin and uses it to create resources in the database. Then I have a > collect statement in the language that pulls those resources out of > the db. So, this tool replaces the ''export'' functionality, but we > still use ''collect''.Like I say above, I''m not sure that I even want to ''collect'' them. As the complexity increases, the overhead of modeling those entities as resources increases, and the flexibility of using templates becomes stronger. Essentially, what it takes to model: <% node_list.each do |node| %> define host { use generic-host ; Name of host template to use host_name <%= node["hostname"] %> alias <%= node["fqdn"] %> address <%= node["ipaddress"] %> } <% end %> In a template is pretty straightforward. If you could do this in a manifest: node_list = search("tag:monitor") file { "nagios-hosts-cfg": .. } You have, in my opinion, pretty well solved the problem. The fact that File["nagios-hosts-cfg"] is rendering a template is not, to me, a flaw in any way. The other way to do this would be to model everything as resources, and have my lookup be actually returning those pre-modeled resources.. which I have to admit, I think is a waste.> Of course, there are implementation problems stopping this from > happening. You''d need to store all of your configurations in the db, > which is slow (even worse than your estimate) and silly. But the > solution here is for you to spend some of that time that Puppet has > saved you on fixing the implementation, rather than seeing > philosophical differences that don''t exist.We both see the philosophical differences, it''s not just me. :) The reason these conversations get taken so easily out of context is that we diverge pretty late in the game, which is why my original post was so long.> You seem pretty fixated on Puppet''s language as its primary feature, > so when the language isn''t the answer, you seem to think Puppet has > kind of lots use. I, instead, see the RAL, and generally modeling > things as resources, as its main advance, with the language is the > primary enabler.I know. It is, in my opinon, actually the language. The cross-platform nature of the abstraction is great, in that: user { "adam": .. } Just does the right thing on every platform it''s available. But being able to say: user { "adam": .. } At all was where the earth really shattered. Modeling the world as resources is valuable, but I think it breaks down at a certain point. At that point, the language still enables me to get the job done.> In a few years, I see a significant portion of resources being > specified in external data sources, or calculated using decision > engines, rather than being in the language at all.Really? I''m pretty sure I''m the class of people who would use something like that, and I''m not sure I really see the need. For example, lets say you live in a world where you want HR to own the creation of "People" within the company. That means, in the best case, HR has some silo someplace of the information that is canonical for "People". So you want Puppet to automatically create User entries for each Person. In my world, I think being able to query HR for the data, then create the resulting resource is where the goodness is. That could be external to puppet (a decision engine that injects puppet resources outside of the manifest), but why? search_hr("people") search_hr("people").each do |p| user { p.name: ... } end (Sorry for the Puppet/Ruby pastiche, but you see where I''m going) Shipping the resources around as the thing you store is, I think, far less valuable then letting me create the resources based on the data I have. Does that make sense?> The only way we''re going to get there, though, is if people spend > their development time making Puppet better, rather than working > around it. You have all of these templates, but they''re all built to > your site, which makes them the exact kind of script you don''t want to > go back to. If you use resources, and just have drivers to hook > Puppet''s resource querying to your db, then you''re doing less work and > more of it is reusable by others.To a degree. Templates are not inherently un-redistributable -- they are in fact quite the opposite. The abstraction into resources does make things more portable for many cases, but not all cases. When you''re dealing with Nagios, Apache, or even Sudoers, the truth is the Template is just as portable as the provider or definition would be. I agree that Puppet could use some extending to make this better -- see my response to Digant with some proposed syntax for doing just that. Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 12, 2008, at 3:15 PM, Adam Jacob wrote:>> Architecturally, you really only have problems with the ''export'' part >> of ''export/collect''. You don''t want to specify all of your resources >> in Puppet''s language, because it''s more natural to do in a database. >> Yes, implementation-wise, you don''t like how collection works (the >> db, >> etc.), but that''s a bug, not a feature, but architecturely, you''re >> just looking to put your nagios configurations in a database instead >> of specifying them in a manifest. > > Not really -- I''m saying my Nagios configuration is actually driven by > data that has nothing to do with Nagios at all, for the most part. > It''s information about what systems I have, what they should be doing, > etc. It''s true that I''m not certain that specifying that data (again) > in a manifest is a win.Sorry; s/database/external data source or whatever/.> >> But instead of using templates to do the query, and then translating >> them into chunks of opaque text, use Puppet to do the query, and it >> will translate them into resources. Yay! > > I''m not convinced that modeling the nagios configuration as resources > gets me much farther than having the external data source available to > render the template, which is essentially what the native resource > type is doing anyway at the end of the game. Essentially, I''m saying > that the template isn''t opaque -- the template is quite explicit about > what it''s doing (configuring Nagios.) Nagios has already defined the > way it wants to be spoken to (through it''s own configuration > language.This is a fundamental disagreement, then. The only reason, IMO, that you view Nagios resources as chunks of text and users as resources is that you''re conditioned to this by the existence of user tools and lack of Nagios tools. Of course the template is explicit; the problem is that it includes data, formatting, query information, etc.> The difficulty in tools automating tools like Nagios is where you get > the data to configure them, because the breadth of knowledge required > is so wide. See my replies to Digant for ways we might extend Puppets > language to better support that kind of thing.I don''t think this is difficult at all. At the end of the day, Puppet queries a smart system that has all access to all of the knowledge and can apply domain-specific rules. Puppet''s queries are relatively dumb, the data repository (or, more likely, the interface to it) is relatively smart.> Like I say above, I''m not sure that I even want to ''collect'' them. As > the complexity increases, the overhead of modeling those entities as > resources increases, and the flexibility of using templates becomes > stronger. Essentially, what it takes to model: > > <% node_list.each do |node| %> > define host { > use generic-host ; Name of host template to use > host_name <%= node["hostname"] %> > alias <%= node["fqdn"] %> > address <%= node["ipaddress"] %> > } > <% end %> > > In a template is pretty straightforward. If you could do this in a > manifest: > > node_list = search("tag:monitor") > file { "nagios-hosts-cfg": > .. > } > > You have, in my opinion, pretty well solved the problem. The fact > that File["nagios-hosts-cfg"] is rendering a template is not, to me, a > flaw in any way. The other way to do this would be to model > everything as resources, and have my lookup be actually returning > those pre-modeled resources.. which I have to admit, I think is a > waste.Everything is a waste. Ruby is a huge waste. Using user resources is a waste. Using package resources is a waste. The question is, is that waste worth what you get in a return? For some reason, you think the answer is "yes" when it comes to some resources stored in files (e.g., users) but not other resources stored in files (e.g., nagios configurations). I just don''t see a fundamental difference between users and nagios configurations; all I see is a cultural difference in how we see them because of tradition and available tools.> >> Of course, there are implementation problems stopping this from >> happening. You''d need to store all of your configurations in the db, >> which is slow (even worse than your estimate) and silly. But the >> solution here is for you to spend some of that time that Puppet has >> saved you on fixing the implementation, rather than seeing >> philosophical differences that don''t exist. > > We both see the philosophical differences, it''s not just me. :) The > reason these conversations get taken so easily out of context is that > we diverge pretty late in the game, which is why my original post was > so long.Ok.> At all was where the earth really shattered. Modeling the world as > resources is valuable, but I think it breaks down at a certain point. > At that point, the language still enables me to get the job done.Right. C allows you to escape to assembly, Ruby allows you to escape to the shell, and no one would argue that it''s bad to be able to do that. What I''m arguing is, it''s incorrect to act like you''re not doing the equivalent of escaping to the shell. You might as well have backticks throughout your manifests, from my perspective.> >> In a few years, I see a significant portion of resources being >> specified in external data sources, or calculated using decision >> engines, rather than being in the language at all. > > Really? I''m pretty sure I''m the class of people who would use > something like that, and I''m not sure I really see the need. For > example, lets say you live in a world where you want HR to own the > creation of "People" within the company. That means, in the best > case, HR has some silo someplace of the information that is canonical > for "People". So you want Puppet to automatically create User entries > for each Person. > > In my world, I think being able to query HR for the data, then create > the resulting resource is where the goodness is. That could be > external to puppet (a decision engine that injects puppet resources > outside of the manifest), but why? > > search_hr("people") > search_hr("people").each do |p| > user { p.name: > ... > } > end > > (Sorry for the Puppet/Ruby pastiche, but you see where I''m going) > > Shipping the resources around as the thing you store is, I think, far > less valuable then letting me create the resources based on the data I > have. Does that make sense?This syntax already exists: User <<| |>> There''s your query. Now you just need to extend it to support specifying a source; maybe: User(hr) <<| |>> Not sure, really. But how is your syntax better than mine? More importantly, how is your syntax fundamentally different, other than you''re apparently expecting the interface to produce a struct and I''m expecting it to produce a resource. In the RESTian world, we *should* be shipping around resources, not abstract data types.> >> The only way we''re going to get there, though, is if people spend >> their development time making Puppet better, rather than working >> around it. You have all of these templates, but they''re all built to >> your site, which makes them the exact kind of script you don''t want >> to >> go back to. If you use resources, and just have drivers to hook >> Puppet''s resource querying to your db, then you''re doing less work >> and >> more of it is reusable by others. > > To a degree. Templates are not inherently un-redistributable -- they > are in fact quite the opposite. The abstraction into resources does > make things more portable for many cases, but not all cases. When > you''re dealing with Nagios, Apache, or even Sudoers, the truth is the > Template is just as portable as the provider or definition would be.It''s a shell escape. It might be portable, and probably is, but it''s not generally portable, and, as with all shell escapes, it tends to get more difficult as your problems scale. If you have clean abstraction layers, then you can enforce compatibility at the interfaces between those layers. Templates skip through the whole system, which means you have no means of enforcing anything. If any of those formats changes, or (for example) you start using the database-stored configurations for Nagios, all of your templates are useless.> I agree that Puppet could use some extending to make this better -- > see my response to Digant with some proposed syntax for doing just > that.All of that syntax just seems built to support your data sources returning arbitrary data structures instead of a standardized Puppet resource. Why would you do this, instead of having a standard resource format that got automatically turned into normal resources by the parser? -- People are more violently opposed to fur than leather because it is safer to harrass rich women than motorcycle gangs. --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 12, 2008, at 12:32 PM, Luke Kanies wrote:>> Finally, I think the nail in the declarative coffin is this. As the >> complexity of your implementation increases, it is harder to keep the >> explicit statements of dependency in order than it is to slot things >> into a line. The reason? >> >> Most of the time, it doesn''t matter where something gets slotted in. > > Wait... because it doesn''t matter where something gets slotted in, > it''s harder to use a system that only requires you specify the > relationships that matter? > > Instead, you''d prefer that every resource had implied relationships > based on its order in the manifests? And you think this would be > easier?My main issue with inconsistency is that it''s inconsistent. A configuration should be applied the same way from run 1 to run 2 every single time. Period. We all know that building large scale systems requires repeatability. Changes in the order that a configuration is applied MUST be the same every time or it''s not repeatable. It''s very easy to make claims that ''it shouldn''t matter'' if the dependencies are defined correctly, but the reality is that sometimes small things have a big impact. Because order is not the same always without a huge amount of manual dependency creation you run the risk of not seeing an issue in testing, but only in a production deployment, which is incredibly dangerous. Butterfly. Tsunami. Indeterminacy is evil and goes against the principle of repeatability that is critical to large scale systems maintenance. Note that this isn''t an argument against the Puppet language or anything else. A default ordering will get be what I want. I don''t even care if it''s alphabetical or some other strange sort (with dependencies in order of course). I just need it the same way every time. --Randy --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 12, 2008, at 3:40 PM, Randy Bias wrote:> My main issue with inconsistency is that it''s inconsistent. A > configuration should be applied the same way from run 1 to run 2 every > single time. Period. We all know that building large scale systems > requires repeatability. Changes in the order that a configuration is > applied MUST be the same every time or it''s not repeatable.I contend that you must either pick: Consistent ordering, or manageability of your resource graph. Puppet could go a little further toward trying to consistently order resources at the same sort level in the graph, but then, so could someone else. I''m glad to make an effort if someone wants to pay for it, but it''d be a fishing expedition since I don''t know enough about graphing to know if I can even do it. I assume you don''t mean get rid of the dependency graph; I think we would be in a world of hurt if that went away, because we were before we had it. -- It is curious that physical courage should be so common in the world and moral courage so rare. -- Mark Twain --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Mon, May 12, 2008 at 1:39 PM, Luke Kanies <luke@madstop.com> wrote:> > I''m not convinced that modeling the nagios configuration as resources > > gets me much farther than having the external data source available to > > render the template, which is essentially what the native resource > > type is doing anyway at the end of the game. Essentially, I''m saying > > that the template isn''t opaque -- the template is quite explicit about > > what it''s doing (configuring Nagios.) Nagios has already defined the > > way it wants to be spoken to (through it''s own configuration > > language. > > This is a fundamental disagreement, then. The only reason, IMO, that > you view Nagios resources as chunks of text and users as resources is > that you''re conditioned to this by the existence of user tools and > lack of Nagios tools.No. It''s that I view Users as a thing that is easily abstract-able and differs across multiple implementations and platforms, and I view Nagios as something that isn''t easily abstract-able, and doesn''t differ greatly across platforms. If we were talking about an abstract "Monitor" type, that would put it on par with "User". But we aren''t, because a Monitor type would be nearly impossible to build, because a "Monitor" is orders of magnitudes more complex than a "User". I absolutely view most User providers as "chunks of text", just like I view the Nagios providers as "chunks of text". It *is* chunks of text. The fact that you''ve moved Nagios'' configuration syntax from it''s native state to this: nagios_host { "foo": one => two } Does not actually save me much of anything, because you haven''t abstracted it at all! You''ve translated it. You''ve just put another layer between me and the native language of the tool. In either case, you have maintenance issues (you now have to maintain the translation layer, plus the native language builder, between changes and revisions of nagios, and you must be able to detect which is available at run time, blah blah blah) In the case of Users, I dig that layer. It> Of course the template is explicit; the problem is that it includes > data, formatting, query information, etc.That, to me, is a problem easily solved. MVC frameworks deal with this all the time.. Templates are not a bad thing, you just have to learn how to manage them well. Puppet doesn''t manage them well today, and the push towards "resources good, templates are a hack" makes it a very unappealing area to put in energy. Templates including formatting and data is normal. Query information sucks, and we should move it to the language, but that assumes we can get back arbitrary data, or else we loose most of the utility you would require to handle templates well.> Everything is a waste. Ruby is a huge waste. Using user resources is > a waste. Using package resources is a waste. > > The question is, is that waste worth what you get in a return?Right. What I want is a system that''s fully configurable without my intervention as close to 100% of the time as possible.> For some reason, you think the answer is "yes" when it comes to some > resources stored in files (e.g., users) but not other resources stored > in files (e.g., nagios configurations). I just don''t see a > fundamental difference between users and nagios configurations; all I > see is a cultural difference in how we see them because of tradition > and available tools.It''s not a cultural difference -- it''s a very, very large complexity difference.> > At all was where the earth really shattered. Modeling the world as > > resources is valuable, but I think it breaks down at a certain point. > > At that point, the language still enables me to get the job done. > > Right. C allows you to escape to assembly, Ruby allows you to escape > to the shell, and no one would argue that it''s bad to be able to do > that. > > What I''m arguing is, it''s incorrect to act like you''re not doing the > equivalent of escaping to the shell. You might as well have backticks > throughout your manifests, from my perspective.I''m not doing the equivalent of escaping to the shell -- I''m allowing for the modeling of complex applications within their native form, without requiring a translation layer that provides a 1:1 mapping and partial template rendering. You say potatoe, I say potato.> This syntax already exists: > > User <<| |>> > > There''s your query. Now you just need to extend it to support > specifying a source; maybe: > > User(hr) <<| |>> > > Not sure, really. But how is your syntax better than mine? More > importantly, how is your syntax fundamentally different, other than > you''re apparently expecting the interface to produce a struct and I''m > expecting it to produce a resource. > > In the RESTian world, we *should* be shipping around resources, not > abstract data types.You hit the nail on the head. I want it to return a struct, and you want it to return a resource. I think, in many cases, I would rather be able to see the logic relating to what data I transform into resources than have it abstracted again. In a RESTful world, what you want is for each resource to have a single canonical location by which you can access and alter it. It has very little to do with what that resource is.. be careful with further overloading that terminology, it''s loaded up enough as it is. :) Let my try putting my opinion into different terms. I think the majority of the data about your infrastructure needs to live in a semi-structured, searchable index. I think you should put as little enforced ontology in there as you possibly can, to maximize the potential for re-use elsewhere. If the only thing I can retrieve is a fully formed resource, you''ve limited my ability to access that information, without further defining a new "type" of thing for it. I don''t want to have to do that -- I want to be free to get the data I want, how I want it, and then use that data to drive Puppet''s already very powerful abstraction layer.> It''s a shell escape. It might be portable, and probably is, but it''s > not generally portable, and, as with all shell escapes, it tends to > get more difficult as your problems scale.My Nagios templates are no less portable than your Nagios native types. They support 4 different platforms, and could easily be extended to support more by simply defining a new set of variables in a manifest (in regards to where files live, for example, once they are generated.) I can alter what systems are being monitored by altering single variables, and the unstructured nature of the underlying data queries lets me be as flexible as I might ever need to be about selecting which systems are monitored (and how.) I realize you see this as a leaky abstraction, but I say it''s actually the *right* level of abstraction for problems that are this complicated. The resource modeling forces more layers of abstraction for reduced practical benefit, and it forces me to know in advance how I might want to use the data.> If you have clean abstraction layers, then you can enforce > compatibility at the interfaces between those layers. Templates skip > through the whole system, which means you have no means of enforcing > anything.Compatibility at the interfaces between the layers matters when what I am modeling doesn''t have a 1:1 mapping to the end results. Users, Packages, Groups, Files, Directories, Symlinks. All of these might be handled a little differently on each different type of system, so defining a clean abstraction layer buys you these benefits. Nagios? It''s a 1:1 translation -- you aren''t enforcing compatibility with anything at all.> If any of those formats changes, or (for example) you start using the > database-stored configurations for Nagios, all of your templates are > useless.So are the Puppet Native Types for Nagios, unless you have a database provider. Which, if the underlying file format changes, you''ll have to update as well to deal with the new syntax or options. (Just like updating a template) Or if the database format changes, you''ll need to update your native provider to deal with the new schema.> All of that syntax just seems built to support your data sources > returning arbitrary data structures instead of a standardized Puppet > resource. Why would you do this, instead of having a standard > resource format that got automatically turned into normal resources by > the parser?See the above. To take the argument to even more philosophical places, it''s the reason that Yahoo!''s index is not the way anyone finds information on the internet. Even though they have a well defined ontology, and have mapped a very complicated thing into simple to use terms, the reality is that even the most well thought out ontology can''t cover all the possible use cases. Forcing the world into native types forces me to abstract even things that may not fit well into that abstraction, instead of allowing me to drive the parts that *do* work well with the information. Regards, Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 12, 2008, at 1:46 PM, Luke Kanies wrote:> I assume you don''t mean get rid of the dependency graph; I think we > would be in a world of hurt if that went away, because we were before > we had it.No, I don''t really have a problem with the dependency graph. I just want 100% repeatability. About the only thing I would change is to find a cleaner way to have coarser dependencies. Like this whole class dependency thing, which sounds pretty good to me. I haven''t tried it yet. Deps between modules and/or classes would get us pretty far, I think. --Randy --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
I''m once again replying to as little as possible; I can''t keep up this amount of text.> I absolutely view most User providers as "chunks of text", just like I > view the Nagios providers as "chunks of text". It *is* chunks of > text. The fact that you''ve moved Nagios'' configuration syntax from > it''s native state to this: > > nagios_host { "foo": > one => two > } > > Does not actually save me much of anything, because you haven''t > abstracted it at all! You''ve translated it. You''ve just put another > layer between me and the native language of the tool. In either case, > you have maintenance issues (you now have to maintain the translation > layer, plus the native language builder, between changes and revisions > of nagios, and you must be able to detect which is available at run > time, blah blah blah)An abstraction layer like the RAL is only useful if it''s consistent. Using this extra layer of abstraction isn''t necessarily useful in the particular case of Nagios, but its strict use throughout the system makes it generally more useful. Again, auditing, querying, inventorying, reporting -- all require custom coding if you use templates, but all can use the same Puppet interfaces if you use the RAL. It seems that you don''t see the value of the RAL except in its specific ability to make your life easier right now, which is the real myopia, I think. Wide adoption of the RAL, or something like it, is critical in my quest to stop caring about OS implementation details and focus on the bits that matter. The more that you encourage people to shell out of the RAL, the less people will be able to use tools built on the RAL.>> Everything is a waste. Ruby is a huge waste. Using user resources >> is >> a waste. Using package resources is a waste. >> >> The question is, is that waste worth what you get in a return? > > Right. What I want is a system that''s fully configurable without my > intervention as close to 100% of the time as possible.That seems pretty orthogonal to this discussion; clearly, 100% automation is the goal, and wasting resources on abstraction doesn''t affect that goal.>> Right. C allows you to escape to assembly, Ruby allows you to escape >> to the shell, and no one would argue that it''s bad to be able to do >> that. >> >> What I''m arguing is, it''s incorrect to act like you''re not doing the >> equivalent of escaping to the shell. You might as well have >> backticks >> throughout your manifests, from my perspective. > > I''m not doing the equivalent of escaping to the shell -- I''m allowing > for the modeling of complex applications within their native form, > without requiring a translation layer that provides a 1:1 mapping and > partial template rendering. You say potatoe, I say potato.Sorry, I don''t agree. You''ve got a consistent abstraction layer you can use to do your work, and instead of doing so, you''re working around it. That''s not a simple pronunciation difference, it''s a declaration that the old ways are better.> Let my try putting my opinion into different terms. I think the > majority of the data about your infrastructure needs to live in a > semi-structured, searchable index. I think you should put as little > enforced ontology in there as you possibly can, to maximize the > potential for re-use elsewhere. If the only thing I can retrieve is a > fully formed resource, you''ve limited my ability to access that > information, without further defining a new "type" of thing for it. I > don''t want to have to do that -- I want to be free to get the data I > want, how I want it, and then use that data to drive Puppet''s already > very powerful abstraction layer.I haven''t *limited* anything, I''ve just provided a consistent way to talk about Puppet''s idea of a resource. Think of it like a microformat; we define a canonical resource format, and now *anyone* can use that microformat, just like they can use hcard or any of the popular web microformats. There''s just no way this can be seen as a limitation. You need an arbitrary storage format, and I do too; the only differences are that I want to document mine as a "standard" format, and I want Puppet to automatically turn that into resources.> >> If any of those formats changes, or (for example) you start using the >> database-stored configurations for Nagios, all of your templates are >> useless. > > So are the Puppet Native Types for Nagios, unless you have a database > provider. Which, if the underlying file format changes, you''ll have > to update as well to deal with the new syntax or options. (Just like > updating a template) Or if the database format changes, you''ll need > to update your native provider to deal with the new schema.Right, but with Puppet, you drop in a new provider, and change one field in your resource specification. With templates, you have to change the internals of your manifests, and, most likely, you need to provide the ability to select between formats. It doesn''t look like your current templating system supports a switch to choose between formats, but Puppet already does.> To take the argument to even more philosophical places, it''s the > reason that Yahoo!''s index is not the way anyone finds information on > the internet. Even though they have a well defined ontology, and have > mapped a very complicated thing into simple to use terms, the reality > is that even the most well thought out ontology can''t cover all the > possible use cases. Forcing the world into native types forces me to > abstract even things that may not fit well into that abstraction, > instead of allowing me to drive the parts that *do* work well with the > information.That''s a truism on any abstraction, and it''s why the shell-out options exist. But that''s no excuse to jump to the shell-out as quickly as possible, like you''re recommending. -- America believes in education: the average professor earns more money in a year than a professional athlete earns in a whole week. -- Evan Esar --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 13, 2008, at 9:27 AM, Luke Kanies wrote:> > An abstraction layer like the RAL is only useful if it''s consistent. > Using this extra layer of abstraction isn''t necessarily useful in the > particular case of Nagios, but its strict use throughout the system > makes it generally more useful. > > Again, auditing, querying, inventorying, reporting -- all require > custom coding if you use templates, but all can use the same Puppet > interfaces if you use the RAL. > > It seems that you don''t see the value of the RAL except in its > specific ability to make your life easier right now, which is the real > myopia, I think. Wide adoption of the RAL, or something like it, is > critical in my quest to stop caring about OS implementation details > and focus on the bits that matter. The more that you encourage people > to shell out of the RAL, the less people will be able to use tools > built on the RAL.Very well said. This is, I think, the heart of the argument. This is the reason why puppet exists, and why it''s being successful. It will definitely be hard to model the more complex resources, but I think it can be done and at least meet the 80/20 rule. As far as some of the more complex examples given, such as monitoring and web servers, there certainly seems to be a trend in dumbing down individual systems. That''s why applications like Pound, Nginx, Mongrel, Rack, etc. have emerged with such a force in the Ruby on Rails world. Apache could already do all the things those tools did, but they implemented it in a smaller, more focused package. So maybe the trend will be to model the smaller more simplistic packages, and provide all the features of an apache or nagios via an aggregate of more focused resources. I really think nagios is a mess, so much time is wasted on configuring it. Perhaps the attempts to model it are the wrong approach entirely. A more distributed approach to monitoring would be the better approach I think. Integrating god and puppet has long been a itch I want to scratch. -Blake --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Hi> On May 8, 2008, at 11:19 PM, Adam Jacob wrote: >> Unlike Luke, I''m not convinced that a similar syntax built on top of a >> fully functional programming language wouldn''t solve many of these >> issues, while reducing the complexity of the run-time engine as a >> whole. > > As Blake pointed out, I''ve experimented both ways. The community > seems to prefer the external DSL, and AFAICT, literally *no one* has > ever looked at my pure Ruby internal DSL that attempts to mimic > Puppet''s language. I clearly don''t have any religious biases against > an internal DSL -- I''ve implemented one, and one seems to careyou made me interested in that internal dsl, but couldn''t find any references or documentation about it. are you using it somewhere in the code or is it just their waiting to be examined? and would be a combination of the external and internal be possible? for example use the internal one in new types and the external one still for your manifests. or did i know simply misunderstood you? greets pete --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On May 14, 2008, at 9:37 AM, Peter Meier wrote:> you made me interested in that internal dsl, but couldn''t find any > references or documentation about it. are you using it somewhere in > the > code or is it just their waiting to be examined? > and would be a combination of the external and internal be possible? > for > example use the internal one in new types and the external one still > for > your manifests. > > or did i know simply misunderstood you?At this point, you couldn''t use the internal and external dsl together. And you''re right that it doesn''t have much documentation; when I originally created it, I blogged about it and talked about it in email and irc and basically got crickets, which told me there were better things to do with my time. I''m glad to help you through some of it, but It''s likely to require a bit of hacking. I think Blake looked into it some, too. -- The big thieves hang the little ones. -- Czech Proverb --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
And collectd. On May 13, 2008, at 10:53 AM, Blake Barnett wrote:> Integrating god and puppet has long been a > itch I want to scratch.--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
David Schmitt
2008-May-20 11:56 UTC
[Puppet Users] Re: Philosophical Differences (using external data 1)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [as others have done, I split my answer into several pieces] On Friday 09 May 2008, Adam Jacob wrote:> > You mention this is data you already have outside of Puppet. If that is > > true, then yes, you could use a Nagios configuration generation tool. > > But not everyone has this data outside of Puppet nor does everyone want > > to manually compile this data outside of Puppet. What if it is just > > represented in manifests? > > I would argue that this data actually belongs outside of Puppet, in > some kind of external node tool, that''s accessible for other tools to > utilize. When I say "outside of puppet", what I mean is, the > canonical place for representing it should not be as declarations of > resources within a manifest. I loose much of my ability to use that > data in the many tools that it could be helpful with. (I''m think of > things like inventory management systems, meta-directories, and > application deployment systems.) > > I agree that not everyone has this data accessible outside of Puppet. > Sometime in the next year, I would wager that most people will, as > Luke will have completed work on a Reductive-stamped external node > tool. If you want to solve the cases where you need knowledge of the > external infrastructure, right now, using iClassify or LDAP to solve > this problem for you is, in my opinion, your only viable option. > > Where the real cognitive divergence happens is that I take this a step > further, and say that it''s actually the "right thing" that the world > works this way. I *do* want some data outside of Puppet. It''s not > the right tool for many jobs. But those other tools could get large > scale benefit from access to that information, and Puppet is just > another one of those tools.I totally agree with you here Adam! When I look at how my "customers" are defined, see e.g. http://git.black.co.at/?p=manifests;a=blob;f=manifests/site_hosting/dasz.pp for a glimpse of it, I am amazed that i CAN do this AT ALL [1]. Still, coming so far, I yearn for more. Having this information in a "oddly formatted text file" doesn''t help at all for e.g. creating a self-service web interface where people can say "Yes, I''d like to have wordpress too on my server". Of course part of this is lack of appropriate classes which could be bound into a external node classification tool but others like lines 28-31 (user creation) cannot be done with classes alone. [1] this automatically sets up a vserver, configures it, creates an LDAP tree, samba, private openvpn, apache, awstats, munin-node, exim4, smarthost acls, databases, wordpress, svn repos and a DAV folder. Regards, DavidS - -- The primary freedom of open source is not the freedom from cost, but the free- dom to shape software to do what you want. This freedom is /never/ exercised without cost, but is available /at all/ only by accepting the very different costs associated with open source, costs not in money, but in time and effort. - -- http://www.schierer.org/~luke/log/20070710-1129/on-forks-and-forking -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFIMrxW/Pp1N6Uzh0URAtmOAKCX5KJ9nIYqUNncT3rdvF+NHP3UpwCff2Tf bpKpL/7D7rI4BAMvOjk+eyI=yQcd -----END PGP SIGNATURE----- --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
David Schmitt
2008-May-20 12:09 UTC
[Puppet Users] Re: Philosophical Differences (using external data 2: export/collect)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 09 May 2008, Adam Jacob wrote:> > And what if we could get export/collect to match that? Why are we giving > > up on this and just saying "it will never work, lets just work around > > it?" Why not try to bring Puppet up to where it *can* perform at those > > levels? > > I''m not convinced that export/collect is actually the right way to > approach the problem. I think that, for most of the possible use > cases for it, all I''m doing is forcing external data into a resource > construct that may or may not actually give me any semantic advantage > (and in the case of distribution above, a dis-advantage.) > > I''m not at all opposed to bringing Puppet up to where it can perform > at those levels. If people see the value in: > > @ssh_key { $fqdn: > ssh_key => $ssh_dsa_key > } > > ssh_key <| |> > > Than go nuts with it. The part I dislike about the current work around is: > > file { "ssh_host_keys": > contents => template(foo) > } > > # in the template > <% ic = IClassify::Client.new(server, username, password) -%> > <% ic.search("ssh_dsa_key:*", [ "ssh_dsa_key" ]).each do |n| -%> > <%= n[:ssh_dsa_key] %> > <% end -%> > > What sucks about that is I need to look at the template to see what''s > going on, and it''s the exact problem you point out with this > technique. If we could say: > > $ssh_host_key_list = search("ssh_dsa_key:*", [ "ssh_dsa_key" ]) > file { "ssh_host_keys": > contents => template(foo) > } > > # template > <% ssh_host_key_list.each do |n| -%> > <%= n[:ssh_dsa_key] %> > <% end -%> > > You''ve resolved that issue. If you really, really want to declare > each ssh key as a resource, because the above syntax isn''t clear > enough for you: > > $ssh_host_key_list = search("ssh_dsa_key:*", [ "ssh_dsa_key" ]) > foreach $ssh_host_key_list ($key) { > ssh_key { $key: > ssh_dsa_key => $key > } > } > > And if loops are too much: > > ssh_host_keys { "something": > keys => search("ssh_dsa_key:*", [ "ssh_dsa_key" ]) > } > > Lots of ways to make that better, that don''t involve modeling each > discrete piece of data as a resource per-node.Getting resources from a (external) database into the puppet manifests is high on my wishlist (see last mail, "using external data 1"). Without going into much implementation detail, and without claiming any efficiency or ergonomy, this can be done today: define ssh_key_from_db() { sshkey { $name: keys => sql_array("SELECT key FROM sshkeys WHERE id = ''$name''") } } ssh_key_from_db { sql_array(''SELECT DISTINCT id FROM sshkeys''): } This is completely transparent to the reader and doesn''t need any templates. Of course, it suffers from the N+1 problem as well as SQL injection, but these are only implementation details that can be fixed by appropriate encapsulation within a database layer. Another possibility would be to use generate: file { "/etc/ssh/ssh_host_keys": content => generate("mysql", "--no-formatting", "-e", "SELECT keytype || '' '' || keydata FROM sshkeys"), mode => ... } Again, no ruby, no template, no loops, not even a native type or custom function. A third possibility would be to go totally function-nut-ically and do something like this: interpret ( " SELECT ''sshkey'' as Type, hostname as Title, sshkey as Param_sshkey FROM sshkeys" ) which generates resources from the resulting array(s). All of this of course can be extended by analogy to cover non-SQL sources or use some ORM tool. Regards, DavidS - -- The primary freedom of open source is not the freedom from cost, but the free- dom to shape software to do what you want. This freedom is /never/ exercised without cost, but is available /at all/ only by accepting the very different costs associated with open source, costs not in money, but in time and effort. - -- http://www.schierer.org/~luke/log/20070710-1129/on-forks-and-forking -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFIMr9s/Pp1N6Uzh0URArchAJ9jKn7mYzhVEHRM5lRiTojDjkEo2wCeLtnh zjqhVAaWTbNo7vJhH8f/nRk=mSju -----END PGP SIGNATURE----- --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
David Schmitt
2008-May-20 12:11 UTC
[Puppet Users] Re: Philosophical Differences (apache configuration)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 09 May 2008, Adam Jacob wrote:> > Yup. If I want to check if we''re setting up monitoring for Apache, I > > want to look at manifests related to configuring Apache. I don''t want to > > jump from template to template to see if I''ve got all the wheels doing > > the right things. That''s one of the major buy-in points for going with > > Puppet. > > See the syntax above, and my questions about distribution. Do you > *really* want that embedded in your Apache class? We do this now for > anything that is client-side configured. But when I think about > making what we do useful outside of HJK, I start to reconsider that > point of view.I can totally see your point here. After the distro-splitting in my apache module, the nagios and munin stuff is an unsolved sore point. Without having done anything for it, I''d expect some kind of optional include to be enough for this: class apache::base { optional_include apache::muninstats optional_include apache::nagiosmonitoring } define apache::site(...) { if $has_nagios { nagios::port { "$ipaddress:$port": ensure => open } nagios::http { "http://$vhostname/": ensure => 200 } } } Currently puppet needs the nagios::{port,http} types (or defines) to exist even then $has_nagios is false, I believe. Again this is only a implementation detail I believe.> > But you are right, native types may not always be the right answer. We > > do most certainly maintain apache configs as files and use definitions to > > simply drop in and enable those sites. I don''t know if it would be > > feasible to represent all the inner workings of an apache config in > > declarations. > > I''m willing to go out and a limb and say "No". :)There is a trivial, if unpractical, mapping from apache''s configuration to puppet resources, which has all configuration directives as parameters to an apache::global define, an apache::virtualhost define which takes all directives that can be applied in VirtualHost sections and so on and so forth. Containers like <Directory> and friends need a title-syntax so they can be nested into <VirtualHosts> or something. In the end, all configurations are only nested hashes with well-known keys. The root of the obvious impracticality is the way apache handles/nests Defaults, VHost specific configuration and ACLs. To flatten this out, a (constricting) abstraction could look like this: define web::virtualhost( $public_directory = ''/var/www'', $private_directory = ''/var/lib/www'', $port = 80, $ssl = false $modules = [], $options = [], $acls = [ [''allow'', ''all''], [''deny'', ''none''] ], $charset = '''') {...} The point for me is, that one has to abandon distribution- and application specific notions of configuration space IFF one wants to go down that road. Currently I''m just distributing site.confs, which in a way has it''s own problems, since they have to conform to some external restrictions (not too much pollution of global config space, but e.g. NameVirtualHost has to go there, but is potentially dangerous there). Referring back to my "using external data 2: export/collect" mail, this file-base distribution is of course very suboptimal, since it cannot be driven at all from a external database.> > But even with these two examples, I still think there can exist a myriad > > of native types will have practical utility. I don''t think the > > complexity of Nagios and Apache means we need to abandon that push. > > Absolutely. I never argued otherwise.+1> I agree that I don''t think the differences will get settled, but it''s > an interesting conversation regardless. > > Thanks for your well reasoned feedback, and your continuing > contributions to the Puppet community, Digant.The same to you, Adam! It''s a pleasure to see professionals at work. Regards, DavidS - -- The primary freedom of open source is not the freedom from cost, but the free- dom to shape software to do what you want. This freedom is /never/ exercised without cost, but is available /at all/ only by accepting the very different costs associated with open source, costs not in money, but in time and effort. - -- http://www.schierer.org/~luke/log/20070710-1129/on-forks-and-forking -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFIMr/h/Pp1N6Uzh0URAkqDAJ4xAkkUjzhHXL+JC3j0KkE5O6v+6wCcDOzz m4lz/JXFUY0ftlVvxjAOlW8=gIAv -----END PGP SIGNATURE----- --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Is there someone here, that wants to write a summary of this discussion and the different opinions? and perhaps put it in the wiki? Adam Jacob schrieb:> The thread on "templates and tagging" > (http://groups.google.com/group/puppet-users/browse_thread/thread/df87d0837b2e4993) > brought out some questions about how I (and, for some of the things > I''m about to say, others in the Puppet community) think about how > Puppet is designed. > > In my message in that thread, I had a disclaimer, which talked about > the fact that Puppet is a great tool, and that Luke deserves all > praises for having built and conceived of it. That said, we do > disagree about a few things. What follows is my attempt to clarify > how and why I think the way I do. > > It''s broken up in to sections, since the places where we diverge > philosophically don''t actually occur until you get pretty deep into > actually using Puppet. My intent was to be able to bring people > along, so that we''re not talking above peoples understanding of > Puppet. > > That said, I don''t claim to be perfect or unique in my understanding > of Puppet or it''s internals. These are simply my observations, having > used Puppet to help build many different infrastructures for various > organizations. > > == The Language => > Puppet is a huge leap forward in the art and practice of systems > administration/systems automation/systems architecture. Standing on > the shoulders of cfengine and similar tools, Luke created a > fundamentally useful abstraction layer for not only the practical > management of systems, but for expressing the entire infrastructure > (well, the *nix based systems in it anyway) as meaningful code. > > Let me say that one more time, because it''s the thing that is most > amazing about Puppet, and about Luke''s accomplishment with it: > > Puppet allows you to express your entire infrastructure as meaningful code. > > This was just not even theoretically possible with any existing tool > until Luke created puppet, in my opinion. Puppet, in two words, kicks > ass. > > From a practical point of view, the big win here was the creation of a > syntax that allowed for the easy expression of sometimes complicated > relationships between discrete resources in the system. The minimum > set of resources that must be capable of being managed by a tool like > Puppet are: > > * Files > * Directories > * Symlinks > * Exec > * Packages > * Services > > From those six basic resources, you could automate the vast majority > of modern *nix systems. > > Puppet didn''t stop there, though. It gave an even more powerful > abstraction layer to us, and that was the ability to group together > sets of those fundamental resources (which it calls types) into > classes. This allowed us to start grouping together resources > according to function, regardless of the underlying implementation. > Instead of modeling all of our init scripts, then all of our users, we > could start rolling all of those things together into a single class, > and then stating that it should exist on a given system. > > class apache { > file { "httpd.conf": > .. > } > package { "apache": > .. > } > } > > That rocked. We took all the varying steps required to get a > functioning apache installation, and we rolled them up in a single > thing we could name: "apache". > > Still, this wasn''t enough. Often times, we would have a repetitive > series of resources that we needed to apply over and over again. So > Luke one-upped himself: he gave us the ability to create a resource > definition. We could now group together resources, and reference them > in our classes *as if they were a single resource*. > > define apache_virtual_host($ensure="enabled") { > file { "$name.conf": > .. > } > exec { "reload-apache": > .. > } > } > > Now, instead of repeating that pattern anywhere I needed a new virtual > host, I could just say: > > apache_virtual_host { "frank": > ... > } > > And have all of those resources created for me. This is a leap > forward similar to the invention of chain-saws in the logging > industry. The impact on the ability of a single systems administrator > to create elegant, functional, repeatable configurations cannot be > understated. The practical benefit of these three powerful > abstraction layers (Resources, Classes, and Definitions) are: > > * I can manage different resources with similar semantics, in one place. > * I can group those resources together under functional umbrellas, so > it''s easy to find, maintain, and extend them. > * I can create new semantic abstractions that simplify my problem domain. > > If you knew nothing else about what Puppet provided, or how it turns > those resources into functional reality, you should already be able to > see the value inherent in those three things alone. > > If I was putting a percentage value on the different parts of puppet, > in terms of it''s impact on how efficient I am as a Systems > Administrator, I''m putting the language at 90%. Sure, without the > ability to take that language and do meaningful action with it, it''s > functionally useless. But many tools provide the ability to manage > the six fundamental resources, to say "make sure this file is owned by > root" or "make sure this service is enabled". None of them come close > to letting you do it as elegantly as Puppet does. > > == Providers and Native Types => > Puppet takes the language abstraction layer, and it somehow magically > turns them in to reality on your system. When you say: > > package { "apache": > ensure => latest > } > > Puppet knows that Apache should be installed, and that it should > always be kept at the latest version. It knows how to do this for > multiple platforms, even in the face of wildly different syntax for > the underlying operation. This is the second half of the magic of > resource abstraction. Not only does it let you easily say you want > it, it also *hides the implementation details from you*. > > Now, there are a few snags to this magic. One is that the underlying > platforms don''t all agree on the name of the apache package. Is it > "httpd", "apache", or "apache2"? Puppet lets you work through this > with a conditional syntax: > > package { "apache": > name => $operatingsystem ? { > Debian => "apache2", > CentOS => "httpd", > default => "apache" > } > ensure => latest > } > > If you tried to do this the other way, which is having the Package > provider understand "apache" in a native way, you would find the > technical obstacles much to hard. If the provider was supposed to > know that Package["apache"] was called "httpd" on CentOS dynamically, > it would need to have applied it''s own ontology on top of a hideous > number of potential package layers. So it lets you give it a hint, > and move along. > > The results kick ass. I can now refer to one resource, > Package["apache"], any time I want to talk about Apache, no matter > what platform I''m on. That rocks, because it let me apply my own > model of what was desired, and map that easily to a number of > underlying implementations. > > It works because Package, as a concept, is a fundamental thing. It > can be defined, and it''s basic attributes are similar regardless of > implementation. Packages have: > > * Names > * Versions > * Well defined states (installed, not installed, current, out of date, etc.) > > Regardless of your packaging system, it probably has a lot of > similarities with other packaging systems. Building an abstraction > layer at the concept of a "Package" works great, because the > similarities between packaging systems is quite high in the abstract. > > One of the things you''ll notice is that this is true about all six of > the fundamental resources I outlined above. No matter what platform > your on, things of that kind share a lot of common attributes. They > are differentiated mainly by differences in implementation, but not in > abstract concepts. > > These basic types show the sweet spot for this kind of underlying > platform abstraction: > > A resource can be boiled down into discrete abstract concepts, which > are present in the vast majority of implementations, regardless of > platform. > > Doing this provides a clear benefit, as I can now manage a single > declaration of a resource, and have it take care of the details, > regardless of my platform. > > When a resource fits that requirement, the benefit to providing this > cross-platform abstraction layer is clear. It takes more work up > front, but it saves time in the long run. > > This combination of a Resource and a set of Providers is often > referred to as a "native type". It''s a thing that Puppet inherently > knows how to manage, across multiple implementations or platforms. > This is in contrast to my apache_virtual_host definition above, which > is a "defined type", and is made up of a series of Puppets own native > types working in concert. > > If you didn''t have any cross platform providers at all, I could still > build cross platform tools with Puppet''s language.. it would just be > wildly less useful and elegant. > > = Where Provider''s and Native Types beak down > > So, the sweet spot for native types is if: > > A resource can be boiled into discrete abstract concepts, which are > present in the vast majority of implementations, regardless of > platform. > > There are many things that might fit into this box, outside of just > the six fundamental resources: > > * Firewall rules > * Mount points > * Host entries > * DNS entries > > The list goes on. In all of these above cases, you can take the basic > idea of the thing, and boil it down to a series of high level > concepts, regardless of the underlying implementation. > > But wait.. there is a snag. With Firewall rules, it''s actually about > a lot more than just the basic idea of "from x to y on z port". > That''s easy. But the implementation details do matter -- think of the > wide variety of rules you can place in iptables alone, then extend > that to pf or any other sort of similar tool. You might still see > benefit out of a native type, but it probably won''t be the > cross-platform nature of it, unless you are willing to settle for a > non-fully-functional abstraction. > > Lets take this a step further, and talk about a much more complicated > beast: web servers. You can make some abstraction about the sorts of > things web servers allow you to do, on a high level. They bind to a > port, they serve files. But they also have virtual hosts, complicated > rewrite rules, proxy layers, ssl, and a huge list of other things. So > while it would be cool to be able to say: > > web_server { "foo": > port => [ 80, 443 ], > document_root => "/", > provider => "nginx" > } > > And then be able to switch it to say: > > web_server { "foo": > port => [ 80, 443 ], > document_root => "/", > provider => "apache" > } > > Which you could absolutely build in the simplest case. But the > underlying complexity of modern web servers, along with the wildly > different approaches to their underlying configuration, make this kind > of thing very difficult to do. Even though there would be a benefit > to being able to switch out what webserver I am running with a simple > swap of provider, that benefit is outweighed by what I would loose in > flexibility. > > I use this example to illustrate that there is a place outside the > sweet-spot for native type development. It is when: > > The complexity of implementing an abstraction layer means loosing > significant functionality within the underlying providers. > > Or, put another way, as the complexity of the thing you are > abstracting increases, the utility of abstracting it falls. The sweet > spot in the case of web servers is to use definitions, along with the > fundamental building blocks, to automate the process of configuring > them. It provides huge semantic benefit for comparatively little > effort. Building a native web_server provider provides an interesting > benefit (the ability to use different web servers transparently,) but > at a cost of lost functionality and a huge amount of effort. > > So, a particular thing is a great case for becoming a native type if: > > 1) It can be boiled into discrete abstract concepts, which are present > in the vast majority of implementations, regardless of platform. > 2) The complexity of implementing an abstraction layer does not result > in a significant loss of functionality within the underlying > providers. > > Often, one of these two things will be true, and we''ll wind up > building a native type anyway. An example of this would be a pending > iptables native type from Stanford. They aren''t building a "firewall" > or "packet filter" native type with an iptables provider. They are > building native iptables types, that understand the specific > complexities inherent in that one implementation, so that they can > have a more complete semantic set by which to manage their > infrastructure. > > So, something might be an good case for being a native type if: > > 1) It''s configuration can be abstracted into puppet''s native syntax easily > 2) It benefits from a deeper level of decision making on the client side > > In the case of iptables, it fits both these slots. We can totally > model it''s syntax in puppet, or even punt on doing that at all, and > just let you write the rule inline (since we don''t have to worry about > it being cross platform!) Plus, it solves the problem of making sure > your rules all appear in order, since you can use Puppet''s internal > ability to declare relationships between resources to structure the > resulting rule set, which would be impossible using Defined types > alone. > > If it doesn''t fit well in to one of those two boxes, it''s a bad case > for a native type, and you are better off making definitions and using > the smaller building blocks to solve your problem. > > == Puppet''s language is declarative (except when it isn''t) => > In practical terms, this means that when you declare two resources: > > file { "this": } > file { "that": } > > In a manifest, they are not guaranteed to be applied on the client in > the order you declared them. If they were dependent on one another, > you must make that dependency explicit, or you won''t have any > guarantee that they will be applied in the order you expect. > > Puppet makes a Directed Graph of all of your resources, and then > performs a Topological Sort on that graph to determine the order they > will be applied in. One side-effect of this is that the order they > will be applied in may vary from one run to the next, because a > topological sort of a directed graph may have many different > solutions. > > The reasoning behind this is based on the assumption that, as > complexity increases, relying on the order in which resources are > specified to determine when things should happen becomes impossible to > manage. With 10 things, it''s easy to put them in a line. With a 100 > things, it''s harder. With a 1000 things, it''s quite hard. By using > dependencies, you only have to say "this requires that". You don''t > have to worry about where "this" or "that" slot in to the grand scheme > of things. > > The gotcha here is that people actually don''t think in terms of > directed graphs and topological sorts. If you look in your Puppet > manifests, you most often write them in order (at least within a > class). It''s the natural thing to do, because it''s just "normal" to > write in a logical line. When you realize that order doesn''t matter > that first time, you start to make your edits *out* of order, because > you know it doesn''t matter. Many people have been bitten by this: > > file { "foo": > content => template("woot") > } > > $variable_i_need_in_foo = "something" > > You can see the thought process here... I need to model this file as a > template. Oh, and it needs $variable_i_need_in_foo. That should > work, you figure, because Puppet is declarative. > > Except that template() is a function that gets called at the time the > parser sees it. And since the parser hasn''t seen > $variable_i_need_in_foo, it can''t use it in the template. It''s a > totally reasonable behavior to have, and it would blow up if the > language wasn''t declarative as well. But it goes against what the > user gets trained to expect. > > Finally, I think the nail in the declarative coffin is this. As the > complexity of your implementation increases, it is harder to keep the > explicit statements of dependency in order than it is to slot things > into a line. The reason? > > Most of the time, it doesn''t matter where something gets slotted in. > > If I have a thousand things, very few of them *actually care about > where they fit*. They just need to happen after a few other things. > If Puppet was not declarative, but simply executed resources in the > order they appear in the manifest (based on the order the top-level > classes are assigned to the node) the problem would become one of > ensuring that each class state explicitly what *other* classes need to > have been applied before it can be run. Which you have to do anyway, > if you want to ensure that your complete configuration can be > delivered in a single run. > > This would be the first major philosophical break I have with Puppet. > I think the declarative nature of the language is a hinderance at > scale, not a benefit. > > One way to fix this would be to implement an automatic require > statement for every resource that appears in an "include foo" line > inside a class for every resource within the class. At that point, > though, you might as well just ditch the graph altogether. > > = Puppet is Myopic by default > > For those of you who have read this far, this is where Luke and I > really start to diverge philosophically. > > At the moment, the majority of Puppet users are utilizing it in a > myopic way. For any given run, a node knows all about itself, but it > doesn''t know about it''s neighbors. So how do we handle configuring > services which require knowledge about the infrastructure at large, > instead of just my one node? > > To solve this problem, Luke has added a few more layers of abstraction > into the language. They are the ability to create a virtual resource > and "export" it, and the ability to "collect" those resources again on > another node. (Often referred to, cleverly enough, as export/collect) > > To take the simplest case, that of ssh host keys: > > class ssh { > case $sshdsakey { > "": { # ignore empty keys > } > default: { > @sshkey { $hostname: type => dsa, key => $sshdsakey } > } > } > > sshkey <||> > } > > So, a lot of things are happening here: > > * We are saying that the canonical sshkey for $hostname is $sshdsakey. > (Both of which are facts the node submitted to puppet when it ran) > * We are saying that, on every host, we want to make all of the sshkey > resources puppet knows about exist on this node. > > If you accept that Puppet is, or should be, the single canonical and > authoritative place where all the information about how your systems > architecture should be built is, than the @sshkey statement seems to > make some sense. > > The thing is, though, we are really saying that whatever the node says > is correct is correct. Puppet doesn''t have an ssh host key for me > that it''s distributing to my client. It isn''t in control of what that > value is or is not. Each individual node decides for itself what the > ssh host key should be, and we''re just letting everyone else know > about it. > > This is a good thing, in my opinion. Creating every ssh host key > manually and distributing it would work, but what a pain. This solves > my problem, which is great. By letting me grab the resources of other > nodes, I can have a much broader view of what''s happening around me. > (I am less myopic!) > > In order to make this functionality work, you need to store every > created resource for every node in your infrastructure at the last > state you saw it in (Puppet does this by sticking the resources into a > MySQL or PostgreSQL database). > > (Disclaimer: I am not familiar at all with the actual > storeconfigs/export/collect code. I am extrapolating.) > > In one of our infrastructures, we define approximately 900 different > resources per node. If we have 10 machines, that means 9000 discrete > resources total. If Puppet runs every 30 minutes, that means: > > 1) The first run makes 9000 INSERTs. > 2) Subsequent runs make ideally a single SELECT for all the resources > I stored on my last run, and then any UPDATEs, INSERTs or DELETEs that > are required to make my working set match. In an un-optimized > version, it makes 9000 individual SELECTs for comparison. > 3) Any use of the export/collect functionality is a subsequent SELECT. > > This is not such a big deal at 10 systems. Lets scale to 10x, though, > to 100 systems. That number is now 90,000. 200 systems is 180,000. > 1000 systems is 900,000 rows. > > That''s quite a bit of data, and it scales pretty rapidly as you use > puppet more. Especially if you start declaring resources for > everything you want to do. We have a Nagios installation I''ll talk > about later, that has a total of 1748 service checks. How many of > those would need to be @nagios_service { .. }? > > Now, there are benefits to this storage: > > 1) You know about the state of every resource on every system you > manage. This has *super sweet* reporting potential. > 2) You can query against it to do things like sshkey <| |> > > The downside: > > 1) Using storeconfigs today means taking a fairly significant > performance penalty. > 2) You can''t export/collect non-native types. > > The first one is problematic for obvious reasons. The second one is > more nefarious. If I want to use a resource declaration in a puppet > manifest to configure something that *doesn''t* make sense as a native > type, I''m left out in the cold. I''m forced to make a native type, > because it''s the only way I''m going to get the behavior I want when it > comes out the other end. > > For example, Puppet has native Nagios types. These exist because you > want to be able to configure Nagios through resource declarations > within the functional classes that reflect the monitors you need. (And > they aren''t a more generic "monitor" type because it would be too > complex, and you would loose too much functionality.) > > So, what do I really *need* when I''m configuring Nagios in an automated way? > > 1) A list of all the hosts in my infrastructure > 2) Knowledge of what services those hosts provide > 3) Knowledge of what monitors should be watching those services > > All of these details can be boiled down to one thing: I need knowledge > of data about all the systems in my infrastructure. I need to know > the IP Address of every host, along with FQDNs. I need to know that > you''re a "web server" or a "database server". > > All of this is information that exists *outside of the resource > modeling* that provides so much power in puppet. When I model: > > @nagios_host { $fqdn: > address => $ipaddress > } > > All I''m doing is taking data I already know about and forcing it into > a semantic model that provides very little actual value. If my > infrastructure is using Nagios, I''m not learning about it because I > see a @nagios_host declaration in my puppet manifest. (I hope not!) > I''m not learning that this host is monitored by nagios, because the > assumption is that, either way, it will be (or why was I automating > nagios in the first place?) > > It''s only purpose is to force data I already have into a semantic > model that it doesn''t fit well within, so that I can avoid knowledge > of how nagios deals with it''s configuration. (The value of the puppet > syntax, after all!) > > Compare export/collecting to these lines of Ruby in a host template: > > <% ic.search("tag:monitored").each do |node| -%> > define host{ > use generic-host ; Name of host template to use > host_name <%= node[:hostname] %> > alias <%= node[:fqdn] %> > address <%= node[:ipaddress] %> > } > <% end -%> > > And this puppet resource: > > class nagios { > file { "nagios-hosts": > ... > contents => template("nagios_hosts.erb") > } > } > > All wrapped up in a Nagios class. It assumes I have some knowledge of > how to configure Nagios, since I am still writing the config file by > hand. > > In order to use the Nagios native types well, you *still must have an > intimate knowledge of Nagios*. You don''t get to magically wipe away > that complexity, because it''s hard. Hard enough that it doesn''t make > sense to make a "monitor" native type, hard enough that if you want to > avoid having the 1748 service checks as individual resources you need > to understand that you should be modeling hostgroups instead. > > So when people question why the template model is useful at this > level, and that the world should be made up of resources, I have to > ask: why? What''s the concrete benefit? > > It''s not: > > * That I can swap my monitoring solution out for another one easily > with a native type (I can''t) > * That it scales better (it takes a couple seconds to compile a > manifest that handles 243 nodes, 1748 service checks - storeconfigs + > export/collect will likely never match that) > * That I now magically know that a node is being monitored by nagios > (because, as Digant said so elegantly regarding sudoers, you still > have to look at sudoers if you really want to be sure what''s in there) > > So what''s left? That I like being able to look at: > > class apache { > @nagios_service { "httpd": > ... > } > } > > And > > class nagios { > nagios_service <| |> > } > > Instead of knowing that my services are defined in the template for > Nagios service configs? > > Even if that''s enough for you, is the above is so superior that we > shouldn''t expand Puppet''s language to make the gross parts of doing it > with templates (the inability to embed logic that belongs outside the > template itself in the manifest that utilizes the template) easier? > > = Summary > > Resources, Definitions, and Native Types aren''t good in and of > themselves. They need to have an increased level of practical utility > for the people who wield them in order to provide any real benefit. > There are situations where they clearly just are not the right answer. > > I believe Nagios is one of them, indeed, almost any situation where > you need to know data about the infrastructure in aggregate. In those > cases, I don''t need to model the data I already have as a resource -- > I need a way to express that data to impact the creation of resources, > sometimes simple fundamental ones. > > More abstraction is not always better. > > Regards, > Adam >-- Phillip Scholz Junior Linux System-Administrator IT Shared Hosting Linux - SaaS, Karlsruhe 1&1 Internet AG Brauerstraße 48 D-76135 Karlsruhe Tel. +49-721-91374-4818 phillip.scholz@1und1.de http://www.1und1.de/ Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren
David Schmitt
2008-May-20 12:47 UTC
[Puppet Users] Re: Repeatability of Resource sorting (was: Re: Philosophical Differences)
On Monday 12 May 2008, Luke Kanies wrote:> On May 12, 2008, at 3:40 PM, Randy Bias wrote: > > My main issue with inconsistency is that it''s inconsistent. A > > configuration should be applied the same way from run 1 to run 2 every > > single time. Period. We all know that building large scale systems > > requires repeatability. Changes in the order that a configuration is > > applied MUST be the same every time or it''s not repeatable. > > I contend that you must either pick: Consistent ordering, or > manageability of your resource graph. > > Puppet could go a little further toward trying to consistently order > resources at the same sort level in the graph, but then, so could > someone else. > > I''m glad to make an effort if someone wants to pay for it, but it''d be > a fishing expedition since I don''t know enough about graphing to know > if I can even do it.I think the "only" thing one has to do to get a stable sort is to sort all nodes on one "level" of in-degree, see the attached patch for an idea. The remaining problem is how compare the resources, but a simple string-sort on .to_s should be sufficient if this is unique (which I think it is, giving "Type[title]"). Regards, DavidS -- The primary freedom of open source is not the freedom from cost, but the free- dom to shape software to do what you want. This freedom is /never/ exercised without cost, but is available /at all/ only by accepting the very different costs associated with open source, costs not in money, but in time and effort. -- http://www.schierer.org/~luke/log/20070710-1129/on-forks-and-forking
On May 20, 2008, at 7:25 AM, Phillip Scholz wrote:> Is there someone here, that wants to write a summary of this > discussion and the different opinions? and perhaps put it in the wiki?That''d be a great idea, but it''s pretty clear neither Adam nor I should be writing that summary. I think each of us has provided soundbite-like sections in our first couple of emails that could be relatively easily snipped into docs on the wiki, but for the rest, it would take some heavy editing, I expect. -- A gentleman is a man who can play the accordion but doesn''t. --Unknown --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Luke Kanies
2008-May-20 14:46 UTC
[Puppet Users] Re: Philosophical Differences (using external data 2: export/collect)
On May 20, 2008, at 7:09 AM, David Schmitt wrote:> > > Getting resources from a (external) database into the puppet > manifests is high > on my wishlist (see last mail, "using external data 1"). Without > going into > much implementation detail, and without claiming any efficiency or > ergonomy, > this can be done today: > > define ssh_key_from_db() { > sshkey { $name: > keys => sql_array("SELECT key FROM sshkeys WHERE id = ''$name''") > } > } > > ssh_key_from_db { sql_array(''SELECT DISTINCT id FROM sshkeys''): }[snipped other examples] I think this kind of approach is reasonable -- embedding an external query directly in your Puppet code -- but it still seems like it should be nearly as easy to have the existing query syntax (Sshkey <<| |>>) be hooked into sql like this. I guess what I like about using the query syntax is that it requires you to formalize your integration with Puppet a bit. It shouldn''t require much coding, but it makes it more clear and more shareable. I think your basic approaches make sense for initial, simple integrations, though, and if people settled on a good function, I''d go for it. the downside of two of your examples is that they only retrieve content, not resources, but doing so would require that extra development. -- We cannot really love anybody with whom we never laugh. --Agnes Repplier --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Tue, May 20, 2008 at 7:40 AM, Luke Kanies <luke@madstop.com> wrote:> That''d be a great idea, but it''s pretty clear neither Adam nor I > should be writing that summary.What, Luke, I''m clearly fair and balanced! :) Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Adam Jacob
2008-May-20 19:09 UTC
[Puppet Users] Re: Philosophical Differences (using external data 1)
On Tue, May 20, 2008 at 4:56 AM, David Schmitt <david@schmitt.edv-bus.at> wrote:>> Where the real cognitive divergence happens is that I take this a step >> further, and say that it''s actually the "right thing" that the world >> works this way. I *do* want some data outside of Puppet. It''s not >> the right tool for many jobs. But those other tools could get large >> scale benefit from access to that information, and Puppet is just >> another one of those tools. > > I totally agree with you here Adam! When I look at how my "customers" are > defined, see e.g. > http://git.black.co.at/?p=manifests;a=blob;f=manifests/site_hosting/dasz.pp > for a glimpse of it, I am amazed that i CAN do this AT ALL [1]. Still, coming > so far, I yearn for more. Having this information in a "oddly formatted text > file" doesn''t help at all for e.g. creating a self-service web interface > where people can say "Yes, I''d like to have wordpress too on my server". Of > course part of this is lack of appropriate classes which could be bound into > a external node classification tool but others like lines 28-31 (user > creation) cannot be done with classes alone.I think there is a general agreement amongst everyone that this sort of functionality is necessary. You detail one approach to it in another response. Where lak and I diverge (and I think we hammered this out pretty distinctly) is in what we think the results of such a query should be. To me, I want the results to be essentially semi-structured data. I''m not sure where (or when) I might want to use a given piece of information, so I want to be able to query it and get actionable data structures back when I need them. This would mean extending Puppet''s language in a couple directions, some of which I think might not fit well with the overall vision of how the "Puppet Way" will evolve. Luke''s vision would have you returning resources from an external source, instead of the data that would allow you to generate the resources in your manifest. Both ways get you where you are going. :) Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Blake Barnett
2008-May-20 22:13 UTC
[Puppet Users] Re: Philosophical Differences (using external data 1)
On May 20, 2008, at 12:09 PM, Adam Jacob wrote:> > On Tue, May 20, 2008 at 4:56 AM, David Schmitt <david@schmitt.edv-bus.at > > wrote: >>> Where the real cognitive divergence happens is that I take this a >>> step >>> further, and say that it''s actually the "right thing" that the world >>> works this way. I *do* want some data outside of Puppet. It''s not >>> the right tool for many jobs. But those other tools could get large >>> scale benefit from access to that information, and Puppet is just >>> another one of those tools. >> >> I totally agree with you here Adam! When I look at how my >> "customers" are >> defined, see e.g. >> http://git.black.co.at/?p=manifests;a=blob;f=manifests/site_hosting/dasz.pp >> for a glimpse of it, I am amazed that i CAN do this AT ALL [1]. >> Still, coming >> so far, I yearn for more. Having this information in a "oddly >> formatted text >> file" doesn''t help at all for e.g. creating a self-service web >> interface >> where people can say "Yes, I''d like to have wordpress too on my >> server". Of >> course part of this is lack of appropriate classes which could be >> bound into >> a external node classification tool but others like lines 28-31 (user >> creation) cannot be done with classes alone. > > I think there is a general agreement amongst everyone that this sort > of functionality is necessary. You detail one approach to it in > another response. > > Where lak and I diverge (and I think we hammered this out pretty > distinctly) is in what we think the results of such a query should be. > To me, I want the results to be essentially semi-structured data. > I''m not sure where (or when) I might want to use a given piece of > information, so I want to be able to query it and get actionable data > structures back when I need them. This would mean extending Puppet''s > language in a couple directions, some of which I think might not fit > well with the overall vision of how the "Puppet Way" will evolve. > > Luke''s vision would have you returning resources from an external > source, instead of the data that would allow you to generate the > resources in your manifest. > > Both ways get you where you are going. :)I think this is really splitting hairs. Whether the data source returns an actual resource, or the data is _used_ by a native type or definition, the results would be equivalent. The ugly part is when the data is never given any structure (i.e. it never becomes a resource). Whether this happens as an export/collect, a native type with an external back-end, or a hacked up definition doesn''t really matter to me. -Blake --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---