Schofield
2012-Dec-11 01:25 UTC
[Puppet Users] How to handle multi-variable cross cutting concerns in hiera?
I am working with puppet 3.0 and have the opportunity to build the hiera hierarchy from scratch. I am pondering which data should be included in hiera and how it should be organized. After some research it appears that most folks struggle when their data is dependent on multiple facts rather than a strict hierarchical data structure. For example: a value depends on the node location *and *what environment it is in dev|test|qa|prod. In my mind a hiera hierarchy like the following which is based on network location of a node would work great because each level is more specific and a subset of the previous making overrides very clear and clean. - fqdn - for node specific overrides - cluster - cluster specific overrides - network - all clusters are isolated on a network segment. - common - the default Now the difficult part is that I also want to externalize data in to hiera based on the network location *and *the environment. This becomes more complex if a third variable is added. So the question is: Is there a best practice for handling hiera data values based on multiple attributes? In this case location *and *environment. One doesn''t take precedence over the other but both are needed to find a unique and correct value. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/XIQXwB5aBiwJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
jcbollinger
2012-Dec-11 14:56 UTC
[Puppet Users] Re: How to handle multi-variable cross cutting concerns in hiera?
On Monday, December 10, 2012 7:25:30 PM UTC-6, Schofield wrote:> > I am working with puppet 3.0 and have the opportunity to build the hiera > hierarchy from scratch. I am pondering which data should be included in > hiera and how it should be organized. After some research it appears that > most folks struggle when their data is dependent on multiple facts rather > than a strict hierarchical data structure. For example: a value depends on > the node location *and *what environment it is in dev|test|qa|prod. >Well, yes. Hiera is an hierarchical data store. It works great if you can put your data in a well-defined priority hierarchy, because that''s what it''s designed to handle. It is flexible enough to support some kinds of deviations from a strict hierarchy, but the further you go from an hierarchical structure, the worse Hiera supports it. if your data isn''t something reasonably close to hierarchical, then you might be better off choosing or creating an altogether different data store.> > In my mind a hiera hierarchy like the following which is based on network > location of a node would work great because each level is more specific and > a subset of the previous making overrides very clear and clean. > > - fqdn - for node specific overrides > - cluster - cluster specific overrides > - network - all clusters are isolated on a network segment. > - common - the default > >Absolutely. That sort of thing is Hiera''s bread & butter. Now the difficult part is that I also want to externalize data in to hiera> based on the network location *and *the environment. This becomes more > complex if a third variable is added. > > So the question is: Is there a best practice for handling hiera data > values based on multiple attributes? In this case location *and *environment. > One doesn''t take precedence over the other but both are needed to find a > unique and correct value. > >Hiera allows you to lay out your data in two dimensions: data file and key. Whatever selection rules you want to use to choose particular data need to operate in that context. There are at least three ways in which you can embed additional dimensions: 1. You can create separate hierarchies or hierarchy pieces based on node data, by interpolating the data into the hierarchy definition file 2. You can use compound keys 3. You can expand your values into hashes (with the hash keyspace constituting an additional dimension) Those can be used separately or in combination, and even in self-combination, so in principle, you can use as many dimensions as you want. In practice, it can get very messy, very quickly. The best Hiera approach for any given situation is highly dependent on the data that need to be stored and served. Factors such as whether commonalities are coincidental or by design, and who has logical administrative control over the data may be important. Supposing that you want to support a site under a single, unified administration and with a lot of commonality between different environments, I would suggest considering using option (1) above to add a dimension for environment. That would manifest in the hierarchy definition in your hiera.yaml file, which might look something like this: :hierarchy: - %{fqdn} - %{environment}/%{cluster} - %{cluster} - %{environment}/%{network} - %{network} - %{environment}/common - common John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/ftw3X9LMm8gJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Schofield
2012-Dec-11 17:10 UTC
[Puppet Users] Re: How to handle multi-variable cross cutting concerns in hiera?
> Hiera allows you to lay out your data in two dimensions: data file and > key. Whatever selection rules you want to use to choose particular data > need to operate in that context. There are at least three ways in which > you can embed additional dimensions: > > 1. You can create separate hierarchies or hierarchy pieces based on > node data, by interpolating the data into the hierarchy definition file > 2. You can use compound keys > 3. You can expand your values into hashes (with the hash keyspace > constituting an additional dimension) > > Would you mind going into detail on options 2 and 3?> Those can be used separately or in combination, and even in > self-combination, so in principle, you can use as many dimensions as you > want. In practice, it can get very messy, very quickly. >Getting messy, quickly is my concern if the hierarchy is not the best fit for the enterprise or the enterprise architecture changes. Are there any rules of thumb to consider that would suggest hiera is not the best data externalization tool and someone might be better off with a RDMS or denormalized search index as the external data source? -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/uiYolhQxbgsJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Luke Bigum
2012-Dec-11 17:45 UTC
[Puppet Users] Re: How to handle multi-variable cross cutting concerns in hiera?
On Tuesday, December 11, 2012 5:10:48 PM UTC, Schofield wrote:> > > Hiera allows you to lay out your data in two dimensions: data file and >> key. Whatever selection rules you want to use to choose particular data >> need to operate in that context. There are at least three ways in which >> you can embed additional dimensions: >> >> 1. You can create separate hierarchies or hierarchy pieces based on >> node data, by interpolating the data into the hierarchy definition file >> 2. You can use compound keys >> 3. You can expand your values into hashes (with the hash keyspace >> constituting an additional dimension) >> >> Would you mind going into detail on options 2 and 3? > > > >> Those can be used separately or in combination, and even in >> self-combination, so in principle, you can use as many dimensions as you >> want. In practice, it can get very messy, very quickly. >> > > Getting messy, quickly is my concern if the hierarchy is not the best fit > for the enterprise or the enterprise architecture changes. Are there any > rules of thumb to consider that would suggest hiera is not the best data > externalization tool and someone might be better off with a RDMS or > denormalized search index as the external data source? >I can''t speak for John but I can take a guess at what he was getting at regarding hashes getting complicated. You can use Hiera to store complex information structures like the one below: postfix_additional_settings: smtp_tls_security_level: encrypt tls_random_source: dev:/dev/urandom smtpd_use_tls: "yes" smtpd_tls_loglevel: 1 Then inside a Puppet manifest or template you can retrieve and handle the hash in a more concise manner than requesting each postfix configuration key individually. The Puppet and template snippet below will put any Postfix options I add to the hash above into my main.cf file without me having to go in and edit the postfix module itself: Manifest: $postfix_additional_settings = hiera_hash(''postfix_additional_settings'', undef) $postfix_main_conf_file = ''/etc/postfix/main.cf'' file { $postfix_main_conf_file: content => template("${module_name}/${postfix_main_conf_file}.erb"), } Template snippet: ############################################################################### # Everything below here comes from the Hiera postfix_additional_settings hash # ############################################################################### <% if @postfix_additional_settings %> <% postfix_additional_settings.sort.each do |key, val| -%> <% if val -%> <%=key%> = <%=val%> <% end -%> <% end -%> <% end -%> That''s not too bad for a Postfix config where all the keys are unique and there''s only one level of depth. I don''t have to have much complexity in my template file to handle the different types of Postfix options my sites have, that''s all in Hiera. Now here''s a more complex Template where we write a HAProxy configuration file. This hash: haproxy_listen_hash: something: bind: ssl: 1.1.1.1:443 ssl crt /etc/pki/tls/private/1.1.1.1.pem servers: woof: 2.2.2.2:80 check opts: mode: tcp Feeds this template: <% haproxy_listen_hash.sort.each do |key, listen_hash| -%> listen <%= key %> <% listen_hash.sort.each do |key, val| -%> <% if key == "bind" -%> # Bind to these addresses # ----------------------- <% val.sort.each do |subkey, subval| -%> # <%= subkey %> bind <%= subval %> <% end -%> <% elsif key == "servers" -%> # Forward traffic to these servers # -------------------------------- <% val.sort.each do |subkey, subval| -%> server <%= subkey %> <%= subval %> <% end -%> <% elsif key == "opts" -%> # Extra options # ------------- <% val.sort.each do |subkey, subval| -%> <%= subkey %> <%= subval %><% if subkey == "stick-table" && haproxy_peers %> peers mypeers<% end %> <% end -%> <% elsif key == "stats" -%> <% val.sort.each do |subkey, subval| -%> stats <%= subkey %> <%= subval %> <% end -%> <% end -%> <% end -%> <% end -%> It still works well for our purposes, but it''s starting to get quite complicated. There are so many nested hashes the template is difficult to read. We do manage to preserve just raw haproxy information in Hiera though. If you need to go there, using the hiera_hash function adds even more complexity. This will flatten Hiera keys top down through your Hierarchy into a single hash, useful for overriding different portions of your default hash in other parts of your hierarchy. -Luke -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/E66Qn8qzTR8J. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
jcbollinger
2012-Dec-12 14:45 UTC
[Puppet Users] Re: How to handle multi-variable cross cutting concerns in hiera?
On Tuesday, December 11, 2012 11:10:48 AM UTC-6, Schofield wrote:> > > Hiera allows you to lay out your data in two dimensions: data file and >> key. Whatever selection rules you want to use to choose particular data >> need to operate in that context. There are at least three ways in which >> you can embed additional dimensions: >> >> 1. You can create separate hierarchies or hierarchy pieces based on >> node data, by interpolating the data into the hierarchy definition file >> 2. You can use compound keys >> 3. You can expand your values into hashes (with the hash keyspace >> constituting an additional dimension) >> >> Would you mind going into detail on options 2 and 3? >Option 2: Instead of having keys of (say) this form: <module>::<class>::foo At least some of them have form similar to this: <environment>__<module>::<class>::foo Then you account for that specifically when you perform lookups: foo = hiera("${environment}__mymodule::myclass::foo") That only works for explicit lookups, however: where you want to provide for class parameter autolookup then you need to use specific keys. Of course, I generally recommend using explicit lookups instead of class parameters anyway, but that''s a different discussion. Option 3: Luke was right that I was talking about using complex data structures in your hiera data, but I was trying to describe a somewhat narrower and more specific use than he recognized. I was suggesting that (some of) your hiera data can look like this: mymodule::myclass::foo: development: dev_foo production: prod_foo So that in your manifests you can write: $foo_hash = hiera(''mymodule::myclass::foo'') $foo = foo_hash[$environment]> > Getting messy, quickly is my concern if the hierarchy is not the best fit > for the enterprise or the enterprise architecture changes. Are there any > rules of thumb to consider that would suggest hiera is not the best data > externalization tool and someone might be better off with a RDMS or > denormalized search index as the external data source? > >I don''t have any rules of thumb for you, as it really depends a lot on your priorities, and also somewhat on your available resources. However, I think some of the hiera metrics you should be evaluating are - The number of separate files you will need - The complexity of the data files and their layout - The amount of data duplication required - Your manifests'' usage of hiera-dependent features (mainly class parameter autolookup) Also remember that hiera has some nice advantages stemming from the relatively simple form its data take. You can put them under version control alongside your manifests, for example, and you can modify them with a plain text editor. You don''t need any separate software to be running to get at the data. Remember too that hiera supports multiple, pluggable back-ends. Instead of replacing hiera, you could consider just adding a custom back-end for some of your more unruly data. John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/I2lZrL4k8dUJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.