Our current plan for the inventory service is to provide active_record termini for the "facts" and "inventory" indirections. This is to support fast look-up of facts, and search of nodes based on their facts. However, there are already tables for facts, used for storeconfigs, along with an active_record terminus for facts. We want to avoid unnecessarily duplicating this behavior, by reusing the existing tables and terminus. This would result in the same fact data being used by both the inventory service and storeconfigs. The only potential concern we can see with this is users wanting different fact expiration policies for inventory service and storeconfigs. Given the usage scenarios for storeconfigs that we are aware of, this seems unlikely (it sounds like storeconfig fact data is mostly being used as a stand-in for an inventory service). This proposal would have no other effect on storeconfigs. Please share any other comments or concerns you may have related to this proposal, particularly if it would interfere with your current use of storeconfigs. Thanks. -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To post to this group, send email to puppet-dev@googlegroups.com. To unsubscribe from this group, send email to puppet-dev+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
Matt Robinson
2011-Feb-25 21:55 UTC
[Puppet Users] Re: [Puppet-dev] RFC: Database-backed inventory service plan
On Wed, Feb 23, 2011 at 2:04 PM, Nick Lewis <nick@puppetlabs.com> wrote:> Our current plan for the inventory service is to provide active_record > termini for the "facts" and "inventory" indirections. This is to support > fast look-up of facts, and search of nodes based on their facts. However, > there are already tables for facts, used for storeconfigs, along with an > active_record terminus for facts. We want to avoid unnecessarily duplicating > this behavior, by reusing the existing tables and terminus. This would > result in the same fact data being used by both the inventory service and > storeconfigs.In principle I don''t like the idea of tying the backends of storeconfigs and inventory together by sharing tables, especially since I''m not clear on the future of storeconfigs or a lot of details of how it''s currently used, so it makes it harder to change implementation details. As a specific example, I don''t like the schema storeconfigs has for storing fact data (explained in more detail below) and would prefer to use a different one. If we share tables this is awkward. I propose that we don''t share tables, and the inventory service (and any other future service that needs a database backend) has its own set of namespaced tables (servicename_tablename). Ideally I would like to use separate database schemas entirely, but that would be a bigger, harder to manage change with the current code that relies on the active_record terminus. Currently the storeconfigs tables dealing with facts look something like this (I''ve removed the columns that are irrelevant to the inventory service): create_table :hosts do |t| t.column :name, :string, :null => false end create_table :fact_names do |t| t.column :name, :string, :null => false end create_table :fact_values do |t| t.column :value, :text, :null => false t.column :fact_name_id, :integer, :null => false t.column :host_id, :integer, :null => false end I propose something more like: create_table :nodes do |t| t.column :name, :string, :null => false t.column :timestamp, :datetime end create_table :facts do |t| t.column :name, :string, :null => false t.column :value, :text, :null => false t.column :node_id, :integer, :null => false end It''s less normalized than the storeconfigs schema since fact names will be duplicated per node, but easier to understand and work with, and I think better satisfies the types of queries we will be doing which are of the form "select nodes where fact equal to value". The more normalized schema would be better for queries of the form "select all values for fact", but I don''t think that''s something we''ll be doing. Correct me if I''m wrong. Other benefits of the proposed schema include the "metadata" about each fact set being columns on the node table (Nick has also proposed that table be called fact_sets and have a column called node_name) instead of being stored as a fact. Also we tend to use the word host all over our code (in both puppet and dasbhoard) when we really ought to use the word node since host confuses people into thinking the host name is what identifies a node, when by default it''s the fqdn and could be anything.> Please share any other comments or concerns you may have related to this > proposal, particularly if it would interfere with your current use of > storeconfigs. Thanks.Questions: Do or will we want historical fact sets? Current understanding is no, that we only store the most recent fact set per node. This makes the database smaller and I can''t think of a motivator for wanting historical fact sets, but maybe someone else can. What other "metadata" do we want to store about facts. Currently the only metadata we''re storing is timestamp. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Matt Robinson
2011-Mar-01 21:32 UTC
[Puppet Users] Re: [Puppet-dev] RFC: Database-backed inventory service plan
On Wed, Feb 23, 2011 at 2:04 PM, Nick Lewis <nick@puppetlabs.com> wrote:> Our current plan for the inventory service is to provide active_record > termini for the "facts" and "inventory" indirections. This is to support > fast look-up of facts, and search of nodes based on their facts. However, > there are already tables for facts, used for storeconfigs, along with an > active_record terminus for facts.On Fri, Feb 25, 2011 at 1:55 PM, Matt Robinson <matt@puppetlabs.com> wrote:> I propose that we don''t share tables, and the inventory service (and > any other future service that needs a database backend) has its own > set of namespaced tables (servicename_tablename).Thanks to those who gave feedback. The general consensus I''ve reached talking offline to other devs (Jacob, Nick, Paul) is that we should use separate tables for the inventory service from the ones that storeconfigs currently uses. The question of whether to normalize or denormalize (which I didn''t mean to have be the focus of this discussion at all) can be left up to the devs who end up working on the implementation, taking the discussion from this thread into account. Matt -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.