Hi All, In my companies environment, we have multiple sites in multiple geographic locations, sometimes with high latency between the sites. I''m trying to come up with a solution that could provide puppet infrastructure to all sites nodes. ----a few assumptions--- - puppet manifest / configuration is fetched from a centralized version control system. - store db is needed (ssh keys, inventory, puppetshow etc....) - fail over can be done to any other puppet server in any other location (given that you have common ca and mainly build your configuration to detect from where you are connecting...) --- Centralized option --- + easiest to manage. + can use store config centrally - one location for the db. + one global CA + can scale up with Mongrel (or others). - Slow for remote clients - possible WAN cut off --- Decentralized option 1 - local puppet server + local sql db --- + fast local response time + no dependency to any other remote services. - store config db has to be exported to centralized db in order to have a global view. --- Decentralized option 2 - local puppet server + centralized remote db --- + fast local response time only for file transfers + can use store config centrally - one location for the db. - configuration run / database operations are really slow, machines with long manifests many times timeout before getting their configuration. --- Decentralized option 3 - remote puppet server for configuration, local puppet server for file transfers + centralized remote db --- + fast local response time only for file transfers + can use store config centrally - one location for the db. - configuration run / database operations are not as slow compared to the previous setup but I''m still not sure how would this be done under load... Did I miss anything, anyone else has a better idea? Thanks! Ohad _______________________________________________ Puppet-users mailing list Puppet-users@madstop.com https://mail.madstop.com/mailman/listinfo/puppet-users
David Schmitt
2008-Feb-15 07:23 UTC
Re: centralized or decentralized puppet infrastructure
Ohad Levy schrieb:> Hi All, > > In my companies environment, we have multiple sites in multiple geographic > locations, sometimes with high latency between the sites. > > I''m trying to come up with a solution that could provide puppet > infrastructure to all sites nodes. > ----a few assumptions--- > - puppet manifest / configuration is fetched from a centralized version > control system. > - store db is needed (ssh keys, inventory, puppetshow etc....) > - fail over can be done to any other puppet server in any other location > (given that you have common ca and mainly build your configuration to detect > from where you are connecting...)> Did I miss anything, anyone else has a better idea?You could use a replicated setup. This enables you at least to have fast reads. Since puppet is self-healing even serious delays in replication will not have long-term inpact on the installation. Regards, DavidS
> > You could use a replicated setup. This enables you at least to have fast > reads. Since puppet is self-healing even serious delays in replication > will not have long-term inpact on the installation. > > Thanks,I know about replication, but how do you tell rails to write to a different database then the one it reads from? Ohad _______________________________________________ Puppet-users mailing list Puppet-users@madstop.com https://mail.madstop.com/mailman/listinfo/puppet-users
One of the things I haven''t used yet, as my set-up is rather small, but which seems very useful here, is the automatic failover if you supply mutiple sources at least for the file distribution. If you go for decentralized option 3, the sources could all specify the central server as a second choice for fetching a file, which should work if your primary local file server is hosed. On 15/02/2008, David Schmitt <david@schmitt.edv-bus.at> wrote:> Ohad Levy schrieb: > > > Hi All, > > > > In my companies environment, we have multiple sites in multiple geographic > > locations, sometimes with high latency between the sites. > > > > I''m trying to come up with a solution that could provide puppet > > infrastructure to all sites nodes. > > ----a few assumptions--- > > - puppet manifest / configuration is fetched from a centralized version > > control system. > > - store db is needed (ssh keys, inventory, puppetshow etc....) > > - fail over can be done to any other puppet server in any other location > > (given that you have common ca and mainly build your configuration to detect > > from where you are connecting...) > > > > > Did I miss anything, anyone else has a better idea? > > > You could use a replicated setup. This enables you at least to have fast > reads. Since puppet is self-healing even serious delays in replication > will not have long-term inpact on the installation. > > > Regards, DavidS > > > > > _______________________________________________ > Puppet-users mailing list > Puppet-users@madstop.com > https://mail.madstop.com/mailman/listinfo/puppet-users >-- /peter
Thanks Peter, but that was never the problem. the main problem is the amount of time that it takes to store the configuration on the database and give the configuration to the client... in many cases this long period results with a timeout from the client, which at the end puppet exists without doing anything.... Cheers, Ohad On Feb 15, 2008 4:54 PM, Peter Hoeg <peter@hoeg.com> wrote:> One of the things I haven''t used yet, as my set-up is rather small, > but which seems very useful here, is the automatic failover if you > supply mutiple sources at least for the file distribution. > > If you go for decentralized option 3, the sources could all specify > the central server as a second choice for fetching a file, which > should work if your primary local file server is hosed. > > On 15/02/2008, David Schmitt <david@schmitt.edv-bus.at> wrote: > > Ohad Levy schrieb: > > > > > Hi All, > > > > > > In my companies environment, we have multiple sites in multiple > geographic > > > locations, sometimes with high latency between the sites. > > > > > > I''m trying to come up with a solution that could provide puppet > > > infrastructure to all sites nodes. > > > ----a few assumptions--- > > > - puppet manifest / configuration is fetched from a centralized > version > > > control system. > > > - store db is needed (ssh keys, inventory, puppetshow etc....) > > > - fail over can be done to any other puppet server in any other > location > > > (given that you have common ca and mainly build your configuration to > detect > > > from where you are connecting...) > > > > > > > > > Did I miss anything, anyone else has a better idea? > > > > > > You could use a replicated setup. This enables you at least to have fast > > reads. Since puppet is self-healing even serious delays in replication > > will not have long-term inpact on the installation. > > > > > > Regards, DavidS > > > > > > > > > > _______________________________________________ > > Puppet-users mailing list > > Puppet-users@madstop.com > > https://mail.madstop.com/mailman/listinfo/puppet-users > > > > > -- > /peter > _______________________________________________ > Puppet-users mailing list > Puppet-users@madstop.com > https://mail.madstop.com/mailman/listinfo/puppet-users >_______________________________________________ Puppet-users mailing list Puppet-users@madstop.com https://mail.madstop.com/mailman/listinfo/puppet-users
On Feb 15, 2008, at 8:05 PM, Ohad Levy wrote:> Thanks Peter, but that was never the problem. > the main problem is the amount of time that it takes to store the > configuration on the database and give the configuration to the > client... in many cases this long period results with a timeout from > the client, which at the end puppet exists without doing anything....This storage time has been a consistent problem with storeconfigs, and I''ve already spent a considerable amount of time on optimizing it with significant but not sufficient gains. With 0.25.0, it should be easier to make the storage itself asynchronous, but that will only help if the total storage time is less than your server can handle. If anyone has any recommendations for how to fix the storeconfigs, I''m all ears. -- I happen to feel that the degree of a person''s intelligence is directly reflected by the number of conflicting attitudes she can bring to bear on the same topic. -- Lisa Alther --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com
Maybe this could help: ? http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdate Ohad On Feb 17, 2008 4:16 AM, Luke Kanies <luke@madstop.com> wrote:> On Feb 15, 2008, at 8:05 PM, Ohad Levy wrote: > > > Thanks Peter, but that was never the problem. > > the main problem is the amount of time that it takes to store the > > configuration on the database and give the configuration to the > > client... in many cases this long period results with a timeout from > > the client, which at the end puppet exists without doing anything.... > > > This storage time has been a consistent problem with storeconfigs, and > I''ve already spent a considerable amount of time on optimizing it with > significant but not sufficient gains. > > With 0.25.0, it should be easier to make the storage itself > asynchronous, but that will only help if the total storage time is > less than your server can handle. > > If anyone has any recommendations for how to fix the storeconfigs, I''m > all ears. > > -- > I happen to feel that the degree of a person''s intelligence is directly > reflected by the number of conflicting attitudes she can bring to bear > on the same topic. -- Lisa Alther > --------------------------------------------------------------------- > Luke Kanies | http://reductivelabs.com | http://madstop.com > > _______________________________________________ > Puppet-users mailing list > Puppet-users@madstop.com > https://mail.madstop.com/mailman/listinfo/puppet-users >_______________________________________________ Puppet-users mailing list Puppet-users@madstop.com https://mail.madstop.com/mailman/listinfo/puppet-users
On Feb 17, 2008, at 7:02 PM, Ohad Levy wrote:> Maybe this could help: ? > > http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdateHrm, that file seems to be empty... -- In theory, there is no difference between theory and practice; in practice, there is. -- Chuck Reid --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com
James Turnbull
2008-Feb-18 02:47 UTC
Re: centralized or decentralized puppet infrastructure
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Luke Kanies wrote: | On Feb 17, 2008, at 7:02 PM, Ohad Levy wrote: | |> Maybe this could help: ? |> |> http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdate | | | Hrm, that file seems to be empty... | But this one isn''t... http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdates Regards James - -- James Turnbull (james@lovedthanlost.net) - -- Author of: - - Pulling Strings with Puppet (http://www.amazon.com/gp/product/1590599780/) - - Pro Nagios 2.0 (http://www.amazon.com/gp/product/1590596099/) - - Hardening Linux (http://www.amazon.com/gp/product/1590594444/) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHuPGw9hTGvAxC30ARAq95AJ4lQe8XesLmgY45llZFHL4cARb0ygCgun63 8uwyz4c3CWGvCDHWaCl4Uv8=tQIe -----END PGP SIGNATURE-----
On Feb 17, 2008, at 8:47 PM, James Turnbull wrote:> But this one isn''t... > > http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdatesThat''s pretty typical "do all the work yourself" asynchrony. I expect that I''ll end up dumping the configs to yaml, and then have a separate process that treats them as a queue, storing them into the db in turn. It should be pretty easy on both fronts, but, again, not until we''ve switched to the Indirection/REST stuff in 0.25.0. -- He is indebted to his memory for his jests and to his imagination for his facts. --Richard Brinsley Sheridan --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com
Frank Sweetser
2008-Feb-18 03:24 UTC
Re: centralized or decentralized puppet infrastructure
James Turnbull wrote:> Luke Kanies wrote: > | On Feb 17, 2008, at 7:02 PM, Ohad Levy wrote: > | > |> Maybe this could help: ? > |> > |> http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdate > | > | > | Hrm, that file seems to be empty... > | > But this one isn''t... > > http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdatesOne comment about the insert delayed method. The docs on it: http://dev.mysql.com/doc/refman/5.0/en/insert-delayed.html mention that it will only on MyISAM tables (and memory and archive, but those don''t matter here). This means that it will not work on InnoDB tables, which effectively means that this feature is mutually exclusive with foreign key constraints. If it came down to insert delayed vs foreign keys, my vote would absolutely be foreign keys. -- Frank Sweetser fs at wpi.edu | For every problem, there is a solution that WPI Senior Network Engineer | is simple, elegant, and wrong. - HL Mencken GPG fingerprint = 6174 1257 129E 0D21 D8D4 E8A3 8E39 29E3 E2E8 8CEC
Ramon van Alteren
2008-Feb-18 13:47 UTC
Re: centralized or decentralized puppet infrastructure
Frank Sweetser wrote:> James Turnbull wrote: >> Luke Kanies wrote: >> | On Feb 17, 2008, at 7:02 PM, Ohad Levy wrote: >> | >> |> Maybe this could help: ? >> |> >> |> http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdate >> | >> | >> | Hrm, that file seems to be empty... >> | >> But this one isn''t... >> >> http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdates > > One comment about the insert delayed method. The docs on it: > > http://dev.mysql.com/doc/refman/5.0/en/insert-delayed.html > > mention that it will only on MyISAM tables (and memory and archive, but those > don''t matter here). This means that it will not work on InnoDB tables, which > effectively means that this feature is mutually exclusive with foreign key > constraints. > > If it came down to insert delayed vs foreign keys, my vote would absolutely be > foreign keys.The above is not valid for innodb, it uses row-level locking so it doesn''t NEED delayed_insert. MyISAM and most of the other storage engines NEED delayed_insert very very badly because they lock the entire table to do just a single insert. That causes inserts to stall until they get a lock on the table. In that case delayed inserting is valuable. It doesn''t do much else than offloading the wait for the lock to the database server. Innodb allows conncurent inserts/updates/selects/deletes on the same table but NOT on the same record. Regards, Ramon
Frank Sweetser
2008-Feb-18 19:50 UTC
Re: centralized or decentralized puppet infrastructure
Ramon van Alteren wrote:>> mention that it will only on MyISAM tables (and memory and archive, but those >> don''t matter here). This means that it will not work on InnoDB tables, which >> effectively means that this feature is mutually exclusive with foreign key >> constraints. >> >> If it came down to insert delayed vs foreign keys, my vote would absolutely be >> foreign keys. > > The above is not valid for innodb, it uses row-level locking so it > doesn''t NEED delayed_insert. MyISAM and most of the other storage > engines NEED delayed_insert very very badly because they lock the entire > table to do just a single insert. That causes inserts to stall until > they get a lock on the table. In that case delayed inserting is valuable. > > It doesn''t do much else than offloading the wait for the lock to the > database server. > > Innodb allows conncurent inserts/updates/selects/deletes on the same > table but NOT on the same record.Interesting - so if I''m understanding you correctly, does this mean that just switching table types from MyISAM to InnoDB would get the same gains that insert delayed would offer on MyISAM tables? -- Frank Sweetser fs at wpi.edu | For every problem, there is a solution that WPI Senior Network Engineer | is simple, elegant, and wrong. - HL Mencken GPG fingerprint = 6174 1257 129E 0D21 D8D4 E8A3 8E39 29E3 E2E8 8CEC
Blake Barnett
2008-Feb-18 21:52 UTC
Re: centralized or decentralized puppet infrastructure
On Feb 17, 2008, at 7:23 PM, Luke Kanies wrote:> On Feb 17, 2008, at 8:47 PM, James Turnbull wrote: > >> But this one isn''t... >> >> http://wiki.rubyonrails.com/rails/pages/HowToDoAsynchronousDatabaseUpdates > > > That''s pretty typical "do all the work yourself" asynchrony. > > I expect that I''ll end up dumping the configs to yaml, and then have a > separate process that treats them as a queue, storing them into the db > in turn. It should be pretty easy on both fronts, but, again, not > until we''ve switched to the Indirection/REST stuff in 0.25.0. >It might be fun to use Runnels as the queue mechanism.... Also, for large environments, we could use starfish (or some other map/ reduce method) to distribute processing updates. But I can''t help but think that we''re doing much more work for updates than we need to be. Even if only a single fact_value changes for a host we do all the work of looking up everything about a host, etc. If puppetmasterd has a copy of the last update and can do a much quicker comparison (perhaps even keeping a small cache in memory) this would definitely help a lot. -Blake
On Feb 18, 2008, at 3:52 PM, Blake Barnett wrote:> It might be fun to use Runnels as the queue mechanism.... > > Also, for large environments, we could use starfish (or some other > map/ > reduce method) to distribute processing updates. > > But I can''t help but think that we''re doing much more work for updates > than we need to be. Even if only a single fact_value changes for a > host we do all the work of looking up everything about a host, etc. > If puppetmasterd has a copy of the last update and can do a much > quicker comparison (perhaps even keeping a small cache in memory) this > would definitely help a lot.I''m sure there''s some way to speed this up, but I can''t seem to find it; someone else is going to have to take a crack at it, I think. -- The surest sign that intelligent life exists elsewhere in the universe is that it has never tried to contact us. --Calvin and Hobbes (Bill Watterson) --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com
Blake Barnett
2008-Feb-18 22:17 UTC
Re: centralized or decentralized puppet infrastructure
On Feb 18, 2008, at 1:55 PM, Luke Kanies wrote:> On Feb 18, 2008, at 3:52 PM, Blake Barnett wrote: > >> It might be fun to use Runnels as the queue mechanism.... >> >> Also, for large environments, we could use starfish (or some other >> map/ >> reduce method) to distribute processing updates. >> >> But I can''t help but think that we''re doing much more work for >> updates >> than we need to be. Even if only a single fact_value changes for a >> host we do all the work of looking up everything about a host, etc. >> If puppetmasterd has a copy of the last update and can do a much >> quicker comparison (perhaps even keeping a small cache in memory) >> this >> would definitely help a lot. > > > I''m sure there''s some way to speed this up, but I can''t seem to find > it; someone else is going to have to take a crack at it, I think. >I definitely want to, and will unless someone better at this stuff beats me to it. -Blake
Stig Sandbeck Mathisen
2008-Feb-19 07:11 UTC
Re: centralized or decentralized puppet infrastructure
Frank Sweetser <fs@WPI.EDU> writes:> Interesting - so if I''m understanding you correctly, does this mean > that just switching table types from MyISAM to InnoDB would get the > same gains that insert delayed would offer on MyISAM tables?Almost, but not quite. You would remove the problems introduced by MyISAM that would require you to use insert delayed. With InnoDB you gain transactions, which among other things mean you don''t have to lock your tables to prevent inconsistencies. It also means you can backup by dumping your database within a transaction, and not have to lock every table to prevent inconsistencies. On the other hand, you can also do backups from a slave mysql server, wich won''t impact master performance. Short, and possibly horribly wrong in several circumstances: * MyISAM: Good read speed. Fast SELECTs, slow multiple INSERTs due to table locking. Can do backup via file system. Good for data you write once and read a lot. Does not do foreign key constraints. * InnoDB: Good write speed. Reliability due to transactions (if you use them). Fast INSERTs due to row locking. Can not do backup via file system. Good for data you write a lot to. If you don''t have many concurrent reads or writes, the speed and locking probably doesn''t matter. As far as I remember, foreign keys across different storage backends does not work. If I need to use MySQL, I''d pick InnoDB by default for all tables, but not the "mysql" table which is required to be MyISAM. If I didn''t need MySQL, I''d pick PostgreSQL instead. Both work just fine. -- Stig Sandbeck Mathisen, Linpro
Ramon van Alteren
2008-Feb-19 09:01 UTC
Re: centralized or decentralized puppet infrastructure
Stig Sandbeck Mathisen wrote:> Frank Sweetser <fs@WPI.EDU> writes: > >> Interesting - so if I''m understanding you correctly, does this mean >> that just switching table types from MyISAM to InnoDB would get the >> same gains that insert delayed would offer on MyISAM tables? > > Almost, but not quite.For that particular part of the functionality that would be true yes. But switching the storage engine under a databaseserver is slightly more complex than just that. As usual there are trade-offs involved.> You would remove the problems introduced by MyISAM that would require > you to use insert delayed. > > With InnoDB you gain transactions, which among other things mean you > don''t have to lock your tables to prevent inconsistencies. It also > means you can backup by dumping your database within a transaction, > and not have to lock every table to prevent inconsistencies.You gain transactions, row-level locking, a proper relation integrity model, several different index types and some other misc stuff that may or may not interest you.> > On the other hand, you can also do backups from a slave mysql server, > wich won''t impact master performance.But might introduce data-inconsistencies if there was a previous problem with replication.> Short, and possibly horribly wrong in several circumstances: > > * MyISAM: Good read speed. Fast SELECTs, slow multiple INSERTs due to > table locking. Can do backup via file system. Good for data you > write once and read a lot. Does not do foreign key constraints.Do NOT run long expensive queries on a MyISAM based database that you would also like to use for other purposes. copying through the filesystem is something both engines understand.> * InnoDB: Good write speed. Reliability due to transactions (if you > use them). Fast INSERTs due to row locking. Can not do backup via > file system. Good for data you write a lot to.Can copy from filesystem but not while the server is running. Needs regular on-disk format cleaning IF you delete a lot of records, they keep occupying space in the on-disk format after deletion, basically the engine removes the pointer to the row. Much better automatic recovery from crashes than MyISAM.> If you don''t have many concurrent reads or writes, the speed and > locking probably doesn''t matter. As far as I remember, foreign keys > across different storage backends does not work. > > If I need to use MySQL, I''d pick InnoDB by default for all tables, but > not the "mysql" table which is required to be MyISAM. If I didn''t > need MySQL, I''d pick PostgreSQL instead. Both work just fine.Interesting feature of MySQL is it''s replication which makes it a stellar performer in the web-arena. Most web-apps have a database access pattern which involves 80-99% read and 1-20% write. Puppet would seem to fall in the same category if I''m not mistaken. Replication allows you to cater very easily for this pattern, far more so than postgresql + slony which is a bitch to setup compared to MySQL. Additionally MySQL supports cross storage-engine replication which allows you to do the following: master (write) replicating to multiple slaves on MyISAM + a few slaves on Innodb. Define three types in your application: Write host (for writing data) Fast_reader: MyISAM slaves for queries that are guaranteed to come back within 1 sec SlowReader: Innodb slave for new / slow queries and backups If you need even more performance for specific parts you can mix in heap (in-memory) tables for specific data. MySQL doesn''t care much if it needs to replicate to different table-data-storage backends on a single server. You can have table1 on innodb, table2 on MyISAM and table3 on HEAP in the same database if you want to (and more importantly IF you think you can maintain it :-) Ramon