Hi all, hoping someone may have encountered a problem similar to this before: On my customer''s EC2 based infrastructure, we have implemented the nodeless, truth driven module outlined by Jordan Sissel here http://www.semicomplete.com/blog/geekery/puppet-nodeless-configuration. It''s quite an effective model, especially in the realm of EC2... we still have a puppetmaster configuration, and have decided not to go with the masterless option, since we''d like to implement a nagios module using the storeconfigs approach and exported resources. Problem is, we''re seeing large latency during agent runs, which appears to be down to the config retrieval step, so much so that we''ve needed to increase timeouts to avoid our Apache/Mongrel puppetmasterd solution from timing out the connection. I''ve done some basic profiling, using --summarize and --evaltrace, which shows that the bottelneck *appears* to be happening at the config retrieval level: (This is on our monitoring server) Time: Attachedstorage: 2.00 Class: 0.00 Collectd conf: 0.01 Config retrieval: 85.91 Cron: 0.00 Exec: 34.11 File: 35.56 Filebucket: 0.00 Group: 0.26 Mailalias: 0.17 Mount: 3.48 Nagios command: 0.02 Nagios contact: 0.00 Nagios contactgroup: 0.00 Nagios host: 0.02 Nagios service: 0.12 Nagios servicegroup: 0.00 Package: 3.12 Resources: 0.00 Schedule: 0.01 This agent run has been done with storeconfigs enabled, but it appears that the Nagios resources are being processed fairly quickly; when we disable storeconfigs, the config retrieval can be reduced to almost half as long, which would suggest db latency. We''ve got the puppetmaster running on a m1.small EC2 instance, which only seems to have a single core - I''m not sure if that''s perhaps the cause of the bottleneck? Any suggestions / advice would be much appreciated, thanks in advance! Cheers, Andrew -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/P1szoP5rBmUJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
We ended upgrading the EC2 instance from a m1.small to a c1.medium .. it was bottoming out on cpu load, and increasing this to a dual core instance resolved the issue :) No more timeouts!! and a happy customer too On Tuesday, March 6, 2012 11:46:14 AM UTC, Andrew Stangl wrote:> > Hi all, hoping someone may have encountered a problem similar to this > before: > > On my customer''s EC2 based infrastructure, we have implemented the > nodeless, truth driven module outlined by Jordan Sissel here > http://www.semicomplete.com/blog/geekery/puppet-nodeless-configuration. > It''s quite an effective model, especially in the realm of EC2... we still > have a puppetmaster configuration, and have decided not to go with the > masterless option, since we''d like to implement a nagios module using the > storeconfigs approach and exported resources. > > Problem is, we''re seeing large latency during agent runs, which appears to > be down to the config retrieval step, so much so that we''ve needed to > increase timeouts to avoid our Apache/Mongrel puppetmasterd solution from > timing out the connection. I''ve done some basic profiling, using > --summarize and --evaltrace, which shows that the bottelneck *appears* to > be happening at the config retrieval level: > > (This is on our monitoring server) > > Time: > Attachedstorage: 2.00 > Class: 0.00 > Collectd conf: 0.01 > Config retrieval: 85.91 > Cron: 0.00 > Exec: 34.11 > File: 35.56 > Filebucket: 0.00 > Group: 0.26 > Mailalias: 0.17 > Mount: 3.48 > Nagios command: 0.02 > Nagios contact: 0.00 > Nagios contactgroup: 0.00 > Nagios host: 0.02 > Nagios service: 0.12 > Nagios servicegroup: 0.00 > Package: 3.12 > Resources: 0.00 > Schedule: 0.01 > > This agent run has been done with storeconfigs enabled, but it appears > that the Nagios resources are being processed fairly quickly; when we > disable storeconfigs, the config retrieval can be reduced to almost half as > long, which would suggest db latency. We''ve got the puppetmaster running on > a m1.small EC2 instance, which only seems to have a single core - I''m not > sure if that''s perhaps the cause of the bottleneck? > > Any suggestions / advice would be much appreciated, thanks in advance! > > Cheers, > Andrew > > > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/VOSYDmbMUrIJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
We ended up upgrading the EC2 instance from a m1.small to a c1.medium .. it was bottoming out on cpu load, and increasing this to a dual core instance resolved the issue :) No more timeouts!! and a happy customer too On Tuesday, March 6, 2012 11:46:14 AM UTC, Andrew Stangl wrote:> > Hi all, hoping someone may have encountered a problem similar to this > before: > > On my customer''s EC2 based infrastructure, we have implemented the > nodeless, truth driven module outlined by Jordan Sissel here > http://www.semicomplete.com/blog/geekery/puppet-nodeless-configuration. > It''s quite an effective model, especially in the realm of EC2... we still > have a puppetmaster configuration, and have decided not to go with the > masterless option, since we''d like to implement a nagios module using the > storeconfigs approach and exported resources. > > Problem is, we''re seeing large latency during agent runs, which appears to > be down to the config retrieval step, so much so that we''ve needed to > increase timeouts to avoid our Apache/Mongrel puppetmasterd solution from > timing out the connection. I''ve done some basic profiling, using > --summarize and --evaltrace, which shows that the bottelneck *appears* to > be happening at the config retrieval level: > > (This is on our monitoring server) > > Time: > Attachedstorage: 2.00 > Class: 0.00 > Collectd conf: 0.01 > Config retrieval: 85.91 > Cron: 0.00 > Exec: 34.11 > File: 35.56 > Filebucket: 0.00 > Group: 0.26 > Mailalias: 0.17 > Mount: 3.48 > Nagios command: 0.02 > Nagios contact: 0.00 > Nagios contactgroup: 0.00 > Nagios host: 0.02 > Nagios service: 0.12 > Nagios servicegroup: 0.00 > Package: 3.12 > Resources: 0.00 > Schedule: 0.01 > > This agent run has been done with storeconfigs enabled, but it appears > that the Nagios resources are being processed fairly quickly; when we > disable storeconfigs, the config retrieval can be reduced to almost half as > long, which would suggest db latency. We''ve got the puppetmaster running on > a m1.small EC2 instance, which only seems to have a single core - I''m not > sure if that''s perhaps the cause of the bottleneck? > > Any suggestions / advice would be much appreciated, thanks in advance! > > Cheers, > Andrew > > > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/xQdfOjq0fBcJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.