Tim Lank
2012-May-08 12:35 UTC
[Puppet Users] 12% of my puppet clients -- Could not retrieve catalog from remote server: execution expired
how do I troubleshoot this error that occurs for about 12% of the puppet clients (~70 out of ~550.) -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Arnau Bria
2012-May-08 13:03 UTC
Re: [Puppet Users] 12% of my puppet clients -- Could not retrieve catalog from remote server: execution expired
On Tue, 8 May 2012 05:35:34 -0700 (PDT) Tim Lank wrote:> how do I troubleshoot this error that occurs for about 12% of the > puppet clients (~70 out of ~550.)do they run as daemon? always the 70 same hosts are failling? do they run at same time? Cheers, Arnau -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Tim Lank
2012-May-08 14:59 UTC
Re: [Puppet Users] 12% of my puppet clients -- Could not retrieve catalog from remote server: execution expired
they do run as a daemon pretty much always the same 70 and they don''t all run at the same time. Many do, but not all. On Tue, May 8, 2012 at 9:03 AM, Arnau Bria <listsarnau@gmail.com> wrote:> On Tue, 8 May 2012 05:35:34 -0700 (PDT) > Tim Lank wrote: > >> how do I troubleshoot this error that occurs for about 12% of the >> puppet clients (~70 out of ~550.) > do they run as daemon? > always the 70 same hosts are failling? > do they run at same time? > > Cheers, > Arnau > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Steve Shipway
2012-May-09 09:45 UTC
RE: [Puppet Users] 12% of my puppet clients -- Could not retrieve catalog from remote server: execution expired
Not sure if it is the same issue, but we had a lot of timeout errors for catalogue retrieval once we started getting to the 200 nodes/hour point. We changed puppet to be every 2 hours, and all was well, until we had 450 nodes (again, 200/hr) and the problem resurfaced. I take it to be some limitation in the puppet system. Now we''ve just finished installing a fully distributed puppet setup, with one frontend and four backend puppetmasters. This should be able to handle 800/hr if the previous test were right, and we can expand horizontally indefinitely. It could just be that you''ve reached the limit of your puppet infrastructure. I also found that such features as storeconfigs greatly slow things down and reduce how many catalogues/hr can be served (thin storeconfigs is much better). We were advised of this limitation when we put it in, but I had to try it out myself and see... Steve Steve Shipway University of Auckland ITS UNIX Systems Design Lead s.shipway@auckland.ac.nz Ph: +64 9 373 7599 ext 86487 ________________________________________ From: puppet-users@googlegroups.com [puppet-users@googlegroups.com] on behalf of Tim Lank [timlank@timlank.com] Sent: Wednesday, 9 May 2012 2:59 a.m. To: puppet-users@googlegroups.com Subject: Re: [Puppet Users] 12% of my puppet clients -- Could not retrieve catalog from remote server: execution expired they do run as a daemon pretty much always the same 70 and they don''t all run at the same time. Many do, but not all. On Tue, May 8, 2012 at 9:03 AM, Arnau Bria <listsarnau@gmail.com> wrote:> On Tue, 8 May 2012 05:35:34 -0700 (PDT) > Tim Lank wrote: > >> how do I troubleshoot this error that occurs for about 12% of the >> puppet clients (~70 out of ~550.) > do they run as daemon? > always the 70 same hosts are failling? > do they run at same time? > > Cheers, > Arnau > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Jake - USPS
2012-May-09 12:55 UTC
Re: [Puppet Users] 12% of my puppet clients -- Could not retrieve catalog from remote server: execution expired
I was getting timeouts before as well. Usually had to do with apache MaxClients being reached (running apache/passenger setup) so then increased that if the system could handle some more load. Other times it was from too much load on our puppetmasters so needed to increase # of CPU and adjust ''PassengerMaxPoolSize'' in the apache config. Finally, we also ran into ''open file'' limits issues with the number of connections/sockets which would cause issues with passenger, so I had to bump that up as well (from 1024 default to 2048). We have ~4500 systems running every 30 minutes. We use 4 systems with 16 cores each to support this. The systems run with a load of around 30% right now, so really all we need is probably 2 of these systems ... but we want redundancy. So we have ~9000/hr with this setup. To give you an idea of run/hr and horsepower. Regards, Jake> >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/hQExZ1X7pcwJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.