Dick Davies
2010-Apr-14 15:35 UTC
[Puppet Users] Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
I''m getting a lot of ''connection reset'' errors all of a sudden on our 0.24.8 puppetmaster. I''m assuming that''s a load issue? Number of nodes has been stable for a month or so, but a lot of them were rebooted at the same time yesterday so they rain down requests on the poor (untuned, webrick based) puppetmaster in concert every 30 minutes. Quick hack is to try to manually stagger them (by shunting half the node runs back 15 minutes), but obviously it''s time to scale this install up. I was thinking of bumping to 0.25.3 (latest EPEL, puppetmaster then nodes), and then switching over to Passenger. Is that a sensible approach? Thanks! -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Peter Meier
2010-Apr-14 15:42 UTC
Re: [Puppet Users] Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
> I was thinking of bumping to 0.25.3 (latest EPEL, puppetmaster then > nodes), and then switching over > to Passenger. Is that a sensible approach? Thanks!yes, but I would go for 0.25.4 (You''ll find EPEL rpms for them) or wait a little and get 0.25.5. both versions contain numerous fixes and 0.25.5rc1 runs smooth here. cheers pete -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Jeff McCune
2010-Apr-14 15:44 UTC
Re: [Puppet Users] Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
On Wednesday, April 14, 2010, Dick Davies <rasputnik@hellooperator.net> wrote:> I''m getting a lot of ''connection reset'' errors all of a sudden on our > 0.24.8 puppetmaster. > I''m assuming that''s a load issue? > > Number of nodes has been stable for a month or so, but a lot of them > were rebooted at the same time > yesterday so they rain down requests on the poor (untuned, webrick > based) puppetmaster > in concert every 30 minutes. > > Quick hack is to try to manually stagger them (by shunting half the > node runs back 15 minutes), > but obviously it''s time to scale this install up. > > I was thinking of bumping to 0.25.3 (latest EPEL, puppetmaster then > nodes), and then switching over > to Passenger. Is that a sensible approach? Thanks!I encountered similar issues with the default master and passenger has alleviated all of those issues for me. I definitely reccomend passenger. 0.25.3 has some SSL related bugs you may want to avoid with 0.25.4 -Jeff> > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Ken
2010-Apr-14 17:35 UTC
[Puppet Users] Re: Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
+1 on the passenger ... we wouldn''t survive without it. We are running 0.25.4 as well. In regards to the nodes hitting the puppetmaster at the same time - I presume you''ve looked at the ''splay'' option? There is a gotcha at the moment however: http://projects.puppetlabs.com/issues/3321 Vote if you think its important :-). On Apr 14, 4:44 pm, Jeff McCune <mccune.j...@gmail.com> wrote:> On Wednesday, April 14, 2010, Dick Davies <rasput...@hellooperator.net> wrote: > > I''m getting a lot of ''connection reset'' errors all of a sudden on our > > 0.24.8 puppetmaster. > > I''m assuming that''s a load issue? > > > Number of nodes has been stable for a month or so, but a lot of them > > were rebooted at the same time > > yesterday so they rain down requests on the poor (untuned, webrick > > based) puppetmaster > > in concert every 30 minutes. > > > Quick hack is to try to manually stagger them (by shunting half the > > node runs back 15 minutes), > > but obviously it''s time to scale this install up. > > > I was thinking of bumping to 0.25.3 (latest EPEL, puppetmaster then > > nodes), and then switching over > > to Passenger. Is that a sensible approach? Thanks! > > I encountered similar issues with the default master and passenger has > alleviated all of those issues for me. I definitely reccomend > passenger. 0.25.3 has some SSL related bugs you may want to avoid with > 0.25.4 > > -Jeff > > > > > > > -- > > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > > To post to this group, send email to puppet-users@googlegroups.com. > > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > > For more options, visit this group athttp://groups.google.com/group/puppet-users?hl=en.-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Patrick
2010-Apr-14 18:52 UTC
Re: [Puppet Users] Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
On Apr 14, 2010, at 8:44 AM, Jeff McCune wrote:> On Wednesday, April 14, 2010, Dick Davies <rasputnik@hellooperator.net> wrote: >> I''m getting a lot of ''connection reset'' errors all of a sudden on our >> 0.24.8 puppetmaster. >> I''m assuming that''s a load issue? >> >> Number of nodes has been stable for a month or so, but a lot of them >> were rebooted at the same time >> yesterday so they rain down requests on the poor (untuned, webrick >> based) puppetmaster >> in concert every 30 minutes. >> >> Quick hack is to try to manually stagger them (by shunting half the >> node runs back 15 minutes), >> but obviously it''s time to scale this install up. >> >> I was thinking of bumping to 0.25.3 (latest EPEL, puppetmaster then >> nodes), and then switching over >> to Passenger. Is that a sensible approach? Thanks! > > I encountered similar issues with the default master and passenger has > alleviated all of those issues for me. I definitely reccomend > passenger. 0.25.3 has some SSL related bugs you may want to avoid with > 0.25.4Upgrading your server and clients to 0.25.x will probably double your server''s capacity. Adding passenger mikes it much higher. My server with passenger is a Pentium D with 512 Megs of Memory. It''s just fine when I tell all the clients to hit it at once. Even then, the puppet runs only take twice as long. (120 seconds compared to 60 seconds) -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Nikolay Sturm
2010-Apr-15 06:09 UTC
[Puppet Users] Re: Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
On Apr 14, 11:35 am, Dick Davies <rasput...@hellooperator.net> wrote:> I''m getting a lot of ''connection reset'' errors all of a sudden on our > 0.24.8 puppetmaster. > I''m assuming that''s a load issue?Probably, I had the same problem recently and solved it by serializing puppetd runs. It works by restarting my puppetd processes daily at a host-specific time from cron: - have a sorted list of all nodes accessing your puppetmaster - external node classifier computes rank of node (the host''s place in above list) modulo 30 and puts it into a parameter - install cronjob on each machine to restart puppet daily at some hour and rank minute It''s not perfect, but instead of load spikes up to 10 for several minutes, the load on my puppetmaster stays between 0.5 and 1. It would be really nice, if puppetmaster had some intelligence, to help spreading out clients like this automatically. cheers, Nikolay -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
David Schmitt
2010-Apr-15 08:12 UTC
Re: [Puppet Users] Re: Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
On 4/15/2010 8:09 AM, Nikolay Sturm wrote:> On Apr 14, 11:35 am, Dick Davies<rasput...@hellooperator.net> wrote: >> I''m getting a lot of ''connection reset'' errors all of a sudden on our >> 0.24.8 puppetmaster. >> I''m assuming that''s a load issue? > > Probably, I had the same problem recently and solved it by serializing > puppetd runs. It works by restarting my puppetd processes daily at a > host-specific time from cron: > > - have a sorted list of all nodes accessing your puppetmaster > - external node classifier computes rank of node (the host''s place in > above list) modulo 30 and puts it into a parameter > - install cronjob on each machine to restart puppet daily at some hour > and rank minute > > It''s not perfect, but instead of load spikes up to 10 for several > minutes, the load on my puppetmaster stays between 0.5 and 1. > > It would be really nice, if puppetmaster had some intelligence, to > help spreading out clients like this automatically.There is --splay Regards, D. -- dasz.at OG Tel: +43 (0)664 2602670 Web: http://dasz.at Klosterneuburg UID: ATU64260999 FB-Nr.: FN 309285 g FB-Gericht: LG Korneuburg -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Dick Davies
2010-Apr-15 09:05 UTC
Re: [Puppet Users] Re: Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
On Wed, Apr 14, 2010 at 6:35 PM, Ken <ken@bob.sh> wrote:> +1 on the passenger ... we wouldn''t survive without it. We are running > 0.25.4 as well.That seems to be the general advice, thanks.> In regards to the nodes hitting the puppetmaster at the same time - I > presume you''ve looked at the ''splay'' option?Ah, that looks ideal!> http://projects.puppetlabs.com/issues/3321Ah, crap ! :D> Vote if you think its important :-).Will do, thanks. For a quick fix I''ll try manually staggering puppetd startup, see if that buys me enough time to test out 0.25.4 with our manifests. Thanks, everyone, for the suggestions. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Nikolay Sturm
2010-Apr-15 16:39 UTC
[Puppet Users] Re: Time to scale up? : Could not call fileserver.describe: #<Errno::ECONNRESET: Connection reset by peer>
On Apr 15, 10:12 am, David Schmitt <da...@dasz.at> wrote:> There is --splayWhich doesn''t help at all. When I tested it with 0.24.8, it would delay the initial run randomly, but successive runs would start relative to puppetd''s startup, excluding the delay. Haven''t look whether this is fixed in 0.25 or if there even exists a bug report, though. Nikolay -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.