thr3ads.net - Puppet users - [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24) [Dec 2010]

If this information is useful, please help other people find it:
Share via:

Chris

2010-Dec-14 08:24 UTC

[Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Hi

I recently upgraded my puppet masters (and clients) from 0.24.8 to
2.6.4

Previously, my most busy puppet master would hover around about 0.9
load  average, after the upgrade, its load hovers around 5

I am running passenger and mysql based stored configs.

Checking my running processes, ruby (puppetmasterd) shoots up to 99%
cpu load and stays there for a few seconds before dropping again.
Often there are 4 of these running simultaneously, pegging each core
at 99% cpu.

It seems that there has been a serious performance regression between
0.24 and 2.6 for my configuration

I hop the following can help work out where...

I ran puppetmasterd through a profiler to find the root cause of this
(http://boojum.homelinux.org/profile.svg).  The main problem appears
to be in /usr/lib/ruby/site_ruby/1.8/puppet/parser/ast/resource.rb, in
the evaluate function.

I added a few timing commands around various sections of that function
to find the following breakdown of times spent inside it, and the two
most intensive calls are
---
    paramobjects = parameters.collect { |param|
      param.safeevaluate(scope)
    }
---

and
---
    resource_titles.flatten.collect { |resource_title|
      exceptwrap :type => Puppet::ParseError do
        resource = Puppet::Parser::Resource.new(
          fully_qualified_type, resource_title,
          :parameters => paramobjects,
          :file => self.file,
          :line => self.line,
          :exported => self.exported,
          :virtual => virt,
          :source => scope.source,
          :scope => scope,
          :strict => true
        )

        if resource.resource_type.is_a? Puppet::Resource::Type
          resource.resource_type.instantiate_resource(scope, resource)
        end
        scope.compiler.add_resource(scope, resource)
        scope.compiler.evaluate_classes([resource_title],scope,false)
if fully_qualified_type == ''class''
        resource
      end
    }.reject { |resource| resource.nil? }
---


Unfortunately, that is about the limit of my current ruby skills.
What else can be looked at to speed 2.6 back up to the performance of
0.24?



-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Ken Barber

2010-Dec-14 23:40 UTC

head link

[Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Hi Chris,

Sorry - I can''t say I''m seeing this performance issue myself
with my setup
:-(.

I''m not an expert around that part of the code. Having said that its 
probably DSL parsing related (possibly a recursion somewhere) ... I''d
be
focusing on your content not just the Ruby to see what part of the puppet 
DSL is causing it. Strip your content right back and add bits back in 
slowly. I think this would make your report very useful if it turns out to 
be a bug, and perhaps you can find a workaround that way as well.

That''s just my 2c. Good luck :-).

ken.

On Tuesday, December 14, 2010 8:24:55 AM UTC, Chris
wrote:>
> Hi 
>
> I recently upgraded my puppet masters (and clients) from 0.24.8 to 
> 2.6.4 
>
> Previously, my most busy puppet master would hover around about 0.9 
> load average, after the upgrade, its load hovers around 5 
>
> I am running passenger and mysql based stored configs. 
>
> Checking my running processes, ruby (puppetmasterd) shoots up to 99% 
> cpu load and stays there for a few seconds before dropping again. 
> Often there are 4 of these running simultaneously, pegging each core 
> at 99% cpu. 
>
> It seems that there has been a serious performance regression between 
> 0.24 and 2.6 for my configuration 
>
> I hop the following can help work out where... 
>
> I ran puppetmasterd through a profiler to find the root cause of this 
> (http://boojum.homelinux.org/profile.svg). The main problem appears 
> to be in /usr/lib/ruby/site_ruby/1.8/puppet/parser/ast/resource.rb, in 
> the evaluate function. 
>
> I added a few timing commands around various sections of that function 
> to find the following breakdown of times spent inside it, and the two 
> most intensive calls are 
> --- 
> paramobjects = parameters.collect { |param| 
> param.safeevaluate(scope) 
> } 
> --- 
>
> and 
> --- 
> resource_titles.flatten.collect { |resource_title| 
> exceptwrap :type => Puppet::ParseError do 
> resource = Puppet::Parser::Resource.new( 
> fully_qualified_type, resource_title, 
> :parameters => paramobjects, 
> :file => self.file, 
> :line => self.line, 
> :exported => self.exported, 
> :virtual => virt, 
> :source => scope.source, 
> :scope => scope, 
> :strict => true 
> ) 
>
> if resource.resource_type.is_a? Puppet::Resource::Type 
> resource.resource_type.instantiate_resource(scope, resource) 
> end 
> scope.compiler.add_resource(scope, resource) 
> scope.compiler.evaluate_classes([resource_title],scope,false) 
> if fully_qualified_type == ''class'' 
> resource 
> end 
> }.reject { |resource| resource.nil? } 
> --- 
>
>
> Unfortunately, that is about the limit of my current ruby skills. 
> What else can be looked at to speed 2.6 back up to the performance of 
> 0.24? 
>
>
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Nigel Kersten

2010-Dec-15 00:48 UTC

head link

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Tue, Dec 14, 2010 at 12:24 AM, Chris
<iwouldratherbesleepingnow@gmail.com> wrote:
> Hi
>
> I recently upgraded my puppet masters (and clients) from 0.24.8 to
> 2.6.4
>
> Previously, my most busy puppet master would hover around about 0.9
> load  average, after the upgrade, its load hovers around 5
>
> I am running passenger and mysql based stored configs.
>
> Checking my running processes, ruby (puppetmasterd) shoots up to 99%
> cpu load and stays there for a few seconds before dropping again.
> Often there are 4 of these running simultaneously, pegging each core
> at 99% cpu.
>
> It seems that there has been a serious performance regression between
> 0.24 and 2.6 for my configuration
>
Some useful info would be:

OS
OS version
Ruby version
Apache version/worker model
Passenger version




> I hop the following can help work out where...
>
> I ran puppetmasterd through a profiler to find the root cause of this
> (http://boojum.homelinux.org/profile.svg).  The main problem appears
> to be in /usr/lib/ruby/site_ruby/1.8/puppet/parser/ast/resource.rb, in
> the evaluate function.
>
> I added a few timing commands around various sections of that function
> to find the following breakdown of times spent inside it, and the two
> most intensive calls are
> ---
>    paramobjects = parameters.collect { |param|
>      param.safeevaluate(scope)
>    }
> ---
>
> and
> ---
>    resource_titles.flatten.collect { |resource_title|
>      exceptwrap :type => Puppet::ParseError do
>        resource = Puppet::Parser::Resource.new(
>          fully_qualified_type, resource_title,
>          :parameters => paramobjects,
>          :file => self.file,
>          :line => self.line,
>          :exported => self.exported,
>          :virtual => virt,
>          :source => scope.source,
>          :scope => scope,
>          :strict => true
>        )
>
>        if resource.resource_type.is_a? Puppet::Resource::Type
>          resource.resource_type.instantiate_resource(scope, resource)
>        end
>        scope.compiler.add_resource(scope, resource)
>        scope.compiler.evaluate_classes([resource_title],scope,false)
> if fully_qualified_type == ''class''
>        resource
>      end
>    }.reject { |resource| resource.nil? }
> ---
>
>
> Unfortunately, that is about the limit of my current ruby skills.
> What else can be looked at to speed 2.6 back up to the performance of
> 0.24?
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
>

-- 
Nigel Kersten - Puppet Labs -  http://www.puppetlabs.com

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Chris

2010-Dec-15 07:10 UTC

head link

[Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

>
> Some useful info would be:
>
> OS
> OS version
> Ruby version
> Apache version/worker model
> Passenger version
>CentOS 5.2
ruby-1.8.5-5.el5_3.7
httpd-2.2.3-31.el5.centos.2
rubygem-passenger-2.2.11-2el5.ecn
rubygem-rails-2.1.1-2.el5
rubygem-rack-1.1.0-1el5

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2010-Dec-15 10:42 UTC

head link

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Tue, 2010-12-14 at 00:24 -0800, Chris wrote:> Hi
> 
> I recently upgraded my puppet masters (and clients) from 0.24.8 to
> 2.6.4
> 
> Previously, my most busy puppet master would hover around about 0.9
> load  average, after the upgrade, its load hovers around 5
> 
> I am running passenger and mysql based stored configs.
> 
> Checking my running processes, ruby (puppetmasterd) shoots up to 99%
> cpu load and stays there for a few seconds before dropping again.
> Often there are 4 of these running simultaneously, pegging each core
> at 99% cpu.
I would say it is perfectly normal. Compiling the catalog is a hard and
complex problem and requires CPU. 

The difference between 0.24.8 and 2.6 (or 0.25 for what matters) is that
some performance issues have been fixed. Those issues made the master be
more I/O bound under 0.24, but now mostly CPU bound in later versions.

Now compare the compilation time under 0.24.8 and 2.6 and you should see
that it reduced drastically (allowing to fit more compilation in the
same time unit). The reverse of the medal is that now your master
requires transient high CPU usage.

I don''t really get what is the issue about using 100% of CPU?

You''re paying about the same price when your CPU is used and when
it''s
idle, so that shouldn''t make a difference :)

If that''s an issue, reduce the concurrency of your setup (run less
compilation in parallel, implement splay time, etc...).
> It seems that there has been a serious performance regression between
> 0.24 and 2.6 for my configuration
I think it''s the reverse that happened.
> I hop the following can help work out where...
> 
> I ran puppetmasterd through a profiler to find the root cause of this
> (http://boojum.homelinux.org/profile.svg).  The main problem appears
> to be in /usr/lib/ruby/site_ruby/1.8/puppet/parser/ast/resource.rb, in
> the evaluate function.
> 
> I added a few timing commands around various sections of that function
> to find the following breakdown of times spent inside it, and the two
> most intensive calls are
> ---
>     paramobjects = parameters.collect { |param|
>       param.safeevaluate(scope)
>     }
> ---
> 
> and
> ---
>     resource_titles.flatten.collect { |resource_title|
>       exceptwrap :type => Puppet::ParseError do
>         resource = Puppet::Parser::Resource.new(
>           fully_qualified_type, resource_title,
>           :parameters => paramobjects,
>           :file => self.file,
>           :line => self.line,
>           :exported => self.exported,
>           :virtual => virt,
>           :source => scope.source,
>           :scope => scope,
>           :strict => true
>         )
> 
>         if resource.resource_type.is_a? Puppet::Resource::Type
>           resource.resource_type.instantiate_resource(scope, resource)
>         end
>         scope.compiler.add_resource(scope, resource)
>         scope.compiler.evaluate_classes([resource_title],scope,false)
> if fully_qualified_type == ''class''
>         resource
>       end
>     }.reject { |resource| resource.nil? }
> ---
Yes, this is what the compiler is doing during compilation: evaluating
resources and parameters. The more resources you use, the more the
compilation will take time and CPU.
-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Chris

2010-Dec-15 13:28 UTC

head link

[Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Dec 15, 12:42 pm, Brice Figureau <brice-pup...@daysofwonder.com>
wrote:> On Tue, 2010-12-14 at 00:24 -0800, Chris wrote:
> > Hi
>
> > I recently upgraded my puppet masters (and clients) from 0.24.8 to
> > 2.6.4
>
> > Previously, my most busy puppet master would hover around about 0.9
> > load  average, after the upgrade, its load hovers around 5
>
> > I am running passenger and mysql based stored configs.
>
> > Checking my running processes, ruby (puppetmasterd) shoots up to 99%
> > cpu load and stays there for a few seconds before dropping again.
> > Often there are 4 of these running simultaneously, pegging each core
> > at 99% cpu.
>
> I would say it is perfectly normal. Compiling the catalog is a hard and
> complex problem and requires CPU.
>
> The difference between 0.24.8 and 2.6 (or 0.25 for what matters) is that
> some performance issues have been fixed. Those issues made the master be
> more I/O bound under 0.24, but now mostly CPU bound in later versions.
If we were talking about only cpu usage, I would agree with you.  But
in this case, the load average of the machine has gone up over 5x.
And as high load average indicates processes not getting enough
runtime, in this case it is an indication to me that 2.6 is performing
worse than 0.24 (previously, on average, all processes got enough
runtime and did not have to wait for system resources, now processes
are sitting in the run queue, waiting to get a chance to run)
>
> I don''t really get what is the issue about using 100% of CPU?Thats not the issue, just an indication of what is causing it
>
> You''re paying about the same price when your CPU is used and when
it''s
> idle, so that shouldn''t make a difference :)Generally true, but this is a on VM which is also running some of my
radius and proxy instances, amongst others.
>
> If that''s an issue, reduce the concurrency of your setup (run less
> compilation in parallel, implement splay time, etc...).splay has been enabled since 0.24

My apache maxclients is set to 15 to limit concurrency.

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Trevor Vaughan

2010-Dec-15 14:12 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

What is your CPU to Puppetmaster instance ratio?

I''ve had decent luck with 1 CPU to 2 PM, but not much above that.

If you need dedicated resources for other tasks, you may want to
ensure that you don''t have as many masters spawning as you have
processors.

Trevor

On Wed, Dec 15, 2010 at 8:28 AM, Chris
<iwouldratherbesleepingnow@gmail.com> wrote:>
>
> On Dec 15, 12:42 pm, Brice Figureau <brice-pup...@daysofwonder.com>
> wrote:
>> On Tue, 2010-12-14 at 00:24 -0800, Chris wrote:
>> > Hi
>>
>> > I recently upgraded my puppet masters (and clients) from 0.24.8 to
>> > 2.6.4
>>
>> > Previously, my most busy puppet master would hover around about
0.9
>> > load  average, after the upgrade, its load hovers around 5
>>
>> > I am running passenger and mysql based stored configs.
>>
>> > Checking my running processes, ruby (puppetmasterd) shoots up to
99%
>> > cpu load and stays there for a few seconds before dropping again.
>> > Often there are 4 of these running simultaneously, pegging each
core
>> > at 99% cpu.
>>
>> I would say it is perfectly normal. Compiling the catalog is a hard and
>> complex problem and requires CPU.
>>
>> The difference between 0.24.8 and 2.6 (or 0.25 for what matters) is
that
>> some performance issues have been fixed. Those issues made the master
be
>> more I/O bound under 0.24, but now mostly CPU bound in later versions.
>
> If we were talking about only cpu usage, I would agree with you.  But
> in this case, the load average of the machine has gone up over 5x.
> And as high load average indicates processes not getting enough
> runtime, in this case it is an indication to me that 2.6 is performing
> worse than 0.24 (previously, on average, all processes got enough
> runtime and did not have to wait for system resources, now processes
> are sitting in the run queue, waiting to get a chance to run)
>
>>
>> I don''t really get what is the issue about using 100% of CPU?
> Thats not the issue, just an indication of what is causing it
>
>>
>> You''re paying about the same price when your CPU is used and
when it''s
>> idle, so that shouldn''t make a difference :)
> Generally true, but this is a on VM which is also running some of my
> radius and proxy instances, amongst others.
>
>>
>> If that''s an issue, reduce the concurrency of your setup (run
less
>> compilation in parallel, implement splay time, etc...).
> splay has been enabled since 0.24
>
> My apache maxclients is set to 15 to limit concurrency.
>
> --
> You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
> For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.
>
>


-- 
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvaughan@onyxpoint.com

-- This account not approved for unencrypted proprietary information --

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2010-Dec-15 17:15 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Wed, 2010-12-15 at 05:28 -0800, Chris wrote:> 
> On Dec 15, 12:42 pm, Brice Figureau <brice-pup...@daysofwonder.com>
> wrote:
> > On Tue, 2010-12-14 at 00:24 -0800, Chris wrote:
> > > Hi
> >
> > > I recently upgraded my puppet masters (and clients) from 0.24.8
to
> > > 2.6.4
> >
> > > Previously, my most busy puppet master would hover around about
0.9
> > > load  average, after the upgrade, its load hovers around 5
> >
> > > I am running passenger and mysql based stored configs.
> >
> > > Checking my running processes, ruby (puppetmasterd) shoots up to
99%
> > > cpu load and stays there for a few seconds before dropping again.
> > > Often there are 4 of these running simultaneously, pegging each
core
> > > at 99% cpu.
> >
> > I would say it is perfectly normal. Compiling the catalog is a hard
and
> > complex problem and requires CPU.
> >
> > The difference between 0.24.8 and 2.6 (or 0.25 for what matters) is
that
> > some performance issues have been fixed. Those issues made the master
be
> > more I/O bound under 0.24, but now mostly CPU bound in later versions.
> 
> If we were talking about only cpu usage, I would agree with you.  But
> in this case, the load average of the machine has gone up over 5x.
> And as high load average indicates processes not getting enough
> runtime, in this case it is an indication to me that 2.6 is performing
> worse than 0.24 (previously, on average, all processes got enough
> runtime and did not have to wait for system resources, now processes
> are sitting in the run queue, waiting to get a chance to run)
Load is not necessarily an indication of an issue. It can also mean some
tasks are waiting for I/O not only CPU. 
The only real issue under load is if service time is beyond an
admissible value, otherwise you can''t say it''s bad or not.
If you see some hosts reporting timeouts, then it''s an indication that
service time is not good :)

BTW, do you run your mysql storedconfig instance on the same server?
You can activate thin_storeconfigs to reduce the load on the mysql db.
> >
> > I don''t really get what is the issue about using 100% of CPU?
> Thats not the issue, just an indication of what is causing it
> 
> >
> > You''re paying about the same price when your CPU is used and
when it''s
> > idle, so that shouldn''t make a difference :)
> Generally true, but this is a on VM which is also running some of my
> radius and proxy instances, amongst others.
> 
> >
> > If that''s an issue, reduce the concurrency of your setup (run
less
> > compilation in parallel, implement splay time, etc...).
> splay has been enabled since 0.24
> 
> My apache maxclients is set to 15 to limit concurrency.
I think this is too many except if you have 8 cores. As Trevor said in
another e-mail in this thread, 2PM/Core is the best.

Now it all depends on your number of nodes and sleeptime. I suggest you
use ext/puppet-load to find your setup real concurrency.
-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Ashley Penney

2010-Dec-15 18:27 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

This issue is definitely a problem.  I have a support ticket in with Puppet
Labs about the same thing.  My CPU remains at 100% almost constantly and it
slows things down significantly.  If you strace it you can see that very
little appears to be going on.  This is absolutely not normal behavior.
 Even when I had 1 client checking in I had all cores fully used.

On Wed, Dec 15, 2010 at 12:15 PM, Brice Figureau <
brice-puppet@daysofwonder.com> wrote:
> On Wed, 2010-12-15 at 05:28 -0800, Chris wrote:
> >
> > On Dec 15, 12:42 pm, Brice Figureau
<brice-pup...@daysofwonder.com>
> > wrote:
> > > On Tue, 2010-12-14 at 00:24 -0800, Chris wrote:
> > > > Hi
> > >
> > > > I recently upgraded my puppet masters (and clients) from
0.24.8 to
> > > > 2.6.4
> > >
> > > > Previously, my most busy puppet master would hover around
about 0.9
> > > > load  average, after the upgrade, its load hovers around 5
> > >
> > > > I am running passenger and mysql based stored configs.
> > >
> > > > Checking my running processes, ruby (puppetmasterd) shoots
up to 99%
> > > > cpu load and stays there for a few seconds before dropping
again.
> > > > Often there are 4 of these running simultaneously, pegging
each core
> > > > at 99% cpu.
> > >
> > > I would say it is perfectly normal. Compiling the catalog is a
hard and
> > > complex problem and requires CPU.
> > >
> > > The difference between 0.24.8 and 2.6 (or 0.25 for what matters)
is
> that
> > > some performance issues have been fixed. Those issues made the
master
> be
> > > more I/O bound under 0.24, but now mostly CPU bound in later
versions.
> >
> > If we were talking about only cpu usage, I would agree with you.  But
> > in this case, the load average of the machine has gone up over 5x.
> > And as high load average indicates processes not getting enough
> > runtime, in this case it is an indication to me that 2.6 is performing
> > worse than 0.24 (previously, on average, all processes got enough
> > runtime and did not have to wait for system resources, now processes
> > are sitting in the run queue, waiting to get a chance to run)
>
> Load is not necessarily an indication of an issue. It can also mean some
> tasks are waiting for I/O not only CPU.
> The only real issue under load is if service time is beyond an
> admissible value, otherwise you can''t say it''s bad or
not.
> If you see some hosts reporting timeouts, then it''s an indication
that
> service time is not good :)
>
> BTW, do you run your mysql storedconfig instance on the same server?
> You can activate thin_storeconfigs to reduce the load on the mysql db.
>
> > >
> > > I don''t really get what is the issue about using 100% of
CPU?
> > Thats not the issue, just an indication of what is causing it
> >
> > >
> > > You''re paying about the same price when your CPU is used
and when it''s
> > > idle, so that shouldn''t make a difference :)
> > Generally true, but this is a on VM which is also running some of my
> > radius and proxy instances, amongst others.
> >
> > >
> > > If that''s an issue, reduce the concurrency of your setup
(run less
> > > compilation in parallel, implement splay time, etc...).
> > splay has been enabled since 0.24
> >
> > My apache maxclients is set to 15 to limit concurrency.
>
> I think this is too many except if you have 8 cores. As Trevor said in
> another e-mail in this thread, 2PM/Core is the best.
>
> Now it all depends on your number of nodes and sleeptime. I suggest you
> use ext/puppet-load to find your setup real concurrency.
> --
> Brice Figureau
> Follow the latest Puppet Community evolutions on www.planetpuppet.org!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Disconnect

2010-Dec-15 18:35 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

"me too". All the logs show nice quick compilations but the actual
wall
clock to get anything done is HUGE.

Dec 15 13:10:29 puppet puppet-master[31406]: Compiled catalog for
puppet.foo.com in environment production in 21.52 seconds
Dec 15 13:10:51 puppet puppet-agent[8251]: Caching catalog for
puppet.foo.com

That was almost 30 minutes ago. Since then, it has sat there doing
nothing...
$ sudo strace -p 8251
Process 8251 attached - interrupt to quit
select(7, [6], [], [], {866, 578560}

lsof shows:
puppetd 8251 root    6u  IPv4           11016045      0t0      TCP
puppet.foo.com:33065->puppet.foo.com:8140 (ESTABLISHED)


On Wed, Dec 15, 2010 at 1:27 PM, Ashley Penney <apenney@gmail.com> wrote:
> This issue is definitely a problem.  I have a support ticket in with Puppet
> Labs about the same thing.  My CPU remains at 100% almost constantly and it
> slows things down significantly.  If you strace it you can see that very
> little appears to be going on.  This is absolutely not normal behavior.
>  Even when I had 1 client checking in I had all cores fully used.
>
>
> On Wed, Dec 15, 2010 at 12:15 PM, Brice Figureau <
> brice-puppet@daysofwonder.com> wrote:
>
>> On Wed, 2010-12-15 at 05:28 -0800, Chris wrote:
>> >
>> > On Dec 15, 12:42 pm, Brice Figureau
<brice-pup...@daysofwonder.com>
>> > wrote:
>> > > On Tue, 2010-12-14 at 00:24 -0800, Chris wrote:
>> > > > Hi
>> > >
>> > > > I recently upgraded my puppet masters (and clients) from
0.24.8 to
>> > > > 2.6.4
>> > >
>> > > > Previously, my most busy puppet master would hover
around about 0.9
>> > > > load  average, after the upgrade, its load hovers around
5
>> > >
>> > > > I am running passenger and mysql based stored configs.
>> > >
>> > > > Checking my running processes, ruby (puppetmasterd)
shoots up to 99%
>> > > > cpu load and stays there for a few seconds before
dropping again.
>> > > > Often there are 4 of these running simultaneously,
pegging each core
>> > > > at 99% cpu.
>> > >
>> > > I would say it is perfectly normal. Compiling the catalog is
a hard
>> and
>> > > complex problem and requires CPU.
>> > >
>> > > The difference between 0.24.8 and 2.6 (or 0.25 for what
matters) is
>> that
>> > > some performance issues have been fixed. Those issues made
the master
>> be
>> > > more I/O bound under 0.24, but now mostly CPU bound in later
versions.
>> >
>> > If we were talking about only cpu usage, I would agree with you. 
But
>> > in this case, the load average of the machine has gone up over 5x.
>> > And as high load average indicates processes not getting enough
>> > runtime, in this case it is an indication to me that 2.6 is
performing
>> > worse than 0.24 (previously, on average, all processes got enough
>> > runtime and did not have to wait for system resources, now
processes
>> > are sitting in the run queue, waiting to get a chance to run)
>>
>> Load is not necessarily an indication of an issue. It can also mean
some
>> tasks are waiting for I/O not only CPU.
>> The only real issue under load is if service time is beyond an
>> admissible value, otherwise you can''t say it''s bad or
not.
>> If you see some hosts reporting timeouts, then it''s an
indication that
>> service time is not good :)
>>
>> BTW, do you run your mysql storedconfig instance on the same server?
>> You can activate thin_storeconfigs to reduce the load on the mysql db.
>>
>> > >
>> > > I don''t really get what is the issue about using
100% of CPU?
>> > Thats not the issue, just an indication of what is causing it
>> >
>> > >
>> > > You''re paying about the same price when your CPU is
used and when it''s
>> > > idle, so that shouldn''t make a difference :)
>> > Generally true, but this is a on VM which is also running some of
my
>> > radius and proxy instances, amongst others.
>> >
>> > >
>> > > If that''s an issue, reduce the concurrency of your
setup (run less
>> > > compilation in parallel, implement splay time, etc...).
>> > splay has been enabled since 0.24
>> >
>> > My apache maxclients is set to 15 to limit concurrency.
>>
>> I think this is too many except if you have 8 cores. As Trevor said in
>> another e-mail in this thread, 2PM/Core is the best.
>>
>> Now it all depends on your number of nodes and sleeptime. I suggest you
>> use ext/puppet-load to find your setup real concurrency.
>> --
>> Brice Figureau
>> Follow the latest Puppet Community evolutions on www.planetpuppet.org!
>>
>> --
>> You received this message because you are subscribed to the Google
Groups
>> "Puppet Users" group.
>> To post to this group, send email to puppet-users@googlegroups.com.
>> To unsubscribe from this group, send email to
>>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/puppet-users?hl=en.
>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Disconnect

2010-Dec-15 18:38 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

One addendum - the box is absolutely not io or cpu bound:
Cpu(s): 83.0%us, 13.1%sy,  0.0%ni,  2.5%id,  0.0%wa,  0.1%hi,  1.3%si,
0.0%st

(64bit kvm vm w/ 6 3.5ghz amd64 cpus, on an lvm partition - raw disk - with
5G ram, but only using 3 gigs. PLENTY of power, and monitoring supports
that..)

On Wed, Dec 15, 2010 at 1:35 PM, Disconnect <dc.disconnect@gmail.com>
wrote:
> "me too". All the logs show nice quick compilations but the
actual wall
> clock to get anything done is HUGE.
>
> Dec 15 13:10:29 puppet puppet-master[31406]: Compiled catalog for
> puppet.foo.com in environment production in 21.52 seconds
> Dec 15 13:10:51 puppet puppet-agent[8251]: Caching catalog for
> puppet.foo.com
>
> That was almost 30 minutes ago. Since then, it has sat there doing
> nothing...
> $ sudo strace -p 8251
> Process 8251 attached - interrupt to quit
> select(7, [6], [], [], {866, 578560}
>
> lsof shows:
> puppetd 8251 root    6u  IPv4           11016045      0t0      TCP
> puppet.foo.com:33065->puppet.foo.com:8140 (ESTABLISHED)
>
>
>
> On Wed, Dec 15, 2010 at 1:27 PM, Ashley Penney <apenney@gmail.com>
wrote:
>
>> This issue is definitely a problem.  I have a support ticket in with
>> Puppet Labs about the same thing.  My CPU remains at 100% almost
constantly
>> and it slows things down significantly.  If you strace it you can see
that
>> very little appears to be going on.  This is absolutely not normal
behavior.
>>  Even when I had 1 client checking in I had all cores fully used.
>>
>>
>> On Wed, Dec 15, 2010 at 12:15 PM, Brice Figureau <
>> brice-puppet@daysofwonder.com> wrote:
>>
>>> On Wed, 2010-12-15 at 05:28 -0800, Chris wrote:
>>> >
>>> > On Dec 15, 12:42 pm, Brice Figureau
<brice-pup...@daysofwonder.com>
>>> > wrote:
>>> > > On Tue, 2010-12-14 at 00:24 -0800, Chris wrote:
>>> > > > Hi
>>> > >
>>> > > > I recently upgraded my puppet masters (and clients)
from 0.24.8 to
>>> > > > 2.6.4
>>> > >
>>> > > > Previously, my most busy puppet master would hover
around about 0.9
>>> > > > load  average, after the upgrade, its load hovers
around 5
>>> > >
>>> > > > I am running passenger and mysql based stored
configs.
>>> > >
>>> > > > Checking my running processes, ruby (puppetmasterd)
shoots up to
>>> 99%
>>> > > > cpu load and stays there for a few seconds before
dropping again.
>>> > > > Often there are 4 of these running simultaneously,
pegging each
>>> core
>>> > > > at 99% cpu.
>>> > >
>>> > > I would say it is perfectly normal. Compiling the catalog
is a hard
>>> and
>>> > > complex problem and requires CPU.
>>> > >
>>> > > The difference between 0.24.8 and 2.6 (or 0.25 for what
matters) is
>>> that
>>> > > some performance issues have been fixed. Those issues
made the master
>>> be
>>> > > more I/O bound under 0.24, but now mostly CPU bound in
later
>>> versions.
>>> >
>>> > If we were talking about only cpu usage, I would agree with
you.  But
>>> > in this case, the load average of the machine has gone up over
5x.
>>> > And as high load average indicates processes not getting
enough
>>> > runtime, in this case it is an indication to me that 2.6 is
performing
>>> > worse than 0.24 (previously, on average, all processes got
enough
>>> > runtime and did not have to wait for system resources, now
processes
>>> > are sitting in the run queue, waiting to get a chance to run)
>>>
>>> Load is not necessarily an indication of an issue. It can also mean
some
>>> tasks are waiting for I/O not only CPU.
>>> The only real issue under load is if service time is beyond an
>>> admissible value, otherwise you can''t say it''s
bad or not.
>>> If you see some hosts reporting timeouts, then it''s an
indication that
>>> service time is not good :)
>>>
>>> BTW, do you run your mysql storedconfig instance on the same
server?
>>> You can activate thin_storeconfigs to reduce the load on the mysql
db.
>>>
>>> > >
>>> > > I don''t really get what is the issue about using
100% of CPU?
>>> > Thats not the issue, just an indication of what is causing it
>>> >
>>> > >
>>> > > You''re paying about the same price when your CPU
is used and when
>>> it''s
>>> > > idle, so that shouldn''t make a difference :)
>>> > Generally true, but this is a on VM which is also running some
of my
>>> > radius and proxy instances, amongst others.
>>> >
>>> > >
>>> > > If that''s an issue, reduce the concurrency of
your setup (run less
>>> > > compilation in parallel, implement splay time, etc...).
>>> > splay has been enabled since 0.24
>>> >
>>> > My apache maxclients is set to 15 to limit concurrency.
>>>
>>> I think this is too many except if you have 8 cores. As Trevor said
in
>>> another e-mail in this thread, 2PM/Core is the best.
>>>
>>> Now it all depends on your number of nodes and sleeptime. I suggest
you
>>> use ext/puppet-load to find your setup real concurrency.
>>> --
>>> Brice Figureau
>>> Follow the latest Puppet Community evolutions on
www.planetpuppet.org!
>>>
>>> --
>>> You received this message because you are subscribed to the Google
Groups
>>> "Puppet Users" group.
>>> To post to this group, send email to puppet-users@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
>>> .
>>> For more options, visit this group at
>>> http://groups.google.com/group/puppet-users?hl=en.
>>>
>>>
>>  --
>> You received this message because you are subscribed to the Google
Groups
>> "Puppet Users" group.
>> To post to this group, send email to puppet-users@googlegroups.com.
>> To unsubscribe from this group, send email to
>>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/puppet-users?hl=en.
>>
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2010-Dec-15 19:14 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 15/12/10 19:35, Disconnect wrote:> "me too". All the logs show nice quick compilations but the
actual wall
> clock to get anything done is HUGE.
> 
> Dec 15 13:10:29 puppet puppet-master[31406]: Compiled catalog for
> puppet.foo.com <http://puppet.foo.com> in environment production in
> 21.52 seconds
This looks long.
> Dec 15 13:10:51 puppet puppet-agent[8251]: Caching catalog for
> puppet.foo.com <http://puppet.foo.com>
> 
> That was almost 30 minutes ago. Since then, it has sat there doing
> nothing...
> $ sudo strace -p 8251
> Process 8251 attached - interrupt to quit
> select(7, [6], [], [], {866, 578560}
> 
> lsof shows:
> puppetd 8251 root    6u  IPv4           11016045      0t0      TCP
> puppet.foo.com:33065->puppet.foo.com:8140
<http://puppet.foo.com:8140>
> (ESTABLISHED)
Note: we were talking about the puppet master taking 100% CPU, but
you''re apparently looking to the puppet agent, which is a different
story.

-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2010-Dec-15 19:15 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 15/12/10 19:27, Ashley Penney wrote:> This issue is definitely a problem.  I have a support ticket in with
> Puppet Labs about the same thing.  My CPU remains at 100% almost
> constantly and it slows things down significantly.  If you strace it you
> can see that very little appears to be going on.  This is absolutely not
> normal behavior.  Even when I had 1 client checking in I had all cores
> fully used.
I do agree that it''s not the correct behavior. I suggest you to strace
or use any other ruby introspection techniques to find what part of the
master is taking CPU.
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Disconnect

2010-Dec-15 19:24 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Wed, Dec 15, 2010 at 2:14 PM, Brice Figureau <
brice-puppet@daysofwonder.com> wrote:>
> Note: we were talking about the puppet master taking 100% CPU, but
> you''re apparently looking to the puppet agent, which is a
different story.
>
The agent isn''t taking cpu, it is hanging waiting for the master to do
anything. (The run I quoted earlier eventually ended with a timeout..) The
master has pegged the cpus, and it seems to be related to file resources:

$ ps auxw|grep master
puppet   31392 74.4  4.7 361720 244348 ?       R    10:42 162:06 Rack:
/usr/share/puppet/rack/puppetmasterd

puppet   31396 70.0  4.9 369524 250200 ?       R    10:42 152:32 Rack:
/usr/share/puppet/rack/puppetmasterd

puppet   31398 66.2  3.9 318828 199472 ?       R    10:42 144:10 Rack:
/usr/share/puppet/rack/puppetmasterd

puppet   31400 66.6  4.9 369992 250588 ?       R    10:42 145:04 Rack:
/usr/share/puppet/rack/puppetmasterd

puppet   31406 68.6  3.9 318292 200992 ?       R    10:42 149:31 Rack:
/usr/share/puppet/rack/puppetmasterd

puppet   31414 67.0  2.4 243800 124476 ?       R    10:42 146:00 Rack:
/usr/share/puppet/rack/puppetmasterd


Dec 15 13:42:23 puppet puppet-master[31406]: Compiled catalog for
puppet.foo.com in environment production in 30.83 seconds
Dec 15 13:42:49 puppet puppet-agent[10515]: Caching catalog for
puppet.foo.com
Dec 15 14:00:18 puppet puppet-agent[10515]: Applying configuration version
''1292438512''
...
Dec 15 14:14:56 puppet puppet-agent[10515]: Finished catalog run in 882.43
seconds
Changes:
            Total: 6
Events:
          Success: 6
            Total: 6
Resources:
          Changed: 6
      Out of sync: 6
            Total: 287
Time:
   Config retrieval: 72.20
             Cron: 0.05
             Exec: 32.42
             File: 752.33
       Filebucket: 0.00
            Mount: 0.98
          Package: 6.13
         Schedule: 0.02
          Service: 9.09
   Ssh authorized key: 0.07
           Sysctl: 0.00

real    34m56.066s
user    1m6.030s
sys    0m26.590s

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2010-Dec-15 19:43 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 15/12/10 20:24, Disconnect wrote:> On Wed, Dec 15, 2010 at 2:14 PM, Brice Figureau
> <brice-puppet@daysofwonder.com
<mailto:brice-puppet@daysofwonder.com>>
> wrote:
>>
>> Note: we were talking about the puppet master taking 100% CPU, but
>> you''re apparently looking to the puppet agent, which is a
different story.
>>
> 
> The agent isn''t taking cpu, it is hanging waiting for the master
to do
> anything. (The run I quoted earlier eventually ended with a timeout..)
> The master has pegged the cpus, and it seems to be related to file
> resources:
Oh, I see.
> $ ps auxw|grep master
> puppet   31392 74.4  4.7 361720 244348 ?       R    10:42 162:06 Rack:
> /usr/share/puppet/rack/puppetmasterd
> 
> puppet   31396 70.0  4.9 369524 250200 ?       R    10:42 152:32 Rack:
> /usr/share/puppet/rack/puppetmasterd
> 
> puppet   31398 66.2  3.9 318828 199472 ?       R    10:42 144:10 Rack:
> /usr/share/puppet/rack/puppetmasterd
> 
> puppet   31400 66.6  4.9 369992 250588 ?       R    10:42 145:04 Rack:
> /usr/share/puppet/rack/puppetmasterd
> 
> puppet   31406 68.6  3.9 318292 200992 ?       R    10:42 149:31 Rack:
> /usr/share/puppet/rack/puppetmasterd
> 
> puppet   31414 67.0  2.4 243800 124476 ?       R    10:42 146:00 Rack:
> /usr/share/puppet/rack/puppetmasterd
Note that they''re all running. That means there is none left to serve
file content if they are all busy for several seconds (in our case
around 20) while compiling catalogs.
> Dec 15 13:42:23 puppet puppet-master[31406]: Compiled catalog for
> puppet.foo.com <http://puppet.foo.com> in environment production in
> 30.83 seconds
> Dec 15 13:42:49 puppet puppet-agent[10515]: Caching catalog for
> puppet.foo.com <http://puppet.foo.com>
> Dec 15 14:00:18 puppet puppet-agent[10515]: Applying configuration
> version ''1292438512''
> ...
> Dec 15 14:14:56 puppet puppet-agent[10515]: Finished catalog run in
> 882.43 seconds
> Changes:
>             Total: 6
> Events:
>           Success: 6
>             Total: 6
> Resources:
>           Changed: 6
>       Out of sync: 6
>             Total: 287
That''s not a big number.
> Time:
>    Config retrieval: 72.20
This is also suspect.
>              Cron: 0.05
>              Exec: 32.42
>              File: 752.33
Indeed.
>        Filebucket: 0.00
>             Mount: 0.98
>           Package: 6.13
>          Schedule: 0.02
>           Service: 9.09
>    Ssh authorized key: 0.07
>            Sysctl: 0.00
> 
> real    34m56.066s
> user    1m6.030s
> sys    0m26.590s
> 
That just means your master are so busy serving catalogs that they
barely have the time to serve files. One possibility is to use file
content offloading (see one of my blog post about this:
http://www.masterzen.fr/2010/03/21/more-puppet-offloading/).

How many nodes are you compiling at the same time? Apparently you have 6
master processes running at high CPU usage.

As I said earlier, I really advise people to try puppet-load (which can
be found in the ext/ directory of the source tarball since puppet 2.6)
to execise load againts a master. This will help you find your actual
concurrency.

But, if it''s a bug, then could this be an issue with passenger?
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Disconnect

2010-Dec-15 20:10 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

As a datapoint, this exact config (with mongrel_cluster) was working great
under 0.25.x. With fewer, slower cpus, slower storage (vm image files) and
2G of ram...

I gave puppet-load a try, but it is throwing errors that I don''t have
time
to dig into today:
debug: reading facts from: puppet.foo.com.yaml
/var/lib/gems/1.8/gems/em-http-request-0.2.15/lib/em-http/request.rb:72:in
`send_request'': uninitialized constant EventMachine::ConnectionError
(NameError)
    from
/var/lib/gems/1.8/gems/em-http-request-0.2.15/lib/em-http/request.rb:59:in
`setup_request''
    from
/var/lib/gems/1.8/gems/em-http-request-0.2.15/lib/em-http/request.rb:49:in
`get''
    from ./puppet-load.rb:272:in `spawn_request''
    from ./puppet-load.rb:334:in `spawn''

Running about 250 nodes, every 30 minutes.

On Wed, Dec 15, 2010 at 2:43 PM, Brice Figureau <
brice-puppet@daysofwonder.com> wrote:
> On 15/12/10 20:24, Disconnect wrote:
> > On Wed, Dec 15, 2010 at 2:14 PM, Brice Figureau
> > <brice-puppet@daysofwonder.com
<mailto:brice-puppet@daysofwonder.com>>
> > wrote:
> >>
> >> Note: we were talking about the puppet master taking 100% CPU, but
> >> you''re apparently looking to the puppet agent, which is a
different
> story.
> >>
> >
> > The agent isn''t taking cpu, it is hanging waiting for the
master to do
> > anything. (The run I quoted earlier eventually ended with a timeout..)
> > The master has pegged the cpus, and it seems to be related to file
> > resources:
>
> Oh, I see.
>
> > $ ps auxw|grep master
> > puppet   31392 74.4  4.7 361720 244348 ?       R    10:42 162:06 Rack:
> > /usr/share/puppet/rack/puppetmasterd
> >
> > puppet   31396 70.0  4.9 369524 250200 ?       R    10:42 152:32 Rack:
> > /usr/share/puppet/rack/puppetmasterd
> >
> > puppet   31398 66.2  3.9 318828 199472 ?       R    10:42 144:10 Rack:
> > /usr/share/puppet/rack/puppetmasterd
> >
> > puppet   31400 66.6  4.9 369992 250588 ?       R    10:42 145:04 Rack:
> > /usr/share/puppet/rack/puppetmasterd
> >
> > puppet   31406 68.6  3.9 318292 200992 ?       R    10:42 149:31 Rack:
> > /usr/share/puppet/rack/puppetmasterd
> >
> > puppet   31414 67.0  2.4 243800 124476 ?       R    10:42 146:00 Rack:
> > /usr/share/puppet/rack/puppetmasterd
>
> Note that they''re all running. That means there is none left to
serve
> file content if they are all busy for several seconds (in our case
> around 20) while compiling catalogs.
>
> > Dec 15 13:42:23 puppet puppet-master[31406]: Compiled catalog for
> > puppet.foo.com <http://puppet.foo.com> in environment production
in
> > 30.83 seconds
> > Dec 15 13:42:49 puppet puppet-agent[10515]: Caching catalog for
> > puppet.foo.com <http://puppet.foo.com>
> > Dec 15 14:00:18 puppet puppet-agent[10515]: Applying configuration
> > version ''1292438512''
> > ...
> > Dec 15 14:14:56 puppet puppet-agent[10515]: Finished catalog run in
> > 882.43 seconds
> > Changes:
> >             Total: 6
> > Events:
> >           Success: 6
> >             Total: 6
> > Resources:
> >           Changed: 6
> >       Out of sync: 6
> >             Total: 287
>
> That''s not a big number.
>
> > Time:
> >    Config retrieval: 72.20
>
> This is also suspect.
>
> >              Cron: 0.05
> >              Exec: 32.42
> >              File: 752.33
>
> Indeed.
>
> >        Filebucket: 0.00
> >             Mount: 0.98
> >           Package: 6.13
> >          Schedule: 0.02
> >           Service: 9.09
> >    Ssh authorized key: 0.07
> >            Sysctl: 0.00
> >
> > real    34m56.066s
> > user    1m6.030s
> > sys    0m26.590s
> >
>
> That just means your master are so busy serving catalogs that they
> barely have the time to serve files. One possibility is to use file
> content offloading (see one of my blog post about this:
> http://www.masterzen.fr/2010/03/21/more-puppet-offloading/).
>
> How many nodes are you compiling at the same time? Apparently you have 6
> master processes running at high CPU usage.
>
> As I said earlier, I really advise people to try puppet-load (which can
> be found in the ext/ directory of the source tarball since puppet 2.6)
> to execise load againts a master. This will help you find your actual
> concurrency.
>
> But, if it''s a bug, then could this be an issue with passenger?
> --
> Brice Figureau
> My Blog: http://www.masterzen.fr/
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2010-Dec-15 21:45 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 15/12/10 21:10, Disconnect wrote:> As a datapoint, this exact config (with mongrel_cluster) was working
> great under 0.25.x. With fewer, slower cpus, slower storage (vm image
> files) and 2G of ram...
So I ask it again: could it be a problem with passenger more than an
issue with puppet itself?

It would really be interesting to use some ruby introspection[1] to find
exactly where the cpu is spent in those masters.

Like with passenger it reparses everything instead of just compiling?
(I simply don''t know, just throwing out some ideas)

I myself use nginx + mongrel, but have only a dozen of nodes, so I
don''t
really qualify.
> I gave puppet-load a try, but it is throwing errors that I don''t
have
> time to dig into today:
> debug: reading facts from: puppet.foo.com.yaml
> /var/lib/gems/1.8/gems/em-http-request-0.2.15/lib/em-http/request.rb:72:in
> `send_request'': uninitialized constant
EventMachine::ConnectionError
> (NameError)
>     from
> /var/lib/gems/1.8/gems/em-http-request-0.2.15/lib/em-http/request.rb:59:in
> `setup_request''
>     from
> /var/lib/gems/1.8/gems/em-http-request-0.2.15/lib/em-http/request.rb:49:in
> `get''
>     from ./puppet-load.rb:272:in `spawn_request''
>     from ./puppet-load.rb:334:in `spawn''
Could it be that you''re missing EventMachine?
> Running about 250 nodes, every 30 minutes.
Did you try to use mongrel?
Do you use splay time?

Just some math (which might be totally wrong), to give an idea of how I
think we can compute our optimal scaling case:
So with 250 nodes and a sleep time of 30 minutes, we need to overcome
250 compiles in every 30 minute time spans. If we assume a concurrency
of 2 and all nodes evenly spaced (in time), that means we must compile
125 nodes in 30 minutes. If each compilation takes about 10s, then that
means it''ll take 1250s, which means 20 minutes so you have some room
for
growth :)
Now during those 20min your 2 master processes will consume 100% CPU.
Since we''re consuming the the CPU for only 66% of the 30 minute span,
you''ll globally consume 66% of all your CPU available...

Hope that helps,

[1]: http://projects.puppetlabs.com/projects/1/wiki/Puppet_Introspection
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Ashley Penney

2010-Dec-16 00:47 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Just to reply to this - like I said earlier I can get this problem with 1
node checking in against puppetmaster.  All the puppetmasterd processes use
maximum CPU.  It''s not a scaling issue considering serving one node is
certainly not going to max out a newish physical server.

On Wed, Dec 15, 2010 at 4:45 PM, Brice Figureau <
brice-puppet@daysofwonder.com> wrote:
>
> Just some math (which might be totally wrong), to give an idea of how I
> think we can compute our optimal scaling case:
> So with 250 nodes and a sleep time of 30 minutes, we need to overcome
> 250 compiles in every 30 minute time spans. If we assume a concurrency
> of 2 and all nodes evenly spaced (in time), that means we must compile
> 125 nodes in 30 minutes. If each compilation takes about 10s, then that
> means it''ll take 1250s, which means 20 minutes so you have some
room for
> growth :)
> Now during those 20min your 2 master processes will consume 100% CPU.
> Since we''re consuming the the CPU for only 66% of the 30 minute
span,
> you''ll globally consume 66% of all your CPU available...
>
> Hope that helps,
>
> [1]: http://projects.puppetlabs.com/projects/1/wiki/Puppet_Introspection
> --
> Brice Figureau
> My Blog: http://www.masterzen.fr/
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Nigel Kersten

2010-Dec-16 01:25 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Wed, Dec 15, 2010 at 4:47 PM, Ashley Penney <apenney@gmail.com> wrote:
> Just to reply to this - like I said earlier I can get this problem with 1
> node checking in against puppetmaster.  All the puppetmasterd processes use
> maximum CPU.  It''s not a scaling issue considering serving one
node is
> certainly not going to max out a newish physical server.

That is definitely a problem.

Does this happen as soon as a node checks in? Or as soon as you start the
passenger processes?

Can you post a sanitized strace somewhere?

>
>
> On Wed, Dec 15, 2010 at 4:45 PM, Brice Figureau <
> brice-puppet@daysofwonder.com> wrote:
>
>>
>> Just some math (which might be totally wrong), to give an idea of how I
>> think we can compute our optimal scaling case:
>> So with 250 nodes and a sleep time of 30 minutes, we need to overcome
>> 250 compiles in every 30 minute time spans. If we assume a concurrency
>> of 2 and all nodes evenly spaced (in time), that means we must compile
>> 125 nodes in 30 minutes. If each compilation takes about 10s, then that
>> means it''ll take 1250s, which means 20 minutes so you have
some room for
>> growth :)
>> Now during those 20min your 2 master processes will consume 100% CPU.
>> Since we''re consuming the the CPU for only 66% of the 30
minute span,
>> you''ll globally consume 66% of all your CPU available...
>>
>> Hope that helps,
>>
>> [1]:
http://projects.puppetlabs.com/projects/1/wiki/Puppet_Introspection
>> --
>> Brice Figureau
>> My Blog: http://www.masterzen.fr/
>>
>> --
>> You received this message because you are subscribed to the Google
Groups
>> "Puppet Users" group.
>> To post to this group, send email to puppet-users@googlegroups.com.
>> To unsubscribe from this group, send email to
>>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/puppet-users?hl=en.
>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>


-- 
Nigel Kersten - Puppet Labs -  http://www.puppetlabs.com

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2010-Dec-16 09:25 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Wed, 2010-12-15 at 19:47 -0500, Ashley Penney wrote:> Just to reply to this - like I said earlier I can get this problem
> with 1 node checking in against puppetmaster.  All the puppetmasterd
> processes use maximum CPU.  It''s not a scaling issue considering
> serving one node is certainly not going to max out a newish physical
> server.
This looks like a bug to me. 

Do your manifests use many file sources? 
And/or recursive file resources?
It''s possible that those masters are spending their time checksuming
files.

Like I said earlier in the thread the only real way to know is to use
Puppet introspection:
http://projects.puppetlabs.com/projects/1/wiki/Puppet_Introspection

-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Leonid Batizhevsky

2010-Dec-16 12:49 UTC

head link

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

I have  same issuer with running puppentmaster and puppted in same
host. When I updated ruby to 1.8.7 enterprise It resolve problem for
me.
Leonid S. Batizhevsky



On Tue, Dec 14, 2010 at 11:24, Chris
<iwouldratherbesleepingnow@gmail.com> wrote:> Hi
>
> I recently upgraded my puppet masters (and clients) from 0.24.8 to
> 2.6.4
>
> Previously, my most busy puppet master would hover around about 0.9
> load  average, after the upgrade, its load hovers around 5
>
> I am running passenger and mysql based stored configs.
>
> Checking my running processes, ruby (puppetmasterd) shoots up to 99%
> cpu load and stays there for a few seconds before dropping again.
> Often there are 4 of these running simultaneously, pegging each core
> at 99% cpu.
>
> It seems that there has been a serious performance regression between
> 0.24 and 2.6 for my configuration
>
> I hop the following can help work out where...
>
> I ran puppetmasterd through a profiler to find the root cause of this
> (http://boojum.homelinux.org/profile.svg).  The main problem appears
> to be in /usr/lib/ruby/site_ruby/1.8/puppet/parser/ast/resource.rb, in
> the evaluate function.
>
> I added a few timing commands around various sections of that function
> to find the following breakdown of times spent inside it, and the two
> most intensive calls are
> ---
>    paramobjects = parameters.collect { |param|
>      param.safeevaluate(scope)
>    }
> ---
>
> and
> ---
>    resource_titles.flatten.collect { |resource_title|
>      exceptwrap :type => Puppet::ParseError do
>        resource = Puppet::Parser::Resource.new(
>          fully_qualified_type, resource_title,
>          :parameters => paramobjects,
>          :file => self.file,
>          :line => self.line,
>          :exported => self.exported,
>          :virtual => virt,
>          :source => scope.source,
>          :scope => scope,
>          :strict => true
>        )
>
>        if resource.resource_type.is_a? Puppet::Resource::Type
>          resource.resource_type.instantiate_resource(scope, resource)
>        end
>        scope.compiler.add_resource(scope, resource)
>        scope.compiler.evaluate_classes([resource_title],scope,false)
> if fully_qualified_type == ''class''
>        resource
>      end
>    }.reject { |resource| resource.nil? }
> ---
>
>
> Unfortunately, that is about the limit of my current ruby skills.
> What else can be looked at to speed 2.6 back up to the performance of
> 0.24?
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
> For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Nigel Kersten

2010-Dec-17 00:03 UTC

head link

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Thu, Dec 16, 2010 at 4:49 AM, Leonid Batizhevsky
<the.leonko@gmail.com> wrote:> I have  same issuer with running puppentmaster and puppted in same
> host. When I updated ruby to 1.8.7 enterprise It resolve problem for
> me.
> Leonid S. Batizhevsky
For the sake of the archives, what version did you upgrade *from* Leonid ?
>
>
>
> On Tue, Dec 14, 2010 at 11:24, Chris
> <iwouldratherbesleepingnow@gmail.com> wrote:
>> Hi
>>
>> I recently upgraded my puppet masters (and clients) from 0.24.8 to
>> 2.6.4
>>
>> Previously, my most busy puppet master would hover around about 0.9
>> load  average, after the upgrade, its load hovers around 5
>>
>> I am running passenger and mysql based stored configs.
>>
>> Checking my running processes, ruby (puppetmasterd) shoots up to 99%
>> cpu load and stays there for a few seconds before dropping again.
>> Often there are 4 of these running simultaneously, pegging each core
>> at 99% cpu.
>>
>> It seems that there has been a serious performance regression between
>> 0.24 and 2.6 for my configuration
>>
>> I hop the following can help work out where...
>>
>> I ran puppetmasterd through a profiler to find the root cause of this
>> (http://boojum.homelinux.org/profile.svg).  The main problem appears
>> to be in /usr/lib/ruby/site_ruby/1.8/puppet/parser/ast/resource.rb, in
>> the evaluate function.
>>
>> I added a few timing commands around various sections of that function
>> to find the following breakdown of times spent inside it, and the two
>> most intensive calls are
>> ---
>>    paramobjects = parameters.collect { |param|
>>      param.safeevaluate(scope)
>>    }
>> ---
>>
>> and
>> ---
>>    resource_titles.flatten.collect { |resource_title|
>>      exceptwrap :type => Puppet::ParseError do
>>        resource = Puppet::Parser::Resource.new(
>>          fully_qualified_type, resource_title,
>>          :parameters => paramobjects,
>>          :file => self.file,
>>          :line => self.line,
>>          :exported => self.exported,
>>          :virtual => virt,
>>          :source => scope.source,
>>          :scope => scope,
>>          :strict => true
>>        )
>>
>>        if resource.resource_type.is_a? Puppet::Resource::Type
>>          resource.resource_type.instantiate_resource(scope, resource)
>>        end
>>        scope.compiler.add_resource(scope, resource)
>>        scope.compiler.evaluate_classes([resource_title],scope,false)
>> if fully_qualified_type == ''class''
>>        resource
>>      end
>>    }.reject { |resource| resource.nil? }
>> ---
>>
>>
>> Unfortunately, that is about the limit of my current ruby skills.
>> What else can be looked at to speed 2.6 back up to the performance of
>> 0.24?
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google
Groups "Puppet Users" group.
>> To post to this group, send email to puppet-users@googlegroups.com.
>> To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
>> For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
> For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.
>
>


-- 
Nigel Kersten - Puppet Labs -  http://www.puppetlabs.com

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Leonid Batizhevsky

2010-Dec-17 16:27 UTC

head link

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Ruby or puppet?
I start use from 0.25.x (from epel repo and not long) and  1.8.5 ruby.
Then I update to 2.6.0 and I saw memory problems.
I start to google and found:
http://projects.puppetlabs.com/projects/1/wiki/Puppet_Red_Hat_Centos
"The 1.8.5 branch of Ruby shipped will RHEL5 can exhibit memory leaks.
"
And upgrade to ruby enterprise 1.8.7 and It solve my problems!

Leonid S. Batizhevsky





On Fri, Dec 17, 2010 at 03:03, Nigel Kersten <nigel@puppetlabs.com>
wrote:> For the sake of the archives, what version did you upgrade *from* Leonid ?
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Ashley Penney

2010-Dec-17 16:39 UTC

head link

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

As a datapoint, I experience this problem on RHEL6:

ruby-1.8.7.299-4.el6.x86_64

Gems:

passenger (3.0.0)
rack (1.2.1)
rack-mount (0.6.13)
rack-test (0.5.6)
rails (3.0.3)

On Fri, Dec 17, 2010 at 11:27 AM, Leonid Batizhevsky
<the.leonko@gmail.com>wrote:
> Ruby or puppet?
> I start use from 0.25.x (from epel repo and not long) and  1.8.5 ruby.
> Then I update to 2.6.0 and I saw memory problems.
> I start to google and found:
> http://projects.puppetlabs.com/projects/1/wiki/Puppet_Red_Hat_Centos
> "The 1.8.5 branch of Ruby shipped will RHEL5 can exhibit memory leaks.
"
> And upgrade to ruby enterprise 1.8.7 and It solve my problems!
>
> Leonid S. Batizhevsky
>
>
>
>
>
> On Fri, Dec 17, 2010 at 03:03, Nigel Kersten <nigel@puppetlabs.com>
wrote:
> > For the sake of the archives, what version did you upgrade *from*
Leonid
> ?
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Leonid Batizhevsky

2011-Jan-08 22:51 UTC

head link

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

No, I have not, maybe try to play with passanger worker time to live?
Leonid S. Batizhevsky



On Fri, Dec 17, 2010 at 19:39, Ashley Penney <apenney@gmail.com>
wrote:> As a datapoint, I experience this problem on RHEL6:
> ruby-1.8.7.299-4.el6.x86_64
> Gems:
> passenger (3.0.0)
> rack (1.2.1)
> rack-mount (0.6.13)
> rack-test (0.5.6)
> rails (3.0.3)
> On Fri, Dec 17, 2010 at 11:27 AM, Leonid Batizhevsky
<the.leonko@gmail.com>
> wrote:
>>
>> Ruby or puppet?
>> I start use from 0.25.x (from epel repo and not long) and  1.8.5 ruby.
>> Then I update to 2.6.0 and I saw memory problems.
>> I start to google and found:
>> http://projects.puppetlabs.com/projects/1/wiki/Puppet_Red_Hat_Centos
>> "The 1.8.5 branch of Ruby shipped will RHEL5 can exhibit memory
leaks. "
>> And upgrade to ruby enterprise 1.8.7 and It solve my problems!
>>
>> Leonid S. Batizhevsky
>>
>>
>>
>>
>>
>> On Fri, Dec 17, 2010 at 03:03, Nigel Kersten
<nigel@puppetlabs.com> wrote:
>> > For the sake of the archives, what version did you upgrade *from*
Leonid
>> > ?
>> >
>>
>> --
>> You received this message because you are subscribed to the Google
Groups
>> "Puppet Users" group.
>> To post to this group, send email to puppet-users@googlegroups.com.
>> To unsubscribe from this group, send email to
>> puppet-users+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/puppet-users?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-users+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Micah Anderson

2011-Jan-25 22:11 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Brice Figureau <brice-puppet@daysofwonder.com> writes:
> On 15/12/10 19:27, Ashley Penney wrote:
>> This issue is definitely a problem.  I have a support ticket in with
>> Puppet Labs about the same thing.  My CPU remains at 100% almost
>> constantly and it slows things down significantly.  If you strace it
you
>> can see that very little appears to be going on.  This is absolutely
not
>> normal behavior.  Even when I had 1 client checking in I had all cores
>> fully used.
>
> I do agree that it''s not the correct behavior. I suggest you to
strace
> or use any other ruby introspection techniques to find what part of the
> master is taking CPU.
I''m having a similar problem with 2.6.3. At this point, I
can''t get
reliable puppet runs, and I''m not sure what to do.

What seems to happen is things are working fine at the
beginning. Catalog compiles peg the CPU for the puppet process that is
doing them and take anywhere from between 20 seconds and 75
seconds. Then things get drastically worse after 4 compiles (note: I
have four mongrels too, coincidence?), catalog compiles shoot up to 115,
165, 209, 268, 273, 341, 418, 546, 692, 774, 822, then 1149
seconds... then things are really hosed. Sometimes hosts will fail
outright and complain about weird things, like:

Jan 25 14:04:34 puppetmaster puppet-master[30294]: Host is missing hostname
and/or domain: gull.example.com
Jan 25 14:04:55 puppetmaster puppet-master[30294]: Failed to parse template
site-apt/local.list: Could not find value for
''lsbdistcodename'' at
/etc/puppet/modules/site-apt/manifests/init.pp:4 on node gull.example.com

All four of my mongrels are constantly pegged, doing 40-50% of the CPU
each, occupying all available CPUs. They never settle down. I''ve got 74
nodes checking in now, it doesn''t seem like its that many, but perhaps
i''ve reached a tipping point with my puppetmaster (its a dual 1ghz,
2gigs of ram machine)?

I''ve tried a large number of different things to attempt to work around
this:


0. reduced my node check-in times to be once an hour (and splayed
randomly)

1. turn on puppetqd/stomp queuing

   This didn''t seem to make a difference, its off now

2. turn on thin stored configs

   This sort of helped a little, but not enough

3. tried to upgrade rails from 2.3.5 (the debian version) to 2.3.10

   I didn''t see any appreciable difference here. I ended up going back
to
2.3.5 because that was the packaged version.

4. tried to offload file content via nginx[1] 

   This maybe helped a little, but its clear that the problem isn''t the
fileserving, it seems to be something in the catalog compilation.

5. tried to cache catalogs through adding a http front-end cache and
expiring that cache when manifests are updated[1] 

   I''m not sure this works at all.

6. set ''fair'' queuing in my nginx.conf[3]

   This seemed to help for a few days, but then things got bad again.

7. set --http_compression
   
   I''m not sure if this actually hurts the master or not (because it
has
   to now occupy the CPU compressing catalogs?)

8. tried to follow the introspection technique[2] 

   this wasn''t so easy to do, I had to operate really fast, because if
I
   was too slow the thread would exit, or it would get hung up on:

[Thread 0xb6194b70 (LWP 25770) exited]
[New Thread 0xb6194b70 (LWP 25806)]

   Eventually I did manage to get somewhere:

0xb74f1b16 in memcpy () from /lib/i686/cmov/libc.so.6
(gdb) session-ruby
(gdb) redirect_stdout
$1 = 2
(gdb) 
$2 = 2
(gdb) eval "caller"
$3 = 3
(gdb) rb_object_counts
Cannot get thread event message: debugger service failed
An error occurred while in a function called from GDB.
Evaluation of the expression containing the function
(rb_eval_string_protect) will be abandoned.
When the function is done executing, GDB will silently stop.
(gdb) eval "total = \[\[ObjectSpace\]\].each_object(Array)\{\|x\| puts
''---''; puts x.inspect \}; puts \\"---\\nTotal Arrays:
\#{total}\\""
Invalid character ''\'' in expression.

... then nothing.

In the tail:

root@puppetmaster:/tmp# tail -f ruby-debug.28724            
    207 
Puppet::Util::LoadedFile["/usr/lib/ruby/1.8/active_record/base.rb:2746:in
`attributes=''",
"/usr/lib/ruby/1.8/active_record/base.rb:2742:in `each''",
"/usr/lib/ruby/1.8/active_record/base.rb:2742:in
`attributes=''",
"/usr/lib/ruby/1.8/active_record/base.rb:2438:in
`initialize''",
"/usr/lib/ruby/1.8/active_record/reflection.rb:162:in `new''",
"/usr/lib/ruby/1.8/active_record/reflection.rb:162:in
`build_association''",
"/usr/lib/ruby/1.8/active_record/associations/association_collection.rb:423:in
`build_record''",
"/usr/lib/ruby/1.8/active_record/associations/association_collection.rb:102:in
`build''", "/usr/lib/ruby/1.8/puppet/rails/host.rb:145:in
`merge_facts''",
"/usr/lib/ruby/1.8/puppet/rails/host.rb:144:in `each''",
"/usr/lib/ruby/1.8/puppet/rails/host.rb:144:in
`merge_facts''",
"/usr/lib/ruby/1.8/puppet/rails/host.rb:140:in `each''",
"/usr/lib/ruby/1.8/puppet/rails/host.rb:140:in
`merge_facts''",
"/usr/lib/ruby/1.8/puppet/indirector/facts/active_record.rb:32:in
`save''",
"/usr/lib/ruby/1.8/puppet/indirector/indirection.rb:256:in
`save''", "/usr/lib/ruby/1.8/puppet/node/facts.rb:15:in
`save''", "/usr/lib/ruby/1.8/puppet/indirector.rb:64:in
`save''",
"/usr/lib/ruby/1.8/puppet/indirector/catalog/compiler.rb:25:in
`extract_facts_from_request''",
"/usr/lib/ruby/1.8/puppet/indirector/catalog/compiler.rb:30:in
`find''",
"/usr/lib/ruby/1.8/puppet/indirector/indirection.rb:193:in
`find''", "/usr/lib/ruby/1.8/puppet/indirector.rb:50:in
`find''",
"/usr/lib/ruby/1.8/puppet/network/http/handler.rb:101:in
`do_find''",
"/usr/lib/ruby/1.8/puppet/network/http/handler.rb:68:in
`send''",
"/usr/lib/ruby/1.8/puppet/network/http/handler.rb:68:in
`process''", "/usr/lib/ruby/1.8/mongrel.rb:159:in
`process_client''", "/usr/lib/ruby/1.8/mongrel.rb:158:in
`each''", "/usr/lib/ruby/1.8/mongrel.rb:158:in
`process_client''", "/usr/lib/ruby/1.8/mongrel.rb:285:in
`run''", "/usr/lib/ruby/1.8/mongrel.rb:285:in
`initialize''", "/usr/lib/ruby/1.8/mongrel.rb:285:in
`new''", "/usr/lib/ruby/1.8/mongrel.rb:285:in
`run''", "/usr/lib/ruby/1.8/mongrel.rb:268:in
`initialize''", "/usr/lib/ruby/1.8/mongrel.rb:268:in
`new''", "/usr/lib/ruby/1.8/mongrel.rb:268:in
`run''",
"/usr/lib/ruby/1.8/puppet/network/http/mongrel.rb:22:in
`listen''", "/usr/lib/ruby/1.8/puppet/network/server.rb:127:in
`listen''", "/usr/lib/ruby/1.8/puppet/network/server.rb:142:in
`start''", "/usr/lib/ruby/1.8/puppet/daemon.rb:124:in
`start''",
"/usr/lib/ruby/1.8/puppet/application/master.rb:114:in
`main''",
"/usr/lib/ruby/1.8/puppet/application/master.rb:46:in
`run_command''",
"/usr/lib/ruby/1.8/puppet/application.rb:287:in `run''",
"/usr/lib/ruby/1.8/puppet/application.rb:393:in
`exit_on_fail''",
"/usr/lib/ruby/1.8/puppet/application.rb:287:in `run''",
"/usr/lib/ruby/1.8/puppet/util/command_line.rb:55:in
`execute''", "/usr/bin/puppet:4"]

    190  Puppet::Parser::AST::CaseStatement
    181  ZAML::Label
    170  Puppet::Parser::AST::Default
    152  ActiveRecord::DynamicFinderMatch
    152  ActiveRecord::DynamicScopeMatch
    150  ActiveSupport::OrderedHash
    148  OptionParser::Switch::RequiredArgument
    138  YAML::Syck::Node
    125  Range
    124  Puppet::Parser::AST::IfStatement
    117  ActiveRecord::Errors
    115  Puppet::Provider::Confine::Exists
    109  Puppet::Parser::AST::Selector
    108  UnboundMethod
    107  File::Stat
     99  Puppet::Parameter::Value
     90  Bignum
     86  OptionParser::Switch::NoArgument
     85  Puppet::Util::Settings::Setting
     80  Puppet::Indirector::Request
     75  Puppet::Parser::AST::ComparisonOperator
     74  Puppet::Parser::Lexer::Token
     73  Puppet::Parser::AST::ResourceOverride
     70  ActiveRecord::ConnectionAdapters::MysqlColumn
     66  Sync
     65  StringIO
     64  Binding
     62  ActiveSupport::Callbacks::Callback
     61  Puppet::Util::Settings::FileSetting
     58  Puppet::Provider::ConfineCollection
     56  Mysql::Result
     52  Puppet::Module
     47  Puppet::Network::AuthStore::Declaration
     46  IPAddr
     39  Puppet::Util::Settings::BooleanSetting
     38  Thread
     36  Puppet::Util::Autoload
     35  Mysql
     35  ActiveRecord::ConnectionAdapters::MysqlAdapter
     34  Puppet::Parser::AST::Not
     28  Puppet::Type::MetaParamLoglevel
     28  Puppet::Type::File
     28  Puppet::Type::File::ParameterPurge
     28  Puppet::Type::File::ParameterLinks
     28  Puppet::Type::File::Ensure
     28  Puppet::Type::File::ParameterBackup
     28  Puppet::Type::File::ParameterReplace
     28  Puppet::Type::File::ParameterProvider
     28  Puppet::Type::File::ParameterPath
     28  Puppet::Type::File::ProviderPosix
     28  Puppet::Type::File::ParameterChecksum

but then it seemed to stop logging entirely...

I''m available on IRC to try more advanced debugging, just ping me
(hacim). I''d really like things to function again!

micah


1. http://www.masterzen.fr/2010/03/21/more-puppet-offloading/
2. http://projects.puppetlabs.com/projects/1/wiki/Puppet_Introspection
3. http://www.mail-archive.com/puppet-users@googlegroups.com/msg13692.html

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Felix Frank

2011-Jan-26 09:21 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

> What seems to happen is things are working fine at the
> beginning. Catalog compiles peg the CPU for the puppet process that is
> doing them and take anywhere from between 20 seconds and 75
> seconds. Then things get drastically worse after 4 compiles (note: I
> have four mongrels too, coincidence?), catalog compiles shoot up to 115,
> 165, 209, 268, 273, 341, 418, 546, 692, 774, 822, then 1149
> seconds... then things are really hosed. Sometimes hosts will fail
> outright and complain about weird things, like:
> 
> Jan 25 14:04:34 puppetmaster puppet-master[30294]: Host is missing hostname
and/or domain: gull.example.com
> Jan 25 14:04:55 puppetmaster puppet-master[30294]: Failed to parse template
site-apt/local.list: Could not find value for
''lsbdistcodename'' at
/etc/puppet/modules/site-apt/manifests/init.pp:4 on node gull.example.com
> 
> All four of my mongrels are constantly pegged, doing 40-50% of the CPU
> each, occupying all available CPUs. They never settle down. I''ve
got 74
> nodes checking in now, it doesn''t seem like its that many, but
perhaps
> i''ve reached a tipping point with my puppetmaster (its a dual
1ghz,
> 2gigs of ram machine)?
Hmm, some quick math:

You have 74 nodes that (I assume) check in at least once each 1800
seconds. Each compile takes above 40 seconds on average. So all compiles
(if run serially) take some 3000 seconds. Of course, seeing as you have
two cores on that machine, you can take advantage of some concurrency,
but in the ideal case you''re down to 1500 seconds, which leaves you
with
little room to breathe (and in real life, concurrency will not even be
that efficient).

I propose you need to restructure your manifest so that it compiles
faster (if at all possible) or scale up your master. What you''re
watching is probably just overload and resource thrashing.

Do you have any idea why each individual compilation takes that long? I
see a 15-20 seconds compile now and again, but most compiles are under 3
seconds in my case (but then, that''s with 4 2.4GHz cores, so it
doesn''t
necessarily compare).

Regards,
Felix

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Jan-26 10:13 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Tue, 2011-01-25 at 17:11 -0500, Micah Anderson wrote:> Brice Figureau <brice-puppet@daysofwonder.com> writes:
> 
> > On 15/12/10 19:27, Ashley Penney wrote:
> >> This issue is definitely a problem.  I have a support ticket in
with
> >> Puppet Labs about the same thing.  My CPU remains at 100% almost
> >> constantly and it slows things down significantly.  If you strace
it you
> >> can see that very little appears to be going on.  This is
absolutely not
> >> normal behavior.  Even when I had 1 client checking in I had all
cores
> >> fully used.
> >
> > I do agree that it''s not the correct behavior. I suggest you
to strace
> > or use any other ruby introspection techniques to find what part of
the
> > master is taking CPU.
> 
> I''m having a similar problem with 2.6.3. At this point, I
can''t get
> reliable puppet runs, and I''m not sure what to do.
> 
> What seems to happen is things are working fine at the
> beginning. Catalog compiles peg the CPU for the puppet process that is
> doing them and take anywhere from between 20 seconds and 75
> seconds. Then things get drastically worse after 4 compiles (note: I
> have four mongrels too, coincidence?), catalog compiles shoot up to 115,
> 165, 209, 268, 273, 341, 418, 546, 692, 774, 822, then 1149
> seconds... then things are really hosed. Sometimes hosts will fail
> outright and complain about weird things, like:
> 
> Jan 25 14:04:34 puppetmaster puppet-master[30294]: Host is missing hostname
and/or domain: gull.example.com
> Jan 25 14:04:55 puppetmaster puppet-master[30294]: Failed to parse template
site-apt/local.list: Could not find value for
''lsbdistcodename'' at
/etc/puppet/modules/site-apt/manifests/init.pp:4 on node gull.example.com
> 
> All four of my mongrels are constantly pegged, doing 40-50% of the CPU
> each, occupying all available CPUs. They never settle down. I''ve
got 74
> nodes checking in now, it doesn''t seem like its that many, but
perhaps
> i''ve reached a tipping point with my puppetmaster (its a dual
1ghz,
> 2gigs of ram machine)?
The puppetmaster is mostly CPU bound. Since you have only 2 CPUs, you
shouldn''t try to achieve a concurrency of 4 (which your mongrel are
trying to do), otherwise what will happen is that more than one request
will be accepted by one mongrel process and each thread will contend for
the CPU. The bad news is that the ruby MRI uses green threading, so the
second thread will only run when the first one will either sleep, do I/O
or relinquish the CPU voluntary. In a word, it will only run when the
first thread will finish its compilation.

Now you have 74 nodes, with the worst compilation time of 75s (which is
a lot), that translates to 74*75 = 5550s of compilation time.
With a concurrency of 2, that''s still 2775s of compilation time per
round of <insert here your default sleep time>. With the default 30min
of sleep time and assuming a perfect scheduling, that''s still larger
than a round of sleep time, which means that you won''t ever finish
compiling nodes, when the first node will ask again for a catalog.

And I''m talking only about compilation. If your manifests use file
sourcing, you must also add this to the equation.

Another explanation of the issue is swapping. You mention your server
has 2GiB of RAM. Are you sure your 4 mongrel processes after some times
still fit in the physical RAM (along with the other thing running on the
server)?
Maybe your server is constantly swapping.

So you can do several thing to get better performances:
* reduce the number of nodes that check in at a single time (ie increase
sleep time)

* reduce the time it takes to compile a catalog: 
  + which includes not using storeconfigs (or using puppetqd or
thin_storeconfig instead). 
  + Check the server is not swapping. 
  + Reduce the number of mongrel instances, to artifically reduce the
concurrency (this is counter-intuitive I know)
  + use a "better" ruby interpreter like Ruby Enterprise Edition (for
several reasons this ones has better GC, better memory footprint).
  + Cache compiled catalogs in nginx
  + offload file content serving in nginx
  + Use passenger instead of mongrel

Note: you can use puppet-load (in the 2.6 source distribution) to
simulate concurrent node asking for catalogs. This is really helpful to
size a puppetmaster and check the real concurrency a stack/hardware can
give.
> I''ve tried a large number of different things to attempt to work
around
> this:
> 
> 
> 0. reduced my node check-in times to be once an hour (and splayed
> randomly)
> 
> 1. turn on puppetqd/stomp queuing
> 
>    This didn''t seem to make a difference, its off now
> 
> 2. turn on thin stored configs
> 
>    This sort of helped a little, but not enough
> 
> 3. tried to upgrade rails from 2.3.5 (the debian version) to 2.3.10
> 
>    I didn''t see any appreciable difference here. I ended up going
back to
> 2.3.5 because that was the packaged version.
Since you seem to use Debian, make sure you use either the latest ruby
lenny backports (or REE) as they fixed an issue with pthreads and CPU
consumption:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579229
> 4. tried to offload file content via nginx[1] 
> 
>    This maybe helped a little, but its clear that the problem
isn''t the
> fileserving, it seems to be something in the catalog compilation.
Actually offloading only helps when the puppet agent needs the file
content which happens only when this one changes or if the node doesn''t
have this file. In practice this helps only for new nodes.
> 5. tried to cache catalogs through adding a http front-end cache and
> expiring that cache when manifests are updated[1] 
> 
>    I''m not sure this works at all.
This should have helped because this would prevent the puppetmaster to
even be called. You might check your nginx configuration then.
> 6. set ''fair'' queuing in my nginx.conf[3]
> 
>    This seemed to help for a few days, but then things got bad again.
> 
> 7. set --http_compression
>    
>    I''m not sure if this actually hurts the master or not (because
it has
>    to now occupy the CPU compressing catalogs?)
This is a client option, and you need the collaboration of nginx for it
to work. This will certainly add more burden on your master CPU, because
nginx now has to gzip everything you''re sending.
> 8. tried to follow the introspection technique[2] 
> 
>    this wasn''t so easy to do, I had to operate really fast,
because if I
>    was too slow the thread would exit, or it would get hung up on:
> 
> [Thread 0xb6194b70 (LWP 25770) exited]
> [New Thread 0xb6194b70 (LWP 25806)]
When you attach gdb, how many threads are running?
>    Eventually I did manage to get somewhere:
> 
> 0xb74f1b16 in memcpy () from /lib/i686/cmov/libc.so.6
> (gdb) session-ruby
> (gdb) redirect_stdout
> $1 = 2
> (gdb) 
> $2 = 2
> (gdb) eval "caller"
> $3 = 3
> (gdb) rb_object_counts
> Cannot get thread event message: debugger service failed
> An error occurred while in a function called from GDB.
> Evaluation of the expression containing the function
> (rb_eval_string_protect) will be abandoned.
> When the function is done executing, GDB will silently stop.
> (gdb) eval "total = \[\[ObjectSpace\]\].each_object(Array)\{\|x\| puts
''---''; puts x.inspect \}; puts \\"---\\nTotal Arrays:
\#{total}\\""
> Invalid character ''\'' in expression.
> 
> ... then nothing.
> 
> In the tail:
> 
> root@puppetmaster:/tmp# tail -f ruby-debug.28724            
>     207 
Puppet::Util::LoadedFile["/usr/lib/ruby/1.8/active_record/base.rb:2746:in
`attributes=''",
"/usr/lib/ruby/1.8/active_record/base.rb:2742:in `each''",
"/usr/lib/ruby/1.8/active_record/base.rb:2742:in
`attributes=''",
"/usr/lib/ruby/1.8/active_record/base.rb:2438:in
`initialize''",
"/usr/lib/ruby/1.8/active_record/reflection.rb:162:in `new''",
"/usr/lib/ruby/1.8/active_record/reflection.rb:162:in
`build_association''",
"/usr/lib/ruby/1.8/active_record/associations/association_collection.rb:423:in
`build_record''",
"/usr/lib/ruby/1.8/active_record/associations/association_collection.rb:102:in
`build''", "/usr/lib/ruby/1.8/puppet/rails/host.rb:145:in
`merge_facts''",
"/usr/lib/ruby/1.8/puppet/rails/host.rb:144:in `each''",
"/usr/lib/ruby/1.8/puppet/rails/host.rb:144:in
`merge_facts''",
"/usr/lib/ruby/1.8/puppet/rails/host.rb:140:in `each''",
"/usr/lib/ruby/1.8/puppet/rails/host.rb:140:in
`merge_facts''",
"/usr/lib/ruby/1.8/puppet/indirector/facts/active_record.rb:32:in
`save''",
"/usr/lib/ruby/1.8/puppet/indirector/indirection.rb:256:in
`save''", "/usr/lib/ruby/1.8/puppet/node/facts.rb:15:in
`save''", "/usr/lib/ruby/1.8/puppet/indirector.rb:64:in
`save''",
"/usr/lib/ruby/1.8/puppet/indirector/catalog/compiler.rb:25:in
`extract_facts_from_request''",
"/usr/lib/ruby/1.8/puppet/indirector/catalog/compiler.rb:30:in
`find''",
"/usr/lib/ruby/1.8/puppet/indirector/indirection.rb:193:in
`find''", "/usr/lib/ruby/1.8/puppet/indirector.rb:50:in
`find''",
"/usr/lib/ruby/1.8/puppet/network/http/handler.rb:101:in
`do_find''",
"/usr/lib/ruby/1.8/puppet/network/http/handler.rb:68:in
`send''",
"/usr/lib/ruby/1.8/puppet/network/http/handler.rb:68:in
`process''", "/usr/lib/ruby/1.8/mongrel.rb:159:in
`process_client''", "/usr/lib/ruby/1.8/mongrel.rb:158:in
`each''", "/usr/lib/ruby/1.8/mongrel.rb:158:in
`process_client''", "/usr/lib/ruby/1.8/mongrel.rb:285:in
`run''", "/usr/lib/ruby/1.8/mongrel.rb:285:in
`initialize''", "/usr/lib/ruby/1.8/mongrel.rb:285:in
`new''", "/usr/lib/ruby/1.8/mongrel.rb:285:in
`run''", "/usr/lib/ruby/1.8/mongrel.rb:268:in
`initialize''", "/usr/lib/ruby/1.8/mongrel.rb:268:in
`new''", "/usr/lib/ruby/1.8/mongrel.rb:268:in
`run''",
"/usr/lib/ruby/1.8/puppet/network/http/mongrel.rb:22:in
`listen''", "/usr/lib/ruby/1.8/puppet/network/server.rb:127:in
`listen''", "/usr/lib/ruby/1.8/puppet/network/server.rb:142:in
`start''", "/usr/lib/ruby/1.8/puppet/daemon.rb:124:in
`start''",
"/usr/lib/ruby/1.8/puppet/application/master.rb:114:in
`main''",
"/usr/lib/ruby/1.8/puppet/application/master.rb:46:in
`run_command''",
"/usr/lib/ruby/1.8/puppet/application.rb:287:in `run''",
"/usr/lib/ruby/1.8/puppet/application.rb:393:in
`exit_on_fail''",
"/usr/lib/ruby/1.8/puppet/application.rb:287:in `run''",
"/usr/lib/ruby/1.8/puppet/util/command_line.rb:55:in
`execute''", "/usr/bin/puppet:4"]
> 
>     190  Puppet::Parser::AST::CaseStatement
>     181  ZAML::Label
>     170  Puppet::Parser::AST::Default
>     152  ActiveRecord::DynamicFinderMatch
>     152  ActiveRecord::DynamicScopeMatch
>     150  ActiveSupport::OrderedHash
>     148  OptionParser::Switch::RequiredArgument
>     138  YAML::Syck::Node
>     125  Range
>     124  Puppet::Parser::AST::IfStatement
>     117  ActiveRecord::Errors
>     115  Puppet::Provider::Confine::Exists
>     109  Puppet::Parser::AST::Selector
>     108  UnboundMethod
>     107  File::Stat
>      99  Puppet::Parameter::Value
>      90  Bignum
>      86  OptionParser::Switch::NoArgument
>      85  Puppet::Util::Settings::Setting
>      80  Puppet::Indirector::Request
>      75  Puppet::Parser::AST::ComparisonOperator
>      74  Puppet::Parser::Lexer::Token
>      73  Puppet::Parser::AST::ResourceOverride
>      70  ActiveRecord::ConnectionAdapters::MysqlColumn
>      66  Sync
>      65  StringIO
>      64  Binding
>      62  ActiveSupport::Callbacks::Callback
>      61  Puppet::Util::Settings::FileSetting
>      58  Puppet::Provider::ConfineCollection
>      56  Mysql::Result
>      52  Puppet::Module
>      47  Puppet::Network::AuthStore::Declaration
>      46  IPAddr
>      39  Puppet::Util::Settings::BooleanSetting
>      38  Thread
>      36  Puppet::Util::Autoload
>      35  Mysql
>      35  ActiveRecord::ConnectionAdapters::MysqlAdapter
>      34  Puppet::Parser::AST::Not
>      28  Puppet::Type::MetaParamLoglevel
>      28  Puppet::Type::File
>      28  Puppet::Type::File::ParameterPurge
>      28  Puppet::Type::File::ParameterLinks
>      28  Puppet::Type::File::Ensure
>      28  Puppet::Type::File::ParameterBackup
>      28  Puppet::Type::File::ParameterReplace
>      28  Puppet::Type::File::ParameterProvider
>      28  Puppet::Type::File::ParameterPath
>      28  Puppet::Type::File::ProviderPosix
>      28  Puppet::Type::File::ParameterChecksum
This is just the objects used at a given time. What is more interesting
is where the CPU time is spent (ie getting a stacktrace would be
helpful, but not easy).
> but then it seemed to stop logging entirely...
> 
> I''m available on IRC to try more advanced debugging, just ping me
> (hacim). I''d really like things to function again!
I''ll ping you, but I''m just really busy for this very next
couple of
days :(
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Micah Anderson

2011-Jan-26 14:44 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Felix Frank <felix.frank@alumni.tu-berlin.de> writes:
> I propose you need to restructure your manifest so that it compiles
> faster (if at all possible) or scale up your master. What you''re
> watching is probably just overload and resource thrashing.
I''m interested in ideas for what are good steps for restructuring
manifests so they can compile faster, or at least methods for
identifying problematic areas in manifests.
> Do you have any idea why each individual compilation takes that long?
It wasn''t before. Before things start spinning, compilation times are
between 9 seconds and 60 seconds, usually averaging just shy of 30
seconds. 

micah

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Felix Frank

2011-Jan-26 14:46 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 01/26/2011 03:44 PM, Micah Anderson wrote:> Felix Frank <felix.frank@alumni.tu-berlin.de> writes:
> 
>> I propose you need to restructure your manifest so that it compiles
>> faster (if at all possible) or scale up your master. What
you''re
>> watching is probably just overload and resource thrashing.
> 
> I''m interested in ideas for what are good steps for restructuring
> manifests so they can compile faster, or at least methods for
> identifying problematic areas in manifests.
Are there many templates or use of the file() function?

Do you make heavy use of modules and the autoloader?
>> Do you have any idea why each individual compilation takes that long?
> 
> It wasn''t before. Before things start spinning, compilation times
are
> between 9 seconds and 60 seconds, usually averaging just shy of 30
> seconds. 
That''s still quite considerable IMO.

Regards,
Felix

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Micah Anderson

2011-Jan-26 15:11 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Brice Figureau <brice-puppet@daysofwonder.com> writes:
> On Tue, 2011-01-25 at 17:11 -0500, Micah Anderson wrote:
>> Brice Figureau <brice-puppet@daysofwonder.com> writes:
>>
>> All four of my mongrels are constantly pegged, doing 40-50% of the CPU
>> each, occupying all available CPUs. They never settle down.
I''ve got 74
>> nodes checking in now, it doesn''t seem like its that many, but
perhaps
>> i''ve reached a tipping point with my puppetmaster (its a dual
1ghz,
>> 2gigs of ram machine)?
>
> The puppetmaster is mostly CPU bound. Since you have only 2 CPUs, you
> shouldn''t try to achieve a concurrency of 4 (which your mongrel
are
> trying to do), otherwise what will happen is that more than one request
> will be accepted by one mongrel process and each thread will contend for
> the CPU. The bad news is that the ruby MRI uses green threading, so the
> second thread will only run when the first one will either sleep, do I/O
> or relinquish the CPU voluntary. In a word, it will only run when the
> first thread will finish its compilation.
Ok, that is a good thing to know. I wasn''t aware that ruby was not able
to do that.
> Now you have 74 nodes, with the worst compilation time of 75s (which is
> a lot), that translates to 74*75 = 5550s of compilation time.
> With a concurrency of 2, that''s still 2775s of compilation time
per
> round of <insert here your default sleep time>. With the default
30min
> of sleep time and assuming a perfect scheduling, that''s still
larger
> than a round of sleep time, which means that you won''t ever finish
> compiling nodes, when the first node will ask again for a catalog.
I''m doing 60 minutes of sleep time, which is 3600 seconds an hour, the
concurrency of 2 giving me 2775s of compile time per hour does keep me
under the 3600 seconds... assuming scheduling is perfect, which it very
likely is not.
> And I''m talking only about compilation. If your manifests use file
> sourcing, you must also add this to the equation.
As explained, I set up your nginx method for offloading file sourcing.
> Another explanation of the issue is swapping. You mention your server
> has 2GiB of RAM. Are you sure your 4 mongrel processes after some times
> still fit in the physical RAM (along with the other thing running on the
> server)?
> Maybe your server is constantly swapping.
I''m actually doing fine on memory, not dipping into swap. I''ve
watched
i/o to see if I could identify either a swap or disk problem, but
didn''t
notice very much happening there. The CPU usage of the mongrel processes
is pretty much where everything is spending its time. 

I''ve been wondering if I have some loop in a manifest or something that
is causing them to just spin.
> So you can do several thing to get better performances:
> * reduce the number of nodes that check in at a single time (ie increase
> sleep time)
I''ve already reduced to once per hour, but I could consider reducing it
more. 
> * reduce the time it takes to compile a catalog: 
>   + which includes not using storeconfigs (or using puppetqd or
> thin_storeconfig instead). 
I need to use storeconfigs, and as detailed in my original message,
I''ve
tried puppetqd and it didn''t do much for me. thin_storeconfigs did
help,
and I''m still using it, so this one has already been done too.
>   + Check the server is not swapping. 
Not swapping.
>   + Reduce the number of mongrel instances, to artifically reduce the
> concurrency (this is counter-intuitive I know)
Ok, I''m backing off to two mongrels to see how well that works.
>   + use a "better" ruby interpreter like Ruby Enterprise Edition
(for
> several reasons this ones has better GC, better memory footprint).
I''m pretty sure my problem isn''t memory, so I''m not
sure if these will
help much.
>   + Cache compiled catalogs in nginx
Doing this.
>   + offload file content serving in nginx
Doing this
>   + Use passenger instead of mongrel
I tried to switch to passenger, and things were much worse. Actually,
passenger worked fine with 0.25, but when I upgraded I couldn''t get it
to function anymore. I actually had to go back to nginx to get things
functioning again.
>> 3. tried to upgrade rails from 2.3.5 (the debian version) to 2.3.10
>> 
>>    I didn''t see any appreciable difference here. I ended up
going back to
>> 2.3.5 because that was the packaged version.
>
> Since you seem to use Debian, make sure you use either the latest ruby
> lenny backports (or REE) as they fixed an issue with pthreads and CPU
> consumption:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579229
I''m using Debian Squeeze, which has the same version you are mentioning
from lenny backports (2.3.5).
>> 5. tried to cache catalogs through adding a http front-end cache and
>> expiring that cache when manifests are updated[1] 
>> 
>>    I''m not sure this works at all.
>
> This should have helped because this would prevent the puppetmaster to
> even be called. You might check your nginx configuration then.
Hmm. According to jamesturnbull, the rest terminus shouldn''t allow you
to request any node''s catalog, so I''m not sure how this can
work at
all... but in case I''ve got something screwed up in my nginx.conf,
I''d
really be happy if you could have a look at it, its possible that I
misunderstood something from your blog post! Here it is:

user                              www-data;
worker_processes                  2;

error_log                         /var/log/nginx/error.log;
pid                               /var/run/nginx.pid;

events {
  # In a reverse proxy situation, max_clients becomes
  # max_clients = worker_processes * worker_connections/4
  worker_connections              2048;
}

http {
  default_type                    application/octet-stream;

  sendfile                        on;
  tcp_nopush                      on;
  tcp_nodelay                     on;

  large_client_header_buffers     1024      2048k;
  client_max_body_size            150m;
  proxy_buffers                   128     4k;
  
  keepalive_timeout               65;
  
  gzip                            on;
  gzip_min_length                 1000;
  gzip_types                      text/plain;

  ssl                             on;
  ssl_certificate                 /var/lib/puppet/ssl/certs/puppetmaster.pem;
  ssl_certificate_key            
/var/lib/puppet/ssl/private_keys/puppetmaster.pem;
  ssl_client_certificate          /var/lib/puppet/ssl/ca/ca_crt.pem;
  ssl_ciphers                     SSLv2:-LOW:-EXPORT:RC4+RSA;
  ssl_session_cache               shared:SSL:8m;
  ssl_session_timeout             5m;
  
  proxy_read_timeout              600;
  upstream puppet_mongrel {
    fair;
    server                        127.0.0.1:18140;
    server                        127.0.0.1:18141;
    server                        127.0.0.1:18142;
    server                        127.0.0.1:18143;
  }
  log_format  noip  ''0.0.0.0 - $remote_user [$time_local] ''
                      ''"$request" $status $body_bytes_sent
''
                      ''"$http_referer"
"$http_user_agent"'';

  proxy_cache_path  /var/cache/nginx/cache  levels=1:2  
keys_zone=puppetcache:10m;

  server {
    listen                        8140;
    access_log                    /var/log/nginx/access.log noip;
    ssl_verify_client             required; 			      

    root		          /etc/puppet;

    # make sure we serve everything
    # as raw
    types { }
    default_type                 application/x-raw;

    # serve static file for the [files] mountpoint
    location /production/file_content/files/ {
        allow                    172.16.0.0/16;
        allow                    10.0.1.0/8;
        allow                    127.0.0.1/8;
        deny                     all;

        alias                    /etc/puppet/files/;
    }

    # serve modules files sections
    location ~ /production/file_content/[^/]+/files/ {
        # it is advisable to have some access rules here
        allow                    172.16.0.0/16;
        allow                    10.0.1.0/8;
        allow                    127.0.0.1/8;
        deny                     all;

        root                     /etc/puppet/modules;

        # rewrite /production/file_content/module/files/file.txt
        # to /module/file.text
        rewrite                  ^/production/file_content/([^/]+)/files/(.+)$ 
$1/$2 break;
    }

    # Variables
    # $ssl_cipher returns the line of those utilized it is cipher for
established SSL-connection
    # $ssl_client_serial returns the series number of client certificate for
established SSL-connection
    # $ssl_client_s_dn returns line subject DN of client certificate for
established SSL-connection
    # $ssl_client_i_dn returns line issuer DN of client certificate for
established SSL-connection
    # $ssl_protocol returns the protocol of established SSL-connection

    location / {
      proxy_pass                 http://puppet_mongrel;
      proxy_redirect             off;
      proxy_set_header           Host             $host;
      proxy_set_header           X-Real-IP        $remote_addr;
      proxy_set_header           X-Forwarded-For  $proxy_add_x_forwarded_for;
      proxy_set_header           X-Client-Verify  SUCCESS;
      proxy_set_header           X-SSL-Subject    $ssl_client_s_dn;
      proxy_set_header           X-SSL-Issuer     $ssl_client_i_dn;
      proxy_buffer_size          16k;
      proxy_buffers              8 32k;
      proxy_busy_buffers_size    64k;
      proxy_temp_file_write_size 64k;
      proxy_read_timeout         540;

    # we handle catalog differently
    # because we want to cache them
    location /production/catalog {
        proxy_pass               http://puppet_mongrel;
        proxy_redirect           off;

        # it is a good thing to actually restrict who
        # can ask for a catalog (especially for cached
        # catalogs)
        allow                    172.16.0.0/16;
        allow                    10.0.1.0/8;
        allow                    127.0.0.1/8;
        deny                     all;

        # where to cache contents
        proxy_cache              puppetcache;
    
        # we cache content by catalog host
        # we could also use $args to take into account request
        # facts, but those change too often (ie uptime or memory)
        # to be really usefull
        proxy_cache_key          $uri;

        # define how long to cache response
    
        # normal catalogs will be cached 2 weeks
        proxy_cache_valid        200 302 301 2w;

        # errors are not cached long
        proxy_cache_valid        500 403 1m;
    
        # the rest is cached a little bit
        proxy_cache_valid        any 30m;             
    }

    # catch all location for other terminii
    location / {
        proxy_pass               http://puppet_mongrel;
        proxy_redirect           off;
    }
 }
} 
  server {
    listen                       8141;
    ssl_verify_client            off;
    root                         /var/empty;
    access_log                   /var/log/nginx/access.log noip;

    location / {
      proxy_pass                 http://puppet_mongrel;
      proxy_redirect             off;
      proxy_set_header           Host             $host;
      proxy_set_header           X-Real-IP        $remote_addr;
      proxy_set_header           X-Forwarded-For  $proxy_add_x_forwarded_for;
      proxy_set_header           X-Client-Verify  FAILURE;
      proxy_set_header           X-SSL-Subject    $ssl_client_s_dn;
      proxy_set_header           X-SSL-Issuer     $ssl_client_i_dn;
    }
  }
}

>> 7. set --http_compression
>>    
>>    I''m not sure if this actually hurts the master or not
(because it has
>>    to now occupy the CPU compressing catalogs?)
>
> This is a client option, and you need the collaboration of nginx for it
> to work. This will certainly add more burden on your master CPU, because
> nginx now has to gzip everything you''re sending.
Yeah, I have the gzip compression turned on in nginx, but I dont really
need it and my master could use the break.
>> 8. tried to follow the introspection technique[2] 
>> 
>>    this wasn''t so easy to do, I had to operate really fast,
because if I
>>    was too slow the thread would exit, or it would get hung up on:
>> 
>> [Thread 0xb6194b70 (LWP 25770) exited]
>> [New Thread 0xb6194b70 (LWP 25806)]
>
> When you attach gdb, how many threads are running?
I''m not sure, how can I determine that? I just had the existing 4
mongrel processes.

>> (gdb) eval "total = \[\[ObjectSpace\]\].each_object(Array)\{\|x\|
puts ''---''; puts x.inspect \}; puts \\"---\\nTotal
Arrays: \#{total}\\""
>> Invalid character ''\'' in expression.
The above seemed to be a problem with the expression on the wiki page,
does anyone know what that should be so gdb doesn''t have a problem with
it?
>> I''m available on IRC to try more advanced debugging, just ping
me
>> (hacim). I''d really like things to function again!
>
> I''ll ping you, but I''m just really busy for this very
next couple of
> days :(
Thanks for any help or ideas, I''m out of ideas myself so anything
helps!

micah

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Jan-26 15:35 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Wed, 2011-01-26 at 09:44 -0500, Micah Anderson wrote:> Felix Frank <felix.frank@alumni.tu-berlin.de> writes:
> 
> > I propose you need to restructure your manifest so that it compiles
> > faster (if at all possible) or scale up your master. What
you''re
> > watching is probably just overload and resource thrashing.
> 
> I''m interested in ideas for what are good steps for restructuring
> manifests so they can compile faster, or at least methods for
> identifying problematic areas in manifests.
> 
> > Do you have any idea why each individual compilation takes that long?
> 
> It wasn''t before. Before things start spinning, compilation times
are
> between 9 seconds and 60 seconds, usually averaging just shy of 30
> seconds. 
Do you use a External Node Classifier?
-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Jan-26 16:23 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Wed, 2011-01-26 at 10:11 -0500, Micah Anderson wrote:> Brice Figureau <brice-puppet@daysofwonder.com> writes:
> 
> > On Tue, 2011-01-25 at 17:11 -0500, Micah Anderson wrote:
> >> Brice Figureau <brice-puppet@daysofwonder.com> writes:
> >>
> >> All four of my mongrels are constantly pegged, doing 40-50% of the
CPU
> >> each, occupying all available CPUs. They never settle down.
I''ve got 74
> >> nodes checking in now, it doesn''t seem like its that
many, but perhaps
> >> i''ve reached a tipping point with my puppetmaster (its a
dual 1ghz,
> >> 2gigs of ram machine)?
> >
> > The puppetmaster is mostly CPU bound. Since you have only 2 CPUs, you
> > shouldn''t try to achieve a concurrency of 4 (which your
mongrel are
> > trying to do), otherwise what will happen is that more than one
request
> > will be accepted by one mongrel process and each thread will contend
for
> > the CPU. The bad news is that the ruby MRI uses green threading, so
the
> > second thread will only run when the first one will either sleep, do
I/O
> > or relinquish the CPU voluntary. In a word, it will only run when the
> > first thread will finish its compilation.
> 
> Ok, that is a good thing to know. I wasn''t aware that ruby was not
able
> to do that.
> 
> > Now you have 74 nodes, with the worst compilation time of 75s (which
is
> > a lot), that translates to 74*75 = 5550s of compilation time.
> > With a concurrency of 2, that''s still 2775s of compilation
time per
> > round of <insert here your default sleep time>. With the default
30min
> > of sleep time and assuming a perfect scheduling, that''s still
larger
> > than a round of sleep time, which means that you won''t ever
finish
> > compiling nodes, when the first node will ask again for a catalog.
> 
> I''m doing 60 minutes of sleep time, which is 3600 seconds an hour,
the
> concurrency of 2 giving me 2775s of compile time per hour does keep me
> under the 3600 seconds... assuming scheduling is perfect, which it very
> likely is not.
> 
> > And I''m talking only about compilation. If your manifests use
file
> > sourcing, you must also add this to the equation.
> 
> As explained, I set up your nginx method for offloading file sourcing.
> 
> > Another explanation of the issue is swapping. You mention your server
> > has 2GiB of RAM. Are you sure your 4 mongrel processes after some
times
> > still fit in the physical RAM (along with the other thing running on
the
> > server)?
> > Maybe your server is constantly swapping.
> 
> I''m actually doing fine on memory, not dipping into swap.
I''ve watched
> i/o to see if I could identify either a swap or disk problem, but
didn''t
> notice very much happening there. The CPU usage of the mongrel processes
> is pretty much where everything is spending its time. 
> 
> I''ve been wondering if I have some loop in a manifest or something
that
> is causing them to just spin.
I don''t think it''s the problem. There could be some ruby
internals
issues playing here, but I doubt something in your manifest creates a
loop.

What is strange is that you mentioned that the very first catalog
compilations were fine, but then the compilation time increases.
> > So you can do several thing to get better performances:
> > * reduce the number of nodes that check in at a single time (ie
increase
> > sleep time)
> 
> I''ve already reduced to once per hour, but I could consider
reducing it
> more. 
That would be interesting. This would help us know if the problem is too
many load/concurrency for your clients or a problem in the master
itself.

BTW, what''s the load on the server?
> > * reduce the time it takes to compile a catalog: 
> >   + which includes not using storeconfigs (or using puppetqd or
> > thin_storeconfig instead). 
> 
> I need to use storeconfigs, and as detailed in my original message,
I''ve
> tried puppetqd and it didn''t do much for me. thin_storeconfigs did
help,
> and I''m still using it, so this one has already been done too.
> 
> >   + Check the server is not swapping. 
> 
> Not swapping.
OK, good.
> >   + Reduce the number of mongrel instances, to artifically reduce the
> > concurrency (this is counter-intuitive I know)
> 
> Ok, I''m backing off to two mongrels to see how well that works.
Let me know if that changes something.
> >   + use a "better" ruby interpreter like Ruby Enterprise
Edition (for
> > several reasons this ones has better GC, better memory footprint).
> 
> I''m pretty sure my problem isn''t memory, so I''m
not sure if these will
> help much.
Well, having a better GC means that the ruby interpreter will become
faster at allocating stuff or recycling object. That in the end means
the overall memory footprint can be better, but that also means it will
spend much less time doing garbage stuff (ie better use the CPU for your
code and not for tidying stuff).
> >   + Cache compiled catalogs in nginx
> 
> Doing this.
> 
> >   + offload file content serving in nginx
> 
> Doing this
> 
> >   + Use passenger instead of mongrel
> 
> I tried to switch to passenger, and things were much worse. Actually,
> passenger worked fine with 0.25, but when I upgraded I couldn''t
get it
> to function anymore. I actually had to go back to nginx to get things
> functioning again.
> 
> >> 3. tried to upgrade rails from 2.3.5 (the debian version) to
2.3.10
> >> 
> >>    I didn''t see any appreciable difference here. I ended
up going back to
> >> 2.3.5 because that was the packaged version.
> >
> > Since you seem to use Debian, make sure you use either the latest ruby
> > lenny backports (or REE) as they fixed an issue with pthreads and CPU
> > consumption:
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579229
> 
> I''m using Debian Squeeze, which has the same version you are
mentioning
> from lenny backports (2.3.5).
I was talking about the ruby1.8 package, not rails. Make sure you use
the squeeze version or the lenny-backports one.
> >> 5. tried to cache catalogs through adding a http front-end cache
and
> >> expiring that cache when manifests are updated[1] 
> >> 
> >>    I''m not sure this works at all.
> >
> > This should have helped because this would prevent the puppetmaster to
> > even be called. You might check your nginx configuration then.
> 
> Hmm. According to jamesturnbull, the rest terminus shouldn''t allow
you
> to request any node''s catalog, so I''m not sure how this
can work at
> all... but in case I''ve got something screwed up in my nginx.conf,
I''d
> really be happy if you could have a look at it, its possible that I
> misunderstood something from your blog post! Here it is:
When a client asks for a catalog, nginx checks it has already cached it,
if it is and the cache is still fresh, it serves it otherwise it asks a
puppetmaster for the same REST url, and then cache what the master
returns.

It''s easy to check if nginx is caching the catalog:
Have a look into /var/cache/nginx/cache and see if there are some files
containing some of your catalogs.

Puppet doesn''t send the necessary caching headers right now, and
I''m not
sure how nginx deals with that. I hope it would still cache (through the
vertue of proxy_cache_valid).

What version of nginx are you using?

>   server {
>     listen                        8140;
>     access_log                    /var/log/nginx/access.log noip;
>     ssl_verify_client             required; 			      
Make that:
ssl_verify_client       optional;

And remove the second server{} block, and make sure your clients do not
use a different ca_port. But only if you use nginx >= 0.7.64
>     root		          /etc/puppet;
> 
>     # make sure we serve everything
>     # as raw
>     types { }
>     default_type                 application/x-raw;
> 
>     # serve static file for the [files] mountpoint
>     location /production/file_content/files/ {
>         allow                    172.16.0.0/16;
>         allow                    10.0.1.0/8;
>         allow                    127.0.0.1/8;
>         deny                     all;
> 
>         alias                    /etc/puppet/files/;
>     }
> 
>     # serve modules files sections
>     location ~ /production/file_content/[^/]+/files/ {
>         # it is advisable to have some access rules here
>         allow                    172.16.0.0/16;
>         allow                    10.0.1.0/8;
>         allow                    127.0.0.1/8;
>         deny                     all;
> 
>         root                     /etc/puppet/modules;
> 
>         # rewrite /production/file_content/module/files/file.txt
>         # to /module/file.text
>         rewrite                 
^/production/file_content/([^/]+)/files/(.+)$  $1/$2 break;
>     }
> 
>     # Variables
>     # $ssl_cipher returns the line of those utilized it is cipher for
established SSL-connection
>     # $ssl_client_serial returns the series number of client certificate
for established SSL-connection
>     # $ssl_client_s_dn returns line subject DN of client certificate for
established SSL-connection
>     # $ssl_client_i_dn returns line issuer DN of client certificate for
established SSL-connection
>     # $ssl_protocol returns the protocol of established SSL-connection
> 
>     location / {
>       proxy_pass                 http://puppet_mongrel;
>       proxy_redirect             off;
>       proxy_set_header           Host             $host;
>       proxy_set_header           X-Real-IP        $remote_addr;
>       proxy_set_header           X-Forwarded-For 
$proxy_add_x_forwarded_for;
>       proxy_set_header           X-Client-Verify  SUCCESS;
If you used ssl_verify_client as I explained above, this should be:
proxy_set_header           X-Client-Verify   $ssl_client_verify
>       proxy_set_header           X-SSL-Subject    $ssl_client_s_dn;
>       proxy_set_header           X-SSL-Issuer     $ssl_client_i_dn;
>       proxy_buffer_size          16k;
>       proxy_buffers              8 32k;
>       proxy_busy_buffers_size    64k;
>       proxy_temp_file_write_size 64k;
>       proxy_read_timeout         540;
> 
>     # we handle catalog differently
>     # because we want to cache them
>     location /production/catalog {
Warning: this ^^ will work only if your nodes are in the "production"
environment. Adjust for your environments.

>         proxy_pass               http://puppet_mongrel;
>         proxy_redirect           off;
> 
>         # it is a good thing to actually restrict who
>         # can ask for a catalog (especially for cached
>         # catalogs)
>         allow                    172.16.0.0/16;
>         allow                    10.0.1.0/8;
>         allow                    127.0.0.1/8;
>         deny                     all;
> 
>         # where to cache contents
>         proxy_cache              puppetcache;
>     
>         # we cache content by catalog host
>         # we could also use $args to take into account request
>         # facts, but those change too often (ie uptime or memory)
>         # to be really usefull
>         proxy_cache_key          $uri;
> 
>         # define how long to cache response
>     
>         # normal catalogs will be cached 2 weeks
>         proxy_cache_valid        200 302 301 2w;
> 
>         # errors are not cached long
>         proxy_cache_valid        500 403 1m;
>     
>         # the rest is cached a little bit
>         proxy_cache_valid        any 30m;             
>     }
> 
>     # catch all location for other terminii
>     location / {
You already have a location ''/'' above.
Are you sure nginx is correctly using this configuration?
Try:
 nginx -t
it will check your configuration
>         proxy_pass               http://puppet_mongrel;
>         proxy_redirect           off;
>     }
>  }
> } 
>   server {
>     listen                       8141;
>     ssl_verify_client            off;
>     root                         /var/empty;
>     access_log                   /var/log/nginx/access.log noip;
> 
>     location / {
>       proxy_pass                 http://puppet_mongrel;
>       proxy_redirect             off;
>       proxy_set_header           Host             $host;
>       proxy_set_header           X-Real-IP        $remote_addr;
>       proxy_set_header           X-Forwarded-For 
$proxy_add_x_forwarded_for;
>       proxy_set_header           X-Client-Verify  FAILURE;
>       proxy_set_header           X-SSL-Subject    $ssl_client_s_dn;
>       proxy_set_header           X-SSL-Issuer     $ssl_client_i_dn;
>     }
>   }
> }
This server{} wouldn''t be needed if you use the ssl_verify_client as
explained above.
> 
> >> 7. set --http_compression
> >>    
> >>    I''m not sure if this actually hurts the master or not
(because it has
> >>    to now occupy the CPU compressing catalogs?)
> >
> > This is a client option, and you need the collaboration of nginx for
it
> > to work. This will certainly add more burden on your master CPU,
because
> > nginx now has to gzip everything you''re sending.
> 
> Yeah, I have the gzip compression turned on in nginx, but I dont really
> need it and my master could use the break.
Actually your nginx are only compressing text/plain documents, so it
won''t compress your catalogs.
> >> 8. tried to follow the introspection technique[2] 
> >> 
> >>    this wasn''t so easy to do, I had to operate really
fast, because if I
> >>    was too slow the thread would exit, or it would get hung up on:
> >> 
> >> [Thread 0xb6194b70 (LWP 25770) exited]
> >> [New Thread 0xb6194b70 (LWP 25806)]
> >
> > When you attach gdb, how many threads are running?
> 
> I''m not sure, how can I determine that? I just had the existing 4
> mongrel processes.
Maybe you can first try to display the full C trace for all threads:
thread apply all bt

Then, resume everything, and 2 to 5s take another snapshot with the
command above. Comparing the two trace might help us understand what the
process is doing.

HTH,
-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Jan-26 16:30 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Wed, 2011-01-26 at 10:11 -0500, Micah Anderson wrote:> http {
>   default_type                    application/octet-stream;
> 
>   sendfile                        on;
>   tcp_nopush                      on;
>   tcp_nodelay                     on;
> 
>   large_client_header_buffers     1024      2048k;
>   client_max_body_size            150m;
>   proxy_buffers                   128     4k;
>   
>   keepalive_timeout               65;
>   
>   gzip                            on;
>   gzip_min_length                 1000;
>   gzip_types                      text/plain;
> 
>   ssl                             on;
>   ssl_certificate                
/var/lib/puppet/ssl/certs/puppetmaster.pem;
>   ssl_certificate_key            
/var/lib/puppet/ssl/private_keys/puppetmaster.pem;
>   ssl_client_certificate          /var/lib/puppet/ssl/ca/ca_crt.pem;
>   ssl_ciphers                     SSLv2:-LOW:-EXPORT:RC4+RSA;
>   ssl_session_cache               shared:SSL:8m;
>   ssl_session_timeout             5m;
>   
>   proxy_read_timeout              600;
>   upstream puppet_mongrel {
>     fair;
>     server                        127.0.0.1:18140;
>     server                        127.0.0.1:18141;
>     server                        127.0.0.1:18142;
>     server                        127.0.0.1:18143;
>   }
>   log_format  noip  ''0.0.0.0 - $remote_user [$time_local]
''
>                       ''"$request" $status
$body_bytes_sent ''
>                       ''"$http_referer"
"$http_user_agent"'';
> 
>   proxy_cache_path  /var/cache/nginx/cache  levels=1:2  
keys_zone=puppetcache:10m;
make this:
proxy_cache_path  /var/cache/nginx/cache  levels=1:2   keys_zone=puppetcache:50m
inactive=300m

The default inactive is 10 minute which is too low for a sleeptime of 60
minutes, and it is possible the cached catalog to be evicted.
-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Micah Anderson

2011-Jan-26 19:47 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Felix Frank <felix.frank@alumni.tu-berlin.de> writes:
> On 01/26/2011 03:44 PM, Micah Anderson wrote:
>> Felix Frank <felix.frank@alumni.tu-berlin.de> writes:
>> 
>>> I propose you need to restructure your manifest so that it compiles
>>> faster (if at all possible) or scale up your master. What
you''re
>>> watching is probably just overload and resource thrashing.
>> 
>> I''m interested in ideas for what are good steps for
restructuring
>> manifests so they can compile faster, or at least methods for
>> identifying problematic areas in manifests.
>
> Are there many templates or use of the file() function?
Yes, there are quite a few. I''m not really sure the best way to count
them. I have 288 ''source => "$fileserver"'' lines
in my
manifests. Another ~160 of them in various modules. As far as templates
go, I have ~77 "content => template(...)" lines in my manifests and
another 55 in modules.
> Do you make heavy use of modules and the autoloader?
I do make heavy use of modules, I have about 50 of them. I''m importing
18 of them in my manifests/modules.pp. I think, if they are set up
right, I only need to import one of those, and I''ve been slowly pairing
those down. I presume that by ''the autoloader'' you are meaning
those
modules which aren''t explictly included somewhere?
>>> Do you have any idea why each individual compilation takes that
long?
>> 
>> It wasn''t before. Before things start spinning, compilation
times are
>> between 9 seconds and 60 seconds, usually averaging just shy of 30
>> seconds. 
>
> That''s still quite considerable IMO.
Actually looking at my logs, compile time actually was averaging around
15 seconds each. some taking very little time at all. When things go
bad, its more or less a thundering herd and the times start going up and
up.

micah

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Micah Anderson

2011-Jan-26 19:48 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Brice Figureau <brice-puppet@daysofwonder.com> writes:
> On Wed, 2011-01-26 at 09:44 -0500, Micah Anderson wrote:
>> Felix Frank <felix.frank@alumni.tu-berlin.de> writes:
>> 
>> > I propose you need to restructure your manifest so that it
compiles
>> > faster (if at all possible) or scale up your master. What
you''re
>> > watching is probably just overload and resource thrashing.
>> 
>> I''m interested in ideas for what are good steps for
restructuring
>> manifests so they can compile faster, or at least methods for
>> identifying problematic areas in manifests.
>> 
>> > Do you have any idea why each individual compilation takes that
long?
>> 
>> It wasn''t before. Before things start spinning, compilation
times are
>> between 9 seconds and 60 seconds, usually averaging just shy of 30
>> seconds. 
>
> Do you use a External Node Classifier?
I do not.

micah

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Micah Anderson

2011-Jan-26 20:40 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Brice Figureau <brice-puppet@daysofwonder.com> writes:
> On Wed, 2011-01-26 at 10:11 -0500, Micah Anderson wrote:
>> Brice Figureau <brice-puppet@daysofwonder.com> writes:
>> 
>> > On Tue, 2011-01-25 at 17:11 -0500, Micah Anderson wrote:
>> >> Brice Figureau <brice-puppet@daysofwonder.com> writes:
>> >>
>> >> All four of my mongrels are constantly pegged, doing 40-50% of
the CPU
>> >> each, occupying all available CPUs. They never settle down.
I''ve got 74
>> >> nodes checking in now, it doesn''t seem like its that
many, but perhaps
>> >> i''ve reached a tipping point with my puppetmaster
(its a dual 1ghz,
>> >> 2gigs of ram machine)?
>> >
>> > The puppetmaster is mostly CPU bound. Since you have only 2 CPUs,
you
>> > shouldn''t try to achieve a concurrency of 4 (which your
mongrel are
>> > trying to do), otherwise what will happen is that more than one
request
>> > will be accepted by one mongrel process and each thread will
contend for
>> > the CPU. The bad news is that the ruby MRI uses green threading,
so the
>> > second thread will only run when the first one will either sleep,
do I/O
>> > or relinquish the CPU voluntary. In a word, it will only run when
the
>> > first thread will finish its compilation.
>> 
>> Ok, that is a good thing to know. I wasn''t aware that ruby was
not able
>> to do that.
>> 
>> > Now you have 74 nodes, with the worst compilation time of 75s
(which is
>> > a lot), that translates to 74*75 = 5550s of compilation time.
>> > With a concurrency of 2, that''s still 2775s of
compilation time per
>> > round of <insert here your default sleep time>. With the
default 30min
>> > of sleep time and assuming a perfect scheduling, that''s
still larger
>> > than a round of sleep time, which means that you won''t
ever finish
>> > compiling nodes, when the first node will ask again for a catalog.
>> 
>> I''m doing 60 minutes of sleep time, which is 3600 seconds an
hour, the
>> concurrency of 2 giving me 2775s of compile time per hour does keep me
>> under the 3600 seconds... assuming scheduling is perfect, which it very
>> likely is not.
>> 
>> > And I''m talking only about compilation. If your manifests
use file
>> > sourcing, you must also add this to the equation.
>> 
>> As explained, I set up your nginx method for offloading file sourcing.
>> 
>> > Another explanation of the issue is swapping. You mention your
server
>> > has 2GiB of RAM. Are you sure your 4 mongrel processes after some
times
>> > still fit in the physical RAM (along with the other thing running
on the
>> > server)?
>> > Maybe your server is constantly swapping.
>> 
>> I''m actually doing fine on memory, not dipping into swap.
I''ve watched
>> i/o to see if I could identify either a swap or disk problem, but
didn''t
>> notice very much happening there. The CPU usage of the mongrel
processes
>> is pretty much where everything is spending its time. 
>> 
>> I''ve been wondering if I have some loop in a manifest or
something that
>> is causing them to just spin.
>
> I don''t think it''s the problem. There could be some ruby
internals
> issues playing here, but I doubt something in your manifest creates a
> loop.
>
> What is strange is that you mentioned that the very first catalog
> compilations were fine, but then the compilation time increases.
Yes, and it increases quite rapidly. Interesting to note that the first
few compile times are basically within range of what I was experiencing
before things started to tip over (the last few days). I''m struggling
to
try and think of anything I could have changed, but so far have not been
able to think of anything.
>> > So you can do several thing to get better performances:
>> > * reduce the number of nodes that check in at a single time (ie
increase
>> > sleep time)
>> 
>> I''ve already reduced to once per hour, but I could consider
reducing it
>> more. 
>
> That would be interesting. This would help us know if the problem is too
> many load/concurrency for your clients or a problem in the master
> itself.
I''ll need to setup mcollective to do that I believe.

Right now I''m setting up a cronjob like this:

"<%= scope.function_fqdn_rand([''59'']) %> * * *
*"

which results in a cronjob (on one host):
6 * * * * root /usr/sbin/puppetd --onetime --no-daemonize
--config=/etc/puppet/puppet.conf --color false | grep -E
''(^err:|^alert:|^emerg:|^crit:)''
> BTW, what''s the load on the server?
The server is dedicated to puppetmaster. When I had four mongrels
running it was basically at 4 constantly. Now that I''ve backed it down
to 2 mongrels, its:

11:57:41 up 58 days, 21:20,  2 users,  load average: 2.31, 1.97, 2.02
>> Not swapping.
>
> OK, good.
Just as a confirmation to this... vmstat shows no si/so happening, and
very high numbers in the CPU user column. Very little bi/bo, and low sys
values. Context switches are a bit high... this clearly points to the
process eating CPU, not any disk/memory/swap scenario.
>> >   + Reduce the number of mongrel instances, to artifically reduce
the
>> > concurrency (this is counter-intuitive I know)
>> 
>> Ok, I''m backing off to two mongrels to see how well that
works.
>
> Let me know if that changes something.
Doesn''t seem to help. Compiles start out low, and are inching up
(started at 27, and now they are at 120 seconds).
>> >   + use a "better" ruby interpreter like Ruby Enterprise
Edition (for
>> > several reasons this ones has better GC, better memory footprint).
>> 
>> I''m pretty sure my problem isn''t memory, so
I''m not sure if these will
>> help much.
>
> Well, having a better GC means that the ruby interpreter will become
> faster at allocating stuff or recycling object. That in the end means
> the overall memory footprint can be better, but that also means it will
> spend much less time doing garbage stuff (ie better use the CPU for your
> code and not for tidying stuff).
That could be interesting. I haven''t tried REE or jruby on debian
before, I suppose its worth a try.
>> >> 3. tried to upgrade rails from 2.3.5 (the debian version) to
2.3.10
>> >> 
>> >>    I didn''t see any appreciable difference here. I
ended up going back to
>> >> 2.3.5 because that was the packaged version.
>> >
>> > Since you seem to use Debian, make sure you use either the latest
ruby
>> > lenny backports (or REE) as they fixed an issue with pthreads and
CPU
>> > consumption:
>> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=579229
>> 
>> I''m using Debian Squeeze, which has the same version you are
mentioning
>> from lenny backports (2.3.5).
>
> I was talking about the ruby1.8 package, not rails. Make sure you use
> the squeeze version or the lenny-backports one.
Yep, I''m using the squeeze ruby1.8, which is 1.8.7.302-2
>> >> 5. tried to cache catalogs through adding a http front-end
cache and
>> >> expiring that cache when manifests are updated[1] 
>> >> 
>> >>    I''m not sure this works at all.
>> >
>> > This should have helped because this would prevent the
puppetmaster to
>> > even be called. You might check your nginx configuration then.
It wasn''t really caching before, because of the nginx parameter you
pointed out in a previous message. But now it seems like it is:

find /var/cache/nginx/cache -type f |wc
     29      29    1769
> What version of nginx are you using?
0.7.67-3
> Make that:
> ssl_verify_client       optional;
>
> And remove the second server{} block, and make sure your clients do not
> use a different ca_port. But only if you use nginx >= 0.7.64
Ok, that second server block was for the cert request... but sounds like
if I tweak the verify to optional, I dont need that. I''m sure the
clients aren''t using a different ca_port (except for the initial node
bootstrap). I''ve changed that and removed the block.
> If you used ssl_verify_client as I explained above, this should be:
> proxy_set_header           X-Client-Verify   $ssl_client_verify
Changed.
>>     # we handle catalog differently
>>     # because we want to cache them
>>     location /production/catalog {
>
> Warning: this ^^ will work only if your nodes are in the
"production"
> environment. Adjust for your environments.
/etc/puppet/puppet.conf has:

environment = production

I do occasionally use development environments, but rarely enough that
not having caching is ok.
> You already have a location ''/'' above.
> Are you sure nginx is correctly using this configuration?
> Try:
>  nginx -t
> it will check your configuration
Hm, good catch. nginx -t seems ok with it, but I''ve removed the extra
location ''/'' just in case.
> This server{} wouldn''t be needed if you use the ssl_verify_client
as
> explained above.
Removed.
>> >> 7. set --http_compression
>> >>    
>> >>    I''m not sure if this actually hurts the master or
not (because it has
>> >>    to now occupy the CPU compressing catalogs?)
>> >
>> > This is a client option, and you need the collaboration of nginx
for it
>> > to work. This will certainly add more burden on your master CPU,
because
>> > nginx now has to gzip everything you''re sending.
>> 
>> Yeah, I have the gzip compression turned on in nginx, but I dont really
>> need it and my master could use the break.
>
> Actually your nginx are only compressing text/plain documents, so it
> won''t compress your catalogs.
Ah, interesting! Well, again... I''m turning it off on the nodes, its
not
needed.
>> >> 8. tried to follow the introspection technique[2] 
>> >> 
>> >>    this wasn''t so easy to do, I had to operate really
fast, because if I
>> >>    was too slow the thread would exit, or it would get hung up
on:
>> >> 
>> >> [Thread 0xb6194b70 (LWP 25770) exited]
>> >> [New Thread 0xb6194b70 (LWP 25806)]
>> >
>> > When you attach gdb, how many threads are running?
>> 
>> I''m not sure, how can I determine that? I just had the
existing 4
>> mongrel processes.
>
> Maybe you can first try to display the full C trace for all threads:
> thread apply all bt
>
> Then, resume everything, and 2 to 5s take another snapshot with the
> command above. Comparing the two trace might help us understand what the
> process is doing.
Now that I''ve fixed up the nginx.conf and caching is actually
happening,
I''ve noticed that catalog compiles are 10s, 14s, 19s, 10s, 25s, 8s and
things haven''t fallen over yet, so its much better right now.
I''m going
to let this run for an hour or two and if things are still bad, I''ll
look at the thread traces.

m

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Felix Frank

2011-Jan-27 09:57 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

>> Are there many templates or use of the file() function?
> 
> Yes, there are quite a few. I''m not really sure the best way to
count
> them. I have 288 ''source => "$fileserver"''
lines in my
Those don''t hurt compilation.
> manifests. Another ~160 of them in various modules. As far as templates
> go, I have ~77 "content => template(...)" lines in my
manifests and
> another 55 in modules.
If in doubt, try and loose a greater portion of them to see if compile
times are affected. But that won''t be very practical.
>> Do you make heavy use of modules and the autoloader?
> 
> I do make heavy use of modules, I have about 50 of them. I''m
importing
> 18 of them in my manifests/modules.pp. I think, if they are set up
> right, I only need to import one of those, and I''ve been slowly
pairing
> those down. I presume that by ''the autoloader'' you are
meaning those
> modules which aren''t explictly included somewhere?
Yes. If your structure leads to all modules being eventually included on
all nodes, you''re possibly wasting CPU cycles during compilation.

I recently had a script rename all my classes to
<module_name>::classname (including all references, i.e. includes etc.)
and got rid of all import statements. I didn''t determine any changes in
compilation time, but the manifests are now generally less messy, so
there really are no downsides for me.
>>>> Do you have any idea why each individual compilation takes that
long?
>>>
>>> It wasn''t before. Before things start spinning,
compilation times are
>>> between 9 seconds and 60 seconds, usually averaging just shy of 30
>>> seconds. 
>>
>> That''s still quite considerable IMO.
> 
> Actually looking at my logs, compile time actually was averaging around
> 15 seconds each. some taking very little time at all. When things go
> bad, its more or less a thundering herd and the times start going up and
> up.
Reminiscent of what I saw before moving away from Webrick.

Seeing as you noticed in the other branch that tuning nginx (was it?)
helped you, I still think you were just overloaded before.

Regards,
Felix

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Udo Waechter

2011-Jan-31 18:11 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Hi.

I am just reading this thread, and it strikes me that we have the same problems
with 2.6.3.

Since upgrading from 2.6.2 to .3 puppetmaster shows the behabiour described in
this thread.
We have about 160 clients, and puppetmaster (is now) 8 core, 8Gb RAM kvm
instance. Had this with 4 cores and 4 gigs RAM, "doublesizing" the VM
did not change a thing!

We use passenger 2.2.11debian-2 and apache 2.2.16-3, ruby1.8 from squeeze.

Puppetmaster works fine after restart, then after about 2-3 hours it becomes
pretty unresponsive, catalog runs go upt do 120 seconds and more (the baseline
being something about 10 seconds).

I need to restart apache/puppetmaster about once a day. When I do that I need
to:

* stop apache
* kill (still running) pupppetmasters (with SIGKILL!), some are always left
running with "CPU 100%"
* start apache

Something is very weird there, and there were no fundamental changes to the
manifests/modules.

The only thing that really changed is the VM itself. It was XEN (for years), we
switched to KVM with kernel 2.6.35

Another strange thing:

puppet-clients do run a lot longer nowadays. A machine usually took about 40-50
seconds for one run. When puppetmaster goes crazy it now takes ages (500 seconds
and even more).

Something is weird there...
--udo,

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Nan Liu

2011-Jan-31 18:16 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Mon, Jan 31, 2011 at 10:11 AM, Udo Waechter
<udo.waechter@uni-osnabrueck.de> wrote:> Hi.
>
> I am just reading this thread, and it strikes me that we have the same
problems with 2.6.3.
>
> Since upgrading from 2.6.2 to .3 puppetmaster shows the behabiour described
in this thread.
> We have about 160 clients, and puppetmaster (is now) 8 core, 8Gb RAM kvm
instance. Had this with 4 cores and 4 gigs RAM, "doublesizing" the VM
did not change a thing!
>
> We use passenger 2.2.11debian-2 and apache 2.2.16-3, ruby1.8 from squeeze.
>
> Puppetmaster works fine after restart, then after about 2-3 hours it
becomes pretty unresponsive, catalog runs go upt do 120 seconds and more (the
baseline being something about 10 seconds).
>
> I need to restart apache/puppetmaster about once a day. When I do that I
need to:
>
> * stop apache
> * kill (still running) pupppetmasters (with SIGKILL!), some are always left
running with "CPU 100%"
> * start apache
>
> Something is very weird there, and there were no fundamental changes to the
manifests/modules.
>
> The only thing that really changed is the VM itself. It was XEN (for
years), we switched to KVM with kernel 2.6.35
>
> Another strange thing:
>
> puppet-clients do run a lot longer nowadays. A machine usually took about
40-50 seconds for one run. When puppetmaster goes crazy it now takes ages (500
seconds and even more).
When it takes longer, is the agent simply spending more time on
config_retreival? You can find this metric in store reports. I would
not focus on the agent, if the delays are caused by compilation delays
by the master.

Thanks,

Nan

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Jan-31 21:43 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 31/01/11 19:11, Udo Waechter wrote:> Hi.
> 
> I am just reading this thread, and it strikes me that we have the
> same problems with 2.6.3.
> 
> Since upgrading from 2.6.2 to .3 puppetmaster shows the behabiour
> described in this thread. We have about 160 clients, and puppetmaster
> (is now) 8 core, 8Gb RAM kvm instance. Had this with 4 cores and 4
> gigs RAM, "doublesizing" the VM did not change a thing!
> 
> We use passenger 2.2.11debian-2 and apache 2.2.16-3, ruby1.8 from
> squeeze.
I see a pattern here. It seems Micah (see a couple of mails above in
this thread) has about the same setup, except he''s using mongrels.

It would be great to try a non-debian ruby (hint: Ruby Enterprise
Edition for instance) to see if that''s any better.

Do you use storeconfigs?
> Puppetmaster works fine after restart, then after about 2-3 hours it
> becomes pretty unresponsive, catalog runs go upt do 120 seconds and
> more (the baseline being something about 10 seconds).
With 160 hosts, a 30 min sleeptime, and a compilation of 10s, that means
you need 1600 cpu seconds to build catalogs for all your fleet.
With a concurrency of 8 core (assuming you use a pool of 8 passenger
app), that''s 200s per core, which way less than the max of 1800s you
can
accomodate in a 30 min time-frame. Of course this assumes an evenly
distributed load an perfect concurrency, but still you have plenty of
available resources. So I conclude this is not normal.
> I need to restart apache/puppetmaster about once a day. When I do
> that I need to:
> 
> * stop apache * kill (still running) pupppetmasters (with SIGKILL!),
> some are always left running with "CPU 100%" * start apache
Does stracing/ltracing the process show something useful?
> Something is very weird there, and there were no fundamental changes
> to the manifests/modules.
> 
> The only thing that really changed is the VM itself. It was XEN (for
> years), we switched to KVM with kernel 2.6.35
> 
> Another strange thing:
> 
> puppet-clients do run a lot longer nowadays. A machine usually took
> about 40-50 seconds for one run. When puppetmaster goes crazy it now
> takes ages (500 seconds and even more).
If your master are busy, there are great chances your clients have to
wait more to get served either catalogs or sourced files (or file
metadata). This can dramatically increase the run time.
> Something is weird there... --udo,
Indeed.

-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

John Warburton

2011-Jan-31 21:50 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 1 February 2011 08:43, Brice Figureau
<brice-puppet@daysofwonder.com>wrote:
> On 31/01/11 19:11, Udo Waechter wrote:
>
> Do you use storeconfigs?
>
Speaking of resource hogs, do you run the puppet labs dashboard on the same
host? I had a similar setup (on crusty old Sun kit mind), and found a big
performance hit in writing the reports by the client to the puppet master
and then those reports to the dashboard. Everything calmed down once I moved
the dashboard to another host

John

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Waechter Udo

2011-Feb-01 10:30 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Hi,

On 31.01.2011, at 22:43, Brice Figureau wrote:
> On 31/01/11 19:11, Udo Waechter wrote:
>> [..]
>> We use passenger 2.2.11debian-2 and apache 2.2.16-3, ruby1.8 from
>> squeeze.
> 
> I see a pattern here. It seems Micah (see a couple of mails above in
> this thread) has about the same setup, except he''s using mongrels.
> 
> It would be great to try a non-debian ruby (hint: Ruby Enterprise
> Edition for instance) to see if that''s any better.
> Well, since this behavior turned up with 2.6.3 I did not think about blaming it
on another tool. Like ruby. I will try RubyEE though.
> Do you use storeconfigs?Yes, ALOT! Nowadays with stompserver and puppetqd. I did switch it off already
and that did not change a (performance) thing.
> 
>> Puppetmaster works fine after restart, then after about 2-3 hours it
>> becomes pretty unresponsive, catalog runs go upt do 120 seconds and
>> more (the baseline being something about 10 seconds).
> 
> With 160 hosts, a 30 min sleeptime, and a compilation of 10s, that means
> you need 1600 cpu seconds to build catalogs for all your fleet.
> With a concurrency of 8 core (assuming you use a pool of 8 passenger
> app), that''s 200s per core, which way less than the max of 1800s
you can
> accomodate in a 30 min time-frame. Of course this assumes an evenly
> distributed load an perfect concurrency, but still you have plenty of
> available resources. So I conclude this is not normal.Nope, like I said. We had pupetmaster running as a VM with 4 Cores and 4 Gigs
''o RAM. This worked fine since 0.22.x, now its twice as big (if this
comparison holds) and performance is worse than ever.

Also, we  do not do 30 Minutes puppet runs. We do it every hour for workstations
and every 2 hours for servers all with half of that time random sleep. The load
on the server is pretty evenly distributed. Once or twice a day there are some
peaks, but those are not critical at all.


Thanks,
udo.

Waechter Udo

2011-Feb-01 10:31 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Hi,

On 31.01.2011, at 22:50, John Warburton wrote:
> On 1 February 2011 08:43, Brice Figureau
<brice-puppet@daysofwonder.com> wrote:
> On 31/01/11 19:11, Udo Waechter wrote:
>  
> Do you use storeconfigs?
> 
> Speaking of resource hogs, do you run the puppet labs dashboard on the same
host? I had a similar setup (on crusty old Sun kit mind), and found a big
performance hit in writing the reports by the client to the puppet master and
then those reports to the dashboard. Everything calmed down once I moved the
dashboard to another host
Yes I do, but I always did.... Even if this is not a good idea, performance was
acceptable until 2.6.3 Something must have changed there.
--udo.

Ashley Penney

2011-Feb-01 15:30 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

This is the crux of the situation for me too - Puppetlabs blame it on a Ruby
bug that hasn''t been resolved with RHEL6 (in my situation) but this
wasn''t
an issue until .3 for me too.  I feel that fact that many of us have this
problem since upgrading means it can be fixed within Puppet, rather than
Ruby, because it was fine before.

On Tue, Feb 1, 2011 at 5:31 AM, Waechter Udo
<udo.waechter@uni-osnabrueck.de> wrote:
> Hi,
>
> On 31.01.2011, at 22:50, John Warburton wrote:
>
> > On 1 February 2011 08:43, Brice Figureau
<brice-puppet@daysofwonder.com>
> wrote:
> > On 31/01/11 19:11, Udo Waechter wrote:
> >
> > Do you use storeconfigs?
> >
> > Speaking of resource hogs, do you run the puppet labs dashboard on the
> same host? I had a similar setup (on crusty old Sun kit mind), and found a
> big performance hit in writing the reports by the client to the puppet
> master and then those reports to the dashboard. Everything calmed down once
> I moved the dashboard to another host
>
> Yes I do, but I always did.... Even if this is not a good idea, performance
> was acceptable until 2.6.3 Something must have changed there.
> --udo.
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Feb-01 17:14 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Tue, 2011-02-01 at 10:30 -0500, Ashley Penney wrote:> This is the crux of the situation for me too - Puppetlabs blame it on
> a Ruby bug that hasn''t been resolved with RHEL6 (in my situation)
but
> this wasn''t an issue until .3 for me too.  I feel that fact that
many
> of us have this problem since upgrading means it can be fixed within
> Puppet, rather than Ruby, because it was fine before.
Do you mean puppet 2.6.2 wasn''t exhibiting this problem?


-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Ashley Penney

2011-Feb-01 19:35 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Yes, it didn''t happen with the earlier versions of 2.6.

On Tue, Feb 1, 2011 at 12:14 PM, Brice Figureau <
brice-puppet@daysofwonder.com> wrote:
> On Tue, 2011-02-01 at 10:30 -0500, Ashley Penney wrote:
> > This is the crux of the situation for me too - Puppetlabs blame it on
> > a Ruby bug that hasn''t been resolved with RHEL6 (in my
situation) but
> > this wasn''t an issue until .3 for me too.  I feel that fact
that many
> > of us have this problem since upgrading means it can be fixed within
> > Puppet, rather than Ruby, because it was fine before.
>
> Do you mean puppet 2.6.2 wasn''t exhibiting this problem?
>
>
> --
> Brice Figureau
> Follow the latest Puppet Community evolutions on www.planetpuppet.org!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
>
puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Feb-01 19:45 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 01/02/11 20:35, Ashley Penney wrote:> Yes, it didn''t happen with the earlier versions of 2.6.
If it''s easy for you to reproduce the issue you really should git
bisect
the issue and tell puppetlabs what commit is the root cause (the
differences between 2.6.2 and 2.6.3 is not that big).
This way, they''ll certainly be able to fix it.

Do we have a redmine ticket to track this issue?
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Udo Waechter

2011-Feb-01 20:17 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 01.02.2011, at 18:14, Brice Figureau wrote:
> On Tue, 2011-02-01 at 10:30 -0500, Ashley Penney wrote:
>> This is the crux of the situation for me too - Puppetlabs blame it on
>> a Ruby bug that hasn''t been resolved with RHEL6 (in my
situation) but
>> this wasn''t an issue until .3 for me too.  I feel that fact
that many
>> of us have this problem since upgrading means it can be fixed within
>> Puppet, rather than Ruby, because it was fine before.
> 
> Do you mean puppet 2.6.2 wasn''t exhibiting this problem?Yes for me.
--udo.

-- 
:: udo waechter - root@zoide.net :: N 52º16''30.5" E
8º3''10.1"
:: genuine input for your ears: http://auriculabovinari.de 
::                          your eyes: http://ezag.zoide.net
::                          your brain: http://zoide.net




-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Ashley Penney

2011-Feb-07 16:23 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Because I like to live dangerously I upgraded to 2.6.5 and it seems like
this has resolved the CPU problem completely for me.

On Tue, Feb 1, 2011 at 3:17 PM, Udo Waechter
<udo.waechter@uni-osnabrueck.de> wrote:
>
> On 01.02.2011, at 18:14, Brice Figureau wrote:
>
> > On Tue, 2011-02-01 at 10:30 -0500, Ashley Penney wrote:
> >> This is the crux of the situation for me too - Puppetlabs blame it
on
> >> a Ruby bug that hasn''t been resolved with RHEL6 (in my
situation) but
> >> this wasn''t an issue until .3 for me too.  I feel that
fact that many
> >> of us have this problem since upgrading means it can be fixed
within
> >> Puppet, rather than Ruby, because it was fine before.
> >
> > Do you mean puppet 2.6.2 wasn''t exhibiting this problem?
> Yes for me.
> --udo.
>
> --
> :: udo waechter - root@zoide.net :: N 52º16''30.5" E
8º3''10.1"
> :: genuine input for your ears: http://auriculabovinari.de
> ::                          your eyes: http://ezag.zoide.net
> ::                          your brain: http://zoide.net
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-users+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Feb-07 18:56 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 07/02/11 17:23, Ashley Penney wrote:> Because I like to live dangerously I upgraded to 2.6.5 and it seems like
> this has resolved the CPU problem completely for me.
Did you upgrade the master or the master and all the nodes?

I had a discussion about this issue with Nigel during the week-end, and
he said something really interesting I didn''t thought about:
it might be possible that the reports generated by 2.6.3 were larger
than what they were in previous versions.

It is then possible that the CPU time taken to unserialize and process
those larger reports is the root cause of the high CPU usage.

That''d be great if one of the people having the problem could disable
reports to see if that''s the culprit.

And if this is the case, we should at least log how long it takes to
process a report on the master.
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Ashley Penney

2011-Feb-07 19:15 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

I just upgraded the master, I was too lazy to do the nodes yet.

On Mon, Feb 7, 2011 at 1:56 PM, Brice Figureau <
brice-puppet@daysofwonder.com> wrote:
> On 07/02/11 17:23, Ashley Penney wrote:
> > Because I like to live dangerously I upgraded to 2.6.5 and it seems
like
> > this has resolved the CPU problem completely for me.
>
> Did you upgrade the master or the master and all the nodes?
>
> I had a discussion about this issue with Nigel during the week-end, and
> he said something really interesting I didn''t thought about:
> it might be possible that the reports generated by 2.6.3 were larger
> than what they were in previous versions.
>
> It is then possible that the CPU time taken to unserialize and process
> those larger reports is the root cause of the high CPU usage.
>
> That''d be great if one of the people having the problem could
disable
> reports to see if that''s the culprit.
>
> And if this is the case, we should at least log how long it takes to
> process a report on the master.
> --
> Brice Figureau
> My Blog: http://www.masterzen.fr/
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-users+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Udo Waechter

2011-Feb-10 14:55 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Hello,
I am one of those who have this problem. Some people suggested using Ruby
Enterprise. I looked at its installation, it looked a little bit time-consuming,
so I did not try that one out.
I upgraded to debian squeeze (of course), and the problem persists.

Thus I did some tests:

1. got ruby from "Ubuntu Meercat":
libruby1.8                                1.8.7.299-2
ruby1.8                                   1.8.7.299-2
ruby1.8-dev                               1.8.7.299-2

Same Problem (debian is 1.8.7.302 I think), with ruby from ubuntu lucid
(1.8.7.249) the problem is the same. I guess we can rule out debian''s
ruby here.

2. I reported that after stopping apache, stray master process remain and do
100% cpu. I did an strace on those processes and they do this (whatever that
means):

$ strace -p 1231
Process 1231 attached - interrupt to quit
brk(0xa49a000)                          = 0xa49a000
brk(0xbf51000)                          = 0xbf51000
brk(0xda09000)                          = 0xda09000
brk(0xa49a000)                          = 0xa49a000
brk(0xbf52000)                          = 0xbf52000
brk(0xda09000)                          = 0xda09000
brk(0xa49a000)                          = 0xa49a000
brk(0xbf52000)                          = 0xbf52000
brk(0xda09000)                          = 0xda09000
^CProcess 1231 detached

3. I have now disabled reports, lets see what happens.

Thanks for the effort and have a nice day.
udo.


On 07.02.2011, at 19:56, Brice Figureau wrote:
> On 07/02/11 17:23, Ashley Penney wrote:
>> Because I like to live dangerously I upgraded to 2.6.5 and it seems
like
>> this has resolved the CPU problem completely for me.
> 
> Did you upgrade the master or the master and all the nodes?
> 
> I had a discussion about this issue with Nigel during the week-end, and
> he said something really interesting I didn''t thought about:
> it might be possible that the reports generated by 2.6.3 were larger
> than what they were in previous versions.
> 
> It is then possible that the CPU time taken to unserialize and process
> those larger reports is the root cause of the high CPU usage.
> 
> That''d be great if one of the people having the problem could
disable
> reports to see if that''s the culprit.
> 
> And if this is the case, we should at least log how long it takes to
> process a report on the master.
> -- 
> Brice Figureau
> My Blog: http://www.masterzen.fr/
> 
> -- 
> You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
> To post to this group, send email to puppet-users@googlegroups.com.
> To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
> For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.
> 
-- 
:: udo waechter - root@zoide.net :: N 52º16''30.5" E
8º3''10.1"
:: genuine input for your ears: http://auriculabovinari.de 
::                          your eyes: http://ezag.zoide.net
::                          your brain: http://zoide.net




-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Brice Figureau

2011-Feb-10 15:22 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Thu, 2011-02-10 at 15:55 +0100, Udo Waechter wrote:> Hello,
> I am one of those who have this problem. Some people suggested using Ruby
Enterprise. I looked at its installation, it looked a little bit time-consuming,
so I did not try that one out.
> I upgraded to debian squeeze (of course), and the problem persists.
> 
> Thus I did some tests:
> 
> 1. got ruby from "Ubuntu Meercat":
> libruby1.8                                1.8.7.299-2
> ruby1.8                                   1.8.7.299-2
> ruby1.8-dev                               1.8.7.299-2
> 
> Same Problem (debian is 1.8.7.302 I think), with ruby from ubuntu lucid
(1.8.7.249) the problem is the same. I guess we can rule out debian''s
ruby here.
> 
> 2. I reported that after stopping apache, stray master process remain and
do 100% cpu. I did an strace on those processes and they do this (whatever that
means):
> 
> $ strace -p 1231
> Process 1231 attached - interrupt to quit
> brk(0xa49a000)                          = 0xa49a000
> brk(0xbf51000)                          = 0xbf51000
> brk(0xda09000)                          = 0xda09000
> brk(0xa49a000)                          = 0xa49a000
> brk(0xbf52000)                          = 0xbf52000
> brk(0xda09000)                          = 0xda09000
> brk(0xa49a000)                          = 0xa49a000
> brk(0xbf52000)                          = 0xbf52000
> brk(0xda09000)                          = 0xda09000
> ^CProcess 1231 detached
This process is allocating memory like crazy :)
> 3. I have now disabled reports, lets see what happens.
> 
> Thanks for the effort and have a nice day.
> udo.
Are you still on puppet 2.6.3?
Can you upgrade to 2.6.5 to see if that''s better as reported by one
other user?


-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Patrick

2011-Feb-10 19:40 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On Feb 10, 2011, at 6:55 AM, Udo Waechter wrote:
> Hello,
> I am one of those who have this problem. Some people suggested using Ruby
Enterprise. I looked at its installation, it looked a little bit time-consuming,
so I did not try that one out.
Well, I find it takes about 30 min at the most, saves on RAM, and causes puppet
it use a little more CPU.  Here''s what I did.  This method requires a
compiler.  You can also do everything up to (but not including) step 5 without
affecting puppet.  It''s also easy to reverse.

1) Changed /usr/share/puppet/rack/puppetmasterd/config.ru to use an absolute
path to the folder.
Need this line:
$:.unshift(''/usr/lib/ruby/1.8/'')

2) Install the dependencies for the compile:
	package { "libssl-dev": ensure => present }
	package { "libsqlite3-dev": ensure => present }
	package { ''libmysql++-dev'': ensure => present }
	package { ''libpq-dev'': ensure => present }
	package { ''apache2-prefork-dev'': ensure => present }
	package { ''libapr1-dev'': ensure => present }
	package { ''libaprutil1-dev'': ensure => present }

3) Installed RubyEE from their universal package.

4) Added a passengerEE mod to /etc/apache/mods-avaliable/

/etc/apache2/mods-avaliable/passengeree.load:
LoadModule passenger_module
/opt/ruby-enterprise-1.8.7-2010.02/lib/ruby/gems/1.8/gems/passenger-2.2.15/ext/apache2/mod_passenger.so
PassengerRoot
/opt/ruby-enterprise-1.8.7-2010.02/lib/ruby/gems/1.8/gems/passenger-2.2.15
PassengerRuby /opt/ruby-enterprise-1.8.7-2010.02/bin/ruby

5) Disable the old passenger and enable the new one
a2dismod passenger
a2enmod passengeree
service apache2 restart

If things don''t work do this to enable your old passenger:
a2enmod passenger
a2dismod passengeree
service apache2 restart

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

John Warburton

2011-Feb-10 23:55 UTC

head link

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

On 8 February 2011 06:15, Ashley Penney <apenney@gmail.com> wrote:
> I just upgraded the master, I was too lazy to do the nodes yet.
>
>
> On Mon, Feb 7, 2011 at 1:56 PM, Brice Figureau <
> brice-puppet@daysofwonder.com> wrote:
>
>> On 07/02/11 17:23, Ashley Penney wrote:
>> > Because I like to live dangerously I upgraded to 2.6.5 and it
seems like
>> > this has resolved the CPU problem completely for me.
>>
>> Did you upgrade the master or the master and all the nodes?
>>
>Was that upgrade to 2.6.5rc2? Seems there has been a nice patch to speed up
large HTTP POST & PUTs. Since 2.6.x reports can be large (I have some
approaching 1 Mb), this might be where the problem may have been

https://projects.puppetlabs.com/projects/puppet/wiki/Release_Notes#2.6.5
https://projects.puppetlabs.com/issues/6257

John

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to
puppet-users+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.

Puppet users - Dec 2010 - puppetmaster 100%cpu usage on 2.6 (not on 0.24)

[Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

[Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

[Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

[Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)

Re: [Puppet Users] Re: puppetmaster 100%cpu usage on 2.6 (not on 0.24)